Pre-processing (or composition of functions)
This is sort of a theoretical question. How do we know if it is easier to perform an operation on some data by splitting it into two simpler operations that do the same thing as the original operation you wanted to perform? So when f(x) = (g o h)(x), how do we know when we can benefit by using (g o h)(x) instead of f(x)?
As a perfect example, it's less expensive computationally to sort and then search an unsorted list than it is to search the original unsorted list.
The reason I ask is, I'm looking again at the Pyxie XML processing library (which I've looked at before) (also see this followup to the XML.com article). I'm wondering if it's simpler to convert XML to PYX and then process it than it is to process the raw XML in the first place. It's probably simpler to program a parser in these two steps, but is it also simpler in a theoretical sense? Will it take less time computationally?
The reason I'm asking the question is a mundane one, but I'm mostly interested in the theoretical aspects of this. How can we study this phenomenon? How do we look for it?
I'm going to cross-post this on LtU. I finally created an account there. Ok, here it is.
Feel free to post a comment below. Please see my comment policy.
Formatting Rules (No HTML):