Thu, 28 Jul 2005

Now that I've written a little about why you might want to
replace Excel AutoFilter, here's how to actually do it. To frame the discussion,
there are two problems to solve:

The tricky bit of the formula-based filter is the second problem: displaying the result set in a contiguous range of rows with no gaps. Each cell that

As it turns out, even though having the result set compute a mapping from the input set is quite hard, solving the reverse problem isn't too bad. Having the input set compute the mapping to the result set is easy. Here's how it works, by column:

reddit this! Digg Me!

- Deciding which rows of the input set are part of the result set
- Displaying the result set in a contiguous sequence of spreadsheet rows.

**In Query?**:The tricky bit of the formula-based filter is the second problem: displaying the result set in a contiguous range of rows with no gaps. Each cell that

*might*display part of the result set has to figure out itself what part of the result set to display, if any, and pull the data from the input set. A simple MATCH or LOOKUP can't handle this, since MATCH or LOOKUP can't be told to return the second, third, or nth match. They return the first match, which isn't quite enough for what we're trying to do.As it turns out, even though having the result set compute a mapping from the input set is quite hard, solving the reverse problem isn't too bad. Having the input set compute the mapping to the result set is easy. Here's how it works, by column:

**Ord.**- The row ordinal number of the row in the input set, starting with 1.**Result Ord.**- This column starts at zero, in the row preceeding the first row of the result set, and increments by 1 for each row where**In Query?**is TRUE. For each row with**In Query?**of TRUE, this column is the row ordinal number of this row in the result set.... We are almost there.**Result Rows.**- The input row ordinal of each row in the output set. This is done by using MATCH to find the first row for each number in the**Result Ord.**.

**Result Rows.**column has been calculated, populating the actual result set is just a matter of using INDEX. ISERROR can be called on cells in**Result Rows.**to identify rows that don't contain values. After all this is said and done, we have a spreadsheet range that contains only a result set, updates like every other range in Excel, and can be used in formulas like every other range. I have a sample spreadsheet that implements a lot of this here.reddit this! Digg Me!