Mike Schaeffer's Weblog
Mon, 18 Jul 2005
PS: I think that AutoFilter is typical of Excel...
I think that the weaknesses of the Excel AutoFilter turn out to be pretty typical of Excel in general.

To me, the brilliance of the spreadsheet was that it took a data model that business people were familar with, the accountant's paper spreadsheet, and layered on automatic computation and reporting facilities in a natural way. There's something very intuitive about going to cell c1, entering =A1+B1, and then having C1 contain the sum of the other two cells, automatically updated as the source cells change. It just makes sense, and is at the very core of every software spreadsheet dating back to the first, VisiCalc.

For years, spreadsheets worked at making this model work better. Lotus 1-2-3 introduced something called natural recalculation order that made it easier to follow the logic of spreadsheet calculation. Somewhere along the way, spreadsheets started doing limited recalculation, where formulas that didn't change weren't recalculated (thus saving time). New intrinsic functions were added, and Excel made a huge stride when it added array formulas: individual formulas that can produce more than one result. The gateway to user defined functions written in VisualBASIC was another huge win.

The core strength of all of these ideas is that they rely on and extend the core concept of the software spreadsheet: the software tracks dependancies between cells and automatically recalculates the appropriate results as necessary. As powerful as that concept is, Microsoft lost the plot somewhere around Excel 4 or 5 and keeps sinking money and effort into features that don't fully participate:
  • Excel has two data filter features: neither one can automatically update a table as a part of recalculation.
  • PivotTables don't update when their source data updates either. (For SQL data sources, this is understandable, but not so much when the source data comes from Excel itself).
  • PivotTables produce tables with missing values (to improve the formatting), which makes them very difficult to query with spreadsheet lookup functions.
  • The historgram function (among others) of the Analysis ToolPak is a one-time thing: you use it, it generates a histogram, and that's it. It's not possible to incorporate histogram generation into the dataflow driven recalculation of a spreadsheet.
  • There's no way to use an Excel formula to determine if a row is excluded or included in an AutoFilter query. Actually, there's no way to have the result set of an AutoFilter query drive spreadsheet recalculation at all.
Maybe this is being picky, but spreadsheets have a real strength in that they made it a lot easier for non-techies to specify how a computer can automatically solve certain types of problems. It's just a shame that so many of Excel's features are excluded from the natural way Excel is programmed.

reddit this! Digg Me!

[/tech/excel] permanent link

List filtering with Formulas in Microsoft Excel: Motivation
One of Excel's more interesting features for querying data sets is the AutoFilter. Applied to a table of data in a spreadsheet, The AutoFilter allows the table to be queried for subsets of data based on combo boxes in the table's header row. It's a simple way to filter out extraneous data and it can support quite elaborate query semantics (since it can filter based on values in computed cells).

However, AutoFilter is not without its problems:
  • AutoFilter imposes its own user interface: if you want a look-and-feel other than stock, you're out of luck.
  • For wide data tables with lots of columns, it can be hard to see the current AutoFilter query. To see the entire query requires horizontal scrolling down the header row.
  • Cell formatting and AutoFilter are independant of each other. If you want position dependant formatting (alternate row formatting, for example), it has to be recreated after each AutoFilter adjustment.
  • An AutoFilter works by selectively 'hiding' rows in the worksheet it's a part of. This means that an AutoFiltered list can't share rows with anything else that you don't also want selectively 'filtered' from view.
  • You can't have more than one AutoFilter on a worksheet tab.
  • AutoFilter isn't part of the natural 'ebb and flow' of the life of a spreadsheet: it doesn't participate in the dependancy driven formula solver that drives Excel's computational capability. This has some profound (bad) implications:
    • As data rows are added and removed from the list being AutoFiltered, the AutoFilter has to be removed and reapplied to the new data list to reflect changes to its source.
    • You can't use AutoFilter to filter a list and then search that list with =LOOKUP() or =MATCH(): the lookup operation will search the entire list, not the filtered list.
    • If you AutoFilter a list that contains calculated cells, and those cells change value, the set of filtered rows is not updated.
Anyway, I could go on, but I hope it's pretty clear by now that there are sometimes good reasons to look for other list filter mechanisms than AutoFilter. (FYI: 'Advanced Filter' has its own limitations, some of which are very similar to AutoFilter's.) I'll post a way to get AutoFilter-like behavior directly from Excel formulas. This technique has its own issues, but it does address lots of the issues I mentioned here.

reddit this! Digg Me!

[/tech/excel] permanent link