Mike Schaeffer's Weblog

Wed, 14 Dec 2005

Feeds and Reports
I've been doing a lot of analysis of feeds and reports lately, and have come up with a couple suggestions for file design that can make feeds easier to work with. None of this should be earth shattering advice, but collectively it can mean the difference between an easy file to work with and a complete pain in the ...well you know.
  • Prefer machine readable formats - "Pretty printers" for reports have a lot of utility: they can make it easy for users to view and understand results. However, they also have disadvantages: it's harder to use "pretty" reports for the further downstream processing that someone will inevitably want to do. This is something that needs to be considered carefully, keeping your audience in mind, but if possible, pick a format that a machine can easily work with.
  • Use a standard file format - There are lots of standard formats available for reports and feeds: XML, CSV, Tab Delimited, S-Expression, INI File, etc. Use one of these. Tools already exist to process and manipulate these kinds of files, and one of these formats will be able to contain your data.
  • Prefer the simplest format that will work - The simpler the format, the easier it will be to parse/handle. CSV is a good example of this: XML is sexier and much more powerful, but CSV has been around forever and has many more tools. A good example of what I mean is XML support in Excel. Excel has been getting XML support in the most recent versions, but it's had CSV support since the beginning. Also, from a conceptual standpoint, anybody who can understand a spreadsheet can understand a tabular file, but hierarchical data is considerably more complex a concept. (In business settings, there's a very good chance your feed/report audience will be business analysts that know Excel backwards and forwards but have no technical computer science training.)
  • Prefer delimited formats to formats based on field widths - The thing about having columns based on field widths (column 1 is 10 characters wide, column 2 is 20, etc.) is that you have to remember and specify the field widths when you want to extract out the tabular data. In the worst case, without the column widths you can't read your file at all. In the best case, it's just something else you have to do when you load a file.
  • If you specify column names, ensure they are unique. - This isn't necessary for a lot of data analysis tools, but some tools (cough... MS Access) get confused when importing a table with multiple columns of the same name.
  • Include a header that describes the feed. - To fully understand the contents of a file, you really have to understand what it contains and where it came from. This is useful both in testing (did this report come from build 28 or build 29?) and in production (when was this file generated?) My suggestions for header contents include:
    • The version of the report specification
    • Name of the source application
    • Version of the source application (This version number should be updated with every build.)
    • Environment in which the source application was running to produce the report.
    • The date on which the report was run
    • If the report has an effective date, include it too.
    Also, this is a subtle point, but the header row should ideally be in a similar format to the rest of the data in the feed. IE: if your file is a CSV file, the header should be one comma delimited row.
  • Document your report - Without good, precise documention of your file format, it'll be very hard to reliably consume files in the format. Similarly, have as many people as possible peer review your file format. Even if your system's code is complete garbage, the file format represents an interface to your system that will possibly live much longer than the system itself.


reddit this! Digg Me!

[/tech/programming] permanent link

Wed, 16 Nov 2005

UseHR, I give up!!!
A few months ago I wrote a bit on the impact of high resolution displays on the way Internet Explorer renders graphics. I had really planned on using the default setting. Not anymore!

This is the awful default:


This is as it should be:


Now, guess what Firefox does.

reddit this! Digg Me!

[/tech/general] permanent link

Wed, 09 Nov 2005

Thirty days hath September...

"Thirty days hath September,
All the rest I can't remember.
The calendar hangs on the wall;
Why bother me with this at all?"

http://leapyearday.com/30Days.htm



Here's an Excel one liner that computes the number of days in a particular month. Cell A2 contains the year of the month you're looking for, Cell B2 contains the months' ordinal (1=January, 2=February, etc.):
=DAY(DATE(A2,B2+1,1)-1)
This is mainly useful to illustrate what can be done with Excel's internal representation of dates. Dates and times in Windows versions of Excel are normally stored as the number of days from January 1st, 1900. You can see this by entering a date in a cell, and then reformatting the cell to display as a number rather than a date. For example, this reveals April 1st, 2004 to be represented internally as the number 38078. This is because there are 38,078 days between January 1st, 1900 and April 1st, 2004.

The formula above relies on this in its computation of the number of days in a month. The sub-expression DATE(A2,B2+1,1) computes the date number for the first day of the month immediately following the month we're interested in. We then subtract one from that number, which gives us the date number for the last day of the month that we are interested in. The call to DAY then returns the number of the day within the month, which happens to be the number of days in the month.

reddit this! Digg Me!

[/tech/excel] permanent link

Fri, 04 Nov 2005

SRFI-74: Octet-Addressed Binary Blocks
Michael Sperver has written an SRFI that documents "Octet-Addressed Binary Blocks". Basically these things are like BLOBs in SQL: blocks of memory, opaque to the data model of the language, that can be used to store arbitrary binary data. I can think of a bunch of applications for this:
  • An internal representation for compiled byte code functions.
  • A way to interoperate with C code that expects binary data formats. (Like the Win32 API, for example. )
  • A way to represent binary data longer than a byte that's written to and read from binary ports.


reddit this! Digg Me!

[/tech/lisp] permanent link

Tips on Programming
I don't know who this person is, but they have a good collection of programming tips online. A lot of this stuff looks pretty relevant.

Related to that is this deck of slides written by Kent Pitman and Peter Norvig. It's an excellent discussion of good programming style in Lisp.

reddit this! Digg Me!

[/tech/programming] permanent link

"Apropos of..." Better Autofilter Results.
At my job, we use Excel extensively to keep track of software testing progress. One typical use is to maintain a list of features to be tested, along with their current pass/fail statuses and an attempt at a rough subdivision into functional areas. Excel's AutoFilter then makes it easy to ask questions like "show me all failed tests relating to function block scheduling."

This works really well as long as "function block scheduling" is one of the categories into which you've subdivided your features list. If it's not, you have to get a little creative to filter your list. One approach to this problem I've found useful is filtering based on columns populated with a formula similar to this:
=IF(ISERROR(SEARCH($K$5,K6)),"No","Yes")
If column K contains feature descriptions, this formula returns "Yes" is the description matches the search string in K5 and "No", otherwise. Filtering based on this formula makes it possible to display every list item whose description matches a word. If there is more than one column to search, you can use string concatenation to aggregate the columns together:
=IF(ISERROR(SEARCH($K$5,K6&L6&M6)),"No","Yes")
So, why the name apropos? Follow this link.

reddit this! Digg Me!

[/tech/excel] permanent link

Tue, 11 Oct 2005

Excel 12's Conditional Formatting Rules
David Gainer has Summarized a a number of new conditional formatting rules in Excel 12, over on the Excel 12 blog. These rules were designed to "make a greater number of scenarios possible without needing to write formulas." In other words, all these scenarios have simple solutions directly visible in the Excel 12 UI.

Well, if you can't wait for Excel 12, Excel is pretty darned powerful as it is, and as Mr. Gainer states: most of these scenarios have formula-based approaches that work right now. Here are some of the approaches for current versions of Excel:
  • With data bars, color scales, or icons based on the numeric value in the cell, percentages, percentiles, or a formula. See the posts on data bars, color scales, and icon sets for more information on each of these. - This approach to 'databars' generalizes to formula-based scaling, although it's not as pretty, not a color scale, and not an icon set.
  • Containing, not containing, beginning with, or ending with specific text. For example, highlighting parts containing certain characters in a parts catalog. - Use a formula: a lot of these conditions can be tested using FIND: =FIND(string, A1)=1, checks for parts that begin with string, for example.
  • Containing dates that match dynamic conditions like yesterday, today, tomorrow, in the last 7 days, last week, this week, next week, last month, this month, next month. For example, highlight all items dated yesterday. The great part about these conditions is that Excel handles calculating the date based on the system clock, so the user doesn.t need to worry about updating the condition. - Use a formula: the system date is available via NOW(), and Excel offers plenty of date arithmetic functions to check for specific conditions.
  • That are blank or that are non-blank. - Use a formula: =ISBLANK(A1) or =NOT(ISBLANK(A1))
  • That have errors or that do not have errors. - Use a formula: =ISERROR(A1) or =NOT(ISERROR(A1))
  • That are in the top n of a selected range (where n is whatever number you want) OR that are in the top n percent of a selected range (again, where n is adjustable). For example, highlighting the top 10 investment returns in a table of 1,000 investments. - Use a formula: =RANK(A1, range)>n.
  • Cells that have the bottom n values OR cells that are the bottom n percent of a selected range. - Use a formula: =RANK(A1, range)<ROWS(range)-n.
  • Cells that are above average, below average, equal to or above average, equal to or below average, 1 standard deviation above, 1 standard deviation below, 2 standard deviations above, 2 standard deviations below, 3 standard deviations above, 3 standard deviations below a selected range. - This type of thing can be solved using a particular form of formula: =A1<(AVERAGE(range)-n*STDEV(range)) or =A1>(AVERAGE(range)+n*STDEV(range)). For large ranges, it probably makes sense to move the computation of AVERAGE and STDEV into a cell, and have the conditional format reference (with an absolute reference) that cell.
  • Cells that are duplicate values or, conversely, cells that are unique values. - Use a formula: =COUNTIF(range, A1)=1 or =COUNTIF(range, A1)>1. Ensure that the range you use in the formula has an absolute address. If your range is sorted on the 'key' field, you can use this style of formula: =A1<>A2. This can be much, much faster, particularly for large tables. (For the Comp. Sci. types it's O(N), rather than O(N^2), once you have sorted data.)
  • Based on comparisons between two columns in tables. For example, highlight values where values in the .Actual Sales. column are less than in the .Sales Target. column. - Use a conditional format formula: =A1<B1. Apply it to the entire column you want shaded, and Excel will evaluate the seperately for each cell. The cell references in the format formula are relative to the current cell in the selected range. The current cell is the cell in the range that is not highlighted (but is surrounded by a selection border), and can be moved around the four corners of the range with Control+. (period).
  • When working with tables, we have also made it easy to format the entire row based on the results of a condition. - Relative formulas can be made to do this: select an entire range, and define a conditional formula using absolute column addresses (ie: =$a1). Excel evaluates the format formula for each cell in the range, and since the column addresses are absolute, each cell in a row will pull from the came columns. Therefore, each cell in a row will share the same conditional format, which is what we want.
Based on this, you don't have to wait for Excel 12 to get a lot of these features, you just have to wait for Excel 12 if you want Excel to do it for you automatically. My suggestion would be to learn how to use conditional formatting formulas, but I tend to be "here's how to fish" kind of guy more than a "here's a fish" kind of guy.

reddit this! Digg Me!

[/tech/excel] permanent link

Fri, 07 Oct 2005

Excel 12 Databars, Without VBA.
I suspected as much, but Excel has a way to duplicate my UDF using Excel formulas.

=REPT("█",A1)&REPT("▌",ROUND(FLOOR(A1,1),0))

That formula evaluates to a bar of length A1 units, rounded to the nearest 0.5. Rescaling can be done in another cell. If you're interested in a bar that can be right-justified, you can use this:

=REPT("▐",ROUND(A1-FLOOR(A1,1),0))&REPT("█",A1)

The trickiest part about this is getting the block characters into the formula. For that, I reccomend using the Windows Character Map.

Qualitatively compared to VBA, this method requires more logic to be represented in the spreadsheet: that adds compelxity for readers and makes it tricker to set up than the VBA. On the other hand, it avoids the performance hit of calling UDF and the requirement that the spreadsheet contain a macro. I honestly don't know which is better style, but can say that this would be a perfect time to use a paramaterized range name (if Excel had such a thing).

reddit this! Digg Me!

[/tech/excel] permanent link

Excel 12 Databars, Now. (Sort of)
Microsoft has just announced a cool new feature on the Excel 12 blog: the databar. I think a picture (linked from Microsoft's Excel 12 Blog) can explain it better than I can:



This will be a nice way to look for trends/outliers, but I can also see it being useful for tracking parallel completion percentages in status reports, etc. Of the Excel 12 features announced so far, this is the one that I'm the most excited about. Of course, it's also the one that's easiest to approximate in Excel <12. Andrew has an approach using Autoshapes on his blog, and I'm going to present a slightly different approach.

IMO, his approach looks a lot better, this approach has the benefit of updating automatically. Pick your poison.

It all centers around this little UDF:
Option Explicit

Function GraphBar(x As Double, _
                  Low As Double, _
                  High As Double, _
                  ScaleTo As Double) As String

    x = ((x - Low) / (High - Low)) * ScaleTo
    
    Dim i As Integer
    
    Dim blockFull As String
    Dim blockHalf As String
    
    blockFull = ChrW(9608)
    blockHalf = ChrW(9612)
    
    GraphBar = ""
    
    For i = 1 To Fix(x)
        GraphBar = GraphBar + blockFull
    Next
    
    If x - Fix(x) > 0.5 Then
        GraphBar = GraphBar + blockHalf
    End If
End Function
This isn't rocket science: all it does is rescale x from the range [Low, High] to the range [0.0, ScaleTo]. Then, it strings together that many Chrw(9608)'s, followed by a Chrw(9612), if the scaled value's fractional part is >0.5. The trick in this is that Chrw(9608) and Chrw(9612) are VBA expressions that produce the the Unicode equivalent of the old line drawing characters IBM put in the original PC [1]. 9608 is a full box ("█"), 9612 is a half box on the left ("▌"). The result of this function ends up being a string that (when displayed as Arial) looks like a horizontal bar. ("████▌"). Put a few of those in adjacent cells, and you get this:



The formulat in C2 (and filled down) is =GraphBar(B2,MIN(B$2:B$8),MAX(B$2:B$8),5). The MIN and MAX set the scale, the 5 sets the maximum length of a bar. The maximum length, font size, column width can be tweaked to produce a reasonably attractive result, although I do reccomend using vertical centering.

If you want to get a little fancier, conditional formatting works on plot cells...



...whitespace can possibly improve the appearance...



...and this technique can scale.



1] (The original PC didn't have stanard graphics, it was an option. If you bought the monochrome, non-graphics, video board, characters like this were as close as you could get to a bar chart.)

reddit this! Digg Me!

[/tech/excel] permanent link

Thu, 06 Oct 2005

Excel: 'Repeat', the Simplest Macro
This is a simple little two-bit Excel trick that I find myself using all the time, particularly when formatting worksheets.

In Excel, Control+Y is the 'other half' of the Undo/Redo pair. If you undo an action and want to redo what you just undid, Control+Y undoes the undo, so to speak. However, if you haven't undone anything, and there's nothing on the redo queue, Control+Y repeats the last single action you took.

Repeatable actions can actually be quite complex. For example, opening the Format Cell dialog box and applying a format counts as one repeatable action, regardless of how many format attributes you change. Once you make that format change to one cell and before you do anything else Control+Y has become a key that applies that specific format change to as many other cells as you like.

In a sense, Control+Y is a command that's eternally bound to a simple macro that Excel keeps updating with your last action. If you plan your work to group actions together, this 'automatic' macro can save a lots of time.

reddit this! Digg Me!

[/tech/excel] permanent link

Mon, 03 Oct 2005

Formulas Driven from AutoFilters
I had this written out and then discovered a better way. SUBTOTAL is "sensitive to AutoFilter settings", right? Assuming A1 isn't empty, this formula =subtotal(a1, 2)=1 returns TRUE if row 1 is visible and FALSE otherwise. No VBA necessary.

Not too long ago, I made a post that describes how to replicate some of the behavior of Excel Autofilters using a purely formula based approach. One of the arguments I put forward in support of that technique is that it makes it possible to use filtered result sets to drive other calculations. However, the approach also has two disadvantages: it's slow to compute and can be a little tricky to setup and understand. As a sort of intermediate ground between using the AutoFilter and re-implementing it, this post describes how an Excel formula can determine if a row is a member of an AutoFilter result set. The magic bit is this little user defined function:
Function IsVisible(rng As Range) As Boolean
    
    IsVisible = True
    
    Dim row As Range
    Dim col As Range
           
    For Each row In rng.Rows
        If row.RowHeight = 0 Then
            IsVisible = False
            Exit Function
        End If
    Next
        
    For Each col In rng.Columns
        If col.ColumnWidth = 0 Then
            IsVisible = False
            Exit Function
        End If
    Next
End Function
Given a range, this function returns true if every cell in the range is visible (non-zero row height and column width). The way Excel works, the Row Height of a row hidden by the Autofilter is reported as zero. Therefore, IsVisible returns false when given a reference to a cell in a hidden AutoFiltered row. Of course, it also returns False for cells in manually hidden rows and columns, but if you're careful, you can avoid that.

For a simple use case, this function can be used to generate alternating color bars that always alternate regardless of the AutoFilter settings. To set it up, Put TRUE in the topmost cell of a free column next to the AutoFilter to be colored. Below the TRUE, fill down with a formula like this: =IF(isvisible(D2),NOT(D1),D1). This formula inverts the value in the column, but only for cells that are visible. This guarantees that regardless of the AutoFilter settings, this column will always alternate TRUE/FALSE in the set of visible rows. This column can then be used to drive a conditional format that highlights alternating visible rows.

A couple sidenotes:
  • This function works because adjusting an AutoFilter triggers recalculation, and Excel notices that this function depends on row heights. For hiding columns, it's a lot less reliable. All the calls to IsVisible have to be forced to recompute after the column is hidden or displayed. To do this, IsVisible can be marked as volatile and recalculation forced by pressing F9. This is a lousy solution.
  • To optimize performance, the function short-circuits its search. The Exit Function's bail out of the calculation as soon as the first hidden row or column is discovered.
  • Excel's SUBTOTAL intrinsic function is also sensitive to AutoFilter settings.


reddit this! Digg Me!

[/tech/excel] permanent link

Tue, 20 Sep 2005

The world's most popular functional programming language...
This may be something of a suprise, but Excel has even gotten the attention of Microsoft Research. Simon Peyton Jones, Margaret Burnett, and Alan Blackwell have written a paper that describes "extensions to the Excel spreadsheet that integrate user-defined functions into the spreadsheet grid". Of course, Excel doesn't do this... but, I wonder if that should be "Excel doesn't do this yet".

As a sidenote, this reminds me a little of how LabView handled subfunction definitions: subfunctions are defined using the same visual tools as top-level functions. It worked, but 'felt' a little heavy weight in actual use.

reddit this! Digg Me!

[/tech/excel] permanent link

MoveAfterReturn, OnTime, and the Excel Status Bar
I really liked This post by Dick Kusleika, over on Daily Dose of Excel. I'm a big fan of controlling frequently used options with keyboard shortcuts. To riff on Mr. Kusleika's post a little, here's a refinement I've found useful in the past for macros like these. Basically this allows the same macro to toggle a state as well as non-destructively display the current state.

The first time the macro MaybeToggleMAR is invoked, it displays the current state in the status bar, and sets a timer to expire in 3 seconds. If the macro is invoked a second time before the timer expires (easy to do if it's bound to a keystroke) the state is toggled. Technically speaking, the trickiest bit is that the function that sets the 3 second timer also has to handle cancelling any previous instance of the same timer. It works without the timer cancellation, but without it, the UI behaves oddly after multiple keypresses in rapid succession.

Chip Pearson's website has useful content discussing the Excel API's for Timers and the Status Bar.

Here's the code: to use it, stick it in a module and bind MaybeToggleMAR to the keyboard shortcut of your choice.
Option Explicit
Private MARChangesEnabled As Boolean

Public NextDisableTime As Double

Sub DisableMARChanges()
    Application.StatusBar = False
    MARChangesEnabled = False
End Sub

Sub DisableMARChangesSoon()
    On Error Resume Next
    Application.OnTime NextDisableTime, "DisableMARChanges", , False
    
    NextDisableTime = Now + TimeSerial(0, 0, 3)
    Application.OnTime NextDisableTime, "DisableMARChanges", , True
End Sub


Sub MaybeToggleMAR()
    Dim NewStatusText As String
    
    NewStatusText = ""
    
    If MARChangesEnabled Then
        Application.MoveAfterReturn = Not Application.MoveAfterReturn
        NewStatusText = "Status changed: "
    Else
        MARChangesEnabled = True
        NewStatusText = "Second press will change status: "
    End If
            
    If Application.MoveAfterReturn Then
        NewStatusText = NewStatusText & "MoveAfterReturn Enabled"
    Else
        NewStatusText = NewStatusText & "MoveAfterReturn Disabled"
    End If
                
    Application.StatusBar = NewStatusText
    
    DisableMARChangesSoon
End Sub


reddit this! Digg Me!

[/tech/excel] permanent link

Wed, 07 Sep 2005

Programming Well: Write (and Read) your Data ASAP
One of the first functions I like to write when creating a new data structure is a human-readable dumper. This is a simple function that takes the data you're working with and dumps it to an output stream in a readable way. I've found that these things can save huge amounts of debugging time: rather than paging through debugger watch windows, you can assess your program's state by calling a function and reading it out by eye. A few tips for dump functions:
  • The more use this kind of scaffolding code gets, it gets progressively more cost effective to write. Time spent before dumpers are in place reduces the amount of use they can get and makes them progressively less cost effective. Implement them early, if you can.
  • Look for cheap alternatives already in your toolkit: Lisp can already print most of its structures, and .Net includes object serialization to XML. The standard solution might not be perfect, but it is 'free'.
  • Make sure your dumpers are correct from the outset. The whole point of this is to save debugging time later on, if you can't trust your view into your data structures during debugging, it will cost you time.
  • Dump into standard formats. If you can, dump into something like CSV, XML, S-expressions, or Dotty. If you have a considerable amount of data to analyze, this'll make it easier to use other tools to do some of the work.
  • Maintain your dumpers. Your software isn't going to go away, and neither are your data structures. If it's useful during initial development, it's likely to be useful during maintenance.
  • For structures that might be shared, or exist on the system heap, printing object addresses and reference counts can be very useful.
  • For big structures, it can be useful to elide specific content. For example: a list of 1000 items can be printed as (item_0, item_1, item_2, ..., item_999 ).
  • This stuff works for disk files too. For binary save formats, specific tooling to examine files can save time compared to an on-disk hex-editor/viewer. (Since you have code to read your disk format into a data structure in memory, if you also have code to dump your in-memory structure, this does not have to be much more work. Sharing code between the dump utility and the actual application also makes it more likely the dumper will show you the same view your application will see.)
  • Reading dumped structures back in can also be useful.


reddit this! Digg Me!

[/tech/programming] permanent link

Dusty Decks, Lisp, and Early Computing
I've found a couple interesting websites related to computer history. The first is Dusty Decks, a blog related to some efforts to reconstruct Lisp and FORTRAN history. A highlight of this is a discussion on the Birth of the FORTRAN subroutine. Also via Dusty Decks is a website on the early history of the Lisp Programming Language.

That leads me to a couple books I've been reading lately. The first is Lisp in Small Pieces, by Christian Queinnec. I'm only a couple chapters in (stuck on continuations right now), but it's already been pretty profound. So far, the aspect of the book that's been the most useful is that it has gone through several core design choices Lisp implementors have to make ( Lisp-1 vs. Lisp-2, Lexical Scope vs. Dynamic Scope, types of continuations to support), and goes into depth regarding the implications and history of the choices involved. I think I'm finally starting to understand more of the significance of funcall and function in Common Lisp, not to mention throw/catch and block/return-from.

Book two is The First Computers--History and Architectures, edited by Raul Rojas. This book is a collection of papers discussing the architecture of significant early computers from the late 30's and 40's. The thing that's so unique about the book is that it focuses on the architectural issues surrounding these machines: the kinds of hardware they were built with, how they processed information, and how they were programmed. Just as an example, it has a detailed description of many of ENIAC's functional units, even going into descriptions of how problems were set up on the machine. Another highlight of the book for me (so far) has been a description of Konrad Zuse's relay-based Z3, down to the level of a system architectural diagram, schematics of a few key circuits, and coverage of its microprogramming (!).

reddit this! Digg Me!

[/tech/history] permanent link

Wed, 24 Aug 2005

I had a dream...
I literally dreamed about this last night. It would be wonderful if Excel supported formulas like this:
=LET(value=MATCH(item,range,0), IF(ISERROR(value), 0, value))
If you're into Lisp-y languages, it'd look like this:
(let ((value (match item range 0)))
  (if (is-error? value) 0 value))
The function call =LET(name=binding, expression) would create a local range name named name, bound (equal) to the value returned by binding, to be used during the evaluation of expression. In the example above, during the evaluation of IF(ISERROR(value), 0, value)), value would be bound to the value returned by MATCH(item, range, 0).

It's worth pointing out that this is slightly different from how normal Excel range names work. Range names in Excel work through textual substitution. With textual substitution, the initial expression would be logically equivalent to this:

=IF(ISERROR(MATCH(item, range, 0)), 0, MATCH(item, range, 0)))
In other words, Excel would treat every instance of value as if MATCH(item, range, 0) was explictly spelled out. This means there are two calls to MATCH and two potential searches through the range. While it's possible that Excel optimizes the second search away, I'm not sure that anybody outside of Microsoft can know for sure how this is handled.

Microsoft's current reccomendation for handling the specific ISERROR scenario in the first expression is this VBA function:

Function IfError(formula As Variant, show As String)

    On Error GoTo ErrorHandler

    If IsError(formula) Then
        IfError = show
    Else
        IfError = formula
    End If

    Exit Function

ErrorHandler:
    Resume Next

End Function
This isn't bad, but it requires that spreadsheet authors and readers understand VBA. It also imposes significant performance costs: calling into VBA from a worksheet takes time.

reddit this! Digg Me!

[/tech/excel] permanent link

Wed, 17 Aug 2005

Bell Labs group 1127 has been disbanded...
Group 1127, the group at Bell Labs that originally developed Unix, has been disbanded in a reorganization. I'm not exactly sure why this matters since all the original staff are gone and the remnants of systems research are elsewhere, but it did make the front page of Slashdot.

reddit this! Digg Me!

[/tech/general] permanent link

Mon, 08 Aug 2005

Nine things from MacOS X
I saw a list of nine things KDE can learn from MacOS X over on Planet KDE. This should go beyond OS X, most modern software could stand to follow this advise. (I'm a big fan of Item 2, single toolbars, and Item 3, simple default views. Toolbars got way too complicated around the time Microsoft introduced Word 6.0, and simple default views make sense for the simple reason that users should have to explicitly ask for more complex or confusing functionality).

reddit this! Digg Me!

[/tech/general] permanent link

Arc Hub
Paul Graham solicited comments on his Arc programming language a few years ago. These comments are online, and are very interesting reading. Lots of good comments.

reddit this! Digg Me!

[/tech/lisp] permanent link

Fri, 29 Jul 2005

Interesting links and blogs...
Non-Blogs Blogs

reddit this! Digg Me!

[/tech/links] permanent link

Thu, 28 Jul 2005

List filtering with Formulas in Microsoft Excel: How it's done
Now that I've written a little about why you might want to replace Excel AutoFilter, here's how to actually do it. To frame the discussion, there are two problems to solve:
  • Deciding which rows of the input set are part of the result set
  • Displaying the result set in a contiguous sequence of spreadsheet rows.
The first problem is easy: add another column alongside the input set with a formula that evaluates to TRUE if the row belongs in the result. This can be any valid Excel formula: it can include complex logic, it can depend on other cells containing control parameters. In my example spreadsheet, this formula is in column H, labeled In Query?:



The tricky bit of the formula-based filter is the second problem: displaying the result set in a contiguous range of rows with no gaps. Each cell that might display part of the result set has to figure out itself what part of the result set to display, if any, and pull the data from the input set. A simple MATCH or LOOKUP can't handle this, since MATCH or LOOKUP can't be told to return the second, third, or nth match. They return the first match, which isn't quite enough for what we're trying to do.

As it turns out, even though having the result set compute a mapping from the input set is quite hard, solving the reverse problem isn't too bad. Having the input set compute the mapping to the result set is easy. Here's how it works, by column:
  • Ord. - The row ordinal number of the row in the input set, starting with 1.
  • Result Ord. - This column starts at zero, in the row preceeding the first row of the result set, and increments by 1 for each row where In Query? is TRUE. For each row with In Query? of TRUE, this column is the row ordinal number of this row in the result set.... We are almost there.
  • Result Rows. - The input row ordinal of each row in the output set. This is done by using MATCH to find the first row for each number in the Result Ord..
Once the Result Rows. column has been calculated, populating the actual result set is just a matter of using INDEX. ISERROR can be called on cells in Result Rows. to identify rows that don't contain values. After all this is said and done, we have a spreadsheet range that contains only a result set, updates like every other range in Excel, and can be used in formulas like every other range. I have a sample spreadsheet that implements a lot of this here.

reddit this! Digg Me!

[/tech/excel] permanent link

Wed, 27 Jul 2005

Apple's new iBook
As rumored, Apple just refreshed the iBook. The other rumor, the one about a new chassis and a widescreen display, did not come true. Between that and Apple's desire not to encroach too much on the PowerBooks, there wasn't much headroom for major upgrades:
  • 2-finger trackpad scrolling.
  • Sudden motion sensing for the disk. (Is this done by the disk itself with a built in motion sensor or by the motherboard/CPU?)
  • Standard Bluetooth
  • A minor speed bump: the peak CPU is now a 1.42GHz G4 with a 142MHz bus.
I was hoping for more, but given Apple's total lack of manuvering room in the laptop space, this is an understandable bump. If they upgraded the iBook too much, there'd be little reason to pay extra for the PowerBook. Since they can't upgrade the PowerBook too much (thanks to the stagnant G4) they have a natural cap on the features in the iBook. Thus, Apple is restricted to selling up its five year old laptop with slogans like "a fast 133MHz or 142MHz system bus" (fast? Dell's $500 Inspiron 1200 runs its system bus at 400MHz) and "brilliant 1024 by 768 pixel resolution" (maybe it was brilliant five years ago).

Anyway, I've recently come to have a theory on the limited display resolution of Apple's notebooks. It seems obvious in retrospect, but Apple can't scale up the display resolution since they don't have the CPU or memory bandwidth to support higher resolutions as well as they want. With modern display stacks like Quartz and Quartz Extreme, pushing pixels around is one of the biggest user-visible performance burdens on a modern machine (hence, "the snappy"). While a GPU can help, there's no getting around the fact that if they doubled the resolution, they'd double the number of bytes their system has to process to render the same sized desktop on the screen. Given that Apple's best G4's have less than half the main memory bandwidth of the lowest end Centrinos, there's no wonder Apple's not chomping on the bit to eat up more of their bus.

Since Apple's first wave of Centrino laptops should bring fixes for all of this, the computing community has some pretty amazing hardware to look forwards to in a year or so.

reddit this! Digg Me!

[/tech/apple] permanent link

Mon, 18 Jul 2005

PS: I think that AutoFilter is typical of Excel...
I think that the weaknesses of the Excel AutoFilter turn out to be pretty typical of Excel in general.

To me, the brilliance of the spreadsheet was that it took a data model that business people were familar with, the accountant's paper spreadsheet, and layered on automatic computation and reporting facilities in a natural way. There's something very intuitive about going to cell c1, entering =A1+B1, and then having C1 contain the sum of the other two cells, automatically updated as the source cells change. It just makes sense, and is at the very core of every software spreadsheet dating back to the first, VisiCalc.

For years, spreadsheets worked at making this model work better. Lotus 1-2-3 introduced something called natural recalculation order that made it easier to follow the logic of spreadsheet calculation. Somewhere along the way, spreadsheets started doing limited recalculation, where formulas that didn't change weren't recalculated (thus saving time). New intrinsic functions were added, and Excel made a huge stride when it added array formulas: individual formulas that can produce more than one result. The gateway to user defined functions written in VisualBASIC was another huge win.

The core strength of all of these ideas is that they rely on and extend the core concept of the software spreadsheet: the software tracks dependancies between cells and automatically recalculates the appropriate results as necessary. As powerful as that concept is, Microsoft lost the plot somewhere around Excel 4 or 5 and keeps sinking money and effort into features that don't fully participate:
  • Excel has two data filter features: neither one can automatically update a table as a part of recalculation.
  • PivotTables don't update when their source data updates either. (For SQL data sources, this is understandable, but not so much when the source data comes from Excel itself).
  • PivotTables produce tables with missing values (to improve the formatting), which makes them very difficult to query with spreadsheet lookup functions.
  • The historgram function (among others) of the Analysis ToolPak is a one-time thing: you use it, it generates a histogram, and that's it. It's not possible to incorporate histogram generation into the dataflow driven recalculation of a spreadsheet.
  • There's no way to use an Excel formula to determine if a row is excluded or included in an AutoFilter query. Actually, there's no way to have the result set of an AutoFilter query drive spreadsheet recalculation at all.
Maybe this is being picky, but spreadsheets have a real strength in that they made it a lot easier for non-techies to specify how a computer can automatically solve certain types of problems. It's just a shame that so many of Excel's features are excluded from the natural way Excel is programmed.

reddit this! Digg Me!

[/tech/excel] permanent link

List filtering with Formulas in Microsoft Excel: Motivation
One of Excel's more interesting features for querying data sets is the AutoFilter. Applied to a table of data in a spreadsheet, The AutoFilter allows the table to be queried for subsets of data based on combo boxes in the table's header row. It's a simple way to filter out extraneous data and it can support quite elaborate query semantics (since it can filter based on values in computed cells).

However, AutoFilter is not without its problems:
  • AutoFilter imposes its own user interface: if you want a look-and-feel other than stock, you're out of luck.
  • For wide data tables with lots of columns, it can be hard to see the current AutoFilter query. To see the entire query requires horizontal scrolling down the header row.
  • Cell formatting and AutoFilter are independant of each other. If you want position dependant formatting (alternate row formatting, for example), it has to be recreated after each AutoFilter adjustment.
  • An AutoFilter works by selectively 'hiding' rows in the worksheet it's a part of. This means that an AutoFiltered list can't share rows with anything else that you don't also want selectively 'filtered' from view.
  • You can't have more than one AutoFilter on a worksheet tab.
  • AutoFilter isn't part of the natural 'ebb and flow' of the life of a spreadsheet: it doesn't participate in the dependancy driven formula solver that drives Excel's computational capability. This has some profound (bad) implications:
    • As data rows are added and removed from the list being AutoFiltered, the AutoFilter has to be removed and reapplied to the new data list to reflect changes to its source.
    • You can't use AutoFilter to filter a list and then search that list with =LOOKUP() or =MATCH(): the lookup operation will search the entire list, not the filtered list.
    • If you AutoFilter a list that contains calculated cells, and those cells change value, the set of filtered rows is not updated.
Anyway, I could go on, but I hope it's pretty clear by now that there are sometimes good reasons to look for other list filter mechanisms than AutoFilter. (FYI: 'Advanced Filter' has its own limitations, some of which are very similar to AutoFilter's.) I'll post a way to get AutoFilter-like behavior directly from Excel formulas. This technique has its own issues, but it does address lots of the issues I mentioned here.

reddit this! Digg Me!

[/tech/excel] permanent link

Thu, 07 Jul 2005

Using Internet Explorer as a non-Anonymous FTP client
This is pretty well documented online, but I can never seem to find it when I need it. So, I'm putting it here too.

Internet Explorer defaults to anonymous FTP, when sometimes you need to log in with an explict username and password. One of the lesser known features of URL's is that they allow login information to be specified as part of a web address.

     ftp://username:password@hostname/

The :password part is optional, but sometimes necessary. As the Rutgers site points out, there are security issues involved with this, particularly on public terminals. That said, FTP (RFC 959) sends passwords as unencrypted text anyway, so I wouldn't be using my most secure passwords to log into an FTP site.

Also, Microsoft have a Knowledge Base Article that describes this in more detail, including a way to log in from a menu command, if you have the right settings enabled.

reddit this! Digg Me!

[/tech/tips] permanent link

Thu, 30 Jun 2005

UseHR, High Resolution Displays, and the Internet
For a few years, I used this graphic as the front matter for my website:

michael.schaeffer


This, the logo for my website, is basically just antialiased text rendered into a bitmap. At the time, it seemed like a good idea to render the text as a bitmap because I didn't trust the browser to render it for me. Bad idea.

As it turns out, Internet Explorer rescales bitmaps on high resolution displays. This is a somewhat misguided attempt to make keep bitmap sizing consistent. Bitmaps aren't rendered at 1/1 zoom, they are rendered at screen_dpi/96dpi. On non-96dpi screens, that results in ugly scaling. While scaling can be disabled, that's not the ideal solution. The ideal solution is to do as much of the rendering as possible in the browser: which should know more about the client's display than the server. Therefore, my logo is now CSS formatted plain text. That means it looks the right size on more screens, anti-aliases appropriately, uses ClearType if it's available. The next step is going to be to switch from pixel sizes to 'real' sizes.

reddit this! Digg Me!

[/tech/general] permanent link

Mapping XML to S-Expressions
I've been playing around with how to map XML to S-Expressions For a while, I had been considering a mapping like the following:

From:<phone_book name="Jenny">867-5309</phone_book>
To:(phone_book ((name . "Jenny")) "867-5309")

In other words, a symbol for the tag name in the car of the list, an association list of attribute values in the cadr, and then the subelements in the cddr. This seems reasonable, aside from the fact that attributes and tag values are still wierdly disjoint.

On the way to lunch today, I came up with another mapping that might be more reasonable:

From:<phone_book name="Jenny">867-5309</phone_book>
To:(phone_book (name "Jenny") :end-of-attribute-marker "867-5309")

This is simpler in that a tag is modeled as a list containing the tag symbol and then all of the sub-items, attributes or not. Data stored as an attribute doesn't get special treatment relative to data stored as a tag value. The symbol :end-of-attribute-marker makes it possible to still distinguish between attributes and tags. If you don't care, a simple call to remove can remove the marker symbol.

It's a subtle design point, but this'll probably end up in vCalc in the XML support... I've had XML for vCalc on the back-burner for a while now, but due to some real work obligations, I might have to make it a higher priority.

reddit this! Digg Me!

[/tech/lisp] permanent link

Commodore Amiga Marketing
Heh... saw some quotes on Slashdot referring to Commodore's marketing of the old Amiga. I thought they were funny enough to share here:
  • If Commodore bought KFC they would have changed the name to "Warm dead birds in a paper bucket".
  • Commodore Sushi: Cold, dead, raw fish.
I have no idea what the attribution should be for these.

reddit this! Digg Me!

[/tech/general] permanent link

Some historical context around Apple/x86
I ran across this quote the other day from I, cringely:

"The market has stupidly decided that Intel microprocessors are better than Apple's preferred PowerPCs, so Apple will be at a disadvantage trying to sell PowerPC machines into the Intel market. This is what's right now killing Silicon Graphics, which is finding rough going pitting its MIPS processors against Intel. ... Yes, Apple will build computers with Intel processors. Their aim, as in all of these products, is for the high end. Based on Intel's new Merced chip, the new Apple machine will have PCI slots, Universal Serial Bus, Fast Ethernet, IEEE 1394 FireWire, IRDA, DIMM sockets, but no ISA slots and no backwards compatibility to DOS. So this is NOT a PC in the strictest sense, since it will only run Rhapsody, but not System 8 or Windows NT. It will run Mac applications inside Rhapsody. And because Apple is both the author of Rhapsody and the designer of this machine, Jobs believes that more customers will want to buy their Rhapsody wrapped in Apple hardware than not."

Funny thing is... that quote is from October of 1997. A lot has changed since then, but since the core reasoning was sound it probably shouldn't be too much of a suprise that he was ultimately right.

The other interesting bit was that Cringely wrote that piece around 1997, which is when the NDA for 'Project Star Trek' expired. Star Trek was a project in which a few Apple, Novell, and Intel software engineers got MacOS 7 running on PC hardware. I'm not sure what the business story would've been, but it was a nice technical accomplishment nonetheless.

reddit this! Digg Me!

[/tech/apple] permanent link

Tue, 28 Jun 2005

The Inspiron has arrived...
I haven't had as much time to play with it as I'd like, but the laptop arrived today. In the hour I've had it running, so far I'm quite impressed. A couple quick thoughts:
  • I like the keyboard: nice and solid. Since the layout is more like a Dell D600 than a D400 (what I have from work), there'll be a little getting used to it. The D400 layout puts page up and page down near the arrow keys, which I've gotten used to for reading documents. The I6000 (and D600/D800) puts page up and page down up near the display. If that gets too obnoxious, I might have to investigate remapping some of the media keys on the front of the machine to more useful keys.
  • I love the WUXGA (1900x1200, approx.) display. The machine came from the factory with large icons enabled and set to 120dpi. Set up that way, it seems readable enough to me, but my vision is so far correctable to 20/20. If smaller text adds to fatigue or is harder to read on a bouncy train, it'll be possible to enlarge text through preferences, so I'm not worried about it at all. At this point, the 1024x768 D400 is going to feel very cramped.
  • Dell still dumps its machines full of software. This machine came with several broadband offers, four media players, and a bunch of modem stuff. Most of that's getting uninstalled in the name of system stability. I already have broadband, I don't use streaming media that much, and I haven't used a modem in years.
  • XP Media edition looks the same as XP Pro, so far.


reddit this! Digg Me!

[/tech/general] permanent link

Thu, 23 Jun 2005

CGA... EGA... VGA... ... ... WQUXGA?!?!?
I thought these acronyms were dead 15 years ago, when SuperVGA and the 8514/A started to replace the VGA. here's a good history and glossary of the terms, but what's wrong with "2.3MP, 16:10" instead of WUXGA? So much simpler, and it also provides a decent way to compare screen resolutions with digital camera resolutions.

reddit this! Digg Me!

[/tech/general] permanent link

The "Star of the American Road"
Oddly enough, the search term that brings the most visitors to my little website is "Texaco". I suspect most of those people go away totally unsatisfied, but there is a decent story behind the connection:

About ten years ago, a good friend of mine went to work in Texaco's IT Shop as a summer intern. One of his job responsibilities was to develop an intranet website. I forget the details, but somewhere along the way he decided he wanted to put a fancy banner picture atop the page. At the time, we were both interested in ray tracing, so we decided to throw together a raytraced version of the Texaco Star logo.

Using our copious free time, we found an online copy of the Texaco logo, took measurements of the star and rendered it as a white solid set against a metallic red hemisphere. We even went to the trouble of animating the star so it rotates, generating a bunch of frames and using a GIF tool to put together an animated GIF. The final result was a nicely animated Texaco logo with an "attractive" (This is by 1995 intranet standards, remember) banner to the side. Since then, I've dragged the model out, re-rendered it at higher resolution, and stuck it in a little Raytracing Gallery I have set up.
For some reason, that picture brings more visitors to this site than anything else. If you happen across this site and actually use the image for something you owe its presence to a ten year old accident of fate.

reddit this! Digg Me!

[/tech/this_blog] permanent link

Inspiron 6000d (and Laptop Shopping Advice)
I just placed an order for a Dell Inspiron 6000D, using one of Dell's recent $750 off deals. With any luck, it'll ship in a couple weeks. In the course of doing research on the machine, I found this site describing James Carter's experiences with the machine. It is without a doubt the best, most comprehensive laptop review I have ever seen. If you write a product review for a laptop computer, you should emulate this.

Something else worth mentioning is that laptop vendors typically use standard parts in their hardware. While they don't publicize part numbers (partially so they can switch suppliers), it is possible to find datasheets describing things like LCD panels. While it takes some inference to figure out what part is being used, this can reveal statistics about LCD panels that might otherwise be hard to find. While Google is your friend, this site has a bunch of links to useful datasheets.

PS: I've ordered the WUXGA (>2 Megapixels, ~140dpi) display with the 128MB Radeon X300 video adapter. If on screen content isn't too small, I expect the detail to be fabulous. I'll post comments (and screenshots) when I get some experience with the machine.

reddit this! Digg Me!

[/tech/general] permanent link

Mon, 06 Jun 2005

So it's true...
I didn't believe it was possible when I first heard the rumors a few weeks ago, but Here it is: Apple will transition to x86, specifically Intel, in 2006. The whole line will go x86 in 2007. Microsoft is behind the switch, as is Adobe. Interestingly, the developer transition kit has an Intel compiler at its core. I wonder why not GCC.

The next question is how well it will be pulled off. In theory it could be seamless. It needs to be.

reddit this! Digg Me!

[/tech/apple] permanent link

Apple on Intel - Not^H^H^H gonna happen
So, the big rumor is that Apple is switching to Intel processors, and Steve Jobs is going to make the announcement during his WWDC keynote address this morning (10:00AM PST). I had been planning on writing a debunking article, but now I'm not so sure. Here's why:

Reason not to switchCounterargument
If Apple switches to Intel, they introduce another archicture break into their hardware platform. Emulation can make existing binaries run seamlessly on Intel.
But isn't emulation really slow? Modern emulation technology has gotten a lot better, it can compile code on the fly, just like a modern JVM or Virtual PC.
But I've run virtual machines before, and they're still really slow. All of the operating system services can be made to run natively, at full speed. The only thing that will be emulated is the application code itself. So, except for very computation-intensive application code things could still run smoothly.
Okay, but a lot of OS X (like Quartz Extreme) is optimized to run on Macintosh hardware. Macintosh video hardware is the exact same as PC video hardware these days. In fact, most of the supporting hardware in Macintosh is the same as on a PC.
The PowerPC is part of Apple's 'uniqueness'. It doesn't matter to most consumers what chip or ISA is running their software. The reason people pay for Apple, their core unique value, is their appealing design and the attenion they spend developing a well integrated system. Even if Apple switches to Intel, there's no reason any of that has to change. (Anyway, they could still do something pretty unusual, like putting a Pentium M in a desktop).
Lots of new stuff in Tiger like CoreImage uses AltiVec a great deal. CoreImage actually compiles dataflow graphs to native hardware at runtime, picking the approach that runs best on the target hardware. CoreImage could well compile to x86/SSE2 (or whatever else). That means that even a PPC binary running emulated on an Intel Macintosh could have access to full speed CoreImage services compiled to SSE2.
This will alienate existing PowerPC customers. Why does it have to? If their emulation works well enough, Apple could easily introduce Intel hardware and retain PowerPC as the standard binary format for a while. The common case for ISV's would be to continue developing PowerPC binaries and selling into both the x86/OSX and PPC/OSX markets. The only 'schism' would be arise for software vendors who had to have full performance on x86/OSX. They'd have to worry about shipping some kind of fat binary that ran on both platforms. There still, PPC/OSX customers wouldn't see a difference.
Will Windows run on an Intel Mac? Won't that make it easier for Microsoft to drop Office for OS X? Apple could easily make it virtually impossible to run Windows on whatever hardware they sell. With respect to Office for OS X, Microsoft doesn't really care what the target archicture is: they just want to sell licenses to Office. They'll go where the money is, and that might end up being an OSX/Intel port.


Now that I think about it, the switch to Intel would basically boil down to the same story Apple told in 1993, when it initially switched from the Motorola 680X0 to the PowerPC. Apple pulled it off well in 1993, and now they have the benefit of experience (they've done it before), better emulation technology, and an already more standard hardware platform. It seems plausible to me. The only thing that's left is to figure out why they'd do it, and I have some ideas there too:
  • They could finally move their laptops to a faster chip than the G4.
  • x86 is not going away and it's not going to end up marginalized any time soon. This could be a 'final' switch.
  • If IBM is growing cold on the desktop CPU business (and who could blame them), Apple's hand might be forced into switching away from PPC. Right now, IBM is the only high performance CPU story Apple has.
Anyway, let's see what Jobs says...

reddit this! Digg Me!

[/tech/apple] permanent link

Fri, 03 Jun 2005

Readership
Blog readership on this blog is very low. I'd go so far as to say that I'm the only person that regularly hits it, and that's usually to test layout, formatting, etc.

To make it easier to distinguish traffic from me, and traffic from other folks, I've added a symlink to my configuration that makes it possible to hit the blog from a different, private, URL. That way, my hits and other folks hits are nicely bucketed out in my ISP's reporting. This is a cheap and easy solution, and I reccomend it.

reddit this! Digg Me!

[/tech/this_blog] permanent link

It looks better on an LCD, honest!
This post spoke to the use of ClearType to improve text rendering in vCalc. This "after" screenshot was taken from a laptop running ClearType:



Since ClearType depends on the unique properties of LCD's, it won't look as good on a CRT. (I still think it looks better than normal, though).

reddit this! Digg Me!

[/tech/general] permanent link

Thu, 02 Jun 2005

A Pretty Printer for Excel Formulas
This is a little add in for Excel that takes formulas and reformats them in a more readable style.

reddit this! Digg Me!

[/tech/excel] permanent link

"Saving Excursions" in Excel
Lately, I've been finding myself spending lots of time toggling between two Excel spreadsheets to make edits. This little macro makes it easy in Excel 2000 to toggle between two spreadsheet windows. I reccomend you bind it to a keystroke.
Option Explicit

Dim lastWindow As Variant

Sub HereAndThere()
    If IsEmpty(lastWindow) Then
        Set lastWindow = ActiveWindow
    Else
        Dim currentWindow As Window
        Set currentWindow = ActiveWindow
        
        lastWindow.Activate
        
        Set lastWindow = currentWindow
    End If
End Sub
Here's how you use it:
  • Run the macro once to save your current location.
  • Switch to your other spreadsheet
  • Now, running the macro will switch back and forth betweeen the two worksheets.
The "saving excursions" in the title is a reference to the save-excursion special form in Emacs Lisp. This macro isn't quite the same (and not nearly as powerful), but it reminded me of the Emacs feature. If it turns out to be useful, I might generalize my little macro to include some of the capabilities of Emacs' save-excursion.

reddit this! Digg Me!

[/tech/excel] permanent link

Wed, 01 Jun 2005

Should blog posts be written like a newspaper article?
Should blog posts be written with the essential content at the beginning and less interesting details at the end? Or is putting the punchline at the end acceptable?

Since people tend not to read online, and blogs are often read in huge volumes via an aggregator, my hunch is that the title (and maybe the first paragraph) have to convince people your article is worth the time...

More established bloggers, with a better track record of writing interesting stuff, might have more leeway.

reddit this! Digg Me!

[/tech/this_blog] permanent link

Some things never change...
I've been shopping for a laptop recently. My target specs are these:
  • Any modern laptop processor is probably adequate.
  • 1GB RAM.
  • 30-60GB Disk.
  • A DVD writer would be nice, but not necessary.
  • 14-15 inch display, the highest dot pitch I can find.
  • Reasonable 2D graphics performance, 3D is not that important to me.
  • Touchpad pointing device.
  • 3 year warranty, accident insurance is a nice plus
  • Long battery life, >3 hours.
  • Reasonable expectation of 2-3 years of reliable life.
  • Can run a couple small Windows applications I need to do my job.
  • Can run MS Office.
That's a long list, but nothing on it is very demanding. Let's see how close a couple vendors get:

Dell D610Thinkpad T4xApple 15" PowerBookApple 14" iBook
CPUPentium M, 1.6Pentium M, 1.8G4, 1.564, 1.33
Ram1GB, 2 DIMMS1GB, 1 DIMM1GB, 2 DIMMS768MB, 2 DIMMS
Hard Disk60GB60GB80GB60GB
Optical DiskDVD+/-RWDVD+/-RWDVD+/-RWDVD+/-RW
Screen14.1", 1.5MP14.1", 1.5MP15", 1MP14", 0.75MP
Warranty3 year3 year3 year3 year
Insurance3 yearnonenonenone
Price$1,893$2,306$2,648$1,948


So, as ever, Apple is the most expensive choice, even when compared to nicer PC's like the ThinkPad.

Maybe the thing that suprises me the most about this is that Apple isn't even close to the bleeding edge of display technology. Given the energy they've put into OS X's desktop rendering pipeline, I'd expect them to have displays that could compete with Sony's XBrite or maybe the 2MP 15" widescreen that Dell makes available on the D810. OS X could drive those displays better than pre-Avalon Windows. Maybe this is a artifact of the suppliers Apple is using?

reddit this! Digg Me!

[/tech/apple] permanent link

Fri, 27 May 2005

Anti Grain Geometry
I just found about it, but I already think it might end up in vCalc. Anti Grain Geometry is a open source 2D rendering library with a very liberal license. The feature set looks pretty comprehesive: it supports anti-aliasing, affine transforms, sub-pixel resolution, and alpha blending. Even better, it's designed as a lightweight set of C++ classes, so it shouldn't bloat or slow down vCalc too much. About the only hole is that it doesn't have any kind of built in text rendering; However, even there there are are detailed instructions for using the Windows True Type renderer to generate glpyhs.

All I need now is time...



reddit this! Digg Me!

[/tech/general] permanent link

Dell Service Manuals
This is cool... I knew IBM (er, Lenovo) did this, but Dell does it too. They have an online site with all of the service manuals and documentation for every machine they've ever sold. This includes detailed instructions on disassembling and rebuilding laptops.

Even more cool is that the archive goes back to the beginning, back when Dell was called PC's Limited.

Note: The IBM link above is actually still on the IBM site... I expect the link to break whenever Lenovo takes the contents.

reddit this! Digg Me!

[/tech/general] permanent link

Tue, 17 May 2005

Seymour Cray
For some reason, I've been thinking a lot lately about Seymour Cray. When I was growing up, I remember asking my dad about who made the fastest computers in the world, and the answer at the time was Cray. I don't know if he meant the man or the company, but for a while both were true. I suppose it made an impression.

I've found a bunch of good things online about the man and his work: Reading through them, a couple of things made impressions on me:
  • He didn't mind throwing bad ideas away (or saving them for later). The Cray 1 took a very different approach from the CDC 8600.
  • Cray failed a lot. He was always pushing the limits and taking risks, and paid the price of those risks. The CDD 8600 failed, as did several designs for the Cray 2. The Cray 3 failed to sell, and the 4 doesn't seem to have hit the prototype stage at all. Even the Cray 2 doesn't seem to have been an unqualified success, thanks to issues with memory bandwidth.
  • He had a very 'startup mentality'. His career seems to be a repeating story of initial success, spin off lab, and spin off company.
  • A lot of his design problems weren't electronic at all. He seems to have struggled as much (if not more) with packaging and cooling as with anything else.
  • He had a keen sense of style. With the possible exception of the Connection Machine CM-1/2, his machines were the most visually striking of the major supercomputers. Maybe it's superficial, but it can't have hurt the sales or publicity.
  • He knew what he had accomplished. There's a story about his suprise when Steve Chen developed the X-MP from the Cray 1 and doubled (?) the performance. Of course, the story goes on to describe how Cray ended up appreciating the new design.
Anyway, I have nothing but the utmost respect for the man and his accomplishments. R.I.P, Mr. Cray.

reddit this! Digg Me!

[/tech/general] permanent link

Better Text for GDI Applications
This is well documented on MSDN, but it's still pretty cool.

I've never been happy with the text quality of the vCalc display: it's jagged and at a font size that doesn't rasterize well on the displays I have access to. Well, as it turns out, this is relatively easy to fix. The LOGFONT structure that GDI uses to select fonts has a field, lfQuality, that is used to select the quality of the text rendering. Back in olden days, this field was used to do things like disallow scaling of bitmap fonts (if you don't know what that is, be thankful: it was awful). These days, it's used to turn on Antialiasing and Cleartype (on winXP). Thus, this one line of code:...

lf.lfQuality = CLEARTYPE_QUALITY;

...transformed this...


...into this.


There's also a setting for anti-aliasing:

lf.lfQuality = ANTIALIAS_QUALITY;

Anti-aliasing (in Windows) dates back to the Windows 95 Plus pack, so this setting should be much more widely supported. However, it's also much less powerful: it doesn't do any of the sub-pixel stuff and it is enabled far less often. In my experimentation, non-bold fonts had to be pretty big before anti-aliasing was used at all.

The other caveat is that this doesn't automatically buy you decent formatting of the text you display. That is, if you're still computing text positioning on per-pixel increments, you'll still get mediocre layout. vCalc does this, but it also has very minimal text layout requirements for now.

reddit this! Digg Me!

[/tech/general] permanent link

Tue, 12 Apr 2005

What is a Company for?
The last line of my VB6 post was this: "Commercial vendors, particularly, have no legal obligation to their customers." To clarify this, companies are legally obligated to their owners, not their customers. Since the owners own the company and have their investment at risk, the company has to act in their interest... even if it's in opposition to their customer's interest.

Since a company has to have customers to survive, most of the time the interests of the owners are in line with the interests of the customers. However, this isn't always the case: Microsoft's VB6/VB.Net decision might be an example. If you believe that the lower costs and better prospects of VB.Net outweigh the lost goodwill of all those VB6 developers, then you can also argue that dumping VB6 was a net profitable thing to do. This is despite the fact that so many customers are paying a price for the decision.

So... if you're a VB6 developer and you're upset about the way you were treated, the best protest you can make is to make Microsoft's decision a bad one. Make it unprofitable. When it comes time to pick a replacement platform, vote with your wallets and send your dollars somewhere else (and hopefully to a platform served by more than one vendor).

reddit this! Digg Me!

[/tech] permanent link

Programming Well: Global Variables
Global variables tend get a bad rap, kind of like goto and pointers. Personally, I think they can be pretty useful if you are careful. Here are the guidelines I use to determine if a global variable is an appropriate solution:

  • Will there ever be a need for more than one instance of the variable?
  • How much complexity does passing the variable to all its accessors entail?
  • Does the variable represent global state? (A heap free list, configuration information, a pool of threads, a global mutex, etc.)
  • Can the data be more effectively modeled as a static variable in a function or private member variable in a singleton object? (Both of these are other forms of global storage, but they wrap the variable accesses in accessor functions.)
  • Can you support the lifecycle you need for the variable any other way? Global variables exist for the duration of your program's run-time. Local variables exist for the duration of a function. If you don't have heap allocated variables, or if your heap allocator sucks, then a global variable might be the best way to get to storage that lasts longer than any one function invocation.
  • Do you need to use environment features that are specific to globals? In MSVC++, this can mean things like specifying the segment in which a global is stored or declaring a variable as thread-local.
If all that leads you to the decision that a global variable is the best choice, you can then take steps to mitigate some of the risks involved. The first thing i'd do is prefix global variable names with a unique qualifier, maybe something like g_. This lowers the risk of namespace collisions as well as clearely denotes what variables are global, when you have to read or alter your code. If you have multiple global variables, I'd also be tempted to wrap them all up in a structure, for some of the same reasons.

reddit this! Digg Me!

[/tech/programming] permanent link

Visual BASIC
There's been some 'controversy' in the blog world about a petition that's circulating to ask Microsoft to continue supporting "Classic" Visual BASIC in addition to the replacement VB.Net. A month ago, I had a pretty long post dedicated to the topic, but due to technical problems I wasn't able to get it online. Therefore, I'll keep this sweet and to the point.

The core problem VB6 developers are facing is that they sank lots of development money into a closed, one-vendor language. Choosing VB6 basically amounted to a gamble that Microsoft would continue to support and develop the language for the duration of a project's active life. That gamble hasn't paid off for some developers, and companies with sizable investments in VB6 code now need to figure out how to make the most of that investment while still evolving their software.

With standardized languages like C, languages with multiple tool vendors, the risk is significantly lower. If one vendor drops their version of a language, switching to another implementation is going to be a lot easier than porting to an entirely different platform (particularly if you've avoiced or isolated vendor-specific features).

So... what's the moral of this story? Before you base your business on a particular language or tool, make sure you know what happens if that platform ever loses support. Pick something standardized, with multiple viable vendors. Or alternatively pick something open source, where you can take over platform development yourself (if you absolutely need to). Whatever you do, don't pick a one vendor tool and complain when the vendor decides to drop it. Commercial vendors, particularly, have no legal obligation to their customers.

reddit this! Digg Me!

[/tech/general] permanent link

Fri, 04 Mar 2005

General posts on a topic don't go at the root...
Maybe this should have been obvious from the beginning, but I'm no longer putting "general interest" blog posts at the root for the topic. Rather, those posts are now going under a "general" subtopic of the root.

The problem with making general interest posts at the root of a topic is that there's then no way to watch only the general topics. If you look at the root topic, you get the whole topic.

reddit this! Digg Me!

[/tech/this_blog] permanent link

A few good Lisp and Scheme (and Smalltalk) Related Links
It never ceases to amaze me how much good material there is online. Here's some more:

  • Olin Shiver's History of the T implemenatation of the Scheme programming language.
  • Aubrey Jaffer has some interesting material on interpreter performance issues at his SCM site.
  • Alan Kay's Early History of Smalltalk
  • In 1994, Richard Stallman started a debate (flamewar?) on comp.lang.tcl with a post entitled Why you should not use Tcl. The Guile scripting language was the logical outgrowth of this.
  • Kent Pitman has posted a bunch of his writings relating to Lisp and SCheme. Among other things, he edited the Common Lisp Hyperspec, which is an on-line version of the Common Lisp specification.
  • Peter Siebel has written a book, Practical Common Lisp, and has gotten permission to put it online indefinately.

    Ps: Be sure to check out Olin Shiver's philosophy of undergraduate advising. It's an example to be followed. ;-)

    reddit this! Digg Me!
  • [/tech/lisp] permanent link

    Outsourcing vs. Offshoring
    I'm writing some content for a future post on the offshoring of jobs overseas, but I want to clear something up before it gets posted: Outsourcing and offshoring are two different and orthogonal concepts. This seems to be something that gets misunderstood a great deal, but simply put, outsourcing is the movement of jobs to a different company and offshoring is the movement of jobs to a different country. Either one can be done without the other.

    The scenarios that people tend to get upset about (at least in the United States) are the scenarios involving offshoring, the movement of work overseas. Outsourcing, however, does not necessarily imply that the work gets moved to a different country: it's very common for work to be outsourced to another American business employing American workers. An example of this is hiring a Madison Avenue firm to put together an ad campaign. Sure, it'd be possible to develop the talent in house to do this yourself, but there are many advantages in outsourcing the work to a more specialized vendor.

    reddit this! Digg Me!

    [/tech/business] permanent link

    Color Picker...
    This is cool. What it is is a web-based color picker that automatically gives you a couple different kinds of complementary colors.

    reddit this! Digg Me!

    [/tech/general] permanent link

    Sony Ericsson T-637
    A few months ago, my wife and I recently switched from a Sanyo 4700 and a 4900 on Sprint PCS to a pair of Sony Ericsson T-637's on bCingular Wireless. Overall, the switch has been an improvement, but there are still a few nagging issues:

  • Cingular's selection of Java games is much sparser and more expensive than Sprint's.
  • There's no "Phone Ringing" Ringtone on the phone, just a bunch of generic and/or unrecognizable music files.
  • There are buttons on the side of the phone that activate the web browser and camera. These are pretty easy to hit by accident.
  • Sanyos and Nokias have this problem too, but the Sony doesn't really handle the case of multiple directory entries with the same phone number. When called by someone at a number that for which I have multiple entries, I'd really like to see a list of all of the entries containing that number. (This would help handle the case of two people each with cell phones and with one home number.)
  • The incoming call logs are by number, not by call. This makes it difficult to tell when you've missed multiple calls from the same number.
  • The incoming call logs rely on automatic horizontal scrolling to reveal information like time of call and number of calls missed. This means that you have to select a log entry and sit on it for a few seconds while the phone scrolls the information you want into view. I'd much rather have some kind of details/summary view toggle button on the side of the phone. Of the four side mounted buttons, surely one could be for this.
  • There's a music editor built in that lets you compose custom ring tones. However, it only lets you work with a fixed set of clips, so it loses its appeal very quickly.

    I guess that looks like a lot of complaining, but otherwise the phone is very nice. The last phone I've liked as much is my old Nokia 8260 (and the 6160 before that). The Sanyo 4900 doesn't even come close. I'm happy enough with this phone to consider buying another Sony Ericsson. (The new W800i looks pretty nice...)

    reddit this! Digg Me!
  • [/tech/products] permanent link

    Wed, 02 Mar 2005

    I guess I had forgotten how slow I/O was, particularly bad I/O.
    I'm in the middle of developing a Scheme compiler for a future release of vCalc. While I've been developing the code, I've peppered it full of debugging print statements that look something like this:

    (format #t "compiling ~l, tail?=~l, value?=~l" form tail? value?)

    with the output statements in place, the compiler takes about 250-300ms to compile relatively small functions. Not great, particularly considering that there's no optimization being done at all. Anyway, on a hunch I removed the format statements, and execution time improved by a couple orders of magnitude to a millisecond or two per function. That's a lot closer to what I was hoping for at this stage of development.

    On the other hand, I hadn't realized that my (ad hoc, slapped together in an hour) format function was running quite that slowly. I think it'll end up being an interesting optimnization problem sooner or later.

    reddit this! Digg Me!

    [/tech/lisp] permanent link

    Tue, 01 Mar 2005

    Message Dialog
    This be should part of the Win32 API if it's not already. Basically, it amounts to a variant of the MessageBox API that allows custom button labels, rather than just "Yes", "No", "Abort", "Retry", etc.

    Maybe this kind of API would have made abominations like this less likely:



    At least users might be able to avoid calling errors "OK", of all things...

    reddit this! Digg Me!

    [/tech/general] permanent link

    Programming Well: Embrace Idempotence, Part 2 (It works at runtime too)
    Idempotence has benefits at a program's run-time, as well as at build time. To illustrate, consider the case of a reference counted string. For the sake of example, it might be declared like this (In case you're wondering, no, I don't think this is a production-ready counted string library...):

    struct CountedString
    {
      int _references;
      char *_data;
    };

    CountedString *makeString(char *data)
    {
      CountedString cs = (CountedString *)malloc(sizeof(CountedString));

      cs->_references = 1;
      cs->_data = strdup(data);

      return 1;
    }

    CountedString *referToString(CountedString *cs)
    {
      cs->_references++;
      return cs;
    }

    void doneWithString(CountedString *cs)
    {
      cs->_references--;

      if (cs->_references == 0)
      {
        free(cs->_data);
        free(cs);
      }
    }

    // ... useful library functions go here...


    The reference counting mechanism buys you two things. It gives you the ability to delete strings when they're no longer accessible; It also gives you the abilty to avoid string copies by deferring them to the last possible moment. This second benefit, known as copy-on-write, is where idempotence can play a role. What copy on write entails is ensuring that whenever you write to a resource, you ensure that you have a copy unique to to yourself. If the copy you have isn't unique, copy-on-write requires that you duplicate the resource and modify the copy instead of the original. If you never modify the string, you never make the copy.

    This means that the beginning of every string function that alters a string has to look something like this:

    CountedString *alterString(CountedString *cs)
    {
      if (cs->_references > 1)
      {
        CountedString *uniqueString = makeString(cs->_data);
        doneWithString(cs);
        cs = uniqueString;
      }

       \\ ... now, cs can be modified at will

       return cs;
    }

    Apply a little refactoring, and you get this...

    CountedString *ensureUniqueInstance(CountedString *cs)
    {
      if (cs->_references > 1)
      {
        CountedString *uniqueString = makeString(cs->_data);
        doneWithString(cs);
        cs = uniqueString;
      }

      return cs;
    }

    CountedString *alterString(CountedString *cs)
    {
      cs = ensureUniqueReference(cs);

      \\ ... now, cs can be modified at will

      return cs;
    }


    Of course, ensureUniqueInstance ends up being idempotent: it gets you into a known state from an unknown state, and it doesn't (semantically) matter if you call it too often. That's the key insight into why idempotence can be useful. Because idempotent processes don't rely on foreknowledge of your system's state to work reliably, they can be a predictable means to get into a known state. Also, If you hide idempotent processes behind the appropriate abstractions, they allow you to write code that's more self documenting. A function that begins with a line like cs = ensureUniqueInstance(cs); more clearly says to the reader that it needs a unique instance of cs than lines of code that check the reference count of cs and potentially duplicate it.

    Next up are a few more examples of idempotence, as well as a look into some of the pitfalls.

    reddit this! Digg Me!

    [/tech/programming/idempotence] permanent link

    Michael Kaplan's Blog and a Few Other Good Links
    Another useful blog from Microsoft. Michael Kaplan has been blogging for quite some time on internationalization and other Unicode-related issues. His blog full of deep, technical information on a part of Windows that seems to get overlooked a lot. I've been starting the very first steps of getting vCalc (and my Scheme interpreter) to be Unicode aware, so his blog has been timely reading.

    I've also found, via Lambda the Ultimate, a website dedicated to Alexander Stepanov's papers and code. Stepanov is one of the principals behind the C++ STL (STepanov and Lee) Standard Template Library.

    reddit this! Digg Me!

    [/tech/general] permanent link

    New York Laundromat
    Over the last several months, I've been spending a great deal of time in New York City on business. In that time, I have never been quite as suprised by prices as I have tonight. I wasn't even trying to do anything all that unusual, just two loads of laundry.

    Normally, I'd expect that two loads of laundry would cost about $6. The washer would be $1.50 or so per run, and the dryer would be another $1.50 per run, for a total of $6. Maybe even $7.50, if you decide to run a second dryer cycle. Even in New York (west Midtown Manhattan), I've been in apartments recently that charge about that much.

    However, this apartment is special: they use a "smart" card system to manage payments. There's a dispenser on the side of the wall that sells $7 cards for $10 (cards themselves cost $3). The dispenser also allows you to reload cards in $5 and $10 increments. Once you have a card, there are slots in each of the washers and dryers that accept the card and debit from it the $2.50 it takes to buy a cycle in one of the machines. Yes, you read that right: $2.50. $2.50 in my apartment complex buys a 34 minute washer cycle or a 30 (yes, 30) minute dryer cycle.

    So tonight, I spent $15 (200% of my estimate) and got this:

  • A $3 "smart" Card to carry around and not lose
  • Two complete loads of laundry that will inevitably end up damp, thanks to the pathetic dryer cycle.
  • $2 of "change" on my "smart" card that I will never get to spend. (Since every machine in the laundromat costs $2.50, and the card can only be reloaded with $5 or $10)

    The part of this that bothers me the most is the $3 surcharge on the smart card. Thanks to the pricing structure of the laundromat, the $3 surcharge really amounts to a $5 surcharge. This means that someone was either stupid enough not to notice that customers would always end up with $2 of useless change, or was malicious enough to use this as a sleazy way to bilk customers out of an extra $2. Not to mention that I get the hassle of trying not to lose this stupid card, lest I want to drop another $5 on yet another card.

    reddit this! Digg Me!
  • [/personal/nyc] permanent link

    Sun, 27 Feb 2005

    Jef Raskin
    Reading Slashdot today, I heard that Jef Raskin has passed away from cancer. If you don't know who Jef is, it's safe to say that you have been influenced by his ideas if you're reading this blog.

    Dr. Raskin was one of the first human interface experts to contribute to and be involved in the Apple Macintosh. computer. While it's true that the design took a different direction from some of his initial ideas, he played a major role in defining the user interface ethic of the Macintosh, and consequently basically every other major computer interface.

    After leaving Apple, Jef went on to continue his ideas with the SwyftCard and Canon Cat. The best articulation I've seen of his ideas regarding interface design is in his book, The Humane Interface. He has also put a great deal of his work on his personal website.

    This is a sad day, indeed.

    reddit this! Digg Me!

    [/tech/general] permanent link

    Tue, 22 Feb 2005

    Larry Osterman on Concurrency
    Larry Osterman has been running a nice series of posts on issues related to thread synchronization and concurrncy related issues. It's been full of useful tips and tricks, I particularly like part 2, Avoiding the Issue. That's a technique that's worked well for me in the multithreaded systems I've worked on. Of course, if you're writing SQL Server, etc. I'm sure you can't take nearly as simple an approach.

    reddit this! Digg Me!

    [/tech/general] permanent link

    Blosxom, Annotated
    Frank Hecker, in an effort to teach himself more about Blosxom has done a cool thing. He has taken the source for Blosxom and annotated it with extra comments to describe what it's doing each step of the way. I don't know if I'll ever use the knowledge to hack Blosxom, but it's still good reading.

    reddit this! Digg Me!

    [/tech/this_blog] permanent link

    Programming Well: Embrace Idempotence, Part 1
    There's a good definition of the word idempotent over on Dictinoary.com. In a nutshell, the word is used to describe mathematical functions that satisfy the relationship f(x)=f(f(x)): functions for which repeated applications produce the same result as the first. For functions that satisfy this condition, you can rest assured that you can apply the function as many times as you like, get the expected result, and not screw anything up if you apply it more times than you absolutely need. This turns out to be a useful concept for people developing software systems.

    One of the most common examples of this is in C-style include files. It's common practice to write code like this, to guard against multiple inclusions:

    #ifndef __HEADER_FILE_GUARD
    #define __HEADER_FILE_GUARD

    // ... declarations go here...

    #endif __HEADER_FILE_GUARD


    This idiomatic C code protects the include file against multiple inclusions. Include files with this style of guard can be included as many times as you like with no ill effect.

    The benefit to this is that it basically changes the meaning of the code #include <foo.h> from "Include these declarations" to "Ensure that these declarations have been made". That's a much safer kind of statement to make since it delgates the whole issue of multiple inclusions to a simple piece of automated logic.

    Of course, this is pretty commonplace. More is to come...

    reddit this! Digg Me!

    [/tech/programming/idempotence] permanent link

    Wed, 16 Feb 2005

    The Portland Pattern Repository
    A year or two ago, I started noticing a disproportionate number of my programming queries on Google ended up at www.c2.com. A little exploration showed the web server to be the host of the Portland Pattern Repository, the original Wiki, and dedicated to software engineering related topics (for the most part). The site is highly worth spending some time reading.

    reddit this! Digg Me!

    [/tech/general] permanent link

    Joel is Right
    Jon Galloway recently posted a set of six counterarguments to Joel Spolsky's assertation that Computer Science college students should not learn C. I take issue with all six of Jon's arguments, but my core arguments boil down to these two points:

  • An understanding of C, and the issues it raises, is essential to programming well in higher level environments.
  • Interesting and useful work is still being done in C.

    Now, speaking to Jon's particular issues

    1. It's not a skill you'll use in most of the software development jobs you'd want to have

    ...here are the kind of things you might use C for these days - writing some kind of device driver, maintaining extremely old applications, embeded development on very limited hardware, or maybe writing an operating system or compiler or something. I guess you could like hack away on the Linux (or is that GNU-Linux) kernel....

    Having done this work for seven years (and enough of IT-style work to know the difference), this is some of the most interesting stuff you can do with a computer science education. In C, I've developed two programming languages, an object oriented framework for distributed real-time process control applications, a bunch of objects using that framework, and yes, my share of device drivers and RTOS extensions. It's been interesing and deep technical work, despite the fact that is was wrapped in a plain, C wrapper.

    "...Consider this though - you're not really going to be solving any new problems...."

    Not much commercial software work involves solving truly new technical problemss. If you really want that, pick a problem to solve, get your Ph.D. and enter a research environment.

    Otherwise, there are still plenty of interesting systems to build and lots of deep thinking to do in commercial software development, Some of those jobs require C. Even more of those jobs require a mastery of the skills that C requires you to have and develop.

    "...If you want to do web development, database development, desktop applications - you know, stuff you can talk about at a party - you're probably not going to be working in C....

    This is a wierd argument... I'm not sure why this is stuff that's more suitable for small talk at a party than any other programming work. All of it seems equally bad, actually. In any event, do you really want to choose your career (>=40 hours a week for years) based on what plays well at a party?

    2. It can give you a false sense of control

    "...Worse still is that it can make you think that programming is about telling the computer what to do with its registers and memory. ..."

    At it's core, that is exactly what programming is about: telling a computer how to accomplish useful tasks in a language/vocabulary it can understand (i.e. bits/bytes, registers, memory, etc.). It's nice to build abstractions atop that so we don't have to think about moving bytes around, but if those abstractions ever leak, you will have to know why to effectively deliver software.

    3. It can teach you to get in the way

    "... If all you learned from C is that you are the boss, you will most certainly write code that plays poorly with others..."

    This is probably true, particularly if you treat other languages like they were C.

    The power of learning C is that it forces you to take control. To programm effectively in C, you have to understand how higher level constructs like strings, objects, processes, etc. map down to the basic concepts supported You have to explicitly think about every memory allocation, where the storage is allocated, and whem the memory gets freed. You have to think about when values are passed by reference, and by value. Compared to Java and C#, you have to think about a lot of things that, in the modern world, seem like low-level trivia.

    This is the whole point. Even if you never touch C again, the language you do end up using has to solve exactly these problems. And it likely does it, via some mechanism like garbage collection, that imposes its own costs and constraints. If you don't understand these low-level mechanisms, and the constraints they impose, you can't be considered a fluent programmer in whatever language you use. This is true for the same reason that, when I went to school, around 1994, I had to learn Motorola 68000 assembler code. I've never written 68K assembler commercially, but that coursework made it crystal clear what various high level language constructions cost.

    4. It can make it hard for you to love famework based development

    "...To be productive as a programmer these days, you either need to be learning to get the most out of existing frameworks or helping to build tomorrow's frameworks. ..."

    It's possible to build frameworks in C, as well as to use them. One of Jon's framework examples, Gtk, is written in C. The Gimp is an example of an application that is written in C, and based on GTK. There are plenty of other examples of C-based framework development.

    5. It can teach you philosophies which will prevent you from really understanding modern programming

    It teaches the philosophies on which modern programming is built. BAsically all modern OS's are built in C, at the core. Basically all commercial run-time environments are built in C, at the core. Lower-level still, modern CPU's and ISA's evolved in a time when C was king, and are well suited to running compiled C code.

    6. It can teach you divert your problems from the real challenges of software engineering

    "...The point is, today's software development environment is dynamic, evolving, and extremely challenging. If you're going to be of help, you need to do something more productive with your time than learn a 20 year old language..."

    If you're going to be of help, you can do things more productive with your time than encouraging people not to learn about the core aspects of their profession.

    I'll be the first to admit that you can get away with never programming professionally in C, but your programming will ultimately suffer for not knowing it.



    reddit this! Digg Me!
  • [/tech/general] permanent link

    Wed, 09 Feb 2005

    ./blosxom.cgi: 444 lines, 16674 characters.
    I've spent a little time doing some things to tweak Bloxsom so that it fits better into my website. So far, I've:

  • Changed the html flavour to refer to my CSS file and use it correctly.
  • Set up a simple hierarchy of post topics.
  • Gotten static rendering working (as a test, it's not in use now.)

    None of this is all that earth shattering, but it was all trivial to do in Bloxsom. For a one-file, 16K Perl script, Bloxsom brings a lot to the table.

    Next on the adgenda is getting a web form set up for posting and hopefully editing blog posts, and then setting up a web-based way to upload images into the blog. My current workflow for posting to the blog involves two levels of nested SSH logins and the use of vi. *shudder*.

    reddit this! Digg Me!

  • [/tech/this_blog] permanent link

    Tue, 08 Feb 2005

    A couple Lisp/Programming Language Blogs
    One interest of mine is programming languages, and more specifically, Lisp and Scheme. Lately, the blogosphere has produced a couple interesting blogs that tie to this interest:

  • Lambda the Ultimate
  • Planet Lisp

    Planet Lisp aggregates a bunch of Lisp-related blogs, while LtU is more general and more of a discussion site.

    Related, I came to LtU via Eric Lippert's Blog over on Weblogs @ASP.Net. Eric is a developer at Microsoft who's done a lot of work on Windows scripting and the Windows script languages.

    reddit this! Digg Me!

  • [/tech/lisp] permanent link

    vCalc
    vCalc is the other side project I have going on right now. It's a simple RPN style calculator written for Win32. Underlying vCalc is a Scheme interpreter that I talk about a little here. The ultimate goal for vCalc is to have a calculator that can be easily extended with Scheme functions, in addition to the keystroke sequences you might expect. As it turns out, there are a lot of interesting problems that crop up trying to make this work right. I hope to blog more on this in the future.

    Like Noisemaker, vCalc is shareware available through IceGiant.



    reddit this! Digg Me!

    [/tech/ectworks/vcalc] permanent link

    Noisemaker
    One of my side projects is a little tool called Noisemaker. NoiseMaker is a utility that runs in the background and generates white noise over the computer's speakers to mask out distractions like the TV, phone, annoying co-workers, etc... If you need it, you need it badly...

    It's available as a shareware program at Icegiant Software

    reddit this! Digg Me!

    [/tech/ectworks/noisemaker] permanent link

    First Post
    Welcome to my little blog. This is the first post.

    It took a while, but I decided to base the thing on Blosxom. Nice and simple...

    reddit this! Digg Me!

    [] permanent link