Wed, 14 Dec 2005
- Prefer machine readable formats - "Pretty printers" for reports have a lot of utility: they can make it easy for users to view and understand results. However, they also have disadvantages: it's harder to use "pretty" reports for the further downstream processing that someone will inevitably want to do. This is something that needs to be considered carefully, keeping your audience in mind, but if possible, pick a format that a machine can easily work with.
- Use a standard file format - There are lots of standard formats available for reports and feeds: XML, CSV, Tab Delimited, S-Expression, INI File, etc. Use one of these. Tools already exist to process and manipulate these kinds of files, and one of these formats will be able to contain your data.
- Prefer the simplest format that will work - The simpler the format, the easier it will be to parse/handle. CSV is a good example of this: XML is sexier and much more powerful, but CSV has been around forever and has many more tools. A good example of what I mean is XML support in Excel. Excel has been getting XML support in the most recent versions, but it's had CSV support since the beginning. Also, from a conceptual standpoint, anybody who can understand a spreadsheet can understand a tabular file, but hierarchical data is considerably more complex a concept. (In business settings, there's a very good chance your feed/report audience will be business analysts that know Excel backwards and forwards but have no technical computer science training.)
- Prefer delimited formats to formats based on field widths - The thing about having columns based on field widths (column 1 is 10 characters wide, column 2 is 20, etc.) is that you have to remember and specify the field widths when you want to extract out the tabular data. In the worst case, without the column widths you can't read your file at all. In the best case, it's just something else you have to do when you load a file.
- If you specify column names, ensure they are unique. - This isn't necessary for a lot of data analysis tools, but some tools (cough... MS Access) get confused when importing a table with multiple columns of the same name.
- Include a header that describes the feed. - To fully understand the contents of a
file, you really have to understand what it contains and where it came from. This is useful
both in testing (did this report come from build 28 or build 29?) and in production (when was
this file generated?) My suggestions for header contents include:
- The version of the report specification
- Name of the source application
- Version of the source application (This version number should be updated with every build.)
- Environment in which the source application was running to produce the report.
- The date on which the report was run
- If the report has an effective date, include it too.
- Document your report - Without good, precise documention of your file format, it'll be very hard to reliably consume files in the format. Similarly, have as many people as possible peer review your file format. Even if your system's code is complete garbage, the file format represents an interface to your system that will possibly live much longer than the system itself.
reddit this! Digg Me!
[/tech/programming] permanent link
Wed, 16 Nov 2005
This is the awful default:
This is as it should be:
Now, guess what Firefox does.
reddit this! Digg Me!
[/tech/general] permanent link
Wed, 09 Nov 2005
"Thirty days hath September,
All the rest I can't remember.
The calendar hangs on the wall;
Why bother me with this at all?"
—http://leapyearday.com/30Days.htm
Here's an Excel one liner that computes the number of days in a particular month. Cell A2 contains the year of the month you're looking for, Cell B2 contains the months' ordinal (1=January, 2=February, etc.):
=DAY(DATE(A2,B2+1,1)-1)This is mainly useful to illustrate what can be done with Excel's internal representation of dates. Dates and times in Windows versions of Excel are normally stored as the number of days from January 1st, 1900. You can see this by entering a date in a cell, and then reformatting the cell to display as a number rather than a date. For example, this reveals April 1st, 2004 to be represented internally as the number 38078. This is because there are 38,078 days between January 1st, 1900 and April 1st, 2004.
The formula above relies on this in its computation of the number of days in a month. The sub-expression DATE(A2,B2+1,1) computes the date number for the first day of the month immediately following the month we're interested in. We then subtract one from that number, which gives us the date number for the last day of the month that we are interested in. The call to DAY then returns the number of the day within the month, which happens to be the number of days in the month.
reddit this! Digg Me!
Fri, 04 Nov 2005
- An internal representation for compiled byte code functions.
- A way to interoperate with C code that expects binary data formats. (Like the Win32 API, for example. )
- A way to represent binary data longer than a byte that's written to and read from binary ports.
reddit this! Digg Me!
Related to that is this deck of slides written by Kent Pitman and Peter Norvig. It's an excellent discussion of good programming style in Lisp.
reddit this! Digg Me!
[/tech/programming] permanent link
This works really well as long as "function block scheduling" is one of the categories into which you've subdivided your features list. If it's not, you have to get a little creative to filter your list. One approach to this problem I've found useful is filtering based on columns populated with a formula similar to this:
=IF(ISERROR(SEARCH($K$5,K6)),"No","Yes")If column K contains feature descriptions, this formula returns "Yes" is the description matches the search string in K5 and "No", otherwise. Filtering based on this formula makes it possible to display every list item whose description matches a word. If there is more than one column to search, you can use string concatenation to aggregate the columns together:
=IF(ISERROR(SEARCH($K$5,K6&L6&M6)),"No","Yes")So, why the name apropos? Follow this link.
reddit this! Digg Me!
Tue, 11 Oct 2005
Well, if you can't wait for Excel 12, Excel is pretty darned powerful as it is, and as Mr. Gainer states: most of these scenarios have formula-based approaches that work right now. Here are some of the approaches for current versions of Excel:
- With data bars, color scales, or icons based on the numeric value in the cell, percentages, percentiles, or a formula. See the posts on data bars, color scales, and icon sets for more information on each of these. - This approach to 'databars' generalizes to formula-based scaling, although it's not as pretty, not a color scale, and not an icon set.
- Containing, not containing, beginning with, or ending with specific text. For example, highlighting parts containing certain characters in a parts catalog. - Use a formula: a lot of these conditions can be tested using FIND: =FIND(string, A1)=1, checks for parts that begin with string, for example.
- Containing dates that match dynamic conditions like yesterday, today, tomorrow, in the last 7 days, last week, this week, next week, last month, this month, next month. For example, highlight all items dated yesterday. The great part about these conditions is that Excel handles calculating the date based on the system clock, so the user doesn.t need to worry about updating the condition. - Use a formula: the system date is available via NOW(), and Excel offers plenty of date arithmetic functions to check for specific conditions.
- That are blank or that are non-blank. - Use a formula: =ISBLANK(A1) or =NOT(ISBLANK(A1))
- That have errors or that do not have errors. - Use a formula: =ISERROR(A1) or =NOT(ISERROR(A1))
- That are in the top n of a selected range (where n is whatever number you want) OR that are in the top n percent of a selected range (again, where n is adjustable). For example, highlighting the top 10 investment returns in a table of 1,000 investments. - Use a formula: =RANK(A1, range)>n.
- Cells that have the bottom n values OR cells that are the bottom n percent of a selected range. - Use a formula: =RANK(A1, range)<ROWS(range)-n.
- Cells that are above average, below average, equal to or above average, equal to or below average, 1 standard deviation above, 1 standard deviation below, 2 standard deviations above, 2 standard deviations below, 3 standard deviations above, 3 standard deviations below a selected range. - This type of thing can be solved using a particular form of formula: =A1<(AVERAGE(range)-n*STDEV(range)) or =A1>(AVERAGE(range)+n*STDEV(range)). For large ranges, it probably makes sense to move the computation of AVERAGE and STDEV into a cell, and have the conditional format reference (with an absolute reference) that cell.
- Cells that are duplicate values or, conversely, cells that are unique values. - Use a formula: =COUNTIF(range, A1)=1 or =COUNTIF(range, A1)>1. Ensure that the range you use in the formula has an absolute address. If your range is sorted on the 'key' field, you can use this style of formula: =A1<>A2. This can be much, much faster, particularly for large tables. (For the Comp. Sci. types it's O(N), rather than O(N^2), once you have sorted data.)
- Based on comparisons between two columns in tables. For example, highlight values where values in the .Actual Sales. column are less than in the .Sales Target. column. - Use a conditional format formula: =A1<B1. Apply it to the entire column you want shaded, and Excel will evaluate the seperately for each cell. The cell references in the format formula are relative to the current cell in the selected range. The current cell is the cell in the range that is not highlighted (but is surrounded by a selection border), and can be moved around the four corners of the range with Control+. (period).
- When working with tables, we have also made it easy to format the entire row based on the results of a condition. - Relative formulas can be made to do this: select an entire range, and define a conditional formula using absolute column addresses (ie: =$a1). Excel evaluates the format formula for each cell in the range, and since the column addresses are absolute, each cell in a row will pull from the came columns. Therefore, each cell in a row will share the same conditional format, which is what we want.
reddit this! Digg Me!
Fri, 07 Oct 2005
=REPT("█",A1)&REPT("▌",ROUND(FLOOR(A1,1),0))
That formula evaluates to a bar of length A1 units, rounded to the nearest 0.5. Rescaling can be done in another cell. If you're interested in a bar that can be right-justified, you can use this:
=REPT("▐",ROUND(A1-FLOOR(A1,1),0))&REPT("█",A1)
The trickiest part about this is getting the block characters into the formula. For that, I reccomend using the Windows Character Map.
Qualitatively compared to VBA, this method requires more logic to be represented in the spreadsheet: that adds compelxity for readers and makes it tricker to set up than the VBA. On the other hand, it avoids the performance hit of calling UDF and the requirement that the spreadsheet contain a macro. I honestly don't know which is better style, but can say that this would be a perfect time to use a paramaterized range name (if Excel had such a thing).
reddit this! Digg Me!
This will be a nice way to look for trends/outliers, but I can also see it being useful for tracking parallel completion percentages in status reports, etc. Of the Excel 12 features announced so far, this is the one that I'm the most excited about. Of course, it's also the one that's easiest to approximate in Excel <12. Andrew has an approach using Autoshapes on his blog, and I'm going to present a slightly different approach.
IMO, his approach looks a lot better, this approach has the benefit of updating automatically. Pick your poison.
It all centers around this little UDF:
Option Explicit
Function GraphBar(x As Double, _
Low As Double, _
High As Double, _
ScaleTo As Double) As String
x = ((x - Low) / (High - Low)) * ScaleTo
Dim i As Integer
Dim blockFull As String
Dim blockHalf As String
blockFull = ChrW(9608)
blockHalf = ChrW(9612)
GraphBar = ""
For i = 1 To Fix(x)
GraphBar = GraphBar + blockFull
Next
If x - Fix(x) > 0.5 Then
GraphBar = GraphBar + blockHalf
End If
End Function
This isn't rocket science: all it does is rescale x from the range
[Low, High] to the range [0.0, ScaleTo]. Then, it
strings together that many Chrw(9608)'s, followed by a
Chrw(9612), if the scaled value's fractional part is >0.5. The
trick in this is that Chrw(9608) and Chrw(9612) are VBA
expressions that produce the the Unicode equivalent of the old line
drawing characters IBM put in the original PC [1]. 9608 is a full box
("█"), 9612 is a half box on the left ("▌"). The result of
this function ends up being a string that (when displayed as Arial) looks
like a horizontal bar. ("████▌"). Put a few
of those in adjacent cells, and you get this:
The formulat in C2 (and filled down) is =GraphBar(B2,MIN(B$2:B$8),MAX(B$2:B$8),5). The MIN and MAX set the scale, the 5 sets the maximum length of a bar. The maximum length, font size, column width can be tweaked to produce a reasonably attractive result, although I do reccomend using vertical centering.
If you want to get a little fancier, conditional formatting works on plot cells...
...whitespace can possibly improve the appearance...
...and this technique can scale.
1] (The original PC didn't have stanard graphics, it was an option. If you bought the monochrome, non-graphics, video board, characters like this were as close as you could get to a bar chart.)
reddit this! Digg Me!
Thu, 06 Oct 2005
In Excel, Control+Y is the 'other half' of the Undo/Redo pair. If you undo an action and want to redo what you just undid, Control+Y undoes the undo, so to speak. However, if you haven't undone anything, and there's nothing on the redo queue, Control+Y repeats the last single action you took.
Repeatable actions can actually be quite complex. For example, opening the Format Cell dialog box and applying a format counts as one repeatable action, regardless of how many format attributes you change. Once you make that format change to one cell and before you do anything else Control+Y has become a key that applies that specific format change to as many other cells as you like.
In a sense, Control+Y is a command that's eternally bound to a simple macro that Excel keeps updating with your last action. If you plan your work to group actions together, this 'automatic' macro can save a lots of time.
reddit this! Digg Me!
Mon, 03 Oct 2005
Not too long ago, I made a post that describes how to replicate some of the behavior of Excel Autofilters using a purely formula based approach. One of the arguments I put forward in support of that technique is that it makes it possible to use filtered result sets to drive other calculations. However, the approach also has two disadvantages: it's slow to compute and can be a little tricky to setup and understand. As a sort of intermediate ground between using the AutoFilter and re-implementing it, this post describes how an Excel formula can determine if a row is a member of an AutoFilter result set. The magic bit is this little user defined function:
Function IsVisible(rng As Range) As Boolean
IsVisible = True
Dim row As Range
Dim col As Range
For Each row In rng.Rows
If row.RowHeight = 0 Then
IsVisible = False
Exit Function
End If
Next
For Each col In rng.Columns
If col.ColumnWidth = 0 Then
IsVisible = False
Exit Function
End If
Next
End Function
Given a range, this function returns true if every cell in the range
is visible (non-zero row height and column width). The way Excel
works, the Row Height of a row hidden by the Autofilter is reported
as zero. Therefore, IsVisible returns false when given a reference to
a cell in a hidden AutoFiltered row. Of course, it also returns False
for cells in manually hidden rows and columns, but if you're careful,
you can avoid that.
For a simple use case, this function can be used to generate alternating color bars that always alternate regardless of the AutoFilter settings. To set it up, Put TRUE in the topmost cell of a free column next to the AutoFilter to be colored. Below the TRUE, fill down with a formula like this: =IF(isvisible(D2),NOT(D1),D1). This formula inverts the value in the column, but only for cells that are visible. This guarantees that regardless of the AutoFilter settings, this column will always alternate TRUE/FALSE in the set of visible rows. This column can then be used to drive a conditional format that highlights alternating visible rows.
A couple sidenotes:
- This function works because adjusting an AutoFilter triggers recalculation, and Excel notices that this function depends on row heights. For hiding columns, it's a lot less reliable. All the calls to IsVisible have to be forced to recompute after the column is hidden or displayed. To do this, IsVisible can be marked as volatile and recalculation forced by pressing F9. This is a lousy solution.
- To optimize performance, the function short-circuits its search. The Exit Function's bail out of the calculation as soon as the first hidden row or column is discovered.
- Excel's SUBTOTAL intrinsic function is also sensitive to AutoFilter settings.
reddit this! Digg Me!
Tue, 20 Sep 2005
As a sidenote, this reminds me a little of how LabView handled subfunction definitions: subfunctions are defined using the same visual tools as top-level functions. It worked, but 'felt' a little heavy weight in actual use.
reddit this! Digg Me!
The first time the macro MaybeToggleMAR is invoked, it displays the current state in the status bar, and sets a timer to expire in 3 seconds. If the macro is invoked a second time before the timer expires (easy to do if it's bound to a keystroke) the state is toggled. Technically speaking, the trickiest bit is that the function that sets the 3 second timer also has to handle cancelling any previous instance of the same timer. It works without the timer cancellation, but without it, the UI behaves oddly after multiple keypresses in rapid succession.
Chip Pearson's website has useful content discussing the Excel API's for Timers and the Status Bar.
Here's the code: to use it, stick it in a module and bind MaybeToggleMAR to the keyboard shortcut of your choice.
Option Explicit
Private MARChangesEnabled As Boolean
Public NextDisableTime As Double
Sub DisableMARChanges()
Application.StatusBar = False
MARChangesEnabled = False
End Sub
Sub DisableMARChangesSoon()
On Error Resume Next
Application.OnTime NextDisableTime, "DisableMARChanges", , False
NextDisableTime = Now + TimeSerial(0, 0, 3)
Application.OnTime NextDisableTime, "DisableMARChanges", , True
End Sub
Sub MaybeToggleMAR()
Dim NewStatusText As String
NewStatusText = ""
If MARChangesEnabled Then
Application.MoveAfterReturn = Not Application.MoveAfterReturn
NewStatusText = "Status changed: "
Else
MARChangesEnabled = True
NewStatusText = "Second press will change status: "
End If
If Application.MoveAfterReturn Then
NewStatusText = NewStatusText & "MoveAfterReturn Enabled"
Else
NewStatusText = NewStatusText & "MoveAfterReturn Disabled"
End If
Application.StatusBar = NewStatusText
DisableMARChangesSoon
End Sub
reddit this! Digg Me!
Wed, 07 Sep 2005
- The more use this kind of scaffolding code gets, it gets progressively more cost effective to write. Time spent before dumpers are in place reduces the amount of use they can get and makes them progressively less cost effective. Implement them early, if you can.
- Look for cheap alternatives already in your toolkit: Lisp can already print most of its structures, and .Net includes object serialization to XML. The standard solution might not be perfect, but it is 'free'.
- Make sure your dumpers are correct from the outset. The whole point of this is to save debugging time later on, if you can't trust your view into your data structures during debugging, it will cost you time.
- Dump into standard formats. If you can, dump into something like CSV, XML, S-expressions, or Dotty. If you have a considerable amount of data to analyze, this'll make it easier to use other tools to do some of the work.
- Maintain your dumpers. Your software isn't going to go away, and neither are your data structures. If it's useful during initial development, it's likely to be useful during maintenance.
- For structures that might be shared, or exist on the system heap, printing object addresses and reference counts can be very useful.
- For big structures, it can be useful to elide specific content. For example: a list of 1000 items can be printed as (item_0, item_1, item_2, ..., item_999 ).
- This stuff works for disk files too. For binary save formats, specific tooling to examine files can save time compared to an on-disk hex-editor/viewer. (Since you have code to read your disk format into a data structure in memory, if you also have code to dump your in-memory structure, this does not have to be much more work. Sharing code between the dump utility and the actual application also makes it more likely the dumper will show you the same view your application will see.)
- Reading dumped structures back in can also be useful.
reddit this! Digg Me!
[/tech/programming] permanent link
That leads me to a couple books I've been reading lately. The first is Lisp in Small Pieces, by Christian Queinnec. I'm only a couple chapters in (stuck on continuations right now), but it's already been pretty profound. So far, the aspect of the book that's been the most useful is that it has gone through several core design choices Lisp implementors have to make ( Lisp-1 vs. Lisp-2, Lexical Scope vs. Dynamic Scope, types of continuations to support), and goes into depth regarding the implications and history of the choices involved. I think I'm finally starting to understand more of the significance of funcall and function in Common Lisp, not to mention throw/catch and block/return-from.
Book two is The
First Computers--History and Architectures, edited by
Raul Rojas. This book is a collection
of papers discussing the architecture of significant early computers from the late
30's and 40's. The thing that's so unique about the book is that it focuses on
the architectural issues surrounding these machines: the kinds of hardware they
were built with, how they processed information, and how they were programmed. Just
as an example, it has a detailed description of many of ENIAC's functional units,
even going into descriptions of how problems were set up on the machine. Another
highlight of the book for me (so far) has been a description of Konrad Zuse's
relay-based Z3, down to the level of a system architectural diagram, schematics of
a few key circuits, and coverage of its microprogramming (!).
reddit this! Digg Me!
[/tech/history] permanent link
Wed, 24 Aug 2005
=LET(value=MATCH(item,range,0), IF(ISERROR(value), 0, value))If you're into Lisp-y languages, it'd look like this:
(let ((value (match item range 0))) (if (is-error? value) 0 value))The function call
=LET(name=binding, expression)
would create a local range name named name, bound (equal) to the value returned by
binding, to be used during the evaluation of expression. In the example
above, during the evaluation of IF(ISERROR(value), 0, value)), value
would be bound to the value returned by MATCH(item, range, 0).
It's worth pointing out that this is slightly different from how normal Excel range names work. Range names in Excel work through textual substitution. With textual substitution, the initial expression would be logically equivalent to this:
=IF(ISERROR(MATCH(item, range, 0)), 0, MATCH(item, range, 0)))In other words, Excel would treat every instance of value as if
MATCH(item, range, 0) was explictly spelled out. This
means there are two calls to MATCH and two potential searches
through the range. While it's possible that Excel
optimizes the second
search away, I'm not sure that anybody outside of Microsoft can
know for sure how this is handled.
Microsoft's current
reccomendation for handling the specific ISERROR scenario
in the first expression is this VBA function:
Function IfError(formula As Variant, show As String)
On Error GoTo ErrorHandler
If IsError(formula) Then
IfError = show
Else
IfError = formula
End If
Exit Function
ErrorHandler:
Resume Next
End Function
This isn't bad, but it requires that spreadsheet authors and
readers understand VBA. It also imposes significant performance
costs: calling into VBA from a worksheet takes time.reddit this! Digg Me!
Wed, 17 Aug 2005
reddit this! Digg Me!
[/tech/general] permanent link
Mon, 08 Aug 2005
reddit this! Digg Me!
[/tech/general] permanent link
reddit this! Digg Me!
Fri, 29 Jul 2005
- A Comparison Between Erlang and C++ for Implementation of Telecom Applications
- Designing an Authentication System a Dialogue in Four Scenes
- Frontier Kernel
- Graphics Hardware Archives
- LispNYC ErLisp
- Los Alamos From Below Reminiscences 1943-1945, by Richard Feynman
- Making the Jump to tableless design
- Optimizing Search on an 8 Puzzle
- RSS 2.0 and Atom 1.0, Compared
- Sysinternals Freeware - Inside the Native API
- The New C Standard
- The New Guidelines for Writing Spreadsheets
- Time Series in Finance
- Cray Supercomputer FAQ
- Functional PostScript
- Journal of Statistical Software
- Metro Planet - Subway travel information
- NewtonScript Byte Code Specification
- Numerical Recipes Home Page
- Papers about Self and OO Programming
- Puyo's Page - libfov
- SELF and the Origins of NewtonScript
- The Autodesk File
- Undocumented C# Types and Keywords
- Windows A Software Engineering Odyssey
- The Gallery of Old Iron
- world subways
- Brad Abrams
- Cockpit Conversation
- Daily Dose of Excel
- Daryll McDade - The High School Guy
- Edward Tufte Ask E.T. forum
- Exorcyst's Padded Cell
- Fabulous Adventures In Coding
- Lambda the Ultimate Programming Languages Weblog
- Land and Hold Short
- Linux Is No Longer Free
- Mike Stall's .NET Debugging Blog
- Mini-Microsoft
- MozillaZine Weblogs
- Planet GNOME
- Planet Lisp
- Quoderat
- Rob Fahrni, at the core
- SQL Server 2005 CLR Integration
- Slava Pestov's Weblog
- Stochastic Keithp
- Surfin' Safari
- The Blog of Death
- Under The Hood - Matt Pietrek
- Wadler's Blog
- Windows Shell-User
- fintanr's weblog
- jeffdav's WebLog
- rentzsch.com tales from the red shed
- simplegeek
reddit this! Digg Me!
Thu, 28 Jul 2005
- Deciding which rows of the input set are part of the result set
- Displaying the result set in a contiguous sequence of spreadsheet rows.
The tricky bit of the formula-based filter is the second problem: displaying the result set in a contiguous range of rows with no gaps. Each cell that might display part of the result set has to figure out itself what part of the result set to display, if any, and pull the data from the input set. A simple MATCH or LOOKUP can't handle this, since MATCH or LOOKUP can't be told to return the second, third, or nth match. They return the first match, which isn't quite enough for what we're trying to do.
As it turns out, even though having the result set compute a mapping from the input set is quite hard, solving the reverse problem isn't too bad. Having the input set compute the mapping to the result set is easy. Here's how it works, by column:
- Ord. - The row ordinal number of the row in the input set, starting with 1.
- Result Ord. - This column starts at zero, in the row preceeding the first row of the result set, and increments by 1 for each row where In Query? is TRUE. For each row with In Query? of TRUE, this column is the row ordinal number of this row in the result set.... We are almost there.
- Result Rows. - The input row ordinal of each row in the output set. This is done by using MATCH to find the first row for each number in the Result Ord..
reddit this! Digg Me!
Wed, 27 Jul 2005
- 2-finger trackpad scrolling.
- Sudden motion sensing for the disk. (Is this done by the disk itself with a built in motion sensor or by the motherboard/CPU?)
- Standard Bluetooth
- A minor speed bump: the peak CPU is now a 1.42GHz G4 with a 142MHz bus.
Anyway, I've recently come to have a theory on the limited display resolution of Apple's notebooks. It seems obvious in retrospect, but Apple can't scale up the display resolution since they don't have the CPU or memory bandwidth to support higher resolutions as well as they want. With modern display stacks like Quartz and Quartz Extreme, pushing pixels around is one of the biggest user-visible performance burdens on a modern machine (hence, "the snappy"). While a GPU can help, there's no getting around the fact that if they doubled the resolution, they'd double the number of bytes their system has to process to render the same sized desktop on the screen. Given that Apple's best G4's have less than half the main memory bandwidth of the lowest end Centrinos, there's no wonder Apple's not chomping on the bit to eat up more of their bus.
Since Apple's first wave of Centrino laptops should bring fixes for all of this, the computing community has some pretty amazing hardware to look forwards to in a year or so.
reddit this! Digg Me!
Mon, 18 Jul 2005
To me, the brilliance of the spreadsheet was that it took a data model that business people were familar with, the accountant's paper spreadsheet, and layered on automatic computation and reporting facilities in a natural way. There's something very intuitive about going to cell c1, entering =A1+B1, and then having C1 contain the sum of the other two cells, automatically updated as the source cells change. It just makes sense, and is at the very core of every software spreadsheet dating back to the first, VisiCalc.
For years, spreadsheets worked at making this model work better. Lotus 1-2-3 introduced something called natural recalculation order that made it easier to follow the logic of spreadsheet calculation. Somewhere along the way, spreadsheets started doing limited recalculation, where formulas that didn't change weren't recalculated (thus saving time). New intrinsic functions were added, and Excel made a huge stride when it added array formulas: individual formulas that can produce more than one result. The gateway to user defined functions written in VisualBASIC was another huge win.
The core strength of all of these ideas is that they rely on and extend the core concept of the software spreadsheet: the software tracks dependancies between cells and automatically recalculates the appropriate results as necessary. As powerful as that concept is, Microsoft lost the plot somewhere around Excel 4 or 5 and keeps sinking money and effort into features that don't fully participate:
- Excel has two data filter features: neither one can automatically update a table as a part of recalculation.
- PivotTables don't update when their source data updates either. (For SQL data sources, this is understandable, but not so much when the source data comes from Excel itself).
- PivotTables produce tables with missing values (to improve the formatting), which makes them very difficult to query with spreadsheet lookup functions.
- The historgram function (among others) of the Analysis ToolPak is a one-time thing: you use it, it generates a histogram, and that's it. It's not possible to incorporate histogram generation into the dataflow driven recalculation of a spreadsheet.
- There's no way to use an Excel formula to determine if a row is excluded or included in an AutoFilter query. Actually, there's no way to have the result set of an AutoFilter query drive spreadsheet recalculation at all.
reddit this! Digg Me!
However, AutoFilter is not without its problems:
- AutoFilter imposes its own user interface: if you want a look-and-feel other than stock, you're out of luck.
- For wide data tables with lots of columns, it can be hard to see the current AutoFilter query. To see the entire query requires horizontal scrolling down the header row.
- Cell formatting and AutoFilter are independant of each other. If you want position dependant formatting (alternate row formatting, for example), it has to be recreated after each AutoFilter adjustment.
- An AutoFilter works by selectively 'hiding' rows in the worksheet it's a part of. This means that an AutoFiltered list can't share rows with anything else that you don't also want selectively 'filtered' from view.
- You can't have more than one AutoFilter on a worksheet tab.
- AutoFilter isn't part of the natural 'ebb and flow' of
the life of a spreadsheet: it doesn't participate in the
dependancy driven formula solver that drives Excel's
computational capability. This has some profound (bad)
implications:
- As data rows are added and removed from the list being AutoFiltered, the AutoFilter has to be removed and reapplied to the new data list to reflect changes to its source.
- You can't use AutoFilter to filter a list and then search that list with =LOOKUP() or =MATCH(): the lookup operation will search the entire list, not the filtered list.
- If you AutoFilter a list that contains calculated cells, and those cells change value, the set of filtered rows is not updated.
reddit this! Digg Me!
Thu, 07 Jul 2005
Internet Explorer defaults to anonymous FTP, when sometimes you need to log in with an explict username and password. One of the lesser known features of URL's is that they allow login information to be specified as part of a web address.
ftp://username:password@hostname/
The
Also, Microsoft have a Knowledge Base Article that describes this in more detail, including a way to log in from a menu command, if you have the right settings enabled.
reddit this! Digg Me!
Thu, 30 Jun 2005
This, the logo for my website, is basically just antialiased text rendered into a bitmap. At the time, it seemed like a good idea to render the text as a bitmap because I didn't trust the browser to render it for me. Bad idea.
As it turns out, Internet Explorer rescales bitmaps on high resolution displays. This is a somewhat misguided attempt to make keep bitmap sizing consistent. Bitmaps aren't rendered at 1/1 zoom, they are rendered at screen_dpi/96dpi. On non-96dpi screens, that results in ugly scaling. While scaling can be disabled, that's not the ideal solution. The ideal solution is to do as much of the rendering as possible in the browser: which should know more about the client's display than the server. Therefore, my logo is now CSS formatted plain text. That means it looks the right size on more screens, anti-aliases appropriately, uses ClearType if it's available. The next step is going to be to switch from pixel sizes to 'real' sizes.
reddit this! Digg Me!
[/tech/general] permanent link
| From: | <phone_book name="Jenny">867-5309</phone_book> |
| To: | (phone_book ((name . "Jenny")) "867-5309") |
In other words, a symbol for the tag name in the car of the list, an association list of attribute values in the cadr, and then the subelements in the cddr. This seems reasonable, aside from the fact that attributes and tag values are still wierdly disjoint.
On the way to lunch today, I came up with another mapping that might be more reasonable:
| From: | <phone_book name="Jenny">867-5309</phone_book> |
| To: | (phone_book (name "Jenny") :end-of-attribute-marker "867-5309") |
This is simpler in that a tag is modeled as a list containing the tag symbol and then all of the sub-items, attributes or not. Data stored as an attribute doesn't get special treatment relative to data stored as a tag value. The symbol :end-of-attribute-marker makes it possible to still distinguish between attributes and tags. If you don't care, a simple call to remove can remove the marker symbol.
It's a subtle design point, but this'll probably end up in vCalc in the XML support... I've had XML for vCalc on the back-burner for a while now, but due to some real work obligations, I might have to make it a higher priority.
reddit this! Digg Me!
- If Commodore bought KFC they would have changed the name to "Warm dead birds in a paper bucket".
- Commodore Sushi: Cold, dead, raw fish.
reddit this! Digg Me!
[/tech/general] permanent link
"The market has stupidly decided that Intel microprocessors are better than Apple's preferred PowerPCs, so Apple will be at a disadvantage trying to sell PowerPC machines into the Intel market. This is what's right now killing Silicon Graphics, which is finding rough going pitting its MIPS processors against Intel. ... Yes, Apple will build computers with Intel processors. Their aim, as in all of these products, is for the high end. Based on Intel's new Merced chip, the new Apple machine will have PCI slots, Universal Serial Bus, Fast Ethernet, IEEE 1394 FireWire, IRDA, DIMM sockets, but no ISA slots and no backwards compatibility to DOS. So this is NOT a PC in the strictest sense, since it will only run Rhapsody, but not System 8 or Windows NT. It will run Mac applications inside Rhapsody. And because Apple is both the author of Rhapsody and the designer of this machine, Jobs believes that more customers will want to buy their Rhapsody wrapped in Apple hardware than not."
Funny thing is... that quote is from October of 1997. A lot has changed since then, but since the core reasoning was sound it probably shouldn't be too much of a suprise that he was ultimately right.
The other interesting bit was that Cringely wrote that piece around 1997, which is when the NDA for 'Project Star Trek' expired. Star Trek was a project in which a few Apple, Novell, and Intel software engineers got MacOS 7 running on PC hardware. I'm not sure what the business story would've been, but it was a nice technical accomplishment nonetheless.
reddit this! Digg Me!
Tue, 28 Jun 2005
- I like the keyboard: nice and solid. Since the layout is more like a Dell D600 than a D400 (what I have from work), there'll be a little getting used to it. The D400 layout puts page up and page down near the arrow keys, which I've gotten used to for reading documents. The I6000 (and D600/D800) puts page up and page down up near the display. If that gets too obnoxious, I might have to investigate remapping some of the media keys on the front of the machine to more useful keys.
- I love the WUXGA (1900x1200, approx.) display. The machine came from the factory with large icons enabled and set to 120dpi. Set up that way, it seems readable enough to me, but my vision is so far correctable to 20/20. If smaller text adds to fatigue or is harder to read on a bouncy train, it'll be possible to enlarge text through preferences, so I'm not worried about it at all. At this point, the 1024x768 D400 is going to feel very cramped.
- Dell still dumps its machines full of software. This machine came with several broadband offers, four media players, and a bunch of modem stuff. Most of that's getting uninstalled in the name of system stability. I already have broadband, I don't use streaming media that much, and I haven't used a modem in years.
- XP Media edition looks the same as XP Pro, so far.
reddit this! Digg Me!
[/tech/general] permanent link
Thu, 23 Jun 2005
reddit this! Digg Me!
[/tech/general] permanent link
About ten years ago, a good friend of mine went to work in Texaco's IT Shop as a summer intern. One of his job responsibilities was to develop an intranet website. I forget the details, but somewhere along the way he decided he wanted to put a fancy banner picture atop the page. At the time, we were both interested in ray tracing, so we decided to throw together a raytraced version of the Texaco Star logo.
Using our copious free time, we found an online copy of the Texaco logo, took measurements of the star and rendered it as a white solid set against a metallic red hemisphere. We even went to the trouble of animating the star so it rotates, generating a bunch of frames and using a GIF tool to put together an animated GIF. The final result was a nicely animated Texaco logo with an "attractive" (This is by 1995 intranet standards, remember) banner to the side. Since then, I've dragged the model out, re-rendered it at higher resolution, and stuck it in a little Raytracing Gallery I have set up.
reddit this! Digg Me!
[/tech/this_blog] permanent link
Something else worth mentioning is that laptop vendors typically use standard parts in their hardware. While they don't publicize part numbers (partially so they can switch suppliers), it is possible to find datasheets describing things like LCD panels. While it takes some inference to figure out what part is being used, this can reveal statistics about LCD panels that might otherwise be hard to find. While Google is your friend, this site has a bunch of links to useful datasheets.
PS: I've ordered the WUXGA (>2 Megapixels, ~140dpi) display with the 128MB Radeon X300 video adapter. If on screen content isn't too small, I expect the detail to be fabulous. I'll post comments (and screenshots) when I get some experience with the machine.
reddit this! Digg Me!
[/tech/general] permanent link
Mon, 06 Jun 2005
The next question is how well it will be pulled off. In theory it could be seamless. It needs to be.
reddit this! Digg Me!
| Reason not to switch | Counterargument |
| If Apple switches to Intel, they introduce another archicture break into their hardware platform. | Emulation can make existing binaries run seamlessly on Intel. |
| But isn't emulation really slow? | Modern emulation technology has gotten a lot better, it can compile code on the fly, just like a modern JVM or Virtual PC. |
| But I've run virtual machines before, and they're still really slow. | All of the operating system services can be made to run natively, at full speed. The only thing that will be emulated is the application code itself. So, except for very computation-intensive application code things could still run smoothly. |
| Okay, but a lot of OS X (like Quartz Extreme) is optimized to run on Macintosh hardware. | Macintosh video hardware is the exact same as PC video hardware these days. In fact, most of the supporting hardware in Macintosh is the same as on a PC. |
| The PowerPC is part of Apple's 'uniqueness'. | It doesn't matter to most consumers what chip or ISA is running their software. The reason people pay for Apple, their core unique value, is their appealing design and the attenion they spend developing a well integrated system. Even if Apple switches to Intel, there's no reason any of that has to change. (Anyway, they could still do something pretty unusual, like putting a Pentium M in a desktop). |
| Lots of new stuff in Tiger like CoreImage uses AltiVec a great deal. | CoreImage actually compiles dataflow graphs to native hardware at runtime, picking the approach that runs best on the target hardware. CoreImage could well compile to x86/SSE2 (or whatever else). That means that even a PPC binary running emulated on an Intel Macintosh could have access to full speed CoreImage services compiled to SSE2. |
| This will alienate existing PowerPC customers. | Why does it have to? If their emulation works well enough, Apple could easily introduce Intel hardware and retain PowerPC as the standard binary format for a while. The common case for ISV's would be to continue developing PowerPC binaries and selling into both the x86/OSX and PPC/OSX markets. The only 'schism' would be arise for software vendors who had to have full performance on x86/OSX. They'd have to worry about shipping some kind of fat binary that ran on both platforms. There still, PPC/OSX customers wouldn't see a difference. |
| Will Windows run on an Intel Mac? Won't that make it easier for Microsoft to drop Office for OS X? | Apple could easily make it virtually impossible to run Windows on whatever hardware they sell. With respect to Office for OS X, Microsoft doesn't really care what the target archicture is: they just want to sell licenses to Office. They'll go where the money is, and that might end up being an OSX/Intel port. |
Now that I think about it, the switch to Intel would basically boil down to the same story Apple told in 1993, when it initially switched from the Motorola 680X0 to the PowerPC. Apple pulled it off well in 1993, and now they have the benefit of experience (they've done it before), better emulation technology, and an already more standard hardware platform. It seems plausible to me. The only thing that's left is to figure out why they'd do it, and I have some ideas there too:
- They could finally move their laptops to a faster chip than the G4.
- x86 is not going away and it's not going to end up marginalized any time soon. This could be a 'final' switch.
- If IBM is growing cold on the desktop CPU business (and who could blame them), Apple's hand might be forced into switching away from PPC. Right now, IBM is the only high performance CPU story Apple has.
reddit this! Digg Me!
Fri, 03 Jun 2005
To make it easier to distinguish traffic from me, and traffic from other folks, I've added a symlink to my configuration that makes it possible to hit the blog from a different, private, URL. That way, my hits and other folks hits are nicely bucketed out in my ISP's reporting. This is a cheap and easy solution, and I reccomend it.
reddit this! Digg Me!
[/tech/this_blog] permanent link
Since ClearType depends on the unique properties of LCD's, it won't look as good on a CRT. (I still think it looks better than normal, though).
reddit this! Digg Me!
[/tech/general] permanent link
Thu, 02 Jun 2005
reddit this! Digg Me!
Option Explicit
Dim lastWindow As Variant
Sub HereAndThere()
If IsEmpty(lastWindow) Then
Set lastWindow = ActiveWindow
Else
Dim currentWindow As Window
Set currentWindow = ActiveWindow
lastWindow.Activate
Set lastWindow = currentWindow
End If
End Sub
Here's how you use it:
- Run the macro once to save your current location.
- Switch to your other spreadsheet
- Now, running the macro will switch back and forth betweeen the two worksheets.
reddit this! Digg Me!
Wed, 01 Jun 2005
Since people tend not to read online, and blogs are often read in huge volumes via an aggregator, my hunch is that the title (and maybe the first paragraph) have to convince people your article is worth the time...
More established bloggers, with a better track record of writing interesting stuff, might have more leeway.
reddit this! Digg Me!
[/tech/this_blog] permanent link
- Any modern laptop processor is probably adequate.
- 1GB RAM.
- 30-60GB Disk.
- A DVD writer would be nice, but not necessary.
- 14-15 inch display, the highest dot pitch I can find.
- Reasonable 2D graphics performance, 3D is not that important to me.
- Touchpad pointing device.
- 3 year warranty, accident insurance is a nice plus
- Long battery life, >3 hours.
- Reasonable expectation of 2-3 years of reliable life.
- Can run a couple small Windows applications I need to do my job.
- Can run MS Office.
| Dell D610 | Thinkpad T4x | Apple 15" PowerBook | Apple 14" iBook | |
| CPU | Pentium M, 1.6 | Pentium M, 1.8 | G4, 1.5 | 64, 1.33 |
| Ram | 1GB, 2 DIMMS | 1GB, 1 DIMM | 1GB, 2 DIMMS | 768MB, 2 DIMMS |
| Hard Disk | 60GB | 60GB | 80GB | 60GB |
| Optical Disk | DVD+/-RW | DVD+/-RW | DVD+/-RW | DVD+/-RW |
| Screen | 14.1", 1.5MP | 14.1", 1.5MP | 15", 1MP | 14", 0.75MP |
| Warranty | 3 year | 3 year | 3 year | 3 year |
| Insurance | 3 year | none | none | none |
| Price | $1,893 | $2,306 | $2,648 | $1,948 |
So, as ever, Apple is the most expensive choice, even when compared to nicer PC's like the ThinkPad.
Maybe the thing that suprises me the most about this is that Apple isn't even close to the bleeding edge of display technology. Given the energy they've put into OS X's desktop rendering pipeline, I'd expect them to have displays that could compete with Sony's XBrite or maybe the 2MP 15" widescreen that Dell makes available on the D810. OS X could drive those displays better than pre-Avalon Windows. Maybe this is a artifact of the suppliers Apple is using?
reddit this! Digg Me!
Fri, 27 May 2005
All I need now is time...

reddit this! Digg Me!
[/tech/general] permanent link
Even more cool is that the archive goes back to the beginning, back when Dell was called PC's Limited.
Note: The IBM link above is actually still on the IBM site... I expect the link to break whenever Lenovo takes the contents.
reddit this! Digg Me!
[/tech/general] permanent link
Tue, 17 May 2005
I've found a bunch of good things online about the man and his work:
- http://research.microsoft.com/users/gbell/craytalk/
- http://en.wikipedia.org/wiki/Seymour_Cray
- http://www.cwheroes.org/oral_history_archive/seymour_cray/index.asp
- http://www.spikynorman.dsl.pipex.com/CrayWWWStuff/
- He didn't mind throwing bad ideas away (or saving them for later). The Cray 1 took a very different approach from the CDC 8600.
- Cray failed a lot. He was always pushing the limits and taking risks, and paid the price of those risks. The CDD 8600 failed, as did several designs for the Cray 2. The Cray 3 failed to sell, and the 4 doesn't seem to have hit the prototype stage at all. Even the Cray 2 doesn't seem to have been an unqualified success, thanks to issues with memory bandwidth.
- He had a very 'startup mentality'. His career seems to be a repeating story of initial success, spin off lab, and spin off company.
- A lot of his design problems weren't electronic at all. He seems to have struggled as much (if not more) with packaging and cooling as with anything else.
- He had a keen sense of style. With the possible exception of the Connection Machine CM-1/2, his machines were the most visually striking of the major supercomputers. Maybe it's superficial, but it can't have hurt the sales or publicity.
- He knew what he had accomplished. There's a story about his suprise when Steve Chen developed the X-MP from the Cray 1 and doubled (?) the performance. Of course, the story goes on to describe how Cray ended up appreciating the new design.
reddit this! Digg Me!
[/tech/general] permanent link
I've never been happy with the text quality of the vCalc display: it's jagged and at a font size that doesn't rasterize well on the displays I have access to. Well, as it turns out, this is relatively easy to fix. The LOGFONT structure that GDI uses to select fonts has a field, lfQuality, that is used to select the quality of the text rendering. Back in olden days, this field was used to do things like disallow scaling of bitmap fonts (if you don't know what that is, be thankful: it was awful). These days, it's used to turn on Antialiasing and Cleartype (on winXP). Thus, this one line of code:...
lf.lfQuality = CLEARTYPE_QUALITY;
...transformed this...
...into this.
There's also a setting for anti-aliasing:
lf.lfQuality = ANTIALIAS_QUALITY;
Anti-aliasing (in Windows) dates back to the Windows 95 Plus pack, so this setting should be much more widely supported. However, it's also much less powerful: it doesn't do any of the sub-pixel stuff and it is enabled far less often. In my experimentation, non-bold fonts had to be pretty big before anti-aliasing was used at all.
The other caveat is that this doesn't automatically buy you decent formatting of the text you display. That is, if you're still computing text positioning on per-pixel increments, you'll still get mediocre layout. vCalc does this, but it also has very minimal text layout requirements for now.
reddit this! Digg Me!
[/tech/general] permanent link
Tue, 12 Apr 2005
Since a company has to have customers to survive, most of the time the interests of the owners are in line with the interests of the customers. However, this isn't always the case: Microsoft's VB6/VB.Net decision might be an example. If you believe that the lower costs and better prospects of VB.Net outweigh the lost goodwill of all those VB6 developers, then you can also argue that dumping VB6 was a net profitable thing to do. This is despite the fact that so many customers are paying a price for the decision.
So... if you're a VB6 developer and you're upset about the way you were treated, the best protest you can make is to make Microsoft's decision a bad one. Make it unprofitable. When it comes time to pick a replacement platform, vote with your wallets and send your dollars somewhere else (and hopefully to a platform served by more than one vendor).
reddit this! Digg Me!
- Will there ever be a need for more than one instance of the variable?
- How much complexity does passing the variable to all its accessors
entail?
- Does the variable represent global state? (A heap free list, configuration
information, a pool of threads, a global mutex, etc.)
- Can the data be more effectively modeled as a static variable in a
function or private member variable in a singleton object? (Both
of these are other forms of global storage, but they wrap the variable
accesses in accessor functions.)
- Can you support the lifecycle you need for the variable any other way? Global
variables exist for the duration of your program's run-time. Local variables
exist for the duration of a function. If you don't have heap allocated
variables, or if your heap allocator sucks, then a global variable might
be the best way to get to storage that lasts longer than any one function
invocation.
- Do you need to use environment features that are specific to globals?
In MSVC++, this can mean things like
specifying the segment in which a global is stored or declaring a variable as
thread-local.
reddit this! Digg Me!
[/tech/programming] permanent link
The core problem VB6 developers are facing is that they sank lots of development money into a closed, one-vendor language. Choosing VB6 basically amounted to a gamble that Microsoft would continue to support and develop the language for the duration of a project's active life. That gamble hasn't paid off for some developers, and companies with sizable investments in VB6 code now need to figure out how to make the most of that investment while still evolving their software.
With standardized languages like C, languages with multiple tool vendors, the risk is significantly lower. If one vendor drops their version of a language, switching to another implementation is going to be a lot easier than porting to an entirely different platform (particularly if you've avoiced or isolated vendor-specific features).
So... what's the moral of this story? Before you base your business on a particular language or tool, make sure you know what happens if that platform ever loses support. Pick something standardized, with multiple viable vendors. Or alternatively pick something open source, where you can take over platform development yourself (if you absolutely need to). Whatever you do, don't pick a one vendor tool and complain when the vendor decides to drop it. Commercial vendors, particularly, have no legal obligation to their customers.
reddit this! Digg Me!
[/tech/general] permanent link
Fri, 04 Mar 2005
The problem with making general interest posts at the root of a topic is that there's then no way to watch only the general topics. If you look at the root topic, you get the whole topic.
reddit this! Digg Me!
[/tech/this_blog] permanent link
Ps: Be sure to check out Olin Shiver's philosophy of undergraduate advising. It's an example to be followed. ;-)
reddit this! Digg Me!
The scenarios that people tend to get upset about (at least in the United States) are the scenarios involving offshoring, the movement of work overseas. Outsourcing, however, does not necessarily imply that the work gets moved to a different country: it's very common for work to be outsourced to another American business employing American workers. An example of this is hiring a Madison Avenue firm to put together an ad campaign. Sure, it'd be possible to develop the talent in house to do this yourself, but there are many advantages in outsourcing the work to a more specialized vendor.
reddit this! Digg Me!
[/tech/business] permanent link
reddit this! Digg Me!
[/tech/general] permanent link
I guess that looks like a lot of complaining, but otherwise the phone is very nice. The last phone I've liked as much is my old Nokia 8260 (and the 6160 before that). The Sanyo 4900 doesn't even come close. I'm happy enough with this phone to consider buying another Sony Ericsson. (The new W800i looks pretty nice...)
reddit this! Digg Me!
[/tech/products] permanent link
Wed, 02 Mar 2005
(format #t "compiling ~l, tail?=~l, value?=~l" form tail? value?)
with the output statements in place, the compiler takes about 250-300ms to compile relatively small functions. Not great, particularly considering that there's no optimization being done at all. Anyway, on a hunch I removed the format statements, and execution time improved by a couple orders of magnitude to a millisecond or two per function. That's a lot closer to what I was hoping for at this stage of development.
On the other hand, I hadn't realized that my (ad hoc, slapped together in an hour) format function was running quite that slowly. I think it'll end up being an interesting optimnization problem sooner or later.
reddit this! Digg Me!
Tue, 01 Mar 2005
Maybe this kind of API would have made abominations like this less likely:
At least users might be able to avoid calling errors "OK", of all things...
reddit this! Digg Me!
[/tech/general] permanent link
struct CountedString
{
int _references;
char *_data;
};
CountedString *makeString(char *data)
{
CountedString cs = (CountedString *)malloc(sizeof(CountedString));
cs->_references = 1;
cs->_data = strdup(data);
return 1;
}
CountedString *referToString(CountedString *cs)
{
cs->_references++;
return cs;
}
void doneWithString(CountedString *cs)
{
cs->_references--;
if (cs->_references == 0)
{
free(cs->_data);
free(cs);
}
}
// ... useful library functions go here...
The reference counting mechanism buys you two things. It gives you the ability to delete strings when they're no longer accessible; It also gives you the abilty to avoid string copies by deferring them to the last possible moment. This second benefit, known as copy-on-write, is where idempotence can play a role. What copy on write entails is ensuring that whenever you write to a resource, you ensure that you have a copy unique to to yourself. If the copy you have isn't unique, copy-on-write requires that you duplicate the resource and modify the copy instead of the original. If you never modify the string, you never make the copy.
This means that the beginning of every string function that alters a string has to look something like this:
CountedString *alterString(CountedString *cs)
{
if (cs->_references > 1)
{
CountedString *uniqueString = makeString(cs->_data);
doneWithString(cs);
cs = uniqueString;
}
\\ ... now, cs can be modified at will
return cs;
}
Apply a little refactoring, and you get this...
CountedString *ensureUniqueInstance(CountedString *cs)
{
if (cs->_references > 1)
{
CountedString *uniqueString = makeString(cs->_data);
doneWithString(cs);
cs = uniqueString;
}
return cs;
}
CountedString *alterString(CountedString *cs)
{
cs = ensureUniqueReference(cs);
\\ ... now, cs can be modified at will
return cs;
}
Of course,
ensureUniqueInstance ends up being idempotent: it gets you
into a known state from an unknown state, and it doesn't (semantically) matter if
you call it too often. That's the key insight into why idempotence can be useful.
Because idempotent processes don't rely on foreknowledge of your system's state to work reliably,
they can be a predictable means to get into a known state. Also, If you hide
idempotent processes behind the appropriate abstractions, they allow you to write
code that's more self documenting. A function that begins with a line like
cs = ensureUniqueInstance(cs); more clearly says to the reader that it
needs a unique instance of cs than lines of code that check the reference count of
cs and potentially duplicate it.
Next up are a few more examples of idempotence, as well as a look into some of the pitfalls.
reddit this! Digg Me!
[/tech/programming/idempotence] permanent link
I've also found, via Lambda the Ultimate, a website dedicated to Alexander Stepanov's papers and code. Stepanov is one of the principals behind the C++ STL (STepanov and Lee) Standard Template Library.
reddit this! Digg Me!
[/tech/general] permanent link
Normally, I'd expect that two loads of laundry would cost about $6. The washer would be $1.50 or so per run, and the dryer would be another $1.50 per run, for a total of $6. Maybe even $7.50, if you decide to run a second dryer cycle. Even in New York (west Midtown Manhattan), I've been in apartments recently that charge about that much.
However, this apartment is special: they use a "smart" card system to manage payments. There's a dispenser on the side of the wall that sells $7 cards for $10 (cards themselves cost $3). The dispenser also allows you to reload cards in $5 and $10 increments. Once you have a card, there are slots in each of the washers and dryers that accept the card and debit from it the $2.50 it takes to buy a cycle in one of the machines. Yes, you read that right: $2.50. $2.50 in my apartment complex buys a 34 minute washer cycle or a 30 (yes, 30) minute dryer cycle.
So tonight, I spent $15 (200% of my estimate) and got this:
The part of this that bothers me the most is the $3 surcharge on the smart card. Thanks to the pricing structure of the laundromat, the $3 surcharge really amounts to a $5 surcharge. This means that someone was either stupid enough not to notice that customers would always end up with $2 of useless change, or was malicious enough to use this as a sleazy way to bilk customers out of an extra $2. Not to mention that I get the hassle of trying not to lose this stupid card, lest I want to drop another $5 on yet another card.
reddit this! Digg Me!
[/personal/nyc] permanent link
Sun, 27 Feb 2005
Dr. Raskin was one of the first human interface experts to contribute to and be involved in the Apple Macintosh. computer. While it's true that the design took a different direction from some of his initial ideas, he played a major role in defining the user interface ethic of the Macintosh, and consequently basically every other major computer interface.
After leaving Apple, Jef went on to continue his ideas with the SwyftCard and Canon Cat. The best articulation I've seen of his ideas regarding interface design is in his book, The Humane Interface. He has also put a great deal of his work on his personal website.
This is a sad day, indeed.
reddit this! Digg Me!
[/tech/general] permanent link
Tue, 22 Feb 2005
reddit this! Digg Me!
[/tech/general] permanent link
reddit this! Digg Me!
[/tech/this_blog] permanent link
One of the most common examples of this is in C-style include files. It's common practice to write code like this, to guard against multiple inclusions:
#ifndef __HEADER_FILE_GUARD
#define __HEADER_FILE_GUARD
// ... declarations go here...
#endif __HEADER_FILE_GUARD
This idiomatic C code protects the include file against multiple inclusions. Include files with this style of guard can be included as many times as you like with no ill effect.
The benefit to this is that it basically changes the meaning of the code #include <foo.h> from "Include these declarations" to "Ensure that these declarations have been made". That's a much safer kind of statement to make since it delgates the whole issue of multiple inclusions to a simple piece of automated logic.
Of course, this is pretty commonplace. More is to come...
reddit this! Digg Me!
[/tech/programming/idempotence] permanent link
Wed, 16 Feb 2005
reddit this! Digg Me!
[/tech/general] permanent link
Now, speaking to Jon's particular issues
1. It's not a skill you'll use in most of the software development jobs you'd want to have
...here are the kind of things you might use C for these days - writing some kind of device
driver, maintaining extremely old applications, embeded development on very limited hardware,
or maybe writing an operating system or compiler or something. I guess you could like hack
away on the Linux (or is that GNU-Linux) kernel....
Having done this work for seven years (and enough of IT-style work to know the difference), this
is some of the most interesting stuff you can do with a computer science education. In C, I've
developed two programming languages, an object oriented framework for distributed real-time
process control applications, a bunch of objects using that framework, and yes, my share of
device drivers and RTOS extensions. It's been interesing and deep technical work, despite
the fact that is was wrapped in a plain, C wrapper.
"...Consider this though - you're not really going to be solving any new problems...."
Not much commercial software work involves solving truly new technical problemss. If you really
want that, pick a problem to solve, get your Ph.D. and enter a research environment.
Otherwise, there are still plenty of interesting systems to build and lots of deep thinking
to do in commercial software development, Some of those jobs require C. Even
more of those jobs require a mastery of the skills that C requires you to have and develop.
"...If you want to do web development, database development, desktop applications - you know,
stuff you can talk about at a party - you're probably not going to be working in C....
This is a wierd argument... I'm not sure why this is stuff that's more suitable for small talk
at a party than any other programming work. All of it seems equally bad, actually. In any event,
do you really want to choose your career (>=40 hours a week for years) based on what plays well
at a party?
2. It can give you a false sense of control
"...Worse still is that it can make you think that programming is about telling the
computer what to do with its registers and memory. ..."
At it's core, that is exactly what programming is about: telling a computer how to accomplish
useful tasks in a language/vocabulary it can understand (i.e. bits/bytes, registers, memory, etc.).
It's nice to build abstractions
atop that so we don't have to think about moving bytes around, but if those abstractions
ever leak, you will have to know why to effectively deliver software.
3. It can teach you to get in the way
"... If all you learned from C is that you are the boss, you will most certainly write code that
plays poorly with others..."
This is probably true, particularly if you treat other languages like they were C.
The power of learning C is that it forces you to take control. To programm effectively in C, you
have to understand how higher level constructs like strings, objects, processes, etc. map
down to the basic concepts supported You have to explicitly think about every memory
allocation, where the storage is allocated, and whem the memory gets freed. You have to
think about when values are passed by reference, and by value. Compared to Java and C#, you
have to think about a lot of things that, in the modern world, seem like low-level trivia.
This is the whole point. Even if you never touch C again, the language you do end up using has
to solve exactly these problems. And it likely does it, via some mechanism like garbage
collection, that imposes its own costs and constraints. If you don't understand these
low-level mechanisms, and the constraints they impose, you can't be considered a fluent programmer
in whatever language you use. This is true for the same reason that, when I went to school, around
1994, I had to learn Motorola 68000 assembler code. I've never written 68K assembler commercially,
but that coursework made it crystal clear what various high level language constructions
cost.
4. It can make it hard for you to love famework based development
"...To be productive as a programmer these days, you either need to be learning to get the most
out of existing frameworks or helping to build tomorrow's frameworks. ..."
It's possible to build frameworks in C, as well as to use them. One of Jon's framework
examples, Gtk, is written in C. The Gimp
is an example of an application that is written in C, and based on GTK. There are plenty of other
examples of C-based framework development.
5. It can teach you philosophies which will prevent you from really understanding modern programming
It teaches the philosophies on which modern programming is built. BAsically all modern OS's are built in C,
at the core. Basically all commercial run-time environments are built in C, at the core. Lower-level still,
modern CPU's and ISA's evolved in a time when C was king, and are well suited to running compiled C code.
6. It can teach you divert your problems from the real challenges of software engineering
"...The point is, today's software development environment is dynamic, evolving, and extremely
challenging. If you're going to be of help, you need to do something more productive with your
time than learn a 20 year old language..."
If you're going to be of help, you can do things more productive with your time than encouraging
people not to learn about the core aspects of their profession.
I'll be the first to admit that you can get away with never programming professionally in C, but
your programming will ultimately suffer for not knowing it.
reddit this! Digg Me!
[/tech/general] permanent link
Wed, 09 Feb 2005
None of this is all that earth shattering, but it was all trivial to do in Bloxsom. For a one-file, 16K Perl script, Bloxsom brings a lot to the table.
Next on the adgenda is getting a web form set up for posting and
hopefully editing blog posts, and then setting up a web-based
way to upload images into the blog. My current workflow for posting
to the blog involves two levels of nested SSH logins and the use
of vi. *shudder*.
reddit this! Digg Me!
[/tech/this_blog] permanent link
Tue, 08 Feb 2005
Planet Lisp aggregates a bunch of Lisp-related blogs, while LtU is more general and more of a discussion site.
Related, I came to LtU via Eric
Lippert's Blog over on Weblogs @ASP.Net.
Eric is a developer at Microsoft who's done a lot of work on Windows scripting
and the Windows script languages.
reddit this! Digg Me!
Like Noisemaker, vCalc is shareware available through IceGiant.
[/tech/ectworks/vcalc] permanent link
It's available as a shareware program at Icegiant Software
reddit this! Digg Me!
[/tech/ectworks/noisemaker] permanent link
It took a while, but I decided to base the thing on
Blosxom.
Nice and simple...
reddit this! Digg Me!