Mike Schaeffer's Blog

The Commodore 64 and Apple ][

mike@mschaef.com (Mike Schaeffer) — Tue, 05 Aug 2025 00:00:00 +0000

A little over a year ago, I wrote about some formative experiences I had with computers early in my life. That article focused on the Timex Sinclair 1000, but sitting off in the wings are two other computers - the Commodore 64 and the Apple II. A few weeks ago, there was some dialog between John Gruber, Drew Saur, and Jason Snell that's brought those two machines back to mind. It's all very worth reading if you have any interest in the early history of personal computing.

It's hard for me to admit that it's early history, but these machines are five years further in the past to us today than the ENIAC would've been to somebody opening up a brand new Apple II. Time flies.

Like you might expect, there are partisans for each machine. Drew Saur likes the Commodore and the other two prefer the Apple machines. Having spent some time using and thinking about both, I have my preferences also, but I've also come to a different conclusion. While the Apple and the Commodore competed in a similar place in the early 1980's market, it's equally true that they are products of different times, different design philsophies, and of companies with different goals. For me, it's hard to reduce it to a single "favorite" machine, and it's hard to boil it down to specifications and case design. There's more to the story.

My exposure to the Commodore 64 and Apple II series was entirely in the context of grade school education. My school had two computer labs opposite from each other across the same hall. Turn left and you step into a lab full of Commodores, turn right and you step into the Apple II lab. While the majority of the school's educational use of the computers centered around the Apple products, it was the Commodore 64 that made the first big impression.

I entered third grade in 1983. This was three years after Seymour Papert first published a book entitled Mindstorms). If you're not familar with the book, and have any interest in computer literacy education, I highly suggest you read it. In his book, Papert talks about the Logo) programming language, and how he viewed it as central to a Constructivst) approach to education. The idea is that children could learn by building things, and exploring the properties of what they had built. The Logo language itself was designed to support this. Children could start with a particularly accessible form of graphics programming - turtle grahics - and move on to the sorts of list and symbolic processing normally associated with Lisp. (This is no accident - Papert spent most of his career at MIT.)

For whatever reason, my school was piloting a program to test some of Papert's ideas in the classroom setting. Practically speaking, this meant we had that lab full of Commodore 64's running Logo, and some time each week to teach virtual turtles how to draw shapes on screen. If you haven't seen Logo or turtle graphics before, this shows how you might define a procedure to draw a square:

to square
   forward 10
   right 90
   forward 10
   right 90
   forward 10
   right 90
   forward 10
   right 90
end

This works by instructing a 'turtle' animated on the screen to move forward 10 pixels, turn right 90 degrees, and repeat the process. The turtle carried a virtual pen that caused it to draw a square on the screen.

By design it was an accessible way to program, even to a third grader. Papert had realized it was easy for children to envision themselves as the turtle, and by imagining themselves as the turtle, they could more easily think through how their instructions to the computer would be understood. That made it easier to make direct connections between what they wanted to happen and the instructions they needed to give the computer. It eliminated the need for cartesian coordinates to put a picture on the screen.

The next step in drawing a square might be to notice that the same pair of steps is repeated 4 times. Logo has a easily understandable looping construct for repetitive operations:

to square
   repeat 4 [
      forward 10
      right 90
   ]
end

From there, you might decide you want to be able to change the size of the squares you draw, and add a parameter:

to square :size
   repeat 4 [
      forward :size
      right 90
   ]
end

If you add a second parameter, you can vary the number of sides and draw regular polygons. (Set :nsides high enough, and the output looks a lot like a circle.)

to polygon :nsides :size
   repeat :nsides [
      forward :size
      right 360 / :nsides
   ]
end

For me, it was utterly seductive and very compelling. Adding fuel to the fire was the fact I occasionally had free run of the lab for hours at a time. My mom was one of the other third grade teachers at the school, and She was also the teacher responsible for managing the Logo lab. On days when she had work to do outside of school hours, I'd amble down to the lab and experiment with Logo. There were even a few times I was responsible for setting the lab up - wiring the hardware together.

Initially, I used my extra time in the lab to elaborate on ideas I'd developed in class. Later on, I started exploring the Logo system's manual to see what else could be done. There was also a second diskette that shipped with example programs. Among many others, there was an interactive editor for C64 VIC-II sprites and an animal guessing game. The animal game would try to guess the animal you were thinking of. If it got it wrong, it would ask you to teach it a yes or no question that would differentiate the animal you were thinking of for the next time. It was a simple form of Lisp-style symbolic AI, and heady stuff for a ten year old.

In retrospect, while this time in my life was only a few years, I credit it with my choice of profession. I also belive my early exposure to Logo played a profound role in the way I think about writing software. The C64 Logo implementation had an excellent manual, covering not only how to use the Logo interpreter, but also how to program in general. It talked about subjects like functional decomposition and recursion effectively enough that I was exploring the ideas myself in my spare time in the computer lab. In no time, I was writing functions that would do things like draw n-sided figues on screen. I'd call those with other functions that recursively called themselves to draw even more complex composite shapes. To this day, I am convinced it shaped the way I think about decomposing a problem into component modules. With all that in mind, I hope it goes without saying that I have a fond place in my heart for the Commodore 64.

Unfortunately, if you look a little deeper into what it was like to use a Commodore 64, you can see where it went off the rails in comparison to the Apple. In addition to Logo, my extra time in the lab allowed me to explore other aspects of how the machines worked. I had a couple games and other programs on a floppy disk I'd picked up at a local school computer fair. A later upgrade to the lab brought a row of Commodore 64C's along with GEOS. None of this was used in the curriculum, but I had time to play around and experiment a bit.

While I can say the sprites and sounds were fun, I'd boot the machine into BASIC and be frustrated by the fact that the interpreter didn't have commnands for graphics. An Apple II booted into a BASIC interpreter that had commands for both low and high resolution graphics. Three commands were all it took to switch to graphics mode and plot a colored line on the screen. The same task on a Commodore, required direct access to memory mapped hardware with PEEK and POKE.

Similarly, while I didn't understood why it happened, I noticed that when I wanted to load Logo from disk, it took an awfully long time. While the hardware capabilities for video and audio were nice, there were other aspects of the machine that just weren't right. This strange mix of tradeoffs in comparison to the Apple hardware were related in many ways to the different origin stories of the two machines.

The Apple was hardware Steve Wozniak built as a passion project that Steve Jobs turned into a viable product. The Commodore came a few years later, from a much more mercenary company, albeit a company with its own semiconductor fab. It says it all that Wozniak picked the 6502 chip because he could afford it, and Commodore picked the 6502 because they owned the fab that made it. That manufacturing capability (and some talented engineering) gave Commodore the ability to do things like design the custom semiconductors for video and audio that gave the Commodore so much of its power.

The flip side of the C64 story is that it was also built to a price point. Commodore had bought that semiconductor fab as a way to stay competitive in the calculator market and moved into computers as a way to hedge against the fact that market was shrinking. While Wozniak was able to develop much of the Apple II hardware in isolation, the Commodore 64 was very much a product of a wider corporate engineering effort. They could do things like build their own chips, but they didn't have the focus of Wozniak's stated goal of building a computer that could implement Breakout in BASIC. (Apple's later failures with the Apple /// and Lisa should show that Commodore is not the only company of the era with the ability to lose its focus.)

The net effect of this is that while the Commodore had some nice hardware, it also suffered due to cost cutting. The lack of investment into a BASIC interpreter that supported the hardware is one such example, but the situation was even worse with respect to Commodore 64 Disk I/O. If you're not familar with how disks worked on a Commodore 64, they were the diametric opposite in design when compared to the Apple Disk II. Steve Wozniak famously designed the Apple disk controller using the machine's host CPU and a handful of chips on an expansion board about the size of a deck of cards.

In contrast, Commodore 64 disk drives were standalone computers with their own CPUs. There was the disk hardware, but there was also another 6502 chip, RAM, and ROM firmware. In a sense, a Commodore 64 was what we might view today as a network server. It ran most of the disk operating system itself and served files to the host Commodore 64 via a serial link. This added costs to the disk hardware, but carried a few advantages. First, it was simpler to package and install than the Apple II's plug in expansion card. (Atari did something similar with its 8-bit disk drives for what I presume are similar reasons.) The other advantage was that this networked approach made it possible to multiplex a single Commodore 64 disk drive (and printer) across several Commodore 64's. My school's Logo lab did this to share disk and printer pairs across groups of four C64's. They had to buy multiplexing hardware to do this, but I suspect the cost per seat was much lower than it would have been with the either the more expensive Apple hardware or a dedicated Commodore disk drive per machine.

Besides higher drive cost, the other downside of this approach to disk storage was that it relied on the performance of a serial communications link between the Commodore 64 and the disk drive. For a range of reasons related to hardware compatibility with the earlier VIC-20, the higher memory bandwidth requirements of Commodore 64 graphics, and a rush to market, the performance of this serial link was severely compromised on the Commodore 64. Not only was Commodore 64 disk I/O slower than on a VIC-20 with the same disk drive, it was less than 10% the speed of an Apple II disk. Benchmarks put the transfer rate of the disk drive (with the stock link protocol) at significantly less than 1K/sec. Loading that Logo interpreter could take a minute or more, which was made even worse by the way the lab shared a single disk across four C64's. While it was possible to fix the serial link performance issues with a fast load cartridge, that consumed the closest thing the Commodore had to an expansion port solely to correct an issue that should not have shipped in the first place. I argue that this sort of failure (and a general lack of expandability) have a lot to do with the limited shelf life of a Commodore 64.

So, while the video and audio capabilities of the Apple II weren't in the same league as the Commodore, it had some strong compensation in the form of performant disk I/O and expandable hardware. My middle school library used an Apple II with a Corvus hard disk to maintain its book inventory. A few years later (early 1990's), my high school physics teachers had built a fully integrated test generation and grading system around a heavily upgraded Apple //e. They would generate custom tests (with graphics) for each student and grade them in real time as students were taking the test. If we missed a problem while taking the exam, we could try again during the same class period for half credit and again after that for quarter credit. I suppose it would have been possible to do these sorts of things with a Commodore, but there were more gaps in the machine's capabilities that would need to be filled and fewer ways to expand the machine to fill those gaps.

Technical issues aside, what I'd like to leave you with is the notion of coherence of vision. Wozinak designed the Apple II in the context of a company that was already selling hardware (the Apple 1), but he did it with a very clear and personal vision of what the machine should be. In part, he envisioned the Apple II as a machine that could take the Atari Breakout game and re-implement it in software. This vision guided the features he built into the machine and turned it into a product with a purpose. He knew what he was designing, who he was designing it for, and when it was complete. He also had the humility and foresight to recognize that the Apple II itself was the beginning of the story and would need to be expanded and potentially carried forward into subsequent models. While I don't think we should disminish the role of Steve Jobs' marketing vision, I do think there's a lot to be said for starting a design with specific requirements and a specific audience. This may not be as sexy or as fun as hardware specifications, but it is a better core vision for a product with staying power. That's a lesson as true now as it was then.

Workload

mike@mschaef.com (Mike Schaeffer) — Fri, 22 Nov 2024 00:00:00 +0000

Lately, I've been reading "Beyond Mach 3"), Col. Buddy Brown's memoir of his years piloting the U-2 and SR-71 spy planes. If you like that sort of thing, it's an interesting read on its own, but it also has parallels to the software industry, particularly in the way it describes pilot workload. The concept of pilot workload provides a nice way to think explicitly about the burdens our systems place on us as engineers and operators to keep them running properly.

If you're not familar with the U-2, it's a 1950's era aircraft designed to overfly hostile territory and take high resolution pictures of the ground for later analysis. It was a U-2 that flew over Cuba to take the pictures of Soviet missiles at the beginning of the Cuban Missile Crisis. To make these sorts of missions possible, the plane was designed to fly at very high altitudes, upwards of 70,000 feet. This is almost double the altitude of a modern passenger airliner.

What may not be obvious is that airplanes get progressively more difficult to fly at higher altitudes. If speed drops below the stall speed, the wings stall and lose lift. If speed gets too high, the airflow separates and the wings lose lift. Either way, if the speed deviates out of this window, the pilot loses control entirely. Due to the decreasing density of the atmosphere at higher altitudes, this window gets narrower as the plane flies higher. Fly high enough, the window closes entirely, and you've just discovered the ceiling of your aircraft. It can't fly that high at all.

A U-2 flying at 70,000 feet is flying very close to its ceiling. The range of permissable speeds at this point in its flight is just 5-10mph wide. Regardless of what else is going on, the pilot has to be completely precise managing their speed in order to retain control. This region of a flight envelope is referred to as the coffin corner), due to the catastrophic implications of a mistake.

On top of the work required to keep the plane in the air, the pilot of a U-2 is also taking celesital navigation readings every 30 minutes, compensating for the curvature of the earth, and running the sensor packages that are the point of the mission in the first place. In the best case, a mistake means the mission doesn't produce the desired results. In the worst case, a mistake means a crash in hostile territory and an international incident. To put it plainly, the U-2 asked a lot of its pilot as part of its basic operation. In aerospace engineering, this is referred to as having a high pilot workload.

For the U-2 program, this was arguably an appropriate design choice. The program had the ability to be highly selective in the pilots it chose, it had the ability to engage in extensive and costly training of those pilots, and it was operating at the limits of available technology. This was the only way their mission requirements could be satisfied at the time, so they were forced into a design that placed high workload demands on the pilot. As soon as technology got better, it became possible to fullfill many of the same requirements with less likelihood of error and overall lower risk, which is what happened.

There are a few lessons in this for software professionals. While most commercial software systems don't overfly hostile enemy territory, they hopefully do have tangible real world implications anyway. Otherwise, what's the point of building software at all? The reality is that if your software matters, operational failures will have tangible real-world impacts and costs. This means that to the extent economically possible, we have to get it right. This is where the workload concept can play a role.

In the context of aerospace engineering, workload refers to the burdens on the pilot and crew to keep the aircraft operating as desired. The same sort of concept also applies to software. How much time and effort is required to keep a software system operating properly? Costs here can range from everything to periodic backups, monitoring, difficulty of installation and upgrades, environment management, complexity of testing... the list goes on and on. Viewed from that perspetive, you should ask yourself explicitly if you and your team are you building a system with a low or a high workload. What does it take to keep your system running, and is it appropriate for what it does?

At some level, this is old news. Build, deployment, and test automation have been on the software professional's agenda for literally decades at this point. 17 years ago, Jeff Atwood wrote about Rico Mariani's highly related concept of the pit of success. That said, how close are you really to where you want to be with respect to the systems you help build and maintain? Even if the concept of workload is old, the need to revisit it and make sure you're handling it appropriately is evergreen.

Concrete Control Flow

mike@mschaef.com (Mike Schaeffer) — Fri, 26 Apr 2024 00:00:00 +0000

Over the last few months, I've started making an attempt (with the help of my patient coworkers) to more intentionally learn the programming language Haskell. I have professional reasons why this is directly important, but it's also been a good way to challenge some long held assumptions. Being purely functional and lazily evaluated, Haskell lacks traits like implicitly sequenced evaluation, forcing its reconstruction where needed. Sequence is among the first concepts taught in programming, and it can be disconcerting to have to consider it explicitly. Despite my good intentions and years of experience with functional programming (mainly in various Lisps), the Haskell learning curve has presented challenges. It has been a rewarding journey with a number of connections to other parts of software design. I hope to share some of that here, and hopefully draw some connections that will make it easier to approach this content whether you use Haskell or not.

To see what I mean about the challenges of the Haskell learning curve, there is no better place to start than the beginning. First consider the definition of a function to double an integer.

doubleInt :: Int -> Int
doubleInt x = x * 2

The first line declares the type of doubleInt. It is a function from Int to Int. The arrow in the type declaration (->) signifies doubleInt as a function. Types without arrows are value types. With that in mind, consider this complete Haskell program. As you might guess, this is a classic "Hello World" with the message split on two output lines.

main:: IO ()
main = do
  putStrLn "Hello"
  putStrLn "World"

If you read carefully, you may notice that even though main defines the entry point for the program, it is not actually a function. There is no arrow in the type declaration, and so main is declared to be a value. Specifically, main is declared to be a value of type IO (). To give a more direct sense of how profoundly strange that is, consider a similar Python program also written with main as a value:

main = [
    lambda : print('hello'),
    lambda : print('world')
]

To bring this example at least a little closer to the Haskell, it is also possible to build main using functions to streamline the syntax:

def putStrLn(msg):
    return lambda : print(msg)

main = [
    putStrLn('hello'),
    putStrLn('world')
]

Run as written, neither of these Python programs do anything useful. They both allocates some data structures, bind them to a variable (main), and immediately exit witout doing anything useful. The second version is more sophisticated in that it uses Python code to help build those data structures, but it is fundamentally the same. In both versions, here is a gap between the value of main and externally visible action.

It is not necessarily obvious, but the Haskell program has the same problem. The only reason it works in the Haskell is the fact there is magic in the Haskell language runtime. This magic knows how to interpret an IO value and turn it into the actions that might make the program useful.

The gap between main and useful action can be solved in the Python code by bringing our own magic. This is code that takes the array value of main, imposes a meaning and executes according to that interpretation. A simple example might just execute the array elements in sequence:

def putStrLn(msg):
    return lambda : print(msg)

main = [
    putStrLn('hello'),
    putStrLn('world')
]

def run(blocks):
    for block in blocks:
        block()

run(main)

Run this version of program, and you get the following output:

$ python main.py
hello
world

This is exactly the same output as the alternative implementation you might be wishing I began with:

def main():
  print('hello')
  print('world')

main()

This simpler version relies on the property of Python that statements execute in the order in which they are written. This is built into the language, can be assumed, and does not need to be explicitly stated as in the run function. Underneath this, however, the Python code is interpreted in very similar fashion. There is an array of operations and a loop that processes the contents of that array to process them according to their semantics. You can inspect the array with the Python bytecode disassembler:

>>> dis.dis(main)
  1           0 RESUME                   0

  2           2 LOAD_GLOBAL              1 (NULL + print)
             14 LOAD_CONST               1 ('hello')
             16 PRECALL                  1
             20 CALL                     1
             30 POP_TOP

  3          32 LOAD_GLOBAL              1 (NULL + print)
             44 LOAD_CONST               2 ('world')
             46 PRECALL                  1
             50 CALL                     1
             60 POP_TOP
             62 LOAD_CONST               0 (None)
             64 RETURN_VALUE

Dropping down a level, the Python interpreter itself works the same way. In this case, the 'array' is machine code, and the 'loop' is the hardware in the CPU (working in conjunction with the operating system) to execute the intent of that code. This is as good a time as any to point out that run is about as simple an interpreter as it gets, and it is possible to extensively elaborate that same basic idea.

The point here is that the idea of expressing control flow as data to be interpreted by some external magic is not such a strange concept. In fact, it is essential to the way modern computers work. While all this hints at the power of what is happening with main and run, it leaves open the question of why you would bother. Why would you ever bother with such explicitly mangaged control flow, when it is already so common?

The justification boils down to control. A program that explicitly manages its execution is a program that has control over its execution. In the case of Haskell, need for this control is easy to justify. Haskell is functional, lazily evaluated, and does not offer the same sort of implicit ordering provided by languages like Python. Given that I/O operations typically assume at least some level of ordering, Haskell needs a way to reconstruct ordering when needed. This role is served by IO and its associated machinery. If you have ever wondered why Haskell I/O is built in terms of Category Theory and usually covered in something like Chapter 7 of whatever Haskell book you are reading, this is why.

As ever, there is an alternative. Early versions of Haskell handled sequencing by delegating I/O operations to an external server process. This server process consumed a stream of I/O requests and produced a stream of I/O responses. Haskell code could then be expressed as a mapping across these two streams. This essentially outsourced the the ordering concerns to the outside world, leaving Haskell functionally pure.

Here is a rough example of what my Hello World might look like in that style. (Note that unlike before, this main is actually a function.)

data Request = PutStrLn String | GetLine | Exit | ...
data Response = Success | Str String | ...

main :: [Response] -> [Request]
main _ = [
    PutStrLn "hello",
    PutStrLn "world",
    Exit
  ]

It is probably obvious at this point, but I have never written code in this style. Even still, it is easy to imagine how awkwardly it scales to force all I/O through two sequential streams and an external process. I do not know how long it took, but it seems like the Haskell community quickly abandoned that approach in favor of the approach we have been discussing: monadic I/O.

With monadic I/O, Haskell programs are expressed as values of type IO. These values represent I/O operations interleaved with computation. Internally, they data structures with an explicit sequence. If the mapping from the code to the data structure is not obvious, it is because Haskell has a well developed syntax for making monadic I/O code look a little more imperative. The do notation is a good example. do looks sequential, but it is really syntactic sugar for an expression that constructs a value using a specific pattern.

Once assembled, values of type IO can be passed to an internal component of the Haskell runtime for interpretation. The runtime automatically does this for main. A Haskell program starts up, computes main and then immediately passes it to a component inside the runtime that knows what to do with an IO.

For users of traditional (non-functional) languages, this may sound limiting, but it turns out not to be. Haskell includes functions among its value types, so the full power of Haskell functions is available within IO values. Haskell code is used to construct IO values that contain Haskell code. The use of IO just provides a specific structure and establishes a boundary between the code that cares about order and the code that does not. This is similar to the partitioning achieved in the older stream based I/O model, with the exception that it is fully expressed within Haskell, type checked, and does not require explicitly reducing all I/O to two streams.

Even without quite that pressing a need to manage sequence, there is still utility in taking explicit control. For languages that already have a notion of sequence, explicitly managing control flow can be useful for the opposite reason. Control flow is usually modeled at the language level as a single stream of execution. Taking more control can make it possible to run multiple streams at once, suspend or alter computations in the event of failures, pause and restart, or even selectively introduce concurrency. If these sound like operating system facilities, they usually are. The difference is that at the operating system level, they are often provided at a higher cost than what can be done if managed internally to a process. A thread is expensive, but a list of operations is not. There is a reason so much of high throughput server computing is expressed in terms of asynchronous operations with control flow expressed in data.

Closing out, there is another parallel I would like to call out. If you have worked with Promises in JavaScript or other languages, the notion of modeling control flow as a data structure might feel vaguely familiar. Consider the following:

new Promise((resolve) => resolve())
  .then(() => console.log('hello'))
  .then(() => console.log('world'));

This code constucts a description of a desired computation in the form of a linked list of promises - a control flow graph. The control flow graph is then implicitly passed into the JavaScript runtime for evaluation.

The previous code explicitly lists out each step in the control flow graph (log hello then log world), but the full power of JavaScript is available when constructing a sequence of actions. This uses promises to print a list of numbers from 0 to 4. It constructs a chain of promises rather than printing the numbers directly.

function putStrLn(msg) {
  return () => console.log(msg);
}

let p = new Promise((resolve) => resolve());

for(var ii = 0; ii < 5; ii++) {
  p.then(putStrLn(ii));
}

I hope the parallel here is clear. The Python code used Python to generate an array to beinterpreted. The Haskell code used Haskell to generate an IO to be interpreted. This JavaScript uses JavaScript to generate a chain of promises to be interpted. In all three cases, there are elements both of code generation and interptation.

If this example seems a bit contrived, printing a list of numbers is very much so. The pattern in general is not. A similar scenario came up practically at my day job not too long ago. I was building a data conversion process at the time. The processes needed to query for a list of ID's and then issue a series of asynchronous commands to convert batches of data items in sequence. The solution involved standard functional programming techniques to partition the ID's into the batches followed by construction of a chain of promises to issue the API calls. The details were different, but the structure was the same.

There are two key insights I would like to leave you with. The first is that the justification from these techniques comes from the ability to more fluently control your program's execution. Sometimes you need more structure, sometimes you need less structure, and sometimes you just need something different than the default. Being explicit about it can open up new possibilities.

The second insight is that in all of these cases, the code that you write as a developer is not quite the code that directly executes a computation. Rather, you are writing code that computes a description of a computation. The description is then processed elsewhere and according to a slightly different rules. This is the origin of the technique's power, but if it feels harder to write code in this style, it probably should. If it feels harder to understand a stack trace from this sort of code, again, it probably should.

For me, I have found it helpful over the years to understand the concepts I use with at least some amount of depth. This makes it easier to understand and predict the design. This also makes it easier to develop instincts for how to use a given technique, which is particularly important if there is no other option. I hope that what I have written here provides a useful viewpoint into what it means to be explicit about control flow, a few places where you will see it, and how to use it effectively.

A few thoughts on Lisp syntax

mike@mschaef.com (Mike Schaeffer) — Mon, 05 Jun 2023 00:00:00 +0000

If you've been around programming for a while you've no doubt come across the Lisp family of languages. One of the oldest language familess still in use, much of what we take for granted in modern programming has roots in Lisp. This includes everything from dynamic memory management to first class functions and a comprehensive library of standard data structures. However, despite the considerable influence of Lisp on the field, one aspect of the language that hasn't been widely adopted is its syntax. Lisp syntax is one of its most distinctive aspects, and is both a strength and a weakness. In this post, I'll share a few thoughts on why that is.

To give you a taste of Lisp's syntax if you're not familar, here's a simple Clojure function for parsing an string identifier. The identifier is passed in as either a numeric string (123) or a hash ID (tlXyzzy), and the function parses either form into a number.

(defn decode-list-id [ list-id ]
  (or (try-parse-integer list-id)
      (hashid/decode :tl list-id)))

In a "C-Like Langauge", the same logic looks more or less like this:

function decodeListId(listId) {
    return tryParseInteger(listId) || hashid::decode("tl", listId);
}

Right off the bat, you'll notice a heavy reliance on parenthesis to delimit logical blocks. With the exception of the argument list (`[ list-id ]`), every logical grouping in the code is delimited by parenthesis. You'll also notice the variable name (list-id) contains a hyphen - not allowed in C-like languages. What's not as obvious is that the syntax directly corresponds with the native data structures of the language. The brackets deliniate vectors and the parens deliniate sequences (or lists). The rough Java equivalents would be ArrayList and LinkedList. This direct mapping between syntax and structure goes a long way to explaining why the language looks the way it does and why syntax works the way it does.

In short, It's strange, but there are reasons. The strangeness, while it imposes costs, also offers benefits. It's these benefits that I wish to discuss.

Before I continue, I'd like to first credit Fernando Borretti's recent post on Lisp syntax. It's always good to see a defense of Lisp syntax, and I think his article nicely illustrates the way that the syntax of the langauage supports one of Lisp's other hallmark features: macros. If you haven't already read it, you should click that link and read it now. That said, there's more to the story, which is why I'm writing something myself.

If you've studied compilers, it's probably struck you how much of the first part of the source is spent on various aspects of language parsing. You'll study lexical analysis, which lets you divide streams of characters into tokens. Once you understand the basics of lexical analysis, you'll them study how to fold linear sequences of tokens into trees according to a grammar. Then, a few more tree transformations, and finally linearization back to a sequence of instructions for some more primitive machine. Lisp's syntax concerns the first two steps of this - lexical and syntactic analysis.

Lexical analysis for Lisp is very similar to lexical analysis for other languages. The main differences are the rules are a bit different. Lisp allows hyphens in symbols (see above), and other languages do not. This changes how the language looks, but isn't a huge structural advantage to Lisp's syntax:

(defn decodeListIid [ listId ]
  (or (tryParseInteger listId)
      (hashid/decode :tl listId)))

Where things get interesting for Lisp is in the syntactic analysis stage - the folding of linear lists of tokens into trees. One of the first parsing techniques you might learn while studying compilers is known as predictive recursive descent, specifically for LL(1) grammars. Without going into details, these are simple parsers to write by hand. The grammar of an LL(1) language can be mapped directly to collections of functions. Then, if there's a choice to be made during parsing, it can always be resolved by looking a single token ahead to predict the next rule you need to follow. These parsers have many limitations in what they can parse (no infix expressions), but they can parse quite a bit, and they're easy to write.

Do you see where this is going? Lisp falls into the category of languages that can easily be parsed using a recursive descent parser. Another way to put it is that it doesn't take a lot of sophistication to impart structure on a sequence of characters representing a Lisp program. While It is may be hard to write a C++ parser, it's comparatively easy to write one for Lisp. Thanks to the simple nature of a Lisp's grammar, the language really wears its syntax tree on its sleeve. This is and has been one of the key advantages Lisp derives from its syntax.

The first advantage is that simple parsing makes for simple tooling. If it's easier to write a parser for a language, it's easier to write external tools for that langauge that understand it in terms of its syntax. Emacs' paredit-mode is a good example of this. paredit-mode offers commands for interacting with Lisp code on the level of its syntactic structure. It lets you cut text based on subexpressions, swap subexpressions around, and similar sorts of operations based on the structure of the language. It is easier to write tools that operate on a langauge like this if the syntax is easily parsed. To see what I mean, imagine a form of paredit-mode for C++ and think how hard it would be to cut a subexpression there. What sorts of parsing capabilities would that command require, and how would it handle the case where code in the editor is only partially correct/

This is also true for human users of this sort of tooling. Lisp's simple grammar enables it to wear its structure on its sleeve for automatic tools, but also for human users of those tools. The properties of Lisp that make it easy for tools to identify a specific subexpression also make it easier for human readers of a block of code to identify that same subexpression. To put it in terms of paredit-mode, it's easier for human readers to understand what the commands of that mode will do, since the syntactic structure of the language is so much more evident.

A side benefit to a simple grammar is that simpler grammars are more easily extended. Fernando Boretti speaks to the power of Lisp macros in his article, but Common Lisp also offers reader macros. A reader macro is bound to a character or sequence of characters, and receives control when the standard Lisp reader encounters that sqeuence. The standard Lisp reader will pass in the input stream and allow the reader macro function to do what it wants, returning a Lisp value reflecting the content of what it read. This can be used to do things like add support for XML literals or infix expressions.

If the implications are not totally clear, Lisp's syntactic design is arguably easier for tools, and it allows easier extension to completely different syntaxes. The only constraint is that the reader macro has to accepts its input as a Lisp input stream, process somehow with Lisp code, and then return the value it "read" as a single Lisp value. It's very capable, and fits naturally into the simple lexical and syntactic structure of a Lisp. Infix languages have tried to be this extensible, but have largely failed, due to the complexity of the task.

Of course, the power of Lisp reader macros is also their weakness. By operating at the level of character streams (rather than Lisp data values) they make it impossible for external tools to fully parse Common Lisp source text. As soon as a Lisp reader macro becomes involved, there exists the possiblity of character sequences in the source text that are entirely outside the realm of a standard s-expression. This is like JSX embedded in JavaScript or SQL embedded in C - blocks of text that are totally foreign to the dominant language of the source file. While it's possible to add special cases for specific sorts of reader macros, it's not possible to do this in general. The first reader macro you write will break your external tools' ability to reason about the code that use it.

This problem provides a great example of where Clojure deviates from the Common Lisp tradition. Rather than providing full reader macros, Clojure offers tagged literals. Unlike a reader macro, a tagged literal never gets control over the reader's input stream. Rather, it gets an opportunity at read-time to process a value that's already been read by the standard reader. What this means is that a tagged literal process data very early in the compilation process, but it does not have the freedom to deviate from the standard syntax of a Clojure S-expression. This implies both flexibility to customize the reader and the ability for external tools to fully understand ahead of time the syntax of a Clojure source file, regardless of whether or not it uses tagged literals. Whether or not this is a good trade off might be a matter of debate, but it's in the context of a form of customization that most languages don't offer at all.

To be clear, there's more to the story. As Fernando Boretti mentions in his article, Lisp's uniform syntax extends across the language. A macro invocation looks the same as a special form, a function call, or a defstruct. Disambiguting between the various semantics of a Lisp form requires you to understand the context of the form and how symbols within that form are bound to meanings within this context. Put more simply, a function call and a macro invocation can look the same, even though they may have totally different meanings. This is a problem, and it's a problem that directly arises from the simplicity of Lisp syntax I extoll above. I don't have a solution to this problem other than to observe that if you're going to adopt Lisp syntax and face the problems of that syntax, you'd do well to fully understand and use the benefits of that syntax as compensation. Everything in engineering, as in life, is a tradeoff.

It's that last observation that's my main point. We live in a world where the mathematical tradition has, for centuries, been infix expressions. This has carried through to programming, that has also significantly converged on C-like syntax for its dominant languages. Lisp stands against both of these traditions in its choice of prefix expressions written in a simpler grammar than the norm. There are costs to this choice, and these costs tend to be immediately obvious. There are also benefits, and these benefits take time to make themselves known. If you have that time, it can be a a rewarding thing to explore the possibilities, even if you never get the chance to use them directly in production.

Computing, back in the day...

mike@mschaef.com (Mike Schaeffer) — Fri, 20 Jan 2023 00:00:00 +0000

As a child of the 80's, I had a front row seat to the beginning of what was then called personal computing. My elementary school got its first Apple around the time I entered kindergarten. That was also the time personal computers were starting to make inroads into offices (largely thanks to VisiCalc and Lotus 1-2-3). By modern standards these machines weren't very good. At the time they were transformative. They brought computing to places it hadn't been before, and gave access to entirely new sets of people. For someone with an early adopter's mindset, it an optimistic and exploratory time. It's for this reason (and the fact it was my childhood) that I like looking back on these old machines. That's something I hope to do here in an informal series of posts. If there happen to be a few lessons for modern computing along the way, so much the better.

If you're reading this, you're probably familar with retrocomputing. It's easy to go to eBay, buy some used equipment, and play around with a period machine from the early 80's. Emulators make it even easier. As much as I appreciate the movement, it doesn't quite provide the full experience of the time. To put it in perspective, an Apple //e was a $4,000 purchase in today's money. This is before adding disk drives, software, or a monitor. After bringing it home, and turning it on, all you had was a black screen and a blinking prompt from Applesoft basic. If you needed help, you were limited to the manual, a few books and magazines at the local bookstore, and whoever else you happened to know. The costs were high, the utility wasn't obvious, and there wasn't a huge network of people to fall back on for help. It was a different time in a way retrocomputing doesn't quite capture.

My goal here is to talk about my own experiences in that time. What it was like to grow up with these machines, both in school and at home. It's one person's perspective (from a position of privlidge) but hopefully it'll capture a little of the spirit of the day.

If you want a way to apply this to modern computing, I'd suggest thinking about the ways it was possible for these machines to be useful with such limited capabilities. I'm typing this on a laptop with a quarter million times the memory of an Apple //e. It's arguably suprising that the Apple was useful at all. But it was, and without much of the software and hardware we take for granted today. This suggests that we might have more ways to produce useful software systems than we think. Do you really need to take on the complexities of Kubernetes or React to meet your requirements? Maybe it's possible to bring a little of the minimalist spirit of 1983 forward, take advantage of the fact modern computers are as good as they are, and deliver more value for less cost.

Before I continue, I should start off by acknowledging just how privileged I am to be able to write these stories. I grew up in a stable family with enough extra resources to be able to devote a significant chunk of money to home computing. My dad is an engineer by training, with experience in computing dating back to the 60's. He was able to apply computing at his job in a number of capacities, and then had the desire and ability to bring it home. To support this, his employer offered financing programs to help employees buy their own home machines. For my mom's part, she taught third grade at my elementary school, which in 1983 (when I was in third grade) happened to be piloting a Logo programming course for third graders. Not only was I part of the course, my mom helped run the lab, and I often had free run after school to explore on my own. (At least one summer, when I was ten or eleven, I was responsible for setting up all the hardware in the lab for the upcoming school year.)

I didn't always see it at the time, but this was an amazingly uncommon set of circumstances. It literally set the direction of my intellectual and professional life, and is something I will always be thankful for. I am thankful to my parents, and also to the good fortune of the circumstances which enabled it to happen for us as a family. It could have been very different, and for most people, it was.

But before most of that, one of the first personal computers I was ever exposed to was my Uncle Herman's Timex Sinclair 1000. This was a Z80 machine, built in Clive Sinclair fashion - to the lowest possible price point. It was intended to be a machine for beginning hobbyists, and sold for $100. (In modern dollars, that's roughtly the same as a low end iPad.) Uncle Herman had his TS1000 connected to a black and white TV and sitting on his kitchen table. It's the first and only time I've ever computed on an embroidered tablecloth.

The machine itself, as you might guess from the price, was dominated by it's limitations. The first was memory. A stock 1000 had a total of 2KB of memory. KB. Not GB. Not MB. KB.

The second limitation of the machine was the keyboard. To save on cost, the keyboard was entirely membrane based. The keys were drawn on a sheet of flat plastic, didn't move when you pressed them, and offered no tactile feedback at all. The closest modern experience is the butterfly keyboard, for which Apple was recently sued and lost.

Fortunately for the machine, the software design had a trick up its sleeve that addressed both limitations at the same time. Like many other machines of the time, the 1000's only user interface was through a BASIC interpreter. When you plug the computer in (there was no power switch) you're immediately dropped into a REPL for a BASIC interpeter that serves as the command line interface. However, due to the memory limitations, the 1000 lacked space for a line editor. There wasn't enough memory in the machine to commit the bytes necessary to buffer a line of text character by character, before parsing it to a more memory efficient tokenized representation.

The solution to this problem was to allow users to enter BASIC code directly in tokenized form, without the need to parse text. Rather than typing the five characters PRINT and having the interpreter translate that to a one byte token code, the user directly pressed a button labeled PRINT. The code for the PRINT button then emitted the one byte code for that operation. This bypassed the need for both the string buffer and the parse/tokenize step.

Beyond the reduced memory consumption of this approach, it also meant you say PRINT with one keypress instead of five. This is good, given the lousy keyboard. There are also discoverability benefits. With each BASIC command labeled directly on the keyboard, it was easy for the beginner to see a list of the possible commands. The downside is that the number of possible operations is limited by the number of keys and shift states. (A problem shared by programmable pocket calculators of the time.)

Of course, the machine had other limitations too. Graphics were blocky and monochrome, and a lack of hardware forced a hard tradeoff between CPU and display refresh. It's easy to forget this now, but driving a display is a demanding task. Displays require continual refresh, with every pixel has to be driven every frame. If this doesn't happen the display goes blank. The 1000 was so down on hardware capacity that it forced a choice on the programmer. There were two modes for controlling the tradeoff between display refresh and execution speed. FAST mode gave faster execution of user code, at the expense of sacrificing display refresh. Run your code and the display goes blank. If you wanted simultaneous execution and display, you had to select SLOW mode and pay the performance price of multiplexing too little hardware to do too much work.

Despite these limitations, the machine did offer a few options for expansion. The motherboard exposed an edge connector on the back of the case. There were enough pins on this connector for a memory expansion module to hang off the back of the machine. 2K was easy to exhaust, so an extra 16K was a nice addition. The issue here is that the connection between the computer and the expansion module was unreliable. The module could rock back and forth as you typed and the machine would occasionlly totally fail when the CPU lost its electrical connection to the expansion memory.

The usual mitigation strategy for an unreliable machine is to save your work often. This is a good idea in general, and even more advisable when pressing any given key key might disconnect your CPU from its memory and totally crash the machine. The difficulty here is that the Timex only had an analog cassette tape interface for storage. I never did get this to work, but it provided at least theoretical persistant storage for your programs. The idea here is that the computer would encode a data stream as an analog signal that could be recorded on audio tape. Later, the signal could be played back from the tape to the computer to reconstruct the data in memory.

This is not the best example of an old computer with a lot of utilty. In fact, the closet analog to a Timex Sinclair 1000 might not have been a computer at all. Between the keystroke programming, limited memory, and flashing display, the 1000 was arguably closest in scope to a programmable pocket calculator. Even with those limitations, if you had a 1000, you had machine you could use to learn programming. It was possible to get a taste of what personal computing was about, and decide about taking the next step.

Git Commits and Rebasing vs. Merging

mike@mschaef.com (Mike Schaeffer) — Sat, 05 Mar 2022 00:00:00 +0000

Over most of the ten years I've been using git, I've been a strong proponent of merging over rebasing. It seemed more honest to avoid rewriting commits and more likely to produce a complete history. There are also problems that arise when you rewrite shared history, and you can avoid those entirely if you just never rewrite history at all. While all of this is true, the hidden costs of the approach came to play an increasing role in my thinking, and these days, I essentially avoid merge entirely. The result has been an easier workflow, with a more useful history of more coherent commits.

History tracking in a tool like git serves a few development purposes, some tactical and some strategic. Tactically speaking, it's nice to be able to have confidence that you can always reset to a particular state of the codebase, no matter how badly you've screwed it up. It's easier to make "risky" changes to code when you know that you're a split second away from your last known good state. Further, git remotes give you easy access to a form of off site backup and tags give you the ability to label released. Not only does the history in a tool like git make it easier to get to your last known good state during development, it also makes it easier to get back to the version you released last month before your dog destroyed your laptop.

At a strategic level, history tracking can give other longer term benefits. With a little effort, it's an excellent way to document the how and way your code evolves over time. Correctly done (and with an IDE), a good version history gives developers immediate access to the origin of each line of code, along with an explanation of how and why it got there. Of course, it takes effort to get there. Your history can easily devolve into a bunch of "WIP" messages in a randomly associated stream of commits. Like everything else in life worth doing, it takes effort to ensure that you actually have a commit history that can live up to its strategic value.

This starts with a commit history that people bother to read, and like everthing else, it takes effort to produce something worth reading. For people to bother reading your commit history, they need to believe that it's worth the time spent to do so. For that to happen, enough effort needs to have been spent assembling the history that it's possible to understand what's being said. This is where the notion of a complete history runs into trouble. Just like historians curate facts into readable narratives, it is our responsibility as developers to take some time to curate our projects' change history. At least if we expect them to be read. My argument for rebasing over merging boils down to the fact that rebase/squash makes it easier to do this curation and produce a history that has these useful properties.

For a commit to be useful in the future as a point of documentation, it needs to contain a coherent unit of work. git thinks in terms of commits, so it's important that you also think in terms of commits. Being able to trust that a single commit contains a complete single change is usetul both from the point of view of interpreting a history, and also from the point of view of using git to manipulate the history. It's easier to cherry-pick one commit with a useful change than it is three commits, each with a part of that one change.

Another way of putting this is that nobody cares about the history of how you developed a given feature. Imagine adding a field to a screen. You make a back end change in one commit, a front end change in the next, and then submit them both in one branch as a PR. A year after, does it really matter to you or to anybody else that you modified the back end first and then the front end? The two commits are just noise in the history. They document a state that never existed in anything like a production environment.

These two commits also introduce a certain degree of ongoing risk. Maybe you're trying to backport the added field into an earlier maintenance release of your software. What happens if you cherry-pick just one of the two commits into the maintenance release? Most likely, that results in a wholly invalid state that you may or may not detect in testing. Sure, the two commits honestly documented the history, but there's a cost. You lose documentation of the fact that both the front and back end changes are necessary parts of a single whole.

Given this argument for squashing, or curating, commits into useful atomic units, development branches largely reduce down to single commits. You may have a sequence of commits during development to personally track your work, but by the time you merge, you've squashed it down to one atomic commit describing one useful change. This simplifies your history directly, but it also makes it easier to rebase your evelopment branch. Rebasing a branch with a single commit avoids introducing historical states that "never existed". The single commit also dramatically simplifies the process of merge conflict resolution. Rebase a branch with 10 commits, and you may have 10 sets of merge conflicts to resolve. Do you really care about the first nine? Will you really go back to those commits and verify that they still work post-rebase? If you don't, you're just dumping garbage in your commit log that might not even compile, much less run.

I'll close with the thought that this approach also lends itself to better commit messages. If there are fewer commits, there are fewer commit messages to write. With fewer commit messages to write, you can take more time on each to write something useful. It's also easier to write commit messages when your commits are self-contained atomic units. Squashing and curating commits is useful by itself in that it leads to a cleaner history, but it also leads to more opportunities to produce good and useful commit messages. It points in the direction of a virtuous cycle where positive changes drive other positive changes.

The Minimum Viable Product

mike@mschaef.com (Mike Schaeffer) — Fri, 14 Jan 2022 00:00:00 +0000

This image has been circulating on LinkedIn as a tongue and cheek example of a miminum viable product.

Of course, at least one of the responses was that it's not an MVP without some extras. It needs 24/7 monitoring or a video camera with a motion alarm. It needs to detect quakes that occur off hours or when you're otherwise away from the detector. The trouble with this statement is the same as with the initial claimed MVP status of this design - both claims make assumptions about requirements. The initial claim assumes you're okay missing quakes when you're not around and the second assumes you really do need to know. To identify an MVP, you need to understand what it means to be viable. You need to understand the goals and requirements of your stakeholders and user community.

Personally, I'm sympathetic to the initial claim that two googly eyes stuck on a shet construction paper might actually be a viable earthquake detector. As a Texan transplant to the Northeast, I'd never experienced an earthquake until the 2011 Virginia earthquake rattled the walls of my suburban Philly office. Not having any real idea what was going on, my co-workers and I walked over to a wall of windows to figure it out. Nothing bad happened, but it wasn't a smart move, and exactly the sort of thing a wall mounted earthquake detector might have helped avoid. The product doesn't do much, but it does do something, and that might well be enough that it's viable.

This viability, though, is contingent on the fact that there was no need to know about earthquakes that occurred off-hours. Add that requirement in, and more capability is needed. The power of the MVP is that it forces you to develop a better understanding of what it is that you're trying to accomplish. Getting to an MVP is less about the product and more about the requirements that drive the creation of that product.

In a field like technology, where practicioners are often attracted to the technology itself, the distinction between what is truly required and what is not can be easy to miss. Personally, I came into this field because I like building things. It's fun and rewarding to turn an idea into a working system. The trouble with the MVP from this point of view is that defining a truly minimum product may easily eliminate the need to build something cool. The answer may well be that nNo, you don't get to build the video detection system, because you don't need it and your time is better spent elsewhere. The notion of the MVP inherently pulls you away from the act the build and forces you to to consider that there may be no immediate value in the thing you aim to build.

One of my first consulting engagments was years ago, for a bank building out a power trading system. They wanted to enter the business to hedge other trades, and the lack of a trading system to enforce controls limits was the reason they couldn't. Contrary to the advice of my team's leadership, they initially decided to scratch build a trading system in Java. There were two parts of this experience that spoke to the idea of understanding requirements and the scope of the minimum viable product.

The first case can be boiled down to the phrase 'training issue'. Coming from a background of packaged software development, my instincts at the time were competely aligned around building software that helps avoid user error. In mass market software, you can't train all of your users, so the software has to fill the gap. There's a higher standard for viability in that the software is required to do more and be more reliable.

This trading platform was different in that it was in-house software with a user base known that numbered in the dozens. With a user base that well and known small, it's feasable to just train everybody to avoid bugs. A crashing, high severity bug that might block a mass market software release might just be addressed by training users to avoid it. This can be much faster, which is important when the software schedule is blocking the business from operating in the first place. The software fix might not actually be required for the product to be viable. This was less perfect software, and more about getting to minimum viability and getting out of the way of a business that needed to run.

The second part of the story is that most of the way through the first phase of the build, the client dropped the custom build entirely. Instead, they'd deploy a commercial trading platform with some light customizations. There was a lot less to build, but it went live much more quickly, particularly in the more complex subsequent phases of the work. It turned out that none of the detailed customizations enabled by the custom build were actually required.

Note that this is not fundementally a negative message. What the MVP lets you do is lower the cost of your build by focusing on what is truly required. In the case of a trading organization, it can get your traders doing their job more quickly. In the case of an earthquake detector, maybe it means you can afford more than just one. Lowering the cost of your product can enable it to be used sooner and in more ways than otherwise.

The concept of an MVP has power because it focuses your attention on the actual requirements you're trying to meet. With that clearer focus, you can achieve lower costs by reducing your scope. This in turn implies you can afford to do more of real value with the limited resources you have available. It's not as much about doing less, as it is about doingo more of value with the resources you have at hand. That's a powerful thing, and something to keep in mind as you decide what you really must build.

Packaging Small Clojure Apps

mike@mschaef.com (Mike Schaeffer) — Wed, 12 Aug 2020 00:00:00 +0000

Like a lot of engineers, I have a handful of personal projects I keep around for various reasons. Some are useful and some are just for fun, but none of them get the same sort of investment as a funded commercial effort. The consequence of this is that it's all the more important to keep things as simple as possible, to focus the investment where it counts. Part of the way I achieve that is that I've spent some initial time putting together a standard packaging approach. I know, I know - "standard packaging approach" doesn't sound like "fun personal project" - but getting the packaging out of the way makes it easier to focus on the actual fun part - building out functionality. It's for that reason that I've also successfully used variants of this approach on smaller commercial projects too. Hopefully, this will be useful to you too.

Setting the state, the top level view is this:

Uberjar packaging of single binaries using Leiningen and a few plugins.
Standard scripts and tools for packaging and install.
Use of existing Linux mechanisms for service control.
A heavy tendancy toward 12 Factor principles.

What this gets you is a good local interactive development story and easy deployment to a server. I've also gotten it to work with Client side code too, using Figwheel

What it doesn't get you is direct support for large numbers of processes or servers. Modern hardware is fast and capable, so you may not have those requirements, but if you do, you'll need something heavier weight, to reduce both management overhead and costs. (In my day job, we've done some amazing things with Kubernetes.)

The example project I'm going to use is the engine for this blog, Rhinowiki. It's useful, but simple enough to be used as a model for a packaging approach. If you're also interested in strategies for managing apps with read/write persistance (SQL) and rich client code, I have a couple other programs packaged this way with those features. Even with these, the essentials of the packaging strategy are exactly the same as what I describe here:

Toto - Web app with persistance to an embedded SQL database managed with sql-file.
Metlog - Single Page App done with a Clojure back end, Reagent, ClojureScript, and Figwheel.

Everything begins with a traditional project.clj, and the project can be started locally with the usual lein run.

Once running, main immediately writes a herald log message at info level:

(defn -main [& args]
  (log/info "Starting Rhinowiki" (get-version))
  (let [config (load-config)]
    (log/debug "config" config)
    (webserver/start (:http-port config)
                     (blog/blog-routes (blog/blog-init config)))
    (log/info "end run.")))

This immediately lets you know the process has started, logs are working, and which version of the code is running. These are usually the first things verified after an install, so it's good to ensure they happen early on. This is particularly useful for software that's not interactive or running on slow hardware. (I've run some of this code on Raspberry Pi hardware that takes ten or so seconds to get to the startup herald.)

The way the version is acquired is interesting too. The call to get-version is really a macro invocation and not a function call.

(defmacro get-version []
  ;; Capture compile-time property definition from Lein
  (System/getProperty "rhinowiki.version"))

Because macros are evaluated at compile time, the macroexpansion of get-version has access to JVM properties defined at build time by Leiningen.

The next step is to pull in configuration settings using Anatoly Polinsky's https://github.com/tolitius/cprop library. cprop can do more than what I use it for here, but here, I use it to load a single EDN config file. cprop lets the name of that file be identified at startup via a system proprety, making it possible to define a file for each server, as well as a local config file specified in: project.clj.

:jvm-opts ["-Dconf=local-config.edn"]

I've also found it useful to minimize the number and power of configuration settings. Every setting that changes is a risk that something will break when you promote code. Every setting that doesn't change is a risk of introducing a bug in the settings code.

I also dump the configugration to a log very early in the startup process.

(log/debug "config" config)

Given the importance of configuration settings, it's occasionally important to be able to inspect the settings in use at run-time. However, this log is written at debug level, so it doesn't normally print. This reduces the risk of accidentally revealing secret keys in the log stream. Depending on the importance of those keys, there is also much more you can do to protect them, if preventing the risk is worth the effort.

After all that's done, main transfers control over to the actual application:

(webserver/start (:http-port config)
                 (blog/blog-routes (blog/blog-init config)))

With a configurable application running, the next step is to get it packaged in a way that lets us predictably install it elsewhere. The strategy here is a two step approach: build the code as an uberjar and include the uberjar in a self-contained .tar.gz as an installation pacakge.

The installer package contains everything needed to install the software (the one exception being the JVM itself).
The package name includes the version number of the software: rhinowiki-0.3.3.tar.gz.
Files in the installation package all have a prefix (rhinowiki-install, in this case) to confine the installation files to a single directory when installing. This is to make it easy to avoid crosstalk between multiple installers and delete installation directories after you're done with an installation.
There is an idempotent installation script (install.sh) at the root of the package. Running this script either creates or updates an installation.
The software is installed as a Linux service.

The net result of this packaging in an installation/upgrade process that works like this:

tar xzvf rhinowiki-0.3.3.tar.gz
cd rhinowiki-install
sudo service rhinowiki stop
sudo ./install.sh
sudo service rhinowiki start

To get to this point, I use the Leiningen release task and the lein-tar plugin, both originally by Phil Hagelberg. There's a wrapper script, but the essential command is lein clean && lein release $RELEASE_LEVEL. This instructs Leiningen to execute a series of tasks listed in the release-tasks key in project.clj.

I've had to modify Leiningen's default list of release tasks, in two ways: I skip signing of tagged releases in git, and I invoke lein-tar rather than deploy. However, the full task list needs to be [completely restated in project.clj](https://github.com/mschaef/rhinowiki/blob/master/project.clj#L42), so it's a lengthy setting.

:release-tasks [["vcs" "assert-committed"]
                ["change" "version" "leiningen.release/bump-version" "release"]
                ["vcs" "commit"]
                ["vcs" "tag" "--no-sign" ]
                ["tar"]
                ["change" "version" "leiningen.release/bump-version"]
                ["vcs" "commit"]
                ["vcs" "push"]]

The configuration for lein-tar is more straightforward - include the plugin, and specify a few options. The options request that the packaged output be written in the project root, include an uberjar, and extract into an install directory rather than just CWD.

:plugins [[lein-ring "0.9.7"]
          [lein-tar "3.3.0"]]

;; ...

:tar {:uberjar true
      :format :tar-gz
      :output-dir "."
      :leading-path "rhinowiki-install"}

Give the uberjar a specific fixed name:

:uberjar-name "rhinowiki-standalone.jar"

And populate it with a few files additional to the uberjar itself - lein-tar accepts these files in pkg/ at the root of the project directory hierarchy. These files include everything else needed to install the application - a configuration map for cprop, an install script, a service script, and log configuration.

The install script is the last part of the process. It's an idempotent script that, when run on a server as sudo, guarantees that the application is installed. It sets up users and groups, copies files from the package to wherever they belong, and uses update-rc.d to ensure that the service scripts are correctly installed.

This breaks down the packaging and installation process to the following:

./package.sh
scp package tarball to server and ssh in
Extract the package - tar xzvf rhinowiki-0.3.3.tar.gz
Change into the expanded package directory - cd rhinowiki-install
Stop any existing instances of the service - sudo service rhinowiki stop
Run the install script - sudo ./install.sh
(Re)Start the service - sudo service rhinowiki start

At this point, I've sketched out the approach end to end, and I hope it's evident that this can be used in fairly simple scenarios. Before I close, let me also talk about a few sharp edges to be aware of. Like every other engineering approach, this packaging strategy has tradeoffs, and some of these tradeoffs require specific compromises.

The first is that this approach requires dependencies (notably the JVM) to be manually installed on target servers. For smaller environments, this can be acceptable, for larger numbers of target VM's, almost definately not.

The second is that there's nothing about persistance in this approach. It either needs to be managed externally, or the entire persistance story needs to be internal to the deployed uberjar. This is why I wrote sql-file, which provides a built in SQL database with schema migration support. Another approach is just to handle it altogether externally, which is what I do for Rhinowiki. The Rhinowiki store is a git repository, and it's managed out of band with respect to the deployment of Rhinowiki itself.

But these are both specific problems that can be managed for smaller applications. Often times, it's worth the costs associated with these problems, to gain the benefits of reducing the number of software components and moving pieces. If you're in a situation like that, I hope you give this approach a try and find it useful. Please let me know if you do.

Git Stuff

mike@mschaef.com (Mike Schaeffer) — Wed, 18 Sep 2019 00:00:00 +0000

Amazingly enough, git is now 14 years old. What started out as Linus Torvald's 'three day' replacement for BitKeeper is now dominant enough in its domain that even the Windows Kernel is hosted on git. (If you really are amazed by the age of git, that last bit might be even more amazing.) In any event, I also use git and have done so for close to ten years. Along with a compiler and an editor, I'd consider it one of the three essential development tools. That experience has left me with a set of preconceived notions about how git should be used and some tips and tricks on how to use it better. I've been meaning to get it all into a single place for a while, and this is the attempt.

This isn't really the place to start learning git (that would be a tutorial). This is for people that have used git for a while, understand the basic mechanics, and want to look for ways to elevate their game and streamline their workflow.

The Underlying Data Model

git is built on a distinct data structure, and the implications of this structure permeate the user experience.

Understanding the underlying data model is important, and not that complicated from a computer science perspective.

Every revision of a source tree managed by git can be considered a complete snapshot of every source file. This is called a commit.
Every commit has a name (or address), which is a hash of the entire contents of the commit. These names are not user friendly (They look like d674bf514fc5e8301740534efa42a28ca4466afd), but they're essentially guaranteed to be unique.
If two commits have different contents, they also have different hashes. A hash is enough to completely identify a state of a source tree.
Because hashes are a pain to work with, git also has refs. Refs are user friendly symbolic names (master, fix-bug-branch) that can each point to a commit by hash.
Commits can't be mutated, because any change to their contents would change their name/hash. Refs are where git allows mutations to occur.
If you think of a ref as a variable that contains a hash and points to a commit, you're not far off.
Commits can themselves refer to other commits - Each commit can contain references to zero or more predecessors. These backlinks what allow git to construct a history of commits (and therefore a history of a source code tree).
The 'first commit' has zero predecessors, a merge commit has two or more.

The result of all this is that the core data structure is a directed acyclic graph, covered nicely in this post by Tommi Virtanen.

Friend Authorization Checks and Compojure Routing

mike@mschaef.com (Mike Schaeffer) — Thu, 24 Jan 2019 00:00:00 +0000

Despite several good online resources, it's not necessarily obvious how friend's wrap-authorize interacts with Compojure routing.

This set of routes handles /4 incorrectly:

(defroutes app-routes
  (GET "/1" [] (site-page 1))
  (GET "/2" [] (site-page 2))
  (friend/wrap-authorize (GET "/3" [] (site-page 3)) #{::user})
  (GET "/4" [] (site-page 4)))

Any attempt to route to /4 for a user that doesn't have the ::user role will fail with the same error you would expect to (and do) get from an unauthorized attempt to route to /3. The reason this happens is that Compojure considers the four routes in the sequence in which they are listed and wrap-authorize works by throw-ing out if there is an authorization error (and aborting the routing entirely).

So, even though the code looks like the authorization check is associated with /3, it's really associated with the point in evaluation after /2 is considered, but before /3 or /4. So for an unauthorized user of /3, Compojure never considers either the the /3 or /4 routes. /4 (and anything that might follow it) is hidden behind the same security as /3.

This is what's meant when the documentation says to do the authorization check after the routing and not before. Let the route decide if the authorization check gets run and then your other routes won't be impacted by authorization checks that don't apply.

What that looks like in code is this (with the friend/authorize check inside the body of the route):

(defroutes app-routes
  (GET "/1" [] (site-page 1))
  (GET "/2" [] (site-page 2))
  (GET "/3" [] (friend/authorize #{::user} (site-page 3)))
  (GET "/4" [] (site-page 4)))

The documentation does mention the use of context to help solve this problem. Where that plays a role is when a set of routes need to be hidden behind the same authorization check. But the essential point is to check and enforce authorization only after you know you need to do it.

Small Computing History Resources

mike@mschaef.com (Mike Schaeffer) — Fri, 21 Dec 2018 00:00:00 +0000

I've lately run across several interesting small computer history sites. If you have any interest in small computing's emergence from 1980 to 1990 or so, these are worth a look.

In no particular order:

OS/2 Museum - Covers OS/2, but also gets into detail around PC architecture. Among other interesting bits, this is just one of several articles on A20 gate handling, and here's something on the IBM 8514/A.
DTACK Grounded - A newsletter written to promote Hal Hardbergh's side business of attached Motorola 68000 processor boards. Mostly interesting for his commentary on then-crurent events leading up to the emergence and use of 32-bit microprocessors. Notably, this was written at the time of Intel's pivot from the iAPX 432 to the 80386. The commentary on the relative unreliability of DRAM is amusing too.
CRPG Addict - Not sure how he has the time, but the author of this blog has set himself the challenge of playing through and documenting every early CRPG game from the late 70's and well into the 90's.
The Digital Antiquarian - Critical commentary on early small computer gaming. Lots of details about how games came to be made and their content.
Retrocomputing Stack Exchange site - This is currently more like Netflix than anything else. Coverage is spotty, but that doesn't mean you can't find something interesting to read.

Rhinowiki

mike@mschaef.com (Mike Schaeffer) — Fri, 03 Aug 2018 00:00:00 +0000

It's been a long time coming, but I've finally replaced blosxom with a custom CMS I've been writing called Rhinowiki. More than a serious attempt at a CMS, this is mainly a fun little side project to write some Clojure, experiment a bit with JGit, and hopefully make it easier to implement a few of my longer term plans that might have been tricky to do in straight Perl.

Full source in the link above, a high level summary here:

Everything is in Clojure.
Backend format is Markdown as interpreted by markdown-clj.
Source code is highlighted using highlight.js.
Markdown rendering is done entirely on the server, with syntax highlighting on the client. (I'm looking into Nashorn to run highlight.js server side too, but don't know if that's possible within my time constraints.)
Back end storage is managed using and retrieved via JGit.
All requests are served out of memory.
There's a hand rolled (and conformant) Atom feed.
Also RSS 2.0.

Working with Directories

mike@mschaef.com (Mike Schaeffer) — Fri, 30 Sep 2016 00:00:00 +0000

This is a bash function definition that takes you to the top level directory of a git project.

function cdtop() {
    local git_root;

    git_root=`git rev-parse --show-toplevel`;

    if [ $? -eq 0 ]
    then
        cd ${git_root}
    else
        return 1
    fi
}

Here's a git alias that does serves a similar purpose. What this does is define a new alias, exec, that executes a shell command in the current project's root.

git config --global alias.exec '!exec '

With this alias defined, you can say the following and it will take you to the project root.

cd `git exec pwd`

http://stackoverflow.com/questions/957928/is-there-a-way-to-get-the-git-root-directory-in-one-command

MacBook 2015 Revisited

mike@mschaef.com (Mike Schaeffer) — Wed, 29 Apr 2015 00:00:00 +0000

Since my last post, I dropped by an Apple Store to take a look at the 2015 MacBook. It is difficult to overstate how startlingly small the new machine is in person. I may be biased by the internal specifications, but the impression is much more 'big tablet' than 'small laptop'. The other standout feature was the touchpad. It continues Apple's tradition of high-quality touchpad implementations, removes the mechanicical switch and hinge, and adds force sensititivy and haptic feedback. The mechanical simplifications alone are a worthwhile improvement.

I also spent some time typing on the keyboard. It's as shallow as you'd think, but the keys are very direct have a positive feel. There's none of the subtle rattling found on most small keyboards and it registered every keypress. I'm not completely convinced yet, but it at least seems possible that this type of keyboard could become the preferred keyboard for some typists.

The performance of the machine is also a point of interest. Even the lightly loaded demo machine on the showroom floor had a few hiccups paging the display around from one virtual desktop to the next. Maybe it's nothing, but it does make me wonder if the machine can keep up with daily use, particuarly after a few OSX updates have been released. (For me, I think it'd be fine, but I spend most my time in Terminal, Emacs, and Safari, none of which are exactly heavy-hitters.)

RPN Calc Part 10 – Macros and the Intent of the Code

mike@mschaef.com (Mike Schaeffer) — Sat, 20 Dec 2014 00:00:00 +0000

One of the key attributes I look for when writing and reviewing code is that code should express the intent of the developer more than the mechanism used to achieve that intent. In other words, code should read as much as possible as if it were a description of the end goal to be achieved. The mechanism used to achieve that goal is secondary.

Over the years, I’ve found this emphasis improves the quality of a system by making it easier to write correct code. By removing the distraction of the mechanism underneath the code: it’s easier for the author of that code to stay in the mindset of the business process they’re implementing. To see what I mean, consider how hard it would be to query a SQL database if every query was forced to specify the details of each table scan, index lookup, sort, join, and filter. The power of SQL is that it eliminates the mechanism of the query from consideration and lets a developer focus on the logic itself. The computer handles the details. Compilers do the same sort of thing for high level languages: coding in Java means not worrying about register allocation, machine instruction ordering, or the details of free memory reclamation. In the short-term, these abstractions make it easier to think about the problem I’m being paid to solve. Over a longer time scale, the increased distance between the intent and mechanism makes it easier to improve the performance or reliability of a system. Adding an index can transparently change a SQL query plan and Java seamlessly made the jump from an interpreter to a compiler.

One of the unique sources of power in the Lisp family of languages is a combination of features that makes it easier build the abstractions necessary to elevate code from mechanism to intent. The combination of dynamic typing, higher order functions, good data structures, and macros can make it possible to develop abstractions that allow developers to focus more on what matters, the intent of the paying customer, and less on what doesn’t. In this article, I’ll talk about what that looks like for the calculator example and how Clojure brings the tools needed to focus on the intent of the code.

To level set, I’m going to go back to the calculator’s addition command defined in the last installment of this series.:

(fn [ { [x y & more] :stack } ]
   { :stack (cons (+ y x) more)})

Given a stack, this command removes the top two arguments from the stack, adds them, and pushes the result back on top of the stack. This stack:

1 2 5 7

becomes this stack:

3 5 7

While the Clojure addition command is shorter than the Java version, the Clojure version still includes a number of assumptions about the machinery used in the underlying implementation:

Calculator state is passed to the command as a map with a key :stack that holds the stack.
The input stack can be destructured as a sequence.
The output state is represented in a map allocated at the end of the command’s execution.
The output stack is a sequence of cons cells and the output of this command is stored in a newly allocated cell.
The command has a single point in time at which it begins execution.
The command has a single point in time at which it ends execution.
The execution of this command cannot overlap with other commands that manipulate the stack.

Truth be told, there isn’t a single item on this list that’s essential to the semantics of our addition command. Particularly in the case where a sequence of commands is linked together to make a composite command, every item on that list might be incorrect. This is because the state of the stack between elements of a composite command might not ever be directly visible to the user. Keeping that in mind, what would be nice is some kind of shorthand notation for stack operations that hides these implementation details. This type of notation would make it possible to express the intent of a command without the machinery. Fortunately, the programming language Forth has a stack effect notation often used in comments that might do the trick.

Forth is an interesting and unique language with a heavy dependency on stack manipulation. One of the coding conventions sometimes used in Forth development is that every ‘composite command’ (‘word’, in Forth terminology) is associated with a comment that shows a picture of the stack at the beginning and end of the command’s execution. For addition, such a comment might look like this:

: add ( x y -- x+y ) .... ;
Code found missing language specification while processing file: ksm/rpncalc_10.md

This comment shows that the command takes two arguments off the top of the stack, ‘x’ and ‘x’, and returns a single value ‘x+y’. None of the details regarding how the stack is implemented are included in the comment. The only thing that’s left in the comment are the semantics of the operation. This is more or less perfect for defining a calculator command. Mapped into Clojure code, it might look something like this:

(stack-op [x y] [(+ x y)])

This Clojure form indicates a stack operation and has stack pictures that show the top of the stack both before and after the evaluation of the command. The notation is short, yes, but it’s particularly useful because it doesn’t overspecify the command by including the details of the mechanics. All that’s left in this notation is the intent of the command.

Of course, the mechanics of the command still need to be there for the command to work. The magic of macros in Clojure is that they make it easier to bridge the gap from the notation you want to the mechanism you need. Fortunately, all it takes in this case is a short three line macro that tells Clojure how to reconstitute a function definition from our new stack-op notation:

(defmacro stack-op [ before after ]
  `(fn [ { [ ~@before & more# ] :stack } ]
     { :stack (concat ~after more# ) } ) )

Squint your eyes, and the structure of the original Clojure add command function should be visible within the macro definition. That’s because this macro really serves as a kind of IDE snippet hosted by the compiler, providing blanks to be filled in with the macro parameters. Multiple calls to a macro are like expanding the same snippet multiple times with different parameters. The only difference is that when you expand a snippet within an IDE, it only helps you when you’re entering the code into the editor; the relationship between a block of code in the editor and the snippet from which it came is immediately lost. Macros preserve that relationship, and thanks to Lisp’s syntax, do so in a way that avoids some of the worst issues that plague C macros. This gives us both the more ‘intentional’ notation, as well as the ability to later change the underlying implementation in more profound ways.

Before I close the post, I should mention that there are ways to approach this type of design in other languages. In C, the preprocessor provides access to compile-time macro expansion, and for Java and C#, code generation techniques are well accepted. For JavaScript, any of the higher level languages that compile into JavaScript can be viewed as particularly heavy-weight forms of this technique. Where Lisp and Clojure shine is that they make it easy by building it into the language directly. This post only scratches the surface, but the next post will continue the theme by exploring how we can improve the calculator now that we have a syntax that more accurately expresses our intent.

RPN Calc Part 9 – State and Commands in Clojure

mike@mschaef.com (Mike Schaeffer) — Mon, 15 Dec 2014 00:00:00 +0000

In my last post, I started porting the RPN calculator example from Java to Clojure, moving a functional program into a functional language. In this post, I finish the work and show how the Clojure calculator models both state and calculator commands.

Going back to the last post, the Clojure version of the Read-Eval-Print-Loop (REPL) has the following code.

(defn main []
  (loop [ state (make-initial-state) ]
    (let [command (parse-command-string (read-command-string state))]
      (if-let [new-state (apply-command state command)]
        (recur new-state)
        nil))))

As with the Java REPL, this function continually loops, gathering commands to evaluate, evaluating them against the current state, and printing the state after each command is executed. The REPL function controls the lifecycle of the calculator state from beginning to end, starting by invoking the state constructor function:

(defn make-initial-state []
  {
   :stack ()
   :regs (vec (take 20 (repeat 0)))
   })

Like main, the empty brackets signify that this is a 0-arity function, a function that takes 0 arguments. Looking back at the call site, this is why the name of the function appears by itself within the parenthesis:

(make-initial-state)

If the function required arguments, they’d be to the right of the function name at the call site:

(make-initial-state  ... )

This is the way that Lisp like languages represent function and macro call sites. Every function or macro call is syntactically a list delimited by parenthesis. The first element of that list identifies the function or macro being invoked, and the arguments to that function or macro are in the second list position and beyond. This is the rule, and it is essentially universal, even including the syntax used to define functions. In this form, defn is the name of the function definition macro, and it takes the function name, argument list, and body as arguments:

(defn make-initial-state []
  {
   :stack ()
   :regs (vec (take 20 (repeat 0)))
   })

For this function, the body of the function is a single statement, a literal for a two element hash map. In Clojure, whenever run time control flow passes to an object literal, a new instance of that literal is constructed and populated.

{
 :stack ()
 :regs (vec (take 20 (repeat 0)))
 }
 

This one statement is thus the rough equivalent of calling a constructor and then a series of calls to populate the new object. Paraphrasing into faux-Java:

Mapping m = new Mapping();
 
m.put("stack", Sequence.EMPTY);
m.put("regs", vec(take(20, repeat(0)));

Once the state object is constructed, the first thing the REPL has to do is prompt the user for a command. The function to read a new command takes a state as an argument. This is so it can print out the state prior to prompting the user and reading the command string:

(defn read-command-string [ state ]
  (show-state state)
  (print "> ")
  (flush)
  (.readLine *in*))

This code should be fairly understandable, but the last line is worthy of an explicit comment. *in* is a reference to the usual java.lang.System.in, and the leading dot is Clojure syntax for invoking a method on that object. That last line is almost exactly equivalent to this Java code:

System.in.readLine();

There’s more use of Clojure/Java interoperability in the command parser:

(defn parse-command-string [ str ]
  (make-composite-command
   (map parse-single-command (.split (.trim str) "\\s+"))))

The Java-interop part is in this bit here:

(.split (.trim str) "\\s+")

Translating into Java:

str.trim().split("\\s+")

Because str is a java.lang.String, all the usual string methods are available. This makes it easy to use standard Java facilities to trim the leading and trailing white space from a string and then split it into space-delimited tokens. Going back to part 2 of this series, this is the original algorithm I used to handle multiple calculator commands entered at the same prompt.

The rest of parse-command-string also follows the original part-2 design: each token is parsed individually as a command, and the list of all commands is then assembled into a single composite command. The difference is that there’s less notation in the Clojure version, mainly due to the use of the higher-order function map. map applies a function to each element of an input sequence and returns a new sequence containing the results. This one function encapsulates a loop, two variable declarations, a constructor call, and the method call needed to populate the output sequence:

List subCmds = new LinkedList();
  
for (String subCmdStr : cmdStr.split("\\s+"))
    subCmds.add(parseSingleCommand(subCmdStr));

What’s nice about this is that eliminating the code eliminates the possibility of making certain kinds of errors. It also makes the code more about the intent of the logic, and less about the mechanism used to achieve that intent. This opens up optimization opportunities like Clojure’s lazy evaluation of mapping functions.

The final bit of new notation I’d like to point out is the way the Clojure version represents commands. Commands in the Clojure version of the calculator are functions on calculator state, represented as Clojure functions:

(fn [ { [x y & more] :stack } ]
    { :stack (cons (+ y x) more)})

This function, the addition command, accepts a state object and uses argument list destructuring to extract out the stack portion of the state. It then assembles a new state object that contains a version of the stack that contains the sum of the top two previous stack elements. Rather than focusing on the machinery used to gather and manipulate stack arguments, Clojure’s notation makes it easier for the code behind the command to match the intent. As before, this helps reduce the chance for errors, and it also opens up new optimization opportunities.

(If you’ve read closely and are wondering what happened to regs, commands in the Clojure version of the calculator can actually return a partial state. If a command doesn’t return a state element, then the previous value for that state element is used in the next state. Because add doesn’t change regs, it doesn’t bother to return it.)

RPN Calc Part 8 – Moving to Clojure

mike@mschaef.com (Mike Schaeffer) — Tue, 02 Dec 2014 00:00:00 +0000

So far in this series, I’ve taken a basic calculator written in Java and transformed it from a command-oriented procedural design into a more functional style. In some ways, this has made for simpler code: calculator state is better encapsulated in value objects, and explicit control flow structures have been replaced with domain-specific higher order functions. Unfortunately, Java wasn’t designed to be a functional language, so the notation has become progressively more cumbersome and lengthy. 151 lines of idiomatic Java is now 327 lines of inner classes, custom iterators, and inverted control flow patterns. It should be difficult to get this kind of code through a serious Java code review.

Despite this difficulty, there is value in the functional design approach; What we need is a new notation. To show what I mean, this article switches gears and ports the latest version of the calculator from Java to Clojure. This reduces the size of the code from 327 lines down to a more reasonable-for-the-functionality 82. More importantly, the new notation opens up new opportunities for better expressiveness and further optimization. Building on the Clojure port, I’ll ultimately build out a version of the calculator that uses eval for legitimate purposes, and compiles calculator macros and can run them almost as fast as code written directly in Java.

The first step to understanding the Clojure port is to understand how it’s built from source. For the Java versions of the code, I used Apache Maven to automate the build process. Maven provides standard access to dependencies, a standard project directory structure, and a standard set of verbs for building, installing, and running the project. In the Clojure world, the equivalent tool is called Leiningen. It provides the same structure and services for a Clojure project as Maven does for a Java project, including the ability to pull in Maven dependencies. While it’s possible to build Clojure code with Maven, Leiningen is a better choice for new work, largely because it’s more well integrated into the Clojure ecosystem out of the box.

For the RPN calculator project, the project definition file looks like this:

(defproject rpn-calc "0.1.0-SNAPSHOT"
  :description "KSM Partners - RPN Calculator"
 
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
 
  :dependencies [[org.clojure/clojure "1.5.0"]]
 
  :repl-options {
                 :host "0.0.0.0"
                 :port 53095
                 }
 
  :main rpn-calc.main)

This file contains the same sorts of information as the equivalent POM file for the Java version of the project. (In fact, Leiningen provides way to take a Leiningen project definition file and translate it into an equivalent Maven pom.xml.) Rather than XML, the Leiningen project file is written in an S-expression, and it contains a few additional settings. Notably, the last line is the name of the project’s entry point: the function that gets called when Leiningen runs the project. For this project, rpn-calc.main is a function that ultimately delegates to one of three entry points for the three Clojure versions of the calculator. For this post, the implementation specific entry point looks like this:

(defn main []
  (loop [ state (make-initial-state) ]
    (let [command (parse-command-string (read-command-string state))]
      (if-let [new-state (apply-command state command)]
        (recur new-state)
        nil))))

public void main() throws Exception
{
    State state = new State();
 
    while(state != null) {
        System.out.println();
        showStack(state);
        System.out.print("> ");
 
        String cmdLine = System.console().readLine();
 
        if (cmdLine == null)
            break;
 
        Command cmd = parseCommandString(cmdLine);
 
        state = cmd.execute(state);
    }
}

Unwrapping the code, both function definitions include construction of the initial state and then the body of the Read-Eval-Print-Loop. These two lines of code include both elements.

(loop [ state (make-initial-state) ]
    ...
    (recur new-state))

The loop form, surrounded by parentheses, is the body of the loop. Any loop iteration variables are defined and initialized within the bracketed form at the beginning of the loop. In this case, a variable state is initialized to hold the value returned by a call to make-initial-state. Within the body of the loop, there can be one or more recur forms that jump back to the beginning of the loop and provide new values for all the iteration variables defined for the loop. This gives a bit more flexibility than Java’s while loop: there can be multiple jumps to the beginning of a loop.

The body of this loop form is entirely composed of a let form. A let form establishes local variable bindings over a block of source code and provides initial values for those variables. If this sounds a lot like a loop form without the looping, that’s exactly what it is.

(let [command (parse-command-string (read-command-string state))]
   ...)

This code calls read-command-string, passing in the current state and then passes the returned command string into a call to parse-command-string. The result of this two step read process is the Clojure equivalent of a command object, which is modeled as a function from a calculator state to a state.

Digressing a moment, there are several attributes of the Clojure syntax that are worth pointing out. The most important is that, as with most Lisps, parenthesis play a major role in the syntax of the language. Parenthesis (and braces and brackets) delimit all statements and expressions, group statements into logical blocks, delimit function definitions, and serve as the syntax for composite object literals. In contrast, a language like Java uses a combination of semicolons, braces, and parsing grammar to serve the same purposes. This gives Clojure a more homogeneous syntax, but a syntax with fewer rules that’s easier to parse and analyze. Explicit statement delimiters also allow Lisp more freedom to pick symbol names. Symbols in Lisp can include characters (‘-‘, ‘<', '&', etc.) that infix languages can't use for the purpose, because the explicit statement grouping makes it easier to distinguish a symbol from its context. The topic of Lisp syntax is really interesting enough for its own lengthy series of posts and articles. Going back to the Clojure calculator's main loop, the next statement in the loop is yet another binding form. Like loop, this binding form also includes an element of control flow.

(if-let [new-state (apply-command state command)]
   (recur new-state)
   nil)

It may be easiest to see the meaning of this block of code by paraphrasing it into Java:

State newState = applyCommand(state, command);
 
if (newState != null)
    return recur(newState);
else
    return null;

What if-let does is to establish a new local variable and then conditionally pick between two alternative control flow paths based on the value of the new variable. It’s a common pattern within code, so it’s good to have a specific syntax for the purpose. What’s interesting about Clojure, though, is that if the language didn’t have it built in, a programmer working in Clojure could add it with a macro and you couldn’t tell the difference from built-in features. (In fact, the default Clojure implementation of if-let is itself a macro.)

At this point, I’ve covered the basic structure of the Clojure project, as well as the project’s main entry point. Subsequent posts will cover modeling of application state within Clojure, as well as the command parser, and the commands themselves. Once I’ve covered the basic functionality of the calculator, I’ll use that as a starting point to discuss custom syntax for command definitions, and ultimately a compiler for the calculator.

RPN Calc Part 7 – Refactoring Loops with Reduce

mike@mschaef.com (Mike Schaeffer) — Sun, 01 Jun 2014 00:00:00 +0000

In the last installation of this series, we started using Java iterators to decompose the monolithic REPL (read-eval-print-loop) into modular compoments. This let us start decoupling the semantics of the REPL from the mechanisms that it uses to implement read, evaluate, and print. Unfortunately, the last version of rpncalc only modularized the command prompt itself: the ‘R’ in REPL. The evaluator and printer are still tightly bound to the main command loop. In this post I’ll use another kind of custom iterator to further decompose the main loop, breaking out the evaluator and leaving only the printer itself in the loop.

Going back to the original command loop from the stateobject version of rpncalc, the loop traverses two sequences of values in parallel.

state = new State();
 
while(running) {
    System.out.println();
    showStack();
    System.out.print("> ");
 
    String cmdLine = System.console().readLine();
 
    if (cmdLine == null)
        break;
 
    Command cmd = parseCommandString(cmdLine);
 
    State initialState = state;
 
    state = cmd.execute(state);
 
    lastState = initialState;
}

Neither of the two sequences this loop traverses are made explicit within the code, both are implicit in the sequence of values taken on by variables managed by the loop. The first sequence the loop traverses is the sequence of commands that the user enters at the console. This sequence manifests in the code as the sequence of values taken on by cmd through each iteration of the loop. The second sequence is similarly implicit: the sequence of states that state takes on through each iteration. Last post, when we added the CommandStateIterator, the key to that refactoring was that we took one of the implicit loop sequences and made it explicitly a sequence witin the code. Having an explicit structure within the code for the sequence of commands provided a place for the loop to invoke the reader that wasn’t in the body of the loop itself.

// Set initial state
State state = new State();
 
// Loop over all input commands
for(Command cmd : new ConsoleCommandStream()) {
 
    // Evaluate the command and produce the next state.
    state = cmd.execute(state);
 
    if (state == null)
        break;
 
    // Print the current state
    showStack(state);
}

Looking forward, the next refactoring for the REPL is to make explicit the implicit sequence of result states in the same way we transformed the sequence of input commands. This will let us take our current loop, which loops over input commands, and turn it into a loop over states. The call to evaluate will be pushed into an iterator in the same way that we pushed the reader into an iterator in the last post. This leaves us with a main loop that simply loops over states and prints them out:

for(State state : new CommandStateReduction(new State(), new CommandStream()))
    showStack(state);

This code is short, but it’s dense: most of the logic is now outside the text of the loop, and within CommandStateReduction and CommandStream. The command stream is the same stream of commands used in the last version of rpncalc. The ‘command state reduction’ stream is the stream that invokes the commands to produce the sequence of states. I’ve given it the name ‘reduction’ because of the way it relates to reduce in funcional programming. To see why, look back at abstract class we’re using to model a command:

abstract class Command
{
    abstract State execute(State in);
}
Code found missing language specification while processing file: ksm/rpncalc_07.md

Given a state, applying a command results in a new state, returned from the execute method. A second command can then be applied to the new state giving an even newer state, and there’s no inherent bound on the number of times this can happen. In this way, a sequence of commands applied to an initial state produces a corresponding sequence of output states. The sequence of output states is the sequence of command results that the REPL needs to print for each entered command. Each time a command is executed, the result state needs to be printed and stored for the next command.

The relationship between this and reduction comes from the fact that reduction combines the elements of a sequence into an aggregate result. Reducing + over a list of numbers gives the sum of those numbers. Applying a sequence of commands combines the effects of those commands into a single final result. The initial value that gets passed into the reduction is the initial state. The sequence over which the reduction is applied is the sequence of commands from the console. The combining operator is command application. The most significant difference between this and traditional reduce is that we need more than just the final result, we also need each intermediate result. (This makes our reduction more like Haskell’s scan operator.)

Practically speaking CommandStateReduction is implemented as an Iterable. The constructor takes two arguments: the initial state before any commands are executed, and a sequence of commands to be executed.

class CommandStateReduction implements Iterable
{
    CommandStateReduction(State initialState, Iterable cmds)

Note that the only property that the command state reduction requires of the sequence of commands is that it be Iterable and produce Commands. There’s nothing about the signature of the reduction iterator that requires the sequence of commands to be concrete and known ahead of time. This is useful, because our current command source is CommandStream, which lazily produces commands. Both the command stream and the command state reduction are lazily evaluated, and only operate when a caller makes a request. The command stream doesn’t read until the evaluator requests a command, the evaluator doesn’t evaluate until the printer makes a request for a value. Despite the fact that it’s hidden behind a pipeline of iterable object, the REPL still operates as it did before: first it reads, then it evaluates, then it prints, and then it loops back.

As with the command state iterator, most of the logic in command state reduction is handled with a single advanceIfNecessary method. The instance variable state is used to maintain the state between command applications:

private State state = initialState;
 
private boolean needsAdvance = true;
 
Iterator cmdIterator = cmds.iterator();
 
private void advanceIfNecessary()
{
    if (!needsAdvance)
        return;
 
    needsAdvance = false;
 
    if (cmdIterator.hasNext())
        state = cmdIterator.next().execute(state);
    else
        state = null;
}

Looking back at the code, the Java version of the RPN calculator has come a long way. From heavily procedural origins, we’ve added command pattern based undo logic, switched over to a functional style of implementation, and redesigned our main loop so that it operates via lazy operations on streams of values. We’ve taken a big step in the direction of functional programming. The downside has been in the size of the code. The functional style has many benefits, but it’s not a style that’s idiomatic to Java (at least before Java 8). Our code side has more than doubled from 150 to 320 LOC. In the next few entries of this series, we’ll continue evolving rpncalc, but switch over to Clojure. This will let us continue this line of development without getting buried in the syntax of Java.

Details in Java Code: Error Reporting and Loop Control Variables

mike@mschaef.com (Mike Schaeffer) — Mon, 31 Mar 2014 00:00:00 +0000

Sometimes, it’s easy to focus so much on the architecture of a system that the details of its implementation get lost. While it’s true that inattention to architectural concerns can cause a system to fail, it’s also true that poor attention to the details can undermine even the best overall system design. This post covers a few minor details of code structure that I’ve found to be useful in my work:

It’s a small thing, but one of my favorite utility methods is a short way to throw run-time exceptions.

public static void FAIL(String message)
{
    throw new RuntimeException(message);
}

Defining this method accomplishes a few useful goals. The first is that (with an import static) it makes it possible to throw a RuntimeException with 22 fewer characters of source text per call site. If you’re writing usefully descriptive error messages (which you should be), this can significantly improve the readability of the code. The text FAIL tends to stand out in source code listings, and bringing the error message closer to the left margin of the source text makes it more obvious. The symbol FAIL is also easy to identify with tools like grep, ack, and M-x occur.

To handle re-throw scenarios, it's also useful to have another definition that lets you specify a cause for the failure.

public static void FAIL(String message, Throwable cause)
{
    throw new RuntimeException(message, cause);
}

Related to this is a useful naming convention for loop control variables. Thanks in large part to FORTRAN, and its mathematical heritage, it's very common to use the names i, j, and k for loop control variables. These names aren't very descriptive, but they're short and for small loop bodies, there's usually enough context that a longer name would be superfluous. (If your loop spans pages of text, you should use a more descriptive variable name... but first, you should try to break up your loop into sensible, testable functions.) One technique I've found useful for making loop control variables more obvious (and searchable) without going to fully descriptive variable names is to double up the letters, giving ii, jj, and kk.

These are both small changes, but they both can improve the readability of the code. Try them out and see if you like them. If you disagree that they are improvements, it's easy to switch back.

Recent Blog Posts

mike@mschaef.com (Mike Schaeffer) — Wed, 26 Mar 2014 00:00:00 +0000

Update 2019-01-17: KSM recently redesigned their website in a way that removes the original blog. Because of this, I've taken some of what I wrote then for KSM and re-hosted it here. Thanks are due both to KSM Technology Partners for allowing me to do this and to the Wayback Machine for retaining the content. All the links below are updated to reflect the articles' new locations.

Sorry for the radio silence, but recently I've been focusing my writing time on the KSM Techology Partners Blog. My writing there is still technical in nature, but it tends to be more heavily focused on the JVM. If you're interested, here are a few of what I consider to be the highlights.

In mid-2013, I started out writing about how to use Runnable to explictly enforce dynamic extent in Java. In a nutshell, this is a way to implement try...with...resources in versions of Java that don't have it built in to the language. I then used the dynamic extent technique to build a ThreadLocal that plays nicely with thread pools. This is useful because thread pools require an understanding of which thread you're running on, which thread pooling techniques can abstract away.

Later in the year, I focused more on Clojure, starting off with a quick bit on the relationship of lexical closures to Java inner classes. I also wrote about a particular kind of stack overflow exception that can happen with lazy sequences. Lazy sequences can nicely remove the need to use recursion while traversing their length, but each time two unrealized lazy sequences are combined, it adds to the recursive depth required to compute the first element. For me, this stack overflow was a difficult error to diagnose, because it seemed so counter-intuitive.

I'm also in the middle of a series of posts that relate the GoF command pattern to functional programming. The posts start off with Java, but will ultimately describe a Clojure implementation that compiles a stack based expression language into optimized Java bytecode. If you'd like to play with the code, it's on github.