Mike Schaeffer's Blog

Articles with tag: ksm
December 17, 2013

Throughout the last four parts of this series, the common theme has been that the state of the calculator program has been managed globally. This requires the main command loop to directly update global data after each command, to prepare the state for the next command. While this works, it would be nice to remove the need for the global update. This post talks about how that’s done in the functional version of rpncalc.

The last installment in the series updated the calculator by adding a Java class to model State. This allows the entire state of the calculator to be managed as a first-class entity within the program. That’s evident from the fact that the calculator now models state through a single global reference:

public class RpnCalc extends Calculator
{
    // ...
 
    private State state = null;

Before I continue, I should clarify by saying that state isn’t truly a global variable. Obviously state is an instance variable within a single instance of RpnCalc. However, because of the fact that the calculator program only runs with a single instance of RpnCalc, any instance variables within that class are effectively global within the program, and suffer many of the same problems associated with true globals. This parallel is also true for instance variables within singletons managed by Spring. They’re instance variables, but they act like globals and should be viewed with the same kind of suspicion.

To remove the global state, we’re going to take advantage of the fact that the last update to rpncalc changed the signature for a command, so that it returns a new copy of the state:

abstract class Command
    {
        State execute(State in)

This implies that the main command loop has direct access to the updated state, and can manage it as a local variable:

public void main()
        throws Exception
    {
        State state = new State();
 
        while(state != null) {
            // ...read and gather command...
            state = cmd.execute(state);
        }
    }

This eliminates the global variable, moving the same data to a single variable contained within a function definition. This hugely reduces the variable’s footprint within the code. The global variable can be read and updated by 200 lines of code, compared to 20 lines of code for the local variable. For a developer, this makes the management of state more evident within the code, and makes it easier to write and maintain correct code.

Unfortunately, there’s a problem. For most of the commands, storing state in a local variable works well: a command like addition only needs access to the current state to produce the next state. Where it breaks down is in the undo command. In contrast to every other command, undo doesn’t need access to the current state, it needs access to the previous state. Now that we’ve moved the state management entirely within the command loop function, there’s no way for the undo command to get access to the history of states. Given that undo is a requirement, we need to find a solution.

One approach would be to extend the command loop to maintain a full history of states, and then broaden the command function signature to accept both the current and previous states. Each command would then be a function on a history of states. This approach has some appeal, but a simpler way to achieve the same result is to acknowledge the fact that each state is a function of the previous state, and store a back reference within each state:

private class State
{
    State prev;
    // ...
 
    State()
    {
        prev = null;
        // ...
    }
 
    State(State original)
    {
        prev = original;
        // ...
    }
}

This makes undo easy:

cmds.put("undo", new Command() {
        public State execute(State in) {
            // skip past the copy made for 'undo', and get to the previous.
            return in.prev.prev;
        }
    });

This design is also closely related to how git stores commits: each commit stores a back link to the previous commit in the history. It does have potentially unbounded memory usage, but our state in rpncalc is small enough in size and our memory is large enough that this shouldn’t present a problem. If it ever did present a memory consumption problem, adding a bound on the length number of retained states is as simple as walking down the previous links and nulling the previous link of the last history element to retain.

Next up, we’ll start taking a bit more advantage of the newly functional nature of rpncalc, and start using functional techniques to decompose the main command loop into component modules.

December 10, 2013

In the last installment of this series, I built a basic undo facility on top of the command pattern. One of the problems with that implementation is that the Command class has to know too much about how to save and restore the overall state of the calculator. In this post, I’ll introduce a way around this problem.

To see the problem, note that every instance of `Commandq has a private member variable for each global state variable maintained by the calculator:

private Deque<Double> oldStack;
private Double[] oldRegs;

Commands also have methods to save and restore the global state from the calculator instance itself:

void saveState()
{
    oldStack = new LinkedList<Double>(stack);
    oldRegs = Arrays.copyOf(regs, regs.length);
}
 
void restoreState()
{
    stack = new LinkedList<Double>(oldStack);
    regs = Arrays.copyOf(oldRegs, oldRegs.length);
}

Aside from the fact that Commands are directly updating the global state of the calculator, this design makes it more difficult to extend or modify the calculator’s notion of state. Any change to the calculator’s design that that adds a global state variable forces the Command class to be extended to manage the new global state. This means at least adding new private fields to Command and updating saveState and restoreState implementations. The stateobject version of RpnCalc addresses this issue by introducing a new class for the purpose of managing calculator state: State.

As you might have guessed, State has the two fields we need to store our current notion of state:

Deque<Double> stack = null;
 Double[] regs = null;

There are also two constructors, one for constructing a new, blank state, and a copy constructor for duplicating an existing state:

State()
{
    stack = new LinkedList<Double>();
    regs = new Double[NUM_REGISTERS];
}
 
State(State original)
{
    stack = new LinkedList<Double>(original.stack);
    regs = Arrays.copyOf(original.regs, original.regs.length);
}

The State class simplifies the calculator’s state variable declaration down to a single member variable:

private State state = null;

Along with that simplification, Command no longer needs to worry about state management, so it’s back down to a single method:

abstract class Command
{
    abstract State execute(State in);
}

The biggest change here is that the execute method now returns an instance of State. With state objects, the behavior of a command is now that it accepts a ‘before’ state, and then returns the state that results from applying the command to the output state. Every implementation of Command, save for two, first makes a copy of the state using the copy constructor, and then updates the copied state and returns it to the caller. The command doesn’t touch the input state, aside from the initial copy operation (which isn’t an update).

At this point, you might want to re-read that last paragraph - it’s kind of a big deal. With this latest change to the calculator, we’ve really made two big improvements. The first is the change that motivated the post in the first place: Commands no longer need to know how to save and restore state and the no longer need to be aware of the contents of that state. The second change is that Commands are no longer modifying global state at all. Each command is now a Function, mapping from state to state.

This second aspect of the design change has many positive implications, but the first is that it makes it easier to implement undo. Because the input state passed into Command.execute isn’t touched, the caller can save off a copy of the previous state in its entirety. The calculator’s main command loop does just that:

Command cmd = parseCommandString(cmdLine);
 
State initialState = state;
 
state = cmd.execute(state);
 
LastState = initialState; // (lastState is an instance variable alongside state)

The undo command itself just ignores its input state and returns whatever the lastState was:

cmds.put("undo", new Command() {
        public State execute(State in) {
            return lastState;
        }
    });

This version of the calculator gets us most of the way to a mathematically functional implementation. The next installment will take us further by removing state and lastState.

November 19, 2013

One of the reasons given in the GoF book for the use of the command pattern is to support undo. By recording the commands executed by a user, and giving the commands the ability to reverse themselves, a user interface can be designed to allow users to undo mistakes they make. For this post, I’ll talk about how this is done by extending the last version of the calculator to support undo.

There are three main extensions to rpncalc that enable it to support undo. The first is that I’ve broadened the Command interface into an abstract class that supports operations for managing state. There is a new method, saveState, that saves a copy of the globally shared state within the command instance itself:

abstract class Command
{
    private Deque<Double> oldStack;
    private Double[] oldRegs;
 
    void saveState()
    {
        oldStack = new LinkedList<Double>(stack);
        oldRegs = Arrays.copyOf(regs, regs.length);
    }

There is also the inverse of saveState, restoreState:

void restoreState()
{
    stack = new LinkedList<Double>(oldStack);
    regs = Arrays.copyOf(oldRegs, oldRegs.length);
}

Taken together, these changes provide the two new fundamental operations necessary to support undo: Saving global state, in case an undo operation requires that it be restored at a later point in time. If the definition of global state ever broadens to include other variables, the change can be confined to these two methods. If there’s ever a more efficient way for a command to save state than a full copy, the change can be confined to these two methods. In this way the two methods provide an abstraction over the idea of ‘global state’. By adding methods (verbs) to manipulate global state, we’ve extended the language to allow direct statements about global state.

The next step is to use this new abstraction to implement undo. The first step is easy: each command needs to be modified to save the current global state before it executes. There are other ways to do this, but what I’ve done here is add a call to saveState at the beginning of each command implementation. This is simple, and for a small program like rpncalc, it works well:

cmds.put("+", new Command() {
        public void execute() {
            saveState();
 
            Double x = stack.pop();
            Double y = stack.pop();
 
            stack.push(x + y);
        }
    });

Implementing undo is just a bit trickier. One way to think about undo is that it restores the state at the beginning of the last command that was executed. To implement undo in these terms requires that we keep a record of the last executed command.

Command cmd = parseCommandString(cmdLine);
 
cmd.execute();
 
lastCmd = cmd;

From here, ‘undo’ is a new command that restores the state at the beginning of lastCmd:

cmds.put("undo", new Command() {
        public void execute() {
            saveState();
 
            lastCmd.restoreState();
        }
    });

Note that undo itself saves its own state. This makes undo a reversible operation: ‘undo’ can be ‘undone’.

To this point, we’ve developed the basic calculator idea around the idea of state contained in a collection of global variables. The methods saveState and `restoreStateq provide a nice, verb-oriented way to hide the details behind state management.

This approach has a great deal of headroom, but it doesn’t take advantage of the capabilities of the language to logically group data into tuples. The next installation of this series will look into a way we can use a Java object to package our entire calculator state into a single object that can be manipulated as a whole. If this post introduced verbs for managing state, the next post will introduce the noun.

October 28, 2013

If you’ve played around with the basic version of the Java calculator from the last post in this series, you may have tried to enter multiple commands on the same prompt line:

Begin: class com.ksmpartners.rpncalc.basic.RpnCalc

> 1 2
Uncaught Exception: For input string: "1 2"
java.lang.NumberFormatException: For input string: "1 2"

This is a convenient way to enter calculator commands, but it doesn’t work for two reasons. This post discusses why it fails, and how to fix it:

The immediate failure is because the parser considers the entire input string as a single command, without regard to whitespace. The parser looks for a command named "1 2", fails to find it, and then attempts to parse the entire string as a number. The number parse then fails and throws an exception that kills the process. This part of the problem can be fixed by altering the parser to split the input string into tokens, and parsing each token separately as its own command. Unfortunately, this shows the second more structural problem. The way the REPL is currently written, it’s built to read and execute a single Command for each prompt:

Command cmd = parseCommandString(cmdLine.trim());
 
cmd.execute();

To correct the second problem, the data flow between parseCommandString and execute needs to be widened so that the parser can return multiple commands for a single input string. The easiest way to achieve this is to have the parser return a list of Commands.

List<Command> cmds = parseCommandString(cmdLine.trim());
 
for(Command cmd : cmds)
    cmd.execute();

This approach works well because it’s simple. It doesn’t add any additional class definitions, there’s still a single call site for execute(), and java.util.List is well-understood by developers that use Java. However, if you’re willing to trade some of that simplicity away, there’s an alternative approach to the problem. The alternative approach allows the original REPL to go unchanged, and still process groups of commands. It does this by introducing a new and more powerful kind of command object that can be used to compose lists of commands into single Command instances. The parser then uses these CompositeCommands to signal to the REPL that there are multiple commands to be executed:

private Command parseCommandString(String cmdStr)
    throws Exception
{
    List<Command> subCmds = new LinkedList<Command>();
 
    for (String subCmdStr : cmdStr.split("\\s+"))
        subCmds.add(parseSingleCommand(subCmdStr));
 
    return new CompositeCommand(subCmds);
}

There are a few things to notice about this code. The first is that the type signature of parseCommandString is unchanged: it’s still a function from String to Command. This is why the REPL doesn’t need to be updated. The second aspect of the code is that there’s still a list of command objects: The list of commands we introduced when we modified the REPL example is still present, even with the CompositeCommand. What the composite command object does is hide the list from the REPL; It wraps the list up, abstracts it away, and lets the REPL pretend that there aren’t any lists.

The internal machinery that lets composite command achieve this isn’t too involved. All CompositeCommand has to do is remember the Command objects it’s composing, and execute each of those commands as if they were entered individually:

private class CompositeCommand implements Command
{
    private List<Command> subCmds = new LinkedList<Command>();
 
    CompositeCommand(Collection<Command> subCmds)
    {
        this.subCmds.addAll(subCmds);
    }
 
    public void execute()
    {
        for(Command subCmd : subCmds)
            subCmd.execute();
    }
}

And that’s all there is to it. While this approach doesn’t have the surface simplicity of our first attempt, what it has done for us is give us a new abstraction for sequences of commands. This is fully a third of the ‘sequence’, ‘selection’, and ‘iteration’ that are necessary to express any computable program. That’s not a bad result for a single class definition.

Older Articles...