Mike Schaeffer's Blog

August 19, 2013

As part of a team conversation this morning, I worked up a quick Java translation of some more-interesting-than-it-looks Clojure code. It’s a good example of how lexical closures map to objects.

The code we started out with was this:

(defn make-adder [x]
  (let [y x]
    (fn [z] (+ y z))))
 
(def add2 (make-adder 2))
 
(add2 4)

What this code does is define and return a new type of function that adds values to the constant x. In Java, it looks like this:

// Not needed in Clojure... the Clojure runtime implicitly gives types
// that look similar to this to all Clojure functions.
interface Function
{
   Integer invoke(Integer z);
}
 
public class Foo
{
   // (defn make-adder [x]
   //   (let [y x]
   //     (fn [z] (+ y z))))
   public static Function makeAdder(Integer x)
   {
       final Integer y = x;
 
       return new Function() {
           public Integer invoke(Integer z) {
               return z + y;
           }
       }
   }
 
   public static int main(String[] args)
   {
       // (def add2 (make-adder 2))
       Function add2 = Foo.makeAdder(2);
 
       // (add2 4)
       System.out.println(add2.invoke(4));
   }
}

Five lines of Clojure code translate into 30 lines of Java: a function definition becomes a class definition, with state.

This idiom is powerful, but coming from Java, the power is hidden behind unusual syntax and terse notation. There are good reasons for both the syntax and the notation, but if you’re not used to either, it’s very easy to look at a page of Clojure code and be completely lost. Getting past this barrier by developing an intuitive feel for the language is a major challenge faced by teams transitioning to Clojure and Scala. One of the goals I have for my posts in this blog is help fellow developers along the way. It should be a fun ride.

August 6, 2013

Occasionally, it’s useful to be able to print nicely formatted tables of data to a textual output stream. This is particularly the case when writing command line tools. To make the output of these tools more readable, any tables they write should have columns that line-up from row to row. The Unix tool ls does this when it prints out long form directory listings. In this example, notice how the dates line up, even though the file size column varies in width.

drwxr-xr-x+ 1 mschaef Domain Users        0 Oct  3 09:20 docs
-rwxr-xr-x  1 mschaef Domain Users 29109013 Oct 10 13:38 file.zip
-rwxr-xr-x  1 mschaef Domain Users    77500 Oct 10 13:17 file2.zip

To accomplish this, it’s necessary to accumulate all of the lines of text to be written, compute the column widths when all lines are known, and then print the lines out, with appropriate padding to ensure that columns occupy the same width in each row. This is easy to accomplish, with just a bit of reusable Java.

package com.ksmpartners.utility;
 
import java.util.LinkedList;
import java.util.List;
 
import org.apache.commons.lang3.StringUtils;
 
public class TableBuilder
{
    List<String[]> rows = new LinkedList<String[]>();
 
    public void addRow(String... cols)
    {
        rows.add(cols);
    }
 
    private int[] colWidths()
    {
        int cols = -1;
 
        for(String[] row : rows)
            cols = Math.max(cols, row.length);
 
        int[] widths = new int[cols];
 
        for(String[] row : rows) {
            for(int colNum = 0; colNum < row.length; colNum++) {
                widths[colNum] =
                    Math.max(
                        widths[colNum],
                        StringUtils.length(row[colNum]));
            }
        }
 
        return widths;
    }
 
    @Override
    public String toString()
    {
        StringBuilder buf = new StringBuilder();
 
        int[] colWidths = colWidths();
 
        for(String[] row : rows) {
            for(int colNum = 0; colNum < row.length; colNum++) {
                buf.append(
                    StringUtils.rightPad(
                        StringUtils.defaultString(
                            row[colNum]), colWidths[colNum]));
                buf.append(' ');
            }
 
            buf.append('\n');
        }
 
        return buf.toString();
    }
 
}

The calling convention for this class is very much in line with the calling convention for Java’s StringBuilder.

TableBuilder tb = new TableBuilder();
 
tb.addRow("alpha", "beta", "gamma");
tb.addRow("-----", "----", "-----");
tb.addRow("1", "20000000", "foo");
tb.addRow("x", "yzz", "y");
 
System.out.println(tb.toString());

That code will write the following output:

alpha beta     gamma
----- ----     -----
1     20000000 foo
x     yzz      y

This isn’t necessarily the prettiest output in the world, but it’s easy to accomplish and is much better than many of the alternatives.

Tags:javaksm
July 27, 2013

The other day, our team ran across an interesting design problem related to Java synchronization. Without getting into the details of why this problem came up, the gist of it was that we had to somehow to write a synchronization wrapper around OutputStream. (Well, technically, it was a BufferedWriter, but the issue is the same.) This wrapper needed to correctly allows multiple unsynchronized writer threads to write to an underlying writer, each thread atomically writing lines of text. We wound up managing to avoid having to do this by changing our interface, but the solution to the OutputStream problem still provides an interesting look at a lesser-known aspect of the Java Runtime: ThreadLocal variables.

To illustrate the problem, consider this simpler version of the OutputStream interface. (Unlike the full version, this interface eliminates the two bulk-write forms of the write method, which don’t matter for this conversation.)

interface SimpleOutputStream {
    void write(int byte);
}

The simplest way to writing a synchronized wrapper for this interface is to wrap each call to the underlying implementation method in a synchronized block. This wrapper then guarantees that each thread entering write will have exclusive and atomic access to write its byte to the underlying implementation.

public void write(int byte)
{
    synchronized(underlying)
    {
        underlying.write(byte);
    }
}

The difficulty with this approach is that the locking is too fine grained to produce a coherent stream of output bytes on the underlying writer. A context switch in the middle of a line of output text will cause that line to contain bytes from two source threads, with no way to distinguish one thread’s data from another. The only thing the locking has provided is a guarantee that we won’t have multiple threads reentrantly calling into the underlying stream. What we need is a way for our wrapper to buffer the lines of text on a per-thread basis, and then write full lines within a synchronized block, and this is where ThreadLocal comes into play.

An instance of ThreadLocal is exactly what it sounds like: a container for a value that is local to a thread. Each thread that gets a value from an instance of a ThreadLocal gets its own, unique copy of that value. Ignoring how exactly it might work, this is the abstraction that will enable our implementation of OutputStream to buffer lines of text from each writer thread, prior to writing them out. The key is is in the specific use of the thread local.

ThreadLocal threadBuf = new ThreadLocal() {
    protected StringBuffer initialValue() {
        return new StringBuffer();
    }
};

This code allocates an instance of a local derivation of ThreadLocal specialized to hold a StringBuffer. The overrided initialValue method determines the initial value of the ThreadLocal for each thread that retrieves a value – it is called once for each thread. In this specific case, it allocates a new StringBuffer for each thread that requests a value from threadBuf. This provides us with a line buffer per thread. From here on out, the implementation is largely connecting the dots.

First, we need an implementation of write that buffers into threadBuf:

public void write(int b)
      throws IOException
{
    threadBuf.get().append((char)b);
 
    if (b == (int)'\n')
        flush();
}

The get method pulls the thread local buffer out of threadBuf. Note that because this buffer is thread local, it is not shared, so we don’t need to synchronize on it.

Second, we need an implementation of flush that atomically writes the current thread line buffer to the underlying output:

protected void flush()
    throws IOException
{
    synchronized(underlying) {
        underlying.write(threadBuf.get().toString().getBytes());
 
        threadBuf.get().setLength(0);
    }
}

Here we do need synchronization, because underlying is shared between threads, even if the threadBuf is not.

While there are still a few implementation details to worry about, this provides the bulk of the functionality we were originally looking for. Multiple threads can write into an output stream wrapped in this way, and the output will be synchronized on a per-line basis. If you’d like to experiment with the idea a bit, or read through an implementation, I’ve put code on github. This sample project contains a synchronized output stream, as well as a test program that launches multiple threads, all writing to System.out, via our synchronized. A command line flag lets you turn the wrapper off to see the ill effects of finer grained synchronization.

Tags:javaksm
July 22, 2013

One of the best things about writing code in the Java ecosystem is that so much of the underlying platform is open source. This makes it easy to get good answers to questions about how the platform actually works. To illustrate, I’ll walk through the JVM code to show why Java I/O isn’t interruptable. This will explain why threads performing Java I/O can’t be interrupted.

The core issue with interrupting a Java thread performing I/O is that the underlying system call is uninterruptable. Let’s see why that is, for a FileInputStream.

The start of the call stack is in the Java standard class library. In a traditional source tree, this is located under: ${JDK_SRC_ROOT}/jdk/src/share/classes/. In keeping with Java tradition, the class library code is structured with a directory per package. Looking at the code for FileInputStream, both bulk read operations delegate to a method readBytes:

public int read(byte b[]) throws IOException {
    return readBytes(b, 0, b.length);
}
// ...
public int read(byte b[], int off, int len) throws IOException {
    return readBytes(b, off, len);
}

readBytes, however, is declared to be native:

private native int readBytes(byte b[], int off, int len) throws IOException;

The native code for the class library is located in a parallel directory structure. ${JDK_SRC_ROOT}/jdk/src/share/native/.... Like the Java code, this native code is structured with a file-per-class and directory-per package. Looking at this code, you can see that the function name is structured to include the java class name, and the prototype contains extra parameters that the JVM uses to manage internal state. env stores a pointer to the current thread’s JVM environment, and this is an explicit declaration of the usual implicit Java this argument. The remaining three arguments are the arguments declared in the original source file.

JNIEXPORT jint JNICALL
Java_java_io_FileInputStream_readBytes(JNIEnv *env, jobject this,
    jbyteArray bytes, jint off, jint len) {
    return readBytes(env, this, bytes, off, len, fis_fd);
}

At this point, the class library is calling into another layer of code, outside the class library. Most of the Java policy surrounding File I/O is implemented in readBytes. (Note particularly the call to malloc in line 23…. traditional File I/O in Java is implemented by reading into a local buffer, and then copying from the local buffer into the Java byte[] initially passed into int read(byte buf[]). This double-copy is slow, but it also requires heap allocation of a second read buffer, if the read buffer is large than 8K. The last time I was reading this code, it was to diagnose an OutOfMemory caused by an over-large buffer size.)

jint
readBytes(JNIEnv *env, jobject this, jbyteArray bytes,
          jint off, jint len, jfieldID fid)
{
    jint nread;
    char stackBuf[BUF_SIZE];
    char *buf = NULL;
    FD fd;
 
    if (IS_NULL(bytes)) {
        JNU_ThrowNullPointerException(env, NULL);
        return -1;
    }
 
    if (outOfBounds(env, off, len, bytes)) {
        JNU_ThrowByName(env, "java/lang/IndexOutOfBoundsException", NULL);
        return -1;
    }
 
    if (len == 0) {
        return 0;
    } else if (len > BUF_SIZE) {
        buf = malloc(len);
        if (buf == NULL) {
            JNU_ThrowOutOfMemoryError(env, NULL);
            return 0;
        }
    } else {
        buf = stackBuf;
    }
 
    fd = GET_FD(this, fid);
    if (fd == -1) {
        JNU_ThrowIOException(env, "Stream Closed");
        nread = -1;
    } else {
        nread = IO_Read(fd, buf, len);
        if (nread > 0) {
            (*env)->SetByteArrayRegion(env, bytes, off, nread, (jbyte *)buf);
        } else if (nread == JVM_IO_ERR) {
            JNU_ThrowIOExceptionWithLastError(env, "Read error");
        } else if (nread == JVM_IO_INTR) {
            JNU_ThrowByName(env, "java/io/InterruptedIOException", NULL);
        } else { /* EOF */
            nread = -1;
        }
    }
 
    if (buf != stackBuf) {
        free(buf);
    }
    return nread;
}

The meat of the read is done by the IO_Read in line 37. This is aliased with a preprocessor definition to JVM_Read, which is a JVM primitive operation. JVM primitives are outside the Java class library, and are in the HotSpot JVM itself. This particular primitive is defined in ${JDK_SRC_ROOT}/hotspot/src/share/vm/prims/jvm.cpp. (In case you’re wondering how I’ve been finding these functions outside of the class library, I usually use a code text search facility.)

JVM_LEAF(jint, JVM_Read(jint fd, char *buf, jint nbytes))
  JVMWrapper2("JVM_Read (0x%x)", fd);
 
  //%note jvm_r6
  return (jint)os::restartable_read(fd, buf, nbytes);
JVM_END

The Java read operation is the point where the code path goes from code common to all JVM platforms and into OS specific code. For the Solaris (Unix) code path, the definition looks like this.

size_t os::restartable_read(int fd, void *buf, unsigned int nBytes) {
  INTERRUPTIBLE_RETURN_INT(::read(fd, buf, nBytes), os::Solaris::clear_interrupted);
}

Line 2 of this code, finally is the OS system call itself: ::read. However, it’s wrapped in a macro call to INTERRUPTIBLE_RETURN_INT. This macro turns out to be a standard retry loop for a Unix system call.

#define INTERRUPTIBLE_RETURN_INT(_cmd, _clear) do { \
  int _result; \
  do { \
    INTERRUPTIBLE(_cmd, _result, _clear); \
  } while((_result == OS_ERR) && (errno == EINTR)); \
  return _result; \
} while(false)

This macro expansion turns into a loop that will repeatedly issue the system call as long as it returns EINTR - the return code for an interrupted system call. As long as the system call doesn’t outright fail, the JVM will keep retrying the read if it’s interrupted. To get interruptable I/O semantics, you have to call the OS differently.

Slava Pestov has written a nice piece on how EINTR is used by Unix.

Unix’s use of EINTR is one of the original aspects of Unix that led to ‘worse is better’. Other contemporary operating systems of the time went to greater lengths to handle long running system calls. ITS would interrupt the system call, and then arrange for it to be restarted after the interruption, but with parameters that allow it to pick up where it left off.

See Also: The original paper on ITS system call restarts.

Tags:javaksm
July 2, 2013

I recently blogged about solving problems with thread local storage. In that instance, thread local storage was a way to add a form of ‘after the fact’ synchronization to an interface that was initially designed to be called from a single thread. Thread local storage is useful for this purpose, because it allows each thread to easily isolate the data it manipulates from other threads. Unfortunately, while this is the strength of thread local storage, it is also the weakness. In modern multi-threaded designs, threading issues are often abstracted behind thread pools and work queues. With these abstractions at work, the threads themselves become an implementation detail, and thread local variables are too low level to serve some useful scenarios.

One way to address this issue is to use some of the techniques from an earlier post on dynamic extent. The general gist of the idea is to provide Runnable‘s with way of reconstructing relevant thread local state at they time they are invoked on a pool thread. This maps well to the idea of ‘dynamic extent’, as presented in the earlier post. ‘Establishing the precondition’ is initializing the thread locals for the run, and ‘Establishing the post condition’ is restoring their original values. Here’s how it might look in code:

// This is the thread local storage that we want to migrate to worker threads.
ThreadLocal<Object> tls = new ThreadLocal<Object>();
 
Runnable bindToTls(final Runnable runnable)
{
   return new Runnable() {
      // Captured at the time/thread making the initial call to bindToTls
      final Object boundTls = tls.get();
 
      public void run() {
         // Captured at the time the Runnable is invoked
         Object initialTls = tls.get();
 
         try {
            tls.set(boundTls);
 
            runnable.run();
         } finally {
            tls.set(initialTls);
         }
   };
}

Calling bindToTls on a Runnable then gives a new instance of Runnable that remembers the preserved thread local state at the time of the call to bindToTls. The returned Runnable can then be provided to a thread pool, and when it is run, it will remember the thread local state. At the end of the run, it then ensures that the thread local is restored to its original value. (Which eliminates the possibility of certain classes of errors, including potential memory leaks.)

One caveat to this version of bindToTls is that is only migrates the one thread local variable that it’s been written to migrate. While this could be made considerably more general, my suggestion is that it’s usually unnecessary to do so. My current project uses one instance of bindToTls to provide a custom Spring scope. From there, Spring provides all the generality we need, and the lower level thread management code is kept contained, which is as it should be.

Tags:javaksm
May 30, 2012

In my Lisp programming, I find myself using Anaphoric Macros quite a bit. My first exposure to this type of macro (and deliberate variable capture) was in Paul Graham's On Lisp. Since I haven't been able to find Emacs Lisp implementations of these macos, I wrote my own.

The first of the two macros is an anaphoric version of the standard if special form:

(defmacro aif (test if-expr &optional else-expr)
  "An anaphoric variant of (if ...). The value of the test
expression is locally bound to 'it' during execution of the
consequent clauses. The binding is present in both consequent
branches."
  (declare (indent 1))
  `(let ((it ,test))
     (if it ,if-expr ,else-expr)))

The second macro is an anaphoric version of while:

(defmacro awhile (test &rest body)
  "An anaphoric varient of (while ...). The value of the test
expression is locally bound to 'it' during execution of the body
of the loop."
  (declare (indent 1))
  (let ((escape (gensym "awhile-escape-")))
    `(catch ',escape
       (while t
         (let ((it ,test))
           (if it
               (progn ,@body)
             (throw ',escape ())))))))

What both of these macros have in common is that they emulate an existing conditional special form, while adding a local binding that makes it possible to access the result of the condition. This is particularly useful in scenarios where a predicate function returns a true value that contains useful information beyond t or nil.

January 12, 2012

Not too long ago, I wrote a bit on life with an iPhone 3G. Since then, Apple has revised the platform a few times, and I'verecently upgraded to the iPhone 4S. This makes now as good a time as any to revisit the points in my earlier post to see what has changed:

  • Touch Screen - The Apple touch screen is about as good as it gets. The size is a good balance between utility and portability, the hardware is well executed, and the software is very, very fluid. That said, there's still the problem that touch screens eliminate the tactile feedback you get from physical buttons. It's harder to use the phone when your eyes aren't visually focused on the display. This limitation is innate to touch screens, but it's still annoying.

  • 'Ambient Information' - iOS 5 handles notifications much more nicely than in earlier versions of iOS. However, the homescreen is still largely dead to ambient information. The only two exceptions are the numeric badges attached to icons and the calendar icon (which displays the current date). The clock icon is wrong, the weather is wrong, and the map is wrong. My hunch is that this is partially to save on battery life, but given that the iPod nano can keep an analog clock icon current, some of this limitation seems gratuitous.

  • Inconvenient Portrait/Landscape Switching - Fixed with a nice lock facility in the task switcher. (Although I rarely use the lock, so maybe it wasn't a big problem after all.)

  • Multiple e-Mail boxes - Fixed in iOS 4 with the unified mailbox view.

  • Large e-mails - I'm not honestly sure if this has been fixed, or if I just get fewer large e-mails, but I haven't noticed this nearly as much.

  • Latency - The iPhone 4S is almost completely beyond reproach. (The latency on my old 3G got terrible, with the upgrade to iOS 4.0, and the subseuent patches did nothing to correct it.)

  • App Store Rejections - There seems to have been less public drama lately around App Store rejections and policy changes. However, I'm suspecting it's mainly because Apple has given a little and developers have come to grudgingly accept the limitations that Apple still imposes.

  • App Store - It's grown to 500,000 (!) applications, but Apple still controls the horizontal and the vertical. Because they control the way applications are displayed, they have a huge degree of control over the exposure their ISV's get and the revenues those ISV's earn.

  • Keyboard - After over three years, it's still tedious and error-prone. It works, but just. What's changed in my thinking over the last couple years is that I no longer care. For me, the iPhone is almost entirely about content consumption, and the keyboard doesn't really matter that much.

  • Industrial Design - I still love the way the phone looks and feels. What's different for me is that I no longer bother with the add on case.

A couple years ago, this is where I said I wouldn't switch away from an iPhone. I recently replaced one iPhone with another, so for me, this is still mostly true; The iPhone has evolved nicely over the years, and it still fits my needs better than the alternatives. However, two things have changed in the last few years. The first is that there's now a reasonable competitor. Unlike then, the alternative to iOS isn't Windows Mobile 6.5... the modern alternative, Android, has a touch screen, a modern web browser, and a fully stocked app store. Unless Apple sues Android into submission, it has lost these things as competitive differentiators.

The second thing that's changed for me in the last couple years is more personal. As much as I like the iPhone, I can't shake the feeling that it isn't a net improvement to my overall standard of living. Amy Breesman said it well when she was recently quoted in an NPR Story: " I would almost say it's, like, a negative effect that it's had on my life. It's just kind of this rabbit hole that you're always going down.". Maybe I'd miss it more if it were gone, but I can't shake the feeling that the time spent on the phone would be better spent elsewhere. Then again, I wouldn't have known about that NPR quotation, unless I had heard it on the NPR app in my phone.

June 29, 2011

One typical property of Lisp systems is that they they intern symbols. Roughly speaking, when symbols are interned, two symbols with the same print name will also have the same identity. This design choice has several significant implications elsewhere in the Lisp implementation. It is also one of the places where Clojure differs from Lisp tradition.

In code, the most basic version of the intern algorithm is easy to express:

(define (intern! symbol-name)
   (unless (hash-has? *symbol-table* symbol-name)
      (hash-set! *symbol-table* symbol-name (make-symbol symbol-name)))
   (hash-ref *symbol-table* symbol-name))

This code returns the symbol with the given name in the global symbol table. If there's not already a symbol under that name in the global table, it creates a symbol with that name and stores it in the hash prior to returning it.. This ensures that make-symbol is only called once for each symbol-name, and the symbol stored in *symbol-table* is always the symbol returned for a given name. Any two calls to intern! with the same name are therefore guaranteed to return the exact, same, eq? symbol object. At a vCalc REPL, this looks like so (The fact that both symbols are printed with ##0 implies that they have the same identity.):

user> (intern! "test-symbol")
; ##0 = test-symbol
user> (intern! "test-symbol")
; ##0 = test-symbol

This design has several properties that have historially been useful when implementing a Lisp. First, by sharing the internal representation of symbols with the same print name, interning can reduce memory consumption. A careful programmer can write an implementation of interned symbols that doesn't allocate any memory on the heap unless it sees a new, distinct symbol. Interning also gives a (theoretically) cheaper mechanism for comparing two symbols for equality. Enforcing symbol identity equality for symbol name equality implies that symbol name equality can be reduced to a single machine instruction. In the early days of Lisp, these were very significant advantages. With modern hardware, they are less important. However, the semantics of interned symbols do still differ in important ways.

One example of this is that interned symbols make it easy to provide a global environment 'for free'. To see what I mean by this, here is the vCalc declaration of a symbol:

struct
{
     ...
     lref_t vcell;  // Current Global Variable Binding
     ...
} symbol;

Each symbol carries with it three fields that are specific to each symbol, and are created and initialized at the time the symbol is created. Because vcell for the symbol is created at the same time as the symbol, the global variable named by the symbol is created at the same time as the symbol itself. Accessing the value of that global variable is done through a field stored at an offset relative to the beginning of the symbol. These benefits also accrue to property lists, as they can also be stored in a field of a symbol. This is a cheap implementation strategy for global variables and property lists, but it comes at the cost of imposing a tight coupling between two distinct concepts: symbols and the global environment.

The upside of this coupling is that it encourages the use of global symbol attributes (bindings and properties). During interactive programming at a REPL, global bindings turn out to be useful because they make it easy to 'say the name' of the bindings to the environment. For bindingsthat directly map to symbols, the symbol itself is sufficient to name the binding and use it during debugging. Consider this definition:

(define *current-counter-value* 0)

(define (next-counter-value)
   (incr! *current-counter-value*)
   *current-counter-value*)

This definition of next-counter-value makes it easy to inspect the current counter value. It's stored in a global variable binding, so it can be inspected and modified during debugging using its name: *current-counter-value*. A more modular style of programming would store the current counter value in a binding local to the definition of next-counter-value:

(let ((current-counter-value 0))
  (define (next-counter-value)
    (incr! current-counter-value)
    current-counter-value))

This is 'better' from a stylistic point of view, because it reduces the scope of the binding of current-counter-value entirely to the scope of the next-counter-value function. This eliminates the chance that 'somebody else' will break encapsulation and modify the counter value in a harmful fashion. Unfortunately, 'somebody else' also includes the REPL itself. The 'better' design imposes the cost that it's no longer as easy to inspect or modify the current-counter-value from the REPL. (Doing so requires the ability to inspect or name the local bindings captured in the next-counter-value closure.)

The tight coupling between interned symbols and global variable bindings should not come as a suprise, because interning a symbol necessarily makes the symbol itself global. In a Lisp that interns symbols, the following code fragment creates two distinct local variable bindings, despite the fact that the bindings are named by the same, eq? symbol: local-variable.

(let ((local-variable 0))
   (let ((local-variable 0))
      local-variable))

The mismatch between globally interned symbols and local bindings implies that symbols cannot as directly be involved in talking about local bindings. A Common Lisp type declaration is an S-expression that says something about the variable named by a symbol.

(declare (fixnum el))

In contrast, a Clojure type declaration is a reader expression that attaches metadata to the symbol itself:

^String x

The ^ syntax in Clojure gathers up metadata and then applies it using withMeta to the next expression in the input stream. In the case of a type declaration, the metadata gets applied to the symbol naming the binding. This can be done in one of two ways. The first is to destructively update metadata attached to an interned symbol. If Clojure had done this, then each occurrance of symbol metadata would overwrite whatever metadata was there before, and that one copy of the metadata would apply to every occurance of the symbol in the source text. Every variable with the same name would have to have the same type declarations.

Clojure took the other approach, and avoids the problem by not interning symbols. This allows metadata to be bound to a symbol locally. In the Clojure equivalent of the local-variable example, there are two local variables and each are named by two distinct symbols (but with the same name).

(let [local-variable 0]
   (let [local-variable 0]
      local-variable))

The flexibility of this approch is useful, but it comes at the cost of losing the ability to store values in the symbols themselves. Global symbol property lists in Lisp have to be modeled using some other means. Global variable bindings are not stored in symbols, but rather in Vars. (This is something that compiled Lisps tend to do anyway.) These changes result in symbols in Clojure becoming slightly 'smaller' than in Lisp, and more well aligned with how they are used in moodern, lexically scoped Lisps. Because there are still global variable bindings availble in the language, the naming benefits of globals are still available for use in the REPL. It's taken a while for me to get there, but the overall effect of un-interned symbols on the design of a Lisp seems generally positive.

September 12, 2009

It took long enough, but finally, I've taken the time to set up a better workflow for this blog:

  • The master copy of the blog contents is no longer on the server. It's now on one of my personal machines.
  • I'm managing site history using git . This was a nice idea, but git and blosxom have a fundamental difference of opinion on the importance of file datestamps. blosxom relies on datestamps to assign dates to posts and git deliberately updates datestamps to work with build systems. There are ways to reconcile the two, but it's not worth the time right now.
  • Uploads to the server are done with rsync invoked through a makefile. (ssh's public key authentication makes this blazingly fast and easy.)

Maybe now, I'll finally get around to writing a little more. (Or, I could investigate incorporating Markdown, or the Baseline CSS Framework, or....)

August 5, 2009

I've been meaning to write this for months... after switching to an iPhone last October I have some thoughts on the transition away from Windows Mobile. Most of my detailed comments are complaints, so before I continue, it's worth saying that I do think the iPhone is the best smart phone you can buy. It is, by far, the best answer the industry has come up with for this class of device. That said, it's more fun (and potentially useful) to complain:

  • Touch Screen - I remember shopping with my parents for a car in the late 80's. One of the cars we looked at was a Buick Riveria with a touch screen in the center console. It was cool, but since it lacked tactile feedback, you had to be looking at it to use it. Flash forward 23 years, and you can replicate this experience in the palm of your hand, for better or for worse.

  • 'Ambient Information' - The phone does a poor job of making inforation ambiently available. To see your next appointment, you need to open the Calendar unless the reminder has already displayed. (This could go on the home page.) To be notified of a new e-mail, you need to unlock the phone and look at the home page. (This could be a LED on the case.) As notifications build up, they wind up truncated and incomplete, presumably so they can fit in an artificially small box on the screen.

  • Portrait/Landscape - I wake up in the morning and want to check e-mail before I get up.. I grab the phone off the nightstand, look at the display, and it... switches to landscape mode. I'm driving down the road and want to skip a track, so I grab the phone (eyes on the road), put my finger in the general area where the 'forward' button is, and it... switches to landscape mode. Landscape is useful when you need it, and a usability menace when you don't. There needs to be better control over when it engages and when it doesn't. (In this case, physical buttons for skipping forward and backward among tracks might be nice too... Buick ultimately dropped the touch screen entirely, and modern cars with navigation tend to also offer physical controls for key functions.)

  • e-Mail - I have two e-mail accounts set up on my phone: personal and business. It takes five taps to switch between them. A unified view would be nice. (A list of the union of all inboxes, color-coded by in-box). An easier way to pick an in-box would be almost as nice.

  • Large e-mails - By default, large e-mails are only partially downloaded to the phone and there's a button at the bottom of a one of these mails that lets you download the rest. Of course, once it does, it then zips you back to the top of the mail, so you have to manually scroll through the (remember, it's large) e-mail to get to where you were reading. Argh.

  • Latency - Maybe a 3GS would fix this, but the phone seems very slow to change modes and update the display. I find myself continually waiting split-seconds for the thing to animate the transition from one display to the next. I'm asking a lot here, but I don't care... I use the thing most waking hours of most days.

  • App Store Rejections - This is a problem, it sucks for app developers, and it won't matter to the success of the platform. The vast majority of customers will never hear that Apple censored a dictionary (!), and even if they did, it won't stop them from buying. In the short term, my guess is that Apple will make whatever minimal changes it needs to make to keep developers quiet enough, and the iPhone will continue to do very well. Phone buyers don't care enough about choice, and App developers will tend to always want to code for the platform where they have the best shot at making money, which is currently the iPhone. In the long term, my guess is that a lot more of this content will wind up on mobile web sites than through the store. After all, a website can be an icon on the home page, avoid the risk of Apple's rejecton, and also get to run on Android, Pre, and Windows Mobile.

  • App Store - 25,000 applications on the site, and I might look at 10 or 20 before deciding to make a purchase. The way the store presents applications (controlled by Apple) has a huge impact on which apps succeed and which apps fail. Even if the rejection problem magically goes away, Apple still controls the horizontal and the vertical. (A lot like Google's control over the fate of websites...)

  • Keyboard - After ten months, it's still tedious and error-prone for me. It works, but just. Apple should provide a keyboard layout that works like a Blackberry (or even T9) and trades off multiple letters per key in exchange for larger keys.

  • Industrial Design - I love the way the phone looks and feels, so I wrap it in a tacky add on case to 'protect it'. So does most everybody else. Last I heard, good design was about making a product that looks good and works well. The rampant sales of cases implies to me that something is missing with the 'works well' part of that equation.

I do like the thing, and I wouldn't switch away, but it's far from perfect. Let's hope it gets better.

Older Articles...