Friday, 13 September 2013

Writing a simple Lisp interpreter in Clojure

Having spent some time recently looking at functional programming and Lisp, I inevitably ended up having a look at Clojure. I started out looking at Common Lisp, but while it is clearly very powerful, it did seem quite an old-school experience too. Seeing as I've been working in the Java world for years, it made a lot of sense to have a look at Clojure instead, a modern, clean rethink of Lisp that runs on the JVM and benefits from a lot of the tooling that this platform offers. Meeting up with the London Clojurians for a code dojo also proved that there are a lot of very bright, enthusiastic and helpful Clojure coders out there and that was a real inspiration too.

The exercise I did of writing a simple Lisp interpreter in Scala proved so useful and interesting that I decided to do the same with Clojure. What could be better than implementing Lisp in another Lisp? And a useful experience it turned to be this time as well.

An initial solution


The resulting code lives on github for anyone who's interested in having a look. I think the end result looks OK, although being new to Clojure I'm sure there's a lot of scope for improvement. I'd be very grateful for any suggestions on how to simplify things, and make them more idiomatic Clojure.

Counting line numbers, the solution is roughly the same size as my previous Scala solution, at a couple of hundred lines of code. There are definitely some things that are simpler in the Clojure version. For example the environment, i.e. the data structure that contains symbol bindings. In Clojure, I implemented this as a function that when passed a value returns the corresponding solution. The neat thing is that Clojure maps behave like this, i.e. you can do this:

=> (def m {:a 1 :b 2 :c 3})
m
=> (m :a)
1

So a simple map is used for the predefined, root environment, and nested environments are implemented by a function that wraps the parent environment.

Predefined arithmetic and comparison operators were trivial to define as we could just delegate to the named functions in Clojure itself.

Strictly functional


Closure is a functional language and only provides immutable variables and collections. This meant that unlike in my Scala version where I "cheated" and used a mutable map in the environment, that was not really an option. This meant that the eval function can't simply take in the environment and mutate it at will, it has to return the new version of the environment alongside the result of the evaluation. This does seem like a cleaner, more "proper" functional design though (and it would have been better to do the Scala version this way too).

Another thing that emerged due to the immutable nature of the environment is that there's no way to write a lambda that calls itself recursively. I.e. if you write:

(define factorial
   (lambda (n)
       (if (<= n 1)
            1
            (* n (factorial (- n 1))))))


Then the "factorial" symbol is not defined in the environment at the time of creating the lambda, so it won't be able to look up and call itself. Looking around, this is the case in Common Lisp as well. There, the labels special operator lets you access defined names immediately.

To work around this problem, I added a special form, defn, that makes the lambda available in the environment of the lambda itself. My implementation of this uses a Clojure atom as one point of mutable that gets round the chicken-and-egg situation of making the lambda available in the environment that's passed in to the lambda when it's created.

Pattern matching


The Scala solution benefited greatly from the use of pattern matching in the eval function, so I had a look at the core.match library which provides similar functionality in Clojure. This worked out pretty well so I went along with this solution. It's quite impressive that functionality like this can be provided via a library, using macros, as opposed to being built into the language itself.

There were a few problems though: when matching more complex patterns with this library, I started getting an error "java.lang.ClassFormatError: Invalid method Code length". This did indeed seem to be related to the length of the function, as I only started getting it after adding a certain number of patterns to a match expression. I didn't investigate this further, but ended up limiting myself to simple patterns instead and doing further destructuring in a separate let expression. For example, I would do:

    [["eq?" & args]] (let [[arg1 arg2 arg3] args
                           _ (error-if (or (nil? arg1) (nil? arg2) arg3)
                                "Exactly two arguments expected for 'eq?'")


instead of destructuring arg1 and arg2 in the match pattern itself.

Maybe this is a bug in core.match that will be resolved at some point, I didn't have time to investigate this further though. And admittedly the eval function in this code is unusually long, longer than one would typically write a function, though in this case I don't think it would help readability to split it into several functions.

Final thoughts


So, the "write a simple Lisp" exercise proved useful once again. This strictly functional nature of this one helped shed light on a few more subtleties of Lisp and functional programming. It was also an excellent way to try out a few different parts of the Clojure language and related tools, such as data structures, functions, recursion using loop and recur. Again, I highly recommend this exercise as a way to learn a language.

Postscript


Since I last did this exercise I have realised that it's quite a common thing to do. I've come across some weird and wonderful little Lisp implementations along the way, and here are some favourites: A JavaScript version, a version written in Postscript(!), and one written in LaTeX macros(!!). And perhaps most impressive of all: a version written in C++ using template meta-programming techniques, that evaluates Lisp expressions statically as part of compiling the C++ code! If you know if any weird and wonderful implementations then do let me know.

No comments:

Post a Comment

Real Time Web Analytics