Back to Lisp

I have taken lisp back to my language repertuare, and I am studying what it has to offer. Considering my previous posts and boosting of importance of syntax, this is a drastic change of heart and party. This post tells a story about how I ended up to doing it, so I will remember that later.

The rewrite time

It was time to rewrite everything again in my language runtime. It had grown to 4000 lines of code, yet I realised that if the language has generators and coroutines, then the C functions should be written in a way that they work fine along those routines. Otherwise it would become, inflexible, too hard or even gimmicky to create things such as hash tables operations, that would have to check whether the hash table contains a key, with user-defined comparator functions.

I started coroutinizing the functions in a rewrite. The language runtime begun to seem simpler, but it was still bit unclear how it should look like so it would work. I weren't sure about what to do so I decided to take a break out of my own language and study alternative ideas on the language design. I had read about Escher language lately, and I kept remembering it from the picture in the repository front page. The picture was M. C. Escher's work called "Hand with Reflecting Sphere", which portraits the watcher of the picture as the Escher himself.

I don't know what to think about the author of the Escher language. Is he a gag or a genius? Escher seems to be just a data flow language, although all the material creates environment of excitation over something I do not really understand. What picked my attention was the small detail: The author is boasting that Escher has less than four grammar rules. Ended up wondering how many unique grammar rules my language has? I removed everything that could be represented by other constructs, based on what I knew at the time. Here's the summary of the grammar that I thought was required to implement the language:

This is a tall order and it's even taller. I would have to implement function calls in continuation passing style to make it easy to reintroduce generators and coroutines into the language later. Similarly I left list/dictionary constructors and multiple dispatch out because function calls covered those. The hardest thing was the exception handling, for the reason that I'm not sure whether it's the best choice for error recovery, even if it's been popular choice for error handling. I wanted leeway from exception handling to experiment with alternatives that I am quite unaware of.

All the exception handling systems I've seen are a bit smelly. The control flow is occassionally doing a huge leap. When it's needed the exception handlers start piling up in the functions and the resulting constructs look a bit like conditional constructs. Add finalizers there and you've got complex error handling logic, in slightly different language than what the rest of the code is written in. Default behavior of python is also a bit questionable. It's unrolling the whole stack for you and dropping all the values of the variables by default. At least the exception handling would require variations that one can try out.

Cheney on the MTA

I went to study, search and ask about exception handling. I was hinted about common lisp restarts and that continuations can represent any kind of control flow including the exception handling. I shown my recent rewrite with the coroutines and was told about the Cheney-on-the-MTA compilation strategy. In this scheme the scheme is compiled into C functions. Faithful for the name of the strategy as they never return. Instead they call a continuation. This fills up the call stack, which, when bypassing a limit is copied into the heap and then lowered down to the bottom. For it to work, everything is compiled into the continuation passing style before compiling to the C, including all the control flow constructs, even the simplest constructs such as the if and while. The interesting thing is that in that form you have only three constructs: lambda, let, jump. This makes it very straightforward to compile into C, or even into native machine code if you need to. Thanks for the #python IRC -channel's people for giving a hint on this. :)

Cheney on the MTA led to Chicken scheme, and ur-scheme, and search for a minimal implementation of lisp. And of course back to the wiki and lambda the ultimate blog. I'm in middle of reading the "Lisp in Small Pieces" while writing this to see if I still find something.

Minimal viable language

With the knowledge that I could transform every control flow construct into a continuation, I analysed my language again, removing the constructs I could represents by the constructs left in. This is what was left:

It makes up very simple and clean model of the language. It allowed me to identify that I could implement the whole language over these constructs and the following data structures:

This kind of organization means I need to write very little, if any, C to get myself an implementation that I can test and experiment with. Besides, the reduction in the data structures mean for a language that is easier to inspect and easier to memory dump, either for deferred debugging or for accelerating the startup of the process, or for inspecting the program state.

It seems that I had returned to the lisp, although my lisp implementation won't likely ever return.

Sticking to lisp

I could just pick the concepts I learn from the lisp and apply them into my language. I'm not doing it because I notice a pattern here. The pattern is that many great discoveries have been done in lisp, or they've ended up using lisp. Knowing that there's been awesome work and research in parsing algorithms. It makes me think lisp languages never needed to lack algebraic syntax to stay simple. It's a deliberate choice of going bare bones. I have a theory that it is about the mobility, although I don't know to be certain.

Lisp is a crazy incubator of crazy futuristic ideas that occassionally become popular

EMACS is a programmable text editor which you can modify and extend with your own code while it's running. It consists of small elements crucial for it's function that start up when you run the program, but everything else is written in a way that it could have been written or modified by the user. One can write a command and see the effects of the change immediately. A programming error in custom code of the user won't crash the editor. Instead an interactive prompt is given. Stallman's favor for the dynamic scope is a bit odd, but otherwise this seems like something from recent popular talks emphasizing visualization of the program state.

There are other such stories. I can remember the Paul Graham's Beating the Averages. I think I read the articles the same time when I first studied lisp. The common pattern here is that people do transformative things in lisp. Something that works very well. Then people try to take it out of the lisp, the thing shatters apart and everybody forgets that it even happened. The stupidity of the computing industry and individuals is something amazing. In the beating the averages PG quotes the Eric Raymond:

Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.

This is a conjecture. I'm not certain that the experience and enlightement from the lisp sustains if you get out of the lisp. Not anymore. Like the amazing concepts and ideas introduced in the lisp, it might very well just shatter as well. Well things could very well shatter if lisp was kept. But if the pick of language didn't matter, why the lisp ends up as a base when people do amazing things with their programming language?

This is theory I consider: The freedom of syntax and clear perception of code as a data makes the lisp highly mobile vehicle for thought. After all there are many kinds of lisps, and in a stupid sort of the way the lisp is defined by it's parentheses. The simple language connects all those languages together, turning into a language toolkit. Again providing the lisp language family strong mobility. You can switch between languages with ease since they fit into same structures that are in concrete form on your screen.

Meaning of the syntax in the programming

I used to thought before that syntax is very important for readability. My recent discoveries are challenging this statement. Consider how exceptions are constructed in languages that implement them. The syntax is different in every single language, but you're still saying the same things with the exceptions. There are radical counterpoints to this though. Some languages implement overly verbose way of saying things, for example javascript's function -statement. This makes the concise syntax choices such as the function definition of coffee script very attractive. It seems to be a common opinion that syntax matters.

While looking for papers on programming language syntax, I stumbled upon the quorum programming language that was engineered based on peer-reviewed studies. The idea of designing a language based on user-testing is an interesting thought. But unfortunately I don't find any reference or support to the claim of the language being "heavily tested in formal scientific peer reviewed studies". The only paper I wound had sample size of 18 test people, with arbitrary sample of languages, from which it's dangerous to draw conclusions. Once you observe closer, the authors say they altered the design based on observational work with novices and few papers. It seems measly. It'd be very difficult to find large enough sample size for syntax related study, but maybe you don't need a study. Once you give the choice, it converges. Just design a language which's syntax or coding mechanism can be chosen. That seems to be.. lisp.

It hits the lisp over and over again. Were the subject about syntax or not. The lisp seems to be the vehicle to figure out solution to hard problems. If that's true, then it doesn't matter what it looks like. Whatever benefit you can get by better syntax would be always overstep by benefits you get by just staying in the lisp.

The popular languages with highly articulated grammars seem to eventually stagnate because their new users don't have control over them. People can't easily change or improve their languages so they keep doing same things over again. The crystal-clear understanding about the implementation eventually returns back into the lisp-world. Even if those 'better' languages had lisp-features such as lexical scoping or first class functions in them.

Syntax might be a dead end anyway

I have identified several scenarios of how programming languages might evolve from where they are now:

In many of these scenarios syntax may become less relevant, or entirely irrelevant, or a hindrance. In some of the situations you replace syntax, or even semantics of the language with another. Lisp allows this kind of changes, which sets it apart from other languages.

The way people talk about lisp makes it easy to forget it's a family of programming languages, actually, family of families of programming languages. If your lisp has proper macros, it means that every time when you create a new macro you end up with a different language. This happens because the language gets a new semantic rule that needs to be documented and remembered by the reader. Having rampart macros make it harder to analyse or provide syntax for the language. This doesn't mean that language shouldn't have macros or that macros are a bad construct. It just means that they should be thought as change of language and used accordingly.

My lisp flavour

I'm not just taking the syntax for granted though. I keep the things I think are important in the lisp and define the language. For the remaining there's a little bit of snake in my lisp, because it's influenced by python. The strings have escape characters and normal backslashes \\. The comments start with # -character. There's Bx -syntax for numbers of different base. My lisp will have the following production rules:

SExpr -> ( SExprs )
      -> Atom
      -> SExpr .attribute

SExprs -> 
       -> SExprs SExpr

Atom -> "string"
     -> 'string'
     -> symbol
     -> 1234
     -> 123.5
     -> 0x15beef
     -> 2x010110
     -> 8x777

There's been attempts to improve lisp syntax. I will consider these later after I've gotten to use lisp more. I've seen the sweet expressions. I consider it might work, or then not.