Pyllisp Syntax

I have a theory that so called 'forms' are the enabling element in lisp that sets it apart from other languages I've programmed with.

(+ 1 3)
(if (== 5 2))
(set! x 5)

The simple syntax is easy to teach and doesn't stick on way. To see whether it'd work, I extended the syntax but retained the syntax on forms. There's only a form where you see parentheses. And there's no parentheses without forms.

Lisp syntax avoids explicit semantics, but it's not nice if you've got arithmetic expressions or expressions spanning multiple lines

To make it nicer on arithmetic, I implemented layout-sensitive prefix, postfix and infix syntax, here are the forms and what do they translate to:

*prefix    (:prefix {*} prefix)
in*fix     ({*} in fix)
in * fix   ({*} in fix)
postfix*   (:postfix {*} postfix)

It is implemented as a top-down precedence parser. It looks ahead two tokens to determine what to do.

The programmer can define new operators before the file is parsed. Obviously, to avoid parentheses, the curly braces are for grouping the expressions:

{1+2}*3

Comparators

Additionally the language has syntax for comparators with chaining rules, as well as and, or, not -semantics. It follows the rules of pythons operator precedence

a < b <= c      {a < b} and {b <= c}

Internally the chaining is treated as a separate expression to avoid introducing an extra computation.

Lists, Index & Attribute

There's syntax for list literals, indexing and taking an attribute:

items = [1 2 3 4 5]
items[5]
items.length

Assignment

Assignments are parsed in same precedence as indexes and attributes:

term = expression
term := expression

We got local and upscope assignments. Finally there's an augmented form. We are doing a computation using the original value:

term <<= expression     (:aug term << {expression})

There's not in-place assignment for and, or, not or chainable operators. Additionally it's not present if the augmented operator would shadow some other operator.

Off-Side Capture

The outermost forms formed using parentheses can capture the indented expressions coming after it:

main = (func name)
    (println "hello" name)

The block doesn't translate into anything. Instead the capture block is passed along in a scope, so macros can give it a special treatment.

In case a macro never captures the block, it is passed in as extra arguments to the outermost form:

(gl.clearColor)
    0.0 0.0 0.0 1.0

If the outermost form was a macro that didn't capture, an error is raised.

If the outermost expression isn't a form, then the rightmost outermost form is used. If there's not such form on the line, then the capturing rule isn't applied. Additionally, forms on the same line capture each other:

(f) 4.5 (g) (h)
(f 4.5 (g (h)))

The above expressions are equivalent iff f and g do not specify a special form.

Of course, the expressions on different lines but in same blocks must align:

(x)
    y
  z

Is not allowed.

Chaining rule

(if (good_day))
    (println "come in" name)
(elif (medium_day))
    (println "umm")
(else)
    (println "go away" name)

There's still one rule left. The programmer can define chained special forms. Such 'chain macros' are captured by the expression coming before it, including other chained forms.

If the chained forms aren't used by the capturing form, the reader produces an error.

AST

The programs reading are receiving literals and expressions. There are following expressions:

:float
:int
:string
:symbol
:attr expr symbol
:index expr*
:aug symbol expr expr
:form expr*
:chain expr (symbol expr)*
:prefix symbol expr
:postfix symbol expr

Implementation

I rewrote the parser several times before finishing the design. The implementation along everything required to use it sans documentation is about 400 lines. I verified RPython can compile it.

I gave the syntax to Pyllisp. The parser will appear in Pyllisp repository when I get it to evaluate the new syntax.

Similar posts