Merging of Lisp with Lisp with types

Typed and dynamic languages could be merged in a way that most people have not thought about.

The key to understanding this approach is to understand that dynamically typed language does come with type checking, but that checking is partitioned into many small pieces that happen during evaluation of the program.

The idea I'm going to present allows you to merge dynamic and statically typed languages together into a single coherent language.

Lisp

If you interpreted "Lisp" as a speech impediment you wouldn't be far off. It's a family of languages that are unclear because the input text is a direct encoding of the parse tree it produces.

Lisps are nice languages because they are like cars without bonnets and sometimes without doors and windshields. They are easy to examine and work with.

In specific we are looking at languages resembling the "Scheme". These include languages such as Python and Javascript if they were stripped out from their syntax and "semantic sugar".

Here's a scheme program that would compute 1+2+3:

(+ 1 2 3)

The evaluator can be very simple:

def eval(expr, env):
    if is_list(expr) and len(expr) == 0:
        return None
    elif is_list(expr) and expr[0] == quote:
        return expr[1:]
    elif is_list(expr) and expr[0] == function:
        return Thunk([builtin_closure, env] + expr[1:])
    elif is_list(expr):
        command = [eval(arg, env) for arg in expr]
        return apply(command[0], command[1:])
    elif is_symbol(expr) and env == None:
        return Thunk([expr])
    elif is_symbol(expr) and env[0] == expr:
        return env[1]
    elif is_symbol(expr):
        return eval(expr, env[2])
    else:
        return expr

The apply is an assistive subroutine that applies the function to the arguments.

def apply(obj, args):
    while is_thunk(obj):
        args = obj.expr[1:] + args
        obj = obj.expr[0]
    if is_builtin(obj):
        return obj.fn(args)
    else:
        return Thunk([obj] + args)

The builtin that produces a closure is not complicated either.

@Builtin
def builtin_closure(args):
    if len(args) >= 3 and is_list(args[1]):
        env = args[0]
        binds = args[1]
        body = args[2]
        args = args[3:]
        while len(args) > 0 and len(binds) > 0:
            env = (binds[0], args[0], env)
            binds = binds[1:]
            args = args[1:]
        if len(binds) == 0 and len(args) == 0:
            return eval(body, env)
        elif len(binds) == 0:
            return apply(eval(body, env), args)
        else:
            return Thunk([builtin_closure, env, binds, body] + args)
        return result
    else:
        return Thunk([builtin_closure] + args)

Here are few example evaluations:

(+ 1 2)
= (+ 1 2)

((function (a) (add 1 (add a 3))) 2)
= (+ 1 (+ 2 3))

There's source code (.py) for it if you want to tinker.

A larger such implementation might have better macroexpansion, dynamic scope or context that goes along evaluation.

Though it's a dynamically typed language. The version above simply doesn't evaluate if it is unable to evaluate and builds a thunk instead. The evaluation proceeds in a fixed order that is simple to express with programming languages that do the same.

with types

"Lisp with types" refers to a typed language that implements type theory in some sort of a functional language. It usually ends up being some variation of ML language. These languages are not as bare as lisp. Here's an example of Factorial in ML:

fun fac 0 = 1
  | fac n = n * fac (n - 1)

Valid programs in these languages must be type annotated. This means that we must have fragment such as:

fac : nat -> nat

..to mark that the above factorial function is defined for natural numbers.

The evaluation is tied to the type annotations and does not proceed if the types in the annotation does not match with the structure.

In the simplest form it's simply typed lambda calculus.

x : type
variable ↦ body
x y
x → y

Some things behave exactly like they did in the untyped language, it's just that now we need to place type annotations into the expression. Otherwise things work just like they did:

(a ↦ add 1 (add a 3) : nat → nat) 2
= add 1 (add a 3) : nat

There's an interesting thing about this language that is used in bidirectional typing. The type annotation only makes sense when it appears in between a normal and a neutral form. Normal forms are forms that represent constructions whereas neutral forms are structures that are prepared to pick things apart.

This means that the following:

a ↦ (b ↦ add a b : nat → nat) : nat → nat

..is just an inconvenient way to write:

a ↦ b ↦ add a b : nat → nat → nat

These types are checked so that the program conforms to them and they don't need to be checked in middle of evaluating the program.

Types may also provide a strong normalization property. This means that if the program evaluation completes, it will complete no matter which evaluation strategy we pick.

The obvious solution is the wrong solution

So the first idea everybody thinks out is to add type annotations into untyped language. This results in a language that's pretty much like the untyped language but there's some additional checking.

add "foo" 2
= add "foo" 2

add (5 : string) 2
= add (5 : string) 2

add (5 : nat) 2
= 7 : nat

We get more ways for the program to fail. Python3 and Dart/Typescript communities rejoice, though I think I don't really get anything useful here.

Observe where it typechecks & align them

The answer here would seem to unify where the dynamic and static languages typecheck their programs.

I actually didn't show where the untyped language typechecks. It does that in the buildtin functions.

@Builtin
def builtin_add(args):
    if len(args) == 2 and is_int(args[0]) and is_int(args[1]):
        return args[0] + args[1]
    else:
        return Thunk([builtin_add] + args)

That's the boundary where you want to add the types.

To mixing the two

So consider that we'd have these two languages mixed already. We'd have this well-typed expression that is checked at compile-time.

(a ↦ b ↦ add a b : nat → nat → nat)

You compile the innards and expose it into the program as a typed program.

Now lets say something feeds an another typed program as an argument to this.

(a ↦ b ↦ add a b : nat → nat → nat) (5 : nat)

The type annotations are merged. If they merge successfully, it becomes a running thunk.

(a ↦ b ↦ add a b : nat → nat → nat) 5 : nat → nat
= b ↦ add 5 b : nat → nat

If they do not merge.

(a ↦ b ↦ add a b : nat → nat → nat) ("foo" : string) : *bork*

You can build new static structures inside dynamic code. Here if you'd have something : nat → nat in the scope, you could add an annotation like this:

let arg = (a ↦ add a (something a)) : nat → nat
in arg 5

The annotation would be checked and the program will fail if it can't build the typed structure.

Conclusion

It seems and feels quite nice. I think it's something that could be implemented today, although it might end up being a bit underwhelming.

Dynamically typed languages usually don't have fancy datatypes. This would bring the types from an actual typed language so that'd be nice.

Maybe the point of this was to show that there's a good glue language that's been around here for a while.

It brings up a question though: If you add logic programming into this mix, do you end up with two systems that do roughly the same thing? Or would there be a point in having all of them together?

Similar posts