Explaining lambda calculus to a front-end web developer

Today I'm going to answer to a person who has been reading these posts. I was using wrong sender field by mistake and I did not get a response afterwards suggests that it was lost. Since then I've studied and rewritten the post to improve it. The plan was to publish this anyway, although it's perhaps a bit early.

How do you explain lambda calculus (or just calculus) to a Front-end web developer. How does calculus boost their productivity?

There is a bit of an immediacy in this question that I didn't catch at first. It's a bit like asking how do I burn belly fat? The answer is you ensure your food macronutrients are in balance. If you do have nutrients in balance then you do bit of strength training and cardio exercise every week. It won't do anything at first, but after a year you'll notice you're in a much better health and get much more done with the time you got, even if you just end up improving your nutrition.

It's bit like that with (lambda) calculus but in context of programming. It takes a while to make anything productive out of what you learn. Ultimately you'll be able to apply the ideas, but it'll take a while and may requires you make a habit of learning. Eventually you may learn to write programs and prove they're correct without needing to debug or run them to know it.

Lambda calculus

Lambda calculus is an incredibly compact model of computation. Despite simplicity it's fairly readable. In the whole the language has only 3 rules that you use to form expressions.

The first rule introduces a variable. It's a neutral form, which is shown with a down-arrow in the expression, meaning that it takes something apart and makes something else from it.

────── variable
x ⊢ x↓

There's a perhaps weird symbol, that's a turnstile with context on the left side and lambda calculus expression on the right side.

When the context is not empty, we have an open expression. If the context is empty, then it's a closed expression. Note that if the context is satisfied with something, it must be a normal form.

With variables you can't do much anything yet. Actually you need the whole language with all the 4 rules to do something. There's an another neutral form that we can use to apply functions to values.

Γ ⊢ a↓     Γ ⊢ b↑
───────────────── application
Γ ⊢ a b↓

Things over the line tell what you need in order to form the expression. The arrows point that 'a' must be a neutral form, while 'b' must be normal. Normal forms are expressions that form new values. The symbol Γ describes a context filled with possibly many variables.

Finally we have a normal-form that forms functions:

Γ,x ⊢ a↑
────────── abstraction
Γ ⊢ x ↦ a↑

The comma extracts a variable from the scope and that variable no longer appears in the context of the formed expression.

That's the whole language, with two additional rules:

Every neutral form(↓) can be used in a place of a normal form(↑), but a normal form cannot be used in the place where a neutral form is required.
If normal form(↑) appears in the place of neutral form(↓), we get an reducible form that is marked with (⋄). Whenever a rule is formed with reducible form in place of either, the resulting rule is also reducible.

Reducible forms are programs whereas normal forms are values. The last rule makes this language untyped lambda calculus.

At programming we'd like to be precise with our definitions. There's one thing I haven't explained that is the behavior of context. I haven't done so because those rules are structural and thus you can deduce when they have been used. The only structural rule present here allows you to drop variables from the context as you go upwards.

Examples of lambda calculus expressions

Now we can form programs expressed in lambda calculus by applying the rules. This is an identity function, it gives you what you pass into it.

──────
x ⊢ x↓
──────────
Ø ⊢ x ↦ x↑

The Ø is an empty context, which means that the identity function has closed context. We can build an another function, using all these rules. This function has open context.

──────     ──────
x ⊢ x↓     y ⊢ y↓
─────────────────
x,y ⊢ x y↓
────────────
y ⊢ x ↦ x y↑

Note that the above function uses the neutral form in place of normal form. Here's something that's reducible:

──────
x ⊢ x↓
──────────    ──────
Ø ⊢ x ↦ x↑    y ⊢ y↓
────────────────────
y ⊢ (x ↦ x) y⋄
──────────────────
Ø ⊢ y ↦ (x ↦ x) y⋄

People who have seen and heard of lambda calculus before, you may be mystified by seeing something that is completely different to the usual presentation of the language.

The presentation encodes some ideas that will be useful for the later, actual use of the language.

More lambda calculus

The structures presented above are proofs verifying that the structures can be formed in the lambda calculus, at least in a version of it that was just described.

You may have noticed, or know, that they're highly redundant. From the previous proofs we can erase about everything, even the context, and reconstruct it again.

x ↦ x

That is, we can construct the whole earlier proof from just this piece.

x ↦ x↑      abstraction is a normal-form

Ø ⊢ x ↦ x↑  every variable in the expression is bound.

x ⊢ x↑
──────────  it had to be constructed using abstraction -rule.
Ø ⊢ x ↦ x↑

──────      likewise there's only one way to construct a variable
x ⊢ x↓      and it is a neutral-form that fits into normal-form we have.
──────────
Ø ⊢ x ↦ x↑

Above means that lambda calculus expressions are encoding their own formation proofs. If you associate a type to every term in a lambda calculus expression, it becomes capable of encoding intuitionistic logic proofs. The following tutorial tells more if you're interested about how they relate.

A Tutorial on the Curry-Howard Correspondence (PDF) | Darryl McAdams

Precedence rules

There is still one thing left to define in the language, and those are the precedence rules. If we didn't define those, our language would be ambiguous.

The application is left-binding, it also binds stronger than the abstraction:

a b c       = (a b) c
x ↦ x y     = x ↦ (x y)

The parentheses are available for grouping terms.

Reduction

Lambda calculus evaluates through substitution. We mark it with brackets: e[x := value].

  p ⊢ (x ↦ x p) (y ↦ y)⋄
= p ⊢ (x p)[x := (y ↦ y)↑]
= p ⊢ (y ↦ y) p⋄
= p ⊢ y[y := p]
= p ⊢ p↓

The substitution follows the context. For example, consider that the following expression would substitute the 'x' in the context away and replace it with 'p'.

──────────
p,x,x ⊢ x↓
──────────────
p,x ⊢ (x ↦ x)↑

The result would be this:

──────────
p,x ⊢ x↓
──────────────
p ⊢ (x ↦ x)↑

Otherwise the substitution replaces every variable of matching name:

──────────
p,x,y ⊢ x↓
──────────────
p,x ⊢ (y ↦ x)↑

Above expression would turn into this:

────────
p,y ⊢ p↓
────────────
p ⊢ (y ↦ p)↑

The above reduction rule is β-reduction. There is another reduction rule that tends to be used. Take that in the program below, 'x' would not appear in 'p'.

p ⊢ (x ↦ p x)↑

If the outer abstraction were to be reduced, we'd end up calling 'p' anyway. Therefore it's common to remove it:

p ⊢ (x ↦ p x)↑
= p ⊢ p↑

That is η-reduction, we can do it with normal forms, while beta reduction only appears in reducible forms.

Reduction termination

Without type information present, lambda calculus is computationally universal. The proof would require that we show how to simulate a turing machine in lambda calculus.

For now I show that untyped lambda calculus programs can get stuck. This is a consequence of being computationally universal.

(x ↦ x x) (x ↦ x x)⋄

Consider this reducible expression. Lets use the reduction rules I just described to it.

  (x x)[x := x ↦ x x↑]
= (x x)[x := x ↦ x x↑]
= x[x := x ↦ x x↑] x[x := x ↦ x x↑]↓
= (x ↦ x x) (x ↦ x x)⋄

We end up to the same form that we started with, which demonstrates that the computation gets stuck. It keeps doing the same reduction over and over again.

Lambda calculus expression may contain multiple parts you could reduce at once. The order at which you reduce the expression doesn't matter. If reduction were to terminate, it'd produce the same value no matter which way you reduce it. This isn't hard to observe as other reducible expressions remain like they are when you reduce one of them away.

If you can predict how many nested functions each variable potentially represents in your expression, then you know whether the reduction on it terminates.

For example we know this reduction does terminate because we're able to count the reductions done on the variables.

w ⊢ (x ↦ (y ↦ x y) x) (z ↦ w)⋄
0    1    0            0

Reduction indeed shows that it does terminate:

= w ⊢ (y ↦ (z ↦ w) y) (z ↦ w)⋄
= w ⊢ (z ↦ w) (z ↦ w)⋄
= w ⊢ w↓

If a reduction terminates, the resulting program is in a normal-form.

How to count the reductions isn't quite obvious, but it is something you get if you have typed lambda calculus expressions.

If you know the amount of reductions, you'll be able to construct a reduction algorithm that does terminate. This is described in the Keller-Altenkirch 2010 paper: "Hederitary Substitutions for Simple Types, Formalized".

Lets try it on the expression we already know never becomes a normal-form. We could just assume some reduction counts.

(x ↦ x x) (x ↦ x x)⋄
 2         1

To carry the algorithm out, we η-expand each variable as many times as we predicted it reduces, expansion means that we use the η-reduction in reverse. Then we mark every application that we didn't create through η-expansion. Here's the result of doing the expansion:

(x ↦ (a ↦ b ↦ x a b) (c ↦ d ↦ x c d)⋄) (x ↦ (e ↦ x e) (f ↦ x f)⋄)⋄

The idea in the η-expansion is to allow eager application of arguments. When we normalize the expression, we should never reach a point where application of the function becomes impossible.

There's a rule that we must have normal-forms on the both sides of application. The left side of application cannot be a neutral form. Therefore the deepest parts are normalized first.

  (a ↦ b ↦ x a b) (c ↦ d ↦ x c d)⋄
= b ↦ x (c ↦ d ↦ x c d) b↑

  (e ↦ x e) (f ↦ x f)⋄
= x (f ↦ x f)↑

These operations succeed because they don't activate the thunks they substitute into. We obtain the form:

(x ↦ b ↦ x (c ↦ d ↦ x c d) b) (x ↦ x (f ↦ x f))⋄

The application results in:

b ↦ (x ↦ x (f ↦ x f)) (c ↦ d ↦ (x ↦ x (f ↦ x f)) c d⋄) b⋄

Looking at the inner active form there:

  (x ↦ x (f ↦ x f)) c d⋄
= ((x ↦ x (f ↦ x f)) c⋄) d⋄

  (x ↦ x (f ↦ x f)) c⋄
= c (f ↦ c f)⋄

Since the left side of an application cannot be a neutral form, the reduction has failed at this point.

Now lets do it again on the expression that we know to terminate.

  w ⊢ (x ↦ (y ↦ (a ↦ x a) y⋄) (a ↦ x a)⋄) (z ↦ w)⋄
= w ⊢ (x ↦ (y ↦ x y) (a ↦ x a)⋄) (z ↦ w)⋄
= w ⊢ (x ↦ x (a ↦ x a)) (z ↦ w)⋄
= w ⊢ (z ↦ w) (a ↦ (z ↦ w) a⋄)⋄
= w ⊢ (z ↦ w) (z ↦ w)⋄
= w ⊢ w↑

You can see the algorithm has no trouble reducing the expression.

The way this works is that the η-expansion ensures there are finite amount of steps involved in each reduction.

Reduction counts can be read directly from types, if you have those.

Non-function values in lambda calculus

One thing that should be specifically appreciated about lambda calculus is that it doesn't need to be made fancy for it to be useful.

You don't need functions with multiple arguments because you can get them from single-argument functions.

If you apply a function repeatedly to something else, you can use such repetition to represent numeral values.

0 = f,x ⊢ x
1 = f,x ⊢ f x
2 = f,x ⊢ f (f x)
3 = f,x ⊢ f (f (f x))

Pairs are fairly easy to build:

(a,b) = a,b ⊢ e ↦ e a b

You can build lists and maps in a similar way. Even I/O can be carried through by returning a triple consisting of a number, input value, and a callback function. When the program receives that triple, it can carry out the I/O operation and then resume by applying the callback to output.

All of this is similar to building datatypes in a Turing machine, though you can see that in lambda calculus the number and list representations are somewhat close to their "best" representations.

Termination guarantees can be useful

Termination guarantees do not seem so useful at first, because although they tell you that the program terminates you won't know how long it is going to take. In practice though, they are extremely useful.

Evaluation of any program can be understood as a proof search. If it happens to terminate it proves that it does terminate. Step-by-step trace of the program forms the actual proof, while the type of the output is the proposition that was proved.

If you have a proof that a program terminates, then the program itself is a sufficient proof to prove a proposition.

A silly example: You can trivially prove that there are such things as natural numbers after you've proved them. The proofs are values themselves: 0, 1, 2, ...

Two lambda calculus expressions reduce into the same normal-form if they are equivalent and they terminate. For example: (x ↦ x) = (x ↦ x (y ↦ y)) (z ↦ z) is true.

These kind of ideas provide foundation to building formal proofs and propositions that can be automatically checked by a computer.

Modern formal verification systems can do higher-order logic and cover quite many things you'd like to describe. When using such theorem proving systems, absolutely everything you reason about doesn't need to be proved, but the things you manage to prove do hold.

They way it works can be intricate, but generally propositions describe something about programs that behave as their proofs.

For example, you could have a proposition that says there is a bidirectional map from HTML string response into HTML protocol semantics. The proof for this would be a encoder/decoder for HTML response messages. Prove that such map is an isomorphism, eg. roundtrip produces the same value. Then you have proven that it's consistent.

Mathematical definitions can be only checked against other mathematical definitions, though such definitions can very clean and close to how you'd tell what properties you want from your program.

Type inference & checking

Earlier on untyped lambda calculus, I described a rule:

If normal form(↑) appears in the place of neutral form(↓), we get an reducible form that is marked with (⋄). Whenever a rule is formed with reducible form in place of either, the resulting rule is also reducible.

If you change this rule to require that there's a type annotation in a normal form where it's used, then every reducible expressions in your expressions are typed in a way that are easy to typecheck. The type of neutral forms can be inferred and the type of normal forms can be checked.

It works because neutral forms take some part apart from something that is already typed which makes figuring them out easy, whereas normal forms build structures and that makes them easy to typecheck.

Additionally, the types of simply typed lambda calculus can be entirely inferred with a constraint solver.

To some extent, if you have a type, then the programs can be inferred as well. The normal/neutral/reduced separation of expressions, is helpful in building all of these tools.

Immediate uses?

The most interesting usecases require more effort from somebody. Though there are some ways how you can use lambda calculus.

There are plenty many computational properties presented on this description, those can be applied to other systems in some extent, and similar ideas can be used to examine them.

The theory here may be immediately useful for building business rule engines, though it may need something else in addition to create conveniently useful systems.

Lambda calculus may be an useful communication tool. It's a clear description of mathematical functions and allows mechanically deducing what you need in addition to make it work.

It can be used to write down intuitionistic logic proofs on paper. Were they formal or informal. They are like those formation rules, you write under the line what you're looking for and pile over the line what you need to get it. After the line you write what's the step to go from over the line to under the line.