Hindley-Milner type system/Algorithm W study

These ideas and thoughts have converged towards type theory, and I remembered familiar patterns so I did a closer study at the Hindley-Milner, to understand it much better than before.

As the basis for studying the subject, I did read few papers:

The concepts I describe here are likely better described in these papers. Despite that, I'll do a layman explanation to how Algorithm W works, then provide some observations.

Many of these observations have likely been done before. Some of them are going to be important for my work.

Language

We can treat Typed lambda calculus as a simplified version of several programming languages. It is a model deliberately made for study of the theory.

The rules have been selected such, that we can derive the observations we make here to cover the programming languages that we use. This is helpful for studying programming language theory and design.

The language consists of the following pieces:

Unary application expr(expr)
Unary lambda abstraction fn var. expr
Constant assignment let var = expr in expr
Variables var

This language is then annotated with simple types. One type for functions, an arrow is used (->) to mark it. It is accompanied by type variables.

Here's a context-free grammar for the language and its type annotations.

expr:
    expr "(" expr ")"               / app
    "fn" var "." expr               / abs
    "let" var "=" expr "in" expr    / let
    var                             / var

type:
    var "->" var
    var

terminal var

The expression is given with a type environment. In literature universal quantification (∀) is used for differentiating between "free" and "generalized" type variables. I use "forall var. expr" syntax for this.

The structure describes parametric polymorphism present in the system. I take an example that should describe this well.

id   :: forall x. x -> x
heat :: y -> y

Both of these things describe functions. The id has polymorphic parameters and it has been defined for all types you have. You can do id(sausage), id(cauliflower), id(seltzer). It returns back a value of the same type.

But in the system described by above types, you can only call heat on some specific type. For example, if the y ends up being substituted by pizza, then you can only heat pizzas.

Here's a grammar for describing bindings in type environments:

binding:
    var "::" scheme

scheme:
    type
    "forall" var+ "." type

substitution: "[" type "/" var "]"

result: expr ":" type

Additionally above we have syntax for substitutions of variables. The idea is that we can write down substitutions so that we don't need to do it immediately. Having this distinction helps with making an efficient implementation of type inference.

We also mark intermediate results with single colon. They are not part of the type environment that is produced.

Algorithm W

Once the language has been abstracted, the Algorithm W becomes easy to explain.

Given a type environment and an expression, there's a trivial way to determine the type for the expression.

Instantiation (Var)

For every occurrence of a variable, we can seek the type for it from the environment and instantiate the type.

Instantiation means that all generalized variables are replaced by fresh type variables.

Example:

id :: forall x. x -> x
heat :: y -> y
pizza :: forall g. g

The instantiation of id results in type z -> z.
The instantiation of heat results in type y -> y.
The instantiation of pizza results in type w

Subsequent instantiations always produce new, fresh type variables for all generalized variables in the expression.

Abstraction (Abs)

An abstraction binds a new type variable into the environment, and proceeds with the expression.

Example:

id :: forall x. x -> x
fn x. id(x)

The next step is:

id :: forall x. x -> x
x :: z
id(x)

The type for fn x. id(x) is a function.

The type can be produced by combining the type variable with the type we get by inferencing the body of the function.

id :: forall x. x -> x
x :: z
id(x) : z
fn x. id(x) : z -> z

Note that the type variable created may end up being substituted by something else. If that substitution exists, it appears when type inferencing the body of the function.

Application (App)

The type inference of the application is the only part in the algorithm that produces substitutions.

Lets consider the expression id(x) described earlier on.

id :: forall x. x -> x
x :: z
id(x)

The left-hand side produces w -> w.

The right-hand side produces z.

If the right-hand side produces substitutions, they are applied to the left-hand side.

Next a fresh type variable (g) is produced, then the type on the left side is unified with the type (z -> g).

Finally the fresh type variable is substituted and returned. All substitutions produced here are returned along with the result.

The unification takes two arguments and generates substitutions that need to be made to make them equal. The unify(a,b) can be described by a function:

1. if a is variable, then substitution `[b/a]` is produced.
2. if b is variable, then substitution `[a/b]` is produced.
3. if the structure of a, b match, then unify the elements.
4. otherwise, fail at the unification.

Unification of functions follow the usual rules:

s1 = unify(a.domain, b.domain)
s2 = unify(a.codomain / s1, b.codomain / s1)
return s1 ++ s2

When unification produces a substitution, it checks that the substitution isn't redundant. Eg. You don't substitute a with a. It also checks that the substitution eliminates the variable without producing recursive types. Eg. [a -> b / a] is an invalid substitution.

Example: Unification for functions (a -> b), (c -> a)

(->) matches on the both sides.
unify(a,c) therefore [c/a] is produced.
unify(b [c/a], a [c/a]) therefore.
unify(b, c) therefore [c/b] is produced.

[c/a] [c/b] are returned.

Only the unification of two types result in new substitutions.

Generalization (Let)

Let-expression presents the final piece.

r :: a
y(r) : w -> a
let z = y(r) in z

The type inference for (y(r)) is just taken care of. To construct a type environment for inferencing the (z), the y(r) must go through generalization.

The result is:

r :: a
z :: forall b. b -> a
z

Every type variable that is free in the y(r) but not free in the type environment end up being generalized. The above might happen in part of a larger expression, for example:

fn r. let z = y(r) in z

The fresh type variable that are not free in the environment must result from instantiations. Therefore they can be generalized to allow further parametric polymorphism to occur inside the expression.

Recursion

The type inferencing algorithm described above doesn't allow recursion to occur.

Simply typed lambda-calculus is not turing-complete. For example, beta-reduction eventually terminates.

The following fixed-point combinator is required for representing the recursion:

fix :: forall a. (a -> a) -> a
fix = fn f. f(fix(f))

This is a recursive clause, and therefore it's not part of simply typed lambda calculus. But we can use this combinator to represent recursion in our system.

Lets do this trick for the following 'factorial' expression and see how it works out.

fact(n) = if n > 0
    then n * fact(n-1)
    else 1

We would fail at typing the fact because it's not in our scope. We are doing the following to get it first into the type variable scope:

y_fact(fact)(n) = if n > 0
    then n * fact(n-1)
    else 1
fact = fix(y_fact)

Next we will add the necessary environment:

(>) :: (nat, nat) -> bool
(-) :: (nat, nat) -> nat
(*) :: (nat, nat) -> nat
1 :: nat
0 :: nat
fix :: forall a. (a -> a) -> a

The type inference algorithm above will reveal that the y_fact and fact become:

y_fact :: (nat -> nat) -> nat -> nat
fact :: nat -> nat

This technique is trivial to extend for mutually recursive clauses. You just imagine different kind of fixed-point combinators that do this same thing, but for multiple functions.

Whether you allow recursion, you don't need to do this kind of lifting for every function. If you calculate strongly connected components of your call graph, you can identify the parts that require this treatment.

This open up the way to type inferencing loops and other complex structures, such as SSA, with the algorithm w.

Observations

The function types described in Algorithm W are parametric. The (->) is a type constructor. It clues in how to extend the system for parametric types. Overall, you can build a very complex language over the ideas presented here.

The environment and type environment maps variables into representations and types. But type variables do not have their own environment.

Type declarations resemble equations or graphs by their behavior.

The HM type inferencers resemble constraint solvers. Though they solve the types through "equality". This approach hides a detail that you would have to otherwise see.

The types are not associated to the valuables, rather they are associated to expressions and functions in the language. The type "as it" represents two concepts: provision of a feature, and requirement of a feature.

Solving the type constraint problems by unification means that we assume an object to be useful in respect to only one feature. This ignores the details in types during the type inference.

The parametric polymorphism allowed by HM appears to explictly come from the forall -clauses. The variations between types have to be described in terms of those variations. This allows type inference for every program that is written, but it means that you cannot inference the type first and then choose the polymorphism.

Instead, the polymorphism in functionality must be written before the type is decided. This is a favorable property in Hindley-Milner type systems. The inability to subtyping tends to decrease its value though.

Unvisited ideas

There exists a correspondence between proof systems and computation models that presents there's a correspondence between Hilbert-style intuitionistic implicational logic and Simply typed combinatory logic.. This Curry-Howard correspondence is something I should study in future.

axiom ............................. variable
introduction rule ................. constructor
elimination rule .................. destructor
normal deduction .................. normal form
normalisation of deductions ....... weak normalisation
provability ....................... type inhabitation problem
intuitionistic tautology .......... inhabited type

Next it's time to put this knowledge into a good use. I believe it starts with reading Stephen Dolan's thesis again. It wouldn't be entirely necessary but this way I can verify some ideas and concepts that crystallized during this last study.