Recursion

The objective of this article is to teach recursion, recurrence and loops. After comprehending this post you will:

  1. Learn what for you need recursion.
  2. Understand how different notions of recursion relate to each other.
  3. Grok the basics for proving recursive programs correct.
  4. Know how to write recursion in new and existing languages.
  5. Capture recurrences into loops or recursion when it is feasible.

This is intended to be a mostly self-contained explanation.

In addition there are real-world examples that use structures that are explained for the extent of connecting the content of this post into current practice. It is not required that you understand them in full, they are merely there to help in case you need them.

Preliminary: Judgements & Equations

Whenever you see a single colon separating a term from a type, it means that the term is interpreted to be a value of that type.

term : type

The equality mark = strictly forms a mathematical equation that means the term on the left side can be replaced by the term on the right side, if the equation holds.

Exemption is made in examples that are written in programming languages, that assign a different meaning for these symbols.

Preliminary: Functions

Function types are denoted by an arrow, and constructed from two types, a and b.

a → b

You can construct a function by constructing b while assuming you have the a. We mark this with λ, followed by the name of a inside the clause.

λx. e

Function can extract b if you supply the a. That is, you can “apply” a function to a value.

f x

The construction and elimination of a function can be inserted or removed in pairs, for example this equation holds:

(λx. f x) y = f y

Preliminary: Type variables

In this text we use type variables. They are denoted with , and the variable names are listed. These variables can be used in the e.

∀a b c. e

We do not provide or require constructors or eliminators for the newly formed type, because it will appear as the outermost structure, and only used for denoting places where a function can take a value of any type.

Recurrence in programming

TODO: ??

This while-loop.

while (x != 10) { x := x + 1 }

You can interpret it to be equivalent to an infinite sequence of “if” -expressions.

if (x != 10) { x := x + 1 }
if (x != 10) { x := x + 1 }
if (x != 10) { x := x + 1 }
...
if (x != 10) { x := x + 1 }

Alternatively you can interpret it as if the while-loop unrolled infinitely many times.

if (x != 10) { x := x + 1 }
if (x != 10) { x := x + 1 }
if (x != 10) { x := x + 1 }
...
while (x != 10) { x := x + 1 }

Fixed point

A fixed point of a function is a value that is mapped to itself by the function.

f x = x

It is not obvious outright, but since x appears alone on the right side of the equation it means you can substitute x by f x because they have the same value.

If we do that kind of substitution, we end up with a sequence of equations such as these:

f (f x) = x
f (f (f x)) = x
f (f (f (f x))) = x
...

We can introduce a term fix to represent these elements. The fix will have the following type:

fix : ∀a. (a → a) → a

It is fully defined by the following equation:

fix f = f (fix f)

The fixed point captures the idea that the term appears inside itself in an equation. Such self-referential terms are called recursive terms.

Type theory restricts use of fixed point for a reason

If you take untyped lambda calculus, it comes with a fixed point of its own.

fix f = (λx. f (x x)) (λx. f (x x))

However, this equation cannot be given a type. The type itself would be infinite.

Check this out to verify that, lets give the fixed point a type:

fix : ∀a. (a → a) → a

Now we’ll start examining whether the proposed equation can be interpreted under our type for the fixed point.

f : (a → a) → a
⊢ (λx. f (x x)) (λx. f (x x)) : a

We are going to have to assume that the term we got here produces a function, and an argument to that function. b in the judgements below is just a placeholder variable.

(λx. f (x x)) : b
(λx. f (x x)) : b → a

This immediately means for trouble because you can see the fixed point appearing in the type, b = b → a it says.

But lets continue.

Since b is constructed by a function, it must be a function as well.

(λx. f (x x)) : (c → d)
(λx. f (x x)) : (c → d) → a

f : (a → a) → a
x : c
⊢ f (x x) : d

So we can still go further.

x x : (a → a)
x : c
x : c → (a → a)

Now we have a concrete pair of judgements that state c = c → (a → a). Therefore you have the fixed point there again.

For amusement, we can also use fixed point in type equations.

fix (λc. c → (a → a))

And here’s how it’d unroll:

fix (λc. c → (a → a)) → (a → a)
(fix (λc. c → (a → a)) → (a → a)) → (a → a)
((fix (λc. c → (a → a)) → (a → a)) → (a → a)) → (a → a)
...

This is why in typed lambda calculus, you’d struggle to define fixed point combinator, fix, without giving an equation with a term that appears within itself.

Consider these attempts to derive such a function, the question mark represents a hole in the equation:

fix f = ?
fix f = f ?
fix f = f (f ?)
fix f = f (f (f ?))

If you could construct an infinitely long term here, you could define this function. Otherwise you cannot construct the fixed point.

Fixed point captures the notions of recursion, repetition and looping.

Formal logic content of programs

Recursion has an interesting role in mathematical logic.

Here’s a judgement that 4 is a number.

4 : number

Now consider that we can also assume a number, without stating which number it is. We just promise that it can be retrieved if needed.

x : number

You can trust that ‘x’ would eventually turn out to be some number that you can read out.

Fixed point creates a nice structure that you can pass as an anything that you can come up with:

fix (λx.x) : number

All sort of expressions, such as a function that returns whatever is passed into it:

fix (λx.x) : ∀a. a → a

Or a function that converts anything into anything else:

fix (λx.x) : ∀a b. a → b

If you try to expand it with the rule of the fixed point, you get the same equation back.

(λx.x) (fix (λx.x)) = fix (λx.x)

Without recursion a valid number would mean that you have something that forms a number. With recursion you may end up with a structure that gets stuck.

This is important because you can come up with complex and convoluted types that mean interesting things, especially when you have a term for them.

Restoring logical content for recursion

All forms of recursion do not destroy the logical content of expressions. If recursion eventually forms a value it’s supposed to do, then it preserves logical content.

The very well known forms for this kind of recursion involve induction and coinduction. These rely on recursion appearing only on construction/elimination side of a structure.

Induction

Recall the struggle in constructing the fixed point:

fix : ∀a. (a → a) → a
fix f = ?
fix f = f ?
fix f = f (f ?)
fix f = f (f (f ?))
...

If we had something to plug there, we’d obtain natural numbers.

nat : ∀a. (a → a) → a → a
nat f x = x
nat f x = f x
nat f x = f (f x)
nat f x = f (f (f x))
...

You can see how the f is applied either zero or n times.

This means that we have a finitely constructed structure. We can also represent it like this:

zero : nat
succ : nat → nat

zero
succ zero
succ (succ zero)
succ (succ (succ zero))
...

Now examine the type for nat.

nat = ∀a. (a → a) → a → a

You could interpret that the zero (a) is some base case. The succ (a → a) is an inductive step.

The elimination of this kind of a natural number forms a structural induction proof.

Structural induction proof is a proof that something holds for the minimal structures (zero in this case), then, if it holds for immediate substructures of all structures (succ), then it holds for all structures.

For example, if you want to add a number defined in this way. You’d form the following structure:

add : nat → nat → nat
add n m = n (λx. x) m

Inductive structures are this kind of finitely constructed structures that can be eliminated recursively.

Coinduction

Induction has an “inverse” concept that are structures constructed recursively but they are finitely eliminated.

Coinduction requires that each step in the recursion “progresses” meaning it constructs the structure. Otherwise the construct can go on and on.

Next : ∀a. a → Stream a → Stream a

stream : Stream Nat
stream num = Next num (stream (num+1))

Coinductive structures cannot be taken apart recursively.

Examples of recursion

Next lets go through some examples of recursion in practice.

Quicksort

Sorting is to arrange items systematically. List sorting is to arrange a list of items such that the smallest item, in some sense, comes first.

Quicksort is an efficient sorting algorithm to sort a list. The idea is simple: You take some item from the list, then divide rest of items in the list to a pile that consists of smaller items, and pile that consists of larger items, then do this again for the rest of the items.

Here’s a bunch of equations to define quicksort. This sorts a list when it gets a way to compare items.

sort : ∀a. (a → a → Bool) → (List a) → (List a)
sort lte Empty = Empty 
sort lte (Cons x xs) = (sort lte (filter (λy. lte y x) xs))
                    ++ [x]
                    ++ (sort lte (filter (λy. lte x y) xs))

filter : ∀a. (a → Bool) → (List a) → (List a)
filter f Empty = Empty
filter f (Cons x xs) = if (f x)
                       then (Cons x (filter f xs))
                       else (filter f xs)

Note that to sort a list these equations require the comparison function is anticommutative, eg: lte x y = not (lte y x). You can consider what would happen with functions that are not anticommutative.

State machines

State machine is a machine that stays in some state and it can be transitioned into an another state.

You can interpret a state machine as a coinductive, recursive structure. You get the current state, and a way to transition it into the next state.

type State a t = (a, t → State a)

open_door, closed_door :: State DoorState DoorTransition
open_door
    = (Open,
       λt. case t of
           Close -> closed_door;
           Open -> open_door)
closed_door
    = (Closed,
       λt. case t of 
           Close -> closed_door
           Open -> open_door)

More common way to represent state machines is to just write down the state transitions.

door_action :: DoorState → DoorTransition → DoorState
door_action Closed Close = Closed
door_action Closed Open  = Open
door_action Open   Close = Closed
door_action Open   Open  = Open

Example: Publishing date selection

Here’s something that occurs often enough that it’s worthwhile to show an example of it.

I have a small Python script to create directories for blogposts. I commonly post every monday and create the directory a week prior the posting day. I have written a command to create or display the directory and this script adds it into the command line program.

date_cmd = subparsers.add_parser("monday",
    help="print the blog directory path for monday")
date_cmd.add_argument('--create', '-c', action='store_true')
date_cmd.set_defaults(func=date_fn(0))

Next I ended up writing also on wednesdays, so I ended up writing an another command to post on wednesdays:

date_cmd = subparsers.add_parser("monday",
    help="print the blog directory path for monday")
date_cmd.add_argument('--create', '-c', action='store_true')
date_cmd.set_defaults(func=date_fn(0))

date_cmd = subparsers.add_parser("wednesday",
    help="print the blog directory path for wednesday")
date_cmd.add_argument('--create', '-c', action='store_true')
date_cmd.set_defaults(func=date_fn(2))

Note there are differences but they’re mostly the same text that varies in two locations.

weekday, index

date_cmd = subparsers.add_parser(weekday,
    help="print the blog directory path for " + weekday)
date_cmd.add_argument('--create', '-c', action='store_true')
date_cmd.set_defaults(func=date_fn(index))

When I considered that I might sometimes also post on friday, I eventually decided to just provide all weekdays. Instead of repeating the same command 7 times, I just made a list of things that vary and folded that list into the commands.

weekdays = [
    (0, 'monday'), (1, 'tuesday'), (2, 'wednesday'), (3, 'thursday'),
    (4, 'friday'), (5, 'saturday'), (6, 'sunday') ]
for index, weekday in weekdays:
    date_cmd = subparsers.add_parser(weekday,
        help="print the blog directory path for " + weekday)
    date_cmd.add_argument('--create', '-c', action='store_true')
    date_cmd.set_defaults(func=date_fn(index))

This is a common way to treat recurrences.