Chu construction on linear logic

Any of the options that I already knew of for building a substrate for linear logic didn't seem simple enough. I decided to continue studying.

I hadn't studied Chu construction or Chu spaces yet and I thought they were difficult, but turns out it's not much effort to understand them. They turned out being very useful.

Chu spaces provide a way to rewrite linear logic propositions into intuitionistic logic propositions. Then we can interpret linear logic with calculus of constructions or any other pure type system. This provides implementation strategies and ways to verify work.

I'm first explaining what Chu spaces are, and then how the Chu construction displays linear logic and how it can be interpreted through it.

In the end of the post there are two small prototype interpreters.

Chu spaces

Chu space is a matrix with some transformation rules. It is denoted in the following way:

(X,r,A)
r : (X,A) → K

X is a type of points. A is a type of states. K is the alphabet.

To transform a Chu space (X,r,A) into an another one (Y,s,B), you need a pair of functions:

f: X → Y
g: B → A
satisfying condition s(f(x),b) = r(x,g(b))

Vaughan Pratt has a web site just for Chu spaces. The following example is from there.

String editing example

I liked to examine these examples first. They cleared up lot of things. Here we examine letter deletion and insertion into a string, through Chu spaces.

String editing commands can be Chu transforms once we represent strings as Chu spaces. All strings of length 3 could be represented as:

X = strings of length 3
points: A = positions in strings that we can read
states: K = actual english alphabet
r matrix contents:

      0 1 2
"cat" C A T
"hat" H A T
"bat" B A T
"bot" B O T
...

There are finite amount of rows, but so many that I don't write them all here. This Chu space has all 3-letter strings that you can build with a 26-character alphabet.

At first it looks a bit odd because the row labels contain same information as the whole table. The catch there is that you can forget you have head and tail.

You see, the Strings would be constructed with cons and nil, and destroyed with these two commands:

data String = (cons K String) | (nil)

head: (s:String) → (s = cons x y) → K
tail: (s:String) → (s = cons x y) → String

head (cons letter _)    = letter
tail (cons _ remaining) = remaining

When working with Chu spaces we forget these eliminators, as the contents of the Chu space contains all that information:

head(x) is same as r(x, 0)
(head∘tail)(x) is same as r(x, 1)
(head∘tail∘tail)(x) is same as r(x, 2)

The following two functions together would form a Chu transform that erases the first letter in the input string:

f: X → Y
f "cat" = "at"
f "hat" = "at"
f "bat" = "at"
f "bot" = "ot"
...

g: B → A
g 0 = 1
g 1 = 2

satisfying condition s(f(x),b) = r(x,g(b))

When this transform is applied into the earlier Chu space, we get the following Chu space as the result:

points: Y = strings of length 2
states: B = positions in strings that we can read
K = actual english alphabet
s matrix contents:

     0 1
"at" A T
"ot" O T
...

Chu-spaces can be transposed such that the points and states switch places. The Chu transforms have to be inverted as well if they were to be applied. The above transform could invert to f: B → A, g: X → Y by flipping the functions around.

The opposite transform is impossible to represent because it would have to introduce some new information, namely the letter that was removed/inserted. The f could be constructed, but g would become Y → X, and we can't build it because:

g "at" = ?
g "ot" = ?

In this sense insertion is an active operation that adds information, whereas the deletion of a letter doesn't insert more information.

It's a trivial conclusion but all of this already resemble how Linear logic works.

Linear logic viewed through Chu spaces

Chu construction is a general method for constructing a star-autonomous category from a closed symmetric monoidal category. -- ncatlab.org

We get slightly different systems depending on how we do the Chu construction.

The first construct I want to show is from Vaughan Pratt's paper: "Chu spaces as a semantic bridge between linear logic and mathematics" There's a detail that shows up better in this one than the next, which in turn is nice to interpret and is actually used.

We start with an alphabet. Let's just use zero, one:

Σ = {0,1}

Units are described as pairs:

unit = ({0}, r, Σ)
r 0 0 = 0
r 0 1 = 1

In tabular form the units seem like this:

 1│0 1
──┼───
 0│0 1

~1│0
──┼─
 0│0
 1│1

The inversion just flips the point and states around.

~(A, r, X) = (X, transpose r, A)

transpose r = (x, y) ↦ r(y, x)

The tensor product A⊗B is described as:

(A, r, X) ⊗ (B, s, Y) = ((A, B), t, F)

F is all pair of functions (f,g) that satisfy the equation:

∀a:A.∀b:B. s(b,f(a)) = r(a,g(b)) = t((a,b),f)

It's a bit tricky to work with that, but the next thing is what I want to show.

~1*~1│
─────┤      Impossible to construct a function such that this holds.      
  0*0│ 
  0*1│ 
  1*0│
  1*1│

It proposes that the tensor with two negative units as arguments is entirely uninhabited.

So lets take the second one.

  1*1│f h
─────┼─────
  0*0│0 1

        f│
─────────┼─────────┐
a=0 → y=0│b=0 → x=0│0

        h│
─────────┼─────────┐
a=0 → y=1│b=1 → x=0│1

We are identifying functions that satisfy the equation. That makes it a bit tricky.

 1*~1│f
─────┼─────
  0*0│0
  1*0│1

        f│0
─────────┼─────────┐
a=0 → y=0│b=0 → x=0│0
a=0 → y=0│b=1 → x=1│1

You could do a routine for finding these functions:

Scan through every combination of a,b
Find every x,y where the matrix values match
Produce cartesian from every matched value, plotting the results in as you go.

That empty table on certain constructions was a thing I wanted to show to you.

From constructive logic to linear logic

The other Chu construction presents itself in pairs of types. It can be found in Mike Shulman's paper "Linear logic for constructive mathematics". There are also slides from Shulman's talk about the subject where the rules are neatly written out.

The idea is that we can describe Linear logic connectives in terms of a proof and a refutation for the constructed proposition.

connective : (a:Type) → (b:Type) → ¬(a ∧ b) → LinearLogicProposition
connective a b _ = (a,b)
proof p      = p+
refutation p = p-

The third argument in the constructor means that the proof and the refutation must cancel out (a- ∧ a+ → 0)

All linear logic connectives can be described in this way:

~a = (a-, a+)
T => (1,0)
F => (0,1)
a & b => (a+ ∧ b+, a- ∨ b-)
a ⊕ b => (a+ ∨ b+, a- ∧ b-)
a ⊗ b => (a+ ∧ b+, (a+ → b-)&(b+ → a-))
a ⅋ b => ((a- → b+)&(b- → a+), a- ∧ b-)
!p = (p+, p+ → 0)
?p = (p-, p- → 0)
∃x.P(x) = (∃x.P(x)+, ∀x.P(x)-)
∀x.P(x) = (∀x.P(x)-, ∃x.P(x)-)

Now all valid linear logic inference rules can be described in terms of these constructions, and they can be composed together as long as we remember the refutation-side is composed as contravariant.

comp :: Rule f g → Rule g h → Rule f h
comp a b =
    logic_rule (pos b . pos a) (neg a . neg b)

All of the Linear logic propositions form functors or bifunctors. I'm not entirely sure how to invert the inference rules.

tensor_bifunctor :: f → g → f ⊗ g
tensor_bifunctor f g =
    logic_rule
        ((x,y) ↦ ((pos f) x, (pos g) y))
        ((h,k) ↦ 
          (((neg g) . h . (pos f)),
           ((neg f) . k . (pos g))))

par_bifunctor :: f → g → f ⅋ g
par_bifunctor f g =
    logic_rule
        ((h,k) ↦ 
          (((pos g) . h . (neg f)),
           ((pos f) . k . (neg g))))
        ((x,y) ↦ ((neg f) x, (neg g) y))

Here's the list of difficult rules from the previous post. They are no longer so difficult.

id : a → a
com : a ∘ b → b ∘ a
asl : (a ∘ b) ∘ c → a ∘ (b ∘ c)
srl : a⊗(b⅋c) → (a⊗b)⅋c
ax  : T → a⅋~a
cut : a⊗~a → F

The identity/commutativity/associativity shouldn't be problematic at all. Let's look at srl and ax. Can we implement these now?

srl : a⊗(b⅋c) → (a⊗b)⅋c
srl = logic_rule srl_proof srl_refutation

srl_proof : (a⊗(b⅋c))+ → ((a⊗b)⅋c)+
srl_refutation : (a⊗(b⅋c))- ← ((a⊗b)⅋c)-

srl_proof (a, (f, g)) = (((k, j) ↦ f (k a)), (c ↦ (a, g c)))
srl_refutation ((f, g), a) = ((c ↦ (f c, a)), ((k,j) ↦ g (j a))

ax  : T → a⅋~a
ax = logic_rule ax_refutation ax_proof

ax_proof : T+ → (a⅋~a)+
ax_proof () = (a ↦ inversion_lemma a, b ↦ inversion_lemma b)

ax_refutation : T- ← (a⅋~a)-
ax_refutation (a, na) = cancellation a na

inversion_lemma : a+ = (~(a-))+
  Inversion just flips the parameters in the chu-pair around.
  Programmatically this is sort of a flip-parameters-around function.

cancellation : a+ → a- → 0
  Proof and refutation must cancel out, like described earlier on.

Unit introductions/eliminations such as below are required, but they can be implemented by just filling the unit into the side that vanishes.

a → a ⊗ T
a ⊗ T → a

Here we have direct instructions for evaluating linear logic proofs as lambda calculus rules.

Before going deeper to this, lets explain something shortly, I think it's cool.

Shortly about negation

I haven't provided an implementation for cancellation above, somebody might wonder why is that? Here's an explanation.

The thing is that function cancellation is never evaluated. You may be surprised to hear that despite that, it's not redundant. In fact, when you realise what it is, you will feel frustrated for not having this concept in your favorite programming language.

Let's examine what 0, 1 or 2 mean if they are used as a type.

0, or {} is a type with no constructors. It is called bottom type
1, or {0} is a type that has exactly one constructor: (*) this means that whichever proof we supply is equivalent with any other proof that we may provide to it. This is called unit type
2, or {0,1} is a type that has two constructors, proof that we supply are equivalent to either one primitive proof: true, or false. We may also use numbers 0, 1 themselves as values as long as we distinguish that the numbers themselves are not types. This is called boolean type
This kind of type exists for every natural number N, having N "points" on it. Note that although they're numbered, the constructors are nominal and do not provide an algebra. To get anything like that you need to provide modular arithmetic numbers

What does it mean that there are no constructors? That means there is nothing that is equivalent with anything else. That little piece that should plug into the hole with 0 in it doesn't exist.

What if we have managed to produce a function a → 0? Now we have a way to construct 0 from a. This means that a would have to contain that small piece, which doesn't exist.

In logic these kind of explanations are proofs by contrapositive. If we have a function (a → b), we also have (¬b → ¬a), guess how simple ¬0 is to provide. To easen it up I tell you, it's equivalent to having a function (0 → 0).

Another thing that may be perplexing to somebody stumbling down into this article is the need to restrict recursion in formal verification.

If your program gets stuck with certainty, you got a proof that it doesn't produce anything.

def thing(x):
    return thing(x)

You know this program would never return anything if Python didn't create a stack record and then crash. Therefore it has the type 0.

The same program in haskell is:

thing :: a -> a
thing x = thing x

If you're able to produce this kind of a program, then you have a trivial way to prove anything as well.

absolute_proof :: unicorns
absolute_proof = absolute_proof

There we just proved that unicorns exist. Since our proof does not actually prove it, it will get stuck. We maybe asked it too much... nah.

But if we know that we won't really reach some point in the program, then we don't need to supply anything there. For this purpose you usually don't have the following function:

absurd: 0 → c

You see.. You don't have this. But it's okay because you won't need it. Also if you don't have it and you need it, you can not get it easily with absurd.

(x ↦ absurd x): 0 → (0 → c)

This makes negative proofs useful. They're also fairly easy to do in a dependently typed language. Even a fairly simple example can show how to use them.

Example: Construction of predecessor accessor

If you'd like to construct a function that gives you a predecessor of a natural number, you'd stumble to a problem of what to do when you're asked for a predecessor for zero?

predecessor :: (x:Nat) → Nat
predecessor x =
    case x of
        zero ⇒ __what_here__
        succ x ⇒ x

A very pragmatic programmer would substitute in [__what_here__:=zero], or use machine integers instead, but if we do that then we state that the predecessor of a zero is zero. Used to living in a Turing tarpit programmers accept this kind of shenanigans.

Dependent typing and bottom types provide a convenient solution. Using them your program never has to answer to the question of what is the predecessor of a zero.

For example with natural numbers we might say:

has_predecessor :: Nat → Set
has_predecessor = (x:Nat) → (case x of zero ⇒ 0, succ _ ⇒ 1)

Now if we happen to know that our natural number has a predecessor, we'll be able to draw a type for it:

(*) : (has_predecessor 5)

Now we can implement a predecessor function that always works:

predecessor :: (x:Nat) → has_predecessor x → Nat
predecessor x x_has_predecessor =
    case x of
        zero ⇒ absurd x_has_predecessor
        succ x ⇒ x

Note that doing it directly like this, requires that the type system substitutes the value of the case expression into its typing context.

This works because for some nonzero natural number we'll be always able to write:

predecessor 5 (* : has_predecessor 5)

However we can't call the predecessor function with the 0. By requiring a construct of has_predecessor x, our function conveniently disappears if there's zero coming.

Disturbing proposal

This thing could actually translate into an useful construct in traditional programming languages.

You may have written something like this before:

assert false; // Execution should never reach this point!

Writing absurd x_has_predecessor is bit same, except that you say the reason, and the programming runtime proves that it's true.

Traditional programming environments can't say anything about such things being true, but this thing might still translate into an useful programming construct.

Consider what if you had written this in Python:

@precondition
def has_predecessor(x):
    return x != zero

def predecessor(x, x_has_predecessor):
    if x != zero: 
        return x.predecessor
    else:
        return absurd(x_has_predecessor)

...
    x = 5
    x_has_predecessor = has_predecessor(5)
    ...
    predecessor(x, x_has_predecessor)

What's the point? The point can be seen in the traceback errors when the informally listed logic eventually breaks down.

Traceback (most recent call last):
  File "naturals.py", line 1248, in foobar
    predecessor(x, x_has_predecessor)
  File "naturals.py", line 64, in predecessor
    assert False, "This can never happen!!"
AssertionError: This can never happen!!

You see the point?

Traceback (most recent call last):
  File "naturals.py", line 1248, in foobar
    predecessor(x, x_has_predecessor)
  File "naturals.py", line 64, in predecessor
    return absurd(x_has_predecessor)
Absurdity: has_predecessor(5)
Precondition was checked at the:
  File "naturals.py", line 892, in foobar

It's not nearly as useful as the real thing of formally verified informal logic, but it'd be usable.

Prototype

The earlier Linear logic system looks a bit flippant, hoping you get the pun, and you might feel entitled to respond "you silly boy! That won't work." Ok, let me show this to you.

$ mkdir haskell_chu_construction
$ cd haskell_chu_construction
haskell_chu_construction$ ...
...

Let's backtrack a bit first.

...

mkdir ../python_chu_construction
cd ../python_chu_construction

The haskell_chu_construction actually contains an Agda file but it's not complete. I about managed to get it right in Python thanks to doing it first in Agda.

I think this won't be insightful as it, but it actually works on the small example I gave.

# -*- encoding:utf-8 -*-

identity_f = (lambda x: x)
identity = (identity_f, identity_f)

def inv((a, b)):
    return (b, a)

def compose(*args):
    def proof(x):
        for f in args:
            x = f[0](x)
        return x
    def refutation(x):
        for f in reversed(args):
            x = f[1](x)
        return x
    return (proof, refutation)

def tensor(f, g):
    def proof((x,y)):
        return (f[0](x), g[0](y))
    def refutation((h, k)):
        return (
            (lambda x: g[1](h(f[0](x)))),
            (lambda x: f[1](k(g[0](x)))))
    return (proof, refutation)

def par(f, g):
    def proof((h, k)):
        return (
            (lambda x: g[0](h(f[1](x)))),
            (lambda x: f[0](k(g[1](x)))))
    def refutation((x,y)):
        return (f[1](x), g[1](y))
    return (proof, refutation)

def right_init_proof(x):
    return (x, (None, None))
def right_init_refutation((f, g)):
    return g((None, None))
right_init = (right_init_proof, right_init_refutation)

def left_elim_proof((f, g)):
    return f((None, None))
def left_elim_refutation(x):
    return (((None,None), x))
left_elim = (left_elim_proof, left_elim_refutation)

def ax_proof(x):
    assert x == (None, None)
    return (inv, inv)
def ax_refutation(a):
    return (None, None)
ax = (ax_proof, ax_refutation)
cut = inv(ax)

def srl_proof(x):
    a, (f,g) = x
    return (
        (lambda kj: f(kj[0](a))),
        (lambda c: (a, g(c))))
def srl_refutation(x):
    (f,g), a = x
    return (
        (lambda c: (f(c), a)),
        (lambda kj: g(kj[1](a))))
srl = (srl_proof, srl_refutation)

def p(t):
    def _message(x):
        print t, x
        return x
    return _message

def main():
    a = (p("A Forwards"), p("A Backwards"))
    b = (p("B Forwards"), p("B Backwards"))
    simple = compose(
        right_init,
        tensor(a, compose(ax, par(b, identity))),
        srl,
        par(cut, identity),
        left_elim)

    print simple[0]((1,None))
    print simple[1]((None,2))

if __name__ == "__main__":
    main()

The program evaluated is a variation from the program on the previous post:

A ⊗ (ax . B ⅋ id) . srl . f ⅋ id . com

It would reduce down to (A . ~B). And actually it does evaluate like so:

A Forwards (1, None)
B Backwards (None, 1)
(1, None)
B Forwards (2, None)
A Backwards (None, 2)
(None, 2)

Everything just works although it is vague to me what exactly is going on there?

An another in Haskell!

When I finally figured out how it should go, I also ended up doing it in Haskell:

module Main where

identity = (id, id)
inv (x,y) = (y,x)

comp (f,f') (g,g') = (g . f, f' . g)

tensor (f,f') (g,g') = (proof, refutation) where
    proof (x,y) = (f x, g y)
    refutation (h,k) = (g'.h.f, f'.k.g)

par (f,f') (g,g') = (proof, refutation) where
    proof (h,k) = (g.h.f', f.k.g')
    refutation (x,y) = (f' x, g' y)

right_init = (proof, refutation) where
    proof x = (x, ((), undefined))
    refutation (f,g) = g ((), undefined)

left_elim = (proof, refutation) where
    proof (f,g) = f ((), undefined)
    refutation x = (((), undefined), x)

ax = (proof, refutation) where
    proof x = (inv, inv)
    refutation x = (undefined, ())

putString = (proof, refutation) where
    proof (io, s) = ((fst io) >> putStrLn (fst s), (snd io))
    refutation = undefined

cut = inv ax

srl = (proof, refutation) where
    proof (a, (f,g)) = (prove, refute) where
        prove (k,j) = (f . k) a
        refute c = (a, g c)
    refutation ((f,g), a) = (prove, refute) where
        prove c = (f c, a)
        refute (k,j) = (g . j) a

initial_io = ((return ()), undefined)

main = fst $ (fst putString) (initial_io, ("Hello",undefined))

Using this you could perhaps already write concurrent programs in Haskell. Pick the Control.Concurrent module, add some more wrapping for different IO objects. Note though that you have to break out of the monad to retrieve values. Either it works just fine or then it fails in some odd way.

It appears that linear logic doesn't only extend lambda calculus, but it extends from lambda calculus.

Next I have to explore into pure type systems.