Exploration into categorical abstract machines

Examining the rules I gave in previous post, I tried to write them in terms of each other to find a minimal group of rules. About 15 of them remain, although all except the rules on multiplicatives are just forming pairs that interact by annihilation. The difficult rules are multiplicatives, I could isolate it to this:

id : a → a
com : a ∘ b → b ∘ a
asl : (a ∘ b) ∘ c → a ∘ (b ∘ c)
srl : a⊗(b⅋c) → (a⊗b)⅋c
ax  : 1 → a⅋~a
cut : a⊗~a → ~1

Y. Lafont's paper "The linear abstract machine" has similar rules as I just wrote. Though instead of the srl, cut and the ⅋, the paper's providing the following:

cur(f : a⊗b → c) : a → (b ⊸ c)
app : (a ⊸ b)⊗a → b

Lafont's combinators form intuitionistic linear logic. These combinators can be implemented with our "classical" linear logic combinators:

cur f = id ⊗ ax . srl . f ⅋ id . com
app   = com . srl . cut ⅋ id

I'm not sure whether this can be done vice-versa, without introducing problems into how the linear abstract machine works.

Categorical abstract machine's evaluation

(If you wonder where to read more about categorical abstract machines, there's one in Tatsuya Hagino's thesis about the Categorical programming language. There's also a paper called "The Categorical Abstract Machine" written by G. Cousineau, P,-L, Curien and M. Mauny.)

It is worth mentioning how both categorical abstract machine and Lafont's linear abstract machine works. The principle can be summarized that there are number of elements that rewrite to something else when they meet:

⊳⋄⊲ ↝ ⋄

Since we have composed our program out of these sequences, we can obtain a normal form by interpreting the sequences from left to right and keeping a buffer that consists only of normal-forms. Here's a simplified illustration of how that happens, with very simple rules:

⋄⊳⊲⊳⊲⊲⊳⊲⊲⊳⊲⊲⊲⊳⊲⊳
⊳⋄⊲⊳⊲⊲⊳⊲⊲⊳⊲⊲⊲⊳⊲⊳
  ⋄⊳⊲⊲⊳⊲⊲⊳⊲⊲⊲⊳⊲⊳
  ⊳⋄⊲⊲⊳⊲⊲⊳⊲⊲⊲⊳⊲⊳
    ⋄⊲⊳⊲⊲⊳⊲⊲⊲⊳⊲⊳
    ⊲⋄⊳⊲⊲⊳⊲⊲⊲⊳⊲⊳
    ⊲⊳⋄⊲⊲⊳⊲⊲⊲⊳⊲⊳
      ⊲⋄⊲⊳⊲⊲⊲⊳⊲⊳
      ⊲⊲⋄⊳⊲⊲⊲⊳⊲⊳
      ⊲⊲⊳⋄⊲⊲⊲⊳⊲⊳
        ⊲⊲⋄⊲⊲⊳⊲⊳
        ⊲⊲⊲⋄⊲⊳⊲⊳
        ⊲⊲⊲⊲⋄⊳⊲⊳
        ⊲⊲⊲⊲⊳⋄⊲⊳
          ⊲⊲⊲⊲⋄⊳
          ⊲⊲⊲⊲⊳⋄

The annihilations change the program state and eventually we finish with some normal-form in the buffer. Category theory forms a tool for constructing logic out of these kind of objects that form such interactive pairs.

The evaluation can be fairly straightforward because there is some certainty about what kind of actions are taken.

fst ⋄ pair(a,b) = a⋄
app ⋄ pair(x, y):
  start evaluating x⋄
  when it finishes, we end up with 'cur(f).z' in the buffer.
  = f ⋄ pair(z, y)

The evaluation may also proceed on-demand:

pair(a,b)*: result obtained.
cur(f)*: result obtained.
app * pair(x,y):
  get result from *x
  the result is cur(f)*z because the code has been typechecked
  * f . pair(z, y)
pi1 * pair(x,y) = x
a * b = a . b *

There is a little complication on the on-demand evaluation if the pair(x,y) runs out of items before there is cur(f). In that case the pair(x,y) becomes an active element that eats the next item.

It still takes pleasingly little effort to evaluate lambda calculus in this way. Unfortunately the rules in the beginning mess it up for classical linear logic.

Lafont's paper also has this very neat way to construct exponentials:

eins : !a → a
edrop : !a → 1
edup : !a → !a ⊗ !a
make(ins: a → b, drop: a → 1, dup: a → a⊗a) : a → b

This allows the elimination of a tricky rule in the exponentials ((!(a ⊗ ?b) → !a ⊗ ?b), (!a ⊗ !b → !(a ⊗ b))) so I'm going to steal it.

Though, should I also adopt the cur/app -combinators?

There remains an important question after that exploration though. If these combinators are equivalent, should I continue with these tricky-looking combinators or just adopt the simple cur/app combinators?

I'd like to see if I can keep the symmetric combinators because the ability to negate proofs is interesting. I think it better conveys the rules present in linear logic. One example of this would be that I don't separately need to introduce continuations. The "program" to introduce continuations can be produced with the combinators alone.

One way to work this out would be to treat the multiplicative fragment as a substrate. But there remains a question. Can we eliminate the rules in the substrate or radically simplify it? Is there a tool for doing it?

Multiplicative units and srl

There seem to be at least three scenarios where the srl rule disappears:

a ⊗ (ax) . srl . (cut) ⅋ b = a . b
(a) ⊗ b . srl . c ⅋ d = b . ((a)⊗id.c)⅋d
a ⊗ b . srl . c ⅋ (d) = a⊗(b.id⅋(d)) . c

In short, either the srl interacts with cut/axiom rules, or the whole rule slides out from the structure and gets untangled.

Multiplicative units are involved when ending composition chains. It may be interesting to examine them for a moment. It'd seem we may get any of the 8 possible configurations with units and srl:

1⊗(1⅋1) → (1⊗1)⅋1 --- leads to elimination
1⊗(~1⅋1) → (1⊗~1)⅋1 --- leads to elimination
1⊗(1⅋~1) → (1⊗1)⅋~1 --- leads to elimination
1⊗(~1⅋~1) → (1⊗~1)⅋~1 --- leads to elimination
~1⊗(1⅋1) → (~1⊗1)⅋1
~1⊗(~1⅋1) → (~1⊗~1)⅋1
~1⊗(1⅋~1) → (~1⊗1)⅋~1 --- leads to elimination
~1⊗(~1⅋~1) → (~1⊗~1)⅋~1 --- leads to elimination

There are two configurations where it's likely we won't see any sort of elimination. It may be that they would only appear in structures that cannot be evaluated.

The given elimination rules would seem sufficient, but they require that we are tracking appearance of unit types and axiom rules in the expressions. Whenever the composition chain can be slid such that the unit "cuts" the srl apart, it disappears.

In short, if a suitable unit appears inside conjunction, it cuts the chain. If the chain is cut, it may unravel out from the well-formed structure rules.

BCI combinators in there

Btw. You can identify the BCI combinators in the deduction rules. This isn't related to this post, but it's something fun to pick out.

I combinator is just id or ax. The C combinator can be obtained by com and asl. The srl combinator is a B in disguise. That last one is particularly funny thing to notice. First take the signature for B combinator and translate it.

B : (b -> c) -> (a -> b) -> a -> c

B : (b ⊗ ~c) ⅋ (a ⊗ ~b) ⅋ ~a ⅋ c

Next take the srl -rule and flip the whole thing to the right side.

ax.srl : 1 → (~a⅋ (~b⊗~c)) ⅋ ((a⊗b)⅋c)

The ~a ⅋ c pops out right there, move them right. Next, invert the b, you get:

(b ⊗ ~c) ⅋ (a ⊗ ~b) ⅋ (~a ⅋ c)

There you have the B -combinator in srl.

Duals with everything

Before we give up entirely, there are still few unexamined options. What would we obtain as duals if we just inverted the curry/eval -structures?

cur(f : a⊗b → c) : a → (~b ⅋ c)
app : (~a ⅋ b)⊗a → b

To invert them, invert the arrows and replace the types with their duals.

cocur(f : c → a⅋~b) : (b ⊗ c) → a
coapp : b → (a ⊗ b)⅋~a

This seems to break something, as we can now pass coapp into app.

app . pair(coapp, id) : ((~a ⊗ b) ⅋ c)⊗(a ⅋ ~b) → c

Likewise we may pass pair into cocur.

cocur(f).pair(x,y) : a ⊗ b → c

Also it'd seem we cannot pass the cocur into coapp because the dual for pair is missing. cocur even seems it could evaluate spontaneously.

We could also treat negation as a structure.. basically.

neg(f : a → b) : ~b → ~a

Then we can no longer pass coapp into app, but the rules seem weird still because the coapp only fits inside cocur.

cocur(coapp) : (b ⊗ d) → (b ⊗ d)
cocur(coapp . a) = pair(pi1, a.pi2)

This would seem to break down because we're using pair on reasoning that should be additive conjunction in linear logic. Also we can apply the negation on composition:

neg(a.b) = ~b . ~a

So if we keep being aware of these rules, we can still use fairly simple rules.

The cocur(coapp . f) seems like a half of a bifunctor for multiplicative conjunction. Same for multiplicative disjunction would be the. (b ⅋ c) → (b ⅋ d).

cur(app . cocur(coapp . f)) : (a ⅋ d) → (b ⅋ d)

The introduction of all these rules seem like not solving anything, but we have some new tools and machinery there to solve something at least. To understand it better it may be worthwhile to examine the evaluation rules again.

The evaluation-by-demand rules propose that we could have some really trivial evaluation of operations such as app and pi1. They try evaluate and if they can't, they go to the stack. Both react if there's a pair next. pi1 just selects the leftmost item from the pair, removes the pair, drops the item to the evaluator. eval triggers evaluation on the leftmost item to obtain cur(f), then takes it and drops the f on the front of the pair.

This pair-shuffling&dropping is what we care about in the first place. Thanks to the negation rule, we would now be able to start on-demand reduction from either end.

You could shuffle the values different ways to obtain different versions of the app.

lrapp : (~a ⅋ b)⊗a → b
rrapp : (b ⅋ ~a)⊗a → b
llapp : a⊗(~a ⅋ b) → b
rlapp : a⊗(b ⅋ ~a) → b

The negation flips the "reaction side" of the element.

cur(f : a⊗b → c) : a → (~b ⅋ c)
app : (~a ⅋ b)⊗a → b
~app : b → (~a ⊗ b)⅋a 
~cur(f : c → a⅋b) : (~b ⊗ c) → a

Unfortunately this makes the rules fancy too.

app . (cur(f)⊗g) = f . (id⊗g)
(~cur(f) ⅋ g) . ~app = (id⅋g) . f

app⅋id . ~app = com ⅋ id . com . ~app : b → a⅋(b ⊗ ~a)
app . ~app⊗id = app . com ⊗ id . com  : a⊗(b ⅋ ~a) → b
cur(~cur(f : c → b⅋a)) = com . cur(~f . com) : a → (c ⅋ ~b)
~cur(cur(f)) = cur(com . ~f) . com

The amount of the rules here seem infuriating. At least our srl -rule just returns a variation of itself when inverted. But it makes you wonder.. Do we need anything special after all? Well, I guess a substrate is needed for multiplicative logic operators.

Continuation passing style

A well-known extension to lambda calculus are continuations. The continuations can be transformed back to lambda calculus though. This happens by never returning after a call.

In the categorical abstract machine, it roughly means a transformation such as the following:

g . app . pair(cur(f).z, x)

= app . pair(app . pair(cur(cur(f)).z, cur(g . snd)), x)

The function signatures become of the form as below. The return value is no longer interesting, so we blank it.

(b ⊸ ¤) ⊸ a ⊸ ¤

In classical linear logic, it is clear what we just did. Variables no longer move into other direction? It's an axiom or cut.

(b ⊗ ~¤) ⅋ ~a ⅋ ¤

We could add something into the hole to give the computation some context, but if we clear it and make it unit, it means that we've just added bit of flow direction markers or something.

The continuations still have to be exposed to the user. They are usually done through a little function, call/cc, lets call it E, just like how the linear continuations -paper calls it.

E : (a ⊸ ¤) ⊸ ¤ ⊸ a

What does happen to this signature if we rewrite it?

E : (a ⊸ ¤) ⊸ (((¤ ⊸ ¤) ⊸ a ⊸ ¤) ⊸ ¤) ⊸ ¤

Seems difficult..

E = f ↦ g ↦ g (k ↦ x ↦ k (f x))

If you wonder if it's correct, yes, we can typecheck this and see that it matches. GHCI would give the following output:

(r3 -> r2) -> (((r2 -> r1) -> r3 -> r1) -> r) -> r

The continuation passing gives all the pieces for classical linear logic, but it's also more complicated than having few rules.

There doesn't seem to be too much anything to do with this. To solve this I have to build a substrate that conveys the rules of linear logic, can be written and evaluated. It is needed and it shapes how the rest of the system turns out.