The categorical abstract machine Mk.2
The categorical abstract machine is a neat virtual machine designed in 1987 by Cousineau, Curien and Mauny.
This posts extends the categorical abstract machine with typechecking and proof search.
De Bruijn's notation can be used to give simple denotational semantics to lambda calculus.
{0}(p,d) = d, {n+1}(p,d) = {n}p,
{c}p = c,
{MN}p = {M}p({N}p),
{λ.M}pd = {M}(p,d).
CAM is a very simple virtual machine based on these semantics. It has only few state transition rules that can be easily described in Haskell.
data Sem a = Nil
| Cons (Sem a) (Sem a)
| Closure (Sem a -> Sem a)
| Syn a
data Tm = MPair [Tm] [Tm]
| MAbs [Tm]
| MConst
| MFst
| MSnd
| MAp
deriving (Show)
eval :: Sem a -> Tm -> Sem a
eval v (MAbs f) = Closure (\w -> foldl eval (Cons v w) f)
eval v (MPair f g) = Cons (foldl eval v f) (foldl eval v g)
eval v (MConst) = Nil
eval (Cons f _) MFst = f
eval (Cons _ g) MSnd = g
eval (Cons (Closure f) w) MAp = f w
Described in this way, it's an obvious implementation of normalisation by evaluation. The "eval" constructs an initial semantic interpretation of a term.
Throughout this blog post I use the following "type information". We use categorical terms "terminal", "product", "exponent" to refer on units, tuples and functions. There are also two sets of type variables, some for me, and some for you.
data Ty = Terminal
| Prod Ty Ty
| Exp Ty Ty
| Some Int
| Var Int
deriving (Show, Eq)
The Ty
-values have similar role as Sem
values, but on a different level.
These are values produced by interpreting functors,
whereas Sem
are values produced by interpreting natural transformations.
Since we have an NbE, we should be able to reify semantics back to syntactic terms and from syntactic terms to semantics.
reify :: Ty -> Sem [Tm] -> [Tm]
reify Terminal Nil = [MConst]
reify (Prod a b) (Cons x y) = [MPair (reify a x) (reify b y)]
reify (Exp a b) (Closure f) = [MAbs (reify a (f (reflect b (scope a))))]
where scope (Exp a _) = MFst:scope a
scope _ = [MSnd]
reify (Some _) (Syn t) = t
reify (Var _) (Syn t) = t
reflect :: Ty -> [Tm] -> Sem [Tm]
reflect Terminal _ = Nil
reflect (Prod a b) t = Cons (reflect a (t++[MFst]))
(reflect b (t++[MSnd]))
reflect (Exp a b) t = Closure (\x -> reflect a [MPair t (reify b x), MAp])
reflect (Some _) t = Syn t
reflect (Var _) t = Syn t
nbe :: Ty -> Ty -> [Tm] -> [Tm]
nbe a b t = reify b (foldl eval (reflect a (scope b)) t)
where scope (Exp e _) = MFst:scope e
scope _ = []
Given a well-typed term, the reify
should be a total function.
I'm not coinvinced at all that is correct because the
reconstruction of the scope in the reify
seems funky,
but examining how the evaluator produces the closure suggests it'd be necessary.
*CAM2> nbe (Some 0) (Some 0) []
[]
*CAM2> nbe (Prod (Some 0) Terminal) (Some 0) [MFst]
[MFst]
*CAM2> nbe (Prod (Some 0) Terminal) Terminal [MSnd]
[MConst]
*CAM2> nbe Terminal (Exp (Some 0) (Some 0)) [MAbs [MSnd]]
[MAbs [MSnd]]
*CAM2> nbe Terminal (Exp (Exp (Some 0) (Some 1)) (Some 0)) [MAbs [MAbs [MFst,MSnd]]]
[MAbs [MAbs [MFst,MSnd]]]
The computation yields us a normal form and it's very interesting to examine how normal forms are shaped. If you don't share this opinion, watch some Frank Pfenning's logic lectures.
We can conclude what the normal form is through two methods. One is to look at the evaluator and think about where would it get stuck? Another way is to examine the reify/reflect functions we just described.
data Norm = Pair Norm Norm
| Abs Norm
| Const
| Neut Neut
deriving (Show, Eq)
data Neut = Id
| Fst Neut
| Snd Neut
| Ap Neut Norm
deriving (Show, Eq)
There's something interesting happening with the neutral form of "apply".
It gets this weird [MPair t (reify b x), MAp]
-form,
that is it gets stuck with a pair that is "loaned" to it.
Next we define nreify
and nreflect
.
nreify :: Ty -> Sem Neut -> Norm
nreify Terminal Nil = Const
nreify (Prod a b) (Cons x y) = Pair (nreify a x) (nreify b y)
nreify (Exp a b) (Closure f) = Abs (nreify a (f (nreflect b (scope a Id))))
where scope (Exp a _) x = scope a (Fst x)
scope _ x = Snd x
nreify (Some _) (Syn t) = Neut t
nreify (Var _) (Syn t) = Neut t
nreflect :: Ty -> Neut -> Sem Neut
nreflect Terminal _ = Nil
nreflect (Prod a b) t = Cons (nreflect a (Fst t)) (nreflect b (Snd t))
nreflect (Exp a b) t = Closure (\x -> nreflect a (Ap t (nreify b x)))
nreflect (Some _) t = Syn t
nreflect (Var _) t = Syn t
nnbe :: Ty -> Ty -> [Tm] -> Norm
nnbe a b t = nreify b (foldl eval (nreflect a (scope b Id)) t)
where scope (Exp e _) x = scope e (Fst x)
scope _ x = x
Though now we have normal forms, what do they mean as terms? That can be written down, like this.
meaning :: Norm -> [Tm]
meaning (Pair a b) = [MPair (meaning a) (meaning b)]
meaning (Abs a) = [MAbs (meaning a)]
meaning Const = [MConst]
meaning (Neut Id) = []
meaning (Neut (Fst a)) = meaning (Neut a) ++ [MFst]
meaning (Neut (Snd b)) = meaning (Neut b) ++ [MSnd]
meaning (Neut (Ap f n)) = [MPair (meaning (Neut f)) (meaning n), MAp]
Now we can get normal forms in the structures that cannot be filled by anything else but a normal form.
*CAM2> nnbe (Some 0) (Some 0) []
Neut Id
*CAM2> nnbe (Prod (Some 0) Terminal) (Some 0) [MFst]
Neut (Fst Id)
*CAM2> nnbe (Prod (Some 0) Terminal) Terminal [MSnd]
Const
*CAM2> nnbe Terminal (Exp (Some 0) (Some 0)) [MAbs [MSnd]]
Abs (Neut (Snd Id))
*CAM2> nnbe Terminal (Exp (Exp (Some 0) (Some 1)) (Some 0)) [MAbs [MAbs [MFst,MSnd]]]
Abs (Abs (Neut (Snd (Fst Id))))
*CAM2> nnbe Terminal (Exp (Exp (Some 0) (Some 1)) (Prod Terminal (Some 0))) [MAbs [MAbs [MFst,MSnd,MSnd]]]
Abs (Abs (Neut (Snd (Snd (Fst Id)))))
Why come up with such a structure? Well, there are some things we can do with it.
Normal forms are shaped such that they can be typechecked without including annotations inside them. We can use a bidirectional typechecking algorithm for that:
check :: Ty -> Ty -> Norm -> Bool
check c Terminal Const = True
check c (Prod a b) (Pair f g) = check c a f && check c b g
check c (Exp a b) (Abs f) = check (Prod c b) a f
check a c@(Some _) (Neut n) = maybe False (==c) (infer a n)
check a c@(Var _) (Neut n) = maybe False (==c) (infer a n)
check _ _ _ = False
infer :: Ty -> Neut -> Maybe Ty
infer a (Fst n) = fmap fst (infer a n >>= valid_product)
infer a (Snd n) = fmap snd (infer a n >>= valid_product)
infer a (Ap n f) = do
(c,d) <- infer a n >>= valid_exponent
guard (check a d f) >> pure c
infer a Id = Just a
valid_product :: Ty -> Maybe (Ty,Ty)
valid_product (Prod a b) = Just (a,b)
valid_product _ = Nothing
valid_exponent :: Ty -> Maybe (Ty,Ty)
valid_exponent (Exp a b) = Just (a,b)
valid_exponent _ = Nothing
guard :: Bool -> Maybe ()
guard True = pure ()
guard False = Nothing
Note the right side of Neut
-structure is supposed to be abstract.
It's there because we could substitute Id
away if we knew its type.
Therefore for the structure to be in the specific normal form that we want,
this property is required.
Lets try this out to something, just to verify it's not entirely wrong.
*CAM2> check Terminal (Exp (Exp (Some 0) (Some 1)) (Prod Terminal (Some 0))) (Abs (Abs (Neut (Snd (Snd (Fst Id))))))
True
This is useful but it's possible you may want to store programs that aren't in a normal form in practice. However it is also possible to form arbitrary terms by annotating their types into the leaves.
To understand how to produce internally verified programs, you need to do a little trip to the category theory. The computation rules described here can be derived from adjoint functors. Adjointness is a rich condition providing plenty of structure that can be relied upon.
For functors to be adjoint,
they form unit/counit natural transformations.
For example, pair/(fst*snd)
forms an unit/counit pair.
The pair functor is a right adjoint for the diagonal functor.
Functors form structures that behave like polymorphic types. Though, natural transformations are a lot like functors themselves.
There's a condition that a natural transformation behaves like a structure-preserving transformation from one functor to an another. This means that units and counits allow things to "pass-through".
f ∘ fst = fst ∘ (f * g)
g ∘ snd = snd ∘ (f * g)
(f * f) ∘ pair = pair ∘ f
If we recognize this property you can conclude that natural transformations can be treated as if they took parameters. This also makes them self-identifying.
fst f g
snd f g
pair f
After all natural transformations are families of morphisms. If you identify the parameters they take, then you can identify individual morphisms by filling in the parameters.
Next I construct the natural transformation presentation. It's a little bit neat.
data NT
= NSym
| NTerm
| NComp NT NT
| NProd NT NT
| NExp NT NT
| NConst NT
| NPair NT
| NCur NT NT
| NFst NT NT
| NSnd NT NT
| NAp NT NT
deriving (Eq, Show)
The validation is surprisingly simple.
nt_env :: NT -> Maybe (NT, NT)
nt_env NSym = pure (NSym, NSym)
nt_env NTerm = pure (NTerm, NTerm)
nt_env (NComp f g) = do
(a,b) <- nt_env f
(c,d) <- nt_env g
guard (b == c)
pure (a,d)
nt_env (NProd x y) = do
(a,b) <- nt_env x
(c,d) <- nt_env y
pure (NProd a c, NProd b d)
nt_env (NExp x y) = do
(a,b) <- nt_env x
(c,d) <- nt_env y
pure (NProd a d, NProd b c)
nt_env (NConst x) = do
(_,c) <- nt_env x
pure (NTerm, c)
nt_env (NPair f) = do
(a,b) <- nt_env f
pure (NProd a a, b)
nt_env (NCur f g) = do
(a,b) <- nt_env f
(c,d) <- nt_env g
pure (NExp (NProd a d) c, b)
nt_env (NFst f g) = do
(a,b) <- nt_env f
(_,d) <- nt_env g
pure (a, NProd b d)
nt_env (NSnd f g) = do
(_,b) <- nt_env f
(c,d) <- nt_env g
pure (c, NProd b d)
nt_env (NAp f g) = do
(a,b) <- nt_env f
(c,d) <- nt_env g
pure (a, NProd (NExp b c) d)
You might be asking, why does this work? What are we doing different here?
The trick would appear to be that we end up correctly describing
the full shape of fst
, snd
and const
.
Now as a reminder that we aren't going to abandon lambda calculus, lets leave those structures here.
data LC = LVar Int
| LConst
| LLam LC | LAp LC LC
| LPair LC LC | LFst LC | LSnd LC
deriving (Show)
It's trivial to build up functions to relate these structures into the main content in this blogpost. For now I leave them away.
There's still one thing about those normal forms that I didn't cover. We can use it to drive a proof search. Actually, if you look at those natural transformations, notice that they allow you to treat type inference as a form of proof search. This is a relatively cool thing and I'm going to cover it after I've introduced how to do proof search with these structures.
First we're going to need a logic programming environment.
class Unifiable a where
var :: Int -> a
as_var :: a -> Either Int a
rewrite :: (Int -> Maybe a) -> a -> a
overlap :: a -> a -> Maybe [(a,a)]
occurs :: Int -> a -> Bool
And we have to layout it over our type declarations.
instance Unifiable Ty where
var i = Var i
as_var (Var i) = Left i
as_var a = Right a
rewrite _ Terminal = Terminal
rewrite f (Prod x y) = Prod (rewrite f x) (rewrite f y)
rewrite f (Exp x y) = Exp (rewrite f x) (rewrite f y)
rewrite _ (Some i) = Some i
rewrite f x@(Var i) = case f i of
Nothing -> x
Just v -> v
overlap Terminal Terminal = pure []
overlap (Some i) (Some j) | (i == j) = pure []
overlap (Var i) (Var j) | (i == j) = pure []
overlap x@(Var _) y = pure [(x, y)]
overlap x y@(Var _) = pure [(x, y)]
overlap (Prod x y) (Prod z w) = do
xs <- overlap x z
ys <- overlap y w
pure (xs ++ ys)
overlap (Exp x y) (Exp z w) = do
xs <- overlap x z
ys <- overlap y w
pure (xs ++ ys)
overlap _ _ = Nothing
occurs i (Var j) = (i == j)
occurs i Terminal = False
occurs i (Prod x y) = any (occurs i) [x,y]
occurs i (Exp x y) = any (occurs i) [x,y]
occurs i (Some j) = False
You can think of unification as a coequalizer in a suitable category where morphisms form substitutions.
unify :: Unifiable a => a -> a -> Map Int a -> Maybe (Map Int a)
unify x y m = unify' (as_var $ subs m x) (as_var $ subs m y) m
unify' :: Unifiable a => Either Int a -> Either Int a -> Map Int a -> Maybe (Map Int a)
unify' (Left i) (Left j) m = if i == j then pure m else pure (Map.insert i (var j) m)
unify' (Left i) (Right x) m = if occurs i x then Nothing else pure (Map.insert i x m)
unify' (Right x) (Left i) m = if occurs i x then Nothing else pure (Map.insert i x m)
unify' (Right x) (Right y) m = overlap x y >>= (foldl (>>=) (pure m) . fmap (uncurry unify))
subs :: Unifiable a => Map Int a -> a -> a
subs m k = rewrite (\i -> Map.lookup i m >>= pure . subs m) k
This is very much an implementation of μKanren. You treat the logic program as a stream of results.
type State a = (Map Int a, Int)
data Stream a b
= Empty
| Delay (Stream a b)
| Filled (State a) b (Stream a b)
| Suspend (Stream a b)
stream_head :: Stream a b -> Maybe (State a, b)
stream_head Empty = Nothing
stream_head (Delay tail) = stream_head tail
stream_head (Filled st x tail) = Just (st, x)
stream_head (Suspend s) = stream_head s
stream_list :: Stream a b -> [(State a, b)]
stream_list Empty = []
stream_list (Delay tail) = stream_list tail
stream_list (Filled st x tail) = (st, x):stream_list tail
stream_list (Suspend s) = stream_list s
The Suspend differs from the usual μKanren implementation. I was not sure how to implement this feature properly so I extended the definition of stream to include it.
Suspended stream is empty if it appears in an empty context. These are meant to stack such that we have layers of traps that determine what the context is in each case.
For now this likely is overconstrained and not produce every proof. I suppose Suspension would need De-bruijn levels to work properly.
The implementation wraps goals so that they can be put into a monad. This is done to make it convenient to write logic programs.
newtype Goal a b = Goal {enumerate :: State a -> Stream a b}
-- Enumeration of solutions
empty_state :: State a
empty_state = (Map.empty, 0)
any_solution :: Goal a b -> Maybe (State a, b)
any_solution g = stream_head (enumerate g empty_state)
all_solutions :: Goal a b -> [(State a, b)]
all_solutions g = stream_list (enumerate g empty_state)
-- It is nice for the REPL to display few solutions for me.
instance (Show a, Show b) => Show (Goal a b) where
show g = let
some = fmap snd (take 20 (all_solutions g))
in case some of
[] -> "Nope"
_ -> "Solutions: \n " ++ (foldr (\x y -> x++"\n "++y) "" $ fmap show some)
Following code implements an usual μKanren environment.
infixr 7 ≡
infixr 6 ∧
infixr 3 ∨
fresh :: Unifiable a => (a -> Goal a x) -> Goal a x
fresh f = Goal fresh' where
fresh' (m,i) = enumerate (f (var i)) (m,i+1)
(≡), eq :: Unifiable a => a -> a -> Goal a ()
(≡) = eq
eq x y = Goal eq' where
eq' (st,i) = case unify x y st of
Just a -> Filled (a,i) () Empty
Nothing -> Empty
false :: Goal a b
false = Goal (\st -> Empty)
mplus :: Stream a b -> Stream a b -> Stream a b
mplus Empty g = g
mplus (Delay f) g = mplus g f
mplus (Filled st x f) g = Filled st x (mplus f g)
mplus (Suspend f) (Suspend g) = Suspend (mplus g f)
mplus (Suspend f) g = mplus g (Suspend f)
bind :: Stream a t -> (t -> State a -> Stream a b) -> Stream a b
bind Empty g = Empty
bind (Delay f) g = Delay (bind f g)
bind (Filled st x f) g = mplus (g x st) (bind f g)
bind (Suspend s) g = Suspend (bind s g)
mutate :: (x -> y) -> Stream a x -> Stream a y
mutate fn Empty = Empty
mutate fn (Delay f) = Delay (mutate fn f)
mutate fn (Filled st x f) = Filled st (fn x) (mutate fn f)
mutate fn (Suspend s) = Suspend (mutate fn s)
(∧), conj :: Goal a x -> Goal a y -> Goal a (x,y)
(∧) = conj
conj f g = Goal impl' where
impl' st = bind (enumerate f st) (comma (enumerate g))
comma h x = mutate (\y -> (x,y)) . h
(∨), disj :: Goal a x -> Goal a x -> Goal a x
(∨) = disj
disj f g = Goal disj' where
disj' st = mplus (enumerate f st) (enumerate g st)
The stream trapping is provided by "suspend" and "trap" -functions.
suspend :: Goal u a -> Goal u a
suspend g = Goal (\st -> Suspend (enumerate g st))
trapshutter :: Stream a b -> Stream a b
trapshutter Empty = Empty
trapshutter (Delay s) = Delay (trapshutter s)
trapshutter (Filled st x s) = Filled st x (trapshutter_open s)
trapshutter (Suspend s) = Empty
trapshutter_open :: Stream a b -> Stream a b
trapshutter_open Empty = Empty
trapshutter_open (Delay s) = Delay (trapshutter_open s)
trapshutter_open (Filled st x s) = Filled st x (trapshutter_open s)
trapshutter_open (Suspend s) = s
trap :: Goal u a -> Goal u a
trap g = Goal (\st -> trapshutter (enumerate g st))
The kreify is used for retrieving results from the logic program.
You get quite useful results this way.
-- Apply all substitutions to allow readout
kreify :: Unifiable a => a -> Goal a a
kreify x = Goal (\st -> Filled st (subs (fst st) x) Empty)
Delaying gives a turn for an another stream to produce values. Despite this feature the proof search would not appear to produce a balanced stream of values.
-- Delay recursive expressions.
delay :: Stream a b -> Stream a b
delay Empty = Empty
delay (Delay s) = Delay (delay s)
delay (Filled st x s) = Filled st x (Delay (delay s))
delay (Suspend s) = Suspend (delay s)
zzz :: Goal a x -> Goal a x
zzz f = Goal (\st -> delay (enumerate f st))
Finally this is just some convenience to allow us to write logic programs a bit easier in Haskell.
-- Provide fresh variable as output
freshvar :: Unifiable a => Goal a a
freshvar = fresh pure
instance Functor (Goal a) where
fmap f x = Goal (mutate f . enumerate x )
instance Applicative (Goal a) where
pure x = Goal (\st -> Filled st x Empty)
(<*>) x y = Goal (\st -> bind (enumerate x st) impl') where
impl' f = mutate f . enumerate y
instance Monad (Goal a) where
(>>=) f g = Goal (\st -> bind (enumerate f st) (\y -> (enumerate (g y))))
The proof search structure can be derived from the checking routine. Though there's a thing that needs to be taken into account.
If you convert the normal-form type checking into a search program,
you would find out it gets stuck if you give it (a → a) → a
.
That type is the type of the fixed point and you may guess what's happening there.
The proof search engine finds out the result f _
, but it wants to fill the hole,
therefore it fills it up with f _
resulting in f (f _)
, and it keeps going
without returning a result.
Some proof search engines have search depth limit to fix this. It's bit of an inconvenience though because the proof search must take the depth search limit as an argument.
Technically the engine searching to depth isn't an issue.
Consider for the following example: (a → a) → a → a
.
Actually let me show you the difference.
*CAM2> fix_test
Nope
*CAM2> nat_test
Solutions:
Neut (Snd Id)
Neut (Ap (Fst Id) (Neut (Snd Id)))
Neut (Ap (Fst Id) (Neut (Ap (Fst Id) (Neut (Snd Id)))))
Neut (Ap (Fst Id) (Neut (Ap (Fst Id) (Neut (Ap (Fst Id) (Neut (Snd Id)))))))
I'll show these programs later. Though you see that the second thing produces results because there's a base case.
This is achieved by constraining the neutral form search.
Before the procedure is allowed to search for f (_:a):a
,
it must produce x:a
first.
If the base case is available, it can fill the f _
as well.
Ok, here is the search program.
proofsearch :: Ty -> Ty -> Goal Ty Norm
proofsearch a b = search_norm [] a b
search_norm :: [Ty] -> Ty -> Ty -> Goal Ty Norm
search_norm ctx a b = (neut ∨ term ∨ zzz exp ∨ zzz prod) where
term = do
b ≡ Terminal
atom_constraint_check ctx
pure Const
exp = do
c <- freshvar
d <- freshvar
b ≡ Exp c d
atom_constraint_check ctx
x <- search_norm ctx (Prod a d) c
pure (Abs x)
prod = do
c <- freshvar
d <- freshvar
b ≡ Prod c d
atom_constraint_check ctx
x <- search_norm ctx a c
y <- search_norm ctx a d
pure (Pair x y)
neut = do
b' <- kreify b
atomic_term b'
capture_recursives suspend ctx b'
trap $ do
trace <- prooftrace ctx a b'
neut <- search_neut a (b':ctx) trace Id
pure (Neut neut)
I likely should reimplement this
with μKanren extended into a constraint solver.
The atom_constraint_check
is everywhere because we have to ensure
those variables stay abstract.
To consider how the suspension may be implemented wrong for now,
think about how search_norm [x]
produces
different results than search_norm []
under multiple traps.
Before the trap is set up, we have to set the trapping variable. I guess there's a better way but we trace a spine for now.
data ProofTrace = PT0 | PT1 | PTA Ty deriving (Show)
prooftrace :: [Ty] -> Ty -> Ty -> Goal Ty [ProofTrace]
prooftrace ctx a b = (atomic ∨ zzz prod0 ∨ zzz prod1 ∨ zzz ap) where
atomic = do
kreify a >>= atomic_term
a ≡ b
atom_constraint_check ctx
pure []
prod0 = do
x <- freshvar
y <- freshvar
a ≡ Prod x y
kreify b >>= atomic_term
atom_constraint_check ctx
rest <- prooftrace ctx x b
pure (PT0:rest)
prod1 = do
x <- freshvar
y <- freshvar
a ≡ Prod x y
kreify b >>= atomic_term
atom_constraint_check ctx
rest <- prooftrace ctx y b
pure (PT1:rest)
ap = do
x <- freshvar
y <- freshvar
a ≡ Exp x y
kreify b >>= atomic_term
atom_constraint_check ctx
rest <- prooftrace ctx x b
pure (PTA y:rest)
The search_neut
fills the spine up with neutral values it requires.
search_neut :: Ty -> [Ty] -> [ProofTrace] -> Neut -> Goal Ty Neut
search_neut _ _ [] neut = pure neut
search_neut a ctx (PT0:tail) neut = search_neut a ctx tail (Fst neut)
search_neut a ctx (PT1:tail) neut = search_neut a ctx tail (Snd neut)
search_neut a ctx (PTA b:tail) neut = do
norm <- search_norm ctx a b
search_neut a ctx tail (Ap neut norm)
The capture_recursives
actually suspends the values if they hit the context.
capture_recursives :: (Goal Ty () -> Goal Ty ()) -> [Ty] -> Ty -> Goal Ty ()
capture_recursives _ [] _ = pure ()
capture_recursives susp (x:xs) y = do
x' <- kreify x
atomic_term x'
if x' == y
then susp (pure ())
else capture_recursives (suspend . susp) xs y
Proof-search tends to get stuck if we hit branches with little to no results in them. This happens if we don't check a trivial constraint too late, and it ends up discarding the whole branch but we still search through.
Such trivial result-removing thing is our requirement that we look for n-long normal forms. Some variables ending up to the right side of "Neut" must be constrained to stay as variables.
atom_constraint_check :: [Ty] -> Goal Ty ()
atom_constraint_check [] = pure ()
atom_constraint_check (x:xs) = do
kreify x >>= atomic_term
atom_constraint_check xs
atomic_term :: Ty -> Goal Ty ()
atomic_term (Var _) = pure ()
atomic_term (Some _) = pure ()
atomic_term _ = false
There would be few things I can do to improve the proof search here. But the whole thing is just an ongoing experiment or a proof-of-concept at this stage.
I was going to present how type inference is just a form of proof search. To do that you have to count occurrences of variables in the type.
occurrences :: Ty -> Map Int Int -> Map Int Int
occurrences Terminal m = m
occurrences (Var x) m = Map.alter increment x m
where increment Nothing = Just 1
increment (Just n) = Just (n+1)
occurrences (Some x) m = Map.alter increment (-x) m
where increment Nothing = Just 1
increment (Just n) = Just (n+1)
occurrences (Exp x y) m = occurrences x (occurrences y m)
occurrences (Prod x y) m = occurrences x (occurrences y m)
each_var_occurs_only_once t = Map.fold f True (occurrences t Map.empty)
where f count j = (j && (count == 1))
If the variable occurs more than once, we discard the result. Though we have the problem that I described earlier. This means for discarding a lot of results.
But it still produces some results that we can look at.
type_inference_demonstration = do
x <- freshvar
prog <- proofsearch x x
x' <- kreify x
if each_var_occurs_only_once x'
then pure (x', prog)
else false
*CAM2> type_inference_demonstration
Solutions:
(Var 0,Neut Id)
(Terminal,Const)
(Exp (Var 5) (Var 8),Abs (Neut (Ap (Fst Id) (Neut (Snd Id)))))
(Prod (Var 5) (Var 6),Pair (Neut (Fst Id)) (Neut (Snd Id)))
These seem familiar? They're descriptions of identity natural transformations. Natural transformations that form an identity morphism for a functor.
It was quite interesting thing to find out. I thought type inference is a weird thing that's opposite to proof search. Turns out it's not, it's just a simple case of proof search.
The next interesting thing is to do with a proof search engine is to look if the routines are right.
all_programs :: Goal Ty (Norm, Ty, Ty, Bool)
all_programs = do
x <- freshvar
y <- freshvar
prog <- proofsearch x y
x' <- kreify x
y' <- kreify y
pure (prog, x', y', check x' y' prog)
Like I said, the thing is not quite balanced so it's returning neutral terms only if it happens to be entirely unconstrained.
*CAM2> all_programs
Solutions:
(Neut Id,Var 1,Var 1,True)
(Neut (Fst Id),Prod (Var 1) (Var 3),Var 1,True)
(Neut (Snd Id),Prod (Var 2) (Var 1),Var 1,True)
(Neut (Ap Id Const),Exp (Var 1) Terminal,Var 1,True)
(Neut (Ap Id (Abs (Neut (Snd Id)))),Exp (Var 1) (Exp (Var 4) (Var 4)),Var 1,True)
(Neut (Ap Id (Pair Const Const)),Exp (Var 1) (Prod Terminal Terminal),Var 1,True)
(Neut (Ap Id (Abs (Neut (Fst (Snd Id))))),Exp (Var 1) (Exp (Var 4) (Prod (Var 4) (Var 9))),Var 1,True)
(Neut (Ap Id (Pair Const (Abs (Neut (Snd Id))))),Exp (Var 1) (Prod Terminal (Exp (Var 6) (Var 6))),Var 1,True)
(Neut (Ap Id (Abs (Neut (Snd (Snd Id))))),Exp (Var 1) (Exp (Var 4) (Prod (Var 8) (Var 4))),Var 1,True)
Though those neutral terms seem to typecheck, like they should.
Once you have the proof search, a simple check is to find out the SKI combinators. The proof search is entirely broken if it cannot do this, though.
i_program :: Goal Ty Norm
i_program = do
proofsearch Terminal (Exp (Some 0) (Some 0))
k_program :: Goal Ty Norm
k_program = do
let x = Terminal
y = (Exp (Exp (Some 0) (Some 1)) (Some 0))
proofsearch x y
s_program :: Goal Ty Norm
s_program = do
let a = Some 0
b = Some 1
c = Some 2
x = Terminal
y = (Exp (Exp (Exp c a) (Exp b a)) (Exp (Exp c b) a))
proofsearch x y
*CAM2> i_program
Solutions:
Abs (Neut (Snd Id))
*CAM2> k_program
Solutions:
Abs (Abs (Neut (Snd (Fst Id))))
*CAM2> s_program
Solutions:
Abs (Abs (Abs (Neut (Ap (Ap (Snd (Fst (Fst Id))) (Neut (Snd Id))) (Neut (Ap (Snd (Fst Id)) (Neut (Snd Id))))))))
Some types have unique programs for them. In those cases the proof could be even omitted.
swap_program :: Goal Ty Norm
swap_program = do
let a = Some 0
b = Some 1
x = Prod a b
y = Prod b a
proofsearch x y
prod_programs :: Goal Ty Norm
prod_programs = do
let a = Some 0
x = Prod a a
y = Prod a a
proofsearch x y
*CAM2> swap_program
Solutions:
Pair (Neut (Snd Id)) (Neut (Fst Id))
*CAM2> prod_programs
Solutions:
Pair (Neut (Fst Id)) (Neut (Fst Id))
Pair (Neut (Fst Id)) (Neut (Snd Id))
Pair (Neut (Snd Id)) (Neut (Fst Id))
Pair (Neut (Snd Id)) (Neut (Snd Id))
There are 4 programs of form (a*a) → (a*a)
but only 1 program with the form a*b → b*a
.
Of course, when we have a nice way to generate programs, we can generate programs that fit together, then run them.
composition_test :: Goal Ty (Ty, Ty, Norm, Norm)
composition_test = do
x <- freshvar
y <- freshvar
z <- freshvar
prog1 <- proofsearch x y
prog2 <- proofsearch y z
x' <- kreify x
z' <- kreify z
pure (x', z', prog1, prog2)
composition_test_run = do
(x, y, prog1, prog2) <- composition_test
pure (nnbe x y (meaning prog1 ++ meaning prog2))
*CAM2> composition_test
Solutions:
(Var 2,Var 2,Neut Id,Neut Id)
(Prod (Var 2) (Var 4),Var 2,Neut Id,Neut (Fst Id))
(Prod (Var 3) (Var 2),Var 2,Neut Id,Neut (Snd Id))
(Exp (Var 2) Terminal,Var 2,Neut Id,Neut (Ap Id Const))
(Exp (Var 2) (Exp (Var 5) (Var 5)),Var 2,Neut Id,Neut (Ap Id (Abs (Neut (Snd Id)))))
(Exp (Var 2) (Prod Terminal Terminal),Var 2,Neut Id,Neut (Ap Id (Pair Const Const)))
(Exp (Var 2) (Exp (Var 5) (Prod (Var 5) (Var 10))),Var 2,Neut Id,Neut (Ap Id (Abs (Neut (Fst (Snd Id))))))
*CAM2> composition_test_run
Solutions:
Neut Id
Neut (Fst Id)
Neut (Snd Id)
Neut (Ap Id Const)
Neut (Ap Id (Abs (Neut (Snd Id))))
Neut (Ap Id (Pair Const Const))
Neut (Ap Id (Abs (Neut (Fst (Snd Id)))))
Neut (Ap Id (Pair Const (Abs (Neut (Snd Id)))))
Here's the "nat" and "fix" program generators.
nat_test :: Goal Ty Norm
nat_test = do
let x = Prod (Exp (Some 0) (Some 0)) (Some 0)
y = Some 0
proofsearch x y
fix_test :: Goal Ty Norm
fix_test = do
let x = Exp (Some 0) (Some 0)
y = Some 0
proofsearch x y
Btw. If you added fixed point to the language it would become a trivial program for every type.
I think we aren't far away from having a programming language founded mainly on category theory. Two things are missing.
- We'd need some way to abstract over arithmetic operations and it's not obvious outright.
- Another is that we need a way to translate from denotational structures into machine datatypes and APIs (you can use linear logic to model interactions and mutable state).
Note that the logic programs written here were very difficult to construct. They usually get stuck if you don't write them right. Though the results are impressive enough that it was worthwhile.
You can download the CAM2.hs and fiddle around with it.