Review of "Simple Essence of Algebraic Subtyping"
MLsub has been picked up again. Lionel Parreaux wrote a paper that I'm examining in this post. There's also a blog, Considerations on Codecrafting, writing a blogpost series on subtype inference Parreaux's paper is [v1.8] at the time of the review.
Stephen Dolan's thesis
I talk a bit about Stephen Dolan's "Algebraic Subtyping" thesis first, to give some context.
- His thesis introduces type inference to an extensible subtyping system.
- You can introduce new types without turning previously typeable programs untypeable.
- The results are achieved by never relying on non-existence of weird types and putting the "algebra before syntax"
Find the simplest algebra of types, and some syntax for them. Reads in the paper.
Influence to my work
Back when Dolan published I was impressed by the online demo and the introduction of the thesis. MLsub's type system matched better to how I was used to thinking about types at the time when I was still writing Python.
I thought type inference would let me produce type information to the language and then use that to target platforms that are not available for a JIT-equipped interpreter, platforms such as graphics accelerators and microcontrollers.
I studied to get the compiling to work well and to build a better implementation for my programming language, learned more about logic programming, dependent type theory. It eventually made me to realise that what I was building with Lever programming language was not as useful as it could be. Realising this I put my programming language project to ice and moved on. At that point my interest to MLsub waned.
Lionel Parreaux's paper
I appreciate the effort to produce a simpler subtyping inference algorithm. In this post I'm going to try implement it in a different language and see if I understand how it works. The paper's language is Scala and I'd like to see the implementation in Haskell.
I need to comment about few things in the paper first.
Practical usefulness of extensibility
In the 2.1. "Background on Algebraic Subtyping" Parreaux mentions that the practical usefulness of extensibility is unclear. In the footnote he agrees it simplifies the subtyping but also says practical programming languages already have extensible type systems by their design.
I'd think the extensibility does matter. If you create a new type that combines properties of the previous type, then it'd be great to be certain that it doesn't mess up structures that are already there and unrelated to it. This is achieved by not assuming inexistence of any potential type, which is a good principle in such subtyped environment anyway.
Subtyping effects on correctness
Paper section 2.3.3. "Typing Surprises" cued me to try a thing and I found out this:
let pum = fun x -> x x
let kaput = pum pum
val pum: 'a ∧ ('a -> 'b) -> 'b
val kaput: ⊥
Algebraically subtyped lambda calculus would seem to correspond to inconsistent logic.
If we're thinking about what the types mean here,
- the
⊤
is a type that can be constructed from anything because it is discarded and not used at all. - The
⊥
is a type that cannot be constructed because it can be used in every way.
The above means that we can interpret a fixed-point combinator
as a ⊥
and therefore we've found a contradiction in the corresponding logic.
This should help understanding what you are working with here. If you need termination proofs/proof content, I'd believe there are workarounds to obtaining them.
Implementation in Haskell
{-# LANGUAGE TypeFamilies,
TypeSynonymInstances,
ScopedTypeVariables,
DeriveFunctor,
DeriveTraversable #-}
module SimpleSub where
import Control.Monad.ST (ST, runST)
import Data.STRef (STRef, newSTRef, readSTRef, writeSTRef)
I'm implementing the terms with De-bruijn indexing.
data Term
= Lit Integer
| Var Integer
| Lam Term
| App Term Term
| Rcd [(String, Term)]
| Sel Term String
| Let { isRec :: Bool
, rhs :: Term
, body :: Term }
deriving (Show)
To implement the rest, I need to create a monad that captures memoization. The constraining function in the algorithm doesn't work without it. While I'm at it, lets use this monad to track state variables and catch errors from the computation.
class Monad m => MuscularMonad m where
type PrimState m
liftST :: ST (PrimState m) a -> m a
memoize :: (SimpleType m -> SimpleType m -> m ())
-> SimpleType m -> SimpleType m -> m ()
err :: Error m -> m z
There are two kind of errors this system can produce.
type Error m = BareError (SimpleType m)
data BareError ty
= MissingField String
| ConstraintViolation ty ty
deriving (Functor, Traversable, Foldable, Show)
I see that this resembles a technique I remember. It separated constraint generation from constraint resolution. It was described in Bastiaan Heeren's paper "Generalizing Hindley-Milner Type Inference Algorithms" [pdf]. Though, instead of this we're separating simplification and constraint propagation. The first part propagates the constraints.
type SimpleType m = SimpleTypeST (PrimState m)
data SimpleTypeST s
= Variable (STRef s (VariableStateST s))
| Primitive String
| Function (SimpleTypeST s) (SimpleTypeST s)
| Record [(String, SimpleTypeST s)]
deriving (Eq)
type VariableState m = VariableStateST (PrimState m)
type VariableStateST s = ([SimpleTypeST s], [SimpleTypeST s])
type Ctx m = [SimpleType m]
The weird-looking type aliases are there for convenience. There's a moment when the monad need to be "run" which requires that we "detach" the structure from that monad.
Next there's the basic type inference. This is precisely what you could expect in Hindley-Milner algorithm's implementation as well.
typeTerm :: (MuscularMonad m)
=> Ctx m -> Term
-> m (SimpleType m)
typeTerm ctx (Lit n) = pure (Primitive "int")
typeTerm ctx (Var i) = pure (ctx !! i)
typeTerm ctx (Rcd fs) = Record <$> mapM (typeRecordTerm ctx) fs
typeTerm ctx (Lam body) = do
a <- freshVar
Function a <$> typeTerm (a:ctx) body
typeTerm ctx (App f x) = do
a <- freshVar
t' <- typeTerm ctx f
u' <- Function <$> typeTerm ctx x <*> pure a
constrain t' u'
pure a
typeTerm ctx (Sel term name) = do
a <- freshVar
t' <- typeTerm ctx term
constrain t' (Record [(name, a)])
pure a
typeRecordTerm :: (MuscularMonad m)
=> Ctx m -> (t, Term) -> m (t, SimpleType m)
typeRecordTerm ctx (name, term)
= fmap (\t -> (name, t)) (typeTerm ctx term)
The constraining is a memoized function. This thing produces a constrain that ensures the thing on the left side passes as the thing on the right side.
constrain :: (MuscularMonad m)
=> SimpleType m -> SimpleType m
-> m ()
constrain = memoize constrain'
where
constrain' (Primitive n0) (Primitive n1) | (n0 == n1) = pure ()
constrain' (Function l0 r0) (Function l1 r1)
= constrain l1 l0 >> constrain r0 r1
constrain' (Record fs0) (Record fs1)
= mapM_ (constrainRecord fs0) fs1
constrain' (Variable ref) rhs = do
(lob, upb) <- liftST (readSTRef ref)
mapM_ (\a -> constrain a rhs) lob
liftST (writeSTRef ref (lob, rhs : upb))
constrain' lhs (Variable ref) = do
(lob, upb) <- liftST (readSTRef ref)
mapM_ (\a -> constrain lhs a) upb
liftST (writeSTRef ref (lhs : lob, upb))
constrain' x y = err (ConstraintViolation x y)
constrainRecord :: (MuscularMonad m)
=> [(String, SimpleType m)] -> (String, SimpleType m)
-> m ()
constrainRecord fs0 (n1, t1) = do
case lookup n1 fs0 of
Nothing -> err (MissingField n1)
Just t0 -> constrain t0 t1
freshVar :: MuscularMonad m => m (SimpleType m)
freshVar = liftST (Variable <$> newSTRef ([], []))
To use this things, we need to flex some muscle and get ourselves an instance of the monad that we described. Plenty of type trickery that perhaps would have been better with monad transformers.
newtype Muscle s a = Muscle
((STRef s [(SimpleType (Muscle s), SimpleType (Muscle s))])
-> (ST s (Either (Error (Muscle s)) a)))
unmusc :: Muscle s a
-> ((STRef s [(SimpleType (Muscle s), SimpleType (Muscle s))])
-> ST s (Either (Error (Muscle s)) a))
unmusc (Muscle x) = x
instance Functor (Muscle s) where
fmap f (Muscle x) = Muscle (fmap (fmap (fmap f)) x)
instance Applicative (Muscle s) where
pure x = Muscle (pure (pure (pure x)))
Muscle f <*> Muscle x = Muscle ((fmap ((<*>) . fmap (<*>)) f) <*> x)
instance Monad (Muscle s) where
Muscle m >>= f = Muscle (\q -> m q >>= crunch (\x -> (unmusc . f) x q))
where
crunch :: (t -> ST s (Either a b)) -> Either a t -> ST s (Either a b)
crunch f (Left a) = pure (Left a)
crunch f (Right a) = f a
instance MuscularMonad (Muscle s) where
type PrimState (Muscle s) = s
liftST st = Muscle (\q -> fmap pure st)
memoize fn x y = Muscle (\q -> do
qv <- readSTRef q
if elem (x,y) qv
then pure (pure ())
else do
writeSTRef q ((x,y):qv)
unmusc (fn x y) q)
err msg = Muscle (pure (pure (Left msg)))
runMusc :: Muscle s a
-> [(SimpleType (Muscle s), SimpleType (Muscle s))]
-> ST s (Either (BareError (SimpleTypeST s)) a)
runMusc m x = newSTRef x >>= unmusc m
Our memoization function here isn't particularly excellent. It's updating the ST-variable and compares each structure to see whether they're equal. This should be sufficient though. Now that we got some muscle, it's time to look at the user-facing type representation.
User-facing representations of types
The process of retrieving the type seems a bit taxing.
Although the author tested his code and ran an experimental evaluation with millions of randomly generate expressions, I'm a bit concerned, is the use of 'recursive' -map correct in this or the paper's algorithm?
The complexity of MLsub's implementation may have contributed as motivation for me to look into formal verification. Those difficult parts seem to have been distilled here.
data Type
= Top
| Bot
| Meet Type Type
| Join Type Type
| FunctionType Type Type
| RecordType [(String,Type)]
| RecursiveType Int Type -- No longer de-bruijn indices
| TypeVariable Int -- Just ordinary labels
| PrimitiveType String
deriving (Show)
type PolarVariable s = (STRef s (VariableStateST s), Bool)
coalesceType :: SimpleTypeST s -> ST s Type
coalesceType (ty :: SimpleTypeST s) = do
lazyVariables <- newSTRef []
recursive <- newSTRef []
nextName <- newSTRef 0
let freshName :: ST s Int
freshName = do
nn <- readSTRef nextName
writeSTRef nextName (nn+1)
pure nn
checkRecursive :: PolarVariable s -> ST s (Maybe Int)
checkRecursive vs_pol = do
r <- readSTRef recursive
pure (lookup vs_pol r)
checkAndUpdateRecursive :: PolarVariable s -> ST s Int
checkAndUpdateRecursive vs_pol = do
r <- readSTRef recursive
case lookup vs_pol r of
Nothing -> do
i <- freshName
writeSTRef recursive ((vs_pol,i):r)
pure i
Just a -> pure a
-- This is the part that I missed at first.
getVariable :: STRef s (VariableStateST s) -> ST s Type
getVariable vs = do
v <- readSTRef lazyVariables
case lookup vs v of
Nothing -> do
t <- TypeVariable <$> freshName
writeSTRef lazyVariables ((vs,t):v)
pure t
Just t -> pure t
go :: SimpleTypeST s -> Bool -> [PolarVariable s] -> ST s Type
go (Primitive n) _ _ = pure (PrimitiveType n)
go (Function l r) polar inProcess = FunctionType
<$> go l (not polar) inProcess
<*> go r polar inProcess
go (Record fs) polar inProcess = do
RecordType <$> mapM (\(n,t) -> do
t' <- go t polar inProcess
pure (n,t')) fs
go (Variable vs) polar inProcess = do
let vs_pol = (vs, polar)
if elem vs_pol inProcess
then do
TypeVariable <$> checkAndUpdateRecursive vs_pol
else do
(lb,ub) <- readSTRef vs
let bounds = if polar then lb else ub
boundTypes <- mapM
(\m -> go m polar (vs_pol:inProcess))
bounds
let mrg = if polar then Join else Meet
res <- (\t -> foldl mrg t boundTypes) <$> getVariable vs
rc <- checkRecursive vs_pol
case rc of
Nothing -> do
pure res
Just a -> do
pure (RecursiveType a res)
go ty True []
Since the ST type variable need to be lugged around I had to enable scoped typed variables to get that variable up there. The annotations for those let-clauses make it a bit easier to read the whole program.
Ok, that should do it. I still give you the function to run the full algorithm.
runTypeTerm :: Term -> Either (BareError Type) Type
runTypeTerm term = runST $ do
result <- runMusc (typeTerm [] term) []
case result of
Left error -> Left <$> mapM coalesceType error
Right t -> Right <$> coalesceType t
Lets try it and see if we succeeded in implementing this
Parreaux demonstrates how his algorithm works with the church numeral 2. It's a nice thing to try so lets do the same here.
λf.λx.f (f x)
I convert them into the corresponding structures.
church_numeral_2 :: Term
church_numeral_2 = Lam (Lam (App (Var 1) (App (Var 1) (Var 0))))
I get some result.
> runTypeTerm church_numeral_2
Right (FunctionType
(Meet (FunctionType (TypeVariable 0) (TypeVariable 1))
(FunctionType (TypeVariable 2) (TypeVariable 3)))
(FunctionType (TypeVariable 4) (TypeVariable 5)))
2020-08-11: Corrected algorithm run:
> runTypeTerm church_numeral_2
Right (FunctionType
(Meet (Meet (TypeVariable 3)
(FunctionType (TypeVariable 0) (TypeVariable 1)))
(FunctionType (TypeVariable 2) (TypeVariable 0)))
(FunctionType (TypeVariable 2) (TypeVariable 1)))
I was surprised by the duplicates at first but this seem to be near the result in the paper. The type is further compacted by observing where the variables appear in the structure. The paper tells how it's being done but does not provide the code.
I tried yet an another one.
> runTypeTerm (Lam (App (Var 0) (Var 0)))
Right (FunctionType
(FunctionType (TypeVariable 0) (TypeVariable 1))
(TypeVariable 2))
2020-08-11: Corrected algorithm run:
> runTypeTerm (Lam (App (Var 0) (Var 0)))
Right (FunctionType
(Meet (TypeVariable 0)
(FunctionType (TypeVariable 0) (TypeVariable 1)))
(TypeVariable 1))
I'm a bit confused by that result. I'm not sure whether my implementation is correct.
The full program & conclusions
You can download the whole source code here: SimpleSub.hs
A whole lot of juggling with state is present in Scala implementation. It may be short code but holds a similar logical complexity that was present in Stephen Dolan's original algorithm. I remember the similar recursive implementation with a lot of mutable references.
2020-08-10: Parreaux answered that he liked the review. He told me that this:
res <- do
if length boundTypes == 0
then TypeVariable <$> freshName
else pure (foldl1 mrg boundTypes)
Should likely be instead:
res <- (\t -> foldl mrg t boundTypes) <$> getVariable vs
The corresponding line in the scala implementation. I've fixed it into the implementation and re-ran the commands.
I thoughtlessly changed the program there without understanding it well enough. Parreaux explained me that this feature connects negative and positive occurrences of variables. That means the additional variable needs to be folded into the result.