Horn-clauses as imperative program representation
I have to write this down now, otherwise I might forget it while I am searching for more.
In spring I stumbled upon the use of
horn clauses as a representation of imperative programs.
The format is good for analysis and type inference so
I decided to use it in my virtual machine.
The translation ended up revealing use-before-def bugs in the program flow
and the runtime could work without a definition of null
.
To grasp the ideas, lets work through an example program.
Introduction through an example
Using horn clauses, the strategy for depth first search could be written like this:
visit : procedure
visited : set
children : function
dfs(a) <-
a in visited.
dfs(a) <-
a not in visited,
add(visited, a),
visit(a),
dfs_children(children(a)).
dfs_children(rest) <-
next(rest; child, rest`),
dfs(child),
dfs_children(rest`).
dfs_children(rest) <-
empty(rest;).
The same could be of course written more concisely in Python:
def dfs(a):
if a not in visited:
visited.add(a)
visit(a)
for child in a.children:
dfs(child)
The earlier horn-clause-program is clearly easier to evaluate and analyse, yet it may even be much easier to understand.
Each clause in the program stands as a conclusion for a procedure
and only one is accepted at a time.
For example, if we try dfs_children([a, b, c])
.
This is the result:
dfs_children([a, b, c]) <-
next([a, b, c]; a, [b, c]),
dfs(a),
dfs_children([b, c])
.
dfs_children([a, b, c]) <-
empty([a, b, c];)
Ø
Since the [a,b,c]
is clearly not an empty sequence, we have marked that
the second clause terminates there on this argument.
Likewise if the sequence was empty,
then we could not extract a value from it.
To execute, the program must be in a form that can be encoded as a choice-tree.
Here's an example of a choice tree for the dfs_children
:
dfs_children(rest):
if next(rest; child, rest`),
dfs(child),
dfs_children(rest`).
else empty(rest).
The next
and empty
form guard clauses and when used together
they must be complementary in the same context.
In this case, next(rest; ...)
is only possible
if empty(rest;)
is not possible.
This must hold for every pair of guards in the choice tree.
If you can straightforwardly define such a choice tree for the program, then the program is in the required format for execution.
Use of global variables in the examples is for clarity and if you find that as bad you can think of them as coeffects.
Analysis of the program
There are all sort of interesting things we can conclude from the program. For example, we can determine that it is going to terminate.
Calling dfs
expands into further calls of dfs
, but we can glance
through the program to acknowledge that dfs
does nothing
when it faces an element that has been visited:
dfs(a) <-
a in visited.
To terminate, every item that may be passed to dfs
must appear in
the visited set.
We can also conclude that dfs
is populating the visited
-set,
in the second clause:
dfs(a) <-
a not in visited,
add(visited, a),
visit(a),
dfs_children(children(a)).
From there we may conclude that visit(a)
runs exactly once for every
element that is passed to dfs(a)
,
because the addition of the item into the visited set blocks this
clause from repeating again with the same element.
The program runs for every element that is reachable through the children
.
The visited
set collects up a
transitive closure
for the original argument of dfs
.
The representation would appear to make it easier to reason about the behavior of the program, which seems to translate to the program being easier to understand in whole.
Relationship with mathematical logic
Curry-Howard correspondence states that there is a link between logic and a computational calculus. Proofs correspond to programs and propositions correspond to types.
We could think of programming as a construction of theorems and proofs for them. This is a powerful way to think of programming, and I will try to show why it is so, in a moment.
The correspondence is described as follows:
- Implication - functions
- Conjunctions - products
- Disjunctions - sums
- True formula - unit type that has only one value.
- False formula - bottom type that has no value.
So a sequent like this A ├ B
,
could be read that for every value of A
we can get something from B
.
Types in programs are constraints that prevent what the value can be in a variable. In computation values never exist without a type. For example, in a Python program every value is a Python object. Python object is a sum type that allow every value that can be represented with Python's primitive values.
To represent that something is a Python object,
we could describe it in logic like this: python_object(A)
Likewise we could represent
that an item in B
is either 0
or 1
by stating boolean(B)
.
Unit types correspond to tautologies, always true.
In terms of logic this could mean something like boolean(1)
or python_object("hello")
.
There is no possibility for anything to vary in the expression.
Bottom types correspond to constraints that cannot be satisfied.
natural(A) and A = A+1
.
If you know that A
is a natural number,
then its successor cannot be the number itself.
To understand what implication actually is,
we could ask ourselves,
why is "hello"
a python object?
python_string("hello") ∧ python_string(X) -> python_object(X)
─────────────────────────────────────────────────────────────
python_object("hello")
In Python, and many other languages "hello"
means for a string value.
Strings in python are also python objects,
therefore we have a trivial way to get a python object from any python string.
The opposite is not true, because all python objects are not python strings. Functions are supposed to produce something with every input that is provided. Partial functions do so too, but they have a choice to produce no information.
For example:
python_object(X) -> python_string(X) ∨ true
─────────────────────────────────────────────────
python_object("hello") -> python_string("hello")
python_object(5) -> true
To understand the distinction, the partial functions are proofs with loopholes. The function doesn't need to retrieve you a string in this case, which means that when the X is not a string, it may just retrieve you nothing.
Every Python function is a partial function, which guarantees to return an error message if they do not return you anything.
I've left out the universal and existential quantification because I find them difficult to understand myself. They are useful, but well we can get along without them for now.
Construction of programs
Natural deduction forms a framework that can be generally used to solve problems and verify choices. Also the system is somewhat layman understandable.
Things that ensure hold for : a ├ b ∧ a ├ c
───────────────────────────────────────────
Thing you want to hold : a ├ b,c
While you construct a proof like this, it encodes into a program:
thing_you_want_to_hold(a ├ b ∧ c) <-
things_that_ensure_hold_b(a ├ b)
things_that_ensure_hold_c(a ├ c)
A well-trained programmer can take up and do programming in an any environment, but even then there's use in such powerful representations of logic. It leaves up the effort to be done for the harder problems than those that have been already solved.
Fictional example: Robot and the Milk
I yet use a fictional example here that is not usual to do because usually when they appear then they are the only examples given. In this case it leaves the reader some freedom to not worry about details and let them to focus on the process of working with the concepts themselves and to show that they have all the skills to program a computer.
For a moment think that you have a very fancy home-assistant robot that knows how to do really complex tasks, but it's made by IBM and it's really evil and wicked but otherwise safe to use. It loves doing bad things and only things it loves doing more is obeying rules.
Next, lets say you'd want to have milk with your morning coffee every morning.
milk every morning for my coffee
You can set this as a constraint for the robot, but since it's an evil robot it will always buy IBM-branded Nestle milk powder instead of the real milk. When it cannot obtain milk, it will still aggressively pursue it. There are reports from iRobots harassing cattle in the morning news.
When you search for a solution in the web, you will find the script to program the robot to fetch milk periodically.
it is evening has money in the tab
───────────────────────────────────────
buy specific milk from the nearest shop
───────────────────────────────────────
milk every morning for my coffee
The program ensures that since the robot cannot be evil in the traditional ways, it won't attempt to loan money with terrible terms to buy milk. Program is bit more intricate of course as the money to buy milk must come from the tab and the tab must be described as a concept for the robot.
The problem in this program is that the robot always does it every evening. The shelves fill up from the milk and when they're full of expired bottles the robot starts to pile the new cartons to the warmest place in the room and they spoil overnight. You can change the period but you do not regularly consume milk. Also if the specific milk is not available or the brand in the shop changes, then you may be out of milk.
After a long time af thinking about the problem, you might end up with the following program for the robot:
milk_every_morning_for_my_coffee() <-
fridge has no milk or milk has expired,
has money in the tab,
buy milk from the nearest shop with money in the tab
but milk must be good_milk,
place milk to the frigde.
milk_every_morning_for_my_coffee() <-
do not pour good milk into the drain.
milk_every_morning_for_my_coffee() <-
otherwise wait.
good_milk(milk) <-
milk is pasteurized,
milk is from an animal,
milk is cold but not frozen,
milk costs between 0.5$ and 2$ per Litre,
milk has expire at least 7 days forward from today.
good_milk(milk) <-
otherwise it is not good.
I limit the thought-play here because I really ran out of the time I had reserved for logging today. If you'd like to continue from here, you may come up with a way that the robot would misinterpret the instructions and either not buy milk when it can be bought, or be evil otherwise. Also ask yourself is the above program sufficient?
One problem is that it goes to buy milk when a carton is empty, it can proceed with attempting to buy milk and then wait until the shop is open. It can detect that carton is empty at night, then go wait for shop to open at 2AM and then someone can steal him meanwhile. You may like to try think of how to prevent this.
If you follow through with this small example, well then you've been basically doing programming.
Conclusion
We've seen a "low-level" presentation of a program that is "higher-level" than the equivalent representation in one of the most popular programming languages today.