Explaining formal proofs
A tiresome and boring last week. Lets discuss formal proofs and verification. There are things in it I don't understand at all.
I was being told that there's a tendency to declare programming, logic and types as "the same thing" in academic computer science because some correspondences were observed once. The claim is that we confuse and conflate things when asserting that "programming = mathematics = logic".
Symbolic manipulation as computation
I was told that something very mathematical is lost if we view formal mathematics as computation because "calculating" 1,000,000 * 1,000,000 by manipulation of syntactic objects make no sense.
So lets say we multiply the left number by 10 and divide the right by 1/10. It's a syntactic manipulation, as we only add or remove zero symbols in the clause. We can conclude this keeps the value same because:
a / a = 1
1
is identity for multiplication, eg.a * 1 = a
.- The operation is multiplication.
Then we apply the operation:
1,000,000 * 1,000,000 ├ 10,000,000 * 100,000
10,000,000 * 100,000 ├ 100,000,000 * 10,000
100,000,000 * 10,000 ├ 1,000,000,000 * 1,000
1,000,000,000 * 1,000 ├ 10,000,000,000 * 100
10,000,000,000 * 100 ├ 100,000,000,000 * 10
100,000,000,000 * 10 ├ 1,000,000,000,000 * 1
1,000,000,000,000 * 1 ├ 1,000,000,000,000
But this is calculation by manipulation of syntactic objects and therefore make no sense. We can also extend this to cover counting with objects such as an abacus, after all the beads in the board are taken as symbols standing for something, rather than being directly things we care about. Computers merely manipulate symbols as well.
What comes to losing something very mathematical, I don't agree because even if I took equivalence between proofs and programs as a fact, I still wouldn't treat whole of mathematics as computation only or even about just computation.
I wrote this so you could recognize a strawman in here, or even in your thinking, a stupid strawman that counts about cows by actually using cows to count: Curry-howard correspondence is not restricting you to think of things only in terms of computation.
Formal proofs
Formal proof is a finite sequence of formulas where each formula is an axiom or follows from use of a rule of inference.
You can mechanically check that the proof is valid. You can compose them together and simplify them.
Curry-howard correspondence
If you take the above definition of a formal proof and compare it with definition of machine code, you'll find out you get a similar finite sequence where the instructions are inference rules and encode the formulas.
Every combination of the instructions form a valid program, but we have similar ways to check validity as we had in the proofs. Using Hoare and separation logic you can apply this in reasoning about ordinary programs.
Say you want to add two numbers and multiply by 3, in RISC-V. You can take the equation:
c = (x + y)*3
Assuming we have 'x' and 'y', we want to obtain some value 'c' such that the equation is satisfied. We can do that.
{x, y} P {c = (x + y)*3}
To solve this we take some RISC-V instructions and the related discard operation. I do not discuss the immediate field or address size limitations. The variables refer to numbers within a moduler arithmetic system. Additionally I provide a discard-operation that removes a variable. It's not present in the actual instruction set, but we are keeping it here.
{x,y} add z, x, y {z=x+y}
{x,y} mul z, x, y {z=x*y}
{x} addi z, x, c {z=x+c}
{x} discard x {}
You can construct the following derivation to produce a program that satisfies the formulas we use:
x,y
────────── add w, x, y; discard x, y
w=x+y
────────── addi g, zero, 3
w=x+y, g=3
────────── mul z, w, g; discard g, w
z=(x+y)*3
The formulas in between the lines here correspond to types.
It may seem a bit weird. How can (x+y)*3
possibly refer to a type?
Data types are collections of values and formulas such as
g=3
and z=(x+y)*3
describe collections just as well
as they describe single values.
You can construct multiple this kind of correspondences. Usually the correspondence is presented between intuitionistic logic and typed lambda calculus. But really it's all over the system and you can see it once you study the subject.
Example of proving things
You can go and compute with proofs, but it doesn't mean that you're coerced into doing just computation. I'll try to provide an example.
When examining some simple things I could build examples from, I ended up having somewhat less obvious question. Does there exist a numerical base where you cannot determine whether number is odd/even by examining only the lowest digit?
If there exists, we cannot take a numerical base and conclude that the
∀(b ∈ Nat, x ∈ Nat). even(x) = even(ldigit(x, b))
.
If our numbers are defined as pair of lower digit and higher digit:
digit(x, B) = ldigit(x, B) + B*(rdigit(x, B) or 0)
Next we take some rules applying to even
:
even(x + y) = (even(x) ∧ even(y)) ∨ (¬even(x) ∧ ¬even(y))
even(x * y) = even(x) ∨ even(y)
It'd take some additional work and effort to ensure these definitions match
with the peano representation, and that the axioms for even
are correct.
For now I just assume these are correct.
Now we can do the following:
even(x) = even(ldigit(x, b))
───────────────────────────────────────────────────────────────
let L = ldigit(x, B)
let R = rdigit(x, B)
even(L + B*(R or 0)) = even(L)
───────────────────────────────────────────────────────────────
let L_e = even(L)
let R_e = even(B*(R or 0))
(L_e ∧ R_e) ∨ (¬L_e ∧ ¬R_e) = L_e
───────────────────────────────────────────────────────────────
let L_e = even(L)
let R_e = even(B) ∨ even(R or 0)
(L_e ∧ R_e) ∨ (¬L_e ∧ ¬R_e) = L_e
Assume even(L)
is True/False:
assume L_e = True
let R_e = even(B) ∨ even(R or 0)
R_e = True
assume L_e = False
let R_e = even(B) ∨ even(R or 0)
¬R_e = False
You can see the expression reduces to the following. It could have probably been also deduced
even(B) ∨ even(R or 0)
From here we learned the higher digits can be only disregarded if the parity of the base is even.
Now we could collect this proof and write a program out of it. If the program compiles we know our proof is correct, but we wouldn't likely do anything with the program as-it.
Note that you should be carefully aware about what you've proven. We didn't prove anything about whether we can conclude the result by examining whether the lowest digit is not even. You should also note that the proofs and assumptions here themselves give cues about what you can prove about them in the first place.
Mid-conclusions
From above you hopefully see that programs indeed are a lot like proofs in every sense. The correspondence is there and you can't wet-paper over it by claiming that "Curry-Howard correspondence is taken too far".
For the formulas and propositions, they seem to form for the following purposes:
- The propositions roughly encode what we know about the problem at hand.
- They are all the things that we know before we evaluate the program.
- They are the cues that we have available to solve the problem.
- They describe constraints that we have insisted must hold.
- They let you encode the goals that you strive to solve.
But why would you want this?
So given that programmatic presentations are so similar to logic, why should you be interested about formal proofs or formal verification? By now you should understand that you can still do very bad errors such as type a formula or equation just wrong, or start solving the wrong problem. If you didn't get anything more than what Python or assembly code provides you, why should you use it?
I can illustrate my motivations with a little story.
You got an incredibly complex machine that juggles balls. By observing it for a while, you notice that certain balls always go through the same hoop.
"Lets depend on that and place a shutter on the hoop to discard some balls of a color that we don't like." Sometimes the ball doesn't go through the hoop, but it's the right color so everything works and nobody cares.
You brew coffee with this ball-juggler.
Some day, the ball doesn't go through the hoop and it's the wrong color. As the result there are balls all over the floor and the coffee machine has reduced into molten plastic that's flowing down from the table.
This is the Python experience. It's fun until you get bored of it.
With formal verification you don't need to assume that it always happens because you observed so. Or at least you have a choice between doing that and keeping your marbles in the bag. Instead you can ask the system whether your observation is true. Or in the first place, describe the program as what you want to happen.
Often saying things like this is interpreted as if Python was a bad programming language and shouldn't be used. That's the wrong conclusion because you use it as a crutch to use even shittier languages.
Even the whole type theory has nothing to do with the arguments that go between preference of static and dynamic type checking.
What you really want is that things stop running well before it gets inconvenient to deal with the consequences of running the program. Plain static typing is likely too simple-minded to achieve this while retaining practicality, and dynamic typing cannot achieve it alone. Ideas around proof theory and proof search hold the keys to achieve it though.