Grow past dynamic/static typing debate

Many must have witnessed a programmer spinning on his office chair as if a compass searching for a cardinal point. Then he posts yet another static vs. dynamic typing post online.

When observing the proof theory and foundations of formal reasoning and computing and their interactions, it has become apparent to me how corny the whole subject is and how the lip and finger motion is better spent on entirely something else than debating about differences of static and dynamic type systems.

If we were to think of language users as tribes, dynamic typers would refuse to understand proof theory and only rely on informal reasoning, whereas static typers would not bring their understanding of proof theory to such a conclusion that it's make any difference. Most programmers simply aren't capable of writing correct or reliable software with the tools they tend to grab on from their directories.

Dynamic and static typing differ by a computational model and seemingly important semantics of the programming language. The problem is that the validity of the program, that is when it compiles, only verifies it fits into the computational model of the language. If we were to do it right, it wouldn't be as interesting as it's treated as.

Computational models and necessity of types

All computational models can be illustrated as a single rule of rewriting. Evaluation applies the rewriting rule to an input until it no longer can apply the rewriting rule.

Due to the finite nature of computing, the computation model must describe a type for the expression that is being rewritten by the machine. Validity in this sense means that the type is satisfiable by the expression.

For example, we can examine the turing machine this way. You get the input state and the transformation table. You must describe which symbols the machine can draw before you can actually create it. In practice you can do very simple choices such as '0', and '1' to get yourself a turing machine.

Here's a short textual illustration of how it goes. Let's evaluate a program that is writing 0,1,0,1.... to tape. First we got to describe the transformation table. Lets describe it in something that resembles early BASIC.

1. PUT 0, RIGHT
2. PUT 1, RIGHT, GOTO 1

Next we can pretend to feed this thing an empty tape, and see it build the string.

1. 00000000
---*------- 1. PUT 0, RIGHT
2. 00000000
----*------ 2. PUT 1, RIGHT, GOTO 1
1. 01000000
-----*----- 1. PUT 0, RIGHT
2. 01000000
------*---- 2. PUT 1, RIGHT, GOTO 1
1. 01010000
-------^--- ...

We apply the rule for turing tape, it's the same rule every time, but for clarity I display information from the transformation table to show what is happening here. The 'star' is the write-read head of the tape in the turing machine. The computation doesn't finish, so I marked it to the bottom.

You find out that if a program terminates, it produces a formal proof. You can verify this by examining the definition of a formal proof. In this way you can consider computation as a form of proof search. It's still possible to also examine computation as proof-normalisation as well, but I find it useful to build my post on this basis.

The input in the turing machine is clearly typed. Lets describe the type of the input in the turing machine. You can check ahead of the time that you don't feed in an invalid state, that the tape can contain the needed symbols, that the transformation table contains correct state transitions. If the input satisfies these conditions, it is valid.

From the computational model, the validity can be extended into the programs. It's extended in any case and dynamic/static typing in programming languages only make a difference in how deep it extends this concept of validity.

To understand what I mean here, lets re-examine behavior of the division in programming languages. Independent of a language, the expression a/b may fail to several issues:

a/b, a, b, have mismatching type
a/b, a, b, match in type, division has not been defined for the type
a/0, zero division, division by zero has not been defined.

Dynamically typed language does not extend the validity of a program to include the idea that types have to match. It can fail to all of these issues.

Statically typed language extends the validity of a program to cover the type match/mismatch. It can still fail to the last issue.

There was a reason that I've forgotten, that prevented easy typechecking of division by zero -condition in typed languages.

Deeper you manage to extend the validity of the computational model into the program, more computation you can fold away from the program.

This is the source of the static/dynamic typing debate, but it shouldn't be such a feature that you fix down in a language in the first place.

For the computational model to be universal, it cannot quarantee termination or predictable answers. We already know we can't win there, but we don't need to.

You should plan things out a bit better

Currently the choice of the computational model among programmers is an arbitrary tradeoff between evaluation efficiency and some other difficult to determine properties, that is left to be a question of taste. Our languages are modeling machines that are a bit like the turing tape, and we use them to pretend that we're directly computing on these machines.

I think it's a bit silly. If you treat every type at the last possible moment, you're wasting computing resources, dynamically typing. But you're still wasting computing resources, and your own patience, if you treat machine types as abstract mathematical types, statically typing. Also no matter what you do, the program can still produce a result you didn't expect.

The results can be unexpected because the program may describe a computation that is very different than the description what we expect to happen. These can be mathematically checked to correspond, but we need bit of tooling first.

It starts from basic reasoning and planning. I assume this is not unfamiliar to the reader who has gotten this far, because I know chicken can do some reasoning and planning, sometimes people can do it too. The planning usually produces a proof that is valid when every rule you've used is applicable.

For example, if you were baking a pancake, you might consider two plans:

3 eggs ∧ 250g flour ∧ 1/4tsp salt
───────────────────────────────── sieve flours into bowl
bowl_a                            and mix with egg contents

                0.5L milk  
         ─────────────────────── halve
bowl_a ∧ 0.25L milk ∧ 0.25L milk
──────────────────────────────── pour in and whisk briskly
bowl_b              ∧ 0.25L milk
──────────────────────────────── pour rest in until the batter is smooth
bowl_c
────── pour to baking plate    
plate
─────── bake at 225 - 250 °C, 20-25 minutes until puffed and golden brown
pancake

This is the another plan.

─────── Wish for a pancake from the Finnish pie god
pancake

If you got oven, sieve, and a bowl, then the first plan is valid. The another plan likely doesn't work for obvious reasons.

These are near to formal proofs, though as they are a recipe, they define what a pancake is. Still it's predictable that you get a pancake if you follow the instructions. With some tools we may also describe algorithms and verify their validity using such definitions. Then we can derive programs from such definitions as well.

The main shortcoming in our pancake example is that the rewriting rules in our proofs are informal. The rules depend on what kind of a system you select to represent your logic. Milk disappears into flour in the example I shown. such interactions can be described formally by linear logic.

We have such formal systems, but why they're not already in widespread use? Lots of people are still chiseling programs directly for some specific computational model they have. I guess it's because they've never been taught to do formal reasoning, although they have to learn informal reasoning anyway.

Sometimes the current tools to do formal reasoning aren't flexible enough to clearly explain what you want to. It is still not a good enough reason to simply throw away the whole thing.

Notes about notation

The notation is slightly informal. To really get most of it, the formal proofs and proving should interplay with the informal reasoning you're doing naturally. The formality is needed to make clean rewriting rules and eventually make it work mechanically.

The notation I've used here is resembling open deduction rules in deep inference and natural deduction. Deep inference differs from natural deduction such, that the rewriting rules can be applied deep in the structures. I likely use (∧) once such that it might not form a valid rule in practice. I've tried to keep the thing simple for the purpose that the notation at the moment is not very important at the very beginning.

The semantics of the propositions are important to understand, as well as what the line in the middle means. From the cake example the meaning of the horizontal line should be clear. We apply a rule to move from a proposition to an another one.

If we have an expression like a ∧ b. It's a proposition that is proven if the a and b are proven. In open deduction we can split the proof apart from there and prove the expression separately.

⎛...           ⎞   ⎛...           ⎞
⎜─── proof of a⎟ ∧ ⎜─── proof of b⎟
⎝ a            ⎠   ⎝ b            ⎠

Sometimes occurrences of the propositions in the proof matter, in that case it's marked with a subscript. For now it should be fairly obvious how the proof builds up, so I haven't used subscripts.

Computational example: greatest common divisor

To illustrate how this all would operate in an actual example, I can show you this euclidean algorithm for computing greatest common divisor written in C. It shouldn't be very unfamiliar to people, so it's often used as an example.

int gcd(int a, int b) {
    while (b != 0) {
        int t = b;
        b = a % b;
        a = t;
    }
    return a;
}

You can note that the program has been overconstrained in order to make it fit into the computational model that C language provides. The greatest common divisor is defined on more things than just machine's modulo-N integers. For now I just entirely leave out the idea of integers from these proofs.

The definition of the gcd is in the name. It's the largest divisor that two numbers share. Therefore we can model it as a such proof:

             a ∧ b 
────────────────────────────── gcd(a, b)
max(divisors(a) ∩ divisors(b))

This defines the gcd entirely. To apply the rule, we need two propositions a, and b that we apply it to. These play the same role as the variables in the first program. In this context the proposition a is true if we can obtain such a value somehow. And false if we cannot obtain such a value.

The whole thing in itself is already defining a program as well, if we think of how the lower expression builds up, it consists of the following proofs or programs:

     a
─────────── take divisors of a
divisors(a)

a ∧ b
───── intersect sets a, b
a ∩ b 

  a
────── take largest value from the set a
max(a)

Divisors of some number form a set, and we can use set notation to describe common divisors, then we can retrieve the largest number in those divisors.

If we examine the gcd program as a proof, we get the following:

a ∧ b ∧ b=0
─────────── gcd(a, b)
a

a ∧ b ∧ b≠0 ∧ c=gcd(b, a%b)
─────────────────────────── gcd(a, b)
c

The program forms through mathematical induction. We have a base case, and the induction rule. The application of the induction rule requires that we apply some rule of gcd again. Repeated application of the rule eventually leads to the base-case being applied and the program terminates.

Now we can determine that the base case is trivially true.

'a' is the highest divisor of itself, can be represented as:

max(divisors(a)) = a

All non-zero divisors are divisors of zero:

divisors(0) = all

Set-intersection with anything:

divisors(a) ∩ all = divisors(a)

Now we can confirm that max(divisors(a) ∩ divisors(0)) reduces to a.

Proving the inductive rule is harder. We would have to prove that the following equation is valid:

max(divisors(a) ∩ divisors(b)) = max(divisors(b) ∩ divisors(a % b))

If you think about what the modulus is and what its definition would be, you can do it. I leave it for you as an exercise. :)

Also, despite that we removed machine integers from the picture. Notice there's an interesting edge case where this algorithm doesn't produce the same result as the definition. Can you figure out what that is? There's also an error in somewhere on this page that hides the issue. Things you need to figure it out already read on this page.

I'll add the answers to the next week's post.

29.4.2019: Improved the post based on feedback. Emphasis on the validity concept of computational model. Explaining the notation used. Improving notation for induction based on earlier examples.

Similar posts