Null references was a mistake, but what's the take-away?

Not too long ago I used to think of null as an useful feature that can be directly derived from the pointer arithmetic on a computer. Used to think that Sir Tony Hoare was wrong or talking about a different null than the one I knew.

Null values appear in both statically and dynamically typed languages. In statically typed languages they break the type system beyond repair. In dynamically typed languages they are an expensive crutch for bad design. And their existence cannot be properly reasoned by how computer is doing pointer arithmetic.

What's the take-away from knowing that null references are bad? Once I started tipping against null I was wondering about this. Eventually I realised that it's in understanding how horrendous null is so you won't reinvent it or mistakenly recognize anything else as null.

Types

To perceive null as shoddy requires that you have fairly good understanding of what types should be. Types are things you know about the program before you run it.

For example about things you might not think of as types, Hoare triples are type declarations.

Lets show some simple boolean types that have few values. So lets say we would notate booleans like this:

A ∈ {0,1}

This means that A is either '0' or '1'. We can add an another to make a product:

A ∈ {0,1} and B ∈ {0,1}

The values this type allows are:

A=0, B=0
A=0, B=1
A=1, B=0
A=1, B=1

You could use these to count from 0 to 3 using booleans. That'd require you to say that A=0 B=0 is zero, and A=0 B=1 is a successor of the A=0 B=0. Whether these are programs or types is a thing to wonder.

Now we could restrict the type further:

A ∈ {0,1} and B ∈ {0,1} and xor(A,B)=1

And we'd get the following values:

A=0, B=1
A=1, B=0

Finally we may restrict the type so much that it has no values:

A ∈ {0,1} and B ∈ {0,1} and false

Or we may present a type that has only one value:

true

Now if you look at all these types, you may notice that their memory layout does not need to correspond. Why would you need 2 bits to store something that fits into 1 bit?

Accepting null means tightly marrying the memory layout and types together. You can no longer examine types while ignoring how they're layouted in the memory. In the modern times you'd probably prefer them to stay in open relationship.

Introducing null multiplies the state you need to reason with. Either you get lots of null-checks or then you get fragile logic. With non-fragile logic a type system would be form of proofs. A form of mechanized reasoning that supports your own reasoning.

A null-exception or segmentation fault is not any more dignified than the runtime type errors in Python. Maybe slightly less so as you're a dork that bent a perfectly well-working spoon.

Laxity

We can look at that counting -idea again to illustrate the problems with null.

So lets say we wanted to use two bits for counting and added a successor function to do this.

successor for A=0, B=0: give A=0, B=1
successor for A=0, B=1: give A=1, B=0
successor for A=1, B=0: give A=1, B=1
successor for A=1, B=1: give ??

It looks like this function is not closed. It is missing a result for one possible value you can pass in. We could make it closed by wrapping it around.

successor for A=1, B=1: give A=0, B=0

Well we could do other things as well. For example, this would add an additional constant to describe that the number is saturated:

successor for A=1, B=1:   give saturation
successor for saturation: give saturation

It could be that we have to be exact, we could fail and keep the successor function open.

successor for A=1, B=1:   fail

Finally we can return null and leave it for the next guy to resolve the problem that the function isn't closed.

successor for A=1, B=1:   give null

In the most languages that allow null to be part of any type, we also hide the problem using null, that's convenient because it lets you forget about it until someone gets a null exception.

Vague semantics

Dynamically typed languages such as Javascript or Python should not have null at all. The problem is that it collapses semantics on these languages. When you meet null values, you don't know which semantics they assume, it could be nearly anything.

I've seen null being used to present:

It lets the user of the language be very sloppy.

Gateway null theory

But null also lets the creator of the language be extremely sloppy. It's an open portal to introducing all sort of other bad ideas to the language.

Null allows an easy relief to the problem: "This thing would work if I can leave that otherwise necessary value away." When you finally get it wrong, it takes a long while when you figure it out because you've bypassed program's types or structure and it no longer helps you. Finally you won't blame your use of null for making the mistake in the first place, so you're bound to repeat it many times if you don't get this advice.

If the variable flow in the language isn't working, you can still make it "work" by assigning null to variables you can't bind and then run the program. With bit of luck it works in all of your tests and you won't trigger any systems that needed proper values.

If the operation of the interpreter quite don't match the spec, null makes it easier to not spot that they don't match.

You might think at first that this is akin to functional builds of the car industry, something in the direction of correcting imperfections in parts by building assemblies that accept imperfections. Unfortunately null isn't about finding perfection without perfect parts. Null's more about being stupid, foolishly lax in the place where you could attain perfection and paying the harsh price for being dumb.

So what to do?

Null, None, or whatever it's called is deeply ingrained into the languages that exploit it. There are nonsensical special rules such as no "null" with integers because the type system is the same thing as the memory layout in those languages. When this is the case my advice is to not bother. The effort is already lost there.

However you can make use of this advice when you create new stuff and new languages. Don't repeat the mistake of null.

Similar posts