Static typing increases the need to refactor

I came up with an idea when studying type theory. I am a vehement proponent of dynamic typing but there are several ideas that require me to explore into type theory so I am studying it.

I want to see the idea being contested so here's this bait with the most inflammatory yet relevant title I could think of.

Notice: Depending on how you are using types in Haskell or the like, contents of this article may or may not apply to you.

Preliminary refresh

For accessibility lets do a refresher through the terminology in this post. Also you can see that we are talking about the same thing.

Refactoring is a process of restructuring computer code such that its external behavior isn't affected. You can make it by hand but if you have statically typed code, you can also automate this process.

Static typing, or static type checking is the property in compilers to check the type safety of a program before compiling the program. In contrast dynamically typed programs check the type safety during run-time.

With type safety we refer to both the formal definition and the kind of type safety that popular programming languages such as C++ and Java provide. That these two definitions for type safety are like a sky and a sea forms the crack that this debate can expand on.

Concrete type may be an unknown term for larger audiences as well. This refers to the types that uniquely identify values and objects and isn't an "abstract type". Eg. integer, string, a list containing integers.

When discussing static typing you have to be careful

It is unfortunate that you cannot say that static type checking guarantees type safety or correctness. This means that static type checking proponents do it anyway but add weasel words:

In correct texts it's never "all errors" or even "all type-errors".

This is often a reason people say when they claim that static typing is beneficial. Other causes are usually listed as improved tooling support, improved correctness, improved documentation and efficiency of the produced program.

The best lies have a little bit of truth in there. The morsels of truth you find are words that weaken claims such as "some", "certain", and the word "efficiency".

Quoting Knuth

Knuth's famous quote about premature optimization has been used so often to bludgeon static typing. People have started to claim it's a misquote of what he said.

Here's the popular quote again:

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

There's a larger context that points out that you still have to care about efficiency but we got to remember that Knuth started his life work well before personal computing was a thing. The source of this quote was published in 1974. In that time nearly everything trivial had to be optimized because otherwise it would not run.

The big motivation to me picking this subject is that the writing of statically concretely typed code in statically concretely typed programming environments is larger waste of time than the constant argument between dynamic and static typing.

Concern over efficiency prematurely

With static typing you pretend that you have a baby in the bathwater, but you actually throw the baby out because doing the bathing is much faster when there's not a baby to bath.

The biggest f$*kup in statically typed programming environments is that they require concrete type definitions for the programs they produce.

The concrete type restriction is heritage from the simpler times when you had to use an assembler language of your platform and you ran out of RAM faster than you ran out of development time.

When you describe a concrete type, you also describe very specific implementation details over the representation of the value. Things such as the storage layout and size.

There are those Knuth's "3%" of time when it is required that the operation evaluates efficiently and we cannot afford different representations over the same concept.

The other times it would be enough that the type selected has some mathematical properties it always satisfies. For example, you might want an input stream, but you don't care whether it's a list, a generator clause or a file or a network stream.

Specifically for streaming you may find the way to generalize over many kind of streams. Often there are ways to do the above kind of abstractions but they aren't the default everywhere when the compiler has been designed to expect concrete types.

This bias for efficiency means that the author is encouraged to select an overconstrained type and frequently he doesn't even have the means to describe what the program actually needs in order to run.

The claimed benefits are not there because of the overconstraining. The value of the type as documentation is being diminished because the type describes more properties than what is expected.

Then you need refactoring tools

The overconstrained types need to be spread all across the software. Since it is more specific than what it needs to be you have more reasons to change the types in the program. Now you have a need for refactoring tools because the type signature of any specific object has to be changed in many places.

Additionally you may face code quality problems due to passing more information into a function than is required, in order to avoid the refactoring costs of changing an individual type.

The type signature doesn't know which specific properties are expected. This means that when you change the type into an another you might be introducing a bug that won't be detected by your type system.

An example of this kind of bug would be apparent with modular arithmetic. Widen a 8-bit unsigned integer to 32-bit one and there's a chance that you broke something because a counter no longer overflows when it goes above 255.

Now you're likely to say that doesn't happen. That's because you're perhaps remembering a specific scenario where you noticed the thing and you fixed the problem before it happened. Or then you might be thinking it's unwise to do that kind of a coercion because people pick 'int' instead of 'uint8' if the size doesn't matter.

Static typing proponents say one thing yet they are doing the another. Why pick a statically typed system with concrete types if you aren't hellbent on efficiency?

There's not a law that states the types should be concretely chosen after the compiling when you statically type it.

(The title is missing 'Concrete' from the front, but I guess it's appropriate? We aren't too specific are we?!)

Update: A reader in reddit sent me a link that is about this same subject and it could be easier to follow than mine. It is titled:

"Data Structures Are Antithetical to Functional Programming".

I also added a little bit on the above post, but if you go into the reddit you will find all the comments giving out what was not there.

Similar posts