Can we solve dynamic memory management?
In the previous post I left a subject open. Do we need memory-unsafe programming languages?
It would seem that when the gut feeling of software being correct is true, then there exists a way to construct a formal proof that confirms it.
Maintaining memory safety in C programs and assembly code relies on ad hoc proofs. If you can identify the construction pieces of these proofs then they can become formal proofs that can be automatically checked. We won't get that far here.
Memory/Value/Type/Layout separation
Before we start we should make a separation between concepts, at least conceptually.
- Memory is mutable storage for values, but a memory address is not the value that is held in the memory.
- Value is a non-recursive compound constant that may contain free variables.
- Type is a constraint over values.
- Layout is an in-memory representation for values of some type.
Note that nothing is intimately tied together here. Values may belong to several types, types may have several layouts and memory may hold several layouts, types and values.
All of this is not necessary, but it is important that you don't attempt to solve this problem with the full burden of an existing implementation because existing mainstream programming languages remain surprisingly low level.
Dynamic memory axioms/theorems
I identify three rules that any program satisfies if it does not leak.
- Every allocation must follow with a free.
- It must be ensured that the free occurs before all references are lost.
- After freeing memory, the memory must not be used with any references that were there before the free.
Depending on how you think these may be theorems as well. If we treat them as theorems, it means we must prove these properties hold. If we treat them as axioms, we must reject any construct that can violate them.
For example. The duplication of a reference must be rejected, if the safety for the operation is not certain.
The simplest way to maintain these rules is to keep only one reference around and statically ensure it is freed.
There seem to be a few other fairly simple ways to stay on the safe side.
- Stratification of references. Refuse freeing a reference unless every reference that can reference it is freed.
- Reference counting. Keep count of references, this is valid as long as we can prove no cyclic references are forming. It can be achieved through an occurrence check.
- Substitution of invalidated references. Scan through referencing structures and substitute away the references that have been freed.
- Ensure fixed amount of references that are removed all at once. For example, in a doubly linked list.
I've been wondering whether everything boils down to these memory management strategies. All automatic memory management schemes would be included if the above list is exhaustive.
The above list might be exhaustive. All valid memory management strategies must be tracking the references and then be able of reducing the amount of valid references to one before freeing or reusing the memory behind it. There aren't that many ways to do this.