Multiple Dispatch Problems

I chose multimethods to implement operator overloading in my language. As I've worked with the system the hunch has been growing that there is something wrong.

I didn't find any articles pointing out problems in Multiple Dispatch, so I made one. Here it is. This post isn't going to provide you with a solution to this problem, but it will try to specify the problem better.

Feature description

Multiple dispatch means that you select an implementation of a function based on the type of two or more arguments.

The definition also includes a notion that we are talking about dynamic dispatch, but many of these potential problems should also appear in function overloading on multiple arguments.

It also shouldn't matter whether you do dynamic dispatch twice. eg. in Python the addition operator calls .__add__(self, other) and then .__radd__(other, self). The method to dispatch is still determined by both types on the operation.

The t² issue

Lets say that we have t amount of types and we have to be able to combine any of these two together in a meaningful way. How many methods are necessary? The answer is that t² methods are necessary.

We need a square two-dimensional table to store all the methods. The problem is that the amount of methods grow at a radical rate. We already need 16 methods to handle 4 tightly coupled types.

Also, the implementation of most of those methods are usually trivial suggesting that our implementation does not fit the semantics of what we are trying to implement.

Note that this was a hypothethical case where you'd need all the implementations because every single one of them has to interact.

Also note that we ignore the possibility of deploying highly coupled types using type inheritance or subtyping. Complex forms of type promotion systems are also ignored for a moment.

False sense of extensibility

In the situations where we actually have 3-to-3 or 4-to-4 interactions, and beyond, the solution of writing every method may be usable if it forms a closed system where the author is aware about every available element.

But multiple dispatch is supposed to allow extensibility from outside. Unfortunately the useful application of this ends up being severely limited. We can go through two scenarios to confirm this.

In our scenarios, there are two authors with their libraries. Both libraries extend the same set of types to implement additional forms of arithmetic. Then there is an user who has to use these arithmetic forms together.

We do not specify what kind of aritmetic is involved here, but we assume it does not matter. In the existing system of types there are 4 types, which means there already exists 16 methods.

Each author adds 2 additional types to accompany the 4 that are already there. This would be an unusual setting, but it is made to keep the description simple by having only two authors with such extraordinary needs.

The first scenario

In the first scenario the authors are not aware about each other. Each author would see he extended the group of 4 to 6. This means that each author would add 20 new methods.

Unfortunately the total amount of types afterwards is 8, so we have 12 holes in our dispatch table.

This means that to use the two libraries, someone would have to implement 12 missing methods to use these two libraries together.

You could argue that these two libraries should not interact in the first place because they cannot, unless either author is aware of the each one. This is a valid concern.

The second scenario

In the second scenario the authors are aware about each other and they co-operate so their libraries could be used together.

Unless the libraries cross-import each other, it means that either author has to implement additional 12 methods to make his library work in the combination. This causes a fairness problem because the first author has not this additional responsibility.

Also there ends up being total of 64 methods. Each author who requires additional types and has to join into this group will end up increasing amount of responsibilities.

The 20th introduction of a type requires 39 new methods. After that there are total of 400 methods.

On a comparison, if we needed only one method per type, the 20th introduction of a type would only require 1 additional method.

Real examples

I ran through my project repository, collecting every instance and case where I used multimethods for something. This provides more information which helps when analysing the situation.

In an implementation of subtyping

I have written a multimethod which tells whether some type is a subtype of an another. It is the following kind of a binary multimethod:

is_subtype = multimethod(2)
is_subtype.default = (a, b):
    return false
is_subtype[[StrictInt, StrictInt]] = (a, b):
    return a == b
is_subtype[[Pointer, Pointer]] = (a, b):
    return is_subtype(a.type, b.type) and a.storage == b.storage

The default behavior is to return 'false', determining that the left-side type is not a subtype of a right-side type.

This is one of the 'closed form' functions. It does not reveal any new insights. We already knew this might be something you might end up doing with multimethods.

Specifying rules for sorting a list

I did an ordering function for context-free grammar's symbols. I did it into a separate table because I did not want to populate the actual comparison table unnecessarily.

symbol_lt = multimethod(2)
symbol_lt[[Terminal, Nonterminal]] = (a, b):
    return false

symbol_lt[[Nonterminal, Terminal]] = (a, b):
    return true

symbol_lt[[Nonterminal, Nonterminal]] = (a, b):
    return a.name < b.name

symbol_lt[[Terminal, Terminal]] = (a, b):
    return a.name < b.name

At first sight this also doesn't provide any additional insights, as it is a 'closed form' function just like before.

The definitions of additional methods is "global" in one sense of the word. Therefore if there is no need to modify the behavior of the operation inside other modules, then there is not significant motivation to add yet another rule into a multimethod table.

So this example illustrated that we desire interaction between other modules when we are using multimethods.

Remember that this does not apply to all languages, because all of them does not allow local renaming of operators. Such behavior is illustrated below:

lt = %"<"
%"<" = new_lt(a, b):
    if isinstance(a, Nonterminal) and isinstance(b, Terminal)
        return false
    elif isinstance(a, Terminal) and isinstance(b, Nonterminal)
        return true
    else
        return lt(a, b)

Runtime operator overloading

I didn't find any other examples of multimethod use in my libraries. There is still a side to examine. How my operators are implemented in the runtime.

For starters, here is the full list of multimethods in my language:

add  = arithmetic_multimethod(u'+',   (lambda a, b: a + b), flo=True)
and_ = arithmetic_multimethod(u'&',   (lambda a, b: a & b))
clamp = Multimethod(3)
cmp_= Multimethod(2)
coerce = Multimethod(2)
concat = Multimethod(2)
cross = Multimethod(2)
#distance = Multimethod(2) # planned
div  = by_symbol[u'/'] = Multimethod(2)
dot = Multimethod(2)
eq  = Multimethod(2)
floordiv  = by_symbol[u'//'] = Multimethod(2)
ge  = Multimethod(2)
gt  = Multimethod(2)
le  = Multimethod(2)
lt  = Multimethod(2)
max_ = arithmetic_multimethod(u'max', (lambda a, b: max(a, b)), flo=True)
min_ = arithmetic_multimethod(u'min', (lambda a, b: min(a, b)), flo=True)
mod  = arithmetic_multimethod(u'%',   (lambda a, b: a % b))
mul  = arithmetic_multimethod(u'*',   (lambda a, b: a * b), flo=True)
ne  = Multimethod(2)
or_  = arithmetic_multimethod(u'|',   (lambda a, b: a | b))
pow_ = Multimethod(2)
#reflect = Multimethod(2)  # planned
#refract = Multimethod(3)  # planned
shl = by_symbol[u'<<'] = Multimethod(2)
shr = by_symbol[u'>>'] = Multimethod(2)
sub  = arithmetic_multimethod(u'-',   (lambda a, b: a - b), flo=True)
xor  = arithmetic_multimethod(u'^',   (lambda a, b: a ^ b))

I have left away unary multimethods from the above list because single dispatch is not going to cause any troubles. For reference, these methods are unary methods:

abs(a)         # absolute value
length(a)      # vector length
normalize(a)   # normalize
neg(a), pos(a) # negative and positive sign

To make sense of this, it can be worthwhile to categorize it all a little bit. The single biggest category here is the arithmetics.

Arithmetic multimethods

The arithmetic multimethods form the one single, biggest group of multimethods. These include:

add, and, cross, div, dot,
floordiv, max, min, mod, mul,
or, pow, shl, shr, sub, xor

The arithmetic methods have a common aspect in them. With notable exception to shl/shr, their machine-code-equivalents require that the type on the both sides are type-equal.

I have taken this into account and implemented a coercion multimethod, which is called whether an arithmetic function doesn't match properly.

Arithmetic coercion

The only coercion rule I have had in my coercion table so far is the int -> float -coercion.

The another useful case of coercion appears to come from vector/matrix/tensor arithmetic, complex numbers and quaternions side.

We would prefer to be able to convert integers and floats into these types.

In a certain weird way scalar values such as integers and floats are valid multidimensional arrays. You could describe the dimensions in the following way:

[=1, =1, ...] scalars
[>=2, =1, =1, ...] vectors
[>=2, >=2, =1, =1, ...] matrices
[>=2, >=2, >=2, 1=, 1=, ...] tensors

This aspect in vectors also motivates towards implementing parametric types, even if the language were dynamically typed otherwise.

Comparison and equality

The comparison and equality multimethods are potentially the one group that requires most careful examination. This group forms the tools for comparing values and objects.

cmp, eq, ge, gt, le, lt, ne

These are comparison elements in somewhat standard literal naming scheme. The "g" stands for "greater" and "l" stands for "less". "e" stands for "equal" and "t" stands for "than".

ge, gt, le, lt, should be potentially ordinary functions that just call 'cmp'. The 'cmp' itself has some similarity with other arithmetic operations, it can do arithmetic coercion to extend its ability to compare.

In target machine terms the comparison is related to subtraction on integer numbers. Because of that, some sanity can be observed in these rules.

The equality and inequality are potentially troublesome. On one hand one is shorthand for use of an another.

Right now equality is implemented such that it calls 'cmp', if 'cmp' does not provide a result, then we try to treat the element through checking the identity.

It appears that if we want to use this kind of definition for these operators, then each object falls into one of the three groups:

Has equality, inequality. Are equal by unique value.
Has equality, inequality. Are equal by structure.
Has cmp. can be compared numerically.

The only trinary multimethod: clamp(x, lowbound, highbound)

The clamp is the only multimethod where there are three arguments in the method. On the closer inspection though, we note that the 'x' is some interesting item and the remaining arguments depend on what that 'x' is. Therefore this could be implemented as a single-dispatch method.

Outlier: concat(a, b)

Concatenation appears to be useful as-it. But on retrospect might benefit from the similar rules as the arithmetic methods.

Intermediary conclusion

What we have seen so far suggests that multiple dispatch may be unnecessary for good implementation of operator overloading. Even with expressive form of operations the desired behavior can be observed as a necessity to align the type signatures before a task can be done.

The problem of multimethod-like dispatch seems to be that they cannot be used for the purpose that would be potentially most useful in providing this kind of a system.

If it cannot be used for one of the most useful things it can enable, is it any good in the first place? Considering that there is a much simpler way to redefine an operator in a local context, I would resoundingly say "no".

Multiple dispatch may be an anti-feature.

Solutions

There are potentially several good solutions to the problem that have been described here. Unfortunately the problem description itself is barely fitting into week's worth of study so any resolutions have to be left to an another time.