APL, ignored wisdom under plain sight

The usual vector algebra implementations you find in any programming language resemble glmatrix. They focus on vectors and matrices with low dimensions and provide a lot of operations specific to some use. They are also very laborious and repetitive to implement.

I've grown dissatisfied with solutions that are laborious for a language and library maintainer. The time of a person with expertise to do this kind of work is always better spent elsewhere.

I think that the solution to this was invented 51 years ago. That sometimes seem to be a theme of this blog, everything has been already figured out already. Surprisingly the solution is the APL this time.

I happened to stumble upon this because APL has been recently featured in several places and I really was looking for a solution to this from everywhere else first. But I eventually ended up to the conclusion that computer programs have to represent their arithmetic mainly as a sequence of operations. An alternative would be to base the whole work to some mathematical theory, but doing so would suppress some other useful mathematical theories.

I think the fundamental problem is originating from a simple misunderstanding that comes up when we do programming with matrices. It is best explained through an example:

mat3 m = mat3(
   1.1, 2.1, 3.1, // first column 
   1.2, 2.2, 3.2, // second column
   1.3, 2.3, 3.3  // third column
);
vec3 column3 = m[2]; // = vec3(1.3, 2.3, 3.3)
float a = column3[0]; // = 1.3

It appears to be that matrices consist of vectors, and that vectors consists of floating point values. This is true, but it leads us to thinking inconsistently about vectors.

The problem is that the containment-concept doesn't support the idea of multiplying a vector by scalar. a -> a -> a prevents that in Haskell. Also it suspiciously looks a lot like you'd allow operation between a list and its contents, which is usually not very popular action to do either.

The embedding of numbers to arrays work if you think that the number it itself an array that you can extend infinitely with ranks. Now addition with a vector makes sense because numbers are rank-0 arrays and vectors are rank-1 arrays. Addition works because both elements are of same type, and you can bring both of them into the same rank.

Example of 1.2 + [2, 3, 4]:

1.2
2.0, 3.0, 4.0
operation=(+), incompatible shape.

expand rank [] to [3] to reach compatible shape.

1.2, 1.2, 1.2
2.0, 3.0, 4.0
------------- (+)
3.2, 4.2, 5.2

This kind of thinking leads to the Iverson notation that allows every operation to work on arrays just as well as they work on individual elements. Iverson notation allows you to represent many subjects through elementary operations.

length: sqrt(fold((+), a*b))
dot product: + fold((+), a*b)
cross product: (↑a)*(↓b) - (↓a)*(↑b)
matrix multiplication: C[x][y] = A[x][k] * B[k][y]

If you can represent the vector arithmetic on this level, you wouldn't really necessarily need a library for representing these concepts. A library might even get on the way!

I recently heard about Egison and the paper "Scalar and Tensor Parameters for Importing Tensor Index Notation including Einstein Summation Notation". Also the pattern matching paper seems something that could be integrated into my language even if Egison is lazy and Lever is eager.

Overall Egison's symbolic and numerical computation capabilities seem to be exactly what I would hope to have in Lever, also the aesthetics of the author seem to be near with mine although the features implemented in either language are complementary. We have been solving the same problem from different ends.

After writing this post, I finished reading Egison's tensor paper. I have concluded that this notation is excellent and that I will also implement it into Lever. Though I propose it is not a good idea to have free variables flowing all around, even if they were using invariants such as in Einstein's summation notation. I likely implement this in the following formats:

c[.i^j] = a[.i] * b[^j]
[i] length = sqrt(+ fold a[.i]*b[^i])

The variable must be introduced, and it is preferred that the context ends up eliminating the variable before it escapes the context.

Similar posts