Concurrency in Javascript
Since we have had multitasking operating systems we have had concurrency. They have a history with thousands of war stories.
Concurrency is to have many programs running at the same time or in turns.
Concurrency appears in many different forms on computing history. Early on application programmers encountered concurrency problems via operating system threads. Concurrency bugs and the locks used to kept them in place became legends that still echo in the Internet.
The concurrency is your best choice for obtaining non-deterministic bugs. For that reason it is considered dangerous, traitorous, unpredictable, vile.
Javascript started out as a simple language. It would thrive on early browsers of the early internet years.
Javascript never really grew out of this role. Its design is still steered by what a least capable living browser vendor is able to cope with. This is not always a flaw in the language design.
There is an event loop at the bottom of the call stack of Javascript. The event loop responds to external events by dispatching calls. A call event loop dispatches has to return before event loop can respond to an another event.
This design is simple to implement but presented a challenge to programming in Javascript. The whole website would be jammed while a script is running. The outcome of this is that javascript programs could not wait for results from calls.
As features were added to Javascript the qualities of its event loop appeared in interesting ways.
The earliest of interesting things to do with Javascript was to download more content to display after the website itself has loaded. If this had been implemented like it was implemented in python, the http request would have returned the content as a return value.
In javascript this kind of behavior would have jammed the eventloop during the duration it takes to download the content. To work around this problem the call to XMLHttpRequest does not download the content. Instead it enqueues a task to download, and takes few call backs that the event loop calls when the task is completed.
Javascript has exception handling, but it is hardly used because most of the API functions in javascript cannot return meaningful exceptions. Enqueuing a task does not tell whether the task will succeed or whether it will fail. Instead the errors are transmitted back within callbacks.
As there came more things you could do in Javascript, and more people were interested of doing them, Javascript got unique ways to do things. These ways were far more complex than the languages preceding it.
The smartest people would see, already about 10 years ago, that there was something hairy going on here. Javascript programs being partitioned to callbacks let you enqueue several different tasks at the same time. Program execution would jump from a callback to a callback in partial order, gathering values on the way.
People enjoyed achieving things with Javascript, but they grew to hate it and the callback-ridden code containing deeply nested function statements. Maintaining such code was a pain, and where is pain there are marketers bringing up solutions.
The solution with the highest perceived elegance/cost ratio came from 1977. To make it easier to enqueue and handle many tasks lets redefine every Javascript function enqueing a task such that it returns a promise where code can place its callbacks before it returns to the event loop.
To provide sequential execution order and stacking of promises, every attach of a callback would return a promise as well. This returned promise could be used to attach an another callback that runs after the first callback. The first callback could then itself return a new promise that is waited upon before this callback chained after it could run.
It can sound complicated, although what forms is just an ordinary call flow. Lets consider the following sequence with promises:
return task_a(0)
.then(function(a){
return task_b(a, 1);
}).then(function(b){
return task_c(b, 2);
}).then(function(c){
return task_d(c, 3);
}).catch(error_handling_callback);
This function body would return a promise. That promise
would return the result of the task_d()
after it runs.
The task_a()
here would be enqueued to the event loop.
After calling the function and returning to the event loop,
these four tasks would run one after each other.
Note that each scope would lose the variable it introduces as an argument. To keep the variable you would have to move the callback inside an another one, resulting in a code that looks like this:
return task_a(0)
.then(function(a){
return task_b(a, 1)
.then(function(b){
return task_c(a, b);
});
}).then(function(c){
return task_d(c, 3);
}).catch(error_handling_callback);
So if your promises had lot of intra-variable dependencies,
it would start to gain back the shape of a callback
hierarchy, now with promises! To fix this problem they added
syntax for promises. async
and await
.
The new syntax would be implemented by a transpiler. The
async
prefix in front of a function would trigger behavior
that, with the help of the await
-keyword makes it more
logical and cleaner to represent these complex promise
hierarchies:
async function() {
try {
var a = await task_a(0);
var b = await task_b(a, 1);
var c = await task_c(a, b);
return task_d(c, 3);
} catch (error) {
return error_handling_callback(error);
}
}
You can see it is much cleaner. It still enqueues a task, and if you call the async -defined function directly, it returns a promise like earlier.
So you can call that function like before and use .then() to
retrieve a value. Or if your function is itself of async
-flavour, you can return that value directly.
Now it is so clean that you could make every function just
like this, but if everything became async
, would you need
to explicitly mark them? Lets say we would translate the
remaining sync constructs to async
so everything could be
represented in async
.
This would turn our language back to syncronous. Just
imagine the above code sample without the async
and
await
keywords. It would mean you can no longer attach a
.then() to a promise because promises are hidden.
But this could be restored by inverting the keywords. Say we
wanted to run the task_a
in parallel with task_b
. We
could use the async
-keyword for 'inverting' the promise
out and sequence them to be called together:
var results = Promise.all([
async task_a(0),
async task_b(1)
]);
Or give task_c
its entirely own life:
(async task_c(2))
.then(return_some_day);
People start to run under their chairs and cover from the impeding doom.
Someone screams "that is a pthread_join!" They show at a C program:
pthread_join(task_a, &a);
pthread_join(task_b, &b);
Another one cries "You have brought concurrency upon us! You shall be burned on the stake!!" The C program continues:
pthread_join(&running_task_c, NULL, task_c);
But nothing was introduced but new syntax. Therefore async & await has to be blamed. But then again that is also just nothing but new syntax. Therefore promises must bring concurrency. But promises just wrap callbacks. Therefore callbacks must provide concurrency.
All the time, during the last 10 years some people knew that javascript had a way to write concurrent programs in form of callbacks. All the way since we got XMLHttpRequests, javascript has been a language that has supported concurrency.
We get to the world where people will propose we should use
async
and await
-constructs. They will later propose we
should use them sparingly and be aware about all the daemons
that concurrency can bring upon us.
Concurrency loosens atomicity constraints in the code. If
you have a resource that can be written and read and you
have to use async
functions. You may see a bug where
portions of those resources are written by another task
while your program hasn't finished reading, corrupting the
state of the reading program. This kind of bug can also
happen in inverse, as in the problems appear on the writers
end. And the bug is nearly always a nondeterministic. The
reads and writes must happen interleaved for the bug to
occur.
To fix that kind of bug you may have to reorganize your code
a lot. Or then you introduce a semaphore. Just prepare to be
burned twice on the stake as async
and await
shouldn't
be supposed to have this kind of problems.
Promises provide their own complexity when it comes to
reasoning about programs. With promises you have two prominent
ways to represent sequential execution and calls that return.
On top of that you have an additional way to raise errors.
When you have the new syntax, all programming constructs start
to get variations that work with promises. The languages end up
implementing their every control flow construct twice to support
their async
use.
Promises are quite heavyweight in how they are implemented. As language vendors start to find optimized variations, the language implementations start their painful, but eventual, transformation into supporting continuations. The complex behavior of the promises are easy to represent with continuations such that they perform efficiently, because continuations allow you to transform a return from a function into a callback.
The idea of promises, callbacks hidden inside them is much more complicated than honest concurrency with continuations. People will confuse their promises and values. They keep using them because they will be considered safer than true asyncronous co-operative control flow. But in reality callback-partitioned programs are still in their heart, true concurrency capable of all evil what concurrency can bring to the world.
It will be much like the arguing of static typing resulting in safer programs. The other side of the coin, because it is on the other side, is not seen. They won't see the premature pre-specialization of their code, forcing them to write more of complicated code to let them pursue safety with types. They will end up either way, safer or unsafer and which way they end up cannot be determined beforehand. Either way, there is more code afterwards.
To save themselves from the world, programmers are very good at jumping into the closest boiling pot they can find.