Javascript Needs Module Loaders

Today we have access to tons of things from javascript. You can play audio with it, draw 3D images to the browser screen. Grab mouse input and even fullscreen the browser window. If you're going to use these features, you'll face two problems. How to fetch everything you need and how to organize it all?

Javascript was designed for simple scripts written by nonprofessional programmers. That might be the reason why it became popular in the first place. Limited feature set and simple, robust design. There was not much that could have went entirely wrong by whoever designed it. Language design is hard, so it was a good choice when it was made.

Now javascript has the fastest interpreted language implementations I know, in most situations it's penalizing in performance compared to C in factor of ten or less. Such performance means that it's idiotic to write large programs in, for example, C++. Because you penalize development speed and maintainability just to get a program that runs in 2 milliseconds instead of 20. It means it's very interesting if we could write all sort of programs in javascript.

Unfortunately javascript is still on same level as C when it comes to structuring your programs. You just include files into the main script, which is html. There exists no concept of module or module boundary. In other hand this might be fortunate because function boundary easily turns into a module boundary. In the default state of javascript, you're maintaining the module boundary by yourself. It worked on C after all already! The problem comes when programs become more complex as they do when there are tons of useful things you can do.

For instance lets consider you would want to embed a coding editor into your website. There are Ace and CodeMirror editors available for you. You 'import' them into your website by downloading couple files, placing them into place and refer to them from the html file. There's no other way to do it. It gets worse if you need to provide or use a library which depends on something like CodeMirror. The whole editor gets either embedded along the library, or the user is asked to include the scripts in the html file. Besides what happens when the new version rolls out of the editor and you want to update? You're going to remove all the references to the old editor by hand, then add the references to the new version by hand. Plus references to the new files the editor might have introduced.

What would happen if we modularised it all? Every app and library would organize themselves into a directory and reference or bundle the modules they need. Adding a nice web app would require just a single line - <import src="nice_web_app"/>. Or perhaps even better by using a CDN - <import src="cdn.example.org/meganetwork" apikey="0101010101"/> Now you know why I'm writing this post. The remaining part describes what is needed to implement a module system that's good.

Proposal of Modules

You would want the website to take care of all the loading for you. Every script and asset would be required to be downloaded before the program runs. The program would then get all the resources as ArrayBuffer objects and piece itself together in the way it needs to. Either a simple .json -file or a tar archive could take care of this task. The browser or website could show a progress bar while loading.

First we need a function which loads a program from sources and lets us evaluate it, while giving us access to control the scope of the program and specify the source of the program for debugging. We almost have such a function. It's conveniently called Function.

Earlier I told that function scope boundary can be easily turned into a module boundary. Function kind of does that. You pass it argument names and source code for the function body. It returns a function for you. The form and structure of the arguments aren't just very convenient. For instance, it does not handle source maps properly. You would need something more like this, except with support for source maps:

var buildFunction = function(source, options, namespace) {
  var arg, args, name, vars;
  if (options.sourceURL != null) {
    source += "\n//# sourceURL=" + options.sourceURL;
  }
  vars = [];
  args = [];
  for (name in namespace) {
    arg = namespace[name];
    vars.push(name);
    args.push(arg);
  }
  vars.push(source);
  return Function.apply(null, vars).apply(null, args);
};

By using that function, you could then load any script, giving them the right parameters to run:

buildFunction(source, {sourceURL: "something.js"}, {
  module:  module,
  require: function(name) { module.require(name); },
  exports: module.exports
});

You could build any kind of modularising over this system, which is a great thing. This way the solution doesn't only solve the problem, but makes it easier to redesign if the old module system ever turns out to be bad. As extreme fallback, the <script> -tags would still remain too. It also opens doors for custom module loaders, whenever anyone needs such a thing.

What is a Module

So every module would form a small, complete filesystem containing everything the app needs to run. What should the module look like and how would you fill the structure? There would be a module for every file. Every module would contain the arraybuffer of the file it represents. Overall modules would contain following:

.buffer - ArrayBuffer object, file contents.
.url - complete url of the object.
.parent - parent module of the object, or null.
.basename - name of the module.
.submodules - object containing modules that are submodules of this module.
.type - either "file", "directory", or "package".
.resolve - a function, which resolves a path relative to this module.
.require - implementation of code, which initializes the module if it is a script.

You could fill an entry into this structure like this, assuming you have properly working dirname and basename:

function(root, type, path, url, buffer) {
  module = new Module(type);
  module.buffer = buffer;
  module.url = url;
  module.parent = root.resolve(dirname(path));
  module.parent.submodules[module.basename] = module;
  module.basename = basename(path);
}

If you search for dirname and basename code in javascript you will find regex implementations by PHP programmers. Do not use them as they are broken and work only seemingly.

Module Resolution

So what would require exactly do? It would first use .resolve to retrieve a path. That would look somewhat like this:

var module = this.resolve(path);

if ((module != null) && module.type === 'directory') {
  module = module.resolve('index.js');
}

if (module == null) {
  module = this.resolve(path + '.js');
}

If module wouldn't exist, it would throw an error. In other situation it would look for the module.exports, if that weren't around, it would use the following code to create one and return it.

module.exports = {};
module.loaded  = false;
var source = TextDecoder('utf-8').decode(
    new Uint8Array(module.buffer));
buildFunction(module.source, {sourceURL: module.url}, {
    module:  module,
    exports: module.exports,
    require: function(path) { module.require(path); }
});
return module.exports;

What would resolve do? It would parse something alike unix-path. Here's few examples:

assets/knight  
would lookup inside the closest directory module.

/webgl/utility
backtracks to the root module before lookup.

@/plop
backtracks to the closest package module before lookup.

../blah
backtracks one level lower before lookup

This system would allow you to access any file within the module tree.

Demonstration

If it works, you see my website has this background, which may look bit annoying to some people as it's animated. That's a webgl background which has been tar -archive packaged. As an extra I've got coffeescript bundled in and I compile the scripts needed to run the background inside the browser. The code requires lots of 'bootup' scripts and might not be entirely robust or complete because of that.

The sources are in my Github account, like usual, The repository is coffee-modules.

Other tools trying to modularize javascript

To point out that I'm not ignoring existing work trying to bring modules to the javascript, I describe how other tools differ from this proposal.

requireJS?

RequireJS does not fetch non-script assets for you, or keep bookkeeping about them. Also if you're suffering from async brace pyramids of doom, it will make your brace pyramids one level deeper.

npm and browserify

Browserify packs everything into a single file. Also works, but needs to encode binary files and images to function. Not sure whether keeps bookkeeping about assets inside the modules.

ECMAScript6 modules

ES6 modules introduces new syntax, which lets you import relative to the URL. Does neither treat non-script files. It is supposed that it works with web components spec to construct a website. In upside it does not apparently require any kind of package summary to function well.

Why would you want the module system to handle anything else than scripts?

Lots of the existing tools do not handle other than scripts. But for the script to function like promised, there are times when it needs actual data to function, or placeholders. The program can't function without those. Either you preload them after the program loads, adding few async pyramids and a preloader along, or then you have them loaded along the code. The latter seems more sane.