Webassembly is friendship

Dear blog and a small part of the Internet. I have met an unexpected companion.

I wrote a Lever program that opens and translates Webassembly binaries into IR code. It is giving my Lever compiler framework a flying start!

Webassembly provides me with precompiled programs in an easy-to-type-check format. Thanks to it, I can start building my compiler backend tomorrow.

Decoding and encoding of Webassembly

The only obstacle in using .wasm files is that you have to get them decoded somehow.

I despise the menial data entry clerk work. If something has been already written I would prefer to not write it again. Therefore I wrote a machine-readable specification for Webassembly files. You can read the spec here.

The spec is written in a style that can be fed through different interpreters to provide both a decoder and an encoder that provides you completely disassembled Webassembly files.

Webkit developers helpfully pointed out that they had made the JSON treatment for opcode tables already. The guys at Webkit have spread their opcode table sideways, so reading it is a bit like reading a treasure map. (YARR!)

Shared ammunition is magic

There will be an endless stream of hipsters porting C and C++ programs to Webassembly. Ability to run Webassembly files will let me to channel this power of androgynous hairstyles in Lever programming language and leverage it.

Although getting a C-library to work in Lever is easy, C++ is a different thing. Using anything-C++ in Lever is difficult so, for example, I would be delighted to use this ammo.js port from the Bullet engine.

The intention is to make this work in both directions. To get there I will need to implement emscripten relooper and llvm treegraphs.

But we thought Lever is a dynamic language?

Lever's runtime has been written in Python. It uses a library in PyPy to compile itself into C, that is then compiled by a C compiler down to machine code. In the process Lever gets a marvellous Just-in-time compiler into its runtime.

I am planning to implement the same features in Lever, and design the language to support the concepts. This means that you can design the programs in a dynamic language and then compile them down to SPIR-V or Webassembly modules.

Though, to get this work better and in more allowing manner, I need a piece of a compiler inside my standard libraries. Webassembly allows to start this small compiler project before having translation from Lever to Lever's IR in place.

Your instruction tables are my instruction tables

If you happen to know how many there are of me you would think I'm picking a too big chunk to chew. I am essentially writing a compiler backend for x86_64 in an one man team.

It'll be a proper compiler that properly selects instructions and allocates available registers for as many variables as it can find. How could I come up with anything like this in a short duration?

The trick here is that Google and Intel have already written a good part of my compiler for me. Intel was the data entry clerk, and Google was just being the Google and reprocessing the data. The Intel instruction manual has been made machine-readable by Google's CPU-instructions project.

So I learned how the protobuf works, just enough of that to get these tables translated to JSON. I'm so glad about having these so that I'll implement a protobuf library for my language some day.

A tiny bit of parsing on these results, and I'll have a tool to assembly/deassembly x86_64 instructions. The file driving such system is so large that I may have to separate the instructions actually used by the compiler apart.

A bit of mapping and preprocessing and I have everything I need to run an instruction selector for most desktop PC machines today.