Compiler in Plain Python: Instruction Select
MenuetOS binaries are roughly the same form as the source code it is assembled from. I feel that kind of isomorphism as empowering. It drove me to writing a compiler in python. This far I've started on instruction selection. The source code is in my github account.
At first I tried to get LLVM compile out unformatted binaries. Only way to do it would be to write a linker script and there would still be a chance that it will generate wrong code. Besides LLVM doesn't have a target for x86 real mode.
Despite I didn't get LLVM to do what I want I found its documentation valuable, especially the descriptions about the code generator. It describes many intricacies done to support difficult platforms.
The code I wrote is doing instruction selection. I started on that because I want to generate assembly that resembles the source code in Menuet OS. I'm planning to make it emit FASM for easier comparison with Menuet OS sources and to avoid having to implement an assembler. I've tried to do something similar before but I used to start it by writing a parser and ended up with poor code generation step. I'm noticing that doing it opposite way gives me enormous leverage in form of early compilation results.
Code generates optimal instruction tiling by dynamic programming taking in one tree structure at a time. Ultimately I don't want to supply the input in a tree structure but after reading the Near-Optimal Instruction Selection on DAGs it convinced me that I can extend this simple instruction selector to generate good code for anything else as well.
I haven't set up any benchmarks to tune the costs for each tiling so the optimality of the selected assembly instructions are subjective. Doing that would be hard when it's not doing anything else than selecting instructions.
I am testing my instruction selector by giving it input structures generated by python code. Another input it gets is the instruction tiling patterns with associated costs and code for generating each pattern. The code traverses the given input in postorder comparing it to the tiling patterns. At every tree node it selects the least costing tile. For example on Const(0)
it selects xor dst, dst
to mimick how assembly programmer would encode a zero constant using exclusive or -instruction. On Op('sub', [Const(0), Const(6)])
it emits move dst, src1
to work around two-operand instruction sets and sub dst, 6
because the rightmost side is a constant and on leftmost side the zero constant was cheap enough to produce compared to alternatives it had in the tiling table.
Next I'll write a traditional register allocator without spilling. The compiling just fails if the register allocation has to spill a register. At that point it may already generate machine code that can be evaluated by an intel machine.