Thursday, December 13, 2012

Mapping .NET IL code to Android's Dex code

The primary function of the dot42 compiler is to convert .NET byte code (IL) to Android byte code (dex). This article describes how.

Differences and similarities

First let's look at the differences and similarities between the two. IL is a stack-oriented language. This means that all operations take their operands from the stack and place their results back on the stack. Dex, in contrast, is executed by a register-based virtual machine called Dalvik. A dex operation takes operands from registers and places the result in a register. This means that a key function of the dot42 compiler is to go from a stack-based set to a register-based set.

Another difference between the two is the set of primitive types that are available. Most match exactly (bool, char, short, int, long, float and double), but there is a problem with bytes. IL has support for signed and unsigned bytes, while dex only has signed bytes. Similarly, dex has no support for unsigned 32-bit and 64-bit integers. The conversion between signed and unsigned bytes is explained later.

Furthermore, there are differences related to generics and structs. These will be the topics of future blog articles.

Stack to registers

The first conversion that takes place in the dot42 compiler is going from from stack-based single instruction to stack-based expressions. For example:

load a
load b
add
load c
add

is converted to:

add(add(load(a), load(b)), load(c))

The latter expression is much easier to alter and optimize towards the dex register-based operations.

RL - Register Language

From this expression-based format, a second conversion takes place to an intermediate format we call RL (for register language). It takes the expressions from the first conversion and converts them towards register-based instructions that are similar to dex instructions, but have a few important differences.

The first difference is that RL only has 1 variation of each instructions. For example there is only 1 "const" instruction, where Dex has "const/4", "const/16", "const" and "const/high16". All of which are variations for the size of the constant and the number of bits used by the destination register. Simplifying instructions like this makes them much easier to optimize and work with.

The second difference between RL and dex is that RL starts in single-assignment mode. That means that each register is assigned a value only once.

Each stack-based expression is now converted to RL using an approach like this:

  1. Convert all operands to RL and save the registers that hold their resulting value
  2. Allocate an RL register for the result
  3. Emit the RL instruction (with operand registers from step 1 and result register from step 2)
  4. Return the result register allocated in step 2.

Optimize RL

Once all stack-based expressions are converted to RL, we have code that is fully register-based. Now we can optimize this code and convert it to dex.

The first optimization that takes place is to share constants. An expression like add(1, 1) has resulted in two "const" instructions, each with a value of "1", but a different destination register. If both registers are never modified, we can safely share them and remove the unused "const" instruction.

A second optimization is to share registers as much as possible. For this purpose, instructions are divided into basic blocks. We then calculate the range of instructions in which a register is used. If that range does not overlap with a range of another register, we can use the same register in both ranges. In reality this is a bit more complex because you also have to deal with register pairs: in dex, long and double operands use two adjacent registers, called a register pair.

Furthermore, we perform various optimizations like nop removal and branch optimization. These are not discussed here.

From RL to Dex

Finally, it is time to convert the code to dex. Since RL is very close to dex and we already share registers as much as possible, the basic conversion is trivial.

The main problem we have to solve now, is that not all instructions take all registers as their operands. For example, all "invoke-x" instructions take only registers that fit in 4 bits. For that we apply register spilling. If an operand of an instruction does not fit within the limits of that instruction, we copy the register to a "low register", execute the instruction and then copy it back to the original register (when needed) afterwards.

Finally it is time to consider the first difference between RL and dex: the various variations for many instructions. We go over all instructions and take the best variation that fits the operands.

Debug information

During the entire process described above, we maintain the debug information that is present in the original .NET assembly. In doing so, we can generate dex debug information containing accurate line number information, variable names, etc.

But also this conversion was more complex in reality. In .NET, debug information describes ranges in a source file. Each range has a start line, a start column, an end line and an end column. Dex has (start) line numbers only. This means that if there are multiple code ranges on one line (e.g. "for (var x = i; i < 10; x++)" has 3 ranges), we have to cheat a bit to make sure that all ranges get unique line numbers. As a result, the line numbers reported in an Android exception stack trace may not be 100% correct, but stepping through the code in the Visual Studio debugger works as expected.