VM Instruction Formats

The design of a binary format for virtual-machine instructions is governed by these considerations:

In an academic setting, the instruction format should be easy to debug.

The SVM instruction format is inspired by the MIPS instruction set developed at Stanford in the 1980s. This CPU architecture, which was used in the Sony PlayStation and PlayStation 2, used a 32-bit instruction word and supported 32 hardware registers. The SVM instruction format also nods to the instruction format of the Lua virtual machine, which supports 512 virtual registers.

An SVM instruction fits in a 32-bit word. It includes an 8-bit opcode in bits 24 to 31 (the most significant bits); the remaining 24 bits may be used to code three 8-bit register names, one 8-bit register name and a 16-bit index, or a 24-bit signed offset. I used 8 bits for the field sizes so that you could read off the fields of any instruction just by looking at the bits rendered in hexadecimal. An 8-bit field can name 256 different registers, and that will be enough for us.

The four bytes of the instruction are named OP, X, Y, and Z. The (unsigned) values of these fields are extracted by decoding functions opcode, uX, uY, and uZ.

The SVM supports four instruction formats:

The formats and the decoding functions tell you everything you need to know to write your vmrun function in module 1. The encoding functions won’t be used until module 2.