Virtual object code

All term we will be implementing new languages by adding a little bit to a language we already have. Our first new language is virtual object code, which adds these three little features to virtual-machine code:

The design I have chosen for the on-disk format exemplifies a strategy good for putting any kind of binary data on disk: represent a data structure on disk as a program that, when interpreted, materializes the data structure. Virtual object code is just such a program, and to describe some aspects of it, I have give it a tiny operational semantics.

The main new technology needed to implement virtual object is parsing, which is described in another handout.


You may have seen grammars and EBNF in other courses, including 105. If not, or if you want a refresher, there is decent material from Matt Might, the Wikipedia article is not terrible, and there’s a decent short summary from Pete Jinks, which also mentions railroad diagrams. Or you can consult the parsing sources mentioned in the module handout.

Here’s the grammar of our virtual object code:

<modules>     ::= { <module> }

<module>      ::= .load module <length> \n
<body>        ::= { <instruction> }

<instruction> ::= <opcode> {<operand>} \n
               |  <opcode> <register> <literal> \n
               |  .load <register> function <arity> <length> \n

<literal>     ::= true | false | <number> | emptylist | nil
               |  string <length> { <byte> }


<opcode> is the mnemonic name of an opcode (single token)
<operand> is an integer literal representing a register number or immediate value
<register> is an integer literal in the range 0..255
<number> is a numeric literal (integer or floating point)
<length> is an integer literal
<arity> is an integer literal
<byte> is an integer literal in the range 0..255

A virtual object file is a program that defines a list of modules, plus has side effects on a VM state. The program has a well-defined semantics, which I’ll describe informally.