The P-Machine

The P-Machine: Dynamically-linked procedures for interactive fiction

Representation of procedures
Identifying procedures to the call instruction
Associating procedures with procedure IDs
Running bytecode procedures
Startup
Calling C procedures
Adding procedures on the fly
What's left?

This document sketches a proposal for a C-like approach to providing high-level services (I/O, text decompression, dictionary lookup) in a virtual machine for the support of interactive fiction. The essence of the approach is that instead of providing special bytecodes (VM instructions) for these services, we arrange for them to be provided by ordinary procedures. Hence, the ``procedure machine'' or ``P-machine.'' A crucial aspect of the proposal is that procedures may be referred to by name---this means the set of procedures can easily be extended. The names are resolved to individual procedures when the story file is loaded, just before it starts to run---hence ``dynamic linking.''

Representation of procedures

I assume that arguments are passed and results are returned on a stack of ``VM values.'' It's probably a good idea if all VM values are the same size, e.g., 32 bits.

<type definitions>= (U->) [D->]
typedef unsigned int vmvalue;  /* supposed to be 32 bits */

Defines vmvalue (links are to index).

I'm also going to assume that there's some virtual-machine state in a struct vmstate. I won't bother to specify what this is, but you can imagine memory, program counter, a value stack, whatever you like.

<type definitions>+= (U->) [<-D->]
typedef struct vmstate *Vmstate;

Defines Vmstate (links are to index).

Now, let's imagine an arbitrary procedure called from the VM. It needs a pointer to the VM state. I'm also going to assume an explicit specification of the arguments passed, so that a procedure that just messes with arguments and results can be compiled without any knowledge of the VM state. Finally, I'll assume that the procedure takes an extra argument that is somehow associated with the procedure itself. This extra argument can support an object-oriented or functional style from within C. In various different places, you'll see it called self, or a closure, or a cookie, or a rock. We'll use the term ``closure.'' A VM procedure is therefore represented as a value of type Vmproc:

<type definitions>+= (U->) [<-D]
typedef struct vmproc {
  unsigned (*f)(Vmstate vm, int argc, vmvalue *argv, void *closure);
  void *closure;
} Vmproc;

Defines Vmproc (links are to index).

The unsigned return value from f should indicate how many results were left on the stack.

Identifying procedures to the call instruction

At run time, procedures are identified by small integers, which we'll call procedure IDs. Let us assume our VM has a stack. Just before a call, the VM code should push all the arguments on the stack. Then, the very last value pushed should be a procedure ID identifying the procedure to be called. [The reason for putting the procedure ID on the stack instead of in the instruction is to make it possible for games to create new procedures on the fly. This feature is probably useless in a VM intended for adventure games, and if you could do without it you could make the procedure ID part of the call instruction itself. You'd want at least 16 bits for procedure IDs.] The call instruction uses the ID to find the procedure and call it.

Here's some hypothetical code. In real life, making the stack and sp global variables would probably be a mistake---they should be part of that struct vmstate.

<hypothetical call example>= (U->)
vmvalue stack[];                /* the VM value stack */
vmvalue *sp;                    /* pointer to the first empty slot on the stack */

Vmproc *known_procedures;       /* pointer to all known procedures */

static void vm_call(void) {     /* implements the call opcode */
  extern struct vmstate *current_state; /* pointer to VM state (bogus) */
  Vmproc *p;
  <advance VM's program counter past the call instruction>
  assert(sp > 0);
  p = &known_procedures[sp[-1]];
  sp = stack + p->f(current_state, sp - stack - 1, stack, p->closure);
}

Defines known_procedures, vm_call (links are to index).

In a real VM, of course, vm_call wouldn't be a procedure; it would be inlined.

Associating procedures with procedure IDs

The crucial question is how does the compiler know which procedures go with which procedure IDs? That question has a two-part answer, depending on whether the procedure is part of compiler-generated code (game and library code) or the interpreter.

Assume there are N procedures, with IDs ranging from 0 to N-1. M of these procedures are compiler-generated, and the rest are provided by the interpreter. Let's call the procedure IDs from 0 to M-1 the low procedure IDs, and the procedure IDs from M toN-1 the high procedure IDs. The convention is that low procedure IDs refer to procedures generated by the compiler, and high procedure IDs refer to procedures provided as part of the interpreter. The compiler gets to pick N and M, and it gets to assign procedure IDs any way it likes. But how does it tell the interpreter about them? By putting two tables into the story file.

The first table, or the bytecode procedure table, has M entries, one for each low procedure ID. Each entry points (somehow) to the information for that procedure in the story file. Such information probably includes the number of local variables used by a procedure, possibly the number of arguments it expects, and certainly the bytecodes that implement the procedure.

The second table, or the native procedure table, has N-M entries, one for each high procedure ID. Each entry is a string that gives the standard name for some procedure. The interpreter should have a list of all the standard procedures it implements, together with their names. The names are part of the VM standard; every name has to have a well-defined semantics, so that the procedure with that name does the same thing on every platform, no matter what. The name will probably encode at least a subsystem (e.g., glk), a procedure, and a version number.

The interpreter uses both tables to build the known_procedures table before the game begins. Note that if the native procedure table asks the interpreter for a name that is not known to the interpreter, that's OK---the interpreter should bind in a procedure that will issue an error message if never called. This way, a game can link to a special procedure but not call it if the procedure isn't there. [A good standard procedure would be System.provides, to ask the interpreter if it provides a procedure with a given name.]

Here's an example error procedure for a Unix interpreter:

<sample error procedure>= (U->)
static unsigned vm_unknown_proc(Vmstate vm, int argc, vmvalue *argv, void *closure) {
  fprintf(stderr, "Standard procedure `%s' is not available in this interpreter\n",
          (char *)closure);
  exit(1);
}

Defines vm_unknown_proc (links are to index).

A better procedure would try to roll the VM back to its last consistent state, so the user could have a chance to save the game.

Running bytecode procedures

You might have guessed by this point that all of the compiled procedures are going to use the name native-code procedure, but with different closures. Here's an example. I'm defining a VM with a current program counter, a value stack, a local-variable stack, a return-address stack, and global memory. To keep the example simple, I'm not going to worry about overflow or about the lengths of anything.

<vm example>= (U->) [D->]
struct vmstate {
  unsigned char *memory;
  vmvalue *locals;              /* local vars of currently active procedure */
  unsigned num_locals;          /* number of local variables */
  vmvalue *stack;               /* the value stack */
  vmvalue *sp;                  /* next available slot in stack */
  struct frame *suspended;      /* states for suspended bytecode procs */
  unsigned char *pc;            /* pointer to next bytecode to interpret */
};

The struct frame contains the information needed to implement return. It just restores the VM state as directed:

<vm example>+= (U->) [<-D->]
struct frame {
  vmvalue *locals;
  unsigned num_locals;
  unsigned char *pc;            
};

Imagine this is the representation of a bytecode procedure:

<procedure example>= (U->)
struct bytecode_proc {
  unsigned num_locals;          /* number of local variables */
  unsigned char *bytecodes;     /* pointer to bytecodes */
};

(For performance reasons, you probably also want each procedure annotated with its maximum stack usage. This means you can check for stack overflow only on call and return, instead of at every push. The stack usage could be computed for every procedure at startup time, by abstract intrepretation, or you could make the compiler do it.)

Now we can write call and return. This call initializes local variables either to the arguments or to zero.

<vm example>+= (U->) [<-D->]
static unsigned vm_bytecode_call(Vmstate vm, int argc, vmvalue *argv, void *closure) {
  struct bytecode_proc *p = closure;

  /* save current state */
  vm->suspended++;
  vm->suspended->locals     = vm->locals;
  vm->suspended->num_locals = vm->num_locals;
  vm->suspended->pc         = vm->pc;

  /* allocate new frame */
  vm->locals += vm->num_locals;
  vm->num_locals = p->num_locals;

  /* initalize locals to arguments or zero */
  assert(argc <= p->num_locals);
  memcpy(vm->locals, argv, argc * sizeof (*argv));
  memset(vm->locals + argc, 0, (p->num_locals - argc) * sizeof (*vm->locals));

  vm->pc = p->bytecodes;        /* set pc */
  return 0;                     /* start new proc with empty stack */
}

Defines vm_bytecode_call (links are to index).

A more clever scheme would be to use the same space to represent both local variables and the value stack; that would make it possible to start a procedure just by shifting a pointer, and it wouldn't be necessary to copy the arguments.

In a return, results are left on the stack by the bytecodes, so we simply indicate the values that are already there. Return doesn't actually have to be done as a procedure; it could be done inline as a bytecode.

<vm example>+= (U->) [<-D]
static unsigned vm_bytecode_return(Vmstate vm, int argc, vmvalue *argv, void *closure) {
  /* restore previous procedure */
  vm->locals     = vm->suspended->locals;
  vm->num_locals = vm->suspended->num_locals;
  vm->pc         = vm->suspended->pc;
  vm->suspended--;
  return argc;
}

Defines vm_bytecode_return (links are to index).

Startup

At startup time, we assume the interpreter has a way of looking up its procedures by name. Then it can initialize the known_procedures table as follows:

<sample startup code>= (U->)
extern struct vmproc *lookup_procedure(char *procname);

static void init_procs(unsigned M, struct bytecode_proc *low_procs,
                       unsigned N, char **high_procs) {
  int i;

  known_procedures = malloc(N * sizeof (*known_procedures));
  for (i = 0; i < M; i++) {
    known_procedures[i].f = vm_bytecode_call;
    known_procedures[i].closure = low_procs+i;
  }
  for (     ; i < N; i++) {
    Vmproc *p = lookup_procedure(high_procs[i-M]);
    if (p)
      known_procedures[i] = *p;
    else {
      known_procedures[i].f = vm_unknown_proc;
      known_procedures[i].closure = high_procs[i-M];
    }
  }
}

Defines init_procs, lookup_procedure (links are to index).

Calling C procedures

It's a nuisance to specify the behavior of each standard procedure in terms of a list of vmvalues. It would really be simpler if we could write most standard procedures in C, then have a standard way of invoking such procedures from VM code. We can do this by stealing some technology from the implementation of Remote Procedure Call. Unmarshalling is the process of taking a sequence of vmvalues and converting them into arguments for an ordinary C call. Marshalling is the process of taking a result from a C function and converting it into a sequence of vmvalues. With luck and good management, marshalling and unmarshalling code can be generated automatically by a stub generator. Depending on the application, the stub generator can operate on a raw .h file or a description that provides a little more information.

Here's an example. Suppose we have the following declarations:

<sample C procedures to be converted>= (U->)
typedef struct window_st *window_t;
typedef struct event_struct {
    unsigned type;
    window_t win;
    unsigned val1, val2;
} event_t;
void    ick_request_char_event(window_t win);
event_t ick_wait_for_event(void);

Defines event_t, ick_request_char_event, window_t (links are to index).

The stub generator might produce this:

<sample output from stub generation>= (U->)
static unsigned my_ick_request_char_event
  (Vmstate vm, int argc, vmvalue *argv, void *closure) 
{
  assert(argc == 1);
  ick_request_char_event((window_t) argv[0]);
  return 0;
}

static unsigned my_ick_wait_for_event
  (Vmstate vm, int argc, vmvalue *argv, void *closure) 
{
  event_t e;
  assert(argc == 0);
  e = ick_wait_for_event();
  argv[0] = e.type;
  argv[1] = (vmvalue) e.win;
  argv[2] = e.val1;
  argv[3] = e.val2;
  return 4;
}

struct predefined_binding {
  char *name;
  struct vmproc proc;
} procedures[] = {
  { "ick.request-char-event.1", { my_ick_request_char_event, 0} }, 
  { "ick.wait-for-event.1",     { my_ick_wait_for_event,     0} }, 
  { 0, {0, 0} }
};

Defines my_ick_request_char_event, my_ick_wait_for_event, procedures (links are to index).

(Actually, rather than compile in the assertions, you might have the stub generator create two versions---one with no checking, and one with extensive debugging, e.g., to make sure every value claimed to be a window_t was actually returned as a window_t. Then the interpreter could pick the normal version or the debugging version at startup time.)

Of course, if a procedure is really in the inner loop, you probably don't want the minor overhead of calling it through a stub, and so you write a Vmproc directly.

Adding procedures on the fly

If the game includes a compiler, it would be useful to be able to take a struct bytecode_proc from VM memory and assign it a procedure ID on the fly. This would be easy to implement as another procedure.

What's left?

In this system, lots of things that used to be opcodes are now procedures. What's left? What do you continue to implement as bytecodes? Generically, the answer is that cheap operations you want to run with very low overhead should be implemented as bytecodes. This certainly includes arithmetic and logical operations, stack manipulations, and loads and stores. These are just the sorts of things that compile into machine code in C programs. In fact, one of the intriguing ideas about this approach would be to take the bytecode procedures in a story file and compile them into native code. (This could be done through C without much difficulty.) Maybe then my Pilot would run ``The Meteor, The Stone And A Long Glass Of Sherbet'' at reasonable speeds!

Compiling all the examples

This puts all the code together so it can be compiled.

<*>=
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

<type definitions>
<hypothetical call example>
<sample error procedure>

<procedure example>
<vm example>

<sample startup code>

<sample C procedures to be converted>
<sample output from stub generation>

This is necessary, but it can't do anything, because we haven't actually defined the format of the call instruction.

<advance VM's program counter past the call instruction>= (<-U)
/* do nothing */

<*>: D1
<advance VM's program counter past the call instruction>: U1, D2
<hypothetical call example>: D1, U2
<procedure example>: D1, U2
<sample C procedures to be converted>: D1, U2
<sample error procedure>: D1, U2
<sample output from stub generation>: D1, U2
<sample startup code>: D1, U2
<type definitions>: D1, D2, D3, U4
<vm example>: D1, D2, D3, D4, U5

event_t: D1, U2
ick_request_char_event: D1, U2
init_procs: D1
known_procedures: D1, U2, U3
lookup_procedure: D1
my_ick_request_char_event: D1
my_ick_wait_for_event: D1
procedures: U1, D2
vm_bytecode_call: D1, U2
vm_bytecode_return: D1
vm_call: D1
Vmproc: D1, U2, U3, U4
Vmstate: D1, U2, U3, U4, U5, U6
vm_unknown_proc: D1, U2
vmvalue: D1, U2, U3, U4, U5, U6, U7, U8, U9
window_t: D1, U2