COMP40 Assignment: Assembly-Language Programming

COMP40 Assignment: Assembly-Language Programming

Assignment due Sunday, December 11 at 11:59 PM.
Design (in limited form; see below) due Tuesday December 6 at 11:59 PM.

Table of Contents


Purpose and overview

The purpose of this assignment is to deliver on the second half of the course title: you get to do some assembly-language programming. You will consolidate and solidify your knowledge of machine-level programming by implementing a calculator that uses Reverse Polish Notation, like the immortal HP 15C.


CommandFunction
nPush n onto the value stack, where n is a numeral (sequence of digits).
spaceDoes nothing, but may be used to separate numerals, as in the command sequence ``6 7*.''
newlinePrint the contents of the value stack
+Pop y from the value stack, then pop x from the value stack, then push x+y.
-Pop y from the value stack, then pop x from the value stack, then push x-y.
*Pop y from the value stack, then pop x from the value stack, then push x×y.
/Pop y from the value stack, then pop x from the value stack, then push x ÷y. If y is zero, print an error message and leave the stack unchanged.
|Pop y from the value stack, then pop x from the value stack, then push x \/y, where \/ stands for bitwise or.
&Pop y from the value stack, then pop x from the value stack, then push x /\y, where /\ stands for bitwise and.
c(Change sign.) Pop x from the value stack, then push -x.
~Pop x from the value stack, then push ¬x, where ¬ stands for bitwise complement.
sSwap the two values on top of the value stack (exchange x and y).
dDuplicate the value on the top of the stack. (The HP 15C uses the Enter key.)
pPop a value off the value stack and discard it.
zRemove all values from the value stack (zero stack).

Calculator commands [*]

  sunfire31{nr}403: calc40
  6 7 *
  >>> 42
  2 +
  >>> 44
  11 /
  >>> 4
  c
  >>> -4
  p
  466 319sd+240c807c   sd-
  >>> 0
  >>> -807
  >>> 932
  >>> 319
Interacting with the RPN calculator [*]


An RPN calculator

The COMP 40 RPN calculator reads commands from standard input and prints results to standard output. Like all RPN calculators, it works with a value stack. In this case, a value on the stack is one Universal Machine word. The command set is shown in Figure [<-]; Figure [<-] shows an example interaction. You will find a complete reference implementation in file /comp/40/www/homework/calc.c, and you can run a binary in /comp/40/bin/calc40.

Your assignment is to implement this calculator in Universal Machine Assembly language. Your calculator must duplicate the output of the reference implementation exactly.

The implementation of the calculator is mostly straightforward: the only persistent state is the value stack, and this value stack is manipulated by each command independently of the others, using purely local reasoning. There is one dirty trick, however: in order to make it possible to read the digits of a numeral one character at a time, the calculator uses a finite-state machine with two states: waiting and entering. The normal state, which is also the initial state, is waiting. The entering state is used only when the entry of a numeral is in progress.

Here are two examples: You can see for yourself the difference between 42 with no space and 4 2 with a space:
  42
  >>> 42
  p
  4 2
  >>> 2
  >>> 4
The C code keeps track of the state through the position of the program counter, using the two labels entering and waiting. To avoid duplicating the implementations of any commands, if the code for the entering state does not see a digit, it uses a goto to reuse the same code used in the waiting state.

Technical information

Useful macro instructions

Some critically important macro instructions are not explained in your handout:

push r3 on stack r2Register r2 points to a stack, and this instruction subtracts 1 from r2, then stores r3 at offset r2 in segment 0.
pop r5 off stack r2Register r2 points to a stack, and this instruction loads register r5 from offset r2 in segment 0, then adds 1 to r2.
pop stack r2Adds 1 to register r2.
goto p linking r1Sets register r1 to the offset of the instruction immediately following the call macro, then transfers control to the instruction labelled p in segment 0. Used to implement procedure calls.

Recommended calling convention

You may choose any calling convention you like, but for general purposes I recommend the following convention:

  1. Arguments are passed on the call stack, which is pointed to by register r2. The callee sees the first argmument is at the lowest address (m[0][r2]), with subsequent arguments at higher addresses. In this convention, if you examine a sequence of push instructions in the caller, you'll see that the caller pushes the first argument last.
  2. Register r0 is always zero.
  3. On entry to a procedure, register r1 holds the return address. If you write a procedure that itself makes a call, you will have to save and restore the procedure's return address.

    If a procedure returns a result, the result should be returned in register r1.

  4. Register r2 is the stack pointer.
  5. Registers r3 and r4 are nonvolatile general-purpose registers. If you use either of these registers in a procedure, you must save and restore them.
  6. Registers r5, r6, and r7 are volatile registers and are not saved and restored by procedure calles.
I also recommend that you dedicate registers r6 and r7 for use as temporaries.


  .zero r0
  .temps r6, r7
  .section text

  // return address in r1, which gets result
  // stack pointer in r2
  // nonvolatiles r0, r3, r4
  // r0 is zero
  double:
    push r1 on stack r2 // save return address
    push r3 on stack r2 // save nonvolatile registers
    push r4 on stack r2

    r3 := m[r0][r2+3] // load argument into r3
    r1 := r3 + r3     // result goes into register

    pop r4 off stack r2 // restore nonvolatile registers
    pop r3 off stack r2
    pop r5 off stack r2 // put return address in r5
    goto r5 // return
An assembly procedure that returns double its argument [*]

Using this convention, Figure [<-] shows a slightly paranoid procedure that doubles its argument. In Figure [<-], it is not really necessary to save r3 and r4, since everything could have been done using r5, but the model works in the general case.

Design and implementation plan

Sections

In my assembly code, I use these sections

textContains procedure definitions, including the definition of main.
dataContains a preallocated call stack and other data structures.
rodataContains jump tables.
initContains setup code, including code to set up the stack, code to initialize jump tables, and code to call main when setup is complete.

Modules

My calculator is split into four assembly-language source files:

  1. File stack.ums allocates space for the call stack (in the data section) and initializes the stack pointer (with code in the init section). Not counting blank lines or comments, my implementation of this module is only 6 lines of assembly code.
  2. File printd.ums contains a function for printing Universal Machine words in decimal.
  3. File calc40.ums contains my calculator-related functions.
  4. File callmain.ums puts code in the init section which makes the initial call to main, then halts. Not counting blank lines or comments, my implementation of this module is only 5 lines of assembly code.
It is important that stack.ums come first and callmain.ums come last, so that the stack pointer is initialized before any other code runs, and so that main is not called until all the other code in the init section runs. For example,
  umasm stack.ums calc40.ums printd.ums callmain.ums > calc40.um

Data structures

There is really only one data structure in the program, which is the value stack. I recommend that you reserve space in segment 0 so that you can take advantage of the push and pop instructions. (We will be testing your calculator on random inputs, so be sure that your value stack is capable of holding at least ten thousand values.) Another alternative is to use segments to make a linked list.

Implementation of the print module

The print module is the most challenging module in the calculator. Printing Universal Machine words as numbers requires three or four cases:

The reason the print module is difficult is that the only way to get the digits of a number is to get the least-significant digit first—but numbers are conventionally printed with the most-significant digit first. I'm aware of two kinds of solutions:

Implementation of the calculator module

My implementation of the calculator module is about 250 lines of assembly code, but most of these lines are very repetitive—there are fifteen commands, and each one has to check for operands on the stack, do some manipulation, and some control flow. The codes are all quite similar. I recommend you take advantage of these tricks:

Debugging techniques

Assembly code is hard to debug. You will need to add some debugging code to your Universal Machine. I made my debugging code conditional on an environment variable called UMTRACE. When I start my Univeral Machine, I make one check in the environment to see if I should be tracing:

  bool trace = getenv("UMTRACE") != NULL;
Then, in my execution loop, I print information conditioned on the trace:
  if (trace) {
    Um_instruction instruction = *pc;
    char *asm = (char *)Um_disassemble(instruction);
    if (OP(instruction) == LV)
      fprintf(stderr, "%7" PRIdPTR ": %s\n", pc - prog, asm);
    else
      fprintf(stderr, "%7" PRIdPTR ": %s  (r%d = %d, r%d = %d, r%d = %d)\n",
              pc - prog, asm,
              A(instruction), RA, B(instruction), RB, C(instruction), RC);
    FREE(asm);
  }
This code prints each PC and instruction before it is executed, along with the values of the registers mentioned in the instruction.

Here's some advice:

What we provide for you

Your mission is to implement the RPN calculator in Universal Machine assembly language. Here's the support you get from us:

What we expect from you

Documentation

Assembly code requires the same kinds of documentation as C code, but in more places.

``Design''

On this project, I'm afraid you don't get to do much design. But to give you help getting started, and to give you feedback on your documentation, we're asking you to submit two things early:

Using the script submit40-asmcoding-design, please submit these two files by Tuesday, December 6 at 11:59 PM.

Final submission

By Sunday, December 11 at 11:59 PM, use the script submit40-asmcoding to submit