The Universal Machine Macro Assembler


This document is the specification for the Universal Machine Macro Assembler Language and for the UMASM Macro Assembler program.

With only 14 instructions, the Universal Machine is a Spartan environment for even the most seasoned assembly-language programmer. The Universal Machine Macro Assembler, called UMASM, is a front end that extends the Universal Machine to create a more usable assembly language. [In assembly-language jargon, a ``macro'' is something that appears to be a single instruction but actually expands to a \emph{sequence} of machine instructions. A~true macro assembler would let you, the programmer, define your own macros. Maybe next year.]

Here are some of the features that the macro assembler adds; that is, these are capabilities that are not available from individual UM instructions, but which are available to programmers using the UMASM language. When the following features are used, the UMASM assembler emits multiple machine instructions:

  <comment> ::= from # or // to end-of-line
 <reserved> ::= if | m | goto | map | segment | nand | xor | string
             |  unmap | input | output | in | program | using
             |  off | here | halt | words | push | pop | on | off | stack
    <ident> ::= identifier as in C, except <reserved> or <reg>
    <label> ::= <ident>
      <reg> ::= rNN, where NN is any decimal number
        <k> ::= <hex-literal> | <decimal-literal> | <character-literal>
   <lvalue> ::= <reg> | m[<reg>][<rvalue>]
   <rvalue> ::= <reg> | m[<reg>][<rvalue>]
             |  <k> | <label> | <label> + <k> | <label> - <k>
    <relop> ::= != | == | <s | >s | <=s | >=s
    <binop> ::= + | - | * | / | nand | & | '|' | xor | mod
     <unop> ::= - | ~
    <instr> ::= <lvalue> := <rvalue>
             |  <lvalue> := <rvalue> <binop> <rvalue>
             |  <lvalue> := <unop> <rvalue>
             |  <lvalue> := input()
             |  <lvalue> := map segment (<rvalue> words)
             |  <lvalue> := map segment (string <string-literal>)
             |  unmap m[<reg>]
             |  output <rvalue>
             |  output <string-literal>
             |  goto <rvalue> [linking <reg>]
             |  if (<rvalue> <relop> <rvalue>) goto <rvalue>
             |  if (<rvalue> <relop> <rvalue>) <lvalue> := <rvalue>
             |  push  <rvalue> on   stack <reg>
             |  pop  [<lvalue> off] stack <reg>
             |  halt
             |  goto *<reg> in program m[<reg>]
<directive> ::= .section <ident>
             |  .data <label> [(+|-) <k>]
             |  .data <k>
             |  .space <k>
             |  .string <string-literal>
             |  .zero <reg> | .zero off              // identify zero register
             |  .temps <reg> {, <reg>} | .temps off  // temporary regs
     <line> ::= {<label>:} [<instr> [using <reg> {, <reg>}] | <directive>]
  <program> ::= {<line> (<comment> | newline | ;)}
Grammar for the Universal Machine Macro Assembler [*]

Notable features of the Macro Assembler

Figure [<-] on page [<-] gives the full language accepted by the Macro Assembler; the start symbol of the grammar is <program>, at the bottom. Note that the <string-literal>, <hex-literal>, <decimal-literal> and <character-literal> productions are missing from the grammer. Each matches the equivalent C literal syntax: for example, <string-literal> accepts a C-style double-quoted string. The nonterminals of major interest are <instr> and <directive>:

The Macro Assembler Program

Usage of the Macro Assembler program is straightforward:

  umasm [-help] [-grammar] [-o] [source.ums ...]
The -help option prints a longer explanation of options, including several options that are intended only for debugging the Macro Assembler itself. The -grammar option prints the input language of the Macro Assembler. The -o option names a file to which the binary UM code should be written; if not given, the Assembler writes to standard output. [For the COMP 40 implementation of the macro assembler program, the supplied framework parses and implements all of these command line switches; students supply only specific pieces of code to implement selected macros and to assist with management of segments.]