Modules and Abstract Types

COMP 105 Assignment

Due Wednesday, December 9, 2020 at 11:59PM

Overview

To build systems at scale, we break them into modules, and we put abstraction barriers between the modules. The most effective abstraction barrier we have is data abstraction. It’s a key element in the design of systems implemented in Ada, C, C++, Eiffel, Go, Haskell, Java, and related languages. Modules and abstract types are supported by concepts and mechanisms that you’ll learn in the context of Standard ML, which, not coincidentally, has one of the most expressive and powerful system-level abstraction mechanisms ever created.

In this assignment, you will

You’ll accomplish these ends by completing the three parts of the assignment:

Overall, you’ll answer reading-comprehension questions, you’ll deliver three modules, and you’ll assemble the abstraction functions and the invariants from two of those modules. As usual, you do reading-comprehension questions on your own. You may tackle the modules either on your own or with a partner.

Note: Because you’re creating modules, not just functions, and you’re also learning some new technology, you’ll do a lot of reading before you write your first line of code. This experience makes the assignment feel time-consuming, but overall, past students have reported spending the same amount of time on this assignment as on other assignments. And those past assignments included a long section on two-player games, which I have removed from the homework (and replaced with heapsort).

Setup

The code in this handout is installed for you in /comp/105/lib, where you don’t have to look at it. You will compile your own code using a special script, compile105-sml, which is available on the servers through the usual command use comp105. This script does not produce any executable binaries. Instead, it creates binary modules (“.uo files”) that you can load into Moscow ML, as in load "natural";. You can use it to compile all files or just a single file:

compile105-sml
compile105-sml natural.sml

To be able to load the binaries that we provide, you must supply an additional argument to mosmlc and mosml, as in

    mosml -I /comp/105/lib -P full

If you run into any surprises, consult Appendix I below, which explains, in detail, your options for compiling.

Dire warnings

Unless otherwise noted below, functions and constructs that were forbidden on earlier assignments remain forbidden.

In addition,

Reading

For this homework, you will need to understand the ML Modules section of “Learning Standard ML.” The book chapter on modules, chapter 9, is not included in your abridged edition—the chapter is still under review. Instead of the book chapter, we recommend the sixth lesson on program design: “Program design with abstract data types.”

You’ll also need to understand Appendix B, “Arithmetic”, starting on page S15 of Programming Languages: Build, Prove, and Compare. This appendix is included in the supplement. The appendix contains everything you need to know, but we recommend an additional handout, “Mastering Multiprecision Arithmetic”, which provides many hints about implementation.

Reading comprehension

These questions will help guide you through the reading. We recommend that you complete them before starting the other exercises below. You can download the questions.

  1. Using one of the sources in the ML learning guide, read about structures, signatures, and matching. Then answer questions about the structure and signature below.

    The following structure contains definitions that should be familiar from the ML homework and from code you may have seen in the course interpreters:

    structure ExposedEnv = struct
      type name   = string
      type 'a env = (name * 'a) list
      exception NotFound of name
      val emptyEnv = []
    
      fun lookup (name, [])              = raise NotFound name
        | lookup (name, (x, v) :: pairs) =
            if x = name then v else lookup (name, pairs)
    
      fun bindVar (name, value, env) = (name, value) :: env
    end

    Here is a signature:

    signature ENV = sig
      type name = string
      type 'a env
      val emptyEnv : 'a env
      val lookup   : name * 'a env -> 'a
      val bindVar  : name * 'a * 'a env -> 'a env
    end

    Answer these questions:

    1. Does the structure match the signature? That is, if we write

      structure Env :> ENV = ExposedEnv

      does the resulting code typecheck? Please answer yes or no.

    2. Does the signature expose enough information for us to write the following function? Please answer yes or no.

      fun extendEnv (names, vals, rho) =
        ListPair.foldrEq Env.bindVar rho (names, vals)
    3. Does the signature expose enough information for us to write the following function? Please answer yes or no.

      fun isBound (name, rho) = (Env.lookup (name,rho) ; true) 
                                handle Env.NotFound _ => false
    4. If in part (b) or part (c), if it is not possible to write the function given, change the signature to make it possible. If necessary, please copy, paste, and edit your new version in here:

    5. Suppose I change the ENV signature to make the name type abstract, so the code reads

      signature ENV' = sig
        type name
        type 'a env
        val emptyEnv : 'a env
        val lookup   : name * 'a env -> 'a
        val bindVar  : name * 'a * 'a env -> 'a env
      end
      structure Env' :> ENV' = ExposedEnv

      The new structure Env', sealed with signature ENV', is useless. Please explain why it is useless:

    You now have the basic ideas needed to understand what is being asked of you in this assignment.

  2. An ML functor is a function that operates on the module level. Think of it as a “module in waiting” or a “module builder”; a software engineer would call it a “generic module.” A functor’s formal parameters, if any, are specified by a sequence of declarations, and its actual parameters are given by a sequence of definitions. A functor’s result is a structure.

    Read about functors in Harper, as recommended in the ML learning guide. Then read about structure matching in section 24 of Tofte’s “Tips for Computer Scientists on Standard ML.” Then answer the questions below.

    1. On page 183, Harper defines a functor DictFun which takes one parameter: a structure K matching signature ORDERED. A dictionary is implemented using a binary tree. Suppose instead you want to implement a dictionary using a hash table. So you define a new functor HashDictFun, and it expects one parameter: a structure K with signature HASHABLE:

      functor HashDictFun(structure K : HASHABLE) :>
                MUTABLE_DICT where type Key.t = K.t
        =
      struct
        ... imagine your beautiful hash table here ...
      end

      The new signature HASHABLE is analogous to Harper’s signature ORDERED: it defines a key type t, plus every thing we need to know about keys to build a generic hash table with elements of type t.

      Fill in the complete definition of signature HASHABLE here:

      signature HASHABLE = sig
        ... fill in declarations here ...
        ... only info about keys; no operations on tables...
      end
    2. For each component of your HASHABLE signature, whether it is a type component or a value component, say what you expect it to be used for in functor HashDictFun. Write only English, not code:

    3. Suppose you have a structure IntHash that matches signature

      HASHABLE where type t = int

      Now suppose you apply function DictFun, from Harper’s chapter, to structure IntHash. This scenario is like the examples on the bottom of page 184; I’m suggesting you try

      structure HIntDict = DictFun (structure K = IntHash).

      What will the ML compiler say? Will it reject this application of DictFun or accept it?

      • If the compiler would reject the application, say one specific thing the compiler would complain about:

      • If the compiler would accept the application, explain why the compiler would be OK with it even though the functor expects module K with signature ORDERED and you are giving it module K with signature HASHABLE:

    You now understand functors well enough to use them in exercises H and I.

  3. Read about “signature refinement or specialization” in the ML learning guide. Now,

    1. In Harper’s DictFun functor, explain what is going on with the where type syntax.

    2. Explain what would go wrong if Harper wrote a definition without where type:

      functor DictFun (structure K : ORDERED) :> DICT
        =
      struct
        ...
      end

    You now know enough to use functors in exercises H and I.

  4. Read about abstraction functions and invariants in the lesson “Program design with abstract data types”. Then, from the ML homework, review the algebraic data type from the natural-number problems.

    Now answer these questions:

    1. The lesson describes a sorted list as one possible representation for a set. Define a function invariant that takes as argument a list of integers and returns a Boolean saying if the list is sorted in strictly ascending order (that is, increasing, with no repeats). You may use ML or μScheme, and you may reuse any function assigned for homework this term:

    2. In the ML homework, the algebraic type nat satisfies two invariants. In ML, define a function invariant of type nat -> bool, which returns true if and only if the given representation satisfies both invariants:

    You are now ready to write abstraction functions and invariants in exercises I, N, and ADT.

  5. Read about short division starting on page S21 of the book, and in “Mastering Multiprecision Arithmetic”.

    1. Divide 2918 by 7, calculating both quotient and remainder.
      At each step, you divide a two-digit number by 7. The remainder is passed along to form the next two-digit number.

        _________
      7 | 2 9 1 8

      At each step of the computation, you will take a two-digit dividend, divide by 7, and give quotient and remainder. The first step is

        02 divided by 7  ==  0 remainder 2
        29 divided by 7  ==  ...

      There are four steps in total. Edit the text above to state the dividend, divisor, quotient, and remainder at each step. Here, write the final four-digit quotient and the one-digit remainder:

    You are now ready to implement short division on natural numbers (for exercise N).

  6. Going back to the same reading, and following the examples in the section “Using short division for base conversion,” convert a number from decimal to binary and another number from decimal to octal.

    1. Using repeated division by 2, convert decimal 13 to binary. The “Mastering Multiprecision Arithmetic” handout shows the form, so please just fill in the right-hand sides here:

        q0 =           r0 = 
        q1 =           r1 = 
        q2 =           r2 = 
        q3 =           r3 = 

      Now write the converted numeral here:

    2. Using repeated division by 8, convert decimal 25 to octal 31. Follow the same model: at each step, give the intermediate quotient and remainder, and then form the final quotient by reading off the remainders.

    You are now ready to implement the decimal operation on natural numbers (for exercise N). This will also enable you to implement the toString operation on signed integers.

Programming, part one: Warming up with heapsort

Heapsort is a sorting algorithm that gets O(N log N) performance by using an auxiliary data structure: a priority queue.2 A priority queue is a collection of elements with an operation that quickly finds and removes a minimal element. (A priority queue requires a total order exists on the elements; a minimal element is one that is at least as small as any other element in the queue. A priority queue may contain more than one minimal element, in which case it is not specified which such element is removed.)

Like most data structures, priority queues come in both immutable and mutable forms. Immutable data structures are easier to test, specify, and share. Mutable data structures usually have lower space costs and sometimes lower time costs. One version of an immutable priority queue is described by this interface:

signature PQUEUE = sig
  type elem  (* element (a totally ordered type) *)
  val compare_elem : elem * elem -> order (* observer: total order of elems *)

  type pqueue  (* Abstraction: a sorted list of 'elem' *)
  val empty  : pqueue                   (* the empty list *)
  val insert : elem * pqueue -> pqueue  (* producer: insert (x, xs) == sort (x::xs) *)
  val isEmpty : pqueue -> bool          (* observer: isEmpty xs == null xs *)
  exception Empty
  val deletemin : pqueue -> elem * pqueue
    (* observer: deletemin [] = raises the Empty exception
                 deletemin (x::xs) = (x, xs) *)

  (* cost model: insert and deletemin take at most logarithmic time;
                 isEmpty takes constant time*)
end

The type pqueue is abstract, so its representation is not specified, but the interface says what abstraction a pqueue corresponds to: a sorted list of values of type elem.3 Each operation is described by giving the imagined effect on the abstraction. When specifying an interface that uses abstract data types, working with an abstraction in this way is a best practice—but not always easy or even possible.

A mutable version of the same abstraction is described by this interface:

signature MPQUEUE = sig
  type elem  (* element (a totally ordered type) *)
  val compare_elem : elem * elem -> order (* observer: total order of elems *)

  type pqueue  (* Abstraction: mutable cell containing a sorted list of 'elem' *)
  val new : unit -> pqueue              (* returns `ref []` *)
  val insert : elem * pqueue -> unit    (* q := sort (x :: !q) *)
  val isEmpty : pqueue -> bool          (* observer: isEmpty q == null (!q) *)
  exception Empty
  val deletemin : pqueue -> elem        (* remove and return the first element *)
end

Observers can conclude that the abstraction is mutable just by inspecting the types: They see unit types, and they observe that deletemin doesn’t return a pqueue.

Programming exercise H: Heapsort

Use a priority-queue abstraction to implement a sort module matching this signature:

signature SORT = sig
  type elem
  val compare : elem * elem -> order
  val sort : elem list -> elem list  (* return increasing *)
end

You must implement your sort by inserting elements into a priority queue and then removing them. To represent the priority queue, you may use either of the abstractions defined above: PQUEUE or MPQUEUE. Your choice of abstraction will determine the details of your solution:

Either way, place your solution in file sort.sml. You can compile it by running

compile105-sml sort.sml
    

You don’t need a structure that implements signature PQUEUE or MPQUEUE. This is the whole point! (Just keep in mind that unless you choose to write such a structure, you will not be able to run any tests that you write for your heapsort. Running the tests is optional and is at your discretion.)

Also, you do not need the source code that defines the signatures PQUEUE and MPQUEUE—the compile105-sml script finds those files in /comp/105/lib.

If you use the exception, avoid this common mistake: don’t try catching exception Empty. The exception you need to catch must be identified by its full qualified name, Q.empty.

How big is it? The body of the functor is under 10 lines of code.

Related reading:

  • Read about functors in sources from the ML learning guide.

What’s the point? This problem illustrates a key advantage of the ML module system: you can program against an unimplemented interface. When you’re designing an interface, you often need to know if the interface meets the needs of its clients, and you want to get the design right before you spend the budget needed to create an implementation. Several languages provide mechanisms that will do the job (e.g., Java interfaces); in ML, those mechanisms are signatures and functors.

If you want to run tests

If you want to test your heapsort, you’ll need to implement a heap. The book excerpt has two implementations that you are welcome to translate from Molecule into Standard ML. An even easier approach is to use the heap data structure from the ML lectures:

datatype 'a heap = EHEAP
                 | HEAP of 'a * 'a heap * 'a heap

Here’s a trick: every time you insert into a nonempty heap, swap the left and right subheaps. For the special case of heapsort, this trick will give perfect O(Nlog N) performance.

Implementing heaps and testing heapsort is optional.

Programming, part two: Arbitrary-precision numbers

Standard ML’s primitive type int is limited to machine precision. If arithmetic on int results in a value that is too large or too small to fit in one word, the primitive functions raise Overflow. In the next two exercises, you will implement a true integer abstraction, called bigint, which never raises Overflow. (The worst it can do is make your program run out of memory.)

You will implement this interface:

signature BIGNUM = sig
   type bigint
   exception BadDivision                (* contract violation for sdiv *)

   val ofInt   : int -> bigint
   val negated : bigint -> bigint       (* "unary minus" *)
   val <+>     : bigint * bigint -> bigint
   val <->     : bigint * bigint -> bigint
   val <*>     : bigint * bigint -> bigint
   val sdiv    : bigint * int -> { quotient : bigint, remainder : int }

     (* Contract for "short division" sdiv, which is defined only on
        *nonnegative* integers:

        Provided 0 < d <= K and n >= 0,
          sdiv (n, d) returns { quotient = q, remainder = r }
        such that
          n == q /*/ ofInt d /+/ ofInt r
          0 <= r < d

        Given a negative n or d <= 0, sdiv (n, d) raises BadDivision.

        Given d > K, the program halts with a checked run-time error.
        The constant K depends on the number of bits in a machine integer,
        so it is not specified, but it is known to be at least 10.
     *)
        
   val compare : bigint * bigint -> order

   val toString : bigint -> string

     (* toString n returns a string giving the natural 
        representation of n in the decimal system.  If n is
        negative, toString should use the conventional minus sign "-".

        And when toString returns a string containing two or more digits,
        the first digit must not be zero.
     *)

  val invariant : bigint -> bool  (* representation invariant---instructions below *)

end

You will not build your implementation from scratch. Instead, you will use ML’s functor mechanism to build on top of natural numbers. Natural numbers implement this interface:

signature NATURAL = sig
   type nat
   exception Negative     (* the result of an operation is negative *)
   exception BadDivisor   (* divisor zero or negative *)

   val ofInt : int -> nat          (* could raise Negative *)
   val /+/   : nat * nat -> nat
   val /-/   : nat * nat -> nat    (* could raise Negative *)
   val /*/   : nat * nat -> nat
   val sdiv  : nat * int -> { quotient : nat, remainder : int }

     (* Contract for "Short division" sdiv:

        Provided 0 < d <= K,
          sdiv (n, d) returns { quotient = q, remainder = r }
        such that
          n == q /*/ ofInt d /+/ ofInt r
          0 <= r < d

        Given a d <= 0, sdiv (n, d) raises BadDivisor.

        Given d > K, the program halts with a checked run-time error.
        The constant K depends on the number of bits in a machine integer,
        so it is not specified, but it is known to be at least 10.
     *)
        
   val compare : nat * nat -> order

   val decimal : nat -> int list

     (* decimal n returns a list giving the natural decimal
        representation of n, most significant digit first.
        For example,  decimal (ofInt 123) = [1, 2, 3]
                      decimal (ofInt 0)   = [0]
        It must never return an empty list.
        And when it returns a list of two or more digits,
        the first digit must not be zero.
     *)

  val invariant : nat -> bool  (* representation invariant---instructions below *)

end

In exercises I and N below, you implement both interfaces. You can do them in either order.

Programming exercise I: Integers from natural numbers

Abstract data type: large integers. In file bignum.sml, define a functor BignumFn that takes as its argument a structure N matching signature NATURAL, and returns as its result a structure matching signature BIGNUM. Your functor should look like this:

functor BignumFn(structure N : NATURAL) :> BIGNUM
  =
struct
  ... sequence of definitions here ...
end

Within the body of the functor, you refer to the representation of natural numbers as type N.nat, and you call operations using the fully qualified names of the functions as in N./+/ (n, m).

ML syntax hint: The introduction form for the quotient/remainder record is

{ quotient = e1, remainder = e2 }

for any expressions e1 and e2.

The elimination form is

let val { quotient = q, remainder = r } = …

for any patterns q and r. (In typical usage, q and r will both be names.)

How big is it? The course implementation is about 70 lines, of which about 15 are blank.

Representation, abstraction function, and invariant

What you need to know about an integer is how big it is and whether it is negative: its magnitude and sign. Within that constraint, you have your choice of representations. Several representations will work; here are three good ones:

Each of these representations has its advantages, and they all work. Pick what you think will make your job easy.

Document your representation by writing down the abstraction function and invariant:

Put the abstraction function and invariant inside your BignumFn functor, right after the definition of type bigint.

Two mild warnings and one dire warning

The only major pitfall in this abstraction is that the integer zero is likely to have two or even three different representations. You must make it impossible for client code to distinguish them. That is, it must be impossible for the teaching staff to write a program that uses the BIGNUM interface and can tell one representation of zero from another. The risks are greatest in exported functions compare and toString.

There is also a minor pitfall: if a small-minded person hands you the most negative integer (see Int.minInt), its magnitude cannot be represented as a machine integer. To compute the magnitude of a negative integer safely, use the following steps:

  1. Add one to the negative integer
  2. Negate the sum
  3. Convert the result to nat
  4. Add one to the nat

The dire warning is this: do not include large integer literals, like ~4611686018427387904, in your source code. Your code won’t compile on our test machine, and you will get No Credit. You do not need to mess around with Int.minInt, but if you feel compelled to do so, refer to it by name.

Guidance: infix operators

Inside your BignumFn functor, we recommend you change the “fixity” of the major operators so you can write both calls and definitions using infix notation. Here’s all you have to write:

infix 6 <+> <->
infix 7 <*> sdiv

If you also want to use the operations from the NATURAL interface as infix operators, try something like this:

val /+/ = N./+/
val /-/ = N./-/
val /*/ = N./*/

infix 6 /+/ /-/
infix 7 /*/ 

Following these declarations, you can write both definitions and calls using infix notation:

fun thing1 <+> thing2 = ...

... mag1 /+/ mag2 ...

Guidance: algebraic laws

Arithmetic on signed integers requires a collection of algorithms. The heavy lifting is done in the implementation of natural numbers (below); the integer abstraction mostly has to get the signs right. To help you get signs right, we provide some algebraic laws. In the laws, magnitudes appear as variables N and M; signs appear as symbols + and -, and the BIGNUM and NATURAL operations are written using infix notation.

Guidance: short division

If it gets a divisor that’s too big, function sdiv is allowed to fail with a checked run-time error. How big is too big? Anything that makes division on natural numbers fail. In other words, your BignumFn only has to check the sign of the dividend and divisor; it doesn’t have to worry about the magnitude of the divisor.

Related reading: Using the information in the ML learning guide, read about ML signatures, structures, and functors.

Programming exercise N: Arbitrary-precision natural numbers

In this exercise, you finish the implementation of natural numbers—of arbitrary precision—that you started on the first ML assignment. Arithmetic will never overflow; the worst that can happen is you run out of memory. You will design and build an implementation of signature NATURAL, which you will put in file natural.sml. You will tackle the implementation in two parts: first choose a representation and invariant, then implement the operations.

ML syntax hint: The introduction and elimination forms for the quotient/remainder record are shown above.

How big is it? The course solutions include two different implementations using two significantly different representations. Together they total about 270 lines of code, of which 50 lines are blank. The two implementations are roughly the same size.

What’s the point? Abstract data types put a firewall between an interface and its implementation, so that you can easily change the base of natural numbers, or even the representation, and no program can tell the difference. In Standard ML, the firewall is emplaced through opaque signature ascription, known to a select group of 105 alumni as “the beak.”

Getting started: Representation, abstraction function, and invariant. A natural number should be represented by a sequence of digits. But “sequence of digits” has many representations! You probably want a list of digits, an array of digits, or an algebraic data type like the one from the ML homework. And what counts as a “digit” depends on the base. In this assignment, the choice of base is yours, but to get full credit, you must choose a base that is different from 10.

Document your representation by writing down the abstraction function and invariant:

Put the abstraction function and invariant inside your Natural module, right after the definition of type nat.

Related reading:

Finishing up: Implementing the interface. In file natural.sml, define a structure Natural that implements signature NATURAL. The right way to do it is with code that looks like this:

(* inconvenient code *)
structure Natural :> NATURAL = struct
  ...
  type     nat = ... your definition from part (a) ...,  OR
  datatype nat = ... a datatype is also OK here ...
  (* abstraction function:
       ... *)
  fun invariant ... = ...

  ...
end

But there’s a small problem here: once the module is sealed, you will find it almost impossible to debug. Here’s a trick of the ML masters:

(* convenient code *)
structure ExposedNatural = struct               (* Look!  No ascription *)
  ...
  type     nat = ...     OR
  datatype nat = ... 
  (* abstraction function:
       ... *)
  fun invariant ... = ...

  ...
end

structure Natural :> NATURAL = ExposedNatural   (* the module is sealed here *)

You can debug ExposedNatural, while your client code uses the sealed Natural.

Whichever way you choose to write it, be sure you seal structure Natural with the :> operator, so that no client code can see the representation and violate its invariants. Sealing is especially important if you choose a mutable representation of this immutable abstraction.

Feel free to draw on the course model solutions from the first ML assignment, or from your own solutions—just acknowledge your sources.

Related reading:

  • For the details of addition, subtraction, and conversion, revisit the ML assignment. For multiplication, look at the model solutions.

  • Also from that assignment, revisit the “smart constructor”.

  • The short handout “Mastering Multiprecision Arithmetic” recommends exactly how to implement natural numbers, including detailed suggestions about helper functions that you may find most useful and ideas about what operations to implement first.

  • Read about arithmetic in Appendix B of Programming Languages: Build, Prove, and Compare, which is part of the “Supplement”.

  • If you need a refresher on ML signatures, structures, and functors, consult the ML learning guide.

Guidance for choosing a representation of natural numbers

No one representation is equally good for all operations.

You can pick one representation and stick with it, or you can convert temporarily as needed to facilitate each operation. You can even define a representation as a set of alternatives, as in

datatype nat = ARRAY         of digit array
             | LITTLE_ENDIAN of digit list
             | BIG_ENDIAN    of digit list

A little freedom can be dangerous: if you choose a representation like this one, you risk having to handle at least nine cases per binary operator. But the choice is yours.

If you’d like to use mutable arrays to implement natural numbers, here are some hints based on our experience:

As you complete exercise N, you may want to revisit your representation choices. That’s part of the point—representations can change without affecting any external code.

Guidance for implementing natural-number operations

Choice of representation determines what’s hard and what’s easy. Here are a few notes and hints:

Testing numbers

Testing should begin with natural numbers. To test anything, you need ofInt and decimal, which probably also means sdiv. If you can’t get sdiv working and you’re desperate to write tests, try using base 10 and writing a special-purpose implementation of decimal.

With these parts in place, compute some 20-digit numbers by taking a list of digits and folding with multiply and add. Since you only have to multiply by 10, you can test addition without multiplication. Here’s a function that multiplies by 10, using only addition:

fun tenTimes n =
  let infix 6 /+/
      val p = n
      val p = n /+/ n   (* p ==  2n *)
      val p = p /+/ p   (* p ==  4n *)
      val p = p /+/ n   (* p ==  5n *)
      val p = p /+/ p   (* p == 10n *)
  in  p
  end

Once you are confident that addition works, you can test subtraction of natural numbers by generating a long random sequence, then subtracting the same sequence in which all digits except the most significant are replaced by zero.

You can create more ambitious tests of subtraction by generating random natural numbers and using the algebraic law (m + n) − m = n. You can also check to see that unless n is zero, m − (m + n) raises the Negative exception.

It is harder to test multiplication, but you can at least use repeated addition to test multiplication by small values.

You can also easily test multiplication by large powers of 10.

To create more test cases, you can use the interactive ghci interpreter on the servers. It implements the functional language Haskell, in which large-integer arithmetic is the (sensible) default. Here’s an example:

$ ghci
GHCi, version 8.0.2: http://www.haskell.org/ghc/  :? for help
Prelude> 1234567890 * 987654321 
1219326311126352690
Prelude> 1234567890 * 987654321 * 9999999999
12193263110044200588873647310
Prelude>

To use such big numbers in your own code, you can convert them from strings. Here is a rough sketch of conversion from string to nat:

fun digitOfChar c =
  if Char.isDigit c then Char.ord c - Char.ord #"0"
  else raise Match

val natOfDigit = Natural.ofInt o digitOfChar

val ten  = Natural.ofInt 10
val zero = Natural.ofInt 0

val /*/ = Natural./*/
val /+/ = Natural./+/

infix 7 /*/
infix 6 /+/

fun natOfString s =
  foldl (fn (d, n) => n /*/ ten /+/ natOfDigit d) zero (explode s)

Beyond these simple kinds of tests, skilled engineers don’t write unit tests by hand—we write computer programs that generate unit tests. At any point, you can just generate random expressions that compute with large numbers, then compare your results with a working implementation of bignums. Both MLton and Standard ML of New Jersey (command sml) provide a structure IntInf that implements arbitrary-precision signed integers.

We provide some computer-generated tests for natural numbers. To run the tests, compile everything using compile105-sml, then load them into the interactive system, apply the functor, and run the tests:

$ mosml -P full -I /comp/105/lib
Moscow ML version 2.10-3 (Tufts University, April 2012)
Enter `quit();' to quit.
- load "natural-tests";
> val it = () : unit
- load "natural";
> val it = () : unit
- structure Run = UnitTestsFun(structure Natural = Natural);
> structure Run : {val run : unit -> unit}
- Run.run();
All 72 internal Unit tests passed.
> val it = () : unit

We also provide some generated tests for signed integers, which require something like the following:

- structure B  = BignumFn(structure N = Natural);
- structure BT = UnitTestsFun(structure Bignum = B);
- BT.run();

Programming wrapup: Abstraction functions and invariants

Exercise ADT. Collect representations, abstraction functions, and invariants.

In a new file adt.txt, summarize your design work:

Avoid common mistakes

It’s a surprisingly common mistake to write a BIGNUM implementation of ofInt that doesn’t work with negative inputs.

It’s a common mistake to overlook the equations for short division in the middle of page S22 in Appendix B in the supplement. Those equations do all the heavy lifting.

It’s a common mistake to think you need write code in sdiv to avoid K. No such code needs to be written; If a divisor is too big, an intermediate multiplication will overflow. This is one of those classic situations where we say, “it’s not your code’s responsibility to detect or deal with contract violations.” It was never more true than here.

It’s a mistake to think that K has anything to do with the base of natural numbers. K is a large number that occupies roughly half a machine word. Fortunately, you don’t have to worry about it.

What and how to submit

As soon as you have tackled the reading comprehension, run submit105-sml-solo to submit your answers to the CQ’s.

For the programming exercises, submit the following files:

The ML files that you submit should contain all structure and function definitions that you write for this assignment (including any helper functions that may be necessary), in the order they should be compiled. Do not copy or submit the signatures provided with the assignment; we already have them.

The files you submit must compile with Moscow ML, using the compile105-sml script we give you. We will reject files with syntax errors or type errors. Your files must compile without warning messages.

As soon as you have the files listed above, run submit105-sml-pair to submit a preliminary version of your work. Keep submitting until your work is complete; we grade only the last submission.

Appendices

Appendix I: Two ways to compile Standard ML modules

The Definition of Standard ML does not specify where or how a compiler should look for modules in a filesystem. And each compiler looks in its own idiosyncratic way. You should be able to get away with using compile105-sml, but if something goes wrong, this appendix explains not only what is going on but also how to compile with MLton. (MLton is going to be important when the time comes to test your game.)

Compiling Standard ML modules using Moscow ML

To compile an individual module using Moscow ML, you type

mosmlc -I /comp/105/lib -c -toplevel filename.sml

This puts compiler-interface information into filename.ui and implementation information into filename.uo. Perhaps surprisingly, either a signature or a structure will produce both .ui and .uo files. This behavior is an artifact of the way Moscow ML works; don’t let it alarm you.

If your module depends on another module, you will have to mention the .ui file on the command line as you compile. For example, a BignumFn functor depends on both NATURAL and BIGNUM signatures. If BignumFn is defined in bignum.sml, NATURAL is defined in natural-sig.sml, and BIGNUM is defined in bignum-sig.sml, then to compile BignumFn you run

mosmlc -I /comp/105/lib -toplevel -c natural-sig.ui bignum-sig.ui bignum.sml

The script compile105-sml knows about the files that are assigned for the homework, and in most situations it inserts the .ui references for you.

Calling mosmlc produces a bignum.uo file. The .uo files are good for two things:

Compiling Standard ML to native machine code using MLton

If your games are running too slow, compile them with MLton. MLton is a whole-program compiler that produces optimized native code. Nothing on this assignment really requires native-code speed, but because of its superior error messages, MLton can be worth using anyway.

To use MLton, you list all your modules in an MLB file, and MLton compiles them at one go. If you want to try MLton on your own codes, download file sml-homework.mlb, then compile with

mlton -verbose 1 sml-homework.mlb

You can then run any Unit tests with ./sml-homework.

More information about MLton is available on the man page and at mlton.org.

If you want to try this with the coins game, download files coins.mlb and runcoins.sml, and then compile with

mlton -output coins -verbose 1 coins.mlb

(You can also download and compile sticks.mlb, runsticks.sml, ttt.mlb, and runttt.sml.)

Because MLton requires source code, you will be able to use it only once you have your own AGS.

Appendix II: How your work will be evaluated

Program structure

We’ll be looking for you to seal all your modules. We’ll also be looking for the usual hallmarks of good ML structure.

Exemplary Satisfactory Must improve
Structure

• All modules are sealed using the opaque sealing operator :>

• Code uses basis functions effectively, especially higher-order functions on list and vector types.

• Code has no redundant case analysis

• Code is no larger than is necessary to solve the problem.

• Most modules are sealed using the opaque sealing operator :>

• Code uses the familiar functions, but misses opportunities to use unfamiliar functions like Vector.tabulate.

• Code has one redundant case analysis

• Code is somewhat larger then necessary to solve the problem.

• Only some or no modules are sealed using the opaque sealing operator :>

• A module is defined without ascribing any signature to it (unsealed)

• Code misses opportunities to use map, fold, or other familiar HOFs.

• Code has more than one redundant case analysis

• Code is almost twice as large as necessary to solve the problem.

Performance and correctness

We’ll look to be sure your code meets specifications, and that the performance of your AGS and arithmetic is as good as reasonably possible.

Exemplary Satisfactory Must improve
Performance

• Long computations on natural numbers are basically instantaneous, even when hundreds of decimal digits are involved.

Or, we can do hundreds of operations on natural numbers, including multipliation, on numbers with dozens of decimal digits, in less then a hundred milliseconds.

• The natural-number tests I provide run in less than a hundred milliseconds.

• There is a computation on natural numbers that takes more than a second.


  1. If you copy a signature, sml-lint will complain. Don’t copy signatures.

  2. “Priority queue” is the name of the abstraction. “Heap” is the name of the efficient implementation. It’s confusing.

  3. Sometimes a priority queue is said to correspond to a “bag” of elements or even a set, but a sorted list is simpler.

  4. Words like “subtrahend” and “minuend” are useful, but I always have to look them up.