COMP 105: Supplement to Ullman

This document has a few fine points not covered in Ullman.

Opening structures

Ullman sometimes abbreviates by opening structures, e.g., open TextIO. Never do this---it is bad enough to open structures in the standard basis, but if you open other structures, your code will be hopelessly difficulty to maintain. Instead, abbreviate structure names as needed. For example, after structure T = TextIO, you can use T.openIn, etc., without (much) danger of confusion.

Getting record elements

Some students have the idea that a good way to get the second element of a pair p is to write #2 p. This style is not idiomatic or readable. The proper way to handle this is by pattern matching, so

  fun first  (x, _) = x
  fun second (_, y) = y

is preferred, and not

  fun bogus_first  p = #1 p
  fun bogus_second p = #2 p

(For reasons I don't want to discuss, but will answer in class if asked, these versions don't even type-check.) If your pair or tuple is not an argument to a function, use val to do the pattern matching:

  val (x, y) = lookup_pair mumble

But usually you can include matching in ordinary fun matching.

Points will be deducted on homework for using #1, #2, and their friends.

Optional types

Let's suppose you want to represent a value, except the value might not actually be known. For example, I could represent a grade on a homework by an integer, except if a grade hasn't been submitted. Or the contents of a square on a chessboard is a piece, except the square might be empty. This problem comes up so often that the initial basis for ML has a special type constructor called option, which lets you handle it. The definition of option is

  datatype 'a option = NONE | SOME of 'a

and it is already defined when you start the interactive system. You need not and should not define it yourself.

Some examples

  - datatype chesspiece = K | Q | R | N | B | P
  - type square = chesspiece option
  - val empty : square = NONE
  - val lower_left : square = SOME R
  - fun play piece = SOME piece : square;
  > val play = fn : chesspiece -> chesspiece option

  - SOME true; 
  > val it = SOME true : bool option
  - SOME 37;
  > val it = SOME 37 : int option

  - SOME "fish" = SOME "fowl";
  > val it = false : bool
  - SOME "fish" = NONE;
  > val it = false : bool
  - "fish" = NONE;
  ! Toplevel input:
  ! "fish" = NONE;
  !          ^^^^
  ! Type clash: expression of type
  !   'a option
  ! cannot be made to have type
  !   string

The option type is covered in Ullman on pages 111-113, 208, etc.


Although Ullman describes the mutable Array structure in Chapter 7, he doesn't cover the immutable Vector structure except for a couple of pages deep in Chapter 9. Like an array, a vector offers constant-time access to an array of elements, but a vector is not mutable. Because of its immutability, Vector is often preferred. It is especially flexible when initialized with Vector.tabulate. Here's the signature:

  signature VECTOR =
      eqtype 'a vector
      val maxLen : int
      val fromList : 'a list -> 'a vector
      val tabulate : int * (int -> 'a) -> 'a vector
      val length : 'a vector -> int
      val sub : 'a vector * int -> 'a
      val update : 'a vector * int * 'a -> 'a vector
      val concat : 'a vector list -> 'a vector
      val appi : (int * 'a -> unit) -> 'a vector -> unit
      val app : ('a -> unit) -> 'a vector -> unit
      val mapi : (int * 'a -> 'b) -> 'a vector -> 'b vector
      val map : ('a -> 'b) -> 'a vector -> 'b vector
      val foldli : (int * 'a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val foldri : (int * 'a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val foldl : ('a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val foldr : ('a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val findi : (int * 'a -> bool) -> 'a vector -> (int * 'a) option
      val find : ('a -> bool) -> 'a vector -> 'a option
      val exists : ('a -> bool) -> 'a vector -> bool
      val all : ('a -> bool) -> 'a vector -> bool
      val collate : ('a * 'a -> order) -> 'a vector * 'a vector -> order


It's easy to be confused about when you need parentheses. Here's a checklist to tell you when you need parentheses around an expression or a pattern:

  1. Is it an argument to a (possibly Curried) function, and if so, is it more than a single token?
  2. Is it an infix expression that has to be parenthesized because the precedence of another infix operator would do the wrong thing otherwise?
  3. Are you forming a tuple?
  4. Are you parenthesizing an expression involving fn, case, or handle?

If the answer to any of these questions is yes, use parentheses. Otherwise, you almost certainly don't need them---so get rid of them!


Ullman's style is less than ideal. Here are some short recommendations.

Back to the class home page