COMP 105: Supplement to Ullman

This document has a few fine points not covered in Ullman.

Opening structures
Getting record elements
Optional types
Vectors
Parentheses

Opening structures

Ullman sometimes abbreviates by opening structures, e.g., open TextIO. Never do this---it is bad enough to open structures in the standard basis, but if you open other structures, your code will be hopelessly difficulty to maintain. Instead, abbreviate structure names as needed. For example, after structure T = TextIO, you can use T.openIn, etc., without (much) danger of confusion.

Getting record elements

Some students have the idea that a good way to get the second element of a pair p is to write #2 p. This style is not idiomatic or readable. The proper way to handle this is by pattern matching, so

  fun first  (x, _) = x
  fun second (_, y) = y

is preferred, and not

  fun bogus_first  p = #1 p
  fun bogus_second p = #2 p

(For reasons I don't want to discuss, but will answer in class if asked, these versions don't even type-check.) If your pair or tuple is not an argument to a function, use val to do the pattern matching:

  val (x, y) = lookup_pair mumble

But usually you can include matching in ordinary fun matching.

Points will be deducted on homework for using #1, #2, and their friends.

Optional types

Let's suppose you want to represent a value, except the value might not actually be known. For example, I could represent a grade on a homework by an integer, except if a grade hasn't been submitted. Or the contents of a square on a chessboard is a piece, except the square might be empty. This problem comes up so often that the initial basis for ML has a special type constructor called option, which lets you handle it. The definition of option is

  datatype 'a option = NONE | SOME of 'a

and it is already defined when you start the interactive system. You need not and should not define it yourself.

Some examples

  - datatype chesspiece = K | Q | R | N | B | P
  - type square = chesspiece option
  - val empty : square = NONE
  - val lower_left : square = SOME R
  - fun play piece = SOME piece : square;
  > val play = fn : chesspiece -> chesspiece option

  - SOME true; 
  > val it = SOME true : bool option
  - SOME 37;
  > val it = SOME 37 : int option

  - SOME "fish" = SOME "fowl";
  > val it = false : bool
  - SOME "fish" = NONE;
  > val it = false : bool
  - "fish" = NONE;
  ! Toplevel input:
  ! "fish" = NONE;
  !          ^^^^
  ! Type clash: expression of type
  !   'a option
  ! cannot be made to have type
  !   string

The option type is covered in Ullman on pages 111-113, 208, etc.

Vectors

Although Ullman describes the mutable Array structure in Chapter 7, he doesn't cover the immutable Vector structure except for a couple of pages deep in Chapter 9. Like an array, a vector offers constant-time access to an array of elements, but a vector is not mutable. Because of its immutability, Vector is often preferred. It is especially flexible when initialized with Vector.tabulate. Here's the signature:

  signature VECTOR =
    sig
      eqtype 'a vector
      val maxLen : int
      val fromList : 'a list -> 'a vector
      val tabulate : int * (int -> 'a) -> 'a vector
      val length : 'a vector -> int
      val sub : 'a vector * int -> 'a
      val update : 'a vector * int * 'a -> 'a vector
      val concat : 'a vector list -> 'a vector
      val appi : (int * 'a -> unit) -> 'a vector -> unit
      val app : ('a -> unit) -> 'a vector -> unit
      val mapi : (int * 'a -> 'b) -> 'a vector -> 'b vector
      val map : ('a -> 'b) -> 'a vector -> 'b vector
      val foldli : (int * 'a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val foldri : (int * 'a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val foldl : ('a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val foldr : ('a * 'b -> 'b) -> 'b -> 'a vector -> 'b
      val findi : (int * 'a -> bool) -> 'a vector -> (int * 'a) option
      val find : ('a -> bool) -> 'a vector -> 'a option
      val exists : ('a -> bool) -> 'a vector -> bool
      val all : ('a -> bool) -> 'a vector -> bool
      val collate : ('a * 'a -> order) -> 'a vector * 'a vector -> order
    end

Parentheses

It's easy to be confused about when you need parentheses. Here's a checklist to tell you when you need parentheses around an expression or a pattern:

Is it an argument to a (possibly Curried) function, and if so, is it more than a single token?
Is it an infix expression that has to be parenthesized because the precedence of another infix operator would do the wrong thing otherwise?
Are you forming a tuple?
Are you parenthesizing an expression involving fn, case, or handle?

If the answer to any of these questions is yes, use parentheses. Otherwise, you almost certainly don't need them---so get rid of them!

Style

Ullman's style is less than ideal. Here are some short recommendations.

Never write x = nil. Either use null (that's why it's in the initial basis) or use pattern matching.
You need the semicolon at interactive toplevel, but it should almost never appear in your code. Don't use a semicolon unless you are deliberately sequencing imperative code. Ullman's book is full of unnecessary semicolons.
As per the checklist above, never parenthesize the condition in an if expression.

Back to the class home page