COMP 105 (Programming Languages): Supplement to Ullman

This document has a few fine points not covered in Ullman.

Opening structures

Ullman sometimes abbreviates by opening structures, e.g., open TextIO. Never do this---it is bad enough to open structures in the standard basis, but if you open other structures, your code will be hopelessly difficulty to maintain. Instead, abbreviate structure names as needed. For example, after structure T = TextIO, you can use T.openIn, etc., without (much) danger of confusion.

Getting record elements

Some students have the idea that a good way to get the second element of a pair p is to write #2 p. This style is not idiomatic or readable. The proper way to handle this is by pattern matching, so
fun first  (x, _) = x
fun second (_, y) = y
is preferred, and not
fun bogus_first  p = #1 p
fun bogus_second p = #2 p
(For reasons I don't want to discuss, but will answer in class if asked, these versions don't even type-check.) If your pair or tuple is not an argument to a function, use val to do the pattern matching:
val (x, y) = lookup_pair mumble
But usually you can include matching in ordinary fun matching.

Points will be deducted on homework for using #1, #2, and their friends.

Optional types

Let's suppose you want to represent a value, except the value might not actually be known. For example, I could represent a grade on a homework by an integer, except if a grade hasn't been submitted. Or the contents of a square on a chessboard is a piece, except the square might be empty. This problem comes up so often that the initial basis for ML has a special type constructor called option, which lets you handle it. The definition of option is
datatype 'a option = NONE | SOME of 'a
and it is already defined when you start the interactive system. You need not and should not define it yourself.

Some examples

- datatype chesspiece = K | Q | R | N | B | P
- type square = chesspiece option
- val empty : square = NONE
- val lower_left : square = SOME R
- fun play piece = SOME piece : square;
> val play = fn : chesspiece -> chesspiece option

- SOME true; 
> val it = SOME true : bool option
- SOME 37;
> val it = SOME 37 : int option

- SOME "fish" = SOME "fowl";
> val it = false : bool
- SOME "fish" = NONE;
> val it = false : bool
- "fish" = NONE;
! Toplevel input:
! "fish" = NONE;
!          ^^^^
! Type clash: expression of type
!   'a option
! cannot be made to have type
!   string

The option type is covered in Ullman on pages 111-113, 208, etc.


Although Ullman describes the mutable Array structure in Chapter 7, he doesn't cover the immutable Vector structure except for a couple of pages deep in Chapter 9. Like an array, a vector offers constant-time access to an array of elements, but a vector is not mutable. Because of its immutability, Vector is often preferred. It is especially flexible when initialized with Vector.tabulate. Here's the signature:
signature VECTOR =
    eqtype 'a vector
    val maxLen : int
    val fromList : 'a list -> 'a vector
    val tabulate : int * (int -> 'a) -> 'a vector
    val length : 'a vector -> int
    val sub : 'a vector * int -> 'a
    val update : 'a vector * int * 'a -> 'a vector
    val concat : 'a vector list -> 'a vector
    val appi : (int * 'a -> unit) -> 'a vector -> unit
    val app : ('a -> unit) -> 'a vector -> unit
    val mapi : (int * 'a -> 'b) -> 'a vector -> 'b vector
    val map : ('a -> 'b) -> 'a vector -> 'b vector
    val foldli : (int * 'a * 'b -> 'b) -> 'b -> 'a vector -> 'b
    val foldri : (int * 'a * 'b -> 'b) -> 'b -> 'a vector -> 'b
    val foldl : ('a * 'b -> 'b) -> 'b -> 'a vector -> 'b
    val foldr : ('a * 'b -> 'b) -> 'b -> 'a vector -> 'b
    val findi : (int * 'a -> bool) -> 'a vector -> (int * 'a) option
    val find : ('a -> bool) -> 'a vector -> 'a option
    val exists : ('a -> bool) -> 'a vector -> bool
    val all : ('a -> bool) -> 'a vector -> bool
    val collate : ('a * 'a -> order) -> 'a vector * 'a vector -> order


It's easy to be confused about when you need parentheses. Here's a checklist to tell you when you need parentheses around an expression or a pattern:
  1. Is it an argument to a (possibly Curried) function, and if so, is it more than a single token?
  2. Is it an infix expression that has to be parenthesized because the precedence of another infix operator would do the wrong thing otherwise?
  3. Are you forming a tuple?
  4. Are you parenthesizing an expression involving fn, case, or handle?
If the answer to any of these questions is yes, use parentheses. Otherwise, you almost certainly don't need them---so get rid of them!


Ullman's style is less than ideal. Here are some short recommendations.