Lecture notes for COMP 105 (Programming Languages)

Table of Contents

21 Jan 2011: Introduction to Comp 105

p1210000 p1214432

Why so many languages?

Topic of study: the stuff software is made of

Conclusion: make it easier to write programs that really work

Reading: Paul Graham, especially the "Blub Paradox"

What do the professionals say?

In public: a language should express computations

So, no more uncivilized low-level C, and no more C++ that no mortal being can understand

In the pub: professionals are keenly interested in their code

Your language can make you or break you.

What can you get out of Comp 105?

Bonus: preparation for advanced study

How will it work?

Intellectual tools you need to understand, evaluate, and use languages effectively

Notations of the trade (source of precision, further study)

Learn by doing:

(Just as intellectually challenging as COMP 40, but in an entirely different direction.

Great languages begin with great features

Whorf's hypothesis applies

There aren't that many great features, and you will see them over and over

You'll choose features, and therefore languages, to fit your needs

Some are more powerful than others

The Blub paradox: habit blinds us to power

Blub looks good enough only if you think in Blub.

In Comp 105,

OK, what are the great ideas?

To say more with less (expressive power):

To promote reliability and reuse:

Describing it all precisely:

Course logistics and administration

Three lectures per week: 50% more cracks at tough ideas

No cadre of teaching assistants: material is new to everyone except me and Professor Guyer

Instead, small-group section once a week led by Engineering Fellow:

Syllabus and schedule are works in progress:

You must get my book

Homework will be frequent and challenging:

Both individual and pair work:

Other policies and procedures in handout and on web - You are responsible!

Prerequisites and Rough syllabus

This topic marries programming and mathematics.
If you're going to enjoy the course,

24 Jan 2011: Abstract syntax and environments

p1244433 p1244434 p1244435 p1244436 p1244437 p1244438 p1244440 p1244441 p1244442 p1244443

Forms and translation of programs

Each of these is a program:

Transition: lexer/parser/lowering/optimization/codegen/assembler/linker

There is one true representation (and that's the syntax)

Supporting cast:

Introducing our common framework

Goal: make comparisons easy by eliminating superficial differences

No new language ideas. Instead, imperative programming with an IMPerative CORE:

Idea of LISP syntax:

Two syntactic categories: definitions, expressions (no statements!)

An Impcore program is a sequence of definitions [possible demo here]

(define mod (m n) (- m (* n (/ m n))))

Compare

int mod (int m, int n) { 
  return m - n * (m / n); 
}

Or

(val n 99)

Compare

int n = 99;

Also, expressions at top level (definition of it)

No statements means expression-oriented:

(if e1 e2 e3)
(while e1 e2)
(set x e)
(begin e1 ... en)
(f e1 ... en)

Each one has a value and may have side effects!

Functions may be primitive (+ - * / = < > print)
or defined with (define f ...).

The only type of data is "machine integer" (deliberate oversimplification)

Scoping rules for Impcore

Scopes also called "names spaces"; we will call them "environments" because that's the pointy-headed theory term---and if you want to read some of the exciting theory papers, pointy-headed theory has to be second nature.

Abuse of separate name spaces: [online demo if possible]

-> (val f 33)
33
-> (define f (x) (+ x x))
f
-> (f f)
66

Concrete syntax for Impcore

def ::= (val x exp)
     |  exp
     |  (define f (formals) e)
     |  (use filename)

exp ::= integer-literal
     |  variable-name
     |  (set x exp)
     |  (if exp1 exp2 exp3)
     |  (while exp1 exp2)
     |  (begin exp1 ... expn)
     |  (op exp1 ... expn)  

op ::= function-name | primitive-name

Abstract syntax and its representation in C

Exp = LITERAL (Value)
    | VAR     (Name)
    | SET     (Name name, Exp exp)
    | IFX     (Exp cond, Exp true, Exp false)
    | WHILEX  (Exp cond, Exp exp)
    | BEGIN   (Explist)
    | APPLY   (Name name, Explist actuals)

Notice that there is one kind of "application" for both user-defined and primitive functions.

C code:

typedef struct Exp *Exp;
typedef enum {
  LITERAL, VAR, SET, IFX, WHILEX, BEGIN, APPLY
} Expalt;        /* which alternative is it? */

struct Exp {  // only two fields: 'alt' and 'u'!
    Expalt alt;
    union {
        Value literal;
        Name var;
        struct { Name name; Exp exp; } set;
        struct { Exp cond; Exp true; Exp false; } ifx;
        struct { Exp cond; Exp exp; } whilex;
        Explist begin;
        struct { Name name; Explist actuals; } apply;
    } u;
};

Example AST for (f x (* y 3)) (using Explist)

Example Ast for (define abs (x) (if (< x 0) (- 0 x) x)) (using Namelist)

Trick question:

What's the value of (* y 3)? OK, what's its meaning?

Meanings, part I: Names

Environment associates each variable with one value

Greek letter rho, mapping { x1 |--> n1, ..., xk -> nk }

Environment is a finite map, aka partial function (theorists love functions):

x in dom rho // x is defined in environment rho rho(x) // the value of x in environment rho

Representing environments in C

First, an abstract type. (For 40 vets: like a Hanson Table_T, only simpler because it's monomorphic.)

typedef struct Valenv *Valenv;

Valenv mkValenv(Namelist vars, Valuelist vals);
int isvalbound(Name name, Valenv env);
Value fetchval(Name name, Valenv env);
void bindval(Name name, Value val, Valenv env);

Implementation uses pair of lists. Consider

(val x 1)
(val y 2)
(val z 3)

Global environment xi says

 nl -> z -> y -> x
 vl -> 3 -> 2 -> 1

Representing environments efficiently a key challenge for language implementor. (You'll tackle Exercise 18.)

Bonus content: inductively defined types

Let's simplify to

exp ::= numeral
     | (op exp exp)
op  :: + | - | * | /

There's no such thing as a "recursive definition." Instead there are only recursion equations and solutions.

Consider

E = Z union { op | e1 in E, e2 in E, op in { +, -, *, / } } / \ e1 e2

What is the intersection of all sets satisfying this equation?

Tricky bit: proving the intersection is not empty

Solution is called an "initial algebra" and we won't go there.

Key fact: in any expression, child expressions are smaller

Therefore, recursive functions on expressions terminate
(provided they make recursive calls on child expressions).

This is called structural recursion.

Structural recursion is the simplest known argument for

a. Writing recursive functions

b. Proving they terminate

Preview: meaning of expressions

An expression is evaluated wrt environment (composition of xi, phi, rho)

In our interpreters, implemented by structural recursion

Base cases:

26 Jan 2011: Operational semantics

p1264444 p1264445 p1264446 p1264447 p1264448 p1264449 p1264450 p1264451 p1264452 p1264453 p1264454

(Guest lecture by Geoff Mainland)

Geoff's notes for his first lecture

28 Jan 2011: Using operational semantics in proofs

p1284455 p1284456 p1284457 p1284458 p1284459 p1284460 p1284461 p1284462 p1284463

(Guest lecture by Geoff Mainland)

Geoff's notes for his second lecture

31 Jan 2011: Semantics and Impcore review

p1314464 p1314465 p1314466 p1314467 p1314468 p1314469 p1314470 p1314471 p1314472 p1314473 p1314474 p1314476 p1314477 p1314478 p1314479 p1314480 p1314481 p1314482 p1314483

Semantics review

Questions

Example: what changes the environment?

4 Feb 2011: Impcore final review; Introduction to Scheme

p2044484 p2044485 p2044486 p2044487 p2044488 p2044489 p2044490 p2044491 p2044492 p2044493 p2044494 p2044495 p2044496 p2044497 p2044498 p2044499

Time in class is the tip of the iceberg

It's your job to make connections:

Each data type leads to a coding pattern

Homework review: from equational reasoning to computation

Metatheory gives interesting facts with easy proofs.

What is the inductive definition of the natural numbers?

To discover recursive functions, write algebraic laws:

exp x 0 = 1
exp x (n + 1) = x * (exp x n)

Can you find a direction in which something gets smaller?

Another example:

binary 0 = 0
binary (2n + 1) = binary n .. 1
binary (2n + 0) = binary n .. 0

Right-hand sides in arithmetic?

binary (2n + 1) = 10 * binary n + 1
binary (2n + 0) = 10 * binary n + 0

Which direction gets smaller? What's the code?

Common syntactic categories you find around the house every day

Expression, statement, definition, declaration, type

Impcore final review

Things to notice about impcore

Lots of environments:

Typical complexity!

The Five Questions

About any language, you can ask these questions:

  1. What is the abstract syntax? What are the syntactic categories, and what are the terms in each category?

  2. What are the values? What do expressions/terms evaluate to?

  3. What environments are there? That is, what can names stand for?

  4. How are terms evaluated? What are the judgments? (What are the rules?)

  5. What's in the initial basis? Primitives and otherwise, what is built in?

Introduction to Scheme

Two new kinds of data:

(The setting is Scheme)

Scheme, child of Lisp

LISP, circa 1960; Scheme, circa 1975

Values of Scheme

Values are S-expressions.

Comforting lies:

An S-expression is a symbol or a literal integer or a literal Boolean or a list of S-expressions.

A list of S-expressions is either - the empty list '() - an S-expression followed by a list of S-expressions

Like any other abstract data type

N.B. creators + producers = constructors

No mutators in uScheme

Focus on lists

Creators and producers: nil and cons

'()

(cons S' S)

More examples:

(cons 'a '())
(cons 'a '(b))

Note the literal '(b).

Observers: car, cdr, null?, pair? (also known as "first" and "rest", "head" and "tail", and many other names)

Algebraic laws of lists:

(null? '()) == ...
(null? (cons v vs)) == ...
(car (cons v vs)) == ...
(cdr (cons v vs)) == ...

Do these laws hold more generally?

(car (cons e1 e2)) = ...

What about

S-Expressions more broadly

More S-expression observers:

number?
symbol?
symbol literal symbol? ---
integer literal number? ---
cons pair? car, cdr
'() null? ---

More observers: <, =, >

Equality: number, symbol, Boolean, nil---never cons cells

7 February 2011: The rule of algebraic law; the method of accumulating parameters

p2074503 p2074504 p2074505 p2074506 p2074507 p2074508 p2074509 p2074510 p2074511 p2074512 p2074513 p2074514 p2074515 p2074516 p2074517 p2074518 p2074519 p2074520

Recursive functions for recursive types

Inductively defined data type: small set satisfying this equation:

Inductive definition of lists:
TeX slide

Any list is therefore constructed with nil or with cons.

Example: length

Polymorphic equality testing

Polymorphic means "operating on values of many types"

The primitive equal? is polymorphic but returns false on cons cells
It is useful only for comparing atoms

          *** Atoms and polymorphic equality ***
(define atom? (x)
  (or (number? x) 
  (or (symbol? x) 
  (or (boolean? x) 
      (null? x)))))

(define equal? (s1 s2)
  (if (or (atom? s1) (atom? s2))
    (= s1 s2)
    (and (equal? (car s1) (car s2)) 
         (equal? (cdr s1) (cdr s2)))))

The major cost center is cons

Allocation is the big cost.

Algebraic law for append, by cases, what is xs ys?

'() .. ys == ys
(x .. xs) .. ys == x .. (xs .. ys)
(cons x xs) .. ys = (cons x (xs .. ys))

From there to

          *** Equations and function for append ***
(append '()         ys) == ys
(append (cons z zs) ys) == (cons z (append zs ys))

(define append (xs ys)
  (if (null? xs) ys
      (cons (car xs) (append (cdr xs) ys))))

Why does it terminate?
How many cons cells are allocated?

(N.B. You won't find xs in the book; that update missed the print deadline. The book calls a list l, which is not as good.)

Other laws

(append (list1 x) ys) = ???
(if p #t #f) = ???

What about list reversal?

reverse '() = '()
reverse (x .. xs) = reverse xs .. reverse x = reverse xs .. x

And the code?

          *** Naive list reversal ***
(define reverse (xs)
   (if (null? xs) '()
       (append (reverse (cdr xs))
               (list1 (car xs)))))

How many cons cells are allocated?

The method of accumulating parameters

Let's try a new algebraic law:

reverse (x .. xs) .. zs = reverse xs .. x .. zs = reverse xs .. (cons x zs)
reverse '() .. zs       = zs

The code

          *** Reversal by accumulating parameters ***
(define revapp (xs zs)
   (if (null? xs) zs
       (revapp (cdr xs) (cons (car xs) zs))))

(define reverse (xs) (revapp xs '()))

Parameter zs is the accumulating parameter.
(A powerful, general technique.)

Assocation lists represent finite maps (not to be covered in class)

Implementation: list of key-value pairs

'((k1 v1) (k2 v2) ... (kn vn))

Picture with spine of cons cells, car, cdar, caar, cadar.

          *** A-list observer: find ***
(define caar  (xs) (car (car  xs)))
(define cdar  (xs) (cdr (car  xs)))
(define cadar (xs) (car (cdar xs)))
(define find (x alist)
  (if (null? alist) '()
    (if (equal? x (caar alist))
      (cadar alist)
      (find x (cdr alist)))))



          *** A-list producer: bind ***
(define bind (x y alist)
  (if (null? alist)
    (list1 (list2 x y))
    (if (= x (caar alist))
      (cons (list2 x y) (cdr alist))
      (cons (car alist) 
                (bind x y (cdr alist))))))


          *** A-list example ***
    -> (find 'Room '((Course 105) (Room H111A) 
                              (Instructor Ramsey)))
    H111A
    -> (val nr (bind 'Office 'Extension-006 
               (bind 'Courses '(105)
               (bind 'Email 'comp105-staff '()))))
    ((Email comp105-staff) (Courses (105)) (Office Extension-006))
    -> (find 'Office nr) 
    Extension-006
    -> (find 'Favorite-food nr)
    ()

Notes:

Algebraic laws of association lists

          *** Laws of assocation lists ***
(find k (bind k v l)) = v
(find k (bind k' v l)) = (find k l), provided k != k'
(find k '()) =  '() --- bogus!

9 February 2011: Features of a functional language

p2094521 p2094522 p2094523 p2094524 p2094525 p2094526 p2094527 p2094528 p2094529 p2094530 p2094531 p2094532 p2094533 p2094534 p2094535

Prologue

Depth, not breadth.

Today:

  1. Homework review
  2. First-class, higher-order functions (definitely)
  3. Classic higher-order functions that consume lists (probably not)

Homework

Notes:

Example: undeclared variables:

 x notin dom xi      x notin dom rho
---------------------------------------
<VAR(x), xi, phi, rho>  evals  < ???? >

Handy new feature of Scheme: let binding

          *** Introduce local names into environment ***
(let ((x1 e1)
      ...
      (xn en))
    e)

Evaluate e1 through en, bind answers to x1, ... xn

Also let* (one at a time) and letrec (local recursive functions)

Note that we really have definititions and it might be easier to read if McCarthy had actually used definition syntax, which you'll see in ML, Haskell, and other functional languages:

          *** What McCarthy should have done ***
(let ((val x1 e1)
      ...
      (val xn en))
   e)

From Impcore to uScheme

Things that should offend you about Impcore:

All these problems have one solution: lambda

Anonymous, first-class functions

From Church's lambda-calculus:

(lambda (x) (+ x x))

"The function that maps x to x plus x"

At top level, like define. (Or more accurately, define is a synonym for lambda.)

In general, \x.E or (lambda (x) E)

The ability to "capture" free variables is what makes it interesting:

(lambda (x) (+ x y))  ; means what??

First-class, nested functions

What matters is that y can be a parameter or let-bound variable of an enclosing functions.

First example:

          *** A zero-finder ***
(define findzero-between (f lo hi)
   ; binary search
   (if (>= (+ lo 1) hi)
       hi
       (let ((mid (/ (+ lo hi) 2)))
          (if (< (f mid) 0)
              (findzero-between f mid hi)
              (findzero-between f lo mid)))))
(define findzero (f) (findzero-between f 0 100))

findzero-between is a higher-order function because it takes another function as an argument. You can do this in C.

          *** Function escapes! ***
-> (define to-the-n-minus-k (n k)
      (let
        ((x-to-the-n-minus-k (lambda (x) 
                                (- (exp x n) k))))
        x-to-the-n-minus-k))
-> (val x-cubed-minus-27 (to-the-n-minus-k 3 27))
-> (x-cubed-minus-27 2)
-19

to-the-n-minus-k is a higher-order function because it returns another (escaping) function as a result. Much more interesting!

          *** No need to name the escaping function ***
-> (define to-the-n-minus-k (n k)
      (lambda (x) (- (exp x n) k)))



-> (val x-cubed-minus-27 (to-the-n-minus-k 3 27))
-> (x-cubed-minus-27 2)
-19


          *** The cube root of 27 and the square root of 16 ***
-> (findzero (to-the-n-minus-k 3 27))                                    
3
-> (findzero (to-the-n-minus-k 2 16))
4


          *** What's an "escaping" function? ***
-> (define to-the-n-minus-k (n k)
      (lambda (x) (- (exp x n) k)))

"Escape" means "outlive the activation in which the lambda was evaluated."

Escaping and lifetimes are some of those universal decisions every programmer has to think about.

Where are n and k stored???

C programmers do this explicitly with malloc

In a language with first-class, nested functions, part of the semantics of lambda

Closures:
TeX slide

Operational semantics of closures:
TeX slide

Higher-order functions!

Preview: in math, what is the following equal to?

(f o g)(x) == ???

Another algebraic law, another function:

          *** Functions create new functions ***
-> (define o (f g) (lambda (x) (f (g x))))
-> (define even? (n) (= 0 (mod n 2)))
-> (val odd? (o not even?))
-> (odd? 3)
-> (odd? 4)

11 Feb 2011: More higher-order functions

p2114536 p2114537 p2114538 p2114539 p2114540 p2114541 p2114542 p2114543 p2114544 p2114545 p2114546 p2114547 p2114548 p2114549 p2114550 p2114551 p2114552 p2114553 p2114554 p2114555

Plan for today:

  1. Preparation for some particular homework problems

  2. An important example of functions creating functions: Currying (and its partner, partial application)

  3. Start (but don't finish) HOFs on lists

Homework preparation

Beyond lists

Imposing your will on S-expressions

Board:

LIST(A) = UNIT  u  A * LIST(A)
ALIST(K, V) = LIST( { (list2 k v) | k in K, v in V } )

Not every S-expression is a list:
TeX slide

Rose trees ($k$-ary trees):
TeX slide

Board:

ROSE(A) = A   u   LIST(ROSE(a))

Your job:

Association lists and sets of assocation lists

On your own:

Sets represented as Boolean functions

Q: What can you do if you write a function that consumes a function?

A: apply it!

You can't mutate or modify a function, but you can create as many new ones as you like:

          *** SLIDE ***
-> (define o (f g) (lambda (x) (f (g x))))

Return of the higher-order function

          *** Classic functional technique: Currying ***
-> (val positive? (lambda (y) (< 0 y)))
-> (positive? 3)
-> (val <-curried (lambda (x) (lambda (y) (< x y))))
-> (val positive? (<-curried 0)) ; "partial application"
-> (positive? 0)

No need to Curry by hand!:
TeX slide

What's the algebraic law for curry?

 ...   (curry f) ...    =  ... f ...

Keeping in mind all you can do with a function is apply it?

Bonus content: Lambda as an abstraction barrier

          *** Bonus content: vulnerable variables? ***
-> (val seed 1)
-> (val rand (lambda ()
      (set seed (mod (+ (* seed 9) 5) 1024)))))
-> (rand)
14
-> (rand)
131
-> (set seed 1)
1
-> (rand)
14


          *** Bonus: Lambda as abstraction barrier! ***
-> (val mk-rand (lambda (seed)
     (lambda ()
       (set seed (mod (+ (* seed 9) 5) 1024))))))
-> (val rand (mk-rand 1))
-> (rand)
14
-> (rand)
131
-> (set seed 1)
error: set unbound variable seed
-> (rand)
160

Standard higher-order functions on lists

Capture common patterns of computation

Folds also called reduce, accum, "catamorphism"

List search: exists?

Algebraic law:

(exists? p? '())          == ???
(exixts? p? '(cons a as)) == ???

          *** SLIDE ***
-> (define exists? (p? xs)
      (if (null? xs)
          (if (p? (car xs)) 
              (exists? p? (cdr xs)))))
-> (exists? pair? '(1 2 3))
-> (exists? pair? '(1 2 (3)))
-> (exists? ((curry =) 0) '(1 2 3))
-> (exists? ((curry =) 0) '(0 1 2 3))

List selection: filter

Problem: Give me a list of numbers; return only the even elements.

What are the laws?

(filter even? '())          == ???
(filter even? '(cons a as)) == ???
(filter even? '(cons a as)) == ???


          *** SLIDE ***
-> (define filter (p? xs)
     (if (null? xs)
       '()
       (if (p? (car xs))
         (cons (car xs) (filter p? (cdr xs)))
         (filter p? (cdr xs)))))
-> (filter (lambda (n) (>  n 0)) '(1 2 -3 -4 5 6))
(1 2 5 6)
-> (filter (lambda (n) (<= n 0)) '(1 2 -3 -4 5 6))
(-3 -4)
-> (filter ((curry <)  0) '(1 2 -3 -4 5 6))
(1 2 5 6)
-> (filter ((curry >=) 0) '(1 2 -3 -4 5 6))
(-3 -4)




          *** List filtering: composition revisited ***
-> (val positive? ((curry <) 0))
<procedure>
-> (filter positive?         '(1 2 -3 -4 5 6))
(1 2 5 6)
-> (filter (o not positive?) '(1 2 -3 -4 5 6))
(-3 -4)

"Lifting" functions to lists: map

Problem: square every element of a list.

What are the laws?

(map square '())         ==
(map square (cons a as)) ==

          *** SLIDE ***
-> (define map (f xs)
     (if (null? xs)
       '()
       (cons (f (car xs)) (map f (cdr xs)))))
-> (map number? '(3 a b (5 6)))
(#t #f #f #f)
-> (map ((curry *) 100) '(5 6 7))
(500 600 700)
-> (val square* ((curry map) (lambda (n) (* n n))))
<procedure>
-> (square* '(1 2 3 4 5))
(1 4 9 16 25)

The universal list function: fold

TeX slide

Another view of operator folding:
TeX slide

foldr is a catamorphism

Another catamorphism: foldl associates to left

14 Feb 2011: Continuations - what to do next

p2144556 p2144557 p2144558 p2144559 p2144560 p2144561 p2144562 p2144563 p2144564 p2144565 p2144566 p2144567 p2144568 p2144569 p2144570 p2144571 p2144572 p2144573 p2144574 p2144575

Higher-order functions on lists, continued

Review

          *** Existence, by laws ***
(exists? p? '())          == #f
(exists? p? '(cons a as)) == (p? a) \/ (exists p? as)


          *** Filter, by laws ***
(filter p? '())          == '()
(filter p? '(cons a as)) == (cons a (filter p? as)), 
                                if (p? a)
(filter p? '(cons a as)) == (filter p? as), 
                                otherwise


          *** Filter, by code ***
-> (define filter (p? xs)
     (if (null? xs)
       '()
       (if (p? (car xs))
         (cons (car xs) (filter p? (cdr xs)))
         (filter p? (cdr xs)))))
-> (filter ((curry <)  0) '(1 2 -3 -4 5 6))
(1 2 5 6)
-> (filter ((curry >=) 0) '(1 2 -3 -4 5 6))
(-3 -4)

What have we got:

What we haven't got:

"Lifting" functions to lists: map

Problem: square every element of a list.

What are the laws?

(map square '())         ==
(map square (cons a as)) ==

          *** SLIDE ***
-> (define map (f xs)
     (if (null? xs)
       '()
       (cons (f (car xs)) (map f (cdr xs)))))
-> (map number? '(3 a b (5 6)))
(#t #f #f #f)
-> (map ((curry *) 100) '(5 6 7))
(500 600 700)
-> (val square* ((curry map) (lambda (n) (* n n))))
<procedure>
-> (square* '(1 2 3 4 5))
(1 4 9 16 25)

The universal list function: fold

TeX slide

* SLIDE* -> (define foldr (plus zero xs) (if (null? xs) zero (plus (car xs) (foldr plus zero (cdr xs))))) -> (val sum (lambda (xs) (foldr + 0 xs))) -> (val prod (lambda (xs) (foldr * 1 xs))) -> (sum '(1 2 3 4)) 10 -> (prod '(1 2 3 4)) 24

Another view of operator folding:
TeX slide

foldr is a catamorphism

Another catamorphism: foldl associates to left

Continuations

Direct style - last action is to return a value

Continuation-passing style - last action is to "throw" value to a continuation

Simulate with an ordinary tail call.

How functions finish:
TeX slide

Motivating example: from existence to witness

Problem in interface design:
TeX slide

Ideas?

Bad choices:

Good choice:

Solution: new interface:
TeX slide

          *** Coding \lit{witness} with continuations ***
(define witness-cps (p? xs succ fail)
   (if (null? xs)
       (fail)
       (let ((x (car xs)))
         (if (p? x)
             (succ x)
             (witness-cps p? (cdr xs) succ fail)))))

Example use: instructor lookup:
TeX slide

          *** Simple continuations at work ***
-> (val 2011s '((Ramsey 105)(Hescott 170)(Chow 116)))
-> (instructor-info 'Ramsey 2011s)
(Ramsey teaches 105)
-> (instructor-info 'Chow 2011s)
(Chow teaches 116)
-> (instructor-info 'Souvaine 2011s)
(Souvaine is-not-on-the-list)

Crucial question:
TeX slide

16 Feb 2011: Writing CPS; Semantics; Implementation

p2161000 p2161001 p2161002 p2161003 p2161004 p2161005 p2161006 p2161007 p2161008 p2161009 p2161010 p2161011 p2161012 p2161013 p2164576

Continuations

Review: CPS as an interface:
TeX slide

Writing code in CPS:
TeX slide

Continuations for search problems

{Continuations for search problems}:
TeX slide

Concluding remark (board): functions are cheap

18 Feb 2011: Semantics and implementation of uScheme; full Scheme

p2184577 p2184578 p2184579 p2184580 p2184581 p2184582 p2184583 p2184584 p2184585 p2184586 p2184587 p2184588 p2184589 p2184590 p2184591 p2184592 p2184593 p2184594 p2184595 p2184596 p2184597 p2184598 p2184599 p2184600

First three of five questions: Syntax, values, environments

Key changes from Impcore

{\uscheme\ vs \impcore \ (Probably overdue)}:
TeX slide

{Evaluation rules}:
TeX slide

{Implementation of closures}:
TeX slide

{Applying closures}:
TeX slide

{Locations in closures}:
TeX slide

Real closures

On the heap, but

Example: SML/NJ has closure register, arg register.

{Closures in pictures}:
TeX slide

Closure optimizations

{\uscheme\ in one slide}:
TeX slide

Scheme as it really is

  1. Macros!
  2. Cond expressions (solve nesting problem)
  3. Mutation
  4. Garbage collection
  5. Tail calls
  6. Applications and assessment

Macros!

A Scheme program is just another S-expression

Conditional expressions

          *** Real Scheme Conditionals ***
(cond (c1 e1)    ; if c1 then e1
      (c2 e2)    ; else if c2 then e2
       ...            ...
      (cn en)    ; else if cn then en

; Syntactic sugar---'if' is a macro:
(if e1 e2 e3) == (cond (e1 e2)
                       (#t e3))

Tail calls

A call in tail position.

What is tail position?

Tail position is defined inductively:

Anything in tail position is the last thing executed!

Key idea is tail-call optimization!

Tail-call optimization:
TeX slide

Example of tail position:
TeX slide

Example of tail position:
TeX slide

Question: how much stack space is used by the call?

Another example of tail position:
TeX slide

Another example of tail position:
TeX slide

Question: how much stack space is used by the call?

A third example of tail position:
TeX slide

A third example of tail position:
TeX slide

Question: how much stack space is used by the call?

Question: in your previous life, what did you call a construct that

  1. Transfers control to an arbitrary point in the code?
  2. Uses no stack space?

Assessment

Common Lisp, Scheme:

Down sides:

Bottom line: it's all about lambda

A synopsis of your Scheme homework

Results of the Impcore homework

Back to fundamentals:

Coming attractions

Apply your new knowlwedge in Standard ML:

Much less intellectual effort

23 Feb 2011: ML --- programming with algebraic types

p2234601 p2234602 p2234603 p2234604 p2234605 p2234606 p2234607 p2234608 p2234609 p2234610 p2234611 p2234612 p2234613 p2234614 p2234615 p2234616 p2234617

Prelude

What's the one thing I wanted you to remember from Friday's lecture?

Different kind of effort:

This week:

  1. Algebraic type and pattern matching
  2. A touch of types
  3. Everything else

Introduction

Meta: Not your typical introduction to a new language

ML Overview

Designed for programs, logic, symbolic data

Apples to Oranges: uScheme in ML

Theme: talking about data

Board: ML = uScheme + pattern matching + exceptions + static types

Three new ideas:

  1. Pattern matching is big and important. You will like it.
  2. Exceptions are easy
  3. Static types get two to three weeks in their own right.

And lots of new concrete syntax!

ML---The Five Questions

Syntax: expressions, definitions, patterns, types

Values: num/string/bool, record/tuple, \rlap{\high{algebraic data}}

Environments: names stand for values (and types)

Evaluation: uScheme + case and pattern matching

Initial Basis: medium size; emphasizes lists

\medskip

(Question Six: type system---a coming attraction)

A note about books

Ullman is easy to digest

Ullman is clueless about good style

Suggestion:

(My style is not flawless---some Haskell style has crept in---but it will serve you better than Ullman. And I've made some effort at consistency.)

Algebraic data types

Tidbits:

Defining algebraic types

Board:

Exegesis (on board):

Syntactic sugar for lists

          *** Syntactic sugar for lists ***
- 1 :: 2 :: 3 :: 4 :: nil; (* :: associates to the right *)
> val it = [1, 2, 3, 4] : int list

- INTERNAL (LEAF 1 :: INTERNAL (LEAF 2 :: LEAF 3 :: nil) ::
            LEAF 4 :: nil);
> val it = INTERNAL [LEAF 1, INTERNAL [LEAF 2, LEAF 3],
                     LEAF 4] : int srose

Consuming values of algebraic types

Consuming algebraic types

New language construct case (an expression)

fun length xs =
  case xs
    of []      => 0
     | (x::xs) => 1 + length xs

Exegesis:

24 Feb 2011: ML and more ML

p2243400 p2243423 p2243433 p2243450 p2243471 p2243501 p2243544 p2244053 p2244055 p2244233

Types, especially algebraic ones

Types and their uses:
TypeProduce Consume
Introduceeliminate
arrow Lambda (fn) Application
algebraic Apply constructor Pattern match
tuple (e1, ..., en) Pattern match!

Consuming algebraic types---better

Case analysis in definition

fun length []      = 0
  | length (x::xs) = 1 + length xs

(Very close to an algebraic law.)

This is the standard.

Harper: my students don't believe...

          *** Type of a function consuming lists ***
- map;
> val ('a, 'b) it = fn : ('a -> 'b) -> 'a list -> 'b list

(* means this:

val map: forall 'a, 'b . ('a -> 'b) -> ('a list -> 'b list)
          ^^^^^^^^^^^^^^^
*)

Milner left out the forall. A tragic mistake ("language you need a PhD to use").

And the arrow type associates to the right

          *** Code for a function consuming lists ***
fun map f []      = []
  | map f (x::xs) = f x :: map f xs

(*  FUNCTION APPLICATION HAS HIGHER
    PRECEDENCE THAN ANY INFIX OPERATOR! *)


          *** SLIDE ***
val filter : forall 'a . ('a -> bool) ->
                         'a list -> 'a list
fun filter p []      = []
  | filter p (x::xs) =
       case p x
         of true  => x :: filter p xs
          | false =>      filter p xs

fun filter p []      = []  
  | filter p (x::xs) = (* 'if' better than 'case' *)
       if p x then x :: filter p xs
              else      filter p xs

Exercise --- write

val flatten : forall 'a . 'a srose -> 'a list

where

datatype 'a srose
  = LEAF     of 'a
  | INTERNAL of 'a srose list


          *** SLIDE ***
fun flatten (LEAF a)      = [a]
  | flatten (INTERNAL ts) = (List.concat o map flatten) ts

fun flatten (LEAF a)      = [a]
  | flatten (INTERNAL ts) = List.concat (map flatten ts)

fun flatten t =
  let fun flatapp (LEAF a, tail) = a :: tail
        | flatapp (INTERNAL ts, tail) =
             foldr flatapp tail ts
  in  flatapp (t, [])
  end

25 Feb 2011: A whirlwind tour

p2254618 p2254619 p2254620 p2254621 p2254622 p2254623 p2254624 p2254625 p2254626 p2254627 p2254628 p2254629 p2254630 p2254631 p2254632 p2254633 p2254634 p2254635 p2254636 p2254637 p2254638 p2254639 p2254640 p2254641 p2254642 p2254643 p2254644 p2254645 p2254646 p2254647 p2254648 p2254649 p2254650 p2254651 p2254652 p2254653

Board:

curry     flip       infixr 0 $        optmap
uncurry   fst
          snd

Basic elements

Basic values and expression

Infix arithmetic with "high minus"

~3 + 3 = 0

String concatenation: concat, infix hat ^

Short-circuit andalso, orelse for Booleans

Strings "as in C and C++"
Characters #"a", #"b", ...

Identifiers

Formation rules:

Fixity:

infixr 5 @ (* associativity, precedence *)

Revert using op

          *** Lists and fixity ***
- [+, -, *, div];
! Toplevel input:
! [+, -, *, div];
!  ^
! Ill-formed infix expression
- [op +, op -, op *, op div];
> val it = [fn, fn, fn, fn] : (int * int -> int) list
- length it;
> val it = 4 : int
- 

Environments

The value environment:
TeX slide

Beware the semicolon!

Interactive R-E-P-L uses semicolons

Never put a semicolon in a file

(Did I mention Ullman was clueless?)

Nesting environments

At top level, definitions

Definitions contain expressions:

def ::= val ident = exp

Expressions contain definitions:

exp ::= let defs in exp end

Sequence of defs has let-star semantics

Structured values and terms: lists:
TeX slide

Structured values and terms: tuples:
TeX slide

Function pecularities: lambda

Lambda is spelled fn:\normalsize

val rec factorial =
  fn n => if n = 0 then 1 else n * fact (n-1)

Function pecularities: 1 argument

Each function takes 1 argument, returns 1 result

For "multiple arguments," use tuples!\normalsize

 fun factorial n =
   let fun f (i, prod) = 
         if i > n then prod else f (i+1, i*prod)
   in  f (1, 1)
   end

 fun factorial n =  (* you can also Curry *)
   let fun f i prod = 
         if i > n then prod else f (i+1) (i*prod)
   in  f 1 1
   end

Tuples are "usual and customary."

Mutual recursion:
TeX slide

More pattern matching

Pattern-matching in definitions

def ::= val pattern = exp

Example:

val (pivot, left, right) = split xs

Function definition:

def ::= fun ident patterns = exp
{| ident patterns = exp }

Cases must be exhaustive! (Overlap discouraged)

What is a pattern?

pattern ::= variable
          | wildcard
          | value-constructor [pattern]
          | tuple-pattern
          | record-pattern
          | integer-literal
          | list-pattern

Design bug: no lexical distinction between

Workaround: programming convention

Pattern-matching example:
TeX slide

Types

Type examples:
TeX slide

Values and types:
TeX slide

Syntax of ML types:
TeX slide

Syntax of ML types:
TeX slide

Polymorphic types:
TeX slide

Old and new friends:
TeX slide

Exceptions

Board:

Syntax:

Semantics:

Gotchas

Gotcha - Value polymorphism:
TeX slide

          *** Gotcha --- overloading ***
- fun plus x y = x + y;
> val plus = fn : int -> int -> int
- fun plus x y = y + y : real;
> val plus = fn : real -> real -> real

Gotcha --- equality types:
TeX slide

Gotcha --- parentheses

Put parentheses around anything with |

case, handle, fn

Function application has higher precedence than any infix operator

Running ML

To run Moscow ML:

ledit mosml -P full

And tell it:

help "";
use "warmup.sml";

Also /usr/bin/sml, /usr/bin/mlton

28 Feb 2011: Type systems

p2284654 p2284655 p2284656 p2284657 p2284658 p2284659 p2284660 p2284661 p2284662 p2284663 p2284664 p2284665 p2284666 p2284667 p2284668 p2284669 p2284670 p2284671

Responses to Scheme homework

Recursion for lists:

(if (null? xs)
    ... one case ...
    ... the other case ...)

Recursion for rose trees:

(if (atom? t)
    ... one case ...
    ... the other case ..)

Never correct:

... (atom? (car l)) ...  ; there is no such data structure

Code improvement:

(append (list1 x) l)

Introduction to static types

[N.B. "Static" means "at compile time"]

Why have a static type system?

Why are we learning about types?

But why are we learning this theory stuff?

Two flavors:

What is a type?

Examples: types

These are types:

Board: little language of types

Examples: not really types

These "types in waiting" don't classify any terms

These are utter nonsense

Type constructors

Technical name for "types in waiting"

Zero or more args, produce type

More complex type constructors:

Rules for constructors:
TeX slide

Connecting types to values:
TeX slide

Type soundness:
TeX slide

Types help programmers

What do I write next?

  1. What types of values do I have?
  2. What type of value am I trying to produce?
  3. What are the types of the available functions?

Plan for this unit

Plan:

A monomorphic type system

New syntax

Typical syntactic support for types

Explicit types on lambda and define:

Abstract syntax: \small

datatype exp = ...
 | LAMBDA of (name * tyex) list * exp
    ...
datatype def = ...
 | DEFINE of name * tyex * ((name * tyex) list * exp)
    ...

2 March 2011: A monomorphic type system, and type checking

p3024673 p3024674 p3024675 p3024676 p3024677 p3024678 p3024679 p3024680 p3024681 p3024682 p3024683 p3024684 p3024685 p3024686 p3024687 p3024688 p3024689 p3024690 p3024691 p3024692 p3024693 p3024694

Review: type formation and syntax

Type formation rules (Typed Impcore):
TeX slide

Board: what datatype goes with these rules?

New syntax

Typical syntactic support for types

Explicit types on lambda and define:

Typing judgment; type soundnes

Type judgments for monomorphic system

Two judgments

Type environment gives type of each variable.

Type soundness again:
TeX slide

Monomorphic type rules

Type rules for variables:
TeX slide

Notice: one rule for if!! (and while)

Type rules for control:
TeX slide

Product types: both x and y:
TeX slide

Pair rules generalize to product types with many elements (tuples,''structs,'' ``records'')

Sum types: either x or y:
TeX slide

Sum types generalize to ``union'' of N alternatives

          *** Digression: Product and sum in ML ***
type ('a, 'b) prod = 'a * 'b
val pair = fn (a, b) => (a, b)
val fst  = fn (a, b) => a
val snd  = fn (a, b) => b

datatype ('a, 'b) sum (* often "either" *)
  = LEFT  of 'a
  | RIGHT of 'b

Arrow types: function from x to y:
TeX slide

Array types: array of x:
TeX slide

Type checking

Type checking:
TeX slide

4 March 2011: Type checking, polymorphism

p3044695 p3044696 p3044697 p3044698 p3044699 p3044700 p3044701 p3044702 p3044703 p3044704 p3044705 p3044706 p3044707 p3044708 p3044709 p3044710 p3044711 p3044712 p3044713 p3044714 p3044715 p3044716 p3044717 TeX slide

Board: Lambda rule

          *** Type-checking LAMBDA ***
datatype exp = LAMBDA of (name * tyex) list * exp 
   ...
fun ty (LAMBDA (formals, body)) = 
  let val Gamma' = (* body gets new env *)
        foldl (fn ((n, ty), g) => bind (n, ty, g))
              Gamma formals
      val bodytype = typeof (body, Gamma')
      val formaltypes = 
        map (fn (n, ty) => ty) formals
  in  funtype (formaltypes, bodytype)
  end

Limitations of monomorphic type systems

Monomorphic types are limiting

Each new type constructor requires

Monomorphic burden: Array types:
TeX slide

Notes:

Monomorphism hurts programmers too

Users can't create new syntax

User-defined functions are monomorphic:\normalsize

(define int lengthI ((list int) l)
   (if (null? l) 0 (+ 1 (lengthI (cdr l)))))
(define int lengthB ((list bool) l)
   (if (null? l) 0 (+ 1 (lengthB (cdr l)))))
(define int lengthS ((list sym) l)
   (if (null? l) 0 (+ 1 (lengthS (cdr l)))))

Polymorphic type systems

Quantified types

Quantified types:
TeX slide

          *** Substitute for quantified variables ***
-> length
<proc> : (forall ('a) (function ((list 'a)) int))
-> (@ length int)
<proc> : (function ((list int)) int)
-> (length '(1 2 3))
type error: function is polymorphic; instantiate before applying
-> ((@ length int) '(1 2 3))
3 : int


          *** Substitute what you like ***
-> length
<proc> : (forall ('a) (function ((list 'a)) int))
-> (@ length bool)
<proc> : (function ((list bool)) int)
-> ((@ length bool) '(#t #f))
2 : int


          *** More "Instantiations" ***
-> (val length-int (@ length int))
length-int : (function ((list int)) int)
-> (val cons-bool (@ cons bool))
cons-bool : (function (bool (list bool)) 
                                (list bool))
-> (val cdr-sym (@ cdr sym))
cdr-sym : (function ((list sym)) (list sym))
-> (val empty-int (@ '() int))
() : (list int)

Bonus instantiation:

-> map
<proc> : 
  (forall ('a 'b) 
      (function ((function ('a) 'b) 
                 (list 'a)) 
           (list 'b)))
-> (@ map int bool)
<proc> : (function ((function (int) bool) 
                    (list int)) 
              (list bool))

Create your own!

Abstract over unknown type using type-lambda

  -> (val id (type-lambda ('a)
                (lambda (('a x)) x)))
  id : (forall ('a) (function ('a) 'a))

\large 'a is type parameter (an unknown type)

This feature is parametric polymorphism

Board: two forms of abstraction:

termtype
`lambda``function` (arrow)
`type-lambda``forall`

Power comes at notational cost

Function composition

-> (val o (type-lambda ('a 'b 'c) 
     (lambda (((function ('b) 'c) f)
              ((function ('a) 'b) g))
        (lambda (('a x)) (f (g x))))))
o : (forall ('a 'b 'c) 
       (function ((function ('b) 'c) 
                  (function ('a) 'b)) 
           (function ('a) 'c)))

Aka \black o : \mbox{$\forall \alpha, \beta, \gamma \mathrel. (\beta \arrow \gamma) \cross (\alpha \arrow \beta) \arrow (\alpha \arrow \gamma) $}

Type rules for polymorphism

Instantiate by substitution:
TeX slide

Generalize with type-lambda:
TeX slide

7 March 2011: Haskell type classes (guest lecture by Greg Morrisett)

p3074718 p3074719 p3074720 p3074721 p3074722 p3074723 p3074724 p3074725 p3074726 p3074727 p3074728 p3074729 p3074730 p3074731 p3074732 p3074733 p3074734 p3074735 p3074736 p3074737 p3074738 p3074739 p3074740

14 March 2011: Type systems wrapup

p3144741 p3144742 p3144743 p3144744 p3144745 p3144746 p3144747 p3144748 p3144749 p3144750 p3144751 p3144752 p3144753 p3144754 p3144755 p3144756

Administration

Midterm evaluations (board --- participation grade)

Typed uScheme policy reversal

Fresh issue of extension tokens to total 7

ML problems

(blots!)

Also length considered harmful

          *** SLIDE ***
datatype 'a tree 
  = NODE of 'a tree * 'a * 'a tree 
  | LEAF


          *** SLIDE ***
fun tfr f LEAF                   tail = tail
  | tfr f (NODE(LEAF, x, LEAF))  tail = f (x, tail)
  | tfr f (NODE(LEAF, x, right)) tail =
                      f (x, tfr f right tail)
  | tfr f (NODE(left, x, LEAF))  tail =
                      tfr f left (f (x, tail))
  | tfr f (NODE(left, x, right)) tail =
      tfr f left (f (x, tfr f right tail))


          *** SLIDE ***
fun length (nil)   = 0
  | length (x::xs) = 
      foldr (fn (_, n) => n + 1) 0 (x::xs) 

Where have we been and were are we going?

From a monomorphic language to a polymorphic language:

Polymorphic types wrapup

Review

bs - Instantiate by substitution $\forall$ elimination:

begin{center} \trule {\ptypeis e {\forall\ldotsn\alpha.\tau}} {\ptypeis {\xtyapply(e, \ldotsn\tau)} {\tau[\ldotsmapston \alpha \tau]}} \end{center} es

Generalize with type-lambda:
TeX slide

Arrow-introduction, corrected {\&} compared:
TeX slide

Type formation through kinds

Type formation through kinds:
TeX slide

Use kinds to give arities:
TeX slide

The kinding judgment:
TeX slide

          *** The phase distinction revealed ***
fun checkThenEval (d, (delta, gamma, rho)) =
  let val (gamma', tystring)  = elabdef (d, gamma, delta)
      val (rho',   valstring) = evaldef (d, rho)
      val _ = print (valstring ^ " : " ^ tystring)
  in  (delta, gamma', rho')
  end

How to write elabdef:

Kinding rules for types:
TeX slide

Three environments:
TeX slide

Three environments revealed:
TeX slide

Three environments:
TeX slide

Three environments revealed:
TeX slide

16 March 2011: ML type inference

p3164757 p3164758 p3164759 p3164760 p3164761 p3164762 p3164763 p3164764 p3164765 p3164766 p3164767 p3164768 p3164769 p3164770 p3164771 p3164772 p3164773 p3164774 p3164775 p3164776 p3164777 p3164778 p3164779 p3164780 p3164781 p3164782

Review

Relationship between case and either:

case e
  of LEFT  x => a
   | RIGHT y => b

becomes

(either e
     (lambda (x) a)
     (lambda (y) b))

and it is continuation-passing style

Why do this???*

ML Type inference

          *** Working up to type inference ***
-> (define     double (x)       (+ x x))
double                         ;; uScheme
-> (define int double ((int x)) (+ x x))
double : (function (int) int)  ;; Typed uSch.
-> (define     double (x)       (+ x x))
double : int -> int            ;; uML




          *** Infer a polymorphic type ***
->    (lambda                     (f)  ;; uScheme
          (lambda     (x)  (f (f x))))
<procedure>
-> (type-lambda ('a)                   ;; Typed uScheme
      (lambda (((function ('a) 'a) f)) 
          (lambda (('a x)) (f (f x)))))
<procedure> : (forall ('a) 
      (function ((function ('a) 'a)) (function ('a) 'a)))
->    (lambda                     (f)  ;; uML
          (lambda     (x)  (f (f x))))
<procedure> : forall 'a . ('a -> 'a) -> ('a -> 'a)
- fn f => fn x => f (f x);             ;; Moscow ML
> val 'a it = fn : ('a -> 'a) -> 'a -> 'a

The Hindley-Milner type system

Restricted polymorphism:

Advantage: no notation

Bonus! Translation from ML

ML to Typed uScheme:

val id = fn x => x

becomes

(val id (type-lambda ('a) (lambda (('a x)) x)))

Term

3 :: [] 

becomes

((@ cons int) 3 (@ '() int))

Representing Hindley-Milner types

Representing Hindley-Milner types:
TeX slide

          *** Type schemes in codeland ***
datatype ty
  = TYVAR  of name        
  | TYCON  of name        
  | CONAPP of ty * ty list

datatype type_scheme
  = FORALL of name list * ty

Key ideas:
TeX slide

Key ideas repeated:
TeX slide

{ML type rules can be nondeterministic!}:
TeX slide

{ML type rules, continued}:
TeX slide

{Things to notice}:
TeX slide

{Type inference}:
TeX slide

{Instances and substitution}:
TeX slide

{Instance properties}:
TeX slide

{Instantiation intuition}:
TeX slide

{More instantiation}:
TeX slide

18 March 2011: Unification and type inference

p3184783 p3184784 p3184785 p3184786 p3184787 p3184788 p3184789 p3184790 p3184791 p3184792 p3184793 p3184794 p3184795 p3184796 p3184797 p3184798 p3184799 p3184800 p3184801 p3184802 p3184803 p3184804 p3184805 p3184806

Notes

John Hughes roadshow 29 March

Tucket Taft 31 March

Board

{Unification}:
TeX slide

{Most General Unifiers}:
TeX slide

{Implementing unification}:
TeX slide

{From type rules to type inference}:
TeX slide

{Type rules to type inference, cont'd}:
TeX slide

{Type rules to type inference, finish}:
TeX slide

28 March 2011: Type inference wrapup;

p3280005 p3280542 p3280611 p3281534 p3281549 p3281848 p3281914 p3282136 p3283315 p3283327 p3283334 p3284349 p3284358 p3284438 p3284807 p3284808 p3284809 p3284810 p3284935 p3285156 p3285523 p3285935

Unification review:
TeX slide

{Type inference review: Lambda}:
TeX slide

{Type rules to type inference: apply}:
TeX slide

{More explicit substitutions}:
TeX slide

{Type inference, operationally}:
TeX slide

{Implementing type inference}:
TeX slide

{Type inference example}

Try

val f = fn x => op + (x, 1)

{Type systems: Things to remember}:
TeX slide

On to the lambda calculus!

30 March 2011: Lambda calculus

p3304811 p3304812 p3304813 p3304814 p3304815 p3304816 p3304817 p3304818 p3304819 p3304820 p3304821 p3304822 p3304823 p3304824 p3304825 p3304826 p3304827 p3304828 p3304829

Theory Ahead

What is theory and why is it important:

Precise answer to what does this program mean

Guides implementors and programmers

Elements of theory

Lambda Calculus

World's simplest language

exp ::= var | \var.exp | exp exp

Also typically written as

M,N ::= x | \x.M | M N

Application associates to the left and binds tighter than abstraction

(Equivalent: body of lambda-term extends to the closing right paren)

Lambda is as powerful as any language known

Ideal for

Lambda calculus and the five questions

Abstract syntax: application, abstraction, variable

Values: no values! Values are just terms

Environments: not used! (But names stand for terms)

Evaluation rules: coming

Initial basis: often add constants like + or 2

Operational semantics of lambda calculus

New kind of semantics: small-step

New judgment form

M --> N   ("M reduces to N in one step")

No context!! No turnstile!!

Just pushing terms around == calculus

{Reduction rules}:
TeX slide

Example reduction to normal form:

(\x.\y.yx)(+ 3 4)(\x.+ x 2) -> (beta)
(\y.y(+ 3 4))(\x.+ x 2) -> (beta)
(\x.+ x 2)(+ 3 4) -> (beta)
+(+ 3 4)2

This is a normal form. If we have rules for constants, we can go further (to 9)

The substitution in the beta rule is the heart of the lambda calculus

Free variables:
TeX slide

Examples of free variables:

\x . + x y

\x. \y. x

Substitution:
TeX slide

1 April 2011: Lambda calculus continued: substitution, normalization, computing

p4014830 p4014831 p4014832 p4014833 p4014834 p4014835 p4014836 p4014837 p4014838 p4014839 p4014840 p4014841 p4014842 p4014843 p4014844 p4014845 p4014846 p4014847 p4014848

Interlude: Midterm evaluations

Over 20 pages of feedback

Focus on quick improvements

Substitution:
TeX slide

Example:

(\yes.\no.yes)(\time.no) ->
\z.\time.no

and never

\no.\time.no    // WRONG!!!!!!

{Renaming of bound variables}:
TeX slide

Conversion:
TeX slide

Nondeterminism of conversion:

               A
              / \
             V   V
            B     C

Now what??

Church-Rosser Theorem:
TeX slide

Normal forms

What if we have A but no B with A -> B

(Each set of reduction rules induces its own normal form)

Normal forms are values:
TeX slide

Applicative-order reduction

Given a beta-redex

(\x.M) N

do the beta-reduction only if N is in normal form

Q: Does applicative order ever prevent you from making progress?

A: No. We can prove it by induction on the number of lambdas in N

Normal-order reduction

Always choose leftmost, outermost redex

"Normal-order" stands for produces a normal form, not for "the normal way of doing things"

Normal-order illustration:
TeX slide

Not your typical call-by-value semantics!

4 April 2011: Computing with the lambda calculus

p4044849 p4044850 p4044851 p4044852 p4044853 p4044854 p4044855 p4044856 p4044857 p4044858 p4044859 p4044860 p4044861 p4049900 p4049901 p4049902 p4049903 p4049904 p4049905 p4049906 p4049907 p4049908 p4049909 p4049910 p4049911 p4049912 p4049913 p4049914 p4049915 p4049916 p4049917

(bad lambda)

Preview: the Curry-Howard isomorphism

Anybody remember propositions, logic, tautologies?

LogicPL
PropositionsTypes
Proofs Terms

Basis of many modern theorem provers

Logical formulas:

A -> B -> A
A -> B -> B

A /\ B -> A
A /\ B -> B

A -> A \/ B
B -> A \/ B

Types:

A -> B -> A
A -> B -> B

A * B -> A
A * B -> B

A -> A + B
B -> A + B

What terms have these types?

Is

A -> B

a tautology?

Q: What term could possibly have that type?

A: Type soundness guarantees that if it terminates, it's good

A: So, a term that doesn't terminate

Q: How would you represent a pair, using only functions?
Followup: given a pair, what can a function do

A: continuation-passing style

Q: So then what function type might represent A * B

A: forall C . (A -> B -> C) -> C

Followup: does (forall C . (A -> B -> C) -> C) -> A?

A: Yes; substitute A for C and ((A -> B -> A) -> A) -> A

Followup: a term of that type?

A: \f. f (\a.\b.a)

Computing with lambda

Agenda: code a simple functional language, like Scheme, using lambda calculus.

Idea: continuation-passing style

Booleans

if M then N else P   ===> M N P

M chooses N or P based on truth or falsehood?

Q: How are true and false coded??

A: \x.\y.x and \x.\y.y

Products (tuples, records, structs)

Q: What is the capability of a pair?

A: To access both elements

Q: Then what does the continuation look like?

A: Takes both elements and produces an answer

Q: How do we represent the pair (M, N)?

A: \k. k M N

Q: What does the pair function look like?

A: `pair = \x.\y.\f.f x y

Q: How do we implement fst and snd?

A: fst = \p.p(\x.\y.x)
A: snd = \p.p(\x.\y.y)

Sums (discriminated unions)

Q: What is the capability of a pair?

A: To access either one element or the other

Q: How many continuations should a sum expect?

A: Two

Q: What does a continuation look like?

A: Takes one element and produces an answer

Q: How do we represent the value (LEFT M)

A: \f.\g. f M

Q: How do we represent the value (RIGHT N)

A: \f.\g. g N

Q: So what do the LEFT and RIGHT functions look like?

A: left = \a.\f.\g.f a
A: right = \b.\f.\g.g b

Q: How do we do case analysis?

A: Our old friend either!

case X of LEFT a => M | RIGHT b => N

either X (\a.M) (\b.N)

either s f g = s f g

Simulating S-expressions:
TeX slide

6 April 2011: Church encodings finalized

p4064862 p4064863 p4064864 p4064865 p4064866 p4064867 p4064868 p4064869 p4064870 p4064871 p4064872 p4064873 p4064874 p4064875 p4064876 p4064877 p4064878 p4064879

Review: encoding sums of products

datatype bool = true | false

datatype 'a heap = EMPTY | HEAP of 'a heap * 'a * 'a heap

datatype 'a seq = ESEQ | SINGLE of 'a | APPEND of 'a seq * 'a seq

What are the Church encodings of the following?

true
false

EMPTY
HEAP

ESEQ
SINGLE
APPEND

Encoding natural numbers

Natural numbers as functions:
TeX slide

Church Numerals:
TeX slide

          *** Church Numerals to machine integers ***
-> (define to-int (n)
          ((n ((curry +) 1)) 0))
-> (to-int three)
3
-> (to-int ((times three) four))
12




          *** Church Numerals in $\lambda$ ***
zero  = \f.\x.x;
succ  = \n.\f.\x.f (n f x);
plus  = \n.\m.n succ m;
times = \n.\m.n (plus m) zero;
 ...
-> four;
\f.\x.f (f (f (f x)))
-> three;
\f.\x.f (f (f x))
-> times four three;
\f.\x.f (f (f (f (f (f (f (f (f (f (f (f x)))))))))))

Polymorphic Typed Lambda Calculus: F2

Typed lambda is a simplified version of Typed uScheme:

M  ::= x
    |  \x:tau . M
    |  M N
    |  /\'a . M
    |  M [tau]                // instantiation
    |  constant

tau ::= mu                    // type constructor
    |   'a                    // type variable
    |   forall 'a . tau
    |   tau tau

There is also a simply-typed lambda calculus without type-lambda, instantiation, or quantified types.

Church numerals in a typed calculus:
TeX slide

Denotational semantics and fixed-point theory

Lambda calculus is just pushing terms around

Operational semantics is just executing some automaton

How do we justify calling it "functional programming?" Where are the functions????

Functions are mathematical objects

Denotational semantics says what mathematical object corresponds to

Why functions are a problem

A function is a value.

If values are V, then V must include V -> V.

What is V -> V? It can't be all functions from V to V

But wait! There are only countably many source codes!

Intuition: include only those functions whose behavior is required by the source code

8 April 2011: Semantic domains and the fixed-point construction

p4084880 p4084881 p4084882 p4084883 p4084884 p4084885 p4084886 p4084887 p4084888 p4084889 p4084890 p4084891 p4084892 p4084893 p4084894 p4084895 p4084896 p4084897 p4084898 p4084899 p4084900 p4084901

Overview of results

Two heroes here:

Scott's ideas underly classical compiler optimizations based on dataflow analysis. Plus they apply to all recursion equations

Strachey's ideas are used in every compiler and interpreter.

(N.B. First book ever written on these topics was written by Joe Stoy.)

Our goals (board):

Key problems in domain theory

Avoid "junk" values, including non-computable functions

Deal with nontermination through a bottom element

Solve recursion equations with fixed points

Domains as Complete partial orders

Bottom element

Partial order approximation

Functions in semantic domains are not "over-defined"

Partial order review:

Examples:

Complete partial order or CPO

The "least" upper bound makes the fewest assumptions

Well-behaved functions

Given a cpo S, what's the semantic domain of functions from S to S?

Lift approximation to functions pointwise:

Refinement ordering on functions:
TeX slide

Well-behaved functions:
TeX slide

Bonuses:

Solving recursion equations

Recursion = Fixed point:
TeX slide

11 April 2011: Fixed-point construction; program semantics

p4114902 p4114903 p4114904 p4114905 p4114906 p4114907 p4114908 p4114909 p4114910 p4114911 p4114912 p4114913 p4114914 p4114915 p4114916 p4114917 p4114918 p4114919 p4114920 p4114921 p4114922

Recursion = Fixed point:
TeX slide

Suppose g F = F. Proof that F is factorial.

For all n, g F n = n!, by induction:

F 0 = g F 0 = 1
F n
  = { by assumption }
g F n 
  = { definition of g }
if n = 0 then 1 else n * F (n-1)
  = { assumption, n > 0 }
n * F (n-1)
  = { induction hypothesis }
n * (n-1)!
  = { definitiion of factorial }
n!

Now you do it

          *** Conversion to fixed point ***

length = \xs.null? xs 0 (+ 1 (length (cdr xs)))


lg = \lf.\xs.null? xs 0 (+ 1 (lf (cdr xs)))

Existence and construction of fixed points

Monotonic, continous $g$ has fixed point:
TeX slide

Fixed-point construction:
TeX slide

What do I want from you?

Fixed points in (untyped) lambda:
TeX slide

Fixed points work on domains:
TeX slide

Bonus content: example fixed-point construction

This evening, my friends and I discovered that we had a hard time answering a question on the assignment because we had no way to frame what it meant to "write the function whose fixed point is a solution to the equation".

Write a function f such that if (f ys = ys) then

ys = (cons 0 (map ((curry +) 1) ys)))

Could you produce a different, dissimilar recursion equation and use the steps listed on the assignment to solve it? I think such an example would be helpful.

Equation

p = pair 0 (fst p)

Function:

f = \p.pair 0 (fst p)

zero approximation:

p0 = _|_
p1 = f p0 = pair 0 (fst _|_)          = pair 0 _|_
p2 = f p1 = pair 0 (fst (pair 0 _|_)) = pair 0 0
p3 = f p2 = pair 0 (fst (pair 0 0))   = pair 0 0

p2 is the least fixed point of function f and the least defined solution to the recursion equation.

Mapping programs to domains

Straight-line code, value semantics

Direct-style value semantics for a stratified language with statements and expression.

Domains: (board)

V         = Int
Ide
sigma : S = Ide -> V
Exp       = S -> V
Stm       = S -> S

Notes:

Semantic functions on base types and on syntax:

I : ID  -> Ide
N : NUM -> V
E : exp -> Exp
S : stm -> Stm

Semantic functions for values and store

plus   : V -> V -> V
update : S * Ide * V -> S

Abstract syntax and semantic equations for statements:

S [[ID := exp]] sigma = update(sigma, I[[ID]], E[[exp]]sigma)
S [[s1; s2]]    sigma = S[[s2]](S[[s1]] sigma)

Abstract syntax and semantic equations for expressions:

E [[NUM]] sigma = N[[NUM]]      // note state is ignored
E [[ID]]  sigma = sigma(I[[ID]])
E [[e1 + e2]] sigma = plus (E [[e1]] sigma) (E [[e2]] sigma)

Things to notice:

13 April 2011: More denotational definitions

p4134923 p4134924 p4134925 p4134926 p4134927 p4134928 p4134929 p4134930 p4134931 p4134932 p4134933 p4134934 p4134935 p4134936 p4134937 p4134938 p4134939 p4134940 p4134941 p4134942 p4134943 p4134944 p4134945 p4134946 p4134947 p4134948 p4134949 p4134950 p4134951 p4134952

Today's plan:

  1. Design choices, quickly and easily
  2. Continuations
  3. Definitional interpreters

DON'T FORGET JOE STOY'S TALK TOMORROW, 3PM

Locations and environments

Ideas:

Domains: (board)

V         = Int
Ide
l : L     = Locations    // new
rho : Env = Ide -> L     // new
sigma : S = L -> V       // changed
Exp       = Env -> S -> V      // changed
Stm       = Env -> L -> S -> S // changed

Notes:

Semantic functions for base types and syntax

I : ID  -> Ide
N : NUM -> V
E : exp -> Exp
S : stm -> Stm
P : prog -> Env -> L -> S -> L -> V
                   ^         ^
                   |         +-- location where answer is stored
                   |
                   +--- location for next local variable

Semantic functions for values, environment, and store

plus   : V -> V -> V
update_S : S * L * V -> S
update_E : Env * Ide * L -> Env

New functions:

l0   : L        // location supply
next : L -> L

sigma0 : S      // initial state

rho0 : Env      // initial env
rho1 : Env = update_E(rho0, I [[Answer]], l0)

Abstract syntax and semantic equations for programs:

P [[s]] = S [[s] rho_1 (next l0) sigma0 l0

Abstract syntax and semantic equations for statements:

S [[ID := exp]] rho l sigma = update_S(sigma, rho(I[[ID]]), E[[exp]] rho sigma)
S [[{ VAR x ; s }]] rho l sigma =
     S [[s]](update_E(rho, I[[x]], l))(next l) sigma
S [[s1; s2]]    rho l sigma = S[[s2]] rho l (S[[s1]] rho l sigma)

Abstract syntax and semantic equations for expressions:

E [[NUM]] rho sigma = N[[NUM]]      // note state is ignored
E [[ID]]  rho sigma = sigma(rho(I[[ID]]))
E [[e1 + e2]] rho sigma = plus (E [[e1]] rho sigma) (E [[e2]] rho sigma)

Things to notice:

??????

Control flow in direct style

Ideas:

Domains: (board)

V         = Int
Ide
sigma : S = Ide -> V       // changed
Exp       = S -> V      // changed
Stm       = S -> S // changed

Semantic functions for base types and syntax

I : ID  -> Ide
N : NUM -> V
E : exp -> Exp
S : stm -> Stm

Semantic functions for values, environment, and store

plus   : V -> V -> V
update : S * L * V -> S
ifnz   : V * S * S -> S

Abstract syntax and semantic equations for statements:

S [[ID := exp]] sigma = update(sigma, I[[ID]], E[[exp]] sigma)
S [[s1; s2]]    sigma = S[[s2]] (S[[s1]] sigma)
S [[IF exp THEN s1 ELSE s2]] sigma =
     ifnz(E[[exp]]sigma, S[[s1]] sigma, S[[s2]]sigma)
S [[WHILE exp DO stm]] sigma =
     Y(\f.\sigma.ifnz(E[[exp]]sigma,f(S[[stm]]sigma),sigma)) sigma
S [[skip]] sigma = sigma

Abstract syntax and semantic equations for expressions:

E [[NUM]] sigma = N[[NUM]]      // note state is ignored
E [[ID]]  sigma = sigma(I[[ID]])
E [[e1 + e2]] sigma = plus (E [[e1]] sigma) (E [[e2]] sigma)

Things to notice:

??????

Control flow with continuations

Ideas:

Why are we doing this?

We want to model those constructs.

Domains: (board)

V         = Int
Ide
sigma : S = Ide -> V
A         = Answers     // new
theta : C = S -> A      // new
Exp       = S -> V      // same
Stm       = C -> C      // changed
Prog      = A           // by definition, the answer you get when
                        // you run it (e.g., stdout, stderr, exit code)

Semantic functions for base types and syntax

I : ID  -> Ide
N : NUM -> V
E : exp -> Exp
S : stm -> Stm
P : prog -> Prog

Semantic functions for values, environment, and store

plus   : V -> V -> V
update : S * L * V -> S
ifnz   : V * S * S -> S
done   : C  // end of the program

Abstract syntax and semantic equations for a program:

P[[stm]] = S[[stm]] done sigma_0

Abstract syntax and semantic equations for statements:

S [[ID := exp]] theta = \sigma.theta(update(sigma, I[[ID]], E[[exp]] sigma))
S [[s1; s2]]    theta = S[[s1]] (S[[s2]] theta)  // order restored!
S [[IF exp THEN s1 ELSE s2]] theta =
     \sigma.ifnz(E[[exp]]sigma, S[[s1]] theta sigma, S[[s2]]theta sigma)

Abstract syntax and semantic equations for expressions:

E [[NUM]] sigma = N[[NUM]]      // note state is ignored
E [[ID]]  sigma = sigma(I[[ID]])
E [[e1 + e2]] sigma = plus (E [[e1]] sigma) (E [[e2]] sigma)

Things to notice:

15 April 2011: Continuation wrapup; Smalltalk

p4154953 p4154954 p4154955 p4154956 p4154957 p4154958 p4154959 p4154960 p4154961 p4154962 p4154963 p4154964 p4154965 p4154966 p4154967 p4154968 p4154969 p4154970 p4154971 p4154972 p4154973 p4154974 p4154975 p4154976

Context: operational, type theory, denotational

Gotos and labels: Important recursion equation

rho_final, theta = S[[stm]] (rho_final, done)

where

Stm = C env * C -> C env * C

Review: Control flow with continuations

Domains: (board)

V         = Int
Ide
sigma : S = Ide -> V
A         = Answers     // new
theta : C = S -> A      // new
Exp       = S -> V      // same
Stm       = C -> C      // changed
Prog      = A           // by definition, the answer you get when
                        // you run it (e.g., stdout, stderr, exit code)

Semantic functions for base types and syntax

I : ID  -> Ide
N : NUM -> V
E : exp -> Exp
S : stm -> Stm
P : prog -> Prog

Semantic functions for values, environment, and store

plus   : V -> V -> V
update : S * L * V -> S
ifnz   : V * S * S -> S
done   : C  // end of the program

Abstract syntax and semantic equations for a program:

P[[stm]] = S[[stm]] done sigma_0

Abstract syntax and semantic equations for statements:

S [[ID := exp]] theta = \sigma.theta(update(sigma, I[[ID]], E[[exp]] sigma))
S [[s1; s2]]    theta = S[[s1]] (S[[s2]] theta)  // order restored!
S [[IF exp THEN s1 ELSE s2]] theta =
     \sigma.ifnz(E[[exp]]sigma, S[[s1]] theta sigma, S[[s2]]theta sigma)

Abstract syntax and semantic equations for expressions:

E [[NUM]] sigma = N[[NUM]]      // note state is ignored
E [[ID]]  sigma = sigma(I[[ID]])
E [[e1 + e2]] sigma = plus (E [[e1]] sigma) (E [[e2]] sigma)

Things to notice:

Expression continuations

Ideas:

Domains: (board)

V         = Int
Ide
sigma : S = Ide -> V
A         = Answers     
theta : C = S -> A      
kappa : K = V -> C      // new
Exp       = K -> C      // changed
Stm       = C -> C      // same
Prog      = A           // same

Semantic functions for base types and syntax

I : ID  -> Ide
N : NUM -> V
E : exp -> Exp
S : stm -> Stm
P : prog -> Prog

Semantic functions for values, environment, and store

plus   : V -> V -> V
update : S * L * V -> S
ifnz   : V * A * A -> A   // changed
done   : C  // end of the program
error  : A  // an error answer

Abstract syntax and semantic equations for a program:

P[[stm]] = S[[stm]] (\sigma.error) sigma_0

Abstract syntax and semantic equations for statements:

S [[ID := exp]] theta = E[[exp]](\v.\sigma.theta(update(sigma, I[[ID]], v)))
S [[RETURN exp]] theta = E[[exp]](\v.\sigma.done v)   // ignores theta!!
S [[s1; s2]]    theta = S[[s1]] (S[[s2]] theta)       // unchanged

Abstract syntax and semantic equations for expressions:

E [[NUM]] kappa = kappa(N[[NUM]]) // feed value to context
E [[ID]]  kappa = \sigma.kappa(sigma(I[[ID]]))sigma // lookup and continue
E [[e1 + e2]] kappa =  // must make contexts for e1 and e2
  E[[e1]](\v1.E[[e1]](\v2.kappa (plus v1 v2)))
E [[stm; exp]] kappa = S[[stm]](E[[exp]]kappa)  // bonus construct!

Things to notice:

Definitional interpreter!

          *** ML domains ***
type V     = int
type Ide   = string
type S     = Ide -> V
datatype A = ERROR
           | ANSWER of V (* value returned *)
type C     = S -> A
type K     = V -> C
type Exp   = K -> C
type Stm   = C -> C


          *** Starting values ***
exception NotFound
val plus   : V * V -> V = op +
val upd    : S * Ide * V -> S 
  = fn (sigma, n, v) => fn n' => 
      if n = n' then v else sigma n'
val done   : V -> A = fn x => ANSWER x
val sigma0 : S      = fn n => raise NotFound


          *** Syntax of programs ***
datatype prog = PROG of stm
and      stm = op :=  of Ide * exp
             | RETURN of exp
             | SEQs   of stm * stm
and      exp = SEQe   of stm * exp
             | NUM    of V
             | ID     of Ide
             | op +   of exp * exp


          *** Semantic equations, part I ***
fun P (PROG stm1) = S stm1 (fn s => ERROR) sigma0
and S (id := exp1)  th = 
      E exp1 (fn v => fn s => th (upd(s, id, v)))
  | S (RETURN exp1) th = 
      E exp1 (fn v => fn s => done v)
  | S (SEQs (stm1, stm2)) th = S stm1 (S stm2 th)

val _ = P : prog -> A
val _ = S : stm  -> C -> C


          *** Semantic equations, part II ***
and E (SEQe (stm1, exp1)) k = S stm1 (E exp1 k)
  | E (NUM v) k = k v
  | E (ID  n) k = (fn s => k (s n) s)
  | E (exp1 + exp2) k = 
      E exp1 (fn v1 => 
              E exp2 (fn v2 => k (plus(v1, v2))))

val _ = E : exp  -> K -> C

Continuations applied (bonus content): Gotos and labels

Ideas:

Domains: (board)

V         = Int
Ide
Lbl                     // labels, new
sigma : S = Ide -> V
A         = Answers     
theta : C = S -> A      
rho : Lenv= Lbl -> C    // label meanings, new
Exp       = S -> V      // back to direct-style expressions
Stm       = Env * C -> Env * C    // changed
Prog      = A           

Semantic functions for base types and syntax

I : ID  -> Ide
N : NUM -> V
E : exp -> Exp
S : stm -> Stm
P : prog -> Prog

Semantic functions for values, environment, and store

plus   : V -> V -> V
update : S * L * V -> S
ifnz   : V * S * S -> S
done   : C  // end of the program
fst, snd

Abstract syntax and semantic equations for a program:

P[[stm]] = snd(Y(\(rho_L, theta).S[[stm]](rho_L, done))) sigma_0

    rho_final, theta = S[[stm]] (rho_final, done)

Abstract syntax and semantic equations for statements:

S [[ID := exp]] (rho_L,theta) =
      (rho_L,\sigma.theta(update(sigma, I[[ID]], E[[exp]] sigma)))
S [[s1; s2]]    (rho_L,theta) = S[[s1]] (S[[s2]] (rho_L,theta))
S [[IF exp THEN GOTO L]] (rho_L,theta) =
     (rho_L, \sigma.ifnz(E[[exp]]sigma, rho_L([[L]]), theta sigma))
S [[L:]] (rho_L, theta) = (update(rho_L, [[L]], theta), theta)

Abstract syntax and semantic equations for expressions:

E [[NUM]] sigma = N[[NUM]]      // note state is ignored
E [[ID]]  sigma = sigma(I[[ID]])
E [[e1 + e2]] sigma = plus (E [[e1]] sigma) (E [[e2]] sigma)

Things to notice:

20 April 2011: Objects

p4204977 p4204978 p4204979 p4204980 p4204981 p4204982 p4204983 p4204984 p4204985 p4204986 p4204987 p4204988 p4204989 p4204990 p4204991

Midterm results

80 and up    Excellent
60 to 79     Very Good
42 to 59     Good
30 to 41     Fair
below 30     Poor

Three-quarters of exams are Good or better.
Median grade is 53.
40 or below is cause for concern

15% or more improvement on the final replaces the midterm grade completely

Why objects?

Theory is so over!

Where have we been?

What about big programs?

An area of agreement and a great divide:

                      INFORMATION HIDING
                          /         \
                         /           \
                  reuse /             \ modular reasoning
                       /               \
                      /                 \
                   OBJECTS            MODULES

Two kinds of reuse:

Don't conflate them!

Smalltalk

Why Smalltalk?

The five questions:

Syntax

          *** Abstract syntax (approximate) ***
datatype exp = VAR     of name
             | SET     of name * exp
             | BEGIN   of exp list
             | SEND    of name * exp * exp list
             | BLOCK   of name list * exp list
             | LITERAL of rep

Analysis of syntax (board):

Protocol --- the interface to an object

Protocol is Smalltalk term of art:

Ruby dialect "duck typing" is a statement about protocols

Protocol is determined by the class of which an object is an instance

TeX slide

Arities:

Every object gets not just the protocol but the implementations from its class. And a class can inherit from its superclass (more undefined terms) all the way up to class Object.

          *** Simple examples ***
-> (isKindOf: 3 Object)
<True>
-> (isMemberOf: 3 Object)
<False>
-> (isNil 3)
<False>
-> (isNil nil)
<True>
-> (println nil)
nil
nil


          *** SLIDE ***
-> true
<True>
-> True
<class True>
-> Object
<class Object>

Subtyping

Key strategy for reuse in object-oriented languages: subtype polymorphism

Subtyping mathematically:
TeX slide

Subtyping is not inheritance:
TeX slide

22 April 2011: Classes and Inheritance (Nathan Ricci)

p4224993 p4224994 p4224995 p4224996 p4224997 p4224998 p4224999 p4225000 p4225001 p4225002 p4225003 p4225004 p4225005 p4225006 p4225007 p4225008 p4225009

25 April 2011: Smalltalk wrapup

p4255010 p4255011 p4255012 p4255013 p4255014 p4255015 p4255016 p4255017 p4255018 p4255019 p4255020 p4255021 p4255022

On offer:

  1. Correction from last time about metaclasses
  2. Information hiding
  3. Double dispatch
  4. Blocks and Boolean protocol and implementation (optional)
  5. Collections
  6. Homework stuff

Metaclasses

                                      Object!
                                        /
                                       /
                                    Class
                                     /
                                    /
                 Object ===> Object's metaclass
                   /              /
                  /              /
   beetle ====> Car ===> Car's metaclass
                /              /
               /              /
prius ====> HybridCar ===> Hybrid Car's metaclass

Information hiding using objects

Smalltalk

(Other object-oriented languages have elaborate controls for public/private)

Double Dispatch

Typical object-orientation:

What if you need to choose code based on both receiver and argument?

Solution: method name encodes both operation and type of argument

Examples:

addIntegerTo:
addFloatTo:

Smalltalk's initial basis

Blocks and Booleans

Blocks are closures

Blocks are objects

Booleans use continuation-passing style

Protocol for Booleans:
TeX slide

Booleans implemented with two classes True and False

Blocks, control flow, the block protocol

          *** Blocks manage loops ***
-> (val x 10)
-> (val y 20)
-> (whileTrue: [(<= x (* 10 y))]
      [(set x (* x 3))])
nil
-> x
270

Protocol for blocks:
TeX slide

Collections

Goal of objects is reuse

Key to successful reuse is a well-designed class hierarchy

``Collection hierarchy'':
TeX slide

Collectioncontain things
Setobjects in no particular order
KeyedCollectionobjects accessible by keys
Dictionaryany key
SequenceableCollectionkeys are consecutive integers
Listcan grow and shrink
Arrayfixed size, fast access

Collection mutators:
TeX slide

Collection observers:
TeX slide

Collection iterators:
TeX slide

Implementing Collections

Implementing collections:
TeX slide

Reusable methods:
TeX slide

\lit{species} method:
TeX slide

Example collection - Sets

TeX slide

Most subclass methods work by delegating all or part of work to list members

N.B. Set is a client of List, not a subclass!

Real Smalltalk syntax

Real Smalltalk syntax: receiver first:
TeX slide

Mixfix message sends:
TeX slide

27 April 2011: Modules and separate compilation

p4275023 p4275024 p4275025 p4275026 p4275027 p4275028 p4275029 p4275030 p4275031 p4275032 p4275033 p4275034 p4275035 p4275036 p4275037

ENGINEERING FELLOWS FOCUS GROUPS

Why modules?

Unlocking the final door for building large software systems

Many paths will be open to you:

Modules overview

Functions of a true modules system:

Real modules include separately compilable interfaces and implementations

Interfaces

Collect declarations

Things typically declared:

Terminology: a module is a client of the interfaces it depends on

Roles of interfaces in programming:

The best-proven technology for structuring large systems.

Ways of thinking about interfaces

Two approaches to writing interfaces

Interface "projected" from implementation:

Full interfaces:

Structure of implementations

Standard ML Modules

The Perl of module languages?

What we've been using so far is the core language

Modules are a separate language layered on top.

ML module terminology

Interface is a signature

Implementation is a structure

Generic module (parameterized implementation) is a functor

You have too many choices

Implementations

Structures match signatures

The implementation relation is called matching

Surprise! A single structure can match many signatures

Structure Basics

A structure has a private name space

To refer to other structures, use qualified names

    Int.toString
    TextIO.stdIn
    Real.fromInt

Structures relate to other structures

No explicit imports (terrible decision)

Explicit exports handled by signature ascription (good decision)

Structures may be nested (great decision)

Workaround for lack of imports

External structures should usually be abbreviated:

structure S = String
structure C = Char
val ucase = S.map C.toUpper
val doubleVision = 
  S.translate (fn c => S.implode [c, c])

If you use open, I will hunt you down and hurt you
(breaks modularity)

Signature basics

Signature says what's in a structure

Specify types (w/kind), values (w/type), exceptions.

Ordinary type examples:

    type t        // abstract type, kind *
    eqtype t
    type t = ...  // 'manifest' type
    datatype t = ...

Type constructors work too

    type 'a t     // abstract, kind * => *   
    eqtype 'a t
    type 'a t = ...
    datatype 'a t = ...

29 April 2011: ML Modules examples

p4295038 p4295039 p4295040 p4295041 p4295042 p4295043 p4295044 p4295045 p4295046 p4295047 p4295048 p4295049 p4295050 p4295051 p4295052 p4295053

First Example: Environments

          *** Example: environment signature ***
signature ENV = sig
  type 'a env
  eqtype name
  exception NotFound of name
  val empty : 'a env
  val bind  : name * 'a * 'a env -> 'a env
  val find  : name * 'a env -> 'a
end


          *** Environment structure ***
structure ListEnv : ENV = struct
  type name = string
  type 'a env = (name * 'a) list
  exception NotFound of name
  val empty : 'a env = []
  fun bind(n, x, rho) = (n,x) :: rho
  fun find(n, []) = raise NotFound n
    | find(n, (n', x')::l) =
        if n = n' then x' else find(n, l)

Matching is OK, but too many details exposed

Two kinds of ascription (matching)

Two kinds of ascription (matching)

What is hidden?

Transparent ascription ListENV : ENV

Opaque ascription ListENV :> ENV

But wait! The opaque version is useless! We don't know what type name is.

Reveal that names are strings:
TeX slide

Second example: more abstract environments

Going back, signature does not really work with search trees

Equality types are a bad idea anyway

          *** More abstract environments ***
signature ENV = sig
  type 'a env
  type name
  exception NotFound of name
  val empty : 'a env
  val bind  : name * 'a * 'a env -> 'a env
  val find  : name * 'a env -> 'a
end


          *** Environment with binary-search tree ***
signature ORD_KEY = sig
  type ord_key
  val compare : ord_key * ord_key -> order
end

functor SearchTreeFn(Key: ORD_KEY) : ENV = 
  struct
    ...  binary-tree implementation ...
  end

(ord_key?! These people have no taste!)

You give me an ordered type, and I'll give you an environment:

Key idea: SearchTreeFn is compiled once, used over and over, and supports modular type checking

Let's see the code!

          *** Implementation of search-tree functor ***
functor SearchTreeFn(Key: ORD_KEY) : ENV = struct
 type name = Key.ord_key
 datatype 'a tree
   = NULL
   | NODE of name * 'a * 'a tree * 'a tree
 type 'a env = 'a tree
 exception NotFound of name

 val empty = NULL
 fun find (name, NULL) = raise NotFound name
   | find (name, NODE (n', x', L, R)) =
        case Key.compare (name, n')
          of EQUAL   => x'
           | LESS    => find (name, L)
           | GREATER => find (name, R)


          *** Search-tree functor, finale ***
 fun bind (name, x, NULL) = NODE (name, x, NULL, NULL)
   | bind (name, x, NODE (n', x', L, R)) =
       case Key.compare (name, n')
         of EQUAL   => NODE (name, x, L, R)
          | LESS    => NODE (n', x', bind(name, x, L), R)
          | GREATER => NODE (n', x', L, bind(name, x, R))
end


          *** Using the functor ***
structure OrderedString = struct
  type ord_key = string
  val compare = String.compare
end

structure StringEnv =
            SearchTreeFn(OrderedString)

Note that ORD_KEY is not required

Signature matching defined

Two-step process

Signature matching in two steps:

Step one: realize the signature

Step two: cut down the structure to match

A form of subtyping!!

"More polymorphic"

2 May 2011: ML modules, type identity, how to use ML

p5025054 p5025055 p5025056 p5025057 p5025058 p5025059 p5025060 p5025061 p5025062 p5025063 p5025064

Review

signatures, structures, functors

signature matching:

Type identity

When are two types equal?

Abbreviation defined with type equals RHS

Type defined with datatype is always fresh

Abstract types are never equal,

Example with sharing

Interactive interpreter with persistent store.

signature INTERPRETER = sig
 type value
 type exp
 val eval : exp * value ref Env.env -> value
end

signature PERSISTENT_STORE = sig
 type value
 val save    : (string * value) list -> unit
 val restore : unit -> (string * value) list
end

Abstractions must agree

Interpreter and store must have same value type:

functor INTERACTIVE_SYSTEM (
  structure I : INTERPETER
  structure S : PERSISTENT_STORE
    sharing type I.value = S.value
) = struct ... end

Styles of using Standard ML Modules

  1. Could use SML like Modula, Ada

    • One structure, one signature, no/rare functors

    Doesn't work as well as Modula, Ada:

    • If you even mention another structure, you get import dependencies that are not easy to discover

    Goal of SML is flexibility, not control.

  2. Tofte style:

    • Write everything as functors and signatures
    • Build the whole system by functor application

    (See "four lectures on standard ML")

    Crucial advantages:

    • Because there are no structures until the end, properties of structures can't "leak" by accident.

    • Dependencies between modules are explicit and totally controlled.

  3. What most people do: a haphazard mix of functors and structures

Function vs functor

Core language Modules language
Function maps value to valueFunctor maps structure to structure
Cannot depend on a typeCan depend on a type
Unknown type? Must carry around clunky type parameterUnknown type? No problem! It's invisible. (Think `type 'a env` does not depend on names)
Lightweight notationHeavyweight notation
Inconvenient if there are a lot of argumentsEasy when there are a lot of arguments (names!)
Has a run-time costEssentially no run-time cost

Comparing modules and objects

Modules more flexible about information hiding

Comparing modules and objects

A module can encapsulate two types:

signature CONTROL_FLOW_GRAPH = sig
  type graph
  type node
  val entry      : graph -> node
  val exit       : graph -> node
  val graph_of   : node -> graph
  val successors : node -> node list
    ...
end

Both graph and node ``methods'' know all

Modules, objects, and reuse

Like Smalltalk protocols and Java interfaces,
SML signatures support subtyping

But no automatic code reuse (no inheritance)

Assessing Standard ML modules

SML modules: powerful but strange

Power beyond mortal ken

Clean winner on expressive power

Reliability, not so much

Module semantics is considered intimidating!

Needed: good models, guidelines, tutorials

7 May 2011: Review for the final

p5075065 p5075066 p5075067 p5075068 p5075069 p5075070 p5075071 p5075072 p5075073 p5075074 p5075075 p5075076

Notes will be mailed out.


Back to class home page