Lecture notes for COMP 105 (Programming Languages)

9 September 2020: Introduction to Comp 105

There are PDF slides for 9/10/2020.

Handout: 105 Tip Sheet

Handout: 105 Syllabus

Handout: Experiences of successful 105 students

Handout: Lessons in program design

Introduction

My pronouns are she, her, hers.

Please call me “Kathleen,” or “Professor Fisher”

Reflection: Are all programming languages created equal?

Under what circumstances is one programming language better than another one? Write 2-3 sentences explaining your thoughts.

Why we are here

Why it’s different from COMP 40:

What it involves:

Today:

What is a programming language?

What is a PL?

If you’ve taken 170, a caution: every programming language is a formal language, but not every formal language is a programming language.

My answer: a language with a compositional structure that has an element of run time to it.

Key idea: phase distinction

We will talk about “compile time” and “run time” (and later, type-checking time).

Key ideas: static and dynamic

The thing that you write down is syntax

What flies around at run time is values

(You know about syntax and values already—but this semester we’re going to think about them very systematically.

First three classes: new ways of thinking about stuff you already know

Check your understanding: The big picture

  1. The goal of this course is to teach you principles of programming languages so you’ll be able to teach yourself new languages quickly and improve your coding in many different languages.
  1. Almost all languages are equally powerful, and so are equally well suited to solving any programming task.

Check your understanding: Syntax vs. semantics

Indicate whether each statement is more about the syntax or the semantics of a program:

  1. A valid C program always has a matching } for every {.
  1. A valid C program never dereferences an uninitialized pointer.
  1. In a valid C program, the right-hand side of an = operation must be an expression.
  1. In our house style, every if statement has a corresponding else.
  1. When a ternary conditional expression x ? y : z is evaluated, it acts a lot like an if

Reflection: Inductive definitions for numerals

Numerals are an example of syntax: the rules for how things are written. At run time, a numeral stands for a natural number; the natural number is its semantics. Numerals are things like 0, 2020, 73, and 1852.
They are not things like 311t3, 2+2, or -12.

  1. Write an inductive definition of numerals (think base case and inductive case).

  2. There’s more than one! When you finish, look for another.

Value structure for numerals: Induction

Example: Numerals

Numerals are syntax; Natural numbers are semantics

Numerals:

2018
73
1852

Not numerals:

3l1t3
2+2
-12

Approach 1: the cons approach

Approach 2: the snoc approach

Approach 3: the tree approach

have written a list or sequence of digits as a definition. That’s good thinking, but it leaves open the problem of defining “list” or “sequence”. These definitions take care of that issue.

Check your understanding: Representations of numerals

Different representations make different operations easier. Consider the three different ways we discussed to represent numerals, which we called the cons, snoc, and tree approaches.

  1. Which representation makes it easiest to access the most significant digit?
  1. Which representation makes it easiest to access the least significant digit?
  1. Which representation makes it equally easy to access either the first or last digit?

Reflection: Natural numbers

We have limited operations on numerals: stick a digit before or after a numeral, or stick two numerals together. But on numbers we have many more operations. What are two simple, useful operations that are defined on natural numbers?

Value structure for natural numbers: induction again

Natural numbers are semantics for numerals.

Infinite set, that we’ll define using induction.

Check your understanding: Representations of numerals

The decimal definition of natural numbers in the video ignores almost all operations on natural numbers. It uses just addition and multiplication. And not just any addition and multiplication, but a highly stylized one: multiply by ten and then add. So out of all the gazillions of ways we might compute a natural number, we are limiting ourselves. According to this definition, how many different ways of forming natural numbers must we consider to be able to account for every natural number?

Reflection: Defining functions over natural numbers

We want to define a function that examines a natural number (actually a machine integer) to find out if its numeral can written using only the digit 4. How many cases will such a function have to consider? Do any of the cases look like they will require recursion?

Writing code with recursion and induction

Data-driven design process. Connections to design ideas from 1970s, 1980s, 1990s, plus test-driven development (2000s).

Recursive-function problem

Exercise: all-fours?

Write a function that takes a natural number n and returns true (1) if and only if all the digits in n’s numeral are 4’s.

Key design step: Form of number

Choose inductive structure for natural numbers:

Step 1: Forms of DECNUMERAL proof system (1st lesson in program design):

Example inputs

Step 2:

Function’s name and contract

Steps 3 and 4:

Function (all-fours? n) returns nonzero if and only if the decimal representation of n can be written using only the digit 4.

Example results

Step 5: write expected results as unit tests:

(check-assert      (all-fours?  4))
(check-assert (not (all-fours?  9)))
(check-assert      (all-fours? 44))
(check-assert (not (all-fours? 48)))
(check-assert (not (all-fours? 907)))

Algebraic laws

Step 6: Generalize example results to arbitrary forms of data

(all-fours? d) ==  (= d 4)

(all-fours? (+ (* 10 m) d)) ==
   (= d 4) && (all-fours? m)

Left-hand sides turn into case analysis

Step 7:

; (all-fours? d) == ...
; (all-fours? (+ (* 10 m) d)) ==  ...

(define all-fours? (n)
  (if (< n 10)
    ... case for n = d ...
    ... case for n = (+ (* 10 m) d),
        so m = (/ n 10) and
        d = (mod n 10) ...))

Each right-hand side becomes a result

Step 8:

; (all-fours? d) ==  (= d 4)
; (all-fours? (+ (* 10 m) d)) ==
;           (= d 4) && (all-fours? m)

(define all-fours? (n)
   (if (< n 10)
       (= n 4)
       (and (= 4 (mod n 10))
            (all-fours? (/ n 10)))))

Revisit tests:

Step 9:

(check-assert      (all-fours?  4))
(check-assert (not (all-fours?  9)))
(check-assert      (all-fours? 44))
(check-assert (not (all-fours? 907)))

(check-assert (not (all-fours? 48)))

Checklist:

Check your understanding: Design process

Put the nine steps of the design process for writing functions guided by the structure of the data and algebraic laws in order.

  1. Turn right-hand sides of algebraic laws into results
  2. Specify example results
  3. Choose function’s name
  4. Revisit Tests
  5. Describe function’s contract
  6. List example inputs
  7. Choose representation of inputs
  8. Turn left-hand sides of algebraic laws into case analysis
  9. Write algebraic laws

The all-fours? function

;; forms of data
;; Either a single digit d
;; OR 10 * m + d where m not equal 0

;; example inputs
(val x1 4 )
(val x2 9)
(val x3 44)
(val x4 907)
(val x5 48)

;; Pick the name now
;; Write the contract now
;; Write check assert cases now

;; Algebraic Laws
;; (all-fours? d) = (= d 4)
;; (all-fours? (+ (* 10 m) d)) == 
;;     (= d 4) && (all-fours? m)       

;; The code!
  ;; return nonzero if and only if 
  ;; decimal rep of n can be written
  ;; using only the digit 4
(define all-fours? (n)
  (if (< n 10)
      (= n 4)
      (and (= 4 (mod n 10)) 
           (all-fours? (/ n 10)))))

        (check-assert      (all-fours? 4))
        (check-assert (not (all-fours? 9)))
        (check-assert      (all-fours? 44))
        (check-assert (not (all-fours? 48)))
        (check-assert (not (all-fours? 907)))

Coding Interlude: all-fours?

Consider the all-fours? code from the previous video. Run the code on the sample data.

Change the code to make it be the all-nines? function.

Reflection: Various syntactic forms

Consider the following two functions that show all the syntactic forms in Impcore.

(define even? (n) (= (mod n 2) 0))

(define 3n+1-sequence (n) ; from Collatz
  (begin
    (while (!= n 1)
       (begin
         (println n)
         (if (even? n)
             (set n (/ n 2))
             (set n (+ (* 3 n) 1)))))
    n))
  1. What do you think these functions do?

  2. What do you notice about the syntax of these functions (ie, how they are written down)?

  3. How might you write these same two functions in C?

Syntax in programming languages

Idea of LISP syntax

Parenthesized prefix syntax:

Examples:

(+ 2 2)
(if (isbound? x rho) (lookup rho x) (error 99))

(For now, we use just the round brackets)

Impcore structure

Two syntactic categories: expressions, definitions

No statements!—expression-oriented

(if e1 e2 e3)
(while e1 e2)
(set x e)
(begin e1 ... en)
(f e1 ... en)

Evaluating e has value, may have side effects

Functions f named (e.g., + - * / = < > print)

The only type of data is “machine integer” (deliberate oversimplification)

Syntactic structure of Impcore

An Impcore program is a sequence of definitions

(define mod (m n) (- m (* n (/ m n))))

Compare

int mod (int m, int n) { 
  return m - n * (m / n); 
}

Impcore variable definition

Example

(val n 99)

Compare

int n = 99;

Slide 14 

14 September 2020: Abstract syntax and operational semantics

There are PDF slides for 9/15/2020.

Retrospective: Introduction to the study of programming languages, induction and recursion, and Impcore.

Pedagogical studies show that reflection about what you are learning really helps the learning process, so we will start each module by asking you to briefly reflect on what you learned in the previous module.

Write one thing you learned in the previous module and one question you have about that material.

Your responses may be anonymously shared with the rest of the class.

Reflection: Exploring composition in programming languages

Composition, which is the ability to build many sensible things using just a few construction principles, is a critical to programming languages. An example at the syntactic level is the ability to build bigger programs out of smaller pieces. To get some practice with this idea, complete the following exercises:

Programming-language semantics

“Semantics” means “meaning.”

What problem are we trying to solve?

Know what’s supposed to happen when you run the code

Ways of knowing:

You may recall the beginner exercise “make a peanut butter and jelly sandwich”

Why bother with precise semantics?

The programming languages you encounter after 105 will certainly look different from what we study this term. But most of them will actually be the same. Studying semantics helps you identify that.

The payoff: Your new skills will apply

Behavior decomposes

We want a computational notion of meaning.

What happens when we run (* y 3)?

We must know something about *, y, 3, and function application.

Knowledge is expressed inductively

Example of domain where compositionaliy doesn’t apply: English spelling and pronunciation

Check your understanding: Principle of composition

The following questions are designed to give you practice with composition by asking you to think about how C statements and expressions are composed to form more complex C statements and expressions.

C has a construct of the form ___?___:___. For the result to be syntactically well formed, what kinds of things can go in the blanks?

Is the result of filling in the blanks in the above C construct

C also has an if construct of the form if (___) ___; else ___;.
For the result to be syntactically well formed, what kinds of things can go in the blanks?

Is the result of filling in the blanks in the C if construct

Recall Impcore concrete syntax

Definitions and expressions:

Define behaviors inductively

We’ll focus on expressions

Base cases (plural!): numerals, names

Inductive steps: compound forms

First, simplify the task of specification

What’s different? What’s the same?

 x = 3;               (set x 3)

 while (i * i < n)    (while (< (* i i) n)
   i = i + 1;            (set i (+ i 1)))

Abstract away gratuitous differences

Abstract syntax

Same inductive structure as concrete syntax

But,

Concrete syntax: sequence of symbols

Abstract syntax: ???

The abstraction is a tree

The abstract-syntax tree (AST):

Exp = LITERAL (Value)
    | VAR     (Name)
    | SET     (Name name, Exp exp)
    | IFX     (Exp cond, Exp true, Exp false)
    | WHILEX  (Exp cond, Exp exp)
    | BEGIN   (Explist)
    | APPLY   (Name name, Explist actuals)

One kind of “application” for both user-defined and primitive functions (One syntax, two behaviors)

In C, trees are fiddly

typedef struct Exp *Exp;
typedef enum {
  LITERAL, VAR, SET, IFX, WHILEX, BEGIN, APPLY
} Expalt;        /* which alternative is it? */

struct Exp {  // 'alt' combines with any field in union
    Expalt alt;
    union {
        Value literal;
        Name var;
        struct { Name name; Exp exp; } set;
        struct { Exp cond; Exp true; Exp false; } ifx;
        struct { Exp cond; Exp exp; } whilex;
        Explist begin;
        struct { Name name; Explist actuals; } apply;
    };
};

ASTs

Question: What do we assign behavior to?

Answer: The Abstract Syntax Tree (AST) of the program.

Question: How can we represent all while loops?

while (i < n && a[i] < x) { i++ }

Answer:

As a data structure:

Let’s picture some trees

An expression:

  (f x (* y 3))

Let’s picture some trees

And a definition:

  (define abs (n)
    (if (< n 0) (- 0 n) n))

Check your understanding: AST Representations

Which of the following is the AST representation of the concrete Impcore syntax (if (> x 9) 1 0)?

Key question: names

Reflection: What’s in a name?

We’ve talked about how composition is really important in giving meaning to programs: you can understand a program fragment such as (+ e1 e2) by understanding each of its pieces (+, e1, and e2) and then putting them together.

Some parts of programs can’t be broken down further. An integer, for example, doesn’t have any smaller pieces that are still an integer. But we can understand an integer just by look at the bits that comprise it (aka, its representation).

What about names, such as a variable x? A name doesn’t have any smaller pieces, so we can’t understand it by understanding its pieces. And a name doesn’t contain in itself its meaning the way an integer does.

Explain how you think language implementors might give meaning to a name.

Hint: think about how dictionaries give meanings to words.

From ASTs to behaviors

Behaviors of ASTs, part I: Atomic forms

Numeral: stands for a value

Name: stands for what?

Slide 10 

Slide 11 

``Environment’’ is formal term

You may also hear:

So-called “Scope rules” determine which environment governs where

Find behavior using environment

Consider the program fragment

  (* y 3)   ;; what does it mean?

Key Idea: Look up the meaning of variables in an environment

Impcore uses three environments

Global variables ξ (“ksee”)

Functions ϕ (“fee”)

Formal parameters ρ (“roe”)

There are no local variables

Function environment ϕ not shared with variables—just like Perl

Check your understanding: Operational semantics judgments

  1. We find the meaning of variables by looking them up in an environment.

    • True
    • False
  2. If an Impcore name f stands for a function, which environment will it be in?

    • ξ (“ksee”)
    • ϕ (“fee”)
    • ρ (“roe”)
  3. If an Impcore name x is defined in environment ξ (meaning x is in the domain of ξ), then it represents
    • a global variable
    • a function
    • a formal parameter

16 September 2020: Judgments, Rules, and Proofs

There are PDF slides for 9/17/2020.

Handout: Redacted Impcore rules

Handout: Overview of Induction

Reflection: Modeling computation in math

Suppose the notation e ⇓ 3 means that expression e evaluates to 3 and the notation n dom ρ means that variable n is in the environment ρ (and hence is defined in ρ).

  1. How would you say an expression evaluates to 7?
  2. How would you say formally that an expression like (+ n 1) evaluates successfully only if n is defined?
  3. How would you say what (set x 0) does, and also (set n 0), and (set m 0), without having to repeat yourself for every variable you want to set to 0?
  4. What information would you need to know what (all-fours? 344) evaluates to?
  5. Why would you want to write these expressions in math, instead of just writing in natural language?

Rules and metatheory

Slide 1 

Slide 2 

Slide 3 

Reflection: Operational semantics judgments

What do you think the rules are for evaluating a literal and for evaluating a variable? Fill in the question marks with your best guess!

Slide 4 

Slide 5 

Slide 6 

Check your understanding: Semantics of variables

Consider the semantic rules for assignment to variables FormalAssign and GlobalAssign.

If x is in both the global environment ξ and the formal environment ρ, which version takes precedent?

During the execution of the expression set(x, e), which environments may change? (Check all that apply)

Reflection: Mapping semantics to code

Consider again the semantic rules for assignment to variables FormalAssign and GlobalAssign.

How might these rules guide a language implementor?

Slide 7 

Slide 8 

Slide 9 

Slide 10 

Slide 11 

Slide 12 

Slide 13 

Check your understanding: C implementation of environment operations

  1. In the implementation of variable lookup, what C operation corresponds to the semantic check that x ∈ dom ρ?
  1. In the implementation of assignment, what part of the semantic rules corresponds to the C operation of bindval(e->set.name, v, globals)?

Note that the implementation doesn’t precisely follow the semantics. Instead of returning a new environment as directed by the semantics, the implementation instead updates the old environment in place using a side-effecting operation. This choice is more efficient than copying the old environment. It’s a metatheorem of Impcore that this optimization is sound. It’s a metatheorem that will no longer hold when we shift to Scheme and start copying environments to form closures.

Reflection: Using semantics to understand how programs are evaluated

Consider the following code:

      (define and (p q)
        (if p q 0))

      (define digit? (n)
        (and (<= 0 n) (< n 10)))

Suppose we evaluate (digit? 7)

  1. In the body of digit?, what expressions are evaluated in what order?
  1. The body of the digit? function matches the template (f e1 e2) where f, e1, and e2 are meta-variables that stand for different program fragments. In matching the template to the body of digit?,

    • What is f?
    • What is e1?
    • What is e2?

Good and bad judgments

Reflection: Good and bad judgments

Just because something is written in math notation doesn’t make it true!

Which of the following judgements correctly describe what happens at runtime? Explain your answers. Pay special attention to the lack of primes on the concluding environments.

  1. (+ 2 2), ξ, ϕ, ρ⟩ ⇓ ⟨4, ξ, ϕ, ρ

  2. (+ 2 2), ξ, ϕ, ρ⟩ ⇓ ⟨99, ξ, ϕ, ρ

  3. (+ 2 2), ξ, ϕ, ρ⟩ ⇓ ⟨4, ∅, ϕ, ρ

  4. (while 1 0), ξ, ϕ, ρ⟩ ⇓ ⟨77, ξ, ϕ, ρ

  5. (begin (set n (+ n 1)) 17)ξ, ϕ, ρ
     ⇓ ⟨17, ξ, ϕ, ρ

Proofs: Putting rules together

Every terminating computation is described by a data structure—we’re going to turn computation into a data structure: a tree. Proofs about computations are hard (see: COMP 170), but proofs about trees are lots easier (see: COMP 61).

How do we know it’s right?

From rules to proofs

Judgment speaks truth when ``derivable’’

Special kind of proof: derivation

A form of “syntactic proof”

Recursive evaluator constructs derivation

Root of derivation at the bottom (surprise!)

Build

First let’s see a movie

<– The ``Tony Hawk’’ algorithm –>

Building a derivation

Slide 16 

Slide 17 

Slide 18 

Slide 19 

Slide 20 

Slide 21 

Slide 22 

Slide 23 

Slide 24 

Slide 25 

Slide 26 

Slide 27 

Slide 28 

Slide 29 

Building derivations

Slide 30 

Check your understanding: Derivation trees

  1. In constructing the derivation tree for the program (digit? 7), what is the rule at the bottom of the tree?
  1. What environment will be used to find the meaning of the name digit??
  1. What rule will be used to find the meaning of the sub-expression 7?

21 September 2020: Derivations and metatheory

There are PDF slides for 9/22/2020.

All your operational-semantics questions are answered.

Reflection: Metatheory vs theory

Metatheory allows us to prove properties about all derivations, which means it allows us to prove properties about all programs in a language.

The derivation that demonstrates that (digit? 7) evaluates to true is a proof about that particular program, so it theory but not metatheory. A metatheoretic property about Java is that in a well-typed program, every variable mentioned in the program has previously been defined.

Classify the following properties as theory or metatheory:

  1. The evaluation of any Impcore program will never access undefined memory.

    • theory
    • metatheory
  2. The evaluation of (and (<= 0 n) (< n 10)) will access the value associated with the variable n.

    • theory
    • metatheory
  3. For any Impcore program p, every variable mentioned in p will be in the domain of either ξ, ϕ, or ρ.

    • theory
    • metatheory

Note: a theoretic or metatheoretic property can be either true or false. The distinction is whether the property would apply only to specific programs or to all programs were it to be true. The proof methods used to prove theoretic and metatheoretic properties differ because a theory proof applies only to a specific program while a metatheoretic proof applies to all programs in a language.

Proofs about all derivations: metatheory

Syntactic proofs empower metatheory

Derivation/proof 𝒟 is a data structure

Got a fact about all derivations?

Prove meta-theoretic properties by structural induction over derivations

Example: Evaluating an expression doesn’t create or destroy any global variables (the set of defined global variables is invariant)

Metatheorems often help implementors

More example metatheorems:

Slide 3 

Reflection: Writing metatheoretic properties

  1. Using the judgment form for Impcore evaluation that we are already familiar with, e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩, how would you formally write “In Impcore, evaluation doesn’t change the set of defined global variables?”?
  1. Using the judgment form for Impcore evaluation that we are already familiar with, e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩, how would you formally write “In Impcore, evaluation doesn’t change the values of any global variables”?

Metatheorems are proved by induction

Induction over height of derivation trees 𝒟

These are “math-class proofs” (not derivations)

Proof

Let’s try it!

Slide 5 

Cases to try:

For your homework, “Simplified Impcore” leaves out While and Begin rules.

Slide 6 

Slide 7 

Slide 8 

Slide 9 

Slide 10 

Check your understanding: Proving metatheoretic properties

The following questions all relate to the metatheoretic proof that evaluation in Impcore does not change the set of defined global variables, ie, for any Impcore expression e, if e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩, then dom(ξ) = dom(ξ′). We prove this property by induction on the derivation of e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩.

  1. The proof proceeds by case analysis on the last rule used in the derivation e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩.

    • True
    • False
  1. One of the rules that must be considered is the IfFalse rule

    • True
    • False
  1. The case for GlobalVar is an inductive case.

    • True
    • False
  1. The case for GlobalAssign is an inductive case.

    • True
    • False

Summary and Next Steps

Theory: Operational semantics

Practice: For the next month, programming!

Return to algebraic laws! Lean on them!

23 September 2020: Scheme I: Recursive Programming with Lists and S-Expressions

There are PDF slides for 9/24/2020.

Retrospective: Operational Semantics

Write one thing you learned about operational semantics and one question you still have. Your answers may be anonymously shared with your classmates.

Reflection: From inference rules to recursive functions

In the previous module, you encountered a whole bunch of inference rules. How do you think those inference rules related to functions? When might they be recursive functions?

For a new language, five powerful questions

As a lens for understanding, you can ask these questions about any language:

  1. What are the values? What do expressions/terms evaluate to?

  2. What environments are there? That is, what can names stand for?

  3. What is the abstract syntax? What are the syntactic categories, and what are the forms in each category?

  4. How are terms evaluated? What are the judgments? What are the evaluation rules?

  5. What’s in the initial basis? Primitives and otherwise, what is built in?

(Initial basis for μScheme on page 99)

Introduction to Scheme

Why Scheme?

Vehicle to study recursion and composition, two powerful programming techniques:

Today: programming with lists and S-expressions using algebraic laws

Two new kinds of data:

Scheme Values

Most values are S-expressions.

An fully general S-expression is one of

It is very common to work with a list of S-expressions, which is either

We say “an S-expression followed by a list of S-expressions”

Lists: A subset of S-Expressions.

These are your linked lists from COMP 11 or COMP 15.

Can be defined via a recursion equation or by inference rules:

Slide 1 

Slide 2 

Check your understanding: S-Expressions

  1. Every Scheme list is an S-expression.
  1. Every Scheme S-expression is a List(A) for some A.
  1. Select each of the following S-expressions that are also List(A) where A is the set of symbols.

Reflection: Manipulating values

  1. Values are only useful to the extent that they can be manipulated. What operations do you think would be useful to have on a cons cell?

  2. Recall how we defined numbers via different inductive definitions in Impcore. Using similar thinking, what cases will we need to handle for lists?

Understanding values through their operations

Constructors:

Observers:

Laws of lists:

(null? '()) == #t
(null? (cons v vs)) == #f
(car (cons v vs)) == v
(cdr (cons v vs)) == vs

Check your understanding: Deconstructing lists

A list is formed with either '() or cons.

  1. What operation distinguishes between these two types of list?
  1. Given this structure, how many cases will a function that consumes a list typically have?

Why are lists useful?

These “cheap and cheerful” representations are less efficient than balanced search trees, but are very easy to implement and work with—see the book.

The only thing new here is automatic memory management. Everything else you could do in C. (You can have automatic memory management in C as well.)

Programming with lists: The member? function

;; forms of data
;; '()
;; (cons n ns) where n is a number and 
               ns is a list of numbers

;; example inputs
(val x1 '())
(val x2 '(1 2 3 4))
(val x3 '(8))
(val x4 '(10 10))

;; Write check assert cases now

;; Algebraic Laws
;; (member? n '()) = #f
;; (member? n (cons n ns)) = #t
;; (member? n (cons n' ns)) = (member? n ns), 
                              where n' != n

;; The code!
  ;; return true if and only if 
  ;; list of numbers ms contains
  ;; number m
(define member? (m ms)
  (if (null? ms)
      (if (= m (car ms))
      (member? m (cdr ms)))))

        (check-assert (not (member? 8 '())))
        (check-assert (not (member? 8 '(1 2 3))))
        (check-assert      (member? 8 '(8)))
        (check-assert      (member? 8 '(10 7 8 12)))

Coding Interlude: member?

Consider the member? code from the previous video. Run the code on the sample data.

  1. Add a check-assert case where the element is in the list.

  2. Add a check-assert case where the element is not in the list.

Programming with lists: The sum-16? function

Do two distinct list elements sum to 16?

;; forms of data
;; '()
;; (cons n ns) where n is a number and 
               ns is a list of numbers

;; example inputs
(val x5 '(6 10))
(val x6 '(3 6 12 7 8 9 10))

;; Write check assert cases now

;; Algebraic Laws
;; (sum-16? '()) = #f
;; (sum-16? (cons n ns)) = 
;;      (sum-16? ns)  OR
;;      n and an element in ns sum to 16
;;      i.e., (member?  (- 16 n) ns))


;; The code!
  ;; returns #t if and only if ms contains 
  ;; two distinct elements that sum to 16, 
  ;; where ms is a list of numbers

(define sum-16? (ms)
  (if (null? ms)
      (|| (sum-16? (cdr ms))  
       (member? (- 16 (car ms)) (cdr ms)))))

        (check-assert (not (sum-16? x1)))
        (check-assert (not (sum-16? x2)))
        (check-assert (not (sum-16? x3)))
        (check-assert (not (sum-16? x4)))
        (check-assert      (sum-16? x5)))
        (check-assert      (sum-16? x6)))

Coding Interlude: sum-16?

Consider the sum-16? code from the previous video. Run the code on the sample data.

Modify the algebraic laws and the code to check if the product of two elements in the list equals 16. Note that * and / are the multiplication and division operations in μScheme.

Reflection: Quote syntax

Nobody wants to write a constant list of numbers as (cons 0 (cons 1 (cons 2 (cons 3 (cons 4 (cons 5 (cons 6 (cons 7 (cons 8 (cons 9 '())))))))).

What notation would you suggest instead?

Quoted literals

Defining constants can be verbose:

(cons 6 (cons 7 (cons 8 (cons 9 '()))))

Quotation syntax more compact:

(quote (6 7 8 9))     or     '(6 7 8 9)

Works with any constant S-expression:

(cons 'a (cons (cons 'b '()) '()))

replaces

'(a (b))

Check your understanding: Quoted literals

Write each ordinary S-expression using quote syntax:

  1. (cons 1 '()) = BLANK1
  2. (cons 1 (cons 2 '())) = BLANK2
  3. (cons (cons 1 '()) '()) = BLANK3
  4. (cons (cons 'a (cons 'b '())) (cons 'c '())) = BLANK4

Reflection: New syntactic forms

To program with lists and S-expressions, we absolutely have to have functions like null?, cons, car, and cdr. And the quote syntax is certainly useful. To make it even easier to program with S-expressions, what other new syntax might be useful?

μScheme’s new syntax

Slide 4 

28 September 2020: First-class and higher-order functions

There are PDF slides for 9/29/2020.

Big picture:

Retrospective: Scheme I: Recursive programming with lists

Write one thing you learned in the previous module. Your answers may be anonymously shared with your classmates.

es

Reflection: Where is there shared structure?

Consider writing code to answer the following:

Now answer these questions:

Functions are values

Review

Do two elements in a list add to 16?

;; (sum-16? '())         = #f
;; (sum-16? (cons n ns)) =
;;      (sum-16? ns)  OR
;;      ns has an element m with n + m = 16

C implementation: nested for loops

Functional implementation: recursive functions

More two-element search problems

Let’s not rewrite the same nested loops/recursive functions over and over

Idea: Functional programming, Part One

Divide and conquer:

The key trick:

Example:

-> (define product-is-16? (n m) (= (* m n) 16))
product-is-16?
-> (any-two? product-is-16? '(1 2 3 4))
#f
-> (any-two? product-is-16? '(8 2))    
#t

Reflection: Shared structure

Consider again the following questions:

For each question, describe the question-specific search criteria. For example, for the first question, the criterion is

  (define sum-is-16? (n m) (= (+ m n) 16))

Answers in pseudo-code or English are fine.

Implementing any-two?

Implementing

Algebraic laws for special case:

;; (sum-16? '())         = #f
;; (sum-16? (cons n ns)) = (sum-16? ns)  OR
;;                         ns has an element m
;;                         with m + n = 16

Algebraic laws for general case:

;; (any-two? q? '())         == ...
;; (any-two? q? (cons y ys)) == ...

Implementing

Algebraic laws for special case:

;; (sum-16? '())         = #f
;; (sum-16? (cons n ns)) = (sum-16? ns)  OR
;;                         ns has an element m
;;                         with m + n = 16

Algebraic laws for general case:

;; (any-two? q? '())         == #f
;; (any-two? q? (cons y ys)) == (any-two? q? ys) OR
;;                              ys has z with (q? y z)

Start with linear search!

Linear search

;; (exists? p? '())         == ...
;; (exists? p? (cons y ys)) == ...

Linear search

;; (exists? p? '())         == #f
;; (exists? p? (cons y ys)) == (|| (p? y)
;;                                 (exists? p? ys))

Operator || short-circuits:
Evaluates second argument only if first returns #f

Check your understanding: The exists? function

Introduction to lambda

Two-element search

Algebraic laws:

;; (any-two? q? '())         == #f
;; (any-two? q? (cons y ys)) == (any-two? q? ys) OR
;;                              (exists? q-with-y? ys)
;;               where (q-with-y? z) == (q? y z)

But where does q-with-y? come from?

Functional programming, Part Two

(q-with-y? z) == (q? y z)

q-with-y? = a function that takes z, returns

q-with-y? = (lambda (z) (q? y z))

Built into Scheme since 1973!
Added to C++ in 2011

Two-element search, complete {&} formal

;; (any-two? q? '())         == #f
;; (any-two? q? (cons y ys)) ==
;;   (|| (any-two? q? ys)
;;       (exists? (lambda (z) (q? y z)) ys))

Code in lecture notes.

(define any-two? (q? xs)
  ;; tell if any two elements y and z 
  ;; in order satisfy (q? y z)
  (if (null? xs)
      (|| (any-two? q? (cdr xs))
          (exists? (lambda (z) (q? (car xs) z))
                   (cdr xs)))))

Lambda expressions in general

In general, (lambda (x1 ... xn) e)

Capture: lambda “takes” free variables with it.

Programming with functions

Functions are “first-class” values

Created with lambda
(In Scheme, define combines “val” and “lambda”)

Check your understanding: Lambda

Consider the following code when answering the questions below:

(define combine (p1? p2?)
   (lambda (x) (if (p1? x) (p2? x) #f)))

(define divvy (p1? p2?)
   (lambda (x) (if (p1? x) #t (p2? x))))

(val c-p-e (combine prime? even?))
(val d-p-o (divvy   prime? odd?))
  1. The expression (c-p-e 9) equals

    • #t
    • #f
  2. The expression (c-p-e 8) equals

    • #t
    • #f
  3. The expression (c-p-e 7) equals

    • #t
    • #f
  4. The expression (d-p-o 9) equals

    • #t
    • #f
  5. The expression (d-p-o 8) equals

    • #t
    • #f
  6. The expression (d-p-o 7) equals

    • #t
    • #f

Escaping functions

Slide 13 

μScheme’s semantics: Overview and let

Reflection: Scheme vs Impcore

We’ve now studied two different languages: Impcore and μScheme.

Impcore: Things that should offend you

Look up function vs look up variable:

To get variable, check multiple environments

Create a function? Must give it a name

Slide 15 

Slide 16 

It’s not precisely true that rho never changes.
New variables are added when they come into scope.
Old variables are deleted when they go out of scope.
But the location associated with a variable never changes.

Slide 17 

Strange syntax, but simple semantics!

Key idea: don’t worry about memory

Slide 18 

Check your understanding: meta-variables

Consider the semantic rule for let.

Match the meta-variables with their meanings.

Meta-variables:

Meanings:

Another way to write this one:

Check your understanding: The semantics of let

Consider the semantic rule for let.

Expression e1 is evaluated after expression e2.

The memory cells allocated for x1 and x2 must be distinct, meaning they cannot alias.

In evaluating the body of the let e, any side effects that occured during the evaluation of e1 and e2 will be visible and variables x1 and x2 will be in scope.

Semantics of lambda

Key idea: ``every name I can see—remember where it is stored.’’

Slide 19 

Check your understanding: Semantics of lambda

Consider the semantic rule for lambda.

Side effects (changes to the global store) triggered by the evaluation of the body of the lambda (ie, e) will take place during the evaluation of the lambda.

The environment captured in the closure is the same as the environment in force at the point where the lambda appears in the source program

Semantics of function application

Slide 20 

Questions about ApplyClosure:

Check your understanding: Semantics of application

Consider the semantic rule for ApplyClosure2.

Which expression evaluates to a closure?

Which expression represents the first argument to the function?

Which expression represents the body of the function being applied?

In what order are the expressions mentioned in the rule evaluated?

Which environment is in force at the point where the body of the function is evaluated?

Immutable data: its costs and benefits (not covered in class)

Key idea of functional programming: constructed data. Instead of mutating, construct a new one. Supports composition, backtracking, parallelism, shared state.

Already doing it: immutable data (append)

More functional programming

Back to lists!

Append two lists

(define append (xs ys)
  ;; returns a new list that holds all
  ;; the elements of xs followed by
  ;; all the elements of ys
  ...)

Rapid design process

(define append (xs ys) ...)

Forms of data?

Algebraic laws?

(append '()         ys) == 

(append (cons z zs) ys) == 

Equations and function for append

(append '()         ys) == ys

(append (cons z zs) ys) == (cons z (append zs ys))


(define append (xs ys)

  (if (null? xs)

      ys

      (cons (car xs) (append (cdr xs) ys))))

Cost model

The major cost center is cons because it corresponds to allocation.

How many cons cells are allocated?

Let’s rigorously explore the cost of append.

Structural Induction Principle for List(A)

Suppose I can prove two things:

  1. IH (’())

  2. Whenever a in A and also IH(as), then IH (cons a as)

then I can conclude

Forall as in List(A), IH(as)

Claim: Cost (append xs ys) = (length xs)

Proof: By induction on the structure of xs.

Base case: xs = ’()

Inductive case: xs = (cons z zs)

Conclusion: Cost of append is linear in length of first argument.

More functional programming (not covered in class)

List-design shortcuts

Three forms of “followed by”

Two lists? Four cases!

Using informal math notation with .. for “followed by” and e for the empty sequence, we have these laws:

xs .. e         = xs
e .. ys         = ys
(z .. zs) .. ys = z .. (zs .. ys)
xs .. (y .. ys) = (xs .. y) .. ys

The underlying operations are append, cons, and snoc. Which ..’s are which?

Costs of list reversal

Algebraic laws for list reversal:

reverse '() = '()
reverse (x .. xs) = reverse xs .. reverse '(x) = reverse xs .. '(x)

And the code?

Naive list reversal

(define reverse (xs)
   (if (null? xs)
       '()
       (append (reverse (cdr xs))
               (list1 (car xs)))))

The list1 function maps an atom x to the singleton list containing x.

How many cons cells are allocated? Let’s let n = |xs|.

The method of accumulating parameters

Write laws for

(revapp xs ys) = (append (reverse xs) ys)

Who could write the code?

Reversal by accumulating parameters

(define revapp (xs ys)
   ; return (append (reverse xs) ys)
   (if (null? xs)
       ys
       (revapp (cdr xs) 
               (cons (car xs) ys))))

(define reverse (xs) (revapp xs '()))

The cost of this version is linear in the length of the list being reversed.

Parameter ys is the accumulating parameter.
(A powerful, general technique.)

30 September 2020: Higher-order functions

There are PDF slides for 10/1/2020.

Reflection: List functions

Consider the function, with the algebraic laws:

This algorithm searches for an element in a list.

List other algorithms you are familiar with that operate over a list (or linked list, or an array) and explain briefly what they do.

Predefined list algorithms

Some classics:

The power of lambda

Using first-class functions to enlarge your vocabulary

Supported in many languages: Haskell, ML, Python, JavaScript

List search: exists?

Algorithm encapsulated: linear search

Example: Is there a number in the list?

Algebraic laws:

(exists? p? '())          == ???
(exixts? p? (cons x xs))  == ???


(exists? p? '())          == #f
(exists? p? (cons x xs))  == #t,
                             if (p? x)
(exixts? p? (cons x xs))  == exists? p? xs,
                             otherwise

Defining

; (exists? p? '())         = #f
; (exists? p? (cons y ys)) = #t,          if (p? y)
; (exists? p? (cons y ys)) = (exists? p? ys),
                                          otherwise
-> (define exists? (p? xs)
      (if (null? xs)
          #f
          (if (p? (car xs)) 
              #t
              (exists? p? (cdr xs)))))
-> (exists? number? '(1 2 zoo))
#t
-> (exists? number? '(apple orange))
#f

Coding Interlude: exists?

Define a list ys and two predicates p1? and p2? such that exists? p1? ys evaluates to #t and exists? p2? ys evalutes to #f. Run the code to confirm your answers are correct.

An alternative formulation of the algebraic laws for the exists? function is:

;; (exists? p? '() = #f
;; (exists? p? (cons x xs)) = (|| (p? x) 
;;                                (exists? p? xs))

Modify the definition of exists? following this algebraic formulation instead, and re-run your examples.

List selection: filter

Algorithm encapsulted: Filter list according to a predicate

Example: Keep positive numbers in a list.

Defining

; (filter p? '())         == '()
; (filter p? (cons y ys)) ==
;         (cons y (filter p? ys)), when (p? y)
; (filter p? (cons y ys)) ==
;         (filter p? ys), when (not (p? y))

-> (define filter (p? xs)
     (if (null? xs)
       '()
       (if (p? (car xs))
         (cons (car xs) (filter p? (cdr xs)))
         (filter p? (cdr xs)))))

Running filter

-> (filter (lambda (n) (>  n 0)) '(1 2 -3 -4 5 6))
(1 2 5 6)
-> (filter (lambda (n) (<= n 0)) '(1 2 -3 -4 5 6))
(-3 -4)

Coding Interlude: filter

Use lambda to define a predicate that returns true iff the argument is a number larger than 9.

Define a list ys that contans a mix of one- and two-digit numbers.

Use your lambda function and the filter function defined in the video to produce a new list zs that contains only the elements in ys with at least two-digits.

Run your code.

How would you modify the code to produce a list with only one-digit numbers?

Reflection: Lifting functions to lists: map

Consider the map function. The algorithm it encapsulates is to transform every element according to an argument function. If sq is the function that squares its argument, then map sq '(1,2,3) will produce the list '(1,4,9).

Please complete the algebraic laws for the map function:

(map f '())         ==  ???
(map f (cons n ns)) ==  ???

Algorithm encapsulated: Transform every element according to argument function

Example: Square every element of a list.

Algebraic laws:

(map f '())         ==  '()
(map f (cons n ns)) ==  (cons (f y) (map f ys))

Defining and running map


-> (define map (f xs)
     (if (null? xs)
       '()
       (cons (f (car xs)) (map f (cdr xs)))))
-> (map number? '(3 a b (5 6)))
(#t #f #f #f)
-> (map (lambda(x)(* x x)) '(5 6 7))
(25 36 49)

Slide 7 

The universal list function: foldr

Algorithm encapsulated: Aggregate all elements in a list.

Example: What is the sum of the numbers in a list?

Slide 8 

foldr takes two arguments:

Example: foldr plus zero '(a b)

cons a (cons b '())
 |       |      |
 v       v      v
plus a (plus b zero)

Slide 9 

Slide 10 

Check your understanding:

What does the following code evaluate to?

-> (define foldr (plus zero xs)
     (if (null? xs)
       zero
       (plus (car xs) (foldr plus zero (cdr xs)))))
-> (define combine (x a) (+ 1 a))
-> (foldr combine 0 '(2 3 4 1))

Reflection: Language design: Why the redundancy?

We’ve just seen that functions like exists?, map, and filter can all be easily written in terms of the foldr function. The foldr function, in turn, can easily be written in terms of basic recursion. Other languages have similar redundancy involving loops: for, while, etc.

It takes extra effort for library designers to code more functions or for language designers to support more features.

What do you think makes it worth their effort to support this range of functions and looping constructs?

Slide 11 

Currying

Deriving one-argument functions from two-argument functions

Currying: The motivation

Remember me?

q-with-y? = (lambda (z) (q? y z))

Happens so often, there is a function for it:

q-with-y? = ((curry q?) y)

Called a partial application (one now, one later)

Map/search/filter love curried functions

-> (map     ((curry +) 3) '(1 2 3 4 5))
; add 3 to each element


-> (exists? ((curry =) 3) '(1 2 3 4 5))
; is there an element equal to 3?


-> (filter  ((curry >) 3) '(1 2 3 4 5))
; keep elements that 3 is greater then

The idea of currying

What is the benefit?

To get there, we use currying and partial application.

Slogan: Curried functions take their arguments “one-at-a-time.”

Slide 15 

Slide 16 

Check your understanding:

What does the following code evaluate to?

-> (map     ((curry +) 3) '(1 2 3 4 5))
-> (exists? ((curry =) 3) '(1 2 3 4 5))
-> (filter  ((curry >) 3) '(1 2 3 4 5))
                           ; tricky

Composing functions

Composing Functions

In math, what is the following equal to?

(f o g)(x) == ???

Another algebraic law, another function:

(f o g) (x) = f(g(x))
(f o g) = lambda x. (f (g (x)))

One-argument functions compose

-> (define o (f g) (lambda (x) (f (g x))))
-> (define even? (n) (= 0 (mod n 2)))
-> (val odd? (o not even?))
-> (odd? 3)
#t
-> (odd? 4)
#f

Coding Interlude: compose

The predicate null? returns #t iff the argument list is empty, while the function not inverts a boolean. Use the compose function (o) to write a predicate that tests whether a list is non-empty.

Run your code on both empty and non-empty lists.

Proofs about functions

Proofs about functions

Function consuming A is related to proof about A

5 Oct 2020: Costs, tail calls, and continuations

There are PDF slides for 10/6/2020.

Retrospective: Higher-order functions retrospective

Write one thing you learned about higher-order and anonymous functions. Your answers may be anonymously shared with your classmates.

Reflection: Expected performance

Consider the following algebraic laws for list reversal:

reverse '() = '()
reverse (x .. xs) = reverse xs .. reverse '(x) 
                  = reverse xs .. '(x)

where .. denotes the append operation on lists.

Now consder the corresponding code:

(define reverse (xs)
   (if (null? xs)
       '()
       (append (reverse (cdr xs))
               (list1 (car xs)))))

where list1 is a function that makes a one-element list from its argument.

Answer these questions:

Where have we been, where are we going: functions

Today: costs and control flow

Costs of list reversal

Naive list reversal

(define reverse (xs)
   (if (null? xs)
       '()
       (append (reverse (cdr xs))
               (list1 (car xs)))))

Consider the call reverse xs and let n = |xs|.

How many stack frames are allocated for reverse and append?

How many cons cells are allocated?

Alarm bells should be ringing: Reversing a list should be linear!

We’ll get to a linear verson of list reversal via tail calls and the method of accumulating parameters.

Tail calls

Slide 2 

Slide 3 

What is tail position?

Tail position is defined inductively:

Idea: The last thing that happens

Check your understanding: Tail Position

Consider the following code

-> (define foldr (plus zero xs)
     (if (null? xs)
       zero
       (plus (car xs) (foldr plus zero (cdr xs)))))

Now answer the following questions.

The occurrence of expression zero in the true-branch of the if expression is in tail position.

The occurrence of expression plus in the false-branch of the if expression is in tail position.

The occurrence of expression foldr in the false-branch of the if expression is in tail position.

Tail calls and the method of accumulating parameters

Tail-call optimization & \ accumulating parameters

Tail-call optimization

Accumulating Parameter

Example: Accumulating reverse function

Key ideas:

Contract:

  (revapp xs ys) = (append (reverse xs) ys)

Laws:

  (revapp '()         ys) == ys
  (revapp (cons z zs) ys) ==
             (revapp zs (cons z ys))

Reversal by accumulating parameters

; laws: (revapp '()         ys) = ys
;       (revapp (cons z zs) ys) =
;                     (revapp zs (cons z ys))

(define revapp (xs ys)
   ; return (append (reverse xs) ys)
   (if (null? xs)
       ys
       (revapp (cdr xs) 
               (cons (car xs) ys))))


(define reverse (xs) (revapp xs '()))

We’ll seee that the cost of this version is linear in the length of the list being reversed.

Parameter ys is the accumulating parameter.

Slide 8 

Slide 9 

Example: revapp '(1 2) '()

Call stack: (each line replaces previous one)

revapp '(1 2) '()    --> 
revapp '(2)   '(1)   -->
revapp '()    '(2 1)

Consider the call revapp xs '() and let n = |xs|.

How many stack frames are allocated?

How many cons cells are allocated?

Check your understanding: Accumulating Parameters

An accumulating parameter to a recursive function f accumulates the answer so that a recursive call to f can be in tail position.

The stack frame of a function call that is not in tail position cannot be shared with subsequent calls because its values may still be needed.

Tail-call optimizations and the use of accumulating parameters can only result in constant-time improvements to the complexity of a function.

Continuations

Reflection: Another construct that transfers control

Are tail calls familiar? What other programming construct

  1. Transfers control to another point in the code?

  2. Uses no stack space?

Tail calls as gotos with arguments

Gotos

Tail calls

Continuations: First-class tail calls

A continuation is code that represents “the rest of the computation.”

Context: an advanced technique, heavily used in event-driven programming:

Different coding styles

Direct style

Continuation-passing style (CPS)

Uses of continuations

Implementation

Check your understanding: Continuation Introduction

A continuation is different from a normal function because it never returns to the context from which it was invoked.

Continuations cannot take arguments.

Continuations are used in GUI frameworks and to mimic exceptions.

Motivating Example: Using continuations to signal exceptions

Reflection: Handling unusual circumstances

Consider the following algebraic laws for a witness function:

  (witness p? xs) == x, where (member x xs),
                        provided (exists? p? xs)

Now answer this questions:

Design Problem: Missing Value

Provide a witness to existence:

 (witness p? xs) == x, where (member x xs),
                       provided (exists? p? xs)

Problem: What if there exists no such x?

Bad ideas:

Good idea: exception (but not available in uScheme)

Slide 16 

Slide 17 

Slide 18 

Slide 19 

Slide 20 

Question: How much stack space is used by the call?

Answer: Constant

Example Use: Instructor Lookup

-> (val 2020f '((Fisher 105)(Monroe 40)(Chow 116)))
-> (instructor-info 'Fisher 2020f)
(Fisher teaches 105)

-> (instructor-info 'Chow 2020f)
(Chow teaches 116)

-> (instructor-info 'Votipka 2020f)
(Votipka is-not-on-the-list)

Slide 22 

Slide 23 

Slide 24 

Slide 25 

Coding Interlude: witness-cps

Consider the following code:

(define witness-cps (p? xs succ fail)
  (if (null? xs)
      (fail)
      (let ([z (car xs)])
        (if (p? z)
            (succ z)
            (witness-cps p? (cdr xs) succ fail)))))


(val 2020f '((Fisher 105)(Monroe 40)(Chow 116)))


(define instructor-info (instructor classes)
  (let (
         [s  (lambda (info)    ; success continuation
                (list3 instructor 'teaches (cadr info)))]
         [f  (lambda ()        ; failure continuation
                (list2 instructor 'is-not-on-the-list))])
        (witness-cps (o ((curry =) instructor) car)
                      classes s f)))

(instructor-info 'Fisher 2020f)
(instructor-info 'Chow 2020f)
(instructor-info 'Votipka 2020f)

Modify the code to add another professor and course number, and add a test case for that professor.

Run your code.

Modify the code to return 'is-not-teaching-this-semester if the instructor’s name is not on the list.

Run your code to confirm it returns your new response when given a name not on the list.

Explain how would you modify the code to search for a course number instead of an instructor name.

7 Oct 2020: Continuations for backtracking

There are PDF slides for 10/8/2020.

Expressions to avoid

(if <p> #t #f)  ; worst

(append (list1 <x>) <ys>) ; most common

(cons <x> (cons <y> '())) ; prefer `list2`

Homework alert

Functions list-of?, formula?

Slide 3 

Reflection: Solving Boolean formulas

On your homework, you will use continuations to organize a search to find satisfying assignments for Boolean formulas.

A formula has one of these forms:

given the following record declarations:

For each formula, find a satisfying assignment or explain why none exists.

(val f1 (make-and (list4 'x 'y 'z (make-not 'x))))
   ; x /\ y /\ z /\ !x

(val f2 (make-not (make-or (list2 'x 'y)))) 
   ; !(x \/ y)

(val f3 (make-or (list2 'x 'y)))
   ; x \/ y

(val f4 (make-not (make-and (list3 'x 'y 'z))))
   ; !(x /\ y /\ z)

Solving Boolean formulas with continuations

A formula is one of these:

given the following record definitions (in book):

Solving Boolean Formulas

(val f1 (make-and (list4 'x 'y 'z (make-not 'x))))
   ; x /\ y /\ z /\ !x  

Answer: NONE

Solving Boolean Formulas

(val f2 (make-not (make-or (list2 'x 'y)))) 
   ; !(x \/ y)          

Answer:

{ x |-> #f, y |-> #f }

Solving Boolean Formulas

(val f3 (make-or (list2 'x 'y)))
   ; x \/ y 

Answer: Three solutions!

{ x |-> #t, ... }
{ y |-> #t, ... }

Solving Boolean Formulas

(val f4 (make-not (make-and (list3 'x 'y 'z))))
   ; !(x /\ y /\ z)     

Answer: Many solutions
(all 7 ways the variables can not all be #t):

{ x |-> #f, ... }, ...

Searching via continuations: An example

Finding a satisfying assignment

Example formula:

  (x \/ y) /\ (!x /\ !z)

Slide 10 

Slide 11 

the name of the literal in the box.

Solving a Literal

start carries a partial truth assignment to variables current

Box describes how to extend current to make a variable, say x, true.

Case 1: current(x) = #t

Call success continuation with current

Pass fail as resume continuation (argument to success)

Case 2: current(x) = #f

Call fail continuation

Case 3: x not in current

Call success continuation with current{x -> #t}

Pass fail as resume continuation

Solving a literal

; (satisfy-literal-true x current succ fail) =
;   (succ current fail), when x is bound to #t in cur
;   (fail),              when x is bound to #f in cur
;   (succ (bind x #t current) fail), x unbound in cur
(define satisfy-literal-true (x current succ fail)
  (if (bound? x current)
      (if (find x current)
          (succ current fail)
          (fail))
      (succ (bind x #t current) fail)))

Check your understanding: Solving a Negated Literal

Consider the schema we’ve been using to find satisfying assignments for Boolean formulas. In the previous video, we walked through how to instantiate the schema for a positive literal. The following questions ask you to do the same thing for a negated literal, which is a term of the form not x where x is a literal.

The start continuation is called with argument current, the partial truth assignment constructed so far. The box needs to describe how to extend current to make the negated variable not x true.

If current(x) = #t, what should the box do?

If current(x) = #f, what should the box do?

If x is not in current, what should the box do?

The success continuation takes two arguments: current (the partial solution found so far) and resume (the continuation that takes over if current is found to be inconsistent with some new part of the formula. When calling the success continuation, what should we pass as the resume continuation?

Please explain your answer.

Slide 13 

Solving A and B

  1. Solver enters A

  2. If A is solved, newly allocated success continuation starts B

  3. If B succeeds, we’re done! Use success continuation from context.

  4. If B fails, use resume continuation A passed to B as fail.

  5. If A fails, the whole thing fails. Use fail continuation from context.

Solving A or B

  1. Solver enters A

  2. If A is solved, we’re good! But what if context doesn’t like solution? It can resume A using the resume continuation passed out as fail.

  3. If A can’t be solved, don’t give up! Try a newly allocated failure continuation to start B.

  4. If ever B is started, we’ve given up on A entirely. So B’s success and failure continuations are exactly the ones in the context.

  5. If B succeeds, but the context doesn’t like the answer, the context can resume B.

  6. If B fails, abject failure all around; call the original fail continuation.

Lisp and Scheme Retrospective

Slide 14 

Five powerful questions

  1. What are the values?
    What do expressions/terms evaluate to?

  2. What environments are there?
    What can names stand for?

  3. What is the abstract syntax?
    Syntactic categories? Terms in each category?

  4. How are terms evaluated?
    Judgments? Evaluation rules?

  5. What’s in the initial basis?
    Primitives and predefined, what is built in?

Slide 16 

Next steps

Before tomorrow’s lecture,

Consider good and bad points of Scheme

(Informed by your experience)

Common Lisp, Scheme

Advantages:

Down sides:

Bottom line: it’s all about lambda

Next time: think about good and bad points

Bonus content: Scheme as it really is

  1. Macros!
  2. Cond expressions (solve nesting problem)
  3. Mutation

Macros!

Full Scheme: Macros

A Scheme program is just another S-expression

(See book sections 2.16, 2.17.4)

Conditional expressions

Full Scheme: Conditionals

(cond [c1 e1]    ; if c1 then e1
      [c2 e2]    ; else if c2 then e2
       ...            ...
      [cn en])   ; else if cn then en

; Syntactic sugar---'if' is a macro:
(if e1 e2 e3) == (cond [e1 e2]
                       [#t e3])

Mutation

Full Scheme: Mutation

Not only variables can be mutated.

Mutate heap-allocated cons cell:

(set-car! '(a b c) 'd)  => (d b c)

Circular lists, sharing, avoids allocation

12 Oct 2020: Introduction to ML

There are PDF slides for 10/13/2020.

Retrospective: Continuations

Write one thing you learned about continuations. Your answers may be anonymously shared with your classmates.

Reflection: μScheme evaluation

Evaluating an idea or an artifact is an advanced cognitive task:

  1. Remember
  2. Understand
  3. Apply
  4. Analyze
  5. Evaluate
  6. Create

Given what you have learned so far in this course, you are in a good position to start to evaluate μScheme.

Please enter at least two strengths of μScheme:

Please enter at least two weaknesses of μScheme.

Your answers may be anonymously shared with your classmates.

The language that we will study next, ML, addresses some of the weaknesses of μScheme.

Introduction to ML

Meta: Apply your new knowledge in Standard ML:

ML Overview

Three new ideas

ML = uScheme + pattern matching + static types + exceptions

  1. Pattern matching is big and important. You will like it. It’s “coding with algebraic laws”
  2. Static types get two to three weeks in their own right.
  3. Exceptions are easy

Pattern matching makes code look more like algebraic laws: one pattern for each case

Static types tell us at compile time what the cases are.

Exceptions solve the problem “I can’t return anything sensible!”

And lots of new concrete syntax!

Example: The length function.

Coding Interlude: length

Use the length function

   fun length []      = 0
     | length (x::xs) = 1 + length xs

to calculate the length of the list [1,2,3,4].

Now write the sum function in ML that calculates the sum of the same list.

μScheme to ML Rosetta stone

uScheme                    SML

 (cons x xs)             x :: xs

 '()                     []
 '()                     nil

 (lambda (x) e)          fn x => e

 (lambda (x y z) e)      fn (x, y, z) => e

 ||  &&                  andalso    orelse


 (let* ([x e1]) e2)      let val x = e1 in e2 end

 (let* ([x1 e1]          let val x1 = e1
        [x2 e2]              val x2 = e2
        [x3 e3]) e)          val x3 = e3
                         in  e
                         end

Check your understanding: ML Syntax

Match the μScheme syntax with its ML equivalent.

μScheme:

 &&                  

 (lambda (x y z) e)      

 '()                     

 ||  

 (lambda (x) e)          

 (let* ([x e1]) e2)      

 (cons x xs)             

 '()                     


 (let* ([x1 e1]          
        [x2 e2]          
        [x3 e3]) e)      

ML

 fn (x, y, z) => e

 nil

 fn x => e

 x :: xs

 andalso    

 []

 false

 let val x = e1 in e2 end

 true   

 let val x1 = e1
     val x2 = e2
     val x3 = e3
 in  e
 end

 orelse

ML forms of data

Live coding:

(* ML's comment form *)
bool (* true, false *)
string (* "Tufts" *)
char (* #"A" *)
int  (* including negative integers, eg ~12 *)
list of int  [1,2,3]
list of bool [true, false]
ill-typed list [1, true]
pair  (1, "Halligan")
triple  (105, "Scheme", true)

Check your understanding: Simple ML Types

Which of the following is not syntactically correct ML code?

ML Functions

(* Can't use ? in the names of functions in ML *)
fun even x = (x mod 2 = 0)
fun map f []        = []
  | map f (y :: ys) = f y  :: map f ys

(* Anonymous functions are more verbose in ML *)
val res1 = map (fn (x) => (x + 1)) [1,2,3]
(* ML also has a case statement;
   It's not as elegant as clausal definitions. *)
(* orelse shortcircuits *)
fun exists p xs = 
    case xs 
      of [] => false
       | y :: ys => p y orelse exists p ys


val res2 = exists even [1,2,3,4]
(* Higher order functions exist in ML too *)
(* Notation (op +) converts infix to prefix *)
fun sum xs = foldr (op +) 0 [1,2,3,4]

Check your understanding: ML Functions

Which of the following functions correctly implements the filter function, defined by the following algebraic laws? (Note, all of the functions are syntatically correct.)

    filter p [] = []
    filter p (y :: ys) = 
        if p y then y :: filter p ys
               else filter p ys
fun filter p xs =
  case xs
    of [] => []
     | y :: ys => 
          if p y then y :: filter p ys
                 else filter p ys 
fun filter p [] = []
  | filter p (y :: ys) = 
       if p y then y :: filter p ys
             else filter p ys
fun filter p xs = 
     foldr (fn(x,a) => 
              if p x then (x::a) else a) 
           [] xs

All of the above.

New Forms of Data

Example: Binary Tree of Integers

(* LEAF and NODE are called data type constructors *)
datatype itree = LEAF
               | NODE of itree * int * itree


(* Data type constructors build data structures*)
(* and can be used in pattern matching *)
fun insert n LEAF                    = 
        NODE (LEAF, n, LEAF)
  | insert n (NODE (left, m, right)) =
        if n < m then
       NODE (insert n left, m, right)
        else if n > m then
       NODE (left, m, insert n right)
    else
       NODE (left, m, right)


(* insert is curried; 
   we might also want an uncurried version of insert *)
fun insert' (n, tree) = insert n tree

(* @ is ML's notation for appending two lists *)
fun elements LEAF = []
  | elements (NODE (left, n, right)) = 
      elements left @ (n :: elements right)

(* o is ML's composition operation *)
val isort = elements o foldl insert' LEAF
  (* uses binary-search tree for sorting *)

val result = isort [4,2,7,3,4,2,8,4,8,9];

Check your understanding: ML Data Type Introduction

Consider the itree datatype declaration:

datatype itree = LEAF
               | NODE of itree * int * itree

and the function elements

fun elements LEAF = []
  | elements (NODE (left, n, right)) = 
      elements left @ (n :: elements right)

If the elements function were applied to each of the following itree data structures, which would return the list [1,0,5]?

Exceptions in ML

Example: Take from a list

(* Note use of exceptions. *)
(* And ; to sequence imperative operations *)
exception TooShort
fun take 0 _       = []  (* wildcard! *)
  | take n []      = raise TooShort
  | take n (x::xs) = x :: (take (n-1) xs)

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
           handle TooShort =>
             (print "List too short!"; [])

Check your understanding: ML Exceptions

Recall the code for the take function:

exception TooShort
fun take 0 _       = []  (* wildcard! *)
  | take n []      = raise TooShort
  | take n (x::xs) = x :: (take (n-1) xs)

and some code that exercises it:

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
           handle TooShort =>
             (print "List too short!"; [])

Consider the partially-written function find which satisfies the following algebraic laws:

find p [] = raise NotFound
find p (x::xs) = x          if p x
                 find p xs  otherwise  

and some code that exercises it:

exception Blank1
fun find p [] = Blank2 NotFound
  | find p (x::xs) = 
       if p x then x 
              else find p xs
val res = find even [1,3,5] 
            Blank3 NotFound => Blank4

Fill in the blanks to correctly complete the find function and return an answer of 0?

ML—Five Questions

Values: num/string/bool, constructed data

Environments: names stand for values (and types)

Syntax: definitions, expressions, patterns, types

Evaluation: uScheme + case and pattern matching

Initial Basis: medium size; emphasizes lists

(Question Six: type system—a coming attraction)

in a course-specific video or in some other medium.

A note about books

Ullman is easy to digest

Ullman costs money but saves time

Ullman is clueless about good style

Suggestion:

Details in course guide Learning Standard ML

Length

fun length []      = 0
  | length (x::xs) = 1 + length xs

val res = length [1,2,3]

Map

fun map f []      = []
  | map f (x::xs) = (f x) :: (map f xs)

val res1 =
  map length [[], [1], [1,2], [1,2,3]]

Map, without redundant parentheses

fun map f []      = []
  | map f (x::xs) =  f x  ::  map f xs

val res1 =
  map length [[], [1], [1,2], [1,2,3]]

Filter

fun filter pred []      = [] 
  | filter pred (x::xs) =  (* no 'pred?' *)
      let val rest = filter pred xs
      in  if pred x then
            (x :: rest) 
          else
            rest
      end

val res2 =
  filter (fn x => (x mod 2) = 0) [1,2,3,4]

Filter, without redundant parentheses

fun filter pred []      = [] 
  | filter pred (x::xs) =  (* no 'pred?' *)
      let val rest = filter pred xs
      in  if pred x then
             x :: rest
          else
            rest
      end

val res2 =
  filter (fn x => (x mod 2) = 0) [1,2,3,4]

Exists

fun exists pred []      = false
  | exists pred (x::xs) = 
      (pred x) orelse (exists pred xs)

val res3 =
  exists (fn x => (x mod 2) = 1) [1,2,3,4]
(* Note: fn x => e is syntax for lambda *)

Exists, without redundant parentheses

fun exists pred []      = false
  | exists pred (x::xs) = 
       pred x  orelse  exists pred xs

val res3 =
  exists (fn x => (x mod 2) = 1) [1,2,3,4]
(* Note: fn x => e is syntax for lambda *)

All

fun all pred []      = true
  | all pred (x::xs) =
      (pred x) andalso (all pred xs)

val res4 = all (fn x => (x >= 0)) [1,2,3,4]

All, without redundant parentheses

fun all pred []      = true
  | all pred (x::xs) =
       pred x  andalso  all pred xs

val res4 = all (fn x => (x >= 0)) [1,2,3,4]

Take

exception TooShort
fun take 0 _       = []  (* wildcard! *)
  | take n []      = raise TooShort
  | take n (x::xs) = x :: (take (n-1) xs)

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
           handle TooShort =>
             (print "List too short!"; [])

(* Note use of exceptions. *)

Take, without redundant parentheses

exception TooShort
fun take 0 _       = []  (* wildcard! *)
  | take n []      = raise TooShort
  | take n (x::xs) = x ::  take (n-1) xs

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
           handle TooShort =>
             (print "List too short!"; [])

(* Note use of exceptions. *)

Drop

fun drop 0 zs      = zs
  | drop n []      = raise TooShort
  | drop n (x::xs) = drop (n-1) xs

val res7 = drop 2 [1,2,3,4]
val res8 = drop 3 [1]
           handle TooShort =>
             (print "List too short!"; [])

Takewhile

fun takewhile p [] = []
  | takewhile p (x::xs) = 
      if p x then (x :: (takewhile p xs))
      else []

fun even x = (x mod 2 = 0)
val res8 = takewhile even [2,4,5,7]
val res9 = takewhile even [3,4,6,8]

Takewhile, without redundant parentheses

fun takewhile p [] = []
  | takewhile p (x::xs) = 
      if p x then  x ::  takewhile p xs
      else []

fun even x = (x mod 2 = 0)
val res8 = takewhile even [2,4,5,7]
val res9 = takewhile even [3,4,6,8]

Dropwhile

fun dropwhile p []              = []
  | dropwhile p (zs as (x::xs)) = 
      if p x then (dropwhile p xs) else zs
val res10 = dropwhile even [2,4,5,7]
val res11 = dropwhile even [3,4,6,8]

(* fancy pattern form: zs as (x::xs) *)

Dropwhile, without redundant parentheses

fun dropwhile p []              = []
  | dropwhile p (zs as (x::xs)) = 
      if p x then  dropwhile p xs  else zs
val res10 = dropwhile even [2,4,5,7]
val res11 = dropwhile even [3,4,6,8]

(* fancy pattern form: zs as (x::xs) *)

Folds

fun foldr p zero []      = zero
  | foldr p zero (x::xs) = p (x, (foldr p zero xs))

fun foldl p zero []      = zero
  | foldl p zero (x::xs) = foldl p (p (x, zero)) xs


val res12 = foldr (op +)  0 [1,2,3,4] 
val res13 = foldl (op * ) 1 [1,2,3,4] 

(* Note 'op' to use infix operator as a value *)

Folds, without redundant parentheses

fun foldr p zero []      = zero
  | foldr p zero (x::xs) = p (x,  foldr p zero xs )

fun foldl p zero []      = zero
  | foldl p zero (x::xs) = foldl p (p (x, zero)) xs


val res12 = foldr (op +)  0 [1,2,3,4] 
val res13 = foldl (op * ) 1 [1,2,3,4] 

(* Note 'op' to use infix operator as a value *)

14 Oct 2020: Programming with constructed data and types

There are PDF slides for 10/15/2020.

Handout: Learning ML

be two text answer boxes, one for the C/scheme answer and one for the ML answer.

Reflection: Defining new kinds of data

Suppose you were going to program a system to play card games in which suits (CLUBS, DIAMONDS, HEARTS, and SPADES) matter, such as poker, solitaire, bridge, or gin rummy. Briefly describe how you would represent the suits in C and in μScheme.

Now consider ML’s datatype mechanism that we used to define the itree type. Briefly describe how you could use ML’s datatype mechanism to represent suits.

Programming with Types

Pain points from Scheme

Unsolved:

Solved:

Programming with Types

Three tasks:

Foundation: Data

Syntax is always the presenting complaint, but data is what’s always important

“Language support for forms of data”

“Distinguish one cons cell (or one record) from another”

Lightweight constructed data: Tuples

Example (splitList from merge sort):

    let val (left, right) = splitList xs
        in  merge (sort left, sort right)
        end

Enumerated types

Datatypes can define an enumerated type and associated values.

datatype suit = HEARTS | DIAMONDS | SPADES | CLUBS

Here suit is the name of a new type.

The value constructors HEARTS, DIAMONDS, SPADES, and CLUBS are the values of type suit.

Value constructors are separated by vertical bars.

Pattern matching

Datatypes are deconstructed using pattern matching.

fun toString HEARTS   = "hearts"
  | toString DIAMONDS = "diamonds"
  | toString SPADES   = "spades"
  | toString CLUBS    = "clubs"

val suitName = toString HEARTS

Check your understanding: Basic Datatypes

Consider the following partially written isRed function that is supposed to determine if a given suit is one of the two red suits (ie, is either a heart or a diamond).

fun isRed  HEART  = BLANK1
  | BLANK2 BLANK3 = true
  | isRed  _      = false

But wait, there’s more: Value constructors can take arguments!

datatype int_tree = 
    LEAF | NODE of int * int_tree * int_tree

int_tree is the name of a new type.

There are two data constructors: LEAF and NODE.

NODEs take a tuple of three arguments: a value at the node, and left and right subtrees.

The keyword of separates the name of the data constructor and the type of its argument.

When fully applied, data constructors have the type of the defining datatype (ie, int_tree).

It’s recursive!

Building values with constructors

We build values of type int_tree using the associated constructors:

 val tempty = LEAF
 val t1 = NODE (1, tempty, tempty)
 val t2 = NODE (2, t1, t1)
 val t3 = NODE (3, t2, t2)

What is the in-order traversal of t3?

 [1,2,1,3,1,2,1]

What is the pre-order traversal of t3?

 [3,2,1,1,2,1,1]

Deconstruct values with pattern matching

(The @ symbol denotes append in ML)

fun inOrder LEAF = []
  | inOrder (NODE (v, left, right)) = 
       inOrder left @ [v] @ inOrder right

val il3 = inOrder t3



fun preOrder LEAF = []
  | preOrder (NODE (v, left, right)) = 
       v :: preOrder left @ preOrder right

val pl3 = preOrder t3

Reflection: How general is the code?

Consider the int_tree datatype and the code we’ve written to manipulate it:

   datatype int_tree = 
      LEAF | NODE of int * int_tree * int_tree

   fun inOrder LEAF = []
      | inOrder (NODE (v, left, right)) = 
           inOrder left @ [v] @ inOrder right

    fun preOrder LEAF = []
      | preOrder (NODE (v, left, right)) = 
           v :: preOrder left @ preOrder right

Discuss the extent to which this code requires the value stored at each node to be an integer. Type errors aside, would the code still work if we stored a different kind of value in the tree?

But wait, there’s still more: Polymorphic datatypes!

int_tree is monomorphic because it has a single type.

Note though that the inOrder and preOrder functions only cared about the structure of the tree, not the payload value at each node.

Polymorphic datatypes are written using type variables that can be instantiated with any type.

datatype 'a tree = 
   CHILD | PARENT of 'a * 'a tree * 'a tree

tree is a type constructor (written in post-fix notation), which means it produces a type when applied to a type argument.

Examples:

'a is a type variable: it can represent any type.

It is introduced on the left-hand of the = sign. References on the right-hand side are types.

CHILD and PARENT are value constructors.

CHILD takes no arguments, and so has type 'a tree

When given a value of type 'a and two 'a trees, PARENT produces a 'a tree

Constructors build tree values

val empty = CHILD
val tint1 = PARENT (1, empty, empty)
val tint2 = PARENT (2, tint1, tint1)
val tint3 = PARENT (3, tint2, tint2)

val tstr1 = PARENT ("a", empty, empty)
val tstr2 = PARENT ("b", tstr1, tstr1)
val tstr3 = PARENT ("c", tstr2, tstr2)

Pattern matching deconstructs tree values

fun inOrder CHILD = []
  | inOrder (PARENT (v, left, right)) = 
       (inOrder left) @ [v] @ (inOrder right)

fun preOrder CHILD = []
  | preOrder (Parent (v, left, right)) = 
       v :: (preOrder left) @ (preOrder right)

Functions inOrder and preOrder are polymorphic: they work on any value of type 'a tree. 'a is a type variable and can be replaced with any type.

Environments

Datatype definitions introduce names into:

  1. the type environment: suit, int_tree, tree

  2. the value environment: HEART, LEAF, PARENT

Inductive

Datatype definitions inherently inductive:

Check your understanding: Defining Datatypes

We can define a subset of the S-expressions in Scheme using the following specification:

An SX1 is either an ATOM or a list of SX1s, where ATOM is represented by the ML type atom.

Complete the encoding of this definition by filling in the blanks in the following ML datatype:

datatype sx1 = ATOM of BLANK1
             | LIST of BLANK2 list

Another way of defining a subset of the S-expressions in Scheme uses a different specification:

An SX2 is either an ATOM or the result of consing together two values v1 and v2, where both v1 and v2 are themselves members of the set SX2.

Complete the encoding of this definition by filling in the blanks in the following ML datatype:

datatype sx2 = ATOM of BLANK1
             | PAIR of BLANK2 BLANK3 BLANK4

Coding interlude: Defining a datatype

Designing a datatype is a three step process:

  1. For each form, choose a value constructor
  2. Identify the “parts” type that each constructor is of
  3. Write the datatype definition

Another definition of a Scheme S-expression is that it is one of:

Define an ML datatype sx that encodes this version of an S-expression.

Making types work for you

Type-directed programming

Common idea in functional programming: lifting

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

The type of a function tells us a lot about how to write it—the type often narrows the code down to a very few sensible choices. This technique works especially well with polymorphic functions.

Here’s the problem we’re going to solve: we have a predicate on single elements, and we want a predicate on lists of elements:

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

And here is the main tool we are going to use: for every type, we are going to be aware of how to make a value of the type (its introduction form) and what we can do with a value of the type (its elimination form).

Let’s review for the types we have:

Table:

     Type                 Intro                  Elim
     
      ... -> ...           fn                     application
      
      bool                true, false             if, pattern matching (case/fun)
   
      ... list            [], ::                  pattern matching (case/fun)


      'a                    ---                    ---

Let’s get started: our goal is to make a function of type

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

We can always use fun

fun lift p = _goal1

Q: What is the type of p? How do we know?

Q: What is the type of _goal1? How do we know?

Answers:

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) = (_goal1 : 'a list -> bool)

Our new goal is a term of type 'a list -> bool. Which of these forms of expression is always safe and makes progress toward a goal of this type?

  1. A recursive call
  2. A lambda expression of the form (fn ... => ...)
  3. A let expression

Our goal is an arrow type, and it is always safe to make a value of arrow type using lambda:

_goal1 = (fn xs => _goal2)

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) = (fn xs => _goal2)

Q: What is the type of xs? How do we know?

Q: What is the type of _goal2? How do we know?

Answers:

_goal1 = (fn xs => _goal2)

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) = (fn (xs : 'a list) => (_goal2 : bool))

Things we have

lift : ('a -> bool) -> ('a list -> bool)   [it could call itself recursively]
p : 'a -> bool
xs : 'a list

Now we have a goal of type bool and we want to use the parameter xs of type 'a list. What is the safest tactic to make progress using xs?

  1. Break it down by cases, that is, ask whether it is nil or cons
  2. Pass it to foldl or foldr
  3. Pass it to lift p

Now we have a new goal of type bool. And we know two ways to make a value of type bool: we can return true or false. But if we choose true or false, we will have defined a function that is not very interesting, and we will have completely ignored all the parameters to lift. That is probably not a great design choice.

Is there some other way to make a bool? Some way to use what we have?

Next take:

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) =
  fn (xs : 'a list) =>
     case xs
       of [ ] => _goal3 : bool
        | y :: ys => _goal4 : bool

Now instead of one hole we have two holes, each of type bool. But we have used xs, so that’s progress. What’s next? Let’s look at the first hole.

This first hole still has type bool. And we now know that xs is empty, so we have used all the information available in xs, but we still haven’t used p. Which of the following tactics is viable for fulfilling our goal of type bool? Mark all that apply.

  1. Look for a value of type 'a to apply p to
  2. Give up on p and fill the hole with true
  3. Give up on p and fill the hole with false

Let’s flip a coin. My coin comes up true. So now we have

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) =
  fn (xs : 'a list) =>
     case xs
       of [ ] => true
        | y :: ys => _goal4 : bool

Our final goal has type bool, but now we have more Booleans than we know what to do with: both p y and lift p ys. What answer best describes what we should do?

  1. Ignore p y and fill the hole with lift p ys
  2. Ignore lift p ys and fill the hole with p y
  3. Combine the two Booleans using andalso
  4. Combine the two Booleans using orelse
  5. Two Booleans can be combined in 16 different ways, and they are all equally good.

We have to fill that last hole. And we have new variables y and ys. What are their types? How do we know?

y : 'a
ys : 'a list

Add them to everything we have and it’s now

lift : ('a -> bool) -> ('a list -> bool)
p : 'a -> bool
y : 'a
ys : 'a list

How can we use y and ys?

So now these are the things we have:

lift : ('a -> bool) -> ('a list -> bool)
p : 'a -> bool
y : 'a
ys : 'a list

p y : bool
lift p ys : bool

We have just one hole to fill, but we have two terms of type bool. If we want to use all our information, both y and ys, we’d better combine them.

Q: What are common ways to combine two Booleans?

A: Two good ways are andalso and orelse.

Q: Which one do you like here?

A: Let’s think about both choices:

Q: Nice. And here it is:

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) =
  fn (xs : 'a list) =>
     case xs
       of [ ] => true
        | y :: ys => p y andalso lift p ys

Or if we pretty it up a little,

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift p xs = 
     case xs
       of [ ] => true
        | y :: ys => p y andalso lift p ys

or even

fun lift p []        = true
  | lift p (y :: ys) = p y andalso lift p ys

Making types work for you: The false case

Q: Suppose back at the empty-list step, the coin had come down differently and we had used true there. What would we want to do then?

A: I know this one! We combine two Booleans with orelse instead of andalso! That’s the exists function!

Q: Indeed:

fun lift' p []        = false
  | lift' p (y :: ys) = p y orelse lift p ys

Potential bonus content: inorder traversal of binary tree.

val find : forall 'a . ('a -> bool) -> ('a list -> 'a option)

Bonus content

The rest of this slide deck is “bonus content”

New vocabulary for ML

Data:

Code:

Types:

Structure of algebraic types

An algebraic data type is a collection of alternatives

The thing named is the value constructor

(Also called “datatype constructor”)

Additional language support for algebraic types: case expressions

“Eliminate” values of algebraic types

New language construct case (an expression)

fun length xs =
  case xs
    of []      => 0
     | (x::xs) => 1 + length xs

Clausal definition is preferred
(sugar for val rec, fn, case)

works for any datatype

 fun toStr t = 
     case t 
       of EHEAP => "empty heap"
        | HEAP (v, left, right) =>
                   "nonempty heap"

But often a clausal definition is better style:

 fun toStr' EHEAP = "empty heap"
   | toStr' (HEAP (v,left,right)) =
                    "nonempty heap"

Bonus content: Exceptions — Handling unusual circumstances

Syntax:

Informal Semantics:

Exception handling in action

loop (evaldef (reader (), rho, echo))
handle EOF            => finish ()
  | Div               => continue "Division by zero"
  | Overflow          => continue "Arith overflow"
  | RuntimeError msg  => continue ("error: " ^ msg)
  | IO.Io {name, ...} => continue ("I/O error: " ^
                                   name)
  | SyntaxError msg   => continue ("error: " ^ msg)
  | NotFound n        => continue (n ^ "not found")

Bonus Content: A different explanation of ML Datatypes

Tidbits:

Notes:

Datatype definitions

datatype  suit    = HEARTS | DIAMONDS | CLUBS | SPADES

datatype 'a list  = nil           (* copy me NOT! *)
                  | op :: of 'a * 'a list

datatype 'a heap  = EHEAP
                  | HEAP of 'a * 'a heap * 'a heap

type suit          val HEARTS : suit, ...
type 'a list       val nil   : forall 'a . 'a list
                   val op :: : forall 'a . 
                               'a * 'a list -> 'a list
type 'a heap
val EHEAP: forall 'a.                          'a heap
val HEAP : forall 'a.'a * 'a heap * 'a heap -> 'a heap

Exegesis (on board):

Bonus Content: ML traps and pitfalls

Slide 10 

Order of clauses matters


fun take n (x::xs) = x :: take (n-1) xs
  | take 0 xs      = []
  | take n []      = []

(* what goes wrong? *)

Gotcha — overloading

- fun plus x y = x + y;
> val plus = fn : int -> int -> int
- fun plus x y = x + y : real;
> val plus = fn : real -> real -> real

Slide 13 

Gotcha — parentheses

Put parentheses around anything with |

case, handle, fn

Function application has higher precedence than any infix operator

Bonus content (seen in examples)

Syntactic sugar for lists

Syntactic sugar for lists

- 1 :: 2 :: 3 :: 4 :: nil; (* :: associates to the right *)
> val it = [1, 2, 3, 4] : int list

- "the" :: "ML" :: "follies" :: [];
> val it = ["the", "ML", "follies"] : string list

> concat it;
val it = "theMLfollies" : string

Bonus content: ML from 10,000 feet

Slide 16 

Environments

The value environment

Names bound to immutable values

Immutable ref and array values point to mutable locations

ML has no binding-changing assignment

Definitions add new bindings (hide old ones):

val pattern = exp
val rec pattern = exp
fun ident patterns = exp
datatype … = …

Nesting environments

At top level, definitions

Definitions contain expressions:

def ::= val pattern = exp

Expressions contain definitions:

exp ::= let defs in exp end

Sequence of defs has let-star semantics

Patterns

What is a pattern?

pattern ::= variable
          | wildcard
          | value-constructor [pattern]
          | tuple-pattern
          | record-pattern
          | integer-literal
          | list-pattern

Design bug: no lexical distinction between

Workaround: programming convention

Functions

Function pecularities: 1 argument

Each function takes 1 argument, returns 1 result

For “multiple arguments,” use tuples!

 fun factorial n =
   let fun f (i, prod) = 
         if i > n then prod else f (i+1, i*prod)
   in  f (1, 1)
   end


 fun factorial n =  (* you can also Curry *)
   let fun f i prod = 
         if i > n then prod else f (i+1) (i*prod)
   in  f 1 1
   end

Tuples are “usual and customary.”

Slide 21 

Types

Slide 22 

Slide 23 

Slide 24 

Slide 25 

19 Oct 2020: Types

There are PDF slides for 10/20/2020.

Retrospective: Evaluating ML

You are now in a position to start to evaluate ML.

Please enter at least one strength of ML.

Please enter at least one weakness of ML.

Your answers may be anonymously shared with your classmates.

Reflection: Catching errors with types

We have seen that one of the key differences between μScheme and ML is that ML has a static type system. One of the key roles of a type system is to detect errors. List at least two kinds of errors that type systems can detect at compile time (Hint: think about the kinds of type errors the ML compiler complains about).

Type systems aren’t omniscient, though. Now list at least one kind of error that ML’s type system can’t find and do your best to explain why it can’t.

Type systems

What they do:

How they work

World’s most widely deployed static analysis

Where type systems are found:

Our trajectory:

Monomorphic types systems are the past.
Inference and polymorphism are the present and future.
(Visible trend just like lambda.)

This “lecture”:

Next time:

Type systems

What kind of thing is it?

``Types classify terms’’

“Term” is theory word for “syntax”:

n + 1  : int

"hello" ^ "world"  : string

(fn n => n * (n - 1))  : int -> int

if p then 1 else 0 : int, provided  p : bool

Questions type systems can answer:

Questions type systems typically do not answer:

Decidability and Type Checking

Suppose L is a Turing-Complete Language.

P is the set of programs in L that terminate.

Wish: a type system to statically classify terminating programs:

Expression e in L has type T (e : T) iff e terminates.

But: Undecideable!

We can prove no such type system exists. (See Comp170 for the Halting Problem)

Choices:

Static vs. Dynamic Type Checking

Most languages use a combination of static and dynamic checks

Static: “for all inputs”

Dynamic: “for some inputs”

Check your understanding: Static vs. Dynamic Type Systems

Type systems can statically detect all errors.

Static type systems generally use a combiniation of static and dynamic checks.

Dynamic type systems impose runtime overhead.

Type System and Checker for a Simple Language

Why do we write a type checker?

Three questions any type system must answer

  1. What types do I have?

  2. Given a type, how do I create a value

  3. What can I do with a value (of that type)?

Simple Type System Example

Simple language of machine-level expressions

What types do I have?

Type-system example (machine model)

How do I create a word?

What can I do with a word?

How do I create a flag?

What can I do with a flag?

Language of expressions to type

Words and flags:

datatype exp = ARITH of arithop * exp * exp
             | CMP   of relop   * exp * exp
             | LIT   of int
             | IF    of exp     * exp * exp
and      arithop = PLUS | MINUS | TIMES | ...
and      relop   = EQ | NE | LT | LE | GT | GE

datatype ty = WORDTY | FLAGTY

(Looks a lot like int and bool)

Examples to rule out

Can’t add a word and a flag:

3 + (3 < 99)

(ARITH(PLUS, LIT 3, CMP (LT, LIT 3, LIT 99)))

Can’t compare a word and a flag

(3 < (4 = 24))

CMP (LT, LIT 3, CMP(EQ (LIT 4, LIT 24)))

Check your understanding: AST Representation of Simple Language

Recall the AST for the simple machine language:

datatype exp = ARITH of arithop * exp * exp
             | CMP   of relop   * exp * exp
             | LIT   of int
             | IF    of exp     * exp * exp
and      arithop = PLUS | MINUS | TIMES | ...
and      relop   = EQ | NE | LT | LE | GT | GE

datatype ty = WORDTY | FLAGTY

The AST term

CMP(GT, (CMP(EQ, LIT 105, LIT 170)), LIT 160)

which encodes the expression

((105 = 170) > 160)

should be flagged as a type error.

Inference rules to define a type system

Rule for arithmetic operators

Informal example:

|- 3 : word    |- 5 : word 
-------------------------
|- 3 + 5 : word

Rules out:

|- 'a' : char    |- 5 : word
---------------------------
|- 'a' + 5 : ???

General form:

|- e1 : word    |- e2 : word
-----------------------------
|- ARITH ( _ , e1, e2) : word

Rule for comparisons

Informal example:

|- 7 : word    |- 10 : word
-----------------------------
|- 7 < 10 : flag

General form:

|- e1 : word    |- e2 : word
-----------------------------
|- CMP ( _ , e1, e2) : flag

Rule for literals

Informal example:

|- 14 : word

General form:

--------------------
|- LIT (n) : word

Rule for conditionals:

Informal example:

|- true : flag    
|- 3    : word
|- 42   : word      
--------------------------
|- IF (true, 3, 42) : word

General form:

|- e : flag    
|- e1 : tau1   
|- e2 : tau2      tau1 equiv tau2
-----------------------------------
|- IF ( e, e1, e2) : tau1

Typing rules let us read off what a type checker needs to do.

Type checker in ML

val typeof : exp -> ty
exception IllTyped
fun typeof (ARITH (_, e1, e2)) = 
      case (typeof e1, typeof e2) 
        of (WORDTY, WORDTY) => WORDTY
         | _              => raise IllTyped
  | typeof (CMP (_, e1, e2)) = 
      case (typeof e1, typeof e2) 
        of (WORDTY, WORDTY) => FLAGTY
         | _              => raise IllTyped
  | typeof (LIT _) = WORDTY
  | typeof (IF (e,e1,e2)) = 
      case (typeof e, typeof e1, typeof e2) 
        of (FLAGTY, tau1, tau2) => 
           if eqType (tau1, tau2) 
           then tau1 else raise IllTyped
         | _                    => raise IllTyped

Slide 7 

Check your understanding: A type checker from inference rules

Recall the AST for the simple machine language:

datatype exp = ARITH of arithop * exp * exp
             | CMP   of relop   * exp * exp
             | LIT   of int
             | IF    of exp     * exp * exp
and      arithop = PLUS | MINUS | TIMES | ...
and      relop   = EQ | NE | LT | LE | GT | GE

datatype ty = WORDTY | FLAGTY

and the inference rule for comparisons:

    |- e1 : word    |- e2 : word
    -----------------------------
    |- CMP ( _ , e1, e2) : flag

and our in-progress type checking code typeof

    val typeof : exp -> ty
    exception IllTyped
    fun typeof (ARITH (_, e1, e2)) = 
          case (typeof e1, typeof e2) 
            of (WORDTY, WORDTY) => WORDTY
             | _              => raise IllTyped
      | typeof (CMP (_, e1, e2)) = 
          case (BLANK1, BLANK2) 
            of (BLANK3, BLANK4) => BLANK5
             | _              => raise IllTyped
      | typeof (LIT _) = WORDTY
      | typeof (IF (e,e1,e2)) = 
          case (typeof e, typeof e1, typeof e2) 
            of (FLAGTY, tau1, tau2) => 
               if eqType (tau1, tau2) 
               then tau1 else raise IllTyped
             | _                    => raise IllTyped

Fill in the blanks based on the comparison inference rule to complete the code for the CMP case.

An implementor’s trick: If you see identical types in a rule,

Typing Rules: Contexts and Term Variables

Let’s add variables!

    datatype exp = ARITH of arithop * exp * exp
                 | CMP   of relop   * exp * exp
                 | LIT   of int
                 | IF    of exp     * exp * exp
                 | VAR   of name
                 | LET   of name    * exp * exp
    and      arithop = PLUS | MINUS | TIMES | ...
    and      relop   = EQ | NE | LT | LE | GT | GE

    datatype ty = WORDTY | FLAGTY

New forms to support variables

    datatype exp = ...
                 | VAR   of name
                 | LET   of name    * exp * exp

    datatype ty = WORDTY | FLAGTY

What could go wrong with a variable?

So we need to track variable use to ensure consistency

Key idea: Type environment (Gamma) tracks the types of variables.

Rule for var

x in dom Gamma        tau = Gamma(x) 
----------------------------------------
Gamma |- VAR x : tau

So as long as x is recorded in Gamma as a WORDTY, it can only be used as a WORDTY.

Rule for let

Gamma          |- e1 : tau1
Gamma{x->tau1} |- e2 : tau2   
-------------------------------------
Gamma |- LET x = e1 in e2 : tau2

Type Checker

Type checker needs Gamma – gives type of each “term variable”.

Type checking for variables

val typeof : exp * ty env -> ty
fun typeof (ARITH ..., Gamma ) = <as before>
  | typeof (VAR x, Gamma)      =
      (case maybeFind (x, Gamma)
         of SOME tau => tau
          | NONE     => raise IllTyped)
  | typeof (LET (x, e1, e2), Gamma) = 
      <your turn>

Check your understanding: Adding variables

Given the typing rule for the let form we added to the simple machine language:

    Gamma          |- e1 : tau1
    Gamma{x->tau1} |- e2 : tau2   
    -------------------------------------
    Gamma |- LET x = e1 in e2 : tau2

Fill in the missing blanks to complete the type checker we started in the video:

    val typeof : exp * ty env -> ty
    fun typeof (ARITH ..., Gamma ) = <as before>
      | typeof (VAR x, Gamma)      =
          (case maybeFind (x, Gamma)
             of SOME tau => tau
              | NONE     => raise IllTyped)
      | typeof (LET (x, e1, e2), Gamma) = 
          let tau1 = typeof (BLANK1, BLANK2)
          in  BLANK3 (BLANK4, extend Gamma x BLANK5)
          end 

Taking Stock

This is a big chunk of what language designers do.

Bonus content

What is a type?

Source of new language ideas for next 20 years

Needed if you want to understand advanced designs (or create your own)

Type soundness

Key theorem: prediction is accurate

What types can rule out

things that could go wrong:

(8 < 10) + 4

(8 == 8) < 9

x + (x :: xs)

let val y = 10 in length y end

21 October 2020: Type checking with type constructors

There are PDF slides for 10/22/2020.

should have two separate answer boxes: one for what they have learned and one for what they have questions about.

Reflection: Simple type checking

In the first part of this module on type checking, we talked about the purposes and limitations of type systems, we formalized a simple type system with just two types using inference rules, and we converted those inference rules into a type checker.

Describe two specific things you have learned so far in this module.

Write one question you have about what we’ve learned so far.

Your responses may be shared anonymously with your classmates.

Reflection: How many types?

In the previous “lecture”, we studied a type system with just two types: words and flags. In this “lecture”, we’re going to add type constructors such as function types and lists. When we ``add lists" to a language, how many types are we adding?

Where we’ve been

New watershed in the homework

Where we’re going

Focus on two questions about type systems:

We’ll look at these questions in two contexts: monomorphic (this “lecture”) and polymorphic (next “lecture”) languages.

Monomorphic vs Polymorphic Types

Monomorphic types have “one shape.”

Polymorphic types have “many shapes.”

Designing monomorphic type systems

Mechanisms:

Language designer’s process:

Words “introduce” and “eliminate” are the native vocabulary of type-theoretic language design—it’s like knowing what to say when somebody sneezes.

Managing the set of types: Type formation

Slide 1 

Examples: Not yet types

``Types in waiting’’ (yet)

Examples: Not types at all!

These are

Type-formation rules

Are about not trusting the source code

We need a way to classify type expressions into:

Slide 5 

Slide 6 

Type judgments for monomorphic system

Two judgments:

Check your understanding: Types, type constructors, or garbage

In each of the following, indicate whether the “type” is in fact a type (ie, something that classifies terms), a type constructor (ie, something that will classify terms when applied to some number of type arguments), or utter nonsense.

bool list

array

bool bool

->

These rules are actually the same as we saw for the simple arithmetic language, but we’ll review them again here.

Monomorphic type rules

Slide 8 

Slide 9 

Slide 10 

Slide 11 

Check your understanding: Checking type equalities

We saw how to modify the typing rule for Set to explicitly mark where two types were being compared for equality. The typing rule for If that we saw in the video has a similar implicit type equality check. Fill in the missing blanks in the typing rule below to create a version with an explicit equality check.

Classic types for data structures

Slide 12 

At run time, these pairs are identical to cons, car, cdr.

These rules dictate type checker code just like in previous “lecture”.

Slide 13 

Slide 14 

Slide 15 

Typical syntactic support for types

Explicit types on lambda and define:

Abstract syntax:

datatype exp = ...
 | LAMBDA of (name * tyex) list * exp
    ...
datatype def = ...
 | DEFINE of name * tyex * ((name * tyex) list * exp)
    ...

Slide 17 

Slide 18 

Check your understanding: Writing Typing Rules

ML References are similar to C/C++ pointers. Given the following surface syntax (on the left) and corresponding AST represenations (on the right):

Surface Syntax AST
ref τ ref(τ)
ref e ref-make(e)
!e ref-get(e)
e1 := e2 ref-set(e1,e2)

fill in the blanks to correctly complete the following formation, introduction, and elimination rules.

=10pt

Bonus Content

Coding the arrow-introduction rule

Slide 19 

Type-checking LAMBDA

datatype exp = LAMBDA of (name * tyex) list * exp 
   ...
fun ty (Gamma, LAMBDA (formals, body)) = 
  let val Gamma' = (* body gets new env *)
        foldl (fn ((x, ty), g) => bind (x, ty, g))
              Gamma formals
      val bodytype = ty (Gamma', body)
      val formaltypes = 
        map (fn (x, ty) => ty) formals
  in  FUNTY (formaltypes, bodytype)
  end

Approaching types as programmers

Understanding language design

Questions about types never seen before:

Examples: pointer, struct, function, record

Talking type theory

Formation: make new types

Introduction: make new values

Elimination: observe (“take apart”) existing values

Talking type theory: Introduction and elimination constructs

Part of learning any new field: talk to people in their native vocabulary

It’s like knowing what to say when somebody sneezes.

Slide 23 

Slide 24 

Slide 25 

Type soundness

Slide 26 

26 Oct 2020: Polymorphic Type Checking

There are PDF slides for 10/27/2020.

Your type-system questions are answered.

Today

Reflection: Type checking type constructors

In the previous two “lectures” we studied how to type check a known set of types. In each case, the syntax for the type was hard-coded into the AST for the language. Type checking the creation and uses of values with the type were also hard-coded into the type checker. That approach works fine if the language designers know all the types in advance. But, in any language worth its salt, programmers can create new types, (for example, by defining algebraic datatypes in ML, structs in C, or classes in Java). The language has to be able to type check these new types without having to recompile the language implementation first.

Spend a few minutes thinking about how we might redesign the implementation of the type checker to accommodate user-defined types.

Describe your preliminary design in a few sentences.

Limitations of monomorphic type systems

With monomorphism, new types are costly

Closed world

A new type constructor (“array”) requires

Slide 2 

Notes:

Costly for programmers

Monomorphism leads to code duplication

User-defined functions are monomorphic:

(check-function-type swap
                     ([array bool] int int -> unit))
(define unit swap ([a : [array bool]]
                   [i : int]
                   [j : int])
  (begin
     (set tmp       (array-at a i))
     (array-put a i (array-at a j))
     (array-put a j tmp)
     (begin)))

Check your understanding: Monomorphism limitations

If a language has a monomorphic type system, it will not let programmers write a single length function for lists. Instead, programmers must write a length function for each kind of list they want to calculate the length of, eg, a length function for lists of booleans, and another for lists of integers, etc.

Quantified types

Idea: Use functions not syntax to represent new types

Benefits:

Requires: more expressive function types

Better type for a swap function

(check-type
    swap
    (forall ('a) ([array 'a] int int -> unit)))

Slide 6 

Slide 7 

Check your understanding: Quantified Types

The type theory expression and the expression are two different representations for the same polymorphic type.

In the type (forall ('a) ([list 'a] -> 'a)), the notation 'a denotes a type variable that stands for the type of element in the list argument and the type of the result returned by the function.

The type (forall ('a) ([list 'a] -> [list 'a])) is the type of the car function.

Programming with quantified types

Substitute for quantified variables: “instantiate”

-> length
<function> : (forall ('a) ((list 'a) -> int))
-> [@ length int]
<function> : ((list int) -> int)
-> (length '(1 2 3))
type error: function is polymorphic; 
instantiate before applying
-> ([@ length int] '(1 2 3))
3 : int

Instantiate as you like

-> length
 : (forall ('a) ((list 'a) -> int))
-> [@ length bool]
 : ((list bool) -> int)
-> ([@ length bool] '(#t #f))
2 : int

More instantiations

-> (val length-int [@ length int])
length-int : ((list int) -> int)
-> (val cons-bool [@ cons bool])
cons-bool : ((bool (list bool)) -> (list bool))
-> (val cdr-sym [@ cdr sym])
cdr-sym : ((list sym) -> (list sym))
-> (val empty-int [@ '() int])
() : (list int)

Bonus instantiation:

-> map
<function> : 
  (forall ['a 'b]
    (('a -> 'b) (list 'a) -> (list 'b)))
-> [@ map int bool]
<function> : 
  ((int -> bool) (list int) -> (list bool))

Slide 11 

Polymorphic array swap

(check-type swap
    (forall ('a) ([array 'a] int int -> unit)))

(val swap
 (type-lambda ('a)
  (lambda ([a : (array 'a)]
           [i : int]
           [j : int])
   (let ([tmp ([@ Array.at 'a] a i)])
    (begin
      ([@ Array.put 'a] a i ([@ Array.at 'a] a j))
      ([@ Array.put 'a] a j tmp))))))

Two forms of abstraction:

Power comes at high notational cost

Function composition

-> (val o (type-lambda ['a 'b 'c]         
    (lambda ([f : ('b -> 'c)] 
             [g : ('a -> 'b)])
     (lambda ([x : 'a]) (f (g x))))))    

o : (forall ('a 'b 'c) 
       (('b -> 'c) ('a -> 'b) -> ('a -> 'c)))

Aka o :

Coding Interlude: Instantiating Polymorphic Functions

Enter the length function into the interpreter to observe its type.

Apply the length function to the list [1,2,3]. Observe the response.

Create two instantiations of the length function: one for ints and one for bools.

Apply the length-int function to the list [1,2,3] and the length-bool function to the list [#t, #f] and observe the result.

What do you see as the advantages and disadvantages of ’s approach to polymorphism so far?

Polymorphic Type Checking

Representing quantified types

Representing quantified types

Two new alternatives for tyex:

datatype tyex
 = TYCON  of name                // int
 | CONAPP of tyex * tyex list    // (list bool)
 | FUNTY  of tyex list * tyex    // (int int -> bool)
 | TYVAR  of name                // 'a
 | FORALL of name list * tyex    // (forall ('a) ...)

Check your understanding: Representation of quantified types

Given the tyex datatype:

datatype tyex
 = TYCON  of name                // int
 | CONAPP of tyex * tyex list    // (list bool)
 | FUNTY  of tyex list * tyex    // (int int -> bool)
 | TYVAR  of name                // 'a
 | FORALL of name list * tyex    // (forall ('a) ...)

Match the surface syntax for polymorphic types with their representations using the tyex datatype.

Surface syntax:

  1. bool
  2. 'a
  3. (list int)
  4. (bool bool -> int)
  5. (forall ['b] ((list 'b) -> int))

Represenation:

  1. FUNTY ([TYCON "bool", TYCON "bool"], TYCON "int")
  2. TYVAR "a"
  3. FORALL ("b", FUNTY ([CONAPP (TYCON "list", [TYVAR "b"])], TYCON "int"))
  4. TYCON "bool"
  5. CONAPP (TYCON "list", [TYCON "int"])

Type rules for polymorphism

Slide 15 

Slide 16 

Check your understanding: Rules for type-lambda

Slide 17 

  1. In the new judgement form for type checking type-lambda, the meta-variable Δ is a kind environment that records the type variables that have been used in typing a program.
  1. The AST term tylambda(α1,…,αn, e) is the representation of the surface syntax (@ e τ1 ... τn)
  1. In the introduction rule for type-lambda, it is okay if the type variables to be quantified over already appear in the context Γ.
  1. In the elimination rule for type-lambda, the resulting type is obtained by substituting the type arguments for the corresponding quantified type variables. If the type of the expression e is quantified over m type variables and e is applied to n type arguments, a type error should result if n ≠ m.

Type formation through kinds

What have we gained?

No more type-specific introduction rules:

No more type-specific elimination rules:

But, we still need formation rules

User types can’t be blindly trusted

-> (lambda ([a : array]) (Array.size a))
type error: used type constructor `array' as a type
-> (lambda ([x : (bool int)]) x)
type error: tried to apply type bool as type constructor
-> (@ car list)
type error: instantiated at type constructor `list', 
which is not a type

How can we know which types are OK?

Slide 20 

Slide 21 

Slide 22 

Slide 23 

Slide 24 

Check your understanding: Use kinds to classify types

For each type expression, select its kind or indicate it is not well formed.

  1. char
  1. list list
  1. array

Opening the closed world

Designer’s burden reduced

To extend Typed Impcore:

To extend Typed μScheme, none of the above! Just

You’ll do arrays both ways

Programmer’s power increased?

Typed Impcore:

Typed μScheme:

Standard ML:

Bonus content: a definition manipulates three environments

Slide 27 

Slide 28 

Slide 29 

28 October 2020: Introduction to type inference

There are PDF slides for 10/29/2020.

Retrospective: Type Checking

Write one thing you learned in the previous module about type checking and one question you have about that material.

Your responses may be anonymously shared with the rest of the class.

Reflection: How does the compiler know?

Consider the following ML code for the append function:

fun append (x::xs) ys = x :: append xs ys
  | append []      ys = ys

When you enter this code into the ML comiler, it returns the type of the function.

Describe how you think the ML compiler might figure out the type of the function.

Why study type inference?

What type inference accomplishes

-> (define     double (x)       (+ x x))
double                         ;; uScheme
-> (define int double ([x : int]) (+ x x))
double : (int -> int)          ;; Typed uSch.
-> (define     double (x)       (+ x x))
double : (int -> int)          ;; nML

What else type inference accomplishes

-> ([@ cons bool] #t ([@ cons bool] #f [@ '() bool]))
(#t #f) : (list bool)    ;; typed uScheme
-> (   cons       #t (   cons       #f    '()      ))
(#t #f) : (list bool)    ;; nML

How it works

  1. For each unknown type, a fresh type variable

  2. Every typing rule adds equality constraints

  3. Instantiate every variable automatically

  4. Introduce polymorphism at let/val bindings

Plan of study

Let’s do some examples

N.B. Book is “constraints first;” lecture will be “type system first.” Use whatever way works for you

Example: (val-rec double (lambda (x) (+ x x)))

What do we know?

Key idea: Record the constraint in a typing judgement.

'a2 ~ int /\ 'a2 ~ int, 
{ double : 'a1, x : 'a2 } |- (+ x x) : int
Example: (if y 1 0)
Example: (if z z (- 0 z))

Check your understanding: Gathering constraints

  1. In this judgment
    'ax ~ int /\ 'ay ~ int, 
    {x : 'ax, y : 'ay } |- (+ x y) : int

the notation 'ay ~ int is a constraint that means that the type variable 'ay must be equal to type int.

  1. In this judgment
    'ax ~ int /\ 'ay ~ int, 
    {x : 'ax, y : 'ay } |- (+ x y) : int

the notation y : 'ay means that expression variable y has type 'ay. Furthermore, because y has a type variable for its type, y’s type can become any syntactically valid type.

  1. Which set of constraints might have been generated when running type inference on the following program fragment? Pick the most complete set that applies.

(if x (+ 1 y) z)

Inferring polymorphic types

(val app2 (lambda (f x y)
             (begin
                (f x)
                (f y))))

Assume f : 'a_f

Assume x : 'a_x

Assume y : 'a_y

f x implies 'a_f ~ ('a_x -> 'a1)

f y implies 'a_f ~ ('a_y -> 'a2)

Together, these constraints imply 'a_x ~ 'a_y and 'a1 ~ 'a2

begin implies result of function is 'a2

So,

app2 : (('a_x -> 'a2) 'a_x 'a_x -> 'a2)

'a_x and 'a2 aren’t mentioned anywhere else in program, so

we can generalize to:

(forall ('a_x 'a2) (('a_x -> 'a2) 'a_x 'a_x -> 'a2))

which is the same thing as:

app2 : (forall ('a 'b) (('a -> 'b) 'a 'a -> 'b))

Check your understanding: Type inference with polymorphic functions

This question walks you through generating constraints for a simple function that makes two calls to a polymorphic function. It illustrates how those two calls can end up being used at different types.

Each subquestion builds on the previous one. The explanations given after each subquestion explain what you need to know for the next part.

Consider the following program:

(val cc (lambda (nss) (car_2 (car_1 nss))))

where car_1 and car_2 are two instances of the polymorphic car function that has type

car : forall 'a . 'a list -> 'a

For the purposes of this question we’ve named these uses car_1 and car_2 so we can be clear which use we are talking about as we study their effect on the type of the expression.

  1. Variable nss is a parameter, and so during type checking, we assign it a fresh type variable, say 'b.

    • True
    • False
  1. We need to instantiate the two instances of the polymorphic car function before we can apply them. Which set of instantiations is correct?
  1. Given what we have already inferred about the types of car_1 and nss (and given in the answers in parts 1 and 2), which of the following constraints is generated by the application (car_1 nss)?
  1. Given what we have already inferred about the types of car_1, car_2, and nss (and given in the answers in parts 1, 2, and 3), which of the following constraints is generated by the application of the function car_2 to the argument (car_1 nss) in the term (car_2 (car_1 nss)):

Extra examples

Bonus example

-> (val second (lambda (xs) (car (cdr xs))))
second : ...
-> (val two    (lambda (f) (lambda (x) (f (f x)))))
two : ...

Bonus example solved

-> (val second (lambda (xs) (car (cdr xs))))
second : (forall ('a) ((list 'a) -> 'a))
-> (val two    (lambda (f) (lambda (x) (f (f x)))))
two :    (forall ('a) (('a -> 'a) -> ('a -> 'a)))

Precise inference with Hindley-Milner types

Making Type Inference Precise

Sad news:

Solution:

Consequences:

Slide 9 

Representing Hindley-Milner types

type tyvar = name
datatype ty
  = TYVAR  of tyvar
  | TYCON  of name
  | CONAPP of ty * ty list

datatype type_scheme
  = FORALL of tyvar list * ty

fun funtype (args, result) = 
  CONAPP (TYCON "function", 
          [CONAPP (TYCON "arguments", args),
           result])

Slide 11 

Slide 12 

All the pieces

  1. Hindley-Milner types
  2. Bound names : σ, expressions : τ
  3. Type inference yields type-equality constraint
  4. Constraint solving produces substitution
  5. Substitution refines types
  6. Call solver, introduce polytypes at val
  7. Call solver, introduce polytypes at all let forms

To code the type-inference algorithm, replace eqType with constraint generation!

Check your understanding: Hindley-Milner Types

  1. With Hindley-Milner types, the meta-variable τ cannot stand for a type variable α.
  1. With Hindley-Milner types, the meta-variable σ represents a polymorphic type and has the form α1,… αn . τ.
  1. Choose the appropriate values for the blanks in the following ML data structure to make it represent the the Hindley-Miler type scheme α . α list -> α.
FORALL (BLANK1, 
        CONAPP(BLANK2, 
               [CONAPP (TYCON "arguments", BLANK3), 
                        TYVAR "alpha" ]))

BLANK1:

BLANK2:

BLANK3:

  1. In Hindley-Miler type inference, type environments Γ always bind variables to type scheme σ. If a variable is monomorphic, the type scheme will quantify over an empty list of type variables (ie, be of the form . τ)
  1. Match each step in Hindley-Milner type inference with its purpose.

Steps:

  1. constraint generation
  2. constraint solving
  3. substitution
  4. generaliation
  5. instantiation

Purposes:

  1. Refining types based on information inferred during constraint solving.
  2. Collating clues to make sure expressions are used consistently.
  3. Converting type scheme into a type so that it may be used in a given context.
  4. Collecting clues about how expressions are used in the program.
  5. Introducing type scheme to memorialize inference that given expression will work for any type.

2 November 2020: Making type inference formal

There are PDF slides for 11/3/2020.

Retrospective: Introduction to type inference

Write one thing you learned about type inference so far and one question you have. Your answers may be anonymously shared with your classmates.

Type inference algorithm, formally

All the pieces

  1. Hindley-Milner types
  2. Bound names : σ, expressions : τ
  3. Type inference yields type-equality constraint
  4. Constraint solving produces substitution
  5. Substitution refines types
  6. Call solver, introduce polytypes at val
  7. Call solver, introduce polytypes at all let forms

Slide 2 

Apply rule: e’s type could just be a type variable—we need to force it into the arrow form we need.

Slide 3 

Slide 4 

What you know and can do now

Taking stock

You can complete typeof

Except for let forms.

Next up: Solving constraints

Check your understanding: Type inference algorithm

  1. The judgment form C, Γ |- e : τ corresponds to the function typeof. For each element of the judgment, say whether that element is an input or an output of the function typeof:
  1. Consider the typing rule for apply written in the new style:

Slide 6 

Now answer the following questions about the rule.

A. The type τf is the type of the first argument to the function being applied.

B. The arguments to the function e1, …, en have types τ1, … τn, respectively.

C. The type variable α is the result type of the function.

Reflection: Solving constraints

For each of the following constraints, say if it is solvable. If it is, give a solution.

 1. int           ~ bool
 2. (list int)    ~ (list bool)
 3. 'a            ~ int
 4. 'a            ~ (list int)
 5. 'a            ~ ((args int) -> int)
 6. 'a            ~ 'a
 7. (args 'a int) ~ (args bool 'b)
 8. (args 'a int) ~ ((args bool) -> 'b)
 9. 'a            ~ (pair 'a int)
10. 'a            ~ tau    // arbitrary tau

What does it mean to solve constraints?

Slide 7 

Representing constraints

But first, we need to represent them!

datatype con = ~   of ty  * ty
             | /\  of con * con
             | TRIVIAL
infix 4 ~
infix 3 /\

Example:

  TYVAR "alpha1" ~ TYCON "int" /\  
  TYVAR "alpha2" ~ TYCON "bool"

Slide 9 

Two questions: what’s substitution, and when is a constraint satisfied?

Slide 10 

just formalize it. It’s a function that behaves a certain way.

Slide 11 

Slide 12 

Constraint satisfaction: equal types mean equal constructors applied to equal arguments—same is in eqType.

Slide 13 

Check your understanding: Substitutions and satisfied constraints

Answer the following questions.

  1. The constraint τ1 ~ τ2 is satisfied if τ1 = τ2.
  1. A substitution θ is a mapping from type variables to type variables, but not more complex types.
  1. A substitution θ ``solves’’ a constraint C by ensuring all types related in C via the ~ equivalence relation are equal.

Solving simple type equalities

Slide 14 

Reflection: Solving type-equality constraints

One type of constraint we need to solve is that involving two types τ1 ∼ τ2.

Recall that each type can have one of three forms:

datatype ty
  = TYVAR  of tyvar
  | TYCON  of name
  | CONAPP of ty * ty list

Briefly describe how you would determine whether the constraint τ1 ∼ τ2 was satsified.

How many potential cases are there to consider?

Solving type equalities: The nine cases

tau1 \ tau2    TyVar2    TyCon2      ConApp2

                alpha2     mu2     (tau21,...,tau2n)tau2

TyVar1             1        2          3

TyCon1             S      mu1=mu2?     Fail

ConApp1            S        S          Recurse

I = immediate; S = symmetry; 

1 = alpha1 -> alpha2

2 = alpha1 -> mu2

3 = if occurs check passes, alpha1 -> (tau21,...,tau2n)tau2

Solving conjunctions

Slide 16 

Check your understanding: Solving constraints

  1. When asked to produce a substitution θ that solves an equality constraint, which has the form τ1 ~ τ2, there are nine cases to consider. The number nine comes from the fact that each type has three possible forms: a type variable α, a type constructor μ, or a type application (τ1,…,τn)τ.
  1. When asked to produce a substitution θ that solves an equality constraint, which has the form τ1 ~ τ2, if τ1 has the form μ1 and τ2 has the form μ2, then there is no solution.
  1. When asked to produce a substitution θ that solves an equality constraint, which has the form τ1 ~ τ2, if τ1 has the form μ1 and τ2 has the form (τ1,…,τn)τ then there is no solution.
  1. In the type system we have been studying, the constraint α ~ α * τ is solvable.
  1. When solving the conjunction of constraints C1 C2, it is sufficient to produce a solution θ1 for constraint C1 and a separate solution θ2 for constraint C2, and then compose those substitutions to produce a solution θ1 o θ2 for the conjunction C1 C2.

The inference algorithm, formally

All the pieces

  1. Hindley-Milner types
  2. Bound names : σ, expressions : τ
  3. Type inference yields type-equality constraint
  4. Constraint solving produces substitution
  5. Substitution refines types
  6. Call solver, introduce polytypes at val
  7. Call solver, introduce polytypes at all let forms

What you can do so far

Write type inference for everything except VAL, VALREC, and LETX

Write function solve, which, given a constraint C, has one of three outcomes:

4 November 2020: Instantiation and generalization

There are PDF slides for 11/5/2020.

Retrospective: Making type inference precise

Write one thing you learned about formalizing type inference so far and one question you have. Your answers may be anonymously shared with your classmates.

Reflection: Introducing type schemes

Suppose your type inferencer has produced a type that mentions a type variable α. Under what circumstances do you think it is safe to generalize that type to a polymorphic type scheme α . τ?

Introducing and eliminating type schemes

All the pieces

  1. Hindley-Milner types
  2. Bound names : σ, expressions : τ
  3. Type inference yields type-equality constraint
  4. Constraint solving produces substitution
  5. Substitution refines types
  6. Call solver, introduce polytypes at val form
  7. Call solver, introduce polytypes at all let forms

Slide 2 

Moving from type scheme to types (Instantiation)

Moving from types to type scheme (Generalization)

Slide 3 

Check your understanding: Instantiation

  1. Instantiation is the process of converting from a polymorphic type scheme σ to a monomorphic instance τ.
    Such conversions always take place when variables are type checked using the Var typing rule.
  1. Consider the Var typing rule:

Slide 4 

Suppose we want to infer a type for the expression

(if y (fst 2 3) 4)

in the environment

Gamma = {fst : forall 'a 'b. 'a * 'b -> 'a, y : 'c}

Suppose we instantiate fst as follows:

fst : 'c * 'b -> 'c

This instantiation and the expression gives rise to the following constraints:

'c ~ bool  ;; because y:'c and y is tested by the if

'c ~ int   ;; because first argument to fst is an int

These constraints aren’t satisfiable.

But, the expression should have the typing

 'c ~ bool, Gamma |- (if y (fst 2 3) 4) : int

What went wrong?

Bonus question: Why the distinctness requirement?

Consider again

Gamma = {fst : forall 'a 'b. 'a * 'b -> 'a, y : 'c}

??, Gamma |- (if y (fst 2 3) 4) : ??

Now suppose we instantiate fst as follows:

fst : 'a * 'a -> 'a

Gamma |- fst 2 #t

Application rule yields constraints:

'a ~ int

'a ~ bool

which aren’t satisfiable. Which means this code would also be erroneously flagged as an error.

Correct typing:

Gamma |- fst 2 #t : int

The mistake was ignoring the distinctness requirement when instantiating the type scheme.

From types to type scheme: Val rule

Slide 5 

Slide 6 

The set A above will be useful when some variables in τ are mentioned in the environment.

We can’t generalize over those variables.

Applying idea to the type inferred for the function fst:

generalize('a * 'b -> 'a, emptyset) = 
forall 'a, 'b. 'a * 'b -> 'a

Slide 7 

Note the new judgement form above for type checking a declaration.

Slide 8 

Slide 9 

Slide 10 

On the condition θΓ = Γ: Γ is “input”: it can’t be changed.
The condition ensures that θ doen’t conflict with Γ.

We can’t generalize over free type variables in Γ.

Their presence indicates they can be used somewhere else, and hence they aren’t free to be instantiated with any type.

Check your understanding: The full Val rule

Consider the Val rule:

Slide 11 

In this series of questions, we will walk through using the Val rule to type check the following definition:

(val pick-t (lambda (y z) (pick #t y z)))

in the context

Γ = { pick : α.bool $\cross$ α $\cross$ α $\arrow$ α}

  1. The first hypothesis above the line in the Val rule that we must check is the judgment C, Γ |- e : τ. Suppose we have already derived the following instance of that judgment as we infer a type for the right-hand side of the pick-t definition:

    $$ \begin{array}{l} \alpha_y \sim \alpha_p \land \alpha_z \sim \alpha_p, \Gamma \vdash \\ \quad \lit{(lambda (y z) (pick \#t y z)) }: \alpha_y \cross \alpha_z \arrow \alpha_p \\ \end{array} $$

To derive that judgment, which of the following might have been how the polymorphic pick function was instantiated, assuming αy and αz are the types assumed for variables y and z, respectively? Consider what instantiation would have been necessary to produce the observed constraints and resulting type.

  1. The next step is to produce a substitution θ that solves the constraints αy ~ αp αz ~ αp and leaves Γ unchanged (θ Γ = Γ). Which of the following substitutions is a such solution?
  1. The next step is to generalize. Given the substitution θ = { αy -> αp, αz -> αp}, what type should be passed as the first argument to the generalize function? (The second argument is the empty set because ftv(Γ) = .

Reflection: Type inference for let form

Consider the following nML code. ML will infer a type scheme for the first and third examples, but reports an error for the second.

Say what each of the functions is attempting to do and explain why you think ML reports a type error for the second function.

(lambda (ys)  
   (let ([s (lambda (x) (cons x '()))])
      (pair (s 1) (s #t))))
<function> : 
  (forall ['a] ('a -> (pair (list int) (list bool))))

(lambda (ys)  
   (let ([extend (lambda (x) (cons x ys))])
      (pair (extend 1) (extend #t))))
type error: cannot make int equal to bool

(lambda (ys)  ; OK
    (let ([extend (lambda (x) (cons x ys))])
       (extend 1)))
<function> : ((list int) -> (list int))

Slide 12 

Slide 13 

Slide 14 

Let with constraints, operationally:

  1. typesof: returns τ1, …, τn and C

  2. val theta = solve C

  3. C-prime from map, conjoinConstraints, dom, inter, freetyvarsGamma

  4. freetyvarsGamma, union, freetyvarsConstraint

  5. Map anonymous lambda using generalize, get all the σi

  6. Extend the typing environment Gamma (pairfoldr)

  7. Recursive call to type checker, gets C_b, \tau

  8. Return (tau, C' /\ C_b)

Check your understanding: The Let rule

The following questions ask about the Let rule:

Slide 15 

  1. The let-bound variables x1, …, xn are given polymorphic type schemes within the scope of the body of the let e, but the type inferred for e is monomorphic.

    • True
    • False
  1. If the constraint solver returned the substitution θ = {α -> (α, β)}, type inference would be able to proceed with calculating a residual constraint and generalization.
  1. The residual constraint C retains all information in the substitution θ that relates to any type variable that might be relevant after type checking the let expression as determined by being mentioned in the free type variables of the context Γ.
  1. During generalization, we only need to worry about not generalizing over the free type variables mentioned in Γ because those type variables might be constrained in various ways by other parts of the program.

Slide 16 

Slide 17 

Check your understanding: The val-rec and letrec rules

The key insight with the two recursive type inference rules is to follow a three-step process:

  1. Introduce a fresh type variable α to represent the unknown type for the recursive binding.
  2. Infer constraints and a type τ for the recursive binding as usual
  3. Add a constraint relating the type variable and the inferred type (α ~ τ).

Otherwise, the recursive variants of the type inference rules proceed just as the non-recursive versions.

Forall things

Managing Quantified types
val and val-rec let, letrec, … lambda
FORALL contains all variables (because none are free in the context) FORALL contains variables not free in the context FORALL is empty
Generalize over all variables (because none are free in the context) Generalize over variables not free in the context Never generalize

9 November 2020: Introduction to Object-orientation

There are PDF slides for 11/10/2020.

Retrospective: Type Inference

Write one thing you learned in the previous module about type inference and one question you have about that material.

Your responses may be anonymously shared with the rest of the class.

Reflection: Introduction to Object-Oriented Programming

Suppose you were implementing a drawing program that lets users manipulate various kinds of shapes: circles, triangles, diamonds, etc.

Information Hiding

So far: Programming in the small

What about big programs?

An area of agreement and a great divide:

                 INFORMATION HIDING
             (especially: mutable data)
                      /       \
                     /         \
       code reuse   /           \  modular reasoning 
                   /             \ 
interoperability  /               \  internal access
between reps     /                 \  to reps
                /                   \
            OBJECTS                 MODULES               
                                  ABSTRACT TYPES

Key idea: Interfaces

Introduction to OOP

Example Languages: JavaScript, Ruby, Java, C++, Python, …

Goals:

Mechanism:

Object-oriented demo

Demo: circle, diamond, triangle cards with these methods:

Front:

Object 1 : Circle

Back:

Object 1: Circle

;; stores center of shape

center: coord

methods: ...

Message sends:

  1. object1 set-pos:to: 'S (0,0)

  2. object1 pos: 'N answer: (0,2)

  3. object2 set-pos:to: 'S (0,2)

  4. object2 pos: 'N answer: (0,4)

  5. object3 set-pos:to: 'SW (0,4)

  6. object1 draw

  7. object2 draw

  8. object3 draw

Key concepts of object-orientation

Functional vs. OOP mindset

Reflected in the syntax: the decider comes first!

Examples

Print every element of a list

Multiplication of natural numbers, base 10

Check your understanding: Key concepts of object-orientation

In object-oriented programming, computation proceeds by sending messages to objects.

Information-hiding is important to large-scale programming because it allows a large program to be divided into smaller units that can be understood separately.

In both functional and object-oriented languages it is idiomatic to ask about the representation of arguments to functions/methods.

One way an interface supports programming-in-the-large is by limiting access to the internal details of part of a program so those details may evolve independently of the rest of the program.

the fact that the concepts are intertwined to the point that it is hard to present them one at a time. In this video, I’ll review some of the key terms to give you some context, but we’ll introduce them again as we see them in use.

OOP terminology

Example: List select (aka filter)

-> (val ns (List withAll: '(1 2 3 4 5)))    
List( 1 2 3 4 5 )
-> (ns select: [block (n) ((n mod: 2) = 0)])
List( 2 4 )

Vocabulary notes:

Check your understanding: OOP Terminology

Match the OOP term to its definition:

OOP term:

  1. object
  2. class
  3. instance variables
  4. class method
  5. dynamic dispatch
  6. inheritance
  7. instantiation

Definition:

  1. construct used for defining objects
  2. imperative state allocated on a per-object basis
  3. process of creating an object from a class; the new object is called an instance of the class
  4. dynamic entity that receives messages
  5. behavior invoked at the class level typically for instantiating and initializing objects
  6. mechanism whereby one class can be defined by extending the behavior of a parent or super class
  7. another term for message sending

Design process in an OO setting

No interrogation about form!

Design process still works

  1. Each method defined on a class

    • The class knows the form!
  2. Class determines

    • How object is formed (class method)
    • From what parts (instance variables)
    • How object responds to messages (instance method)

Each form of data gets its own methods in its own class!

Using classes to implement select

Class determines how object responds.

Key classes in lists:

(method select: (_) self) ;; on ListSentinel
(method select: (aBlock)  ;; on Cons
   ([aBlock value: car] ifFalse:ifTrue:
       {(cdr select: aBlock)}
       {(((Cons new) car: car) 
                     cdr: (cdr select: aBlock))}))

Check your understanding: Classes represent forms of data

When coding lists using object-oriented idioms, different classes are used to represent the different forms of data: one class for the empty list and another class for non-empty lists.

Dynamic dispatch replaces if

In OOP, conditionals are deprecated

Instead, idiomatic code uses dynamic dispatch.

We’ll expore via example: List filtering via iteration

Iteration in Scheme

Ask value about form:

(define app (f xs)
  (if (null? xs)
      'do-nothing
      (begin
         (f (car xs))
         (app f (cdr xs)))))

Iteration in uSmalltalk

Use dynamic dispatch:

Instead of (app f xs), we have

(xs do: f-block)

For each element x in xs,

send

Example: Iteration

-> (val ms (ns select: [block (n) ((n mod: 2) = 0)]))
List( 2 4 )
-> (ms do: [block (m) ('element print) (space print)
                      ('is print) (space print)
                      (m println)])
element is 2
element is 4
nil

Implementing iteration

What happens if we send “do f” to an empty

What happens if we send “do f” to a cons cell?

Iteration by dynamic dispatch

Sending do: to the empty list:

  (method do: (aBlock) nil)
      ; nil is a global object

Sending do: to a cons cell:

  (method do: (aBlock)
      (aBlock value: car)
      (cdr do: aBlock))

Look! No if! Decisions made by dynamic dispatch

Basic OOP mechanism review

Not shown: a method can use primitive operations.

Check your understanding: Code selection via dynamic dispatch

Dynamic dispatch means it is the run-time class of the receiver object that determines the code to be executed.

Idiomatic object-oriented code often avoids conditionals through the use of dynamic dispatch, for example by sending messages to objects from different classes to trigger class-appropriate behavior instead of asking the object about its representation and then branching on the result.

The six questions for Smalltalk

The six questions

  1. Values are objects (even true, 3, "hello")

    Even classes are objects!

    There are no functions—only methods on objects

The six questions

  1. Syntax:

Slide 13 

Slide 14 

Slide 15 

Slide 16 

Message passing

Look at SEND

N.B. BLOCK and LITERAL are special objects.

The six questions

  1. Environments

The six questions

  1. Types

    There is no compile-time type system.

    At run time, Smalltalk uses behavioral subtyping, known to Rubyists as “duck typing”

  2. Dynamic semantics

    • Main rule is method dispatch (complicated)
    • The rest is familiar
  3. The initial basis is enormous

    • Why? To demonstrate the benefits of reuse, you need something big enough to reuse.

Check your understanding: The six questions for Smalltalk

In μSmalltalk, all values are objects, even booleans, integers, strings, classes.

Because μSmalltalk focuses so strongly on objects, it does not provide any language feature akin to anonymous functions.

The μSmalltalk type system statically guarantees that no message-not-understood errors will happen at run-time. That is, if a message m is sent to an object obj in the text of the program, then obj belongs to a class in which method m is defined.

10 November 2020: OOP Mechanisms

There are PDF slides for 11/11/2020.

Retrospective: Introduction to OOP

Write one thing you learned about OOP so far and one question you have. Your answers may be anonymously shared with your classmates.

Smalltalk syntax

Slide 1 

Slide 2 

Slide 3 

Slide 4 

Slide 5 

Slide 6 

Slide 7 

little livecoding.

Coding in uSmalltalk

Coding Interlude: Coding in Smalltalk

Consider the following code:

  ;; blocks
  (val twice [block (n) (n + n)])
  (twice value: 3)
  ;; note lightweight syntax for parameterless blocks
  (val delayed {('hello println) 42})
  delayed
  (delayed value)
  ;; blocks sent to Booleans
  (val x 10)
  (val y 20)
  ;; Note the number of columns matches number of args
  ((x <= y) ifTrue:ifFalse: {x} {y}) 

Now answer the following questions:

  1. Why doesn’t entering delayed cause the string hello to be printed and the number 42 to be returned?
  2. Why does entering (delayed value) cause the string hello to be printed and the number 42 to be returned?
  3. What is the meaning of the braces, as in {('hello println) 42} and in the arguments to theifTrue:ifFalse:method ({x}and{y}`)?
  4. What about the syntax of the ifTrue:ifFalse method tells you how many arguments it is expecting?

Classes

Parts of a class definition

Example: CoordPair

(class CoordPair
   [subclass-of Object]   ;; Inherits from class Object
   [ivars x y]            ;; Instance variables

   (class-method withX:y: (anX aY)
      ((self new) initializeX:andY: anX aY))
   (method initializeX:andY: (anX aY) ;; private
        (set x anX)
        (set y aY)
        self)

   (method x () x)
   (method y () y)
   (method = (coordPair)
      ((x = (coordPair x)) & (y = (coordPair y))))
   (method * (k)           ;;;;;; Scaling and translation
        (CoordPair withX:y: (x * k) (y * k)))
   ... )

Check your understanding: Parts of a class definition

Consider the following class definition:

(class CoordPair
   [subclass-of Object]   ;; Inherits from class Object
   [ivars x y]            ;; Instance variables

   (class-method withX:y: (anX aY)
      ((self new) initializeX:andY: anX aY))
   (method initializeX:andY: (anX aY) ;; private
        (set x anX)
        (set y aY)
        self)

   (method x () x)
   (method y () y)
   (method = (coordPair)
      ((x = (coordPair x)) & (y = (coordPair y))))
   (method * (k)           ;;;;;; Scaling and translation
        (CoordPair withX:y: (x * k) (y * k)))
   ... )

Now answer the following questions.

  1. Class CoordPair inherits from a class Pair
  1. Objects instantiated from class CoordPair have four instance variables: x, y, anX, and anY.
  1. The class method withX:y: is sent to the class CoordPair.
  1. The method = is an instance method that is sent to objects instantiated from the class CoordPair.
  1. In the class CoordPair, the identifier x is used as both an instance variable and an instance method.

Access rights

Instance variables

Methods

coordPair; need to use accessor methods; common idiom.

Example: Access rights

(class CoordPair
   [subclass-of Object]   ;; Inherits from class Object
   [ivars x y]            ;; Instance variables

   (method initializeX:andY: (anX aY) ;; private
        (set x anX)
        (set y aY)
        self)

   (method x () x)
   (method y () y)
   (method = (coordPair)
      ((x = (coordPair x)) & (y = (coordPair y))))

   ...
)

Check your understanding: Access rights

Consider the following code:

(class CoordPair
   [subclass-of Object]   ;; Inherits from class Object
   [ivars x y]            ;; Instance variables

   (method initializeX:andY: (anX aY) ;; private
        (set x anX)
        (set y aY)
        self)

   (method x () x)
   (method y () y)
   (method = (coordPair)
      ((x = (coordPair x)) & (y = (coordPair y))))

   ...
)

Now answer the following questions:

  1. In the class CoordPair, the instance variable y is private, which means it is accessible within the methods of the class, but not elsewhere. As a result, it is accessible within initializeX:andY:, y, and =. However, any y instance variable that might be part of the argument object coordPair is not accessible in the = method.
  1. μSmalltalk makes it impossible to access the method initializeX:andY: outside of the class CoordPair because of the comment private.

invariants for each class in hierarchy.

Object initialization

Objects created by sending creator method to class.

(class CoordPair
   [subclass-of Object]   ;; Inherits from class Object
   [ivars x y]            ;; Instance variables

   (class-method withX:y: (anX aY)
      ((self new) initializeX:andY: anX aY))

   (method initializeX:andY: (anX aY) ;; private
        (set x anX)
        (set y aY)
        self)
    ...

   (method * (k)          ;;;;;; Scaling and translation
        (CoordPair withX:y: (x * k) (y * k)))
)

Check your understanding: Object initialization

Consider the following code:

(class CoordPair
   [subclass-of Object]   ;; Inherits from class Object
   [ivars x y]            ;; Instance variables

   (class-method withX:y: (anX aY)
      ((self new) initializeX:andY: anX aY))

   (method initializeX:andY: (anX aY) ;; private
        (set x anX)
        (set y aY)
        self)
    ...

   (method * (k)          ;;;;;; Scaling and translation
        (CoordPair withX:y: (x * k) (y * k)))
)

Now answer the following questions:

  1. The method * sends the message withX:y: to the class CoordPair to create a new CoordPair object initialized with scaled up x and y coordinates.
  1. The class method withX:y: sends the new message to self to allocate a fresh CoordPair object. It then sends the message initializeX:andY: to the resulting object to initialize its instance variables x and y to the parameter values anX and aY.
  1. An alternative way to get a new CoordPair object in a method defined within the CoordPair class would be to send the message initializeX:andY: to self.

Special name self

  • Allows message sends to host object
  • Idiomatic to return self to support message chaining
  • Not a proper variable because can’t be set

Example: A short-hand method

     (method toOrigin ()
         (self setLoc: (0,0)))

Special name super

  • Starts method search with parent class
  • Statically determined based on defining class of method
  • Typically used when refining inherited behavior
  • Not a proper variable because can’t be set

Example: A speedy shape

(method move: (coord) (super move: (coord * 2)))

Check your understanding: Special names self and super

  1. A method defined on an object obj sends a message to the special name self to invoke another one of obj’s methods.
  • True
  • False
  1. A method defined in a class S that inherits from class P sends a message m to the special name super when it wants to use the method implementation that P would have used for m.
  • True
  • False

Abstract and concrete classes

Abstract class

  • Meant to be inherited from

  • Some ( > 0) subclassResponsibility methods

  • Examples: Shape, Boolean, Collection

Concrete class (aka, a regular class)

  • Meant to be instantiated

  • No subclassResponsibility methods

  • Examples: Triangle, True, List

the collction protocol:

definitions with more efficient implementations.

Slide 16 

Check your understanding: Concrete and abstract classes

  1. An abstract class has exactly one abstract method whose implementation is subclassResponsibility.
  • True
  • False
  1. The purpose of an abstract class is to define some concrete methods in terms of the abstract methods so that behavior can be inherited by subclasses.
  • True
  • False

Slide 17 

Method Dispatch

dispatch. We’ll use Booleans as an example.

class implements: it’s interface. Often standard methods defined on all objects omitted (like print).

Slide 18 

Slide 19 

Slide 20 

send a message to the object true.

Method dispatch: The algorithm

To answer a message:

  1. Consider the class of the receiver

  2. Is the method with that name defined?

  3. If so, use it

  4. If not, repeat with the superclass

Run out of superclasses?

“Message not understood”

Method dispatch: An example

Q: What happens if is sent to ?

  • (true ifTrue: aBlock)

A: Search starts in True class and method is found.

Q: What if is sent to ?

  • (true print)

A: Search starts in True, moves to parent Boolean, stops in class Object where it finds the print method.

Q: What if is sent to ?

  • (true move)

A: Search starts in True, moves to Boolean, moves to Object, fails to find move, so reports “Message not understood” error.

Check your understanding: Method dispatch

  1. When a message is sent to an object, the runtime system searches for a corresponding method definition starting with the class that defined the object.
  • True
  • False
  1. If a message m is sent to an object obj and if the runtime system cannot find a method definition in obj’s class, then the system reports a message-not-understood error.

16 November 2020: OOP Case Study: The Number Hierarchy

There are PDF slides for 11/17/2020.

Case study: Magnitudes and numbers

Reflection: Design of a Number Hierarchy

Imagine you are designing a hierarchy of number types such as reals, floats, integers and fractions. How would you organize this hierarchy? How are these number types similar and how are they different?

Key problems on homework

Case Study: The Number Hierarchy

Goal: Develop feel for programming in the large

Issues we’ll consider along the way:

  • Class design

    • What methods go where?
  • Invariants

    • Class methods establish invariants
    • Instance methods must maintain them
  • Information hiding

    • “semi-private” methods for like objects
    • Double-dispatch when argument form matters.

Slide 2 

Slide 3 

Implementation of : Reuse

(class Magnitude            ; abstract class
    [subclass-of Object]
    (method =  (x) (self subclassResponsibility))
                    ; may not inherit = from Object
    (method <  (x) (self subclassResponsibility))
    (method >  (y) (y < self))
    (method <= (x) ((self > x) not))
    (method >= (x) ((self < x) not))
    (method min: (aMag)
       ((self < aMag) ifTrue:ifFalse: {self} {aMag}))
    (method max: (aMag)
       ((self > aMag) ifTrue:ifFalse: {self} {aMag}))
)

Check your understanding: Design of magnitude class

Consider the Magnitude Class

(class Magnitude ; abstract class
  [subclass-of Object]
  (method = (x) (self subclassResponsibility))
                ; may not inherit = from Object
  (method < (x) (self subclassResponsibility))
  (method > (y) (y < self))
  (method <= (x) ((self > x) not))
  (method >= (x) ((self < x) not))
  (method min: (aMag)
      ((self < aMag) ifTrue:ifFalse: {self} {aMag}))
  (method max: (aMag)
      ((self > aMag) ifTrue:ifFalse: {self} {aMag}))
)
  1. What methods must a subclass of Magnitude implement at a minimum?
  • <, =
  • =
  • <,>,=
  • <,=,<=,>=,>
  1. True or False: Can you compare two instances of Magnitude for equality using the = method as defined?
  • True
  • False

Slide 5 

receiver but represents the same value as the argument

Slide 6 

Reflection: OOP Design: Implementing class Number

Code reuse is a powerful feature. Consider the methods negated, reciprocal, +, -, * / .

How might you implement some of these methods in terms of others?

Slide 7 

Concrete Number Methods

(method -  (y) (self + (y negated)))
(method abs () ((self isNegative) ifTrue:ifFalse:
                   {(self negated)}
                   {self}))
(method /  (y) (self * (y reciprocal)))

(method isNegative         ()
   (self  < (self coerce: 0)))
(method isNonnegative      ()
   (self >= (self coerce: 0)))
(method isStrictlyPositive ()
   (self  > (self coerce: 0)))

Abstract Number Methods

(class Number
    [subclass-of Magnitude] ; abstract class 
    ;;;;;;; arithmetic
    (method +   (aNumber)     (self subclassResponsibility)) 
    (method *   (aNumber)     (self subclassResponsibility)) 
    (method negated    ()     (self subclassResponsibility)) 
    (method reciprocal ()     (self subclassResponsibility)) 

    ;;;;;;; coercion
    (method asInteger ()      (self subclassResponsibility)) 
    (method asFraction ()     (self subclassResponsibility)) 
    (method asFloat ()        (self subclassResponsibility)) 
    (method coerce: (aNumber) (self subclassResponsibility)) 
)

Slide 10 

Check your understanding: Design of number class

Consider the Extended Number Heirarchy

Slide 11 

  1. At a minimum what messages will instances of Fraction, Float and Integer be able to respond to?

    • Methods defined in Number, Magnitude and Object
    • Methods defined in Number only
  1. At a minimum what messages will instances of Small Integer and LargeInteger be able to respond to?

    • Only Methods defined in Number
    • Methods defined in Number and Methods Defined in Integer

Fractions: A case study in information hiding and initialization

  1. the denominator is always positive

  2. If the numerator is zero, the denominator is 1.

  3. The numerator and the denominator are reduced to lowest terms, that is, their only common divisor is 1.

Example class Fraction: Initialization

(class Fraction
    [subclass-of Number]
    [ivars num den] 
      ;; invariants: lowest terms, minus sign on top
      ;;             if num = 0, den = 1
      ;; maintained by divReduce, signReduce
    (class-method num:den: (a b) 
        ((self new) initNum:den: a b))
    (method initNum:den: (a b) ; private
        (self setNum:den: a b)
        (self signReduce)
        (self divReduce))
    (method setNum:den: (a b)
        (set num a) (set den b) self) ; private
    .. other methods of class Fraction ... )

Information revealed to self

Instance variables num and den

  • Directly available
  • Always and only go with self

Object knows its own representation, invariants, private methods:

(method asFraction ()
   self)
(method print () 
   (num print) ('/ print) (den print) self)
(method reciprocal ()
   (((Fraction new) setNum:den: den num) signReduce))

Information revealed to others

Reflection: Implementing coerce

Coercion is the process of making an instance fit a specified type. For instance, the Integer 1 can be coerced to the Float 1.0.

Imagine you wanted to add a Float x to an Integer y without losing any precision. Which instance should be coerced to what type before adding?

The method coerce: implements this idea. It returns the value of the argument in the representation of the receiver. How would you implement coerce: ?

What does the caller of the method coerce: need to know about the type and protocol of the receiver?

Slide 14 

Information revealed to others: coerce

How would you implement coerce:?

  • Value of argument, rep of receiver

  • Challenge: Can’t access rep of argument!

(method asFraction ()
   self)
(method print () 
   (num print) ('/ print) (den print) self)
(method reciprocal ()
   (((Fraction new) setNum:den: den num) signReduce))
(method coerce: (aNumber)
   (aNumber asFraction))

Saved: Number protocol includes asFraction!

Slide 16 

Slide 17 

Slide 18 

Slide 19 

Reflection: Implementing multiply on fractions

(class Fraction
    [subclass-of Number]
    [ivars num den]
        ;; invariants: lowest terms, minus sign on top
        ;; if num = 0, den = 1
        ;; maintained by divReduce, signReduce
    (class-method num:den: (a b)
        ((self new) initNum:den: a b))
    (method initNum:den: (a b) ; private
        (self setNum:den: a b)
        (self signReduce)
        (self divReduce))
    (method setNum:den: (a b)
        (set num a) (set den b) self) ; private
    .. other methods of class Fraction ... )

Consider implementing multiply between instances of the Fraction class. How do you multiply two fractions? How might you implement multiply given what we know about multiplying two fractions?

Slide 20 

Slide 21 

Controlled information sharing in an open system

Reflection: Implementing multiply across differnt kinds of numbers

How might you implement multiply when we have lots of different kinds of numbers with different representations?

Extending open systems

Number protocol: like multiplies with like

Goal:

  • Large integers and small integers are both Integers

  • Messages =, <, +, * ought to mix freely

Constraint: Each object has its own algorithm.

  • Small: Use machine-primitive multiplication

  • Large: Multiply magnitudes; choose sign

Double dispatch to the rescue!

In the following slide, the notation :+ and :- are value constructors for positive and negative large integers. The n and m are magnitudes (natural numbers).

Slide 23 

Slide 24 

Double dispatch to implement addition

How do you act?

  • As small integer, you’re asked to add large positive integer N to self (aka plusLP: N)

Response: ((self asLargeInteger) plus N)

  • As small integer, you’re asked to add small integer n to self (aka plusSP: n)

Response: (self plus n)

  • As large positive integer, you’re asked to add large positive integer N to self (aka plusLP: N)

Response: (self plus n)

  • As large positive integer, you’re asked to add small integer n to self (aka plusSP: n)

Response: (self plus (n asLargeInteger))

Check your understanding: Double dispatch

  1. What two things does a double dispatch method encode?

    • Operation and protocol
    • Operation and instance
    • Instance and protocol
  1. Consider using double dispatch to implement multiplication between your objects.

For each item, consider your class, and then the class of the object you are being multiplied against.

Do you need to coerce yourself into a different type of number for the operation to be successful?

Choose the appropriate smalltalk code to accomplish what you’re asked to do.

  • As small integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)
    • (self * n)
    • ((self asLargeInteger) * N)
  • As small integer, you’re asked to multiply small integer n by self (aka timesSP: n)
    • (self * n)
    • ((self asLargeInteger) * N)
  • As large positive integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)
    • (self * n)
    • ((self asLargeInteger) * N)
  • As large positive integer, you’re asked to multiply small integer n by self (aka timesSP: n)
    • (self * n)
    • ((self asLargeInteger) * N)
  1. Take a look at the code below. Can you determine which class each method belongs to? Hint: Your instance knows it’s own type and broadcasts that to the object the method is operating on.
  • (method + (aNumber) (aNumber addSmallIntegerTo: self))
    • SmallInteger
    • LargePositiveInteger
  • (method * (anInteger) (anInteger multiplyByLargePositiveInteger: self))
    • SmallInteger
    • LargePositiveInteger

Your turn: Double dispatch to implement multiplication

How do you act?

  • As small integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)

Response: ((self asLargeInteger) * N)

  • As small integer, you’re asked to multiply small integer n by self (aka timesSP: n)

Response: (self * n)

  • As large positive integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)

Response: (self * n)

  • As large positive integer, you’re asked to multiply small integer n by self (aka timesSP: n)

Response: (self * (n asLargeInteger))

Slide 27 

Slide 28 

Knowledge of protocol

Three levels:

  • I know only your

Example: send to any collection

  • You are like me: share

Example: send or to

  • I must get to know you:

Example: send to to

Extra: Dealing with overflow

New law for multiplication:

  (self * small) 
=
  ((primitive mulWithOverflow
       self
       small
       {((self asLargeInteger) * small)})
   value))

Primitive is not a method

  • Answers good block or exception block
  • Answer is then sent value

18 November 2020: Subtyping and Equality

There are PDF slides for 11/19/2020.

Retrospective: Double Dispatch

Write one thing you learned about Double Dispatch and one question you have. Your answers may be anonymously shared with your classmates.

Reflection: Subtyping

Consider when can an object of one type be used in a context expecting an object of another type.

What needs to be true for this substitution to work?

Hint: Imagine you need to pound a nail. In this context you expect to use a hammer. If you had to use something else besides a hammer, what needs to be true about the alternative object for to accomplish your task?

Subtyping

Subtype polymorphism

  • Key mechanism to support code reuse

  • A is a subtype of B (written A <: B) if value a:A can be used whenever a value of supertype B is expected.

  • Example: Circle, Diamond, and Triangle can be used in any context expecting a Shape

  • Subtyping relationship can be checked statically (e.g. Java, C++, Scala) or dynamically (e.g. Smalltalk, Ruby)

Slide 2 

Slide 3 

Behavioral subtyping

(in Ruby, ``duck typing’’)

Types aren’t enough:

  • Methods should also behave as expected.

Example:

  • draw method in Shape hierarchy vs.
  • draw method in first-person shooter game

Not enforced in type systems because of decidablity.

Check your understanding: Subtyping

  1. True or False: A subtype doesn’t need to understand all the messages of its supertype.
  • True
  • False
  1. True or False: If A is a subtype of B and B is a subtype of C A can understand all the messages of C.
  • True
  • False

Reflection: Subtyping vs. Inheritance

Subtyping describes the relationship between two objects in terms of what messages both objects understand.

Inheritance describes the relationship between two objects in terms of what code is shared between objects.

First, describe two objects that aren’t related via inheritance, but that are subtypes.

Next, describe two objects that are related via inheritance, but that aren’t subtypes.

Subtyping vs Inheritance

Subtyping is not inheritance

  - **SUBTYPE != SUBCLASS**
  - **SUPERTYPE != SUPERCLASS**

Some languages like C++ identify subtype with subclass, but conceptually they are different.

Examples

Example: circle and square with same protocol but defined independently

  • Subtypes but not related via inheritance

Example: Non-mutable dictionary that inherits from Dictionary but hides all methods that can update entries

  • Related via inheritance but not subtypes

Subtyping and Inheritance: Code Reuse

What reuse is being enabled?

  • Inheritance: Parent code to facilitate class definition

  • Subtyping: All the client code

Food for thought: Which has the bigger impact?

Check your understanding: Subtyping vs. Inheritance

  1. Match each term with the correct definition.
  • Subtyping
    • related in class definition
    • substitutability
  • Inheritance
    -related in class definition
    • substitutability
  1. True or False: When you add a method to a class, you need to add it to all the classes which inherit from that class.
  • True
  • False

Reflection: Equality

When should two objects be considered equal? Consider comparing two strings with == in C. What problems could result?

Equality

When are two entities equal?

  • Easy to get wrong esp. for abstract types

  • Logically equal but have different reps

Example: lists used to represent sets

[1,2,3] and [2,1,3]

  • Logically distinct by have the same rep

Example: Simulation of an ecosystem

Two distinct animals may have same data

Different answers for different types

Scalars

Examples: int, bool, char, etc.

Test: Same bit representation

  • Efficient

  • But: Don’t use for floats!

Functions

Example: int * int list -> int list

Test: Same code?

  • Decidable, but meaningful?

Test: Same math function?

  • Undecideable
  • Reason for ML’s ''a types:
    any type that supports equality

Usually disallowed.

Compound types: Option 1

Examples:

  • a dict object from class Dictionary
  • a rabbit from a simulation class

Test: object identity

  • Same object in memory
  • Very efficient
  • May fail to equate things that should be equal

Compound types: Option 2

Examples:

  • a dict object from class Dictionary
  • a rabbit from a simulation class

Test: structural equality

  • Recursively examine all the parts
  • Can be expensive!
  • Less picky, but may also fail to equate things that should be equal

Compound types: Option 3

Examples:

  • a dict object from class Dictionary
  • a rabbit from a simulation class

Test: User-defined

  • Can precisely define correct notion of equality
  • May or may not be efficient
  • Requires programmer attention

Tension

Correctness

vs. Efficiency

vs. Programmer Convenience

Check your understanding: Equality

  1. True or False: Different types may require different notions of equality. For example a notion of equality between two integers may be different than a notion of equality between two functions, or two expressions or two strings.
  • True
  • False

Semantics of Equivalence

Observational equivalence

Key idea: observational equivalence

  • Two values should be considered equal if no context can tell them apart

Key question: What should the context be allowed to do?

  • If context can use reflection (eg, class methods to query object formation), then no abstraction

    • Rule out reflection
  • Internals of an abstraction can see more than intended

    • Only let context use client code
  • If object is mutable and context is allowed to mutate it

    • either: Object identity
    • or: Rule out mutation

Check your understanding: Observational Equivalence

What is the key idea of Observational Equivalence?

  • Two values should be considered equal if no context can tell them apart.

  • Two values should be considered equal if they have the same structure or binary representation in memory.

  • Two values should be considered equal if there is a context which produces the same result for both.

Equality in uSmalltalk

Method == is object identity

;; object identity
(method  == (anObject) (primitive sameObject self anObject))

Method = is a form of observational equivalence

  • Principle: Two objects are considered equivalent if, without mutating either object, client code cannot tell them apart

  • Each class decides for itself by overwriting default method

  • Defaults to object identity

    (method = (anObject) (self == anObject))

  • Used by check-expect

  • Why disallow client code from using mutation to observe?
    Pragmatic: already have object-identity with = method

Example: Equivalent but not equal

(-> (val ns (List new)) List( )
-> (ns addFirst: 3)
-> (ns addFirst: 2)
-> (ns addFirst: 1)
-> ns
List( 1 2 3 )
-> (val ms (List new)) List( )
-> (ms addLast: 1)
-> (ms addLast: 2)
-> (ms addLast: 3)
-> ms
List( 1 2 3 )
-> (ns == ms) 
<False>
-> (ns = ms)
<True>

Check your understanding: Equality in Smalltalk

Match each of the smalltalk operators ( =, == ) with the correct definition.

  • ==
    • Are two objects observationally equivalent?
    • Are two objects pointing to the same place in memory?
  • =
    • Are two objects observationally equivalent?
    • Are two objects pointing to the same place in memory?

Review: Designing with objects

Visibility rules

Private messages to self (or instance of self)

Secret protocol to other objects of the same class (which can be spoofed)

Slide 20 

How to approach object-orientation

What to remember:

  • Dynamic dispatch tells your own form of data
  • Avoid knowing any argument’s form!
    Use semi-private methods, or (last resort) double dispatch

What not to do:

  • Don’t guess what code will answer a message
  • Don’t try to trace follow procedures step by step
    (crosses too many class/method boundaries)

What to do instead:

  • Trust the contract
  • Keep each method dirt simple

23 November 2020: Hiding information with abstract data types

There are PDF slides for 11/24/2020.

Retrospective: Object-oriented programming

Write one thing you learned about object-oriented programming and one question you still have. Your answers may be anonymously shared with your classmates.

Reflection: Introduction to Module Systems

Describe two issues that arise when working on large programs that to not arise when working on small ones.

Pick a particular language feature or programming idiom and describe why that feature or idiom makes programming-in-the-large either easier or harder.

Data abstraction

How to write large programs?

An area of agreement and a great divide:

                 INFORMATION HIDING
             (especially: mutable data)
                      /       \
                     /         \
 modular reasoning  /           \  code reuse
                   /             \ 
internal access   /               \  interoperability 
to rep           /                 \  between reps
                /                   \
            MODULES               OBJECTS           
        ABSTRACT TYPES

Introduction to Module Systems

Key idea: Internal details hidden behind clean interface

        Implementation      Interface

      ------------------   -----
      |                |   |   |
      |     Module     |   | I |
      |                |   |   |
      ------------------   -----
                ^            ^
                |            |
   Nitty gritty -            - Public face

Key idea: Hierarchical building blocks

     ------        ------
     | C1 |        | C2 |
     ------        ------
        \            /
         \          /
     --------------------
     |        I3        |
     --------------------
     |        M3        |
     |  ------  ------  |
     |  | I1 |  | I2 |  |
     |  ------  ------  |
     |  |    |  |    |  |
     |  | M1 |  | M2 |  |
     --------------------

Terminology: a module is a client of the interfaces it depends on

Module system overview

Functions of a module system:

Desirable features:

Aside: C and C++ are cheap imitations

Interfaces

Collect declarations

Things typically declared:

Key idea: a declared type can be abstract

Ways of thinking about interfaces

Example interface

signature NATURAL_NUMBER = sig
  type nat
  val intOfNat  : nat -> int
  val natOfInt  : int -> nat
  val natString : nat -> string

  exception Negative
  val addNats : nat * nat -> nat
  val subNats : nat * nat -> nat
  val mulNats : nat * nat -> nat
end

Design choice: How are interfaces constructed?

Interface “projected” from implementation:

  • No separate interface
  • Compiler extracts from implementation
    (CLU, Java class, Haskell)
  • When code changes, must extract again
  • Easy to construct but few cognitive benefits

Interface written independently:

  • Distinct file, separately compiled
    (Caml, Go, Java interface, Modula, Ada)
  • Implementations can change independently
  • Harder to construct but full cognitive benefits

Check your understanding: Interfaces

  1. An interface can be thought of as the type of a module.
  • True
  • False
  1. An interface typically includes the implementation of one or more functions.
  • True
  • False
  1. An interface can declare a type but leave its definition abstract, forcing client code to treat that type abstractly, as if it were a primitive type.
  • True
  • False

Module Implementations

     ------        ------
     | C1 |        | C2 |
     ------        ------
        \            /
         \          /
     --------------------
     |        I3        |
     --------------------
     |        M3        |
     |  ------  ------  |
     |  | I1 |  | I2 |  |
     |  ------  ------  |
     |  |    |  |    |  |
     |  | M1 |  | M2 |  |
     --------------------

Check your understanding: Modules and module systems

  1. A module provides the implementation of one or more interfaces.
  • True
  • False
  1. A module may have private information hidden from client code.
  • True
  • False

Standard ML Modules

The Perl of module languages?

What we’ve been using so far is the core language

Modules are a separate language layered on top.

ML module terminology

Interface is a signature

Implementation is a structure

Generic module is a functor

  • A compile-time function over structures

  • The point: Reuse without violating abstraction

Structures and functors match signature

Analogy: Signatures are the ``types’’ of structures.

Signature basics

Signature says what’s in a structure

Specify types (w/kind), values (w/type), exceptions

Examples: Ordinary types

    type t        // abstract type, kind *
    eqtype t
    type t = ...  // 'manifest' type
    datatype t = ...

Examples: Type constructors work too

    type 'a t     // abstract, kind * => *   
    eqtype 'a t
    type 'a t = ...
    datatype 'a t = ...

Signature example: Ordering


signature ORDERED = sig
  type t
  val lt : t * t -> bool
  val eq : t * t -> bool
end

operation that quickly finds and removes a minimal element.

without knowing the details of the implementation.

Signature example: Priority queue

signature PQUEUE = sig
  type elem           (* <-- ABSTRACT type *)
  val compare_elem : elem * elem -> order

  type pqueue         (* <-- ABSTRACT type *)
  val empty  : pqueue
  val insert : elem * pqueue -> pqueue
  val isEmpty : pqueue -> bool
  exception Empty
  val deletemin : pqueue -> elem * pqueue
end

Check your understanding: The anatomy of a signature

When specifying a value in an ML signature, it is necessary to give a type for that value.

  • True
  • False

An ML signature can specify types, but not type constructors.

  • True
  • False

An ML signature cannot include exceptions.

  • True
  • False

ML modules extended example: Numbers

Signature example: Integers

signature INTEGER = sig
  eqtype int             (* <-- ABSTRACT type *)
  val ~   : int -> int
  val +   : int * int -> int
  val -   : int * int -> int
  val *   : int * int -> int
  val div : int * int -> int
  val mod : int * int -> int
  val >   : int * int -> bool
  val >=  : int * int -> bool
  val <   : int * int -> bool
  val <=  : int * int -> bool
  val compare : int * int -> order
  val toString   : int    -> string
  val fromString : string -> int option
end

Implementations of integers

A selection…

structure Int    :> INTEGER
structure Int31  :> INTEGER  (* optional *)
structure Int32  :> INTEGER  (* optional *)
structure Int64  :> INTEGER  (* optional *)
structure IntInf :> INTEGER  (* optional *)

Natural Numbers

signature NATURAL = sig
   type nat   (* abstract, NOT 'eqtype' *)
   exception Negative
   exception BadDivisor

   val ofInt   : int -> nat 
   val /+/     : nat * nat -> nat
   val /-/     : nat * nat -> nat
   val /*/     : nat * nat -> nat
   val sdiv    : nat * int -> 
                 { quotient : nat, remainder : int }
   val compare : nat * nat -> order
   val decimal : nat -> int list
end

Large Integers

signature BIGNUM = sig
   type bigint
   exception BadDivision

   val ofInt    : int -> bigint
   val negated  : bigint -> bigint
   val <+>      : bigint * bigint -> bigint
   val <->      : bigint * bigint -> bigint
   val <*>      : bigint * bigint -> bigint
   val sdiv : bigint * int -> 
              { quotient : bigint, remainder : int }
   val compare  : bigint * bigint -> order
   val toString : bigint -> string
end

Check your understanding: Number signatures

  1. Consider the INTEGER signature:
signature INTEGER = sig
  eqtype int             (* <-- ABSTRACT type *)
  val ~   : int -> int
  val +   : int * int -> int
  val -   : int * int -> int
  val *   : int * int -> int
  val div : int * int -> int
  val mod : int * int -> int
  val >   : int * int -> bool
  val >=  : int * int -> bool
  val <   : int * int -> bool
  val <=  : int * int -> bool
  val compare : int * int -> order
  val toString   : int    -> string
  val fromString : string -> int option
end

In this signature, the representation of int is kept abstract except for the fact that values of type int may be compared for equality using the built-in equality function.

  • True
  • False
  1. Consider the NATURAL signature:
signature NATURAL = sig
   type nat   (* abstract, NOT 'eqtype' *)
   exception Negative
   exception BadDivisor

   val ofInt   : int -> nat 
   val /+/     : nat * nat -> nat
   val /-/     : nat * nat -> nat
   val /*/     : nat * nat -> nat
   val sdiv    : nat * int -> 
                 { quotient : nat, remainder : int }
   val compare : nat * nat -> order
   val decimal : nat -> int list
end

Given this signature, if you do not already have a value of type nat, the only way to get a value one is to use the ofInt function.

  • True
  • False

Signatures collect

signature QUEUE = sig
  type 'a queue    (* another abstract type *)
  exception Empty

  val empty : 'a queue
  val put : 'a * 'a queue -> 'a queue
  val get : 'a queue -> 'a * 'a queue   (* raises Empty *)

  (* LAWS:  get(put(a, empty))     ==  (a, empty)
            ...
   *)
end

Structures collect

structure Queue :> QUEUE = struct   (* opaque seal *)
  type 'a queue = 'a list
  exception Empty

  val empty = []
  fun put (x,q) = q @ [x]
  fun get [] = raise Empty
    | get (x :: xs) = (x, xs)


  (* LAWS:  get(put(a, empty))     ==  (a, empty)
            ...
   *)
end

Dot notation to access components

fun single x = Queue.put (Queue.empty, x)
val _ = single : 'a -> 'a Queue.queue

Coding Interlude: Signature for a stack

Given the following SML structure, write a corresponding SML signature that hides the implementation as a list but allows polymorphic stacks to be created and elements to be pushed and popped.

Structure for a stack

structure Stack = struct
   type 'a stack = 'a list
   exception Empty
   val empty = []
   val push  = op ::
   fun pop []            = raise Empty
     | pop (top :: rest) = (top, rest)
end

Reflection: When does a structure match a signature?

If we can write structures and signatures independently, we must have some way to determine whether a given structure implements given signature.

Give examples of three different circumstances that would mean a given structure does not implement an interface. One such example would be that the interface requires the presence of a value MAX_INT but the structure fails to define such a value.

What interface with what implementation?

Maybe mixed together, extracted by compiler!

  • CLU, Haskell

Maybe matched by name:

  • Modula-2, Modula-3, Ada

Best: any interface with any implementation:

  • Java, Standard ML

But: not “any”—only some matches are OK

Signature Matching

Structure declaration is well-formed

 structure Queue :> QUEUE = QueueImpl

if principal signature of QueueImpl matches ascribed signature QUEUE

  • Every type in QUEUE is in QueueImpl

  • Every exception in QUEUE is in QueueImpl

  • Every value in QUEUE is in QueueImpl
    (type could be more polymorphic)

  • Every substructure matches, too (none here)

Matching items must be compatible.

Signature Ascription

Ascription attaches signature to structure

  • Transparent Ascription: types are revealed

    structure strid : sig_exp = struct_exp

    This method leaks information (legacy)
    (But it’s awfully convenient)

  • Opaque Ascription: types are hidden (“sealing”)

    structure strid :> sig_exp = struct_exp

    This method respects abstraction
    (And when you need to expose, can be tiresome)

Slogan: “use the beak”

Transparent Ascription

Not recommended!

Example:

  structure IntLT : ORDERED = struct
     type t = int
     val le = (op <)
     val eq = (op =)
  end

Exposed: IntLT.t = int

  • Violates abstraction

Opaque Ascription

Recommended

Example:

 structure Queue :> QUEUE = struct
   type 'a queue = 'a list
   exception Empty

   val empty = []
   fun put (x, q) = q @ [x]
   fun get [] = raise Empty
    | get (x :: xs) = (x, xs)
 end

Not exposed: type ’a Queue.queue = ’a list

  • Respects abstraction

Check your understanding: Signature ascription

  1. The principal signature of a structure is the signature that contains the maximal interface information for that structure.
  • True
  • False
  1. With opaque signature ascription (Queue :> QUEUE), type information in the structure is hidden. Only information described in the ascribing signature is available to client code.
  • True
  • False

Abstract data types

Opaque ascription and access to internals

Outside module, no access to representation

  • Protects invariants
  • Allows software to evolve
  • Enforced by type system

Inside module, complete access to representation

  • Every function sees rep of every argument
  • Key distinction modules vs object

Abstract data types and number hierarchy

Natural numbers

signature NATURAL = sig
   type nat   (* abstract, NOT 'eqtype' *)
   ...
   val /+/     : nat * nat -> nat
   val /-/     : nat * nat -> nat
   val /*/     : nat * nat -> nat
end
  • Funs/+/,/-/,/*/ see both representations

  • Makes arithmetic relatively easy

  • But type nat works only with type nat
    (no “mixed” arithmetic, unlike in OOP)

Check your understanding: Module access to representations

Consider the signature NATURAL:

signature NATURAL = sig
   type nat   (* abstract, NOT 'eqtype' *)
   ...
   val /+/     : nat * nat -> nat
   val /-/     : nat * nat -> nat
   val /*/     : nat * nat -> nat
end

Now answer the following question.

  1. In any structure implementing this signature, the code for the /+/ operation will be able to access the representation of both arguments.

30 November 2020: Functors and Extended Case Study

There are PDF slides for 12/1/2020.

Retrospective: Structures and Signatures

Write one thing you learned about structures and signatures so far and one question you have. Your answers may be anonymously shared with your classmates.

Reflection: Parameterizing a structure by another structure

We’ve seen a couple of different kinds of functions. A lambda expression takes an expression-level value as an argument, while a type-lambda expression takes a type as an argument.
Suppose we introduced a new kind of function that took a structure as an argument and returned a structure as a result.

  1. How do you think we might we “type” the argument and the result?

  2. What do you think such functions might be used for?

Functors (aka generic modules aka parameterized structures)

A is a generic module

A new form of parametric polymorphism:

  • lambda and type-lambda in one mechanism

  • Introduction form is functor (definition form)

“Generics” found across language landscape
(wherever large systems are built)

Introducing functors

A functor abstracts over a structure

Formal parameters are declarations

Example:

functor MkSingle(structure Q:QUEUE) = 
   struct
     structure Queue = Q
     fun single x = Q.put (Q.empty, x)
   end

Functors combine familiar ideas:

  • Higher-order functions (value parameter Q.put)
  • type-lambda (type parameter Q.queue)

Using Functors

Actual parameters are definitions

structure QueueS  = MkSingle(structure Q = Queue)
structure EQueueS = MkSingle(structure Q = EQueue)

(EQueue might be a more efficient implementation)

functor MkSingle(structure Q:QUEUE) = 
   struct
     structure Queue = Q
     fun single x = Q.put (Q.empty, x)
   end

Functor applications are evaluated at compile time.

Check your understanding: Functors

  1. A functor is a function over structures that combines elements of higher-order functions and type-lambda.
  • True
  • False
  1. Functor applications are evaluated at runtime.
  • True
  • False

Slide 4 

Check your understanding: where type

Consider the following code:

signature ORDER = sig
  type t
  val compare : t * t -> order
end

signature MAP = sig
  type key
  type 'a table
  val insert : key -> 'a -> 'a table -> 'a table
  ...
end

functor RBTree(structure Ord : ORDER) 
                  :> MAP where type key = Ord.t = 
   struct  ...   end

Now answer the following questions.

  1. The ORDER signature defines an abstract type t.
  • True
  • False
  1. If applying the RBTree functor to some concrete structure Foo passes the type checker, then Foo must not define a type t.
  • True
  • False
  1. The MAP signature defines an abstract type key.
  • True
  • False
  1. The structure that results from applying the functor RBTree to structure Foo will satisfy the MAP signature refined by the where type clause so that key type is defined to be equal to Foo’s definition of type t.
  • True
  • False

Slide 5 

Extended example: Error-tracking Interpreter

Reusable Abstractions: Extended Example

Error-tracking interpreter for a toy language

Why this example?

Error modules: Three common implementations

Classic ``accumulator’’ for errors

signature ERROR = sig
  type error   (* a single error *)
  type summary (* summary of what errors occurred *)

  val nothing : summary                  (* no errors *)
  val <+> : summary * summary -> summary (* combine *)

  val oneError : error -> summary

  (* laws:   nothing <+> s == s
             s <+> nothing == s
             s1 <+> (s2 <+> s3) == (s1 <+> s2) <+> s3    
                                        // associativity
   *)
end

First Error: Implementation

structure FirstError :> 
    ERROR where type error = string
            and type summary = string option =
  struct
    type error   = string
    type summary = string option

    val nothing = NONE
    fun <+> (NONE,   s) = s
      | <+> (SOME e, _) = SOME e

    val oneError = SOME
  end

All Errors: Implementation

structure AllErrors :>
    ERROR where type error   = string
            and type summary = string list =
  struct
    type error   = string
    type summary = error list

    val nothing = []
    val <+> = op @
    fun oneError e = [e]
  end

Check your understanding: Error signature and implementations

Consider the ERROR signature:

signature ERROR = sig
  type error   (* a single error *)
  type summary (* summary of what errors occurred *)

  val nothing : summary                  (* no errors *)
  val <+> : summary * summary -> summary (* combine *)

  val oneError : error -> summary

  (* laws:   nothing <+> s == s
             s <+> nothing == s
             s1 <+> (s2 <+> s3) == (s1 <+> s2) <+> s3    
                                        // associativity
   *)
end

Now answer the following questions.

  1. There is only one sensible way to implement this signature.
  • True
  • False
  1. In the signature, the type error represents one detected error.
  • True
  • False
  1. In the signature, the combinator <+> concatenates a new error onto a summary of the errors detected so far.
  • True
  • False

Computations Abstraction

Ambitious! (will make your head hurt a bit)

Computations either:

Errors must be threaded through everything :-(

Example: Simple arithmetic interpreter

(* Given: *)
datatype 'a comp = OK of 'a | ERR of AllErrors.summary 
datatype exp = LIT  of int
             | PLUS of exp * exp
             | DIV  of exp * exp
(* Write an evaluation function that tracks errors. *)
val eval : exp -> int comp = ...

Reflection: Error tracking interpreter

Given the AllErrors implementation of the ERROR signature:

structure AllErrors :>
    ERROR where type error   = string
            and type summary = string list =
  struct
    type error   = string
    type summary = error list

    val nothing = []
    val <+> = op @
    fun oneError e = [e]
  end

and the following datatype declarations:

datatype 'a comp = OK of 'a | ERR of AllErrors.summary 

datatype exp = LIT  of int
             | PLUS of exp * exp
             | DIV  of exp * exp
  1. Fill in evaluation code for PLUS case using the AllErrors implementation of ERROR signature:
fun eval (LIT n) = OK n
  | eval (PLUS (e1, e2)) = ...
  1. Fill in evaluation code for DIV case using the AllErrors implementation of ERROR signature:
fun eval (LIT n) = OK n
  | ...
  | eval (DIV  (e1, e2)) - ...

Interpreter: LIT and PLUS cases

fun eval (LIT n) = OK n
  | eval (PLUS (e1, e2)) = 
     (case eval e1
        of OK v1 => 
          (case eval e2
             of OK  v2 => OK (v1 + v2)
              | ERR s2 => ERR s2)
       | ERR s1 =>
          (case eval e2
             of OK  _  => ERR s1
              | ERR s2 => ERR (AllErrors.<+> (s1, s2))))

Interpreter: DIV case

  | eval (DIV (e1, e2)) = 
     (case eval e1
        of OK v1 =>
          (case eval e2
             of OK   0 => ERR (AllErrors.oneError "Div 0")
              | OK  v2 => OK  (v1 div v2)
              | ERR s2 => ERR s2)
       | ERR s1 =>
           (case eval e2
              of OK  v2 => ERR s1
               | ERR s2 => ERR (AllErrors.<+> (s1, s2)))

That’s really painful!

We can extend the computation abstraction with sequencing operations to help.

Combining computations

signature COMPUTATION = sig
  type 'a comp    (* Computation! When run, results in
                     value of type 'a or error summary. *)
  (* A computation without errors always succeeds. *)
  val succeeds : 'a -> 'a comp   
  (* Apply a pure function to a computation. *)
  val <$> : ('a -> 'b) * 'a comp -> 'b comp
  (* Application inside computations. *)
  val <*> : ('a -> 'b) comp * 'a comp -> 'b comp
  (* Computation followed by continuation. *)
  val >>= : 'a comp * ('a -> 'b comp) -> 'b comp
end

Example:

eval e1 + eval e2

(op +) (eval e1, eval e2)

curry (op +) (eval e1) (eval e2)

curry (op +) <$> eval e1 <*> eval e2

The first three versions are not well typed. Why?

The last version will thread errors through the compuation behind the scenes.

Note:

eval e1, eval e2 : int comp

curry (op +) : int -> (int -> int)

<$> : (int -> (int -> int)) * (int comp) -> (int -> int) comp

<*> : (int -> int) comp * int comp -> int comp

curry (op +) <$> eval e1 : (int -> int) comp

let pa = curry (op +) <$> eval e1

note by above,  pa : (int -> int) comp 

pa <*> eval e2  : int comp

Buckets of algebraic laws

succeeds a >>= k  == k a                  // identity
comp >>= succeeds == comp                 // identity
comp >>= (fn x => k x >>= h) == (comp >>= k) >>= h  
                                          // associativity
succeeds f <*> succeeds x == succeeds (f x)  // success
...

Environments can use computation abstraction, too!

signature COMPENV = sig
  type 'a env   (* environment mapping strings
                   to values of type 'a *)
  type 'a comp  (* computation of 'a or
                   an error summary *)

  val lookup : string * 'a env -> 'a comp
end

Check your understanding: Computation combinators

Consider the signature of the computation abstraction:

signature COMPUTATION = sig
  type 'a comp    (* Computation! When run, results in
                     value of type 'a or error summary. *)

  (* A computation without errors always succeeds. *)
  val succeeds : 'a -> 'a comp   

  (* Apply a pure function to a computation. *)
  val <$> : ('a -> 'b) * 'a comp -> 'b comp

  (* Application inside computations. *)
  val <*> : ('a -> 'b) comp * 'a comp -> 'b comp

  (* Computation followed by continuation. *)
  val >>= : 'a comp * ('a -> 'b comp) -> 'b comp
end

Now answer the following questions:

  1. The type 'a comp represents a computation that either produces a value of type 'a or results in an error.
  • True
  • False
  1. A value of type 'a comp created by calling the succeeds combinator will always return a value of type 'a.
  • True
  • False
  1. The <*> combinator applies a pure function f with type 'a -> 'b to a value with type 'a comp.
  • True
  • False

Error-tracking interpreter

functor InterpFn(
  structure Error : ERROR
  structure Comp  : COMPUTATION
  structure Env   : COMPENV
  val zerodivide  : Error.error
  val error       : Error.error -> 'a Comp.comp
  sharing type Comp.comp = Env.comp) =
struct
  val (<*>, <$>, >>=) = (Comp.<*>, Comp.<$>, Comp.>>=)
  
  (* Definition of Interpreter *)

end

Error-tracking intepreter, continued

datatype exp = LIT of int
             | VAR of string
             | PLUS of exp * exp
             | DIV  of exp * exp
fun eval (e, rho) =
 let fun ev(LIT n) = Comp.succeeds n
       | ev(VAR x) = Env.lookup (x, rho)
       | ev(PLUS (e1, e2)) = curry op + <$> ev e1 <*> ev e2
       | ev(DIV (e1, e2))  = ev e1 >>= (fn n1 =>
                             ev e2 >>= (fn n2 =>
                             case n2
                               of 0 => error zerodivide
                                | _ => Comp.succeeds
                                              (n1 div n2)))
 in  ev e
 end

Computation abstraction is a ``monad’’

Supported by special syntax in Haskell:

eval :: Exp -> Hopefully Int
eval (LIT v)       = return v
eval (PLUS  e1 e2) = 
  do { v1 <- eval e1
     ; v2 <- eval e2
     ; return (v1 + v2) }
eval (DIV  e1 e2) =
  do { v1 <- eval e1
     ; v2 <- eval e2
     ; if v2 == 0 then Error "div 0"
       else return (v1 `div` v2) }

Check your understanding: Using the computation abstraction

Consider the following code for implementing an expression interpreter using the computation abstraction:

functor InterpFn(
  structure Error : ERROR
  structure Comp  : COMPUTATION
  structure Env   : COMPENV
  val zerodivide  : Error.error
  val error       : Error.error -> 'a Comp.comp
  sharing type Comp.comp = Env.comp) =
struct
  val (<*>, <$>, >>=) = (Comp.<*>, Comp.<$>, Comp.>>=)
  

datatype exp = LIT of int
             | VAR of string
             | PLUS of exp * exp
             | DIV  of exp * exp

fun eval (e, rho) =
 let fun ev(LIT n) = Comp.succeeds n
       | ev(VAR x) = Env.lookup (x, rho)
       | ev(PLUS (e1, e2)) = curry op + <$> ev e1 <*> ev e2
       | ev(DIV (e1, e2))  = ev e1 >>= (fn n1 =>
                             ev e2 >>= (fn n2 =>
                             case n2
                               of 0 => error zerodivide
                                | _ => Comp.succeeds
                                              (n1 div n2)))
 in  ev e
 end

end

Now answer the following questions:

  1. The InterpFn functor takes as a parameter a value declaration that represents a divide by zero error. This value is used in the evaluation of the DIV operation.
  • True
  • False
  1. The evaluation of the VAR case can never fail and so doesn’t need to be part of the computation abstraction.
  • True
  • False
  1. The evaluation of the PLUS operation uses the <$> and <*> combinators to thread errors through the computation.
  • True
  • False
  1. In the evaluation of DIV, n1 is bound to the result of evaluating e1 only in the case where e1 evaluates without error.
    If e1 evaluates to an error, the rest of the computation is short-circuited and the error is returned directly.
  • True
  • False

ML Module summary

ML module summary

New syntactic category: declaration

  • Of type, value, exception, or module

Signature groups declarations: interface

Structure groups definitions: implementation

Functor (generic module) enables reuse:

  • Formal parameter: declarations
  • Actual parameter: definitions

Opaque ascription hides information

  • Enforces abstraction

2 December 2020: Drive-by tour of lambda calculus

There are PDF slides for 12/3/2020.

Retrospective: SML’s Module System

Write one thing you learned about ML’s module system so far and one question you have. Your answers may be anonymously shared with your classmates.

Reflection: Models of computation

Think back to our formal models of impcore and μscheme. The simpler a formal model is, while still capturing the essence of what is being modelled, the better, because it allows us to focus on what matters most.

In programming languages, we generally want the langauge to be Turing complete. Practically speaking, that means it has to be able to express conditionals, natural numbers, and general recursion.

Describe how you would go about designing a minimal language that supported these features.

Why study lambda calculus?

The world’s simplest reasonable programming language

Syntax: Only three forms!

M ::= x | \x.M | M M'

Everything is programming with functions

First example:

(\x.\y.x) M N --> (\y.M) N --> M

Crucial:

Programming in Lambda Calculus

Absolute minimum of code forms:

Assembly language of programming languages

Systematic approach to constructed data:

Alert to the reading: Wikipedia is reasonably good on this topic

Coding Booleans and if expressions

Q: Who remembers the boolean-formula solver?

Q: What classes of results could it produce?

Q: How were the results delivered?

Q: How shall we do Booleans?

A Boolean takes two continuations:

Laws:

true  s f = s
false s f = f

Code:

true  = \x.\y.x
false = \x.\y.y

Coding the if expression, laws:

if true  then T else E = T
if false then T else E = E

Therefore, code is:

if B then T else E = 
if-then-else B T E = \B.\T.\E.B T E

Code:

true  = \x.\y.x
false = \x.\y.y
if    = \b.\t.\e.b t e

Coding Interlude: Coding not in lambda calculus

Given that we can code booleans in lambda calculus as

true  = \x.\y.x
false = \x.\y.y
if    = \b.\t.\e.b t e

your job is to write the not function.

  • As a first step, write the algebraic laws for not. What are the forms of the input data? For each form, what should the output be?

  • Now write the code for not

You can try out your code in the lamstep interpreter on the homework cluster.

Coding data structures

Coding pairs

Coding lists

Check your understanding: Coding data structures in lambda calculus

  1. The encoding of a pair constructor in lambda calculus is as a function that takes the two elements x and y of the pair and a continuation k that says what to do with those elements.
  • True
  • False
  1. The function \p.p (\x.y.x) encodes the snd projection function on pairs in the lambda calculus.
  • True
  • False
  1. Lists are coded in lambda calculus as functions that take two continuations: one for what to do if the list is nil and one for what to do if the list is a cons cell. The nil continuation takes no arguments, while the cons case takes one argument: what to do for the head of the list.
  • True
  • False
  1. The function \xs.xs true (\y.\ys.false) encodes the null? function on lists in the lambda calculus.
  • True
  • False

Coding numbers: Church Numerals

Church numerals

Key idea: Encode a number as the number of times a function f is applied.

The number n: Apply f to x, n times

 \f.\x.fold f x n

Slide 2 

Church Numerals in lambda calculus

zero  = \f.\x.x;
succ  = \n.\f.\x.f (n f x);
plus  = \n.\m.n succ m;
times = \n.\m.n (plus m) zero;
 ...
-> four;
\f.\x.f (f (f (f x)))
-> three;
\f.\x.f (f (f x))
-> times four three;
\f.\x.f (f (f (f (f (f (f (f (f (f (f (f x)))))))))))

Check your understanding: Church numerals

  1. The Church numeral encoding of the number 3 is \f.\x.f(f(f(f x)))
  • True
  • False
  1. The function \n.\f.\x.f(n f x) is the successor function on Church numerals
  • True
  • False

Taking stock:

Q: What’s missing from this picture?

A: Recursive functions.

Astonishing fact: We don’t need letrec or val-rec.

Reflection: Encoding recursion

What value of
f
solves this equation:


f = λn.if n = 0 then 1 else n × f(n − 1)?

Coding recursion

Slide 4 

Factorial in lambda calculus

Wish for:

fact = \n.(zero? n) 1 (times n (fact (pred n)));

But: on right-hand side, fact is not defined.

Successive approximations

Function bot always goes into an infinite loop.

Consider how these functions relate to factorial:

fact0 = \n.(zero? n) 1 (times n (bot   (pred n)));


fact1 = \n.(zero? n) 1 (times n (fact0 (pred n)));


fact2 = \n.(zero? n) 1 (times n (fact1 (pred n)));

General pattern:

fact-i = \n.(zero? n) 1 (times n (fact-i-1   (pred n)));

cessive approximations (construct

fact-i = \n.(zero? n) 1 (times n (fact-i-1   (pred n)));


g = \f.  \n.(zero? n) 1 (times n (f          (pred n)));


fact0 = g bot;


fact1 = g fact0;      // = g (g bot)


fact2 = g fact1;      // = g (g (g bot))


fact3 = g fact2;      // = g (g (g (g bot)))


   ...

Fixed point

Suppose f = g f.

f is a fixed point of g.

Claim: f n is n factorial!

Proof: by induction on n.

Left as an exercise to the viewer :-)

Fixed-point combinator

What if we had a combinator fix such that

fix g = g (fix g)

Term fix g is a fixed point of g, and so by previous slide,

fix g n is n factorial!

fix g = g (fix g)
      = g (g (fix g))
      = g (g (g (fix g)))
      = ...

Expand as much as you need to.

Slide 10 

Check your understanding: Encoding recursion

Consider the recursively-defined length function:

length = \xs.null? xs 0 (+ 1 (length (cdr xs)))

If we were to take the fixpoint of the following functions, which of them would produce the length function?

  • lg1 = \lf.\xs.null? xs 1 (+ 1 (lf (cdr xs)))
  • lg2 = \lf.\xs.null? xs 0 (+ 0 (lf (cdr xs)))
  • lg3 = \lf.\xs.null? xs 0 (+ 1 (lf (cdr xs)))
  • lg4 = \lf.\xs.null? xs 0 (+ 0 (lf (car xs)))

Using fixed points

Enrichment Reflection.

For extra credit, for each of the following recursive equations, say whether there is a solution, (ie, a recursive function that makes the equation true). If there is a solution, say whether it is unique and what it is.

f1 = \n.\m.(eq? n m) n 
              (plus n (f1 (succ n) m));

f2 = \n.f2 (isZero? n 100 (pred n));

f3 = \xs.xs nil (\z.\zs.cons 0 (f3 zs));

f4 = \xs.\ys.f4 ys xs;

Summary

7 December 2020: Producing Software with Integrity: Motivations, Capabilities, and Impediments

There were no PDF slides on 12/8/2020.

Retrospective: Lambda calculus

Write one thing you learned about lambda calculus so far and one question you have. Your answers may be anonymously shared with your classmates.

Reflection: Security-critical software

Despite high-profile data breaches stemming from flaws in software and frequent demonstrations of the insecurity of security and safety critical systems at Black Hat, high-quality security-critical software seems to remain elusive. Why do you think that might be?

Misused buffers are a major security problem

Check your understanding: Software bugs lead to exploits

  1. Given all the vulnerabilities listed in MITRE’s Common Vulnerability and Expostures Database, issues related to misusing buffers are a relatively minor problem.
  • True
  • False
  1. Buffer overflow/underflow attacks such as Heartbleed and the issue reported in MS security bulletin MS15-078 can result in exfiltrating information from a system without leaving any trace in log files and/or allow a remote attacker to take over control of a computer.
  • True
  • False

Consequences of poor cybersecurity

Consequences of poor cybersecurity: Ransomware

Consequences of poor cybersecurity: Physical damage

Consequences of poor cybersecurity: Changing the nature of warfare

Check your understanding: Consequences of poor cybersecurity

  1. Ransomware is a way hackers can monetize malware. It typical works when hackers break into a system and encypt files. They then demand payment to decrypt the files. Whether or not victims pay the ransom, recovery costs can be in the millions of US dollars.
  • True
  • False
  1. When breaking into automobiles and other cyber-physical systems, hackers have to be physically close to the system they are attacking.
  • True
  • False
  1. Although disguised as anonymous ransomware, subsequent analysis has revealed that NotPetya constituted a purely destructive attack launched by one sovereign nation (Russia) against another (Ukraine). It caused billions of U.S. dollars of damage, both within the Ukraine and worldwide.
  • True
  • False
  1. There is no known evidence of successful attacks on critical infrastructure such as power distribution systems.
  • True
  • False

Reflection: Techniques to build better software

What approaches do you think might be effective for building higher quality software?

We know how to build higher-quality software

Check your understanding: Proof of concept for building higher-quality software

  1. No C code was involved in the high-assurance code for the SMACCMcopter or Boeing’s Unmanned Little Bird.
  • True
  • False
  1. At the end of DARPA’s HACMS program, the red team was unable to disrupt the flight operations or communication channels of Boeing’s Unmanned Little Bird despite being given root access to one of the partitions on the helicopter.
  • True
  • False

Reflection: Impediments to applying known technology

Assuming we do in fact know how to build higher-quality software, why do you think we might be choosing not to use such capabilities?

The economics of cybersecurity: Misaligned incentives and market failures

The economics of cybersecurity: Misaligned incentives

Check your understanding: Misaligned incentives

  1. In retail banking in the 1990s, the UK had lower fraud rates than the US because consumers were held responsible.
  • True
  • False
  1. Critical infrastructure operators are making uninformed decisions when they connect their cyber-physical systems to the internet given the security risks that such linkages cause.
  • True
  • False

Market Failure: Information asymmetries

Check your understanding: Information asymmetries

  1. Information asymmetry is a market failure that plagues the software security market: customers can’t evaluate the security of software and so aren’t willing to pay a premium for software with better security. As a result, vendors are reluctant to invest in better security because they won’t be able to recoup the cost in a higher price for their product.
  • True
  • False
  1. Another example of information assymetry in the software security industry is that lack of information about cybersecurity incidents. Lacking such data means we cannot properly assess the size of the problem, making it difficult to gauge whether we are properly allocating defensive resources.
  • True
  • False

Where do we go from here? Possible policy interventions

Possible policy interventions: Information disclosure

Possible policy interventions: Cyber insurance

Possible policy interventions: Software product liability

Check your understanding: Possible policy interventions

  1. Mandatory disclosure of data breaches has increased the attention paid to data breaches, making cyberinsurance for data breaches the most mature segment of the cyberinsurance market.
  • True
  • False
  1. Only a handful of insurance companies offer cyber insurance because it is too hard to measure the risk.
  • True
  • False
  1. The software industry has successfully lobbied against software product liability on the grounds that it would stifle innovation.
  • True
  • False

Producing software with integrity

A Prediction

9 December 2020: Comp 105 Conclusion

There are PDF slides for 12/10/2020.

Retrospective: Producing Software with Integrity

Write one thing you learned about producing software with integrity so far and one question you have. Your answers may be anonymously shared with your classmates.

Comp105 in Review

What have you learned?

Principled design process for writing code

  • Algebraic laws guided by the form of data

  • 9-step design process

  • Recursive functions whose structure follows from laws

    • Explain why function terminates
  • Inductive proofs that follow same structure

    • Explain why induction is well-founded

Understanding a language: Six key questions

  1. What is the abstract syntax?

  2. What are the values?

  3. What are the environments?

  4. How does evaluation happen?

  5. What is the initial basis?

  6. What are the types?

Operational semantics

  • Precisely describe meaning of programs

    • What value does a program evaluate to?

    • What side effects are caused in the process?

  • Written using inference rules

  • Judgment forms capture all relevant information

  • Environments track information about variables

  • Stores map locations to values

First-class functions

  • What they are

  • How to use them effectively

  • Lambdas create anonymous functions

  • Closures

    • run-time representation of functions;
    • capture environment at closure-definition time
  • Continuations capture ``the rest of the computation’’

Cost Models and Optimizations

  • How many cons cells?

  • How many activation records?

  • Method of accumulating parameters

  • Tail calls

ML

  • Datatypes

  • Pattern matching

  • Exceptions

Type systems

Type Systems: Big Idea

Static vs. Dynamic Typing

  • Expressiveness (+ Dynamic)

  • Don’t have to worry about types (+ Dynamic)

  • Dependent on input (- Dynamic)

  • Runtime overhead (- Dynamic)

  • Serve as documentation (+ Static)

  • Catch errors at compile time (+ Static)

  • Used in optimization (+ Static)

Type Systems: Big Idea

  • Undecideability forces tradeoff:

    • Dynamic or

    • Approximate or

    • Non-terminating

  • Example: array bounds checking

    • Occasional negative consequences: e.g., Heartbleed

Type Systems: Mechanics

  • Monomorphic and Polymorphic Types

  • Types, Type Constructors, Quantified Types (α.τ)

  • Kinds (κ) classify types:

    • well-formed,

    • types (*),

    • type constructors: κ ⇒ κ

  • Type Environments: type identifiers kinds

  • Typing Rules

    • Introduction and Elimination forms
  • Type Checking

Type inference

Hindley-Milner Type Inference: Big Idea

  • Inferred vs Declared Types

    • Advantages of Inference: write fewer types, infer more general types.

    • Advantages of Declarations: better documentation, more general type system.

  • Canonical example of static analysis:

    • Proving properties of programs based only on text of program.

    • Useful for compilers and security analysis.

Hindley-Milner Type Inference: Mechanics

  • Use fresh type variables for unknown types

  • Generate constraints that collect clues

  • Solve constraints just before introducing quantifiers

  • Compromises to preserve decideability:

    • Only generalize lets and top-level declarations

    • Polymorphic functions aren’t first-class

Object-Oriented Programming

Object-Oriented Programming: Big Ideas

  • “Programming-in-the-medium”

  • Advantages and Disadvantages

    • Enables code reuse

    • Easy to add new kinds of objects

    • Hard to add new operations

    • Algorithms smeared across many classes

    • Hard to know what code is executing

  • Good match for GUI programming

  • Smalltalk mantra: Everything is an Object

    • Can redefine basic operations

Object-Oriented Programming: Mechanics

  • Classes and objects

  • Computation via sending messages

    • special names: self and super
  • Blocks to code anonymous functions & continuations

  • Inheritance for implementation reuse

  • Subtyping (“duck typing”) for client code reuse

  • Subtyping is not Inheritance

  • Idioms for information sharing

    • by convention private methods for like objects
    • Double-dispatch

Module Systems

Module Systems a la SML: Big Ideas

  • “Programming-in-the-large”

  • Separate implementation from interface

  • Enforced modularity

    • Swap implementations w/o breaking clients

Module Systems a la SML: Mechanics

  • Signatures describe interfaces

    • types, values, exceptions, substructures

    • include to extend

  • Structures provide implementations

  • Signature ascription hides structure contents (Heap :> HEAP)

  • Functors

    • Functions over structures

    • Executed at compile time

Lambda Calculus

Lambda Calculus: Big Ideas

  • Three forms:
      e :  :  = x | λx.e | e1e2

  • Church-Turing Thesis:

    • All computable functions expressable in lambda calculus

    • Data types: booleans, pairs, lists, naturals

    • Recursive functions via fixed points

  • Y combinator calculates fixed points:

    • Y = λf.(λx.f(x x))(λx.f(x x))

Producing software with integrity

Producing software with integrity

  • Software flaws have significant costs

    • financial, physical, national-security
  • We know how to build better software

  • We aren’t building better software

    • misaligned incentives, market failures
  • Policy interventions might change that

    • information disclosure, cyber insurance, product liability

Programming experience

Programming Experience

  • Recursion and higher-order functions are now second-nature to you.

    • You’ll miss pattern matching and algebraic data types in any language you use that doesn’t have them!
  • C for impcore (imperative language)

  • Scheme (dynamically typed functional language)

  • ML (statically typed functional language)

  • uSmalltalk (dynamically typed OO language)

Built substantial pieces of code

  • SAT solver using continuations

  • Type checker (ML pattern matching!)

  • Type inference system (using constraints, reading typing rules)

  • BigNums (Power of OO abstractions; resulting challenges)

Where might you go from here?

Where might you go from here?

Haskell

  • At the research frontier: Still evolving.

  • Lazy:

    • Expressions only evaluated when needed.

    • Conflict with side-effects.

    • Solution: Monads (computation abstraction)

  • Type Classes:

    • Ad hoc polymorphism (aka, overloading)

    • ML: Hard-wire certain operations (+, *)

    • Haskell: User programmable.

New languages

  • Swift (Apple)

    • type inference, modules, higher-order functions, pattern matching, memory management
  • Go (Google)

    • advanced concurrency, modules (aka packages) higher-order functions, garbage collection
  • Rust (Mozilla)

    • advanced type system, modules, higher-order functions, support for systems programming

Additional Courses

  • Compilers

  • Software engineering

  • Special Topics:

    • Programming language design

    • Static analysis

    • Virtual machines

Congratulations!

  • You have learned an amazing amount.

  • You have really impressed me.

  • I’m confident you will be able to use your newly developed skills in many contexts.

    • When that happens, tell us about it!