# 9 September 2020: Introduction to Comp 105

There are PDF slides for 9/10/2020.

Handout: 105 Tip Sheet

Handout: 105 Syllabus

Handout: Experiences of successful 105 students

Handout: Lessons in program design

### Introduction

My pronouns are she, her, hers.

Please call me “Kathleen,” or “Professor Fisher”

### Reflection: Are all programming languages created equal?

Under what circumstances is one programming language better than another one? Write 2-3 sentences explaining your thoughts.

## Why we are here

• Write code from scratch
• In a language you’ve never seen
• That sails past code review

Why it’s different from COMP 40:

• Machines haven’t really changed since 1970 (mainframes), 1980 (minicomputers), 1982 (microcomputers)
• Languages are changing all the time

What it involves:

• Programming practice (emphasize widely used features: functions, types, modules, objects)
• Mathematical description (how things work, see patterns)
(Example: see patterns of recursion in homework)

Today:

• Then a little about experience and expectations

## What is a programming language?

What is a PL?

If you’ve taken 170, a caution: every programming language is a formal language, but not every formal language is a programming language.

My answer: a language with a compositional structure that has an element of run time to it.

Key idea: phase distinction

• Something you write down, which just sits there

• Something that runs

We will talk about “compile time” and “run time” (and later, type-checking time).

Key ideas: static and dynamic

The thing that you write down is syntax

What flies around at run time is values

(You know about syntax and values already—but this semester we’re going to think about them very systematically.

First three classes: new ways of thinking about stuff you already know

### Check your understanding: The big picture

1. The goal of this course is to teach you principles of programming languages so you’ll be able to teach yourself new languages quickly and improve your coding in many different languages.
• True
• False
1. Almost all languages are equally powerful, and so are equally well suited to solving any programming task.
• True
• False

### Check your understanding: Syntax vs. semantics

Indicate whether each statement is more about the syntax or the semantics of a program:

1. A valid C program always has a matching } for every {.
• syntax
• semantics
1. A valid C program never dereferences an uninitialized pointer.
• syntax
• semantics
1. In a valid C program, the right-hand side of an = operation must be an expression.
• syntax
• semantics
1. In our house style, every if statement has a corresponding else.
• syntax
• semantics
1. When a ternary conditional expression x ? y : z is evaluated, it acts a lot like an if
• syntax
• semantics

### Reflection: Inductive definitions for numerals

Numerals are an example of syntax: the rules for how things are written. At run time, a numeral stands for a natural number; the natural number is its semantics. Numerals are things like 0, 2020, 73, and 1852.
They are not things like 311t3, 2+2, or -12.

1. Write an inductive definition of numerals (think base case and inductive case).

2. There’s more than one! When you finish, look for another.

### Value structure for numerals: Induction

#### Example: Numerals

Numerals are syntax; Natural numbers are semantics

Numerals:

2018
73
1852

Not numerals:

3l1t3
2+2
-12

Approach 1: the cons approach

• if d is a digit then d is a numeral
• if d is a digit and n is a numeral then dn is a numeral

Approach 2: the snoc approach

• if d is a digit then d is a numeral
• if d is a digit and n is a numeral then nd is a numeral

Approach 3: the tree approach

• if d is a digit then d is a numeral
• if n1 is a numeral and n2 is a numeral then n1 n2 is a numeral

have written a list or sequence of digits as a definition. That’s good thinking, but it leaves open the problem of defining “list” or “sequence”. These definitions take care of that issue.

### Check your understanding: Representations of numerals

Different representations make different operations easier. Consider the three different ways we discussed to represent numerals, which we called the cons, snoc, and tree approaches.

1. Which representation makes it easiest to access the most significant digit?
• cons
• snoc
• tree
1. Which representation makes it easiest to access the least significant digit?
• cons
• snoc
• tree
1. Which representation makes it equally easy to access either the first or last digit?
• cons
• snoc
• tree

### Reflection: Natural numbers

We have limited operations on numerals: stick a digit before or after a numeral, or stick two numerals together. But on numbers we have many more operations. What are two simple, useful operations that are defined on natural numbers?

### Value structure for natural numbers: induction again

Natural numbers are semantics for numerals.

Infinite set, that we’ll define using induction.

• Base case: a digit is a nat
• Inductive case: if m is a nat and d is a digit then 10*m + d is a nat

### Check your understanding: Representations of numerals

The decimal definition of natural numbers in the video ignores almost all operations on natural numbers. It uses just addition and multiplication. And not just any addition and multiplication, but a highly stylized one: multiply by ten and then add. So out of all the gazillions of ways we might compute a natural number, we are limiting ourselves. According to this definition, how many different ways of forming natural numbers must we consider to be able to account for every natural number?

• one
• two
• ten
• gazillions
• infinite

### Reflection: Defining functions over natural numbers

We want to define a function that examines a natural number (actually a machine integer) to find out if its numeral can written using only the digit 4. How many cases will such a function have to consider? Do any of the cases look like they will require recursion?

## Writing code with recursion and induction

Data-driven design process. Connections to design ideas from 1970s, 1980s, 1990s, plus test-driven development (2000s).

Recursive-function problem

Exercise: all-fours?

Write a function that takes a natural number n and returns true (1) if and only if all the digits in n’s numeral are 4’s.

Key design step: Form of number

Choose inductive structure for natural numbers:

• Which case analysis do we want?

Step 1: Forms of DECNUMERAL proof system (1st lesson in program design):

• Either a single digit d
• Or 10 × m + d, where m ≠ 0

Example inputs

Step 2:

• Single digits: 4, 9

• Multi-digits: 44, 907, 48

Function’s name and contract

Steps 3 and 4:

Function (all-fours? n) returns nonzero if and only if the decimal representation of n can be written using only the digit 4.

Example results

Step 5: write expected results as unit tests:

(check-assert      (all-fours?  4))
(check-assert (not (all-fours?  9)))
(check-assert      (all-fours? 44))
(check-assert (not (all-fours? 48)))
(check-assert (not (all-fours? 907)))

Algebraic laws

Step 6: Generalize example results to arbitrary forms of data

(all-fours? d) ==  (= d 4)

(all-fours? (+ (* 10 m) d)) ==
(= d 4) && (all-fours? m)

Left-hand sides turn into case analysis

Step 7:

; (all-fours? d) == ...
; (all-fours? (+ (* 10 m) d)) ==  ...

(define all-fours? (n)
(if (< n 10)
... case for n = d ...
... case for n = (+ (* 10 m) d),
so m = (/ n 10) and
d = (mod n 10) ...))

Each right-hand side becomes a result

Step 8:

; (all-fours? d) ==  (= d 4)
; (all-fours? (+ (* 10 m) d)) ==
;           (= d 4) && (all-fours? m)

(define all-fours? (n)
(if (< n 10)
(= n 4)
(and (= 4 (mod n 10))
(all-fours? (/ n 10)))))

Revisit tests:

Step 9:

(check-assert      (all-fours?  4))
(check-assert (not (all-fours?  9)))
(check-assert      (all-fours? 44))
(check-assert (not (all-fours? 907)))

(check-assert (not (all-fours? 48)))

Checklist:

• For each form of data, one true and one false
• One extra corner case (partly fours)
• Tests pass

### Check your understanding: Design process

Put the nine steps of the design process for writing functions guided by the structure of the data and algebraic laws in order.

1. Turn right-hand sides of algebraic laws into results
2. Specify example results
3. Choose function’s name
4. Revisit Tests
5. Describe function’s contract
6. List example inputs
7. Choose representation of inputs
8. Turn left-hand sides of algebraic laws into case analysis
9. Write algebraic laws

### The all-fours? function

;; forms of data
;; Either a single digit d
;; OR 10 * m + d where m not equal 0

;; example inputs
(val x1 4 )
(val x2 9)
(val x3 44)
(val x4 907)
(val x5 48)

;; Pick the name now
;; Write the contract now
;; Write check assert cases now

;; Algebraic Laws
;; (all-fours? d) = (= d 4)
;; (all-fours? (+ (* 10 m) d)) ==
;;     (= d 4) && (all-fours? m)

;; The code!
;; return nonzero if and only if
;; decimal rep of n can be written
;; using only the digit 4
(define all-fours? (n)
(if (< n 10)
(= n 4)
(and (= 4 (mod n 10))
(all-fours? (/ n 10)))))

(check-assert      (all-fours? 4))
(check-assert (not (all-fours? 9)))
(check-assert      (all-fours? 44))
(check-assert (not (all-fours? 48)))
(check-assert (not (all-fours? 907)))


### Coding Interlude: all-fours?

Consider the all-fours? code from the previous video. Run the code on the sample data.

Change the code to make it be the all-nines? function.

### Reflection: Various syntactic forms

Consider the following two functions that show all the syntactic forms in Impcore.

(define even? (n) (= (mod n 2) 0))

(define 3n+1-sequence (n) ; from Collatz
(begin
(while (!= n 1)
(begin
(println n)
(if (even? n)
(set n (/ n 2))
(set n (+ (* 3 n) 1)))))
n))
1. What do you think these functions do?

2. What do you notice about the syntax of these functions (ie, how they are written down)?

3. How might you write these same two functions in C?

## Syntax in programming languages

Idea of LISP syntax

Parenthesized prefix syntax:

• Names and numerals are basic atoms

• Other constructs bracketed with () or [] (Possible keyword after opening bracket)

Examples:

(+ 2 2)
(if (isbound? x rho) (lookup rho x) (error 99))

(For now, we use just the round brackets)

Impcore structure

Two syntactic categories: expressions, definitions

No statements!—expression-oriented

(if e1 e2 e3)
(while e1 e2)
(set x e)
(begin e1 ... en)
(f e1 ... en)

Evaluating e has value, may have side effects

Functions f named (e.g., + - * / = < > print)

The only type of data is “machine integer” (deliberate oversimplification)

Syntactic structure of Impcore

An Impcore program is a sequence of definitions

(define mod (m n) (- m (* n (/ m n))))

Compare

int mod (int m, int n) {
return m - n * (m / n);
}

Impcore variable definition

Example

(val n 99)

Compare

int n = 99;

# 14 September 2020: Abstract syntax and operational semantics

There are PDF slides for 9/15/2020.

### Retrospective: Introduction to the study of programming languages, induction and recursion, and Impcore.

Pedagogical studies show that reflection about what you are learning really helps the learning process, so we will start each module by asking you to briefly reflect on what you learned in the previous module.

Write one thing you learned in the previous module and one question you have about that material.

Your responses may be anonymously shared with the rest of the class.

### Reflection: Exploring composition in programming languages

Composition, which is the ability to build many sensible things using just a few construction principles, is a critical to programming languages. An example at the syntactic level is the ability to build bigger programs out of smaller pieces. To get some practice with this idea, complete the following exercises:

• In C++, write down three expressions and two statements.

• Now compose one of the three expressions with the two statements to make another statement.

• Now compose all three expressions to make another expression.

## Programming-language semantics

“Semantics” means “meaning.”

### What problem are we trying to solve?

Know what’s supposed to happen when you run the code

Ways of knowing:

• People learn from examples
• You can build intuition from words
(Book is full of examples and words)
• To know exactly, unambiguously, you need more precision

You may recall the beginner exercise “make a peanut butter and jelly sandwich”

• You can watch and learn, but a computer can’t
• We must specify exactly what must be done.

### Why bother with precise semantics?

• Prepare to program in languages you’ve never seen before
• “X-ray glasses”
• Compare languages by translating into semantics
• Python and JavaScript are much more similar than they look
• C++ and Java are much different than they look
• Prove properties of programs and programming languages
• Useful for security audits and program optimizations
• Understand what a correct language implementation must do

The programming languages you encounter after 105 will certainly look different from what we study this term. But most of them will actually be the same. Studying semantics helps you identify that.

The payoff: Your new skills will apply

### Behavior decomposes

We want a computational notion of meaning.

What happens when we run (* y 3)?

We must know something about *, y, 3, and function application.

Knowledge is expressed inductively

• Atomic forms: Describe behavior directly (e.g., constants, variables)
• Compound forms: Behavior specified by composing behaviors of parts

Example of domain where compositionaliy doesn’t apply: English spelling and pronunciation

• fish vs ghoti
• Both composed from letters, but no rules of composition guide pronunciation

### Check your understanding: Principle of composition

The following questions are designed to give you practice with composition by asking you to think about how C statements and expressions are composed to form more complex C statements and expressions.

C has a construct of the form ___?___:___. For the result to be syntactically well formed, what kinds of things can go in the blanks?

• statements only
• expressions only
• a mix

Is the result of filling in the blanks in the above C construct

• always a statement?
• always an expression?
• either a statement or an expression, depending on whether the second two blanks were filled with expresions or statements?

C also has an if construct of the form if (___) ___; else ___;.
For the result to be syntactically well formed, what kinds of things can go in the blanks?

• statements only
• expressions only
• a mix

Is the result of filling in the blanks in the C if construct

• always a statement?
• always an expression?
• either a statement or an expression, depending on whether the second two blanks were filled with expresions or statements?

Recall Impcore concrete syntax

Definitions and expressions:

Define behaviors inductively

We’ll focus on expressions

Base cases (plural!): numerals, names

Inductive steps: compound forms

• To determine behavior of a compound form, look at behaviors of its parts

First, simplify the task of specification

What’s different? What’s the same?

 x = 3;               (set x 3)

while (i * i < n)    (while (< (* i i) n)
i = i + 1;            (set i (+ i 1)))

Abstract away gratuitous differences

Abstract syntax

Same inductive structure as concrete syntax

But,

• More uniform representation

• Designed for compiler, not programmer

Concrete syntax: sequence of symbols

Abstract syntax: ???

The abstraction is a tree

The abstract-syntax tree (AST):

Exp = LITERAL (Value)
| VAR     (Name)
| SET     (Name name, Exp exp)
| IFX     (Exp cond, Exp true, Exp false)
| WHILEX  (Exp cond, Exp exp)
| BEGIN   (Explist)
| APPLY   (Name name, Explist actuals)

One kind of “application” for both user-defined and primitive functions (One syntax, two behaviors)

In C, trees are fiddly

typedef struct Exp *Exp;
typedef enum {
LITERAL, VAR, SET, IFX, WHILEX, BEGIN, APPLY
} Expalt;        /* which alternative is it? */

struct Exp {  // 'alt' combines with any field in union
Expalt alt;
union {
Value literal;
Name var;
struct { Name name; Exp exp; } set;
struct { Exp cond; Exp true; Exp false; } ifx;
struct { Exp cond; Exp exp; } whilex;
Explist begin;
struct { Name name; Explist actuals; } apply;
};
};

### ASTs

Question: What do we assign behavior to?

Answer: The Abstract Syntax Tree (AST) of the program.

• An AST is a data structure that represents a program.

• A parser converts program text into an AST.

Question: How can we represent all while loops?

while (i < n && a[i] < x) { i++ }

• Tag code as a while loop
• Identify the condition, which can be any expression
• Identify the body, which can be any expression

As a data structure:

• WHILEX(exp1, exp2), where
• exp1 is the representation of (i < n && a[i] < x), and
• exp2 is the representation of i++

Let’s picture some trees

An expression:

  (f x (* y 3))

Let’s picture some trees

And a definition:

  (define abs (n)
(if (< n 0) (- 0 n) n))

### Check your understanding: AST Representations

Which of the following is the AST representation of the concrete Impcore syntax (if (> x 9) 1 0)?

• WHILEX( APPLY (Name '>', Explist ( VAR 'x', LITERAL LITERAL 1, LITERAL 0))]
• (IFX( APPLY (Name '>', Explist ( VAR 'x', LITERAL 9)), LITERAL 0, LITERAL 1))
• (IFX( APPLY (Name '>', Explist ( VAR 'x', LITERAL 7)), LITERAL 1, LITERAL 0))
• (IFX( APPLY (Name '>', Explist ( VAR 'y', LITERAL 9)), LITERAL 1, LITERAL 0))
• (IFX( APPLY (Name '>', Explist ( VAR 'x', LITERAL 9)), LITERAL 1, LITERAL 0))

## Key question: names

### Reflection: What’s in a name?

We’ve talked about how composition is really important in giving meaning to programs: you can understand a program fragment such as (+ e1 e2) by understanding each of its pieces (+, e1, and e2) and then putting them together.

Some parts of programs can’t be broken down further. An integer, for example, doesn’t have any smaller pieces that are still an integer. But we can understand an integer just by look at the bits that comprise it (aka, its representation).

What about names, such as a variable x? A name doesn’t have any smaller pieces, so we can’t understand it by understanding its pieces. And a name doesn’t contain in itself its meaning the way an integer does.

Explain how you think language implementors might give meaning to a name.

Hint: think about how dictionaries give meanings to words.

### From ASTs to behaviors

Behaviors of ASTs, part I: Atomic forms

Numeral: stands for a value

Name: stands for what?

Environment’’ is formal term

You may also hear:

• Symbol table
• Name space

So-called “Scope rules” determine which environment governs where

Find behavior using environment

Consider the program fragment

  (* y 3)   ;; what does it mean?

Key Idea: Look up the meaning of variables in an environment

Impcore uses three environments

Global variables ξ (“ksee”)

Functions ϕ (“fee”)

Formal parameters ρ (“roe”)

There are no local variables

• Just like awk; if you need temps, use extra formal parameters

Function environment ϕ not shared with variables—just like Perl

### Check your understanding: Operational semantics judgments

1. We find the meaning of variables by looking them up in an environment.

• True
• False
2. If an Impcore name f stands for a function, which environment will it be in?

• ξ (“ksee”)
• ϕ (“fee”)
• ρ (“roe”)
3. If an Impcore name x is defined in environment ξ (meaning x is in the domain of ξ), then it represents
• a global variable
• a function
• a formal parameter

# 16 September 2020: Judgments, Rules, and Proofs

There are PDF slides for 9/17/2020.

Handout: Redacted Impcore rules

Handout: Overview of Induction

### Reflection: Modeling computation in math

Suppose the notation e ⇓ 3 means that expression e evaluates to 3 and the notation n dom ρ means that variable n is in the environment ρ (and hence is defined in ρ).

1. How would you say an expression evaluates to 7?
2. How would you say formally that an expression like (+ n 1) evaluates successfully only if n is defined?
3. How would you say what (set x 0) does, and also (set n 0), and (set m 0), without having to repeat yourself for every variable you want to set to 0?
4. What information would you need to know what (all-fours? 344) evaluates to?
5. Why would you want to write these expressions in math, instead of just writing in natural language?

## Rules and metatheory

### Reflection: Operational semantics judgments

What do you think the rules are for evaluating a literal and for evaluating a variable? Fill in the question marks with your best guess!

• LITERAL(v), ξ, ϕ, ρ⟩ ⇓ ⟨?, ?, ?, ?⟩

• VAR(x), ξ, ϕ, ρ⟩ ⇓ ⟨?, ?, ?, ?⟩

• In the second judgment, how can you tell which environment x belongs to?

### Check your understanding: Semantics of variables

Consider the semantic rules for assignment to variables FormalAssign and GlobalAssign.

If x is in both the global environment ξ and the formal environment ρ, which version takes precedent?

• Global (ξ)
• Formal (ρ)

During the execution of the expression set(x, e), which environments may change? (Check all that apply)

• ξ
• ρ
• ϕ

### Reflection: Mapping semantics to code

Consider again the semantic rules for assignment to variables FormalAssign and GlobalAssign.

How might these rules guide a language implementor?

### Check your understanding: C implementation of environment operations

1. In the implementation of variable lookup, what C operation corresponds to the semantic check that x ∈ dom ρ?
• isvalbound(e->var, formals)
• isvalbound(e->var, globals)
• fetchval(e->var, formals)
1. In the implementation of assignment, what part of the semantic rules corresponds to the C operation of bindval(e->set.name, v, globals)?
• x∈ dom(ρ)
• ρ′{x ↦ v}
• x∈ domξ
• ξ′{x ↦ v}

Note that the implementation doesn’t precisely follow the semantics. Instead of returning a new environment as directed by the semantics, the implementation instead updates the old environment in place using a side-effecting operation. This choice is more efficient than copying the old environment. It’s a metatheorem of Impcore that this optimization is sound. It’s a metatheorem that will no longer hold when we shift to Scheme and start copying environments to form closures.

### Reflection: Using semantics to understand how programs are evaluated

Consider the following code:

      (define and (p q)
(if p q 0))

(define digit? (n)
(and (<= 0 n) (< n 10)))

Suppose we evaluate (digit? 7)

1. In the body of digit?, what expressions are evaluated in what order?
1. The body of the digit? function matches the template (f e1 e2) where f, e1, and e2 are meta-variables that stand for different program fragments. In matching the template to the body of digit?,

• What is f?
• What is e1?
• What is e2?

### Reflection: Good and bad judgments

Just because something is written in math notation doesn’t make it true!

Which of the following judgements correctly describe what happens at runtime? Explain your answers. Pay special attention to the lack of primes on the concluding environments.

1. (+ 2 2), ξ, ϕ, ρ⟩ ⇓ ⟨4, ξ, ϕ, ρ

2. (+ 2 2), ξ, ϕ, ρ⟩ ⇓ ⟨99, ξ, ϕ, ρ

3. (+ 2 2), ξ, ϕ, ρ⟩ ⇓ ⟨4, ∅, ϕ, ρ

4. (while 1 0), ξ, ϕ, ρ⟩ ⇓ ⟨77, ξ, ϕ, ρ

5. (begin (set n (+ n 1)) 17)ξ, ϕ, ρ
⇓ ⟨17, ξ, ϕ, ρ

## Proofs: Putting rules together

Every terminating computation is described by a data structure—we’re going to turn computation into a data structure: a tree. Proofs about computations are hard (see: COMP 170), but proofs about trees are lots easier (see: COMP 61).

How do we know it’s right?

## From rules to proofs

Judgment speaks truth when derivable’’

Special kind of proof: derivation

• It’s a data structure (a derivation tree)
• Leaves are axioms: variables and constants
• Nodes are compound expressions
• Valid derivation formed by instantiating semantic rules
• Spacelike representation of timelike behavior

A form of “syntactic proof”

Recursive evaluator constructs derivation

Root of derivation at the bottom (surprise!)

Build

• Start on the left, go up
• Cross the
• Finish on the right, go down

First let’s see a movie

<– The Tony Hawk’’ algorithm –>

### Check your understanding: Derivation trees

1. In constructing the derivation tree for the program (digit? 7), what is the rule at the bottom of the tree?
• ApplyUser
• Var
• Literal
1. What environment will be used to find the meaning of the name digit??
• ξ
• ϕ
• ρ
1. What rule will be used to find the meaning of the sub-expression 7?
• ApplyUser
• Var
• Literal

# 21 September 2020: Derivations and metatheory

There are PDF slides for 9/22/2020.

### Reflection: Metatheory vs theory

Metatheory allows us to prove properties about all derivations, which means it allows us to prove properties about all programs in a language.

The derivation that demonstrates that (digit? 7) evaluates to true is a proof about that particular program, so it theory but not metatheory. A metatheoretic property about Java is that in a well-typed program, every variable mentioned in the program has previously been defined.

Classify the following properties as theory or metatheory:

1. The evaluation of any Impcore program will never access undefined memory.

• theory
• metatheory
2. The evaluation of (and (<= 0 n) (< n 10)) will access the value associated with the variable n.

• theory
• metatheory
3. For any Impcore program p, every variable mentioned in p will be in the domain of either ξ, ϕ, or ρ.

• theory
• metatheory

Note: a theoretic or metatheoretic property can be either true or false. The distinction is whether the property would apply only to specific programs or to all programs were it to be true. The proof methods used to prove theoretic and metatheoretic properties differ because a theory proof applies only to a specific program while a metatheoretic proof applies to all programs in a language.

## Proofs about all derivations: metatheory

Syntactic proofs empower metatheory

Derivation/proof 𝒟 is a data structure

Got a fact about all derivations?

• It’s a fact about all terminating evaluations
• They are in 1 to 1 correspondance

Prove meta-theoretic properties by structural induction over derivations

• aka “induction on height of derivation tree”

Example: Evaluating an expression doesn’t create or destroy any global variables (the set of defined global variables is invariant)

Metatheorems often help implementors

More example metatheorems:

• OK to mutate environments if you use a stack (Impcore)

• Interactive browser doesn’t leak space (POPL 2012)

• Device driver can’t harm kernel (Microsoft Singularity)

### Reflection: Writing metatheoretic properties

1. Using the judgment form for Impcore evaluation that we are already familiar with, e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩, how would you formally write “In Impcore, evaluation doesn’t change the set of defined global variables?”?
1. Using the judgment form for Impcore evaluation that we are already familiar with, e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩, how would you formally write “In Impcore, evaluation doesn’t change the values of any global variables”?

Metatheorems are proved by induction

Induction over height of derivation trees 𝒟

These are “math-class proofs” (not derivations)

Proof

• Goes by case analysis of the last rule in the derivation.

• Has one case for each rule

• Base cases don’t have proper sub-derivations.

• Inductive cases assume the induction hypothesis for any proper sub-derivation

Let’s try it!

Cases to try:

• Literal
• GlobalVar
• SetGlobal
• IfTrue
• ApplyUser2

For your homework, “Simplified Impcore” leaves out While and Begin rules.

### Check your understanding: Proving metatheoretic properties

The following questions all relate to the metatheoretic proof that evaluation in Impcore does not change the set of defined global variables, ie, for any Impcore expression e, if e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩, then dom(ξ) = dom(ξ′). We prove this property by induction on the derivation of e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩.

1. The proof proceeds by case analysis on the last rule used in the derivation e, ξ, ϕ, ρ⟩ ⇓ ⟨v, ξ′, ϕ′, ρ′⟩.

• True
• False
1. One of the rules that must be considered is the IfFalse rule

• True
• False
1. The case for GlobalVar is an inductive case.

• True
• False
1. The case for GlobalAssign is an inductive case.

• True
• False

## Summary and Next Steps

Theory: Operational semantics

• A “high-level assembly language”
• Justify algebraic laws
• Guide to implementation

Practice: For the next month, programming!

# 23 September 2020: Scheme I: Recursive Programming with Lists and S-Expressions

There are PDF slides for 9/24/2020.

### Retrospective: Operational Semantics

Write one thing you learned about operational semantics and one question you still have. Your answers may be anonymously shared with your classmates.

### Reflection: From inference rules to recursive functions

In the previous module, you encountered a whole bunch of inference rules. How do you think those inference rules related to functions? When might they be recursive functions?

## For a new language, five powerful questions

As a lens for understanding, you can ask these questions about any language:

1. What are the values? What do expressions/terms evaluate to?

2. What environments are there? That is, what can names stand for?

3. What is the abstract syntax? What are the syntactic categories, and what are the forms in each category?

4. How are terms evaluated? What are the judgments? What are the evaluation rules?

5. What’s in the initial basis? Primitives and otherwise, what is built in?

(Initial basis for μScheme on page 99)

## Introduction to Scheme

### Why Scheme?

Vehicle to study recursion and composition, two powerful programming techniques:

• Recursive functions in depth

• Two recursive data structures: the list and the S-expression

• More powerful ways of putting functions together (compositionality again, and it leads to reuse)

Today: programming with lists and S-expressions using algebraic laws

Two new kinds of data:

• The function closure: the key to “first-class” functions

• Pointer to automatically managed cons cell (mother of civilization)

### Scheme Values

Most values are S-expressions.

An fully general S-expression is one of

• a symbol 'Halligan 'tufts

• a literal integer 0 77

• a literal Boolean #t #f

• (cons v1 v2), where v1 and v2 are S-expressions

It is very common to work with a list of S-expressions, which is either

• the empty list '()
• (cons v vs), where v is an S-expression and vs is a list of S-expressions

We say “an S-expression followed by a list of S-expressions”

### Lists: A subset of S-Expressions.

Can be defined via a recursion equation or by inference rules:

1. Every Scheme list is an S-expression.
• True
• False
1. Every Scheme S-expression is a List(A) for some A.
• True
• False
1. Select each of the following S-expressions that are also List(A) where A is the set of symbols.
• (cons 'Tufts 'Jumbos)
• (cons 'Tufts '())
• (cons '2020 '())

### Reflection: Manipulating values

1. Values are only useful to the extent that they can be manipulated. What operations do you think would be useful to have on a cons cell?

2. Recall how we defined numbers via different inductive definitions in Impcore. Using similar thinking, what cases will we need to handle for lists?

#### Understanding values through their operations

Constructors:

• Form values
• List constructors: '() and cons
• Any list is therefore constructed with '() or with cons applied to a value and a smaller list.

Observers:

• These ask, “how were you formed, and from what parts?”
• They are null?, pair?, car, and cdr (car and cdr also known as “first” and “rest”, “head” and “tail”, and many other names)

Laws of lists:

(null? '()) == #t
(null? (cons v vs)) == #f
(car (cons v vs)) == v
(cdr (cons v vs)) == vs

### Check your understanding: Deconstructing lists

A list is formed with either '() or cons.

1. What operation distinguishes between these two types of list?
• car
• cdr
• null?
1. Given this structure, how many cases will a function that consumes a list typically have?
• 0
• 1
• 2
• 3

### Why are lists useful?

• Sequences a frequently used abstraction

• Can easily approximate a set

• Can implement finite maps with association lists (aka dictionaries)

• You don’t have to manage memory

These “cheap and cheerful” representations are less efficient than balanced search trees, but are very easy to implement and work with—see the book.

The only thing new here is automatic memory management. Everything else you could do in C. (You can have automatic memory management in C as well.)

### Programming with lists: The member? function

;; forms of data
;; '()
;; (cons n ns) where n is a number and
ns is a list of numbers

;; example inputs
(val x1 '())
(val x2 '(1 2 3 4))
(val x3 '(8))
(val x4 '(10 10))

;; Write check assert cases now

;; Algebraic Laws
;; (member? n '()) = #f
;; (member? n (cons n ns)) = #t
;; (member? n (cons n' ns)) = (member? n ns),
where n' != n

;; The code!
;; return true if and only if
;; list of numbers ms contains
;; number m
(define member? (m ms)
(if (null? ms)
(if (= m (car ms))
(member? m (cdr ms)))))

(check-assert (not (member? 8 '())))
(check-assert (not (member? 8 '(1 2 3))))
(check-assert      (member? 8 '(8)))
(check-assert      (member? 8 '(10 7 8 12)))


### Coding Interlude: member?

Consider the member? code from the previous video. Run the code on the sample data.

1. Add a check-assert case where the element is in the list.

2. Add a check-assert case where the element is not in the list.

### Programming with lists: The sum-16? function

Do two distinct list elements sum to 16?

;; forms of data
;; '()
;; (cons n ns) where n is a number and
ns is a list of numbers

;; example inputs
(val x5 '(6 10))
(val x6 '(3 6 12 7 8 9 10))

;; Write check assert cases now

;; Algebraic Laws
;; (sum-16? '()) = #f
;; (sum-16? (cons n ns)) =
;;      (sum-16? ns)  OR
;;      n and an element in ns sum to 16
;;      i.e., (member?  (- 16 n) ns))

;; The code!
;; returns #t if and only if ms contains
;; two distinct elements that sum to 16,
;; where ms is a list of numbers

(define sum-16? (ms)
(if (null? ms)
(|| (sum-16? (cdr ms))
(member? (- 16 (car ms)) (cdr ms)))))

(check-assert (not (sum-16? x1)))
(check-assert (not (sum-16? x2)))
(check-assert (not (sum-16? x3)))
(check-assert (not (sum-16? x4)))
(check-assert      (sum-16? x5)))
(check-assert      (sum-16? x6)))


### Coding Interlude: sum-16?

Consider the sum-16? code from the previous video. Run the code on the sample data.

Modify the algebraic laws and the code to check if the product of two elements in the list equals 16. Note that * and / are the multiplication and division operations in μScheme.

### Reflection: Quote syntax

Nobody wants to write a constant list of numbers as (cons 0 (cons 1 (cons 2 (cons 3 (cons 4 (cons 5 (cons 6 (cons 7 (cons 8 (cons 9 '())))))))).

What notation would you suggest instead?

Quoted literals

Defining constants can be verbose:

(cons 6 (cons 7 (cons 8 (cons 9 '()))))

Quotation syntax more compact:

(quote (6 7 8 9))     or     '(6 7 8 9)

Works with any constant S-expression:

(cons 'a (cons (cons 'b '()) '()))

replaces

'(a (b))

### Check your understanding: Quoted literals

Write each ordinary S-expression using quote syntax:

1. (cons 1 '()) = BLANK1
2. (cons 1 (cons 2 '())) = BLANK2
3. (cons (cons 1 '()) '()) = BLANK3
4. (cons (cons 'a (cons 'b '())) (cons 'c '())) = BLANK4

### Reflection: New syntactic forms

To program with lists and S-expressions, we absolutely have to have functions like null?, cons, car, and cdr. And the quote syntax is certainly useful. To make it even easier to program with S-expressions, what other new syntax might be useful?

# 28 September 2020: First-class and higher-order functions

There are PDF slides for 9/29/2020.

Big picture:

• Think of S-expressions and lists, not pointers
• Write less code and reuse more code

### Retrospective: Scheme I: Recursive programming with lists

Write one thing you learned in the previous module. Your answers may be anonymously shared with your classmates.

es

### Reflection: Where is there shared structure?

Consider writing code to answer the following:

• Do two list elements add to 16?
• Do two list elements multiply to 16?
• Are two list elements identical (duplicate?)
• Are two students driving to same location?
• Do two TAs combine to cover hours 4:00–6:00?

• How is the code for each question similar?
• How does the code for each question differ?

## Functions are values

Review

Do two elements in a list add to 16?

;; (sum-16? '())         = #f
;; (sum-16? (cons n ns)) =
;;      (sum-16? ns)  OR
;;      ns has an element m with n + m = 16

C implementation: nested for loops

Functional implementation: recursive functions

More two-element search problems

• Two elements add to 16?

• Two elements multiply to 16?

• Two elements identical (duplicate?)

• Two students driving to same location?

• Two TAs combine to cover hours 4:00–6:00?

Let’s not rewrite the same nested loops/recursive functions over and over

Idea: Functional programming, Part One

Divide and conquer:

• Write two-element search algorithm (once!)
• Describe search criteria in a separate function

The key trick:

• Search-criteria function is a parameter

Example:

-> (define product-is-16? (n m) (= (* m n) 16))
product-is-16?
-> (any-two? product-is-16? '(1 2 3 4))
#f
-> (any-two? product-is-16? '(8 2))
#t

### Reflection: Shared structure

Consider again the following questions:

• Do two list elements add to 16?
• Do two list elements multiply to 16?
• Are two list elements identical (duplicate?)
• Are two students driving to same location?
• Do two TAs combine to cover hours 4:00–6:00?

For each question, describe the question-specific search criteria. For example, for the first question, the criterion is

  (define sum-is-16? (n m) (= (+ m n) 16))

Answers in pseudo-code or English are fine.

## Implementing any-two?

Implementing

Algebraic laws for special case:

;; (sum-16? '())         = #f
;; (sum-16? (cons n ns)) = (sum-16? ns)  OR
;;                         ns has an element m
;;                         with m + n = 16

Algebraic laws for general case:

;; (any-two? q? '())         == ...
;; (any-two? q? (cons y ys)) == ...

Implementing

Algebraic laws for special case:

;; (sum-16? '())         = #f
;; (sum-16? (cons n ns)) = (sum-16? ns)  OR
;;                         ns has an element m
;;                         with m + n = 16

Algebraic laws for general case:

;; (any-two? q? '())         == #f
;; (any-two? q? (cons y ys)) == (any-two? q? ys) OR
;;                              ys has z with (q? y z)

Linear search

;; (exists? p? '())         == ...
;; (exists? p? (cons y ys)) == ...

Linear search

;; (exists? p? '())         == #f
;; (exists? p? (cons y ys)) == (|| (p? y)
;;                                 (exists? p? ys))

Operator || short-circuits:
Evaluates second argument only if first returns #f

## Introduction to lambda

Two-element search

Algebraic laws:

;; (any-two? q? '())         == #f
;; (any-two? q? (cons y ys)) == (any-two? q? ys) OR
;;                              (exists? q-with-y? ys)
;;               where (q-with-y? z) == (q? y z)

But where does q-with-y? come from?

• It is different for every y!

Functional programming, Part Two

(q-with-y? z) == (q? y z)

q-with-y? = a function that takes z, returns

q-with-y? = (lambda (z) (q? y z))

Built into Scheme since 1973!

Two-element search, complete {&} formal

;; (any-two? q? '())         == #f
;; (any-two? q? (cons y ys)) ==
;;   (|| (any-two? q? ys)
;;       (exists? (lambda (z) (q? y z)) ys))

Code in lecture notes.

(define any-two? (q? xs)
;; tell if any two elements y and z
;; in order satisfy (q? y z)
(if (null? xs)
(|| (any-two? q? (cdr xs))
(exists? (lambda (z) (q? (car xs) z))
(cdr xs)))))

Lambda expressions in general

In general, (lambda (x1 ... xn) e)

• lambda is an anonymous function
• x1 to xn are parameters
• e is the body
• x1 to xn are bound in e
• Other variables are free in e

Capture: lambda “takes” free variables with it.

Programming with functions

Functions are “first-class” values

• Passed as parameters
• Returned as results
• Stored in data structures

Created with lambda
(In Scheme, define combines “val” and “lambda”)

Consider the following code when answering the questions below:

(define combine (p1? p2?)
(lambda (x) (if (p1? x) (p2? x) #f)))

(define divvy (p1? p2?)
(lambda (x) (if (p1? x) #t (p2? x))))

(val c-p-e (combine prime? even?))
(val d-p-o (divvy   prime? odd?))
1. The expression (c-p-e 9) equals

• #t
• #f
2. The expression (c-p-e 8) equals

• #t
• #f
3. The expression (c-p-e 7) equals

• #t
• #f
4. The expression (d-p-o 9) equals

• #t
• #f
5. The expression (d-p-o 8) equals

• #t
• #f
6. The expression (d-p-o 7) equals

• #t
• #f

## μScheme’s semantics: Overview and let

### Reflection: Scheme vs Impcore

We’ve now studied two different languages: Impcore and μScheme.

• Name two things you can do in μScheme that you can’t do in Impcore.

• Speculate on how the semantics of μScheme might differ from that of Impcore to account for those features.

Impcore: Things that should offend you

Look up function vs look up variable:

• Different interfaces!

To get variable, check multiple environments

Create a function? Must give it a name

• A sign of second-class citizenship

It’s not precisely true that rho never changes.
New variables are added when they come into scope.
Old variables are deleted when they go out of scope.
But the location associated with a variable never changes.

Strange syntax, but simple semantics!

Key idea: don’t worry about memory

Consider the semantic rule for let.

Match the meta-variables with their meanings.

Meta-variables:

• e
• ρ
• σ
• v
• x

Meanings:

• The names that are being given values.
• The expressions that specify the values to associate with each variable.
• The values the expressions evaluate to.
• The locations where values will be stored.
• The environment that associates variable names with locations.
• The store that maps locations to values.

Another way to write this one:

• The names that are being given values.
• The expressions that specify the values to associate with each variable.
• The values the expressions evaluate to.
• The locations where values will be stored.
• The environment that associates variable names with locations.
• The store that maps locations to values.

### Check your understanding: The semantics of let

Consider the semantic rule for let.

Expression e1 is evaluated after expression e2.

• True
• False

The memory cells allocated for x1 and x2 must be distinct, meaning they cannot alias.

• True
• False

In evaluating the body of the let e, any side effects that occured during the evaluation of e1 and e2 will be visible and variables x1 and x2 will be in scope.

• True
• False

### Semantics of lambda

Key idea: every name I can see—remember where it is stored.’’

### Check your understanding: Semantics of lambda

Consider the semantic rule for lambda.

Side effects (changes to the global store) triggered by the evaluation of the body of the lambda (ie, e) will take place during the evaluation of the lambda.

• True
• False

The environment captured in the closure is the same as the environment in force at the point where the lambda appears in the source program

• True
• False

### Semantics of function application

• What if we used σ instead of σ0 in evaluation of e1?

• What if we used σ instead of σ0 in evaluation of arguments?

• What if we used ρc instead of ρ in evaluation of arguments?

• What if we did not require 1, 2 dom(σ)?

• What is the relationship between ρ and σ?

### Check your understanding: Semantics of application

Consider the semantic rule for ApplyClosure2.

Which expression evaluates to a closure?

• e
• e1
• e2
• ec

Which expression represents the first argument to the function?

• e
• e1
• e2
• ec

Which expression represents the body of the function being applied?

• e
• e1
• e2
• ec

In what order are the expressions mentioned in the rule evaluated?

• e1, e2, ec, e
• ec, e1, e2, e
• e, e1, e2, ec
• ec, e2, e1, e

Which environment is in force at the point where the body of the function is evaluated?

• ρ
• ρc
• ρc{x1 ↦ ℓ1, x2 ↦ ℓ2}

## Immutable data: its costs and benefits (not covered in class)

Key idea of functional programming: constructed data. Instead of mutating, construct a new one. Supports composition, backtracking, parallelism, shared state.

Already doing it: immutable data (append)

• Always safe to share data (I can’t mess up things for you)
• Perfect for parallel/distributed (think Erlang)
• Perfect for web (JSON, XML)

More functional programming

Back to lists!

Append two lists

(define append (xs ys)
;; returns a new list that holds all
;; the elements of xs followed by
;; all the elements of ys
...)

Rapid design process

(define append (xs ys) ...)

Forms of data?

Algebraic laws?

(append '()         ys) ==

(append (cons z zs) ys) == 

Equations and function for append

(append '()         ys) == ys

(append (cons z zs) ys) == (cons z (append zs ys))

(define append (xs ys)

(if (null? xs)

ys

(cons (car xs) (append (cdr xs) ys))))


### Cost model

The major cost center is cons because it corresponds to allocation.

How many cons cells are allocated?

Let’s rigorously explore the cost of append.

### Structural Induction Principle for List(A)

Suppose I can prove two things:

1. IH (’())

2. Whenever a in A and also IH(as), then IH (cons a as)

then I can conclude

Forall as in List(A), IH(as)

Claim: Cost (append xs ys) = (length xs)

Proof: By induction on the structure of xs.

Base case: xs = ’()

• I am not allowed to make any assumptions.

(append '() ys)
= { because xs is null }
ys

Nothing has been allocated, cost is zero

(length xs) is also zero.

Therefore, cost = (length xs).

Inductive case: xs = (cons z zs)

• I am allowed to assume the inductive hypothesis for zs.

Therefore, I may assume the number of cons cells allocated by (append zs ys) equals (length zs)

Now, the code:

(append (cons z zs) ys)
= { because first argument is not null }
= { because (car xs) = z }
= { because (cdr xs) = zs }
(cons z (append zs ys))

The number of cons cells allocated is 1 + the number of cells allocated by (append zs ys).

cost of (append xs ys)
= { reading the code }
1 + cost of (append zs ys)
= { induction hypothesis }
1 + (length zs)
= { algebraic law for length }
(length (cons z zs))
= { definition of xs }
(length xs)

Conclusion: Cost of append is linear in length of first argument.

## More functional programming (not covered in class)

### List-design shortcuts

Three forms of “followed by”

• Given: Element .. list of values = (cons x xs)

• Define: List of values .. list of values = (append xs ys)

• Ignore: List of values .. element = (snoc xs y)
Or (append xs (list1 y))

Two lists? Four cases!

Using informal math notation with .. for “followed by” and e for the empty sequence, we have these laws:

xs .. e         = xs
e .. ys         = ys
(z .. zs) .. ys = z .. (zs .. ys)
xs .. (y .. ys) = (xs .. y) .. ys

The underlying operations are append, cons, and snoc. Which ..’s are which?

• But we have no snoc

• If we cross out the snoc law, we are left with three cases… but case analysis on the first argument is complete.

So cross out the law xs .. e == xs.

• Which rules look useful for writing append?

### Costs of list reversal

Algebraic laws for list reversal:

reverse '() = '()
reverse (x .. xs) = reverse xs .. reverse '(x) = reverse xs .. '(x)

And the code?

Naive list reversal

(define reverse (xs)
(if (null? xs)
'()
(append (reverse (cdr xs))
(list1 (car xs)))))


The list1 function maps an atom x to the singleton list containing x.

How many cons cells are allocated? Let’s let n = |xs|.

• Q: How many calls to reverse? A: n
• Q: How many calls to append? A: n
• Q: How long a list is passed to reverse? A: n-1, n-2, … , 0
• Q: How long a list is passed as first argument to append? A: n-1, n-2, … , 0
• Q: How many cons cells are allocated by call to list1? A: one per call to reverse.
• Conclusion: O(n2) cons cells allocated. (We could prove it by induction.)

### The method of accumulating parameters

Write laws for

(revapp xs ys) = (append (reverse xs) ys)

Who could write the code?

Reversal by accumulating parameters

(define revapp (xs ys)
; return (append (reverse xs) ys)
(if (null? xs)
ys
(revapp (cdr xs)
(cons (car xs) ys))))

(define reverse (xs) (revapp xs '()))


The cost of this version is linear in the length of the list being reversed.

Parameter ys is the accumulating parameter.
(A powerful, general technique.)

# 30 September 2020: Higher-order functions

There are PDF slides for 10/1/2020.

### Reflection: List functions

Consider the function, with the algebraic laws:

This algorithm searches for an element in a list.

List other algorithms you are familiar with that operate over a list (or linked list, or an array) and explain briefly what they do.

Predefined list algorithms

Some classics:

• exists? (Example: Is there a number?)
• all?    (Example: Is everything a number?)
• filter  (Example: Select only the numbers)
• map     (Example: Add 1 to every element)
• foldr   (Visit every element;
also called reduce, accum, a “catamorphism”)

The power of lambda

Using first-class functions to enlarge your vocabulary

• List computations
• Cheap functions from other functions

Supported in many languages: Haskell, ML, Python, JavaScript

### List search: exists?

Algorithm encapsulated: linear search

Example: Is there a number in the list?

Algebraic laws:

(exists? p? '())          == ???
(exixts? p? (cons x xs))  == ???

(exists? p? '())          == #f
(exists? p? (cons x xs))  == #t,
if (p? x)
(exixts? p? (cons x xs))  == exists? p? xs,
otherwise

Defining

; (exists? p? '())         = #f
; (exists? p? (cons y ys)) = #t,          if (p? y)
; (exists? p? (cons y ys)) = (exists? p? ys),
otherwise
-> (define exists? (p? xs)
(if (null? xs)
#f
(if (p? (car xs))
#t
(exists? p? (cdr xs)))))
-> (exists? number? '(1 2 zoo))
#t
-> (exists? number? '(apple orange))
#f


### Coding Interlude: exists?

Define a list ys and two predicates p1? and p2? such that exists? p1? ys evaluates to #t and exists? p2? ys evalutes to #f. Run the code to confirm your answers are correct.

An alternative formulation of the algebraic laws for the exists? function is:

;; (exists? p? '() = #f
;; (exists? p? (cons x xs)) = (|| (p? x)
;;                                (exists? p? xs))

Modify the definition of exists? following this algebraic formulation instead, and re-run your examples.

### List selection: filter

Algorithm encapsulted: Filter list according to a predicate

Example: Keep positive numbers in a list.

Defining

; (filter p? '())         == '()
; (filter p? (cons y ys)) ==
;         (cons y (filter p? ys)), when (p? y)
; (filter p? (cons y ys)) ==
;         (filter p? ys), when (not (p? y))

-> (define filter (p? xs)
(if (null? xs)
'()
(if (p? (car xs))
(cons (car xs) (filter p? (cdr xs)))
(filter p? (cdr xs)))))


Running filter

-> (filter (lambda (n) (>  n 0)) '(1 2 -3 -4 5 6))
(1 2 5 6)
-> (filter (lambda (n) (<= n 0)) '(1 2 -3 -4 5 6))
(-3 -4)


### Coding Interlude: filter

Use lambda to define a predicate that returns true iff the argument is a number larger than 9.

Define a list ys that contans a mix of one- and two-digit numbers.

Use your lambda function and the filter function defined in the video to produce a new list zs that contains only the elements in ys with at least two-digits.

How would you modify the code to produce a list with only one-digit numbers?

### Reflection: Lifting functions to lists: map

Consider the map function. The algorithm it encapsulates is to transform every element according to an argument function. If sq is the function that squares its argument, then map sq '(1,2,3) will produce the list '(1,4,9).

Please complete the algebraic laws for the map function:

(map f '())         ==  ???
(map f (cons n ns)) ==  ???

Algorithm encapsulated: Transform every element according to argument function

Example: Square every element of a list.

Algebraic laws:

(map f '())         ==  '()
(map f (cons n ns)) ==  (cons (f y) (map f ys))

Defining and running map


-> (define map (f xs)
(if (null? xs)
'()
(cons (f (car xs)) (map f (cdr xs)))))
-> (map number? '(3 a b (5 6)))
(#t #f #f #f)
-> (map (lambda(x)(* x x)) '(5 6 7))
(25 36 49)


### The universal list function: foldr

Algorithm encapsulated: Aggregate all elements in a list.

Example: What is the sum of the numbers in a list?

foldr takes two arguments:

• zero: What to do with the empty list.

• plus: How to combine next element with running results.

Example: foldr plus zero '(a b)

cons a (cons b '())
|       |      |
v       v      v
plus a (plus b zero)

What does the following code evaluate to?

-> (define foldr (plus zero xs)
(if (null? xs)
zero
(plus (car xs) (foldr plus zero (cdr xs)))))
-> (define combine (x a) (+ 1 a))
-> (foldr combine 0 '(2 3 4 1))
• 0
• 10
• 24
• 4

### Reflection: Language design: Why the redundancy?

We’ve just seen that functions like exists?, map, and filter can all be easily written in terms of the foldr function. The foldr function, in turn, can easily be written in terms of basic recursion. Other languages have similar redundancy involving loops: for, while, etc.

It takes extra effort for library designers to code more functions or for language designers to support more features.

What do you think makes it worth their effort to support this range of functions and looping constructs?

## Currying

Deriving one-argument functions from two-argument functions

Currying: The motivation

Remember me?

q-with-y? = (lambda (z) (q? y z))

Happens so often, there is a function for it:

q-with-y? = ((curry q?) y)

Called a partial application (one now, one later)

Map/search/filter love curried functions

-> (map     ((curry +) 3) '(1 2 3 4 5))
; add 3 to each element

-> (exists? ((curry =) 3) '(1 2 3 4 5))
; is there an element equal to 3?

-> (filter  ((curry >) 3) '(1 2 3 4 5))
; keep elements that 3 is greater then


The idea of currying

• Input: a binary function f(x,y)

• Output: a function f'
• Input: argument x
• Output: a function f''
• Input: argument y
• Output: f(x,y)

What is the benefit?

• Functions like exists?, all?, map, and filter expect a function of one argument.

To get there, we use currying and partial application.

Slogan: Curried functions take their arguments “one-at-a-time.”

What does the following code evaluate to?

-> (map     ((curry +) 3) '(1 2 3 4 5))
• 15
• 18
• '(4 5 6 7 8)
• '(1 2 3 3 4 5)
• '(3 4 5 6 7 8)
• 4
-> (exists? ((curry =) 3) '(1 2 3 4 5))
• '()
• #t
• #f
• 0
-> (filter  ((curry >) 3) '(1 2 3 4 5))
; tricky

• '(1 2)
• '(1 2 3)
• ’()
• '(3 4 5)
• '(4 5)

### Composing functions

Composing Functions

In math, what is the following equal to?

(f o g)(x) == ???

Another algebraic law, another function:

(f o g) (x) = f(g(x))
(f o g) = lambda x. (f (g (x)))

One-argument functions compose

-> (define o (f g) (lambda (x) (f (g x))))
-> (define even? (n) (= 0 (mod n 2)))
-> (val odd? (o not even?))
-> (odd? 3)
#t
-> (odd? 4)
#f


### Coding Interlude: compose

The predicate null? returns #t iff the argument list is empty, while the function not inverts a boolean. Use the compose function (o) to write a predicate that tests whether a list is non-empty.

Run your code on both empty and non-empty lists.

Function consuming A is related to proof about A

• Q: How to prove two lists are equal?

A: Prove they are both ’() or that they are both cons cells cons-ing equal car’s to equal cdr’s

• Q: How to prove two functions equal?

A: Prove that when applied to equal arguments they produce equal results.

# 5 Oct 2020: Costs, tail calls, and continuations

There are PDF slides for 10/6/2020.

### Reflection: Expected performance

Consider the following algebraic laws for list reversal:

reverse '() = '()
reverse (x .. xs) = reverse xs .. reverse '(x)
= reverse xs .. '(x)

where .. denotes the append operation on lists.

Now consder the corresponding code:

(define reverse (xs)
(if (null? xs)
'()
(append (reverse (cdr xs))
(list1 (car xs)))))

where list1 is a function that makes a one-element list from its argument.

• For a list reversal function to be considered to have good performance, what should its running time be in terms of the length of its argument?
• What is the running time of the reverse function above?

Where have we been, where are we going: functions

Today: costs and control flow

### Costs of list reversal

Naive list reversal

(define reverse (xs)
(if (null? xs)
'()
(append (reverse (cdr xs))
(list1 (car xs)))))


Consider the call reverse xs and let n = |xs|.

How many stack frames are allocated for reverse and append?

How many cons cells are allocated?

• Q: How many calls to reverse? A: n
• Q: How many calls to append?
A: n
• Q: How long is list passed to reverse?
A: n-1, n-2, … , 0
• Q: How long is list passed as 1st arg to append?
A: n-1, n-2, … , 0
• Q: How many cons cells are allocated by call to list1? A: one per call to reverse.
• Conclusions: (We could prove by induction.)
• O(n) stack frames are allocated.
• O(n2) cons cells allocated.

Alarm bells should be ringing: Reversing a list should be linear!

We’ll get to a linear verson of list reversal via tail calls and the method of accumulating parameters.

## Tail calls

What is tail position?

Tail position is defined inductively:

• The body of a function is in tail position

• When (if e1 e2 e3) is in tail position, so are e2 and e3

• When (let (...) e) is in tail position, so is e, and similary for letrec and let*.

• When (begin e1 ... en) is in tail position, so is en.

Idea: The last thing that happens

### Check your understanding: Tail Position

Consider the following code

-> (define foldr (plus zero xs)
(if (null? xs)
zero
(plus (car xs) (foldr plus zero (cdr xs)))))

The occurrence of expression zero in the true-branch of the if expression is in tail position.

• True
• False

The occurrence of expression plus in the false-branch of the if expression is in tail position.

• True
• False

The occurrence of expression foldr in the false-branch of the if expression is in tail position.

• True
• False

## Tail calls and the method of accumulating parameters

Tail-call optimization & \ accumulating parameters

Tail-call optimization

• Anything in tail position is the last thing executed,
so its stack frame can be recycled

• Before executing a call in tail position,
abandon current stack frame

• Results in asymptotic space savings

Accumulating Parameter

• Moves recursive call to tail position

Example: Accumulating reverse function

Key ideas:

• Recursive call is in a tail position

Contract:

  (revapp xs ys) = (append (reverse xs) ys)

Laws:

  (revapp '()         ys) == ys
(revapp (cons z zs) ys) ==
(revapp zs (cons z ys))

Reversal by accumulating parameters

; laws: (revapp '()         ys) = ys
;       (revapp (cons z zs) ys) =
;                     (revapp zs (cons z ys))

(define revapp (xs ys)
; return (append (reverse xs) ys)
(if (null? xs)
ys
(revapp (cdr xs)
(cons (car xs) ys))))

(define reverse (xs) (revapp xs '()))


We’ll seee that the cost of this version is linear in the length of the list being reversed.

Parameter ys is the accumulating parameter.

Example: revapp '(1 2) '()

Call stack: (each line replaces previous one)

revapp '(1 2) '()    -->
revapp '(2)   '(1)   -->
revapp '()    '(2 1)

Consider the call revapp xs '() and let n = |xs|.

How many stack frames are allocated?

How many cons cells are allocated?

• Q: How many calls to revapp? A: n
• Q: How many of them are tail calls? A: n
• Q: How many stack frames are allocated for revapp? A: 1
• Q: How much work is done per call to revapp? A: constant.
• Conclusions: (We could prove by induction.)
• O(1) stack frames for revapp are allocated.
• O(n) cons cells allocated.
• Function is linear in length of list being reversed.

### Check your understanding: Accumulating Parameters

An accumulating parameter to a recursive function f accumulates the answer so that a recursive call to f can be in tail position.

• True
• False

The stack frame of a function call that is not in tail position cannot be shared with subsequent calls because its values may still be needed.

• True
• False

Tail-call optimizations and the use of accumulating parameters can only result in constant-time improvements to the complexity of a function.

• True
• False

## Continuations

### Reflection: Another construct that transfers control

Are tail calls familiar? What other programming construct

1. Transfers control to another point in the code?

2. Uses no stack space?

Tail calls as gotos with arguments

Gotos

• Transfer control to another point in the code
• Use no stack space
• Can only go to code marked with labels

Tail calls

• Transfer control to another point in the code
• Use no stack space
• Can transfer control to any function
• Can take parameters!

Continuations: First-class tail calls

A continuation is code that represents “the rest of the computation.”

• Not a normal function call because continuations never return
• Think “goto with arguments”

Context: an advanced technique, heavily used in event-driven programming:

• GUIs, games, web, embedded devices

Different coding styles

Direct style

• Last action of a function is to return a value. (This style is what you are used to.)

Continuation-passing style (CPS)

• Last action of a function is to “throw” answer to a continuation.
(For us, tail call to a parameter.)

Uses of continuations

• Compiler representation:
• Compilers for functional languages often convert direct-style user code to CPS because CPS matches control-flow of assembly.
• First-class continuations.
• Some languages provide a construct for capturing the current continuation and giving it a name k.
• Control can be resumed at captured continuation by throwing to k.
• A style of coding that can mimic exceptions

• Callbacks in GUI frameworks and web services

• Event loops in game programming and other concurrent settings

Implementation

• We’re going to simulate continuations with function calls in tail position.

• First-class continuations require compiler support.
• primitive function that materializes “current continuation” into a variable.

### Check your understanding: Continuation Introduction

A continuation is different from a normal function because it never returns to the context from which it was invoked.

• True
• False

Continuations cannot take arguments.

• True
• False

Continuations are used in GUI frameworks and to mimic exceptions.

• True
• False

## Motivating Example: Using continuations to signal exceptions

### Reflection: Handling unusual circumstances

Consider the following algebraic laws for a witness function:

  (witness p? xs) == x, where (member x xs),
provided (exists? p? xs)

• What should the function do if there exists no such x?

Design Problem: Missing Value

Provide a witness to existence:

 (witness p? xs) == x, where (member x xs),
provided (exists? p? xs)

Problem: What if there exists no such x?

• nil
• special symbol fail
• run-time error

Good idea: exception (but not available in uScheme)

Question: How much stack space is used by the call?

Example Use: Instructor Lookup

-> (val 2020f '((Fisher 105)(Monroe 40)(Chow 116)))
-> (instructor-info 'Fisher 2020f)
(Fisher teaches 105)

-> (instructor-info 'Chow 2020f)
(Chow teaches 116)

-> (instructor-info 'Votipka 2020f)
(Votipka is-not-on-the-list)


### Coding Interlude: witness-cps

Consider the following code:

(define witness-cps (p? xs succ fail)
(if (null? xs)
(fail)
(let ([z (car xs)])
(if (p? z)
(succ z)
(witness-cps p? (cdr xs) succ fail)))))

(val 2020f '((Fisher 105)(Monroe 40)(Chow 116)))

(define instructor-info (instructor classes)
(let (
[s  (lambda (info)    ; success continuation
[f  (lambda ()        ; failure continuation
(list2 instructor 'is-not-on-the-list))])
(witness-cps (o ((curry =) instructor) car)
classes s f)))

(instructor-info 'Fisher 2020f)
(instructor-info 'Chow 2020f)
(instructor-info 'Votipka 2020f)

Modify the code to add another professor and course number, and add a test case for that professor.

Modify the code to return 'is-not-teaching-this-semester if the instructor’s name is not on the list.

Run your code to confirm it returns your new response when given a name not on the list.

Explain how would you modify the code to search for a course number instead of an instructor name.

# 7 Oct 2020: Continuations for backtracking

There are PDF slides for 10/8/2020.

Expressions to avoid

(if <p> #t #f)  ; worst

(append (list1 <x>) <ys>) ; most common

(cons <x> (cons <y> '())) ; prefer list2

Functions list-of?, formula?

• Can be passed any value

• Must handle all cases (Figure 2.2, page 94)

### Reflection: Solving Boolean formulas

On your homework, you will use continuations to organize a search to find satisfying assignments for Boolean formulas.

A formula has one of these forms:

given the following record declarations:

For each formula, find a satisfying assignment or explain why none exists.

(val f1 (make-and (list4 'x 'y 'z (make-not 'x))))
; x /\ y /\ z /\ !x

(val f2 (make-not (make-or (list2 'x 'y))))
; !(x \/ y)

(val f3 (make-or (list2 'x 'y)))
; x \/ y

(val f4 (make-not (make-and (list3 'x 'y 'z))))
; !(x /\ y /\ z)

Solving Boolean formulas with continuations

A formula is one of these:

given the following record definitions (in book):

Solving Boolean Formulas

(val f1 (make-and (list4 'x 'y 'z (make-not 'x))))
; x /\ y /\ z /\ !x  

Answer: NONE

Solving Boolean Formulas

(val f2 (make-not (make-or (list2 'x 'y))))
; !(x \/ y)          

{ x |-> #f, y |-> #f }

Solving Boolean Formulas

(val f3 (make-or (list2 'x 'y)))
; x \/ y 

{ x |-> #t, ... }
{ y |-> #t, ... }

Solving Boolean Formulas

(val f4 (make-not (make-and (list3 'x 'y 'z))))
; !(x /\ y /\ z)     

(all 7 ways the variables can not all be #t):

{ x |-> #f, ... }, ...

### Searching via continuations: An example

Finding a satisfying assignment

Example formula:

  (x \/ y) /\ (!x /\ !z)

the name of the literal in the box.

### Solving a Literal

start carries a partial truth assignment to variables current

Box describes how to extend current to make a variable, say x, true.

Case 1: current(x) = #t

Call success continuation with current

Pass fail as resume continuation (argument to success)

Case 2: current(x) = #f

Call fail continuation

Case 3: x not in current

Call success continuation with current{x -> #t}

Pass fail as resume continuation

Solving a literal

; (satisfy-literal-true x current succ fail) =
;   (succ current fail), when x is bound to #t in cur
;   (fail),              when x is bound to #f in cur
;   (succ (bind x #t current) fail), x unbound in cur
(define satisfy-literal-true (x current succ fail)
(if (bound? x current)
(if (find x current)
(succ current fail)
(fail))
(succ (bind x #t current) fail)))

### Check your understanding: Solving a Negated Literal

Consider the schema we’ve been using to find satisfying assignments for Boolean formulas. In the previous video, we walked through how to instantiate the schema for a positive literal. The following questions ask you to do the same thing for a negated literal, which is a term of the form not x where x is a literal.

The start continuation is called with argument current, the partial truth assignment constructed so far. The box needs to describe how to extend current to make the negated variable not x true.

If current(x) = #t, what should the box do?

• Call success continuation with current as argument.
• Call fail continuation
• Call success continuation with current extended with {x->#t}
• Call success continuation with current extended with {x->#f}

If current(x) = #f, what should the box do?

• Call success continuation with current as argument.
• Call fail continuation
• Call success continuation with current extended with {x->#t}
• Call success continuation with current extended with {x->#f}

If x is not in current, what should the box do?

• Call success continuation with current as argument.
• Call fail continuation
• Call success continuation with current extended with {x->#t}
• Call success continuation with current extended with {x->#f}

The success continuation takes two arguments: current (the partial solution found so far) and resume (the continuation that takes over if current is found to be inconsistent with some new part of the formula. When calling the success continuation, what should we pass as the resume continuation?

• (lambda (x) x)
• fail
• (lambda (x) (not x))
• start

### Solving A and B

1. Solver enters A

2. If A is solved, newly allocated success continuation starts B

3. If B succeeds, we’re done! Use success continuation from context.

4. If B fails, use resume continuation A passed to B as fail.

5. If A fails, the whole thing fails. Use fail continuation from context.

### Solving A or B

1. Solver enters A

2. If A is solved, we’re good! But what if context doesn’t like solution? It can resume A using the resume continuation passed out as fail.

3. If A can’t be solved, don’t give up! Try a newly allocated failure continuation to start B.

4. If ever B is started, we’ve given up on A entirely. So B’s success and failure continuations are exactly the ones in the context.

5. If B succeeds, but the context doesn’t like the answer, the context can resume B.

6. If B fails, abject failure all around; call the original fail continuation.

## Lisp and Scheme Retrospective

Five powerful questions

1. What are the values?
What do expressions/terms evaluate to?

2. What environments are there?
What can names stand for?

3. What is the abstract syntax?
Syntactic categories? Terms in each category?

4. How are terms evaluated?
Judgments? Evaluation rules?

5. What’s in the initial basis?
Primitives and predefined, what is built in?

Next steps

Before tomorrow’s lecture,

Consider good and bad points of Scheme

### Common Lisp, Scheme

• High-level data structures
• Cheap, easy recursion
• Automatic memory management (garbage collection!)
• Programs as data!
• Hygenic macros for extending the language
• Big environments, tiny interpreters, everything between
• Sophisticated Interactive Development Environments
• Used in AI applications; ITA; Paul Graham’s company Viaweb

Down sides:

• Hard to talk about data
• Hard to detect errors at compile time

Bottom line: it’s all about lambda

• Major win
• Real implementation cost (heap allocation)

## Bonus content: Scheme as it really is

1. Macros!
2. Cond expressions (solve nesting problem)
3. Mutation

### Macros!

Full Scheme: Macros

A Scheme program is just another S-expression

• Function define-syntax manipulates syntax at compile time

• Macros are hygienic—name clashes impossible

• let, &&, record, others implemented as macros

(See book sections 2.16, 2.17.4)

### Conditional expressions

Full Scheme: Conditionals

(cond [c1 e1]    ; if c1 then e1
[c2 e2]    ; else if c2 then e2
...            ...
[cn en])   ; else if cn then en

; Syntactic sugar---'if' is a macro:
(if e1 e2 e3) == (cond [e1 e2]
[#t e3])


### Mutation

Full Scheme: Mutation

Not only variables can be mutated.

Mutate heap-allocated cons cell:

(set-car! '(a b c) 'd)  => (d b c)

Circular lists, sharing, avoids allocation

• still for specialists only

# 12 Oct 2020: Introduction to ML

There are PDF slides for 10/13/2020.

### Reflection: μScheme evaluation

1. Remember
2. Understand
3. Apply
4. Analyze
5. Evaluate
6. Create

Given what you have learned so far in this course, you are in a good position to start to evaluate μScheme.

Please enter at least two strengths of μScheme:

Please enter at least two weaknesses of μScheme.

The language that we will study next, ML, addresses some of the weaknesses of μScheme.

### Introduction to ML

Meta: Apply your new knowledge in Standard ML:

• You’ve already learned (most of) the ideas
• There will be a lot of new detail
• Not your typical introduction to a new language
• Not definition before use, as in a manual
• Not tutorial, as in Ullman
• Instead, the most important ideas that are most connected to your work up to now

### ML Overview

• Designed for manipulating programs, logic, and symbolic data
• Good language for implementing language features
• Good language for studying type systems
• Theme: Precise ways to describe data

### Three new ideas

ML = uScheme + pattern matching + static types + exceptions

1. Pattern matching is big and important. You will like it. It’s “coding with algebraic laws”
2. Static types get two to three weeks in their own right.
3. Exceptions are easy

Pattern matching makes code look more like algebraic laws: one pattern for each case

Static types tell us at compile time what the cases are.

Exceptions solve the problem “I can’t return anything sensible!”

And lots of new concrete syntax!

### Example: The length function.

• Algebraic laws:

length []      = 0
length (x::xs) = 1 + length xs
• The code:

fun length []      = 0
| length (x::xs) = 1 + length xs

val res = length [1,2,3]
• Things to notice:

• No brackets! (I hate the damn parentheses)

• Function application by juxtaposition

• Function application has higher precedence than any infix operator

• Compiler checks all the cases (try in the interpreter)

### Coding Interlude: length

Use the length function

   fun length []      = 0
| length (x::xs) = 1 + length xs

to calculate the length of the list [1,2,3,4].

Now write the sum function in ML that calculates the sum of the same list.

### μScheme to ML Rosetta stone

uScheme                    SML

(cons x xs)             x :: xs

'()                     []
'()                     nil

(lambda (x) e)          fn x => e

(lambda (x y z) e)      fn (x, y, z) => e

||  &&                  andalso    orelse

(let* ([x e1]) e2)      let val x = e1 in e2 end

(let* ([x1 e1]          let val x1 = e1
[x2 e2]              val x2 = e2
[x3 e3]) e)          val x3 = e3
in  e
end

### Check your understanding: ML Syntax

Match the μScheme syntax with its ML equivalent.

μScheme:

 &&

(lambda (x y z) e)

'()

||

(lambda (x) e)

(let* ([x e1]) e2)

(cons x xs)

'()

(let* ([x1 e1]
[x2 e2]
[x3 e3]) e)



ML

 fn (x, y, z) => e

nil

fn x => e

x :: xs

andalso

[]

false

let val x = e1 in e2 end

true

let val x1 = e1
val x2 = e2
val x3 = e3
in  e
end

orelse


### ML forms of data

Live coding:

(* ML's comment form *)
bool (* true, false *)
string (* "Tufts" *)
char (* #"A" *)
int  (* including negative integers, eg ~12 *)
list of int  [1,2,3]
list of bool [true, false]
ill-typed list [1, true]
pair  (1, "Halligan")
triple  (105, "Scheme", true)

### Check your understanding: Simple ML Types

Which of the following is not syntactically correct ML code?

• [1,2,3,5]
• (* This is a comment *) [true, false, true]
• (true, "hello")
• ~12
• #"K"
• None of the above

ML Functions

(* Can't use ? in the names of functions in ML *)
fun even x = (x mod 2 = 0)

fun map f []        = []
| map f (y :: ys) = f y  :: map f ys

(* Anonymous functions are more verbose in ML *)
val res1 = map (fn (x) => (x + 1)) [1,2,3]

(* ML also has a case statement;
It's not as elegant as clausal definitions. *)
(* orelse shortcircuits *)
fun exists p xs =
case xs
of [] => false
| y :: ys => p y orelse exists p ys

val res2 = exists even [1,2,3,4]
(* Higher order functions exist in ML too *)
(* Notation (op +) converts infix to prefix *)
fun sum xs = foldr (op +) 0 [1,2,3,4]


### Check your understanding: ML Functions

Which of the following functions correctly implements the filter function, defined by the following algebraic laws? (Note, all of the functions are syntatically correct.)

    filter p [] = []
filter p (y :: ys) =
if p y then y :: filter p ys
else filter p ys
fun filter p xs =
case xs
of [] => []
| y :: ys =>
if p y then y :: filter p ys
else filter p ys 
fun filter p [] = []
| filter p (y :: ys) =
if p y then y :: filter p ys
else filter p ys
fun filter p xs =
foldr (fn(x,a) =>
if p x then (x::a) else a)
[] xs

All of the above.

### New Forms of Data

Example: Binary Tree of Integers

(* LEAF and NODE are called data type constructors *)
datatype itree = LEAF
| NODE of itree * int * itree

(* Data type constructors build data structures*)
(* and can be used in pattern matching *)
fun insert n LEAF                    =
NODE (LEAF, n, LEAF)
| insert n (NODE (left, m, right)) =
if n < m then
NODE (insert n left, m, right)
else if n > m then
NODE (left, m, insert n right)
else
NODE (left, m, right)

(* insert is curried;
we might also want an uncurried version of insert *)
fun insert' (n, tree) = insert n tree

(* @ is ML's notation for appending two lists *)
fun elements LEAF = []
| elements (NODE (left, n, right)) =
elements left @ (n :: elements right)

(* o is ML's composition operation *)
val isort = elements o foldl insert' LEAF
(* uses binary-search tree for sorting *)

val result = isort [4,2,7,3,4,2,8,4,8,9];

### Check your understanding: ML Data Type Introduction

Consider the itree datatype declaration:

datatype itree = LEAF
| NODE of itree * int * itree

and the function elements

fun elements LEAF = []
| elements (NODE (left, n, right)) =
elements left @ (n :: elements right)

If the elements function were applied to each of the following itree data structures, which would return the list [1,0,5]?

• NODE (NODE (LEAF,0,LEAF),1,NODE (LEAF,5,LEAF))
• NODE (NODE (LEAF,5,LEAF),1,NODE (LEAF,0,LEAF))
• NODE (NODE (LEAF,1,LEAF),0,NODE (LEAF,5,LEAF))

### Exceptions in ML

Example: Take from a list

(* Note use of exceptions. *)
(* And ; to sequence imperative operations *)
exception TooShort
fun take 0 _       = []  (* wildcard! *)
| take n []      = raise TooShort
| take n (x::xs) = x :: (take (n-1) xs)

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
handle TooShort =>
(print "List too short!"; [])

### Check your understanding: ML Exceptions

Recall the code for the take function:

exception TooShort
fun take 0 _       = []  (* wildcard! *)
| take n []      = raise TooShort
| take n (x::xs) = x :: (take (n-1) xs)

and some code that exercises it:

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
handle TooShort =>
(print "List too short!"; [])

Consider the partially-written function find which satisfies the following algebraic laws:

find p [] = raise NotFound
find p (x::xs) = x          if p x
find p xs  otherwise  

and some code that exercises it:

exception Blank1
fun find p [] = Blank2 NotFound
| find p (x::xs) =
if p x then x
else find p xs
val res = find even [1,3,5]
Blank3 NotFound => Blank4


Fill in the blanks to correctly complete the find function and return an answer of 0?

ML—Five Questions

Values: num/string/bool, constructed data

Environments: names stand for values (and types)

Syntax: definitions, expressions, patterns, types

Evaluation: uScheme + case and pattern matching

Initial Basis: medium size; emphasizes lists

(Question Six: type system—a coming attraction)

in a course-specific video or in some other medium.

Ullman is easy to digest

Ullman costs money but saves time

Ullman is clueless about good style

Suggestion:

• Learn the syntax from Ullman
• Learn style from Ramsey, Harper, & Tofte

Details in course guide Learning Standard ML

Length

fun length []      = 0
| length (x::xs) = 1 + length xs

val res = length [1,2,3]


Map

fun map f []      = []
| map f (x::xs) = (f x) :: (map f xs)

val res1 =
map length [[], [1], [1,2], [1,2,3]]


Map, without redundant parentheses

fun map f []      = []
| map f (x::xs) =  f x  ::  map f xs

val res1 =
map length [[], [1], [1,2], [1,2,3]]


Filter

fun filter pred []      = []
| filter pred (x::xs) =  (* no 'pred?' *)
let val rest = filter pred xs
in  if pred x then
(x :: rest)
else
rest
end

val res2 =
filter (fn x => (x mod 2) = 0) [1,2,3,4]


Filter, without redundant parentheses

fun filter pred []      = []
| filter pred (x::xs) =  (* no 'pred?' *)
let val rest = filter pred xs
in  if pred x then
x :: rest
else
rest
end

val res2 =
filter (fn x => (x mod 2) = 0) [1,2,3,4]


Exists

fun exists pred []      = false
| exists pred (x::xs) =
(pred x) orelse (exists pred xs)

val res3 =
exists (fn x => (x mod 2) = 1) [1,2,3,4]
(* Note: fn x => e is syntax for lambda *)


Exists, without redundant parentheses

fun exists pred []      = false
| exists pred (x::xs) =
pred x  orelse  exists pred xs

val res3 =
exists (fn x => (x mod 2) = 1) [1,2,3,4]
(* Note: fn x => e is syntax for lambda *)


All

fun all pred []      = true
| all pred (x::xs) =
(pred x) andalso (all pred xs)

val res4 = all (fn x => (x >= 0)) [1,2,3,4]


All, without redundant parentheses

fun all pred []      = true
| all pred (x::xs) =
pred x  andalso  all pred xs

val res4 = all (fn x => (x >= 0)) [1,2,3,4]


Take

exception TooShort
fun take 0 _       = []  (* wildcard! *)
| take n []      = raise TooShort
| take n (x::xs) = x :: (take (n-1) xs)

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
handle TooShort =>
(print "List too short!"; [])

(* Note use of exceptions. *)


Take, without redundant parentheses

exception TooShort
fun take 0 _       = []  (* wildcard! *)
| take n []      = raise TooShort
| take n (x::xs) = x ::  take (n-1) xs

val res5 = take 2 [1,2,3,4]
val res6 = take 3 [1]
handle TooShort =>
(print "List too short!"; [])

(* Note use of exceptions. *)


Drop

fun drop 0 zs      = zs
| drop n []      = raise TooShort
| drop n (x::xs) = drop (n-1) xs

val res7 = drop 2 [1,2,3,4]
val res8 = drop 3 [1]
handle TooShort =>
(print "List too short!"; [])


Takewhile

fun takewhile p [] = []
| takewhile p (x::xs) =
if p x then (x :: (takewhile p xs))
else []

fun even x = (x mod 2 = 0)
val res8 = takewhile even [2,4,5,7]
val res9 = takewhile even [3,4,6,8]


Takewhile, without redundant parentheses

fun takewhile p [] = []
| takewhile p (x::xs) =
if p x then  x ::  takewhile p xs
else []

fun even x = (x mod 2 = 0)
val res8 = takewhile even [2,4,5,7]
val res9 = takewhile even [3,4,6,8]


Dropwhile

fun dropwhile p []              = []
| dropwhile p (zs as (x::xs)) =
if p x then (dropwhile p xs) else zs
val res10 = dropwhile even [2,4,5,7]
val res11 = dropwhile even [3,4,6,8]

(* fancy pattern form: zs as (x::xs) *)


Dropwhile, without redundant parentheses

fun dropwhile p []              = []
| dropwhile p (zs as (x::xs)) =
if p x then  dropwhile p xs  else zs
val res10 = dropwhile even [2,4,5,7]
val res11 = dropwhile even [3,4,6,8]

(* fancy pattern form: zs as (x::xs) *)


Folds

fun foldr p zero []      = zero
| foldr p zero (x::xs) = p (x, (foldr p zero xs))

fun foldl p zero []      = zero
| foldl p zero (x::xs) = foldl p (p (x, zero)) xs

val res12 = foldr (op +)  0 [1,2,3,4]
val res13 = foldl (op * ) 1 [1,2,3,4]

(* Note 'op' to use infix operator as a value *)


Folds, without redundant parentheses

fun foldr p zero []      = zero
| foldr p zero (x::xs) = p (x,  foldr p zero xs )

fun foldl p zero []      = zero
| foldl p zero (x::xs) = foldl p (p (x, zero)) xs

val res12 = foldr (op +)  0 [1,2,3,4]
val res13 = foldl (op * ) 1 [1,2,3,4]

(* Note 'op' to use infix operator as a value *)


# 14 Oct 2020: Programming with constructed data and types

There are PDF slides for 10/15/2020.

Handout: Learning ML

### Reflection: Defining new kinds of data

Suppose you were going to program a system to play card games in which suits (CLUBS, DIAMONDS, HEARTS, and SPADES) matter, such as poker, solitaire, bridge, or gin rummy. Briefly describe how you would represent the suits in C and in μScheme.

Now consider ML’s datatype mechanism that we used to define the itree type. Briefly describe how you could use ML’s datatype mechanism to represent suits.

## Programming with Types

Pain points from Scheme

Unsolved:

• Clunky records (no dot notation)

Solved:

• Land of Infinite Parentheses
• Algebraic laws are just comments
• car or cdr of empty list
• car or cdr of non-list
• Too many ways to use cons
• Return value of wrong type
• Wrong types/number of arguments

Programming with Types

• Define a type
• Create value (“introduction”)
• Observe a value (“elimination”)

For functions,

All you can do with a function is apply it

For constructed data,

“How were you made & from what parts?”

## Foundation: Data

Syntax is always the presenting complaint, but data is what’s always important

“Language support for forms of data”

• Base types: int, real, bool, char, string
• Functions: A -> B
• Constructed data:

• Tuples: pairs A*B, triples A*B*C, etc.
• (Records with named fields)
• Lists and other algebraic types

Deconstruct using pattern matching

“Distinguish one cons cell (or one record) from another”

### Lightweight constructed data: Tuples

• Types defined wtih *: A1 * A2 * … * An

• Expressions to create: (e1, e2, …, en)

• Patterns to observe: (p1, p2, …, pn)

Example (splitList from merge sort):

    let val (left, right) = splitList xs
in  merge (sort left, sort right)
end

### Enumerated types

Datatypes can define an enumerated type and associated values.

datatype suit = HEARTS | DIAMONDS | SPADES | CLUBS

Here suit is the name of a new type.

The value constructors HEARTS, DIAMONDS, SPADES, and CLUBS are the values of type suit.

Value constructors are separated by vertical bars.

### Pattern matching

Datatypes are deconstructed using pattern matching.

fun toString HEARTS   = "hearts"
| toString DIAMONDS = "diamonds"
| toString CLUBS    = "clubs"

val suitName = toString HEARTS

### Check your understanding: Basic Datatypes

Consider the following partially written isRed function that is supposed to determine if a given suit is one of the two red suits (ie, is either a heart or a diamond).

fun isRed  HEART  = BLANK1
| BLANK2 BLANK3 = true
| isRed  _      = false

### But wait, there’s more: Value constructors can take arguments!

datatype int_tree =
LEAF | NODE of int * int_tree * int_tree

int_tree is the name of a new type.

There are two data constructors: LEAF and NODE.

NODEs take a tuple of three arguments: a value at the node, and left and right subtrees.

The keyword of separates the name of the data constructor and the type of its argument.

When fully applied, data constructors have the type of the defining datatype (ie, int_tree).

It’s recursive!

### Building values with constructors

We build values of type int_tree using the associated constructors:

 val tempty = LEAF
val t1 = NODE (1, tempty, tempty)
val t2 = NODE (2, t1, t1)
val t3 = NODE (3, t2, t2)

What is the in-order traversal of t3?

 [1,2,1,3,1,2,1]

What is the pre-order traversal of t3?

 [3,2,1,1,2,1,1]

### Deconstruct values with pattern matching

(The @ symbol denotes append in ML)

fun inOrder LEAF = []
| inOrder (NODE (v, left, right)) =
inOrder left @ [v] @ inOrder right

val il3 = inOrder t3

fun preOrder LEAF = []
| preOrder (NODE (v, left, right)) =
v :: preOrder left @ preOrder right

val pl3 = preOrder t3

### Reflection: How general is the code?

Consider the int_tree datatype and the code we’ve written to manipulate it:

   datatype int_tree =
LEAF | NODE of int * int_tree * int_tree

fun inOrder LEAF = []
| inOrder (NODE (v, left, right)) =
inOrder left @ [v] @ inOrder right

fun preOrder LEAF = []
| preOrder (NODE (v, left, right)) =
v :: preOrder left @ preOrder right

Discuss the extent to which this code requires the value stored at each node to be an integer. Type errors aside, would the code still work if we stored a different kind of value in the tree?

### But wait, there’s still more: Polymorphic datatypes!

int_tree is monomorphic because it has a single type.

Note though that the inOrder and preOrder functions only cared about the structure of the tree, not the payload value at each node.

Polymorphic datatypes are written using type variables that can be instantiated with any type.

datatype 'a tree =
CHILD | PARENT of 'a * 'a tree * 'a tree

tree is a type constructor (written in post-fix notation), which means it produces a type when applied to a type argument.

Examples:

• int tree is a tree of integers

• bool tree is a tree of booleans

• char tree is a tree of characters

• int list tree is a tree of a list of integers.

'a is a type variable: it can represent any type.

It is introduced on the left-hand of the = sign. References on the right-hand side are types.

CHILD and PARENT are value constructors.

CHILD takes no arguments, and so has type 'a tree

When given a value of type 'a and two 'a trees, PARENT produces a 'a tree

### Constructors build tree values

val empty = CHILD
val tint1 = PARENT (1, empty, empty)
val tint2 = PARENT (2, tint1, tint1)
val tint3 = PARENT (3, tint2, tint2)

val tstr1 = PARENT ("a", empty, empty)
val tstr2 = PARENT ("b", tstr1, tstr1)
val tstr3 = PARENT ("c", tstr2, tstr2)

### Pattern matching deconstructs tree values

fun inOrder CHILD = []
| inOrder (PARENT (v, left, right)) =
(inOrder left) @ [v] @ (inOrder right)

fun preOrder CHILD = []
| preOrder (Parent (v, left, right)) =
v :: (preOrder left) @ (preOrder right)

Functions inOrder and preOrder are polymorphic: they work on any value of type 'a tree. 'a is a type variable and can be replaced with any type.

### Environments

Datatype definitions introduce names into:

1. the type environment: suit, int_tree, tree

2. the value environment: HEART, LEAF, PARENT

### Inductive

Datatype definitions inherently inductive:

• the type int_tree appears in its own definition

• the type tree appears in its own definition

### Check your understanding: Defining Datatypes

We can define a subset of the S-expressions in Scheme using the following specification:

An SX1 is either an ATOM or a list of SX1s, where ATOM is represented by the ML type atom.

Complete the encoding of this definition by filling in the blanks in the following ML datatype:

datatype sx1 = ATOM of BLANK1
| LIST of BLANK2 list

Another way of defining a subset of the S-expressions in Scheme uses a different specification:

An SX2 is either an ATOM or the result of consing together two values v1 and v2, where both v1 and v2 are themselves members of the set SX2.

Complete the encoding of this definition by filling in the blanks in the following ML datatype:

datatype sx2 = ATOM of BLANK1
| PAIR of BLANK2 BLANK3 BLANK4

### Coding interlude: Defining a datatype

Designing a datatype is a three step process:

1. For each form, choose a value constructor
2. Identify the “parts” type that each constructor is of
3. Write the datatype definition

Another definition of a Scheme S-expression is that it is one of:

• A symbol (string)
• A number (int)
• A Boolean (bool)
• A list of S-expressions

Define an ML datatype sx that encodes this version of an S-expression.

## Making types work for you

### Type-directed programming

Common idea in functional programming: lifting

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

The type of a function tells us a lot about how to write it—the type often narrows the code down to a very few sensible choices. This technique works especially well with polymorphic functions.

Here’s the problem we’re going to solve: we have a predicate on single elements, and we want a predicate on lists of elements:

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

And here is the main tool we are going to use: for every type, we are going to be aware of how to make a value of the type (its introduction form) and what we can do with a value of the type (its elimination form).

Let’s review for the types we have:

• Arrow types: A function of any type can always be create with an explicit lambda (fn). And the only thing we can do with a function is apply it.

• Boolean: can make Booleans with true and false. We can use them in if expressions and we can pattern match on them.

• List types: Any value of list type is always made with ::.
And to take one apart, we use pattern matching (in case or fun).

• The unknown type 'a: We have to be prepared for any 'a at all, so there’s no syntax that goes with it. Instead

• If we want an expression of unknown type 'a, our choices are that we have a variable of 'a (maybe a parameter), or maybe we can apply a function that returns a value of 'a.

• All we can do with a value of unknown type are the things we can do with any type: pass it to functions or store it in data structures.

Table:

     Type                 Intro                  Elim

... -> ...           fn                     application

bool                true, false             if, pattern matching (case/fun)

... list            [], ::                  pattern matching (case/fun)

'a                    ---                    ---

Let’s get started: our goal is to make a function of type

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

We can always use fun

fun lift p = _goal1

Q: What is the type of p? How do we know?

Q: What is the type of _goal1? How do we know?

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) = (_goal1 : 'a list -> bool)

Our new goal is a term of type 'a list -> bool. Which of these forms of expression is always safe and makes progress toward a goal of this type?

1. A recursive call
2. A lambda expression of the form (fn ... => ...)
3. A let expression

Our goal is an arrow type, and it is always safe to make a value of arrow type using lambda:

_goal1 = (fn xs => _goal2)

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) = (fn xs => _goal2)

Q: What is the type of xs? How do we know?

Q: What is the type of _goal2? How do we know?

_goal1 = (fn xs => _goal2)

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) = (fn (xs : 'a list) => (_goal2 : bool))

### Things we have

lift : ('a -> bool) -> ('a list -> bool)   [it could call itself recursively]
p : 'a -> bool
xs : 'a list

Now we have a goal of type bool and we want to use the parameter xs of type 'a list. What is the safest tactic to make progress using xs?

1. Break it down by cases, that is, ask whether it is nil or cons
2. Pass it to foldl or foldr
3. Pass it to lift p

Now we have a new goal of type bool. And we know two ways to make a value of type bool: we can return true or false. But if we choose true or false, we will have defined a function that is not very interesting, and we will have completely ignored all the parameters to lift. That is probably not a great design choice.

Is there some other way to make a bool? Some way to use what we have?

• Our p has type 'a -> bool, and that’s an arrow type, so the only thing we can do with it is apply it. But we don’t have a value of 'a to apply it to. Let’s set it aside.

• Our xs has type 'a list. This doesn’t solve our immediate goal of a bool, but since it’s a list, we can try to take it apart. We know xs is either nil or cons. So let’s fill the hold [goal2] with a case expression.

Next take:

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) =
fn (xs : 'a list) =>
case xs
of [ ] => _goal3 : bool
| y :: ys => _goal4 : bool

Now instead of one hole we have two holes, each of type bool. But we have used xs, so that’s progress. What’s next? Let’s look at the first hole.

This first hole still has type bool. And we now know that xs is empty, so we have used all the information available in xs, but we still haven’t used p. Which of the following tactics is viable for fulfilling our goal of type bool? Mark all that apply.

1. Look for a value of type 'a to apply p to
2. Give up on p and fill the hole with true
3. Give up on p and fill the hole with false
• The hole _goal3 has a type bool. We know everything available to be known about xs (namely, it’s empty). All we have left is p.

And p has type 'a -> bool, with no prospect of 'a to apply it to. But we have to fill the hole.

Our only choices are true and false. Which would you like to pick?

Let’s flip a coin. My coin comes up true. So now we have

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) =
fn (xs : 'a list) =>
case xs
of [ ] => true
| y :: ys => _goal4 : bool

Our final goal has type bool, but now we have more Booleans than we know what to do with: both p y and lift p ys. What answer best describes what we should do?

1. Ignore p y and fill the hole with lift p ys
2. Ignore lift p ys and fill the hole with p y
3. Combine the two Booleans using andalso
4. Combine the two Booleans using orelse
5. Two Booleans can be combined in 16 different ways, and they are all equally good.

We have to fill that last hole. And we have new variables y and ys. What are their types? How do we know?

y : 'a
ys : 'a list

Add them to everything we have and it’s now

lift : ('a -> bool) -> ('a list -> bool)
p : 'a -> bool
y : 'a
ys : 'a list

How can we use y and ys?

• We can apply p y and that’s an expression of bool.

• But we’d also like to use ys.

Q: Can we use ys?

A: Yes! We can all lift p recursively. We know it terminates because the list argument keeps getting smaller.

So now these are the things we have:

lift : ('a -> bool) -> ('a list -> bool)
p : 'a -> bool
y : 'a
ys : 'a list

p y : bool
lift p ys : bool

We have just one hole to fill, but we have two terms of type bool. If we want to use all our information, both y and ys, we’d better combine them.

Q: What are common ways to combine two Booleans?

A: Two good ways are andalso and orelse.

Q: Which one do you like here?

A: Let’s think about both choices:

• Q: Suppose we pick orelse here. How does lift behave?

A: It always returns true

• Q: Suppose we pick andalso here. How does lift behave?

A: Hey! It behaves just like the all function! I know that function!

Q: Nice. And here it is:

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift (p : 'a -> bool) =
fn (xs : 'a list) =>
case xs
of [ ] => true
| y :: ys => p y andalso lift p ys

Or if we pretty it up a little,

val lift : forall 'a . ('a -> bool) -> ('a list -> bool)

fun lift p xs =
case xs
of [ ] => true
| y :: ys => p y andalso lift p ys

or even

fun lift p []        = true
| lift p (y :: ys) = p y andalso lift p ys

### Making types work for you: The false case

Q: Suppose back at the empty-list step, the coin had come down differently and we had used true there. What would we want to do then?

A: I know this one! We combine two Booleans with orelse instead of andalso! That’s the exists function!

Q: Indeed:

fun lift' p []        = false
| lift' p (y :: ys) = p y orelse lift p ys

Potential bonus content: inorder traversal of binary tree.

val find : forall 'a . ('a -> bool) -> ('a list -> 'a option)

Bonus content

The rest of this slide deck is “bonus content”

New vocabulary for ML

Data:

• Constructed data
• Value constructor

Code:

• Pattern
• Pattern matching
• Clausal definition
• Clause

Types:

• Type variable (’a)

Structure of algebraic types

An algebraic data type is a collection of alternatives

• Each alternative must have a name

The thing named is the value constructor

(Also called “datatype constructor”)

## Additional language support for algebraic types: case expressions

“Eliminate” values of algebraic types

New language construct case (an expression)

fun length xs =
case xs
of []      => 0
| (x::xs) => 1 + length xs

Clausal definition is preferred
(sugar for val rec, fn, case)

works for any datatype

 fun toStr t =
case t
of EHEAP => "empty heap"
| HEAP (v, left, right) =>
"nonempty heap"

But often a clausal definition is better style:

 fun toStr' EHEAP = "empty heap"
| toStr' (HEAP (v,left,right)) =
"nonempty heap"

## Bonus content: Exceptions — Handling unusual circumstances

### Syntax:

• Definition: exception EmptyQueue
• Introduction: raise e where e : exn
• Elimination: e1 handle pat => e2

### Informal Semantics:

• alternative to normal termination
• can happen to any expression
• tied to function call
• if evaluation of body raises exn, call raises exn
• Handler uses pattern matching

e handle pat1 => e1 | pat2 => e2

Exception handling in action

loop (evaldef (reader (), rho, echo))
handle EOF            => finish ()
| Div               => continue "Division by zero"
| Overflow          => continue "Arith overflow"
| RuntimeError msg  => continue ("error: " ^ msg)
| IO.Io {name, ...} => continue ("I/O error: " ^
name)
| SyntaxError msg   => continue ("error: " ^ msg)


## Bonus Content: A different explanation of ML Datatypes

Tidbits:

• The most important idea in ML!

• Originated with Hope (Burstall, MacQueen, Sannella), in the same lab as ML, at the same time (Edinburgh!)

Notes:

• A “suit” is constructed using HEARTS, DIAMONDS, CLUBS, or SPADES

• A “list of A” is constructed using nil or a :: as, where a is an A and as is a “list of A”

• A “heap of A” is either empty or it’s an A and two child heaps

Datatype definitions

datatype  suit    = HEARTS | DIAMONDS | CLUBS | SPADES

datatype 'a list  = nil           (* copy me NOT! *)
| op :: of 'a * 'a list

datatype 'a heap  = EHEAP
| HEAP of 'a * 'a heap * 'a heap

type suit          val HEARTS : suit, ...
type 'a list       val nil   : forall 'a . 'a list
val op :: : forall 'a .
'a * 'a list -> 'a list
type 'a heap
val EHEAP: forall 'a.                          'a heap
val HEAP : forall 'a.'a * 'a heap * 'a heap -> 'a heap


Exegesis (on board):

• Notation 'a is a type variable

• On left-hand side, it is a formal type parameter
• On right-hand side it is an ordinary type
• In both cases it represents a single unknown type
• Name before = introduces a new type constructor into the type environment. Type constructors may be nullary.

• Alternatives separated by bars are value constructors of the type

• They are new and hide previous names

• (Do not hide built-in names nil and list from the initial basis!)

• Value constructors build constructed data

• Value constructors participate in pattern matching

• Complete by themselves: HEARTS, SPADES, nil

• Expect parameters to make a value or pattern: ::, HEAP

• The value a constructor is applied to is said to be carried

• Every val on this slide is a value constructor

• op enables an infix operator to appear in a nonfix context

• Type application is postfix

• A list of integer lists is written: int list list
• New names into two environments:

• suit, list, heap stand for new type constructors

• HEARTS, CLUBS, nil, ::, EHEAP, HEAP stand for new value constructors

• Algebraic datatypes are inherently inductive (list appears in its own definition)—to you, that means finite trees

• 'a * 'a list is a pair typeinfix operators are always applied to pairs

## Bonus Content: ML traps and pitfalls

Order of clauses matters


fun take n (x::xs) = x :: take (n-1) xs
| take 0 xs      = []
| take n []      = []

(* what goes wrong? *)


- fun plus x y = x + y;
> val plus = fn : int -> int -> int
- fun plus x y = x + y : real;
> val plus = fn : real -> real -> real


Gotcha — parentheses

Put parentheses around anything with |

case, handle, fn

Function application has higher precedence than any infix operator

## Bonus content (seen in examples)

### Syntactic sugar for lists

Syntactic sugar for lists

- 1 :: 2 :: 3 :: 4 :: nil; (* :: associates to the right *)
> val it = [1, 2, 3, 4] : int list

- "the" :: "ML" :: "follies" :: [];
> val it = ["the", "ML", "follies"] : string list

> concat it;
val it = "theMLfollies" : string


## Bonus content: ML from 10,000 feet

### Environments

The value environment

Names bound to immutable values

Immutable ref and array values point to mutable locations

ML has no binding-changing assignment

Definitions add new bindings (hide old ones):

val pattern = exp
val rec pattern = exp
fun ident patterns = exp
datatype … = …

Nesting environments

At top level, definitions

Definitions contain expressions:

def ::= val pattern = exp

Expressions contain definitions:

exp ::= let defs in exp end

Sequence of defs has let-star semantics

### Patterns

What is a pattern?

pattern ::= variable
| wildcard
| value-constructor [pattern]
| tuple-pattern
| record-pattern
| integer-literal
| list-pattern

Design bug: no lexical distinction between

• VALUE CONSTRUCTORS
• variables

Workaround: programming convention

### Functions

Function pecularities: 1 argument

Each function takes 1 argument, returns 1 result

For “multiple arguments,” use tuples!

 fun factorial n =
let fun f (i, prod) =
if i > n then prod else f (i+1, i*prod)
in  f (1, 1)
end

fun factorial n =  (* you can also Curry *)
let fun f i prod =
if i > n then prod else f (i+1) (i*prod)
in  f 1 1
end

Tuples are “usual and customary.”

# 19 Oct 2020: Types

There are PDF slides for 10/20/2020.

### Retrospective: Evaluating ML

You are now in a position to start to evaluate ML.

Please enter at least one strength of ML.

Please enter at least one weakness of ML.

### Reflection: Catching errors with types

We have seen that one of the key differences between μScheme and ML is that ML has a static type system. One of the key roles of a type system is to detect errors. List at least two kinds of errors that type systems can detect at compile time (Hint: think about the kinds of type errors the ML compiler complains about).

Type systems aren’t omniscient, though. Now list at least one kind of error that ML’s type system can’t find and do your best to explain why it can’t.

Type systems

What they do:

• Guide coding
• Document code (checked by compiler!)
• Rule out certain errors
• Provide information to compiler

How they work

• Constrain what values can be at run time

World’s most widely deployed static analysis

Where type systems are found:

• C#, Swift, Go
• Java, Scala, Rust (polymorphism)
• ML, Haskell, TypeScript, Hack (inference)

Our trajectory:

• Formalize familiar, monomorphic type systems (like C)
• Learn polymorphic type systems
• Eventually, infer polymorphic types

Monomorphic types systems are the past.
Inference and polymorphism are the present and future.
(Visible trend just like lambda.)

This “lecture”:

• Formal type system with only two types:
• words and flags
• Type checker derived from inference rules

Next time:

• Unbounded number of types! (Formation, introduction, elimination)

## Type systems

What kind of thing is it?

Types classify terms’’

“Term” is theory word for “syntax”:

n + 1  : int

"hello" ^ "world"  : string

(fn n => n * (n - 1))  : int -> int

if p then 1 else 0 : int, provided  p : bool

• What kind of value does it evaluate to (if it terminates)?

• What is the contract of the function?

• Is each function called with the right number of arguments?
(And similar errors)

• Who has the rights to look at it/change it?

• Is the number miles or millimeters?

Questions type systems typically do not answer:

• Can it divide by zero?

• Can it access an array out of bounds?

• Can it take the car of '()?

• Will it terminate?

### Decidability and Type Checking

Suppose L is a Turing-Complete Language.

P is the set of programs in L that terminate.

Wish: a type system to statically classify terminating programs:

Expression e in L has type T (e : T) iff e terminates.

But: Undecideable!

We can prove no such type system exists. (See Comp170 for the Halting Problem)

Choices:

• Allow type checker to run forever.

• Don’t use type system to track termination.

### Static vs. Dynamic Type Checking

Most languages use a combination of static and dynamic checks

Static: “for all inputs”

• input independent
• efficient at run-time
• approximate : rules out some programs that won’t trigger errors
example: (if false then 2 else "hi") ^ "there"

Dynamic: “for some inputs”

• depends on input
• precise

### Check your understanding: Static vs. Dynamic Type Systems

Type systems can statically detect all errors.

• True
• False

Static type systems generally use a combiniation of static and dynamic checks.

• True
• False

Dynamic type systems impose runtime overhead.

• True
• False

## Type System and Checker for a Simple Language

Why do we write a type checker?

• Canonical example of static analysis. Used pervasively in industry in code improvement and security analysis.

• Great practice connecting math to code. To be educated about programming languages, you must be able to realize inference rules in code. Eventually you should learn to “speak” inference rules like a native. Implementing a type system is a valuable way to build these competencies.

• Tool for structuring language design, which applies to input validation. If (when!) you get to do your own language designs, type systems are an area where you are most likely to be able to innovate. The ideas behind type systems apply any time you need to validate user input, for example. This is the highest level of cognitive task: creation of new things.

Three questions any type system must answer

1. What types do I have?

2. Given a type, how do I create a value

3. What can I do with a value (of that type)?

Simple Type System Example

Simple language of machine-level expressions

What types do I have?

• word predicts a machine word
(in a general-purpose register)

• flag predicts a single bit
(in a flags register)

Type-system example (machine model)

How do I create a word?

• Arithmetic

What can I do with a word?

• Arithmetic, comparison

How do I create a flag?

• Comparison

What can I do with a flag?

• Drive conditional (IF)

Language of expressions to type

Words and flags:

datatype exp = ARITH of arithop * exp * exp
| CMP   of relop   * exp * exp
| LIT   of int
| IF    of exp     * exp * exp
and      arithop = PLUS | MINUS | TIMES | ...
and      relop   = EQ | NE | LT | LE | GT | GE

datatype ty = WORDTY | FLAGTY

(Looks a lot like int and bool)

### Examples to rule out

Can’t add a word and a flag:

3 + (3 < 99)

(ARITH(PLUS, LIT 3, CMP (LT, LIT 3, LIT 99)))

Can’t compare a word and a flag

(3 < (4 = 24))

CMP (LT, LIT 3, CMP(EQ (LIT 4, LIT 24)))

### Check your understanding: AST Representation of Simple Language

Recall the AST for the simple machine language:

datatype exp = ARITH of arithop * exp * exp
| CMP   of relop   * exp * exp
| LIT   of int
| IF    of exp     * exp * exp
and      arithop = PLUS | MINUS | TIMES | ...
and      relop   = EQ | NE | LT | LE | GT | GE

datatype ty = WORDTY | FLAGTY

The AST term

CMP(GT, (CMP(EQ, LIT 105, LIT 170)), LIT 160)

which encodes the expression

((105 = 170) > 160)

should be flagged as a type error.

• True
• False

### Inference rules to define a type system

• Form of judgment Context |- term : type

Written |- e : tau

Contexts vary between type systems

(Right now, the empty context)

• Judgment is proved by derivation

• Derivation made using inference rules

• Inference rules determine how to code val typeof : exp -> ty:

Given e, find tau such that |- e : tau

### Rule for arithmetic operators

Informal example:

|- 3 : word    |- 5 : word
-------------------------
|- 3 + 5 : word

Rules out:

|- 'a' : char    |- 5 : word
---------------------------
|- 'a' + 5 : ???

General form:

|- e1 : word    |- e2 : word
-----------------------------
|- ARITH ( _ , e1, e2) : word

### Rule for comparisons

Informal example:

|- 7 : word    |- 10 : word
-----------------------------
|- 7 < 10 : flag

General form:

|- e1 : word    |- e2 : word
-----------------------------
|- CMP ( _ , e1, e2) : flag

### Rule for literals

Informal example:

|- 14 : word

General form:

--------------------
|- LIT (n) : word

### Rule for conditionals:

Informal example:

|- true : flag
|- 3    : word
|- 42   : word
--------------------------
|- IF (true, 3, 42) : word

General form:

|- e : flag
|- e1 : tau1
|- e2 : tau2      tau1 equiv tau2
-----------------------------------
|- IF ( e, e1, e2) : tau1

Typing rules let us read off what a type checker needs to do.

• input to checker: e

• output from checker: tau

## Type checker in ML

val typeof : exp -> ty
exception IllTyped
fun typeof (ARITH (_, e1, e2)) =
case (typeof e1, typeof e2)
of (WORDTY, WORDTY) => WORDTY
| _              => raise IllTyped
| typeof (CMP (_, e1, e2)) =
case (typeof e1, typeof e2)
of (WORDTY, WORDTY) => FLAGTY
| _              => raise IllTyped
| typeof (LIT _) = WORDTY
| typeof (IF (e,e1,e2)) =
case (typeof e, typeof e1, typeof e2)
of (FLAGTY, tau1, tau2) =>
if eqType (tau1, tau2)
then tau1 else raise IllTyped
| _                    => raise IllTyped

### Check your understanding: A type checker from inference rules

Recall the AST for the simple machine language:

datatype exp = ARITH of arithop * exp * exp
| CMP   of relop   * exp * exp
| LIT   of int
| IF    of exp     * exp * exp
and      arithop = PLUS | MINUS | TIMES | ...
and      relop   = EQ | NE | LT | LE | GT | GE

datatype ty = WORDTY | FLAGTY

and the inference rule for comparisons:

    |- e1 : word    |- e2 : word
-----------------------------
|- CMP ( _ , e1, e2) : flag


and our in-progress type checking code typeof

    val typeof : exp -> ty
exception IllTyped
fun typeof (ARITH (_, e1, e2)) =
case (typeof e1, typeof e2)
of (WORDTY, WORDTY) => WORDTY
| _              => raise IllTyped
| typeof (CMP (_, e1, e2)) =
case (BLANK1, BLANK2)
of (BLANK3, BLANK4) => BLANK5
| _              => raise IllTyped
| typeof (LIT _) = WORDTY
| typeof (IF (e,e1,e2)) =
case (typeof e, typeof e1, typeof e2)
of (FLAGTY, tau1, tau2) =>
if eqType (tau1, tau2)
then tau1 else raise IllTyped
| _                    => raise IllTyped


Fill in the blanks based on the comparison inference rule to complete the code for the CMP case.

An implementor’s trick: If you see identical types in a rule,

• Give each type a distinct subscript

• Introduce equality constraints

• Remember to be careful using primitive equality to check types—you are better off with eqType.

## Typing Rules: Contexts and Term Variables

    datatype exp = ARITH of arithop * exp * exp
| CMP   of relop   * exp * exp
| LIT   of int
| IF    of exp     * exp * exp
| VAR   of name
| LET   of name    * exp * exp
and      arithop = PLUS | MINUS | TIMES | ...
and      relop   = EQ | NE | LT | LE | GT | GE

datatype ty = WORDTY | FLAGTY


New forms to support variables

    datatype exp = ...
| VAR   of name
| LET   of name    * exp * exp

datatype ty = WORDTY | FLAGTY


What could go wrong with a variable?

• Used inconsistently:

;; x can’t be both an integer and a list

x + x @ x

;; y can’t be both an integer and a string

let y = 10 in y ^ “hello” end

So we need to track variable use to ensure consistency

Key idea: Type environment (Gamma) tracks the types of variables.

### Rule for var

x in dom Gamma        tau = Gamma(x)
----------------------------------------
Gamma |- VAR x : tau

So as long as x is recorded in Gamma as a WORDTY, it can only be used as a WORDTY.

### Rule for let

Gamma          |- e1 : tau1
Gamma{x->tau1} |- e2 : tau2
-------------------------------------
Gamma |- LET x = e1 in e2 : tau2

### Type Checker

Type checker needs Gamma – gives type of each “term variable”.

Type checking for variables

val typeof : exp * ty env -> ty
fun typeof (ARITH ..., Gamma ) = <as before>
| typeof (VAR x, Gamma)      =
(case maybeFind (x, Gamma)
of SOME tau => tau
| NONE     => raise IllTyped)
| typeof (LET (x, e1, e2), Gamma) =
<your turn>

Given the typing rule for the let form we added to the simple machine language:

    Gamma          |- e1 : tau1
Gamma{x->tau1} |- e2 : tau2
-------------------------------------
Gamma |- LET x = e1 in e2 : tau2

Fill in the missing blanks to complete the type checker we started in the video:

    val typeof : exp * ty env -> ty
fun typeof (ARITH ..., Gamma ) = <as before>
| typeof (VAR x, Gamma)      =
(case maybeFind (x, Gamma)
of SOME tau => tau
| NONE     => raise IllTyped)
| typeof (LET (x, e1, e2), Gamma) =
let tau1 = typeof (BLANK1, BLANK2)
in  BLANK3 (BLANK4, extend Gamma x BLANK5)
end


### Taking Stock

• I gave you syntax for simple language.

• We came up with formal typing rules.

• We used the typing rules to implement the type checker.

• You will design new syntax and typing rules for lists

• You will extend an existing type checker

• You will implement a full type checker from scratch

This is a big chunk of what language designers do.

### What is a type?

• Weak: a set of values

• Better: a conservative prediction about values

• Best: the precise definition: classifier for terms!!

• The relationship to values becomes a proof obligation.

• Note: a computation can have a type even if it doesn’t terminate! (Or doesn’t produce a value)

Source of new language ideas for next 20 years

Needed if you want to understand advanced designs (or create your own)

Type soundness

Key theorem: prediction is accurate

• Will state more precisely next week

• Best explanation for how/why type system works

• Proof beyond the scope of 105

### What types can rule out

things that could go wrong:

(8 < 10) + 4

(8 == 8) < 9

x + (x :: xs)

let val y = 10 in length y end

# 21 October 2020: Type checking with type constructors

There are PDF slides for 10/22/2020.

should have two separate answer boxes: one for what they have learned and one for what they have questions about.

### Reflection: Simple type checking

In the first part of this module on type checking, we talked about the purposes and limitations of type systems, we formalized a simple type system with just two types using inference rules, and we converted those inference rules into a type checker.

Describe two specific things you have learned so far in this module.

Write one question you have about what we’ve learned so far.

### Reflection: How many types?

In the previous “lecture”, we studied a type system with just two types: words and flags. In this “lecture”, we’re going to add type constructors such as function types and lists. When we add lists" to a language, how many types are we adding?

## Where we’ve been

New watershed in the homework

• You’ve been developing and polishing programming skills: recursion, higher-order functions, using types to your advantage. But the problems have been mostly simple problems around simple data structures, mostly lists.

• You’ve already seen everything you need to know to implement a basic type checker, and you are almost fully equipped to add array operations and types to Typed Impcore.

## Where we’re going

• We’re now going to shift and spend the next several weeks doing real programming-languages stuff, starting with type systems.

• What’s next is much more sophisticated type systems, with an infinite number of types.

Focus on two questions about type systems:

• What is and is not a good type, that is, a classifier for terms?

• How shall we represent types?

We’ll look at these questions in two contexts: monomorphic (this “lecture”) and polymorphic (next “lecture”) languages.

## Monomorphic vs Polymorphic Types

Monomorphic types have “one shape.”

• Examples: int, bool, int -> bool, int * int

Polymorphic types have “many shapes.”

• Examples:

(forall ('a) (list 'a))
(forall ('a) ((list 'a) -> int))
(forall ('a) ((list 'a) -> (list 'a)))

Substitute any type you like for 'a.

• ML Examples: 'a list, 'a list -> int, 'a list -> 'a list

In ML, forall is only at the top level, and it isn’t written explicitly

## Designing monomorphic type systems

Mechanisms:

• Every new variety of type requires special syntax (examples: structs, pointers, arrays)

• Implementation is a straightforward application of what you already know.

Language designer’s process:

• What new types do I have?

• Formation rules
• How do I create new values with that type?

• Introduction rules

For introduce think “produce”, “create”, “make”, “define”

• What new syntax do I have to observe terms of a type?

• Elimination rules

For eliminate think “consume”, “examine”, “interrogate”, “look inside”, or “take apart”, “observe”, “use”, “mutate”

Words “introduce” and “eliminate” are the native vocabulary of type-theoretic language design—it’s like knowing what to say when somebody sneezes.

### Managing the set of types: Type formation

Examples: Not yet types

Types in waiting’’ (yet)

Examples: Not types at all!

These are

Type-formation rules

Are about not trusting the source code

We need a way to classify type expressions into:

• types that classify terms

• type constructors that build types

• nonsense that doesn’t mean anything

Type judgments for monomorphic system

Two judgments:

• The familiar typing judgment Γ ⊢ e : τ

• Now we add the judgment “τ is a type”

### Check your understanding: Types, type constructors, or garbage

In each of the following, indicate whether the “type” is in fact a type (ie, something that classifies terms), a type constructor (ie, something that will classify terms when applied to some number of type arguments), or utter nonsense.

bool list

• type
• type constructor
• utter nonsense

array

• type
• type constructor
• utter nonsense

bool bool

• type
• type constructor
• utter nonsense

->

• type
• type constructor
• utter nonsense

These rules are actually the same as we saw for the simple arithmetic language, but we’ll review them again here.

### Check your understanding: Checking type equalities

We saw how to modify the typing rule for Set to explicitly mark where two types were being compared for equality. The typing rule for If that we saw in the video has a similar implicit type equality check. Fill in the missing blanks in the typing rule below to create a version with an explicit equality check.

## Classic types for data structures

At run time, these pairs are identical to cons, car, cdr.

These rules dictate type checker code just like in previous “lecture”.

Typical syntactic support for types

Explicit types on lambda and define:

• For lambda, argument types:

(lambda ([n : int] [m : int]) (+ (* n n) (* m m)))
• For define, argument and result types:

(define int max ([x : int] [y : int])
(if (< x y) y x))

Abstract syntax:

datatype exp = ...
| LAMBDA of (name * tyex) list * exp
...
datatype def = ...
| DEFINE of name * tyex * ((name * tyex) list * exp)
...

### Check your understanding: Writing Typing Rules

ML References are similar to C/C++ pointers. Given the following surface syntax (on the left) and corresponding AST represenations (on the right):

Surface Syntax AST
ref τ ref(τ)
ref e ref-make(e)
!e ref-get(e)
e1 := e2 ref-set(e1,e2)

fill in the blanks to correctly complete the following formation, introduction, and elimination rules.

=10pt

### Coding the arrow-introduction rule

Type-checking LAMBDA

datatype exp = LAMBDA of (name * tyex) list * exp
...
fun ty (Gamma, LAMBDA (formals, body)) =
let val Gamma' = (* body gets new env *)
foldl (fn ((x, ty), g) => bind (x, ty, g))
Gamma formals
val bodytype = ty (Gamma', body)
val formaltypes =
map (fn (x, ty) => ty) formals
in  FUNTY (formaltypes, bodytype)
end


### Approaching types as programmers

Understanding language design

Questions about types never seen before:

• What types can I make?
• What syntax goes with each form?
• To create values?
• To do things with values?
• What functions?

Examples: pointer, struct, function, record

Talking type theory

Formation: make new types

Introduction: make new values

Elimination: observe (“take apart”) existing values

## Talking type theory: Introduction and elimination constructs

Part of learning any new field: talk to people in their native vocabulary

• Introduce means “produce”, “create”, “make”, “define”

• Eliminate means “consume”, “examine”, “observe”, “use”, “mutate”

It’s like knowing what to say when somebody sneezes.

# 26 Oct 2020: Polymorphic Type Checking

There are PDF slides for 10/27/2020.

### Today

• Burdens of monomorphism
• On designers and implementors
• On programmers
• Lift burden with quantified types
• Extensible type formation

## Reflection: Type checking type constructors

In the previous two “lectures” we studied how to type check a known set of types. In each case, the syntax for the type was hard-coded into the AST for the language. Type checking the creation and uses of values with the type were also hard-coded into the type checker. That approach works fine if the language designers know all the types in advance. But, in any language worth its salt, programmers can create new types, (for example, by defining algebraic datatypes in ML, structs in C, or classes in Java). The language has to be able to type check these new types without having to recompile the language implementation first.

Spend a few minutes thinking about how we might redesign the implementation of the type checker to accommodate user-defined types.

## Limitations of monomorphic type systems

With monomorphism, new types are costly

Closed world

• Only language designer can add a new type constructor

A new type constructor (“array”) requires

• New expression and type syntax
• New internal representation (type formation)
• New type rules (intro, elim)
• New code in type checker
• New or revised proof of soundness

Notes:

• Implementing arrays on homework
• Writing rules for lists on homework

Costly for programmers

User-defined functions are monomorphic:

(check-function-type swap
([array bool] int int -> unit))
(define unit swap ([a : [array bool]]
[i : int]
[j : int])
(begin
(set tmp       (array-at a i))
(array-put a i (array-at a j))
(array-put a j tmp)
(begin)))

### Check your understanding: Monomorphism limitations

If a language has a monomorphic type system, it will not let programmers write a single length function for lists. Instead, programmers must write a length function for each kind of list they want to calculate the length of, eg, a length function for lists of booleans, and another for lists of integers, etc.

• True
• False

## Quantified types

Idea: Use functions not syntax to represent new types

Benefits:

• No new syntax
• No new internal representation
• No new type rules
• No new code in type checker
• No new proof of soundness
• Programmers can add new types

Requires: more expressive function types

Better type for a swap function

(check-type
swap
(forall ('a) ([array 'a] int int -> unit)))

### Check your understanding: Quantified Types

The type theory expression and the expression are two different representations for the same polymorphic type.

• True
• False

In the type (forall ('a) ([list 'a] -> 'a)), the notation 'a denotes a type variable that stands for the type of element in the list argument and the type of the result returned by the function.

• True
• False

The type (forall ('a) ([list 'a] -> [list 'a])) is the type of the car function.

• True
• False

Programming with quantified types

Substitute for quantified variables: “instantiate”

-> length
<function> : (forall ('a) ((list 'a) -> int))
-> [@ length int]
<function> : ((list int) -> int)
-> (length '(1 2 3))
type error: function is polymorphic;
instantiate before applying
-> ([@ length int] '(1 2 3))
3 : int

Instantiate as you like

-> length
: (forall ('a) ((list 'a) -> int))
-> [@ length bool]
: ((list bool) -> int)
-> ([@ length bool] '(#t #f))
2 : int


More instantiations

-> (val length-int [@ length int])
length-int : ((list int) -> int)
-> (val cons-bool [@ cons bool])
cons-bool : ((bool (list bool)) -> (list bool))
-> (val cdr-sym [@ cdr sym])
cdr-sym : ((list sym) -> (list sym))
-> (val empty-int [@ '() int])
() : (list int)


Bonus instantiation:

-> map
<function> :
(forall ['a 'b]
(('a -> 'b) (list 'a) -> (list 'b)))
-> [@ map int bool]
<function> :
((int -> bool) (list int) -> (list bool))

Polymorphic array swap

(check-type swap
(forall ('a) ([array 'a] int int -> unit)))

(val swap
(type-lambda ('a)
(lambda ([a : (array 'a)]
[i : int]
[j : int])
(let ([tmp ([@ Array.at 'a] a i)])
(begin
([@ Array.put 'a] a i ([@ Array.at 'a] a j))
([@ Array.put 'a] a j tmp))))))


Two forms of abstraction:

Power comes at high notational cost

Function composition

-> (val o (type-lambda ['a 'b 'c]
(lambda ([f : ('b -> 'c)]
[g : ('a -> 'b)])
(lambda ([x : 'a]) (f (g x))))))

o : (forall ('a 'b 'c)
(('b -> 'c) ('a -> 'b) -> ('a -> 'c)))

Aka o :

### Coding Interlude: Instantiating Polymorphic Functions

Enter the length function into the interpreter to observe its type.

Apply the length function to the list [1,2,3]. Observe the response.

Create two instantiations of the length function: one for ints and one for bools.

Apply the length-int function to the list [1,2,3] and the length-bool function to the list [#t, #f] and observe the result.

What do you see as the advantages and disadvantages of ’s approach to polymorphism so far?

## Polymorphic Type Checking

### Representing quantified types

Representing quantified types

Two new alternatives for tyex:

datatype tyex
= TYCON  of name                // int
| CONAPP of tyex * tyex list    // (list bool)
| FUNTY  of tyex list * tyex    // (int int -> bool)
| TYVAR  of name                // 'a
| FORALL of name list * tyex    // (forall ('a) ...)

### Check your understanding: Representation of quantified types

Given the tyex datatype:

datatype tyex
= TYCON  of name                // int
| CONAPP of tyex * tyex list    // (list bool)
| FUNTY  of tyex list * tyex    // (int int -> bool)
| TYVAR  of name                // 'a
| FORALL of name list * tyex    // (forall ('a) ...)

Match the surface syntax for polymorphic types with their representations using the tyex datatype.

Surface syntax:

1. bool
2. 'a
3. (list int)
4. (bool bool -> int)
5. (forall ['b] ((list 'b) -> int))

Represenation:

1. FUNTY ([TYCON "bool", TYCON "bool"], TYCON "int")
2. TYVAR "a"
3. FORALL ("b", FUNTY ([CONAPP (TYCON "list", [TYVAR "b"])], TYCON "int"))
4. TYCON "bool"
5. CONAPP (TYCON "list", [TYCON "int"])

### Check your understanding: Rules for type-lambda

1. In the new judgement form for type checking type-lambda, the meta-variable Δ is a kind environment that records the type variables that have been used in typing a program.
• True
• False
1. The AST term tylambda(α1,…,αn, e) is the representation of the surface syntax (@ e τ1 ... τn)
• True
• False
1. In the introduction rule for type-lambda, it is okay if the type variables to be quantified over already appear in the context Γ.
• True
• False
1. In the elimination rule for type-lambda, the resulting type is obtained by substituting the type arguments for the corresponding quantified type variables. If the type of the expression e is quantified over m type variables and e is applied to n type arguments, a type error should result if n ≠ m.
• True
• False

## Type formation through kinds

What have we gained?

No more type-specific introduction rules:

No more type-specific elimination rules:

But, we still need formation rules

User types can’t be blindly trusted

-> (lambda ([a : array]) (Array.size a))
type error: used type constructor array' as a type
-> (lambda ([x : (bool int)]) x)
type error: tried to apply type bool as type constructor
-> (@ car list)
type error: instantiated at type constructor list',
which is not a type

How can we know which types are OK?

### Check your understanding: Use kinds to classify types

For each type expression, select its kind or indicate it is not well formed.

1. char
• *
• * => *
• not well defined
1. list list
• *
• * => *
• not well defined
1. array
• *
• * => *
• not well defined

## Opening the closed world

Designer’s burden reduced

To extend Typed Impcore:

• New syntax
• New type rules
• New internal representation
• New code
• New soundness proof

To extend Typed μScheme, none of the above! Just

• New functions
• New primitive type constructor in Δ

You’ll do arrays both ways

Programmer’s power increased?

Typed Impcore:

• Closed world (no new types)

• Simple formation rules

Typed μScheme:

• Semi-closed world (new type variables)

• Types are formed via explicit abstraction and instantiation

Standard ML:

• Open world (programmers create new types)

• How are types formed (from other types)?

# 28 October 2020: Introduction to type inference

There are PDF slides for 10/29/2020.

### Retrospective: Type Checking

Write one thing you learned in the previous module about type checking and one question you have about that material.

Your responses may be anonymously shared with the rest of the class.

### Reflection: How does the compiler know?

Consider the following ML code for the append function:

fun append (x::xs) ys = x :: append xs ys
| append []      ys = ys

When you enter this code into the ML comiler, it returns the type of the function.

Describe how you think the ML compiler might figure out the type of the function.

Why study type inference?

• Useful in its own right (as we’ll see shortly)
• Essential in modern languages with complex type systems
• Canonical example of Static Analysis. Key for:
• Program optimization
• Software engineering
• Cybersecurity

What type inference accomplishes

-> (define     double (x)       (+ x x))
double                         ;; uScheme
-> (define int double ([x : int]) (+ x x))
double : (int -> int)          ;; Typed uSch.
-> (define     double (x)       (+ x x))
double : (int -> int)          ;; nML


What else type inference accomplishes

-> ([@ cons bool] #t ([@ cons bool] #f [@ '() bool]))
(#t #f) : (list bool)    ;; typed uScheme
-> (   cons       #t (   cons       #f    '()      ))
(#t #f) : (list bool)    ;; nML


How it works

1. For each unknown type, a fresh type variable

2. Every typing rule adds equality constraints

3. Instantiate every variable automatically

4. Introduce polymorphism at let/val bindings

Plan of study

• First, we’ll walk through several examples, showing how to generate constraints

• Then, you’ll do an example

• Then, we focus on solving constraints

• For the next homework, you’ll implement both algorithms

#### Let’s do some examples

N.B. Book is “constraints first;” lecture will be “type system first.” Use whatever way works for you

##### Example: (val-rec double (lambda (x) (+ x x)))

What do we know?

• double has type a1

• x has type a2

• + has type int * int -> int

• (+ x x) is an application, what does it require?

• a2 ~ int and a2 ~ int
• Is this possible?

Key idea: Record the constraint in a typing judgement.

'a2 ~ int /\ 'a2 ~ int,
{ double : 'a1, x : 'a2 } |- (+ x x) : int
##### Example: (if y 1 0)
• y has type a3, 1 has type int, 0 has type int

• Requires what constraints? (int ~ int, a3 ~ bool)

##### Example: (if z z (- 0 z))
• z has type a3, 0 has type int, - has type int * int -> int

• Requires what constraints? (a3 ~ bool /\ int ~ int /\ a3 ~ int)

• These constraints are not solveable so the program is ill-typed.

### Check your understanding: Gathering constraints

1. In this judgment
    'ax ~ int /\ 'ay ~ int,
{x : 'ax, y : 'ay } |- (+ x y) : int

the notation 'ay ~ int is a constraint that means that the type variable 'ay must be equal to type int.

• True
• False
1. In this judgment
    'ax ~ int /\ 'ay ~ int,
{x : 'ax, y : 'ay } |- (+ x y) : int

the notation y : 'ay means that expression variable y has type 'ay. Furthermore, because y has a type variable for its type, y’s type can become any syntactically valid type.

• True
• False
1. Which set of constraints might have been generated when running type inference on the following program fragment? Pick the most complete set that applies.

(if x (+ 1 y) z)

• 'ax ~ 'ay, 'ay ~ int, 'az ~ int
• 'ax ~ bool, 'ay ~ int, 'az ~ int
• 'ax ~ 'ay, 'ay ~ 'az
• 'ay ~ int, 'az ~ int
• 'ax ~ bool, 'ay ~ int

## Inferring polymorphic types

(val app2 (lambda (f x y)
(begin
(f x)
(f y))))

Assume f : 'a_f

Assume x : 'a_x

Assume y : 'a_y

f x implies 'a_f ~ ('a_x -> 'a1)

f y implies 'a_f ~ ('a_y -> 'a2)

Together, these constraints imply 'a_x ~ 'a_y and 'a1 ~ 'a2

begin implies result of function is 'a2

So,

app2 : (('a_x -> 'a2) 'a_x 'a_x -> 'a2)

'a_x and 'a2 aren’t mentioned anywhere else in program, so

we can generalize to:

(forall ('a_x 'a2) (('a_x -> 'a2) 'a_x 'a_x -> 'a2))

which is the same thing as:

app2 : (forall ('a 'b) (('a -> 'b) 'a 'a -> 'b))

### Check your understanding: Type inference with polymorphic functions

This question walks you through generating constraints for a simple function that makes two calls to a polymorphic function. It illustrates how those two calls can end up being used at different types.

Each subquestion builds on the previous one. The explanations given after each subquestion explain what you need to know for the next part.

Consider the following program:

(val cc (lambda (nss) (car_2 (car_1 nss))))

where car_1 and car_2 are two instances of the polymorphic car function that has type

car : forall 'a . 'a list -> 'a

For the purposes of this question we’ve named these uses car_1 and car_2 so we can be clear which use we are talking about as we study their effect on the type of the expression.

1. Variable nss is a parameter, and so during type checking, we assign it a fresh type variable, say 'b.

• True
• False
1. We need to instantiate the two instances of the polymorphic car function before we can apply them. Which set of instantiations is correct?
• car_1 : 'a1 list -> 'a1 and car_2 : 'a1 list -> 'a1

• car_1 : 'a1 list -> 'a2 and car_2 : 'a2 list -> 'a1

• car_1 : 'a2 list -> 'a1 and car_2 : 'a1 list -> 'a2

• car_1 : 'a1 list -> 'a1 and car_2 : 'a2 list -> 'a2

1. Given what we have already inferred about the types of car_1 and nss (and given in the answers in parts 1 and 2), which of the following constraints is generated by the application (car_1 nss)?
• 'b ~ 'a1
• 'b ~ 'a1 list
• 'a1 ~ 'a2 list
• 'b ~ 'a2 list
1. Given what we have already inferred about the types of car_1, car_2, and nss (and given in the answers in parts 1, 2, and 3), which of the following constraints is generated by the application of the function car_2 to the argument (car_1 nss) in the term (car_2 (car_1 nss)):
• 'a2 ~ 'a1
• 'a2 ~ 'a1 list
• 'b ~ 'a2
• 'a1 ~ 'a2 list

### Extra examples

Bonus example

-> (val second (lambda (xs) (car (cdr xs))))
second : ...
-> (val two    (lambda (f) (lambda (x) (f (f x)))))
two : ...


Bonus example solved

-> (val second (lambda (xs) (car (cdr xs))))
second : (forall ('a) ((list 'a) -> 'a))
-> (val two    (lambda (f) (lambda (x) (f (f x)))))
two :    (forall ('a) (('a -> 'a) -> ('a -> 'a)))


## Precise inference with Hindley-Milner types

Making Type Inference Precise

• Type inference for polymorphism is undecidable

Solution:

• Each formal parameter has a monomorphic type

Consequences:

• The argument to a higher-order function cannot be mandated to be polymorphic
• forall appears only outermost in types

Representing Hindley-Milner types

type tyvar = name
datatype ty
= TYVAR  of tyvar
| TYCON  of name
| CONAPP of ty * ty list

datatype type_scheme
= FORALL of tyvar list * ty

fun funtype (args, result) =
CONAPP (TYCON "function",
[CONAPP (TYCON "arguments", args),
result])


All the pieces

1. Hindley-Milner types
2. Bound names : σ, expressions : τ
3. Type inference yields type-equality constraint
4. Constraint solving produces substitution
5. Substitution refines types
6. Call solver, introduce polytypes at val
7. Call solver, introduce polytypes at all let forms

To code the type-inference algorithm, replace eqType with constraint generation!

### Check your understanding: Hindley-Milner Types

1. With Hindley-Milner types, the meta-variable τ cannot stand for a type variable α.
• True
• False
1. With Hindley-Milner types, the meta-variable σ represents a polymorphic type and has the form α1,… αn . τ.
• True
• False
1. Choose the appropriate values for the blanks in the following ML data structure to make it represent the the Hindley-Miler type scheme α . α list -> α.
FORALL (BLANK1,
CONAPP(BLANK2,
[CONAPP (TYCON "arguments", BLANK3),
TYVAR "alpha" ]))

BLANK1:

• “alpha”
• [“alpha”]
• TYVAR “alpha”

BLANK2:

• TYCON “function”
• [TYCON “function”]
• “function”

BLANK3:

• CONAPP(TYCON “list”, [TYVAR “alpha”])
• CONAPP(“list”, [TYVAR “alpha”])
• CONAPP(TYCON “list”, TYVAR “alpha”)
• CONAPP(“list”, “alpha”)
1. In Hindley-Miler type inference, type environments Γ always bind variables to type scheme σ. If a variable is monomorphic, the type scheme will quantify over an empty list of type variables (ie, be of the form . τ)
• True
• False
1. Match each step in Hindley-Milner type inference with its purpose.

Steps:

1. constraint generation
2. constraint solving
3. substitution
4. generaliation
5. instantiation

Purposes:

1. Refining types based on information inferred during constraint solving.
2. Collating clues to make sure expressions are used consistently.
3. Converting type scheme into a type so that it may be used in a given context.
4. Collecting clues about how expressions are used in the program.
5. Introducing type scheme to memorialize inference that given expression will work for any type.

# 2 November 2020: Making type inference formal

There are PDF slides for 11/3/2020.

### Retrospective: Introduction to type inference

Write one thing you learned about type inference so far and one question you have. Your answers may be anonymously shared with your classmates.

### Type inference algorithm, formally

All the pieces

1. Hindley-Milner types
2. Bound names : σ, expressions : τ
3. Type inference yields type-equality constraint
4. Constraint solving produces substitution
5. Substitution refines types
6. Call solver, introduce polytypes at val
7. Call solver, introduce polytypes at all let forms

Apply rule: e’s type could just be a type variable—we need to force it into the arrow form we need.

### What you know and can do now

Taking stock

You can complete typeof

• Takes e and Γ, returns τ and C

Except for let forms.

Next up: Solving constraints

### Check your understanding: Type inference algorithm

1. The judgment form C, Γ |- e : τ corresponds to the function typeof. For each element of the judgment, say whether that element is an input or an output of the function typeof:
• Constraint C:

• input
• output
• Context Γ:

• input
• output
• Expression e:

• input
• output
• Type τ:

• input
• output
1. Consider the typing rule for apply written in the new style:

A. The type τf is the type of the first argument to the function being applied.

• True
• False

B. The arguments to the function e1, …, en have types τ1, … τn, respectively.

• True
• False

C. The type variable α is the result type of the function.

• True
• False

### Reflection: Solving constraints

For each of the following constraints, say if it is solvable. If it is, give a solution.

 1. int           ~ bool
2. (list int)    ~ (list bool)
3. 'a            ~ int
4. 'a            ~ (list int)
5. 'a            ~ ((args int) -> int)
6. 'a            ~ 'a
7. (args 'a int) ~ (args bool 'b)
8. (args 'a int) ~ ((args bool) -> 'b)
9. 'a            ~ (pair 'a int)
10. 'a            ~ tau    // arbitrary tau

## What does it mean to solve constraints?

Representing constraints

But first, we need to represent them!

datatype con = ~   of ty  * ty
| /\  of con * con
| TRIVIAL
infix 4 ~
infix 3 /\

Example:

  TYVAR "alpha1" ~ TYCON "int" /\
TYVAR "alpha2" ~ TYCON "bool"


Two questions: what’s substitution, and when is a constraint satisfied?

just formalize it. It’s a function that behaves a certain way.

Constraint satisfaction: equal types mean equal constructors applied to equal arguments—same is in eqType.

### Check your understanding: Substitutions and satisfied constraints

1. The constraint τ1 ~ τ2 is satisfied if τ1 = τ2.
• True
• False
1. A substitution θ is a mapping from type variables to type variables, but not more complex types.
• True
• False
1. A substitution θ solves’’ a constraint C by ensuring all types related in C via the ~ equivalence relation are equal.
• True
• False

### Reflection: Solving type-equality constraints

One type of constraint we need to solve is that involving two types τ1 ∼ τ2.

Recall that each type can have one of three forms:

datatype ty
= TYVAR  of tyvar
| TYCON  of name
| CONAPP of ty * ty list

Briefly describe how you would determine whether the constraint τ1 ∼ τ2 was satsified.

How many potential cases are there to consider?

Solving type equalities: The nine cases

tau1 \ tau2    TyVar2    TyCon2      ConApp2

alpha2     mu2     (tau21,...,tau2n)tau2

TyVar1             1        2          3

TyCon1             S      mu1=mu2?     Fail

ConApp1            S        S          Recurse

I = immediate; S = symmetry;

1 = alpha1 -> alpha2

2 = alpha1 -> mu2

3 = if occurs check passes, alpha1 -> (tau21,...,tau2n)tau2

### Check your understanding: Solving constraints

1. When asked to produce a substitution θ that solves an equality constraint, which has the form τ1 ~ τ2, there are nine cases to consider. The number nine comes from the fact that each type has three possible forms: a type variable α, a type constructor μ, or a type application (τ1,…,τn)τ.
• True
• False
1. When asked to produce a substitution θ that solves an equality constraint, which has the form τ1 ~ τ2, if τ1 has the form μ1 and τ2 has the form μ2, then there is no solution.
• True
• False
1. When asked to produce a substitution θ that solves an equality constraint, which has the form τ1 ~ τ2, if τ1 has the form μ1 and τ2 has the form (τ1,…,τn)τ then there is no solution.
• True
• False
1. In the type system we have been studying, the constraint α ~ α * τ is solvable.
• True
• False
1. When solving the conjunction of constraints C1 C2, it is sufficient to produce a solution θ1 for constraint C1 and a separate solution θ2 for constraint C2, and then compose those substitutions to produce a solution θ1 o θ2 for the conjunction C1 C2.
• True
• False

### The inference algorithm, formally

All the pieces

1. Hindley-Milner types
2. Bound names : σ, expressions : τ
3. Type inference yields type-equality constraint
4. Constraint solving produces substitution
5. Substitution refines types
6. Call solver, introduce polytypes at val
7. Call solver, introduce polytypes at all let forms

What you can do so far

Write type inference for everything except VAL, VALREC, and LETX

Write function solve, which, given a constraint C, has one of three outcomes:

• Returns the identity substitution in the case where C is trivially satisfied

• Returns a non-trivial substitution if C is satisfiable otherwise.

• Calls unsatisfiableEquality in when C cannot be satisfied

# 4 November 2020: Instantiation and generalization

There are PDF slides for 11/5/2020.

### Retrospective: Making type inference precise

Write one thing you learned about formalizing type inference so far and one question you have. Your answers may be anonymously shared with your classmates.

### Reflection: Introducing type schemes

Suppose your type inferencer has produced a type that mentions a type variable α. Under what circumstances do you think it is safe to generalize that type to a polymorphic type scheme α . τ?

## Introducing and eliminating type schemes

All the pieces

1. Hindley-Milner types
2. Bound names : σ, expressions : τ
3. Type inference yields type-equality constraint
4. Constraint solving produces substitution
5. Substitution refines types
6. Call solver, introduce polytypes at val form
7. Call solver, introduce polytypes at all let forms

Moving from type scheme to types (Instantiation)

Moving from types to type scheme (Generalization)

1. Instantiation is the process of converting from a polymorphic type scheme σ to a monomorphic instance τ.
Such conversions always take place when variables are type checked using the Var typing rule.
• True
• False
1. Consider the Var typing rule:

Suppose we want to infer a type for the expression

(if y (fst 2 3) 4)

in the environment

Gamma = {fst : forall 'a 'b. 'a * 'b -> 'a, y : 'c}

Suppose we instantiate fst as follows:

fst : 'c * 'b -> 'c

This instantiation and the expression gives rise to the following constraints:

'c ~ bool  ;; because y:'c and y is tested by the if

'c ~ int   ;; because first argument to fst is an int

These constraints aren’t satisfiable.

But, the expression should have the typing

 'c ~ bool, Gamma |- (if y (fst 2 3) 4) : int

What went wrong?

• We ignored the freshness requirement when instantiating fst.

• We ignored the distinctness requirement when instantiating fst.

• Nothing: the program shouldn’t be typeable.

## Bonus question: Why the distinctness requirement?

Consider again

Gamma = {fst : forall 'a 'b. 'a * 'b -> 'a, y : 'c}

??, Gamma |- (if y (fst 2 3) 4) : ??

Now suppose we instantiate fst as follows:

fst : 'a * 'a -> 'a

Gamma |- fst 2 #t

Application rule yields constraints:

'a ~ int

'a ~ bool

which aren’t satisfiable. Which means this code would also be erroneously flagged as an error.

Correct typing:

Gamma |- fst 2 #t : int

The mistake was ignoring the distinctness requirement when instantiating the type scheme.

## From types to type scheme: Val rule

The set A above will be useful when some variables in τ are mentioned in the environment.

We can’t generalize over those variables.

Applying idea to the type inferred for the function fst:

generalize('a * 'b -> 'a, emptyset) =
forall 'a, 'b. 'a * 'b -> 'a

Note the new judgement form above for type checking a declaration.

On the condition θΓ = Γ: Γ is “input”: it can’t be changed.
The condition ensures that θ doen’t conflict with Γ.

We can’t generalize over free type variables in Γ.

Their presence indicates they can be used somewhere else, and hence they aren’t free to be instantiated with any type.

### Check your understanding: The full Val rule

Consider the Val rule:

In this series of questions, we will walk through using the Val rule to type check the following definition:

(val pick-t (lambda (y z) (pick #t y z)))

in the context

Γ = { pick : α.bool $\cross$ α $\cross$ α $\arrow$ α}

1. The first hypothesis above the line in the Val rule that we must check is the judgment C, Γ |- e : τ. Suppose we have already derived the following instance of that judgment as we infer a type for the right-hand side of the pick-t definition:

$$\begin{array}{l} \alpha_y \sim \alpha_p \land \alpha_z \sim \alpha_p, \Gamma \vdash \\ \quad \lit{(lambda (y z) (pick \#t y z)) }: \alpha_y \cross \alpha_z \arrow \alpha_p \\ \end{array}$$

To derive that judgment, which of the following might have been how the polymorphic pick function was instantiated, assuming αy and αz are the types assumed for variables y and z, respectively? Consider what instantiation would have been necessary to produce the observed constraints and resulting type.

• pick : bool $\cross$ αp $\cross$ αy $\arrow$ αy

• pick : bool $\cross$ αz $\cross$ αz $\arrow$ αz

• pick : bool $\cross$ αp $\cross$ αp $\arrow$ αp

• pick : αp $\cross$ αp $\cross$ αp $\arrow$ αp

1. The next step is to produce a substitution θ that solves the constraints αy ~ αp αz ~ αp and leaves Γ unchanged (θ Γ = Γ). Which of the following substitutions is a such solution?
• θ = { αz -> αp}

• θ = { αy -> αp, αz -> αp}

• θ = { αp -> αx, αz -> αp}

• None of the above because they all change Γ.

1. The next step is to generalize. Given the substitution θ = { αy -> αp, αz -> αp}, what type should be passed as the first argument to the generalize function? (The second argument is the empty set because ftv(Γ) = .
• θ(αy $\cross$ αz $\arrow$ αp) = αp $\cross$ αp $\arrow$ αp

• θ(αy $\cross$ αz $\arrow$ αp) = αz $\cross$ αz $\arrow$ αz

• θ(α $\cross$ α $\arrow$ α) = α $\cross$ α $\arrow$ α

• θ(αy) = αp

## Reflection: Type inference for let form

Consider the following nML code. ML will infer a type scheme for the first and third examples, but reports an error for the second.

Say what each of the functions is attempting to do and explain why you think ML reports a type error for the second function.

(lambda (ys)
(let ([s (lambda (x) (cons x '()))])
(pair (s 1) (s #t))))
<function> :
(forall ['a] ('a -> (pair (list int) (list bool))))

(lambda (ys)
(let ([extend (lambda (x) (cons x ys))])
(pair (extend 1) (extend #t))))
type error: cannot make int equal to bool

(lambda (ys)  ; OK
(let ([extend (lambda (x) (cons x ys))])
(extend 1)))
<function> : ((list int) -> (list int))

Let with constraints, operationally:

1. typesof: returns τ1, …, τn and C

2. val theta = solve C

3. C-prime from map, conjoinConstraints, dom, inter, freetyvarsGamma

4. freetyvarsGamma, union, freetyvarsConstraint

5. Map anonymous lambda using generalize, get all the σi

6. Extend the typing environment Gamma (pairfoldr)

7. Recursive call to type checker, gets C_b, \tau

8. Return (tau, C' /\ C_b)

### Check your understanding: The Let rule

1. The let-bound variables x1, …, xn are given polymorphic type schemes within the scope of the body of the let e, but the type inferred for e is monomorphic.

• True
• False
1. If the constraint solver returned the substitution θ = {α -> (α, β)}, type inference would be able to proceed with calculating a residual constraint and generalization.
• True
• False
1. The residual constraint C retains all information in the substitution θ that relates to any type variable that might be relevant after type checking the let expression as determined by being mentioned in the free type variables of the context Γ.
• True
• False
1. During generalization, we only need to worry about not generalizing over the free type variables mentioned in Γ because those type variables might be constrained in various ways by other parts of the program.
• True
• False

### Check your understanding: The val-rec and letrec rules

The key insight with the two recursive type inference rules is to follow a three-step process:

1. Introduce a fresh type variable α to represent the unknown type for the recursive binding.
2. Infer constraints and a type τ for the recursive binding as usual
3. Add a constraint relating the type variable and the inferred type (α ~ τ).

Otherwise, the recursive variants of the type inference rules proceed just as the non-recursive versions.

• True
• False

Forall things

Managing Quantified types
val and val-rec let, letrec, … lambda
FORALL contains all variables (because none are free in the context) FORALL contains variables not free in the context FORALL is empty
Generalize over all variables (because none are free in the context) Generalize over variables not free in the context Never generalize

# 9 November 2020: Introduction to Object-orientation

There are PDF slides for 11/10/2020.

### Retrospective: Type Inference

Write one thing you learned in the previous module about type inference and one question you have about that material.

Your responses may be anonymously shared with the rest of the class.

### Reflection: Introduction to Object-Oriented Programming

Suppose you were implementing a drawing program that lets users manipulate various kinds of shapes: circles, triangles, diamonds, etc.

• What information would each shape need to know to be able to respond to user commands to display itself or to move to a new location?

• What information would each shape need to know about the other shapes?

• What benefits would there be to minimizing the amount of information shared between objects?

• How would you organize your code if you knew you were going to have to add lots of new shapes in the future?

## Information Hiding

So far: Programming in the small

• Expressive power
• Success stories:

• First-class functions
• Algebraic data types and pattern matching
• Polymorphic type systems

• Immersion in pile of interpreter code: not fun

An area of agreement and a great divide:

                 INFORMATION HIDING
(especially: mutable data)
/       \
/         \
code reuse   /           \  modular reasoning
/             \
interoperability  /               \  internal access
between reps     /                 \  to reps
/                   \
OBJECTS                 MODULES
ABSTRACT TYPES

Key idea: Interfaces

• Provide external abstraction of gory details
• Limit access so internals can evolve

### Introduction to OOP

Example Languages: JavaScript, Ruby, Java, C++, Python, …

Goals:

• Abstraction
• Code reuse
• Scaling to large code bases

Mechanism:

• Every object is a black box
• All you can do is send it messages!
• What messages? What do they do?
That’s the protocol
(an example of an interface)

### Object-oriented demo

Demo: circle, diamond, triangle cards with these methods:

Front:

Object 1 : Circle

• pos: cardinal-point

• set-pos:to: cardinal-point coord

• draw

Back:

Object 1: Circle

;; stores center of shape

center: coord

methods: ...

Message sends:

1. object1 set-pos:to: 'S (0,0)

2. object1 pos: 'N answer: (0,2)

3. object2 set-pos:to: 'S (0,2)

4. object2 pos: 'N answer: (0,4)

5. object3 set-pos:to: 'SW (0,4)

6. object1 draw

7. object2 draw

8. object3 draw

## Key concepts of object-orientation

(information hiding)

• Ask me to tell you something
• Ask me to do something

(In object-oriented world, imperative “do this” is common.)

• If you ask me something, I might delegate to my parent.

(Code reuse: one parent can do things for many children.)

### Functional vs. OOP mindset

• Functional and procedural:

• I find out what form you are
• I decide what to do and how to do it.
• Function name comes first in the syntax.
• Object oriented:

• I ask you what to do
• You decide how to do it, based on what form you are.
• Receiving object comes first in the syntax.

Reflected in the syntax: the decider comes first!

### Examples

Print every element of a list

• Functional: Are you nil or cons?
• OOP: Ask every element to do something.

Multiplication of natural numbers, base 10

• Functional: Are you zero or nonzero?
• OOP: return a number that is 10 times yourself.

### Check your understanding: Key concepts of object-orientation

In object-oriented programming, computation proceeds by sending messages to objects.

• True
• False

Information-hiding is important to large-scale programming because it allows a large program to be divided into smaller units that can be understood separately.

• True
• False

In both functional and object-oriented languages it is idiomatic to ask about the representation of arguments to functions/methods.

• True
• False

One way an interface supports programming-in-the-large is by limiting access to the internal details of part of a program so those details may evolve independently of the rest of the program.

• True
• False

the fact that the concepts are intertwined to the point that it is hard to present them one at a time. In this video, I’ll review some of the key terms to give you some context, but we’ll introduce them again as we see them in use.

OOP terminology

• Object:

• Dynamic entity that receives messages
• Encapsulates instance variables (generally imperative) and methods (aka instance methods)
• Special variable self refers to the host object
• Class: Construct for defining objects

• Specifies methods for objects
• Defines object initialization in class methods
• New objects created by sending message to class
• May inherit behavior from parent or super class
• An object instantiated from a class is called an instance of that class.
• Dynamic dispatch: Another term for message sending

Example: List select (aka filter)

-> (val ns (List withAll: '(1 2 3 4 5)))
List( 1 2 3 4 5 )
-> (ns select: [block (n) ((n mod: 2) = 0)])
List( 2 4 )

Vocabulary notes:

• List is a class
• withAll is a class method

• It encapsulates a use of new
• ns is an object and an instance of class List
• ns is sent the message select
• select is an instance method of class List
• Bonus: block construct analogous to lambda

### Check your understanding: OOP Terminology

Match the OOP term to its definition:

OOP term:

1. object
2. class
3. instance variables
4. class method
5. dynamic dispatch
6. inheritance
7. instantiation

Definition:

1. construct used for defining objects
2. imperative state allocated on a per-object basis
3. process of creating an object from a class; the new object is called an instance of the class
4. dynamic entity that receives messages
5. behavior invoked at the class level typically for instantiating and initializing objects
6. mechanism whereby one class can be defined by extending the behavior of a parent or super class
7. another term for message sending

Design process in an OO setting

Design process still works

1. Each method defined on a class

• The class knows the form!
2. Class determines

• How object is formed (class method)
• From what parts (instance variables)
• How object responds to messages (instance method)

Each form of data gets its own methods in its own class!

Using classes to implement select

Class determines how object responds.

Key classes in lists:

• class ListSentinel: an instance denotes end of list

• class Cons: instance is cons cell with car and cdr instance variables

(method select: (_) self) ;; on ListSentinel
(method select: (aBlock)  ;; on Cons
([aBlock value: car] ifFalse:ifTrue:
{(cdr select: aBlock)}
{(((Cons new) car: car)
cdr: (cdr select: aBlock))}))

### Check your understanding: Classes represent forms of data

When coding lists using object-oriented idioms, different classes are used to represent the different forms of data: one class for the empty list and another class for non-empty lists.

• True
• False

Dynamic dispatch replaces if

In OOP, conditionals are deprecated

Instead, idiomatic code uses dynamic dispatch.

We’ll expore via example: List filtering via iteration

• Uses imperative idioms (common in OOP)

Iteration in Scheme

(define app (f xs)
(if (null? xs)
'do-nothing
(begin
(f (car xs))
(app f (cdr xs)))))

Iteration in uSmalltalk

Use dynamic dispatch:

Instead of (app f xs), we have

(xs do: f-block)

For each element x in xs,

send

Example: Iteration

-> (val ms (ns select: [block (n) ((n mod: 2) = 0)]))
List( 2 4 )
-> (ms do: [block (m) ('element print) (space print)
('is print) (space print)
(m println)])
element is 2
element is 4
nil

Implementing iteration

What happens if we send “do f” to an empty

What happens if we send “do f” to a cons cell?

Iteration by dynamic dispatch

Sending do: to the empty list:

  (method do: (aBlock) nil)
; nil is a global object

Sending do: to a cons cell:

  (method do: (aBlock)
(aBlock value: car)
(cdr do: aBlock))

Look! No if! Decisions made by dynamic dispatch

### Basic OOP mechanism review

• Message send replaces function application

• (ns select: pred) vs filter ns pred
• Receiver appears first: it’s in charge!
• Object responds to message by evaluating method
• Method determined by object’s class

Not shown: a method can use primitive operations.

### Check your understanding: Code selection via dynamic dispatch

Dynamic dispatch means it is the run-time class of the receiver object that determines the code to be executed.

• True
• False

Idiomatic object-oriented code often avoids conditionals through the use of dynamic dispatch, for example by sending messages to objects from different classes to trigger class-appropriate behavior instead of asking the object about its representation and then branching on the result.

• True
• False

## The six questions for Smalltalk

The six questions

1. Values are objects (even true, 3, "hello")

Even classes are objects!

There are no functions—only methods on objects

The six questions

1. Syntax:
• Mutable variables

• Message send

• Sequential composition of mutations and message sends
(side effects used commonly)

• “Blocks”
(objects and closures in one, used as continuations)

• No if or while
Functionality implemented by passing continuations to Boolean objects.

### Message passing

Look at SEND

• Message identified by name (messages are not values)
• Always sent to a receiver
• Optional arguments must match arity of message name
(no other static checking)

N.B. BLOCK and LITERAL are special objects.

The six questions

1. Environments
• Name stands for a mutable cell containing an object:

• Global variables
• Instance variables

The six questions

1. Types

There is no compile-time type system.

At run time, Smalltalk uses behavioral subtyping, known to Rubyists as “duck typing”

2. Dynamic semantics

• Main rule is method dispatch (complicated)
• The rest is familiar
3. The initial basis is enormous

• Why? To demonstrate the benefits of reuse, you need something big enough to reuse.

### Check your understanding: The six questions for Smalltalk

In μSmalltalk, all values are objects, even booleans, integers, strings, classes.

• True
• False

Because μSmalltalk focuses so strongly on objects, it does not provide any language feature akin to anonymous functions.

• True
• False

The μSmalltalk type system statically guarantees that no message-not-understood errors will happen at run-time. That is, if a message m is sent to an object obj in the text of the program, then obj belongs to a class in which method m is defined.

• True
• False

# 10 November 2020: OOP Mechanisms

There are PDF slides for 11/11/2020.

### Retrospective: Introduction to OOP

Write one thing you learned about OOP so far and one question you have. Your answers may be anonymously shared with your classmates.

## Smalltalk syntax

little livecoding.

## Coding in uSmalltalk

• ;; blocks
• (val twice [block (n) (n + n)])
• (twice value: 3)
• ;; note lightweight syntax for parameterless blocks
• (val delayed {(’hello println) 42})
• delayed
• (delayed value)
• ;; blocks sent to Booleans
• (val x 10)
• (val y 20)
• ;; Note the number of columns matches number of args
• ((x <= y) ifTrue:ifFalse: {x} {y})
• ;; numbers
• ;; Fractions: exact 1 / 3
• ;; (1/3) is a name; (1 / 3) is what you want!
• ((1 / 3) - (1 / 2))
• ((1 / 3) + 1) ;; mixing classes!
• ((1 / 3) asFloat) ;; software floating point
• (2 asFloat)
• ((2 asFloat) sqrt)
• (((2 asFloat) sqrt) squared)
• (100 squared)
• (it squared)
• … keep going until overflow …
• (use bignum.smt)
• 100
• (it squared)
• Try again, go until CPU cap, stack trace

### Coding Interlude: Coding in Smalltalk

Consider the following code:

  ;; blocks
(val twice [block (n) (n + n)])
(twice value: 3)
;; note lightweight syntax for parameterless blocks
(val delayed {('hello println) 42})
delayed
(delayed value)
;; blocks sent to Booleans
(val x 10)
(val y 20)
;; Note the number of columns matches number of args
((x <= y) ifTrue:ifFalse: {x} {y}) 
• Enter the code, run it, and observe the results.

• Write a block square that multiples its argument object by itself (the symbol * represents multiplication), and use it to square the number 5.

1. Why doesn’t entering delayed cause the string hello to be printed and the number 42 to be returned?
2. Why does entering (delayed value) cause the string hello to be printed and the number 42 to be returned?
3. What is the meaning of the braces, as in {('hello println) 42} and in the arguments to theifTrue:ifFalse:method ({x}and{y})?
4. What about the syntax of the ifTrue:ifFalse method tells you how many arguments it is expecting?

## Classes

Parts of a class definition

• Superclass: What do I inherit?

• Instance variables:

• Parts from which objects are formed
• Class methods:

• How will class respond to class messages?
(Yes, in uSmalltalk, classes are also objects)
• Typically used to create and initialize instances.
• Instance methods:

• How will each object respond to messages?
• Some may override methods in parent class.

Example: CoordPair

(class CoordPair
[subclass-of Object]   ;; Inherits from class Object
[ivars x y]            ;; Instance variables

(class-method withX:y: (anX aY)
((self new) initializeX:andY: anX aY))
(method initializeX:andY: (anX aY) ;; private
(set x anX)
(set y aY)
self)

(method x () x)
(method y () y)
(method = (coordPair)
((x = (coordPair x)) & (y = (coordPair y))))
(method * (k)           ;;;;;; Scaling and translation
(CoordPair withX:y: (x * k) (y * k)))
... )

### Check your understanding: Parts of a class definition

Consider the following class definition:

(class CoordPair
[subclass-of Object]   ;; Inherits from class Object
[ivars x y]            ;; Instance variables

(class-method withX:y: (anX aY)
((self new) initializeX:andY: anX aY))
(method initializeX:andY: (anX aY) ;; private
(set x anX)
(set y aY)
self)

(method x () x)
(method y () y)
(method = (coordPair)
((x = (coordPair x)) & (y = (coordPair y))))
(method * (k)           ;;;;;; Scaling and translation
(CoordPair withX:y: (x * k) (y * k)))
... )

1. Class CoordPair inherits from a class Pair
• True
• False
1. Objects instantiated from class CoordPair have four instance variables: x, y, anX, and anY.
• True
• False
1. The class method withX:y: is sent to the class CoordPair.
• True
• False
1. The method = is an instance method that is sent to objects instantiated from the class CoordPair.
• True
• False
1. In the class CoordPair, the identifier x is used as both an instance variable and an instance method.
• True
• False

Access rights

Instance variables

• Visible only to me (“private”)
• I can’t access argument instance variables directly;
I can only send messages.

• Instance variables are encapsulated

Methods

• By mechanism, “public” (anyone can send)

• By convention, some are private

• Send to myself, or to an instance I create
(Simplest thing that could possibly work.)

coordPair; need to use accessor methods; common idiom.

Example: Access rights

(class CoordPair
[subclass-of Object]   ;; Inherits from class Object
[ivars x y]            ;; Instance variables

(method initializeX:andY: (anX aY) ;; private
(set x anX)
(set y aY)
self)

(method x () x)
(method y () y)
(method = (coordPair)
((x = (coordPair x)) & (y = (coordPair y))))

...
)


### Check your understanding: Access rights

Consider the following code:

(class CoordPair
[subclass-of Object]   ;; Inherits from class Object
[ivars x y]            ;; Instance variables

(method initializeX:andY: (anX aY) ;; private
(set x anX)
(set y aY)
self)

(method x () x)
(method y () y)
(method = (coordPair)
((x = (coordPair x)) & (y = (coordPair y))))

...
)


1. In the class CoordPair, the instance variable y is private, which means it is accessible within the methods of the class, but not elsewhere. As a result, it is accessible within initializeX:andY:, y, and =. However, any y instance variable that might be part of the argument object coordPair is not accessible in the = method.
• True
• False
1. μSmalltalk makes it impossible to access the method initializeX:andY: outside of the class CoordPair because of the comment private.
• True
• False

invariants for each class in hierarchy.

Object initialization

Objects created by sending creator method to class.

(class CoordPair
[subclass-of Object]   ;; Inherits from class Object
[ivars x y]            ;; Instance variables

(class-method withX:y: (anX aY)
((self new) initializeX:andY: anX aY))

(method initializeX:andY: (anX aY) ;; private
(set x anX)
(set y aY)
self)
...

(method * (k)          ;;;;;; Scaling and translation
(CoordPair withX:y: (x * k) (y * k)))
)


### Check your understanding: Object initialization

Consider the following code:

(class CoordPair
[subclass-of Object]   ;; Inherits from class Object
[ivars x y]            ;; Instance variables

(class-method withX:y: (anX aY)
((self new) initializeX:andY: anX aY))

(method initializeX:andY: (anX aY) ;; private
(set x anX)
(set y aY)
self)
...

(method * (k)          ;;;;;; Scaling and translation
(CoordPair withX:y: (x * k) (y * k)))
)


1. The method * sends the message withX:y: to the class CoordPair to create a new CoordPair object initialized with scaled up x and y coordinates.
• True
• False
1. The class method withX:y: sends the new message to self to allocate a fresh CoordPair object. It then sends the message initializeX:andY: to the resulting object to initialize its instance variables x and y to the parameter values anX and aY.
• True
• False
1. An alternative way to get a new CoordPair object in a method defined within the CoordPair class would be to send the message initializeX:andY: to self.
• True
• False

Special name self

• Allows message sends to host object
• Idiomatic to return self to support message chaining
• Not a proper variable because can’t be set

Example: A short-hand method

     (method toOrigin ()
(self setLoc: (0,0)))

Special name super

• Starts method search with parent class
• Statically determined based on defining class of method
• Typically used when refining inherited behavior
• Not a proper variable because can’t be set

Example: A speedy shape

(method move: (coord) (super move: (coord * 2)))

### Check your understanding: Special names self and super

1. A method defined on an object obj sends a message to the special name self to invoke another one of obj’s methods.
• True
• False
1. A method defined in a class S that inherits from class P sends a message m to the special name super when it wants to use the method implementation that P would have used for m.
• True
• False

Abstract and concrete classes

Abstract class

• Meant to be inherited from

• Some ( > 0) subclassResponsibility methods

• Examples: Shape, Boolean, Collection

Concrete class (aka, a regular class)

• Meant to be instantiated

• No subclassResponsibility methods

• Examples: Triangle, True, List

the collction protocol:

definitions with more efficient implementations.

### Check your understanding: Concrete and abstract classes

1. An abstract class has exactly one abstract method whose implementation is subclassResponsibility.
• True
• False
1. The purpose of an abstract class is to define some concrete methods in terms of the abstract methods so that behavior can be inherited by subclasses.
• True
• False

## Method Dispatch

dispatch. We’ll use Booleans as an example.

class implements: it’s interface. Often standard methods defined on all objects omitted (like print).

send a message to the object true.

Method dispatch: The algorithm

1. Consider the class of the receiver

2. Is the method with that name defined?

3. If so, use it

4. If not, repeat with the superclass

Run out of superclasses?

“Message not understood”

Method dispatch: An example

Q: What happens if is sent to ?

• (true ifTrue: aBlock)

A: Search starts in True class and method is found.

Q: What if is sent to ?

• (true print)

A: Search starts in True, moves to parent Boolean, stops in class Object where it finds the print method.

Q: What if is sent to ?

• (true move)

A: Search starts in True, moves to Boolean, moves to Object, fails to find move, so reports “Message not understood” error.

### Check your understanding: Method dispatch

1. When a message is sent to an object, the runtime system searches for a corresponding method definition starting with the class that defined the object.
• True
• False
1. If a message m is sent to an object obj and if the runtime system cannot find a method definition in obj’s class, then the system reports a message-not-understood error.

# 16 November 2020: OOP Case Study: The Number Hierarchy

There are PDF slides for 11/17/2020.

## Case study: Magnitudes and numbers

### Reflection: Design of a Number Hierarchy

Imagine you are designing a hierarchy of number types such as reals, floats, integers and fractions. How would you organize this hierarchy? How are these number types similar and how are they different?

Key problems on homework

• Natural is a Magnitude

• “Large integer” is a Number

Case Study: The Number Hierarchy

Goal: Develop feel for programming in the large

Issues we’ll consider along the way:

• Class design

• What methods go where?
• Invariants

• Class methods establish invariants
• Instance methods must maintain them
• Information hiding

• “semi-private” methods for like objects
• Double-dispatch when argument form matters.

Implementation of : Reuse

(class Magnitude            ; abstract class
[subclass-of Object]
(method =  (x) (self subclassResponsibility))
; may not inherit = from Object
(method <  (x) (self subclassResponsibility))
(method >  (y) (y < self))
(method <= (x) ((self > x) not))
(method >= (x) ((self < x) not))
(method min: (aMag)
((self < aMag) ifTrue:ifFalse: {self} {aMag}))
(method max: (aMag)
((self > aMag) ifTrue:ifFalse: {self} {aMag}))
)

### Check your understanding: Design of magnitude class

Consider the Magnitude Class

(class Magnitude ; abstract class
[subclass-of Object]
(method = (x) (self subclassResponsibility))
; may not inherit = from Object
(method < (x) (self subclassResponsibility))
(method > (y) (y < self))
(method <= (x) ((self > x) not))
(method >= (x) ((self < x) not))
(method min: (aMag)
((self < aMag) ifTrue:ifFalse: {self} {aMag}))
(method max: (aMag)
((self > aMag) ifTrue:ifFalse: {self} {aMag}))
)
1. What methods must a subclass of Magnitude implement at a minimum?
• <, =
• =
• <,>,=
• <,=,<=,>=,>
1. True or False: Can you compare two instances of Magnitude for equality using the = method as defined?
• True
• False

receiver but represents the same value as the argument

### Reflection: OOP Design: Implementing class Number

Code reuse is a powerful feature. Consider the methods negated, reciprocal, +, -, * / .

How might you implement some of these methods in terms of others?

Concrete Number Methods

(method -  (y) (self + (y negated)))
(method abs () ((self isNegative) ifTrue:ifFalse:
{(self negated)}
{self}))
(method /  (y) (self * (y reciprocal)))

(method isNegative         ()
(self  < (self coerce: 0)))
(method isNonnegative      ()
(self >= (self coerce: 0)))
(method isStrictlyPositive ()
(self  > (self coerce: 0)))

Abstract Number Methods

(class Number
[subclass-of Magnitude] ; abstract class
;;;;;;; arithmetic
(method +   (aNumber)     (self subclassResponsibility))
(method *   (aNumber)     (self subclassResponsibility))
(method negated    ()     (self subclassResponsibility))
(method reciprocal ()     (self subclassResponsibility))

;;;;;;; coercion
(method asInteger ()      (self subclassResponsibility))
(method asFraction ()     (self subclassResponsibility))
(method asFloat ()        (self subclassResponsibility))
(method coerce: (aNumber) (self subclassResponsibility))
)


### Check your understanding: Design of number class

Consider the Extended Number Heirarchy

1. At a minimum what messages will instances of Fraction, Float and Integer be able to respond to?

• Methods defined in Number, Magnitude and Object
• Methods defined in Number only
1. At a minimum what messages will instances of Small Integer and LargeInteger be able to respond to?

• Only Methods defined in Number
• Methods defined in Number and Methods Defined in Integer

## Fractions: A case study in information hiding and initialization

1. the denominator is always positive

2. If the numerator is zero, the denominator is 1.

3. The numerator and the denominator are reduced to lowest terms, that is, their only common divisor is 1.

Example class Fraction: Initialization

(class Fraction
[subclass-of Number]
[ivars num den]
;; invariants: lowest terms, minus sign on top
;;             if num = 0, den = 1
;; maintained by divReduce, signReduce
(class-method num:den: (a b)
((self new) initNum:den: a b))
(method initNum:den: (a b) ; private
(self setNum:den: a b)
(self signReduce)
(self divReduce))
(method setNum:den: (a b)
(set num a) (set den b) self) ; private
.. other methods of class Fraction ... )

Information revealed to self

Instance variables num and den

• Directly available
• Always and only go with self

Object knows its own representation, invariants, private methods:

(method asFraction ()
self)
(method print ()
(num print) ('/ print) (den print) self)
(method reciprocal ()
(((Fraction new) setNum:den: den num) signReduce))

## Information revealed to others

### Reflection: Implementing coerce

Coercion is the process of making an instance fit a specified type. For instance, the Integer 1 can be coerced to the Float 1.0.

Imagine you wanted to add a Float x to an Integer y without losing any precision. Which instance should be coerced to what type before adding?

The method coerce: implements this idea. It returns the value of the argument in the representation of the receiver. How would you implement coerce: ?

What does the caller of the method coerce: need to know about the type and protocol of the receiver?

Information revealed to others: coerce

How would you implement coerce:?

• Value of argument, rep of receiver

• Challenge: Can’t access rep of argument!

(method asFraction ()
self)
(method print ()
(num print) ('/ print) (den print) self)
(method reciprocal ()
(((Fraction new) setNum:den: den num) signReduce))
(method coerce: (aNumber)
(aNumber asFraction))

Saved: Number protocol includes asFraction!

### Reflection: Implementing multiply on fractions

(class Fraction
[subclass-of Number]
[ivars num den]
;; invariants: lowest terms, minus sign on top
;; if num = 0, den = 1
;; maintained by divReduce, signReduce
(class-method num:den: (a b)
((self new) initNum:den: a b))
(method initNum:den: (a b) ; private
(self setNum:den: a b)
(self signReduce)
(self divReduce))
(method setNum:den: (a b)
(set num a) (set den b) self) ; private
.. other methods of class Fraction ... )

Consider implementing multiply between instances of the Fraction class. How do you multiply two fractions? How might you implement multiply given what we know about multiplying two fractions?

### Reflection: Implementing multiply across differnt kinds of numbers

How might you implement multiply when we have lots of different kinds of numbers with different representations?

Extending open systems

Number protocol: like multiplies with like

Goal:

• Large integers and small integers are both Integers

• Messages =, <, +, * ought to mix freely

Constraint: Each object has its own algorithm.

• Small: Use machine-primitive multiplication

• Large: Multiply magnitudes; choose sign

Double dispatch to the rescue!

In the following slide, the notation :+ and :- are value constructors for positive and negative large integers. The n and m are magnitudes (natural numbers).

How do you act?

• As small integer, you’re asked to add large positive integer N to self (aka plusLP: N)

Response: ((self asLargeInteger) plus N)

• As small integer, you’re asked to add small integer n to self (aka plusSP: n)

Response: (self plus n)

• As large positive integer, you’re asked to add large positive integer N to self (aka plusLP: N)

Response: (self plus n)

• As large positive integer, you’re asked to add small integer n to self (aka plusSP: n)

Response: (self plus (n asLargeInteger))

### Check your understanding: Double dispatch

1. What two things does a double dispatch method encode?

• Operation and protocol
• Operation and instance
• Instance and protocol
1. Consider using double dispatch to implement multiplication between your objects.

For each item, consider your class, and then the class of the object you are being multiplied against.

Do you need to coerce yourself into a different type of number for the operation to be successful?

Choose the appropriate smalltalk code to accomplish what you’re asked to do.

• As small integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)
• (self * n)
• ((self asLargeInteger) * N)
• As small integer, you’re asked to multiply small integer n by self (aka timesSP: n)
• (self * n)
• ((self asLargeInteger) * N)
• As large positive integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)
• (self * n)
• ((self asLargeInteger) * N)
• As large positive integer, you’re asked to multiply small integer n by self (aka timesSP: n)
• (self * n)
• ((self asLargeInteger) * N)
1. Take a look at the code below. Can you determine which class each method belongs to? Hint: Your instance knows it’s own type and broadcasts that to the object the method is operating on.
• (method + (aNumber) (aNumber addSmallIntegerTo: self))
• SmallInteger
• LargePositiveInteger
• (method * (anInteger) (anInteger multiplyByLargePositiveInteger: self))
• SmallInteger
• LargePositiveInteger

Your turn: Double dispatch to implement multiplication

How do you act?

• As small integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)

Response: ((self asLargeInteger) * N)

• As small integer, you’re asked to multiply small integer n by self (aka timesSP: n)

Response: (self * n)

• As large positive integer, you’re asked to multiply large positive integer N by self (aka timesLP: N)

Response: (self * n)

• As large positive integer, you’re asked to multiply small integer n by self (aka timesSP: n)

Response: (self * (n asLargeInteger))

Knowledge of protocol

Three levels:

• I know only your

Example: send to any collection

• You are like me: share

Example: send or to

• I must get to know you:

Example: send to to

Extra: Dealing with overflow

New law for multiplication:

  (self * small)
=
((primitive mulWithOverflow
self
small
{((self asLargeInteger) * small)})
value))

Primitive is not a method

• Answers good block or exception block
• Answer is then sent value

# 18 November 2020: Subtyping and Equality

There are PDF slides for 11/19/2020.

### Reflection: Subtyping

Consider when can an object of one type be used in a context expecting an object of another type.

What needs to be true for this substitution to work?

Hint: Imagine you need to pound a nail. In this context you expect to use a hammer. If you had to use something else besides a hammer, what needs to be true about the alternative object for to accomplish your task?

## Subtyping

Subtype polymorphism

• Key mechanism to support code reuse

• A is a subtype of B (written A <: B) if value a:A can be used whenever a value of supertype B is expected.

• Example: Circle, Diamond, and Triangle can be used in any context expecting a Shape

• Subtyping relationship can be checked statically (e.g. Java, C++, Scala) or dynamically (e.g. Smalltalk, Ruby)

Behavioral subtyping

(in Ruby, duck typing’’)

Types aren’t enough:

• Methods should also behave as expected.

Example:

• draw method in Shape hierarchy vs.
• draw method in first-person shooter game

Not enforced in type systems because of decidablity.

1. True or False: A subtype doesn’t need to understand all the messages of its supertype.
• True
• False
1. True or False: If A is a subtype of B and B is a subtype of C A can understand all the messages of C.
• True
• False

### Reflection: Subtyping vs. Inheritance

Subtyping describes the relationship between two objects in terms of what messages both objects understand.

Inheritance describes the relationship between two objects in terms of what code is shared between objects.

First, describe two objects that aren’t related via inheritance, but that are subtypes.

Next, describe two objects that are related via inheritance, but that aren’t subtypes.

### Subtyping vs Inheritance

Subtyping is not inheritance

  - **SUBTYPE != SUBCLASS**
- **SUPERTYPE != SUPERCLASS**

Some languages like C++ identify subtype with subclass, but conceptually they are different.

Examples

Example: circle and square with same protocol but defined independently

• Subtypes but not related via inheritance

Example: Non-mutable dictionary that inherits from Dictionary but hides all methods that can update entries

• Related via inheritance but not subtypes

Subtyping and Inheritance: Code Reuse

What reuse is being enabled?

• Inheritance: Parent code to facilitate class definition

• Subtyping: All the client code

Food for thought: Which has the bigger impact?

### Check your understanding: Subtyping vs. Inheritance

1. Match each term with the correct definition.
• Subtyping
• related in class definition
• substitutability
• Inheritance
-related in class definition
• substitutability
1. True or False: When you add a method to a class, you need to add it to all the classes which inherit from that class.
• True
• False

### Reflection: Equality

When should two objects be considered equal? Consider comparing two strings with == in C. What problems could result?

## Equality

When are two entities equal?

• Easy to get wrong esp. for abstract types

• Logically equal but have different reps

Example: lists used to represent sets

[1,2,3] and [2,1,3]

• Logically distinct by have the same rep

Example: Simulation of an ecosystem

Two distinct animals may have same data

Scalars

Examples: int, bool, char, etc.

Test: Same bit representation

• Efficient

• But: Don’t use for floats!

Functions

Example: int * int list -> int list

Test: Same code?

• Decidable, but meaningful?

Test: Same math function?

• Undecideable
• Reason for ML’s ''a types:
any type that supports equality

Usually disallowed.

Compound types: Option 1

Examples:

• a dict object from class Dictionary
• a rabbit from a simulation class

Test: object identity

• Same object in memory
• Very efficient
• May fail to equate things that should be equal

Compound types: Option 2

Examples:

• a dict object from class Dictionary
• a rabbit from a simulation class

Test: structural equality

• Recursively examine all the parts
• Can be expensive!
• Less picky, but may also fail to equate things that should be equal

Compound types: Option 3

Examples:

• a dict object from class Dictionary
• a rabbit from a simulation class

Test: User-defined

• Can precisely define correct notion of equality
• May or may not be efficient
• Requires programmer attention

Tension

Correctness

vs. Efficiency

vs. Programmer Convenience

1. True or False: Different types may require different notions of equality. For example a notion of equality between two integers may be different than a notion of equality between two functions, or two expressions or two strings.
• True
• False

## Semantics of Equivalence

Observational equivalence

Key idea: observational equivalence

• Two values should be considered equal if no context can tell them apart

Key question: What should the context be allowed to do?

• If context can use reflection (eg, class methods to query object formation), then no abstraction

• Rule out reflection
• Internals of an abstraction can see more than intended

• Only let context use client code
• If object is mutable and context is allowed to mutate it

• either: Object identity
• or: Rule out mutation

### Check your understanding: Observational Equivalence

What is the key idea of Observational Equivalence?

• Two values should be considered equal if no context can tell them apart.

• Two values should be considered equal if they have the same structure or binary representation in memory.

• Two values should be considered equal if there is a context which produces the same result for both.

Equality in uSmalltalk

Method == is object identity

;; object identity
(method  == (anObject) (primitive sameObject self anObject))

Method = is a form of observational equivalence

• Principle: Two objects are considered equivalent if, without mutating either object, client code cannot tell them apart

• Each class decides for itself by overwriting default method

• Defaults to object identity

(method = (anObject) (self == anObject))

• Used by check-expect

• Why disallow client code from using mutation to observe?
Pragmatic: already have object-identity with = method

Example: Equivalent but not equal

(-> (val ns (List new)) List( )
-> ns
List( 1 2 3 )
-> (val ms (List new)) List( )
-> ms
List( 1 2 3 )
-> (ns == ms)
<False>
-> (ns = ms)
<True>

### Check your understanding: Equality in Smalltalk

Match each of the smalltalk operators ( =, == ) with the correct definition.

• ==
• Are two objects observationally equivalent?
• Are two objects pointing to the same place in memory?
• =
• Are two objects observationally equivalent?
• Are two objects pointing to the same place in memory?

## Review: Designing with objects

Visibility rules

Private messages to self (or instance of self)

Secret protocol to other objects of the same class (which can be spoofed)

How to approach object-orientation

What to remember:

• Dynamic dispatch tells your own form of data
• Avoid knowing any argument’s form!
Use semi-private methods, or (last resort) double dispatch

What not to do:

• Don’t guess what code will answer a message
• Don’t try to trace follow procedures step by step
(crosses too many class/method boundaries)

• Trust the contract
• Keep each method dirt simple

# 23 November 2020: Hiding information with abstract data types

There are PDF slides for 11/24/2020.

### Retrospective: Object-oriented programming

Write one thing you learned about object-oriented programming and one question you still have. Your answers may be anonymously shared with your classmates.

### Reflection: Introduction to Module Systems

Describe two issues that arise when working on large programs that to not arise when working on small ones.

Pick a particular language feature or programming idiom and describe why that feature or idiom makes programming-in-the-large either easier or harder.

## Data abstraction

How to write large programs?

An area of agreement and a great divide:

                 INFORMATION HIDING
(especially: mutable data)
/       \
/         \
modular reasoning  /           \  code reuse
/             \
internal access   /               \  interoperability
to rep           /                 \  between reps
/                   \
MODULES               OBJECTS
ABSTRACT TYPES

### Introduction to Module Systems

Key idea: Internal details hidden behind clean interface

        Implementation      Interface

------------------   -----
|                |   |   |
|     Module     |   | I |
|                |   |   |
------------------   -----
^            ^
|            |
Nitty gritty -            - Public face


Key idea: Hierarchical building blocks

     ------        ------
| C1 |        | C2 |
------        ------
\            /
\          /
--------------------
|        I3        |
--------------------
|        M3        |
|  ------  ------  |
|  | I1 |  | I2 |  |
|  ------  ------  |
|  |    |  |    |  |
|  | M1 |  | M2 |  |
--------------------


Terminology: a module is a client of the interfaces it depends on

### Module system overview

Functions of a module system:

• Hide representations, implementations, private names
• “Firewall” separately compiled units
(promote independent compilation)
• Possibly reuse units

Desirable features:

• Modular (i.e., independent) static type checking
• Separately compilable interfaces and implementations

Aside: C and C++ are cheap imitations

• C doesn’t provide namespaces
• C++ doesn’t provide modular type checking for templates

### Interfaces

Collect declarations

• Unlike definition, provides partial information about a name
(usually environment and type)

Things typically declared:

• Variables or constants (values, mutable or immutable)
example: val MAX_INT : int

• Types  example: type nat

• Procedures (if different from values)
example: val intOfNat : nat -> int

• Exceptions
example: exception Negative

Key idea: a declared type can be abstract

• Just like a primitive type

• Cannot ask “how were you formed, and from what parts?”

### Ways of thinking about interfaces

• Specification or contract for a module

• The “type” of a module
• Includes contracts for all declared functions
• Means of hiding information (“what secret does it protect?”)

• A way to limit what we have to understand

• The unit of sharing and reuse
• A contract between programmers

• Parties might be you and your future self
• Explainer of libraries
• The best technology for structuring large systems.

• Estimated force multiplier x10

Example interface

signature NATURAL_NUMBER = sig
type nat
val intOfNat  : nat -> int
val natOfInt  : int -> nat
val natString : nat -> string

exception Negative
val addNats : nat * nat -> nat
val subNats : nat * nat -> nat
val mulNats : nat * nat -> nat
end


Design choice: How are interfaces constructed?

Interface “projected” from implementation:

• No separate interface
• Compiler extracts from implementation
• When code changes, must extract again
• Easy to construct but few cognitive benefits

Interface written independently:

• Distinct file, separately compiled
(Caml, Go, Java interface, Modula, Ada)
• Implementations can change independently
• Harder to construct but full cognitive benefits

1. An interface can be thought of as the type of a module.
• True
• False
1. An interface typically includes the implementation of one or more functions.
• True
• False
1. An interface can declare a type but leave its definition abstract, forcing client code to treat that type abstractly, as if it were a primitive type.
• True
• False

### Module Implementations

     ------        ------
| C1 |        | C2 |
------        ------
\            /
\          /
--------------------
|        I3        |
--------------------
|        M3        |
|  ------  ------  |
|  | I1 |  | I2 |  |
|  ------  ------  |
|  |    |  |    |  |
|  | M1 |  | M2 |  |
--------------------

• Hold all dynamically executed code

• expressions/statements
• May include private names

• May depend only on interfaces, or on interfaces and implementations both

• max cognitive benefits when all dependency is mediated by interfaces
• Dependencies may be implicit or explicit (import, require, use)

### Check your understanding: Modules and module systems

1. A module provides the implementation of one or more interfaces.
• True
• False
1. A module may have private information hidden from client code.
• True
• False

## Standard ML Modules

The Perl of module languages?

• Has all known features

• Supports all known styles

• Rejoice at the expressive power

• Weep at

• the terminology,
• the redundancy,
• the confusing syntax

What we’ve been using so far is the core language

Modules are a separate language layered on top.

ML module terminology

Interface is a signature

Implementation is a structure

Generic module is a functor

• A compile-time function over structures

• The point: Reuse without violating abstraction

Structures and functors match signature

Analogy: Signatures are the types’’ of structures.

### Signature basics

Signature says what’s in a structure

Specify types (w/kind), values (w/type), exceptions

Examples: Ordinary types

    type t        // abstract type, kind *
eqtype t
type t = ...  // 'manifest' type
datatype t = ...

Examples: Type constructors work too

    type 'a t     // abstract, kind * => *
eqtype 'a t
type 'a t = ...
datatype 'a t = ...

Signature example: Ordering


signature ORDERED = sig
type t
val lt : t * t -> bool
val eq : t * t -> bool
end



operation that quickly finds and removes a minimal element.

without knowing the details of the implementation.

Signature example: Priority queue

signature PQUEUE = sig
type elem           (* <-- ABSTRACT type *)
val compare_elem : elem * elem -> order

type pqueue         (* <-- ABSTRACT type *)
val empty  : pqueue
val insert : elem * pqueue -> pqueue
val isEmpty : pqueue -> bool
exception Empty
val deletemin : pqueue -> elem * pqueue
end


### Check your understanding: The anatomy of a signature

When specifying a value in an ML signature, it is necessary to give a type for that value.

• True
• False

An ML signature can specify types, but not type constructors.

• True
• False

An ML signature cannot include exceptions.

• True
• False

### ML modules extended example: Numbers

Signature example: Integers

signature INTEGER = sig
eqtype int             (* <-- ABSTRACT type *)
val ~   : int -> int
val +   : int * int -> int
val -   : int * int -> int
val *   : int * int -> int
val div : int * int -> int
val mod : int * int -> int
val >   : int * int -> bool
val >=  : int * int -> bool
val <   : int * int -> bool
val <=  : int * int -> bool
val compare : int * int -> order
val toString   : int    -> string
val fromString : string -> int option
end


Implementations of integers

A selection…

structure Int    :> INTEGER
structure Int31  :> INTEGER  (* optional *)
structure Int32  :> INTEGER  (* optional *)
structure Int64  :> INTEGER  (* optional *)
structure IntInf :> INTEGER  (* optional *)

Natural Numbers

signature NATURAL = sig
type nat   (* abstract, NOT 'eqtype' *)
exception Negative

val ofInt   : int -> nat
val /+/     : nat * nat -> nat
val /-/     : nat * nat -> nat
val /*/     : nat * nat -> nat
val sdiv    : nat * int ->
{ quotient : nat, remainder : int }
val compare : nat * nat -> order
val decimal : nat -> int list
end

Large Integers

signature BIGNUM = sig
type bigint

val ofInt    : int -> bigint
val negated  : bigint -> bigint
val <+>      : bigint * bigint -> bigint
val <->      : bigint * bigint -> bigint
val <*>      : bigint * bigint -> bigint
val sdiv : bigint * int ->
{ quotient : bigint, remainder : int }
val compare  : bigint * bigint -> order
val toString : bigint -> string
end

### Check your understanding: Number signatures

1. Consider the INTEGER signature:
signature INTEGER = sig
eqtype int             (* <-- ABSTRACT type *)
val ~   : int -> int
val +   : int * int -> int
val -   : int * int -> int
val *   : int * int -> int
val div : int * int -> int
val mod : int * int -> int
val >   : int * int -> bool
val >=  : int * int -> bool
val <   : int * int -> bool
val <=  : int * int -> bool
val compare : int * int -> order
val toString   : int    -> string
val fromString : string -> int option
end

In this signature, the representation of int is kept abstract except for the fact that values of type int may be compared for equality using the built-in equality function.

• True
• False
1. Consider the NATURAL signature:
signature NATURAL = sig
type nat   (* abstract, NOT 'eqtype' *)
exception Negative

val ofInt   : int -> nat
val /+/     : nat * nat -> nat
val /-/     : nat * nat -> nat
val /*/     : nat * nat -> nat
val sdiv    : nat * int ->
{ quotient : nat, remainder : int }
val compare : nat * nat -> order
val decimal : nat -> int list
end

Given this signature, if you do not already have a value of type nat, the only way to get a value one is to use the ofInt function.

• True
• False

Signatures collect

signature QUEUE = sig
type 'a queue    (* another abstract type *)
exception Empty

val empty : 'a queue
val put : 'a * 'a queue -> 'a queue
val get : 'a queue -> 'a * 'a queue   (* raises Empty *)

(* LAWS:  get(put(a, empty))     ==  (a, empty)
...
*)
end


Structures collect

structure Queue :> QUEUE = struct   (* opaque seal *)
type 'a queue = 'a list
exception Empty

val empty = []
fun put (x,q) = q @ [x]
fun get [] = raise Empty
| get (x :: xs) = (x, xs)

(* LAWS:  get(put(a, empty))     ==  (a, empty)
...
*)
end


Dot notation to access components

fun single x = Queue.put (Queue.empty, x)
val _ = single : 'a -> 'a Queue.queue


### Coding Interlude: Signature for a stack

Given the following SML structure, write a corresponding SML signature that hides the implementation as a list but allows polymorphic stacks to be created and elements to be pushed and popped.

Structure for a stack

structure Stack = struct
type 'a stack = 'a list
exception Empty
val empty = []
val push  = op ::
fun pop []            = raise Empty
| pop (top :: rest) = (top, rest)
end

### Reflection: When does a structure match a signature?

If we can write structures and signatures independently, we must have some way to determine whether a given structure implements given signature.

Give examples of three different circumstances that would mean a given structure does not implement an interface. One such example would be that the interface requires the presence of a value MAX_INT but the structure fails to define such a value.

What interface with what implementation?

Maybe mixed together, extracted by compiler!

Maybe matched by name:

Best: any interface with any implementation:

• Java, Standard ML

But: not “any”—only some matches are OK

Signature Matching

Structure declaration is well-formed

 structure Queue :> QUEUE = QueueImpl

if principal signature of QueueImpl matches ascribed signature QUEUE

• Every type in QUEUE is in QueueImpl

• Every exception in QUEUE is in QueueImpl

• Every value in QUEUE is in QueueImpl
(type could be more polymorphic)

• Every substructure matches, too (none here)

Matching items must be compatible.

Signature Ascription

Ascription attaches signature to structure

• Transparent Ascription: types are revealed

structure strid : sig_exp = struct_exp

This method leaks information (legacy)
(But it’s awfully convenient)

• Opaque Ascription: types are hidden (“sealing”)

structure strid :> sig_exp = struct_exp

This method respects abstraction
(And when you need to expose, can be tiresome)

Slogan: “use the beak”

Transparent Ascription

Not recommended!

Example:

  structure IntLT : ORDERED = struct
type t = int
val le = (op <)
val eq = (op =)
end

Exposed: IntLT.t = int

• Violates abstraction

Opaque Ascription

Recommended

Example:

 structure Queue :> QUEUE = struct
type 'a queue = 'a list
exception Empty

val empty = []
fun put (x, q) = q @ [x]
fun get [] = raise Empty
| get (x :: xs) = (x, xs)
end

Not exposed: type ’a Queue.queue = ’a list

• Respects abstraction

### Check your understanding: Signature ascription

1. The principal signature of a structure is the signature that contains the maximal interface information for that structure.
• True
• False
1. With opaque signature ascription (Queue :> QUEUE), type information in the structure is hidden. Only information described in the ascribing signature is available to client code.
• True
• False

## Abstract data types

• Protects invariants
• Allows software to evolve
• Enforced by type system

• Every function sees rep of every argument
• Key distinction modules vs object

Abstract data types and number hierarchy

Natural numbers

signature NATURAL = sig
type nat   (* abstract, NOT 'eqtype' *)
...
val /+/     : nat * nat -> nat
val /-/     : nat * nat -> nat
val /*/     : nat * nat -> nat
end

• Funs/+/,/-/,/*/ see both representations

• Makes arithmetic relatively easy

• But type nat works only with type nat
(no “mixed” arithmetic, unlike in OOP)

Consider the signature NATURAL:

signature NATURAL = sig
type nat   (* abstract, NOT 'eqtype' *)
...
val /+/     : nat * nat -> nat
val /-/     : nat * nat -> nat
val /*/     : nat * nat -> nat
end


1. In any structure implementing this signature, the code for the /+/ operation will be able to access the representation of both arguments.
• True
• False

# 30 November 2020: Functors and Extended Case Study

There are PDF slides for 12/1/2020.

### Retrospective: Structures and Signatures

Write one thing you learned about structures and signatures so far and one question you have. Your answers may be anonymously shared with your classmates.

### Reflection: Parameterizing a structure by another structure

We’ve seen a couple of different kinds of functions. A lambda expression takes an expression-level value as an argument, while a type-lambda expression takes a type as an argument.
Suppose we introduced a new kind of function that took a structure as an argument and returned a structure as a result.

1. How do you think we might we “type” the argument and the result?

2. What do you think such functions might be used for?

## Functors (aka generic modules aka parameterized structures)

A is a generic module

A new form of parametric polymorphism:

• lambda and type-lambda in one mechanism

• Introduction form is functor (definition form)

“Generics” found across language landscape
(wherever large systems are built)

Introducing functors

A functor abstracts over a structure

Formal parameters are declarations

Example:

functor MkSingle(structure Q:QUEUE) =
struct
structure Queue = Q
fun single x = Q.put (Q.empty, x)
end

Functors combine familiar ideas:

• Higher-order functions (value parameter Q.put)
• type-lambda (type parameter Q.queue)

Using Functors

Actual parameters are definitions

structure QueueS  = MkSingle(structure Q = Queue)
structure EQueueS = MkSingle(structure Q = EQueue)

(EQueue might be a more efficient implementation)

functor MkSingle(structure Q:QUEUE) =
struct
structure Queue = Q
fun single x = Q.put (Q.empty, x)
end

Functor applications are evaluated at compile time.

1. A functor is a function over structures that combines elements of higher-order functions and type-lambda.
• True
• False
1. Functor applications are evaluated at runtime.
• True
• False

### Check your understanding: where type

Consider the following code:

signature ORDER = sig
type t
val compare : t * t -> order
end

signature MAP = sig
type key
type 'a table
val insert : key -> 'a -> 'a table -> 'a table
...
end

functor RBTree(structure Ord : ORDER)
:> MAP where type key = Ord.t =
struct  ...   end

1. The ORDER signature defines an abstract type t.
• True
• False
1. If applying the RBTree functor to some concrete structure Foo passes the type checker, then Foo must not define a type t.
• True
• False
1. The MAP signature defines an abstract type key.
• True
• False
1. The structure that results from applying the functor RBTree to structure Foo will satisfy the MAP signature refined by the where type clause so that key type is defined to be equal to Foo’s definition of type t.
• True
• False

## Extended example: Error-tracking Interpreter

Reusable Abstractions: Extended Example

Error-tracking interpreter for a toy language

### Why this example?

• Sketch composing large systems

• Lots of interfaces using ML signatures

• Some ambitious, very abstract abstractions—it’s toward the end of term, and you should see something ambitious.

• Practice implementing functors

Error modules: Three common implementations

• Collect all errors

• Keep just the first error

• Keep the most severe error

Classic accumulator’’ for errors

signature ERROR = sig
type error   (* a single error *)
type summary (* summary of what errors occurred *)

val nothing : summary                  (* no errors *)
val <+> : summary * summary -> summary (* combine *)

val oneError : error -> summary

(* laws:   nothing <+> s == s
s <+> nothing == s
s1 <+> (s2 <+> s3) == (s1 <+> s2) <+> s3
// associativity
*)
end


First Error: Implementation

structure FirstError :>
ERROR where type error = string
and type summary = string option =
struct
type error   = string
type summary = string option

val nothing = NONE
fun <+> (NONE,   s) = s
| <+> (SOME e, _) = SOME e

val oneError = SOME
end


All Errors: Implementation

structure AllErrors :>
ERROR where type error   = string
and type summary = string list =
struct
type error   = string
type summary = error list

val nothing = []
val <+> = op @
fun oneError e = [e]
end


### Check your understanding: Error signature and implementations

Consider the ERROR signature:

signature ERROR = sig
type error   (* a single error *)
type summary (* summary of what errors occurred *)

val nothing : summary                  (* no errors *)
val <+> : summary * summary -> summary (* combine *)

val oneError : error -> summary

(* laws:   nothing <+> s == s
s <+> nothing == s
s1 <+> (s2 <+> s3) == (s1 <+> s2) <+> s3
// associativity
*)
end


1. There is only one sensible way to implement this signature.
• True
• False
1. In the signature, the type error represents one detected error.
• True
• False
1. In the signature, the combinator <+> concatenates a new error onto a summary of the errors detected so far.
• True
• False

### Computations Abstraction

Computations either:

• return a value

• produce an error

Errors must be threaded through everything :-(

Example: Simple arithmetic interpreter

(* Given: *)
datatype 'a comp = OK of 'a | ERR of AllErrors.summary 
datatype exp = LIT  of int
| PLUS of exp * exp
| DIV  of exp * exp
(* Write an evaluation function that tracks errors. *)
val eval : exp -> int comp = ...

### Reflection: Error tracking interpreter

Given the AllErrors implementation of the ERROR signature:

structure AllErrors :>
ERROR where type error   = string
and type summary = string list =
struct
type error   = string
type summary = error list

val nothing = []
val <+> = op @
fun oneError e = [e]
end


and the following datatype declarations:

datatype 'a comp = OK of 'a | ERR of AllErrors.summary

datatype exp = LIT  of int
| PLUS of exp * exp
| DIV  of exp * exp
1. Fill in evaluation code for PLUS case using the AllErrors implementation of ERROR signature:
fun eval (LIT n) = OK n
| eval (PLUS (e1, e2)) = ...
1. Fill in evaluation code for DIV case using the AllErrors implementation of ERROR signature:
fun eval (LIT n) = OK n
| ...
| eval (DIV  (e1, e2)) - ...

Interpreter: LIT and PLUS cases

fun eval (LIT n) = OK n
| eval (PLUS (e1, e2)) =
(case eval e1
of OK v1 =>
(case eval e2
of OK  v2 => OK (v1 + v2)
| ERR s2 => ERR s2)
| ERR s1 =>
(case eval e2
of OK  _  => ERR s1
| ERR s2 => ERR (AllErrors.<+> (s1, s2))))



Interpreter: DIV case

  | eval (DIV (e1, e2)) =
(case eval e1
of OK v1 =>
(case eval e2
of OK   0 => ERR (AllErrors.oneError "Div 0")
| OK  v2 => OK  (v1 div v2)
| ERR s2 => ERR s2)
| ERR s1 =>
(case eval e2
of OK  v2 => ERR s1
| ERR s2 => ERR (AllErrors.<+> (s1, s2)))


That’s really painful!

We can extend the computation abstraction with sequencing operations to help.

Combining computations

signature COMPUTATION = sig
type 'a comp    (* Computation! When run, results in
value of type 'a or error summary. *)
  (* A computation without errors always succeeds. *)
val succeeds : 'a -> 'a comp   
  (* Apply a pure function to a computation. *)
val <$> : ('a -> 'b) * 'a comp -> 'b comp  (* Application inside computations. *) val <*> : ('a -> 'b) comp * 'a comp -> 'b comp  (* Computation followed by continuation. *) val >>= : 'a comp * ('a -> 'b comp) -> 'b comp end ### Example: eval e1 + eval e2 (op +) (eval e1, eval e2) curry (op +) (eval e1) (eval e2) curry (op +) <$> eval e1 <*> eval e2

The first three versions are not well typed. Why?

The last version will thread errors through the compuation behind the scenes.

Note:

eval e1, eval e2 : int comp

curry (op +) : int -> (int -> int)

<$> : (int -> (int -> int)) * (int comp) -> (int -> int) comp <*> : (int -> int) comp * int comp -> int comp curry (op +) <$> eval e1 : (int -> int) comp

let pa = curry (op +) <$> eval e1 note by above, pa : (int -> int) comp pa <*> eval e2 : int comp Buckets of algebraic laws succeeds a >>= k == k a // identity comp >>= succeeds == comp // identity comp >>= (fn x => k x >>= h) == (comp >>= k) >>= h // associativity succeeds f <*> succeeds x == succeeds (f x) // success ... Environments can use computation abstraction, too! signature COMPENV = sig type 'a env (* environment mapping strings to values of type 'a *) type 'a comp (* computation of 'a or an error summary *) val lookup : string * 'a env -> 'a comp end ### Check your understanding: Computation combinators Consider the signature of the computation abstraction: signature COMPUTATION = sig type 'a comp (* Computation! When run, results in value of type 'a or error summary. *) (* A computation without errors always succeeds. *) val succeeds : 'a -> 'a comp (* Apply a pure function to a computation. *) val <$> : ('a -> 'b) * 'a comp -> 'b comp

(* Application inside computations. *)
val <*> : ('a -> 'b) comp * 'a comp -> 'b comp

(* Computation followed by continuation. *)
val >>= : 'a comp * ('a -> 'b comp) -> 'b comp
end


1. The type 'a comp represents a computation that either produces a value of type 'a or results in an error.
• True
• False
1. A value of type 'a comp created by calling the succeeds combinator will always return a value of type 'a.
• True
• False
1. The <*> combinator applies a pure function f with type 'a -> 'b to a value with type 'a comp.
• True
• False

Error-tracking interpreter

functor InterpFn(
structure Error : ERROR
structure Comp  : COMPUTATION
structure Env   : COMPENV
val zerodivide  : Error.error
val error       : Error.error -> 'a Comp.comp
sharing type Comp.comp = Env.comp) =
struct
val (<*>, <$>, >>=) = (Comp.<*>, Comp.<$>, Comp.>>=)

(* Definition of Interpreter *)

end

Error-tracking intepreter, continued

datatype exp = LIT of int
| VAR of string
| PLUS of exp * exp
| DIV  of exp * exp
fun eval (e, rho) =
let fun ev(LIT n) = Comp.succeeds n
| ev(VAR x) = Env.lookup (x, rho)
| ev(PLUS (e1, e2)) = curry op + <$> ev e1 <*> ev e2 | ev(DIV (e1, e2)) = ev e1 >>= (fn n1 => ev e2 >>= (fn n2 => case n2 of 0 => error zerodivide | _ => Comp.succeeds (n1 div n2))) in ev e end Computation abstraction is a monad’’ Supported by special syntax in Haskell: eval :: Exp -> Hopefully Int eval (LIT v) = return v eval (PLUS e1 e2) = do { v1 <- eval e1 ; v2 <- eval e2 ; return (v1 + v2) } eval (DIV e1 e2) = do { v1 <- eval e1 ; v2 <- eval e2 ; if v2 == 0 then Error "div 0" else return (v1 div v2) } ### Check your understanding: Using the computation abstraction Consider the following code for implementing an expression interpreter using the computation abstraction: functor InterpFn( structure Error : ERROR structure Comp : COMPUTATION structure Env : COMPENV val zerodivide : Error.error val error : Error.error -> 'a Comp.comp sharing type Comp.comp = Env.comp) = struct val (<*>, <$>, >>=) = (Comp.<*>, Comp.<$>, Comp.>>=) datatype exp = LIT of int | VAR of string | PLUS of exp * exp | DIV of exp * exp fun eval (e, rho) = let fun ev(LIT n) = Comp.succeeds n | ev(VAR x) = Env.lookup (x, rho) | ev(PLUS (e1, e2)) = curry op + <$> ev e1 <*> ev e2
| ev(DIV (e1, e2))  = ev e1 >>= (fn n1 =>
ev e2 >>= (fn n2 =>
case n2
of 0 => error zerodivide
| _ => Comp.succeeds
(n1 div n2)))
in  ev e
end

end


1. The InterpFn functor takes as a parameter a value declaration that represents a divide by zero error. This value is used in the evaluation of the DIV operation.
• True
• False
1. The evaluation of the VAR case can never fail and so doesn’t need to be part of the computation abstraction.
• True
• False
1. The evaluation of the PLUS operation uses the <\$> and <*> combinators to thread errors through the computation.
• True
• False
1. In the evaluation of DIV, n1 is bound to the result of evaluating e1 only in the case where e1 evaluates without error.
If e1 evaluates to an error, the rest of the computation is short-circuited and the error is returned directly.
• True
• False

## ML Module summary

ML module summary

New syntactic category: declaration

• Of type, value, exception, or module

Signature groups declarations: interface

Structure groups definitions: implementation

Functor (generic module) enables reuse:

• Formal parameter: declarations
• Actual parameter: definitions

Opaque ascription hides information

• Enforces abstraction

# 2 December 2020: Drive-by tour of lambda calculus

There are PDF slides for 12/3/2020.

### Retrospective: SML’s Module System

Write one thing you learned about ML’s module system so far and one question you have. Your answers may be anonymously shared with your classmates.

### Reflection: Models of computation

Think back to our formal models of impcore and μscheme. The simpler a formal model is, while still capturing the essence of what is being modelled, the better, because it allows us to focus on what matters most.

In programming languages, we generally want the langauge to be Turing complete. Practically speaking, that means it has to be able to express conditionals, natural numbers, and general recursion.

Describe how you would go about designing a minimal language that supported these features.

### Why study lambda calculus?

• A metalanguage for describing other languages, known to all educated PL people

Church-Turing Thesis: Any computable operator can be encoded in lambda calculus

• Theoretical underpinnings for most programming languages (all in this class).

• Test bench for new language features

### The world’s simplest reasonable programming language

Syntax: Only three forms!

M ::= x | \x.M | M M'

Everything is programming with functions

• Everything is curried

• Application associates to the left

• Arguments are not evaluated before application

First example:

(\x.\y.x) M N --> (\y.M) N --> M

Crucial:

• Argument N is never evaluated (could have an infinite loop)
• Substitution is “capture-avoiding”

## Programming in Lambda Calculus

Absolute minimum of code forms:

• no set, while, begin
• but also no if and no define!

Assembly language of programming languages

Systematic approach to constructed data:

• Everything is continuation-passing style

• EXCEPT for the way they encode lists, which is bogus (violates abstraction)

### Coding Booleans and if expressions

Q: Who remembers the boolean-formula solver?

Q: What classes of results could it produce?

Q: How were the results delivered?

Q: How shall we do Booleans?

A Boolean takes two continuations:

• one for true, one for false

Laws:

true  s f = s
false s f = f

Code:

true  = \x.\y.x
false = \x.\y.y

Coding the if expression, laws:

if true  then T else E = T
if false then T else E = E

Therefore, code is:

if B then T else E =
if-then-else B T E = \B.\T.\E.B T E

Code:

true  = \x.\y.x
false = \x.\y.y
if    = \b.\t.\e.b t e

### Coding Interlude: Coding not in lambda calculus

Given that we can code booleans in lambda calculus as

true  = \x.\y.x
false = \x.\y.y
if    = \b.\t.\e.b t e

your job is to write the not function.

• As a first step, write the algebraic laws for not. What are the forms of the input data? For each form, what should the output be?

• Now write the code for not

You can try out your code in the lamstep interpreter on the homework cluster.

## Coding data structures

### Coding pairs

• To pair two values, how many alternatives are there?
(A: one, pair constructor)

• How many continuations?
(A: two, one for each element of the pair)

• What information does each expect?
(A: the two elements of the pair)

• What are the algebraic laws?

fst (pair x y) = x
snd (pair x y) = y
• Code pair, fst, snd
pseudo-code:

pair x y k = k x y
fst p = p (\x.\y.x)
snd p = p (\x.\y.y)

lambda calculus:

pair = \x.\y.\k.k x y
fst  = \p.p (\x.\y.x)
snd  = \p.p (\x.\y.y)
• Code pair, fst, snd

pair x y k = k x y
fst p = p (\x.\y.x)
snd p = p (\x.\y.y)

pair = \x.\y.\k.k x y
fst  = \p.p (\x.\y.x)
snd  = \p.p (\x.\y.y)

### Coding lists

• List laws

null? nil         = true
null? (cons x xs) = false
car (cons x xs)   = x
cdr (cons x xs)   = xs
• How many ways can lists be created?  (A: two: nil and cons)

• How many continuations?
(A: two, one for each, kn and kc)

• What does each continuation expect?

• nil: nothing
• cons: car and cdr
• For each creator, what are its continuation laws?

cons y ys kn kc = kc y ys
nil       kn kc = kn

car xs = xs error (\y.\ys.y)
cdr xs = xs error (\y.\ys.ys)

null? xs = xs true (\y.\ys.false)
• What are the definitions?

cons = \y.\ys.\kn.\kc.kc y ys
nil  = \kn.\kc.kn

car = \xs.xs error (\y.\ys.y)
cdr = \xs.xs error (\y.\ys.ys)

null? = \xs.xs true (\y.\ys.false)
• What do those second continuations look like? (This is the source of Wikipedia’s terrible hack)

• What are the definitions?

cons = \y.\ys.\kn.\kc.kc y ys
nil  = \kn.\kc.kn

car = \xs.xs error (\y.\ys.y)
cdr = \xs.xs error (\y.\ys.ys)

null? = \xs.xs true (\y.\ys.false)

### Check your understanding: Coding data structures in lambda calculus

1. The encoding of a pair constructor in lambda calculus is as a function that takes the two elements x and y of the pair and a continuation k that says what to do with those elements.
• True
• False
1. The function \p.p (\x.y.x) encodes the snd projection function on pairs in the lambda calculus.
• True
• False
1. Lists are coded in lambda calculus as functions that take two continuations: one for what to do if the list is nil and one for what to do if the list is a cons cell. The nil continuation takes no arguments, while the cons case takes one argument: what to do for the head of the list.
• True
• False
1. The function \xs.xs true (\y.\ys.false) encodes the null? function on lists in the lambda calculus.
• True
• False

## Coding numbers: Church Numerals

Church numerals

Key idea: Encode a number as the number of times a function f is applied.

The number n: Apply f to x, n times

 \f.\x.fold f x n

Church Numerals in lambda calculus

zero  = \f.\x.x;
succ  = \n.\f.\x.f (n f x);
plus  = \n.\m.n succ m;
times = \n.\m.n (plus m) zero;
...
-> four;
\f.\x.f (f (f (f x)))
-> three;
\f.\x.f (f (f x))
-> times four three;
\f.\x.f (f (f (f (f (f (f (f (f (f (f (f x)))))))))))


### Check your understanding: Church numerals

1. The Church numeral encoding of the number 3 is \f.\x.f(f(f(f x)))
• True
• False
1. The function \n.\f.\x.f(n f x) is the successor function on Church numerals
• True
• False

Taking stock:

• bools
• pairs
• lists
• numbers

Q: What’s missing from this picture?

A: Recursive functions.

Astonishing fact: We don’t need letrec or val-rec.

• We can encode recursion in the lambda calculus!
• Recursion + numbers makes the lambda calculus Turing complete.

### Reflection: Encoding recursion

What value of
f
solves this equation:

f = λn.if n = 0 then 1 else n × f(n − 1)?

## Coding recursion

Factorial in lambda calculus

Wish for:

fact = \n.(zero? n) 1 (times n (fact (pred n)));

But: on right-hand side, fact is not defined.

Successive approximations

Function bot always goes into an infinite loop.

Consider how these functions relate to factorial:

fact0 = \n.(zero? n) 1 (times n (bot   (pred n)));

fact1 = \n.(zero? n) 1 (times n (fact0 (pred n)));

fact2 = \n.(zero? n) 1 (times n (fact1 (pred n)));

General pattern:

fact-i = \n.(zero? n) 1 (times n (fact-i-1   (pred n)));

cessive approximations (construct

fact-i = \n.(zero? n) 1 (times n (fact-i-1   (pred n)));

g = \f.  \n.(zero? n) 1 (times n (f          (pred n)));

fact0 = g bot;

fact1 = g fact0;      // = g (g bot)

fact2 = g fact1;      // = g (g (g bot))

fact3 = g fact2;      // = g (g (g (g bot)))

...

Fixed point

Suppose f = g f.

f is a fixed point of g.

Claim: f n is n factorial!

Proof: by induction on n.

Left as an exercise to the viewer :-)

Fixed-point combinator

What if we had a combinator fix such that

fix g = g (fix g)

Term fix g is a fixed point of g, and so by previous slide,

fix g n is n factorial!

fix g = g (fix g)
= g (g (fix g))
= g (g (g (fix g)))
= ...

Expand as much as you need to.

### Check your understanding: Encoding recursion

Consider the recursively-defined length function:

length = \xs.null? xs 0 (+ 1 (length (cdr xs)))

If we were to take the fixpoint of the following functions, which of them would produce the length function?

• lg1 = \lf.\xs.null? xs 1 (+ 1 (lf (cdr xs)))
• lg2 = \lf.\xs.null? xs 0 (+ 0 (lf (cdr xs)))
• lg3 = \lf.\xs.null? xs 0 (+ 1 (lf (cdr xs)))
• lg4 = \lf.\xs.null? xs 0 (+ 0 (lf (car xs)))

### Using fixed points

Enrichment Reflection.

For extra credit, for each of the following recursive equations, say whether there is a solution, (ie, a recursive function that makes the equation true). If there is a solution, say whether it is unique and what it is.

f1 = \n.\m.(eq? n m) n
(plus n (f1 (succ n) m));

f2 = \n.f2 (isZero? n 100 (pred n));

f3 = \xs.xs nil (\z.\zs.cons 0 (f3 zs));

f4 = \xs.\ys.f4 ys xs;

### Summary

• Lambda calculus is Turing Complete

• Conditionals (via continuations)
• Numbers (via Church encoding)
• Recursion (via Y combinator)
• Essence of most programming languages

• All that we have studied
• Evaluation proceeds by substituting arguments for formal variables (called beta reduction)

# 7 December 2020: Producing Software with Integrity: Motivations, Capabilities, and Impediments

There were no PDF slides on 12/8/2020.

### Retrospective: Lambda calculus

Write one thing you learned about lambda calculus so far and one question you have. Your answers may be anonymously shared with your classmates.

### Reflection: Security-critical software

Despite high-profile data breaches stemming from flaws in software and frequent demonstrations of the insecurity of security and safety critical systems at Black Hat, high-quality security-critical software seems to remain elusive. Why do you think that might be?

### Misused buffers are a major security problem

• Graph showing number of buffer overflow

• CVE: Common Vulnerability and Exposures Database (>67,000)
• CWE: Common Weaknesses Enumeration (CWE) (120 categories)
• CVSS: Common Vulnerability Scoring System (CVSS)

• score 1-10, above a 7 is ‘severe’
• severe vulnerability has these characteristics:

• network based attack
• complexity is low
• no privilege is required
• no user interaction is required
• no workaround exists
• buffer overflow most common: > 12,000 identified CVEs, more often than not secure

• Examples:

• Heartbleed: missing bounds check, exfiltration
• Microsoft Security Bulletin MS15-078: buffer underflow, remote code execution in all supported windows platforms

1. Given all the vulnerabilities listed in MITRE’s Common Vulnerability and Expostures Database, issues related to misusing buffers are a relatively minor problem.
• True
• False
1. Buffer overflow/underflow attacks such as Heartbleed and the issue reported in MS security bulletin MS15-078 can result in exfiltrating information from a system without leaving any trace in log files and/or allow a remote attacker to take over control of a computer.
• True
• False

### Check your understanding: Consequences of poor cybersecurity

1. Ransomware is a way hackers can monetize malware. It typical works when hackers break into a system and encypt files. They then demand payment to decrypt the files. Whether or not victims pay the ransom, recovery costs can be in the millions of US dollars.
• True
• False
1. When breaking into automobiles and other cyber-physical systems, hackers have to be physically close to the system they are attacking.
• True
• False
1. Although disguised as anonymous ransomware, subsequent analysis has revealed that NotPetya constituted a purely destructive attack launched by one sovereign nation (Russia) against another (Ukraine). It caused billions of U.S. dollars of damage, both within the Ukraine and worldwide.
• True
• False
1. There is no known evidence of successful attacks on critical infrastructure such as power distribution systems.
• True
• False

### Reflection: Techniques to build better software

What approaches do you think might be effective for building higher quality software?

### Check your understanding: Proof of concept for building higher-quality software

1. No C code was involved in the high-assurance code for the SMACCMcopter or Boeing’s Unmanned Little Bird.
• True
• False
1. At the end of DARPA’s HACMS program, the red team was unable to disrupt the flight operations or communication channels of Boeing’s Unmanned Little Bird despite being given root access to one of the partitions on the helicopter.
• True
• False

### Reflection: Impediments to applying known technology

Assuming we do in fact know how to build higher-quality software, why do you think we might be choosing not to use such capabilities?

### Check your understanding: Misaligned incentives

1. In retail banking in the 1990s, the UK had lower fraud rates than the US because consumers were held responsible.
• True
• False
1. Critical infrastructure operators are making uninformed decisions when they connect their cyber-physical systems to the internet given the security risks that such linkages cause.
• True
• False

### Check your understanding: Information asymmetries

1. Information asymmetry is a market failure that plagues the software security market: customers can’t evaluate the security of software and so aren’t willing to pay a premium for software with better security. As a result, vendors are reluctant to invest in better security because they won’t be able to recoup the cost in a higher price for their product.
• True
• False
1. Another example of information assymetry in the software security industry is that lack of information about cybersecurity incidents. Lacking such data means we cannot properly assess the size of the problem, making it difficult to gauge whether we are properly allocating defensive resources.
• True
• False

### Check your understanding: Possible policy interventions

1. Mandatory disclosure of data breaches has increased the attention paid to data breaches, making cyberinsurance for data breaches the most mature segment of the cyberinsurance market.
• True
• False
1. Only a handful of insurance companies offer cyber insurance because it is too hard to measure the risk.
• True
• False
1. The software industry has successfully lobbied against software product liability on the grounds that it would stifle innovation.
• True
• False

### Producing software with integrity

• Software flaws enabled by obsolete language design enable remote code execution and information exfiltration.

• Poor cybersecurity has significant negative consequences at many levels: financial losses, damage in the physical world, and threats to national security.

• We know how to build software to much higher quality standards using a variety of techniques. HACMS highlighted some of these, but there are many others.

• Misaligned incentives and market failures such as information asymmetry explain why we are not building better software even though we know how.

• Policy interventions are necessary to align incentives and fix market failures. Possible policy interventions include requiring information disclosure, mandating cyber insurance, or introducing product liability for software.

### A Prediction

• Cybersecurity is changing the nature of international conflict.

• We are going to take it a lot more seriously

• Hopefully before something goes catastrophically wrong.

# 9 December 2020: Comp 105 Conclusion

There are PDF slides for 12/10/2020.

### Retrospective: Producing Software with Integrity

Write one thing you learned about producing software with integrity so far and one question you have. Your answers may be anonymously shared with your classmates.

## Comp105 in Review

• Review the highlights of what we’ve learned this semester:

• topics
• languages
• projects
• Worth reviewing because it’s easy to lose track

• you’ve learned a lot!
• Briefly discuss where you might go from here

• languages
• courses

## What have you learned?

Principled design process for writing code

• Algebraic laws guided by the form of data

• 9-step design process

• Recursive functions whose structure follows from laws

• Explain why function terminates
• Inductive proofs that follow same structure

• Explain why induction is well-founded

Understanding a language: Six key questions

1. What is the abstract syntax?

2. What are the values?

3. What are the environments?

4. How does evaluation happen?

5. What is the initial basis?

6. What are the types?

Operational semantics

• Precisely describe meaning of programs

• What value does a program evaluate to?

• What side effects are caused in the process?

• Written using inference rules

• Judgment forms capture all relevant information

• Environments track information about variables

• Stores map locations to values

First-class functions

• What they are

• How to use them effectively

• Lambdas create anonymous functions

• Closures

• run-time representation of functions;
• capture environment at closure-definition time
• Continuations capture the rest of the computation’’

Cost Models and Optimizations

• How many cons cells?

• How many activation records?

• Method of accumulating parameters

• Tail calls

ML

• Datatypes

• Pattern matching

• Exceptions

### Type systems

Type Systems: Big Idea

Static vs. Dynamic Typing

• Expressiveness (+ Dynamic)

• Don’t have to worry about types (+ Dynamic)

• Dependent on input (- Dynamic)

• Serve as documentation (+ Static)

• Catch errors at compile time (+ Static)

• Used in optimization (+ Static)

Type Systems: Big Idea

• Dynamic or

• Approximate or

• Non-terminating

• Example: array bounds checking

• Occasional negative consequences: e.g., Heartbleed

Type Systems: Mechanics

• Monomorphic and Polymorphic Types

• Types, Type Constructors, Quantified Types (α.τ)

• Kinds (κ) classify types:

• well-formed,

• types (*),

• type constructors: κ ⇒ κ

• Type Environments: type identifiers kinds

• Typing Rules

• Introduction and Elimination forms
• Type Checking

### Type inference

Hindley-Milner Type Inference: Big Idea

• Inferred vs Declared Types

• Advantages of Inference: write fewer types, infer more general types.

• Advantages of Declarations: better documentation, more general type system.

• Canonical example of static analysis:

• Proving properties of programs based only on text of program.

• Useful for compilers and security analysis.

Hindley-Milner Type Inference: Mechanics

• Use fresh type variables for unknown types

• Generate constraints that collect clues

• Solve constraints just before introducing quantifiers

• Compromises to preserve decideability:

• Only generalize lets and top-level declarations

• Polymorphic functions aren’t first-class

### Object-Oriented Programming

Object-Oriented Programming: Big Ideas

• “Programming-in-the-medium”

• Enables code reuse

• Easy to add new kinds of objects

• Hard to add new operations

• Algorithms smeared across many classes

• Hard to know what code is executing

• Good match for GUI programming

• Smalltalk mantra: Everything is an Object

• Can redefine basic operations

Object-Oriented Programming: Mechanics

• Classes and objects

• Computation via sending messages

• special names: self and super
• Blocks to code anonymous functions & continuations

• Inheritance for implementation reuse

• Subtyping (“duck typing”) for client code reuse

• Subtyping is not Inheritance

• Idioms for information sharing

• by convention private methods for like objects
• Double-dispatch

### Module Systems

Module Systems a la SML: Big Ideas

• “Programming-in-the-large”

• Separate implementation from interface

• Enforced modularity

• Swap implementations w/o breaking clients

Module Systems a la SML: Mechanics

• Signatures describe interfaces

• types, values, exceptions, substructures

• include to extend

• Structures provide implementations

• Signature ascription hides structure contents (Heap :> HEAP)

• Functors

• Functions over structures

• Executed at compile time

### Lambda Calculus

Lambda Calculus: Big Ideas

• Three forms:
e :  :  = x | λx.e | e1e2

• Church-Turing Thesis:

• All computable functions expressable in lambda calculus

• Data types: booleans, pairs, lists, naturals`

• Recursive functions via fixed points

• Y combinator calculates fixed points:

• Y = λf.(λx.f(x x))(λx.f(x x))

### Producing software with integrity

Producing software with integrity

• Software flaws have significant costs

• financial, physical, national-security
• We know how to build better software

• We aren’t building better software

• misaligned incentives, market failures
• Policy interventions might change that

• information disclosure, cyber insurance, product liability

### Programming experience

Programming Experience

• Recursion and higher-order functions are now second-nature to you.

• You’ll miss pattern matching and algebraic data types in any language you use that doesn’t have them!
• C for impcore (imperative language)

• Scheme (dynamically typed functional language)

• ML (statically typed functional language)

• uSmalltalk (dynamically typed OO language)

Built substantial pieces of code

• SAT solver using continuations

• Type checker (ML pattern matching!)

• Type inference system (using constraints, reading typing rules)

• BigNums (Power of OO abstractions; resulting challenges)

## Where might you go from here?

Where might you go from here?

• At the research frontier: Still evolving.

• Lazy:

• Expressions only evaluated when needed.

• Conflict with side-effects.

• Type Classes:

• ML: Hard-wire certain operations (+, *)

New languages

• Swift (Apple)

• type inference, modules, higher-order functions, pattern matching, memory management

• advanced concurrency, modules (aka packages) higher-order functions, garbage collection
• Rust (Mozilla)

• advanced type system, modules, higher-order functions, support for systems programming

• Compilers

• Software engineering

• Special Topics:

• Programming language design

• Static analysis

• Virtual machines

Congratulations!

• You have learned an amazing amount.

• You have really impressed me.

• I’m confident you will be able to use your newly developed skills in many contexts.

• When that happens, tell us about it!