6 September 2017: Introduction to Comp 105
There are PDF slides for 9/7/2017.
Handout: Experiences of successful 105 students
Announcements
Recitations start this week (tomorrow!)
First HW due Wednesday, September 13.
Overview
Why so many languages?
Question: What languages have you programmed in?
There are thousands of programming languages, each unique.
Question: Why do you suppose there are so many?
The right language for the job makes it easier to write programs that really work
An invaluable skill for software practitioners
Your language can make you or break you.
- Example
- Writing a compiler requires a language that facilitiates maniuplating tree data structures
- Relevant features: algebraic data types, pattern matching, efficient recursion, higher-order functions
The Blub paradox
- Essay by Paul Graham on the relative power of programming languages.
What this course isn’t
- Simula in September
- Objective-C in Ocobter
- Visual Basic in November
- C in December
Why not?
- Because you’d spend most of your time learning shallow details like program syntax, tool chains, and library interfaces.
What this course is:
- Reusable Principles
Why?
- Once you know the principles, you’ll be able to teach yourself new languages quickly and improve your coding in many different languages.
What are reusable principles?
What if the course were called “Cooking”?
You’d need to know something about how cooking works (THEORY)
Want to make bread? How does yeast work?
Want to avoid getting sick? Under what conditions do bacteria thrive?
Want to develop flavor? What triggers the Maillard reaction?
You’d need to know something about how to cook effectively (PRACTICE)
French cuisine: mirepoix (onions, carrots, celery cooked in butter)
Base sauces (the 5 “mother sauces” of Western cooking); Know what they are and when to use them.
The same division for programming languages:
How programming languages work (THEORY)
- MATH: logic, semantics, type theory, induction
How to make them effective (PRACTICE)
- What features enable compact, efficient, maintainable CODE?
What Programming Languages are, technically
The marriage of math and code
Principal tools: Induction and recursion
What can you get out of Comp 105?
Discover new ways think about programming (in many languages)
- For example: master using recursive, higher-order functions
Double your productivity
- By choosing the right language/feature for the job
Become a sophisticated consumer, recognizing familiar features in new languages
Learn new languages quickly
Bonus: Prepare for advanced study
(Course serves everyone from recent 15/61 grads to grad students)
Students who get the most out of 105 report
- They enjoy programming (a la 15)
- They also like math (a la 61, sort of—induction and proofs)
- They work hard
Great languages begin with great features
Language shapes your thinking
There aren’t that many great features, and you will see them over and over
You’ll choose features, and therefore languages, to fit your needs
Some are more powerful than others
Examples: first-class functions, continuations, pattern matching, type inference
In Comp 105,
We explode your brain so you can think differently
You’ll know you’re doing it right if at first your head hurts terribly, then you have a breakthrough moment and it all seems pleasant afterwards
How will we study language features?
Write (lots of!) small programs exercising those features
- High power-to-weight ratio (lots of thought per line)
Learn formal tools to describe language features precisely
Operational Semantics (What do programs mean?)
Typing Rules (What can we prove about all programs without running them?)
Extend language implementations so you understand what is under the hood.
Prove properties about various language features
Common Framework
Sequence of Scheme-based pedagogical languages with increasing power
- Simplest language: ImpCore (IMPerative CORE)
Implementation language:
Start in C
Shift to ML once we have learned that language
Foundation of operational semantics and typing rules
Course logistics and administration
Books
You must get Norman’s book (Both Volumes!!!)
ML book is optional, but very useful. You won’t need it until October 11.
Homework
Homework will be frequent and challenging:
- Many small programming problems
- Some theory problems, more like a math problem set
- The occasional larger project, like a type checker or a game solver
- Submit everything electronically
- First homework is due a week from today; designed to get you moving quickly.
- The course is relentlessly cumulative.
Both individual and pair work:
- All problems should be discussed with others
(Essential to your success) - Discussions must be acknowleged
- Most problems must be completed individually
- Do not allow anyone else to see your code.
- For some problems larger in scope, you can work in pairs
- Be very careful to separate your pair work and your individual work. (A mistake could be major violation of academic integrity, with severe penalties.)
Arc of the homework looks something like this:
Assignment | Difficulty |
---|---|
impcore | one star |
opsem | two stars |
scheme | three stars |
hofs | four stars |
And it’s more or less four-star homeworks from there on out.
Lesson: Don’t make decisions based on the first couple of homeworks!
Just as intellectually challenging as COMP 40, but in an entirely different direction.
- Not “How long until this huge pile of code works?”
- Instead “How long until I get the Aha! Moment that makes these 10 lines work?”
Everyone who takes this class has the ability to master the material; Succeeding just requires digging in.
We provide lots of resources to help:
Lectures
Readings
Recitations
Office hours
Piazza
We encourage you to form study groups so you have thought partners.
Two two bad habits to avoid:
Working on your own.
Trying to cram the assignments at the last minute.
The role of lectures
We don’t cover everything in lecture
Lecture is for just the hard parts
We’ll talk very little about the code (just the interesting bits)
In a 100-level course, you are responsible for your own learning
Course evaluations from previous years: a few students want everything gone over in lecture. That’s not how things work in real life, and that’s not how things work here. We point you in the right direction and identify traps and pitfalls, and we find good problems for you to work on.
If you’re expecting to see everything in lecture, you have a couple of choices: change your expectations, or take the course next year when you will have more experience and will be more prepared to manage your own learning.
Recitations
- Class goes very fast; recitations provide chance to dig in to key topics with classmates.
- Start this Thursday and Friday.
- Location information available in SIS.
- Designed to be interactive, so bring your thinking cap.
- Count towards class participation.
Questions and answers on Piazza
- Don’t just ask questions; answer them too.
- Both activities count toward class participation.
- Be super careful that any question containing your code must be private. (This is an issue of academic integrity.)
Other policies and procedures on the web
- You are responsible!
- Treasure Hunt for class participation points
What am I called?
Call me “Kathleen,” “Professor Fisher”, or “Profesor.”
ImpCore: The first language in our common framework
Exercise: all-fours?
Write a function that takes a natural number n and returns true (1) iff all the digits in n are 4’s.
- Impcore interpretor: > impcore
- Command (use homework/fours.imp); will load file into interpretor
- Syntax: parentheses with keyword or function name to start
- An Impcore program is a sequence of definitions (and expresions)
Scoping rules for Impcore
Scopes also called “name spaces”; we will call them “environments” because that’s the pointy-headed theory term—and if you want to read some of the exciting papers, pointy-headed theory has to be second nature.
Recursion: a review
Ways a recursive function could decompose a natural number n.
Peel back one (Peano numbers):
n = 0 n = m + 1, m is also a natural number
Split into two pieces:
n = 0 n = k + (n - k) 0 < k < n (everything gets smaller)
Sequence of decimal digits (see study problems on digits)
n = d, where 0 <= d < 10 n = 10 * m + d, where 0 <= d < 10 and m > 0
To do your homework problems, which I recommend starting today, you’ll need to invent at least one more.
11 September 2017: Introduction to Semantics
There are PDF slides for 9/12/2017.
Handout: 105 Impcore Semantics, Part 1
Today: Abstract Syntax and Operational Semantics
Discussion: Two things you learned last class.
Programming-language semantics
Semantics means meaning.
What problem are we trying to solve?
Know what’s supposed to happen when you run the code
Ways of knowing:
- People learn from examples
- You can build intuition from words
(Book is full of examples and words) - To know exactly, unambiguously, you need more precision
Q: Does anyone know the beginner exercise “make a peanut butter and jelly sandwich”? (Videos on YouTube)
- You can watch and learn, but a computer can’t.
- “Put the peanut butter on the bread”
Why bother with precise semantics?
Same reason as other forms of math:
- Distill understanding
- Express it in sharable way
- Prove useful properties. For example:
- private information doesn’t leak
- device driver can’t crash the OS kernel
- compiler optimizations prserve program meaning
- Most important for you: things that look different are actually the same
Plus, needed to build language implementation and tests
The programming languages you encounter after 105 will certainly look different from what we study this term. But most of them will actually be the same. Studying semantics helps you identify that.
The idea: The skills you learn in this class will apply
Behavior decomposes
We want a computational notion of meaning.
What happens when we run
(* y 3)
?
We must know something about *
, y
, 3, and function application.
Knowledge is expressed inductively
Atomic forms: Describe behavior directly (e.g., constants, variables)
Compound forms: Behavior specified by composing behaviors of parts
(Non)-Example of compositionality: Spelling/pronunciation in English
fish
vsghoti
- Both composed from letters, but no rules of composition for pronunciation.
By design, programming languages more orderly than natural language.
ASTs
Question: What do we assign behavior to?
Answer: The Abstract Syntax Tree (AST) of the program.
An AST is a data structure that represents a program.
A parser converts program text into an AST.
Question: How can we represent all while loops?
while (i < n && a[i] < x) { i++ }
Answer:
- Tag code as a while loop
- Identify the condition, which can be any expression
- Identify the body, which can be any expression
As a data structure:
- WHILEX(exp1, exp2), where
- exp1 is the representation of (i < n && a[i] < x), and
- exp2 is the representation of i++
With that as background, we can now dive in to the semantics for Impcore!
13 September 2017: Semantics, Syntactic Proofs, Metatheory
There are PDF slides for 9/14/2017.
Handout: Impcore expression rules
Announcements
- Impcore homework due tonight
- Opsem homework now available
Today
- Operational semantics of function application
- How we know what the code is supposed to do at run time: valid derivations
- What we know about valid derivations: metatheory
Last Time
- Compositionality
- Abstract Syntax Trees
- Environments: Globals (ξ), Functions (ϕ), Locals (ρ)
- Abstract machines
- Evaluation judgement
- Operational Semantics
- Correspondance between code and inference rules
Both math and code on homework
You’re good with code—lecture and recitation will focus on math
Questions:
- In what order are the actual parameters evaluated?
- How can you tell?
What happens if the formal parameter names are duplicated?
- How many formal parameters can the body of f access?
- What are their names?
Can changes to formal parameters in the body of f be seen by the code calling f?
Can changes to globals in the body of f be seen by the code calling f?
Using Operational Semantics
The big idea:
Every terminating computation is described by a data structure—we’re going to turn computation into a data structure. Proofs about computations are hard (see: COMP 170), but proofs about data structures are lots easier (see: COMP 61).
Valid derivations, or “How do I know what this program should evaluate to?”
Code example
(define and (p q)
(if p q 0))
(define digit? (n)
(and (<= 0 n) (< n 10)))
Suppose we evaluate (digit? 7)
Exercise:
In the body of
digit?
, what expressions are evaluated in what order?As a function application, the body matches template
(
f e1 e2)
. In this example,- What is f?
- What is e1?
- What is e2?
What is the result of (digit? 7)
?
How do we know it’s right?
From rules to proofs
What can a proof tell us?
Example derivation (rules in handout)
Building derivations
At this point, we’ve now covered derivations and how a single derivation corresponds to evaluating a particular program.
Proofs about all derivations: Metatheory
Cases to try:
- Literal
- GlobalVar
- SetGlobal
- IfTrue
- ApplyUser2
For your homework, “Theory Impcore” leaves out While and Begin rules.
18 September 2017: Metatheory wrapup. Intro to functional programming
There are PDF slides for 9/19/2017.
Announcements
- Impcore homework returned via email
Today
- More induction on derivations (metatheory)
- Introduction to Scheme
Last Time
- Operational semantics of function application
- A valid derivation defines the execution of a single program.
- Metatheory allows us to prove things about all programs in the language.
Where are we going?
Recursion and composition:
Recursive functions in depth
Two recursive data structures: the list and the S-expression
More powerful ways of putting functions together (compositionality again, and it leads to reuse)
Recursion comes from inductive structure of input
Structure of the input drives the structure of the code.
You’ll learn to use a three-step design process:
- Inductive structure
- Equations (“algebraic laws”)
- Code
To discover recursive functions, write algebraic laws:
sum 0 = 0
sum n = n + sum (n - 1)
Which direction gets smaller?
Code:
(define sum (n)
(if (= n 0) 0 (+ n (sum (- n 1)))))
Another example:
exp x 0 = 1
exp x (n + 1) = x * (exp x n)
Can you find a direction in which something gets smaller?
Code:
(define exp (x m)
(if (= m 0)
1
(* x (exp x (- m 1)))))
For a new language, five powerful questions
As a lens for understanding, you can ask these questions about any language:
What is the abstract syntax? What are the syntactic categories, and what are the terms in each category?
What are the values? What do expressions/terms evaluate to?
What environments are there? That is, what can names stand for?
How are terms evaluated? What are the judgments? What are the evaluation rules?
What’s in the initial basis? Primitives and otherwise, what is built in?
(Initial basis for μScheme on page 157)
Introduction to Scheme
Question 2: What are the values?
Two new kinds of data:
The function closure: the key to “first-class” functions
Pointer to automatically managed cons cell (mother of civilization)
Graphically:
(cons 3 (cons 2 ’()))
Scheme Values
Values are S-expressions.
An S-expression is either
a symbol
'Halligan
'tufts
a literal integer
0
77
a literal Boolean
#t
#f
(cons
v1 v2)
, where v1 and v2 are S-expressions
Many predefined functions work with a list of S-expressions
A list of S-expressions is either
the empty list
'()
(cons
v1 v2)
, where v1 is an S-expression and v2 is a list of S-expressionsWe say “an S-expression followed by a list of S-expressions”
S-Expression operators
Like any other abstract data type, S-Expresions have:
creators that create new values of the type
'()
producers that make new values from existing values
(cons s s')
mutators that change values of the type (not in uScheme)
observers that examine values of the type
number?
symbol?
boolean?
null?
pair?
car
cdr
N.B. creators + producers = constructors
The symbol ’ is pronounced “tick.”
It indicates that what follows is a literal.
Picture of (cons c (cons b (cons a '())))
Your turn!
- What is the representation of
'((a b) (c d))
which can be alternatively written
cons( (cons a (cons b '()))
`(cons (cons c (cons d '())) '()))`
- What is the representation of
cons('a 'b)
Contrast this representation with the one for
cons('a '())
Both of these expressions are S-expressions, but only cons('a '())
is a list.
20 September 2017: More Scheme
There are PDF slides for 9/21/2017.
Announcements
- OpSem Homework due tonight
- Scheme I: Recursive Programming with Lists Homework now available.
Today
- Lists
- Algebraic Laws for writing functions
- The
cons
cost model - The method of accumulating parameters
Last Time
- Wrapped up induction on derivations
- Introduction to Scheme
- S-expressions and cons cells
Lists
Subset of S-Expressions.
Can be defined via a recursion equation or by inference rules:
Constructors: '(),
cons`
Observers: null?
, pair?
, car
, cdr
(also known as first
and rest
, head
and tail
, and many other names)
Why are lists useful?
Sequences a frequently used abstraction
Can easily approximate a set
Can implement finite maps with association lists (aka dictionaries)
You don’t have to manage memory
These “cheap and cheerful” representations are less efficient than balanced search trees, but are very easy to implement and work with—see the book.
The only thing new here is automatic memory management. Everything else you could do in C. (You can have automatic memory management in C as well.)
Immutable data structures
Key idea of functional programming. Instead of mutating, build a new one. Supports composition, backtracking, parallelism, shared state.
Review: Algebraic laws of lists
You fill in these right-hand sides:
(null? '()) ==
(null? (cons v vs)) ==
(car (cons v vs)) ==
(cdr (cons v vs)) ==
(length '()) ==
(length (cons v vs)) ==
Combine creators/producers with observers to create laws.
Can use laws to prove properties of code and to write better code.
Recursive functions for recursive types
Any list is therefore constructed with '()
or with cons
applied to an atom and a smaller list.
- How can you tell the difference between these types of lists?
- What, therefore, is the structure of a function that consumes a list?
Example: length
Algebraic Laws for length
Code:
;; you fill in this part
Algebraic laws to design list functions
Using informal math notation with .. for “followed by” and e for the empty sequence, we have these laws:
xs .. e = xs
e .. ys = ys
(z .. zs) .. ys = z .. (zs .. ys)
xs .. (y .. ys) = (xs .. y) .. ys
The underlying operations are append
, cons
, and snoc.
Which ..’s are which?
But we have no
snoc
If we cross out the
snoc
law, we are left with three cases… but case analysis on the first argument is complete.So cross out the law
xs .. e == xs
.
Example: Append
- Which rules look useful for writing append?
You fill in these right-hand sides:
(append '() ys) ==
(append (cons z zs) ys) ==
Why does it terminate?
Cost model
The major cost center is cons
because it corresponds to allocation.
How many cons cells are allocated?
Let’s rigorously explore the cost of append.
Induction Principle for List(Z)
Suppose I can prove two things:
IH (’())
Whenever z in Z and also IH(zs), then IH (cons z zs)
then I can conclude
Forall zs in List(Z), IH(zs)
Example: The cost of append
Claim: Cost (append xs ys) = (length xs)
Proof: By induction on the structure of xs.
Base case: xs = ’()
I am not allowed to make any assumptions.
(append '() ys) = { because xs is null } ys
Nothing has been allocated, so the cost is zero.
(length xs)
is also zero.Therefore, cost =
(length xs)
.
Inductive case: xs = (cons z zs)
I am allowed to assume the inductive hypothesis for
zs
.Therefore, I may assume the number of cons cells allocated by
(append zs ys)
equals(length zs)
Now, the code:
(append (cons z zs) ys) = { because first argument is not null } = { because (car xs) = z } = { because (cdr xs) = zs } (cons z (append zs ys))
The number of cons cells allocated is 1 + the number of cells allocated by
(append zs ys)
.cost of (append xs ys) = { reading the code } 1 + cost of (append zs ys) = { induction hypothesis } 1 + (length zs) = { algebraic law for length } (length (cons z zs)) = { definition of xs } (length xs)
Conclusion: Cost of append is linear in length of first argument.
Example: list reversal
Algebraic laws for list reversal:
reverse '() = '()
reverse (x .. xs) = reverse xs .. reverse '(x) = reverse xs .. '(x)
And the code?
The list1
function maps an atom x
to the singleton list containing x
.
How many cons cells are allocated? Let’s let n = |xs|
.
- Q: How many calls to
reverse
? A:n
- Q: How many calls to
append
? A:n
- Q: How long a list is passed to
reverse
? A:n-1
,n-2
, … ,0
- Q: How long a list is passed as first argument to
append
? A:n-1
,n-2
, … ,0
- Q: How many
cons
cells are allocated by call tolist1
? A: one per call toreverse
. - Conclusion: O(n2) cons cells allocated. (We could prove it by induction.)
The method of accumulating parameters
The function revapp
takes two list arguments xs
and ys
.
It reverses xs
and appends the result to ys
:
(revapp xs ys) = (append (reverse xs) ys)
Write algebraic laws for revapp
involving different possible forms for xs
.
Who could write the code?
The cost of this version is linear in the length of the list being reversed.
Parameter ys
is the accumulating parameter.
(A powerful, general technique.)
Linear reverse, graphically
We call reverse
on the list '(1 2 3)
:
Function reverse
calls the helper function revapp
with '()
as the ys
argument:
The xs
parameter isn’t '()
, so we recursively call revapp
with the cdr
of xs
and the result of cons
ing the car
of xs
onto ys
:
The xs
parameter still isn’t '()
, so we again call revapp
recursively:
Still not '()
, so we recurse again:
This time xs
is '()
, so now we just return ys
, which now contains the original list, reversed!
25 September 2017: Let and Lambda
There are PDF slides for 9/26/2017.
Announcements
- Scheme I HW due 9/27
Last Time
- Inductive definitions: List of Z
- List functions (
'()
,cons
,car
,cdr
,null?
) - Cost model: number of
cons
allocations - Accumulating parameters:
revapp
Today
- Association lists [Not covered in class]
- Let construct
- Anonymous functions
Association lists represent finite maps [Not covered in class]
Implementation: List of key-value pairs
'((k1 v1) (k2 v2) ... (kn vn))
Picture with spine of cons cells
Functions car
, cdar
, caar
, cadar
can help navigate.
car
: Contents of the address registercaar
: Contents of the address then address registercdar
: Contents of the address then data registercadar
: Contents of the address then data then address registers
Recall that the left box in a cons
cell is the address and the right box is the data. Read the a
as “address” and the d
as “data” from right to left.
In association lists, these operations correspond to
car
: First key value pair, e.g.,'(k1 v1)
caar
: Key of first key value pair, e.g.'k1
cdar
: List of values of first key value pair, e.g.'(v1)
cadar
: Contents of the address then data then address registers, e.g.'v1
Notes:
- An attribute can be a list or any other value.
'()
stands for ‘not found’
Algebraic laws of association lists
Handy new feature of Scheme: let
binding
Evaluate e1
through en
, bind answers to x1
, … xn
Name intermediate results (simpler code, less error prone)
Creates new environment for local use only:
rho {x1 |-> v1, ..., xn |-> vn}
Also let*
(one at a time) and letrec
(local recursive functions)
Note that we have definititions in the language and it might be easier to read if McCarthy had actually used definition syntax, which you’ll see in ML, Haskell, and other functional languages:
From Impcore to uScheme
Things that should offend you about Impcore:
Looking up a function and looking up a variable require different interfaces! (
isvalbound
andisfunbound
)To get a variable, must check 2 or 3 environments (ξ, ϕ, ρ),
- Can’t create a function without giving it a name:
- High cognitive overhead
- A sign of second-class citizenship
All these problems have one solution: Lambda! (λ)
Anonymous, first-class functions
From Church’s lambda-calculus:
(lambda (x) (+ x x))
“The function that maps x to x plus x”
At top level, like define
. (Or more accurately, define
is a synonym for lambda
that also gives the lambda
a name.)
In general, \x.E
or (lambda (x) E)
x
is bound inE
- other variables are free in
E
The ability to “capture” free variables is what makes it interesting.
Functions become just like any other value.
First-class, nested functions
(lambda (x) (+ x y)) ; means what??
What matters is that y
can be a parameter or a let
-bound variable of an enclosing function.
- Can tell at compile time what is captured.
- To understand why anyone cares, you’ll need examples
First example: Finding roots. Given n and k, find an x such that x^n = k.
Step 1: Write a function that computes x^n - k.
Step 2: Write a function that finds a zero between lo
and hi
bounds.
Picture of zero-finding function.
Algorithm uses binary search over integer interval between lo
and hi
. Finds point in that interval in which function is closest to zero.
Code that computes the function x^n - k
given n
and k
:
The function to-the-n-minus-k
is a higher-order function because it returns another (escaping) function as a result.
General purpose zero-finder that works for any function f
:
findzero-between
is also a higher-order function because it takes another function as an argument. But nothing escapes; you can do this in C.
Example uses:
Your turn!!
Escaping functions
“Escape” means “outlive the function in which the lambda
was evaluated.”
Typically returned
More rarely, stored in a global variable or a heap-allocated data structure
We have already seen an example:
Where are n
and k
stored???
Values that escape have to be allocated on the heap
C programmers use
malloc
to explicitly manage such values.In a language with first-class, nested functions, storage of escaping values is part of the semantics of
lambda
.
Picture of activation record for to-the-n-minus-k
with n
and k
being popped.
An example:
Higher-order functions!
Preview: in math, what is the following equal to?
(f o g)(x) == ???
Another algebraic law, another function:
(f o g) (x) = f(g(x))
(f o g) = \x. (f (g (x)))
Another example: (o not null?)
Currying
Currying converts a binary function f(x,y)
to a function f'
that takes x
and returns a function f''
that takes y
and returns the value f(x,y)
.
As we study higher-order functions in more detail, you will see why currying is useful.
Curried functions take their arguments “one-at-a-time.”
Your turn!!
Bonus content: Lambda as an abstraction barrier
Q: What’s the problem with this approach?
A: The seed
is exposed to the end user, who can break the abstraction of the rand
function.
27 September 2017: Higher-order functions
There are PDF slides for 9/28/2017.
Announcements
- Instructor Office Visits on Saturday (see Piazza)
- Scheme I HW due tonight
- Scheme II HW (HOFS) due 10/4
Last Time
- Association lists [bonus]
- Let construct
- Anonymous functions
Today
- Reasoning about Functions
- Useful Higher-Order Functions
- Tail Calls
Reasoning about code
Reasoning principle for lists
Recursive function that consumes A has the same structure as a proof about A
Q: How to prove two lists are
equal?
A: Prove they are both
'()
or that they are bothcons
cells cons-ing equal car’s to equal cdr’s
Reasoning principle for functions
Q: Can you do case analysis on a function?
A: No!
Q: So what can you do then?
A: Apply it!
Q: How to prove two functions equal?
A: Prove that when applied to equal arguments they produce equal results.
Higher-Order Functions
Goal: Start with functions on elements, end up with functions on lists
- Generalizes to sets, arrays, search trees, hash tables, …
Goal: Capture common patterns of computation or algorithms
exists?
(Ex: Is there a number?)all?
(Ex: Is everything a number?)filter
(Ex: Take only the numbers)map
(Ex: Add 1 to every element)- foldr (General: can do all of the above.)
Fold also called reduce
, accum
, or a “catamorphism”
List search: exists?
Algorithm encapsulated: linear search
Example: Is there a even element in the list?
Algebraic laws:
(exists? p? '()) == ???
(exixts? p? '(cons a as)) == ???
(exists? p? '()) == #f
(exixts? p? '(cons a as)) == p? x or exists? p? xs
Your turn: Does everything match: all?
Example: Is every element in a list even?
Algebraic laws:
(all? p? '()) == ???
(all? p? '(cons a as)) == ???
(all? p? '()) == #t
(all? p? '(cons a as)) == p? x and all? p? xs
List selection: filter
Algorithm encapsulated: Linear filtering
Example: Given a list of numbers, return only the even ones.
Algebraic laws:
(filter p? '()) == ???
(filter p? '(cons m ms)) == ???
(filter p? '()) == '()
(filter p? '(cons m ms)) == if (p? m)
(cons m (filter p? ms))
(filter p? ms)
“Lifting” functions to lists: map
Algorithm encapsulated: Transform every element
Example: Square every element of a list.
Algebraic laws:
(map f '()) == ???
(map f (cons n ns)) == ???
(map f '()) == '()
(map f (cons n ns)) == cons (f n) (map f ns)
The universal list function: fold
foldr
takes two arguments:
zero
: What to do with the empty list.plus
: How to combine next element with running results.
Example: foldr plus zero '(a b)
cons a (cons b '())
| | |
v v v
plus a (plus b zero)
In-class exercise: Folding combine?
Tail calls
Intuition: In a function, a call is in tail position if it is the last thing the function will do.
A tail call is a call in tail position.
Important for optimizations: Can change complexity class.
Anything in tail position is the last thing executed!
Key idea is tail-call optimization!
Example: reverse '(1 2)
Question: How much stack space is used by the call?
Call stack:
reverse '()
append
reverse '(2)
append
reverse '(1 2)
Answer: Linear in the length of the list
Example: revapp '(1 2) '()
Question: How much stack space is used by the call?
Call stack: (each line replaces previous one)
revapp '(1 2) '()
–>
revapp '(2) '(1)
–>
revapp '() '(2 1)
Answer: Constant
Question: Why can’t we do this same optimization on reverse
?
Answer: reverse
has to do further computation with results of recursive calls, so can’t eliminate the stack frame until later.
Answer: a goto!!
Think of “tail call” as “goto with arguments”
2 October 2017: Continuations
There are PDF slides for 10/3/2017.
Last Time
- Reasoning about functions
- Higher-order functions
exists?
all?
filter
map
fold
- Tail calls
Announcements
- Scheme II HW (HOFs) due Wednesday 10/4
Today
- Continuations
Continuations
A continuation is code that represents “the rest of the computation.”
- Not a normal function call because continuations never return
- Think “goto with arguments”
Different coding styles
Direct style: Last action of a function is to return a value. (This style is what you are used to.)
Continuation-passing style (CPS): Last action of a function is to “throw” a value to a continuation.
Uses of continuations
Call-backs in GUI frameworks
A style of coding that can mimic exceptions
Some languages provide a construct for capturing the current continuation and giving it a name
k
. Control can be resumed at captured continuation by throwing tok
.Compiler representation: Compilers for functional languages often convert direct-style user code to CPS because CPS matches control-flow of assembly.
Implementation
First-class continuations require compiler support.
We’re going to simulate continuations with function calls in tail position.
Motivating Example: From existence to witness
Ideas?
Bad choices:
- nil
- special symbol
'fail
- run-time error
Good choice:
- exception (not in uScheme)
Question: How much stack space is used by the call?
Answer: Constant
Extended Example: A SAT Solver
Continuations for Search
Solving a Literal
start
carries a partial truth assignment to variables current
Box describes how to extend current
to make a variable, say x
, true.
Case 1: current(x) = #t
Call success
continuation with current
Pass fail
as resume
continuation (argument to success
)
Case 2: current(x) = #f
Call fail
continuation
Case 3: x
not in current
Call success
cotinuation with current{x -> #t}
Pass fail
as resume
continuation
Solving a Negated Literal (Your turn)
start
carries a partial truth assignment to variables current
Box describes how to extend current
to make a negated variable, say not x
, true.
Case 1: current(x) = #f
Call success
continuation with current
Pass fail
as resume
continuation (argument to success
)
Case 2: current(x) = #t
Call fail
continuation
Case 3: x
not in current
Call success
cotinuation with current{x -> #f}
Pass fail
as resume
continuation
These diagrams (and the corresponding code) compose!
Solving A and B
Solver enters A
If A is solved, newly allocated success continuation starts B
If B succeeds, we’re done! Use
success
continuation from context.If B fails, use
resume
continuation A passed to B asfail
.If A fails, the whole thing fails. Use
fail
continuation from context.
Solving A or B
Solver enters A
If A is solved, we’re good! But what if context doesn’t like solution? It can resume A using the resume continuation passed out as
fail
.If A can’t be solved, don’t give up! Try a newly allocated failure continuation to start B.
If ever B is started, we’ve given up on A entirely. So B’s success and failure continuations are exactly the ones in the context.
If B succeeds, but the context doesn’t like the answer, the context can resume B.
If B fails, abject failure all around; call the original
fail
continuation.
4 October 2017: Scheme Semantics
There are PDF slides for 10/5/2017.
Announcements
If you can, bring your laptop to this week’s recitation.
Scheme HW II (hofs) due tonight.
Scheme HW III (continuations and semantics) due on Monday 10/16.
Today
Scheme Semantics
Stores
Lambdas evaluate to closures
Application
Last Time
- Continuations
- “gotos with arguments”
- Example: Handling missing values (association list)
- Example: Structuring a search (SAT solver)
New Syntax, Values, Environments, and Evaluation Rules
First four of five questions: Syntax, Values, Environments, Evaluation
Key changes from Impcore:
- New constructs: let, lambda, application (not just named functions)
New values:
cons
cells and functions (closures)A single kind of environment
Environment maps names to mutable locations, not values.
A store maps locations to values.
Environments get copied (in closures).
It’s not precisely true that rho never changes.
New variables are added when they come into scope.
Old variables are deleted when they go out of scope.
But the location associated with a variable never changes.
The book includes all rules for uScheme. Here we will discuss on key rules.
Variables
Board: Picture of environment pointing to store.
Questions about Assign:
What changes are captured in σ′?
What changes are captured in σ′{ℓ↦v}?
What would happen if we used σ instead of σ′
What would happen if we used a fresh ℓ?
Some other ℓ in the range of ρ?
Lambdas
Function Application
Question: Which even
is referenced when f
is called?
Answer: With static scoping, it’s the predicate. With dynamic scoping it’s the one bound to 3.
Questions about ApplyClosure:
What if we used σ instead of σ0 in evaluation of e1?
What if we used σ instead of σ0 in evaluation of arguments?
What if we used ρc instead of ρ in evaluation of arguments?
What if we did not require ℓ1, …, ℓn ∉ dom(σ)?
What is the relationship between ρ and σ?
Picture of environment and store that results from executing above program.
Closure Optimizations
- Major issue in making functional programs efficient
- Keep closures on the stack
- Share closures
- Eliminate closures (when functions don’t escape)
11 October 2017: Scheme Wrap-up; ML Intro
There are PDF slides for 10/12/2017.
Announcements
Today
- Scheme Wrap-up
- Intro to ML
Last Time
Scheme semantics
Single kind of environment that maps names to locations.
A store maps locations to values.
Functions & Closures
Application
Lets
Which let
is which and why?
Three versions of let
:
let
puts the new bindings in scope only for the body expression.let*
adds each binding one at a time, so each binding is in scope for the later ones.letrec
considers all the bindings to be mutually recursive.
Lisp and Scheme Retrospective
Common Lisp, Scheme
Advantages:
- High-level data structures
- Cheap, easy recursion
- Automatic memory management (garbage collection!)
- Programs as data!
- Hygenic macros for extending the language
- Big environments, tiny interpreters, everything between
- Sophisticated Interactive Development Environments
- Used in AI applications; ITA; Paul Graham’s company Viaweb
Down sides:
- Hard to talk about data
- Hard to detect errors at compile time
Bottom line: it’s all about lambda
- Major win
- Real implementation cost (heap allocation)
Bonus content: Scheme as it really is
- Macros!
- Cond expressions (solve nesting problem)
- Mutation
- …
Macros!
Conditional expressions
Mutation
Introduction to ML
Apply your new knowledge in Standard ML:
- You’ve already learned (most of) the ideas
- There will be a lot of new detail
- Good language for implementing language features
- Good language for studying type systems
Lectures on ML:
- Algebraic types and pattern matching
- Exceptions
- An introduction to types
Meta: Not your typical introduction to a new language
- Not definition before use, as in a manual
- Not tutorial, as in Ullman
- Instead, the most important ideas that are most connected to your work up to now
ML Overview
Designed for programs, logic, symbolic data
Theme: Precise ways to describe data
ML = uScheme + pattern matching + exceptions + static types
uScheme -> ML Rosetta Stone
uScheme SML
(cons x xs) x :: xs
'() []
'() nil
(lambda (x) e) fn x => e
(lambda (x y z) e) fn (x, y, z) => e
|| && andalso orelse
(let* ([x e1]) e2) let val x = e1 in e2 end
(let* ([x1 e1] let val x1 = e1
[x2 e2] val x2 = e2
[x3 e3]) e) val x3 = e3
in e
end
Three new ideas
- Pattern matching is big and important. You will like it.
- Exceptions are easy
- Static types get two to three weeks in their own right.
Pattern matching makes code look more like algebraic laws: one pattern for each case.
Static types tell us at compile time what the cases are.
And lots of new concrete syntax!
Examples
The length
function.
Algebraic laws:
length [] = 0 length (x::xs) = 1 + length xs
The code:
fun length [] = 0 | length (x::xs) = 1 + length xs
Things to notice:
No brackets! (I hate the damn parentheses)
Function application by juxtaposition
Function application has higher precedence than any infix operator
Compiler checks all the cases (try in the interpreter)
Let’s try another! map, filter, exists, all, take, drop, takewhile, dropwhile
16 Oct 2017: Programming with constructed data and types
There are PDF slides for 10/17/2017.
Announcements
Continuations HW due tonight
ML homework is now available. Due 10/25.
Today
- Datatypes
- Types, Patterns, Exceptions
- ML Traps & Pitfalls
Last Time
Semantics of let
Scheme wrap up
Introduction to ML: functions and patterns
Foundation: Data
Syntax is always the presenting complaint, but data is what’s always important
- Base types:
int
,real
,bool
,char
,string
- Functions
Constructed data:
- Tuples: pairs, triples, etc
- (Records with named fields)
- Lists and other algebraic data types
“Distinguish one cons cell (or one record) from another”
Algebraic Datatypes
Enumerated types
Datatypes can define an enumerated type and associated values.
datatype suit = heart | diamond | spade | club
Here suit
is the name of a new type.
The data constructors heart
, dimaond
, spade
, and club
are the values of type suit
.
Data constructors are separated by vertical bars.
Pattern matching
Datatypes are deconstructed using pattern matching.
fun toString heart = "heart"
| toString diamond = "diamond"
| toString spade = "spade"
| toString club = "club"
val suitName = toString heart
But wait, there’s more: Data constructors can take arguments!
datatype IntTree = Leaf | Node of int * IntTree * IntTree
IntTree
is the name of a new type.
There are two data constructors: Leaf
and Node
.
Node
s take a tuple of three arguments: a value at the node, and left and right subtrees.
The keyword of separates the name of the data constructor and the type of its argument.
When fully applied, data constructors have the type of the defining datatype (ie, IntTree
).
Building values with constructors
We build values of type IntTree
using the associated constructors: (Draw on board)
val tempty = Leaf
val t1 = Node (1, tempty, tempty)
val t2 = Node (2, t1, t1)
val t3 = Node (3, t2, t2)
What is the in-order traversal of t3
?
[1,2,1,3,1,2,1]
What is the pre-order traversal of t3
?
[3,2,1,1,2,1,1]
Deconstruct values with pattern matching
(The @
symbol denotes append in ML)
fun inOrder Leaf = []
| inOrder (Node (v, left, right)) =
(inOrder left) @ [v] @ (inOrder right)
val il3 = inOrder t3
fun preOrder Leaf = []
| preOrder (Node (v, left, right)) =
v :: (preOrder left) @ (preOrder right)
val pl3 = preOrder t3
IntTree
is monomorphic because it has a single type.
Note though that the inOrder
and preOrder
functions only cared about the structure of the tree, not the payload value at each node.
But wait, there’s still more: Polymorphic datatypes!
Polymorphic datatypes are written using type variables that can be instantiated with any type.
datatype 'a tree = Child | Parent of 'a * 'a tree * 'a tree
tree
is a type constructor (written in post-fix notation), which means it produces a type when applied to a type argument.
Examples:
int tree
is a tree of integersbool tree
is a tree of booleanschar tree
is a tree of charactersint list tree
is a tree of a list of integers.
'a
is a type variable: it can represent any type.
It is introduced on the left-hand of the =
sign. References on the right-hand side are types.
Child
and Parent
are data constructors.
Child
takes no arguments, and so has type 'a tree
When given a value of type 'a
and two 'a tree
s, Parent
produces a 'a tree
Constructors build tree
values
val empty = Child
val tint1 = Parent (1, empty, empty)
val tint2 = Parent (2, tint1, tint1)
val tint3 = Parent (3, tint2, tint2)
val tstr1 = Parent ("a", empty, empty)
val tstr2 = Parent ("b", tstr1, tstr1)
val tstr3 = Parent ("c", tstr2, tstr2)
Pattern matching deconstructs tree
values
fun inOrder Child = []
| inOrder (Parent (v, left, right)) =
(inOrder left) @ [v] @ (inOrder right)
fun preOrder Child = []
| preOrder (Parent (v, left, right)) =
v :: (preOrder left) @ (preOrder right)
Functions inOrder
and preOrder
are polymorphic: they work on any value of type 'a tree
. 'a
is a type variable and can be replaced with any type.
Things to notice about datatypes
Environments
Datatype declarations introduce names into:
the type environment:
suit
,IntTree
,tree
the value environment:
heart
,Leaf
,Parent
Inductive
Datatype declarations are inherently inductive:
the type
IntTree
appears in its own definitionthe type
tree
appears in its own definition
Datatype Exercise
Case expressions: How we use datatypes
Bonus: Talking type theory: Introduction and elimination constructs
Part of learning any new field: talk to people in their native vocabulary
Introduce means “produce”, “create”, “make”, “define”
Eliminate means “consume”, “examine”, “observe”, “use”
It’s like knowing what to say when somebody sneezes.
Tuple Pattern Matching
val (x,y) = (1,2)
val (left, pivot, right) = split xs
val (n,xs) = (3, [1,2,3])
val (x::xs) = [1,2,3]
val (_::xs) = [1,2,3]
Exceptions: Handling unusual circumstances
Syntax:
- Declaration:
exception EmptyQueue
- Introduction:
raise e
wheree : exn
- Elimination:
e1 handle pat => e2
Informal Semantics:
- alternative to normal termination
- can happen to any expression
- tied to function call
- if evaluation of body raises exn, call raises exn
Handler uses pattern matching
e handle pat1 => e1 | pat2 => e2
Bonus Content: ML traps and pitfalls
Bonus content (seen in examples)
Syntactic sugar for lists
Bonus content: ML from 10,000 feet
Environments
Patterns
Functions
Tuples are “usual and customary.”
Types
18 October 2017: Types
There are PDF slides for 10/19/2017.
Announcements
- Handout (slides from today)
Today
Type systems
Typing rules for a simple language
Type checker for a simple language
Adding environments
Type systems
What kind of value do we have?
Slogan: “Types classify terms.”
n + 1 : int
"hello" ^ "world" : string
(fn n => n * (n - 1)) : int -> int
if p then 1 else 0 : int, provided that p : bool
Questions type systems can answer:
What kind of value does it evaluate to (if it terminates)?
What is the contract of the function?
Is each function called with the right number of arguments? (And similar errors)
Who has the rights to look at it/change it?
Is the number miles or millimeters?
Questions type systems generally cannot answer:
Will my program contain a division by zero?
Will my program contain an array bounds error?
Will my program take the
car
of `’()?Will my program terminate?
Decidability and Type Checking
Suppose L is a “Turing-Complete” Language.
TP is the set of programs in L that terminate.
Wish: a type system to statically classify terminating programs:
Expression e in L has type T (e : T) iff e terminates.
But: Undecideable!
We can prove no such type system exists.
Choices:
Allow type checker to run forever.
Don’t use type system to track termination.
Static vs. Dynamic Type Checking
Most languages use a combination of static and dynamic checks
Static: “for all inputs”
input independent
efficient at run-time
approximate : rules out some programs that won’t trigger errors example:
(if false then 2 else "hi") ^ "there"
Dynamic: “for some inputs”
depends on input
run-time overhead
precise
Type System and Checker for a Simple Language
Define an AST for expressions with:
- Simple integer arithmetic operations
- Numeric comparisons
- Conditional
- Numeric literal
Examples to rule out
Can’t add an integer and a boolean:
3 + (3 < 99)
(ARITH(PLUS, LIT 3, CMP (LT, LIT 3, LIT 99)))
Can’t compare an integer and a boolean
(3 < (4 = 24))
CMP (LT, LIT 3, CMP(EQ (LIT 4, LIT 24)))
Inference rules to define a type system
Form of judgment Context
|-
term:
typeWritten
|- e : tau
Contexts vary between type systems
(Right now, the empty context)
Inference rules determine how to write type checker
typeof : exp -> ty
:Given e, find tau such that
|- e : tau
What inference rules do you recommend for this language?
Rule for arithmetic operators
Informal example:
|- 3 : int |- 5 : int
------------------------------------------------------------
|- 3 + 5 : int
Rules out:
|- 'a' : char |- 5 : int
------------------------------------------------------------
|- 'a' + 5 : ???
General form:
|- e1 : int |- e2 : int
------------------------------------------------------------
|- ARITH ( _ , e1, e2) : int
Rule for comparisons
Informal example:
|- 7 : int |- 10 : int
------------------------------------------------------------
|- 7 < 10 : bool
General form:
|- e1 : int |- e2 : int
------------------------------------------------------------
|- CMP ( _ , e1, e2) : bool
Rule for literals
Informal example:
|- 14 : int
General form:
-----------------------------------
|- LIT (n) : int
Rule for conditionals:
Informal example:
|- true : bool
|- 3 : int
|- 42 : int
------------------------------------------------------------
|- IF (true, 3, 42) : int
General form:
|- e : bool
|- e1 : tau1
|- e2 : tau2 tau1 equiv tau2
------------------------------------------------------------
|- IF ( e, e1, e2) : tau1
Experience shows it is better to test two types for equivalence than to write rule with same type appearing twice.
Typing rules let us read off what a type checker needs to do.
input to checker: e
output from checker: tau
What is a type?
Working definition: a set of values
Precise definition: classifier for terms!!
The relationship to values becomes a proof obligation.
Note: a computation can have a type even if it doesn’t terminate! (Or doesn’t produce a value)
Type checker in ML
val typeof : exp -> ty
exception IllTyped
fun typeof (ARITH (_, e1, e2)) =
case (typeof e1, typeof e2)
of (INTTY, INTTY) => INTTY
| _ => raise IllTyped
| typeof (CMP (_, e1, e2)) =
case (typeof e1, typeof e2)
of (INTTY, INTTY) => BOOLTY
| _ => raise IllTyped
| typeof (LIT _) = INTTY
| typeof (IF (e,e1,e2)) =
case (typeof e, typeof e1, typeof e2)
of (BOOLTY, tau1, tau2) =>
if eqType(tau1, tau2)
then tau1 else raise IllTyped
| _ => raise IllTyped
An implementor’s trick: If you see identical types in a rule,
Give each type a distinct subscript
Introduce equality constraints
Remember to be careful using primitive equality to check types—you are better off with
eqType
.
Typing Rules: Contexts and Term Variables
Add variables and let
binding to our language, what happens?
What could go wrong with a variable?
Used inconsistently:
;; x can’t be both an integer and a list
x + x @ x
;; y can’t be both an integer and a string
let y = 10 in y ^ “hello” end
Need to track variable use to ensure consistency
Key idea: Type environment (Gamma) tracks the types of variables.
Rule for var
x in domain Gamma tau = Gamma(x)
------------------------------------------------------------
Gamma |- VAR x : tau
Rule for let
Gamma |- e : tau
Gamma{x->tau} |- e' : tau'
------------------------------------------------------------
Gamma |- LET x = e in e' : tau'
Type Checker
Type checker needs Gamma – gives type of each “term variable”.
val typeof : ty env -> exp -> ty
fun typeof Gamma (ARITH ... ) = <as before>
| typeof Gamma (VAR x) =
case Gamma (x)
of Some tau => tau
| None => raise IllTyped
| typeof Gamma (LET x, e1, e2) =
let tau1 = typeof Gamma e1
in typeof (extend Gamma x tau1) e2
end
Review
I gave you syntax for simple language
You came up with typing rules
I showed you how to implement the type checker.
Then on your homework,
You will design new syntax and typing rules for lists
You will extend an existing type checker
You will implement a full type checker from scratch
This is a big chunk of what language designers do.
23 October 2017: Type Checking with Type Constructors
There are PDF slides for 10/24/2017.
Announcements
- Midterm: Wednesday Nov. 1 in class; 1-page self-prepared sheet of notes
Last Time
What are types?
Undecideability
Static vs. Dynamic types
Typing Rules
Type Checker
Type environment
Today
Type checking with type constructors
Formation, Introduction, and Elimination Rules
Functions
Introduction:
Gamma{x->tau1} |- e : tau2
------------------------------------------------------------
Gamma |- fn x : tau1 => e : tau1 -> tau2
Elimination:
Gamma |- e : tau1 -> tau2
Gamma |- e1 : tau1
------------------------------------------------------------
Gamma |- e e1 : tau2
Where we’ve been and where we’re going
New watershed in the homework
You’ve been developing and polishing programming skills: recursion, higher-order functions, using types to your advantage. But the problems have been mostly simple problems around simple data structures, mostly lists.
We’re now going to shift and spend the next several weeks doing real programming-languages stuff, starting with type systems.
You’ve already seen everything you need to know to implement a basic type checker, and you are almost fully equipped to add array operations and types to Typed Impcore.
What’s next is much more sophisticated type systems, with an infinite number of types. We’ll focus on two questions about type systems:
What is and is not a good type, that is, a classifier for terms?
How shall we represent types?
We’ll look at these questions in two contexts: monomorphic and polymorphic languages.
Monomorphic vs Polymorphic Types
Monomorphic types have “one shape.”
- Examples:
int
,bool
,int -> bool
,int * int
Polymorphic types have “many shapes.”
- Examples:
'a list
,'a list -> 'a list
,('a * int)
Design and implementation of monomorphic languages
Mechanisms:
Every new variety of type requires special syntax
Implementation is a straightforward application of what you already know.
Language designer’s process when adding new kinds of types:
What new types do I have (formation rules)?
What new syntax do I have to create new values with that type (introduction rules)?
For introduce think “produce”, “create”, “make”, “define”
What new syntax do I have to observe terms of a type (elimination rules)?
For eliminate think “consume”, “examine”, “interrogate”, “look inside”, or “take apart”, “observe”, “use”, “mutate”
Words “introduce” and “eliminate” are the native vocabulary of type-theoretic language design—it’s like knowing what to say when somebody sneezes.
Question: If I add lists to a language, how many new types am I introducing?
Managing the set of types: Type Formation
Monomorphic type rules
Notice: one rule for if!!
Classic types for data structures
(At run time, identical to cons
, car
, cdr
)
Typing Rule Exercise
Coding the arrow-introduction rule
25 October 2017: Polymorphic Type Checking
There are PDF slides for 10/26/2017.
Announcements
Continuation homework returned
ML HW due tonight.
Midterm: A week from today
Last Time
- Monomorphic type systems (Typed Impcore)
Typing rules
- Formation: Is this type valid for classifying terms?
- Introduction: How to I create a value of this type?
- Elimination: How do I use a value of this type?
Examples:
- Pairs, functions, arrays, and references
Today
- Polymorphic type systems (TypedUScheme)
- Generic type representations
- Kinds for classifying types
Limitations of monomorphic type systems
Notes:
- Implementing arrays on homework
- Writing rules for lists on homework
Quantified types
Type formation via kinds
Polymorphic Type Checking
Quantified types
Bonus instantiation:
-> map
<procedure> :
(forall ('a 'b)
(('a -> 'b) (list 'a) -> (list 'b)))
-> (@ map int bool)
<procedure> :
((int -> bool) (list int) -> (list bool))
Two forms of abstraction:
Type rules for polymorphism
Type formation through kinds
Bonus content: a definition manipulates three environments
30 October 2017: Midterm Review
There are PDF slides for 10/31/2017.
Announcements
ML HW returned on Saturday; Course summary on Sunday
Midterm: Wednesday in class; 1-page self-prepared sheet of notes
Course evaluations: At the end of class today.
Last Time
Kinds classify type constructors into:
*: types (nullary type constructors that classify values)
- => *: type constructors that become types when applied to some number of arguments
nonsense type expressions
Polymorphic types
Introduction: Type Lambda
- Example:
(type-lambda ['a] (lambda ([x : 'a] x)))
- Example:
Elimination: Explicit Instantiation
(@ length int)
Example:
length : (forall ('a) ('a list -> int)
Example:
(@ length int) : int list -> int
Today
Midterm review
Sample Problems
Course Evaluations
Midterm Review
6 November 2017: Type Inference
There are PDF slides for 11/7/2017.
Announcements
Midterm: Returned at the end of class today.
Final: Thursday, December 14, 8:30 to 10:30.
- Send email to comp105-grades@cs.tufts.edu if you have another exam at the same time.
HW: Type Checking due Wednesday 11/8
Today
Type Inference Intuition
Key Ideas:
Fresh type variables represent unknown types.
- Example: In
(lambda (x) (+ x 3))
, assignx
fresh type variable α
- Example: In
Constraints record knowledge about type variables.
- Example: α ≡
int
- Example: α ≡
Why Study?
Useful in its own right (as we’ll see shortly)
Canonical example of Static Analysis, which is key tool in cybersecurity
The compiler tells you useful information and there is a lower annotation burden.
Let’s do an example on the board
(val-rec double (lambda (x) (+ x x)))
What do we know?
double
has type ′a1x
has type ′a2+
has typeint * int -> int
(+ x x)
is an application, what does it require?- ′a2 =
int
and ′a2 =int
- ′a2 =
Is this possible?
Key idea: Record the constraint in a typing judgement.
'a2 = int /\ 'a2 = int, { double : 'a1, x : 'a2 } |- (+ x x) : int
General form of typing judgement:
C, Gamma |- e : tau
which is pronounced “Assuming the constraints in C
are true, in environment Gamma
, expression e
has type tau
.”
Example: if
(if y 1 0)
y
has type ′a3,1
has typeint
,0
has typeint
Requires what constraints? (
int
=int
, ′a3 = bool`)
Example:
(if z z (- 0 z))
z
has type ′a3,0
has typeint
,-
has typeint * int -> int
Requires what constraints? (′a3 =
bool
/\
int
=int
/\
′a3 =int
)Is this possible?
Why not?
Inferring polymorphic types
(val app2 = (lambda (f x y)
(begin
(f x)
(f y))))
Assume f
: 'a_f
Assume x
: 'a_x
Assume y
: 'a_y
f x
implies 'a_f = 'a_x -> 'a1
f y
implies 'a_f = 'a_y -> 'a2
Together, these constraints imply 'a_x = 'a_y
and 'a1 = 'a2
begin
implies result of function is 'a2
So, app2 : ('a_x -> 'a1) * 'a_x * 'a_x -> 'a1
'a_x
and 'a
aren’t mentioned anywhere else in program, so
we can generalize to:
forall 'a_x, 'a1 . ('a_x -> 'a1) * 'a_x * 'a_x -> 'a1
which is the same thing as:
forall 'a, 'b . ('a -> 'b) * 'a * 'a -> 'b
Assume nss : 'b
We know car : forall 'a . 'a list -> 'a
=> car_1 : 'a1 list -> 'a1
=> car_2 : 'a2 list -> 'a2
(car_1 nss)
=> 'b = 'a1 list
(car_2 (car_1 nss))
=> 'a1 = 'a2 list
(car_2 (car_1 nss)) : 'a2
nss : 'b
: 'a1 list
: ('a2 list) list
So, cc : ('a2 list) list -> 'a2
Because 'a2
is unconstrained, we can generalize:
cc : forall 'a . ('a2 list) list -> 'a
8 November 2017: Formalizing Type Inference and Instantiation
There are PDF slides for 11/9/2017.
Announcements
Last Time
Type inference
Type variables represent unknown types
Type constraints record requirements on those types
Constraint judgement C, Gamma |- e : tau
Today
Formalizing type inference
Moving from type schemes to types (Instantiation)
Moving from types to type schemes (Generalization)
Formalizing Type Inference
What you know and can do now
Solving Constraints
Question: What does a solution to a set of constraints look like?
Answer: A substitution/mapping from types variables to types: θ.
Question: in solving tau1 ~ tau2
, how many potential cases are there to consider?
Question: how are you going to handle each case?
What you know and can do after this lecture
From Type Scheme to Type
Why the freshness requirement?
Consider
Gamma = {fst : forall 'a 'b. 'a * 'b -> 'a, y : 'ay}
??, Gamma |- if (y, fst 2 3, 4) : ??
Imagine we ignore the freshness constraint when instantiating fst
:
fst : 'ay * 'b -> 'ay
From if
, we get the constraints:
'ay ~ bool
'ay ~ int
which aren’t satisfiable. Which means this code would be erroneously flagged as an error.
Correct typing:
'ay ~ bool, Gamma |- if (y, fst 2 3, 4) : int
Why the distinctness requirement?
fst : 'a * 'a -> 'a
Gamma |- fst 2 #t
Application rule yields constraints:
'a ~ int
'a ~ bool
which aren’t satisfiable. Which means this code would also be erroneously flagged as an error.
Correct typing:
Gamma |- fst 2 #t : int
13 November 2017: Generalization
There are PDF slides for 11/14/2017.
Announcements
Last Time
Formalizing type inference
Judgement form: C,Gamma |- e1, … en : tau1, …, taun
Representing constraints
Solving constraints: tau1 ~ tau2 and C1 / C2
From type scheme to types: Instantiation
- Freshness and Distinctness requirements
Today
Generalization: going from types to type schemes
- Inference rule for
val
- Inference for
let
- Inference for
val-rec
andlet-rec
From Type to Type Scheme
The set A above will be useful when some variables in τ are mentioned in the environment.
We can’t generalize over those variables.
Applying idea to the type inferred for the function fst
:
generalize('a * 'b -> 'a, emptyset) = forall 'a, 'b. 'a * 'b -> 'a
Note the new judgement form above for type checking a declaration.
On the condition ΘΓ = Γ: Γ is “input”: it can’t be changed.
The condition ensures that Θ doen’t conflict with Γ.
We can’t generalize over free type variables in Γ.
Their presence indicates they can be used somewhere else, and hence they aren’t free to be instantiated with any type.
Let with constraints, operationally:
typesof
: returns τ1, …, τn and CC-prime from
map
,conjoinConstraints
,dom
,inter
,freetyvarsGamma
val theta = solve C'
freetyvarsGamma
,union
,freetyvarsConstraint
Map anonymous lambda using
generalize
, get all the σiExtend the typing environment Gamma (pairfoldr)
Recursive call to type checker, gets
C_b
,\tau
Return
(tau, C' /\ C_b)
Forall things
val and val-rec |
let , letrec , … |
lambda |
---|---|---|
FORALL contains all variables (because none are free in the context) | FORALL contains variables not free in the context | FORALL is empty |
Generalize over all variables (because none are free in the context) | Generalize over variables not free in the context | Never generalize |
15 November 2017: Hiding information with abstract data types
There are PDF slides for 11/16/2017.
Announcements
Last Time
Generalization: Going from types to type schemes
Inferring types for
val
,val-rec
,let
, andlet-rec
Today
Module Systems
Structures/Implementations
Signatures/Interfaces
Where have we been?
- Programming in the small
- Expressive power
Success stories:
- First-class functions
- Algebraic data types and pattern matching
- Polymorphic type systems
What about big programs?
An area of agreement and a great divide:
INFORMATION HIDING
/ \
modular reasoning / \ code reuse
/ \
internal access / \ interoperability
to rep / \ between reps
/ \
MODULES OBJECTS
ABSTRACT TYPES
Why modules?
Unlocking the final door for building large software systems
You have all gotten good at first-class functions, algebraic data types, and polymorphic types
Modules are the last tool you need to build big systems
Implementation Interface
| | | | | Module | | I | | | | | ^ ^ | |
Nitty gritty - - Stuff you want others to see
Modules overview
Functions of a true module system:
Hide representations, implementations, private names
“Firewall” separately compiled units (promote independent compilation)
Possibly reuse units
Real modules include separately compilable interfaces and implementations
Designers almost always choose static type checking, which should be “modular” (i.e., independent)
C and C++ are cheap imitations
- C doesn’t provide namespaces
- C++ doesn’t provide modular type checking for templates
Interfaces
Collect declarations
- Unlike definition, provides partial information about a name (usually environment and type)
Things typically declared:
Variables or constants (values, mutable or immutable)
Types
Procedures (if different from values)
Exceptions
Key idea: a declared type can be abstract
Terminology: a module is a client of the interfaces it depends on
Roles of interfaces in programming:
The unit of sharing and reuse
Explainer of libraries
Underlie component technology
The best-proven technology for structuring large systems.
Ways of thinking about interfaces
Means of hiding information (ask “what secret does it hide?”)
A way to limit what we have to understand about a program
- Estimated force multiplier x10
A contract between programmers
- Essential for large systems
- Parties might be you and your future self
Interface is the specification or contract that a module implements
- Includes contracts for all declared functions
Module Implementations
Holds all dynamically executed code (expressions/statements)
May include private names
May depend only on interfaces, or on interfaces and implementations both (max cognitive benefits when all dependency is mediated by interfaces)
Dependencies may be implicit or explicit (
import
,require
,use
)
Standard ML Modules
The Perl of module languages?
Has all known features
Supports all known styles
Rejoice at the expressive power
Weep at the terminology, the redundancy, the bad design decisions
What we’ve been using so far is the core language
Modules are a separate language layered on top.
Signature basics
ML Modules examples, part I
Abstract data types
Data abstraction for reuse
20 November 2017: Functors and an Extended SML Example
There are PDF slides for 11/21/2017.
Announcements
type inference homework due tonight
Module HW due Sunday 12/3
Last Time
Modules/structures
Interfaces/signatures
Ascription
Today
Functors
Extended SML example
Computation abstraction
Extended Example: Error-tracking Interpreter
Why this example?
Lots of interfaces using ML signatures
Idea of how to compose large systems
Some ambitious, very abstract abstractions—it’s the end of class, and you should see something ambitious.
Practice implementing functors
Error modules: Three common implementations
Collect all errors
Keep just the first error
Keep the most severe error
Your obligations: two types, three functions, algebraic laws
Computations Abstraction
Ambitious! (will make your head hurt a bit)
Computations either:
return a value
produce an error
Errors must be threaded through everything :-(
That’s really painful!
We can extend the computation abstraction with sequencing operations to help.
Example:
eval e1 + eval e2
(op +) (eval e1, eval e2)
curry (op +) (eval e1) (eval e2)
curry (op +) <$> eval e1 <*> eval e2
The first three versions are not well typed. Why?
The last version will thread errors through the compuation behind the scenes.
Note:
eval e1, eval e2 : int comp
curry (op +) : int -> (int -> int)
<$> : (int -> (int -> int)) * (int comp) -> (int -> int) comp
<*> : (int -> int) comp * int comp -> int comp
curry (op +) <$> eval e1 : (int -> int) comp
let pa = curry (op +) <$> eval e1
note by above, pa : (int -> int) comp
pa <*> eval e2 : int comp
27 November 2017: Object-orientation
There are PDF slides for 11/28/2017.
Announcements
- SML HW due Sunday, December 3
Last Time
Functors
Extended example: Error-tracking interpretor
“Computation” abstraction
Today
Objects
Message passing
Demo: circle, square, and triangle objects
Instructions to student volunteers
- You have one instance variable, which represents the coordinate position at the ``center’’ of the object.
Note: Mutable state is back!
Key concepts of object-orientation
Key mechanisms
Private instance variables
- Only object knows its instance variables and can see them
- C++ calls these “members”
- Like the coordinate of the geometric figure
Code attached to objects and classes
- Code needed to draw the object is associated with the object in one of its ``methods.’’
Dynamic dispatch
- Methods invoked via message sends (like
draw
andposition:
). - Message sender doesn’t know what code will be called.
- Object receiving the message send is called the ``receiver’’.
- In a method, special variable
self
is bound to the receiver.
Key idea
Protocol determines behavioral subtyping
The protocol of an object is the set of messages it understands.
Object
A
is a behavioral subtype of objectB
ifA
understands all the messages thatB
does in a compatible way.Intuition: If
A
is a behavioral subtype ofB
, thenA
can be used in any context whereB
can be used.
Class-based object-orientation
Object implementations determined by its class definition.
So, each class implicitly defines the protocol for its objects, and, dynamic dispatch is determined by object’s class.
Code reuse by sending messages around like crazy.
Example: list filter
Blocks and Booleans
[block (formals) expressions]
For parameterless blocks (normally continuations),
{expressions}
Blocks are objects
- You don’t “apply” a block; you “send it the
value
message”
Booleans use continuation-passing style
- Blocks delay evaluation
Booleans implemented with two classes True
and False
- one value apiece
Method dispatch in the Booleans
Board - Method dispatch
To answer a message:
Consider the class of the receiver
Is the method with that name defined?
If so, use it
If not, repeat with the superclass
Run out of superclasses?
“Message not understood”
29 November 2017: Inheritance
There are PDF slides for 11/30/2017.
Announcements
Solutions for ty-inf available at the end of class.
Course evaluations!
Send confirmation to comp105-grades@eecs.tufts.edu and we’ll give you class participation credit.
Last Time
Objects
Dynamic Dispatch
Protocols and behavioral subtyping
Blocks
Today
Inheritance
Abstract classes and methods
Object-oriented design
Object initialization
History and overview of objects
History of objects
Pioneers were Nygaard and Dahl, who added objects to Algol 60, producing SIMULA-67, the first object-oriented language
Bjarne Stroustrup liked Simula but wanted complete control of costs, so he created C++
James Gosling wanted something a little cleaner and a little more like Simula, created Java
Microsoft funded C#
Objects are everywhere
What’s an object?
We know that mixing code and data can create powerful abstractions (function closures)
Objects are another way to mix code and data
Agglutination containing
Some mutable state (instance variables)
Code that can respond to messages (code is called methods)
A lot like a closure
What are objects good at?
Not really useful for building small programs.
If you build a big, full-featured abstraction, you can use inheritance to build another, similar abstraction.
Very good at adding new kinds of things that behave similarly to existing things.
Programs that are evolving
A particular kind of evolution: operations stay the same, but we add new kinds of things
Example: GUIs (operations are
paint
andrespond-to-mouse-click
)Example: numbers
For your homework, you’ll take a Smalltalk system that has three kinds of numbers, and you’ll add a fourth kind of number.
What’s hard about objects?
If you do anything at all interesting, your control flow becomes smeared out over half a dozen classes, and your algorithms are nearly impossible to undrstand.
Smalltalk objects
Why Smalltalk?
- Another Turing Award
- Small, simple, pure objects
- Almost the complete language can be done in a relatively small interpreter
Alive and well today
- At the core of Ruby
- As an extension to C (“Objective C”) for Apple products
The six questions:
Values are objects (even
true
, 3,"hello"
)Even classes are objects!
There are no functions—only methods on objects
Syntax:
mutable variables
message send
sequential composition of mutations and message sends (side effects)
“blocks” (really closures, objects and closures in one, used as continuations)
No
if
orwhile
. These are implemented by passing continuations to Boolean objects.
(Smalltalk programmers have been indoctrinated and don’t even notice)
Environments
Name stands for a mutable cell containing an object:
- Global variables
- “Instance variables” (new idea, not yet defined)
Types
There is no compile-time type system.
At run time, Smalltalk uses behavioral subtyping, known to Rubyists as “duck typing”
Dynamic semantics
- Main rule is method dispatch (complicated)
- The rest is familiar
The initial basis is enormous
- Why? To demonstrate the benefits of reuse, you need something big enough to reuse.
Message passing
Look at SEND
- Message identified by name (messages are not values)
- Always sent to a receiver
- Optional arguments must match arity of message name
(no other static checking)
N.B. BLOCK
and LITERAL
are special objects.
Magnitudes and numbers
Key problems on homework
Natural
is aMagnitude
“Large integer” is a
Number
4 December 2017: Double dispatch, collections
There are PDF slides for 12/5/2017.
Announcements
Smalltalk HW available, due 12/10
Course evaluations!
Send confirmation to comp105-grades@eecs.tufts.edu and we’ll give you class participation credit.
Last Time
Inheritance
Abstract/Concrete classes
Object-oriented design
Object initialization
Today
Information hidden and revealed; three layers
Double dispatch
Subtyping
Making open system extensible
Subtyping
Key strategy for reuse in object-oriented languages: subtype polymorphism
A value of the subtype can be used wherever a value of the supertype is expected.
Board:
- SUBTYPE != SUBCLASS
- SUPERTYPE != SUPERCLASS
Some languages like C++ identify subtype with subclass, but conceptually they are different.
Subtyping relationship can be checked statically (e.g. Java, C++, Scala) or dynamically (e.g. Smalltalk, Ruby)
Bonus content not covered in class: Collections
Why collections?
Goal of objects is reuse
Key to successful reuse is a well-designed class hierarchy
Killer app: toolkits for building user interfaces
Smalltalk blue book is 90 pages on language, 300 pages on library
Lots of abstract classes
- Define protocols
- Build reusable stuff, just like
Boolean
,Magnitude
,Number
Implementing Collections
Question: what’s the most efficient way to find the size of a list?
Question: what’s the most efficient way to find the size of an array?
Example collection - Sets
Most subclass methods work by delegating all or part of work to list members
N.B. Set
is a client of List
, not a subclass!
Next example highlight: class method and super
!
6 December 2017: Lambda Calculus
There are PDF slides for 12/7/2017.
Announcements
Last Time
Information hiding in Smalltalk
“Private methods”
Double Dispatch
Subtyping vs. Inheritance
Today
Lambda Calculus Overview
Programming in the Lambda Calculus
Operational Semantics
Lambda Calculus
Why study lambda calculus?
Theoretical underpinnings for most programming langauges (all in this class).
Church-Turing Thesis: Any computable operator can be expressed as an encoding in lambda calculus
Test bench for new language features
The world’s simplest reasonable programming language
Only three syntactic forms:
M ::= x | \x.M | M M'
Everything is programming with functions
Everything is Curried
Application associates to the left
First example:
(\x.\y.x) M N --> (\y.M) N --> M
Crucial: argument N is never evaluated (could have an infinite loop)
Programming in Lambda Calculus
Everything is continuation-passing style
Q: Who remembers the boolean equation solver?
Q: What classes of results could it produce?
A: success, failure
Q: How were the results delivered?
A: calling continuations
Q: How shall we do Booleans?
A: true continuation, false continuation
Coding Booleans
Booleans take two continuations:
true = \t.\f.t
false = \t.\f.f
if M then N else P = ??? (* M N P *)
if = \b.\t.\e.b t e
Coding Pairs
How many ways can pairs be created? (A: one, pair constructor)
How many continuations? (A: one, corresponding to the pair)
What information does it expect? (A: the two elements of the pair)
What are the algebraic laws?
fst (pair x y) = x
snd (pair x y) = y
Code
pair
,fst
, andsnd
pair x y f = f x y fst p = p (\x.\y.x) snd p = p (\x.\y.y) pair = \x.\y.\f.f x y fst = \p.p (\x.\y.x) snd = \p.p (\x.\y.y)
Coding Lists
How many ways can lists be created? (A: two, nil and cons)
How many continuations? (A: two, one for each)
What does each continuation expect? (A: nil - nothing; cons - hd, tl)
For each creator, what are the laws regarding its continuations?
cons y ys n c = c y ys nil n c = n car xs = xs error (\y.\ys.y) cdr xs = xs error (\y.\ys.ys) null? xs = xs true (\y.\ys.false) cons = \y.\ys.\n.\c.c y ys nil = \n.\c.n car = \xs.xs error (\y.\ys.y) cdr = \xs.xs error (\y.\ys.ys) null? = \xs.xs true (\y.\ys.false)
Coding numbers: Church Numerals
Wikipedia good: “Church numerals”
Key Idea: The value of a numeral is the number of times it applies its argument function.
Taking stock:
bools
pairs
lists
numbers
Question: What’s missing from this picture?
Answer: Recursive functions.
Astonishing fact: we don’t need letrec
or val-rec
The Y-combinator = \f.(\x.f (x x))(\x.f (x x))
can encode recursion.
Operational semantics of lambda calculus
New kind of semantics: small-step
New judgment form
M --> N ("M reduces to N in one step")
No context!! No turnstile!!
Just pushing terms around == calculus
Beta-reduction
The substitution in the beta rule is the heart of the lambda calculus
- It’s hard to get right
- It’s a stupid design for real programming (shell, tex, tcl)
- It’s even hard for theorists!
- But it’s the simplest known thing
Board examples:
Are these functions the same?
\x.\y.x \w.\z.w
Are these functions the same?
\x.\y.z \w.\z.z
Examples of free variables:
\x. + x y (* y is free *)
\x.\y. x (* nothing is free *)
Example:
(\yes.\no.yes)(\time.no) ->
\z.\time.no
and never
\no.\time.no // WRONG!!!!!!
Must rename the bound variable:
(\yes.\no.yes) (\time.no) tuesday
->
(\yes.\z.yes) (\time.no) tuesday
->
(\z.\time.no) tuesday
->
\time.no
Summary
Lambda calculus is Turing Complete
Essence of most programming languages
Evaluation proceeds by substituting arguments for formal variables (called beta reduction)
Definition of free variables
Alpha-conversion allows bound variables to be renamed.
11 December 2017: Comp 105 Conclusion
There are PDF slides for 12/12/2017.
Announce
Final: Thursday, December 14, 8:30 - 10:30 in Barnum 08 (class)
May bring one double-sided page of notes
Online evaluations: Send screen-shot to comp105-grades for participation credit.
Last Time
Lambda calculus
Bools, pairs, lists, numbers
Y-combinator supports recursion
Church-Turing thesis
β-reduction
α- conversion
definitions of free and bound variables.
Today
What have we done (mostly) since the midterm?
Where might you go from here?
Your questions!
Class Feedback