Sign up for Piazza: piazza.com/tufts/fall2013/comp50
Quick start on class home page
Diagram of the course:
Solve Problems
/ | \
Domain / | \
Knowledge--/ Design and Build \- Math
Programs - algebra
| - trig
| - probability (counting)
Understand **Data**
Ultimately,
**better** programs
Problem-solving technique:
Problem -> Data -> Design Recipe -> Program -> Better Program
“Better” means
Plus the skill to articulate how you have solved a problem.
Labs, homeworks, and projects, of which the best are projects. Goals:
They should be interesting.
Some projects will connect to problems and ideas outside of computer science.
Some projects will help you learn how technology works
Some examples:
Working with GPS coordinates, following GPS navigation to a location
Using data and visualization to understand election results or to see what we can learn from Twitter about the people who use it
From its genetic code, what can we learn about an organism?
What language is this web page written in? (Google Translate)
Not all guaranteed to be offered, but definitely GPS and definitely web-page language.
Primarily about learning, not grades
I don’t have lots of little rules that tell you exactly what grade you are going to get.
I don’t have a quota — I can give as many A’s as I like.
My goal is for you to learn, not to get an A. If this is not also your goal, you are going to be uncomfortable. But if you are new to Tufts, stick with your discomfort: throw off your chains.
You’re doing the work of learning, but we do a lot of stuff that helps:
In-class quizzes, just about every day.
Goal: get your brain loaded and into the on position.
Board: Monday’s quiz (not graded): high-school trigonometry. Review your sines, cosines, and radians.
I create labs, which are guided, hands-on experiences.
These experiences address problem-solving by computer or help you learn technology.
I create homeworks and projects, which are independent, hands-on experiences solving problems by computer.
My staff and I respond to questions: on Piazza, in labs, in office hours.
We also teach one on one in labs and office hours.
Last but not least, I lecture. This has little to do with transmitting information; my goal here is to empower you to do your best work. For those of you who have played organized sports, it may help to think of me as a coach.
Don’t try to write it all down; do write down what’s meaningful to you. My notes are all on the web.
Yes. Although I am not about grades, I recognize that grades are important:
No traditional exams: it is too difficult to measure the skills that are really important
In-class quizzes, just about every day. Some are graded, pass/fail.
Weekly lab sessions. You will perform an activity of learners: articulate what you did and what you learned from it. We’ll help.
Homework and projects: we’ll grade something from every homework, but we don’t have enough staff to grade every problem from every homework.
In lieu of final exam, learning portfolio. In effect, you identify an area where you can demonstrate learning. It’s two samples of your work and a very short explanation of how they demonstrate learning. Very similar to samples of your work and a cover letter that you might submit to get a job interview or an internship—and we hope you will use it exactly that way.
We’ll give you lots of help, and we’ll talk more about this later.
I point out the path up the mountain. You climb it.
You understand that lecture is not for transmitting information. You won’t get everything you need in lecture!
You understand that to get the information you need, it’s your responsibility to turn to other sources:
The textbook
Perhaps Khan Academy (if you need to bone up on any math)
Perhaps other Web resources
You hone your skills through deliberate practice [board] — look it up
Our commitment to you: we work very hard to make it possible for you to have a great experience.
What’s your corresponding commitment to us?
I expect two things:
First, you do something ambitious. You get to define what your ambition is. It might be to build something, to master a new way of thinking about something, to make your math knowledge practical, to articulate your problem-solving skills brilliantly, or something else entirely.
Second, you must engage in brief, daily sessions of deliberate practice:
A couple of words about brief, daily sessions:
Trumps prior experience
After two or three weeks of 45 to 90 minutes per day, most people start to experience “instant recall” of where they were.
Are we detecting a theme here? You are responsible for learning new skills and new abilities, and you will learn to assess yourself so that you know where you are. Everything in the course is bent toward that end.
Please have with you a driver’s license, government ID, or Tufts ID.
What kinds of data do you see?
How are they formed?
What kind of arithmetic do they use?
What questions can you ask about this data?
How are the data related?
You have homework already. Tiny amount of code, serious thinking.
Two goals:
Learn by feel what clear instructions means
Help with problem 3 on the homework
The big picture is that real-world data all come with a form of arithmetic:
Arithmetic of numbers
Arithmetic of dates
Arithmetic of strings
Arithmetic of images!
In each case, arithmetic leads to formulas
We are therefore about to
Get a feeling for what a procedure is
Practice describing observed phenomena with formulas, aka understanding the structure of data we find in the world
All your programs will have formulas in them
Class exercise: procedures and formulas, part 1:
Procedure: everybody get up and stand in a line
(Impossibly complicated for the computer)
Procedure: find the person in the middle
How do we simplify this enough for a computer?
Initialization: Two people on the end raise your hands
Step: If a person next to you has hands raised and you do not, raise your hand
Termination: last person to raise a hand is the person in the middle (could be one person or two)
We won’t count these.
Procedure: step forward from the middle out (count and find formula)
Initialization: Middle person step forward
Step: If the person next to you is forward and you are not, step forward
I’m going to put on the board the number of people who are forward at the end of each step
Termination: everyone is forward
Procedure: choose two people and sit down (count and find formula)
Initialization: I choose someone to raise a hand
Step:
All those with hands up go to the podium
If you are on the podium and your hand is raised, point to two people still in line whose hands are not raised. Then sit down.
I’m going to put on the board the number of people who are seated at the end of each step
Termination: everyone is seated
Formulas:
What formula predicts how many people are standing as a function of time?
What formula predicts how many people are seated as a function of time?
Live coding: check our work
Contract purpose and header
Examples
Body
Test
Useful functions: expt
(define (log10 x) (/ (log x) (log 10)))
(define (decibans x) ...)
Take the “flash of insight” and put it into place in a system
We know when we’re solving a puzzle
We know when we’re being creative
We know when we’re carrying out a creative solution
Part of the system is the table of examples:
We have (N) | We want (seated) | We also know (expt 2 N) |
---|---|---|
1 | 1 | 2 |
2 | 3 | 4 |
3 | 7 | 8 |
Use this method for filling in function template.
What’s going on with Tesla? What did Tesla measure?
Q: How did we discover the rule?
It looks like we’re adding a halftone? How does that get translated into multiplication?
Answer: it’s a fact of human audio perception. True of loudness as well as pitch. Also affects perception of brightness (but this is more complicated).
Key algebraic question: what function f has the property that
f(x + y) = f(x) × f(y)?
Another question: what function g has the property that
g(x × y) = g(x) + g(y)?
These functions are a key to understanding behaviors and to dealing with large numbers.
We won’t worry about efficiency for quite some time. But we will be working sooner with probability, and that second function g will be very handy.
Finally, Tesla’s karma problem, anyone?
Likert scale:
If an event is equally likely and unlikely, what are the odds?
What’s the log of the odds
What’s a rational scale look like?
What slots did you pick?
Where are your events?
This week, just immigration, conditionals, animations
Next week, GPS. However, I promised you a quiz!
Slice through the Earth; here’s Boston
Angle θ is Boston’s latitude
Radius R is the radius of the earth (6378.137 km)
How far is Boston from the Earth’s axis of rotation?
If you take a hovercraft over the Earth’s surface, how far do you travel to get to the equator?
Answers should be two formulas using R and θ
Post-quiz: in the Earth’s frame of reference, how far does Boston travel in an hour?
Daily practice: reading, at least one exercise, check Piazza
Study groups: lab will help you meet people and form them
Lab today! 3pm
For lab you need a CS account and password.
For lab you need a signed copy of the Bargain.
Commentary on homework:
Hours vs minutes: minor deduction.
Confusion about probability vs odds. (No penalty; solutions should clear it up.)
Quadratic formulas.
Approximates any smooth function over a sufficiently small interval (calculus).
Some of you have been damaged. I’m so sorry.
Resist the impulse
Numbers are the most familiar data but also the most difficult to work with. Fortunately today’s second example is almost entirely symbolic.
Problem is to compute “learning capacity hours”
Data includes
Designing programs requires us to (a) make a plan, (b) build the bricks, (c) stack the bricks together.
Today:
big-bang
programs (design recipe)First: any questions about tonight’s homework?
Board: latitudes
Goal: review for tonight’s homework
Study abroad: subtropical? temperate? subpolar? How long are the shortest days?
Goal: preparation for next homework (user-interface experiment)
(Demonstrate handin server)
Eventually we will build a simulation of the algorithms from the first day of class. But that simulation will require advanced techniques. Today we will simulate a traffic light.
Ideas of big-bang
:
The world and the model (program)
Rendering the model as an image (to-draw
)
Allowing the world to evolve on its own (on-tick
)
(Later, allowing interactions to affect the world)
Incremental design of the world’s state
Ideas of how to design programs:
Make a plan
Build the bricks
Put them together
Start with the simplest plan that could possibly work
Refine it
This is a chunk of the skill you’ll develop. You’ll get to practice it over and over. Second time today.
So, questions:
What exists in the world of the traffic light?
How shall we represent it in the model of the computer?
How does it behave?
What does the design recipe tell us about how to write the function?
How can we test it?
How can we render it as an image?
Domain knowledge: the transparent square
(See handout)
It’s everywhere. In particular, it’s everywhere in your homework!
The traffic light
The circle dropper
The line drawer
Ask for a programmer
Roll dice for functions
Airline customers:
(define-struct mr (first last))
(define-struct ms (first last))
(define-strudr dr (first last))
Design function "Welcome aboard, Mr Ramsey!"
Employees:
(define-struct grad-ta (name stipend))
(define-struct hourly-ta (name wage hours/week))
Design function
Pay to the order of Tom Williams $538
Board:
DEFN BY CHOICE <-------> DEFN BY PARTS
| /
| /
V /
ATOMIC [SELF-REFERENCE]
(by name) (by name)
Add these arrows later:
Quiz: data description
The university has two missions:
Every professor contributes to both of these missions.
This past week:
For example, I learned that
The part where you guess the formula based on your table of examples? There is a web site that tries to do this (but it uses a different language from BSL).
There are powerful new techniques for integrating database queries into a language related to Intermediate Student Language
Something very like Intermediate Student Language is being used in the Pediatric Anesthesia Research Team and the University Hospital in British Columbia. Computer controls drugs (one of which killed Michael Jackson).
As a result, very far behind on grading. It’s now the top priority.
Congratulations! You’ve reached Level 1: basic programming with bounded data. You also have a lot of practice writing interactive graphics programs with big-bang
.
If you’re concerned about the course, I’d love to see you. We’re losing three students a week. I’d like to know what I need to do differently.
What’s next? The last way to organize data. What problem does it solve?
Let’s draw a picture of the conspiracy:
NR
/ \
AB JA
/ \
MW AP ? ?
Plan:
Build it out
Who hasn’t fully recruited yet?
Get more recruits! (At most two per person)
How long does it take a conspiracy to spread?
Ask everybody to send email to comp50conspiracy@gmail.com
. Include your name and the name of the person who recruited you. If you like, you may include your recruits’ names as well.
Key phrase:
When you have found a new recruit, give your recruit these instructions
The instructions refer to themselves!
The data description is also going to refer to itself.
Data description
A person is a structure:
(make-person first last)
where first
and last
are strings
A conspiracy is either
A structure:
(make-cell head recruits1 recruits2)
where
head
is a personrecruits1
is a conspiracyrecruits2
is a conspiracyA structure
(make-no-recruits)
And DRAW THE ARROWS
We need
(define-struct person (first last))
(define-struct conspiracy (head recruits1 recruits2))
(define-struct no-recruits ())
Commentary:
A self-referential data structure is always defined by choice
The empty structure (make-no-recruits)
comes up often:
It acts a bit like “zero.”
How many people are in the conspiracy?
I can ask Ali and Jordan, who can ask their recruits, and so on…
How do we make this idea precise?
Template
Things to notice about the template:
Any self-referential data is wrapped in a call to the function being defined
In other words, the instructions refer to themselves!
This coding pattern is called natural recursion
Tests!
Does the data definition Your template needs as many cond
distinguish among different clauses as subclasses that the
subclasses of data? data definition distinguishes.
How do the subclasses differ from Use the differences to formulate each other? a condition per clause.
Do any of the clauses deal with If so, add appropriate selector structured values? expressions to the clause.
Does the data definition use Formulate “natural recursions” self-references? for the template to represent
the self-references of the data definition.
Does the data definition refer to Contemplate the development another data definition? of a separate template. See Structures In Lists in the book. ———————————————————————————
: Formulating a template for self-referential data
What are the answers for the The examples should tell you which non-recursive cond clauses? values you need here. If not,
formulate appropriate examples and tests.
What do the selector expressions The data definitions tell you what in the recursive clauses compute? kind of data these expressions
extract and the interpretations of the data definitions tell you what this data represents.
What do the natural recursions Use the purpose statement of the
compute function to determine what the
value of the recursion means not
how it computes this answer. If
the purpose statement doesn’t tell
you the answer, improve the
purpose statement.
How can you combine these values If you are stuck here, arrange the into the desired answer? examples from the third step in a table. Place the given input in
the first column and the desired
output in the last column. In the intermediate columns enter the
values of the selector expressions and the natural recursion(s). Add examples until you see a pattern
emerge that suggests a ``combining’’ function.
———————————————————————————
: How to turn the template into a function definition
"closed"
(make-open tourists rangers)
Where
(define-struct open (tourists rangers))
Show the template.
Example: cost-to-taxpayers
Now you do one:
(make-lecture seats)
(make-lab computers)
where
(define-struct lecture (seats))
(define-struct lab (computers))
What is the template for a function that consumes rooms? (Example: capacity
)
big-bang
this weekRevisit the conspiracy tree:
(make-cell head recruits1 recruits2)
(no-recruits)
(define bos (make-stn "South Station" 0))
(define bby (make-stn "Back Bay" 1.1))
(define pvd (make-stn "Providence" 42.6))
(define nhv (make-stn "New Haven" 156.4))
(define nro (make-stn "New Rochelle" 212.1))
(define nyp (make-stn "NY Penn Station" 228.7))
What’s different?
Functions:
cons
and empty
cons?
and empty?
first
and rest
Today:
Coming attraction:
(define nyp (make-stn "NY Penn Station" 228.7))
(define nro (make-stn "New Rochelle" 212.1))
(define nhv (make-stn "New Haven" 156.4))
(define pvd (make-stn "Providence" 42.6))
(define bby (make-stn "Back Bay" 1.1))
(define bos (make-stn "South Station" 0))
Live demo: how to make a tree
Functions:
southmost
northmost
stations-on
the railway; the append
functionappend
is our first function that consumes two lists. (Built into BSL)stations-on
: south-to-north?
decreasing?
singleton?
(used the enriched recipe)List design:
What are you learning?
define
, cond
, define-struct
, make-posn
(define (f x) (+ (* 5 x) 22))
(define dog (circle ...))
, how to arrange functionsWhat do we think of all this?
Language is dominated by fashion (some languages are better for some purposes, but they don’t always win.
Vocabulary and grammar are rote learning. Fluency will come with time and practice. Lean on the manuals and the first Intermezzo.
Domain knowledge rocks. Every problem solver needs it. Sometimes (usually later in your career) you will work with a “domain expert.” For example, Professor Brodley works with radiologists to help teach the computer how to read X-rays and MRIs. But while you’re a student, and early in your career, or if you start a company, often you have to become the domain expert.
Start now. Take your hobbies seriously. Go deep into something that interests you—know more than anybody else.
The rules of computing are necessary. They become interesting when
As a last resort, you have to figure out what went wrong
The key idea of this course is systematic problem solving. It helps you with business (courses), engineering, math, English, journalism, medicine, and other things. I guarantee it.
Roles:6. Tests
5. Code
4. Template
3. Functional examples
2. Signature/purpose/header
1b. Data examples
1a. Describe data
Add variants!
forms of data
=============
| "atomic" | "enumerations" | "structure"
process steps | numbers, images, ... | intervals, strings | define-struct
===============-------------------------------------------------------------------
| | [1,3] (1,3) [1,3) | (make-posn 3 4)
describe data | Number Image | color is one of: .. | (make-dst "h")
----------------------------------------------------------------------------------
signature | Number -> Image | XYZ -> Number | ABC -> String
purpose | create sq of size s | convert into number | add "Dear" to name
header | (define (square s) ...) | |
----------------------------------------------------------------------------------
functional | 3 |---> [] | [1,3] |---> 2 | (make-posn 3 4)
examples | .... | ... | |--> "5"
----------------------------------------------------------------------------------
template | domain knowledge | cond cases! | selector expr.
----------------------------------------------------------------------------------
now code
----------------------------------------------------------------------------------
tests | turn examples into tests (use of check-expect)
| (check-expect (S 3) []) | (check-expect | (check-expect 5
| | (f 1 3) 2) | (f (make-posn 3 4)))
Interplay of the steps, e.g., in the line-drawing or circle-dropping programs.
Two more extension tokens
Quiz: function templates
Convert list of points to list of positions:
Convert list of (x1 y1 x2 y2 x3 y3 ...)
into list of posn
Q: What is the data description? An xy-list is…
Followups:
What happens to a singleton list?
How do we detect a list with an odd number of elements?
How do we detect a singleton list?
Testing data examples for the apocalyptic railway:
Extract all numbers from the 1D-tree (finger exercise)
Class exercise: is the list of numbers decreasing?
(decreasing? (list 3 2 1) true)
(decreasing? (list 1 2 3) false)
(decreasing? (list 2 1 4 3) false)
How will you write it?
Three cases:
Hints to lockstep:
Lockstep example: zombies on the railway
(define northeast (list nyp nro nhv pvd bby bos))
(define zs (list 0 0 7 83 40 0))
Problem: compute a list of stops on the apocalyptic railway, where a stop is either a station or a horde of zombies (see demo).
In-class demo: tally-invasion
In-class demo: my-append
NOTE: Issued two more extension tokens for a total of 5
Signature must not just say ‘structure’; it must say what structure
Apocalyptic railway: Is it enough to follow the boundary?
What kinds of examples are we going to look for to make our case?
Who can come up with an argument or with a counterexample?
How to avoid zombies
Simplify!
This one is tricky. Luckily you have the template handout!
A nonempty list of strings (1LOS) is one of:
(cons string empty)
(cons string 1LOS)
Write a template for a function that consumes a nonempty list of strings. For example, you can imagine a function that puts a space between words instead of the one you did in lab that puts spaces after words.
Data definition:
; An S-expr (S-expression) is one of:
; – Atom
; – SL
; An SL (S-list) is one of:
; – empty
; – (cons S-expr SL)
; An Atom is one of:
; – Number
; – String
; – Symbol
We must be sure we can build it!
Call for data examples
One for every choice
A simple one for every choice
Maybe something more complicated for testing
Data definition:
; An SX (S-expression) is one of:
; – Atom
; – SL
; An SL (S-list) is one of:
; – empty
; – (cons SX SL)
; An Atom is one of:
; – Number
; – String
; – Symbol
Examples:
Round | SX | SL | Atom |
---|---|---|---|
1 | — | empty |
3 , "Hello" , 'joe , '+ |
2 | 3 , "Hello" , 'joe , '+ empty |
— | — |
3 | — | '(cons 3 empty) , '(list 3) , '(3) , ("Hello") , (+) |
— |
… | '(("Hello") "Hello") , '(* (+ x 1) (- x 1) , … |
More examples:
'(* (- x 1) (+ x 1))
Does it meet the definition?
'((nr 50) (bh 15) (ms 11))
Does it meet the definition?
Key ideas:
For every problem, design three functions together
Functional examples should
In template, data reference by name implies code makes function call
To translate data definition into template, use handout
Q: Where should coding start?
Q: How should coding be driven?
Programming problems:
Define the predicate atom?
Count how many literal numbers occur in an S-expression
Substitute one symbol for another
Can we define a BSL-expression? (Symbol in operator position)
Are two S-expressions the same except for numbers?
Are two S-expressions different in only one atom?
Are two S-expressions different in only one subexpression?
Reading: Part IV, Section 19 (and Intermezzo 3)
Placing text in the corners of a game board (and other demos)
local
Demonstration of costs
What causes code bloat?
Functions that have very similar purpose statements (and that operate on the same kind of data)
Duplicate or near-duplicate phrases within code
Coding without a plan:
Big data definitions, especially big structures
What makes software simple or complex?
Conditionals within other conditionals multiply cases and make code complex.
Nested conditionals often benefit from being pulled into helper functions, which have names, signatures, purpose statements, and tests. These adjuncts help people understand the code.
Big data definitions, especially
There is a tradeoff here in how to manage complexity that is inherent in the world. Sometimes it is better to push choices into another named data definition, so that not all code has to be aware of the choice.
Today:
local
Relevant reading: First Edition, Sections 19 and 20
Quiz: kinds of data definition, corresponding templates
Homework review:
cond
by appeal to dataExample: Corner placement
Where else have you seen repetition?
There is a demo file for text placement.
Relevant reading: First Edition, Sections 21 and 22
Do not share code! (Violation may have occurred; I have no choice)
Longest-string example is in demo; see function longest
in section “More than just clean laundry”.
“Constant” (means independent of input) | logN | N | NlogN | N2 | 2N (exponential) |
---|---|---|---|---|---|
cons? |
find station safely | search all stations | efficient sort | insertion sort (worst case) | all arrangements |
longest string with repeat calls | |||||
find nearest town | search a list | ||||
length of list | |||||
insertion sort (if nearly sorted) | |||||
longest string (one call) |
Cost table in notes
Design goal: reuse (a mark of quality software)
Abstraction in data definitions:
(listof X)
(1d-located X)
– apocalyptic railway
(gps-located X)
– town search
Now the data definition looks like a function. The capital X
is a type parameter. It stands for an unknown class of values. Like the Stepper, the data definition works by substitution.
A `(listof X) is one of
empty
(cons x as)
where x
is an X
and xs
is a (listof X)
(See two occurrences of type variable.)
Type variable stands for an unknown type: student, number, string, station, and so on. (In a regular function, the parameter stands for a single unknown value.)
This style of programming is called programming with generics (Java) or sometimes parametric polymorphism. C++ uses something called templates. We will talk about parametric data definitions.
Purpose statements:
To produce a list of all students in this class who carry MacBooks
To produce a list of all students in this class who live in Tilton
To produce a list of all students in this class who have Android phones
To produce a list of all students in this class who are sophomores
Other purposes:
To produce a list of heights of all students in this class
What student in this class lives furthest from Tufts?
What word on this list is most similar to this one I have misspelled?
To solve these problems, other languages have “loops”. In 15 you will encounter while
loops and for
loops and perhaps other flavors of loops.
We will define functions. These functions are called polymorphic or generic.
Questions:
procedure?
map
andmap ; aka forall?
ormap ; aka exists?
filter
foldl
foldr
compose
build-list
argmax
argmin
apply
quicksort
sort
XXX memf
XXXbuild-string
XXX for-each
Responses to midterm evals:
Workload: All intro CS classes are a ton of work. Just as much work as intro to a foreign language. That’s because computer science is a “knowing how to do things” kind of field rather than a “knowing about ideas” kind of field.
This doesn’t help you, but it’s my job to make this clear from the outset to future students.
Reading: I trusted advice from others who have taught from this book elsewhere. Result: disaster.
We can correct: how can I best guide you in the reading?
Labs: I need to work with my staff to increase the number of interventions. I am trying to avoid interrupting people who are working productively.
The labs are designed for the average student to finish half. I believe in that model and have seen it produce good results. Of course I made a disastrous mistake in the first big-bang
lab.
I will work harder on trying to split lab problems into small pieces.
Feedback: Without throwing any of my people under the bus, this is a train wreck. Homework you submitted on September 23 has not been returned. I am raising hell about it.
Although I do a significant chunk of the grading for each assignment, I am responsible for 45 students across two new classes, and I cannot do all the grading myself. Lack of feedback is my #1 concern.
Yes, definitely this issue will be accounted for in final grades. I will review all of the summary assessments and will not penalize anyone for making the same mistake multiple times without feedback.
Similar purpose statements:
The longest string
The student with the highest GPA
The closest hospital
The least expensive boxed set of all 7 seasons of Buffy
DEMO: The longest string
DEMO: Let’s generalize!
And other ideas from students.
What is the average SAT score of all the freshmen?
Today:
Next week schedule
Where have we been and where are we going: big-picture concerns
Build up a vocabulary for working with lists
(Same thing happens in any other domain or data structure)
Not every function has to have a name
Data structures beyond lists (sneak preview of what’s coming)
(You’re not beginners any more.)
More lists!
Example solution to permissible words (with lambda)
“Do something to each” == fold
Today’s topic: Generative recursion First a recap (structural recursion):
A large system, I worked a little bit on it every day.
We keep track with a data structure:
A Table is a (listof Line) where a Line is a (listof Numbers)
Data example: (list (list 30 40 50) (list 0 43 9))
Now let’s define a function:
;; summarize-table :: table -> line
;; aggregates the elements of the given Table as a Line
[functional examples not captured]
Template:
(define (table-summary a-table)
(cond
[(empty? a-table) ... ]
[(cons? a-table) (... a-table ... (first a-table)
... (rest a-table) ... (table-summary (rest a-table)))]))
Template lets us know what we have available to us. We get first
and rest
“for free” when working with lists.
Idea: we need a helper function that adds lines (he’s already written the template)
addTwoLines :: Line -> Line -> Line
(Now we are talking about this helper function: how many cases do we have?
We go through the template for two lists;
...
Now what’s the last step of the design recipe? Tests. Let’s turn our examples into the tests with check-expects (This has been done already in the code for brevity’s sake).
But first can we can use “loops” to simplify our code. (This is another name for fold, map, and so on).
Hey, let’s use a foldr! (And we do it, all the check-expects pass again. Super.)
Ok, continuing with the example: I stored my data in a Table, but my friend stored his in an Excel file (CSV format).
We talk about CSV files – Racket has a friendly way of dealing with this format.
A CSV is a (listof datum)
where a datum is:
a number
eol (end of line character)
Data examples:
empty
(list eol)
(list 60 eol)
(list eol eol) //two empty lines
Hmm now we have two different data representations; we need a function that takes a CSV and returns a Table.
(We go through the purpose statement, contract and examples…should be in the code)
First instinct, let’s write the natural recursion template.
Uh oh, there’s no way to do this with structural recursion. Tricky.
Instead of using the structure to create a new, smaller problem to solve, we generate a new problem from the data.
We don’t have to change the template too much–just rearrange some words.
We need a function called getFirstLine, that returns the first line of the CSV as a list.
(He wrote it for us. This is just structural recursion.)
OK, what else do we need? We only processed the first line, now we need to get the rest of the CSV but we no longer have ‘rest’. We need to generate a new problem, so we want a function that removes the first line (i.e. truncates the CSV file). (He wrote this one again; the template is identical to getFirstLine; we cruised through these two functions…)
Q: What’s the signature for removeFirstLine?
A: CSV -> CSV, because a CSV can be empty.
What’s the trivially solvable problem?
How do we create the new problem?
Now let’s write tests! Make them from our examples.
We’re using a template that looks different but we aren’t going to throw out the design recipe. It’s actually the same template, but we just have different vocabulary.
The structure of the data give you the new problems to solve; be careful, and use your imagination to come up with new ways to solve the problem.
Pitfalls:
Q: In structural recursion, how do we know our function terminates?
A: calls to rest produce smaller lists).
Q: How can we make sure that functions terminate in generative recursion? We gotta be careful here; it’s more complicated than in structural recursion.
We now need to have a termination argument. Every time you write a function using generative recursion, you need to guarantee that the recursive call terminates.
AVOID GENERATIVE RECURSION (if you can): There are usually simpler weapons.
Fibonacci numbers. To figure out how fast he’s gonna get more rabbits, Fibonacci made a sequence, etc. Now we are looking at shells and the golden ratio to motivate Fib numbers in real life.
Let’s write a function (in five minutes) to calculate Fib numbers.
Fib : Positive -> Positive
Data definition:
a Positive is either:
0 or
(+ 1 n) where n is a Positive
(Now we are looking at the code for the purpose statement; data examples and template; Uh oh, my template doesn’t give me (- 2 n) based on the structure of our data definition;
Q: What’s different in this template from structural recursion?
A: There are n numbers of trivial cases; we can have multiple recursive calls and we combine the simplest sub-problems together
Problems:
Make a 2D-tree. Can it be done by structural recursion? What are the costs?
Sort a list. Can it be done by structural recursion? What are the costs?
Balance a binary search tree. How do we measure balance? Can it be done by structural recursion? What are the costs?
Removing HTML tags. Can it be done by structural recursion?
Board:
Draw 2D points
CHOOSE THREE DICE WITH DIFFERENT NUMBERS OF SIDES
Problems:
Make a 2D-tree
The straight-line dice game:
From your three different dice, choose one
I pay out $1 for each straight line in the number you roll
Will you pay $1.20 to play the game with your die?
Will you pay $1.50?
How many tries should it take you to roll 7 or less?
Odds of dice 1 vs dice 2 = P(observation|dice1) / P(observation|dice2)
How much evidence is added by an observation?
d4+d6:
`((0 = 0)
(1 = 0)
(2 = 0.0416)
(3 = 0.125)
(4 = 0.25)
(5 = 0.416)
(6 = 0.583)
(7 = 0.75)
(8 = 0.875)
(9 = 0.9583)
(10 = 1))
d4+d8:
`((0 = 0)
(1 = 0)
(2 = 0.03125)
(3 = 0.09375)
(4 = 0.1875)
(5 = 0.3125)
(6 = 0.4375)
(7 = 0.5625)
(8 = 0.6875)
(9 = 0.8125)
(10 = 0.90625))
d4+d10
`((0 = 0)
(1 = 0.025)
(2 = 0.075)
(3 = 0.15)
(4 = 0.25)
(5 = 0.35)
(6 = 0.45)
(7 = 0.55)
(8 = 0.65)
(9 = 0.75)
(10 = 0.85))
d4+d12
`((0 = 0)
(1 = 0)
(2 = 0.02083)
(3 = 0.0625)
(4 = 0.125)
(5 = 0.2083)
(6 = 0.2916)
(7 = 0.375)
(8 = 0.4583)
(9 = 0.5416)
(10 = 0.625))
d4+d20:
`((0 = 0)
(1 = 0)
(2 = 0.0125)
(3 = 0.0375)
(4 = 0.075)
(5 = 0.125)
(6 = 0.175)
(7 = 0.225)
(8 = 0.275)
(9 = 0.325)
(10 = 0.375))
d6+d8:
`((0 = 0)
(1 = 0)
(2 = 0.02083)
(3 = 0.0625)
(4 = 0.125)
(5 = 0.2083)
(6 = 0.3125)
(7 = 0.4375)
(8 = 0.5625)
(9 = 0.6875)
(10 = 0.7916))
d6 + d10
`((0 = 0)
(1 = 0.016)
(2 = 0.05)
(3 = 0.1)
(4 = 0.16)
(5 = 0.25)
(6 = 0.35)
(7 = 0.45)
(8 = 0.55)
(9 = 0.65)
(10 = 0.75))
d6+d12:
`((0 = 0)
(1 = 0)
(2 = 0.0138)
(3 = 0.0416)
(4 = 0.083)
(5 = 0.138)
(6 = 0.2083)
(7 = 0.2916)
(8 = 0.375)
(9 = 0.4583)
(10 = 0.5416))
d8+d10
`((0 = 0)
(1 = 0.0125)
(2 = 0.0375)
(3 = 0.075)
(4 = 0.125)
(5 = 0.1875)
(6 = 0.2625)
(7 = 0.35)
(8 = 0.45)
(9 = 0.55)
(10 = 0.65))
d8 + d12:
`((0 = 0)
(1 = 0)
(2 = 0.010416)
(3 = 0.03125)
(4 = 0.0625)
(5 = 0.10416)
(6 = 0.15625)
(7 = 0.21875)
(8 = 0.2916)
(9 = 0.375)
(10 = 0.4583))
d10+d12
`((0 = 0)
(1 = 0.0083)
(2 = 0.025)
(3 = 0.05)
(4 = 0.083)
(5 = 0.125)
(6 = 0.175)
(7 = 0.23)
(8 = 0.3)
(9 = 0.375)
(10 = 0.4583))
Note: Online course evaluations
I know that grading is totally broken
I know I need to improve lab
What else do I need to know to make the course better?
What does the dept head need to know about this course?
Today:
Online course evaluations
Generative recursion: splitting in two (for performance)
Live coding: writing drop
Live coding: balanced 2D trees
split-by
sort-by
Live coding: path ratio and tree balance
(load a model)
People’s choice:
Live coding: balancing binary-search trees
Live coding: merge sort
Live coding: probability and expectation
Handout: creating languages with Racket
Note: course evaluations (labs and projects)
List of barriers to entry (board):
Concrete syntax (everywhere)
Distinct expressions and statements (most places)
Mutable state (most places)
Pointers and memory management (unique to C and C++)
Review of syntactic structure that you know:
Definitions (structures, functions, values)
Tests
Expressions (function application and cond
, local
, lambda
)
Example BSTs:
Lua
Haskell
C++
Concrete syntax:
Just like memorizing vocabulary; done through repetition
Provokes endless religious wars
Lots of beautiful syntax out there
Even more ugly syntax
Why do we have concrete syntax? It is the human interface. Language designers are trying to express their thoughts concisely and beautifully. I am not making this up.
Mutable state:
Rather than make a function call, mutate the value of an existing variable
Most basically called “assignment”
Motivated by machine structures, not by helping people think or solve problems
Can be avoided in many contexts
Is hard to avoid in C or C++
It is known that reasoning about sequences of mutations presents a cognitive barrier to creation and understanding. Mutation will be one of the more difficult ideas you encounter. I recommend the advanced part of the textbook.
Why do we have mutation? It is “closer to the machine” and is a way of making it very easy to predict costs.
Split into “statements” and “expressions”:
Grounded in a false distinction between “do something” and “compute something”
In most languages, case analysis (conditionals) is done by “statement”; arithmetic and other function calls are done by expressions.
Language that are not broken in this way are called “expression-oriented”; there are lots besides *SL, but they are not fashionable.
This is really a minor impediment; you will get used to it quickly.
Why do we have it? Not enough imagination (or insufficiently smart compilers) to escape the machine. It is deeply embedded into our folklore.
Pointers:
Unique to the world of C and C++
The goal is ultimate control of the machine so that you know exactly what things cost.
The micromanagement exacts a heavy human cost.
History:
As recently as 1990, machines were slow and weak, and every practicing problem-solver had to micromanage machine resources.
By the late 1990s, the world had changed. Now you need to know this only if you want to work for Google or Microsoft or Apple or Amazon or some other place where costs really matter.
Delaying the onset of pointers is a big reason we want to change the curriculum.
Example:
Problem: given that a node can contain two other nodes, how big is a node?
Solution: the contents of the node is stored off in a place called “the heap”, and we use a “pointer” to the node.
Template: new
, constructor.
Arrays are like lists except:
Access to any element in constant time
Versus
;; nth : natural (listof X) -> X
;; to produce the `n`th element of xs, counting from zero
(define (nth n xs)
(first (drop n xs)))
But: unlike a list, an array is hard to grow and shrink
(thus, “dynamic arrays”)
The template for arrays:
for
loop (just like a structural recursion)Note: Alert if portfolio in by 9pm
Don’t forget you can cheat: It’s always legitimate to solve a problem using Intermediate Student Language, then borrow structure from your solution and transfer to the unfamiliar context.
This is the key question:
Every language comes with atomic data
Numbers are universal, strings are spotty, images are rare, symbols are rare
Every language is good at definition by parts.
The problem spot is definition by choices:
Data definition is often enforced:
Usually less expressive than what you’re used to
Not all languages have parametric data definitions (called parametric polymorphism or generics)
(I know from reading your homework that you mostly understand the concept but have struggled with the details and the notations)
The error messages are going to be a barrier. You will break through the barrier with help and persistence. Hold onto your concepts and vocabulary; don’t let the new vocabulary hide the underlying similarities.
check-expect
Unit testing tends to be much more heavyweight than you are used to.
Most languages have “frameworks.”
Fairly large cost of entry
Afterward, the marginal cost of adding new tests is relatively small—it is comparable to building code
People won’t tell them to you—you will have to suss them out. But if you ask how to “examine” or “observe” a data structure, that will help.
The principles of choice and selection will always work for you
The idea of natural recursion will always work for you
In many contexts, people may expect you to write natural recursions using “loops.” You will learn about loops and arrays (called vectors in your book).
Example BSTs:
Lua
Haskell
C++
Concrete syntax: memorize through repetition
Mutable state: a new skill, some support in your book
Rather than make a function call, mutate the value of an existing variable
Most basically called “assignment”
Motivated by machine structures, not by helping people think or solve problems
Can be avoided in many contexts
Is hard to avoid in C or C++
It is known that reasoning about sequences of mutations presents a cognitive barrier to creation and understanding. Mutation will be one of the more difficult ideas you encounter. I recommend the advanced part of the textbook.
Why do we have mutation? It is “closer to the machine” and is a way of making it very easy to predict costs.
Split into “statements” and “expressions”: covered last time; not hard
Pointers:
Unique to the world of C and C++
The goal is ultimate control of the machine so that you know exactly what things cost.
The micromanagement exacts a heavy human cost.
History:
As recently as 1990, machines were slow and weak, and every practicing problem-solver had to micromanage machine resources.
By the late 1990s, the world had changed. Now you need to know this only if you want to work for Google or Microsoft or Apple or Amazon or some other place where costs really matter.
Delaying the onset of pointers is a big reason we want to change the curriculum.
Example:
Problem: given that a node can contain two other nodes, how big is a node?
Solution: the contents of the node is stored off in a place called “the heap”, and we use a “pointer” to the node.
Template: new
, constructor.
Arrays are like lists except:
Access to any element in constant time
Versus
;; nth : natural (listof X) -> X
;; to produce the `n`th element of xs, counting from zero
(define (nth n xs)
(first (drop n xs)))
But: unlike a list, an array is hard to grow and shrink
(thus, “dynamic arrays”)
The template for arrays:
for
loop (just like a structural recursion)Back to COMP 50 home page.