This handout discusses the role of “function composition” in the design process. In the textbook, this topic is not as explicit as I would like. But it is the key success with abstract list functions (andmap
, filter
, map
, and so on).
TL;DR: sometimes, instead of using a data-driven template, a function’s body should simply compose (i.e., apply) other functions.
By this time, you’re well acquainted with the idea of one function for each data definition. For example, if you’re searching for list tags, you might define three functions:
; element-list-tags : element -> (setof symbol)
; to return all the list tags used in the given element
; loe-list-tags : (listof element) -> (setof symbol)
; to return all the list tags used in the given list of elements
; tag-list-tags : symbol -> (setof symbol)
; to return all the list tags that equal the given symbol
You also know exactly how to use one of these functions:
At what stage of the design process do you deploy one of these functions? When you are turning a template into code.
The first mention of function composition is all the way back in Section 3, especially Section 3.1. The guiding principle is as follows:
When you see a relationship between two quantities (or other values) in a problem statement, embody that relationship in a function.
A single fact you know can just be written down. The step forward is that relationships among facts can be written as functions.
The book goes on to say that such functions will probably be useful, but it is a bit less clear how they will be used. One technique we know is that when you are filling in a function template, you might use these functions to combine
The shorthand for this technique is combining the values of the template’s subexpressions into the final answer.1
Section 12 of the first edition (Composing Functions, Revisited Again) hints at a kind of definition that does not follow the usual template. Look in Section 12.1 at bullet item 4:
Here is an example of what is meant, drawn from the (non-apocalyptic) railway problem:
The purpose of your main function is to take a point on the railway, identified by miles from Boston, and return directions to the nearest station.
This problem problem can be solved using natural recursion on a function that returns directions, but many programmers prefer to solve a more general problem “find the nearest station.” The main function then becomes this short definition:
; directions-from : number railway -> directions
; to return directions from the given point to the nearest station
(define (directions-from miles rail)
(directions-to-station miles (nearest-station miles rail)))
where the purpose statement of directions-to-station
is “given a point and a station, return directions from the point to that station.”
The key thing about the example is that directions-from
does not follow the template for the complex railway
data. Instead, directions-from
delegates or defers the templated computation to the nearest-station
function.
For the railway problem narrowly defined, deferring the computation provides no immediate benefit—but we all sense that a function “find the nearest station” is somehow “better” than a function “give me directions to the nearest station.” And we are right: the nearest-station
function is easier to reuse.
At what stage of the design process do you deploy one of these functions? The answer is not entirely clear to me, but I think it goes something like this:
If during the process of turning a template into code, you feel that your signature and purpose statement are a “bad fit” or are some how “unnatural”, consider changing horses midstream into a more general signature and purpose statement that will lead to more natural code.
Then, go back and rebuild the original function according to its original signature and purpose statement, but don’t use the same template. Instead, write a simple call to the new, more general version.
The second edition says roughly
Before going through the work of writing down a complex, reflect on the nature of the function. If the problem statement suggests that there are several tasks to be performed, it is likely that a function composition is needed instead of a template. In that case, skip the template step.
The key idea is that coding by “function composition” replaces a data-driven template. This idea comes to its own when we start serious programming with lists.
Anyone who works extensively with lists quickly builds up a vocabulary of frequently used computations. Some of these computations are exemplified by these purpose statements:
Each one of these computations is different, but they are representative of so many frequently used computations that we don’t want to write them over and over. Instead, we generalize each purpose statement to come up with an abstract list function. And when possible, we use one of these functions instead of writing the list template. Here are a couple of examples you’ve already seen:
; anymac? : (listof student) -> boolean
; to tell if any of the given students carries a MacBook
(define (anymac? students)
(ormap carries-MacBook? students))
; sophomores : (listof student) -> (listof student)
; to produce a list of all the given students who are sophomores
(define (sophomores? students)
(local [; soph? : student -> boolean
; to tell if the given student is a sophomore
(define (soph? s)
(= 2013 (student-class-year s)))]
(filter soph? students)))
Notice especially the definition of sophomores
. The local function soph?
is developed according using the template for structured data: its body uses the selector function student-class-year
. But the body of sophomores?
does not use the template for list data—it delegates the list processing to the abstract list function filter
.
The list-processing functions are also very powerful in combination. For example, we can break down “produce a list of permissible Scrabble words” into two problems: “tell if a word is permissible” and “produce the permissible words”.
; scrabble-words : (listof string) -> (listof string)
; to produce a list of those given strings that are made from lowercase letters
(define (scrabble-words los)
(local [; all-lower : string -> boolean
; to tell if the given string is made from lowercase letters
(define (all-lower? w)
(andmap char-lower-case? (string->list w)))]
(filter all-lower? los)))
None of these example functions uses the standard template for lists: there is no cond
, and we never ask the questions empty?
or cons?
. Those tasks are delegated to filter
and andmap
.
How does this technique affect the design process? It provides an alternative to the construction of a template based on input data:
If your signature resembles the signature of a general-purpose list function, perhaps your function is a candidate for function composition. (To bring out the resemblance, you will have to substitute for type variables.)
If your purpose statement resembles the purpose statements of any of the standard general-purpose list functions, then your function is likely a candidate for function composition.
At least as you are learning, it is probably easier to see resemblance to actual purpose statements like the ones shown above, rather than the more abstract purpose statements that go with the general list functions.
If your plan is to process list elements one at a time, while accumulating the results of processing earlier or later list elements, then your function is a candidate for a composition that uses foldl
or foldr
.
Your ultimate goal is to reuse code that clearly expresses your purpose:
The ideal is to find a way to reuse andmap
, ormap
, filter
, map
, argmin
, argmax
, sort
, or other functions with a relatively focused purpose.
The next best thing is to reuse the very general-purpose functions foldl
and foldr
.
The method of last resort is to define a new recursive function. Why is this the method of last resort? Because anyone reading your code will assume that if you couldn’t do it using a standard function, it is not possible to do using a standard function—and they will become confused.
HtDP, Section 12.1, page 169↩