This handout discusses the role of “function composition” in the design process. In the textbook, this topic is not as explicit as I would like. But it is the key success with abstract list functions (andmap, filter, map, and so on).

TL;DR: sometimes, instead of using a data-driven template, a function’s body should simply compose (i.e., apply) other functions.

A function for each named data definition

By this time, you’re well acquainted with the idea of one function for each data definition. For example, if you’re searching for list tags, you might define three functions:

; element-list-tags : element -> (setof symbol)
; to return all the list tags used in the given element

; loe-list-tags : (listof element) -> (setof symbol)
; to return all the list tags used in the given list of elements 

; tag-list-tags : symbol -> (setof symbol)
; to return all the list tags that equal the given symbol

You also know exactly how to use one of these functions:

At what stage of the design process do you deploy one of these functions? When you are turning a template into code.

Auxiliary functions as a problem-solving technique

The first mention of function composition is all the way back in Section 3, especially Section 3.1. The guiding principle is as follows:

When you see a relationship between two quantities (or other values) in a problem statement, embody that relationship in a function.

A single fact you know can just be written down. The step forward is that relationships among facts can be written as functions.

The book goes on to say that such functions will probably be useful, but it is a bit less clear how they will be used. One technique we know is that when you are filling in a function template, you might use these functions to combine

The shorthand for this technique is combining the values of the template’s subexpressions into the final answer.1

Auxiliary functions for “complex programs”

Section 12 of the first edition (Composing Functions, Revisited Again) hints at a kind of definition that does not follow the usual template. Look in Section 12.1 at bullet item 4:

  1. If the natural formulation of the function isn’t quite what we want, it is most likely a generalization of our target. In this case, the main function is a short definition that defers the computation to the generalized auxiliary program. [Emphasis mine]

Here is an example of what is meant, drawn from the (non-apocalyptic) railway problem:

The key thing about the example is that directions-from does not follow the template for the complex railway data. Instead, directions-from delegates or defers the templated computation to the nearest-station function.

For the railway problem narrowly defined, deferring the computation provides no immediate benefit—but we all sense that a function “find the nearest station” is somehow “better” than a function “give me directions to the nearest station.” And we are right: the nearest-station function is easier to reuse.

At what stage of the design process do you deploy one of these functions? The answer is not entirely clear to me, but I think it goes something like this:

The second edition says roughly

Before going through the work of writing down a complex, reflect on the nature of the function. If the problem statement suggests that there are several tasks to be performed, it is likely that a function composition is needed instead of a template. In that case, skip the template step.

The key idea is that coding by “function composition” replaces a data-driven template. This idea comes to its own when we start serious programming with lists.

Reusable auxiliary functions for lists

Anyone who works extensively with lists quickly builds up a vocabulary of frequently used computations. Some of these computations are exemplified by these purpose statements:

Each one of these computations is different, but they are representative of so many frequently used computations that we don’t want to write them over and over. Instead, we generalize each purpose statement to come up with an abstract list function. And when possible, we use one of these functions instead of writing the list template. Here are a couple of examples you’ve already seen:

; anymac? : (listof student) -> boolean
; to tell if any of the given students carries a MacBook
(define (anymac? students)
  (ormap carries-MacBook? students))

; sophomores : (listof student) -> (listof student)
; to produce a list of all the given students who are sophomores
(define (sophomores? students)
  (local [; soph? : student -> boolean
          ; to tell if the given student is a sophomore
          (define (soph? s)
             (= 2013 (student-class-year s)))]
     (filter soph? students)))

Notice especially the definition of sophomores. The local function soph? is developed according using the template for structured data: its body uses the selector function student-class-year. But the body of sophomores? does not use the template for list data—it delegates the list processing to the abstract list function filter.

The list-processing functions are also very powerful in combination. For example, we can break down “produce a list of permissible Scrabble words” into two problems: “tell if a word is permissible” and “produce the permissible words”.

; scrabble-words : (listof string) -> (listof string)
; to produce a list of those given strings that are made from lowercase letters
(define (scrabble-words los)
  (local [; all-lower : string -> boolean
          ; to tell if the given string is made from lowercase letters
          (define (all-lower? w)
             (andmap char-lower-case? (string->list w)))]
    (filter all-lower? los)))

None of these example functions uses the standard template for lists: there is no cond, and we never ask the questions empty? or cons?. Those tasks are delegated to filter and andmap.

How does this technique affect the design process? It provides an alternative to the construction of a template based on input data:

Your ultimate goal is to reuse code that clearly expresses your purpose:

  1. The ideal is to find a way to reuse andmap, ormap, filter, map, argmin, argmax, sort, or other functions with a relatively focused purpose.

  2. The next best thing is to reuse the very general-purpose functions foldl and foldr.

  3. The method of last resort is to define a new recursive function. Why is this the method of last resort? Because anyone reading your code will assume that if you couldn’t do it using a standard function, it is not possible to do using a standard function—and they will become confused.

  1. HtDP, Section 12.1, page 169