Handout: Composing Functions

This handout discusses the role of “function composition” in the design process. In the textbook, this topic is not as explicit as I would like. But it is the key success with abstract list functions (andmap, filter, map, and so on).

TL;DR: sometimes, instead of using a data-driven template, a function’s body should simply compose (i.e., apply) other functions.

A function for each named data definition

By this time, you’re well acquainted with the idea of one function for each data definition. For example, if you’re searching for list tags, you might define three functions:

; element-list-tags : element -> (setof symbol)
; to return all the list tags used in the given element

; loe-list-tags : (listof element) -> (setof symbol)
; to return all the list tags used in the given list of elements 

; tag-list-tags : symbol -> (setof symbol)
; to return all the list tags that equal the given symbol

You also know exactly how to use one of these functions:

When your purpose involves finding list tags
When you have in hand the kind of data that the function expects

At what stage of the design process do you deploy one of these functions? When you are turning a template into code.

Auxiliary functions as a problem-solving technique

The first mention of function composition is all the way back in Section 3, especially Section 3.1. The guiding principle is as follows:

When you see a relationship between two quantities (or other values) in a problem statement, embody that relationship in a function.

A single fact you know can just be written down. The step forward is that relationships among facts can be written as functions.

The book goes on to say that such functions will probably be useful, but it is a bit less clear how they will be used. One technique we know is that when you are filling in a function template, you might use these functions to combine

values extracted with selector functions
results of natural recursions
arguments not yet scrutinized

The shorthand for this technique is combining the values of the template’s subexpressions into the final answer.¹

Auxiliary functions for “complex programs”

Section 12 of the first edition (Composing Functions, Revisited Again) hints at a kind of definition that does not follow the usual template. Look in Section 12.1 at bullet item 4:

If the natural formulation of the function isn’t quite what we want, it is most likely a generalization of our target. In this case, the main function is a short definition that defers the computation to the generalized auxiliary program. [Emphasis mine]

Here is an example of what is meant, drawn from the (non-apocalyptic) railway problem:

The purpose of your main function is to take a point on the railway, identified by miles from Boston, and return directions to the nearest station.

This problem problem can be solved using natural recursion on a function that returns directions, but many programmers prefer to solve a more general problem “find the nearest station.” The main function then becomes this short definition:
```
; directions-from : number railway -> directions
; to return directions from the given point to the nearest station
(define (directions-from miles rail)
  (directions-to-station miles (nearest-station miles rail)))
```
where the purpose statement of directions-to-station is “given a point and a station, return directions from the point to that station.”

The key thing about the example is that directions-from does not follow the template for the complex railway data. Instead, directions-from delegates or defers the templated computation to the nearest-station function.

For the railway problem narrowly defined, deferring the computation provides no immediate benefit—but we all sense that a function “find the nearest station” is somehow “better” than a function “give me directions to the nearest station.” And we are right: the nearest-station function is easier to reuse.

At what stage of the design process do you deploy one of these functions? The answer is not entirely clear to me, but I think it goes something like this:

If during the process of turning a template into code, you feel that your signature and purpose statement are a “bad fit” or are some how “unnatural”, consider changing horses midstream into a more general signature and purpose statement that will lead to more natural code.

Then, go back and rebuild the original function according to its original signature and purpose statement, but don’t use the same template. Instead, write a simple call to the new, more general version.

The second edition says roughly

Before going through the work of writing down a complex, reflect on the nature of the function. If the problem statement suggests that there are several tasks to be performed, it is likely that a function composition is needed instead of a template. In that case, skip the template step.

The key idea is that coding by “function composition” replaces a data-driven template. This idea comes to its own when we start serious programming with lists.

Reusable auxiliary functions for lists

Anyone who works extensively with lists quickly builds up a vocabulary of frequently used computations. Some of these computations are exemplified by these purpose statements:

Tell if any student in the class carries a MacBook
Tell if every student in the class lives in Tilton
Produce a list of all students in the class who are sophomores
Produce a list containing the SAT score of every student in the class
Produce the tallest student in the class
Produce the student in the class who has the shortest trip home for Thanksgiving
Produce a list of all students in the class ordered from oldest to youngest

Each one of these computations is different, but they are representative of so many frequently used computations that we don’t want to write them over and over. Instead, we generalize each purpose statement to come up with an abstract list function. And when possible, we use one of these functions instead of writing the list template. Here are a couple of examples you’ve already seen:

; anymac? : (listof student) -> boolean
; to tell if any of the given students carries a MacBook
(define (anymac? students)
  (ormap carries-MacBook? students))

; sophomores : (listof student) -> (listof student)
; to produce a list of all the given students who are sophomores
(define (sophomores? students)
  (local [; soph? : student -> boolean
          ; to tell if the given student is a sophomore
          (define (soph? s)
             (= 2013 (student-class-year s)))]
     (filter soph? students)))

Notice especially the definition of sophomores. The local function soph? is developed according using the template for structured data: its body uses the selector function student-class-year. But the body of sophomores? does not use the template for list data—it delegates the list processing to the abstract list function filter.

The list-processing functions are also very powerful in combination. For example, we can break down “produce a list of permissible Scrabble words” into two problems: “tell if a word is permissible” and “produce the permissible words”.

; scrabble-words : (listof string) -> (listof string)
; to produce a list of those given strings that are made from lowercase letters
(define (scrabble-words los)
  (local [; all-lower : string -> boolean
          ; to tell if the given string is made from lowercase letters
          (define (all-lower? w)
             (andmap char-lower-case? (string->list w)))]
    (filter all-lower? los)))

None of these example functions uses the standard template for lists: there is no cond, and we never ask the questions empty? or cons?. Those tasks are delegated to filter and andmap.

How does this technique affect the design process? It provides an alternative to the construction of a template based on input data:

If your signature resembles the signature of a general-purpose list function, perhaps your function is a candidate for function composition. (To bring out the resemblance, you will have to substitute for type variables.)
If your purpose statement resembles the purpose statements of any of the standard general-purpose list functions, then your function is likely a candidate for function composition.

At least as you are learning, it is probably easier to see resemblance to actual purpose statements like the ones shown above, rather than the more abstract purpose statements that go with the general list functions.
If your plan is to process list elements one at a time, while accumulating the results of processing earlier or later list elements, then your function is a candidate for a composition that uses foldl or foldr.

Your ultimate goal is to reuse code that clearly expresses your purpose:

The ideal is to find a way to reuse andmap, ormap, filter, map, argmin, argmax, sort, or other functions with a relatively focused purpose.
The next best thing is to reuse the very general-purpose functions foldl and foldr.
The method of last resort is to define a new recursive function. Why is this the method of last resort? Because anyone reading your code will assume that if you couldn’t do it using a standard function, it is not possible to do using a standard function—and they will become confused.

HtDP, Section 12.1, page 169↩

Handout: Composing Functions

COMP 50

Fall 2013

A function for each named data definition

Auxiliary functions as a problem-solving technique

Auxiliary functions for “complex programs”

Reusable auxiliary functions for lists