How Closures Work

Closure conversion is the last of the three most fun transformations in our compiler. (The first two are code generation and K-normalization.) It gets us from higher-order Scheme to “closed” first-order Scheme. Closure conversion is a simple, popular, efficient way to implement nested lambdas.

(If you’re reading this handout in a hurry just before lab,1 the key sections are the introduction, the three species of variables, and the section on embedding.)

Introduction

Here’s a standard μScheme function that returns a lambda:

(define o (f g)
  (lambda (x) (f (g x))))

In a definitional interpreter, when we call

(o not positive?)

the interpreter allocates a closure, which I’ll write this way:

⦇ (λ(x) (f (g x))), { f ↦ not, g ↦ positive? } ⦈

The “unimproved” uscheme interpreter used in COMP 105 copies the entire environment, but this example shows the “improved” version, which builds an environment that contains only the variables that appear free in the λ expression (f and g).2 When this closure is called with an actual parameter v, the interpreter extends the saved environment with a binding x ↦ v, then evaluates the body of the λ expression.

How should this closure be translated into a low-level intermediate language? By playing the same game that we played during K-normalization: instead of keeping an environment ρ around at run time, we’ll use the environment only at compile time. The compiler will remember the locations of f and g and will pop them straight into the generated code. But unlike formal parameters or let-bound local variables, f and g won’t be in registers—they’ll be in a closure.

The translation is driven by the representation of closures. The closure above is represented by a record that contains a function pointer and two “slots”:

Inside the lambda, the values of f and g are obtained by indexing into the closure record. By convention, the closure record of a running function is in register $r0—a convention that you have already implemented. Variables stored in a closure record are called captured, so in the abstract syntax, the indexing operation is written CAPTURED 0 for f and CAPTURED 1 for g. In the concrete syntax of my assembly language, I write $r0.0 and $r0.1.

Now that I know the representation of the closure, I can translate the anonymous λ into K-normal form. The closure is in $r0, and according to our calling convention, x arrives in register $r1. The internal lambda, whose body is (f (g x)), might be K-normalized like this:

let f = $r0.0 in     -- take f from slot 0 of the closure in $r0
let g = $r0.1 in     -- take g from slot 1 of the closure in $r0
let t = g($r1) in    -- call g(x)
f(t)                 -- return f(g(x))

To make this code work, o has to allocate the closure record that is eventually stored in $r0. And that’s the rest of the translation: o allocates the closure record, associates it with the inner λ, and initalizes each slot. This computation is expressed in abstract syntax by a new CLOSURE form. The form (lambda (x) (f (g x))) is closure-converted into

((["x"], ...), [LOCAL "f", LOCAL "g"])

where the ... is the conversion of the body, which looks like this:

FUNCALL (CAPTURED 0, [FUNCALL (CAPTURED 1, [LOCAL "x"])])

Why it works: the three species of variables

Closure conversion, like K-normalization, gets good performance out of variables. To understand it, we must refine our understanding of variables.

In vScheme, there is only one species of variable. In module 5, we introduced Disambiguated vScheme, which has two species of variables: local and global. Now we split the non-global variables into two distinct species, resulting in three overall. The three species are relative to a given function f.

This classification is already partially implemented: the disambiguation pass from module 5 distinguishes GLOBAL variables from LOCAL variables. To implement closure conversion, we have to examine the LOCAL variables more closely: which ones are truly local, and which ones are captured? There’s a simple answer:

The captured variables of a lambda expression e are precisely those variables that are identified as LOCAL during the disambiguation pass and are free in e.

Once the captured variables are stored in a closure record, we are guaranteed fast, constant-time access to every non-global variable. (Fast access to global variables can be had by means of a trick, which is one of the depth-point options in module 2.)

Summary of the translation

Pure, higher-order vScheme is converted into first-order code by this transformation:

Because the source language is pure, a captured variable is not mutated. Mutability is a feature you can support if you want to earn depth points.

Representations and syntactic forms

Higher-order functions are supported by one new value representation in the SVM and by two new syntactic forms in some of the intermediate languages.

In the SVM

In the SVM, a closure is represented by code, plus n captured variables, each in its own slot.

struct VMClosure {
  struct VMFunction *f;
  int nslots;
  struct Value captured[];
};

This structure is allocated by instruction mkclosure, read and written by instructions getclslot and setclslot, and supported by instructions call and tailcall.

If you replace the struct VMFunction * pointer with a pointer to machine code, you get exactly the representation that is used in native-code compilers.

In Closed Scheme

In the intermediate languages, we need a form that allocates a closure and another form that refers to the value of a captured variable. Adding those forms to First-Order Scheme defines Closed Scheme. The new types and syntactic forms are as follows:

type funcode = name list * exp   (* a lambda with no free variables *)
                                 (* eventually becomes a function ptr *)

type closure = funcode * exp list

datatype exp 
  = ...
  | CLOSURE  of closure
  | CAPTURED of int

In K-normal form

Your K-normal form has to be extended with these same two constructs. And as expected for K-normal form, every subexpression has to fit in a register. The CAPTURED form doesn’t have any subexpressions. And in the CLOSURE form, a funcode already fits in a register—that’s the LOADFUNC work you did way back in module 2. But there’s a complication: by the time we get to the machine level, it’s not possible to allocate and initialize a closure form atomically—the closure form can be allocated in one instruction, but storing all the captured variables may require arbitrarily many VM instructions. To hold the closure while it’s being initialized, the K-normal form requires a dedicated register. So the syntactic form looks like

let x = ((funcode (x₁, …, xₙ) => e), [y₁, …, yₘ]) in e

and the representations look like this:

type 'a funcode = 'a list * 'a exp
type 'a closure = 'a funcode * 'a list  
                   (* the value of every captured variable is now in a register *)
datatype 'a exp 
  = ...
  | LETCLOSURE of 'a * 'a closure * 'a exp
  | CAPTURED   of int

Re-embedding closure-converted code

Once you’ve implemented closure conversion, you’ll want to see if the converted code is right. Closure-converted code is first-order Scheme, extended with the two new forms listed above: allocate a closure, load a captured variable. To debug it, you’ll re-embed it into ordinary Scheme.

I’ve modified the vScheme interpreter so it can call such a list as if it were a function, according to this equivalence:

APPLY (CONS_CELL ⟨f, ys⟩, vs) ≡ APPLY (f, CONS_CELL ⟨f, ys⟩ :: vs)

As part of the closure-conversion module, you will re-embed converted code into μScheme. You’ll have to implement only the two new forms:

With this embedding in place, you’ll be able to prettyprint and run your closure-converted code.

Here’s an example:

(define flip (f) (lambda (x y) (f y x)))

The lambda has one free variable, f, which will be stored in closure slot 0. The internal f becomes CAPTURED 0, which is embedded as (CAPTURED-IN 0 $closure). The NEWCLOSURE record is embedded as5

(list2 (lambda ($closure x y) ((CAPTURED-IN 0 $closure) y x))
       f)

And the entire function is

(define flip (f) 
  (list2 (lambda ($closure x y) ((CAPTURED-IN 0 $closure) y x))
         f))

This closure can be called in the vscheme interpreter. For interactivity, I invoke vscheme with the -vv option:

$ vscheme -vv
-> (use flip.scm)
flip
-> (cons 'A 'B)
(A . B)
-> ((flip cons) 'A 'B)
(B . A)

A common mistake

The one thing that complicates closure conversion is that by the time you’re ready to allocate a closure, not every captured variable is still a variable—a captured variable might actually refer to a slot in an outer closure. If you overlook this possibility, your UFT may generate code like this:

echo '(lambda (f) (lambda (x) (lambda (y) (f x y))))' | uft ho-cl
(cons
   (lambda ($closure1 f) ;; top level; no free variables and closure won't be used
      (cons
         (lambda ($closure2 x) ;; code for this function is faulty
            (cons
               (lambda ($closure3 y)
                  ((CAPTURED 1 $closure3) (CAPTURED 0 $closure3) y))
               (cons x
                  (cons f '())))) ;; this `f` is wrong
         (cons f '()))) ;; this `f` is in scope
   '())

To make it a little easier to see what’s going on, I’ve renumbered the three $closure parameters. When the intermediate lambda tries to build its closure, it attempts to refer directly to f, which it doesn’t have the rights to do. Think about what code should be generated instead.


  1. remote contingency, to be sure.

  2. If you’ve taken COMP 105 with me or with Kathleen Fisher, you’ve implemented this improvement.

  3. The parameter is “extra” because it does not appear in the original source code.

  4. In the SVM, such an addition is not necessary because the closure record is already passed in register 0. This is also how closures work in native-code compilers.

  5. Becuase CAPTURED-IN refers to a global variable, just like cons or car, it is not considered a “free” variable—as you know, it compiles to GETGLOBAL, not GETLOCAL.