Reading Guide to vmheap.c

Draft

This guide will help you focus your attention on the important parts of file vmheap.c.

Include files and macros

The fake Valgrind macros make it possible to compile the code on a machine without Valgrind.

The SMALLHEAP macro, when defined, makes pages smaller so that garbage collection is triggered more frequently—which makes it easier to debug

Heap pages

The from-space, to-space, and allocation areas are subdivided into pages, which are defined here. The page structure is borrowed from Dave Hanson’s allocator in C Interfaces and Implementations. Its representation is not important.

Main heap data structures

Study these data structures, including the terminology and invariants.

Global variables current, available, next, and limit support the allocator. The hot one is next; if that were to be kept in a machine register, we would have a state-of-the-art high-speed allocator.

The count structure contains all sorts of counters. You’ll use them to determine the survival rate of objects at each garbage collection.

The page functions here are basic but useful. You will use make_available to make reclaimed pages available to the allocator. And you will use acquire_available_page to grow the heap.

Garbage-collector data structures

The only persistent data structure used by the collector is the gray list. It doesn’t actually have to be persistent, but I don’t want to be allocating a new one at every collection.

Initialization and finalization

Setup and teardown are not interesting, but take note of the statistics.

Test-and-increment, page-based allocator

The allocator is why we’re here. No program ever asked to stop and collect garbage; what programs want is to allocate. This allocator is a careful facsimile of a state-of-the-art, high-speed allocator, but since we’re focused on the garbage collector, we won’t pay much attention.

Triggering garbage collection and heap growth

The collector should run when the heap is half full. If it waits longer, there might not be space to copy all the data. And if it runs sooner, it’s doing work that’s not necessary.

I’ve implemented target_gamma, but you’ll need to implement growheap, which enforces the gamma policy. And you’ll need to manage the availability_floor so the collector runs at the right time.

Critical garbage-collector functions

The core operation of a copying collector is forwarding pointers, which can entail copying objects. Because our pointers and objects are statically typed, there is a lot of near-duplicate code here. All of this machinery is needed just to duplicate the simple code on page 277b of my book. It’s depressing, but you don’t have to write it; you just use it.

You need to understand what remember and forward_payload do: they are the key functions you will use in your collector.

Implementations of copy functions

Nothing really to see here.

Implementations of forward functions

It’s worth understanding what forward_payload is doing.

Scanning functions

Pay close attention to scan_value because your functions scan_activation and scan_vmstate have similar obligations. Keep in mind that a Value can take multiple forms, which complicates scanning. A VM state takes just one form, and an activation record also takes one form,1 so your scanning functions will be simpler than the ones you see here.

Page functions

Nothing to see here.

Random utility functions

Nothing to see here.

Draft

  1. Unless you have created some extra forms to get depth points.