This guide will help you focus your attention on the important parts of file vmheap.c
.
Include files and macros
The fake Valgrind macros make it possible to compile the code on a machine without Valgrind.
The SMALLHEAP
macro, when defined, makes pages smaller so that garbage collection is triggered more frequently—which makes it easier to debug
Heap pages
The from-space, to-space, and allocation areas are subdivided into pages, which are defined here. The page structure is borrowed from Dave Hanson’s allocator in C Interfaces and Implementations. Its representation is not important.
Main heap data structures
Study these data structures, including the terminology and invariants.
Global variables current
, available
, next
, and limit
support the allocator. The hot one is next
; if that were to be kept in a machine register, we would have a state-of-the-art high-speed allocator.
The count
structure contains all sorts of counters. You’ll use them to determine the survival rate of objects at each garbage collection.
The page functions here are basic but useful. You will use make_available
to make reclaimed pages available to the allocator. And you will use acquire_available_page
to grow the heap.
Garbage-collector data structures
The only persistent data structure used by the collector is the gray list. It doesn’t actually have to be persistent, but I don’t want to be allocating a new one at every collection.
Initialization and finalization
Setup and teardown are not interesting, but take note of the statistics.
Test-and-increment, page-based allocator
The allocator is why we’re here. No program ever asked to stop and collect garbage; what programs want is to allocate. This allocator is a careful facsimile of a state-of-the-art, high-speed allocator, but since we’re focused on the garbage collector, we won’t pay much attention.
Triggering garbage collection and heap growth
The collector should run when the heap is half full. If it waits longer, there might not be space to copy all the data. And if it runs sooner, it’s doing work that’s not necessary.
I’ve implemented target_gamma
, but you’ll need to implement growheap
, which enforces the gamma policy. And you’ll need to manage the availability_floor
so the collector runs at the right time.
Critical garbage-collector functions
The core operation of a copying collector is forwarding pointers, which can entail copying objects. Because our pointers and objects are statically typed, there is a lot of near-duplicate code here. All of this machinery is needed just to duplicate the simple code on page 277b of my book. It’s depressing, but you don’t have to write it; you just use it.
You need to understand what remember
and forward_payload
do: they are the key functions you will use in your collector.
Implementations of copy functions
Nothing really to see here.
Implementations of forward functions
It’s worth understanding what forward_payload
is doing.
Scanning functions
Pay close attention to scan_value
because your functions scan_activation
and scan_vmstate
have similar obligations. Keep in mind that a Value
can take multiple forms, which complicates scanning. A VM state takes just one form, and an activation record also takes one form,1 so your scanning functions will be simpler than the ones you see here.
Page functions
Nothing to see here.
Random utility functions
Nothing to see here.