Featherweight concurrency in a portable assembly language
What abstractions should a reusable code generator
provide to make
it easy for a language implementor to compile a highly concurrent
language? The implementation of concurrency is typically tightly
interwoven with the code generator and run-time system of the high-level
language. Our contribution is to tease out the tricky low-level concurrency
mechanisms and to package them in an elegant way, so they can be
reused by many front ends.
Here, then, is the problem we address: we want to make it
easy for a front end to support efficient user-level threads,
in such a way that all the policy decisions are in the hands of the
front end, while all the tricky mechanism is supported by C--.
Like many fine phrases, this goal is easier to state than achieve.
The contribution of this paper is a design that extends C-- with
clean, architecture-independent mechanisms
that can be used to implement common concurrency policies
The paper is available as
US Letter PostScript (218K),
US Letter PDF (238K),
US Letter TeX DVI (85K).
Since this paper was written, our thinking about C-- and concurrency
Here are some notes:
- The NewContinuation procedure in the run-time interface (from Section 3.2) has evolved
into a CreateStack procedure.
This procedure receives a delimited stack and returns a continuation
As of Fall 2005, we have implemented a slightly simpler
CreateThread primitive without the limit cookie.
This primitive is described in Section 11.2.1 of the C-- Specification.
- The story about global variables is even murkier than it was in
We have therefore backed off the whole story of Section 5 and are
considering the problem of global-register variables to be still
- We've moved forward a bit on the limitcheck primitive
described in Section 7.
Our thinking has evolved toward a slightly simpler, less powerful
primitive as described in
Section 11.2.2 of the C-- Specification.
This section also talks about moving stack frames and redirecting
- We believe we need to extend the run-time interface beyond what's
in this paper.
The details remain to be worked out, but the idea is simple:
while retaining an efficient representation in which most frames are
the run-time interface should give the client the ability to treat a
call stack as a linked list.
For example, it should be possible to insert and delete frames,
to copy frames,
and to move frames from one memory location to another.
Such an interface should support a wide variety of techniques commonly
used in language implementations:
- Small, contiguous thread stacks that can grow
- Segmented stacks
- First-class Scheme continuations
- One-shot continuations (as in Chez Scheme)
- Incremental stack scanning for real-time garbage collection