Featherweight concurrency in a portable assembly language
What abstractions should a reusable code generator
provide to make
it easy for a language implementor to compile a highly concurrent
language? The implementation of concurrency is typically tightly
interwoven with the code generator and run-time system of the high-level
language. Our contribution is to tease out the tricky low-level concurrency
mechanisms and to package them in an elegant way, so they can be
reused by many front ends.
Here, then, is the problem we address: we want to make it
easy for a front end to support efficient user-level threads,
in such a way that all the policy decisions are in the hands of the
front end, while all the tricky mechanism is supported by C--.
Like many fine phrases, this goal is easier to state than achieve.
The contribution of this paper is a design that extends C-- with
clean, architecture-independent mechanisms
that can be used to implement common concurrency policies
efficiently.
Full paper
The paper is available as
US Letter PostScript (218K),
US Letter PDF (238K),
and
US Letter TeX DVI (85K).
Revisions
Since this paper was written, our thinking about C-- and concurrency
has evolved.
Here are some notes:
- The NewContinuation procedure in the run-time interface (from Section 3.2) has evolved
into a CreateStack procedure.
This procedure receives a delimited stack and returns a continuation
and a
``stack-limit cookie.''
As of Fall 2005, we have implemented a slightly simpler
CreateThread primitive without the limit cookie.
This primitive is described in Section 11.2.1 of the C-- Specification.
- The story about global variables is even murkier than it was in
2001.
We have therefore backed off the whole story of Section 5 and are
considering the problem of global-register variables to be still
unsolved.
- We've moved forward a bit on the limitcheck primitive
described in Section 7.
Our thinking has evolved toward a slightly simpler, less powerful
primitive as described in
Section 11.2.2 of the C-- Specification.
This section also talks about moving stack frames and redirecting
pointers.
- We believe we need to extend the run-time interface beyond what's
in this paper.
The details remain to be worked out, but the idea is simple:
while retaining an efficient representation in which most frames are
allocated contiguously,
the run-time interface should give the client the ability to treat a
call stack as a linked list.
For example, it should be possible to insert and delete frames,
to copy frames,
and to move frames from one memory location to another.
Such an interface should support a wide variety of techniques commonly
used in language implementations:
- Small, contiguous thread stacks that can grow
- Segmented stacks
- ``Stacklets''
- First-class Scheme continuations
- One-shot continuations (as in Chez Scheme)
- Incremental stack scanning for real-time garbage collection