Norman Ramsey - Research Activities
This page sketches my research activities and shows how to get the
papers that have the details.
Information is organized by topic as far as possible, although some
work could reasonably appear under more than one topic.
A chronological list of papers is also available.
Topics covered on this page include:
I am intrigued by the problems of retargeting machine-dependent software
to different hardware platforms.
This topic is of especial interest to those who create such software,
since it is hard for improvements in debuggers, linkers,
etc. to have an impact unless the software works on multiple platforms.
My contributions include a retargetable
debugger and a toolkit for recognizing and
generating machine instructions.
A research summary is available.
Machine code
Representations of machine instructions
ldb's breakpoint implementations required a rudimentary
facility for recognizing machine instructions.
Attempts to generalize that facility led to the
New Jersey Machine-Code Toolkit, which uses a new formalism to
describe binary encodings. In one sentence, it gives application
writers the ability to manipulate true binary representations of
machine code while programming at an assembly-language level of
abstraction.
There are two papers describing the main ideas:
Those who actually want to use the toolkit will probably find the
following technical reports useful:
The toolkit project has also led to a machine-independent method of
relocating instructions, described in
Relocating machine instructions by currying.
ACM
SIGPLAN '96 Conference on Programming Language
Design and Implementation, 226--236, May 1996.
[Abstract]
The toolkit's equation solver is described in
A simple solver for linear equations containing nonlinear operators.
Software---Practice & Experience, 26(4):467--487, April 1996.
An earlier version appeared as
Technical Report 95-068, Purdue
University, Dept of Computer
Sciences, November 1995.
The method we use to test toolkit specifications is described in
Automatic
Checking of Instruction Specifications.
1997 International
Conference on Software Engineering, pp 326-336, May 1997 (with Mary F.
Fernández).
My research summary for
retargeting describes the toolkit.
The toolkit home page provides
entrée to the software and its documentation.
Semantics of machine instructions
I've developed Lambda-RTL, a new metalanguage for specifying the
semantics of machine instructions.
You can read about goals and some preliminary results in
Machine Descriptions to Build
Tools for Embedded Systems (with Jack
W. Davidson).
In ACM
SIGPLAN Workshop on Languages, Compilers, and Tools for
Embedded Systems (LCTES'98), volume 1474 of LNCS, pages 172--188.
Springer Verlag, June 1998.
The semantic descriptions are based on register transfers.
Unlike other tools, Lambda-RTL produces fully explicit
register transfers, in which the sizes of all operands and results are
explicit, as
well as the byte order used for all fetches and stores.
The Lambda-RTL translator uses type inference with subtyping to add
this information to specifications written in a simple metalanguage.
Nascent applications include
More information is available on the
Zephyr
Lambda-RTL pages.
Compiler-writers often
use C as a portable assembly language, so they can avoid the pain
of writing multiple optimizing
back ends.
For this purpose, C's outstanding virtue is its ubiquity.
But C was designed as a high-level programming language, and as a
compiler-target language it has many deficiencies.
In particular, it lacks support for tail calls,
exception handling, garbage collection,
and lightweight concurrency.
I am therefore working with Simon Peyton Jones
on the design of C--, a
language
that is designed as a compiler-target language.
My particular interest is in compile-time and run-time support for
garbage collection, exceptions, and debugging.
Simon and I have written three papers about C--, but we're still not
satisfied.
Please send feedback!
- Machine-independent support for
garbage collection, debugging, exception handling, and concurrency (draft).
Technical Report CS-98-19, Department of Computer Science, University
of Virginia, August 1998 (with Simon L. Peyton Jones).
An extensive proposal for extending C-- to support high-level run-time
services. Parts are superseded by the next two papers.
- C--: a portable assembly
language that supports
garbage collection.
Invited paper,
International Conference on Principles and Practice of
Declarative Programming, LNCS 1702, pp 1-28, September 1999 (with Simon L. Peyton
Jones and Fermin Reig).
A refinement of some of the ideas in the large technical report.
Explains what C--
is and how it may be used to support
such high-level run-time services as garbage collection, exception
handling, debugging, profiling and concurrency. Only garbage
collection is handled in detail.
An extensive proposal for extending C-- to support high-level run-time
services. Parts are superseded by the exceptions paper, next.
- A Single Intermediate Language
That Supports Multiple Implementations of Exceptions
(with Simon
L. Peyton Jones).
To appear in PLDI 2000.
A revised and simplified treatment of exceptions within the context of
C--. This supersedes parts of the early proposal, but it is much
more narrowly focused.
One significant improvement is a formal operational semantics for much
of C--.
Christian
Lindig and I are buildng the Quick C-- implementation.
I'm working with Cristina
Cifuentes on a retargetable
platform for binary translation.
Cristina has put together a nice overview
page.
This work is supported by the Australian Research Council.
We have written two papers:
Debugging
The goal of the Debugging Everywhere project is to
make debugging a cheap, ubiquitous service.
The idea behind the project is to use Active Debugging
Information, i.e., debugging symbols containing executable
content.
Rather than have the compiler give the debugger a bunch of
source-level information about the program, forcing the debugger to
figure out what it means on a particular machine, we have the compiler
provide the debugger with code it can execute directly to do things
like evaluate expressions and print values.
This approach will have two benefits:
- Considerable simplification in the debugger, making it nearly
independent of the source programming language and target machine.
- Simplification for the compiler writer.
Because the debugging information need not be represented in a limited
standard like ``stabs'' or DWARF, the debugging
information can be represented in whatever form is convenient for the
compiler.
The project is based on earlier work on ldb,
a retargetable
debugger.
I've written two papers and a thesis:
- A retargetable debugger.
ACM SIGPLAN '92 Conference on Programming Language Design and
Implementation,in SIGPLAN Notices, 27(7):22-31, July 1992 (with
David R. Hanson).
[Abstract]
An overview of a work in progress. It has little
explanation of expression evaluation or of breakpoints, since they
weren't implemented yet.
- A
Retargetable Debugger.
PhD thesis, Princeton University, Department of Computer Science, December
1992. [Abstract]
The definitive reference, but embarrassingly prolix.
- Correctness
of trap-based breakpoint implementations.
In Proceedings of the 21st ACM Symposium on the Principles of Programming
Languages, pages 15-24, Portland, Oregon, January 1994.
[Abstract]
This literate program formalizes ldb's breakpoint
implementation for the MIPS and SPARC, which uses strategically
placed trap instructions. The paper also shows an apparently correct
implementation that could actually miss breakpoint events.
My research summary for
retargeting describes ldb.
Interpreters
There's nothing like a quick interpreter to try out a new language
idea.
I am especially interested in embedding interpreters into functional
languages.
I have written Lua-ML, which
is an implementation of the Lua
scripting language in Objective
Caml.
It is very useful for scripting and configuring Caml programs; we use
it to configure our compiler for C--.
I have written two papers on Lua-ML:
Language Design
Most of my work is in implementation, but I am also interested in
language design.
Literate Programming
Literate programming is the act of writing programs primarily as
documents to be read by human beings, and only secondarily as
instructions to be executed by computer.
The term was coined by Don Knuth to describe WEB, the tool he created
to write TeX.
My contributions to literate programming
have included
analyzing tools,
building tools to support multiple programming languages,
and simplifying tools by recognizing and
eliminating the inessential.
I have also published some literate programs:
- Correctness of trap-based breakpoint implementations.
In Proceedings of the 21st ACM Symposium on the Principles of
Programming Languages, pages 15--24, Portland, OR, January 1994.
- A simple solver for linear equations containing nonlinear operators.
Software---Practice & Experience, 26(4):467--487, April 1996.
(An earlier version appeared as
Technical Report 95-068, Purdue
University, Dept of Computer
Sciences, November 1995.)
- Eliminating spurious error messages using
exceptions, polymorphism, and higher-order functions,
Computer Journal, 42(5):360-372, 1999.
Spider - Generating WEBs for different languages
In 1986, you couldn't write a literate program in a new programming
language without first writing a new tool.
Incompatible, language-dependent
versions of WEB were starting to proliferate.
I created Spider, a WEB generator, which can create a WEB
system for any language whose lexical and syntactic structure are
roughly Algol-like.
I described Spider in a ``Literate Programming'' column for
CACM, and the system is documented in two
Princeton technical reports:
- Literate programming:
Weaving a language-independent WEB.
Communications of the ACM, 32(9):1051-1055, September 1989.
- A Spider user's guide.
Technical Report TR-225-89, Department of Computer Science, Princeton
University, August 1989.
- The Spidery WEB system of structured documentation.
Technical Report TR-226-89, Department of Computer Science, Princeton
University, August 1989.
Spidery WEB, like the other members of the WEB family, is too complex (see paper
below), so I discourage readers from digging up
the manuals or the software.
noweb is a good alternative.
Experience with WEB
We actually used Spidery WEB for a real project.
One novelty was the idea that literate programming might be a useful
software-engineering technique, even though
we never had any intention of creating a
literate program of publishable quality.
For example, it was helpful to use the literate program to bring new
project members up to speed.
A colleague and I described the experience in:
Readers may find this article valuable for its enumeration of criteria for
evaluating literate-programming tools and for its critique
of WEB.
It is, so far as I know, the only published description of
industrial experience with literate programming.
noweb - Simplification and language-independence
I created the noweb literate-programming tool
to address the shortcomings of WEB, as identified
during the experience with Spidery WEB.
My contributions include making
the first language-independent literate-programming tool,
developing a simpler, more general model of
literate program, identifying and stripping away all inessential
features of WEB, and creating an extensible implementation by using a
pipelined architecture.
These contributions are described for the non-specialist in this article:
The noweb home page provides
entrée to the software and its documentation.
Formal Methods
At one time, I worked on formal verification of Ada programs, as
described in
The
original was set for large paper and then photoreduced; I'm afraid
I've supplied a rather botched reprint whose only virtue is that it
fits on standard US letter paper.
I got interested in file synchronization and developed a new
specification technique.
If you're interested in this topic, check out
Unison.
Those guys were first, and we're swimming in their wake.
The paper on breakpoint correctness is an
application of formal methods.
Miscellany
As a student project, I explored how combining polymorphism, garbage
collection, and the Standard ML modules system could make it easy to
write concurrent programs.
The technical report
describes what I came up with.
The closest thing to a research contribution was an implementation
idea that John Reppy used in his CML system, but some people find the
paper entertaining anyway.
Have you ever been blasted by a screenful of compiler error messages,
only to discover that the first message was the only one that meant
anything?
If so, you'll be glad to read
about
The unparsing problem is to take an internal representation
of a program and produce a concrete external representation that, when
parsed, reproduces the original representation.
The trick is to come up with a readable representation.
The paper
explains how to solve the unparsing problem, focusing on the
elimination of unnecessary parentheses.
Most of my work in machine-level software has involved program
generators, and the New Jersey Machine-Code
Toolkit is a serious program generator. The paper
describes some of the difficulties of making program generators reusable.
Kevin Scott and I ran the 1999 ICFP Programming
Contest. The paper
describes the problem, the field, and the winning entries.
My experience with the Unison
file synchronizer got me interested in the problem of specifying the
behavior of such a synchronizer. The paper
presents some preliminary results.
Physics
In a former life, I was a physicist.
Write me if you want either of
these papers:
- Formation and breakup rates of RbXe van der Waals molecules in He and N2 gas.
Chemical Physics Letters, 102:340, 1983 (with Will Happer et al.).
- A search for capacitance fluctuations in tunnel capacitors (M.S. thesis).
Technical report, Materials Science Center, Cornell University, August 1986.
Copyright
The documents contained in these pages are included
to ensure timely
dissemination of scholarly and technical work on a
non-commercial basis. Copyright and all rights therein
are maintained by the authors or by other copyright
holders, notwithstanding that they have offered their
works here electronically. It is understood that all
persons copying this information will adhere to the terms
and constraints invoked by each author's copyright.
These works may not be reposted without the explicit
permission of the copyright holder.
Back to Norman Ramsey's home page.