Automatically Proving the Correctness of
Compiler Optimizations
Sorin Lerner Todd Millstein Craig Chambers
Department of Computer Science and Engineering
University of Washington
flerns,todd,chambersg@cs.washington.edu
Technical Report UW-CSE-02-11-02
Abstract
We describe a technique for automatically proving compiler optimizations sound, meaning that their trans-
formations are always semantics-preserving. We first present a domain-specific language for implementing
optimizations as guarded rewrite rules. Optimizations operate over a C-like intermediate representation
including unstructured control flow, pointers to local variables and dynamically allocated memory, and re-
cursive procedures. Then we describe a technique for automatically proving the soundness of optimizations
implemented in this language. Our technique requires only a small set of optimization-specific proof obli-
gations to be discharged for each optimization by an automatic theorem prover. We have written a variety
of forward and backward intraprocedural dataflow optimizations in our language, including constant propa-
gation and folding, branch folding, (partial) redundancy elimination, (partial) dead assignment elimination,
and simple forms of points-to analysis. We have implemented our proof strategy with the Simplify automatic
theorem prover, and we have used this implementation to automatically prove our optimizations correct.
Our system also found many subtle bugs during the course of developing our optimizations.
1 Introduction
Compilers are an important part of the infrastructure relied upon by programmers. If a compiler is faulty,
then so are potentially all programs compiled with it. Unfortunately, compiler errors can be diÆcult for
programmers to detect and debug. First, because the compiler's output cannot be easily inspected, problems
can often be found only by running a compiled program. Second, the compiler may appear to be correct
over many runs, with a problem only manifesting itself when a particular compiled program is run with a
particular input. Finally, when a problem does appear, it can be diÆcult to determine whether it is an error
in the compiler or in the source program that was compiled.
For these and other reasons, it is very useful to develop tools and techniques that give compiler developers
and programmers confidence in their compilers. One way to gain confidence in the correctness of a compiler
is to run it on various programs and check that the optimized version of each input program produces correct
results on various inputs. While this method can increase confidence, it cannot provide any guarantees: it
does not guarantee the absence of bugs in the compiler, nor does it even guarantee that any one particular
optimized program is correct on all inputs. It also can be tedious to assemble an extensive test suite of
programs and program inputs.
1
fi
Credible compilers [24, 23] and translation validation [17] improve on this testing approach by having
the compiler automatically check whether or not the optimized version of an input program is semantically
equivalent to the original program. The compiler can therefore guarantee the correctness of certain optimized
programs, but the compiler itself is still not guaranteed to be bug-free: there may exist programs for which
the compiler produces incorrect output. There is little recourse for a programmer if the compiler reports
that it cannot validate the programmer's compiled program. Furthermore, these approaches can have a
substantial impact on the time to run an optimization. For example, Necula mentions that translation
validation of an optimization pass takes about four times longer than the optimization pass itself [17].
The best solution would be to prove the compiler sound, meaning that for any input program, the compiler
always produces an equivalent output program. Optimizations, and sometimes even complete compilers, have
been proven sound by hand [1, 2, 13, 11, 6, 21, 3, 9]. However, manually proving large parts of a compiler
sound requires a lot of effort and theoretical skill on the part of the compiler writer. In addition, these proofs
are usually done for optimizations as written on paper, and bugs may still arise when the algorithms are
implemented from the paper specification.
We present a new technique for proving the soundness of compiler optimizations that combines the benefits
from the last two approaches: our approach is fully automated, as in credible compilers and translation
validation, but it also proves optimizations correct once and for all, for any input program. We achieve this
goal by providing the compiler writer with a domain-specific language for implementing optimizations that is
both flexible enough to express a variety of optimizations and amenable to automated correctness reasoning.
The main contributions of this paper are as follows:
 We present a language for defining optimizations over programs expressed in a C-like intermediate
language including unstructured control flow, pointers to local variables and dynamically allocated
memory, and recursive procedures. To implement an optimization (i.e., an analysis plus a code trans-
formation), users provide a rewrite rule along with a guard describing the conditions that must hold for
the rule to be triggered at some node of an input program's control-flow graph (CFG). The optimiza-
tion also includes a small predicate over program states, which captures the key \insight" behind the
optimization that justifies its correctness. Our language also allows users to express pure analyses, such
as pointer analysis. Pure analyses can be used both to verify properties of interest about a program
and to provide information to be consumed by later transformations. Optimizations and pure analyses
written in our language are directly executable by a special dataflow analysis engine written for this
purpose; they do not need to be reimplemented in a different language to be run.
 We have used our language to express a variety of intraprocedural forward and backward dataflow
optimizations, including constant propagation and folding, copy propagation, common subexpression
elimination, dead assignment elimination, branch folding, partial redundancy elimination, partial dead
code elimination, and loop-invariant code motion. We have also used our language to express several
simple intraprocedural pointer analyses, whose results we have exploited in the above optimizations.
 We present a strategy for automatically proving the soundness of optimizations and analyses expressed
in our language. The strategy requires an automatic theorem prover to discharge a small set of proof
obligations for each optimization. We have manually proven that if these obligations hold for any
particular optimization, then that optimization is sound. The manual proof takes care of the necessary
induction over program execution traces, which is diÆcult to automate. As a result, the automatic
theorem prover is given only non-inductive theorems to prove about individual program states.
 We have implemented our correctness checking strategy using Simplify [25, 20], the automatic theorem
prover used in the Extended Static Checker for Java [5]. We have written a general set of axioms that
are used by Simplify to automatically discharge the optimization-specific proof obligations generated
by our strategy. The axioms simply encode the semantics of programs in our intermediate language.
2
fi
New optimization programs can be written and proven sound without requiring any modifications to
Simplify's axiom set.
 We have used our correctness checker to automatically prove correct all of the optimizations and pure
analyses listed above. The correctness checker uncovered a number of subtle problems with earlier
versions of our optimizations that might have eluded manual testing for a long time.
By providing greater confidence in the correctness of compiler optimizations, we hope to provide a
foundation for extensible compilers. An extensible compiler would allow users to include new optimizations
tailored to their applications or domains of interest. The extensible compiler can protect itself from buggy
user optimizations by verifying their correctness using our strategy; any bugs in the resulting extended
compiler can be blamed on other aspects of the compiler's implementation, not on the user's optimizations.
Extensible compilers could also be a good vehicle for research into new compiler optimizations.
The next section introduces our language for expressing optimizations by example and sketches our
strategy for automatically proving soundness of such optimizations. Sections 3 and 4 formally define our
optimization language and automatic proof strategy, respectively. Section 5 evaluates our work. Section 6
discusses our current and future work, including an extension to support interprocedural optimizations.
Section 7 discusses related work, and section 8 offers our conclusions. The optional appendices contain
definitions of all the optimizations and analyses we have written in our language.
2 Overview
In this section, we informally describe our language for defining optimizations and our technique for proving
those optimizations sound.
2.1 Forward Transformation Patterns
2.1.1 Semantics
The heart of an optimization program is its transformation pattern. For a forward optimization, a transfor-
mation pattern has the following form:
  1 followed by   2 until s ) s 0 with witness P
A transformation pattern describes the conditions under which a statement s may be transformed to
s 0 . The formulas   1 and   2 , which are properties of a statement such as \x is defined and y is not used,"
together act as the guard indicating when it is legal to perform this transformation: s can be transformed to
s 0 if on all paths in the CFG from the start of the procedure being optimized to s, there exists a statement
satisfying   1 , followed by zero or more statements satisfying   2 , followed by s. Figure 1 shows this scenario
pictorially.
Forward transformation patterns codify a scenario common to many forward dataflow analyses: an
enabling statement establishes the conditions necessary for a transformation to be performed downstream,
and any intervening statements are innocuous, i.e., do not invalidate the conditions. The formula   1 captures
the properties that make a statement enabling, and   2 captures the properties that make a statement
innocuous. The witness P captures the conditions established by the enabling statement that allow the
transformation to be safely performed. Witnesses have no effect on the semantics of an optimization; they
will be discussed more below in the context of our strategy for automatically proving optimizations sound.
Example 1 A simple form of constant propagation replaces statements of the form X := Y with X := C
if there is an earlier (enabling) statement of the form Y := C and no intervening (innocuous) statement
3
fi
boundary where
holds
y 2
region where
holds
paths in the CFG
statement s
y 1
Figure 1: CFG paths leading to a statement s which can be transformed to s 0 by the transformation pattern
  1 followed by   2 until s ) s 0 with witness P . The shaded region can only be entered through
a statement satisfying   1 , and all statements within the region satisfy   2 . The statement s can only be
reached by first passing through this shaded region.
modifies Y . The enabling statement ensures that variable Y holds the value C, and this condition is not inval-
idated by the innocuous statements, thereby allowing the transformation to be safely performed downstream.
The \pattern variables" X and Y may be instantiated with any variables of the procedure being optimized,
while the pattern variable C may be instantiated with constants in the procedure. This sequence of events is
expressed by the following transformation pattern (the witness is discussed in more detail in section 2.1.2):
stmt(Y := C)
followed by
:mayDef (Y )
until
X := Y ) X := C
with witness
(Y ) = C
2.1.2 Soundness
A transformation pattern is sound, i.e., correct, if all the transformations it allows are semantics-preserving.
Forward transformation patterns have a natural approach for understanding their soundness. Consider a
statement s transformed to s 0 . Then any execution trace of the procedure that contains s 0 will at some point
execute an enabling statement, then zero or more innocuous statements, before reaching s 0 . As mentioned
earlier, executing the enabling statement establishes some conditions at the subsequent state of execution.
These conditions are then preserved by the innocuous statements. Finally, the conditions imply that s and s 0
have the same effect at the point where s 0 is executed. As a result, the original program and the transformed
program have the same semantics.
4
fi
Our automatic strategy for proving optimizations sound is based on the above intuition. As part of
the code for a forward transformation pattern, optimization writers provide a forward witness P , which
is a (possibly first-order) predicate over an execution state, denoted . The witness plays the role of the
conditions mentioned in the previous paragraph and is the intuitive reason why the transformation pattern
is correct. Our strategy attempts to prove that the witness is established by the enabling statement and
preserved by the innocuous statements, and that it implies that s and s 0 have the same effect. 1 We call the
region of an execution trace between the enabling statement and the transformed statement the witnessing
region. In Figure 1, the part of a trace that is inside the shaded area is its witnessing region.
In example 1, the forward witness (Y ) = C denotes the fact that the value of Y in execution state 
is C. Our implementation proves automatically that the witness (Y ) = C is established by the statement
Y := C, preserved by statements that do not modify the contents of Y , and implies that X := Y and X := C
have the same effect. Therefore, the constant propagation transformation pattern is automatically proven
to be sound.
2.1.3 Labels
Each node (i.e., statement) in a procedure's CFG is labeled with properties that are true at that node,
such as mayDef (y) or stmt(x := 5). The formulas   1 and   2 in an optimization are propositional boolean
expressions over these labels, which may reference pattern variables like Y and C. Our framework provides
a single pre-defined label, stmt(s), which is true at a node if and only if that node is the statement s. Users
can then define their own labels in two ways: syntactic labels and semantic labels.
A syntactic label is defined only in terms of the stmt label and other syntactic labels of the current
statement. For example, synDef (Y ), which stands for syntactic definition of Y , can be defined as:
synDef (Y ) , stmt(Y := : : :)
Then the syntactic (and conservative) version of the mayDef (Y ) label from example 1 can be defined as:
mayDef (Y ) , synDef (Y ) _ stmt(X := : : :) _
stmt(: : : := P (: : :))
In other words, a statement may define variable Y if the statement is either a syntactic definition of Y , a
pointer store (since our language allows taking the address of a local variable), or a procedure call (since the
procedure may be passed pointers from which the address of Y is reachable).
In contrast to syntactic labels, a semantic label of a node can incorporate information about the node's
surrounding context in the CFG. For example, a doesNotPointTo(X ; Y ) label, which says that the contents
of X is definitely not the address of Y , is a semantic label: its truth value at a node depends on the execution
paths leading to the node. Section 2.4 shows how semantic labels are defined and how they can be used to
make the mayDef label less conservative in the face of pointers.
2.2 Backward Transformation Patterns
A backward transformation pattern is similar to a forward one, except that the direction of the flow of
analysis is reversed:
  1 preceded by   2 until s ) s 0 with witness P
1 The correctness of our approach does not depend on the correctness of the witness, since our approach independently verifies
that the witness has the required properties.
5
fi
The backward transformation pattern above says that s may be transformed to s 0 if on all paths in the
CFG from s to the end of the procedure, there exists a statement satisfying   1 , preceded by zero or more
statements satisfying   2 , preceded by s. The witnessing region of a program execution trace consists of the
states between the transformed statement and the statement satisfying   1 ; P is called a backward witness.
As with forward transformation patterns, the backward witness plays the role of an invariant in the
witnessing region. However, in a backward transformation the witnessing region occurs after, rather than
before, the point where the transformed statement has been executed. Therefore, in general a backward
witness must be a predicate that relates two execution states  old and  new , representing corresponding
execution states in the witnessing region of traces in the original and transformed programs. Our automatic
proof strategy attempts to prove that the backward witness is established by the transformation and preserved
by the innocuous states. Finally, we prove that after the enabling statement is executed, the witness implies
that the original and transformed execution states become identical, implying that the transformation is
semantics-preserving.
Example 2 Dead assignment elimination may be implemented in our language by the following backward
transformation pattern:
(synDef (X) _ stmt(return : : :)) ^ :mayUse(X)
preceded by
:mayUse(X)
until
X := E ) skip
with witness
 old =X =  new =X
We express statement removal by replacement with a skip statement. 2 The removal of X := E is enabled
by either a later assignment to X, indicated by synDef (X), or a return statement, which signals the end of
the procedure. Preceding statements are innocuous if they don't use the contents of X.
The backward witness  old =X =  new =X says that  old and  new are equal \up to" X: corresponding
states in the witnessing region of the original and transformed programs are identical except for the contents
of variable X. This invariant is established by the removal of X := E and preserved throughout the region
because X is not used. The witness implies that a re-definition of X or a return statement causes the
execution states of the two traces to become identical.
2.3 Profitability Heuristics
If an optimization's transformation pattern is proven sound, then all matching occurrences of that pattern
are legal to be transformed. For some optimizations, including our two examples above, all legal transfor-
mations are also profitable. However, in more complex optimizations, such as code motion and optimizations
that trade off time and space, many transformations may preserve program behavior while only a small
subset of them improve the code. To address this distinction between legality and profitability, an opti-
mization is written in two pieces. The transformation pattern defines only which transformations are legal.
An optimization separately describes which of the legal transformations are also profitable and should be
performed; we call this second piece of an optimization its profitability heuristic.
An optimization's profitability heuristic is expressed via a choose function, which can be arbitrarily
complex and written in a language of the user's choice. Given the set  of the legal transformations
2 An execution engine would not actually insert such skips.
6
fi
determined by the transformation pattern and the procedure being optimized, choose returns the subset
of the transformations in  that should actually be performed. A complete optimization in our language
therefore has the following form, where O pat is a transformation pattern:
O pat filtered through choose
This way of factoring optimizations into a transformation pattern and a profitability heuristic is critical to
our ability to prove optimizations sound automatically, since only an optimization's transformation pattern
affects soundness. Transformation patterns tend to be simple even for complicated optimizations, with the
bulk of an optimization's complexity pertaining to profitability. Profitability heuristics can be written in
any language, thereby removing any limitations on their expressiveness. Without profitability heuristics, the
extra complexity added to guards to express profitability information would prevent automated correctness
reasoning.
For the constant propagation and dead assignment elimination optimizations shown earlier, the choose
function returns all instances: choose all (; p) = . This profitability heuristic is the default if none is
specified explicitly. Below we give an example of an optimization with a non-trivial choose function:
Example 3 Consider the implementation of partial redundancy elimination (PRE) [12, 8] in our optimiza-
tion language. One way to perform PRE is to first insert copies of statements in well-chosen places in order
to convert partial redundancies into full redundancies, and then to eliminate the full redundancies by running
a standard common subexpression elimination (CSE) optimization expressible in our language. For example,
in the following code fragment, the computation x := a + b at the end is partially redundant, since it is
redundant only when the true leg of the branch is executed:
b := ...;
if (...) {
a := ...;
x := a + b;
} else {
... // don't define a, b, or x, and don't use x.
}
x := a + b;
This partial redundancy can be eliminated by making a copy of the assignment x := a + b in the false leg
of the branch. Now the assignment after the branch is fully redundant and can be removed by running CSE
followed by self-assignment removal (removing assignments of the form x := x).
The criterion that determines when it is legal to duplicate a statement is relatively simple. Most of the
complexity in PRE involves determining which of the many legal duplications are profitable, so that partial
redundancies will be converted to full redundancies at minimum cost. The first, \code duplication" pass of
PRE can be expressed in our language as the following backward optimization:
stmt(X := E) ^ unchanged(E)
preceded by
unchanged(E) ^ :mayDef (X) ^ :mayUse(X)
until
skip ) X := E
with witness
 old =X =  new =X
filtered through
: : :
7
fi
Analogous to statement removal, we express statement insertion as replacement of a skip statement. 3
The label unchanged(E) is defined (by the optimization writer, as described in section 2.1.3) to be true at a
statement s if s does not redefine the contents of any variable mentioned in E. The transformation pattern
for code duplication allows the insertion if, on all paths in the CFG from the skip, X := E is preceded by
statements that do not modify E and X and do not use X, which are preceded by the skip. In the code
fragment above, the transformation pattern allows x:= a + b to be duplicated in the else branch, as well
as other (unprofitable) duplications. This optimization's choose function is responsible for selecting those
legal code insertions that also are the latest ones that turn all partial redundancies into full redundancies
and do not introduce any partially dead computations. This condition is rather complicated, but it can be
implemented in a language of the user's choice and can be ignored when verifying the soundness of code
duplication. A sample choose function for PRE is shown in appendix A.
2.4 Pure Analyses
In addition to optimizations, our language allows users to write pure analyses that do not perform transfor-
mations. These analyses can be used to compute or verify properties of interest about a procedure and to
provide information to be consumed by later transformations. A pure analysis defines a new semantic label,
and the result of the analysis is a labeling of the given CFG. For instance, the does-not-point-to analysis (a
definition of which is shown in appendix A) results in nodes of the CFG being annotated with labels of the
form doesNotPointTo(X; Y ). These labels can then be used by other optimizations in their guards.
A pure analysis is similar to a forward optimization, except that it does not contain a rewrite rule or a
profitability heuristic. 4 Instead, it has a defines clause that gives a name to the new semantic label. A
pure analysis has the form
  1 followed by   2 defines label with witness P
The new label can be added to a statement s if on all paths to s, there exists an (enabling) statement
satisfying   1 , followed by zero or more (innocuous) statements satisfying   2 , followed by s. The given forward
witness should be established by the enabling statement and preserved by the innocuous statements. If so, the
witness provides the new label's meaning: if a statement s has semantic label label, then the corresponding
witness P is true of the program state just before execution of s.
The following example shows how a pure analysis can be used to compute a simple form of pointer
information:
Example 4 We say that a variable is tainted at a program point if its address may have been taken prior
to that program point. The following analysis defines the notTainted label:
stmt(decl X)
followed by
:stmt(: : : := &X)
defines
notTainted(X)
with witness
notPointedTo(X; )
3 An execution engine for optimizations would conceptually insert skips dynamically as needed to perform insertions.
4 Our language currently has no notion of backward analyses. In addition, we currently only allow the results of a forward
analysis to be used in a forward optimization, or in another forward analysis.
8
fi
The analysis says that a variable is not tainted at a statement if on all paths leading to that statement, the
variable was declared, and then its address was never taken. The witness notPointedTo(X; ) is a first-order
predicate defined by the user (and shown in appendix A) that ensures that no memory location in  contains
a pointer to X.
The notTainted label can be used to define a more precise version of the mayDef label from earlier
examples, which incorporates the fact that neither pointer stores nor procedure calls can affect variables that
are untainted:
mayDef (Y ) , synDef (Y ) _
(stmt(X := : : :) ^ :notTainted(Y )) _
(stmt(: : : := P (: : :)) ^ :notTainted(Y ))
With this new definition, optimizations using mayDef become less conservative in the face of pointer stores
and calls.
3 Language for Defining Optimizations
This section provides a formal definition of our optimization language and the intermediate language that
optimizations manipulate. The full formal details can be found in appendix B.
3.1 Intermediate Language
A program  in our (untyped) intermediate language is described by the following grammar:
Progs  ::= pr : : : pr
Procs pr ::= p(x) fs; : : : ; s;g
Stmts s ::= decl x j skip j lhs := e j x := new j
x := p(b) j if b goto  else  j
return x
Exprs e ::= b j x j &x j op b : : : b
Locatables lhs ::= x j x
Base Exprs b ::= x j c
Ops op ::= various operators with arity  1
Vars x ::= x j y j z j : : :
Proc Names p ::= p j q j r j : : :
Consts c ::= constants
Indices  ::= 0 j 1 j 2 j : : :
A program  is a sequence of procedures, and each procedure is a sequence of statements. We assume
a distinguished procedure named main. Statements include local variable declarations, assignments to lo-
cal variables and through pointers, heap memory allocation, procedure calls and returns, and conditional
branches (unconditional branches can be simulated with conditional branches). We assume that each proce-
dure ends with a return statement. Statements are indexed consecutively from 0, and stmtAt(; ) returns
the statement with index  in . Expressions include constants, local variable references, pointer dereferences,
taking the addresses of local variables, and n-ary operators over non-pointer values.
A state of execution of a program is a tuple  = (; ; ; ; M). The index  indicates which statement
is about to be executed. The environment  is a map from variables in scope to their locations in memory,
and the store  describes the contents of memory by mapping locations to values (constants and locations).
9
fi
The dynamic call chain is represented by a stack , and M is the memory allocator, which returns fresh
locations as needed.
The states of a program  transition according to the state transition function !  . We denote by  !   0
the fact that  0 is the program state that is \stepped to" when execution proceeds from state . The definition
of !  is standard and is given in appendix B. We also define an intraprocedural state transition function
,!  . This function acts like !  except when the statement to be executed is a procedure call. In that case,
,!  steps \over" the call, returning the program state that will eventually be reached when control returns
to the calling procedure.
We model run-time errors through the absence of state transitions: if in some state  program execution
would fail with a run-time error, there won't be any  0 such that  !   0 is true. Likewise, if a procedure
call does not return successfully, e.g., because of infinite recursion, there won't be any  0 such that  ,!   0
is true.
3.2 Optimization Language
In this section, we first specify the syntax of a rewrite rule's original and transformed statements s and
s 0 . Then we define the language used for expressing   1 and   2 . Finally, we provide the semantics of
optimizations. The witness P does not affect the (dynamic) semantics of optimizations.
3.2.1 Syntax of s and s 0
Statements s and s 0 are defined in the syntax of the extended intermediate language, which augments the
intermediate language with a form of free variables called pattern variables. Each production in the grammar
of the original intermediate language is extended with a case for a pattern variable. A few examples are
shown below:
Exprs e ::=    j E
Vars x ::=    j X j Y j Z j : : :
Consts c ::=    j C
Statements in the extended intermediate language are instantiated by substituting for each pattern vari-
able a program fragment of the appropriate kind from the intermediate-language program being optimized.
For example, the statement X := E in the extended intermediate language contains two pattern variables
X and E, and this statement can be instantiated to form an intermediate-language statement assigning any
expression occurring in the intermediate program to any variable occurring in the intermediate program.
3.2.2 Syntax and Semantics of   1 and   2
The syntax for   is described by the following grammar:
  ::= true j false j pr j :  j   1 _   2 j   1 ^   2
In this grammar, pr stands for atomic predicates, each of which is formed from the pre-defined stmt label or
from any user-defined label, with parameters drawn from the extended intermediate language. For example,
pr includes predicates such as mayDef (X) and unchanged(E), where X and E are pattern variables.
The semantics of a formula   is defined with respect to a labeled CFG. Each node n in the CFG for
procedure p is labeled with a finite set L p (), where  is n's index. L p () includes atomic predicates pr that
do not contain pattern variables. For example, a node could be labeled with stmt(x := 3) and mayDef (x).
The meaning of a formula   at a node depends on a substitution  mapping the pattern variables
in   to fragments of p. We extend substitutions to formulas and program fragments containing pattern
variables in the usual way. We write  j= p
   to indicate that the node with index  satisfies   in the
10
fi
labeled CFG of p under substitution . The definition of  j= p
   is straightforward, with the base case being
 j= p
 pr () (pr) 2 L p (). The complete definition of j= p
 is in appendix B.
3.2.3 Semantics of Optimizations
We define the semantics of analyses and optimizations in several pieces. First, the meaning of a forward
guard   1 followed by   2 is a function that takes a procedure and returns a set of matching indices with
their corresponding substitutions:
Definition 1 The meaning of a forward guard O guard of the form   1 followed by   2 is as follows:
JO guard K(p) = f(; ) j
for all paths  1 ; : : : ;  j ;  in p's CFG
such that  1 is the index of p's entry node
9k:(1  k  j ^  k j= p
   1 ^ 8i:(k < i  j )  i j= p
   2 ))g
The above definition formalizes the description of forward guards from Section 2. The meaning of a
backward guard   1 preceded by   2 is identical, except that the guard is evaluated on CFG paths ;  j ; : : : ;  1
that start, rather than end, at , where  1 is the index of the procedure's exit node. Guards can be seen as
a restricted form of temporal logic formula, expressible in variants of both LTL and CTL.
Next we define the semantics of transformation patterns. A (forward or backward) transformation pattern
O pat = O guard until s ) s 0 with witness P simply filters the set of nodes matching its guard to include
only those nodes of the form s:
JO pat K(p) = f(; ) j (; ) 2 JO guard K(p) and n j= p
 stmt(s)g
The meaning of an optimization is a function that takes a procedure p and returns the procedure produced
by applying to p all transformations chosen by the choose function.
Definition 2 Given an optimization O of the form O pat filtered through choose, where O pat has rewrite
rule s ) s 0 , the meaning of O is as follows:
JOK(p) = let  := JO pat K(p) in
app(s 0 ; p; choose(; p) \ )
where app(s 0 ; p;  0 ) returns the procedure identical to p but with the node with index  transformed to (s 0 ),
for each (; ) in  0 .
Finally, the meaning of a pure analysis O guard defines label with witness P applied to a procedure p
is a new version of p's CFG where for each pair (; ) in JO guard K(p), the node with index  is additionally
labeled with (label ).
4 Proving Soundness Automatically
In this section we describe in detail our technique for automatically proving soundness of optimizations. We
say that an intermediate-language program  0 is a semantically equivalent transformation of  if, whenever
main(v 1 ) returns v 2 in , for some values v 1 and v 2 , then it also does in  0 . Let [p 7! p 0 ] denote the program
identical to  but with procedure p replaced by p 0 . An optimization O is sound if for all intermediate-
language programs  and procedures p in , [p 7! JOK(p)] is a semantically equivalent transformation of
.
The first subsection describes our technique for proving optimizations sound, which requires an automatic
theorem prover to discharge only a small set of simple proof obligations. The second subsection describes
how these obligations are implemented and automatically discharged with the Simplify theorem prover.
11
fi
4.1 Soundness of Optimizations
We say that a transformation pattern O pat with rewrite rule s ) s 0 is sound if, for all intermediate-language
programs  and procedures p in , for all subsets   JO pat K(p), [p 7! app(s 0 ; p; )] is a semantically
equivalent transformation of . If a transformation pattern is sound, then any optimization O with that
transformation pattern is sound, since the optimization will select some subset of the transformation pattern's
suggested transformations, and each of these is known to be a semantically equivalent transformation of .
Therefore, we need not reason at all about an optimization's profitability heuristic in order to prove that
the optimization is sound.
4.1.1 Forward Transformation Patterns
Consider a forward transformation pattern of the following form:
  1 followed by   2 until s ) s 0 with witness P
As discussed in section 2, our proof strategy entails showing that the forward witness P holds throughout
the witnessing region and that the witness implies s and s 0 have the same semantics in this context. This
can naturally be shown by induction over the states in the witnessing region of an execution trace leading to
a transformed statement. In general, it is diÆcult for an automatic theorem prover to determine when proof
by induction is necessary and to perform such a proof with a strong enough inductive hypothesis. Therefore
we instead require an automatic theorem prover to discharge only non-inductive obligations, which pertain
to individual execution states rather than entire execution traces. We have proven (see Theorems 1 and 2
below) that if these obligations hold for any particular optimization, then that optimization is sound.
We use index as an accessor on states: index ((; ; ; ; M)) = . The optimization-specific obligations,
to be discharged by an automatic theorem prover, are as follows, where (P) is the predicate formed by
applying  to each pattern variable in the definition of P :
F1. If  ,!   0 and index () j= p
   1 , then (P)( 0 ).
F2. If (P)() and  ,!   0 and index () j= p
   2 , then (P)( 0 ).
F3. If (P)() and  ,!   0 and  = index () and stmtAt(; ) = (s) and stmtAt( 0 ; ) = (s 0 ), then
 ,!  0  0 .
Condition F1 ensures that the witness holds at any state following the execution of an enabling statement
(one satisfying   1 ). Condition F2 ensures that the witness is preserved by any innocuous statement (one
satisfying   2 ). Finally, condition F3 ensures that s and s 0 have the same semantics when executed from a
state satisfying the witness.
As an example, consider condition F1 for the constant propagation optimization from example 1. The
condition looks as follows: If  ,!   0 and index () j= p
 stmt(Y := C), then ( 0 (Y ) = C). The condition is
easily proven automatically from the semantics of assignments and the stmt label.
The following theorem validates the optimization-specific proof obligations.
Theorem 1 If O is a forward optimization satisfying conditions F1, F2, and F3, then O is sound.
The proof of this theorem, which is given in appendix B, uses conditions F1 and F2 as part of the base
case and the inductive case, respectively, in an inductive argument that the witness holds throughout a
witnessing region. Condition F3 is then used to show that s and s 0 have the same semantics in this context.
Our proof also handles the case when multiple transformations, with possibly overlapping witnessing regions,
are performed.
12
fi
4.1.2 Backward Transformation Patterns
Consider a backward transformation pattern of the following form:
  1 preceded by   2 until s ) s 0 with witness P
The optimization-specific obligations are similar to those for a forward transformation pattern, except that
the ordering of events in the witnessing region is reversed:
B1. If  ,!   old and  ,!  0  new and  = index () and stmtAt(; ) = (s) and stmtAt( 0 ; ) = (s 0 ), then
(P)( old ;  new ).
B2. If (P)( old ;  new ) and  old ,!   0
old
and  old = index ( old ) and  new = index ( new ) and  old j= 
   2
and stmtAt(;  old ) = stmtAt( 0 ;  new ), then there exists some  0
new
such that  new ,!  0  0
new
and
(P)( 0
old
;  0
new
).
B3. If (P)( old ;  new ) and  old ,!   and  old = index ( old ) and  new = index ( new ) and  old j= 
   1 and
stmtAt(;  old ) = stmtAt( 0 ;  new ), then  new ,!  0 .
Condition B1 ensures that the backward witness holds between the original and transformed programs,
after s and s 0 are respectively executed. 5 Condition B2 ensures that the backward witness is preserved
through the innocuous statements. Condition B3 ensures that the two traces become identical again after
executing the enabling statement (and exiting the witnessing region).
Analogous to the forward case, the following theorem validates the optimization-specific proof obligations
for backward optimizations.
Theorem 2 If O is a backward optimization satisfying conditions B1, B2, and B3, then O is sound.
4.2 Implementation with Simplify
We have implemented our proof strategy with the Simplify automatic theorem prover. For each optimization,
we ask Simplify to prove the three associated optimization-specific obligations. To do so, Simplify requires
background information in the form of a set of axioms. These axioms, which simply encode the semantics of
our intermediate language and of the stmt label, are optimization-independent: they need not be modified
in order to prove new optimizations sound.
We introduce function symbols to represent term constructors for each kind of expression and statement.
For example, the term assgn(var(x ); deref (var(y)) represents the statement x := y . Next we formalize the
representation of program states. Simplify has built-in axioms about a map data structure, with associated
functions select and update to access elements and (functionally) update the map. This is useful for repre-
senting many components of a state. For example, an environment is a map from variables to locations, and
a store is a map from locations to values.
Given our representation for states, we define axioms for a function symbol evalExpr, which evaluates
an expression in a given state. The evalExpr function represents the function () used in section 2. We
also define axioms for a function evalLExpr which computes the location of a lhs expression given a program
state. Finally, we provide axioms for the stepIndex, stepEnv, stepStore, stepStack, and stepMem functions,
which together define the state transition function !  from section 3.1. These functions take a state and a
program and return the new value of the state component being \stepped." As an example, the axioms for
stepping an index and a store through an assignment lhs := e are as follows:
5 This condition assumes that s 0 does not get \stuck" by causing a run-time error. That assumption must actually be proven,
but for simplicity we elide this issue here. It is addressed by requiring a few additional obligations to be discharged that imply
that s 0 cannot get stuck if the original program does not get stuck. Details are in appendix B.
13
fi
8; ; lhs ; e:
stmtAt(; index ()) = assgn(lhs ; e) )
stepIndex (; ) = index () + 1
8; ; lhs ; e:
stmtAt(; index ()) = assgn(lhs ; e) )
stepStore(; ) = update(store();evalLExpr (; lhs);
evalExpr (; e))
The first axiom says that the new index is the current index incremented by one. The second axiom says
that the new store is the same as the old one, but with the location of lhs updated to the value of e. The
,!  function is then defined in terms of the !  function.
We have implemented and automatically proven sound a dozen optimizations and analyses in our language
(which are given in appendix A). On a modern workstation, the time taken by Simplify to discharge the
optimization-specific obligations for these optimizations ranges from 2 to 89 seconds, with an average of 22
seconds.
5 Discussion
In this section, we evaluate our system along three dimensions: expressiveness of our language, debugging
value, and reduced trusted computing base.
Expressiveness. One of the key choices in our approach is to restrict the language in which optimizations
can be written, in order to gain automatic reasoning about soundness. However, the restrictions of our
optimization language are not as onerous as they may first appear. First, much of the complexity of an
optimization can be factored out into the profitability heuristic, which is unrestricted. Second, the pattern
of a witnessing region | beginning with a single enabling statement and passing through zero or more
innocuous statements before reaching the statement to be transformed | is common to many forward
intraprocedural dataflow analyses, and similarly for backward intraprocedural dataflow analyses. Third,
optimizations that traditionally are expressed as having effects at multiple points in the program, such as
various sorts of code motion and partial redundancy elimination, can in fact be decomposed into several
simpler transformations, each of which fits our constraint of making a single transformation at the end of a
witnessing region. The PRE example in section 2.3 illustrates all three of these points. PRE is a complex
code-motion optimization [12, 8], and yet it can be expressed in our language using simple forward and
backward passes with appropriate profitability heuristics. Our way of factoring complicated optimizations
into smaller pieces, and separating the part that affects soundness from the part that doesn't, allows users to
write optimizations that are intricate and expressive yet still amenable to automated correctness reasoning.
Even so, our current language does have limitations. For example, it cannot express interprocedural
optimizations or one-to-many transformations. As mentioned in section 6, our ongoing work is addressing
these limitations. Also, optimizations and analyses that build complex data structures to represent their
dataflow facts may be diÆcult to express. Finally, it is possible for limitations in either our proof strategy
or in the automatic theorem prover to cause a sound optimization expressible in our language to be rejected.
In these cases, optimizations can be written outside of our framework, perhaps verified using translation
validation. Optimizations written in our optimization language and proven correct can peacefully co-exist
with optimizations written \the normal way."
Debugging benefit. Writing correct optimizations is diÆcult because there are many corner cases to
consider, and it is easy to miss one. Our system in fact found several subtle problems in previous versions of
our optimizations. For example, we have implemented a form of common subexpression elimination (CSE)
that eliminates not only redundant arithmetic expressions, but also redundant loads. In particular, this
14
fi
optimization tries to eliminate a computation of X if the result is already available from a previous load.
Our initial version of the optimization precluded pointer stores from the witnessing region, to ensure that
the value of X was not modified. However, a failed soundness proof made us realize that even a direct
assignment Y := : : : can change the value of X , because X could point to Y . Once we incorporated pointer
information to make sure that direct assignments in the witnessing region were not changing the value of
X , our implementation was able to automatically prove the optimization sound. Without the static checks
to find the bug, it could have gone undetected for a long time, because that particular corner case may not
occur in many programs.
Reduced trusted computing base. The trusted computing base (TCB) ordinarily includes the entire
compiler. In our system we have moved the compiler's optimization phase, one of the most intricate and
error-prone portions, outside of the TCB. Instead, we have shifted the trust in this phase to three compo-
nents: the automatic theorem prover, the manual proofs done as part of our framework, and the run-time
engine that executes optimizations. Because all of these components are optimization-independent, new op-
timizations can be incorporated into the compiler without enlarging the TCB. Furthermore, as discussed in
section 6, the run-time engine can be implemented as a single dataflow analysis common to all user-defined
optimizations. This means that the trustworthiness of the run-time engine is akin to the trustworthiness of
a single optimization pass in a traditional compiler.
Trust can be further enhanced in several ways. First, we could use an automatic theorem prover that
generates proofs, such as the prover in the Touchstone compiler [19]. This would allow trust to be shifted
from the theorem prover to a simpler proof checker. The manual proofs of our framework are made public
for peer review in appendix B to increase confidence. We could also use an interactive theorem prover such
as PVS [22] to validate these proofs.
6 Current and Future Work
Our current work is focused in two directions. First, we are implementing the dynamic semantics of our
optimization language as an analysis in the Whirlwind compiler, a successor to Vortex [4]. This analysis
stores at every program point a set of substitutions, each substitution representing a potential witnessing
region. Consider a forward optimization:
  1 followed by   2 until s ) s 0
with witness P filtered through choose
The flow function for our analysis works as follows. First, if the statement being processed satisfies   1 , then
the flow function adds to the outgoing dataflow fact the substitution that caused   1 to be true. Also, for
each substitution  in the incoming dataflow fact, the flow function checks if (  2 ) is true at the current
statement. If it is, then  is propagated to the outgoing dataflow fact, otherwise it is dropped. Finally merge
nodes simply take the intersection of the incoming dataflow facts. After the analysis has reached a fixed
point, if a statement has a substitution  in its incoming dataflow fact that makes (stmt(s)) true, and the
choose function selects this statement, then the statement is transformed to (s 0 ).
For example, in constant propagation we have   1 = stmt(Y := C) and   2 = :mayDef (Y ). The following
program fragment shows the dataflow facts propagated after each statement:
S1 : a := 2; [Y 7! a; C 7! 2]
S2 : b := 3; [Y 7! a; C 7! 2]; [Y 7! b; C 7! 3]
S3 : c := a;
S1 satisfies   1 , and so its outgoing dataflow fact contains the substitution [Y 7! a; C 7! 2]. S2 satisfies   2
under this substitution, and so the substitution is propagated; S2 also satisfies   1 and so [Y 7! b; C 7! 3]
15
fi
is added to the outgoing dataflow fact. In fact, the dataflow information after S2 is very similar to the
regular constant propagation dataflow fact fa 7! 2; b 7! 3g. At fixed point, the statement c := a can be
transformed to c := 2 because the incoming dataflow fact contains the map [Y 7! a; C 7! 2]. Note that this
implementation evaluates all \instances" of the constant propagation optimization pattern simultaneously.
(We also plan to explore potentially more eÆcient implementation techniques, such as generating specialized
code to run each optimization [26].)
Our analysis is being implemented using our earlier framework for composing optimizations in Whirl-
wind [10]. This framework allows optimizations to be defined modularly and then automatically combines
all-forward or all-backward optimizations in order to gain mutually beneficial interactions. Analyses and op-
timizations written in our language will therefore also be composable in this way. Furthermore, Whirlwind's
framework automatically composes an optimization with itself, allowing a recursively defined optimization to
be solved in an optimistic, iterative manner; this property will likewise be conferred on optimizations written
in our language. For example, a recursive version of dead-assignment elimination would allow X := E to
be removed even if X is used before being re-defined, as long as it is only used by other dead assignments
(including itself).
The other direction of our current work is in extending the language to handle interprocedural optimiza-
tions. One approach would extend the scope of analysis from a single procedure to the whole program's
control-flow supergraph. One technical challenge here is the need to express the witness P in a way that
makes sense across procedure calls. For example, the predicate (Y ) = C does not make sense once a call is
stepped into, because Y has gone out of scope. We intend to extend the syntax for the witness to be more
precise about which variable is being talked about. A different approach to interprocedural analysis would
use pure analyses to define summaries of procedures, which could be used in intraprocedural optimizations
of callers.
There are also many directions for future work. First, the optimization language currently supports only
transformations that replace a single statement with a single statement. It should be relatively straightfor-
ward to generalize the framework to handle one-to-many statement transformations, allowing optimizations
like inlining to be expressed. Supporting many-to-many statement transformations would also be interesting.
We plan to try inferring the witnesses, which are currently provided by the user. It may be possible to
use some simple heuristics to guess a witness from the given transformation pattern. As a simple example,
in the constant propagation example of section 2, the appropriate witness, that Y has the value C, is simply
the strongest postcondition of the enabling statement Y := C. Many of the other forward optimizations
that we have written also have this property.
Finally, an important consideration that we have not addressed is the interface between the optimization
writer and our automatic soundness prover. It will be critical to provide useful error messages when an
optimization is unable to be proven sound. When Simplify cannot prove a given proposition, it returns a
counterexample context, which is a state of the world that violates the proposition. An interesting approach
would be to use this counterexample context to synthesize a small intermediate-language program that
illustrates a potential unsoundness of the given optimization.
7 Related Work
Our work is inspired by that of Lacey et al. [9]. Lacey describes a language for writing optimizations as
guarded rewrite rules evaluated over a labeled CFG, and our transformation patterns are modeled on this
language. Lacey's intermediate language lacks several constructs found in realistic languages, including
pointers, dynamic memory allocation, and procedures. Lacey describes a general strategy, based on relating
execution traces of the original and transformed programs, for manually proving the soundness of optimiza-
tions in his language. Three example optimizations are shown and proven sound by hand using this strategy.
16
fi
Unfortunately, the generality of this strategy makes it diÆcult to automate.
Lacey's guards may be arbitrary CTL formulas, while our guard language can be viewed as a strict
subset of CTL that codifies a particularly common idiom. However, we are still able to express more
precise versions of Lacey's three example optimizations (as well as many others) and to prove them sound
automatically. Further, Lacey's optimization language has no notion of semantic labels nor of profitability
heuristics. Therefore, expressing optimizations that employ pointer information (assuming Lacey's language
were augmented with pointers) or optimizations like PRE would instead require writing more complicated
guards, and some optimizations that we support may not be expressible by Lacey.
As mentioned in the introduction, much other work has been done on manually proving optimizations
correct [11, 13, 1, 2, 6, 21, 3]. Transformations have also been proven correct mechanically, but not auto-
matically: the transformation is proven sound using an interactive theorem prover, which increases one's
confidence but requires user interaction. For example, Young [28] has proven a code generator correct using
the Boyer-Moore theorem prover enhanced with an interactive interface [7].
Instead of proving that the compiler is always correct, credible compilation [24, 23] and translation
validation [17] both attack the problem of checking the correctness of a given compilation run. Therefore, a
bug in an optimization will only appear when the compiler is run on a program that triggers the bug. Our
work allows optimizations to be proven correct before the compiler is even run once. However, to do so we
require optimizations to be written in a special-purpose language, while credible compilation and translation
validation typically do not.
Proof-carrying code [16], certified compilation [18], typed intermediate languages [27], and typed assembly
languages [14, 15] have all been used to prove properties of programs generated by a compiler. However, the
kind of properties that these approaches have typically guaranteed are type safety and memory safety. In our
work, we prove the stronger property of semantic equivalence between the original and resulting programs.
8 Conclusion
We have presented an approach for automatically proving the correctness of compiler optimizations. Our
technique provides the optimization writer with a domain-specific language for writing optimizations that
is both reasonably expressive and amenable to automated correctness reasoning. Using our technique we
have proven correct our implementations of several optimizations over a realistic intermediate language. We
believe our approach is a promising step toward the goal of reliable and user-extensible compilers.
References
[1] Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs
by construction or approximation of fixpoints. In Conference Record of the Fourth ACM Symposium on Principles of
Programming Languages, pages 238{252, January 1977.
[2] Patrick Cousot and Radhia Cousot. Systematic design of program analysis frameworks. In Conference Record of the Sixth
ACM Symposium on Principles of Programming Languages, pages 269{282, January 1979.
[3] Patrick Cousot and Radhia Cousot. Systematic design of program transformation frameworks by abstract interpretation. In
Conference Record of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January
2002.
[4] Jeffrey Dean, Greg DeFouw, Dave Grove, Vassily Litvinov, and Craig Chambers. Vortex: An optimizing compiler for object-
oriented languages. In Proceedings of the 1996 ACM Conference on Object-Oriented Programming Systems, Languages,
and Applications, pages 83{100, San Jose, CA, October 1996.
[5] Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson, James B. Saxe, and Raymie Stata. Extended
static checking for Java. In Proceedings of the ACM SIGPLAN '02 Conference on Programming Language Design and
Implementation, June 2002.
17
fi
[6] J. Guttman, J. Ramsdell, and M. Wand. VLISP: a verified implementation of Scheme. Lisp and Symbolic Compucation,
8(1-2):33{110, 1995.
[7] M. Kauffmann and R.S. Boyer. The Boyer-Moore theorem prover and its interactive enhancement. Computers and
Mathematics with Applications, 29(2):27{62, 1995.
[8] Jens Knoop, Oliver Ruthing, and Bernhard Steffen. Optimal code motion: Theory and practice. ACM Transactions on
Programming Languages and Systems, 16(4):1117{1155, July 1994.
[9] David Lacey, Neil D. Jones, Eric Van Wyk, and Carl Christian Frederiksen. Proving correctness of compiler optimizations
by temporal logic. In Conference Record of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, January 2002.
[10] Sorin Lerner, David Grove, and Craig Chambers. Composing dataflow analyses and transformations. In Conference Record
of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 2002.
[11] J. McCarthy and J. Painter. Correctness of a compiler for arithmetic expressions. In T. J. Schwartz, editor, Proceedings
of Symposia in Applied Mathematics, January 1967.
[12] E. Morel and C. Renvoise. Global optimization by suppression of partial redundancies. Communications of the ACM,
22(2):96{103, February 1979.
[13] F. Lockwood Morris. Advice on structuring compilers and proving them correct. In Conference Record of the 1st ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 1973.
[14] Greg Morrisett, Karl Crary, Neal Glew, Dan Grossman, Richard Samuels, Frederick Smith, David Walker, Stephanie
Weirich, and Steve Zdancewic. TALx86: A realistic typed assembly language. In 1999 ACM SIGPLAN Workshop on
Compiler Support for System Software, pages 25{35, Atlanta, GA, USA, May 1999.
[15] Greg Morrisett, David Walker, Karl Crary, and Neal Glew. From System F to Typed Assembly Language. ACM Trans-
actions on Programming Languages and Systems, 21(3):528{569, May 1999.
[16] George C. Necula. Proof-carrying code. In Conference Record of the 24th ACM SIGPLAN-SIGACT Symposium on
Principles of Programming Languages, January 1997.
[17] George C. Necula. Translation validation for an optimizing compiler. In Proceedings of the ACM SIGPLAN Conference
on Programming Language Design and Implementation, pages 83{95, Vancouver, Canada, June 2000.
[18] George C. Necula and Peter Lee. The design and implementation of a certifying compiler. In Proceedings of the ACM
SIGPLAN '98 Conference on Programming Language Design and Implementation, June 1998.
[19] George C. Necula and Peter Lee. Proof generation in the Touchstone theorem prover. In Proceedings of the International
Conference on Automated Deduction, pages 25{44, Pittsburgh, Pennsylvania, June 2000. Springer-Verlag LNAI 1831.
[20] Greg Nelson and Derek C. Oppen. Simplification by cooperating decision procedures. ACM Transactions on Programming
Languages and Systems, 1(2):245{257, October 1979.
[21] D. P. Oliva, J. Ramsdell, and M. Wand. The VLISP verified PreScheme compiler. Lisp and Symbolic Computation,
8(1-2):111{182, 1995.
[22] S. Owre, S. Rajan, J.M. Rushby, N. Shankar, and M.K. Srivas. PVS: Combining specification, proof checking, and model
checking. In Computer-Aided Verification, CAV '96, volume 1102 of Lecture Notes in Computer Science, pages 411{414,
New Brunswick, NJ, July/August 1996. Springer-Verlag.
[23] Martin Rinard. Credible compilation. Technical Report MIT-LCS-TR-776, Massachusetts Institute of Technology, March
1999.
[24] Martin Rinard and Darko Marinov. Credible compilation. In Proceedings of the FLoC Workshop Run-Time Result
Verification, July 1999.
[25] Simplify automatic theorem prover home page, http://research.compaq.com/SRC/esc/Simplify.html.
[26] Bernhard Steffen. Data flow analysis as model checking. In A.R. Meyer T. Ito, editor, Theoretical Aspects of Computer Sci-
ence (TACS'91), Sendai (Japan), volume 526 of Lecture Notes in Computer Science (LNCS), pages 346{364, Heidelberg,
Germany, September 1991. Springer-Verlag.
[27] David Tarditi, Greg Morrisett, Perry Cheng, Chris Stone, Robert Harper, and Peter Lee. TIL: A type-directed opti-
mizing compiler for ML. In Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design and
Implementation, May 1996.
[28] William D. Young. A mechanically verified code generator. Journal of Automated Reasoning, 5(4):493{518, December
1989.
18
fi
A Additional Optimizations
A.1 Optimizations
Copy propagation
stmt(Y := Z)
followed by
:mayDef (Z) ^ :mayDef (Y )
until
X := Y ) X := Z
with witness
(Y ) = (Z)
Constant propagation
stmt(Y := C)
followed by
:mayDef (Y )
until
X := Y ) X := C
with witness
(Y ) = C
stmt(X := C)
followed by
:mayDef (X)
until
if X goto P 1 else P 2 ) if C goto P 1 else P 2
with witness
(X) = C
stmt(Y := C)
followed by
:mayDef (Y )
until
X := op B Y ) X := op B C
with witness
(Y ) = C
stmt(Y := C)
followed by
:mayDef (Y )
until
X := op Y B ) X := op C B
with witness
(Y ) = C
19
fi
Constant folding
true
followed by
true
until
X := op C 1 C 2 ) X := C (C = op(C 1 ; C 2 ))
with witness
true
The above guard, true followed by true, holds at all nodes.
Branch folding
In the following optimizations, we use goto P 1 as sugar for if true goto P 1 else P 1 .
true
followed by
true
until
if true goto P 1 else P 2 ) goto P 1
with witness
true
true
followed by
true
until
if C goto P 1 else P 2 ) goto P 2 (C 6= true)
with witness
true
Common subexpression elimination
unchanged(E) ^ stmt(Z := E)
followed by
:mayDef (Z) ^ unchanged(E)
until
X := E ) X := Z
with witness
(Z) = (E)
Load removal
stmt(Y := &Z)
followed by
:mayDef(Y ) ^ :stmt(decl Z)
until
X := Y ) X := Z
with witness
(Y ) = (&Z)
20
fi
Dead assignment elimination
(synDef (X) _ stmt(return : : :)) ^ :mayUse(X)
preceded by
:mayUse(X)
until
X := E ) skip
with witness
 old =X =  new =X
Code hoisting (PRE)
This is the PRE optimization from section 2.3, with pseudocode for a sample choose function. The
choose function shown here is Steffen's optimal computation placement condition [26], which intuitively tries
to place duplicates as early as possible. A duplicate is placed at node n if placing the duplicate any earlier
would either not be legal, or it would cover different redundancies from those covered by the placement at
n. This is just one possible choose function, given here for illustrative purposes.
stmt(X := E) ^ unchanged(E)
preceded by
unchanged(E) ^ :mayDef (X) ^ :mayUse(X)
until
skip ) X := E
with witness
 old =X =  new =X
filtered through
let   1 = stmt(X := E) ^ unchanged(E)
let   2 = unchanged (E) ^ :mayDef (X) ^ :mayUse(X)
let   3 = :unchanged(E) _ mayUse(X) _
(mayDef (X) ^ :stmt(X := E))
in
f(; ) 2 j forall paths in the CFG leading to 
there exists a node at which   3 holds
followed by nodes at which :legal holds
followed by the node with index g
where legal holds at a node with index  0 if
forall paths in the CFG from  0
there exists a node at which   1 holds
preceded by nodes at which   2 holds
preceded by the node with index  0
Code sinking (PDE)
We show a version of code sinking that performs partial dead code elimination. We perform code
sinking by inserting copies of certain computations downstream of where they occur in the CFG, and then
running dead-assignment elimination to remove the original computations. Our code sinking optimization
is symmetrical to the code hoisting optimization described above, and the intuition for how it works is the
same.
21
fi
stmt(X := E) ^ unchanged(E)
followed by
unchanged(E) ^ :mayDef (X) ^ :mayUse(X)
until
skip ) X := E
with witness
(X) = (E)
filtered through
let   1 = stmt(X := E) ^ unchanged(E)
let   2 = unchanged (E) ^ :mayDef (X) ^ :mayUse(X)
let   3 = :unchanged(E) _ mayUse(X) _
(mayDef (X) ^ :stmt(X := E))
in
f(; ) 2 j forall paths in the CFG from 
there exists a node at which   3 holds
preceded by nodes at which :legal holds
preceded by the node with index g
where legal holds at a node with index  0 if
forall paths in the CFG leading to  0
there exists a node at which   1 holds
followed by nodes at which   2 holds
followed by the node with index  0
22
fi
A.2 Analyses
Tainted-variable analysis
stmt(decl X)
followed by
:stmt(: : : := &X)
defines
notTainted(X)
with witness
notPointedTo(X; )
where notPointedTo is defined as:
notPointedTo(X; ) , 8l 2 domain(store()):(l) 6= (&X)
Simple points-to analysis
The following is a simple points-to analysis. In section A.3, we show how this label is combined with the
notTainted label to define the doesNotPointTo label, which in turn is used to define labels such as mayDef ,
mayUse and unchanged . This optimization uses npMayDef , which is a version of mayDef that does not use
pointer information.
stmt(X := &Z) ^ :synUse(Y ) ^ hasBeenDeclared (Y )
followed by
:npMayDef (X)
defines
simpleNotPntTo(X; Y )
with witness
(X) 6= (&Y )
Has-been-declared analysis
stmt(decl X)
followed by
true
defines
hasBeenDeclared (X)
with witness
X 2 domain(env())
23
fi
A.3 Label definitions
We now define the labels used in this paper. Labels that start with np (e.g. npMayDef , npMayUse ,
npUnchanged) are versions of the labels that don't use pointer information (and thus are more conservative).
synDef (Y ) , stmt(Y := : : :)
synUse(Y ) , stmt(: : : := : : : Y : : :) _
stmt(Y := : : :) _
stmt(if Y : : :)
npMayDef (Y ) , synDef (Y ) _
stmt(X := : : :) _
stmt(: : : := P (: : :))
npMayUse(Y ) , synUse(Y ) _
stmt(: : : := X) _
stmt(: : : := P (: : :))
doesNotPointTo(X; Y ) , simpleNotPntTo(X; Y ) _
notTainted(Y )
mayPointTo(X; Y ) , :doesNotPointTo(X; Y )
mayDef (Y ) , synDef (Y ) _
(stmt(X := : : :) ^ mayPointTo(X; Y )) _
(stmt(: : : := P (: : :)) ^ :notTainted(Y ))
mayUse(Y ) , synDef (Y ) _
(stmt(: : : := X) ^ mayPointTo(X; Y )) _
stmt(: : : := P (: : :))
The unchanged label is defined as follows. When E is not X , unchanged(E) is defined as the conjunction
of :mayDef for all the variables in E. For X , unchanged (X) is defined as follows:
unchanged(X) , :mayDef (X) ^
:stmt(Y := : : :) ^
:stmt(: : : := P (: : :)) ^
(stmt(Y := : : :) =)
doesNotPointTo(X ; Y ))
npUnchanged is a version of unchanged that does not use pointer information. For X , npUnchanged (X)
is false. When E is not X , it is defined as the conjunction of :npMayDef for all the variables in E.
24
fi
B Formalization
B.1 Semantics of the Intermediate Language
B.1.1 Preliminaries
The set of indices of program  is denoted Indices  . The set of indices of procedure p is denoted Indices p .
The formal argument name of procedure p in program  is denoted formal p . The index of the first statement
in procedure p of program  is denoted start p . We assume WLOG that no if statements in a procedure p
refer to indices not in Indices p , nor to the index start p . 6
The arity of an operator op is denoted arity(op). We assume a fixed interpretation function for each
n-ary operator symbol op: JopK : Consts n ! Consts .
We assume an infinite set Locations of memory locations, with metavariable l ranging over the set. We
assume that the set Consts is disjoint from Locations and contains the distinguished elements true and uninit .
Then the set of values is defined as Values = (Locations [ Consts)
An environment is a partial function  : Vars * Locations ; we denote by Environments the set of all
environments. A store is a partial function  : Locations * Values ; we denote by Stores the set of all stores.
The domain of an environment  is denoted dom(), and similarly for the domain of a store. The notation
[x 7! l] denotes the environment identical to  but with variable x mapping to location l; if x 2 dom(),
the old mapping for x is shadowed by the new one. The notation [l 7! v] is defined similarly. The notation
=fl 1 ; : : : ; l i g denotes the store identical to  except that all pairs (l; v) 2  such that l 2 fl 1 ; : : : ; l i g are
removed.
The current dynamic call chain is represented by a stack. A stack frame is a triple f = (; l; ) :
Indices  Locations  Environments. Here  is the index of the first statement following the call currently
being executed, l is the location in which to put the return value from the call, and  is the current
lexical environment at the point of the call. We denote by Frames the set of all stack frames. A stack
 =< f 1 : : : fn >: Frames is a sequence of stack frames. The set of all stacks is denoted Stacks . Stacks
support two operations defined as follows:
push : (Frames  Stacks) ! Stacks
push(f; < f 1 : : : fn >) =< f f 1 : : : fn >
pop : Stacks * (Frames  Stacks)
pop(< f 1 f 2 : : : fn >) = (f 1 ; < f 2 : : : fn >); where n > 0
Finally, a memory allocator M is an infinite stream < l 1 ; l 2 ; : : : > of locations. We denote the set of all
memory allocators as MemAllocs.
A state of execution of a program  is a quintuple  = (; ; ; ; M) where  2 Indices ,  2 Environments ,
 2 Stores ,  2 Stacks , and M 2 MemAllocs . We denote the set of program states by States . We refer
to the corresponding index of a state  as index (), and we similarly define accessors env, store, stack, and
mem.
6 This last restriction ensures that the entry node of p's CFG will have no incoming edges.
25
fi
B.1.2 State Transition Functions
The evaluation of an expression e in a program state , where env() =  and store() = , is given by the
function (e) : (States  Exprs) ! Values defined by:
(x) = ((x))
where x 2 dom(); (x) 2 dom()
(c) = c
(x) = (((x)))
where x 2 dom(); (x) 2 dom(); ((x)) 2 dom()
(&x) = (x)
where x 2 dom()
(opb 1 : : : b n ) = JopK((b 1 ); : : : ; (b n ))
where arity(op) = n and 81  j  n:((b j ) 2 Consts)
Similarly, the evaluation of a locatable expression lhs in a state , where env() =  and store() = , is
given by the function  l (lhs) : (States  Locatables) ! Locations defined by:
 l (x) = (x)
where x 2 dom()
 l (x) = ((x))
where x 2 dom(); (x) 2 dom(); ((x)) 2 Locations
Definition 3 Given a program , the state transition function !   States  States is defined by:
 If stmtAt(; ) = decl x then (; ; ; ; < l; l 1 ; l 2 ; : : : >) !  ( + 1; [x 7! l]; [l 7! uninit ]; ; <
l 1 ; l 2 ; : : : >)
where l 62 dom()
 If stmtAt(; ) = skip then (; ; ; ; M) !  ( + 1; ; ; ; M)
 If stmtAt(; ) = (lhs := e) then (; ; ; ; M) !  ( + 1; ; [ l (lhs) 7! (e)]; ; M)
where  = (; ; ; ; M)
 If stmtAt(; ) = x := new then (; ; ; ; < l; l 1 ; l 2 ; : : : >) !  ( + 1; ; [(x) 7! l][l 7! uninit ]; ; <
l 1 ; l 2 ; : : : >)
where x 2 dom(), l 62 dom()
 If stmtAt(; ) = x := p 0 (b) then (; ; ; ; < l; l 1 ; l 2 ; : : : >) !  ( 0 ; f(y; l)g; [l 7! (b)]; push(f; ); <
l 1 ; l 2 ; : : : >)
where  = (; ; ; ; < l; l 1 ; l 2 ; : : : >),  0 = start p0 , y = formal p0 , l 62 dom(), x 2 dom(), f =
( + 1; (x); )
 If stmtAt(; ) = (if b goto  1 else  2 ) then (; ; ; ; M) !  ( 1 ; ; ; ; M)
where (; ; ; ; M)(b) = true
 If stmtAt(; ) = (if b goto  1 else  2 ) then (; ; ; ; M) !  ( 2 ; ; ; ; M)
where (; ; ; ; M)(b) 6= true
 If stmtAt(; ) = return x then (; ; ; ; M) !  ( 0 ;  0 ;  0 ;  0 ; M)
where pop() = (( 0 ; l 0 ;  0 );  0 ), dom() = fx 1 ; : : : ; x i g,  0 = (=f(x 1 ); : : : ; (x i )g)[l 0 7!
(; ; ; ; M)(x)]
26
fi
We denote the reflexive, transitive closure of !  as ! 
 .
Definition 4 Given a program , the intraprocedural state transition function ,!   States  States is
defined by:
 If stmtAt(; ) is not a procedure call, then  ,!   0
where  !   0
 If stmtAt(; ) is a procedure call, then  ,!   0
where  !   00 ! 
  0 and  0 is the first state on the trace between  and  0 such that stack( 0 ) =
stack()
We denote the reflexive, transitive closure of ,!  as ,!   .
We say that  6,!  if there does not exist some  0 such that  ,!   0 .

B.1.3 Programs and Program Transformations
Definition 5 The semantic function of a program  is the partial function JK : Consts  MemAllocs *
Values defined by: JK(c; < l; l 1 ; l 2 ; : : : >) = (l) where (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) ! 

( + 1; f(x; l)g; ; <>;M) and  62 Indices  and stmtAt(; ) is defined to be x := main(c).
Definition 6 We say that  0 is a semantically equivalent transformation of  if for all c; M such that
JK(c; M) = v, it is the case that J
0 K(c; M) = v.
B.2 Semantics of Optimizations
Let stmtAt(p; ) denote the statement at index  of procedure p.
Definition 7 The control flow graph (CFG) of a procedure p is a graph (Indices p ; ! cfg ), where ! cfg 
Indices p  Indices p and  1 ! cfg  2 if and only if:
(stmtAt(p;  1 ) 2 fdecl x; lhs := e; skip; x := new; x := p(b); return xg ^  2 =  1 + 1)
_ (stmtAt(p;  1 ) = if b goto  else  0 ^ ( 2 =  _  2 =  0 ))

Each node with index  is annotated with the label stmt(s), where s = stmtAt(p; ).
The definition of  j= p
  , which evaluates a formula   at the node for index  in the CFG of p, is as
follows.
 j= p
 true () true
 j= p
 false () false
 j= p
 :  () not  j= p
  
 j= p
   1 _   2 ()  j= p
   1 or  j= p
   2
 j= p
   1 ^   2 ()  j= p
   1 and  j= p
   2
 j= p
 pr () (pr ) 2 L p ()
27
fi
B.3 Proof Obligations
B.3.1 Forward Optimizations
Consider a forward transformation pattern of the following form:
  1 followed by   2 until s ) s 0 with witness P
The optimization-specific obligations, to be discharged by an automatic theorem prover, are as follows:
F1. If  ,!   0 and index () j= p
   1 , then (P)( 0 ).
F2. If (P)() and  ,!   0 and index () j= p
   2 , then (P)( 0 ).
F3. If (P)() and  ,!   0 and  = index () and stmtAt(; ) = (s) and stmtAt( 0 ; ) = (s 0 ), then
 ,!  0  0 .
B.3.2 Backward Optimizations
Consider a backward transformation pattern of the following form:
  1 preceded by   2 until s ) s 0 with witness P
We require that the witness P match program points in the original and transformed state, or in other
words that P =) (index ( old ) = index ( new )). The optimization-specific obligations, to be discharged by
an automatic theorem prover, are as follows:
B1. If  ,!   old and  ,!  0  new and  = index () and stmtAt(; ) = (s) and stmtAt( 0 ; ) = (s 0 ), then
(P)( old ;  new ).
B2. If (P)( old ;  new ) and  old ,!   0
old
and  old = index ( old ) and  new = index ( new ) and  old j= 
   2
and stmtAt(;  old ) = stmtAt( 0 ;  new ), then there exists some  0
new
such that  new ,!  0  0
new
and
(P)( 0
old
;  0
new
).
B3. If (P)( old ;  new ) and  old ,!   and  old = index ( old ) and  new = index ( new ) and  old j= 
   1 and
stmtAt(;  old ) = stmtAt( 0 ;  new ), then  new ,!  0 .
In rule B1, we assume that both programs can step. However, we in fact need to prove that the trans-
formed program steps if the original one does, in order to show that the transformed program is semantically
equivalent to the original one. Unfortunately, it is not possible to prove this for B1 using only local knowl-
edge. Therefore, we allow B1 to assume that the transformed program steps, and we separately prove the
property using some additional obligations.
We introduce the notion of an error predicate (). Intuitively, the error predicate says what the state
of the original program must look like at the point in the trace where a transformation is allowed, if that
transformation would get stuck. We then show that the error predicate would continue to hold on the
original program throughout the witnessing region, eventually implying that the original program itself will
get stuck. So we will have shown that the transformed program gets stuck only if the original one does.
We currently infer the error predicate: it is simply a predicate stating the conditions under which
the transformed statement s 0 is \stuck" | it cannot take a step. This inference has been suÆcient to
automatically prove soundness of all the backward optimizations we have written. However, in our obligations
below, we allow an arbitrary error predicate to be specified.
B1 0 . If  ,!   old and  6,!  0 and  = index () and stmtAt(; ) = (s) and stmtAt( 0 ; ) = (s 0 ), then
()( old ).
28
fi
B2 0 . If ()() and  ,!   0 and  = index () and  j= 
   2 , then ()( 0 ).
B3 0 . If ()() and  = index () and  j= 
   1 , then  6,!  .
B.3.3 Analyses
Consider a pure analysis of the following form:
  1 followed by   2 defines label with witness P
The optimization-specific obligations, to be discharged by an automatic theorem prover, are as follows:
A1. If  ,!   0 and index () j= p
   1 , then (P)( 0 ).
A2. If (P)() and  ,!   0 and index () j= p
   2 , then (P)( 0 ).
B.4 Metatheory
B.4.1 Forward Optimizations
Theorem 1 If O is a forward optimization with transformation pattern   1 followed by   2 until
s ) s 0 with witness P satisfying conditions F1, F2, and F3, then O is sound.
Proof:
Let  be an intermediate-language program, p be a procedure in ,   JO pat K(p), and let  be the
program identical to  but with p replaced by app(s 0 ; p; ). It suÆces to show that  is a semantically
equivalent transformation of . Let c be a constant and M be a memory allocator such that JK(c; M) = v.
By definition 6 we must show that also J  K(c; M) = v.
Since JK(c; M) = v, by definition 5 we have that (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) !   ( +
1; f(x; l)g; ; <>;M 0 ) and stmtAt(; ) is defined to be x := main(c) and  62 Indices  and v = (l), where
M =< l; l 1 ; l 2 ; : : : >. Define    to act like !  ordinarily, but to act like ,!  when executing a state at
some node with index  0 such that some ( 0 ; ) 2 . Then we have that
(; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >)     1     2           k 1    ( + 1; f(x; l)g; ; <>;M 0 )
for some  1 ; : : : ;  k 1 . To prove that J K(c; M) = v we will show that also
(; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >)     1     2           k 1    (+1; f(x; l)g; ; <>;M 0 )
where stmtAt( ; ) is defined to be x := main(c).
Let  k = ( + 1; f(x; l)g; ; <>;M 0 ). We show by induction on k that every prefix of the trace in  up to
 j , for all 1  j  k, is mirrored in  .
 Case j=1. Since stmtAt(; ) = stmtAt( ; ) = x := main(c) and (; f(x; l)g; f(l; uninit)g; <>;<
l 1 ; l 2 ; : : : >)     1 , by definition of    we have (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) !   1 .
Therefore also (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) !   1 , so (; f(x; l)g; f(l; uninit)g; <>;<
l 1 ; l 2 ; : : : >)     1 .
 Case 1 < j  k. By induction we have that (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >)     1   
       j 1 . Let index ( j 1 ) =  j 1 . We have two sub-cases:
{ :9:(( j 1 ; ) 2 ). Then by the definition of  we have stmtAt(;  j 1 ) = stmtAt( ;  j 1 ).
Therefore, since  j 1     j , by definition of    we have  j 1 !   j . Then also  j 1 !   j ,
so  j 1     j and the result follows.
29
fi
{ 9:(( j 1 ; ) 2 ). Then by the definition of  , there is some  such that stmtAt( ;  j 1 ) =
(s 0 ). Then also ( j 1 ; ) 2 JO pat K(p), so we have  j 1 j= p
 stmt(i), so stmtAt(;  j 1 ) = (s).
By definition of    we know that (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) ! 
  j 1 , and we
also have stmtAt(; ) = x := main(c) and  j 1 2 Indices p . Assume
(; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) !   0
1 !     !   0
v
where  0
v =  j 1 . Then there must be some t such that 1  t < v and index ( 0
t ) = start p , repre-
senting the first statement executed on the same invocation of p as in  j 1 . Then let  00
w ; : : : ;  00 1
be identical to the sequence  0
v ; : : : ;  0
t , but with all states that are not in the same invocation
of p as in  j 1 removed. Let index ( 00 x ) =  00 x for all 1  x  w. Then  00 w =  j 1 . It is easy
to show that  00 1 ,!     ,!   00 w . Also, by the definition of an intraprocedural CFG we have
that n 00
w ; : : : ; n 00
1 represents a backward path in the CFG of p to the entry node. Therefore, since
( j 1 ; ) 2 JO pat K(p), it follows that there exists some r such that 1  r < w and  00
r j= p
   1 , and
for all q such that r < q < w we have  00
q j= p
   2 .
First we prove 8q:(r < q  w) ) (P)( 00
q ). We prove it by induction on q:
 (base case) q = r + 1. So we have  00 r ,!   00 q and index ( 00 r ) j= p
   1 , and the result follows
from condition F1.
 (inductive case) q > r + 1. By the inductive hypothesis we have (P)( 00
q 1 ). We also know
that  00
q 1 ,! p  00
q and, since r < q 1 < w, index ( 00
q 1 ) j= p
   2 . Then the result follows from
condition F2.
So we have shown in particular that (P)( j 1 ) holds. We saw above that  j 1     j , and
by definition of    that means  j 1 ,!   j . We also know that stmtAt(;  j 1 ) = (s) and
stmtAt( ;  j 1 ) = (s 0 ). Then by condition F3 we have  j 1 ,!   j , so also  j 1     j and
the result follows.
B.4.2 Backward Optimizations
Lemma 1 Let O be a backward optimization with transformation pattern   1 preceded by   2 until
s ) s 0 with witness P and error predicate  such that B1 0 -B3 0 hold. Let p be a procedure,  be a
program containing p,  2 Indices p , and stmtAt(; ) = (s). Let  be a state such that index () = . Let  0
be a program such that stmtAt( 0 ; ) = (s 0 ) and  6,!  0 . If
 ,!   1 ,!     ,!   k
and for all 1  j < k we have index ( j ) j= p
   2 and index ( k ) j= p
   1 , then  k 6,!  .
Proof: We will first prove by induction on k that (( j ) holds for all 1  j  k.
 Case j = 1. Since  !   1 and  6,!  0 and stmtAt(; ) = (s) and stmtAt( 0 ; ) = (s 0 ), the result
follows from B1 0 .
 Case 1 < j  k. By induction, assume that ( j 1 ) holds. We are given that  j 1 !   j , and since
(j 1)  k we also have that index ( j 1 ) j= p
   2 . Then the result follows from B2 0 .
So in particular we have shown that ( k ) holds. We are given that index ( k ) j= p
   1 , so by B3 0 we have
 k 6,!  .
Theorem 2 If O is a backward optimization with transformation pattern   1 preceded by   2 until
s ) s 0 with witness P with error predicate  and satisfying conditions B1, B2, B3, B1 0 , B2 0 , and B3 0 ,
then O is sound.
30
fi
Proof: Let  be an intermediate-language program, p be a procedure in ,   JO pat K(p), and let  be
the program identical to  but with p replaced by app(s 0 ; p; ). It suÆces to show that  is a semantically
equivalent transformation of .
We define an infinite family of generalized intermediate-language programs as follows. Let  j
 denote the
program that acts like  for the first j states but henceforth acts like . Formally, we define the transition
relation of  j
 directly as a relation !  j

on prefixes of execution traces, rather than as a relation on states.
Let T = [ 1     r ] denote a partial trace of  j
 such that index ( 1 ) 62 Indices  j

and s  j

index(1 )
is a call to
main. We say that T !  j

T 0 if and only if T 0 = [ 1     r+1 ], where
 r  j )  r !   r+1
 r > j )  r !   r+1
Let ! 
 j

denote the reflexive, transitive closure of !  j

. We also define an intraprocedural version ,!  j

in the identical way that ,!  is defined for !  . Finally, we define the semantic function of  j
 by the
straightforward modification of Definition 5.
We prove that for all j  1,  j
 is a semantically equivalent transformation of . Since  =  1
 and
the semantic equivalence relation is transitive, it then follows easily that  is a semantically equivalent
transformation of . The proof proceeds by induction on j.
For the base case, j = 1. Let c be a constant and M =< l; l 1 ; l 2 ; : : : > such that JK(c; M) = v.
By Definition 6 we must show that J
1
 K(c; M) = v. By Definition 5 we have that v = (l) and
(; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) ! 
 ( + 1; f(x; l)g; ; <>;M) and stmtAt(; ) is defined to be
x := main(c) and  62 Indices  . Therefore assume that
(; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) !   2 !     !   k !  ( + 1; f(x; l)g; ; <>;M)
Let  1 = (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) and  k+1 = ( + 1; f(x; l)g; ; <>;M). Also let s  n
be x := main(c). Then I claim that
[ 1 ] !  1

[ 1 ;  2 ] !  1

   !  1

[ 1 ; : : : ;  k ;  k+1 ]
If we can prove this, then the result follows. We prove inductively that each transition in the above sequence
of transitions holds.
 Base Case. We must show that [ 1 ] !  1

[ 1 ;  2 ]. We're given that  1 !   2 and stmtAt(; ) =
stmtAt( 1
 ; ) = x := main(c). Then  1 !   2 , so by the definition of !  1

the result follows.
 Inductive Case. By induction we have [ 1 ] !  1

[ 1 ;  2 ] !  1

   !  1

[ 1 ; : : : ;  q ], for some 1 <
q  k. We're given that  q !   q+1 . Then by the definition of !  1

we have that [ 1 ; : : : ;  q ] !  1

[ 1 ; : : : ;  q+1 ].
For the inductive case, j > 1 and  j 1
 is a semantically equivalent transformation of . We will prove that
 j
 is a semantically equivalent transformation of  j 1
 . Let c be a constant and M =< l; l 1 ; l 2 ; : : : > such
that JK(c; M) = v. It suÆces to show that J
j
 K(c; M) = v. Since  j 1
 is a semantically equivalent transfor-
mation of , we know that J
j 1
 K(c; M) = v. Then we have that v = (l) and [(; f(x; l)g; f(l; uninit)g; <>
; < l 1 ; l 2 ; : : : >)] ! 
 j 1

[(; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >);  2 ; : : : ;  k ; ( +1; f(x; l)g; ; <>;M)] and
stmtAt(; ) = stmtAt( j 1
 ; ) is defined to be x := main(c) and  62 Indices  . Therefore assume that
[ 1 ] !  j 1

[ 1 ;  2 ] !  j 1

   !  j 1

[ 1 ; : : : ;  k+1 ]
31
fi
where  1 = (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >) and  k+1 = ( + 1; f(x; l)g; ; <>;M).
For each 1  t  k + 1, let (t; ) denote the following predicate:
t > j ^
(index ( j ); ) 2  ^
stmtAt(; index ( j )) = (s) ^
stmtAt( ; index ( j )) = (s 0 ) ^
8m:((j < m < t ^  j ,! 
 m ) ) index ( m ) 6j= 
   1 )
Define    j 1

as a view on the above execution trace, in the following way:    j 1

acts like !  j 1

ordinarily, but it acts like ,!  j 1

when it is either at a state  t such that (t; ) for some , or it is at the
state  j , where (index ( j ); ) 2 . Then we have
[ 0
1 ]    j 1

[ 0
1 ;  0
2 ]    j 1

      j 1

[ 0
1 ; : : : ;  0
z ]
where  1 =  0
1 and  k+1 =  0
z .
Then I claim that
[ 00
1 ]    j

[ 00
1 ;  00
2 ]    j

      j

[ 00
1 ;  00
2 ; : : : ;  00
z ]
where    j

acts like !  j

ordinarilly, but acts like ,!  j

when it is either at a state  00
y such that  0
y =  t
and (t; ) for some , or it is at a state  00
y such that  0
y =  j , where (index ( j ); ) 2 . Further, each  00
y is
defined as follows. For each 1  y  z:
 If there exists  such that (t; ), where 1  t  k + 1 and  0
y =  t , then (P)( 0
y ;  00
y ).
 Else  0
y =  00
y .
If we can prove this, then we have that  00
z =  0
z =  k+1 , and the result follows.
We prove inductively that each of the partial traces in the sequence above exists.
 For the base case, y = 1. We saw above that  1 =  0
1 . Since j > 1, we have 1 6> j, so 8::(1; ).
Therefore we must prove that  0
1 =  00
1 . We're given that  1 = (; f(x; l)g; f(l; uninit)g; <>;< l 1 ; l 2 ; : : : >
) and stmtAt(; ) = x := main(c). Therefore [ 0
1 ] is a valid partial trace for  j
 .
 For the inductive case, y > 1 and [ 00
1 ; : : : ;  00
y 1 ] is a valid partial trace for  j
 with each component
state meeting the definition above. We must show that there exists  00
y meeting the definition above
such that [ 00
1 ; : : : ;  00
y ] is a valid partial trace for  j
 . Let t be the integer between 1 and k +1 such that
 0
y 1 =  t . There are several cases.
{ t < j. Since t 6> j, by definition of  we have that  00
y 1 =  0
y 1 . By the definition of    j 1

,
 0
y 1 !   0
y . Then by definition of    j

we have [ 00
1 ; : : : ;  00
y 1 ]    j

[ 00
1 ; : : : ;  00
y 1 ;  0
y ]. Since
t + 1 6> j, by definition of  we must show that  00
y =  0
y , so the result follows.
{ t = j. Then by definition of  we have that  00
y 1 =  0
y 1 . There are two sub-cases.
 :9:(index ( j ); ) 2 . Then by definition of    j 1

, we have  0
y 1 !   0
y . Also
stmtAt(; index ( j )) = stmtAt( ; index ( j )), so we have  0
y 1 !   0
y . Therefore by defini-
tion of    j

we have [ 00
1 ; : : : ;  00
y 1 ]    j

[ 00
1 ; : : : ;  00
y 1 ;  0
y ]. Since :9:(index ( j ); ) 2 , by
definition of  we must show that  0
y =  00
y , so the result follows.
32
fi
 9:(index ( j ); ) 2 . Then by definition of  , there is some  such that (index ( j ); ) 2 
and stmtAt( ; index ( j )) = (s 0 ). Then by definition of    j 1

, we have  0
y 1 ,!   0
y .
Further, since   JO pat K, we have stmtAt(; index ( j )) = (s). Let  0
y =  t 0 , for some
j < t 0  k + 1. Since  0
y 1 ,!   0
y , we vacuously have that 8m:((j < m < t 0 ^  j ,! 
 m ) )
index ( m ) 6j= 
   1 ). Therefore we have shown (t 0 ; ), so by definition of  and    j

we have
to show that there exists  00
y such that  00
y 1 ,!   00
y and (P)( 0
y ;  00
y ).
We have two cases. Suppose there exists  00
y such that  00
y 1 ,!   00
y . Then by condition B1
the result follows.
Now suppose there does not exist  00
y such that  00
y 1 ,!   00
y , so that  00
y 1 6,!  . We are given
that (index ( j ); ) 2 . By definition of  j 1
 we know that  j 1
 acts like  from  j on in the
sequence [ 1 ; : : : ;  k+1 ]. Further,  k+1 = ( + 1; f(x; l)g; ; <>;M), where  + 1 62 Indices  .
Therefore one of the states between  j and  k+1 exclusive must represent the return node
from the same invocation of p as  j . Therefore we have that there exists some j < r  k such
that index ( r ) j= 
   1 and for all t  q < r such that  q is in the same invocation of p as  t ,
we have index ( q ) j= 
   2 . Then by Lemma 1 we have  r 6,!  , and we have a contradiction.
{ t > j. There are two sub-cases.
 :9:(t; ). By definition of    j 1

, we have  0
y 1 !   0
y . Therefore  00
y 1 =  0
y 1 , and for all
 we have that either (index ( j ); ) 62  or stmtAt(;  j ) 6= (s) or stmtAt( ;  j ) 6= (s 0 ) or
9m:(j < m < t ^  j ,! 
 m ^ index ( m ) 6j= 
   1 ). Then also :9:(t 0 ; ), where  0
y =  t 0 , so
by the definition of    j

we must show that  00
y 1 !   00
y , where  0
y =  00
y . Since  0
y 1 !   0
y ,
the result follows.
 9:(t; ). Therefore (P)( 0
y 1 ;  00
y 1 ) and by definition of    j 1

, we have  0
y 1 ,!   0
y . We
have two sub-cases.
 index ( t ) 6j= 
   1 . Then since  0
y 1 !   0
y , we have 8m:((j < m < t 0 ^  j ,! 
 m ) )
index ( m ) 6j= 
   1 ), where  0
y =  t 0 , so (t 0 ; ). Then we must show that  00
y 1 ,!   00
y ,
where (P)( 0 y ;  00
y ).
Since 9:(t; ), we have (index ( j ); ) 2 . We know that  j+1 !     !   t . Therefore
either index ( w ) j= 
   2 for all j +1  w < t such that w is a state in the same invocation
of p as  j , or there exists j + 1  w < t such that index ( w ) j= 
   1 and w is a state in
the same invocation of p as  j . Since we saw above that 8m:((j < m < t 0 ^  j ,! 
 m ) )
index ( m ) 6j= 
   1 ), it must be the case that index ( t ) j= 
   2 . Therefore the result follows
from condition B2.
 index ( t ) j= 
   1 . Then by the definition of  we have :(t 0 ; ), where  0
y =
 t 0 . Since  is the unique substitution such that stmtAt(; index ( j )) = (s) and
stmtAt( ; index ( j )) = (s 0 ), we have :9:(t 0 ; ). Therefore by the definition of    j

we must show that  00
y 1 ,!   0
y . The result follows from condition B3.
B.4.3 Pure Analyses
Let   1 followed by   2 defines label with witness P be a pure analysis. We say that the analysis is sound
if for all programs , all procedures p in , all indices  in Indices p , and all substitutions , the following
condition holds: If the analysis puts a label of the form (label ) on the node of p's CFG with index , then
(P) holds at all program states  of all execution traces of  such that index () = .
Theorem 3 If   1 followed by   2 defines label with witness P is a pure analysis satisfying conditions
A1 and A2, then the analysis is sound.
33
fi
Proof: Identical to the argument used in the proof of Theorem 1 to show that (P) holds at any execution's
program state just before a transformed statement is executed. (Note that conditions A1 and A2 are the
same as F1 and F2.)
34
fi