The toolkit-generated program ``checks'' a specification by applying pairs of constructors for each instruction. The first constructor emits the instruction's binary code. For convenience, a constructor's binary representation is emitted using assembly directives (e.g., .byte, .word, .long), which allows the checker to emit both representations in one assembly file. The second constructor in a pair emits the the assembly-language representation of the instruction. When executed, the checker program emits its output into an assembly file, which is then assembled by the native assembler. This technique assumes that the native assembler permits both instructions and data in the text segment; the SGI-Irix assembler is the only one that we have found that prohibits this mix.
The resulting object file is disassembled using a native disassembler or debugger. We use gdb -batch to disassemble the object file. gdb reads the file gdb-{asm,bin}output, which disassemble the instructions generated by the checker.
<gdb-binoutput>= disas main _asmoutput
<gdb-asmoutput>= disas _asmoutput _endoutput
If the specification's assembly-to-binary mappings are correct, the assembly output should contain pairs of matching instructions; a matched pair indicates that the assembler's binary encoding of an instruction matches the toolkit's encoding. A short script compares pairs of instructions and flags any mismatched pairs. Because the checker emits mismatched pairs in assembly, the user can easily identify the source of the discrepancy in the original specification.
<chkrfilter>= #!/bin/sh # # Usage: chckfilter <gawk or nawk> <object file name> # # Compare pairs of disassembled instructions GNAWK=$1 shift gdb -batch -x gdb-asmoutput $1 | \ sed -e 's/^[^:]*://g' >$1.asmoutput gdb -batch -x gdb-binoutput $1 | \ sed -e 's/^[^:]*://g' >$1.binoutput echo "Differences" # Display diffs in 2 column format diff $1.asmoutput $1.binoutput | expand | \ $GNAWK '/^---$/ { next } /^< / { left[nextleft++] = substr($0,3); next } /^> / { right[nextright++] = substr($0,3); next } function dump () { if (nextright==1 && right[0]=="# ERROR") { nextleft = 1; left[0] = ">>> ERROR <<<" } while (nextleft < nextright) left[nextleft++]="" while (nextright < nextleft) right[nextright++]="" for (i=0; i<nextleft; i++) { printf "%-38.38s | %-38.38s\n", left[i], right[i] } nextleft = nextright = 0 } { dump(); print } END {dump()} ' rm -f $1.asmoutput $1.binoutput
base-checker
,
which contains all
the files necessary to build checkers for the MIPS, SPARC,
and Pentium targets.
It also contains a makefile that will build and run a checker.
We describe the steps necessary to build and run a checker here.
A checker requires an assembly emitter (e.g., target-asm.c), a binary emitter (e.g., target-bin.c), target-specific code to emit bytes in assembly pseudo-ops, the checker program and the toolkit library. The checker program accesses the binary and assembly emitters indirectly via the interfaces target_asm and target_bin.
To check instructions that take relocatable operands, we use a trick to guarantee that the address assigned by the assembler matches that used by the binary emitter. The address in the file chkraddr is the entry point for main; we get that value before building the checker program, then use it to assign known addresses to relocatable operands.
<rules to build check address>= chkraddr: echo "main(){}" >_boot.c $(CC) -g _boot.c -o _boot nm _boot | egrep "T _main|T main" | \ awk '{print "0x"$$1}' | cpif $@ rm -f _boot.c _boot
A specification is checked in several steps.
All of these steps appear in the base-checker
's makefile.
First, the checker program emits assembly code in the file _target.s.
The assembly file is assembled into the binary file _target.out.
Then the chkrfilter shell script disassembles the binary file and
compares the two representations. Any differences are printed to
standard output.
emit_checker
produces the checker.
For each constructor, generate_input_values
produces a table that associates operands'
names with sets of test inputs that satisfy the constructor's
constraints.
For each instruction constructor, generate_checker
enumerates all the test values of the operands
associated with that constructor, and it emits assembly and binary
forms by applying the constructor to each combination.
If an operand is of a constructor type (i.e., not a field or integer)
the checker repeats the enumeration process recursively on
constructors of that type.
This recursion explores the entire state space defined by the
constructors and the sets of operand values.
generate_checker
emits a C program that applies
the pairs of constructors.
The assembly constructors are accessed indirectly
through the interface name_asm
and the binary
constructors via name_bin
.
An interface name must be specified using the -indirect
command-line option.
<*>= [D->] global chkr_input_tbl global TESTLIMIT global zerotab procedure emit_checker(outfilename) local outfile, altoutfile, pp TESTLIMIT := 1024 zerotab := table(0) zerotab[1] := 0 verbose("Emitting checker program") if (/chkr_input_tbl) then chkr_input_tbl := table() if (/indirectname) then error("Interface name undefined. Checker requires ``-indirect'' option") outfile := openfile(outfilename || implementation_extension, "cw") | error("Could not open ", outfilename, " for writing") pp := PPnew(outfile) emit_template(pp, "checker.t", "interface", image(outfilename || interface_extension), "irname", indirectname) every cons := kept_constructors() do { write(\cdebug, "\nChecker for " || expimage(cons.name)) p := pattern_to_case(subst_tab(crhs(cons), parmtab(cons), 1)) chkr_input_tbl[cons.name] := generate_input_values(p, cons) } generate_checker(pp, outfilename, indirectname) PPxwrites(pp, "$b}$n") PPwrite(pp) # flush prettyprinter return end
Defineschkr_input_tbl
,emit_checker
,TESTLIMIT
,zerotab
(links are to index).
generate_checker
emits the ASCII-encoded binary instructions
in the text section main
and then the assembly instructions
in _asmoutput
. UNIX-style assembler pseudo-ops
are interleaved between the two sections so that the disassembler
can find the entry points of each section.
<*>+= [<-D->] procedure generate_checker(pp, outfilename, indirectname) local cons, test, asmoutput, binoutput, outfile, app count := 0 asmoutput := [] binoutput := "" asmir := indirectname || "_asm" || "->" binir := indirectname || "_bin" || "->" outfile := openfile(outfilename || "_tmp_", "cw") | error("Could not open ", outfilename, " for writing") app := PPnew(outfile) every cons := kept_constructors() do if cons.type === instructionctype then every test := generate_tests(cons, 1, !chkr_input_tbl[cons.name], asmir) do { pexp := pretty(Gcall(cons.name, test)) asmoutput := "/* Pair " || string(count +:= 1) || ":" || pexp || " */\n" asmoutput ||:= asmir || pexp || ";\n" PPxwrite(app, asmoutput) binoutput := "/* Pair " || string(count) || ":" || pexp || " */\n" binoutput ||:= binir || pexp || ";\n" PPxwrite(pp, binoutput) } emit_template(pp, "inchecker.t") PPwrite(app) # flush assembly prettyprinter outfile := openfile(outfilename || "_tmp_", "r") | error("Could not open ", outfilename, " for reading") every line := !outfile do PPxwrite(pp, line) emit_template(pp, "afterchecker.t") remove(outfilename || "_tmp_") end
Definesgenerate_checker
(links are to index).
generate_tests
generates the test inputs for arguments idx
through the last argument to constructor cons
.
It suspends a list of test values for the operands from idx
to
*cons.operands
.
<*>+= [<-D->]
procedure generate_tests(cons, idx, test_values, asmir)
local i, rest
write(\cdebug, "Generate tests for ", cons.name, " ",
idx, " ", expimage(cons.operands[idx]))
<print test_values
>
if idx > *cons.operands then
suspend []
else if type(cons.operands[idx]) == "literal" then
suspend generate_tests(cons, idx+1, test_values, asmir)
else
every rest := generate_tests(cons, idx+1, test_values, asmir) &
i := tests_of(cons.operands[idx], test_values, asmir)
do {
push(rest, i)
suspend rest
pop(rest)
}
end
Definesgenerate_tests
(links are to index).
<*>+= [<-D->] procedure tests_of(input, tbl, asmir) local i, m, n if type(input) == "literal" then suspend "" else if type(input) == "input" then { case type(input.meaning) of { "integer" : { input := lookuptype(input.name, "field") <check that test values exist for input> suspend tbl[input.name] } "string" : { <check that test values exist for input> suspend if (input.meaning == "reloc" & integer(tbl[input.name])) then Gcall("unsigned_to_raddr", [tbl[input.name]]) else tbl[input.name] } "field" | "null" | "string" : { <check that test values exist for input> suspend tbl[input.name] } "constype" : every m := kept_constructors(input.meaning) & i := generate_tests(m, 1, !chkr_input_tbl[m.name], asmir) do suspend Gcall(asmir || m.name, i) default : typeerror(input.meaning, "field, input, or constype", input.name) } } end
Definestests_of
(links are to index).
<check that test values exist for input>= (<-U) inputs := \tbl[input.name] | \ error("No test values specified for `", input.name, "'")
merge_chkr_tables
merges two input tables and complains if test
values have already been supplied for an input.
<*>+= [<-D->] procedure merge_chkr_tables(L) local t, k chkr_tbl := table() every t := !L do { every k := key(t) do { if /chkr_tbl[k] then chkr_tbl[k] := t[k] else warning("Checker inputs for ", k, " already defined. Using first set.") } } return chkr_tbl end
Definesmerge_chkr_tables
(links are to index).
rd = 1, rd = 2
, in which case, no set
of input values exists that satisfies these equations.
Instead, we use heuristics to generate the test values for a
particular branch.
First, we find equality conditions that constrain an input
to a constant value, e.g., the condition rd = 1
determines
rd
's test value.
That initial set is used to simplify the branch's conditions,
which may determine other inputs' values.
Test values are added until no new inputs are constrained.
If some inputs remain unconstrained, they are assigned
a random test value within the input's width constraints.
<*>+= [<-D->] procedure generate_input_values(p, cons) local d, conds, fits, test_values, test_sets, f, tmp, names test_sets := [] case type(p) of { "Stagcase" : { every c := kept_constructors(p.type) do test_sets |||:= generate_input_values(p.arms[c], cons) } "pattern" : { remove_duplicate_conditions(p, cons) while d := get(p.disjuncts) do { test_values := table() test_values[Epc] := Epc <extractd
's conditions> <create initial set oftest_values
> <generate input values> put(test_sets, test_values) } } } return test_sets end
Definesgenerate_input_values
(links are to index).
This code is derived from the code in constructors.nw
that
extracts a disjunct's conditions before emitting its encoding code.
To generate input values, we only need the field-width constraints
in fits
and the general conditions in conds
.
<extractd
's conditions>= (<-U) set_patlabel_offsets(d) d := gsubst(d, Epatlabel_to_Epc) ds_conds := extract_conditions(d, cons) conds := ds_conds[2] fits := ds_conds[4] sanitize_sequents(d, fits) <eliminateEforce
on relocatable inputs> <add constraints on unconstrained inputs> <add constraints on field inputs> <omit constraints and conditions on constructor-typed operands> <printfits
andconds
>
Relocatable addresses are always "forceable", because the checker determines their values.
<eliminate Eforce
on relocatable inputs>= (<-U)
every f := !fits do f.x := unforce(f.x)
<*>+= [<-D->] procedure unforce(x) return if (type(x) == "Eforce") then x.x else x end
Definesunforce
(links are to index).
fits
andconds
>= (<-U) if (*fits > 0) then write(\cdebug, "Width constraints for " || cons.name) write(\cdebug, " " || expimage(fits)) write(\cdebug, print_conditions(conds))
<print test_values
>= (U->)
every k := key(test_values) do
write(\cdebug, "test_values[", expimage(k), "] = ", expimage(test_values[k]))
<add constraints on unconstrained inputs>= (<-U) every i := inputs_of(cons) & type(i.meaning) == "null" do { write(\cdebug, "adding Efitsu(" || i.name || ", "|| wordsize || "), " || type(i.meaning)) constrained := &null every f := !fits & string(f.x) == i.name do constrained := 1 /constrained & insert(fits, Efitsu(i.name, wordsize)) # if (string(i.meaning) == "reloc") then { # if (type(conds) == "integer") then conds := set() # insert(conds, eqn(Emod(i.name, 4), "=", zerotab)) # } }
We insert constraints on field operands into fits
so we can generate test values for them.
<add constraints on field inputs>= (<-U) every i := inputs_of(cons, "field") do insert(fits, Efitsu(i.name, fwidth(i.meaning))) every i := inputs_of(cons, "integer") do insert(fits, Efitss(i.name, i.meaning))
A constructor's pattern may include constraints and conditions
on its integer, field, and relocatable operands, and
on the inputs to its constructor-typed operands.
We can omit the constraints on constructor-typed operands
from fits
and conds
, because
these constraints are satisfied when test values
are generated for instances of the operands' constructor types.
<omit constraints and conditions on constructor-typed operands>= (<-U) tmp := copy(fits) every f := !\fits & (type(f.x) == "Einstance_input") do delete(tmp, f) fits := tmp tmp := conds every c := !conds & subterms_matching(c, "Einstance_input") do delete(tmp, c) conds := tmp
<insert equality conditions in test_values
>= (U->)
every c := !conds do {
if (<equality condition>) then {
verbose("inserting test_values[" || expimage(x) || "] = " || v)
test_values[x] := v
}
}
<equality condition>= (<-U) type(c) == "eqn" & c.op == "=" & ((v := integer(untableexp(c.right)) & (not integer(c.left)) & x := c.left) | (v := integer(untableexp(c.left)) & (not integer(c.right)) & x := c.right))
The initial set of test values includes equality conditions
in conds
, e.g. rd = 1
, and equality conditions derived
from this initial set, e.g., rd = 1, rs = rd => rs = 1
.
We try to generate random test values for any operand without an initial value until all conditions dependent on the operand are satisfied. For most operands, we choose a random value that falls within its field-width constraints. If a field operand, however, is sparsely populated, i.e., a set of meaningful values are specified by the user, we choose a random value from this set.
<sparsely populated>= (U->) names := \fieldname_table(i.meaning)
<generate input values>= (<-U) every (i := inputs_of(cons) & type(i.meaning) ~== "constype" & /test_values[i.name]) do { every f := !fits & string(f.x) == i.name do { unsatisfied := 1 count := 0 if (type(i.meaning) == "field" & <sparsely populated>) then test_values[f.x] := ?names else test_values[f.x] := random_nbit_value(f) while (\unsatisfied & count < TESTLIMIT) do { verbose("Try " || expimage(f.x) || " = " || expimage(test_values[f.x])) <check conditions> if (\unsatisfied) then { slices := slices_of(f.x, unsatisfied) verbose("Slices are: ") every verbose(expimage(!slices)) test_values[f.x] := interleave_slices(random_nbit_value(f), slices) } count +:= 1 } if (count == TESTLIMIT) then { warning("Checker can't generate test values for " || i.name || print_conditions(unsatisfied, "not satisfied.")) } } } if (*test_values > 1) then write(\cdebug, "Tests " || (if (*p.disjuncts > 1) then " for disjunct:\n" || expimage(d) else ":")) every i := inputs_of(cons) & \test_values[i.name] do write(\cdebug, expimage(i.name) || " = " || expimage(test_values[i.name])) <check conditions> if (\unsatisfied) then { warning("Checker can't generate test values for disjunct:", "\n", expimage(d), print_conditions(unsatisfied, "not satisfied.")) } if (\unknown) then { warning("Test values for disjunct:", "\n", expimage(d), print_conditions(unknown, "may not be satisfied by test inputs.")) }
slices_of
projects conditions dependent on slices of v
.
<*>+= [<-D->] procedure slices_of(v, conds) local s s := set() every c := !conds do if (type(c.left) == "Eslice" & (c.left.x := unforce(c.left.x)) == v) then insert(s,eqn(c.left, c.op, integer(untableexp(c.right)))) else if (type(c.right) == "Eslice" & (c.right.x := unforce(c.right.x)) == v) then insert(s, eqn(c.right, c.op, integer(untableexp(c.left)))) return s end
Definesslices_of
(links are to index).
Now try to interleave the slices, generating random bits for the unspecified parts. No checking for overlapping slices! The solver should do that.
<*>+= [<-D->] procedure interleave_slices(result, slices) local e, mask every e := !slices & e.op == "=" do { mask := icom(ishift(2^(e.left.n)-1, e.left.lo)) result := ior(iand(result, mask), ishift(iand(e.right, 2^(e.left.n)-1), e.left.lo)) } return result end
Definesinterleave_slices
(links are to index).
<create initial set oftest_values
>= (<-U) every i := inputs_of(cons, "string") & i.meaning == "reloc" & test_values[i.name] := "reloc" n_tests := 0 <insert equality conditions intest_values
> while (*test_values > n_tests) do { n_tests := *test_values <simplify conditions> <insert equality conditions intest_values
> }
<simplify conditions>= (<-U) new_conds := set() every insert(new_conds, (simplify_conditions(conds, test_values))[2]) conds := new_conds
Simplifying conditions can produce satisfied conditions, unsatisfied conditions, or conditions with unknown satisfiability.
<check conditions>= (<-U) unsatisfied := &null unknown := &null every l := simplify_conditions(conds, test_values) do { c := l[1] newc := l[2] if (type(newc) == "eqn") then { if (reduceable(newc)) then /unknown := [c] | push(unknown, c) else if (not tautology(newc)) then /unsatisfied := [c] | push(unsatisfied, c) } }
<*>+= [<-D->] procedure print_conditions(conds, msg) local l l := [] (*conds == 0) & return "" every put(l, expimage(!conds)) return "Conditions " || (\msg | " ") || ":" || " " || commaseparate(l) end
Definesprint_conditions
(links are to index).
random_nbit_value
produces a random value in
the range 0...2^n-1 for unsigned, n-bit values
and values in the range -2^(n-1)...2^(n-1)-1
for signed, n-bit values.
<*>+= [<-D->] procedure random_nbit_value(f) local v return case type(f) of { "Efitsu": ?(2^(f.n)-1) "Efitss": { v := ?(2^(f.n-1)) if (?2 = 1) then -v else if (v = 2^(f.n-1)) then v - 1 else v } default : error("Invalid field-width constraint: ", expimage(f)) } end
Definesrandom_nbit_value
(links are to index).
simplify_conditions
simplifies each condition in
a set of conditions and then suspends a list pair
which contains the original and new condition.
<*>+= [<-D->] procedure simplify_conditions(conditions, values) local c, cp every c := !\conditions do { # write("==> " || image(type(c)) || "," || expimage(c)) cp := simplify(subst_tab(c, values)) # write("<== " || image(type(cp)) || "," || expimage(cp)) suspend [c, cp] } end
Definessimplify_conditions
(links are to index).
An equation is reduceable if it contains a variable expression.
<*>+= [<-D] procedure reduceable(c) return not integer(untableexp(c.left)) | not integer(untableexp(c.right)) end procedure tautology(c) local l,r if (l := integer(untableexp(c.left)) & r := integer(untableexp(c.right))) then return case c.op of { "=" : l = r ">=" : l >= r "<=" : l <= r "!=" : l ~= r "<" : l > r ">" : l < r default: error("Invalid operator " || c.op) } return end
Definesreduceable
,tautology
(links are to index).
<checker.t>= #include <mclib.h> #include %interface #include <stdio.h> #include <stdarg.h> extern struct %{irname}_asm *%{irname}_asm; extern struct %{irname}_asm *%{irname}_bin; static void checkerfail(char *, ...); static void reloc_print(RAddr r); static void reloc_print(RAddr r) {$t assert(location_known(r)); asmprintf(asmprintfd, "%%s", r->label->name); /* asmprintf(asmprintfd, "0x%%08x", location(r)); */$b } static void checkerfail(char *fmt, ...) {$t va_list ap; va_start(ap, fmt); fprintf(stderr, "Error in checker: "); vfprintf(stderr, fmt, ap); fprintf(stderr, "\n"); va_end(ap); exit(1); $b}$n void *mc_alloc(int size, Mc_alloc_pool pool) {$t char *p = (char *)malloc(size); return (void *)p; $b}$n RClosure mc_alloc_closure(size_in_bytes, dest_block, dest_lc) unsigned size_in_bytes; RBlock dest_block; unsigned dest_lc; {$t RClosure cl; checkerfail("attempt to call mc_alloc_closure!"); return cl; $b}$n main(){$t RBlock rb; RAddr reloc; fail = &checkerfail; asmprintreloc = &reloc_print; rb = block_new(0); block_label(rb)->name = "reloc"; set_block(rb); set_address(rb, (unsigned)CHKRADDR); reloc = addr_new(block_label(rb), 0x0); printf(".text\n"); printf(".globl reloc\n"); printf(".globl main\n"); printf(".globl _main\n"); printf("reloc:\n"); printf("main:\n"); printf("_main:\n");
<inchecker.t>= printf(".text\n"); printf(".globl _asmoutput\n"); printf("_asmoutput:\n");
<afterchecker.t>= printf(".text\n"); printf(".globl _endoutput\n"); printf("_endoutput:\n"); exit(0);
test_values
>: U1, D2
Eforce
on relocatable inputs>: U1, D2
d
's conditions>: U1, D2
test_values
>: D1, U2
fits
and conds
>: U1, D2
test_values
>: U1, D2