% l2h ignore change { \chapter{Vaporware} This appendix describes procedures that we would like to have in the library but that we haven't had time to implement. These procedures are geared to supporting {\em toolkit object code}, viz., a disk-based representation of the data structures used in encoding applications. We intend to use {\em tables} from \citeN{hanson:interfaces} so that labels and such can be referred to by name, and we need to implement {\em pickles} to be able to read and write relocatable blocks, relocation closures, and so on. \section{Tables} Applications will need tables for binding names of objects to their values. The library provides two predefined, global tables: \begin{fields*} labeltab&Binds the names of global labels to the labels themselves\\ sectiontab&Binds the names of sections to lists of relocatable blocks\\ \end{fields*} An assembler may use [[labeltab]] to store labels defined in assembly text and [[sectiontab]] to store relocatable blocks that represent data sections or procedure text. The application create more tables. The application must tell the library what section names it understands. [[newsection]] creates a new section if it has not already been created. An assembler, for example, might call [[newsection]] when assembling a pseudo-op such as ``.data'' for the first time. <>= procedure newsection(char *name) /sectiontab[name] := [] | fail(multiply defined sections) @ %def newsection Each entry of [[sectiontab]] is a list. Relocatable blocks are added to a section by simply concatenating the block on the appropriate list. A C implementation of the library will provide the C equivalents of the Icon code [[l := [], put(l,x), every x := !l do ...]]. @ \section{Pickling and Unpickling} The application can externalize and internalize all of the data structures provided by the library using the following ``pickling'' and ``unpickling'' interface. The following EBNF grammar specifies the layout of pickled objects on disk.% \footnote{In EBNF, [[|]] denotes alternation, [[[...]]] optional fields, and [[{...}]] Kleene closure.} The smallest unit of reading and writing is a word. Within the grammar, four-letter strings, integer literals, and symbols in all caps are terminal symbols standing for single words. Symbols in mixed case are nonterminals. The word represented by a four-character string is computed by putting the first character of the string in the most significant byte, and so on. For example, the string [["tabl"]], used to tag tables, forms the type tag \begin{quote} [['t' @<< 24 + 'a' @<< 16 + 'b' @<< 8 + 'l']]. \end{quote} This type tag {\em cannot} always be formed by casting the string [["tabl"]] to an integer using the native byte order of the machine. The first word in a pickle is always [["picl"]]. By checking this word, a program reading the pickle can tell whether the words in the pickle were written in the native or swapped byte order. A pickle always represents a single object. To provide a modicum of error checking, the object is always tagged. Most applications pickle their own data structures, which may point to more than one object of the kind defined in this library. <>= Pickle => "picl" Tagged @ Every object within a pickle is identified by a unique signed integer or {\em UID}; UIDs make it possible to represent cycles and pointer sharing. The UID~0 always represents the [[NULL]] or [[NIL]] pointer. Under some circumstances, a UID may be followed by a {\em type tag}, which identifies the type of the object. For example, every element of a heterogeneous list must carry a type tag. This library reserves the type tags in the range 0--32767 for use by applications. The following productions show the type tags reserved for use with the toolkit types. The library supports only homogeneous and lists, and homogeneous tables keyed by strings. <>= Any => [ Table | List | RLabel | RAddr | RBlock | RClosure | INTEGER | UserDef ] Tagged => [ "tabl" Table | "list" List | "rlbl" Rlabel | "radr" RAddr | "rblk" RBlock | "rclo" RClosure | "chrs" String | "int\0" INTEGER | TAG UserDef ] @ A pickled table includes a [[TAG]] denoting the type of the table's elements, a [[Count]] denoting the total number of table elements, and a sequence of key-value pairs. The values are untagged because the single [[TAG]] at the beginning describes all of the values. <>= Table => UID [TAG Count {String Any}] @ A [[List]] is just like a table but does not include any keys. <>= List => UID [TAG Count {Any}] @ Of the four basic data types provided by the library, [[RAddr]], [[RLabel]], [[RBlock]], and [[RClosure]], the first two are completely straightforward: <>= RAddr => UID [OFFSET RLabel] RLabel => UID [OFFSET RBlock] @ An [[RBlock]] contains two optional fields [[ADDRESS]] and [[REG OFFSET]]. The value of the [[Flags]] determines the presence of these fields. If [[Flags & 1]], [[ADDRESS]] is the value of [[rb.address]]; otherwise, [[ADDRESS]] is omitted and [[rb.address]] is null. {\em and similarly for other fields\ldots} % }]] is emitted; if [[Flags[1]]] is set, [[{ Reg Offset }]] %is emitted. [[String]] denotes [[rb.contents]]. <>= RBlock => UID [LOW RLabel FLAGS [ADDRESS] [REG OFFSET] String] @ An [[RClosure]] includes a [[Function]], which represents the closure's [[apply]] method. [[COUNT]] denotes the number of relocatable addresses in the closure. Since a closure is created only when a relocatable address is unknown, there must be at least one [[RAddr]] in the closure. <>= RClosure => UID [OFFSET Function RBlock COUNT {RAddr}] @ A function is represented by bytecodes. So far, the only defined ``bytecode'' is actually a PostScript-like ``string code'' described in \citeN{ramsey:relocating}. The toolkit generates these codes when it generates closure functions. This code is indicated by the tag [["tkps"]]. <>= Function => "tkps" String @ Strings are an exception to the rule that all reading and writing is word-oriented. The disk representation of a string is a word indicating the number of bytes in the string, followed by the bytes themselves. The bytes are followed by zero or more zero bytes, used to pad the string so its length is a multiple of the word length. This representation is {\em not} equivalent to the C representation; for example, strings may contain embedded null characters. The C version of the library provides a procedure that pickles and unpickles strings in this format. <>= String => UID [COUNT bytes padding] @ \section{C language support for pickles} @ To pickle and unpickle a type, the library needs a tag, a pickling procedure, and an unpickling procedure: <>= typedef struct pklstream *PklStream; unsigned getword(PklStream s); /* probably a macro */ procedure putword(PklStream s, unsigned n); /* probably a macro */ procedure read_bytes(char *buf, int size, PklStream s); typedef struct pkltag { unsigned tagword; /* TAG */ void (*write)(PklStream s, void *data, ...); void *(*read)(PklStream s); } *PklTag; @ All calls to write procedures must be bracketed between [[pickleBegin]] and [[pickleEnd]]. A [[PklTag]] is provided for each library data structure. The write procedures for homogeneous tables and lists take an extra argument, which is a tag describing the element type.% \footnote{This admittedly ugly scheme is a poor man's substitute for real closures.} Pickling procedures that are part of the application may use this extra argument to pickle tables and lists of the application's own data structures. This library provides no support for pickling and unpickling heterogeneous tables and lists. <>= PklTag pickleTab, pickleList, pickleRAddr, pickleRBlock, pickleRClosure, pickleRLabel; @ As an example, here is the code used to pickle a table [[d]] in which the elements are labels: <>= pickleTab->write(s, d, pickleRLabel); @ [[pickleRBlockList]] is provided as a tag that simplifies pickling of [[sectiontab]]. To pickle [[sectiontab]], the application would call \begin{quote} [[pickleTab(sectiontab, pickleRBlockList)]]. \end{quote} <>= PklTag pickleRBlockList; pickleRBlockList->write(s, l) == pickleList->write(s, l, pickleRBlock) @ Applications use the tags to write and read pickles. \iffalse {\em Guidlines for application writers wanting to create their own tags and pickling procedures appear where?} \fi [[pickleWrite(fd, p, tag)]] pickles the object pointed to by [[p]], pointers within the object are followed and their referents pickled, and so on transitively. Pointer cycles and sharing are correctly represented. {\em flag for tagging every single object?} {\hfuzz=11pt\par} <>= procedure pickleWrite(int fd, void *p, PklTag tag); write disk representation of p to descriptor fd. @ The library uses [[int fd]] instead of [[FILE *fp]] for input and output because some applications (like {\tt lcc} and {\tt mld}) don't tolerate [[FILE *]]. @ [[pickleRead]] reads a pickled object from disk. {\em Support for pickles in which all objects are tagged?} <>= procedure *pickleRead(int fd, PklTag type_expected, Arena *p, (char *)(*pickleGetmem)(int size), int initialSize); return pointer to object in pickle @ {\em Allocation of pickles is arena-based,} and all objects in a pickle are allocated in the same arena. If [[*p]] is nonzero, it is used as the arena; otherwise a new arena is created and *p is set to it. If [[getmem]] is nonzero, it is used to add space to the arena. If [[pickleSizeHint]] is nonzero, it is used as the initial size of the arena when an arena must be created. @ \iffalse \section{What's missing?} Floating-point encoding, which is a separable issue. @ \fi