\ifhtml\section*{Contents} \tableofcontents \fi \section{Intel Pentium instruction-set specification} This specification describes the Intel Pentium~\cite{intel:pentium}. At the instruction-set level, this specification could almost be used to describe the 486; the Pentium supports just a few instructions not found on the 486. This specification has {\em not} been used in an application, which means that it is probably full of bugs.% \footnote{We have plans for a test harness that should generate instructions at random, making sure that we get the same object code no matter whether we emit binary or assembly language, but as of July 1994 that harness is not in place.} Specifying the $x86$ is very different from specifying a RISC machine. In particular, it is difficult to know how to factor this architecture, or indeed whether to try to factor it at all. We've ended by trying to factor the opcodes into tables, but not to factor the instructions. Handling the $x86$ opcode tables is painful, but we prefer it to specifying the opcodes individually, because we believe the tables reduce the likelihood of error. Factoring the instructions turned out to be a hopeless exercise, with one exception: there is a family of 8 groups of arithmetic instructions that factor nicely. The $x86$ is not quite sure whether it is an 8-bit, a 16-bit, or a 32-bit machine. We don't know the exact history, but we believe that the machine started life as an 8-bit machine, and that new opcodes were added when the architecture was extended to 16~bits. New opcodes were {\em not} added when it was extended again to 32~bits; instead, the presence or absence of a prefix is used to distinguish 16- from 32-bit instructions. Unfortunately, the encoding isn't completely specified at assembly time; the meaning of the prefix depends on the setting of a bit in the ``executable-segment descriptor.'' We have chosen to specify encodings in which instructions without prefixes operate on 32-bit quantities and the prefix selects 16-bit operation, but the encoding can be changed by changing just two lines in Section~\ref{section:op-prefix}. An application that wanted to be able to switch between encodings dynamically would have to use the toolkit to generate two sets of encoding procedures, one each for the 16- and 32-bit defaults. One thing we do is provide a way to generate just the 32-bit subset: <>= keep <<32-bit constructors>> <> <>= discard <> @ \begin{figure}[p] \def\={\-\thick8&\omit\leaders\hrule height 1pt\hfill\kern0pt\\} \def\op#1#2{\opspan{#1}\hfil #2\hfil} \begin{optable}[5.8em] &\opspan9\hfil\normalsize{\bf One-Byte Opcode Map (page 0)}\hfil\\ %\noalign{\medskip} % ??? \numbercols 0 8 % \=0|\op6{ADD}|PUSH!POP\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv|ES!ES\. % \line1|\op6{ADC}|PUSH!POP\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv|SS!SS\. % \line2|\op6{AND}| SEG!DAA\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv| =ES!\. % \line3|\op6{XOR}| SEG!AAA\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv| =SS!\. % \=4|\op8{INC general register}\. \lineonly\linecols8\\ |eAX?eCX?eDX?eBX?eSP?eBP?eSI?eDI\. % \line5|\op8{PUSH general register}\. \lineonly\linecols8\\ |eAX?eCX?eDX?eBX?eSP?eBP?eSI?eDI\. % \=6|PUSHA! POPA! BOUND!ARPL! SEG!SEG!Operand!Address\. |PUSHAD!POPAD!Gv,Ma!Ew,Gw!=FS!=GS!Size! Size\. % \line7|\op8{Short-displacement jump on condition (Jb)}\. \thinline|JO?JNO?JB/JNAE/J?JNB/JAE/J?JZ?JNZ?JBE?JNBE\. % \=8|\op2{Immediata Grp1}!MOVB${}^*$!Grp1!\op2{TEST}!\op2{XCHG}\. \lineonly\linecols2\skipcols2\linecols4\\ |Eb,Ib?Ev,Iv!AL,immed!Ev,Ib!Eb,Gb?Ev,Gv!Eb,Gb?Ev,Gv\. % \=9|NOP!\op7{XCHG word to double-word register with eAX}\. \lineonly\skipcols1\linecols7\\ |!eCX?eDX?eBX?eSP?eBP?eSI?eDI\. % \= A|\op4{MOV}|MOVSB!MOVsw!CMPSB!CMPSW\. \lineonly\linecols4\skipcols4\\ |AL,Ob?eAX,Ov?Ob,AL?Ov,eAX|Xb,Yb!Xv,Yv!Xb,Yb!Xv,Yv\. % \= B|\op8{MOV immediate byte into register}\. \lineonly\linecols8\\ |AL?CL?DL?BL?AH?CH?DH?BH\. % \= C|\op2{Shift Grp2a}!\op2{RET near}|LES!LDS!\op2{MOV}\. \lineonly\linecols4\skipcols2\linecols2\\ |Eb,Ib?Ev,Ib!Iw? |Gv,Mp?Gv,Mp!Eb,Ib?Ev,Iv\. % \line D|\op4{Shift Grp2}|AAM!AAD!$*$!XLAT\. \lineonly\linecols4\skipcols4\\ |Eb,1?Ev,1?Eb,CL?Ev,CL|!!!\. % \= E|LOOPNE!LOOPE!LOOP!JCXZ/JEC!\op2{IN}!\op2{OUT}\. \lineonly\skipcols4\linecols4\\ |Jb!Jb!Jb!Jb!AL,Ib?AX,Ib!Ib,AL?Ib,eAX\. % \line F|LOCK!$*$!REPNE!REP!HLT!CMC!\op2{Unary Grp3}\. \lineonly\skipcols6\linecols2\\ |!!!REPE!!!Eb?Ev\. % \= \end{optable} \caption{Pentium opcodes (page 0)}\label{opcode0} \end{figure} \begin{figure}[p] \def\={\-\thick8&\omit\leaders\hrule height 1pt\hfill\kern0pt\\} \def\op#1#2{\opspan{#1}\hfil #2\hfil} \begin{optable}[5.8em] &\opspan9\hfil\normalsize{\bf One-Byte Opcode Map (page 1)}\hfil\\ %\noalign{\medskip} % ??? \numbercols{8}{16} % \=0|\op6{OR}|PUSH!2-byte\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv|CS!escape\. % \line1|\op6{SBB}|PUSH!POP\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv|DS!DS\. % \line2|\op6{SUB}| SEG!DAS\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv| =CS!\. % \line3|\op6{CMP}| SEG!AAS\. \lineonly\linecols6\skipcols2\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev?AL,Ib?eAX,Iv| =DS!\. % \=4|\op8{DEC general register}\. \lineonly\linecols8\\ |eAX?eCX?eDX?eBX?eSP?eBP?eSI?eDI\. % \line5|\op8{POP into general register}\. \lineonly\linecols8\\ |eAX?eCX?eDX?eBX?eSP?eBP?eSI?eDI\. % \=6|PUSH!IMUL! PUSH!IMUL! INSB! INSW/D!OUTSB!OUTSW/D\. |Iv! Gv,Ev,Iv!Ib! Gv,Ev,Ib!Yb,DX!Yv,DX! DX,Xb!DX,Xv\. % \line7|\op8{Short-displacement jump on condition (Jb)}\. \thinline|JS?JNS?JP?JNP?JL?JNL?JLE?JNLE\. % \=8|\op4{MOV}|MOV!LEA!MOV!POP\. \lineonly\linecols4\skipcols4\\ |Eb,Gb?Ev,Gv?Gb,Eb?Gv,Ev|Ew,Sw!Gv,M!Sw,Ew!Ev\. % \=9|CBW!CWD/CDQ!CALL!WAIT|PUSHF!POPF!SAHF!LAHF\. |!!aP!|Fv!Fv!!\. % \line A|\op2{TEST}!STOSB!STOSW/D|LODSB!LODSW/D!SCASB!SCASW/D\. \lineonly\linecols2\skipcols6\\ |AL,Ib?eAX,Iv!Yb,AL!Yv,eAX|AL,Xb!eAX,Xv!AL,Yb!eAX,Yv\. % \= B|\op8{MOV immediate word or double into word or double register}\. \thinline|eAX!eCX!eDX!eBX!eSP!eBP!eSI!eDI\. % \=C|ENTER!LEAVE!RET far!RET far!INT!INT!INTO!IRET\. |Iw,Ib!!Iw!!3!Ib!!\. % \= D|\op8{ESC (Escape to coprocessor instruction set)}\. \thinline|\op8{}\. % \= E|CALL!\op3{JMP}!\op2{IN}!\op2{OUT}\. \lineonly\skipcols1\linecols7\\ |Jv!Jv?Ap?Jb!AL,DX?eAX,DX!DX,AL?DX,eAX\. % \line F|CLC!STC!CLI!STI!CLD!STD!INC/DEC!INC/DEC\. |!!!!!!Grp4!Grp5\. % \= \end{optable} \caption{Pentium opcodes (page 1)}\label{opcode1} \end{figure} Intel uses some naming conventions to try to tame the confusion surrounding operands. ``Operand specifiers'' describe the locations and sizes of operands. The specifiers are (mostly) composed from the following pieces: \begin{quote} \begin{tabular}{llll} \strut&\omit\hfil Operand\strut\hfil&&\omit\hfil Width\hfil\\ \noalign{\smallskip} \strut\tt E&\strut effective address& \tt b&8-bit bytes\\ \strut &(memory or register)& \tt w&16-bit words\\ \strut\tt G&\strut general-purpose register& \tt d&32-bit doublewords\\ \strut\tt I&\strut immediate& \tt v&variable ({\tt w} or {\tt d})\\ \end{tabular} \end{quote} For example, the specifier ``Eb,Gb'' describes an 8-bit memory-to-register instruction. ``Ev,Gv'' describes a similar instruction that operates on 16 or 32~bits, depending on the presence or absence of a prefix. Most of the $x86$ instructions are overloaded, but the opcode tables often use operand specifiers as suffixes to distinguish them. In some cases, where the machine specification doesn't give distinguishing suffixes, we have invented some. Many of the instructions support all three sizes, and we specify them in two variants: a {\tt b} variant with no prefix, and {\tt v} variants with and without prefixes. When the two {\tt v} variants differ only in the presence or absence of a prefix, we can specify them simultaneously using the [[ov]] pattern, which we define in Section~\ref{section:op-prefix} to mean ``optional prefix.'' Sometimes, however, we have to specify all three variants explicitly, as when there is an immediate operand---in that case, we have to give three different output patterns because the token holding the immediate operand may be 8, 16, or 32~bits wide. @ Here is the overall structure of the Intel specification: <>= <> patterns <> <> <> <> relocatable reloc <> <> <> constructors <> @ \subsection{Opcodes} For other machines, we were able to specify entire opcode tables in single declarations. The Intel tables are less tractable, because they don't just contain opcodes; they contain a mix of opcodes and suffixes. We've broken most of the tables into pieces, as shown below. There's an argument for abandoning opcode tables entirely, using only constructors to describe the encodings, but we've decided to keep opcodes. This style of specification lets us use a little factoring, and it gives us a little protection against errors in the distributed opcodes, like the {\tt group{\em x}} opcodes. \subsection{One-byte opcodes} We want to take advantage of factoring when possible, but it works well only for the ``arithemtic group,'' shown in the upper left corners of Figures \ref{opcode0} and \ref{opcode1}, which represent the ``one-byte opcode map'' from pages A-5 and A-6 of the Intel manual~\cite{intel:pentium}. This corner of the opcode table clearly represents an outer product of 8~opcodes with 6~suffixes, and we treat it as such. Most of the rest of the opcode table we treat in purely geometric fashion, using row, column, and page numbers as Cartesian coordinates to determine opcodes. <>= fields of opcodet (8) row 4:7 col 0:2 page 3:3 <> <>= placeholder for opcodet is HLT @ Because the map is so chaotic, we break it into rows, for the most part specifying one or two rows at a time, but sometimes breaking rows into pieces. The first rows are among the most interesting; rows 0--3 contain the outer product of arithmetic operators with operand specifiers: <>= arith is any of [ ADD OR ADC SBB AND SUB XOR CMP ], which is row = {0 to 3} & page = [0 1] [ Eb.Gb Ev.Gv Gb.Eb Gv.Ev AL.Ib eAX.Iv ] is col = {0 to 5} @ %def arith The other columns in rows 0--3 follow a less discernible pattern. <>= [ PUSH.ES POP.ES PUSH.CS esc2 PUSH.SS POP.SS PUSH.DS POP.DS SEG.ES DAA SEG.CS DAS SEG.SS AAA SEG.DS AAS ] is row = {0 to 3} & page = [0 1] & col = [6 7] @ Rows 4 and 5 are the general-register opcodes, formed by an outer product of operation and register specifier.\label{r32} Although the register specifier is actually part of the opcode, we treat it as an operand below. We create the field [[r32]] as an alias for [[col]], so we can use special register names for the values. <>= regops is any of [ INC DEC PUSH POP ], which is row = [4 5] & page = [0 1] @ %def regops <>= r32 0:2 <>= fieldinfo r32 is [names [ eAX eCX eDX eBX eSP eBP eSI eDI ]] @ %def r32 <>= sr16 0:2 <>= fieldinfo sr16 is [sparse [ cs=1, ss=2, ds=3, es=4, fs=5, gs=6 ] ] @ %def sr16 <>= r16 0:2 <>= fieldinfo r16 is [names [ AX CX DX BX SP BP SI DI ]] @ %def r16 @ \noindent Row~6: <>= [ PUSHA POPA BOUND ARPL SEG.FS SEF.GS OpPrefix AddrPrefix PUSH.Iv IMUL.Iv PUSH.Ib IMUL.Ib INSB INSv OUTSB OUTSv ] is page = [0 1] & row = 6 & col = {0 to 7} @ \noindent Row~7 is factored so the jump codes can be re-used in the two-byte opcode map. Again, the row contains an outer product, but this time the columns determine jump conditions, not operand specifiers. <>= Jb is row = 7 cond is any of [ .O .NO .B .NB .Z .NZ .BE .NBE .S .NS .P .NP .L .NL .LE .NLE ], which is page = [0 1] & col = {0 to 7} @ \noindent Row 8 is a bit hard to follow because it contains a seemingly random mix of opcodes and operand specifiers. On page~0, we've left the ``immediate Group1'' implicit, giving only the operand specifiers. On page~1, [[MOV]] shares the operand specifiers we gave with the arithmetic instructions in rows~0--3. <>= [ Eb.Ib Ev.Iv MOVB Ev.Ib TEST.Eb.Gb TEST.Ev.Gv XCHG.Eb.Gb XCHG.Ev.Gv ] is row = 8 & page = 0 & col = {0 to 7} MOV is row = 8 & page = 1 [ MOV.Ew.Sw LEA MOV.Sw.Ew POP.Ev ] is row = 8 & page = 1 & col = {4 to 7} @ \noindent On page 0, row 9 is [[XCHG]] (or [[NOP]]). Again, the register operand is actually part of the opcode. Page~1 has several opcodes. <>= XCHG is row = 9 & page = 0 NOP is XCHG & col = 0 [ CBW CWDQ CALL.aP WAIT PUSHF POPF SAHF LAHF ] is row = 9 & page = 1 & col = {0 to 7} @ \noindent Although there is an outer product or two lurking in row 10 (A), we don't try to specify it, because the operand specifiers used there aren't widely useful. <>= [ MOV.AL.Ob MOV.eAX.Ov MOV.Ob.AL MOV.Ov.eAX MOVSB MOVSv CMPSB CMPSv TEST.AL.Ib TEST.eAX.Iv STOSB STOSv LODSB LODSv SCASB SCASv ] is row = 10 & page = [0 1] & col = {0 to 7} @ \noindent Row 11 (B) is another row in which the register operand is implicit in the opcode, but we need to define a new field to denote 8-bit registers. <>= MOVib is row = 11 & page = 0 MOViv is row = 11 & page = 1 <>= r8 0:2 <>= fieldinfo r8 is [names [ AL CL DL BL AH CH DH BH ]] @ \noindent Rows 12 and 13 contain the bit operators; the others are easy. <>= [ B.Eb.Ib B.Ev.Ib RET.Iw RET LES LDS MOV.Eb.Ib MOV.Ev.Iv B.Eb.1 B.Ev.1 B.Eb.CL B.Ev.CL AAM AAD _ XLAT ] is row = [12 13] & page = 0 & col = {0 to 7} [ ENTER LEAVE RET.far.Iw RET.far INT3 INT.Ib INTO IRET ] is row = 12 & page = 1 & col = {0 to 7} ESC is row = 13 & page = 1 @ \noindent Neither of the last two rows can use factoring. <>= [ LOOPNE LOOPE LOOP JCXZ IN.AL.Ib IN.eAX.Ib OUT.Ib.AL OUT.Ib.eAX LOCK _ REPNE REP HLT CMC grp3.Eb grp3.Ev CALL.Jv JMP.Jv JMP.Ap JMP.Jb IN.AL.DX IN.eAX.DX OUT.DX.AL OUT.DX.eAX CLC STC CLI STI CLD STD grp4 grp5 ] is page = [0 1] & row = [14 15] & col = {0 to 7} @ \subsection{Two-byte opcodes} The two-byte tables are fairly sparse, so we haven't bothered to reproduce them in a table for this report. We describe them one page at a time beginning with page~0. The first token of each two-token pattern contains the [[esc2]] opcode. @ \begingroup\parindent=0pt The first two rows are: <>= [ grp6 grp7 LAR LSL MOV.Eb.Gb MOV.Gv.Ev MOV.Gb.Eb MOV.Ev.Gv ] is esc2; page = 0 & row = [0 1] & col = {0 to 3} CLTS is esc2; page = 0 & row = 0 & col = 6 @ Row~3 is a block of MOV instructions. <>= [ MOV.Rd.Cd MOV.Rd.Dd MOV.Cd.Rd MOV.Dd.Rd MOV.Rd.Td _ MOV.Td.Rd ] is esc2; page = 0 & row = 3 & col = {0 to 6} @ Row 4: <>= [ WRMSR RDTSC RDMSR ] is esc2; page = 0 & row = 4 & col = {0 to 2} @ The jumps in row 8 and sets in row 9 span two pages. They are outer products of opcodes and the conditions defined in row~7 of the one-byte opcode map. <>= Jv is esc2; row = 8 SETb is esc2; row = 9 @ Rows 10 and 11 are more madness: <>= [ PUSH.FS POP.FS CPUID BT SHLD.Ib SHLD.CL _ _ CMPXCHG.Eb.Gb CMPXCHG.Ev.Gv LSS BTR LFS LGS MOVZX.Gv.Eb MOVZX.Gv.Ew ] is esc2; page = 0 & row = [10 11] & col = {0 to 7} @ Row 12 is the only remaining non-empty row on page 0. <>= [ XADD.Eb.Gb XADD.Ev.Gv grp9 ] is esc2; page = 0 & row = 12 & col = [0 1 7] @ \bigskip There are even fewer opcodes on page~1. <>= [INVD WBINVD] is esc2; row = 0 & page = 1 & col = [0 1] @ Rows 1--7 are empty, and 8 and 9 were covered on the previous page. Row~10 is: <>= [ PUSH.GS POP.GS RSM BTS SHRD.Ib SHRD.CL _ IMUL.Gv.Ev] is esc2; row = 10 & page = 1 & col = {0 to 7} @ Row 11: <>= [ grp8 BTC BSF BSR MOVSX.Gv.Eb MOVSX.Gv.Ew] is esc2; page = 1 & row = 11 & col = {2 to 7} @ And the single opcode on row~12: <>= BSWAP is esc2; row = 12 & page = 1 @ \endgroup @ \subsection{Operands and effective addresses} Intel operands and addresses are described in Section 25.2.1 and Figure 25-2 of the Pentium manual. Effective addresses use a ``Mod R/M'' byte, the [[mod]] field of which determines the addressing mode. The Mod R/M byte also holds some bits that denote either a register operand or some extra parts of the opcode (as with the {\tt group{\em x}} instructions). Indexed addressing modes use an additional byte, called ``SIB,'' which holds a scale factor and index and base registers. <>= fields of modrm (8) mod 6:7 reg_opcode 3:5 r_m 0:2 fields of sib (8) ss 6:7 index 3:5 base 0:2 <>= fieldinfo [ base index ] is [ names [ eAX eCX eDX eBX eSP eBP eSI eDI ] ] fieldinfo ss is [ sparse [ "1" = 0, "2" = 1, "4" = 2, "8" = 3 ] ] @ %def mod reg_opcode r_m ss index base <>= placeholder for modrm is HLT placeholder for sib is HLT @ We're faced with a specification problem because some Intel instructions accept only effective addresses that refer to operands in memory; register modes are not permitted. Most instructions, however, accept any kind of effective address. A good way to express this restriction would be with subtyping, but the toolkit doesn't implement subtyping, so instead we define two different constructor types: [[Mem]] to refer to effective addresses of operands in memory, and [[Eaddr]] to refer to any effective address. This separation requires the use of an identity constructor~[[E]] to map [[Mem]]s into [[Eaddr]]s. Such a thing is bad enough in a specification, but these [[E]]'s have to be used in application programs, too. The ugliness is justified because it confers protection against inadvertantly using a register operand with an instruction that doesn't permit one. <>= relocatable d a constructors Indir [reg] : Mem { reg != 4, reg != 5 } is mod = 0 & r_m = reg Disp8 d[reg] : Mem { reg != 4 } is mod = 1 & r_m = reg; i8 = d Disp32 d[reg] : Mem { reg != 4 } is mod = 2 & r_m = reg; i32 = d Abs32 [a] : Mem is mod = 0 & r_m = 5; i32 = a Reg reg : Eaddr is mod = 3 & r_m = reg Index [base][index * ss] : Mem { index != 4, base != 5 } is mod = 0 & r_m = 4; index & base & ss Index8 d[base][index * ss] : Mem { index != 4 } is mod = 1 & r_m = 4; index & base & ss; i8 = d Index32 d[base][index * ss] : Mem { index != 4 } is mod = 2 & r_m = 4; index & base & ss; i32 = d ShortIndex d[index * ss] : Mem { index != 4 } is mod = 0 & r_m = 4; index & base = 5 & ss; i32 = d E Mem : Eaddr is Mem @ %def Eaddr We'll eventually want to be able to keep only 32-bit constructors, to cut down on the time needed to generate encoding procedures. <<32-bit constructors>>= Indir Disp32 Reg Index Index32 E @ Now, this is good as far as it goes, but there are a couple of problems: no support for conditional assembly, and lots of constructors, which increases generation time. So let's get a bit clever: <>= constructors Indir0 [reg] { reg != 4, reg != 5 } is mod = 0 & r_m = reg Indir8 d[reg] { reg != 4 } is mod = 1 & r_m = reg; i8 = d Indir32 d[reg] { reg != 4 } is mod = 2 & r_m = reg; i32 = d Index0 [base][index*ss] { index != 4, base != 5 } is mod = 0 & r_m = 4; index & base & ss Index8 d[base][index*ss] { index != 4 } is mod = 1 & r_m = 4; index & base & ss; i8 = d Index32 d[base][index*ss] { index != 4 } is mod = 2 & r_m = 4; index & base & ss; i32 = d relocatable d constructors Indir d[reg] : Mem when { d = 0 } is Indir0 ( reg) when { } is Indir8 (d, reg) otherwise is Indir32(d, reg) Index d[base][index*ss] : Mem when { d = 0 } is Index0 ( base, index, ss) when { } is Index8 (d, base, index, ss) otherwise is Index32(d, base, index, ss) ShortIndex d[index*ss] : Mem { index != 4 } is mod = 0 & r_m = 4; index & base = 5 & ss; i32 = d E Mem : Eaddr is Mem Reg reg : Eaddr is mod = 3 & r_m = reg discard Indir0 Indir8 Indir32 Index0 Index8 Index32 @ The only problem now is that it's no longer possible to split out the 32-bit constructors. @ Immediate operands occupy whole tokens and have their own classes. <>= fields of I8 (8) i8 0:7 fields of I16 (16) i16 0:15 fields of I32 (32) i32 0:31 <>= placeholder for I8 is HLT placeholder for I16 is HLT; HLT placeholder for I32 is HLT; HLT; HLT; HLT @ \subsection{Mod~R/M opcodes} The Intel architecture offers the spectacle of putting some of the opcode bits in with the operands. The eight values of the [[reg_opcode]] field of the Mod~R/M byte specify different opcodes, depending on the value of the opcode preceding the effective address. Most of these opcodes are notated by ``Group{\em x}'' in the manual. We use different sets of names for the values depending on what opcode precedes the effective-address specification. To make sure we don't mistakenly use a name like [[INC.Eb]] for [[reg_opcode = 0]] when the actual denotation is [[INC.Ev]], we include in the specifications the opcode that must precede the Mod~R/M byte---in this case, either [[grp4]] or [[grp5]]. This kind of specification is conjoined below with specifications of opcodes and effective addresses. If \begin{quote} [[INC.Eb is grp4; reg_opcode = 0]] \end{quote} then we might conjoin it with a [[grp4]] opcode followed by an effective address, writing: \begin{quote} [[INC.Eb Eaddr is (grp4; Eaddr) & INC.Eb]] \end{quote} The conjunction of the explicit [[grp4]] with the [[grp4]] in the definition of [[INC.Eb]] ensures that [[INC.Eb]] is used correctly, since $\hbox{[[grp4 & grp4]]} \equiv [[grp4]]$. If we incorrectly used [[grp5]] when defining the [[INC.Eb]] constructor, [[grp4 & grp5]] would evaluate to a pattern that never matches anything, and the toolkit would complain. We've glossed over an important detail in the definition of this constructor. Because conjunction distributes over concatentation, the right-hand side of the [[INC.Eb]] constructor winds up being equivalent to \begin{quote} [[grp4; (Eaddr & reg_opcode = 0)]]. \end{quote} This conjunction, unfortunately, breaks the rules of conjunction, which requires both patterns conjoined to have the same {\em shape}, or sequence of token classes. The conjunction is legal when [[Eaddr]] is an instance of [[Indir]] or [[Reg]], because those effective addresses consist solely of one Mod~R/M token, but the other modes contain more tokens, and their shapes don't match [[reg_opcode = 0]]. The solution is to relax the shape constraint by using the ellipsis operator. \mbox{``[[p ...]]''} creates a pattern that is equivalent to [[p]], except it is permissible to write \mbox{[[p ... & q]]} whenever [[p]]'s shape is a prefix of [[q]]'s shape.% \footnote{% The ellipsis may also be used as a prefix operator on patterns, in which case [[... p & q]] is permissible whenever [[p]]'s shape is a suffix of [[q]]'s shape. We haven't had occasion to use such patterns in machine descriptions, because most hardware decodes complex instructions from left to right.} In the case at hand, every [[Eaddr]] begins with a Mod~R/M token, so we can always write \begin{quote} [[Eaddr & reg_opcode = 0 ...]] \end{quote} We've now covered enough detail to specify the Mod~R/M or ``Group{\em x}'' opcodes. Just to add some extra complexity, groups 1-3 include opcodes that denote different operand specifiers. For example, [[ADDi]] can denote an integer add of bytes ([[Eb.Ib]]), 16-byte or 32-byte words ([[Ev.Iv]]), or words and bytes ([[Ev.Ib]]), depending on the suffix added to the opcode. For each constructor created from the [[ADDi]] pattern, only one operand specifier is conjoined into the output pattern; the resulting pattern has just one non-contradictory disjunct, so the bits to be emitted are uniquely determined. The constructors for these patterns are defined in Section~\ref{subsec:arith-insts}. <>= patterns arithI is any of [ ADDi ORi ADCi SBBi ANDi SUBi XORi CMPi ], # group 1 which is (Eb.Ib | Ev.Iv | Ev.Ib); reg_opcode = {0 to 7} ... bshifts is B.Eb.1 | B.Eb.CL # D0 D2 vshifts is B.Ev.1 | B.Ev.CL # D1 D3 immshifts is B.Eb.Ib | B.Ev.Ib # C0 C1 rot is any of [ ROL ROR RCL RCR SHLSAL SHR _ SAR], which is (bshifts | vshifts | immshifts); reg_opcode = {0 to 7} ... grp3ops is any of [ TEST.Ib.Iv _ NOT NEG MUL.AL.eAX IMUL.AL.eAX DIV.AL.eAX IDIV.AL.eAX ], which is (grp3.Eb | grp3.Ev); reg_opcode = {0 to 7} ... grp4ops is any of [ INC.Eb DEC.Eb ], which is grp4; reg_opcode = [0 1] ... grp5ops is any of [ INC.Ev DEC.Ev CALL.Ev CALL.Ep JMP.Ev JMP.Ep PUSH.Ev _ ], which is grp5; reg_opcode = {0 to 7} ... grp6ops is any of [ SLDT STR LLDT LTR VERR VERW _ _ ], which is grp6; reg_opcode = {0 to 7} ... grp7ops is any of [ SGDT SIDT LGDT LIDT SMSW _ LMSW INVLPG ], which is grp7; reg_opcode = {0 to 7} ... bittestI is any of [ BTi BTSi BTRi BTCi ], which is grp8; reg_opcode = {4 to 7} ... CMPXCHG8B is grp9; reg_opcode = 1 ... @ \subsection{Operand-size and address-size prefixes} \label{section:op-prefix} The Intel uses prefixes to distinguish 16- from 32-bit operands. The meaning of a prefix on the mode and on the setting of the $D$ bit in the executable-segment descriptor (see pages 25-1ff of the Pentium manual). We assume here that $D=1$, making the default size 32~bits, but that assumption could be changed by reversing the definitions of [[ow]] and [[od]] given here. An application that wanted to be able to use both encodings would have to generate two sets of encoding procedures, perhaps using function pointers to switch back and forth. In specifications, instructions with [[b]] suffixes (e.g., ``Eb,Gb'') use no prefix. Instructions with [[v]] suffixes (e.g., ``Ev,Gv'') begin with [[ov]], which is followed by the rest of the instruction. When [[ov]] is used to build an opcode, this technique automatically creates two variants: [[od]], a 32-bit variant with no prefix ([[epsilon]]) and [[ow]], a 16-bit variant with prefix [[OpPrefix]]. <>= patterns ow is OpPrefix od is epsilon ov is ow | od @ The address prefix is similar, but we haven't figured out how it's to be used. <>= patterns aw is AddrPrefix ad is epsilon av is aw | ad @ \subsection{Floating-point opcodes} The specifications of floating-point opcodes consume many more tables, but it's not necessary to say much about them; the specification techniques needed are those we use above. We have defined patterns [[D9]] though [[DF]], which express opcode values in hex; these are used in the specifications of the opcodes, so it seemed expedient to make them patterns, rather than continually writing something like [[opcode = 0xd8]]. %\footnote{For which we would have to add an [[opcode]] field %equivalent to the whole [[opcode]] token---a feature that ought to be %automatic, dammit.} <>= patterns [ D8 D9 DA DB DC DD DE DF ] is ESC & col = {0 to 7} [ FADD FMUL FCOM FCOMP FSUB FSUBR FDIV FDIVR ] is reg_opcode = {0 to 7} [ FLD _ FST FSTP FLDENV FLDCW FSTENV FSTCW ] is reg_opcode = {0 to 7} ... [ FNOP ] is D9; mod = 3 & reg_opcode = 2 & r_m = [0] [ FCHS FABS _ _ FTST FXAM _ _ ] is D9; mod = 3 & reg_opcode = 4 & r_m = {0 to 7} [ F2XM1 FYL2X FPTAN FPATAN FXTRACT FPREM1 FDECSTP FINCSTP ] is D9; mod = 3 & reg_opcode = 6 & r_m = {0 to 7} FXCH is D9; mod = 3 & reg_opcode = 1 Fconstants is any of [ FLD1 FLDL2T FLDL2E FLDPI FLDLG2 FLDLN2 FLDZ _ ], which is D9; mod = 3 & reg_opcode = 5 & r_m = {0 to 7} [ FPREM FYL2XP1 FSQRT FSINCOS FRNDINT FSCALE FSIN FCOS ] is D9; mod = 3 & reg_opcode = 7 & r_m = {0 to 7} [ FIADD FIMUL FICOM FICOMP FISUB FISUBR FIDIV FIDIVR ] is reg_opcode = {0 to 7} ... FUCOMPP is DA; mod = 3 & reg_opcode = 5 & r_m = 1 [ FILD _ FIST FISTP FBLD FLD.ext FBSTP FSTP.ext ] is reg_opcode = {0 to 7} ... [ FCLEX FINIT ] is DB; mod = 3 & reg_opcode = 4 & r_m = [2 3] [ FRSTOR _ FSAVE FSTSW ] is reg_opcode = {4 to 7} ... [ FFREE _ FST.st FSTP.st FUCOM FUCOMP _ _ ] is mod = 3 & reg_opcode = {0 to 7} [ FADDP _ FUBSRP FDIVRP FMULP _ FSUBP FDIVP ] is mod = 3 & reg_opcode = {0 to 7} FCOMPP is DE; mod = 3 & reg_opcode = 3 & r_m = 1 FSTSW.AX is DF; mod = 3 & reg_opcode = 4 & r_m = 0 @ This next group of floating-point patterns define suffixes that we use on other opcodes, not actual opcodes. <>= patterns .STi is DD; mod = 3 Fstack is any of [ .ST.STi .STi.St P.STi.ST ], which is [ D8 DC DE ]; mod = 3 Fint is any of [.I32 .I16], which is [DA DE] Fmem is any of [.R32 .R64], which is [D8 DC] FlsI is any of [.lsI16 .lsI32], which is [DF DB] FlsR is any of [.lsR32 .lsR64], which is [D9 DD] @ \subsection{Arithmetic instructions} \label{subsec:arith-insts} There are eight arithmetic instructions which have many different modes and which are all treated alike. The regular modes are shown in the upper left corners of Figures \ref{opcode0} and \ref{opcode1}; the immediate modes are the ``group 1'' instructions (denoted here by [[arithI]]). This is the only part of the Intel specification we were able to factor very well, but it does give us 112~constructors in just a dozen lines, so it is worth doing. <>= constructors arith^"iAL" i8! is arith & AL.Ib ; i8 arith^"iAX" i16! is ow; arith & eAX.Iv; i16 arith^"iEAX" i32! is od; arith & eAX.Iv; i32 arithI^"b" Eaddr, i8! is (Eb.Ib; Eaddr) & arithI; i8 arithI^"w" Eaddr, i16! is ow; (Ev.Iv; Eaddr) & arithI; i16 arithI^"d" Eaddr, i32! is od; (Ev.Iv; Eaddr) & arithI; i32 arithI^ov^"b" Eaddr, i8! is ov; (Ev.Ib; Eaddr) & arithI; i8 arith^"mrb" Eaddr, reg8 is arith & Eb.Gb; Eaddr & reg_opcode = reg8 ... arith^"mr"^ov Eaddr, reg is ov; arith & Ev.Gv; Eaddr & reg_opcode = reg ... arith^"rmb" reg8, Eaddr is arith & Gb.Eb; Eaddr & reg_opcode = reg8 ... arith^"rm"^ov reg, Eaddr is ov; arith & Gv.Ev; Eaddr & reg_opcode = reg ... <<32-bit constructors>>= arith^"iEAX" arithI^"d" arith^"mr"^od arith^"rm"^od @ \subsection{Other instructions (in alphabetical order)} Trying to factor the non-arithmetic instructions proved a thankless task, so we've given almost all instructions merely in alphabetical order as they appear in Chapter~25 of the Pentium manual. There's a little bit of local factoring, as with some bit operations. It's not really appropriate to try to explain this part of the specification; the best way to read this section by comparing with the alphabetical section of the Pentium architecture manual~\cite{intel:pentium}. To generate a checker by the end of the millenium, we divide the spec into four parts and check each individually. @ <>= <> <> <> <> <> <>= AAA AAD is AAD; i8 = 10 AAM is AAM; i8 = 10 AAS # ADC, ADD, AND are in arith group ARPL Eaddr, reg16 is ARPL; Eaddr & reg_opcode = reg16 ... <<32-bit constructors>>= AAA AAD AAM AAS <>= AAA AAD AAM AAS ARPL @ Note that [[ARPL]] requires a 16-bit register, i.e. [[%ax, %bx]], for its second operand. @ The ``short'' variant of [[BOUND]] ([[boundw]]) requires a 16-bit register for its first operand. <>= constructors BOUND^ov reg, Mem is ov; BOUND; Mem & reg_opcode = reg ... BSF^ov reg, Eaddr is ov; BSF; Eaddr & reg_opcode = reg ... BSR^ov reg, Eaddr is ov; BSR; Eaddr & reg_opcode = reg ... BSWAP r32 is BSWAP & ... r32 BT^ov Eaddr, reg is ov; BT; Eaddr & reg_opcode = reg ... BTi^ov Eaddr, i8! is ov; (grp8; Eaddr) & BTi; i8 BTC^ov Eaddr, reg is ov; BTC; Eaddr & reg_opcode = reg ... BTCi^ov Eaddr, i8! is ov; (grp8; Eaddr) & BTCi; i8 BTR^ov Eaddr, reg is ov; BTR; Eaddr & reg_opcode = reg ... BTRi^ov Eaddr, i8! is ov; (grp8; Eaddr) & BTRi; i8 BTS^ov Eaddr, reg is ov; BTS; Eaddr & reg_opcode = reg ... BTSi^ov Eaddr, i8! is ov; (grp8; Eaddr) & BTSi; i8 <>= BOUND^ov BSF^ov BSR^ov BSWAP BT^ov BTi^ov BTC^ov BTCi^ov BTR^ov BTRi^ov BTS^ov BTSi^ov <<32-bit constructors>>= BOUND^od BSF^od BSR^od BSWAP BT^od BTi^od BTC^od BTCi^od BTR^od BTRi^od BTS^od BTSi^od @ To deal with relative displacements, we set up constructors to compute them. The displacements are relative to the {\em end} of the word. <>= constructors rel8 reloc : Rel8 { reloc = L + i8! } is i8; L: epsilon rel16 reloc : Rel16 { reloc = L + i16! } is i16; L: epsilon rel32 reloc : Rel32 { reloc = L + i32! } is i32; L: epsilon <<32-bit constructors>>= rel32 <>= CALL.Jv^ow reloc is ow; CALL.Jv; rel16(reloc) CALL.Jv^od reloc is od; CALL.Jv; rel32(reloc) CALL.Ep^ov Mem is ov; (grp5; Mem) & CALL.Ep CALL.aP^ow CS":" IP is ow; CALL.aP; i16 = CS; i16 = IP CALL.aP^od CS":" IP is od; CALL.aP; i16 = CS; i32 = IP CALL.Ev^ov Eaddr is ov; (grp5; Eaddr) & CALL.Ev CBW is ow; CBW CWDE is od; CBW CLC CLD CLI CLTS CMC @ The Linux assembler doesn't support multiple segments, so the [[CALL]] opcodes that take a code segment and offset are discarded when generating assembly code for a Linux assembler. <>= discard CALL.aP^ow CALL.aP^od <>= # CMP is in the arith group CMPSB^av is av; CMPSB CMPSv^ov^av is (av; ov | ov; av); CMPSv CMPXCHG.Eb.Gb Eaddr, reg is CMPXCHG.Eb.Gb; Eaddr & reg_opcode = reg ... CMPXCHG.Ev.Gv^ov Eaddr, reg is ov; CMPXCHG.Ev.Gv; Eaddr & reg_opcode = reg ... CMPXCHG8B Mem is (grp9; Mem) & CMPXCHG8B CPUID CWD is ow; CWDQ CDQ is od; CWDQ <<32-bit constructors>>= CALL.Jv^od CALL.Ep^od CALL.Ev^od CALL.aP^od CBW CWDE CLC CLD CLI CLTS CMC CMPSv^od^av CMPXCHG.Ev.Gv^od CWD CDQ @ [[CMPXCHG8B]] and [[CPUID]] are Pentium instructions so we can't check them on a 486. <>= CMPXCHG8B CPUID CWD <>= DAA DAS DEC.Eb Eaddr is (grp4; Eaddr) & DEC.Eb DEC.Ev^ov Eaddr is ov; (grp5; Eaddr) & DEC.Ev DEC^ov r32 is ov; DEC & r32 DIV^"AL" Eaddr is (grp3.Eb; Eaddr) & DIV.AL.eAX DIV^"AX" Eaddr is ow; (grp3.Ev; Eaddr) & DIV.AL.eAX DIV^"eAX" Eaddr is od; (grp3.Ev; Eaddr) & DIV.AL.eAX <<32-bit constructors>>= DAA DAS DEC.Ev^od DEC^od DIV^"eAX" <>= DAA DAS DEC.Eb <>= ENTER i16, i8! is ENTER; i16; i8 F2XM1 <<32-bit constructors>>= ENTER F2XM1 <>= ENTER F2XM1 @ The first use of the left-hand side ellipsis is used in the definition of [[FADD^Fstack]]. The left ellipsis denotes that the pattern [[(FADD & r_m = idx)]] must be a legal suffix of the pattern [[Fstack]]. <>= FABS FADD^Fmem Mem is Fmem; Mem & FADD ... FADD^Fstack idx is Fstack & ... (FADD & r_m = idx) <>= FABS FADD^Fmem FADD^Fstack <>= FIADD^Fint Mem is Fint; Mem & FIADD ... FBLD Mem is DF; Mem & FBLD FBSTP Mem is DF; Mem & FBSTP <>= FCHS <>= FCHS <>= FCLEX is WAIT; FCLEX <>= FNCLEX is FCLEX <>= patterns FCOMs is FCOM | FCOMP <>= constructors FCOMs^Fmem Mem is Fmem; Mem & FCOMs ... # includes FICOM, FICOMP FCOMs^.ST.STi idx is .ST.STi & ... (FCOMs & r_m = idx) FCOMPP FCOS <>= FCOMs^Fmem FCOMs^.ST.STi FCOMPP FCOS <>= discard FCOMs^.ST.STi <>= FDECSTP FDIV^Fmem Mem is Fmem; Mem & FDIV ... FDIV^Fstack idx is Fstack & ... (FDIV & r_m = idx) FDIVR^Fmem Mem is Fmem; Mem & FDIVR ... FDIVR^Fstack idx is Fstack & ... (FDIVR & r_m = idx) <>= FDECSTP FDIV^Fmem FDIV^Fstack FDIVR^Fmem FDIVR^Fstack <>= FFREE idx is DD; FFREE & r_m = idx <>= FICOM^Fint Mem is Fint; Mem & FICOM FICOMP^Fint Mem is Fint; Mem & FICOMP FILD^FlsI Mem is FlsI; Mem & FILD FILD64 Mem is DF; Mem & FLD.ext ... FINIT <>= patterns FISTs is FIST | FISTP <>= constructors FISTs^FlsI Mem is FlsI; Mem & FISTs FISTP64 Mem is DF; Mem & FSTP.ext <>= FICOM^Fint FICOMP^Fint FICOM16 FICOM32 FICOMP16 FICOMP32 FILD^FlsI FILD64 FINIT FISTs^FlsI FISTP64 @ {\em EDIT ME? There is no obvious way to reference indirectly the stack pointer, i.e., [[(%esp)]], because the encoding of this addressing mode is an escape that indicates an [[sib]] byte follows the [[mod-rm]] byte. The only way to construct this address is to build a [[sib]] byte that ignores its [[index]] field. Page~26-7 specifies how to do this. Luckily, we only need [[(%esp)]] for a few floating-point instructions that use the address on the top of the stack and then pops it off. Although this addressing mode is technically an [[Eaddr]], we specify it individually because virutally no other instruction uses it.} <>= FLD^FlsR Mem is FlsR; Mem & FLD FLD80 Mem is DB; Mem & FLD.ext ... FLD.STi idx is D9; mod = 3 & FLD & r_m = idx Fconstants FLDCW Mem is D9; Mem & FLDCW FLDENV Mem is D9; Mem & FLDENV <>= FLD^FlsR FLD80 FLD.STi Fconstants FLDCW FLDENV <>= FMUL^Fmem Mem is Fmem; Mem & FMUL ... FMUL^Fstack idx is Fstack & ... (FMUL & r_m = idx) <>= FMUL^Fmem FMUL^Fstack <>= FNOP <>= FNOP <>= FPATAN FPREM FPREM1 FPTAN <>= FPATAN FPREM FPREM1 FPTAN <>= FRNDINT FRSTOR Mem is DD; Mem & FRSTOR <>= <>= FNSAVE Mem is DD; Mem & FSAVE <>= <>= FSAVE Mem is WAIT; FNSAVE(Mem) <>= <>= FSCALE FSIN FSINCOS FSQRT <>= patterns FSTs is FST | FSTP FSTs.st is FST.st | FSTP.st <>= constructors FSTs^FlsR Mem is FlsR; Mem & FSTs FSTP80 Mem is DB; Mem & FSTP.ext FSTs.st^.STi idx is .STi & ... (FSTs.st & r_m = idx) FSTCW Mem is D9; Mem & FSTCW <>= FNSTCW Mem is WAIT; FSTCW(Mem) <>= FSCALE FSIN FSINCOS FSQRT FSTs^FlsR FSTP80 FSTs.st^.STi FSTCW FNSTCW <>= FSTENV Mem is D9; Mem & FSTENV <>= FNSTENV Mem is WAIT; FSTENV(Mem) <>= FSTENV FNSTENV <>= FSTSW Mem is DD; Mem & FSTSW FSTSW.AX <>= FNSTSW Mem is WAIT; FSTSW(Mem) FNSTSW.AX is WAIT; FSTSW.AX() <>= FSTSW FSTSW.AX FNSTSW FNSTSW.AX <>= FSUB^Fmem Mem is Fmem; Mem & FSUB ... FSUB^Fstack idx is Fstack & ... (FSUB & r_m = idx) FSUBR^Fmem Mem is Fmem; Mem & FSUBR ... FSUBR^Fstack idx is Fstack & ... (FSUBR & r_m = idx) <>= FSUB^Fmem FSUB^Fstack FSUBR^Fmem FSUBR^Fstack <>= FTST <>= FTST <>= patterns FUCOMs is FUCOM | FUCOMP <>= constructors FUCOMs idx is DD; FUCOMs & r_m = idx FUCOMPP <>= FUCOMs FUCOMPP <>= FWAIT is WAIT <>= FWAIT <>= FXAM FXCH idx is FXCH & ... r_m = idx FXTRACT <>= FXAM FXCH FXTRACT <>= FYL2X FYL2XP1 <>= FYL2X FYL2XP1 <>= FIADD^Fint FBLD FBSTP FNCLEX FCOS FDECSTP FDIVR^Fmem FIDIV^Fint FIDIVR^Fint FICOM^Fint FICOMP^Fint FILD64 FINCSTP FINIT FISTP64 FLD80 Fconstants FLDCW FLDENV FIMUL^Fint FNOP FPATAN FPREM FPREM1 FPTAN FRNDINT FRSTOR FSCALE FSIN FSINCOS FSQRT FSUBR^Fmem FISUB^Fint FISUBR^Fint FTST FUCOMs FUCOMPP FXAM FXCH FXTRACT FYL2X FYL2XP1 <>= HLT <<32-bit constructors>>= HLT <>= IDIV Eaddr is (grp3.Eb; Eaddr) & IDIV.AL.eAX IDIV^"AX" Eaddr is ow; (grp3.Ev; Eaddr) & IDIV.AL.eAX IDIV^"eAX" Eaddr is od; (grp3.Ev; Eaddr) & IDIV.AL.eAX IMULb Eaddr is (grp3.Eb; Eaddr) & IMUL.AL.eAX IMUL^ov Eaddr is ov; (grp3.Ev; Eaddr) & IMUL.AL.eAX IMULrm^ov reg, Eaddr is ov; IMUL.Gv.Ev; Eaddr & reg_opcode = reg ... IMUL.Ib^ov reg, Eaddr, i8! is ov; IMUL.Ib; Eaddr & reg_opcode = reg ... ; i8 IMUL.Iv^"w" reg, Eaddr, i16! is ow; IMUL.Iv; Eaddr & reg_opcode = reg ... ; i16 IMUL.Iv^"d" reg, Eaddr, i32! is od; IMUL.Iv; Eaddr & reg_opcode = reg ... ; i32 IN.AL.Ib i8! is IN.AL.Ib; i8 IN.eAX.Ib^ov i8! is ov; IN.eAX.Ib; i8 IN.AL.DX IN.eAX.DX^ov is ov; IN.eAX.DX INC.Eb Eaddr is (grp4; Eaddr) & INC.Eb INC.Ev^ov Eaddr is ov; (grp5; Eaddr) & INC.Ev INC^ov r32 is ov; INC & r32 INSB INSv^ov is ov; INSv INT3 INT.Ib i8! is INT.Ib; i8 INTO INVD INVLPG Mem is (grp7; Mem) & INVLPG IRET <<32-bit constructors>>= IDIV^"eAX" IMUL^od IMULrm^od IMUL.Iv^"d" IN.eAX.DX^od INC.Ev^od INC^od INSv^od INT3 INT.Ib INTO INVD INVLPG IRET <>= Jb^cond reloc is Jb & cond; rel8(reloc) Jv^cond^ow reloc is ow; Jv & ... cond; rel16(reloc) Jv^cond^od reloc is od; Jv & ... cond; rel32(reloc) JCXZ reloc is JCXZ ; rel8(reloc) JMP.Jb reloc is JMP.Jb; rel8(reloc) JMP.Jv^ow reloc is ow; JMP.Jv; rel16(reloc) JMP.Jv^od reloc is od; JMP.Jv; rel32(reloc) JMP.Ap^ow CS, IP is ow; JMP.Ap; i16 = CS; i16 = IP JMP.Ap^od CS, IP is od; JMP.Ap; i16 = CS; i32 = IP JMP.Ev^ov Eaddr is ov; (grp5; Eaddr) & JMP.Ev JMP.Ep^ov Mem is ov; (grp5; Mem ) & JMP.Ep <<32-bit constructors>>= Jv^cond^od JMP.Jv^od JMP.Ap^od JMP.Ev^od JMP.Ep^od <>= discard JMP.Ap^ow JMP.Ap^ov <>= LAHF LAR^ov reg, Eaddr is ov; LAR; Eaddr & reg_opcode = reg ... <>= patterns lfp is LDS | LES | LFS | LGS | LSS <>= constructors lfp^ov reg, Mem is ov; lfp; Mem & reg_opcode = reg ... LEA^ov reg, Mem is ov; LEA; Mem & reg_opcode = reg ... LEAVE LGDT Mem is (grp7; Mem) & LGDT LIDT Mem is (grp7; Mem) & LIDT LLDT Eaddr is (grp6; Eaddr) & LLDT LMSW Eaddr is (grp7; Eaddr) & LMSW LOCK LODSB LODSv^ov is ov; LODSv <>= patterns LOOPs is LOOP | LOOPE | LOOPNE <>= constructors LOOPs^ov reloc is ov; LOOPs; rel8(reloc) LSL^ov reg, Eaddr is ov; LSL; Eaddr & reg_opcode = reg ... LTR Eaddr is (grp6; Eaddr) & LTR <<32-bit constructors>>= LAHF LAR^ov lfp^ov LEA^ov LEAVE LGDT LIDT LLDT LMSW LOCK LODSB LODSv^ov <>= MOV^"mrb" Eaddr, reg is MOV & Eb.Gb; Eaddr & reg_opcode = reg ... MOV^"mr"^ov Eaddr, reg is ov; MOV & Ev.Gv; Eaddr & reg_opcode = reg ... MOV^"rmb" reg, Eaddr is MOV & Gb.Eb; Eaddr & reg_opcode = reg ... MOV^"rm"^ov reg, Eaddr is ov; MOV & Gv.Ev; Eaddr & reg_opcode = reg ... MOV.Ew.Sw Mem, sr16 is ow; MOV.Ew.Sw; Mem & reg_opcode = sr16 ... MOV.Sw.Ew Mem, sr16 is MOV.Sw.Ew; Mem & reg_opcode = sr16 ... # assume 32-bit address mode MOV.AL.Ob offset is MOV.AL.Ob; i32 = offset MOV.eAX.Ov^ov offset is ov; MOV.eAX.Ov; i32 = offset MOV.Ob.AL offset is MOV.Ob.AL; i32 = offset MOV.Ov.eAX^ov offset is ov; MOV.Ov.eAX; i32 = offset MOVib r8, i8! is MOVib & r8; i8 MOViw r16, i16! is ow; MOViv & r16; i16 MOVid r32, i32! is od; MOViv & r32; i32 MOV.Eb.Ib Eaddr, i8! is MOV.Eb.Ib; Eaddr & reg_opcode = 0 ...; i8 MOV.Ev.Iv^ow Eaddr, i16! is ow; MOV.Ev.Iv; Eaddr & reg_opcode = 0 ...; i16 MOV.Ev.Iv^od Eaddr, i32! is od; MOV.Ev.Iv; Eaddr & reg_opcode = 0 ...; i32 MOV.Cd.Rd cr, reg is MOV.Cd.Rd; mod = 3 & r_m = reg & reg_opcode = cr MOV.Rd.Cd reg, cr is MOV.Rd.Cd; mod = 3 & r_m = reg & reg_opcode = cr MOV.Dd.Rd dr, reg is MOV.Dd.Rd; mod = 3 & r_m = reg & reg_opcode = dr MOV.Rd.Dd reg, dr is MOV.Rd.Dd; mod = 3 & r_m = reg & reg_opcode = dr MOVSB MOVSv^ov is ov; MOVSv MOVSX.Gv.Eb^ov r32, Eaddr is ov; MOVSX.Gv.Eb; Eaddr & reg_opcode = r32 ... MOVSX.Gv.Ew r16, Eaddr is MOVSX.Gv.Ew; Eaddr & reg_opcode = r16 ... MOVZX.Gv.Eb^ov r32, Eaddr is ov; MOVZX.Gv.Eb; Eaddr & reg_opcode = r32 ... MOVZX.Gv.Ew r16, Eaddr is MOVZX.Gv.Ew; Eaddr & reg_opcode = r16 ... MUL.AL Eaddr is (grp3.Eb; Eaddr) & MUL.AL.eAX MUL.AX^ov Eaddr is ov; (grp3.Ev; Eaddr) & MUL.AL.eAX <<32-bit constructors>>= MOV^"mrb" MOV^"mr"^ov MOV^"rmb" MOV^"rm"^ov MOV.Ew.Sw MOV.Sw.Ew MOV.AL.Ob MOV.eAX.Ov^ov MOV.Ob.AL MOV.Ov.eAX^ov MOVib MOViw MOVid MOV.Eb.Ib MOV.Ev.Iv^ow MOV.Ev.Iv^od MOVSB MOVSv^ov MOVSX.Gv.Eb^od MOVSX.Gv.Ew MOVZX.Gv.Eb^od MOVZX.Gv.Ew MUL.AL MUL.AX^ov <>= discard MOVSX.Gv.Ew MOVZX.Gv.Eb^ov MOVZX.Gv.Ew MOV.Ew.Sw <>= NEGb Eaddr is (grp3.Eb; Eaddr) & NEG NEG^ov Eaddr is ov; (grp3.Ev; Eaddr) & NEG NOP NOTb Eaddr is (grp3.Eb; Eaddr) & NOT NOT^ov Eaddr is ov; (grp3.Ev; Eaddr) & NOT <<32-bit constructors>>= NEGb NEG^ov NOP NOTb NOT^ov <>= # OR is in the arith group OUT.Ib.AL i8! is OUT.Ib.AL; i8 OUT.Ib.eAX^ov i8! is ov; OUT.Ib.eAX; i8 OUT.DX.AL OUT.DX.eAX^ov is ov; OUT.DX.eAX OUTSB OUTSv^ov is ov; OUTSv <<32-bit constructors>>= OUT.Ib.AL OUT.Ib.eAX^ov OUT.DX.AL OUT.DX.eAX^ov OUTSB OUTSv^ov <>= POP.Ev^ov Mem is ov; POP.Ev; Mem & reg_opcode = 0 ... POP^ov r32 is ov; POP & r32 <>= patterns POPs is POP.ES | POP.SS | POP.DS | POP.FS | POP.GS POPv is POPA | POPF <>= constructors POPs POPv^ov is ov; POPv PUSH.Ev^ov Eaddr is ov; (grp5; Eaddr) & PUSH.Ev PUSH^ov r32 is ov; PUSH & r32 PUSH.Ib i8! is PUSH.Ib; i8 PUSH.Iv^ow i16! is ow; PUSH.Iv; i16 PUSH.Iv^od i32! is od; PUSH.Iv; i32 <>= patterns PUSHs is PUSH.CS | PUSH.SS | PUSH.DS | PUSH.ES | PUSH.FS | PUSH.GS PUSHv is PUSHA | PUSHF <>= constructors PUSHs PUSHv^ov is ov; PUSHv <<32-bit constructors>>= POPs POPv^ov PUSH.Ev^ov PUSH^ov PUSH.Ib PUSH.Iv^ow PUSH.Iv^od PUSHs PUSHv^ov <>= # ROL ROR RCL RCR SHLSAL SHR SAR rot^bshifts Eaddr is (bshifts; Eaddr) & rot rot^vshifts^ov Eaddr is ov; (vshifts; Eaddr) & rot rot^B.Eb.Ib Eaddr, i8! is (B.Eb.Ib; Eaddr) & rot; i8 rot^B.Ev.Ib^ov Eaddr, i8! is ov; (B.Ev.Ib; Eaddr) & rot; i8 <<32-bit constructors>>= rot^bshifts rot^vshifts^ov rot^B.Ev.Ib^ov <>= RDMSR REP REPNE RET RET.far RET.Iw i16 is RET.Iw; i16 RET.far.Iw i16 is RET.far.Iw; i16 RSM <<32-bit constructors>>= RDMSR REP REPNE RET RET.far RET.Iw RET.far.Iw RSM <>= discard RDMSR <>= SAHF # SAL SAR SHL SR above with rot # SBB is in the arith group SCASB SCASv^ov is ov; SCASv ## SETb^cond Mem is SETb & ... cond; Mem SETb^cond Eaddr is SETb & ... cond; Eaddr SGDT Mem is (grp7; Mem) & SGDT SIDT Mem is (grp7; Mem) & SIDT <<32-bit constructors>>= SCASB SCASv^ov SETb^cond SGDT SIDT <>= patterns shdIb is SHRD.Ib | SHLD.Ib shdCL is SHRD.CL | SHLD.CL <>= constructors shdIb^ov Eaddr, reg, count is ov; shdIb; Eaddr & reg_opcode = reg ... ; i8 = count shdCL^ov Eaddr, reg, "CL" is ov; shdCL; Eaddr & reg_opcode = reg ... SLDT Eaddr is (grp6; Eaddr) & SLDT SMSW Eaddr is (grp7; Eaddr) & SMSW STC STD STI STOSB STOSv^ov is ov; STOSv STR Mem is (grp6; Mem) & STR # SUB is in the arith group <<32-bit constructors>>= shdIb^ov shdCL^ov SLDT SMSW STC STD STI STOSB STOSv^ov STR <>= TEST.AL.Ib i8 is TEST.AL.Ib; i8 TEST.eAX.Iv^ow i16 is ow; TEST.eAX.Iv; i16 TEST.eAX.Iv^od i32 is od; TEST.eAX.Iv; i32 TEST.Eb.Ib Eaddr, i8 is (grp3.Eb; Eaddr) & TEST.Ib.Iv; i8 TEST.Ew.Iw Eaddr, i16 is ow; (grp3.Ev; Eaddr) & TEST.Ib.Iv; i16 TEST.Ed.Id Eaddr, i32 is od; (grp3.Ev; Eaddr) & TEST.Ib.Iv; i32 TEST.Eb.Gb Eaddr, reg is TEST.Eb.Gb; Eaddr & reg_opcode = reg ... TEST.Ev.Gv^ov Eaddr, reg is ov; TEST.Ev.Gv; Eaddr & reg_opcode = reg ... <<32-bit constructors>>= TEST.AL.Ib TEST.eAX.Iv^ow TEST.eAX.Iv^od TEST.Eb.Ib TEST.Ew.Iw TEST.Ed.Id TEST.Eb.Gb TEST.Ev.Gv^ov <>= VERR Eaddr is (grp6; Eaddr) & VERR VERW Eaddr is (grp6; Eaddr) & VERW <<32-bit constructors>>= VERR VERW <>= WAIT WBINVD WRMSR <<32-bit constructors>>= WAIT WBINVD WRMSR <>= discard WRMSR <>= XADD.Eb.Gb Eaddr, reg is XADD.Eb.Gb; Eaddr & reg_opcode = reg ... XADD.Ev.Gv^ov Eaddr, reg is ov; XADD.Ev.Gv; Eaddr & reg_opcode = reg ... XCHG^"eAX"^ov r32 is ov; XCHG & r32 XCHG.Eb.Gb Eaddr, reg is XCHG.Eb.Gb; Eaddr & reg_opcode = reg ... XCHG.Ev.Gv^ov Eaddr, reg is ov; XCHG.Ev.Gv; Eaddr & reg_opcode = reg ... XLATB is XLAT # XOR is in the arith group <<32-bit constructors>>= XADD.Eb.Gb XADD.Ev.Gv^ov XCHG^"eAX"^ov XCHG.Eb.Gb XCHG.Ev.Gv^ov XLATB @ \subsection{Synthetic instructions} The only synthetics we've identified are those that are preceded by a [[WAIT]] instruction, which we've included in the alphabetical list above, since they're all given together in Chapter~25. @ The Gnu-Linux assembler does not recognize instructions prefixed by [[WAIT]] as synthetics. For example, [[wait; fclex]] disassembles as [[fclex]]; [[fclex]] disassembles as [[fnclex]]. <>= constructors <> @ [[mld]] needs some synthetics to help it deal with overloaded operators. This is a hack but better than writing them all out by hand. <>= constructors _call_l reloc : Function is call_jvod(reloc) _call_rm Mem : Function is call_evod(Mem) callfn Function is Function @ The following constructor is used to emit relocatable addresses. It is the same as that defined for the MIPS and SPARC. <>= fields of addrtoken (32) addr32 0:31 placeholder for addrtoken is addr32 = 7 constructors emit_raddr reloc is addr32 = reloc @ \subsection{Assembly Specification for Gnu-Linux Assembler} Some assemblers overload instruction names, using context to determine which instruction is meant. For example, the Pentium [[add]] can mean any of five different instructions, depending on the sizes and locations of the operands. The toolkit cannot do this kind of overloading; it must use different names for different instructions (constructors). The reason is that the toolkit must generate a different encoding procedure for each instruction, and in most programming languages, different procedures must have different names. Even in languages that do support overloading, we might not be able to use the same name, because the name-resolution mechanisms used in programming languages are typically type-based and quite different from what an assembler uses (LR parsing). We solve this problem by requiring each constructor to have a different name. Typically, the specification writer distinguishes variants by adding suffixes to the base name of the constructor. For example, the Pentium [[add]] instructions include constructors called [[addb]], [[addib]], [[addiowb]], [[addmrb]], and [[addrmb]]. To get from these names back to assembly language, we have to define an appropriate mapping. These mappings are defined separately from the main specification, because different vendors use different syntaxes for their assembly languages. \iffalse The toolkit needs the assembly names of constructors and the assembly format for their operands to generate assembly emitters. Such emitters are called by toolkit-generated checker programs and can be called from an application just like binary emitters. Determining the assembly name for a constructor is difficult, because assembly opcodes are often overloaded, i.e., one name maps to multiple opcodes in the target instruction set. On the Pentium, for example, the assembly opcode [[addb]] can be assembled into at least five different target instructions, i.e., the instructions emitted by the constructors [[addb]], [[addib]], [[addiowb]], [[addmrb]], and [[addrmb]]. Operator overloading is not a problem for an assembler, because with the help of an LR parser, it can use the types and format of an instruction's operands to distinguish between variants. Operator overloading poses a difficult problem in the design of the toolkit specification language and in the implementation of the toolkit itself. %The toolkit generator produces procedures that encode instructions; An application writer must be able to distinguish between variants of overloaded instructions when calling the encoding procedures generated by the toolkit. One solution is to generate the encoding procedures in a language that supports overloading of procedure names, but this significantly limits portability of the toolkit. Our current solution requires that the specification writer provide unique names for overloaded opcodes in constructor specifications. The specification or application writer provides a separate specification that maps overloaded assembly names to the constructor names they represent. Decoupling constructor names from assembly names enables use of a single instruction set specification with multiple assembly specifications. This is useful when one architecture is the target for multiple assemblers with mutually incompatible syntax; the Intel~486 assembly languages provided by the Gnu Linux and the Borland MS/DOS assemblers, for example, are incompatible. % assembly syntax for the Intel~486. \fi An assembly specification includes three parts: a mapping of constructor names to assembly opcodes, the assembly format for each constructor operand, and the assembly syntax for each constructor. We use the specification for the assembly language supported by the Gnu-Linux assembler to illustrate assembly specifications. First, we describe assembly name mappings, then operand format and constructor assembly syntax. \subsubsection{Assembly names for opcodes} A constructor's name may contain multiple parts derived from pattern names and constant strings. Each part may or may not contribute to the assembly name. For example, the constructor name [[add^"mrb"]] contains two parts, the first derived from the pattern [[add]] and the second from the string [["mrb"]]. The assembly name for this constructor is [[addb]], so the pattern [[add]] contributes its name and the suffix [["mrb"]] is mapped to the string [["b"]]. This example illustrates how the constructor [[addmrb]] has a more specific name to disambiguate it from other overloaded variants of [[addb]]. [[assembly opcode]] introduces mappings from complete constructor names (strings) to their assembly names (strings). [[assembly component]] introduces mappings from parts of constructor names to their assembly names. We provide a component-wise mapping, because it improves factoring of assembly names among constructors that share common suffixes and prefixes. For example, the suffix [[B.Eb.Ib]] always maps to [[b]] in every constructor name where it appears. There is redundancy in the mapping of constructor names, so we use a regular expression syntax to group related names. The regular expression syntax is the same as the syntax for C-shell ``globbing'' expressions~\cite{joy:c-shell}. If a complete name mapping exists for a constructor name, it is applied first. If no complete mapping exists, mappings are applied individually to {\em each} part of a constructor's name and the resulting strings are concatenated into the complete assembly name. For example, the mappings applied to the parts of the constructor name [[add^"mrb"]] are: <>= assembly component add is add {"mrb","rmb"} is b @ The first rule maps [[add]] to itself and the second maps any string that matches [[mrb]] or [[rmb]] to [[b]]. The least general rule that matches a string is applied. It is often useful to define a default mapping, i.e., for the pattern ``[[*]]''. On the Pentium, for example, most constructor names do not map directly to assembly names, so the default maps a name to the empty string. {\em This is wrong. } <>= assembly component {Indir,{Disp*},Abs32,Reg,{*Index*},E,rel{8,16,32}} is "" {*} is $1 @ On the MIPS and SPARC, however, most constructor names map are assembly names, so the default maps a name to itself, i.e., [[assembly component {*} is $1]]. @ The remaining rules specify all the assembly names for the Pentium constructors and illustrate use of the C-shell globbing expressions. In globbing expressions, ``*'' matches any string; any character and ``.'' matches itself. The concatenation operator is implicit, so adjacent characters are concatenated. Alternatives are comma-separated lists of strings delimited by ``{'' and ``}''. We provide one extension to the C-shell syntax: expressions in curly braces may be referenced on the right-hand side by \$$n$, where $n$ is the $n$-th braced expression on the left-hand side. For example, the first rule specifies that the suffixes [[ow]] and [[aw]] map to the assembly name [[w]], and [[od]] and [[ad]] map to [[l]]. The rest of these rules map all the suffixes used in constructor names to their assembly names. <>= assembly component {iAL,AL} is b {iAX,AX} is w {iEAX,eAX} is l {o,a}d is l {o,a}w is w {.I32,.R64,.lsI32,.lsR64} is l {.I16,.R32,.lsR32} is s .lsI16 is w b.* is b {b,w} is $1 d is l B.{Eb.1,Eb.CL,Eb.Ib,Ev.Ib} is b B.{Ev.1,Ev.CL} is w {.STi,.ST.STi,.STi.St} is "" P.STi.ST is P .{O,NO,B,NB,Z,NZ,BE,NBE,S,NS,P,NP,L,NL,LE,NLE} is $1 @ Many constructor names contain suffixes that are mneumonics for the opcodes they represent. These suffixes are often eliminated in the assembly name. For example, the assembly names for the immediate opcodes [[ADD.i]] and [[OR.i]] are [[ADD]] and [[OR]], respectively. The following rules truncate these suffixes. <>= assembly component {CALL}.* is $1 {CALL}l is $1 CMPXCHG8B is CMPXCHG {CMPXCHG,XADD,XCHG,TEST}.Eb.Gb is $1b CMPSv is CMPS {CMP*}.* is $1 {DEC,INC}.* is $1 {DIV}.* is $1 {*}.st is $1 {FLD,FSTP}80 is $1t {FLD,FSTP}.* is $1 {FILD,FISTP}.* is $1 {FICOM*}16 is $1s {FICOM*}32 is $1l {FILD,FISTP*}64 is $1ll {IDIV,IMUL}.* is $1 {IN,INT,J}.* is $1 JMP.Ep is lJMP {JMP}.* is $1 MOV{.Eb.Ib,.AL.Ob,.Ob.AL} is MOVb {MOViv,MOV.Ev.Iv} is MOV MOVSX.Gv.Ew is MOVSwl {MOV.Ew.Sw,MOV.Sw.Ew} is MOVw {MOVS,MOVZ}X.Gv.Eb is $1b {MOVSv,MOVSX.*} is MOVS {MOV,MOVS}.* is $1 {MOVSX,MOVZX}.* is $1 MOVi{b,w} is MOV$1 MOVid is MOVl {*}.AX is $1 {OUT.Ib.AL,OUT.DX.AL} is OUTb {OUT,OUTS}.* is $1 {RET.far}* is lRET {POP,PUSH,RET}.* is $1 {SCAS,STOS}v is $1 {SHRD,SHLD}.* is $1 SHRSAL is SHR TEST{.*.Ib,.Eb.*} is TESTb TEST.*.Iw is TESTw TEST.*.Id is TESTl TEST.* is TEST {XADD*}.* is $1 {XCHG*}.* is $1 {*}i is $1 @ The remaining rules are for constructors with special assembly names. <>= assembly component {"mr","rm"} is "" {"mrb","rmb"} is b {*}64 is $1 {IDIV,DIV}"AL" is $1 {IDIV,DIV}"AX" is $1 {IDIV,DIV}"eAX" is $1 {IMULrm} is IMUL INT3 is INT FLD.ext is FLDLL {Jv,Jb} is J {INC}.Eb is INCb {INS,LODS}v is $1 MUL.AL is MULb OUTSv is OUTS SHLSAL is SHL TEST.Ew.Iw is TESTw TEST.Ed.Id is TESTl SETb is SET @ Component-wise mapping doesn't work for all names. <>= assembly opcode CALL.{Ev}{od} is CALL CALL.{Jv,Ep}{od,ow} is lCALL CALL.aP{od} is CALL CMPSv{od,ow}ad is CMPSl CMPSv{od,ow}aw is CMPSw JMP.Epod is lJMP MOVSX.Gv.Ebod is MOVSbl MOVSX.Gv.Ebow is MOVSbw {ROL,ROR,RCL,RCR,SHR,SAR}{B.Ev.*}od is $1l {ROL,ROR,RCL,RCR,SHR,SAR}{B.Ev.*}ow is $1w SHLSAL{B.Ev.*}od is SHLl SHLSAL{B.Ev.*}ow is SHLw XCHGeAXow is XCHGw XCHGeAXod is XCHGl @ \subsubsection{Assembly formats for operands} [[assembly operand]] introduces mappings from operands to formatted strings that specify how to print the operands in assembly code. Operands may be fields, integer inputs, relocatable addresses, or constant strings. We use [[printf]]-style syntax for formatted strings. The first [[assembly operand]] rule below specifies that immediate operands are prefixed by ``\$'' and are printed as integers. The second rule specifies that the listed field inputs are prefixed by ``\%'' and printed as strings, using the names provided in their [[fieldinfo]] declarations. <>= assembly operand [count i8 i16 i32] is "$%d" [r32 sr16 r16 r8 base index] is "%%%s" @ Some inputs are not declared as fields but should be printed in the same format as fields. For example, the input [[reg]] should be printed using the names associated with the field [[base]]. The following rule uses the optional [[using ]]{\em field} clause to specify that [[reg]], [[reg8]], etc. should be printed using the fieldinfo associated with [[base]]. <>= assembly operand [reg reg8 sreg cr dr] is "%%%s" using field base @ Some operands are used implicitly in a constructor. For example, the constructor \verb|OUT.DX.AL "dx","al"|, implicitly uses the [[dx]] and [[al]] registers as its operands. Like all other operands, their format is assembler dependent so we provide mappings for printing them in the Gnu-Linux format. <>= assembly operand dx is "%%dx" ax is "%%ax" @ \subsubsection{Assembly syntax for constructors} The default assembly syntax for a constructor appears in the constructor's specification; an alternate syntax may be specified with [[assembly syntax]]. Providing assembly syntax in a constructor specification can help a specification writer or user read and identify a constructor, and it is concise when only one assembly syntax is required. An alternate syntax may be needed, however, if more than one assembly language is used on the target. [[assembly syntax]] uses the same syntax as the [[constructors]] directive: a constructor name followed by a list of operands. The assembly-syntax specification must use the same set of operands that the constructor uses, but the operands may appear in any order and with any syntactic sugar. To reduce redundancy, we define new patterns to group constructors that share the same assembly syntax. \iffalse All field and constructor operands must match in name, number, and type with the operands provided in the constructor's specification. Additional strings are permitted in the assembly syntax. \fi The Gnu-Linux assembly language reverses the order of all operands. The following directives reverse the order. <>= assembly syntax arith^"iAL" i8!, "%al" arith^"iAX" i16!, "%ax" arith^"iEAX" i32!, "%eax" DIV^"AL" Eaddr, "%al" DIV^"AX" Eaddr, "%ax" DIV^"eAX" Eaddr, "%eax" arithI^"b" i8!, Eaddr arithI^"w" i16!, Eaddr arithI^"d" i32!, Eaddr arithI^ov^"b" i8!, Eaddr MOV.Eb.Ib i8!, Eaddr MOV.Ev.Iv^ow i16!, Eaddr MOV.Ev.Iv^od i32!, Eaddr @ <>= assembly syntax arith^"rmb" Eaddr, reg8 arith^"rm"^ov Eaddr, reg IMULrm^ov Eaddr, reg MOV^"rmb" Eaddr, reg MOV^"rm"^ov Eaddr, reg MOVZX.Gv.Ew Eaddr, r16 MOVSX.Gv.Ew Eaddr, r16 MOVZX.Gv.Eb^ov Eaddr, r32 MOVSX.Gv.Eb^ov Eaddr, r32 BSF^ov Eaddr, reg BSR^ov Eaddr, reg LAR^ov Eaddr, reg arith^"mrb" reg8, Eaddr arith^"mr"^ov reg, Eaddr MOV^"mr"^ov reg, Eaddr MOV^"mrb" reg, Eaddr TEST.Ev.Gv^ov reg, Eaddr BT^ov reg, Eaddr BTi^ov i8!, Eaddr BTC^ov reg, Eaddr BTCi^ov i8!, Eaddr BTR^ov reg, Eaddr BTRi^ov i8!, Eaddr BTS^ov reg, Eaddr BTSi^ov i8!, Eaddr CMPXCHG.Eb.Gb reg, Eaddr CMPXCHG.Ev.Gv^ov reg, Eaddr @ <>= patterns fstack is FADD | FDIV | FDIVR | FMUL | FSUB | FSUBR fsti is fstack | FCOMs stidx is FFREE | FUCOMs | FXCH Sstack is P.STi.ST | .STi.St assembly syntax fstack^Sstack "%st", "%st"(idx) fstack^.ST.STi "%st"(idx), "%st" FCOMs^.ST.STi "%st"(idx), "%st" FSTs.st^.STi "%st"(idx) FLD.STi "%st"(idx) stidx "%st"(idx) FNSTSW.AX "%ax" FSTSW.AX "%ax" IDIV^"AX" Eaddr, "%ax" IDIV^"eAX" Eaddr, "%eax" IN.AL.Ib i8!, "%al", i8! IN.eAX.Ib^ov i8!, "%eax", i8! IN.AL.DX "%dx, %al" IN.eAX.DX^ov "%dx, %eax" IMUL.Iv^"d" i32!, Eaddr, reg INT3 "$3" LEA^ov Mem, reg MOVib i8!, r8 MOViw i16!, r16 MOVid i32!, r32 MOV.AL.Ob offset, "%al" MOV.eAX.Ov^ov offset, "%eax" MOV.Ob.AL "%al", offset MOV.Ov.eAX^ov "%eax", offset OUT.Ib.AL "%al", i8! OUT.Ib.eAX^ov "%eax", i8! OUT.DX.AL "%al", "%dx" OUT.DX.eAX^ow "%al", "%dx" OUT.DX.eAX^od "%eax", "%dx" patterns pES is POP.ES | PUSH.ES pSS is POP.SS | PUSH.SS pDS is POP.DS | PUSH.DS pFS is POP.FS | PUSH.FS pGS is POP.GS | PUSH.GS assembly syntax pES "%ES" pSS "%SS" pDS "%DS" pFS "%FS" pGS "%GS" PUSH.CS "%CS" rot^B.Eb.1 "$1", Eaddr rot^B.Ev.1^ov "$1", Eaddr rot^B.Eb.CL "%cl", Eaddr rot^B.Ev.CL^ov "%cl", Eaddr rot^B.Eb.Ib i8!, Eaddr rot^B.Ev.Ib^ov i8!, Eaddr shdIb^ov count, reg, Eaddr shdCL^ov "%cl", reg, Eaddr <>= TEST.AL.Ib i8, "%al" TEST.eAX.Iv^ow i16, "%ax" TEST.eAX.Iv^od i32, "%ax" TEST.Eb.Ib i8, Eaddr TEST.Ew.Iw i16, Eaddr TEST.Ed.Id i32, Eaddr TEST.Eb.Gb reg, Eaddr TEST.Ev.Gv^ov reg, Eaddr XADD.Eb.Gb reg, Eaddr XADD.Ev.Gv^ov reg, Eaddr XCHG.Eb.Gb reg, Eaddr XCHG^"eAX"^ov "%eax", r32 XCHG.Ev.Gv^ov reg, Eaddr @ Effective addresses also use a different syntax under Gnu-Linux. <>= assembly syntax Indir (reg) Disp32 d(reg) Index (base,index,ss) Index32 d(base,index,ss) ShortIndex d(,index,ss) @ \subsection{Miscellaneous} The Intel~486 instruction set is a subset of the Pentium set. When generating instructions for that target, the pentium-only instructions are discarded. <>= <> discard <> <> @ Names for generating assembly emitters are specified in several chunks. <>= <> <> <> <>= <> patterns <> <> <> <> <> <> @ Check the arithmetic constructors. Substitute other chunks for this one to check each chunk. <>= <> <> <>= <> <> <>= <> <> <>= <> <> <>= <> <> <>= <> <> <>= <> <> @ The Pentium checker checks all 32-bit instructions accepted by the Gnu-Linux assembler. We omit some variants of the [[rot]] instructions, because exhaustively checking each one seemed unnecessary. <>= <> <> discard rot^B.Eb.1 rot^B.Ev.CL ROL^B.Eb.Ib RCL^B.Eb.Ib SHLSAL^B.Eb.Ib SAR^B.Eb.Ib ROR^B.Ev.Ib^ov RCR^B.Ev.Ib^ov SHR^B.Ev.Ib^ov <>= .align 16