AMD64 overview

Comp 40

Key locations

Integer unit

The 64-bit registers by number are RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, and r8 to r15. Figure [->] shows the various sub-registers. You are quite likely to encounter such registers as eax or edi, especially when dealing with functions that take 32-bit parameters.

The integer status register includes the typical flags OF (overflow flag), SF (sign flag), ZF (zero flag), and CF (carry flag). Flags unique to the Intel family include PF (parity flag), AF (auxiliary carry flag), and DF (direction flag for string operations). Flags are set by most arithmetic operations and tested by the ``jump conditional'' instructions.

128-bit multimedia unit

This unit includes sixteen 128-bit registers numbered xmm0 to xmm15. This unit provides a variety of vector-parallel instructions (Streaming SIMD Extensions, or SSE) including vector-parallel floating-point operations on either 32-bit or 64-bit IEEE floating-point numbers (single and double precision).

IEEE Floating-point unit

The IEEE floating-point unit has eight 80-bit registers numbered fpr0 to fpr7. It provides floating-point operations on 80-bit IEEE floating-point numbers (double extended precision).

Parameter registers

Integer parameters are passed in registers %rdi, %rsi, %rdx, %rcx, %r8, and %r9. Single-precision and double-precision floating-point parameters (float and double) are passed in registers %xmm0 through %xmm7. Structure parameters, extended-precision floating-point numbers (long double), and parameters too numerous to fit in registers are passed on the stack.

Result registers

An integer result is normally returned in rax. If an integer result is too large to fit in a 64-bit register, it will be returned in the raxrdx register pair. A single-precision or double-precision floating-point result is returned in xmm0; an extended-precision floating-point result is returned on top of the floating-point stack in st0. Complex numbers return their imaginary parts in xmm1 or st1.

Registers preserved across calls

Most registers are overwritten by a procedure call, but the values in the following registers must be preserved:

  %rbx  %rsp  %rbp  %r12 %r13 %r14 %r15
In addition, the contents of the x87 floating-point control word, which controls rounding modes and other behavior, must be preserved across calls.

A typical procedure arranges preservation with a prolog that pushes %rbp and %rbx and subtracts a constant k from %rsp. The body of the procedure usually avoids %r12%r15 entirely. Finally, before returning, the procedure then adds k to %rsp, then pops %rbx and %rbp. But there are many other ways to achieve the same goal, which is that on exit, the nonvolatile registers have the same values they had on entry.

--- --- --- #2

--- --- ---
#5 --- --- |height 10pt --- |#3 --- |#4|

--- 32 --- 16 --- 8 --- 0

--- ---
--- --- #1
--- --- --- --- #2

--- --- --- ---
#5 (#6) --- --- |height 10pt --- |height 10pt --- |#3 --- |#4|

--- 64 --- 32 --- 16 --- 8 --- 0

--- --- ---
--- --- --- #1
--- --- |height 10pt --- |height 10pt --- |#2|

--- 64 --- 32 --- 16 --- --- 0

--- --- ---
--- --- --- #1

--- --- |height 10pt --- |R#1D|

--- 64 --- 32 --- --- --- 0

EDIDI7RDI 8 9 10 11 12 13 14 15

AMD64 Integer Registers [*]

Assembly-language reference to operands and results

A reference to am operand or result is called an effective address. The value of an operand may be coded into the instruction as a literal or immediate operand, or it may be stored in a container. A result is always stored in a container.

Immediate operands begin with $ and are followed by C syntax for a decimal or hexadecimal literal:

In DDD and gdb, literals are written as in C, without the $ sign. As in C, hexadecimal literals must have a leading 0x.

The machine can refer to two kinds of containers: registers and memory. Registers are referred to by name, with a % sign in the assembler and in objdump:

   %rax   %xmm0
In DDD, registers are referred to with a $ sign.

`%=12 `$=12 Memory locations are always referred to by the address of the first byte; the assembly-language syntax is arcane:
( refer to simply as 0x10( -0x8( 0x4089a0(, The address is 0x4089a0 + 8 * This form of reference can be used for very fast array indexing, provided the elements of the array are 8 bytes in size, as in an array of pointers. Only multipliers 1, 2, 4, and 8 are supported.
( the values in

Here are some example instructions:
mov -0x8( Take the 32-bit word whose first byte is stored at memory address of 
mov 0x8( Take from the stack the 64-bit word whose first byte is located at address
add 0x1,
0x1, 0x8( located at address The q suffix is needed on the add because the literal 1 could represent an integer of any size, and the address The q means ``64 bits.'' (l means 32 bits, w means 16 bits, and b means 8 bits). A suffix is normally unnecessary, because the way the register is named indicates the size (examples include and
lea -0x30( Compute the address ( contents of memory. Instead, store the address itself into register This is the ``load effective address'' instruction: its binary coding is short, it doesn't tie up the integer unit, and it doesn't set the flags.

Selected integer instructions

Opcode ExamplesRTL
addadd $0x18, %rsp[]rsp rsp + 24
add 0x8(%rcx), %rdx[]rdx rdx + rcx + 8
subsub $0x18, %rsp[]rsp rsp - 24
sub %rax, 0x8(%rdx)[]rdx + 8 rdx + 8 - rax
sub %rdx, %rax[]rax rax - rdx
lealea 0x10(%rsp), %raxrax rsp+16load effective address
lea (%rbx, %rax, 8), %raxrax rbx + rax ×_u 8(flags unchanged)
adcadc $0x0, %ecx[]rcx rcx + 0 + add with carry
adc `$0xffffffffffffffff, `%r12 []r12 r12 - 1 +
sbbsbb %eax, %eax[]eax eax - (eax+) subtract with borrow
sbb $0x3, %rdi[]rdi rdi - (3 + )
negneg %edx[]edx -edxtwo's-complement negate
negq 0x28(%rsp)[]rsp+40_32 -rsp+40_32
mulmul %rcx[]rdx:rax rax ×_u rcx unsigned multiply
mul %ecx[]edx:eax eax ×_u ecx
imulimul 0x10(%rbx), %rbp[]rbp lobits_64(rbp ×_s rbx + 16)
signed multiply
divdiv %esirdx rdxrax _u esi rax rdxrax _u esi unsigned divide
idividiv %r8rdx rdxrax _s r8 rax rdxrax _s esi signed divide
shlshl %cl, %raxrax rax (cl mod64)shift left
sarsar %cl, %rdxrdx rdx (cl mod64)shift arithmetic right (signed)
shrshr %cl, %raxrax rax (cl mod64)shift right (unsigned)
shrl $0x8, 0x8c(%rsp)rsp+140_32 rsp+140_32 (8 mod32)
andand %r11, %rcx[]rcx rcx r11bitwise and
oror %ebx, 0x10(%rsp)[]rsp+16 rsp+16 ebxbitwise or
xorxorb $0x36, (%rax, %r12, 1)[]rax+r12_8 rax+r12_8 xor 54bitwise exclusive or
notnot %ebpebp ebpone's complement
movmov `$0x7fffffffffffffff, rax []rax 2^63-1load immediate
mov %rax, (%r9, %rsi, 8) []r9+rsi×8_64 raxstore
mov 0x8(%rsp), %rdi[]rdi rsp+8_64load
movsmovsbq (%rbx), %rdxrdx sx_864 rbxsign-extending load
movslq %edi, %raxrax sx_3264 edisign-extending move
movzmovzbl 0x10(%rdi), %esiesi zx_832 rdi+16zero-extending load
movzbl 0x2(r12, rax, 1), eax eax zx_832 r12+rax+2
poppop %rbxrbx rsp rsp rsp + 8(flags unchanged)
pushpush %r14rsp-8 r14 rsp rsp - 8(flags unchanged)

Comparisons and control flow

Opcode ExamplesMeaning
jmpjmp Lstart executing program at label Ljump
cmpcmp %r13, %r12set flags as if for sub %r13, %r12 (but leave r12 unchanged)compare
testtestb $0x10, (%rsi) set flags as if for andb $0x10, (%rsi) (but leave memory unchanged) test bit(s)
test %eax, %eax(eax eax = 0), and set other flags also
jaja L if comparison showed >_u, jump to label Ljump if above
jaeja L if comparison showed >=_u, jump to label Ljump if above or equal
jbjb L if comparison showed <_u, jump to label Ljump if below
jbejb L if comparison showed <=_u, jump to label Ljump if below or equal
jcjc L if !=0, jump to label Ljump if carry
jeje L if comparison showed equal (=0), jump to label Ljump if equal
jgja L if comparison showed >_s, jump to label Ljump if greater
jgeja L if comparison showed >=_s, jump to label Ljump if greater or equal
jlja L if comparison showed <_s, jump to label Ljump if less
jleja L if comparison showed <=_s, jump to label Ljump if less or equal
jzjz L if last result was zero, jump to label L (same as je)jump if zero
callcall printfpush address of next instruction and go to printfcall
callq *%raxpush address of next instruction and go to instruction at address found in rax
callq *0x10(%rcx)push address of next instruction and go to instruction at address found in rcx+16
retretqpop an address from the stack and go to that addressreturn

There are many more conditional comparison instructions to be found in the architecture manual. Most notably, every conditional jump comes in both positive and negative versions; for example, the negative version of ja is jna, i.e., ``jump if not above.''

SASL library    Firefox binary    
75222 mov 3364mov
11881 test 693call
11073 callq 569lea
10887 je 507pop
9267 lea 505push
7567 xor 435add
7531 jne 405nop
5818 jmpq 367test
5180 add 318je
4397 cmp 301sub
2908 movq 271jmp
2791 movl 267ret
2633 sub 226movl
2292 nopl 212cmp
2285 pop 126jne
1944 testb 108xor
1804 and 89movzbl
1782 push 42movzwl
1732 retq 41jbe
1560 jmp 35jae
1528 movzwl 33js
1422 movzbl 33ja
1180 cmpq 31xchg
931 nopw 27shr
649 shl 24jb
524 cmpl 24cmpb
499 xchg 23leave
499 nop 21movsbl
496 ja 19and
445 or 18movb
439 jbe 13shl
414 cmove 13addl
406 cmpb 12sete
373 orl 12fxch
331 sar 12fstp
326 ror 11imul
299 shr 10setne
285 movb 10sar
269 sete 10movswl
258 movslq 9cmpl
257 sbb 8ror
230 addl 8flds

Popular instructions by mnemonic and suffix [*]