0% found this document useful (0 votes)
14 views

xaa

The document is the Programmer's Manual for Flat Assembler 1.73, detailing its features, system requirements, and assembly syntax. It provides comprehensive information on executing the compiler, understanding its instruction set, and utilizing various directives for data definitions and formatting. The manual serves as a guide for both novice and experienced assembly language programmers to effectively use the Flat Assembler for x86 architecture processors.

Uploaded by

nohew14488
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

xaa

The document is the Programmer's Manual for Flat Assembler 1.73, detailing its features, system requirements, and assembly syntax. It provides comprehensive information on executing the compiler, understanding its instruction set, and utilizing various directives for data definitions and formatting. The manual serves as a guide for both novice and experienced assembly language programmers to effectively use the Flat Assembler for x86 architecture processors.

Uploaded by

nohew14488
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 17

,'''

,,;,, ,,,, ,,,,, ,,, ,,


; ; ; ; ; ;
; ,''''; '''', ; ; ;
; ',,,,;, ,,,,,' ; ; ;

flat assembler 1.73


Programmer's Manual

Table of contents
-----------------

Chapter 1 Introduction

1.1 Compiler overview


1.1.1 System requirements
1.1.2 Executing compiler from command line
1.1.3 Compiler messages
1.1.4 Output formats

1.2 Assembly syntax


1.2.1 Instruction syntax
1.2.2 Data definitions
1.2.3 Constants and labels
1.2.4 Numerical expressions
1.2.5 Jumps and calls
1.2.6 Size settings

Chapter 2 Instruction set

2.1 The x86 architecture instructions


2.1.1 Data movement instructions
2.1.2 Type conversion instructions
2.1.3 Binary arithmetic instructions
2.1.4 Decimal arithmetic instructions
2.1.5 Logical instructions
2.1.6 Control transfer instructions
2.1.7 I/O instructions
2.1.8 Strings operations
2.1.9 Flag control instructions
2.1.10 Conditional operations
2.1.11 Miscellaneous instructions
2.1.12 System instructions
2.1.13 FPU instructions
2.1.14 MMX instructions
2.1.15 SSE instructions
2.1.16 SSE2 instructions
2.1.17 SSE3 instructions
2.1.18 AMD 3DNow! instructions
2.1.19 The x86-64 long mode instructions
2.1.20 SSE4 instructions
2.1.21 AVX instructions
2.1.22 AVX2 instructions
2.1.23 Auxiliary sets of computational instructions
2.1.24 AVX-512 instructions
2.1.25 Other extensions of instruction set
2.2 Control directives
2.2.1 Numerical constants
2.2.2 Conditional assembly
2.2.3 Repeating blocks of instructions
2.2.4 Addressing spaces
2.2.5 Other directives
2.2.6 Multiple passes

2.3 Preprocessor directives


2.3.1 Including source files
2.3.2 Symbolic constants
2.3.3 Macroinstructions
2.3.4 Structures
2.3.5 Repeating macroinstructions
2.3.6 Conditional preprocessing
2.3.7 Order of processing

2.4 Formatter directives


2.4.1 MZ executable
2.4.2 Portable Executable
2.4.3 Common Object File Format
2.4.4 Executable and Linkable Format

Chapter 1 Introduction
-----------------------

This chapter contains all the most important information you need to begin
using the flat assembler. If you are experienced assembly language programmer,
you should read at least this chapter before using this compiler.

1.1 Compiler overview

Flat assembler is a fast assembly language compiler for the x86 architecture
processors, which does multiple passes to optimize the size of generated
machine code. It is self-compilable and versions for different operating
systems are provided. All the versions are designed to be used from the system
command line and they should not differ in behavior.

1.1.1 System requirements

All versions require the x86 architecture 32-bit processor (at least 80386),
although they can produce programs for the x86 architecture 16-bit processors,
too. DOS version requires an OS compatible with MS DOS 2.0 and either true
real mode environment or DPMI. Windows version requires a Win32 console
compatible with 3.1 version.

1.1.2 Executing compiler from command line

To execute flat assembler from the command line you need to provide two
parameters - first should be name of source file, second should be name of
destination file. If no second parameter is given, the name for output
file will be guessed automatically. After displaying short information about
the program name and version, compiler will read the data from source file and
compile it. When the compilation is successful, compiler will write the
generated code to the destination file and display the summary of compilation
process; otherwise it will display the information about error that occurred.
The source file should be a text file, and can be created in any text
editor. Line breaks are accepted in both DOS and Unix standards, tabulators
are treated as spaces.
In the command line you can also include "-m" option followed by a number,
which specifies how many kilobytes of memory flat assembler should maximally
use. In case of DOS version this options limits only the usage of extended
memory. The "-p" option followed by a number can be used to specify the limit
for number of passes the assembler performs. If code cannot be generated
within specified amount of passes, the assembly will be terminated with an
error message. The maximum value of this setting is 65536, while the default
limit, used when no such option is included in command line, is 100.
It is also possible to limit the number of passes the assembler
performs, with the "-p" option followed by a number specifying the maximum
number of passes.
There are no command line options that would affect the output of compiler,
flat assembler requires only the source code to include the information it
really needs. For example, to specify output format you specify it by using
the "format" directive at the beginning of source.

1.1.3 Compiler messages

As it is stated above, after the successful compilation, the compiler displays


the compilation summary. It includes the information of how many passes was
done, how much time it took, and how many bytes were written into the
destination file.
The following is an example of the compilation summary:

flat assembler version 1.72 (16384 kilobytes memory)


38 passes, 5.3 seconds, 77824 bytes.

In case of error during the compilation process, the program will display an
error message. For example, when compiler can't find the input file, it will
display the following message:

flat assembler version 1.72 (16384 kilobytes memory)


error: source file not found.

If the error is connected with a specific part of source code, the source line
that caused the error will be also displayed. Also placement of this line in
the source is given to help you finding this error, for example:

flat assembler version 1.72 (16384 kilobytes memory)


example.asm [3]:
mob ax,1
error: illegal instruction.

It means that in the third line of the "example.asm" file compiler has
encountered an unrecognized instruction. When the line that caused error
contains a macroinstruction, also the line in macroinstruction definition
that generated the erroneous instruction is displayed:

flat assembler version 1.72 (16384 kilobytes memory)


example.asm [6]:
stoschar 7
example.asm [3] stoschar [1]:
mob al,char
error: illegal instruction.

It means that the macroinstruction in the sixth line of the "example.asm" file
generated an unrecognized instruction with the first line of its definition.

1.1.4 Output formats

By default, when there is no "format" directive in source file, flat


assembler simply puts generated instruction codes into output, creating this
way flat binary file. By default it generates 16-bit code, but you can always
turn it into the 16-bit or 32-bit mode by using "use16" or "use32" directive.
Some of the output formats switch into 32-bit mode, when selected - more
information about formats which you can choose can be found in 2.4.
All output code is always in the order in which it was entered into the
source file.

1.2 Assembly syntax

The information provided below is intended mainly for the assembly language
programmers that have been using some other assembly compilers before.
If you are beginner, you should look for the assembly programming tutorials.
Flat assembler by default uses the Intel syntax for the assembly
instructions, although you can customize it using the preprocessor
capabilities (macroinstructions and symbolic constants). It also has its own
set of the directives - the instructions for compiler.
All symbols defined inside the sources are case-sensitive.

1.2.1 Instruction syntax

Instructions in assembly language are separated by line breaks, and one


instruction is expected to fill the one line of text. If a line contains
a semicolon, except for the semicolons inside the quoted strings, the rest of
this line is the comment and compiler ignores it. If a line ends with "\"
character (eventually the semicolon and comment may follow it), the next line
is attached at this point.
Each line in source is the sequence of items, which may be one of the three
types. One type are the symbol characters, which are the special characters
that are individual items even when are not spaced from the other ones.
Any of the "+-*/=<>()[]{}:,|&~#`" is the symbol character. The sequence of
other characters, separated from other items with either blank spaces or
symbol characters, is a symbol. If the first character of symbol is either a
single or double quote, it integrates any sequence of characters following it,
even the special ones, into a quoted string, which should end with the same
character, with which it began (the single or double quote) - however if there
are two such characters in a row (without any other character between them),
they are integrated into quoted string as just one of them and the quoted
string continues then. The symbols other than symbol characters and quoted
strings can be used as names, so are also called the name symbols.
Every instruction consists of the mnemonic and the various number of
operands, separated with commas. The operand can be register, immediate value
or a data addressed in memory, it can also be preceded by size operator to
define or override its size (table 1.1). Names of available registers you can
find in table 1.2, their sizes cannot be overridden. Immediate value can be
specified by any numerical expression.
When operand is a data in memory, the address of that data (also any
numerical expression, but it may contain registers) should be enclosed in
square brackets or preceded by "ptr" operator. For example instruction
"mov eax,3" will put the immediate value 3 into the EAX register, instruction
"mov eax,[7]" will put the 32-bit value from the address 7 into EAX and the
instruction "mov byte [7],3" will put the immediate value 3 into the byte at
address 7, it can also be written as "mov byte ptr 7,3". To specify which
segment register should be used for addressing, segment register name followed
by a colon should be put just before the address value (inside the square
brackets or after the "ptr" operator).

Table 1.1 Size operators


/-------------------------\
| Operator | Bits | Bytes |
|==========|======|=======|
| byte | 8 | 1 |
| word | 16 | 2 |
| dword | 32 | 4 |
| fword | 48 | 6 |
| pword | 48 | 6 |
| qword | 64 | 8 |
| tbyte | 80 | 10 |
| tword | 80 | 10 |
| dqword | 128 | 16 |
| xword | 128 | 16 |
| qqword | 256 | 32 |
| yword | 256 | 32 |
| dqqword | 512 | 64 |
| zword | 512 | 64 |
\-------------------------/

Table 1.2 Registers


/-----------------------------------------------------------------\
| Type | Bits | |
|=========|======|================================================|
| | 8 | al cl dl bl ah ch dh bh |
| General | 16 | ax cx dx bx sp bp si di |
| | 32 | eax ecx edx ebx esp ebp esi edi |
|---------|------|------------------------------------------------|
| Segment | 16 | es cs ss ds fs gs |
|---------|------|------------------------------------------------|
| Control | 32 | cr0 cr2 cr3 cr4 |
|---------|------|------------------------------------------------|
| Debug | 32 | dr0 dr1 dr2 dr3 dr6 dr7 |
|---------|------|------------------------------------------------|
| FPU | 80 | st0 st1 st2 st3 st4 st5 st6 st7 |
|---------|------|------------------------------------------------|
| MMX | 64 | mm0 mm1 mm2 mm3 mm4 mm5 mm6 mm7 |
|---------|------|------------------------------------------------|
| SSE | 128 | xmm0 xmm1 xmm2 xmm3 xmm4 xmm5 xmm6 xmm7 |
|---------|------|------------------------------------------------|
| AVX | 256 | ymm0 ymm1 ymm2 ymm3 ymm4 ymm5 ymm6 ymm7 |
|---------|------|------------------------------------------------|
| AVX-512 | 512 | zmm0 zmm1 zmm2 zmm3 zmm4 zmm5 zmm6 zmm7 |
|---------|------|------------------------------------------------|
| Opmask | 64 | k0 k1 k2 k3 k4 k5 k6 k7 |
|---------|------|------------------------------------------------|
| Bounds | 128 | bnd0 bnd1 bnd2 bnd3 |
\-----------------------------------------------------------------/
1.2.2 Data definitions

To define data or reserve a space for it, use one of the directives listed in
table 1.3. The data definition directive should be followed by one or more of
numerical expressions, separated with commas. These expressions define the
values for data cells of size depending on which directive is used. For
example "db 1,2,3" will define the three bytes of values 1, 2 and 3
respectively.
The "db" and "du" directives also accept the quoted string values of any
length, which will be converted into chain of bytes when "db" is used and into
chain of words with zeroed high byte when "du" is used. For example "db 'abc'"
will define the three bytes of values 61, 62 and 63.
The "dp" directive and its synonym "df" accept the values consisting of two
numerical expressions separated with colon, the first value will become the
high word and the second value will become the low double word of the far
pointer value. Also "dd" accepts such pointers consisting of two word values
separated with colon, and "dt" accepts the word and quad word value separated
with colon, the quad word is stored first. The "dt" directive with single
expression as parameter accepts only floating point values and creates data in
FPU double extended precision format.
Any of the above directive allows the usage of special "dup" operator to
make multiple copies of given values. The count of duplicates should precede
this operator and the value to duplicate should follow - it can even be the
chain of values separated with commas, but such set of values needs to be
enclosed with parenthesis, like "db 5 dup (1,2)", which defines five copies
of the given two byte sequence.
The "file" is a special directive and its syntax is different. This
directive includes a chain of bytes from file and it should be followed by the
quoted file name, then optionally numerical expression specifying offset in
file preceded by the colon, and - also optionally - comma and numerical
expression specifying count of bytes to include (if no count is specified, all
data up to the end of file is included). For example "file 'data.bin'" will
include the whole file as binary data and "file 'data.bin':10h,4" will include
only four bytes starting at offset 10h.
The data reservation directive should be followed by only one numerical
expression, and this value defines how many cells of the specified size should
be reserved. All data definition directives also accept the "?" value, which
means that this cell should not be initialized to any value and the effect is
the same as by using the data reservation directive. The uninitialized data
may not be included in the output file, so its values should be always
considered unknown.

Table 1.3 Data directives


/----------------------------\
| Size | Define | Reserve |
| (bytes) | data | data |
|=========|========|=========|
| 1 | db | rb |
| | file | |
|---------|--------|---------|
| 2 | dw | rw |
| | du | |
|---------|--------|---------|
| 4 | dd | rd |
|---------|--------|---------|
| 6 | dp | rp |
| | df | rf |
|---------|--------|---------|
| 8 | dq | rq |
|---------|--------|---------|
| 10 | dt | rt |
\----------------------------/

1.2.3 Constants and labels

In the numerical expressions you can also use constants or labels instead of
numbers. To define the constant or label you should use the specific
directives. Each label can be defined only once and it is accessible from the
any place of source (even before it was defined). Constant can be redefined
many times, but in this case it is accessible only after it was defined, and
is always equal to the value from last definition before the place where it's
used. When a constant is defined only once in source, it is - like the label -
accessible from anywhere.
The definition of constant consists of name of the constant followed by the
"=" character and numerical expression, which after calculation will become
the value of constant. This value is always calculated at the time the
constant is defined. For example you can define "count" constant by using the
directive "count = 17", and then use it in the assembly instructions, like
"mov cx,count" - which will become "mov cx,17" during the compilation process.
There are different ways to define labels. The simplest is to follow the
name of label by the colon, this directive can even be followed by the other
instruction in the same line. It defines the label whose value is equal to
offset of the point where it's defined. This method is usually used to label
the places in code. The other way is to follow the name of label (without a
colon) by some data directive. It defines the label with value equal to
offset of the beginning of defined data, and remembered as a label for data
with cell size as specified for that data directive in table 1.3.
The label can be treated as constant of value equal to offset of labeled
code or data. For example when you define data using the labeled directive
"char db 224", to put the offset of this data into BX register you should use
"mov bx,char" instruction, and to put the value of byte addressed by "char"
label to DL register, you should use "mov dl,[char]" (or "mov dl,ptr char").
But when you try to assemble "mov ax,[char]", it will cause an error, because
fasm compares the sizes of operands, which should be equal. You can force
assembling that instruction by using size override: "mov ax,word [char]", but
remember that this instruction will read the two bytes beginning at "char"
address, while it was defined as a one byte.
The last and the most flexible way to define labels is to use "label"
directive. This directive should be followed by the name of label, then
optionally size operator (it can be preceded by a colon) and then - also
optionally "at" operator and the numerical expression defining the address at
which this label should be defined. For example "label wchar word at char"
will define a new label for the 16-bit data at the address of "char". Now the
instruction "mov ax,[wchar]" will be after compilation the same as
"mov ax,word [char]". If no address is specified, "label" directive defines
the label at current offset. Thus "mov [wchar],57568" will copy two bytes
while "mov [char],224" will copy one byte to the same address.
The label whose name begins with dot is treated as local label, and its name
is attached to the name of last global label (with name beginning with
anything but dot) to make the full name of this label. So you can use the
short name (beginning with dot) of this label anywhere before the next global
label is defined, and in the other places you have to use the full name. Label
beginning with two dots are the exception - they are like global, but they
don't become the new prefix for local labels.
The "@@" name means anonymous label, you can have defined many of them in
the source. Symbol "@b" (or equivalent "@r") references the nearest preceding
anonymous label, symbol "@f" references the nearest following anonymous label.
These special symbol are case-insensitive.

1.2.4 Numerical expressions

In the above examples all the numerical expressions were the simple numbers,
constants or labels. But they can be more complex, by using the arithmetical
or logical operators for calculations at compile time. All these operators
with their priority values are listed in table 1.4. The operations with higher
priority value will be calculated first, you can of course change this
behavior by putting some parts of expression into parenthesis. The "+", "-",
"*" and "/" are standard arithmetical operations, "mod" calculates the
remainder from division. The "and", "or", "xor", "shl", "shr", "bsf", "bsr"
and "not" perform the same bit-logical operations as assembly instructions of
those names. The "rva" and "plt" are special unary operators that perform
conversions between different kinds of addresses, they can be used only with
few of the output formats and their meaning may vary (see 2.4).
The arithmetical and bit-logical calculations are processed as if they
operated on infinite precision 2-adic numbers, and assembler signalizes an
overflow error if because of its limitations it is not table to perform the
required calculation, or if the result is too large number to fit in either
signed or unsigned range for the destination unit size.
The numbers in the expression are by default treated as a decimal, binary
numbers should have the "b" letter attached at the end, octal number should
end with "o" letter, hexadecimal numbers should begin with "0x" characters
(like in C language) or with the "$" character (like in Pascal language) or
they should end with "h" letter. Also quoted string, when encountered in
expression, will be converted into number - the first character will become
the least significant byte of number.
The numerical expression used as an address value can also contain any of
general registers used for addressing, they can be added and multiplied by
appropriate values, as it is allowed for the x86 architecture instructions.
The numerical calculations inside address definition by default operate with
target size assumed to be the same as the current bitness of code, even if
generated instruction encoding will use a different address size.
There are also some special symbols that can be used inside the numerical
expression. First is "$", which is always equal to the value of current
offset, while "$$" is equal to base address of current addressing space. The
other one is "%", which is the number of current repeat in parts of code that
are repeated using some special directives (see 2.2) and zero anywhere else.
There's also "%t" symbol, which is always equal to the current time stamp.
Any numerical expression can also consist of single floating point value
(flat assembler does not allow any floating point operations at compilation
time) in the scientific notation, they can end with the "f" letter to be
recognized, otherwise they should contain at least one of the "." or "E"
characters. So "1.0", "1E0" and "1f" define the same floating point value,
while simple "1" defines an integer value.

Table 1.4 Arithmetical and bit-logical operators by priority


/-------------------------\
| Priority | Operators |
|==========|==============|
| 0 | + - |
|----------|--------------|
| 1 | * / |
|----------|--------------|
| 2 | mod |
|----------|--------------|
| 3 | and or xor |
|----------|--------------|
| 4 | shl shr |
|----------|--------------|
| 5 | not |
|----------|--------------|
| 6 | bsf bsr |
|----------|--------------|
| 7 | rva plt |
\-------------------------/

1.2.5 Jumps and calls

The operand of any jump or call instruction can be preceded not only by the
size operator, but also by one of the operators specifying type of the jump:
"short", "near" or "far". For example, when assembler is in 16-bit mode,
instruction "jmp dword [0]" will become the far jump and when assembler is
in 32-bit mode, it will become the near jump. To force this instruction to be
treated differently, use the "jmp near dword [0]" or "jmp far dword [0]" form.
When operand of near jump is the immediate value, assembler will generate
the shortest variant of this jump instruction if possible (but will not create
32-bit instruction in 16-bit mode nor 16-bit instruction in 32-bit mode,
unless there is a size operator stating it). By specifying the jump type
you can force it to always generate long variant (for example "jmp near 0")
or to always generate short variant and terminate with an error when it's
impossible (for example "jmp short 0").

1.2.6 Size settings

When instruction uses some memory addressing, by default the smallest form of
instruction is generated by using the short displacement if only address
value fits in the range. This can be overridden using the "word" or "dword"
operator before the address inside the square brackets (or after the "ptr"
operator), which forces the long displacement of appropriate size to be made.
In case when address is not relative to any registers, those operators allow
also to choose the appropriate mode of absolute addressing.
Instructions "adc", "add", "and", "cmp", "or", "sbb", "sub" and "xor" with
first operand being 16-bit or 32-bit are by default generated in shortened
8-bit form when the second operand is immediate value fitting in the range
for signed 8-bit values. It also can be overridden by putting the "word" or
"dword" operator before the immediate value. The similar rules applies to the
"imul" instruction with the last operand being immediate value.
Immediate value as an operand for "push" instruction without a size operator
is by default treated as a word value if assembler is in 16-bit mode and as a
double word value if assembler is in 32-bit mode, shorter 8-bit form of this
instruction is used if possible, "word" or "dword" size operator forces the
"push" instruction to be generated in longer form for specified size. "pushw"
and "pushd" mnemonics force assembler to generate 16-bit or 32-bit code
without forcing it to use the longer form of instruction.

Chapter 2 Instruction set


--------------------------

This chapter provides the detailed information about the instructions and
directives supported by flat assembler. Directives for defining labels were
already discussed in 1.2.3, all other directives will be described later in
this chapter.
2.1 The x86 architecture instructions

In this section you can find both the information about the syntax and
purpose the assembly language instructions. If you need more technical
information, look for the Intel Architecture Software Developer's Manual.
Assembly instructions consist of the mnemonic (instruction's name) and from
zero to three operands. If there are two or more operands, usually first is
the destination operand and second is the source operand. Each operand can be
register, memory or immediate value (see 1.2 for details about syntax of
operands). After the description of each instruction there are examples
of different combinations of operands, if the instruction has any.
Some instructions act as prefixes and can be followed by other instruction
in the same line, and there can be more than one prefix in a line. Each name
of the segment register is also a mnemonic of instruction prefix, altough it
is recommended to use segment overrides inside the square brackets instead of
these prefixes.

2.1.1 Data movement instructions

"mov" transfers a byte, word or double word from the source operand to the
destination operand. It can transfer data between general registers, from
the general register to memory, or from memory to general register, but it
cannot move from memory to memory. It can also transfer an immediate value to
general register or memory, segment register to general register or memory,
general register or memory to segment register, control or debug register to
general register and general register to control or debug register. The "mov"
can be assembled only if the size of source operand and size of destination
operand are the same. Below are the examples for each of the allowed
combinations:

mov bx,ax ; general register to general register


mov [char],al ; general register to memory
mov bl,[char] ; memory to general register
mov dl,32 ; immediate value to general register
mov [char],32 ; immediate value to memory
mov ax,ds ; segment register to general register
mov [bx],ds ; segment register to memory
mov ds,ax ; general register to segment register
mov ds,[bx] ; memory to segment register
mov eax,cr0 ; control register to general register
mov cr3,ebx ; general register to control register

"xchg" swaps the contents of two operands. It can swap two byte operands,
two word operands or two double word operands. Order of operands is not
important. The operands may be two general registers, or general register
with memory. For example:

xchg ax,bx ; swap two general registers


xchg al,[char] ; swap register with memory

"push" decrements the stack frame pointer (ESP register), then transfers
the operand to the top of stack indicated by ESP. The operand can be memory,
general register, segment register or immediate value of word or double word
size. If operand is an immediate value and no size is specified, it is by
default treated as a word value if assembler is in 16-bit mode and as a double
word value if assembler is in 32-bit mode. "pushw" and "pushd" mnemonics are
variants of this instruction that store the values of word or double word size
respectively. If more operands follow in the same line (separated only with
spaces, not commas), compiler will assemble chain of the "push" instructions
with these operands. The examples are with single operands:

push ax ; store general register


push es ; store segment register
pushw [bx] ; store memory
push 1000h ; store immediate value

"pusha" saves the contents of the eight general register on the stack.
This instruction has no operands. There are two version of this instruction,
one 16-bit and one 32-bit, assembler automatically generates the appropriate
version for current mode, but it can be overridden by using "pushaw" or
"pushad" mnemonic to always get the 16-bit or 32-bit version. The 16-bit
version of this instruction pushes general registers on the stack in the
following order: AX, CX, DX, BX, the initial value of SP before AX was pushed,
BP, SI and DI. The 32-bit version pushes equivalent 32-bit general registers
in the same order.
"pop" transfers the word or double word at the current top of stack to the
destination operand, and then increments ESP to point to the new top of stack.
The operand can be memory, general register or segment register. "popw" and
"popd" mnemonics are variants of this instruction for restoring the values of
word or double word size respectively. If more operands separated with spaces
follow in the same line, compiler will assemble chain of the "pop"
instructions with these operands.

pop bx ; restore general register


pop ds ; restore segment register
popw [si] ; restore memory

"popa" restores the registers saved on the stack by "pusha" instruction,


except for the saved value of SP (or ESP), which is ignored. This instruction
has no operands. To force assembling 16-bit or 32-bit version of this
instruction use "popaw" or "popad" mnemonic.

2.1.2 Type conversion instructions

The type conversion instructions convert bytes into words, words into double
words, and double words into quad words. These conversions can be done using
the sign extension or zero extension. The sign extension fills the extra bits
of the larger item with the value of the sign bit of the smaller item, the
zero extension simply fills them with zeros.
"cwd" and "cdq" double the size of value AX or EAX register respectively
and store the extra bits into the DX or EDX register. The conversion is done
using the sign extension. These instructions have no operands.
"cbw" extends the sign of the byte in AL throughout AX, and "cwde" extends
the sign of the word in AX throughout EAX. These instructions also have no
operands.
"movsx" converts a byte to word or double word and a word to double word
using the sign extension. "movzx" does the same, but it uses the zero
extension. The source operand can be general register or memory, while the
destination operand must be a general register. For example:

movsx ax,al ; byte register to word register


movsx edx,dl ; byte register to double word register
movsx eax,ax ; word register to double word register
movsx ax,byte [bx] ; byte memory to word register
movsx edx,byte [bx] ; byte memory to double word register
movsx eax,word [bx] ; word memory to double word register

2.1.3 Binary arithmetic instructions

"add" replaces the destination operand with the sum of the source and
destination operands and sets CF if overflow has occurred. The operands may
be bytes, words or double words. The destination operand can be general
register or memory, the source operand can be general register or immediate
value, it can also be memory if the destination operand is register.

add ax,bx ; add register to register


add ax,[si] ; add memory to register
add [di],al ; add register to memory
add al,48 ; add immediate value to register
add [char],48 ; add immediate value to memory

"adc" sums the operands, adds one if CF is set, and replaces the destination
operand with the result. Rules for the operands are the same as for the "add"
instruction. An "add" followed by multiple "adc" instructions can be used to
add numbers longer than 32 bits.
"inc" adds one to the operand, it does not affect CF. The operand can be a
general register or memory, and the size of the operand can be byte, word or
double word.

inc ax ; increment register by one


inc byte [bx] ; increment memory by one

"sub" subtracts the source operand from the destination operand and replaces
the destination operand with the result. If a borrow is required, the CF is
set. Rules for the operands are the same as for the "add" instruction.
"sbb" subtracts the source operand from the destination operand, subtracts
one if CF is set, and stores the result to the destination operand. Rules for
the operands are the same as for the "add" instruction. A "sub" followed by
multiple "sbb" instructions may be used to subtract numbers longer than 32
bits.
"dec" subtracts one from the operand, it does not affect CF. Rules for the
operand are the same as for the "inc" instruction.
"cmp" subtracts the source operand from the destination operand. It updates
the flags as the "sub" instruction, but does not alter the source and
destination operands. Rules for the operands are the same as for the "sub"
instruction.
"neg" subtracts a signed integer operand from zero. The effect of this
instructon is to reverse the sign of the operand from positive to negative or
from negative to positive. Rules for the operand are the same as for the "inc"
instruction.
"xadd" exchanges the destination operand with the source operand, then loads
the sum of the two values into the destination operand. The destination operand
may be a general register or memory, the source operand must be a general
register.
All the above binary arithmetic instructions update SF, ZF, PF and OF flags.
SF is always set to the same value as the result's sign bit, ZF is set when
all the bits of result are zero, PF is set when low order eight bits of result
contain an even number of set bits, OF is set if result is too large for a
positive number or too small for a negative number (excluding sign bit) to fit
in destination operand.
"mul" performs an unsigned multiplication of the operand and the
accumulator. If the operand is a byte, the processor multiplies it by the
contents of AL and returns the 16-bit result to AH and AL. If the operand is a
word, the processor multiplies it by the contents of AX and returns the 32-bit
result to DX and AX. If the operand is a double word, the processor multiplies
it by the contents of EAX and returns the 64-bit result in EDX and EAX. "mul"
sets CF and OF when the upper half of the result is nonzero, otherwise they
are cleared. Rules for the operand are the same as for the "inc" instruction.
"imul" performs a signed multiplication operation. This instruction has
three variations. First has one operand and behaves in the same way as the
"mul" instruction. Second has two operands, in this case destination operand
is multiplied by the source operand and the result replaces the destination
operand. Destination operand must be a general register, it can be word or
double word, source operand can be general register, memory or immediate
value. Third form has three operands, the destination operand must be a
general register, word or double word in size, source operand can be general
register or memory, and third operand must be an immediate value. The source
operand is multiplied by the immediate value and the result is stored in the
destination register. All the three forms calculate the product to twice the
size of operands and set CF and OF when the upper half of the result is
nonzero, but second and third form truncate the product to the size of
operands. So second and third forms can be also used for unsigned operands
because, whether the operands are signed or unsigned, the lower half of the
product is the same. Below are the examples for all three forms:

imul bl ; accumulator by register


imul word [si] ; accumulator by memory
imul bx,cx ; register by register
imul bx,[si] ; register by memory
imul bx,10 ; register by immediate value
imul ax,bx,10 ; register by immediate value to register
imul ax,[si],10 ; memory by immediate value to register

"div" performs an unsigned division of the accumulator by the operand.


The dividend (the accumulator) is twice the size of the divisor (the operand),
the quotient and remainder have the same size as the divisor. If divisor is
byte, the dividend is taken from AX register, the quotient is stored in AL and
the remainder is stored in AH. If divisor is word, the upper half of dividend
is taken from DX, the lower half of dividend is taken from AX, the quotient is
stored in AX and the remainder is stored in DX. If divisor is double word,
the upper half of dividend is taken from EDX, the lower half of dividend is
taken from EAX, the quotient is stored in EAX and the remainder is stored in
EDX. Rules for the operand are the same as for the "mul" instruction.
"idiv" performs a signed division of the accumulator by the operand.
It uses the same registers as the "div" instruction, and the rules for
the operand are the same.

2.1.4 Decimal arithmetic instructions

Decimal arithmetic is performed by combining the binary arithmetic


instructions (already described in the prior section) with the decimal
arithmetic instructions. The decimal arithmetic instructions are used to
adjust the results of a previous binary arithmetic operation to produce a
valid packed or unpacked decimal result, or to adjust the inputs to a
subsequent binary arithmetic operation so the operation will produce a valid
packed or unpacked decimal result.
"daa" adjusts the result of adding two valid packed decimal operands in
AL. "daa" must always follow the addition of two pairs of packed decimal
numbers (one digit in each half-byte) to obtain a pair of valid packed
decimal digits as results. The carry flag is set if carry was needed.
This instruction has no operands.
"das" adjusts the result of subtracting two valid packed decimal operands
in AL. "das" must always follow the subtraction of one pair of packed decimal
numbers (one digit in each half-byte) from another to obtain a pair of valid
packed decimal digits as results. The carry flag is set if a borrow was
needed. This instruction has no operands.
"aaa" changes the contents of register AL to a valid unpacked decimal
number, and zeroes the top four bits. "aaa" must always follow the addition
of two unpacked decimal operands in AL. The carry flag is set and AH is
incremented if a carry is necessary. This instruction has no operands.
"aas" changes the contents of register AL to a valid unpacked decimal
number, and zeroes the top four bits. "aas" must always follow the
subtraction of one unpacked decimal operand from another in AL. The carry flag
is set and AH decremented if a borrow is necessary. This instruction has no
operands.
"aam" corrects the result of a multiplication of two valid unpacked decimal
numbers. "aam" must always follow the multiplication of two decimal numbers
to produce a valid decimal result. The high order digit is left in AH, the
low order digit in AL. The generalized version of this instruction allows
adjustment of the contents of the AX to create two unpacked digits of any
number base. The standard version of this instruction has no operands, the
generalized version has one operand - an immediate value specifying the
number base for the created digits.
"aad" modifies the numerator in AH and AL to prepare for the division of two
valid unpacked decimal operands so that the quotient produced by the division
will be a valid unpacked decimal number. AH should contain the high order
digit and AL the low order digit. This instruction adjusts the value and
places the result in AL, while AH will contain zero. The generalized version
of this instruction allows adjustment of two unpacked digits of any number
base. Rules for the operand are the same as for the "aam" instruction.

2.1.5 Logical instructions

"not" inverts the bits in the specified operand to form a one's complement
of the operand. It has no effect on the flags. Rules for the operand are the
same as for the "inc" instruction.
"and", "or" and "xor" instructions perform the standard logical operations.
They update the SF, ZF and PF flags. Rules for the operands are the same as
for the "add" instruction.
"bt", "bts", "btr" and "btc" instructions operate on a single bit which can
be in memory or in a general register. The location of the bit is specified
as an offset from the low order end of the operand. The value of the offset
is the taken from the second operand, it either may be an immediate byte or
a general register. These instructions first assign the value of the selected
bit to CF. "bt" instruction does nothing more, "bts" sets the selected bit to
1, "btr" resets the selected bit to 0, "btc" changes the bit to its
complement. The first operand can be word or double word.

bt ax,15 ; test bit in register


bts word [bx],15 ; test and set bit in memory
btr ax,cx ; test and reset bit in register
btc word [bx],cx ; test and complement bit in memory

"bsf" and "bsr" instructions scan a word or double word for first set bit
and store the index of this bit into destination operand, which must be
general register. The bit string being scanned is specified by source operand,
it may be either general register or memory. The ZF flag is set if the entire
string is zero (no set bits are found); otherwise it is cleared. If no set bit
is found, the value of the destination register is undefined. "bsf" scans from
low order to high order (starting from bit index zero). "bsr" scans from high
order to low order (starting from bit index 15 of a word or index 31 of a
double word).

bsf ax,bx ; scan register forward


bsr ax,[si] ; scan memory reverse

"shl" shifts the destination operand left by the number of bits specified
in the second operand. The destination operand can be byte, word, or double
word general register or memory. The second operand can be an immediate value
or the CL register. The processor shifts zeros in from the right (low order)
side of the operand as bits exit from the left side. The last bit that exited
is stored in CF. "sal" is a synonym for "shl".

shl al,1 ; shift register left by one bit


shl byte [bx],1 ; shift memory left by one bit
shl ax,cl ; shift register left by count from cl
shl word [bx],cl ; shift memory left by count from cl

"shr" and "sar" shift the destination operand right by the number of bits
specified in the second operand. Rules for operands are the same as for the
"shl" instruction. "shr" shifts zeros in from the left side of the operand as
bits exit from the right side. The last bit that exited is stored in CF.
"sar" preserves the sign of the operand by shifting in zeros on the left side
if the value is positive or by shifting in ones if the value is negative.
"shld" shifts bits of the destination operand to the left by the number
of bits specified in third operand, while shifting high order bits from the
source operand into the destination operand on the right. The source operand
remains unmodified. The destination operand can be a word or double word
general register or memory, the source operand must be a general register,
third operand can be an immediate value or the CL register.

shld ax,bx,1 ; shift register left by one bit


shld [di],bx,1 ; shift memory left by one bit
shld ax,bx,cl ; shift register left by count from cl
shld [di],bx,cl ; shift memory left by count from cl

"shrd" shifts bits of the destination operand to the right, while shifting
low order bits from the source operand into the destination operand on the
left. The source operand remains unmodified. Rules for operands are the same
as for the "shld" instruction.
"rol" and "rcl" rotate the byte, word or double word destination operand
left by the number of bits specified in the second operand. For each rotation
specified, the high order bit that exits from the left of the operand returns
at the right to become the new low order bit. "rcl" additionally puts in CF
each high order bit that exits from the left side of the operand before it
returns to the operand as the low order bit on the next rotation cycle. Rules
for operands are the same as for the "shl" instruction.
"ror" and "rcr" rotate the byte, word or double word destination operand
right by the number of bits specified in the second operand. For each rotation
specified, the low order bit that exits from the right of the operand returns
at the left to become the new high order bit. "rcr" additionally puts in CF
each low order bit that exits from the right side of the operand before it
returns to the operand as the high order bit on the next rotation cycle.
Rules for operands are the same as for the "shl" instruction.
"test" performs the same action as the "and" instruction, but it does not
alter the destination operand, only updates flags. Rules for the operands are
the same as for the "and" instruction.
"bswap" reverses the byte order of a 32-bit general register: bits 0 through
7 are swapped with bits 24 through 31, and bits 8 through 15 are swapped with
bits 16 through 23. This instruction is provided for converting little-endian
values to big-endian format and vice versa.

bswap edx ; swap bytes in register

2.1.6 Control transfer instructions

"jmp" unconditionally transfers control to the target location. The


destination address can be specified directly within the instruction or
indirectly through a register or memory, the acceptable size of this address
depends on whether the jump is near or far (it can be specified by preceding
the operand with "near" or "far" operator) and whether the instruction is
16-bit or 32-bit. Operand for near jump should be "word" size for 16-bit
instruction or the "dword" size for 32-bit instruction. Operand for far jump
should be "dword" size for 16-bit instruction or "pword" size for 32-bit
instruction. A direct "jmp" instruction includes the destination address as
part of the instruction (and can be preceded by "short", "near" or "far"
operator), the operand specifying address should be the numerical expression
for near or short jump, or two numerical expressions separated with colon for
far jump, the first specifies selector of segment, the second is the offset
within segment. The "pword" operator can be used to force the 32-bit far call,
and "dword" to force the 16-bit far call. An indirect "jmp" instruction
obtains the destination address indirectly through a register or a pointer
variable, the operand should be general register or memory. See also 1.2.5 for
some more details.

jmp 100h ; direct near jump


jmp 0FFFFh:0 ; direct far jump
jmp ax ; indirect near jump
jmp pword [ebx] ; indirect far jump

"call" transfers control to the procedure, saving on the stack the address
of the instruction following the "call" for later use by a "ret" (return)
instruction. Rules for the operands are the same as for the "jmp" instruction,
but the "call" has no short variant of direct instruction and thus it not
optimized.
"ret", "retn" and "retf" instructions terminate the execution of a procedure
and transfers control back to the program that originally invoked the
procedure using the address that was stored on the stack by the "call"
instruction. "ret" is the equivalent for "retn", which returns from the
procedure that was executed using the near call, while "retf" returns from
the procedure that was executed using the far call. These instructions default
to the size of address appropriate for the current code setting, but the size
of address can be forced to 16-bit by using the "retw", "retnw" and "retfw"
mnemonics, and to 32-bit by using the "retd", "retnd" and "retfd" mnemonics.
All these instructions may optionally specify an immediate operand, by adding
this constant to the stack pointer, they effectively remove any arguments that
the calling program pushed on the stack before the execution of the "call"
instruction.
"iret" returns control to an interrupted procedure. It differs from "ret" in
that it also pops the flags from the stack into the flags register. The flags
are stored on the stack by the interrupt mechanism. It defaults to the size of
return address appropriate for the current code setting, but it can be forced
to use 16-bit or 32-bit address by using the "iretw" or "iretd" mnemonic.
The conditional transfer instructions are jumps that may or may not transfer
control, depending on the state of the CPU flags when the instruction
executes. The mnemonics for conditional jumps may be obtained by attaching
the condition mnemonic (see table 2.1) to the "j" mnemonic,
for example "jc" instruction will transfer the control when the CF flag is
set. The conditional jumps can be short or near, and direct only, and can be
optimized (see 1.2.5), the operand should be an immediate value specifying
target address.

Table 2.1 Conditions


/-----------------------------------------------------------\
| Mnemonic | Condition tested | Description |
|==========|=======================|========================|
| o | OF = 1 | overflow |
|----------|-----------------------|------------------------|
| no | OF = 0 | not overflow |
|----------|-----------------------|------------------------|
| c | | carry |
| b | CF = 1 | below |
| nae | | not above nor equal |
|----------|-----------------------|------------------------|
| nc | | not carry |
| ae | CF = 0 | above or equal |
| nb | | not below |
|----------|-----------------------|------------------------|
| e | ZF = 1 | equal |
| z | | zero |
|----------|-----------------------|------------------------|
| ne | ZF = 0 | not equal |
| nz | | not zero |
|----------|-----------------------|------------------------|
| be | CF or ZF = 1 | below or equal |
| na | | not above |
|----------|-----------------------|------------------------|
| a | CF or ZF = 0 | above |
| nbe | | not below nor equal |
|----------|-----------------------|------------------------|
| s | SF = 1 | sign |
|----------|-----------------------|------------------------|
| ns | SF = 0 | not sign |
|----------|-----------------------|------------------------|
| p | PF = 1 | parity |
| pe | | parity even |
|----------|-----------------------|------------------------|
| np | PF = 0 | not parity |
| po | | parity odd |
|----------|-----------------------|------------------------|
| l | SF xor OF = 1 | less |
| nge | | not greater nor equal |
|----------|-----------------------|------------------------|
| ge | SF xor OF = 0 | greater or equal |
| nl | | not less |
|----------|-----------------------|------------------------|
| le | (SF xor OF) or ZF = 1 | less or equal |
| ng | | not greater |
|----------|-----------------------|------------------------|
| g | (SF xor OF) or ZF = 0 | greater |
| nle | | not less nor equal |

You might also like