xaa
xaa
Table of contents
-----------------
Chapter 1 Introduction
Chapter 1 Introduction
-----------------------
This chapter contains all the most important information you need to begin
using the flat assembler. If you are experienced assembly language programmer,
you should read at least this chapter before using this compiler.
Flat assembler is a fast assembly language compiler for the x86 architecture
processors, which does multiple passes to optimize the size of generated
machine code. It is self-compilable and versions for different operating
systems are provided. All the versions are designed to be used from the system
command line and they should not differ in behavior.
All versions require the x86 architecture 32-bit processor (at least 80386),
although they can produce programs for the x86 architecture 16-bit processors,
too. DOS version requires an OS compatible with MS DOS 2.0 and either true
real mode environment or DPMI. Windows version requires a Win32 console
compatible with 3.1 version.
To execute flat assembler from the command line you need to provide two
parameters - first should be name of source file, second should be name of
destination file. If no second parameter is given, the name for output
file will be guessed automatically. After displaying short information about
the program name and version, compiler will read the data from source file and
compile it. When the compilation is successful, compiler will write the
generated code to the destination file and display the summary of compilation
process; otherwise it will display the information about error that occurred.
The source file should be a text file, and can be created in any text
editor. Line breaks are accepted in both DOS and Unix standards, tabulators
are treated as spaces.
In the command line you can also include "-m" option followed by a number,
which specifies how many kilobytes of memory flat assembler should maximally
use. In case of DOS version this options limits only the usage of extended
memory. The "-p" option followed by a number can be used to specify the limit
for number of passes the assembler performs. If code cannot be generated
within specified amount of passes, the assembly will be terminated with an
error message. The maximum value of this setting is 65536, while the default
limit, used when no such option is included in command line, is 100.
It is also possible to limit the number of passes the assembler
performs, with the "-p" option followed by a number specifying the maximum
number of passes.
There are no command line options that would affect the output of compiler,
flat assembler requires only the source code to include the information it
really needs. For example, to specify output format you specify it by using
the "format" directive at the beginning of source.
In case of error during the compilation process, the program will display an
error message. For example, when compiler can't find the input file, it will
display the following message:
If the error is connected with a specific part of source code, the source line
that caused the error will be also displayed. Also placement of this line in
the source is given to help you finding this error, for example:
It means that in the third line of the "example.asm" file compiler has
encountered an unrecognized instruction. When the line that caused error
contains a macroinstruction, also the line in macroinstruction definition
that generated the erroneous instruction is displayed:
It means that the macroinstruction in the sixth line of the "example.asm" file
generated an unrecognized instruction with the first line of its definition.
The information provided below is intended mainly for the assembly language
programmers that have been using some other assembly compilers before.
If you are beginner, you should look for the assembly programming tutorials.
Flat assembler by default uses the Intel syntax for the assembly
instructions, although you can customize it using the preprocessor
capabilities (macroinstructions and symbolic constants). It also has its own
set of the directives - the instructions for compiler.
All symbols defined inside the sources are case-sensitive.
To define data or reserve a space for it, use one of the directives listed in
table 1.3. The data definition directive should be followed by one or more of
numerical expressions, separated with commas. These expressions define the
values for data cells of size depending on which directive is used. For
example "db 1,2,3" will define the three bytes of values 1, 2 and 3
respectively.
The "db" and "du" directives also accept the quoted string values of any
length, which will be converted into chain of bytes when "db" is used and into
chain of words with zeroed high byte when "du" is used. For example "db 'abc'"
will define the three bytes of values 61, 62 and 63.
The "dp" directive and its synonym "df" accept the values consisting of two
numerical expressions separated with colon, the first value will become the
high word and the second value will become the low double word of the far
pointer value. Also "dd" accepts such pointers consisting of two word values
separated with colon, and "dt" accepts the word and quad word value separated
with colon, the quad word is stored first. The "dt" directive with single
expression as parameter accepts only floating point values and creates data in
FPU double extended precision format.
Any of the above directive allows the usage of special "dup" operator to
make multiple copies of given values. The count of duplicates should precede
this operator and the value to duplicate should follow - it can even be the
chain of values separated with commas, but such set of values needs to be
enclosed with parenthesis, like "db 5 dup (1,2)", which defines five copies
of the given two byte sequence.
The "file" is a special directive and its syntax is different. This
directive includes a chain of bytes from file and it should be followed by the
quoted file name, then optionally numerical expression specifying offset in
file preceded by the colon, and - also optionally - comma and numerical
expression specifying count of bytes to include (if no count is specified, all
data up to the end of file is included). For example "file 'data.bin'" will
include the whole file as binary data and "file 'data.bin':10h,4" will include
only four bytes starting at offset 10h.
The data reservation directive should be followed by only one numerical
expression, and this value defines how many cells of the specified size should
be reserved. All data definition directives also accept the "?" value, which
means that this cell should not be initialized to any value and the effect is
the same as by using the data reservation directive. The uninitialized data
may not be included in the output file, so its values should be always
considered unknown.
In the numerical expressions you can also use constants or labels instead of
numbers. To define the constant or label you should use the specific
directives. Each label can be defined only once and it is accessible from the
any place of source (even before it was defined). Constant can be redefined
many times, but in this case it is accessible only after it was defined, and
is always equal to the value from last definition before the place where it's
used. When a constant is defined only once in source, it is - like the label -
accessible from anywhere.
The definition of constant consists of name of the constant followed by the
"=" character and numerical expression, which after calculation will become
the value of constant. This value is always calculated at the time the
constant is defined. For example you can define "count" constant by using the
directive "count = 17", and then use it in the assembly instructions, like
"mov cx,count" - which will become "mov cx,17" during the compilation process.
There are different ways to define labels. The simplest is to follow the
name of label by the colon, this directive can even be followed by the other
instruction in the same line. It defines the label whose value is equal to
offset of the point where it's defined. This method is usually used to label
the places in code. The other way is to follow the name of label (without a
colon) by some data directive. It defines the label with value equal to
offset of the beginning of defined data, and remembered as a label for data
with cell size as specified for that data directive in table 1.3.
The label can be treated as constant of value equal to offset of labeled
code or data. For example when you define data using the labeled directive
"char db 224", to put the offset of this data into BX register you should use
"mov bx,char" instruction, and to put the value of byte addressed by "char"
label to DL register, you should use "mov dl,[char]" (or "mov dl,ptr char").
But when you try to assemble "mov ax,[char]", it will cause an error, because
fasm compares the sizes of operands, which should be equal. You can force
assembling that instruction by using size override: "mov ax,word [char]", but
remember that this instruction will read the two bytes beginning at "char"
address, while it was defined as a one byte.
The last and the most flexible way to define labels is to use "label"
directive. This directive should be followed by the name of label, then
optionally size operator (it can be preceded by a colon) and then - also
optionally "at" operator and the numerical expression defining the address at
which this label should be defined. For example "label wchar word at char"
will define a new label for the 16-bit data at the address of "char". Now the
instruction "mov ax,[wchar]" will be after compilation the same as
"mov ax,word [char]". If no address is specified, "label" directive defines
the label at current offset. Thus "mov [wchar],57568" will copy two bytes
while "mov [char],224" will copy one byte to the same address.
The label whose name begins with dot is treated as local label, and its name
is attached to the name of last global label (with name beginning with
anything but dot) to make the full name of this label. So you can use the
short name (beginning with dot) of this label anywhere before the next global
label is defined, and in the other places you have to use the full name. Label
beginning with two dots are the exception - they are like global, but they
don't become the new prefix for local labels.
The "@@" name means anonymous label, you can have defined many of them in
the source. Symbol "@b" (or equivalent "@r") references the nearest preceding
anonymous label, symbol "@f" references the nearest following anonymous label.
These special symbol are case-insensitive.
In the above examples all the numerical expressions were the simple numbers,
constants or labels. But they can be more complex, by using the arithmetical
or logical operators for calculations at compile time. All these operators
with their priority values are listed in table 1.4. The operations with higher
priority value will be calculated first, you can of course change this
behavior by putting some parts of expression into parenthesis. The "+", "-",
"*" and "/" are standard arithmetical operations, "mod" calculates the
remainder from division. The "and", "or", "xor", "shl", "shr", "bsf", "bsr"
and "not" perform the same bit-logical operations as assembly instructions of
those names. The "rva" and "plt" are special unary operators that perform
conversions between different kinds of addresses, they can be used only with
few of the output formats and their meaning may vary (see 2.4).
The arithmetical and bit-logical calculations are processed as if they
operated on infinite precision 2-adic numbers, and assembler signalizes an
overflow error if because of its limitations it is not table to perform the
required calculation, or if the result is too large number to fit in either
signed or unsigned range for the destination unit size.
The numbers in the expression are by default treated as a decimal, binary
numbers should have the "b" letter attached at the end, octal number should
end with "o" letter, hexadecimal numbers should begin with "0x" characters
(like in C language) or with the "$" character (like in Pascal language) or
they should end with "h" letter. Also quoted string, when encountered in
expression, will be converted into number - the first character will become
the least significant byte of number.
The numerical expression used as an address value can also contain any of
general registers used for addressing, they can be added and multiplied by
appropriate values, as it is allowed for the x86 architecture instructions.
The numerical calculations inside address definition by default operate with
target size assumed to be the same as the current bitness of code, even if
generated instruction encoding will use a different address size.
There are also some special symbols that can be used inside the numerical
expression. First is "$", which is always equal to the value of current
offset, while "$$" is equal to base address of current addressing space. The
other one is "%", which is the number of current repeat in parts of code that
are repeated using some special directives (see 2.2) and zero anywhere else.
There's also "%t" symbol, which is always equal to the current time stamp.
Any numerical expression can also consist of single floating point value
(flat assembler does not allow any floating point operations at compilation
time) in the scientific notation, they can end with the "f" letter to be
recognized, otherwise they should contain at least one of the "." or "E"
characters. So "1.0", "1E0" and "1f" define the same floating point value,
while simple "1" defines an integer value.
The operand of any jump or call instruction can be preceded not only by the
size operator, but also by one of the operators specifying type of the jump:
"short", "near" or "far". For example, when assembler is in 16-bit mode,
instruction "jmp dword [0]" will become the far jump and when assembler is
in 32-bit mode, it will become the near jump. To force this instruction to be
treated differently, use the "jmp near dword [0]" or "jmp far dword [0]" form.
When operand of near jump is the immediate value, assembler will generate
the shortest variant of this jump instruction if possible (but will not create
32-bit instruction in 16-bit mode nor 16-bit instruction in 32-bit mode,
unless there is a size operator stating it). By specifying the jump type
you can force it to always generate long variant (for example "jmp near 0")
or to always generate short variant and terminate with an error when it's
impossible (for example "jmp short 0").
When instruction uses some memory addressing, by default the smallest form of
instruction is generated by using the short displacement if only address
value fits in the range. This can be overridden using the "word" or "dword"
operator before the address inside the square brackets (or after the "ptr"
operator), which forces the long displacement of appropriate size to be made.
In case when address is not relative to any registers, those operators allow
also to choose the appropriate mode of absolute addressing.
Instructions "adc", "add", "and", "cmp", "or", "sbb", "sub" and "xor" with
first operand being 16-bit or 32-bit are by default generated in shortened
8-bit form when the second operand is immediate value fitting in the range
for signed 8-bit values. It also can be overridden by putting the "word" or
"dword" operator before the immediate value. The similar rules applies to the
"imul" instruction with the last operand being immediate value.
Immediate value as an operand for "push" instruction without a size operator
is by default treated as a word value if assembler is in 16-bit mode and as a
double word value if assembler is in 32-bit mode, shorter 8-bit form of this
instruction is used if possible, "word" or "dword" size operator forces the
"push" instruction to be generated in longer form for specified size. "pushw"
and "pushd" mnemonics force assembler to generate 16-bit or 32-bit code
without forcing it to use the longer form of instruction.
This chapter provides the detailed information about the instructions and
directives supported by flat assembler. Directives for defining labels were
already discussed in 1.2.3, all other directives will be described later in
this chapter.
2.1 The x86 architecture instructions
In this section you can find both the information about the syntax and
purpose the assembly language instructions. If you need more technical
information, look for the Intel Architecture Software Developer's Manual.
Assembly instructions consist of the mnemonic (instruction's name) and from
zero to three operands. If there are two or more operands, usually first is
the destination operand and second is the source operand. Each operand can be
register, memory or immediate value (see 1.2 for details about syntax of
operands). After the description of each instruction there are examples
of different combinations of operands, if the instruction has any.
Some instructions act as prefixes and can be followed by other instruction
in the same line, and there can be more than one prefix in a line. Each name
of the segment register is also a mnemonic of instruction prefix, altough it
is recommended to use segment overrides inside the square brackets instead of
these prefixes.
"mov" transfers a byte, word or double word from the source operand to the
destination operand. It can transfer data between general registers, from
the general register to memory, or from memory to general register, but it
cannot move from memory to memory. It can also transfer an immediate value to
general register or memory, segment register to general register or memory,
general register or memory to segment register, control or debug register to
general register and general register to control or debug register. The "mov"
can be assembled only if the size of source operand and size of destination
operand are the same. Below are the examples for each of the allowed
combinations:
"xchg" swaps the contents of two operands. It can swap two byte operands,
two word operands or two double word operands. Order of operands is not
important. The operands may be two general registers, or general register
with memory. For example:
"push" decrements the stack frame pointer (ESP register), then transfers
the operand to the top of stack indicated by ESP. The operand can be memory,
general register, segment register or immediate value of word or double word
size. If operand is an immediate value and no size is specified, it is by
default treated as a word value if assembler is in 16-bit mode and as a double
word value if assembler is in 32-bit mode. "pushw" and "pushd" mnemonics are
variants of this instruction that store the values of word or double word size
respectively. If more operands follow in the same line (separated only with
spaces, not commas), compiler will assemble chain of the "push" instructions
with these operands. The examples are with single operands:
"pusha" saves the contents of the eight general register on the stack.
This instruction has no operands. There are two version of this instruction,
one 16-bit and one 32-bit, assembler automatically generates the appropriate
version for current mode, but it can be overridden by using "pushaw" or
"pushad" mnemonic to always get the 16-bit or 32-bit version. The 16-bit
version of this instruction pushes general registers on the stack in the
following order: AX, CX, DX, BX, the initial value of SP before AX was pushed,
BP, SI and DI. The 32-bit version pushes equivalent 32-bit general registers
in the same order.
"pop" transfers the word or double word at the current top of stack to the
destination operand, and then increments ESP to point to the new top of stack.
The operand can be memory, general register or segment register. "popw" and
"popd" mnemonics are variants of this instruction for restoring the values of
word or double word size respectively. If more operands separated with spaces
follow in the same line, compiler will assemble chain of the "pop"
instructions with these operands.
The type conversion instructions convert bytes into words, words into double
words, and double words into quad words. These conversions can be done using
the sign extension or zero extension. The sign extension fills the extra bits
of the larger item with the value of the sign bit of the smaller item, the
zero extension simply fills them with zeros.
"cwd" and "cdq" double the size of value AX or EAX register respectively
and store the extra bits into the DX or EDX register. The conversion is done
using the sign extension. These instructions have no operands.
"cbw" extends the sign of the byte in AL throughout AX, and "cwde" extends
the sign of the word in AX throughout EAX. These instructions also have no
operands.
"movsx" converts a byte to word or double word and a word to double word
using the sign extension. "movzx" does the same, but it uses the zero
extension. The source operand can be general register or memory, while the
destination operand must be a general register. For example:
"add" replaces the destination operand with the sum of the source and
destination operands and sets CF if overflow has occurred. The operands may
be bytes, words or double words. The destination operand can be general
register or memory, the source operand can be general register or immediate
value, it can also be memory if the destination operand is register.
"adc" sums the operands, adds one if CF is set, and replaces the destination
operand with the result. Rules for the operands are the same as for the "add"
instruction. An "add" followed by multiple "adc" instructions can be used to
add numbers longer than 32 bits.
"inc" adds one to the operand, it does not affect CF. The operand can be a
general register or memory, and the size of the operand can be byte, word or
double word.
"sub" subtracts the source operand from the destination operand and replaces
the destination operand with the result. If a borrow is required, the CF is
set. Rules for the operands are the same as for the "add" instruction.
"sbb" subtracts the source operand from the destination operand, subtracts
one if CF is set, and stores the result to the destination operand. Rules for
the operands are the same as for the "add" instruction. A "sub" followed by
multiple "sbb" instructions may be used to subtract numbers longer than 32
bits.
"dec" subtracts one from the operand, it does not affect CF. Rules for the
operand are the same as for the "inc" instruction.
"cmp" subtracts the source operand from the destination operand. It updates
the flags as the "sub" instruction, but does not alter the source and
destination operands. Rules for the operands are the same as for the "sub"
instruction.
"neg" subtracts a signed integer operand from zero. The effect of this
instructon is to reverse the sign of the operand from positive to negative or
from negative to positive. Rules for the operand are the same as for the "inc"
instruction.
"xadd" exchanges the destination operand with the source operand, then loads
the sum of the two values into the destination operand. The destination operand
may be a general register or memory, the source operand must be a general
register.
All the above binary arithmetic instructions update SF, ZF, PF and OF flags.
SF is always set to the same value as the result's sign bit, ZF is set when
all the bits of result are zero, PF is set when low order eight bits of result
contain an even number of set bits, OF is set if result is too large for a
positive number or too small for a negative number (excluding sign bit) to fit
in destination operand.
"mul" performs an unsigned multiplication of the operand and the
accumulator. If the operand is a byte, the processor multiplies it by the
contents of AL and returns the 16-bit result to AH and AL. If the operand is a
word, the processor multiplies it by the contents of AX and returns the 32-bit
result to DX and AX. If the operand is a double word, the processor multiplies
it by the contents of EAX and returns the 64-bit result in EDX and EAX. "mul"
sets CF and OF when the upper half of the result is nonzero, otherwise they
are cleared. Rules for the operand are the same as for the "inc" instruction.
"imul" performs a signed multiplication operation. This instruction has
three variations. First has one operand and behaves in the same way as the
"mul" instruction. Second has two operands, in this case destination operand
is multiplied by the source operand and the result replaces the destination
operand. Destination operand must be a general register, it can be word or
double word, source operand can be general register, memory or immediate
value. Third form has three operands, the destination operand must be a
general register, word or double word in size, source operand can be general
register or memory, and third operand must be an immediate value. The source
operand is multiplied by the immediate value and the result is stored in the
destination register. All the three forms calculate the product to twice the
size of operands and set CF and OF when the upper half of the result is
nonzero, but second and third form truncate the product to the size of
operands. So second and third forms can be also used for unsigned operands
because, whether the operands are signed or unsigned, the lower half of the
product is the same. Below are the examples for all three forms:
"not" inverts the bits in the specified operand to form a one's complement
of the operand. It has no effect on the flags. Rules for the operand are the
same as for the "inc" instruction.
"and", "or" and "xor" instructions perform the standard logical operations.
They update the SF, ZF and PF flags. Rules for the operands are the same as
for the "add" instruction.
"bt", "bts", "btr" and "btc" instructions operate on a single bit which can
be in memory or in a general register. The location of the bit is specified
as an offset from the low order end of the operand. The value of the offset
is the taken from the second operand, it either may be an immediate byte or
a general register. These instructions first assign the value of the selected
bit to CF. "bt" instruction does nothing more, "bts" sets the selected bit to
1, "btr" resets the selected bit to 0, "btc" changes the bit to its
complement. The first operand can be word or double word.
"bsf" and "bsr" instructions scan a word or double word for first set bit
and store the index of this bit into destination operand, which must be
general register. The bit string being scanned is specified by source operand,
it may be either general register or memory. The ZF flag is set if the entire
string is zero (no set bits are found); otherwise it is cleared. If no set bit
is found, the value of the destination register is undefined. "bsf" scans from
low order to high order (starting from bit index zero). "bsr" scans from high
order to low order (starting from bit index 15 of a word or index 31 of a
double word).
"shl" shifts the destination operand left by the number of bits specified
in the second operand. The destination operand can be byte, word, or double
word general register or memory. The second operand can be an immediate value
or the CL register. The processor shifts zeros in from the right (low order)
side of the operand as bits exit from the left side. The last bit that exited
is stored in CF. "sal" is a synonym for "shl".
"shr" and "sar" shift the destination operand right by the number of bits
specified in the second operand. Rules for operands are the same as for the
"shl" instruction. "shr" shifts zeros in from the left side of the operand as
bits exit from the right side. The last bit that exited is stored in CF.
"sar" preserves the sign of the operand by shifting in zeros on the left side
if the value is positive or by shifting in ones if the value is negative.
"shld" shifts bits of the destination operand to the left by the number
of bits specified in third operand, while shifting high order bits from the
source operand into the destination operand on the right. The source operand
remains unmodified. The destination operand can be a word or double word
general register or memory, the source operand must be a general register,
third operand can be an immediate value or the CL register.
"shrd" shifts bits of the destination operand to the right, while shifting
low order bits from the source operand into the destination operand on the
left. The source operand remains unmodified. Rules for operands are the same
as for the "shld" instruction.
"rol" and "rcl" rotate the byte, word or double word destination operand
left by the number of bits specified in the second operand. For each rotation
specified, the high order bit that exits from the left of the operand returns
at the right to become the new low order bit. "rcl" additionally puts in CF
each high order bit that exits from the left side of the operand before it
returns to the operand as the low order bit on the next rotation cycle. Rules
for operands are the same as for the "shl" instruction.
"ror" and "rcr" rotate the byte, word or double word destination operand
right by the number of bits specified in the second operand. For each rotation
specified, the low order bit that exits from the right of the operand returns
at the left to become the new high order bit. "rcr" additionally puts in CF
each low order bit that exits from the right side of the operand before it
returns to the operand as the high order bit on the next rotation cycle.
Rules for operands are the same as for the "shl" instruction.
"test" performs the same action as the "and" instruction, but it does not
alter the destination operand, only updates flags. Rules for the operands are
the same as for the "and" instruction.
"bswap" reverses the byte order of a 32-bit general register: bits 0 through
7 are swapped with bits 24 through 31, and bits 8 through 15 are swapped with
bits 16 through 23. This instruction is provided for converting little-endian
values to big-endian format and vice versa.
"call" transfers control to the procedure, saving on the stack the address
of the instruction following the "call" for later use by a "ret" (return)
instruction. Rules for the operands are the same as for the "jmp" instruction,
but the "call" has no short variant of direct instruction and thus it not
optimized.
"ret", "retn" and "retf" instructions terminate the execution of a procedure
and transfers control back to the program that originally invoked the
procedure using the address that was stored on the stack by the "call"
instruction. "ret" is the equivalent for "retn", which returns from the
procedure that was executed using the near call, while "retf" returns from
the procedure that was executed using the far call. These instructions default
to the size of address appropriate for the current code setting, but the size
of address can be forced to 16-bit by using the "retw", "retnw" and "retfw"
mnemonics, and to 32-bit by using the "retd", "retnd" and "retfd" mnemonics.
All these instructions may optionally specify an immediate operand, by adding
this constant to the stack pointer, they effectively remove any arguments that
the calling program pushed on the stack before the execution of the "call"
instruction.
"iret" returns control to an interrupted procedure. It differs from "ret" in
that it also pops the flags from the stack into the flags register. The flags
are stored on the stack by the interrupt mechanism. It defaults to the size of
return address appropriate for the current code setting, but it can be forced
to use 16-bit or 32-bit address by using the "iretw" or "iretd" mnemonic.
The conditional transfer instructions are jumps that may or may not transfer
control, depending on the state of the CPU flags when the instruction
executes. The mnemonics for conditional jumps may be obtained by attaching
the condition mnemonic (see table 2.1) to the "j" mnemonic,
for example "jc" instruction will transfer the control when the CF flag is
set. The conditional jumps can be short or near, and direct only, and can be
optimized (see 1.2.5), the operand should be an immediate value specifying
target address.