assembly language pdf
assembly language pdf
A computer is made up of hardware and software. The hardware of a computer consists of four
types of components:
The Processor
A processor is also called the central processing unit CPU). The processor consists
of at least the following three components:
A. Registers: - A register is a storage location inside the CPU. It is used to hold data
and/or a memory address during the execution of an instruction. Because the
register is very close to the CPU, it can provide fast access to operands for program
execution. The number of registers varies greatly from processor to processor.
B. Arithmetic logic unit (ALU):- The ALU performs all the numerical computations
and logical evaluations for the processor. The ALU receives data from the memory,
performs the operations, and, if necessary, writes the result back to the memory.
Today's supercomputer can perform trillions of operations per second.
Smaller size
Lower cost
higher reliability
Lower power consumption
More power full
Versatility
80x86 Microprocessor
8086 microprocessor is an enhanced version of 8085 microprocessor that was designed
by Intel in 1976.
Features of 80x86 Microprocessor:
1. Intel 8086 was launched in 1978.
2. It was the first 16-bit microprocessor.
3. This microprocessor had major improvement over the execution speed of
8085.
4. It is available as 40-pin Dual-Inline-Package (DIP).
5. It is available in three versions:
a. 8086 (5 MHz)
b. 8086-2 (8 MHz)
6. It consists of 29,000 transistors
N.B In 80x86, x represents version number
The 8086 signals can be categorized in three groups. The first are the signals having
common functions in minimum as well as maximum mode, the second are the signals,
which have special functions for minimum mode, and the third are the signals having
special functions for maximum mode.
The following signal descriptions are common for both the minimum and maximum
modes.
AD15-AD0 these are the time multiplexed memory I/0 address and data lines. Address
remains on the lines during T, state, while the data is available on the data bus during
T2,T3,Tw, and T4 Here T1,T2,T3,T4 and Tw, are the clock states of a machine cycle.
Tw is a wait state. These lines are active high and float to a tristate during interrupt
acknowledge and local bus hold acknowledge cycles.
A19/S6, A18/S5, A17/S4, A16/S3 these are the time multiplexed address and status
lines. During T1, these are the most significant address lines for memory operations.
During I/0 operations, these lines are low. During memory or I/0 operations, status
information is available on those lines for T2, T3, Tw, and T4. The status of the interrupt
enable flag bit (displayed on S5) is updated at the beginning of each clock cycle.
The S4 and S3 combination indicate which segment register is presently being used for
memory accesses as shown in Table above. These lines float to tri-state off (tristated)
during the local bus hold acknowledge. The status line S6 is always low (logical). The
address bits are separated from the status bits using latches controlled by the ALE
signal.
BHE/S7-Bus High Enable/Status The bus high enable signal is used to indicate the
transfer of data over the higher order (D15-D8) data bus as shown in Table below. It
goes low for the data transfers over D15-D8 and is used to derive chip selects of odd
address memory bank or peripherals. BHE is low during T1 for read, write and interrupt
acknowledge cycles, whenever a byte is to be transferred on the higher byte of the data
bus. The status information is available during T2, T3 and T4. The signal is active low
RD-Read Read signal, when low, indicates the peripherals that the processor is
performing a memory or I/0 read operation. RD is active low and shows the state for
T2, T3, Tw of any read cycle. The signal remains tristated during the 'hold
acknowledge'.
READY This is the acknowledgement from the slow devices or memory that they have
completed the data transfer. The signal made available by the devices is synchronized
by the 8284A clock generator to provide ready input to the 8086. The signal is active
high.
INTR-interrupt Request this is a level triggered input. This is sampled during the last
clock cycle of each instruction to determine the availability of the request. If any interrupt
request is pending, the processor enters the interrupt acknowledge cycle. This can be
internally masked by resetting the interrupt enable flag. This signal is active high and
internally synchronized.
TEST This input is examined by a 'WAIT' instruction. If the TEST input goes low,
execution will continue, else, the processor remains in an idle state. The input is
synchronized internally during each clock cycle on leading edge of clock.
CLK Clock Input:-the clock input provides the basic timing for processor operation and
bus control activity. Its an asymmetric square wave with 33% duty cycle. The range of
frequency for different 8086 versions is from 5MHz to 10MHZ.
Vcc +5V power supply for the operation of the internal circuit.
MN/MX The logic level at this pin decides whether the processor is to operate in either
minimum (single processor) or maximum (multiprocessor) mode. The following pin
functions are for the minimum mode operation of 8086.
INTA - Interrupt Acknowledge this signal is used as a read strobe for interrupt
acknowledge cycles. In other words, when it goes low, it means that the processor has
accepted the interrupt. It is active low during T2, T3 and Tw of each interrupt
acknowledge cycle.
ALE-Address Latch Enable This output signal indicates the availability of the valid
address on the address/data lines, and is connected to latch enable input of latches.
This signal is active high and is never tristated.
DT/R Data Transmit/Receive This output is used to decide the direction of data flow
through the transreceivers (bi-directional buffers). When the processor sends out data,
this signal is high and when the processor is receiving data, this signal is low.
DEN-Data Enable This signal indicates the availability of valid data over the
address/data lines. It is used to enable the transreceivers (bi-directional buffers) to
separate the data from the multiplexed address/data signal. It is active from the middle
of T2 until the middle of T4 DEN is tristated during ‘hold acknowledge' cycle
HOLD, HLDA - Hold/Hold Acknowledge When the HOLD line goes high, it indicates to
the processor that another master is requesting the bus access. The processor, after
receiving the HOLD request, issues the hold acknowledge signal on HLDA pin, in the
middle of the next clock cycle after completing the current bus (instruction) cycle. At the
same time, the processor floats the local bus and control lines. When the processor
detects the HOLD line low, it lowers the HLDA signal. HOLD is an asynchronous input,
and it should be externally synchronized. If the DMA request is made while the CPU is
performing a memory or I/O cycle, it will release the local bus during T4 provided:
The following pin functions are applicable for maximum mode operation of 8086.
S2, S1, and S0 - Status Lines These are the status lines which reflect the type of
operation, being carried out by the processor. These status lines are encoded in the
Table
Table3
Status
Lines
QS1, QS0-Queue Status These lines give information about the status of the code-
prefetch queue. These are active during the CLK cycle after which the queue operation
is performed. These are encoded as shown in Table below4.
The instruction execution cycle is never broken for fetch operation. After decoding the
first byte, the decoding circuit decides whether the instruction is of single opcode byte or
The next byte after the instruction is completed is again the first opcode byte of
the next instruction. A similar procedure is repeated till the complete execution of the
program. The main point to be noted here is that the fetch operation of the next
instruction is overlapped with the execution of the current instruction. As shown in the
architecture, there are two separate units, namely, execution unit and bus interface unit,
while the execution unit is busy in executing an instruction, after it is completely
decoded, the bus interface unit may be fetching the bytes of the next instruction from
memory, depending upon the queue status.
Other local bus masters, in maximum mode, to force the processor to release
the local bus at the end of the processor’s current bus cycle, use these pins. Each of the
pins is bi-directional with RQ/GT0 having higher priority than RQ/GT1. RQ/GT pins have
internal pull-up resistors and may be left unconnected. The request/ grant sequence is
as follows:
1. A pulse one clock wide from another bus master requests the bus access to 8086.
2. During T4 (current) or T, (next) clock cycle, a pulse one clock wide from 8086 to the
requesting master, indicates that the 8086 has allowed the local bus to float and that
it will enter the "hold acknowledge" state at next clock cycle. The CPU's bus interface
unit is likely to be disconnected from the local bus of the system.
Data Registers
AX (Accumulator Register): AH and AL
BX (Base Register): BH and BL
CX (Count Register): CH and CL
DX (Data Register): DH and DL
N.B
Address registers
A. Pointer and Index Registers
SI (Source Index) is a 16-bit register. SI is used for indexed, based indexed and
register indirect addressing, as well as a source data address in string manipulation
instructions
DI (Destination Index) is a 16-bit register. DI is used for indexed, based indexed and
register indirect addressing, as well as a destination data address in string
manipulation instructions.
SP (Stack Pointer) is a 16-bit register pointing to program stack
BP (Base Pointer) is a 16-bit register pointing to data in stack segment. BP register
is usually used for based, based indexed or register indirect addressing
IP (Instruction Pointer)
B. Segment Registers
Additional registers called segment registers generate memory address when combined
with other in the microprocessor. In 8086 microprocessor, memory is divided into 4
segments
CS (Code Segment): The CS register is used for addressing a memory location in the
Code
Segment of the memory, where the executable program is stored.
DS (Data Segment): The DS contains most data used by program. Data are accessed
in the
Data Segment by an offset address or the content of other register that holds the offset
address.
SS (Stack Segment): SS defined the area of memory used for the stack.
ES (Extra Segment): ES is additional data segment that is used by some of the string
to hold
the destination data.
FLAG Registers
1. Conditional Flags
2. Control Flags
1. Conditional Flags
Conditional flags represent result of last arithmetic or logical instruction executed.
Conditional flags are as follows:
Carry Flag (CF): This flag indicates an overflow condition for unsigned integer
arithmetic. It is also used in multiple-precision arithmetic.
Auxiliary Flag (AF): If an operation performed in ALU generates a carry/barrow
from
Lower nibble (i.e. D0 – D3) to upper nibble (i.e. D4 – D7), the AF flag is set i.e.
carry given by D3 bit to D4 is AF flag. This is not a general-purpose flag, it is
used internally by the processor to perform Binary to BCD conversion.
Parity Flag (PF): This flag is used to indicate the parity of result. If lower order
8-bits of the result contains even number of 1’s, the Parity Flag is set and for odd
number of 1’s, the Parity Flag is reset.
Zero Flag (ZF): It is set; if the result of arithmetic or logical operation is zero else
it is reset.
Early computer systems were literally programmed by hand. Front panel switches were
used to enter instructions and data. These switches represented the address, data and
control lines of the computer system.To enter data into memory, the address switches
were toggled to the correct address, the data switches were toggled next, and finally the
WRite switch was toggled. This wrote the binary value on the front panel data switches
to the address specified. Once all the data and instruction were entered, the run switch
was toggled to run the program.
The programmer also needed to know the instruction set of the processor. Each
instruction needed to be manually converted into bit patterns by the programmer so the
front panel switches could be set correctly. This led to errors in translation as the
programmer could easily misread 8 as the value B. It became obvious that such
methods were slow and error prone.
With the advent of better hardware which could address larger memory, and the
increase in memory size (due to better production techniques and lower cost), programs
were written to perform some of this manual entry. Small monitor programs became
popular, which allowed entry of instructions and data via hex keypads or terminals.
Additional devices such as paper tape and punched cards became popular as storage
methods for programs.
Programs were still hand-coded, in that the conversion from mnemonics to instructions
was still performed manually. To increase programmer productivity, the idea of writing a
program to interpret another was a major breakthrough. This would be run by the
computer, and translate the actual mnemonics into instructions. The benefits of such a
program would be
reduced errors
faster translation times
As programmers were writing the source code in mnemonics anyway, it seemed the
logical next step. The source file was fed as input into the program, which translated the
mnemonics into instructions, then wrote the output to the desired place (paper-tape etc).
This sequence is now accepted as common place.
The only advances have been the increasing use of high level languages to increase
programmer productivity. Assembly language programming is writing machine
instructions in mnemonic form, using an assembler to convert these mnemonics into
actual processor instructions and associated data.
here is also a probable maintenance phase also associated. The chosen language will
undoubtedly need to be converted into the appropriate binary bit-patterns which make
sense to the target processor (the processor on which the software will be run). This
process of conversion is called translation.
The other type of system software that you need to know is translator software. This is
software that allows new programs to be written and run on computers, by converting
source code into machine code. There are three types that we'll cover in a lot more
detail shortly:
Allows the programmer to use mnemonics when writing source code programs.
variables are represented by symbolic names, not as memory locations
symbolic code is easier to read and follow
error checking is provided
changes can be quickly and easily incorporated with a re-assembly
programming aids are included for relocation and expression evaluation
The assembler converts the written assembly language source program into a format
which run on the processor. Each machine code instruction (the binary or hex value) is
+----------+---------+-----------------+
| Binary | Hex | Mnemonic |
+----------+---------+-----------------+
| 01001111 | 4F | CLRA | Clears the A accumulator
+----------+---------+-----------------+
| 00110110 | 36 | PSHA | Saves A acc on stack
+----------+---------+-----------------+
| 01001101 | 4D | TSTA | Tests A acc for 0
+----------+---------+-----------------+
Assembly - Variables
NASM provides various define directives for reserving storage space for variables.
The define assembler directive is used for allocation of storage space. It can be used to
reserve as well as initialize one or more bytes.
choice DB 'y'
number DW 12345
neg_number DW -12345
big_number DQ 123456789
real_number1 DD 1.234
real_number2 DQ 123.456
Directive Purpose
RESB Reserve a Byte
RESW Reserve a Word
RESD Reserve a Doubleword
RESQ Reserve a Quadword
REST Reserve a Ten Bytes
Multiple Definitions
You can have multiple data definition statements in a program. For example −
Multiple Initializations
The TIMES directive allows multiple initializations to the same value. For example, an
array named marks of size 9 can be defined and initialized to zero using the following
statement −
marks TIMES 9 DW 0
There are several directives provided by NASM that define constants. We have already
used the EQU directive in previous chapters. We will particularly discuss three
directives −
EQU
%assign
%define
The EQU directive is used for defining constants. The syntax of the EQU directive is as
follows −
For example,
TOTAL_STUDENTS equ 50
You can then use this constant value in your code, like −
movecx, TOTAL_STUDENTS
cmpeax, TOTAL_STUDENTS
LENGTH equ20
WIDTH equ10
AREA equ length * width
%assign TOTAL 10
%assign TOTAL 20
The %define directive allows defining both numeric and string constants. This directive
is similar to the #define in C. For example, you may define the constant PTR as −
1. Label: is the text string much like variable name in the high level language
up : MOV AX,BX;
CLASSIFICATION OF INSTRUCTIONS
The 8086 microprocessor supports 6 types of instructions
2. ARITHMETIC INSTRUCTIONS
2.1. Addition Instructions
ADD Add specified byte-to-byte or specified word to word.
ADC Add byte + byte + carry flag or word + word + carry flag.
INC Increment specified byte or specified word by 1.
AAA ASCII adjust after addition.
DAA Decimal (BCD) adjust after addition.
2.2. Subtraction Instructions:
SUB Subtract byte from byte or word from word.
SBB Subtract byte and carry flag from byte or word and carry flag
From word.
DEC Decrement specified byte or specified word by 1.
NEG Negate-Invert each bit of a specified byte or word and add 1
(Form 2’s complement).
CMP Compare two specified bytes or two specified words.
AAS ASCII adjust after subtraction.
DAS Decimal (BCD) adjust after subtraction.
2.3. Multiplication Instructions
MUL multiply unsigned byte-by-byte or unsigned word by word.
IMUL multiply signed byte by byte or signed word by word.
AAM ASCII adjust after multiplication.
CBW Fill upper byte of word with copies of sign bit of lower byte.
CWD Fill upper word of double word with sign bit of lower word.
Until CX = 0.
EXAMPLES:
PUSH BX Decrement SP by 2, copy BX to stack
PUSH DS Decrement SP by 2, copy DS to stack
PUSH AL Illegal, must push a word
PUSH TABLE [BX] Decrement SP by 2, copy word from memory in
DS at EA = TABLE + [BX] to stack
EXAMPLES:
POP DX Copy a word from top of stack to DX Increment SP by 2
POP DS Copy a word from top of stack to DS Increment SP by 2
POP TABLE [BX] Copy a word from top of stack to memory in DS with EA =
TABLE + [BX]
For the variable-port-type IN instruction, the port address is loaded into the DX register before
the IN instruction. Since DX is a 16-bit register, the port address can be any number between
0000H and FFFFH. Therefore, up to 65,536 ports are addressable in this mode.
MOV DX, 0FF78H Initialize DX to point to port
IN AL, DX Input a byte from 8-bit port 0FF78H to AL
IN AX, DX Input a word from 16-bit port 0FF78H to AX
The variable-port IN instruction has the advantage that the port address can be computed or
dynamically determined in the program
1.4.2 OUT Port, Accumulator AL or AX
The OUT instruction copies a byte from AL or a word from AX to the specified port. The
OUT instruction has two possible forms, fixed port and variable port.
For the fixed-port form, the 8-bit port address is specified directly in the instruction. With this
form, any one of 256 possible ports can be addressed.
EXAMPLES:
OUT 2CH, AX Copy the contents of AX to port 2CH for the variable-port form of the
OUT instruction, the contents of AL or AX will be copied to the port at an address
contained in DX. Therefore, the DX register must always be loaded with the desired port
address before this form of the OUT instruction is used. The advantage of the variable-
port form of addressing is described in the discussion of the IN instruction. The OUT
instruction does not affect any flags.
These instructions add a number from some source to a number from some destination
and put the -result in the specified destination. The Add with Carry instruction, ADC,
also adds the status of the carry flag into the result. The source may be an immediate
number, a register, or a memory location specified by any one of the 24 addressing
modes. The destination may be a register or a memory location specified by any one of
the 24 addressing modes. The source and the destination in an instruction cannot both
be memory locations. The source and the destination must be of the same type. In other
words, they must both be byte locations, or they must both be word locations. If we want
to add a byte to a word, we must copy the byte to a word location and fill the upper byte
of the word with 0's before adding. Flags affected: AF, CF, OF, PF, SF, ZF.
EXAMPLES:
ADD AL, 74H Add immediate number 74H to contents of AL. Result in AL
ADC CL, BL Add contents of BL plus carry status to contents of CL.
ADD DX, BX Add contents of BX to contents of DX
ADD DX,[SI] Add word from memory at offset [SI]in DS to contents of DX
1.5.2 FLAG RESULTS FOR SIGNED ADDITION EXAMPLE
CF = 0 No carry out of bit 7.
PF = 0 Result has odd parity.
AF = 1 Carry was produced out of bit 3.
ZF = 0 Result in destination was not 0.
SF = 1 Copies most significant bit of result; indicates negative result if we are
Adding signed numbers.
OF= 1 Set to indicate that the result of the addition was too large to fit in the lower 7
bits of the destination used to represent the magnitude of a signed number. In other
words, the result was greater than + 127 decimal, so the result overflowed into the sign
bit position and incorrectly indicated that the result was negative. If we are adding two
EXAMPLES:
AL = 0101 1001 = 59 BCD; BL = 0011 0101 = 35 BCD
ADD AL, BL AL = 1000 1110 = 8EH
DAA Add 01 10 because 1110 > 9 AL = 1001 0100 = 94 BCD
AL = 1000 1000 = 88 BCD BL = 0100 1001 = 49 BCD
ADD AL, BL AL = 1101 0001, AF=1
DAA Add 0110 because AF =1, AL = 11101 0111 = D7H
1101 > 9 so add 0110 0000
AL = 0011 0111= 37 BCD, CF =1
The DAA instruction updates AF, CF, PF, and ZF. OF is undefined after a DAA
instruction.
1.6 Subtraction Instructions
CF ZF SF
CX = BX 0 1 0 Result of subtraction is 0
EXAMPLES:
MUL BH AL times BH, result in AX
MUL CX AX times CX, result high word in DX, low word in AX
MUL BYTE PTR [BX] AL times byte in DS pointed to by [BX]
MUL CONVERSION_FACTOR [BX] Multiply AL times byte at effective address
CONVERSION-FACTOR [BX] if it was declared as
type byte with DB. Multiply AX time’s word at
Effective address CONVERSION_FACTOR [BX] if
it was declared as type word with DW. ;
EXAMPLES:
IMUL BH Signed byte in AL times signed byte in BH, result in AX
IMUL AX AX times AX, result in DX and AX
Before we can multiply two ASCII digits, we must first mask the upper 4 bits of each. This
leaves unpacked BCD (one BCD digit per byte) in each byte. After the two unpacked BCD
digits are multiplied, the AAM instruction is used to adjust the product to two unpacked BCD
digits In AX. AAM works only after the multiplication of two un- packed BCD bytes, and it
works only on an operand in AL. AAM updates PF, SF, and ZF, but AF, CF, and OF are left
undefined.
EXAMPLE:
AL 00000101 unpacked BCD 5
BH 00001001 unpacked BCD 9
MUL BH AL x BH; result in AX
AX = 00000000 00101101 = 002DH
AAM AX = 00000100 00000101 = 0405H,
Which is unpacked BCD for 45.
If ASCII codes for the result are desired, use next instruction
OR AX, 3030H Put 3 in upper nibble of each byte.
EXAMPLES:
DIV BL Divide word in AX by byte in BL. Quotient in AL, remainder in AH
DIV CX Divide double word in DX and AX by word in CX.
Quotient in AX, remainder in DX.
DIV SCALE [BX] AX/(byte at effective address SCALE[BX]), if SCALE[BX] is of
Type byte or (DX and AX)/(word at effective address SCALE
[BX]) if SCALE[BX] is of type word
DIV BH AX = 37D7H = 14,295 decimal BH = 97H = 151 decimal
AX/BH, AL = quotient = 5EH = 94 decimal
AH = remainder = 65H = 101 decimal
This instruction is used to divide a signed word by a signed byte, or to divide a signed double
word (32 bits) by a signed word. When dividing a signed word by a signed byte, the word must
be in the AX register. The divisor can be in an 8-bit register or a memory location. After the
division, AL will contain the signed result (quotient), and AH will contain the signed remainder.
The sign of the remainder will be the same as the sign of the dividend.
The 8086 Convert Word to Double word instruction, CWD, will copy the sign bit of AX to all the
bits of DX. EXAMPLES
EXAMPLE:
AX = 00000000 10011011 155 decimal
CBW Convert signed byte in AL to signed word in AX
Result: AX = 11111111 10011011 155 decimal
For further examples of the use of CBW, see the IDIV instruction description.
CWD-Convert Signed Word to Signed Double word
EXAMPLES:
NOT BX Complement contents of BX register
NOT BYTE PTR [BX] Complement memory byte at offset IBXI in data segment
EXAMPLES:
AND CX, [SI] AND word in DS at offset [SI] with word in CX register
BX = 10110011 01011110
AND BX, 00FFH Mask out upper 8 bits of BX
Result: BX = 00000000 01011110 CF, OF, PF, SF, ZF = 0
OR Destination, Source
This instruction ORs each bit in a source byte or word with the corresponding bit in a
destination byte or word. The result is put in the specified destination. The contents of the
specified source will not be changed. The result for each bit will follow the truth table for a two-
input OR gate
OR AH, CL CL ORed with AH, result in AH. CL not changed
OR BP, SI SI ORed with BP, result in BP. SI not changed
OR SI, BP BP ORed with SI, result in SI. BP not changed
OR BL, 80H BL ORed with immediate 80H. Set MSB of BL to a 1
OR CX, TABLE [BX][SI] CX ORed with word from effective address
TABLE [BX][SI] in data segment.
OR CX, 0FF00H CX = 00111101 10100101 ,OR CX with immediate
FF00H, Result in CX = 11111111 10100101
CF=0,OF=0,PF= 1,SF= 1,ZF=0.
EXAMPLES:
SAL BP, CL Shift word in BP left (CL) bit positions, 0's in 2 LSBs
Rotate Instructions
ROL-Rotate All Bits of Operand Left, MSB to LSB-ROL Destination, Count this
instruction rotates all the bits in a specified word or byte to the left some number of bit
positions. The operation can be thought of as circular, because the data bit rotated out
of the MSB is circled back into the LSB. The data bit rotated out of the MSB is also
copied to CF during ROL. In the case of multiple bit rotates, CF will contain a copy of
the bit most recently moved out of the MSB. See the following diagram
The destination operand can be in a register or in a memory location specified by any one of
the addressing modes. If we want to rotate the operand one bit position, we can specify this by
putting a 1 in the count position