Skip To Content
Skip To Content
We know that a processor (also known as CPU - Central Processing Unit) executes all
types of operations, effectively working as the brain of a computer. However, it only
recognizes strings of 0's and 1's. As you can imagine, it's cumbersome to code in
machine language. So, the low-level assembly language was designed for a specific
family of processors that represents various instructions in symbolic code which is far
easier to understand for a human being. But, as you can also guess, it's difficult and
somewhat inconvenient to develop in assembly language.
Well, you can think of the following points to decide whether to learn it or not.
Enhance your skill set.
Learn the fastest language aside from machine language.
Embed assembly language in a higher-level language to use features
unsupported by the higher-level language or for performance reasons.
Fill in the knowledge gap for understanding how the higher-level languages
came to be.
Code editors are software in which you can write the code, modify and save it to a
file. Some editors that support assembly language are VS code, DOSBox, emu8086,
and so on. Online assemblers are also available, like the popular online editor Ideone.
We will use emu8086, which comes with the environment needed to start our journey
in assembly language.
Code structure
We can simply write the assembly code and emulate it in emu8086, and it'll run.
However, without calling the exit statements or halt instruction, the program will
continue executing the next instruction in memory until it is halted by OS or emu8086
itself. The assembly code is saved in a .asm file type.
There are also some good practices like defining the model and stack memory size at
the very beginning. For small model, define data and code segment after the stack.
The code segment contains the code to execute. In the example structure given here, I
have created a main procedure (also called function or methods in other programming
languages), in which the code execution starts. At the end of it, I have called a specific
predefined statement with interrupt to indicate the code has finished executing.
.model small
.stack 100H
; Data segment
.data ; if there is nothing in the data segment, you can omit this line.
; Code segment
.code
main PROC
; Write your code here
exit:
MOV AH, 4CH
INT 21H
main ENDP
END main
The first line, .model small, defines the memory model to use. Some recognized memory
models are tiny, small, medium, compact, large, and so on. The small memory model
supports one data segment and one code segment that are usually enough to write
small programs. The following line .stack 100H defines the stack size in hexadecimal
numbers. The equivalent decimal number is 256. The lines starting with, or part of the
line after, ; are comments that the assembler ignores.
General purpose registers: There are four general purpose registers, each
divided into two subgroups, low and high. For example, AX is divided into AL
and AH, each 8-bit long.
Accumulator (AX)
Base (BX)
Counter (CX)
Data (DX)
Segment registers: There are also four segment registers.
Code Segment (CS)
Data Segment (DS)
Stack Segment (SS)
Extra Segment (ES)
Special purpose registers: There are two index registers and three pointer
registers.
Source Index (SI)
Destination Index (DI)
Base Pointer (BP)
Stack Pointer (SP)
Instruction Pointer (IP)
Flag register: This is a 16-bit register of which 9 bits are used by 8086 to
indicate current state of the processor. The nine flags are categorized into two
groups.
Status flags: Six status flags indicate the status of currently executing
instruction.
Carry flag (CF)
Parity flag (PF)
Auxiliary flag (AF)
Zero flag (ZF)
Sign flag (SF)
Overflow flag (OF)
Control flags: There are three control flags that controls certain
operations of the processor.
Interrupt flag (IF)
Direction flag (DF)
Trap flag (TF)
To read more about these registers and what they are used for, visit this page.
In this article, I'll focus only on a few instructions necessary for understanding the
later parts.
Copy data (MOV): This instruction copies a byte (8-bit) or a word (16-bit)
from source to destination. Both operands should be of the same type (byte or
word). The syntax of this instruction is:
MOV destination, source
; Subtraction
SUB destination, source
SUB BL, 10
Label: A label is a symbolic name for the address of the instruction that is
given immediately after the label declaration. It can be placed at the beginning
of a statement and serve as an instruction operand. The exit: used before is a
label. Labels are of two types.
Symbolic Labels: A symbolic label consists of an identifier or symbol
followed by a colon (:). They must be defined only once as they have
global scope and appear in the object file's symbol table.
Numeric Labels: A numeric label consists of a single digit in the range
zero (0) through nine (9) followed by a colon (:). They are used only for
local reference and excluded in the object file's symbol table. Hence,
they have a limited scope and can be re-defined repeatedly.
; Symbolic label
label:
MOV AX, 5
; Numeric label
1:
MOV AX, 5
Compare (CMP): This instruction takes two operands and subtracts one from
the other, then sets OF, SF, ZF, AF, PF, and CF flags accordingly. The result is
not stored anywhere.
CMP operand1, operand2
The operand1 operand can be a register or memory address, and operand2 can be a
register, memory, or immediate value.
Here, variable-name is the identifier for each storage space. The assembler associates an
offset value for each variable name defined in the data segment.
Following is an example of variable declaration, where we initialize num and char with
a value that can be changed later. The output is initialized with a string and has a dollar
symbol ($) at the end to indicate the end of string. The input_char is declared without any
initial value. We can use ? to indicate that the value is currently unknown.
; Data segment
.data
num DB 31H
char DB 'A'
output DW "Hello, World!!$"
input_char DB ?
We cannot use the variables in the code segment just yet! For using these variables in
the code segment, we have to first move the address of the data segment to
the DS (data segment) register. Use this line at the beginning of the code segment to
import all variables.
; Output a character
MOV AH, 2
MOV DL, 35
INT 21H
; Output a string
MOV AH, 9
LEA DX, output
INT 21H
As shown in the code, for a single character output, we store the value in
the DL register because a character is one byte or 8 bits long. However, for string
output it is a bit different. We must load the effective address (address with offset) of
the string variable in the DX register using LEA instruction. The string variable must be
defined in data segment.
The complete code containing variable declaration, input and output is provided
in GitHub.
; Compare
CMP AL, 5
JG greater ; if greater
JE equal ; else if equal
JMP less ; else
greater:
MOV BL, 'G'
JMP after
equal:
MOV BL, 'E'
JMP after
less:
MOV BL, 'L'
after:
; Other codes
; Note: BL will contain 'E' at this point
Using loops
We can also use loops in assembly language. However, unlike higher-level language,
it does not provide different loop types. Though, the emu8086 emulator supports five
types of loop syntax, LOOP, LOOPE, LOOPNE, LOOPNZ, LOOPZ, they are not flexible
enough for many situations. We can create our self-defined loops using condition and
jump statements. Following are various types of loops implemented in assembly
language, all of which are equivalent.
For loop
The for loop has an initialization section where loop variables are initialized, a loop
condition section, and finally, an increment/decrement section to do some calculation
or change loop variables before the next iteration. Following is an example for loop
in C language.
char bl = '0';
for (int cl = 0; cl < 5; cl++) {
// body
bl++;
}
init_for:
; initialize loop variables
MOV CL, 0
for:
; condition
CMP CL, 5
JGE outside_for
; body
INC BL
outside_for:
; other codes
While loop
Unlike for loop, while loop has no initialization section. It only has a loop condition
section, which if satisfied, executes the body part. In the body part, we can do some
calculations before the next iteration. Following is an example while loop
in C language.
char bl = '0';
int cl = 0;
while (cl < 5) {
// body
bl++;
cl++;
}
MOV CL, 0
MOV BL, '0'
while:
; condition
CMP CL, 5
JGE outside_while
; body
INC BL
INC CL
; next iteration
JMP while
outside_while:
; other codes
Do-while loop
Similar to the while loop, the do-while loop has a loop condition section and body.
The only difference is that the code in the body executes at least once, even if the
condition evaluates to false. Following is an example do-while loop in C language.
char bl = '0';
int cl = 0;
do {
// body
bl++;
cl++;
} while (cl < 5);
MOV CL, 0
MOV BL, '0'
do_while:
; body
INC BL
INC CL
; condition
CMP CL, 5
JL do_while
; other codes
; initialize counter
MOV CX, 5
loop1:
INC BL
LOOP loop1
Include directive
The Include directive is used to access and use procedures and macros defined in
other files. The syntax is include followed by a file name with an extension.
include file_name
The assembler automatically searches for the file in two locations and shows an error
if it cannot find it. The locations are:
In the Inc folder, there is a file emu8086.inc, which defines some useful procedures
and macros that can make coding easier. We have to include the file at the beginning
of our source code to use these functionalities.
include 'emu8086.inc'
Now, we can use these macros in the code segment. Some of these macros and
procedures that I find most useful are:
To learn more about the macros and procedures inside the emu8086.inc file visit this
page.
Try it yourself first and if you cannot solve it, then read on.
; Print new-line
inner_loop:
; Print #
LOOP inner_loop
outside_loop:
; other codes
Summary
We covered so many contents in this article. First, we understood what assembly
language is and some assemblers' names. Then, we understood a code structure and
discovered all the registers and flags in the 8086 microprocessor. After
comprehending some assembly instructions, we learned how to define a variable, how
to take input from the user, and also how to output something on the screen. Then we
learned about conditions and loops, and finally, to wrap up, we solved a problem
using assembly language.
Top comments (0)
Subscribe
Arm
PROMOTED
Decode Cutting-Edge Computing Trends
🤖 Explore generative AI
Register Now
Read next
JOINED
Build your own AI Meme Generator & learn how to use OpenAI's function calls
☎️
#ai #fullstack #tutorial #discuss
DEV Community
Working with MongoDB Aggregation
Let's grab the Node wrench! Time for a peek under MongoDB's hood.