Ibm PC Assembly Language and Programming-0131920634
Ibm PC Assembly Language and Programming-0131920634
LANGUAGE
AND PROGRAMMING
Third Edition
Peter Abel
British Columbia
Institute of Technology
The author and publisher of this book have used their best efforts in preparing this book. These efforts include the
development, research, and testing of the theories and programs to determine their effectiveness. The author and
publisher make no warranty of any kind, expressed or implied, with regard to these programs or the documentation
contained in this book. The author and publisher shall not be liable in any event for incidental or consequential
damages in connection with, or arising out of, the furnishing, performance, or use of these programs.
10 9 8 7 6 5 4 3 2
ISBN 0-13-192063-4
PREFACE Xill
1 INTRODUCTION TO PC HARDWARE
Introduction 1
Bits and Bytes 2
Binary Numbers 3
Hexadecimal Representation 6
Ascii Code 7
The Processor 7
Internal Memory 9
Segments and Addressing 10
Registers 13
Key Points 17
Questions 18
2 PC SOFTWARE REQUIREMENTS 19
Introduction 19
Operating System Characteristics 19
The Boot Process 20
DOS-BIOS Interface 21
System Program Loader 21
ili
Contents
The Stack 22
Program Addressing 24
Memory and Register References 26
Key Points 26
Questions 2/7
3 EXECUTION OF INSTRUCTIONS 28
Introduction 28
The DEBUG Program 29
Viewing Memory Locations 30
Machine Language Example I: Immediate Data 32
Machine Language Example II: Defined Data 37
Entering a Symbolic Assembly Program 40
Using the 1nT Instruction 41
Saving a Program from within DEBUG 43
Assembly Language Example: The ptr Operator 44
Key Points 45
Questions 45
Introduction 200
Features of String Operations 201
REP: Repeat String Prefix 201
MOVs: Move String 202
Lops: Load String 204
STOS: Store String 205
Transferring Data with Lops and stos 206
CMPs: Compare String 206
SCAS: Scan String 209
Scan and Replace 210
Alternative Coding for String Instructions 211
Duplicating a Pattern 211
Right Adjusting on the Screen 212
Key Points 215
Questions 215
Vill Contents
Introduction 217
Addition and Subtraction 218
Multiword Arithmetic 220
Unsigned and Signed Data 223
Multiplication 224
Multiword Multiplication 226
Special Multiplication Instructions 230
Multiplication by Shifting 231
Division 232
Division by Shifting 236
Reversing the Sign 237
Numeric Data Processors 237
Key Points 239
Questions 239
Introduction 241
Data in Decimal Format 242
Processing ASCII Data 243
Processing Unpacked Bcp Data 245
Processing Packed Bcp Data 248
Conversion of ascit to Binary Format 250
Conversion of Binary to ASCII Format 250
Shifting and Rounding 251
Program to Convert ASCII Data 253
Key Points 258
Questions 259
20 PRINTING 364
Introduction 364
Common Printer Control Characters 365
DOS 21H, Function 40H: Print Characters 365
Printing With Page Overflow and Headings 366
Printing ASCII Files and Handling Tabs 369
pos 21H, Function 05H: Print Character 373
Special Printer Control Characters 373
BIOS INT 17H Functions for Printing 374
Key Points 376
Questions 3/76
Comments 396
Using a Macro within a Macro Definition 398
The LOCAL Directive 399
Includes from a Macro Library 401
Concatenation 402
Repetition Directives 403
Conditional Directives 404
Key Points 408
Questions 410
Introduction 469
The Boot Process 470
The sios Data Area 470
Interrupt Services 474
BIOS Interrupts 475
Key Points 478
Questions 4/79
Introduction 481
pos Interrupts 481
DOS INT 21H Services 481
Key Points 486
Questions 486
Introduction 487
Type Specifiers 487
Operators 488
Directives 494
Introduction 514
Register Notation 515
Addressing Mode Byte 515
Two-Byte Instructions 51/7
Three-Byte Instructions 517
Four-Byte Instructions 517
Instruction Set 518
APPENDIXES
Conversion between Hexadecimal and Decimal 542
aAscu Character Codes 545
Reserved Words 547
Assembler and Link Options 549
The DOS DEBUG Program 55/7
>Amo
we Keyboard Scan Codes and ASCII Codes 564
ANSWERS TO SELECTED QUESTIONS 568
INDEX 581
Preface
¢ A program written in assembly language requires considerably less memory and ex-
ecution time than a program written in what are known as high-level languages, such
as Pascal and C.
¢ Assembly language gives a programmer the ability to perform highly technical tasks
that would be difficult, if not impossible, in a high-level language.
Xiil
XiV Preface
OPERATING SYSTEMS
The major purposes of an operating system are (1) to allow users to instruct a computer re-
garding actions it is to take (such as executing a particular program) and (2) to provide
means of storing (“cataloging”) information on disk and of accessing it.
The most common operating system for the PC and its compatibles is Ms-Dos from
Microsoft, known as PC-DOS on the IBM PC. Each version of Dos has provided additional fea-
tures that have extended the capability of the pc. A discussion of such advanced operating
systems as Os/2 and UNIX is outside the scope of this book.
The programs also omit macro instructions (explained in Chapter 22); although pro-
fessional programmers use macros extensively, their appearance in a book of this nature
would interfere with learning the principles of the language. Once these principles are
learned, a programmer can adopt the clever techniques of the professional.
Learning assembly language and getting your programs to work is an exciting and
challenging experience. For the time and effort invested, the rewards are sure to be great.
¢ The inclusion of, and more emphasis on, additional functions in more recent versions
of Dos
¢ Programming for mouse operations
XVi Preface
ACKNOWLEDGMENTS
The author is grateful for the assistance and cooperation of all those who contributed sug-
gestions for, reviews of, and corrections to earlier editions. For this third edition, a special
thanks to Brian R. Anderson of the British Columbia Institute of Technology for inputs on
mouse and C programming
PART A — Fundamentals of PC Hardware
and Software
CHAPTER I
Introduction to PC Hardware
OBJECTIVE
To explain the basic features of microcomputer hardware
and program organization.
INTRODUCTION
Writing a program in assembly language requires knowledge of the computer’s hard-
ware (or architecture), its instruction set, and the rules for using that instruction set. An ex-
planation of the basic hardware—bits, bytes, registers, memory, the processor, and the data
bus—is provided in this chapter. The instruction set and its use are developed throughout
the book.
The fundamental building blocks of a computer are the bit and the byte. These sup-
ply the means by which a computer can represent data and instructions in memory.
The main internal hardware features of a computer are a microprocessor, memory,
and registers; external hardware features are the computer’s input/output devices such as
the keyboard, monitor, and disk. Software consists of the various programs and data files
(including the operating system), stored on the disk. To execute (or run) a program, the sys-
tem copies it from disk into internal memory. (Internal memory is what people mean when
they claim that their computer has, for example, 8 megabytes of memory.) The micro-
processor executes the program instructions, and the registers handle arithmetic, data move-
ment, and addressing.
2 Introduction to PC Hardware Chapter 1
An assembly language program consists of one or more segments for defining data
and for storing machine instructions and a segment named the stack that contains stored
addresses.
Bytes
A group of nine bits is called a byte, which represents storage locations both in internal
memory and on external disk. In memory, each byte has a unique address, beginning with
zero for the first byte. Each byte consists of eight bits for data and one bit for parity:
The eight data bits provide the basis for binary arithmetic and for representing such char-
acters as the letter A and the asterisk symbol (*). Eight bits allow 256 different combina-
tions of on-off conditions, from all bits off (QO000000) through all bits on (11111111). For
example, a representation of the bits for the letter A is 01000001 and for the asterisk is
00101010, although you don’t have to memorize such facts.
Parity requires that in each byte, the number of bits that are on is always odd. Since
the letter A contains two bits that are on, the processor automatically sets its parity bit
on also (01000001-1), to force odd parity. Similarly, since the asterisk contains three bits
that are on, the processor sets its parity bit off (00101010-0), to maintain odd parity.
When an instruction references a byte in internal storage, the processor checks its parity.
If its parity is even, the system assumes that a bit is “lost” and displays an error message.
A parity error may be a result of a hardware fault or an electrical disturbance; either way,
it is a rare event.
You may have wondered how a computer “knows” that bit value 01000001 repre-
sents the letter A. When you key in A on the keyboard, the system delivers a signal from
that particular key into memory and sets a byte (in an input location) to the bit value
01000001. You can move the contents of this byte about in memory as you will, and you
can even print it or display it on the screen as the letter A.
For reference purposes, the bits in a byte are numbered 0 to 7 from right to left, as
shown here for the letter A (we no longer need be concerned with the parity bit):
Bit number: a 6 5 4 3 2 1
Bit contents for A: 0 1 0 0 0 0 0 1
Binary Numbers 3
Related Bytes
A program can treat a group of bytes as a unit of data, such as time or distance. A group of
one or more bytes that defines a particular value is commonly known as a field. A computer
also supports certain sizes that are natural to it:
* Word. A 2-byte (16-bit) field. Bits in a word are numbered 0 through 15 from right
to left, as shown here for the letters ‘PC’:
BINARY NUMBERS
Because a computer can distinguish only between 0 and 1 bits, it works in a base-2 num-
bering system known as binary. In fact, the word “bit” is a contraction of “Binary digIT.”
A collection of bits can represent any numeric value. The value of a binary number
is based on the relative positions of the bits and whether each is a zero or a one. Just as in
decimal numbers, the positions represent ascending powers (but of 2, not 10) from right to
left. In the following eight-bit number, all bits are set to one (on):
Position: 7 6 5 4 3 2 I
Bit value: 1 1 1 1 1 ] 1 1
Position value: 128 64 32 16 8 4 2 1
The rightmost bit assumes the value 1 (2°), the next digit to the left assumes the value 2 (2!),
the next the value 4 (27), and so forth. The value of the binary number in this case is | + 2
+4+...+ 128 = 255 (or 28-1).
In a similar manner, the value of the binary number 01000001 is calculated to be 1
plus 64, or 65:
But isn’t 01000001 the letter A? Indeed, it is. The bits 01000001 can represent either the
number 65 or the letter A, as follows:
4 Introduction to PC Hardware Chapter 1
¢ If a program defines the data for arithmetic purposes, then 01000001 represents a bi-
nary number equivalent to the decimal number 65.
¢ If a program defines the data for descriptive purposes, such as a heading, then
01000001 represents an alphabetic character.
When you start programming, you will see this distinction more clearly, because you de-
fine and use each data item for a specific purpose; in practice, the two uses are rarely a
source of confusion.
A binary number is not limited to 8 bits. A processor that uses 16-bit (or 32-bit)
architecture handles 16-bit (or 32-bit) numbers automatically. For 16 bits, 2!6 — 1
provides values up to 65,535, and for 32 bits, 232 — 1 provides values up to
4,294,967,295.
Binary Arithmetic
A microcomputer performs arithmetic only in binary format. Consequently, an assembly
language programmer has to be familiar with binary format and binary addition. The fol-
lowing examples illustrate binary addition:
0 0 1 1
+0 +1 +1 el
0 1 10 +1
11
Note the carry of a 1-bit in the last two examples. Now, let’s add 01000001 and 00101010.
Are we adding the letter A and an asterisk? No, they are the decimal values 65 and 42:
Decimal Binary
65 01000001
+42 +00101010
107 01101011
Check that the binary sum 01101011 is actually 107. As another example, add the decimal
values 60 and 53:
Decimal Binary
60 00111100
+53 +00110101
113 01110001
Negative Numbers
The preceding binary numbers are all positive values because in each the leftmost bit is a
zero. A negative binary number contains a 1-bit in its leftmost position. However, it’s not
as simple as changing the leftmost bit to 1, such as 01000001 (+65) to 11000001. A nega-
tive value is expressed in two’s complement notation; that is, to represent a binary number
as negative, the rule is: Reverse the bits and add 1. Let’s find the two’s complement of
OLOO00001 (or 65) as an example:
Binary Numbers 5
A binary number is negative if its leftmost bit is 1, but if you add the 1-bit values to con-
vert the number 10111111 to decimal, you won’t get 65. To determine the absolute value of a
negative binary number, simply repeat the previous operation; that is, reverse the bits and add 1:
Number —65: 10111111
Reverse bits: 01000000
Add 1: 1
Number +65: 01000001
The sum of +65 and —65 should be zero. Let’s try it:
+65 01000001
=) FIOTLIILEI
00 (1)00000000
In the sum, the 8-bit value is all zeros, and the carry of the 1-bit on the left is lost. But be-
cause there is a carry into the sign bit and a carry out, the result is correct.
Binary subtraction is a simple matter: Convert the number being subtracted to two’s
complement format, and add the numbers. Let’s subtract 42 from 65. The binary represen-
tation for 42 is 00101010, and its two’s complement is 11010110:
65 01000001
4:( =A) +11010110
23 (1)00010111
The result, 23, is correct. Once again, there is a valid carry into the sign bit and a carry out.
If the justification for two’s complement notation isn’t immediately clear, consider
the following question: What value would you have to add to binary 00000001 to make it
equal to 00000000? In terms of decimal numbers, the answer would be — 1. The two’s com-
plement of 1 is 11111111. So we add +1 and —1 as follows:
1 00000001
1) 11111111
Result: (1)00000000
Ignoring the carry of 1, you can see that the binary number 11111111 is equivalent to dec-
imal —1. You can also see a pattern form as the binary numbers decrease in value:
age: 0000001 1
+2 00000010
aa 00000001
0 00000000
=e 11111111
= 11111110
3 11111101
6 Introduction to PC Hardware Chapter 1
In fact, the 0-bits in a negative binary number indicate its (absolute) value: Treat the posi-
tional value of each O0-bit as if it were a 1-bit, sum the values, and add 1.
You'll find this material on binary arithmetic and negative numbers particularly rel-
evant when you get to Chapters 12 and 13 on arithmetic.
HEXADECIMAL REPRESENTATION
Imagine that you want to view the contents of a binary value in four adjacent. bytes (a dou-
bleword) in memory. Although a byte may contain any of the 256 bit combinations, there
is no way to display or print many of them as standard ASCII characters. (Examples of such
characters include the bit configurations for Tab, Enter, Form Feed, and Escape.) Conse-
quently, computer designers developed a shorthand method of representing binary data.
The method divides each byte in half and expresses the value of each half-byte. As an ex-
ample, consider the following four bytes:
Since the numbers 11, 12, and 14 require two digits, let’s extend the numbering sys-
tem so that 10 = A, 11 = B, 12 = C, 13 = D, 14 = E, and 15 = F. Here’s the revised short-
hand number that represents the contents of the bytes just given:
59 S, B9 CE
The numbering system thus involves the “digits” 0 through F and, since there are 16 such
digits, the system is known as hexadecimal (or hex) representation. Figure 1-1 shows the
decimal numbers 0 through 15 along with their equivalent binary and hexadecimal values.
0 0)
1 1
2 Z
3 3
4 4
- 5 5
6 6
i 7 YHoOQWPpPwa
6 5 F Is 10 FF
+40 +8 +1 +F 4300 +1
A D 10 1E 40 100
Note also that hex 40 equals decimal 64, hex 100 is decimal 256, and hex 1,000 is decimal
4,096.
To indicate a hex number in a program, code an “H” immediately after the number;
thus 25H = decimal 37. By convention, a hex number always begins with a decimal digit
0-9, so you should code B8H as OB8H. In this book, we indicate a hexadecimal value with
the word “hex” or an “H” following the number (such as hex 4C or 4CH); a binary value
with the word binary or a “B” following the number (such as binary 01001100 or
01001100B); and a decimal value simply by a number (such as 76). An occasional excep-
tion occurs where the base is obvious from the context.
Appendix A gives an explanation of how to convert hex numbers to decimal and
vice versa.
ASCII CODE
To standardize the representation of characters, microcomputer manufacturers have
adopted the ASCII (American National Standard Code for Information Interchange) code.
A standard code facilitates the transfer of data between different computer devices. The 8-
bit extended ASCII code that the PC uses provides 256 characters, including symbols for
foreign alphabets. For example, the combination of bits 01000001 (hex 41) indicates the
letter A. Appendix B provides a list of the 256 ASCII characters, and Chapter 8 shows how
to display most of them on the screen.
THE PROCESSOR
An important hardware element of the PC is the system unit, which contains a system board,
power supply, and expansion slots for optional boards. Features of the system board are an
Intel (or equivalent) microprocessor, read-only memory (ROM), and random access mem-
ory (RAM).
The brain of the PC and compatibles is a microprocessor based on the Intel 8086 fam-
ily that performs all processing of instructions and data. Processors vary in their speed and
capacity of memory, registers, and data bus. A data bus transfers data between the proces-
sor, memory, and external devices, in effect, managing data traffic. Following is a brief de-
scription of various Intel processors:
8088/80188. These processors have 16-bit registers and an 8-bit data bus and can
address up to 1 million bytes of internal memory. The registers can process two bytes at a
time, whereas the data bus can transfer only one byte at a time. The 80188 is a souped-up
8088 with a few additional instructions. Both types of processor run in what is known as
real mode, that is, one program at a time.
8 Introduction to PC Hardware Chapter 1
8086/80186. These processors are similar to the 8088/80188, but have a 16-bit data
bus and can run faster. The 80186 is a souped-up 8086 with a few additional instructions.
80286. This processor can run faster than the preceding processors and can ad-
dress up to 16 million bytes. It can run in real mode or in protected mode for multitasking.
80386. This processor has 32-bit registers and a 32-bit data bus and can address up
to 4 billion bytes of memory. It can run in real mode or in protected mode for multitasking.
80486. This processor also has 32-bit registers and a 32-bit data bus (although
some clones have a 16-bit data bus) and is designed for enhanced performance. It can run
in real mode or in protected mode for multitasking.
Pentium (or P5). This processor has 32-bit registers and a 64-bit data bus and
can execute more than one instruction per clock cycle. (Intel adopted the name “Pentium”
because, in contrast to numbers, names can be copyrighted.)
The processor is partitioned into two logical units: an execution unit (EU) and a bus inter-
face unit (BIU), as illustrated in Figure 1-2. The role of the EU is to execute instructions,
whereas the BIU delivers instructions and data to the EU. The EU contains an arithmetic
and logic unit (ALU), a control unit (CU), and a number of registers. These features pro-
vide for execution of instructions and arithmetic and logical operations.
The most important function of the BIU is to manage the bus control unit, segment
registers, and instruction queue. The BIU controls the buses that transfer data to the EU, to
Program Control
ALU: Arithmetic
and Logic Unit
Instruction
Queue
Flags Register
) = ;
Figure 1-2 Execution Unit and Bus Inter-
face Unit
Internal Memory 9
memory, and to external input/output devices, whereas the segment registers control mem-
ory addressing.
Another function of the BIU is to provide access to instructions. Since the instruc-
tions for a program that is executing are in memory, the BIU must access instructions from
memory and place them in an instruction queue. Because this queue is from 4 to 32 bytes
in size, depending on the processor, the BIU is able to look ahead and prefetch instructions
so that there is always a queue of instructions ready to execute.
The EU and BIU work in parallel, with the BIU keeping one step ahead. The EU no-
tifies the BIU when it needs access to data in memory or an I/O device. Also, the EU requests
machine instructions from the BIU instruction queue. The top instruction is the currently ex-
ecutable one, and while the EU is occupied executing an instruction, the BIU fetches another
instruction from memory. This fetching overlaps with execution and speeds up processing.
Processors up through the 80486 have what is known as a single pipeline, which re-
stricts them to completing one instruction before starting the next. The Pentium and later
processors have a dual pipeline structure that enables it to run many operations in parallel.
INTERNAL MEMORY
A microcomputer contains two types of internal memory: random access memory (RAM)
and read-only memory (ROM). Bytes in memory are numbered consecutively, beginning
with 00, so that each location has a uniquely numbered address.
Figure 1—3 shows a physical memory map of an 8086-type PC. Of the first megabyte
of memory, the first 640K is RAM, most of which is available for your own use.
ROM. ROMisaspecial memory chip that (as the full name suggests) can only be
read. Since instructions and data are permanently “burned into” a ROM chip, they cannot
be altered. The ROM Basic Input/Output System (BIOS) begins at address 768K and han-
dles input/output devices, such as a hard disk controller. ROM beginning at 960K controls
the computer’s basic functions, such as the power-on self-test, dot patterns for graphics, and
the disk self-loader. When you switch on the power, ROM performs various check-outs and
loads special system data from disk into RAM.
Start Address Purpose
640K A0000
conventional
memory memory
Zero 00000
register
es
A segment in real mode can be up to 64K bytes. There may be any number of seg-
ments; to address a particular segment, it is necessary only to change the address in the ap-
propriate segment register. The three main segments are the code, data, and stack segments.
Code Segment
The code segment contains the machine instructions that are to execute. Typically, the first
executable instruction is at the start of this segment, and the operating system links to that
location to begin program execution. As the name implies, the code segment (CS) register
addresses the code segment. If your code area requires more than 64K, your program may
need to define more than one code segment.
Data Segment
The data segment contains a program’s defined data, constants, and work areas. The data
segment (DS) register addresses the data segment. If your data area requires more than 64K,
your program may need to define more than one data segment.
Stack Segment
In simple terms, the stack contains any data and addresses that you need to save temporar-
ily or for use by your own “called” subroutines. The stack segment (SS) register addresses
the stack segment.
Segment Boundaries
The segment registers contain the starting address of each segment. Figure 1-4 presents a
graphic view of the CS, DS, and SS registers; the registers and segments are not necessar-
ily in the order shown. Other segment registers are the ES (extra segment) and, on the 80386
and later processors, the FS and GS registers, which have specialized uses.
As discussed earlier, a segment begins on a paragraph boundary, which is an address
evenly divisible by decimal 16, or hex 10. Assume that a data segment begins at memory
location 045FOH. Since in this and all other cases the rightmost hex digit is zero, the com-
puter designers decided that it would be unnecessary to store the zero digit in the segment
register. Thus 045FOH is stored as 045F, with the rightmost zero understood. Where ap-
propriate, the text refers to the rightmost zero through the use of square brackets, such as
in O45F[0].
SS | Address
DS | Address
Relocatable
in Memory
CS | Address
Segment
Registers
Segment Offsets
Within a program, all memory locations are relative to a segment’s starting address. The
distance in bytes from the segment address is expressed as an offset (or displacement). A
two-byte (16-bit) offset can range from OOOOH through FFFFH, or zero through 65,535.
Thus the first byte of the code segment is at offset 00, the second byte is at offset 01, and
so forth, through to offset 65,535. To reference any memory address in a segment, the
processor combines the segment address in a segment register with an offset value.
In the following example, the DS register contains the segment address of the data
segment at hex O45F[0], and an instruction references a location with an offset of 0032H
bytes within the data segment.
pK
| |
segment address 045FOH offset 32H
The actual memory location of the byte referenced by the instruction is therefore
04622H:
DS segment address: O045FOH
Offset: + 0032H
Actual address: 04622H
Note that a program contains one or more segments, which may begin almost any-
where in memory, may vary in size, and may be in any sequence.
Addressing Capacity
The PC series has used a number of Intel processors that provide different addressing
capabilities.
80286 Addressing. In real mode, the 80286 processor handles addressing the
same as an 8086 does. In protected mode, the processor uses 24 bits for addressing, so that
FFFFF[0] allows addressing up to 16 million bytes. The segment registers act as selectors
for accessing a 24-bit segment address from memory and add this value to a 16-bit off-
set address:
Segment address:
Registers 13
Segment address:
REGISTERS
The processor’s registers are used to control instructions being executed, to handle ad-
dressing of memory, and to provide arithmetic capability. The registers are addressable by
name. Bits are conventionally numbered from right to left, as in
15 14 13 12 11 10 9 8 765 4 3 2 «1 «0
Segment Registers
A segment register is 16 bits long and provides for addressing an area of memory known
as the current segment. As discussed earlier, a segment aligns on a paragraph boundary, and
its address in a segment register assumes four O-bits to its right.
CS register. DOS stores the starting address of a program’s code segment in the
CS register. This segment address, plus an offset value in the instruction pointer (IP) regis-
ter, indicates the address of an instruction to be fetched for execution. For normal pro-
gramming purposes, you need not reference the CS register.
ES register. Some string (character data) operations use the extra segment regis-
ter to handle memory addressing. In this context, the ES register is associated with the DI
(index) register. A program that requires the use of the ES register may initialize it with an
appropriate segment address.
FS and GS Registers. These are additional extra segment registers on the 80386
and later processors.
14 Introduction to PC Hardware Chapter 1
Pointer Registers
The SP (stack pointer) and BP (base pointer) registers are associated with the SS register
and permit the system to access data in the stack segment.
SP register. The 16-bit stack pointer is associated with the SS register and pro-
vides an offset value that refers to the current word being processed in the stack. The 80386
and later processors have an extended 32-bit stack pointer, the ESP register. The system au-
tomatically handles these registers.
In the following example, the SS register contains segment address 27B3[0]H, and
the SP contains offset 312H. To find the current word being processed in the stack, the com-
puter combines the addresses in the SS and SP:
Segment address in SS register: 27B30H
Offset in SP register: + 312H
Address in stack: 27E42H
27B3[0]H 312H
SS segment address SP offset
BP register. The 16-bit BP facilitates referencing parameters, which are data and
addresses passed via the stack. The 80386 and later processors have an extended 32-bit BP
called the EBP register.
General-Purpose Registers
The AX, BX, CX, and DX general-purpose registers are the workhorses of the system.
They are unique in that you can address them as one word or as a one-byte portion. The
leftmost byte is the “high” portion and the rightmost byte is the “low” portion. For exam-
ple, the CX register consists of a CH (high) and a CL (low) portion, and you can reference
any portion by its name. The following instructions move zeros to the CX, CH, and CL
registers, respectively:
Registers 15
MOV CX,00
MOV CH,00
MOV CL,00
The 80386 and later processors support all the general-purpose registers, plus 32-bit
extended versions of them: the EAX, EBX, ECX, and EDX.
AX register. The AX register, the primary accumulator, is used for operations in-
volving input/output and most arithmetic. For example, multiply, divide, and translate in-
structions assume the use of the AX. Also, some instructions generate more efficient code
if they reference the AX rather than another register.
BX register. The BX is known as the base register since it is the only general-pur-
pose register that can be used as an index to extend addressing. Another common purpose
of the BX is for computations.
CX register. The CX is known as the count register. It may contain a value to con-
trol the number of times a loop is repeated or a value to shift bits left or right. The CX is
also used for many computations.
ECX:
You may use any of the general-purpose registers for addition and subtraction of
8-bit, 16-bit, or 32-bit values.
Index Registers
The SI and DI registers are available for indexed addressing and for use in addition and
subtraction.
16 Introduction to PC Hardware Chapter 1
SI register. The 16-bit source index register is required for some string (charac-
ter) operations. In this context, the SI is associated with the DS register. The 80386 and later
processors support a 32-bit extended register, the ESI.
DI register. The 16-bit destination index register is also required for some string
operations. In this context, the DI is associated with the ES register. The 80386 and later
processors support a 32-bit extended register, the EDI.
Flags Register
Of the 16 bits of the flags register, 9 are common to all 8086-family processors to indicate
the current status of the machine and the results of processing. Many instructions involv-
ing comparisons and arithmetic change the status of the flags, which some instructions may
test to determine subsequent action.
Briefly, the common flag bits are as follows:
AF (auxiliary carry). Contains a carry out of bit 3 on eight-bit data, for special-
ized arithmetic.
Flag:
Key Points 17
The flags most relevant to assembly programming are O, S, Z, and C for comparisons
and arithmetic operations, and D for the direction of string operations. The 80286 and later
processors have some flags used for internal purposes, concerned primarily with protected
mode. The 80386 and later processors have a 32-bit extended flags register known as
Eflags. Chapter 8 contains more details about the flags register.
KEY POINTS
The computer distinguishes only between bits that are 0 (off) and 1 (on) and performs
arithmetic only in binary format.
The value of a binary number is determined by the placement of its bits. Thus binary
1101 equals 2° +22 ++ 0! + 2°, or 13.
A negative binary number is represented in two’s complement notation: Reverse the
bits of its positive representation and add 1.
A single character of memory is a byte, comprised of eight data bits and one parity
bit. Two adjacent bytes comprise a word, and four adjacent bytes comprise a dou-
bleword.
The value K equals 2!°, or 1,024 bytes.
Hexadecimal format is a shorthand notation for representing groups of four bits. Hex
digits OQ-9 and A-F represent the binary values 0000 through 1111.
The representation of character data is done in ASCII format.
The heart of the PC is a microprocessor. The processor stores numeric data in words
in memory in reverse-byte sequence.
The two types of internal memory are ROM and RAM.
An assembly language program consists of one or more segments: a stack segment
for maintaining return addresses, a data segment for defined data and work areas, and
a code segment for executable instructions. Locations in a segment are expressed as
an offset relative to the segment’s starting address.
The CS, DS, and SS registers provide for addressing the code, data, and stack seg-
ments, respectively.
The IP register contains the offset address of the next instruction that is to execute.
The SP and BP pointer registers are associated with the SS register and permit the
system to access data in the stack segment.
The AX, BX, CX, and DX general-purpose registers are the system’s workhorses.
The leftmost byte is the “high” portion, and the rightmost byte is the “low” portion.
The AX (primary accumulator) is used for input/output and most arithmetic. The BX
(base register) can be used as an index to extend addressing. The CX is known as the
count register, and the DX is known as the data register.
The SI and DI index registers are available for extended addressing and for use in ad-
dition and subtraction. These registers are also required for some string (character)
operations.
18 Introduction to PC Hardware Chapter 1
¢ The flags register indicates the current status of the computer and the results of exe-
cuting instructions.
QUESTIONS
1-1. Provide the binary bit configuration for the following numbers: (a) 6; (b) 14; (c) 22; (d) 28;
(e) 30.
1-2. Add the following binary numbers:
(a) 00010101 (b) 00111101 (c) 00011101 (d) 01010111
00001101 00101010 0000001 1 00111101
1-3. Determine the two’s complement of the following binary numbers: (a) 00010110;
(b) 00111101; (c) 00111100.
. Determine the positive (absolute) value of the following negative binary numbers:
(a) 11001000; (b) 10111101; (c) 11111110; (d) 11111111.
. Determine the hex representation of the following values: (a) ASCII letter Q; (b) ASCII num-
ber 7; (c) binary 01011101; (d) binary 01110111.
1-6. Add the following hex numbers:
(a) 23A6 (b) SIFD (cy. FID (d) EABE (e) FBAC
+0022 +0003 +0887 +26C4 +OCBE
1-7. Determine the hex representation of the following decimal numbers. Refer to Appendix A for
the conversion method. You could also check your result by converting the hex to binary and
adding the 1-bits. (a) 19; (b) 33; (c) 89; (d) 255; (e) 4095; (f) 63,398.
1-8. Provide the ASCII bit configuration for the following one-byte characters. Use Appendix B as
a guide: (a) P; (b) p; (c) #; (d) 5.
1-9. What is the purpose of the processor?
1-10. What are the two main kinds of memory on the PC, and what are their main purposes?
1-11. Show how the system stores hex 012345 as a value in memory.
1-12. Explain the following: (a) segment; (b) offset; (c) address boundary.
1-13. What are (a) the three kinds of segments, (b) their maximum size, and (c) the address bound-
ary on which they begin?
1-14. Explain the purpose of each of the three segment registers.
1-15. Explain which registers are used for the following purposes: (a) addition and subtraction;
(b) counting for looping; (c) multiplication and division; (d) addressing segments; (e) indica-
tion of a zero result; (f) offset address of an instruction that is to execute.
1-16. Show the EAX register and the size and position of the AH, AL, and AX within it.
1-17. Code the assembly language instructions to move the value 25 to the following registers:
(a) CH; (b) CL; (c) CX; (d) ECX.
CHAPTER 2
PC Software Requirements
OBJECTIVE
To explain the general software environment for the PC.
INTRODUCTION
In this chapter, we describe the PC software environment: the functions of DOS and its
main components. We examine the boot process (how the system loads itself when you
power up the computer), and consider how the system loads a program for execution, how
the system uses the stack, and how an instruction in the code segment addresses data in the
data segment.
The chapter completes the basic explanations of the PC’s hardware and software and
enables us to proceed to Chapter 3, where we take up keying programs into memory and
executing them step by step.
19
20 PC Software Requirements Chapter 2
Among the DOS functions that concern us in this book are the following:
File management. DOS maintains the directories and files on the system’s disks. Pro-
grams create and update files, but DOS bears the responsibility of managing their lo-
cation on disk.
Input/output. Programs request input data from DOS or deliver such data to DOS by
means of interrupts. DOS relieves the programmer of coding at the I/O level.
Program loading. A user or program requests execution of a program; DOS handles
the steps involved in accessing the program from disk, placing it in memory, and ini-
tializing it for execution.
Memory management. When DOS loads a program for execution, it allocates a large
enough space in memory for the program code and its data. Programs can process
data within their memory area, can release unwanted memory, and can request addi-
tional memory.
Interrupt handling. DOS allows users to install resident programs that attach them-
selves to the interrupt system to perform special functions.
Organization of DOS
The three major components of DOS are IO.SYS, MSDOS.SYS, and COMMAND.COM.
IO.SYS performs initialization functions at bootup time and also contains important
input/output functions and device drivers that supplement the primitive I/O support in ROM
BIOS. This component is stored on disk as a hidden system file and is known under PC-
DOS as IBMBIO.COM.
MSDOS.SYS acts as the DOS kernel and is concerned with file management, mem-
ory management, and input/output. This component is stored on disk as a hidden system
file and is known under PC-DOS as IBMDOS.COM.
COMMAND.COM is a command processor or shell that acts as the interface between
the user and the operating system. It displays the DOS prompt, monitors the keyboard, and
processes user commands such as deleting a file or loading a program for execution.
1. An interrupt service table that begins in low memory at location 0 and contains ad-
dresses for interrupts that occur.
2. A BIOS data area beginning at location 40[0], largely concerned with attached
devices.
System Program Loader 21
640K
COMMAND.COM transient portion
(executing programs may erase it)
BIOS next determines whether a disk containing the DOS system files is present and,
if so, it accesses the bootstrap loader from the disk. This program loads system files IO.SYS
and MSDOS.SYS from the disk into memory and transfers control to the entry point of
IO.SYS, which contains device drivers and other hardware-specific code. IO.SYS relocates
itself in memory and transfers control in its turn to MSDOS.SYS. This module initializes
internal DOS tables and the DOS portion of the interrupt table. It also reads the CON-
FIG.SYS file and executes its commands. Finally, MSDOS.SYS passes control to COM-
MAND.COM, which processes the AUTOEXEC.BAT file, displays its prompt, and
monitors the keyboard for input.
At this point, conventional memory up to 640K appears as shown in Figure 2—1. Un-
der memory management, part of DOS may be relocated into high memory.
DOS-BIOS INTERFACE
BIOS contains a set of routines in ROM to provide device support. BIOS tests and initial-
izes attached devices and provides services that are used for reading to and for writing from
the devices. One task of DOS is to interface with BIOS when there is a need to access its
facilities.
When a user program requests a service of DOS, it may transfer the request to BIOS,
which in its turn accesses the requested device. Sometimes, however, a program makes re-
quests directly to BIOS, especially for keyboard and screen services. And at other times—
although rarely and not recommended—a program can bypass both DOS and BIOS to
access a device directly. Figure 2—2 shows these alternative paths.
User Programs
Hard Devi
ardware/Devices Figure 2-2 DOS-BIOS Interface
When you request DOS to load an .EXE program from disk into memory for execu-
tion, the loader performs the following operations:
In the foregoing way, the DOS loader correctly initializes the CS:IP and SS:SP reg-
isters. But note that the loader program stores the address of the PSP in both the DS and ES
registers, although your program normally needs the address of the data segment in these
registers. As a consequence, your programs have to initialize the DS with the address of the
data segment, as you'll see in Chapter 4.
We’ll now examine the stack and then the code and data segments.
THE STACK
Both .COM and .EXE programs require an area in the program reserved as a stack. The pur-
pose of the stack is to provide a space for the temporary storage of addresses and data items.
DOS automatically defines the stack for a .COM program, whereas you must explic-
itly define a stack for an .EXE program. Each data item in the stack is one word (two bytes).
The SS register, as initialized by DOS, contains the address of the beginning of the stack.
Initially, the SP contains the size of the stack, a value that points to the byte past the end of
the stack. The stack differs from other segments in its method of storing data: It begins at
the highest location and stores data downward through memory.
The Stack 23
| |
SS SP
segment address of stack top of stack
The PUSH instruction (among others) decrements the SP by 2 to the next lower stor-
age word in the stack and stores (or pushes) a value there. The POP instruction (among oth-
ers) returns a value from the stack and increments the SP by 2 to the next higher storage word.
The following example illustrates pushing the contents of the AX and BX registers
onto the stack and then subsequently popping them off. Assume that the AX contains
015AH, the BX contains 03D2H, and the SP contains 28H. (The address in the SS does not
concern us here.)
| |
SS SP = 28
segment address of stack top of stack
2. PUSH AX: Decrements the SP by 2 (to 26H) and stores the contents of the AX,
015AH, in the stack. Note that the operation reverses the sequence of the stored bytes,
so that 015A becomes 5AOl:
| |
SS SP = 26
segment address of stack top of stack
i). PUSH BX: Decrements the SP by 2 (to 24H) and stores the contents of the BX,
0(3D2H, in the stack:
| |
SS SP = 24
segment address of stack top of stack
. POP BX: Restores the word from where the SP points in the stack to the BX register
obene,
and increments the SP by 2 (to 26H). The BX now contains 03D2H, with the bytes
correctly reversed:
| |
SS SP '=26
segment address of stack top of stack
24 PC Software Requirements Chapter 2
5. POP AX: Restores the word from where the SP points in the stack to the AX register
and increments the SP by 2 (to 28H). The AX now contains 015AH, with the bytes
correctly reversed:
| |
SS SP=728
segment address of stack top of stack
Note that POP instructions are coded in reverse sequence from PUSH instructions. Thus the
example pushed the AX and BX, but popped the BX and AX, in that order. Also, the val-
ues pushed onto the stack are still there, although the SP no longer points to them.
You should always ensure that your program coordinates pushing values onto the
stack with popping them off of it. Although this is a fairly straightforward requirement, an
error can result in a program crash. Also, for an .EXE program, you have to define a stack
that is large enough to contain all values that could be pushed onto it.
Other related instructions that push values onto the stack and pop them off of it are:
¢ PUSHF and POPE: Save and restore the status of the flags.
¢ PUSHA and POPA (for the 80286 and later): Save and restore the contents of all the
general-purpose registers.
PROGRAM ADDRESSING
Normally, programmers write in symbolic code and use the assembler to translate it into
machine code. For program execution, DOS loads only machine code into memory. Every
instruction consists of at least an operation, such as move, add, or return. Depending on the
operation, an instruction may also have one or more operands that reference the data the op-
eration 1s to process.
As discussed in Chapter 1, the CS register provides the address of the beginning of a
program’s code segment, and the DS register provides the address of the beginning of the
data segment. The code segment contains instructions that are to be executed, whereas the
data segment contains data that the instructions reference. The IP register indicates the off-
set address of the current instruction in the code segment that is to be executed. An in-
struction operand indicates an offset address in the data segment that is to be referenced.
Consider an example in which DOS has determined that it is to load an .EXE pro-
gram into memory, beginning at location 04AFOH. DOS accordingly sets the CS register
with segment address 04AF[0]H and the DS with, say, segment address 04B1[0]H. The pro-
gram has already begun executing, and the IP currently contains the offset 0023H. The
CS:IP together determine the address of the next instruction to execute, as follows:
CS segment address: 4AFOH
IP offset: + 0013H
Instruction address: 4B03H
Program Addressing 25
Let’s say that the instruction beginning at 04B03H copies the contents of a byte in memory
into the AL register; the byte is at offset 0012H in the data segment. Here are both the ma-
chine code and the symbolic code for this operation:
Memory location 04B03H contains the first byte (AO) of the instruction the processor is to
access. The second and third bytes contain the offset value, in reversed-byte sequence (0012
becomes 1200). To access the data item, the processor determines its location from the seg-
ment address in the DS register plus the offset (0012H) in the instruction operand. Since
the DS contains 04B1[0]H, the actual location of the referenced data item is
ee ee ee ee A01200 —
AX
Offset 0013 |
_ Sate Segment |
4
|
Offset 0012
Data Segment
Figure 2-3 Segments and Offsets
26 PC Software Requirements Chapter 2
One feature to get clear is the use in instruction operands of names, of names in square
brackets, and of numbers. In the following examples, WORDA is defined as a word (two
bytes) in memory:
The square brackets in the fourth example define an index operator that means: Use the
offset address in the BX (combined with the segment address in the DS, as DS:BX)
to locate a word in memory, and move its contents to the AX. Compare the effect of
this instruction with that of the first example, which simply moves the contents of the
BX to the AX.
_ KEY POINTS
The three major components of DOS are IO.SYS, MSDOS.SYS, and COM-
MAND.COM.
Turning on the computer’s power causes a “cold boot.” The processor enters a reset
state, clears all memory locations to zero, performs a parity check of memory, and
sets the CS register and the IP register to the entry point of BIOS in ROM.
The two types of DOS programs are .COM and .EXE.
When you request DOS to load an .EXE program for execution, DOS constructs a
256-byte (100H) PSP on a paragraph boundary in memory and stores the program
immediately following the PSP. It then loads the address of the PSP in the DS and ES
registers, loads the address of the code segment in the CS, sets the IP to the offset of
the first instruction in the code segment, loads the address of the stack in the SS, and
sets the SP to the size of the stack. Finally, the loader transfers control to the program
for execution.
The purpose of the stack is to provide a space for the temporary storage of addresses
and data items. Each data item in the stack is one word (two bytes).
DOS defines the stack for a .COM program, whereas you must explicitly define a
stack for an .EXE program.
As the processor fetches each byte of an instruction, it increments the IP register so
that the IP contains the offset for the next instruction.
Questions 27
QUESTIONS
2-1. What are the five main functions of DOS?
. What are the three main components of DOS, and what is the purpose of each?
- What steps does the system take on a “cold boot’’?
- (a) What data area does DOS construct and store in front of an executable module when the
module is loaded for execution? (b) What is the size of this data area?
- DOS performs certain operations when it loads an .EXE program for execution. What values
does DOS initialize (a) in the CS and IP registers? (b) in the SS and SP registers? (c) in the DS
and ES registers?)
. What is the purpose of the stack?
- In what way is the stack defined for (a) a .COM program and (b) an .EXE program? (That is,
who or what defines the stack?)
- (a) What is the size of each entry in the stack? (b) Where initially is the top of the stack, and
how is it addressed?
. During execution of a program, the CS contains 5A2B[0], the SS contains 5B53[0], the IP con-
tains 52H, and the SP contains 48H. (Values are shown in normal, not reversed-byte, se-
quence.) Calculate the addresses of (a) the instruction to execute and (b) the top (current
location) of the stack.
2-10. The DS contains 5B24[0], and an instruction that moves data from memory to the AL is
A03A01 (where AO means “move’”). Calculate the referenced memory address.
CHAPTER 3
Execution of Instructions
OBJECTIVE
INTRODUCTION
This chapter uses a DOS program named DEBUG that allows you to view memory, to en-
ter programs in memory, and to trace their execution. The text describes how you can en-
ter these programs directly into memory in a code segment and provides an explanation of
each execution step. Some readers may have access to sophisticated debuggers such as
CODEVIEW or TurboDebugger; however, we’ll use DEBUG since it is simple to use and
universally available.
In the initial exercises, you get to inspect the contents of particular areas of memory.
The first program example uses “immediate” data defined within the instructions for load-
ing data into registers and performing arithmetic. The second program example uses data
defined separately in the data segment. Tracing these instructions as they execute provides
insight into the operation of a computer and the role of the registers.
You can start right in with no prior knowledge of assembly language or even of pro-
gramming. All you need is an IBM PC or equivalent computer and a disk containing the
DOS operating system. We do assume, however, that you are familiar with booting up a
computer, handling diskettes, and selecting disk drives and files.
28
The DEBUG Program 2?
The DOS system comes with a program named DEBUG that is used for testing and de-
bugging executable programs. A feature of DEBUG is that it displays all program code and
data in hexadecimal format, and any data that you enter into memory is also in hex format.
Another feature is that DEBUG allows you to execute a program in single-step mode, so
that you can view the effect of each instruction on memory locations and registers.
DEBUG Commands
DEBUG provides a set of commands that lets you perform a number of useful operations.
The commands that concern us at this point are the following:
For its own purposes, DEBUG does not distinguish between lowercase and uppercase
letters, so you may enter commands either way. Also, you enter a space only where it is
needed to separate parameters in a command. The following three examples use DEBUG’ s
D command to display the same area of memory, beginning at offset 200H in the data seg-
ment (DS):
Note that you specify segments and offsets with a colon, in the form segment:offset.
Also, DEBUG assumes that all numbers are in hexadecimal format.
po,0,9: 4-0.o OA | >. Sear om are eee aeeae ae Doeo, re ee eee RN Ki aee
se wes x
».6.0.6,
G2 o.0o | ny>, rrr ae ee a ee ee ROO. kh ae Oe ses SOS Kage eae x
Each line displays 16 bytes of memory. The address to the left refers only to the leftmost
byte, in segment:offset format; you can count across the line to determine the position of
each byte. The hex representation area shows two hex characters for each byte, followed by
a space for readability. Also, a hyphen separates the second eight bytes from the first eight,
again for readability. Thus if you want to locate the byte at offset xx13H, start with xx 10H,
and count three bytes successively to the right.
This book makes considerable use of DEBUG and explains details of its commands
as they are needed. Appendix E provides a full description of DEBUG commands.
Starting DEBUG
To start DEBUG, set the system to the directory on hard disk containing DEBUG, or insert
a DOS diskette containing DEBUG in the default drive. To initiate the program, key in the
word DEBUG and press Enter. DEBUG should load from disk into memory. When DE-
BUG’s prompt, a hyphen (-), appears on the screen, DEBUG is ready to accept your com-
mands. (That is a hyphen, although it resembles the cursor.) Let’s now use DEBUG to
snoop about in memory.
In this example, the two bytes in the equipment status word contain the hex values 63 and
44. We reverse the bytes (44 63) and convert them to binary:
Viewing Memory Locations 31
Bit: IS 14 13 12 11 10 9 8 7 6 5 4 3 2 1 =«~0
Binary: O 1 0 0 0 41 0 0 0 1 1 0 0 0 1 =41
Here’s an explanation of the hex code:
BITS DEVICE
15,14 Number of parallel printer ports attached = 1 (binary 01)
11-9 Number of serial ports attached = 2 (binary 010)
7,6 Number of diskette devices = 2 (where 00 = 1, 01 = 2, 10 = 3, and
11 = 4)
5,4 Initial video mode = 10 (where 01 = 40 X 25 color, 10 = 80 X 25 color,
and 11 = 80 X 25 monochrome)
i 1 = math coprocessor is present
0 1 = diskette drive is present
The first two bytes displayed at offset 0013H are kilobytes of memory size in hexadecimal,
with the bytes in reverse sequence. Here are two examples showing reversed hex, corrected
hex, and the decimal equivalent:
The screen should display a seven-digit serial number followed, on conventional machines,
by a copyright notice. The serial number is viewable as hex numbers, whereas the copy-
right notice is more recognizable from the ASCII area to the right. The copyright notice
32 Execution of Instructions Chapter 3
may continue past what is already displayed; to view it, simply press D followed by the
Enter key.
Knowing this date could be useful for determining a computer’s age and model.
Checking Model ID
Immediately following the ROM BIOS manufacture date is the model ID at location FFF-
FEH, or FFFF:E. Here are a number of model IDs:
CODE MODEL
F8 PS/2 models 70 and 80
F9 PC convertible
FA PS/2 model 30
FB PC-XT (1986)
FC PC-AT (1984), PC-XT model 286, PS/2 models 50 and 60, etc.
FE PC-XT (1982), portable (1982)
FF Original IBM PC
Now that you know how to use the display command, you can view the contents of
any storage location. You can also step through memory simply by pressing D repeatedly—
DEBUG displays eight lines successively, continuing from the last D operation.
When you’ve completed poking about, enter Q (for quit) to exit from DEBUG, or
continue with the next exercise.
define an immediate value in reverse-byte sequence.) MOV is the instruction, the AX reg-
ister is the first operand, and the immediate value 0123H is the second operand.
MACHINE SYMBOLIC
INSTRUCTION CODE EXPLANATION
You may have noticed that machine instructions may be one, two, or three bytes in length.
The first byte is the actual operation, and any other bytes that are present are operands—
references to an immediate value, a register, or a memory location. Program execution be-
gins with the first machine instruction and steps through each instruction, one after another.
At this point do not expect to make much sense of the machine code. For example, in one
case the machine code (the first byte) for move is hex B8, and in another case the code for
move is hex 8B.
Begin this exercise just as you did the preceding one: Key in the command DEBUG and
press Enter. When DEBUG is fully loaded, it displays its prompt (-). To enter this pro-
gram directly into memory, just type in the machine language portion, but not the sym-
bolic code or explanation. Key in the following E (Enter) command, including the blanks,
where indicated:
CS:100 indicates the starting memory address at which the data is to be stored—100H (256)
bytes following the start of the code segment (the normal starting address for machine code
under DEBUG). The E command causes DEBUG to store each pair of hexadecimal digits
into a byte in memory, from CS:100 through CS:105.
The next E command stores six bytes, starting at CS:106 through 107, 108, 109, 10A,
and 10B:
The last E command stores five bytes, starting at CS:10C through 10D, 10E, 10F, and 110:
If you key in an incorrect command, simply repeat it with the correct values.
34 Execution of Instructions Chapter 3
-~B CS:100 BS 23 01 05 25 00
“mB CS:106 6B DS 03 DS 8B CB
-B CS:10C 2B C8 2B CO 90
“RK
AX=0000 BX=0000 CX=0000 DX= 0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=21C1l ES=21C1 SS=21C1 CS= Zi1Gi IP=0100 NV UP EI PL NZ NA PO NC
21C1i:0100 B82301 MOV AX,0123
=
AX=0000 BX=0000
Machine Language Example I: Immediate Data 35
Because of differences in the various DOS versions, some register contents on your
screen may differ from those shown in Figure 3—1. The IP register displays [IP=0100, in-
dicating that execution of instructions is to begin 100H bytes past the start of the code seg-
ment. (That is why you used E CS:100 to enter the start of the program.)
The flags register in Figure 3—1 shows the following settings:
NV UP EI PL NZ NA PO NC
These settings mean no overflow, up (or right) direction, enable interrupt, plus sign,
nonzero, no auxiliary carry, parity odd, and no carry, respectively. At this time, none of
these settings is important to us.
The R command also displays at offset 0100H the first instruction to be executed.
Note that in the figure the CS register contains 21C1. Since your CS segment address is sure
to differ from this, we’ll show it as xxxx for the instructions:
* xxxx indicates the start of the code segment as xxxx[0]. The value xxxx:0100 means
offset 100H bytes following the CS segment address xxxx[0].
¢ B82301 is the machine code that you entered at CS:100.
¢ MOV AX,0123 is the symbolic assembly instruction for the machine code. This in-
struction means, in effect, move the immediate value 0123H into the AX register. DE-
BUG has “unassembled” the machine instructions so that you may interpret them
more easily. In later chapters, you will code assembly instructions exclusively.
At this point, the MOV instruction has not executed. For that purpose, key in T
(Trace) and press the Enter key. The machine code is B8 (move to AX register) followed
by 2301. The operation moves the 23 to the low half (AL) of the AX register and the 01 to
the high half (AH) of the AX register:
AH AL
ax: | 01 | 23 |
DEBUG displays the results in the registers. The contents of the IP register is 0103H, to in-
dicate the offset location in the code segment of the next instruction to be executed, namely:
To execute this instruction, enter another T. The ADD instruction adds 25H to the low half
(AL) of the AX register and 00H to the high half (AH), in effect adding 0025H to the AX.
AX now contains 0148H, and IP contains 0106H for the next instruction to be executed:
Key in another T command. The MOV instruction moves the contents of the AX register to
the BX register. Note that after the move the BX contains 0148H. AX still contains 0148H
because MOV copies rather than actually moves the data from one location to another.
36 Execution of Instructions Chapter 3
Now key in successive T commands to step through the remaining instructions. The
ADD instruction adds the contents of AX to BX, giving 0290H in BX. Then the program
moves (copies) the contents of BX into CX, subtracts AX from CX, and subtracts AX
from itself. After this last operation, the zero flag is changed from NZ (nonzero) to ZR
(zero), to indicate that the result of the last operation was zero. (Subtracting AX from itself
cleared it to zero.)
If you want to reexecute these instructions, reset the IP register to 1OOH and trace
through them again. Enter R IP, enter 100, and then enter R and the required number of T
commands, all followed by the Enter key.
D CS:100
DEBUG now displays 16 bytes (32 hex digits) of data on each line. To the right is the ASCII
representation (if printable) of each byte (pair of hex digits). In the case of machine code,
the ASCII representation is meaningless and may be ignored. Later sections discuss the
right side of the display in more detail.
The first line of the display begins at offset 100H of the code segment and represents
the contents of locations CS:100 through CS:10F. The second line represents the contents
of CS:110 through CS:11F. Although your program ends at CS:110, the D command auto-
matically displays eight lines from CS:100 through CS:170.
Figure 3—2 shows the results of the D CS:100 command. Expect only the machine
code from CS:100 through 110 to be identical to that of your own display; the bytes that
follow could contain anything. Also, the figure shows that the DS, ES, SS, and CS regis-
ters all contain the same address. This is because DEBUG happens to treat the program area
as one segment, with code and data (if any) in the same segment, although you must keep
them separated.
Enter Q (Quit) to end the DEBUG session, or continue with the next exercise.
Correcting an Entry
If you enter an incorrect value in the data segment or code segment, reenter the E command
to correct it. Also, to resume execution at the first instruction, set the IP register to 0100.
Key in the R command followed by the designated register, that is, R IP [Enter]. DEBUG
displays the contents of the IP and waits for an entry. Key in the value 0100 (followed by
Enter). Next, key in an R command (without the IP). DEBUG displays the registers, flags,
and first instruction to be executed. You can now use T to retrace the instruction steps. If
your program accumulates totals, you may have to clear some memory locations and reg-
isters. But be sure not to change the contents of the CS, DS, SP, and SS registers, all of
which have specific purposes.
0200H 2301H
0202H 2500H
0204H 0000H
0206H 2A2A2AH
Remember that a hex digit occupies a half-byte, so that, for example, 23H is stored in off-
set 0200H (the first byte) of the data area, and 01H is stored in offset 0201H (the second
byte). Here are the machine language instructions that process these data items:
INSTRUCTION EXPLANATION
A10002 Move the word (two bytes) beginning at DS offset 0200H into the
AX register.
03060202 Add the contents of the word (two bytes) beginning at DS offset
0202H into the AX register.
A30402 Move the contents of the AX register to the word beginning at DS
offset 0204H.
90 No operation.
You may have noticed that the two move instructions have different machine codes: Al and
A3. The actual machine code is dependent on the registers that are referenced, the size of
38 Execution of Instructions Chapter 3
data (byte or word), the direction of data transfer (from or to a register), and the reference
to immediate data or memory.
Now use the E command to key in the instructions, again beginning at CS:100:
The first E command stores the three words (six bytes) at the start of the data area,
DS:0200. Note that you have to enter these words with the bytes reversed, so that 0123 is
2301 and 0025 is 2500. When a MOV instruction subsequently accesses these words and
loads them into a register, it “unreverses” the bytes, so that 2301 becomes 0123 and 2500
becomes 0025.
The second E command stores three asterisks (***), defined as 2A2A2A, so that you
can view them later using the D (Display) command. Otherwise, these asterisks serve no
particular purpose in the data segment.
Figure 3-3 shows all the steps in the program, including the E commands. Your
screen should display similar results, although the addresses in the CS and DS probably dif-
fer. To examine the stored data (at DS:200H through 208H) and the instructions (at
CS:100H through 10AH), key in the following D commands:
Check that the contents of both areas (other than segment addresses) are identical to what
is shown in Figure 3-3.
CS:0100 references your first instruction, A10002. DEBUG interprets this instruc-
tion as a MOV and has determined that the reference is to the first location [0200H] in the
data area. The square brackets are to tell you that this reference is to a memory address and
Machine Language Example II: Defined Data 39
-E DS:200 23 01 25 00 00 00
-E DS:206 2A 2A 2A
-E CS:100 Al 00 02 03 06 02 02
-E CS:107 A3 04 02 90
-D DS:200,208
21C1:0200 23 01 25 00 00 00 2A 2A-2A #.S eRe
-D CS:100,10A
21C1:0100 Al 00 02 03 06 02 02 A3-04 02 90 }©=©=—— ee.
-R
AX=0000 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=21C1l ES=21C1l SS=21Cl CS=21C1 IP=0100 NV UP EI PL NZ NA PO NC
21C120100. A10002 MOV AX, [0200] DS :0200=0123
-T
not an immediate value. (An immediate value for moving 0200H to the AX register would
appear as MOV AX,0200.)
Now key in the T (Trace) command. The instruction MOV AX,[0200] moves the con-
tents of the word at offset 0200H to the AX register. The contents are 2301H, which the op-
eration reverses in the AX as 0123H.
Enter another T command to cause execution of the next instruction, ADD. The op-
eration adds the contents of the word in memory at DS offset 0202 to the AX register. The
result in the AX is now the sum of 0123H and 0025H, or 0148H.
The next instruction is MOV [0204],AX. Key in a T command for it to execute. The
instruction moves the contents of the AX register to the word in memory at DS offset
0204H. To view the changed contents of the data from 200H through 208H, key in
D DS:200,208 [Enter]
| | | | | | | | |
Offset : 200 201 202 203 204 205 206 207 208
The value 0148H is moved from the AX register to the data area at offsets 204H and 205H
and is reversed as 4801H. The left side of the display shows the actual machine code as it
40 Execution of Instructions Chapter 3
appears in memory. The right side simply helps you locate character data more easily. Note
that these hex values are represented on the right of the screen by their ASCII equivalents.
Thus 23H generates a number (#) symbol, and 25H generates a percent (%) symbol, while
the three 2AH bytes generate asterisks (*).
Since there are no more instructions to execute, enter Q (Quit) to end the DEBUG
session, or continue with the next exercise (and remember to reset the IP to 100).
The A Command
- The A (Assemble) command tells DEBUG to begin accepting symbolic assembly instruc-
tions and to convert them into machine language. Initialize the starting address in the code
segment at offset 100H for your instructions as
A 100 [Enter]
DEBUG displays the value of the code segment and the offset as xxxx:0100. Type in each
instruction, followed by the Enter key. Try entering the following program:
When you’ ve keyed in the program, press Enter again to exit from the A command. That’s
one extra Enter, which tells DEBUG you have no more symbolic instructions to enter. On
completion, DEBUG should display the following:
xxxx:0106 NOP
You can see that DEBUG has determined the starting location of each instruction. But be-
fore executing the program, let’s use DEBUG’s U (Unassemble) command to examine the
generated machine language.
Using the INT Instruction 41
U 100,106 [Enter]
The screen should display columns for the location, machine code, and symbolic code:
xxxx:0106 90 NOP
Now trace the execution of the program—the machine code is what actually executes. Be-
gin by entering R to display the registers and the first instruction, and then T successively
to trace subsequent instructions. When you get to the NOP at location 106H, continue with
the next exercise or press Q to quit execution.
You can now see how to enter a program in either machine or assembly language.
However, DEBUG is really intended for what its name implies—debugging programs—
and most of your efforts will involve the use of conventional assembly language, which is
not associated with DEBUG.
MOV AH, 30
INT ZL
To trace execution of the instructions, first enter R to view the registers and T to trace the
MOV. Instead of tracing the INT instruction, enter P (Proceed) to execute through the en-
tire DOS routine. Processing stops at the NOP instruction. You can now view the AL for
the DOS major version number, such as the X in DOS X.20, and the AH for the minor num-
ber, such as 14H (or 20) in DOS X.20.
Press Q to quit, or continue with the next exercise (and reset the IP to 100).
MOV AH,2A
INT Zu
NOP
Enter R to display the registers and T to execute the MOV. Then enter P to proceed through
the interrupt routine; the operation stops at the NOP instruction. The registers display this
information:
INT 12
NOP
Enter R to display the registers and the first instruction. The instruction, INT 12H, passes
control to a routine in BIOS that delivers the size of memory to the AX. Press T (and En-
ter) repeatedly to see each BIOS instruction execute. (Yes, we are violating a rule against
tracing through an interrupt, but this time it works all right.)
The actual instructions in your BIOS may differ somewhat from these, depending on
the version installed (the comments to the right are the author’s):
Saving a Program From Within DEBUG 43
ot ;Set interrupt
If you survived this adventure into BIOS, the AX contains the size of memory, in 1K bytes.
The last T command exits from BIOS and returns to DEBUG. The displayed instruction is
the NOP that you entered. Press Q to quit or continue with the next exercise (and reset the
IP to 100).
xxxxi0102 .MOV-BL,32
Xxxx:0106 NOP
Since the last instruction, NOP, is one byte, the program size is 100H through 106H
inclusive, or 7.
¢ First use R BX to display the BX, and enter 0 to clear it.
¢ Next use R CX to display the CX register. DEBUG replies with CX 0000 (zero
value), and you reply with the program size, 7.
¢ Write the revised program: W [Enter].
4a Execution of Instructions Chapter 3
The reason for clearing the BX is because the program length is in the BX:CX pair,
although the CX is adequate for our purposes.
DEBUG displays a message, “Writing nnnn bytes.” If the number is zero, you have
failed to enter the program length; try again. Watch out for the size of the program, since
the last instruction could be longer than one byte.
118 NOP
E19 NOP
11A DB 14 23
Lic DB 05 O00
115 DB 00 O00
120 DB 00 00 00
100: Move the contents of memory locations 11 AH—11BH to the AX. The square brack-
ets indicate a memory address rather than an immediate value.
103: Add the contents of memory locations 11CH-—11DH to the AX.
107: Add the immediate value 25H to the AX.
10A: Move the contents of the AX to memory locations 11EH-—11FH.
10D: Move the immediate value 25H to memory locations 120H—121H. Note the use of
the WORD PTR operator, which tells DEBUG that the 25H is to move into a word
in memory. If you were to code the instruction as MOV [120],25, DEBUG would
have no way of determining what length is intended and would display an ERROR
message. Although you will seldom need to use the PTR operator, it’s vital to know
when it is needed.
113; Move the immediate value 30H to memory location 122H. This time, we want to
move a byte, and the BYTE PTR operator indicates this length.
Questions 45
11A: Define the byte values 14H and 23H. DB here means “define byte(s)” and allows
you to define data items that your instructions (such as the one at 100) are to
reference.
11C, 11E, and 120: Define other byte values for use in the program.
To run this program, first type in A 100 [Enter], and then key in each symbolic in-
struction (but not the location). At the end, key in an additional Enter to exit from the A
command. Begin by entering R to display the registers and the first instruction; then enter
successive T commands. Quit execution when you get to the NOP at 118. Key in D 110 to
view the changed contents of the AX (233E) and of locations 11EH—11FH (3E23),
120H-—121H (2500), and 122H (30).
KEY POINTS
The DOS DEBUG program is useful for testing and debugging machine language and
assembly language programs.
DEBUG provides a set of commands that lets you perform a number of useful oper-
ations, such as display, enter, and trace.
Since DEBUG does not distinguish between lowercase and uppercase letters, you
may enter commands either way.
DEBUG assumes that all numbers are in hexadecimal format.
If you enter an incorrect value in the data segment or code segment, reenter the E com-
mand to correct it.
To resume execution at the first instruction, set the instruction pointer (IP) register to
0100. Key in the R (Register) command, followed by the designated register, as R IP
[Enter]. DEBUG displays the contents of the IP and waits for an entry. Key in the
value 0100 (followed by Enter).
QUESTIONS
3-1. What is the purpose of each of the following DEBUG commands? (a) A; (b) D; (c); E; (d) P;
(e) Q; (f) R; (g) T; (@) U.
3-2. Provide the DEBUG commands for the following requirements.
(a) Display the memory beginning at offset 264H in the data segment.
(b) Display the memory beginning at location 410H. (Note: Separate this address into its seg-
ment and offset values.)
(c) Enter the hex value A8B364 into the data segment beginning at location 200H.
(d) Display the contents of (7) all registers and (ii) the IP register only.
(e) Unassemble the machine code in locations 100H through 11EH.
3-3. Provide the machine code instructions for the following operations: (a) Move the hex value 4629
to the AX register; (b) add the hex value 036A to the AX register.
46 Execution of Instructions Chapter 3
3-4. Assume that you have used DEBUG to enter the following E command:
E CS:100 B8 45 01 05 25 00
The hex value 45 was supposed to be 54. Code another E command to correct only the one byte
that is incorrect; that is, change the 45 to 54 directly.
3-5. Assume that you have used DEBUG to enter the following E command:
E CS:100 B8 04 30 05 00 30 90
(a) What are the three symbolic instructions represented here? (The first program in this chap-
ter gives a clue.)
(b) On executing this program, you discover that the AX register ends up with 6004 instead of
the expected 0460. What is the error, and how would you correct it?
(c) Having corrected the instructions, you now want to reexecute the program from the first in-
struction. What two DEBUG commands are required?
3-6. Consider the machine language program
BO 25 DO EO B3 15 Fé E3 90
This program performs the following:
Moves the hex value 25 to the AL register.
Shifts the contents of the AL one bit left. (The result is 4A.)
Moves the hex value 15 to the BL register.
Multiplies the AL by the BL.
Use DEBUG’s E command to enter the program beginning at CS:100. Remember that these
are hexadecimal values. After entering the program, key in D CS:100 to view it. Then key in
R and enough successive T commands to step through the program until reaching the NOP.
What is the final product in the AX register?
3-7. Use DEBUG’s E command to enter the following machine language program:
MOV BX,25
ADD Bx, 30
Questions 47
SHL Bx, 01
SUB BX, 22
NOP
Unassemble the instructions and trace their execution through to the NOP, and check the value
in the BX after each instruction.
3-10. What is the purpose of the INT instruction?
PART B — Fundamentals of Assembly
Language
CHAPTER 4
Assembly Language Requirements
OBJECTIVE
To cover the basic requirements for coding an assembly
language program and defining data items.
INTRODUCTION
Chapter 3 showed how to use DEBUG for keying in and executing machine language pro-
grams. No doubt, you were very aware of the difficulty in deciphering the machine code,
even for a small program. Probably no one seriously codes in machine language other than
for the smallest programs. A higher level of coding is the assembly level, in which a pro-
grammer uses symbolic instructions in place of machine instructions and descriptive names
for data items and memory locations. You write an assembly program according to a strict
set of rules and then use the assembler translator program to convert the assembly program
into machine code.
In this chapter, we explain the basic requirements for developing an assembly pro-
gram: the use of comments, the general coding format, the directives for printing a program
listing, and the directives for defining segments and procedures. We also cover the general
organization of a program, including initializing the program and ending its execution. Fi-
nally, we cover the requirements for defining data items.
48
Assembly Language Comments 49
A common practice is to combine the benefits of both programming levels: Code the
bulk of a project in a high-level language, and code critical modules (those that cause no-
ticeable delays) in assembly language.
Regardless of the programming language you use, it is still a symbolic language that
has to be translated into a form the computer can execute. A high-level language uses a com-
piler to translate the source code into machine code (technically, object code). A low-level
language uses an assembler to perform the translation. A linker program for both high and
low levels completes the process by converting the object code into executable machine
language.
Since a comment appears only on a listing of an assembled source program and gen-
erates no machine code, you may include any number of comments without affecting the
assembled program’s size or execution. In this book, assembly instructions are in upper-
case letters and comments are in lowercase, only as a convention and to make the programs
more readable. Technically, you can freely use upper- or lowercase for instructions and
comments.
50 Assembly Language Requirements Chapter 4
RESERVED WORDS
Certain words in assembly language are reserved for its own purposes, to be used only un-
der special conditions. By category, reserved words include
¢ instructions, such as MOV and ADD, which are operations that the computer can
execute;
directives, such as END or SEGMENT, which you use to provide commands to the
assembler;
operators, such as FAR and SIZE, which you use in expressions; and
predefined symbols, such as @Data and @ Model, which return information to your
program.
Using a reserved word for a wrong purpose causes the assembler to generate an error
message. Appendix C provides a list of assembly language reserved words.
IDENTIFIERS
An identifier is a name that you apply to items in your program. The two types of identifier
are name, which refers to the address of a data item, and Jabel, which refers to the address
of an instruction. The same rules apply to both names and labels. An identifier can use the
following characters:
The first character of an identifier must be an alphabetic letter or a special character, except
for the period. Since the assembler uses some special words that begin with the @ symbol,
you should avoid using it for your own definitions.
The assembler treats uppercase and lowercase letters the same. The maximum length
of an identifier is 31 characters (247 since MASM 6.0). Examples of valid names are
COUNT, PAGE25, and $E10. Descriptive, meaningful names are recommended. The
names of registers, such as AX, DI, and AL, are reserved for referencing those registers.
Consequently, in an instruction such as
ADD AX, BX
Statements 51
the assembler automatically knows that AX and BX refer to registers. However, in an in-
struction such as
MOV REGSAVE, AX
the assembler can recognize the name REGSAVE only if you define it elsewhere in the
program.
STATEMENTS
An assembly language program consists of a set of statements. The two types of state-
ments are:
1. instructions such as MOV and ADD, which the assembler translates to object
code; and
2. directives, which tell the assembler to perform a specific action, such as define a
data item.
Here is the general format for a statement, where square brackets indicate an optional
entry:
An identifier (if any), operation, and operand (if any) are separated by at least
one blank or tab character. There is a maximum of 132 characters on a line (512 since
MASM 6.0), although most programmers prefer to stay within 80 characters because that
is the maximum number the screen will accomodate. Two examples of statements are the
following:
The identifier, operation, and operand may begin in any column. However, consistently
starting at the same column for these entries makes a more readable program. Also, most
editor programs provide useful tab stops every eight positions to facilitate spacing.
Identitier
As described earlier, the term name applies to the name of a defined item or directive, whereas
the term /abel applies to the name of an instruction; we’ll use these terms from now on.
Operation
The operation, which must be coded, is most commonly used for defining data areas and
coding instructions. For a data item, an operation such as DB or DW defines a field, work
area, or constant. For an instruction, an operation such as MOV or ADD indicates an action
to perform.
52 Assembly Language Requirements Chapter 4
Operand
The operand (if any) provides information for the operation to act on. For a data item, the
operand defines its initial value. For example, in the following definition of a data item
named COUNTER, the operation DB means “define byte,” and the operand initializes its
contents with a zero value:
NAME OPERATION OPERAND COMMENT
COUNTER DB 0 -Define byte (DB) with 0 value
DIRECTIVES
Assembly language supports a number of statements that enable you to control the way in
which a program assembles and lists. These statements, called directives, act only during
the assembly of a program and generate no machine-executable code. The most common
directives are explained in the next few sections. Chapter 27 covers all of the directives in
detail; you may use that chapter as a reference any time.
The following common example provides 60 lines per page and 132 characters per line:
PAGE 60,132
The number of lines per page may range from 10 through 255, and the number of charac-
ters per line may range from 60 through 132. Omission of a PAGE statement causes the as-
sembler to default to PAGE 50,80.
Suppose that the line count for PAGE is defined as 60. Then, when the assembled
program has listed 60 lines, it ejects the forms to the top of the next page and increments a
page count. You may also want to force a page to eject at a specific line in the program list-
Directives 53
ing, such as the end of a segment. At the required line, simply code PAGE with no operand.
On encountering PAGE, the assembler automatically ejects the page and resumes printing
at the top of the next page.
TITLE. Youcan use the TITLE directive to cause a title for a program to print on
line 2 of each page of the program listing. You may code TITLE once, at the start of the
program. Its general format is
TITLE text
For the text operand, a recommended technique is to use the name of the program, as
cataloged on disk. For example, if you named the program ASMSORT, code that name plus
an optional descriptive comment, all up to 60 characters in length, like this:
SEGMENT Directive
An assembly program in .EXE format consists of one or more segments. A stack segment
defines stack storage, a data segment defines data items, and a code segment provides for
executable code. The directives for defining a segment, SEGMENT and ENDS, have the
following format:
The SEGMENT statement defines the start of a segment. The segment name must be pre-
sent, must be unique, and must follow the naming conventions of the language. The ENDS
statement indicates the end of the segment and contains the same name as the SEGMENT
statement. The maximum size of a segment is 64K. The operand of aSEGMENT statement
may contain three types of options: alignment, combine, and class, coded in this format:
Alignment type. The align entry indicates the boundary on which the segment is
to begin. For the typical requirement, PARA, the segment aligns on a paragraph boundary,
so that the starting address is evenly divisible by 16, or 10H. Omission of an operand causes
the assembler to default to PARA.
Combine type. The combine entry indicates whether to combine the segment
with other segments when they are linked after assembly (explained later under “Linking
the Program’). Combine types are STACK, COMMON, PUBLIC, and AT expression. For
example, the stack segment is commonly defined as
name SEGMENT PARA STACK
54 Assembly Language Requirements Chapter 4
You may use PUBLIC and COMMON where you intend to combine separately as-
sembled programs when linking them. Otherwise, where a program is not to be combined
with other programs, you may omit this option or code NONE.
Class type. The class entry, enclosed in apostrophes, is used to group related seg-
ments when linking. This book uses the classes ‘code’ for the code segment (recommended
by Microsoft), ‘data’ for the data segment, and ‘stack’ for the stack segment.
The following example defines a stack segment with alignment, combine, and class
types:
name SEGMENT PARA STACK ‘Stack’
The partial program in Figure 4-1 illustrates SEGMENT statements with various
options.
PROC Directive
The code segment contains the executable code for a program. It also contains one or more
procedures, defined with the PROC directive. A segment that contains only one procedure
would appear as follows:
NAME OPERATION OPERAND COMMENT
segname SEGMENT PARA
procname PROC FAR ;One
;procedure
;within
;the code
procname ENDP ; segment
segname ENDS
The procedure name must be present, must be unique, and must follow naming conventions
for the language. The operand FAR in this case is related to program execution. When you
request execution of a program, the DOS program loader uses this procedure name as the
entry point for the first instruction to execute.
The ENDP directive indicates the end of a procedure and contains the same name as
the PROC statement to enable the assembler to relate the two. Since procedures must be
fully contained within segments, ENDP defines the end of the procedure before ENDS de-
fines the end of the segment.
The code segment may contain any number of procedures used as subroutines, each
with its own set of PROC and ENDP statements. Each additional PROC is usually coded
with (or defaults to) the NEAR operand; Chapter 7 covers this situation.
ASSUME Directive
A program uses the SS register to address the stack, the DS register to address the data seg-
ment, and the CS register to address the code segment. To this end, you have to tell the as-
sembler the purpose of each segment in the program. The directive for this purpose is
ASSUME, coded in the code segment as follows:
Initializing a Program for Execution 55
OPERATION OPERAND
ASSUME: SS:stackname, DS:datasegname,CS:codesegname,
SS:stackname means that the assembler is to associate the name of the stack segment with
the SS register, and similarly for the other operands shown. The operands may appear in
any sequence. ASSUME may also contain an entry for the ES, such as ES:datasegname; if
your program does not use the ES register, you may omit its reference or code ES:NOTH-
ING. (Since MASM 6.0, the assembler automatically generates an ASSUME for the code
segment.)
Like other directives, ASSUME is just a message to help the assembler convert sym-
bolic code to machine code; you may still have to code instructions that physically load ad-
dresses in segment registers at execution time.
END Directive
As already mentioned, the ENDS directive ends a segment, and the ENDP directive ends a
procedure. An END directive ends an entire program. Its general format is:
OPERATION OPERAND
END [procname ]
The operand may be blank if the program is not to execute; for example, you may want to
assemble only data definitions, or you may want to link the program with another (main)
module. In most programs, the operand contains the name of the first or only PROC desig-
nated as FAR, where program execution is to begin.
LINE EXPLANATION
] The PAGE directive for this listing establishes 60 lines and 132 columns
per page.
2 The TITLE directive identifies the program’s name as PO4ASM1.
3 Lines 3, 7, and 11 are comments that clearly set out the defined segments.
4—6 These statements define the stack segment, STACKSG (but not its contents
in this example).
8-10 These statements define the data segment, DATASG (but not its contents).
12-21 These statements define the code segment, CODESG.
13-20 These statements define the code segment’s only procedure, named BEGIN.
This procedure illustrates common initialization and exit requirements for
an .EXE program. The two requirements for initializing are (1) notify the
56 Assembly Language Requirements Chapter 4
60,1432
PO4ASM1 Skeleton of an .EXE Program
STACKSG SEGMENT
STACKSG ENDS
DATASG
PARA ‘Code’
FAR
SS:STACKSG,DS: DATASG, CS : CODESG
AX, DATASG ;Get address of data segment
DS , AX ;Store address in DS
assembler which segments to associate with segment registers and (2) load
the DS with the address of the data segment.
14 The ASSUME directive notifies the assembler to associate certain seg-
ments with certain segment registers, in this case, STACKSG with the SS,
DATASG with the DS, and CODESG with the CS:
ASSUME SS:STACKSG,
DS: DATASG, CS:CODESG
The first MOV loads the address of the data segment into the AX register, and
the second MOV copies the address from the AX into the DS. Two MOVs
are required because no instruction can move data directly from memory to
a segment register; you have to move the address from another register to the
segment register. Thus the statement MOV DS,DATASG would be illegal.
Chapter 5 discusses initializing segment registers in more detail.
Ending Program Execution 57
18,19 These two instructions request an end to program execution and a return to
DOS. A later section discusses them in more detail.
22 The END statement tells the assembler that this is the end of the program,
and the BEGIN operand provides the entry point for subsequent program
execution.
The sequence in which you define segments is usually unimportant. Figure 4—1 de-
fines them as follows:
Keep this point in mind: The program in the figure is coded in symbolic language. To
execute it, you have to use an assembler program and a linker to translate it into executable
machine code. In that case, it would become an .EXE program.
As described in Chapter 2, when DOS loads an .EXE program from disk into memory
for execution, it constructs a 256-byte (100H) PSP on a paragraph boundary in available in-
ternal memory and stores the program immediately following the boundary. DOS then
The DOS loader initializes the CS:IP and SS:SP registers, but not the DS and ES reg-
isters. However, your program normally needs the address of the data segment in the DS
(and often in the ES as well). As a consequence, you have to initialize the DS with the ad-
dress of the data segment, as shown by the two MOV instructions in Figure 4-1.
Now, even if this initialization is not clear at this point, take heart: Every .EXE pro-
gram has virtually identical initialization steps that you can duplicate each time you code
an assembly program.
The return code for normal completion of a program is usually 0 (zero). You may also code
the two MOVs as one statement (as shown in Figure 4—1):
MOV AX, 4C0O0OH ;Request normal exit
DOS function 4CH has superseded the original end operations INT 20H and INT
21H, function OOH.
page 60,132
TITLE PO4ASM1 (EXE) Move and add operations
FLDA DW 250
FLDB DW 125
FLDC DW 2
When loading a program from disk into memory for execution, the system loader sets
the actual addresses in the SS and CS registers, but, as shown by the first two MOV in-
structions, you have to initialize the DS (and ES) register.
We'll trace the assembly, linkage, and execution of this program in Chapter 5.
.386
Initialization of the data segment register could look like this, since on these proces-
sors the DS register is still 16 bits in size:
The STI, CLI, IN, and OUT instructions, available in real mode, are not allowed in
protected mode.
-MODEL memory-model
The memory model may be TINY, SMALL, MEDIUM, COMPACT, or LARGE. (Another
model, HUGE, need not concern us here.) The requirements for each model are:
TINY * *
SMALL 1 ‘
You may use any of these models for a stand-alone program (that is, a program that is not
linked to another program). The TINY model is intended for the exclusive use of .COM
60 Assembly Language Requirements Chapter 4
programs, which have their data, code, and stack in one segment. The SMALL model re-
quires that code fits within a 64K segment and data fit within another 64K segment, this
model is suitable for most of the examples in the book. The MODEL directive automati-
cally generates the required ASSUME statement.
The general formats (including the leading period) for the directives that define the
stack, data, and code segments are:
.STACK [size]
. DATA
. CODE [name ]
Each of these directives causes the assembler to generate the required SEGMENT state-
ment and its matching ENDS. The default segment names (which you don’t have to define)
are STACK, DATA, and _TEXT (for the code segment). The underline (or break) char-
acter at the beginning of [DATA and _TEXT is intended. As the coding format indicates,
you may override the default name for the code segment. The default stack size is 1,024
bytes, which you may also override. You use these directives to identify where in the pro-
gram the three segments are to be located. Note, however, that the instructions you now use
to initialize the address of data segment in the DS are:
MOV AX, @data
MOV DS, AX
Figure 4—2 gave an example of a program using conventionally defined segments. Figure
4-3 provides the same example, but this time using the simplified segment directives
page 60,132
PO4ASM2 (EXE) Move and add operations
;Define stack
;Define data
;Move 0250 to AX
;Add 0125 to AX
FLDC, AX ;Store sum in FLDC
STACK, .DATA, and .CODE. The memory model is specified as SMALL in the fourth
line. The stack is defined as 64 bytes (32 words). Note that the assembler does not gener-
ate conventional SEGMENT and ENDS statements, and you also don’t code an ASSUME
statement.
As you'll see in the next chapter, the assembler handles programs coded with sim-
plified segment directives slightly differently from those using conventional segment
directives.
MASM 6.0 introduced the STARTUP and .EXIT directives to simplify program initiali-
zation and termination. .STARTUP generates the instructions to initialize the segment reg-
isters, whereas .EXIT generates the INT 21H function 4CH instructions for exiting the
program. For purposes of learning assembly language, examples in this text code the full
sets of instructions and leave shortcuts to more experienced programmers.
DATA DEFINITION
As already discussed, the purpose of the data segment in an .EXE program is to define con-
stants, work areas, and input/output areas. The assembler permits definitions of items in
various lengths according to a set of directives that defines data. For example, DB defines
a byte and DW defines a word. A data item may contain an undefined (that is, uninitialized)
value, or it may contain a constant, defined either as a character string or as a numeric value.
Here is the general format for data definition:
[name ] EA expression
Name. A program that references a data item does so by means of a name. The
name of an item is otherwise optional, as indicated by the square brackets. The earlier sec-
tion, “Statements,” provides the rules for names.
Directive. The directives that define data items are DB (byte), DW (word), DD
(doubleword), DF (farword), DQ (quadword), and DT (tenbytes), each of which explicitly
indicates the length of the defined item.
In this case, when your program begins execution, the initial value of FLD1 is unknown to
you. The normal practice before using this item is to move some value into it (any at all,
but it must fit the defined size).
You can also use the operand to define a constant, such as
You can freely use this initialized value throughout your program and can even change the
contents of FLD2.
An expression may contain multiple constant values separated by commas and lim-
ited only by the length of the line, as follows:
The assembler defines these constants in adjacent bytes. A reference to FLD3 is to the first
one-byte constant, 11 (you could think of the first byte as FLD3+0), and a reference to
FLD3-+1 is to the second constant, 12. For example, the instruction
MOV AL,FLD3+3
loads the value 14 (OEH) into the AL register. The expression also permits duplication of
constants in a statement of the general form
[name ] |
Dn repeat-count DUP(expression) ...
The third example generates four copies of the digit 8 (8888) and duplicates that value three
times, giving twelve 8s in all.
An expression may define and initialize a character string or a numeric constant.
Character Strings
Character strings are used for descriptive data such as people’s names and page titles. The
string is defined within single quotes, such as ‘PC’, or within double quotes, such as “PC”.
The assembler translates character strings into object code in normal ASCII format.
Strangely, DB is the only format that defines a character string exceeding two char-
acters and stores the characters in normal left-to-right sequence. Consequently, DB is the
conventional format for defining character data of any length. An example is
DB ‘Character string’
The assembler stores the characters in ASCII format, without the apostrophes. If the string
must contain a single or double quote, you can define it in one of these ways:
Numeric Constants
Numeric constants are used to define arithmetic values and memory addresses. The con-
stant is not defined within quotes, but is followed by an optional radix specifier, such as H
in the hexadecimal value 12H. For most of the data definition directives, the assembler con-
verts defined numeric constants to hexadecimal and stores the generated bytes in object
code in reverse sequence—from right to left. Following are the various numeric formats.
Decimal. Decimal format permits defining the decimal digits 0 through 9, op-
tionally followed by the radix specifier D, such as 125 or 125D. Although the assembler al-
lows you to define values in decimal format as a coding convenience, it converts your
decimal values to binary object code and represents them in hex. For example, a definition
of decimal 125 becomes hex 7D.
Hexadecimal. Hex format permits defining the hex digits 0 through F, followed
by the radix specifier H, which you can use to define binary values. Since the assembler ex-
pects that a reference beginning with a letter is a symbolic name, the first digit of a hex con-
stant must be 0 to 9. Examples are 2EH and OFD8H, which the assembler stores as 2E and
D80F, respectively. Note that the bytes in the second example are stored in reverse sequence.
Binary. Binary format permits defining the binary digits 0 and 1, followed by the
radix specifier B. The normal use for binary format is to distinguish values for the bit-han-
dling instructions AND, OR, XOR, and TEST.
Since the assembler converts all numeric values to binary (and represents them in
hex), definitions of decimal 12, hex C, and binary 1100 all generate the same value: binary
00001100 or hex OC, depending on how you view the contents of the byte.
Because the letters D and B act as both radix specifiers and hex digits, they may cause
some confusion. As a solution, MASM 6.0 introduced the use of T (as in ten) and Y (as in
binary) as radix specifiers for decimal and binary, respectively.
Real. The assembler converts a given real value—a decimal or hex constant
followed by the radix specifier R—into floating-point format for use with a numeric
coprocessor.
Be sure to distinguish between the use of character and numeric constants. A charac-
ter constant defined as DB ‘12’ generates two ASCII characters, represented as hex 3132.
A numeric constant defined as DB 12 generates a binary number, represented as hex OC.
This text uses the conventional directives because of their commonly accepted usage.
The assembled program in Figure 44 provides examples of directives that define
character strings and numeric constants, with the generated object code on the left, which
page 60,132
TITLE PO4DEFIN (EXE) Define data items
.MODEL SMALL
.DATA
Define Byte - DB:
Symbols:
Name Type Value Attr
FLD1DB L BYTE 0000 _DATA
FLD1DD L DWORD OO04A _DATA
FLD1DQ L QWORD 0062 _DATA
FLD1DT L TBYTE OO7A _DATA
FLD1DW L WORD 0030 _DATA
FLD2DB L BYTE 0001 _DATA
FLD2DD L DWORD O004E _DATA
FLD2DQ L QWORD OO6A _DATA
FLD2DT L TBYTE 0084 _DATA
FLD2DW L WORD 0032 _DATA
FLD3DB L BYTE 0002 _DATA
FLD3DD L DWORD 0052 _DATA
FLD3DQ L QWORD 0072 _DATA
FLD3DT L TBYTE O008E _DATA
FLD3DW L WORD 0034 _DATA
FLD4DB L BYTE 0003 _DATA
FLD4DD L DWORD OO5A _DATA
FLD4DW L WORD 0036 _DATA
FLD5DB L BYTE 0004 _DATA Length = OOOA
FLD5DD L DWORD OO5E _DATA
FLDS5SDW L WORD 0040 _DATA Length = 0005
FLD6DB L BYTE OOOE _DATA
FLD7DB L BYTE OO1F _DATA
FLD8DB L BYTE 0024 _DATA
0 Warning Errors
0 Severe Errors
Figure 44 (continued)
you are urged to examine. Note that the object code for uninitialized values appears as hex
zeros. Since this program consists of only a data segment, it is not suitable for execution.
The DW directive defines items that are one word (two bytes) in length. A DW (or WORD) nu-
meric expression may define one or more one-word constants. The largest positive one-word
hex number is 7FFF; all “higher” numbers, 8000 through FFFF (where the sign bit is 1), rep-
resent negative values. In terms of decimal numbers, the limits are +32,767 and = 32,768.
The assembler converts DW numeric constants to binary object code (represented in
hex), but stores the bytes in reverse sequence. Consequently, a decimal value defined as
12345 converts to hex 3039, but is stored as 3930.
In Figure 4-4, FLD1DW and FLD2DW define DW numeric constants. FLD3DW de-
fines the operand as an address—in this case, the offset address of FLD7DB. The generated
object code is OO1F (the R to the right means relocatable), and a check of the figure shows
that the offset address of FLD7DB (the leftmost column) is indeed OO1F.
A DW character expression is limited to two characters, which the assembler reverses
in the object code, so that ‘PC’ would become ‘CP.’ If you think that DW is of limited use
for defining character strings, you’re right.
FLD4DW defines a table of five numeric constants. Note that the length of each con-
stant is one word (two bytes).
The DD directive defines items that are a doubleword (four bytes) in length. A DD (or
DWORD) numeric expression may define one or more constants, each with a maximum
of four bytes (eight hex digits). The largest positive doubleword hex number is 7FFFFFFF;
all “higher” numbers, 80000000 through FFFFFFFF (where the sign bit is 1), represent
negative values. In terms of decimal numbers, these maximums are +2,147,483,647 and
—2,147,483,648.
The assembler converts DD numeric constants to binary object code (represented in
hex), but stores the bytes in reverse sequence. Consequently, a decimal value defined as
12345678 converts to OOBC614EH, but is stored as 4E61 BCOOH.
In Figure 44, FLD2DD defines a DD numeric constant, and FLD3DD defines two
numeric constants. FLD4DD generates the numeric difference between two defined ad-
dresses; in this case, the result is the length of FLD2DB.
A DD character expression is also limited to two characters and is as trivial as those
for DW. The assembler reverses the characters and left-adjusts them in the four-byte
doubleword, as shown in the object code for FLDSDD.
The DF directive defines a farword as six bytes. Its normal use is for the 80386 and later
processors.
The DQ directive defines items that are four words (eight bytes) in length. A DQ (or
QWORD) numeric expression may define one or more constants, each with a maximum of
eight bytes, or 16 hex digits. The largest positive quadword hex number is 7 followed by
Directives for Defining Data 67
hexadecimal representation
You issued DS:100 for the display because the loader set the DS with the address of
the PSP, and the data segment for this program is 100 bytes after that address. Later, when
you use DEBUG for .EXE programs that initialize the DS to the address of the data seg-
ment, you'll use DS:0 for displaying it.
The name, in this case TIMES, may be any name acceptable to the assembler. Now when-
ever the word TIMES appears in an instruction or another directive, the assembler substi-
tutes the value 10. For example, the assembler converts the directive
FIELDA DB TIMES DUP(?)
MOV CX,COUNTR
The assembler replaces COUNTR in the MOV operand with the value 05, making the
operand an immediate value, as if it were coded
MOV CX,05 -Assembler substitutes 05
The advantage of EQU is that many statements may use the value defined by
COUNTER. If the value has to be changed, you need change only the EQU statement. Need-
less to say, you can use an equated value only where a substitution makes sense to the as-
sembler. You can also equate symbolic names, as in the following code:
TOTALPAY DW 0
I EQU TOTALPAY
The first EQU equates the nickname TP to the defined item TOTALPAY. For any instruc-
tion that contains the operand TP, the assembler replaces it with the address of TOTAL-
Key Points 69
PAY. The second EQU enables a program to use the word MPY in place of the regular sym-
bolic instruction MUL.
MASM 6.0 introduced a TEXTEQU directive for text data with the format
KEY POINTS
¢ DW, DD, and DQ store numeric values in object code with the bytes in reverse
sequence.
¢ DB items are used for processing half registers (AL, BL, etc.), DW for full registers
(AX, BX, etc.), and DD for extended registers (EAX, EBX, etc.). Longer numeric
items require special handling.
QUESTIONS
4-1. Distinguish between a compiler and an assembler.
4-2. What is a reserved word in assembler language? Give two examples.
4-3. What are the two types of identifiers?
4-4. Determine which of the following names are valid: (a) PC_AT; (b) $50; (c) @$_Z; (d) 34B7;
(e) AX.
4-5. Distinguish between a directive and an instruction.
4—6. What commands cause the assembler (a) to print a heading at the top of a page of the program
listing and (b) to eject to a new page?
4-7. What is the purpose of each of the three segments described in this chapter?
4-8. The format for the SEGMENT directive is
OBJECTIVE
To cover the steps in assembling, linking, and executing
an assembly language program.
INTRODUCTION
This chapter explains the procedure for keying in an assembly language program and for
assembling, linking, and executing it. The symbolic instructions that you code in assembly
language are known as the source program. You use the assembler program to translate the
source program into machine code, known as the object program. Finally, you use the linker
program to complete the machine addressing for the object program, generating an exe-
cutable module.
The sections on assembling explain how to request execution of the assembler pro-
gram, which provides diagnostics (including any error messages) and generates the object
program. Also explained are details of the assembler listing and, in general terms, how the
assembler processes a source program.
The sections on linking explain how to request execution of the linker program so that
you can generate an executable module. Also explained are details of the generated link
map, as well as the diagnostics. Finally, a section explains how to request execution of the
executable module.
72
Assembing a Source Problem 73
As it stands, the program is just a text file that cannot execute—you must first as-
semble and link it.
1. The assembly step involves translating the source code into object code and generat-
ing an intermediate .OBJ (object) file, or module. (You have already seen examples
of machine code and source code in earlier chapters.) One of the assembler’s tasks is
to calculate the offset for every data item in the data segment and every instruction in
the code segment. The assembler also creates a header immediately ahead of the gen-
erated .OBJ module; part of the header contains information about incomplete ad-
dresses. The .OBJ module is not quite in executable form.
2. The link step involves converting the .OBJ module to an .EXE (executable) machine
code module. One of the linker’s tasks is to combine separately assembled programs
into one executable module.
3. The last step is to load the program for execution. Since the loader knows where the
program is about to load, it is able to complete any addresses indicated in the header
that were left incomplete. The loader drops the header and creates a PSP immediately
before the program loaded in memory.
Figure 5—1 provides a chart of the steps involved in assembling, linking, and execut-
ing a program.
Assemble the
Assemble Source Program,
Assembler _ Create an Object
ee Program (.OBJ)
Prog. LST
Prog. CRF
2 Link the
Link Object Program,
ee Create an Executable
Program (.EXE)
Load and
Execute Execute the .EXE Figure 5-1 Steps in Assembly, Link, and
pon Execute
* Options provides for such features as setting levels of warning messages and is ex-
plained in Appendix D. Since the assembler’s defaults are usually adequate, you’ll
seldom need to use options.
* Source identifies the name of the source program, such as POSASM1. The assembler
assumes the extension .ASM, so you need not enter it. You can also enter a disk drive
number if you don’t want to accept the current default drive.
* Object provides for a generated .OBJ file. The drive, subdirectory, and filename may
be the same as or different from those in the source.
¢ Listing provides for a generated .LST file that contains both the source and object
code. The drive, subdirectory, and filename may be the same as or different from
those in the source.
Assembler Listing of Conventional Segment Definitions rs:
* Crossref generates a cross-reference file containing the symbols used in the program,
which you can use for a cross-reference listing. The extension is .CRF for MASM
and .XRF for TASM. The drive, subdirectory, and filename may be the same as or
different from those in the source.
You always enter the name of the source file, and you usually request an .OBJ file,
which is required for linking a program into executable form. You’ll probably often request
-LST files, especially when you want to examine the generated machine code. A .CRF file
is useful for very large programs where you want to see which instructions reference which
data items. Also, the .CRF request causes the assembler to generate line numbers for state-
ments in the .LST file to which the .CRF file refers. Later sections cover .LST and .CRF
files in detail.
Example 1: Specify source file POSASM1 on drive D, and generate object, listing,
and cross-reference files. If a filename is to be the same as the one in the source, you need
not repeat it; a reference to drive number is sufficient to indicate a request for a file:
MASM/TASM D:PO5ASM1,D:,D:,D:
Example 2: Generate only an object file. In this case, you may omit the reference to
the listing and cross-reference files and simply enter the command
MASM/TASM D:PO5ASM1,D:
The assembler converts your source statements into machine code and displays any
errors on the screen. Typical errors include a name that violates naming conventions, an op-
eration that is spelled incorrectly (such as MOVE instead of MOV), and an operand con-
taining a name that is not defined. There are about 100 error messages, explained in the
assembler manual. Since there are many different assembler versions, we won’t attempt to
list the errors. The assembler attempts to correct some errors, but in any event, you should
reload your editor, correct the .ASM source program, and reassemble it.
Figure 5—2 provides the listing that the assembler produced under the name POSASM1.LST.
The line width is 132 positions because of the PAGE entry. You can also print this listing
ii your printer can compress the print line. Many impact printers have a switch that will
force compressed printing, or you could request your editor or word processor to print in
compressed mode. Another way is to use the DOS MODE command; for 132 characters per
inch and six lines per inch, turn on the printer, key in the command MODE LPT1:132,6,
and request DOS PRINT.
Note at the top of the listing how the assembler has acted on the PAGE and TITLE
directives. None of the directives, including SEGMENT, PROC, ASSUME, and END, gen-
erates machine code, since they are just messages to the assembler.
At the extreme left is the number for each line. The second column shows the hex ad-
dresses of data fields and instructions. The third column shows the translated machine code
in hexadecimal format. To the right is the original source code.
76 Assembling, Linking, and Executing a Program Chapter 5
i page 60,132
2 TITLE POSASM1 (EXE) Move and add operations
3 . SSeS Sea At ee ee RN eS eS Se Se ere ee ae ee
4 0000 STACKSG SEGMENT PARA STACK 'Stack'
5 0000 0020[ DW 32 DUP(0)
6 0000
7 ]
8 \
Symbols:
Name Type Value Attr
BEGIN F PROC 0000 CODESG Length = 0014
27 Source Lines
27 Total Lines
15 Symbols
0 Warning Errors
0 Severe Errors
For each of the three segments, the SEGMENT directive notifies the assembler to
align the segment on an address that is evenly divisible by hex 10—the statement itself
generates no machine code. Theoretically, each segment address begins at offset location
Assembler Listing of Conventional Segment Definitions TP
0000. Actually, when the program begins execution, the segment is stored in memory ac-
cording to an address that DOS loads in the segment register and is offset zero bytes from
that address.
Note that the stack, data segment, and code segment are separate areas, each with its
own offset value for data or instructions.
Stack Segment
The stack segment contains a DW (Define Word) directive that defines 32 words, each gen-
erating a zero value designated by (0). This definition of 32 words is a realistic size for a
stack because a large program may require many interrupts for input/output and calls to sub-
programs, all involving use of the stack. The stack segment ends at offset 0040H, which is
equivalent to decimal value 64 (32 words X 2 bytes).
If the stack size is too small to contain all the items pushed onto it, neither the assem-
bler nor the linker warns you, and the executing program may crash in an unpredictable way.
Data Segment
The program defines a data segment, DATASG, containing three defined values, all in DW
(Define Word) format. FLDA defines a word (two bytes) initialized with decimal value 250,
which the assembler has translated to OOFAH (shown on the left). FLDB defines a word ini-
tialized with decimal value 125, assembled as 0O7DH. The actual storage values of these
two constants are, respectively, FAOO and 7D00, which you can check with DEBUG.
FLDC is coded as a DW with ? in the operand to define a word with an uninitialized
constant.
Code Segment
The program defines a code segment, CODESG, which contains the program’s executable
code, all in one procedure (PROC).
Three statements establish the addressability of the data segment:
¢ The ASSUME directive relates DATASG to the DS register. Note that the program
does not require the ES register, but some programmers define it as a standard prac-
tice. ASSUME simply provides information to the assembler, which generates no ma-
chine code for it.
The first MOV instruction “stores” DATASG in the AX register. Now, an instruction
cannot actually store a segment in a register—the assembler simply recognizes an at-
tempt to load the address of DATASG. Note the machine code to the left: B8 ——R.
The four hyphens mean that at this point the assembler cannot determine the address
of DATASG; the system determines this address only when the object program is
linked and loaded for execution. Since the system loader may locate a program
78 Assembling, Linking, and Executing a Program Chapter 5
anywhere in memory, the assembler leaves the address open and indicates the fact
with an R; the DOS loader program is to replace (or relocate) the incomplete address
with the actual one.
The second MOV instruction moves the contents of the AX register to the DS regis-
ter. Since there is no valid instruction for a direct move from memory to the DS reg-
ister, you have to code two instructions to initialize the DS.
The DOS loader automatically initializes the SS and CS when it loads a program for
execution, but it is your responsibility to initialize the DS, and the ES if required.
For the simplified segment directives, initialize the DS like this:
MOV DS,AX
While all this business may seem unduly involved, at this point you really don’t have
to understand it. All programs in this book use a standard definition and initialization, and
you simply have to reproduce this code for each of your programs. To this end, store a skele-
ton assembly program on disk, and for each new program that you want to create, COPY
the skeleton program into a file with its correct name, and use your editor to complete the
additional instructions.
The first instruction after initializing the DS register is MOV AX,FLDA, which be-
gins at offset location 0005 and generates machine code Al 0000. The space between Al
(the operation) and 0000 (the operand) is only for readability. The next instruction is ADD
AX,FLDB, which begins at offset location 0008 and generates four bytes of machine code.
In this example, machine instructions are two, three, or four bytes in length.
The last statement in the program, END, contains the operand BEGIN, which relates
to the name of the PROC at offset 0000. This is the location in the code segment where the
program loader is to transfer control for execution.
Following the program listing are a Segments and Groups table and a Symbols table.
Symbols Table
The second table provides the names of data fields in the data segment (FLDA, FLDB, and
FLDC) and the labels applied to instructions in the code segment. For BEGIN (the only en-
try in the example), Type F PROC means far procedure. The Value column gives the off-
set for the beginning of the segment for names, labels, and procedures. The column headed
Attr (for attribute) provides the segment in which the item is defined.
Two-Pass Assembler 79
Appendix D explains all the options for these tables. To cause the assembler to omit
the tables, code a /N option following the MASM command, that is, MASM /N.
As for the last three entries, @CPU identifies the processor, @FILENAME gives the
name of the program, and @ VERSION shows the assembler version in the form n.nn.
Figure 4-3 showed how to code a program using the simplified segment directives. Figure
5—3 provides the assembled listing of that program. The first part of the symbol table un-
der “Segments and Groups” shows the three segments renamed by the assembler and listed
alphabetically:
Under the heading “Symbols” are names defined in the program or default names.
The simplified segment directives provide a number of predefined equates, which begin
with an @ symbol and which you are free to reference in a program. As well as @data,
they are:
You may use @code and @data in ASSUME and executable statements, such as
MOV AX, @data.
TWO-PASS ASSEMBLER
Many assemblers make two passes through a source program in order to resolve forward
references to addresses not yet encountered in the program. During pass 1, the assembler
reads the entire source program and constructs a symbol table of names and labels used in
the program, that is, names of data fields and program labels and their relative locations
(offsets) within the segment. You can see such a symbol table immediately following the
assembled program in Figure 5—3, where the offsets for FLDA, FLDB, and FLDC are 0000,
0002, and 0004 bytes, respectively. Although the program defines no instruction labels,
they would appear in the code segment with their own offsets. Pass 1 determines the amount
of code to be generated for each instruction. MASM starts generating object code in pass
1, whereas TASM does it in pass 2.
During pass 2, the assembler uses the symbol table that it constructed in pass 1. Now
that it “knows” the length and relative position of each data field and instruction, it can
80 Assembling, Linking, and Executing a Program §-Chapter 5
page 60,132 .
TITLE POSASM2 (EXE) Move and add operations
.MODEL SMALL
.STACK 64 ;Define stack
.DATA ;Define data
0000 OOFA FLDA DW 250
0002 007D FLDB DW L25
0004 0000 FLDC DW ?
Symbols:
Name Type Value Attr
BEGIN 24 «= # @ «= © & » « « gf PROC 0000 _ TEXT Length = 0014
0 Warning Errors
O Severe Errors
complete the object code for each instruction. It then produces, if requested, the various ob-
ject (.OBJ), list (LST), and cross-reference (.REF) files.
A potential problem in pass | is forward references: A jump instruction in the code
segment may reference a label, but the assembler has not yet encountered its defini-
tion. MASM constructs object code based on what it supposes is the length of each gener-
ated machine language instruction. If there are any differences between pass 1 and pass 2
concerning instruction lengths, MASM issues an error message “Phase error between
Linking an Object Program 81
passes.” Such errors are relatively rare, and if one appears, you’ll have to trace its cause
and correct it.
Since version 6.0, MASM does a more effective job of handling instruction lengths,
taking as many passes through the file as necessary.
Once your program is free of error messages, your next step is to link the object module,
POSASM1.OBJ, that was produced by the assembler and that contains only machine code.
The linker performs the following functions:
* Combines, if requested, more than one separately assembled module into one exe-
cutable program, such as two or more assembly programs or an assembly program
with a C program.
¢ Generates an .EXE module and initializes it with special instructions to facilitate its
subsequent loading for execution.
Once you have linked one or more .OBJ modules into an .EXE module, you may ex-
ecute the .EXE module any number of times. But whenever you need to make a change in
the program, you must correct the source program, assemble it into another .OBJ module,
and link the .OBJ module into an .EXE module. Even if initially these steps are not entirely
clear, you will find that with only a little experience, they become automatic.
You may convert many .EXE programs to .COM programs. See Chapter 7 for details.
The linker version for Microsoft is LINK, whereas the Borland version is TLINK.
You can key in LINK or TLINK with a command line or by means of prompts. (Since
MASM 6.0, the ML command provides for both assembling and linking.) This section
shows how to link using a command line; see Appendix D for using prompts. The command
line for linking is
Objfile identifies the object file generated by the assembler. The linker assumes the
extension .OBJ, so you need not enter it. The drive, subdirectory, and filename may
be the same as or different from those in the source.
Exefile provides for generating an .EXE file. The drive, subdirectory, and filename
may be the same as or different from those in the source.
Mapfile provides for generating a file with an extension .MAP that indicates the rel-
ative location and size of each segment and any errors that LINK has found. A typi-
cal error is the failure to define a stack segment. Entering CON (for console) tells the
linker to display the map on the screen (instead of writing it on disk) so that you can
view the map immediately for errors.
Libraryfile provides for the libraries option, which you don’t need at this early stage
of assembly language programming.
82 Assembling, Linking, and Executing a Program Chapter 5
This example links the object file POSASM1.OBJ that was generated by the earlier
assembly. The linker is to write the .EXE file on drive D, display the map, and ignore the
library option:
LINK D:P0O5ASM1,D:,CON
If the filename is to be the same as that of the source, you need not repeat it: the reference
to drive number is sufficient to indicate a request for the file. Appendix D supplies other
options.
¢ The stack is the first segment and begins at offset zero bytes from the start of the pro-
gram. Since it is defined as 32 words, it is 64 bytes long, as its length (40H) indicates.
¢ The data segment begins at the next paragraph boundary, offset 40H.
¢ The code segment begins at the next paragraph boundary, offset 50H. Some assem-
blers rearrange the segments into alphabetical order.
¢ Program entry point 0005:0000, which is in the form “relative (not absolute) seg-
ment:offset,’ refers to the address of the first executable instruction. In effect, the rel-
ative starting address is at segment 5[0], offset O bytes, which corresponds to the
segment boundary at 50H. The program loader uses this value when it loads the pro-
gram into memory for excution.
At this stage, the only error that you are likely to encounter is entering wrong file-
names. The solution is to restart with the link command.
* The code segment is now the first segment and begins at offset zero bytes from the
start of the program.
¢ The data segment begins at the next word boundary, offset 14H.
¢ The stack begins at the next word boundary, offset 20H.
¢ The program entry point is now 0000:0000, which means that the relative location of
the code segment begins at segment 0, offset 0.
EXECUTING A PROGRAM
Having assembled and linked a program, you can now (at last!) execute it. If the EXE file
is in the default drive, you could cause DOS to load it for execution by entering
PO5ASM1.EXE or PO5ASM1
If you omit typing the file extension, DOS assumes it is EXE (or .COM). However, since
this program produces no visible output, it is suggested that you run it under DEBUG in-
stead and step through its execution with trace commands. Key in the following, including
the extension .EXE:
DEBUG D: POS5SASM1
. EXE
DEBUG loads the .EXE program module and displays its hyphen prompt. To view
the stack segment, key in
D SS:0
The stack contains all zeros because it was initialized that way. To view the data segment,
key in
D.DS:0
The operation displays the three data items as FA 00 7D 00 00 00, with the bytes for each
word in reverse sequence. To view the code segment, key in
D CS:0
Compare the displayed machine code with that of code segment in the assembled listing:
B8----8ED8A10000 ...
In this case, the assembled listing does not accurately show the machine code, since the as-
sembler did not know the address for the operand of the first instruction. You can now de-
termine this address by examining the displayed code.
Key in R to view the registers, and trace through program execution with successive
T commands. As you step through the program, note the contents of the registers. When
you reach the last instruction, you can use L to reload and rerun the program or Q to quit
the DEBUG session.
84 Assembling, Linking, and Executing a Program Chapter 5
CROSS-REFERENCE LISTING
The assembler generates an optional .CRF or .XRF file that you can use to produce a cross-
reference listing of a program’s identifiers, or symbols. However, you still have to convert
this file to a properly sorted cross-reference file. A program on the assembler disk performs
this function: CREF for Microsoft or TCREF for Borland. You can key in CREF or TCREF
with a command line or by 1means of prompts. This section uses a command line; see Ap-
pendix D for using prompts. The command to convert the cross-reference file is
|CREF/TCREF xreffile,reffile |
¢ xreffile identifies the cross-reference file generated by the assembler. The program as-
sumes the extension, so you need not enter it. You can also enter a disk drive number.
¢ reffile provides for generating a .REF file. The drive, subdirectory, and filename may
be the same as or different from those in the source.
The Listing
Figure 5—4 contains the cross-reference listing produced by CREF for the program in Fig-
ure 5—2. The symbols in the first column are in alphabetic order. The numbers in the sec-
ond column, shown as n#, indicate the lines in the .LST file where the symbols are defined.
Numbers to the right of this column are line numbers showing where the symbol 1s refer-
enced. For example, CODESG is defined in line 17 and is referenced in lines 19 and 29.
FLDC is defined in line 14 and referenced in line 25+, where the “+” means its value is
modified.
DAR we se -@ ie ee e Se Oe oe Ue
DATASG «= « « «* 2» «m= *«« = « DEF Ls 19 20
12 Symbols
Generated Files
Assembling a number of programs may use a lot of disk space. You can safely delete .OBJ,
.CRF, and .LST files. Keep .ASM source programs in case of further changes and .EXE
files for executing the programs.
ERROR DIAGNOSTICS
The assembler provides diagnostics for any programming errors that violate its rules. The
program in Figure 5—5 is the same as the one in Figure 5—2, except that it has a number of
intentional errors inserted for illustrative purposes. The program was run under MASM;
TASM generates a similar error listing. Here are the errors, as coded:
LINE EXPLANATION
14 FLDC requires an operand.
19 ASSUME does not relate the SS to STACKSG, although the assembler has
not detected this omission.
20 DATSEG should be spelled DATASG.
a page 60,132
2 TITLE POSASM3 (EXE) Illustrate assembly errors
3 ee eee ee eee eee eee eee eee
4 0000 STACKSG SEGMENT PARA STACK 'Stack'
5 0000 0020[ DW 32 DUP(0)
6 0000
7 ]
8
9 0040 STACKSG ENDS
10 fee eee ee aneeeans a a ae Sener
11 0000 DATASG SEGMENT PARA 'Data'
12 0000 OOFA FLDA DW 250
13 0002 007D FLDB DW Lzo
14 0004 FLDC DW
p05asm3.ASM(11): error A2027: Operand expected
15 0004 DATASG ENDS
16 por rr rr rrr rr
17 0000 CODESG SEGMENT PARA 'Code'
18 0000 BEGIN PROC FAR
19 ASSUME CS:CODESG,DS:DATASG
20 0000 Al 0000 U MOV AX, DATSEG ;Address of DATASG
p0S5asm3.ASM(17): error A2009: Symbol not defined: DATSEG — .
21 0003 8B DO MOV DX, AX ; in DS register
ae
23 MOV AS, FLDA ;Move 0250 to AX
p05asm3.ASM(20): error A2009: Symbol not defined: AS
24 0005 03 06 0002 R ADD AX, FLDB ;Add 0125 to AX
25 0009 A3 0000 U MOV FLDD, AX ;Store sum in FLDC
p05asm3.ASM(22): error A2009: Symbol not defined: FLDD .
26 O000C B8 4C00 MOV AX, 4CO0OH ;Exit to DOS
27 Q000F CD 21 INT 21H
28 0011 BEGIN ENDP
pOS5asm3.ASM(25): error A2006: Phase error between passes
29 OCLL CODESG ENDS
30 END BEGIN
21 DX should be coded as DS, although the assembler does not know that this
is an error.
a AS should be coded as AX.
25 FLDD should be coded as FLDC.
28 Correcting the other errors will cause this diagnostic to disappear.
The last error message, “Phase error between passes,” occurs when addresses gener-
ated in pass | of a two-pass assembler differ from those of pass 2. To isolate an obscure er-
ror, use the /D option for MASM to list both the pass | and the pass 2 files, and compare
the offset addresses.
KEY POINTS
Both MASM and TASM provide a command line for assembling, including (at least)
the name of the source program. MASM also provides prompts for entering options.
The assembler converts a source program to an .OBJ file and generates optional list-
ing and cross-reference files.
The Segments and Groups table following an assembler listing shows any segments
and groups defined in the program. The Symbols table shows all symbols (data names
and instruction labels).
The linker (LINK or TLINK) converts an .OBJ file to an executable .EXE file. You
may link using a command line or by means of prompts (LINK only).
The simplified segment directives generate the names _DATA for the data segment,
STACK for the stack segment, and _TEXT for the code segment. They also generate
a number of predefined equates.
The CREF (or TCREF) program produces a useful cross-reference listing.
QUESTIONS
5-1. Code the command line to assemble a source program named DISCOUNT.ASM with files
.LST, .OBJ, and .CRF. Assume that the source program and assembler are in drive C.
5-2. Code the LINK or TLINK command line to link DISCOUNT.OBJ from Question 5-1.
5-3. Code the commands for DISCOUNT.EXE from Question 5—2 for the following: (a) execution
through DEBUG; (b) direct execution from DOS.
5-4. Give the purpose of each of the following files: (a) file. ASM; (b) file.CRF; (c) file. LST; (d)
file. EXE; (e) file.OBJ; (f) file MAP.
5-5. Code the two instructions to initialize the DS register. Assume that the name of the data seg-
ment is DATSEG.
5-6. Write an assembly program using conventional segment definitions for the following: (a) Move
immediate value hex 40 to the AL register; (b) shift the AL contents one bit left (code SHL
AL,1); (c) move immediate value hex 22 to the BL; (d) multiply AL by BL (code MUL BL).
Remember the instructions required to end program execution. The program does not need to
define or initialize the data segment. Be sure to COPY a skeleton program and use your editor
Questions 87
to develop the program. Assemble and link. Use DEBUG to trace and to check the code seg-
ment and registers.
5-7. Revise the program in Question 5—6 for simplified segment directives. Assemble and link it, and
compare the object code, symbol tables, and link map with those of the original program.
5-8. Add a data segment to the program in Question 5—6 for the following:
¢ Define a one-byte item (DB) named FIELDA containing hex 40 and another named FIELDB
containing hex 22.
¢ Define a two-byte item (DW) named FIELDC with no constant.
* Move the contents of FIELDA to the AL register, and shift left one bit.
¢ Multiply the AL by FIELDB (code MUL FIELDB).
¢ Move the product in the AX to FIELDC.
Assemble, link, and use DEBUG to test the program.
5-9. Revise the program in Question 5—8 for simplified segment directives. Assemble and link it, and
compare the object code, symbol tables, and link map with those of the original program.
CHAPTER 6
Processor Instructions and
Addressing
Objective
INTRODUCTION
This chapter introduces the processor instruction set, and then describes the basic address-
ing formats that are used throughout the rest of the book. The instructions formally covered
in this chapter ae MOV, MOVSX, MOVZX, XCHNG, LEA, INC, DEC, and INT. You
can also define a constant in an instruction operand as an immediate value.
Finally, the chapter explains address alignment and the segment override prefix.
The following is a list of the instructions for the 8086 processor family, arranged by
category. Although the list seems formidable, many of the instructions are rarely
needed.
Arithmetic
88
The Processor Instruction Set 89
¢ DEC: Decrement by 1
¢ DIV: Unsigned Divide
¢ IDIV: Signed (Integer) Divide
¢ IMUL: Signed (Integer) Multiply
¢ INC: Increment by 1
¢ MUL: Unsigned Multiply
¢ NEG: Negate
¢ SBB: Subtract with Borrow
¢ SUB: Subtract Binary Values
ASCII-BCD Conversion
Bit Shifting
¢ RCL: Rotate Left Through Carry
¢ RCR: Rotate Right Through Carry
¢ ROL: Rotate Left
¢ ROR: Rotate Right
¢ SAL: Shift Algebraic Left
¢ SAR: Shift Algebraic Right
SHL: Shift Logical Left
¢ SHR: Shift Logical Right
SHLD/SHRD: Shift Double Precision (80386 and later)
Comparison
¢ BSF/BSR: Bit Scan (80386 and later)
¢ BT/BTC/BTR/BTS: Bit Test (80386 and later)
¢ CMP: Compare
¢ CMPS: Compare String
¢ TEST: Test Bits
Data Transfer
Flag Operations
CLC: Clear Carry Flag
CLD: Clear Direction Flag
CLI: Clear Interrupt Flag
CMC: Complement Carry Flag
LAHF: Load AH from Flags
POPE: Pop Flags off Stack
PUSHF: Push Flags onto Stack
SAHF: Store Contents of AH in Flags
STC: Set Carry Flag
STD: Set Direction Flag
STI: Set Interrupt Flag
Input/Output
IN: Input Byte or Word
OUT: Output Byte or Word
Logical Operations
AND: Logical AND
¢ NOT: Logical NOT
OR: Logical OR
e XOR: Exclusive OR
Looping
¢ LOOP: Loop until Complete
¢ LOOPE/LOOPZ: Loop While Equal or Loop While Zero
¢ LOOPNE/LOOPNZ: Loop While Not Equal or Loop While Not Zero
The Processor Instruction Set 91
Processor Control
¢ ESC: Escape
¢ HLT: Enter Halt State
¢ LOCK: Lock Bus
¢ NOP: No Operation
¢ WAIT: Put Processor in Wait State
Stack Operations
¢ POP: Pop Word off Stack
¢ POPA: Pop All General Registers (80286 and later)
¢ PUSH: Push onto Stack
¢ PUSHA: Push All General Registers (80286 and later)
String Operations
¢* CMPS: Compare String
¢ LODS: Load String
¢ MOVS: Move String
¢ REP: Repeat String
° REPE/REPZ: Repeat While Equal or Repeat While Zero
¢ REPNE/REPNZ: Repeat While Not Equal or Repeat While Not Zero
¢ SCAS: Scan String
¢ STOS: Store String
Transfer (Conditional)
¢ INTO: Interrupt on Overflow
¢ JA/JNBE: Jump If Above or Jump If Not Below or Equal
JAE/JNB: Jump If Above or Equal or Jump If Not Below
JB/JNAE: Jump If Below or Jump If Not Above or Equal
JBE/JNA: Jump If Below or Equal or Jump If Not Above
¢ JC/JNC: Jump If Carry or Jump If No Carry
¢ JCXZ: Jump If CX is Zero
¢ JE/JZ: Jump If Equal or Jump If Zero
¢ JG/JNLE: Jump If Greater or Jump If Not Less or Equal
¢ JGE/JNL: Jump If Greater or Equal or Jump If Not Less
¢ JL/JNGE: Jump If Less or Jump If Not Greater or Equal
JLE/JNG: Jump If Less or Equal or Jump If Not Greater
JNE/JNZ: Jump If Not Equal or Jump If Not Zero
92 Processor Instructions and Addressing Chapter 6
Transfer (Unconditional)
¢ CALL: Call a Procedure
¢ INT: Interrupt
¢ IRET: Interrupt Return
¢ JMP: Unconditional Jump
¢ RET: Return
¢ RETN/RETF: Return Near or Return Far
Type Conversion
¢ CBW: Convert Byte to Word
¢ CDQ: Convert Doubleword to Quadword (80386 and later)
° CWD: Convert Word to Doubleword
¢ CWDE: Convert Word to Extended Doubleword (80386 and later)
OPERANDS
An operand provides a source of data for an instruction. Some instructions, such as CLC
and RET, do not require an operand, whereas other instructions may have one or two
operands. Where there are two operands, the second operand is the source, which contains
either the data to be delivered (immediate) or the address (of a register or in memory) of the
data. The source data is unchanged by the operation. The first operand is the destination,
which contains data in a register or in memory and which is to be processed.
operand1, operand2
Let’s now examine how the operand can affect the addressing of data.
Register Operands
For this type, the register provides the name of any one of the 8-, 16-, or 32-bit registers.
Depending on the instruction, the register may be coded in the first operand, the second
operand, or both:
WORDX DW :
MOV WORDX,
BX ;Register in second operand
Processing data between registers is the fastest type of operation, since there is no ref-
erence to memory.
Immediate Operands
In immediate format, the second operand contains a constant value or an expression. The
destination field in the first operand defines the length of the data and may be a register or
a memory location. Here are some examples:
SAVE DB .
BYTE] DB 0
The last two examples use square brackets as index specifiers to indicate a refer-
ence to memory. (The offset is combined with the address in the DS.) The omission of
square brackets, as in MOV BX,38BOH, indicates an immediate value—note the signifi-
cant difference.
The last example increments the byte in memory at offset 2FOH (the offset combined
with the DS address). Since the operand indicates only a starting memory location, we need
the BYTE PTR modifier here to define the length.
In the following, a data item acts as an offset address in an instruction operand:
TABLEX DB 25 DUP(?)
The first MOV uses an index specifier to access the fourth byte from TABLEX. The sec-
ond MOV uses a plus operator for exactly the same effect.
DATAFLD DB ?
The effect of the two MOVs is the same as coding MOV DATAFLD,0, although the uses
for indexed addressing are usually not so trivial. The following related instruction moves
zero to a location two bytes immediately following DATAFLD:
MOV [BX+2],0 ;Move 0 to DATAFLD+2
You may also combine registers in an indirect address. Thus [BX +SI] means the ad-
dress in BX plus the address in the SI.
Note that any reference in square brackets to the BX, DI, SI, or BP register implies
an indirect operand, and the system treats the contents of the register as an offset address.
Here are a few more examples:
MOV BL, [BX] 7 D523 BX
TABLEX DB 25 DUP(?)
MOV TABLEX[DI],CL
moves an address into the EBX that consists of the contents of (the ECX times 2) plus the
contents of (the ESP plus 4).
Here are four examples of valid MOV operations by category, given the following
data items:
BYTEVAL DB?
WORDVAL DW ?
1. Immediate Moves
MOV BYTEVAL,
25 ; Immediate-to-memory, direct
2. Register Moves
MOV BYTEVAL,
BH ;Register-to-memory, direct
MOV AX,WORDVAL
[BX] ;Memory-to-register, indirect
MOVE-AND-FILL INSTRUCTIONS
A limitation of the MOV instruction is that the destination must be the same length as the
source, such as byte to byte and word to word. On the 80386 and later processors, the
MOVSX and MOVZX (move and fill) instructions facilitate transferring data from a byte
or word source to a word or doubleword destination. Here is the general format for MOVSX
and MOVZX:
MOVSxX, for use with signed arithmetic values, moves a byte or word to a word or
doubleword destination and fills the sign bit (the leftmost bit of the source) into leftmost
bits of the destination. MOVZX, for use with unsigned numeric values, moves a byte or
word to a word or doubleword destination and fills zero bits into leftmost bits of the desti-
nation. As an example, consider moving a byte containing 1011 0000 to a word; the result
in the destination word depends on the choice of instruction:
BYTEVAL DB 2
WORDVAL DW a
IMMEDIATE OPERANDS
In the following example of an immediate operand, the instruction
MOV AX,0123H
moves the immediate constant 0123H to the AX register. The three-byte object code for this
instruction is B82301, where B8 means “move an immediate value to the AX register” and
the following two bytes contain the value itself (2301H, in reverse-byte sequence). Many
instructions provide for two operands; the first operand may be a register or memory loca-
tion, and the second operand may be an immediate constant.
The use of an immediate operand provides more efficient processing than defining a
numeric constant in the data segment and referencing it in the operand of the MOV, as, for
example, in the following:
Data segment: AMT1 DW 0123H ;Define AMT1 as word
the assembler expands the immediate operand to two bytes, 0025H, and stores the object
code as 2500H.
The 80386 and later processors permit four-byte (doubleword) immediate operands,
such as in
MOV FAX,12345678H sMove doubleword
Immediate Formats
An immediate constant may be any valid defined format. Here are some examples:
98 Processor Instructions and Addressing Chapter 6
PAGE 60,132
PO6IMMED (EXE) Example of immediate operands
(Coded for assembly only, NOT for execution)
.MODEL SMALL
.STACK 64 ;Define stack
. DATA ;Define data
;Move immediate
;Add immediate
;Subtract immediate
;Move immediate (80386)
;Add immediate (hex)
Hexadecimal: 0123H
MOV, ADD, and SUB are three of many instructions that allow immediate operands.
Figure 6—1 gives examples of these instructions. The .386 directive allows the assembler to
recognize the reference to the EBX register. You don’t need an 80386 or later processor to
assemble this statement, but you do need one to execute it. Since the example is not intended
for execution, it does not define a stack or initialize the DS register.
Processing items longer than the capacity of a register involves additional coding,
covered in later chapters.
The XCHG instruction performs another type of data transfer, but rather than copy the data
from one location to another, XCHG swaps the two data items. The general format for
XCHG is
Valid XCHG operations involve exchanging data between two registers and between a reg-
ister and memory. Here are examples:
WORDX DW ?
The LEA instruction is useful for initializing a register with an offset address. In fact, a more
descriptive name for this instruction would be “Load Offset Address.” The general format
for LEA is
A common use for LEA is to initialize an offset in the BX, DI, or SI register for indexing an
address in memory. We’ll be doing a lot of that throughout this book. Here’s an example:
SAVBYTE DB i
INC and DEC are convenient instructions for incrementing and decrementing the contents
of registers and memory locations by 1. The general format for INC and DEC is
[label: ] {register/memory }
Note that these instructions require only one operand. Depending on the result, the opera-
tions clear or set the OF, SF, and ZF flags, which conditional jump instructions may test for
minus, zero, or plus.
page 60,232
PO6MOVE (EXE) Extended move operations
-MODEL SMALL
-STACK 64
’ ABCDEFGHTI’
‘ JKLMNOPQR’
;Initialize segment
; registers
NAME 1:
Since these fields are each nine bytes long, more than a simple MOV instruction is required.
The program contains a number of new features.
In order to step through NAME1 and NAME2, the routine initializes the CX register
to 9 (the length of the two fields) and uses the SI and DI index registers. Two LEA in-
structions load the offset addresses of NAME1 and NAME2 into the SI and DI as follows:
LEA SI,NAME1 ;Load offset addresses
The program uses the addresses in the SI and DI registers to move the first byte of NAME1
to the first byte of NAME2. The square brackets around SI and DI in the MOV operands
mean that the instruction is to use the offset address in the given register for accessing the
memory location. Thus
means “Use the offset address in SI (NAME1+0) to move the referenced byte to the AL
register.” And the instruction
Alignment of Addresses 101
MOV [DI],AL
means “Move the contents of the AL to the offset address referenced by DI (NAME2+0).”
The program has to repeat these two MOV instructions nine times, once for each character
in the respective fields. To this end, it uses an instruction that we have not yet explained:
JNE (Jump if Not Equal).
Two INC instructions increment the SI and DI registers by 1, and DEC decrements
the CX by 1. DEC also sets or clears the Zero flag, depending on the result in the CX; if the
contents are not zero, there are still more characters to move, and JNE jumps back to the
label B20 to repeat the move instructions. And since the SI and DI have been incremented
by 1, the next MOVs reference NAME1+1 and NAME2+1. The loop continues in this
fashion until it has moved nine characters in all, up through moving NAMEI1+8 to
NAME2+8.
(You might want to key in this program, assemble and link it, and use DEBUG to
trace it. Note the effect on the registers, the instruction pointer, and the stack. Use D DS:0
to view the changes to NAME2.)
¢ Decrements the stack pointer by 2 and pushes the contents of the flags register onto
the stack.
¢ Clears the interrupt and trap flags.
¢ Decrements the stack pointer by 2 and pushes the CS register onto the stack.
¢ Decrements the stack pointer by 2 and pushes the instruction pointer onto the stack.
¢ Causes the required operation to be performed.
To return from an interrupt, the routine issues an IRET (interrupt return), which pops
the registers off the stack and returns to the instruction immediately following the INT in
your program.
Since the preceding process is entirely automatic, your only concerns are to define a
stack large enough for the necessary pushing and popping and to use the appropriate INT
operations. Starting with Chapter 9, we’llbe making considerable use of the INT instruction.
ALIGNMENT OF ADDRESSES
Since the 8086 and 80286 have a 16-bit (word) data bus, they execute faster if accessed
words begin on an even-numbered (word) address. Consider a situation in which off-
sets 0012H and 0013H contain the word 63 A7H. The processor can access the full word
102 Processor Instructions and Addressing § Chapter 6
at offset 0012H directly into a register. But the word could begin on an odd-numbered ad-
dress, such as 0013H:
Memory contents:
Offset:
In this case, the processor has to perform two accesses. First, it accesses the bytes at 0012H
and 0013H and delivers the byte from 0013H (63) to the AL register. Then, it accesses the
bytes at 0014H and 0015H and delivers the byte from 0014H (A7) to the AH register. The
AX now contains A763H.
You don’t have to perform any special programming for even or odd locations, nor
do you have to know whether an address is even or odd. The accessing operation automat-
ically reverses a word from memory into a register so that it resumes its correct sequence.
The 80386 and later processors have a 32-bit data bus and, accordingly, prefer align-
ment of referenced items on addresses evenly divisible by four (a doubleword address).
(Technically, the 486 and Pentium processors prefer alignment on a 16-byte (paragraph)
boundary.)
Assembly language has an ALIGN directive that you can use to align items on bound-
aries. For example, ALIGN 2 aligns on a word boundary, and ALIGN 4 aligns on a dou-
bleword boundary. Also, since the beginning of the data segment is always on a paragraph
boundary, you could organize your data first with doubleword values, then with word val-
ues, and, finally, with byte values. However, the 80386 and later processors execute at such
rapid speed that you’ll probably never notice the effects of forcing alignment.
as the ES or, on the 386 and later, the FS or GS. A good example would be a large table of
data loaded from disk into memory.
You can use any instruction to process data in the other segment, but you must iden-
tify the appropriate segment register. Let’s say that the address of the other segment is in
the ES register, and the BX contains an offset address within that segment. Suppose the re-
quirement is to move two bytes (a word) from that location to the CX register:
The coding of ES: indicates an override operator that means “Replace the normal use of the
DS segment register with that of the ES.”
The next example moves a byte value from the AL into this other segment, at an off-
set formed by the value in the DI plus 24:
The assembler generates the machine language code with the override operator inserted as
a one-byte prefix (26H) immediately preceding the instruction, just as if you had coded the
instructions as
KEY POINTS
¢ An operand provides a source of data for an instruction. Some instructions do not re-
quire an operand, whereas other instructions may have one or two operands.
¢ Where there are two operands, the second operand is the source, which contains ei-
ther immediate data or the address (of a register or of memory) of the data. The first
operand is the destination, which contains data in a register or in memory that is to
be processed.
¢ In immediate format, the second operand contains a constant value or an expression.
Immediate operands should match the size of a register: a one-byte constant with a one-
byte register (AL, BH) and a one-word constant with a one-word register (AX, BX).
¢ In direct memory format, one of the operands references a memory location, and the
other operand references a register.
¢ Indirect addressing makes use of the computer’s capability for segment:offset ad-
dressing. The registers used are BX, DI, SI, and BP, coded within square brackets as
an index operator. The BX, DI, and SI are associated with the DS as DS:BX, DS:DI,
and DS:SI, respectively, for processing data in the data segment. The BP is associ-
ated with the SS as SS:BP, for handling data in the stack.
¢ You may combine registers in an indirect address as [BX+SI], which means the ad-
dress in BX plus the address in the SI.
104 Processor Instructions and Addressing § Chapter 6
The MOV instruction transfers (or copies) data referenced by the address in the sec-
ond operand to the address in the first operand.
The LEA instruction is useful for initializing a register with an offset address.
INC and DEC are convenient instructions for incrementing and decrementing by 1
the contents of registers and memory locations.
The INT instruction interrupts processing of your program, transfers to DOS or BIOS
for specified action, and returns to your program to resume processing.
QUESTIONS
6-1. For an instruction with two operands, which is the source and which is the destination?
6-2. (a) In what significant way do the following instructions differ in execution?
(b) For the second MOV, one operand is in square brackets. What is the name of this feature?
6-3. (a) In what significant way do the following instructions differ in execution?
MOV BX,0
MOV [BX],0
(b) For the second MOV, what sort of addressing is involved with the first operand?
6—4. Explain the operation of the instruction
6-5. The following statement contains an error; that is, something is needed for the assembler to
translate it:
BYTE! DB ?
BYTE2 DB ?
WORD1 DW ?
6-7. Code the following as instructions with immediate operands: (a) Store 320 in the AX; (b) com-
pare FLDB to zero; (c) add hex 40 to BX; (d) subtract hex 40 from CX; (e) shift FLDB one bit
left; (f) shift the CH one bit right.
Questions 105
6-8. Code one instruction that swaps the contents of a word named WORD 1 with the CX.
6-9. Code the instruction to set the BX with the (offset) address of an item named TABLEX.
6-10. What, in general terms, is the purpose of the INT instruction?
6-11. (a) How does the INT instruction affect the stack? (b) How does the IRET instruction affect
the stack?
6-12. Code, assemble, link, and use DEBUG to test the following program:
Define byte items named BYTEA and BYTEB (containing any values) and a word item named
WORDC (containing zero)
Move the contents of BYTEA to the AL.
Add the contents of BYTEB to the AL.
Move the immediate value 25H to the BL.
Exchange the contents of the AL and BL.
Multiply the contents of the BL by the AL (MUL BL.)
Store the product in the AX into WORDC.
CHAPTER 7
Writing .COM Programs
OBJECTIVE
To explain the purpose and uses of .COM programs and how
to prepare an assembly language program for that format.
INTRODUCTION
Up to now, we have written, assembled, and executed only .EXE programs. The linker au-
tomatically generates a particular format for an .EXE program and, when storing it on disk,
precedes it with a special header block that is at least 512 bytes long. (Chapter 24 provides
details of header blocks.)
You can also generate a .COM program for execution. One example of a commonly
used .COM program is COMMAND.COM. The advantages of .COM programs are that
they are smaller than comparable .EXE programs and are more easily adapted to act as res-
ident programs. The .COM format has its roots in distant pre-DOS days, when program size
was limited to 64K.
Some significant differences between a program that is to execute as EXE and one that is
to execute as .COM involve the program’s size, segmentation, and initialization.
106
Conversion into .COM Format 107
Program Size
An .EXE program may be virtually any size, whereas a .COM program is restricted to one
segment and a maximum of 64K, including the PSP. The PSP is a 256-byte (100H) block
that DOS inserts immediately preceding a .COM and .EXE program when it loads them in
memory. The 64K limit is a general rule; you may get around it by coding additional SEG-
MENT AT statements, a feature that is outside the scope of this chapter. A .COM program
is always smaller than its counterpart .EXE program; one reason is that a 512-byte header
block that precedes an .EXE program on disk does not precede a .COM program. (Don’t
confuse the header block with the PSP.) A .COM program is an absolute image of the ex-
ecutable program, but with no relocatable address information.
Segments
The use of segments for .COM programs is significantly different (and easier) than for EXE
programs.
Stack segment. You define an .EXE program with a stack segment, whereas a
.COM program automatically generates a stack. Thus, when you write an assembly lan-
guage program that is to be converted to .COM format, you omit the defining stack. If the
64K program size is not large enough, the assembler establishes the stack outside of the pro-
gram, in higher memory.
Data segment. An .EXE program usually defines a data segment and initializes
the DS register with the address of that segment. Since the data for a .COM program is de-
fined within the code segment, you don’t define the data segment either. As you'll see, there
are simple ways to handle this situation.
Code segment. Anentire .COM program combines the PSP, stack, data segment,
and code segment into one code segment, in a maximum of 64K bytes.
Initialization
When DOS loads a .COM program for execution, it automatically initializes all segment
registers with the address of the PSP. Since the CS and DS registers will contain the cor-
rect initial segment address, your program does not have to load them.
Because addressing begins at an offset of 100H bytes from the beginning of the PSP,
code an ORG directive as ORG 100H immediately following the code SEGMENT or
‘CODE statement. The ORG directive tells the assembler to begin generating the object code
at an offset of 100H bytes past the start of the PSP, where the actual .COM program begins.
Microsoft Conversion
For both .EXE and .COM programs under Microsoft MASM, you assemble and produce
an .OBJ file and then link the .OBJ file to produce an .EXE program. If you wrote the pro-
gram to run as an .EXE program, you can now execute it. If you wrote the program to run
as a .COM program, the linker produces a message:
You may ignore this message, since there is supposed to be no defined stack. A program
named EXE2BIN converts Microsoft .EXE programs to .COM programs. (Actually, it con-
verts .EXE programs to a .BIN (binary) file; the program name means “convert EXE-to-
BIN,” but you should name your output file extension .COM.) Assuming that EXE2BIN is
in the default drive, and that a linked file named CALC.EXE is in drive D, type
Since the first operand of the command always references an .EXE file, do not code the
.EXE extension. The second operand could be a name other than CALC.COM. If you omit
the extension, EXE2BIN assumes BIN, which you would have to rename subsequently as
.COM in order to execute the program. (Someone, somewhere, must have thought this was
a good idea.)
Borland Conversion
As long as your source program is coded according to .COM requirements, you can con-
vert your object program directly into a .COM program. Use the /T option for TLINK:
TLINK /T D:CALC
page 60,132
TITLE PO7COM1 .COM program to move and add
CODESG SEGMENT PARA ' Code’
ASSUME CS:CODESG,DS:CODESG,SS:CODESG,
ES : CODESG
ORG 100H ;Start at end of PSP
;Jump past data
;Move 0250 to AX
ADD AX, FLDB ;Add 0125 to AX
MOV FLDC, AX ;Store sum in FLDC
MOV AX, 4C0O0OH ;Exit to DOS
INT 21H
MAIN ENDP
CODESG ENDS
END BEGIN
¢ INT 21H, function 4CH, ends processing and exits to DOS. You may also use the
RET instruction for this purpose.
Here are the steps to convert the program for MASM and TASM:
MASM TASM
MASM D:EXCOM1,D: TASM D:EXCOM1,D:
The .EXE and .COM programs are 792 bytes and 24 bytes in size, respectively. The
difference is largely caused by the 512-byte header block stored at the beginning of .EXE
modules. Type DEBUG D:EXCOM1.COM to trace the execution of the .COM program up
to (but not including) the last instruction.
You may also use simplified segment directives when coding a .COM program, as
shown in Figure 7—2. Once again, define only a code segment, not a stack or data segment.
page 60,132
PO7COM2 COM program to move and add data
;Move 0250 to AX
;Add 0125 to AX
;Store sum in FLDC
AX, 4CO0OH ;Return to DOS
21H
BEGIN
an idea as to the space available for a stack. Most of the smaller programs in this book are
in .COM format, which should be easily distinguished from .EXE format.
DEBUGGING TIPS
The omission of only one .COM requirement may cause a program to fail. If EXE2BIN
finds an error, it simply notifies you that it cannot convert the file, but does not provide a
reason. Check the SEGMENT, ASSUME, and END statements. If you omit ORG 100H,
the program incorrectly references data in the PSP, with unpredictable results.
If you run a .COM program under DEBUG, use D CS:100 to view the data and
instructions. Do not follow the program through its termination; instead, use DEBUG’s
Q command.
An attempt to execute the .EXE module of a program written as .COM will fail.
KEY POINTS
QUESTIONS
7-1. What is the maximum size of a .COM program?
7-2. For a source program to be converted to .COM format, what segments can you define?
7-3. Why do you code ORG 100H at the beginning of a program to be converted to .COM format?
7-4. How does the system handle the fact that you do not define a stack for a .COM program?
7-5. A source program is named SAMPLE.ASM. Provide the commands to convert it to .COM for-
mat under (a) MASM;; (b) TASM.
7-6. Revise the program in Question 6—12 for .COM format. Assemble, link, and execute it under
DEBUG.
CHAPTER 8
Program Logic and Control
OBJECTIVES
INTRODUCTION
Up to this chapter, the programs we have have examined have executed in a straight line,
with one instruction sequentially following another. Seldom, however, is a programmable
problem that simple. Most programs consist of a number of loops in which a series of steps
repeats until reaching a specific requirement and various tests to determine which of sev-
eral actions to take. A common practice is to test whether a program is to end execution.
Requirements such as these involve a transfer of control to the address of an instruc-
tion that does not immediately follow the one currently executing. A transfer of control may
be forward, to execute a new series of steps, or backward, to reexecute the same steps.
Certain instructions can transfer control outside the normal sequential flow by adding an
offset value to the IP. Following are the instructions introduced in this chapter, by category:
112
Instruction Labels 113
Jnnn OR RCR/ROR
INSTRUCTION LABELS
The JMP, Jnnn (conditional jump), and LOOP instructions require an operand that refers to
the label of an instruction. The following example jumps to A90, which is the label given
to a MOV instruction:
The label of an instruction, such as A90:, is terminated by a colon to give it the near
attribute—that is, the label is inside a procedure in the same code segment. Watch out:
Omission of the colon is a common error. Note that an address label in an instruction
operand (such as JMP A90) does not have a colon.
You can also code a label on a separate line as
A90:
MOV AH, 00
In both cases, the address of A90 references the first byte of the MOV instruction.
114 Program Logic and Control Chapter 8
A JMP operation within the same segment may be short or near (or, technically, far
if the destination is a procedure with the FAR attribute). On its first pass through a source
program, the assembler generates the length of each instruction. However, a JMP instruc-
tion may be either two or three bytes long. A JMP operation to a label within — 128 to +127
bytes is a short jump. The assembler generates one byte for the operation (EB) and one byte
for the operand. The operand acts as an offset value that the computer adds to the IP regis-
ter when executing the program. The limits are OOH to FFH, or —128 to +127. The as-
sembler may have already encountered the designated operand (a backward jump) within
— 128 bytes, as in
A50:
JMP A50
In this case, the assembler generates a two-byte machine instruction. A JMP that exceeds
— 128 to + 127 bytes becomes a near jump, for which the assembler generates different ma-
chine code (E9) and a two-byte operand (8086/80286) or four-byte operand (80386 and
later). In a forward jump, the assembler has not yet encountered the designated operand:
JMP A90
A90:
Since some assembler versions don’t know at this point whether the jump is short or near,
they automatically generate a three-byte instruction. However, provided that the jump re-
ally is short, you can use the SHORT operator to force a short jump and a two-byte in-
struction by coding
JMP SHORT A90
A90:
page 60,132
PO8JUMP (COM) Use of JMP for looping
-MODEL SMALL
100H
NEAR
AX, 01 ;Initialize AX,
BX, 01 ; BX, and
Cx; Ol ; CX to Ol
AX, 01 ;Add 01 to AX
BX, AX ;Add AX to BX
Cx ;Double CX
A20 ;Jump to A20 instr’n
MAIN
¢ Add | to AX
« Add AX to BX
¢ Double the value in CX
At the end of the loop, the instruction JMP A20 transfers control to the instruction labeled
A20. The effect of repeating the loop causes AX to increase as 1, 2, 3, 4, ...; BX to increase
according to the sum of the digits 1, 3, 6, 10, ...; and CX to double as 1, 2, 4, 8, .... Since
this loop has no exit, processing is endless—usually not a good idea.
In the program, A20 is —9 bytes from the JMP. You can confirm this distance by ex-
amining the object code for the JMP: EBF7. EB is the machine code for a near JMP and
hex F7 is the two’s complement notation for —9. The IP contains the offset (0112H) of the
next instruction to execute. The JMP operation adds the F7 (technically, FFF7, since the
IP is a word in size) to the IP, which contains the offset 0112H of the instruction follow-
ing the JMP:
DECIMAL HEX
Instruction pointer: 274 0112
The jump address is calculated to be 0109H, where the carry out of 1 is ignored (as a check
of the program listing for the offset address of A20 shows). The operation changes the off-
set value in the IP and flushes the instruction queue. Since this is a backward jump, the
operand FFF7 is negative, whereas the operand for a forward jump would be a positive value.
As auseful experience, key in the program, assemble it, link it, and convert it to .COM
format. No data definitions are required, since immediate operands generate all the data.
Use DEBUG to trace the .COM module for a number of iterations. Once the AX contains
08, the BX and CX will be incremented to 24H (decimal 36) and 80H (decimal 128), re-
spectively. Key in Q to quit DEBUG.
116 Program Logic and Control Chapter 8
As used in Figure 8—1, the JMP instruction causes an endless loop. But a routine is more
likely to loop a specified number of times or until it reaches a particular condition. The
LOOP instruction, which serves this purpose, requires an initial value in the CX register.
For each iteration, LOOP automatically deducts 1 from the CX. If the value in the CX is
zero, control drops through to the following instruction; if the value in the CX is nonzero,
control jumps to the operand address. The distance must be a short jump, within — 128 to
+ 127 bytes. For an operation that exceeds this limit, the assembler issues a message such
as “relative jump out of range.” The general format for LOOP is
The program in Figure 8—2 illustrates the use of LOOP and performs the same opera-
tion as the program in Figure 8—1, except that it terminates after 10 loops. A MOV instruc-
tion initializes the CX with the value 10. Since LOOP uses the CX, this program now uses
the DX in place of CX for doubling the initial value 1. The LOOP instruction replaces JMP
A20 and, for faster processing, INC AX (increment the AX by 1) replaces ADD AX,01.
Just as for JMP, the machine code operand contains the distance from the end of the
LOOP instruction to the address of A20, which is added to the IP.
As a useful exercise, modify your copy of Figure 8—1 for these changes, and assem-
ble, link, and convert the program to .COM. Use DEBUG to trace through the entire 10
loops. Once the CX is reduced to zero, the contents of AX, BX, and DX are, respectively,
OQOOBH, 0042H, and O400H. Press Q to quit DEBUG.
There are two variations on the LOOP instruction, both of which also decrement the
CX by 1. LOOPE/LOOPZ (loop while equal or zero) continues looping as long as the value
in the CX is zero or the zero condition is set. LOOPNE/LOOPNZ (loop while not equal or
zero) continues looping as long as the value in the CX is not zero or the zero condition is
not set.
page 60,132
PO8LOOP (COM) Illustration of LOOP
-MODEL SMALL
100H
NEAR
AX,01 ;Initialize AX,
BX, 01 ; BX, and
DX, 01 : DX to O1
CX ;,10 ; Initialize
; mumber of loops
AX ;Add 01 to AX
BX, AX ;Add AX to BX
DX, 1 ;Double DX
A20 ;Decrement CX,
; loop if nonzero
AX,4C0O0OH ;Exit to DOS
2iH
BEGIN
Neither LOOP nor its LOOPxx variations affects any flags in the flags register, which
would be changed by other instructions within the loop routine. As a result, if the routine
contains no instructions that affect the ZF (zero) flag, then using LOOPNE/LOOPNZ
would be equivalent to using LOOP.
FLAGS REGISTER
The remaining material in this chapter requires a more detailed knowledge of the flags reg-
ister. This register contains 16 bits, which various instructions set to indicate the status of
an operation. In all cases, a flag remains set until another instruction changes it. The flags
register for real mode contains the following commonly used bits:
CF (Carry flag). Contains a carry (0 or 1) from the high-order (leftmost) bit fol-
lowing arithmetic operations and some shift and rotate operations.
PF (Parity flag). Contains a check of the low-order eight bits of data operations.
The parity flag is not to be confused with the parity bit and is seldom of concern in con-
ventional programming. An odd number of 1-bits clears the flag to 0, and an even number
of 1-bits sets it to 1.
AF (Auxiliary carry flag). Is concerned with arithmetic on ASCII and BCD
packed fields. An arithmetic operation that causes a carry out of bit 3 (the fourth bit from
the right) of a register one-byte operation sets this flag.
SF (Sign flag). Set according to the sign (high-order or leftmost bit) after an
arithmetic operation: Positive clears the flag to 0, and negative sets it to 1. JG and JL test
this flag.
TF (Trap flag). When set, causes the processor to execute in single-step mode,
that is, one instruction at a time under user control. You already set this flag when you en-
tered the T command in DEBUG, and that’s about the only place where you’d expect to
find its use.
OF (Overflow flag). Indicates a carry into and out of the high-order (leftmost)
sign bit following a signed arithmetic operation.
The result of a CMP operation affects the AF, CF, OF, PF, SF, and ZF flags, although
you do not have to test these flags individually. The following code tests the BX register
for a zero value:
(action if nonzero)
If the BX contains zero, CMP sets the ZF to 1 and may or may not change the settings
of other flags. The JZ (Jump if Zero) instruction tests only the ZF flag. Since ZF contains
1 (meaning a zero condition), JZ transfers control (jumps) to the address indicated by
operand BSO. .
Note that the operation compares the first to the second operand; for example, is the
value of the first operand higher than, equal to, or lower than the value of the second
operand? The next section provides the various ways of transferring control based on tested
conditions.
JNZ A20
DEC and JNZ perform exactly what LOOP does. DEC decrements the CX by 1 and sets or
clears the zero flag in the flags register. JNZ then tests the setting of the zero flag; if the CX
is nonzero, control jumps to A20, and if the CX is zero, control drops through to the next
instruction. (The jump operation also flushes the processor’s prefetch instruction queue.)
Although LOOP has limited uses, in this example it is more efficient than using the DEC
and JNZ instructions.
Just as for JMP and LOOP, the machine code operand contains the distance from the
end of the JNZ instruction to the address of A20, which is added to the instruction pointer.
For the 8086/286, the distance must be a short jump, within —128 to +127 bytes. If an op-
eration exceeds this limit, the assembler issues a message “relative jump out of range.” The
80386 and later processors provide for 8-bit (short) or 32-bit (near) offsets that allow reach-
ing any address within a segment.
compares the contents of the AX to the contents of the BX. For unsigned data, the AX value
is larger; for signed data, however, the AX value is smaller because of the negative sign.
You can express each of these tests in one of two symbolic operation codes. Choose
the one that is clearest and most descriptive. For example, although JB and JNAE generate
the same object code, the positive test JB is easier to understand than the negative test JNAE.
The jumps for testing equal or zero (JE/JZ) and for testing not equal or zero
(JNE/JNZ) are included in the lists for unsigned and signed data, since an equal or zero con-
dition occurs regardless of the presence of a sign.
JO Jump Overflow OF
JC and JNC are often used to test the success of disk operations. Another conditional
jump, JCXZ, tests the contents of the CX register for zero. This instruction need not be
placed immediately following an arithmetic or compare operation. One use for JCXZ could
be at the start of a loop, to ensure that the CX actually contains a nonzero value.
Now, don’t expect to memorize all of these instructions. As a reminder, however, note
that a jump for unsigned data is equal, above, or below, whereas a jump for signed data is
equal, greater, or less. The jumps for testing the carry, overflow, and parity flags have
unique purposes. The assembler translates symbolic to object code, regardless of which in-
Calling Procedures 121
struction you use, but, for example, JAE and JGE, although apparently similar, do not test
the same flags.
The 80386 and later processors permit far conditional jumps. You can indicate a short
or far jump as, for example,
CALLING PROCEDURES
BEGIN ENDP
The FAR operand in this case informs the system that the indicated address is the entry point
for program execution, whereas the ENDP directive defines the end of the procedure. A
code segment, however, may contain any number of procedures, all distinguished by PROC
and ENDP. A called procedure (or subroutine) is a section of code that performs a clearly
defined task (such as set cursor or get keyboard input). Organizing a program into proce-
dures provides the following benefits:
* Reduces the amount of code, since acommon procedure can be called from any where
in the code segment
¢ Encourages better program organization
* Facilitates debugging of the program, since bugs can be more clearly isolated
* Helps in the ongoing maintenance of programs because procedures are readily iden-
tified for modification.
The particular object code that CALL and RET generate depends on whether the op-
eration involves a NEAR or FAR procedure.
Near call and return. A CALL toa procedure within the same segment is
near and performs the following:
ap Program Logic and Control Chapter 8
Pops the old IP value from the stack into the IP (which also flushes the processor’s
prefetch instruction queue).
Increments the SP by 2.
The CS:IP now points to the instruction following the original CALL in the calling proce-
dure, where execution resumes.
Far call and return. A far CALL calls a procedure labeled FAR, possibly in a
separate code segment. A far CALL pushes both the CS and IP onto the stack, and RET
pops them from the stack. Far calls and returns are the subject of Chapter 23.
The program is divided into a far procedure, BEGIN, and two near procedures, B10
and C10. Each procedure has a unique name and contains its own ENDP for ending
its definition.
page 60,132
PO8CALLP (EXE) Calling procedures
;Call B10
;Exit to DOS
;Call C10
;Return to
caller
The PROC directives for B10 and C10 contain the attribute NEAR to indicate that
these procedures are within the current code segment. Since omission of the attribute
causes the assembler to default to NEAR, many subsequent examples omit it.
In procedure BEGIN, the CALL instruction transfers program control to the proce-
dure B10 and begins its execution.
In procedure B10, the CALL instruction transfers control to the procedure C10 and
begins its execution.
In procedure C10, the RET instruction causes control to return to the instruction im-
mediately following CALL C10.
In procedure B10, the RET instruction causes control to return to the instruction im-
mediately following CALL B10.
Procedure BEGIN then resumes processing from that point.
RET always returns to the calling routine. If B10 did not end with a RET instruction,
instructions would execute through B10 and drop directly into C10. In fact, if C10
did not contain a RET, the program would execute past the end of C10 into whatever
instructions (if any) happened to be there, with unpredictable results.
Technically, you can transfer control to a near procedure by means of a jump in-
struction or even by normal in-line code. But for clarity and consistency, use CALL to trans-
fer control to a procedure, and use RET to end the execution of the procedure.
¢ DS and ES: Address of the PSP, a 256-byte (100H) area that precedes an executable
program module in memory.
¢ CS: Address of the code segment—the entry point to your program.
¢ IP: Zero, if the first executable instruction is at the beginning of the code segment.
¢ SS: Address of the stack segment.
¢ SP: Offset to the top of the stack. For example, for a stack defined as STACK 64
(64 bytes or 32 words), the SP initially contains 64, or 40H.
Let’s trace the simple program in Figure 8—3 through its execution. In practice, called
procedures would contain any number of instructions.
124 Program Logic and Control Chapter 8
The current available location for pushing or popping is the top of the stack. For this
example, the system loader would have set the SP to the size of the stack, 64 bytes (40H).
The program performs the following operations:
¢ CALL B10 decrements the SP by 2, from 40H to 3EH. It then pushes the IP (con-
taining 0003) onto the top of the stack at offset 3EH. This is the offset of the instruc-
tion following the CALL. The processor uses the address formed by CS:IP to transfer
control to B10. Words in memory contain bytes in reverse sequence; for example,
0003 becomes 0300.
CALL B10 (push 0003): XXXK XXXX XXXX XKXXX 0300 SP = 3E00H
| | | | |
Stack offset: 0036 0038 003A 003C 003E
¢ In procedure B10, CALL C10 decrements the SP by 2, to 3CH. It then pushes the IP
(containing 000B) onto the top of the stack at offset 3CH. The processor uses the
CS:IP addresses to transfer control to C10.
CALL B10 (push OOOB): XXXX XXXX XXXX OBOO0 0300 SP = 3C00H
¢ To return from C10, the RET instruction pops the offset (OOOB) from the top of the
stack at 3CH, inserts it in the IP, and increments the SP by 2 to 3EH. This causes an
automatic return to offset OOOBH in procedure B10.
¢ The RET at the end of procedure B10 pops the address (0003) from the top of the
stack at 3EH into the IP and increments the SP by 2 to 40H. This causes an automatic
return to offset 0003H, where the program ends its execution.
If you use DEBUG to view the stack, you may find harmless data left by a previously
executed program.
Boolean Operations 125
BOOLEAN OPERATIONS
Boolean logic is important in circuitry design and has a parallel in programming logic. The
instructions for Boolean logic are AND, OR, XOR, TEST, and NOT, which can be used to
clear and set bits and to handle ASCII data for arithmetic purposes (Chapter 13). The gen-
eral format for the Boolean operations is
The first operand references one byte or word in a register or memory and is the only
value that is changed. The second operand references a register or immediate value. The
operation matches the bits of the two referenced operands and sets the CF, OF, PF, SF, and
ZF flags accordingly (AF is undefined).
¢ AND. If matched bits are both 1, sets the result to 1. All other conditions result in 0.
° OR. If either (or both) of the matched bits is 1, sets the result to 1. If both bits are 0,
the result is 0.
¢ XOR. If one matched bit is 0 and the other 1, sets the result to 1. If matched bits are
the same (both O or both 1), the result 1s 0.
¢ TEST. Sets the flags as AND does, but does not change the bits.
The following AND, OR, and XOR operations illustrate the same bit values as
operands:
AND OR XOR
Here’s a useful to rule to remember: ANDing bits with 0 clears them to 0, whereas
ORing bits with 1 sets them to 1.
oO Ci ch ;Sets SF and ZF
Examples 2 and 6 provide ways of clearing a register to zero. Example 3 zeros the left four
bits of the AL. Although the use of CMP may be clearer, you can use OR for the following
purposes:
i. Oe OGG ;Test CX for zero
JZ ;Jump if zero
JS ;Jump if negative
TEST acts like AND, but only sets flags. Here are some examples:
1. TEST BL,11110000B ;Any of leftmost bits
JNZ in BL nonzero?
JZ a zero value?
The NOT instruction simply reverses the bits in a byte or word in a register or memory: Os
become Is and 1s become Os. The general format for NOT is
[label:] {register/memory }
For example, if the AL contains 1100 0101, the instruction NOT AL changes the AL to
0011 1010. (The effect is exactly the same as that of XOR AL,OFFH in Example 7 earlier.)
Flags are unaffected. NOT is not the same as NEG, which changes a binary value from pos-
itive to negative and vice versa by reversing the bits and adding 1.
UPPERCASE LOWERCASE
BEGIN
The .COM program in Figure 8-4 converts the contents of a data item, TITLEX, from
lowercase to uppercase, beginning at TITLEX+ 1. The program initializes the BX with the
address of TITLEX+ 1 and uses the address to move each character, starting at TITLEX+ 1,
to the AH. If the value is between 61H and 7AH, an AND instruction sets bit 5 to 0:
AND AH,11011111B
All characters other than a through z remain unchanged. The routine then moves the
changed character back to TITLEX, increments the BX for the next character, and loops.
Used this way, the BX register acts as an index register for addressing memory loca-
tions. You may also use the SI and DI for the same purpose.
SHIFTING BITS
The shift instructions, which are part of the computer’s logical capability, can perform the
following actions:
The second operand contains the shift value, which is a constant (an immediate value)
or areference to the CL register. For the 8088/8086 processors, the immediate constant may
128 Program Logic and Control Chapter 8
be only 1; a shift value greater than | must be contained in the CL register. Later proces-
sors allow immediate shift constants up to 31. The general format for shift is
MOV CL,03
The first SHR shifts the contents of the AL one bit to the right. The shifted 1-bit now re-
sides in the carry flag, and a 0-bit is filled to the left in the AL. The second SHR shifts the
AL three more bits. The carry flag contains successively 1, 1, and 0, and three O-bits are
filled to the left in the AL.
SAR differs from SHR in one important way: SAR uses the sign bit to fill leftmost
vacated bits. In this way, positive and negative values retain their signs. The following re-
lated instructions illustrate SAR and unsigned data in which the sign is a 1-bit:
INSTRUCTION AL COMMENT
MOV CL, 03
Right shifts are especially useful for halving values and are significantly faster than
using a divide operation. In the examples of the shift right operation, the first right shift of
one bit effectively divides by 2, and the second and third right shifts of three bits effectively
divide by 8.
Rotating Bits 129
Halving odd numbers such as 5 and 7 generates 2 and 3, respectively, and sets the
carry flag to 1. Also, if you have to shift two bits, coding two shift instructions is more ef-
ficient than storing 2 in the CL and coding one shift.
You can use the JC (Jump if Carry) instruction to test the bit shifted into the carry flag
at the end of a shift operation.
feJeLT
TTTTT |eLe|
The following related instructions illustrate SHL for unsigned data:
INSTRUCTION AL COMMENT
MOV CL,03
MOV AL,10110111B ; 10110111
The first SHL shifts the contents of the AL one bit to the left. The shifted 1-bit now resides
in the carry flag, and a O-bit is filled to the right in the AL. The second SHL shifts the AL
three more bits. The carry flag contains successively 0, 1, and 1, and three 0-bits are filled
to the right in the AL.
Left shifts always fill 0-bits to the right. As a result, SHL and SAL are identical. Left
shifts are especially useful for doubling values and are significantly faster than using a mul-
tiply operation. In the examples of the shift left operation, the first left shift of one bit ef-
fectively multiplies by 2, and the second and third left shifts of three bits effectively
multiply by 8. Also, if you have to shift two bits, coding two shift instructions is more ef-
ficient than storing 2 in the CL and coding one shift.
You can use the JC (Jump if Carry) instruction to test the bit shifted into the carry flag
at the end of a shift operation.
ROTATING BITS
The rotate instructions, which are part of the computer’s logical capability, can perform the
following actions:
130 Program Logic and Control Chapter 8
MOV CL, 03
The first ROR rotates the rightmost 1-bit of the BH to the leftmost vacated position. The
second and third ROR operations rotate the three rightmost bits.
RCR causes the carry flag to participate in the rotation. Each shifted-off bit on the
right moves into the CF, and the CF bit moves into the vacated bit position on the left.
INSTRUCTION BL COMMENT
MOV CL, 03
The first ROL rotates the leftmost 1-bit of the BL to the rightmost vacated position. The
second and third ROL operations rotate the three leftmost bits.
Similarly to RCR, RCL also causes the carry flag to participate in the rotation. Each
shifted-off bit on the left moves into the CF, and the CF bit moves into the vacated bit po-
sition on the right.
You can use the JC (Jump if Carry) instruction to test the bit rotated into the CF at
the end of a rotate operation.
You can also use shift and rotate instructions to multiply and divide doubleword values
by multiples of 2. Consider a 32-bit value of which the leftmost 16 bits are in the DX and the
rightmost 16 bits are in the AX, as DX:AX. Instructions to “multiply” the value by 2 could be:
SHL AX,1 ;Use left shift to multiply
The SHL shifts all bits in the AX to the left, and the leftmost bit shifts into the carry flag.
The RCL shifts the DX left and inserts the bit from the CF into the rightmost vacated bit.
To multiply by 4, follow the SHL-RCL pair with an identical SHL-RCL pair.
For division, consider again a 32-bit value in the DX:AX. Instructions to “divide” the
value by 2 would be
SAR DxXx,1 sUse right shift to divide
JUMP TABLES
A program may have a routine for testing a number of related conditions, each requiring a
jump to another routine. Consider, for example, a system for a company that has established
special codes for customers based on their credit rating and sales volume. The codes indi-
cate the amount of discount to offer and other special processing that may be required for
the customer. Customer codes are 0, 1, 2, 3, and 4.
132 Program Logic and Control Chapter 8
JE DOODSCT
CMP CUSCODE, 1 ;Code = 1?
JE D1ODSCT
CMP CUSCODE, 2 ;Code = 2?
JE D20DSCT
CMP CUSCODE, 3 ;Code = 3
JE D30DSCT
CMP CUSCODE, 4 ;Code = 4?
JE D40NSCT
With this approach, the opportunity for errors is great: Just consider matching the correct
codes against their values and jumping to the correct routine. A more elegant solution
involves a table of jump addresses. As shown in the partial program in Figure 8-5,
CUSTTBL defines the five addresses successively in words (two bytes each). The routine
at DIOJUMP accesses the codes (as hex values 00-04) into the BX register. The value is
doubled, so that 0 stays 0, 1 becomes 2, 2 becomes 4, and so forth. The doubled value pro-
vides an offset into the table: CUSTTBL+0 is the first address, CUSTTBL+2 is the sec-
ond, CUSTTBL+4 is the third, and so forth. The operand of the JMP instruction,
[CUSTTBL+BX], forms an address based on the start of the table plus an offset into the
table. The operation then jumps directly to the appropriate routine.
An important constraint in the program is that the codes may be only the hex values
Q0—04; any other value would cause dire results! If you use DEBUG to run this program,
enter valid hex values (00-04) into CUSCODE to check the effect of the logic.
For the 80386 and later processors, you could replace the two instructions at
D10JUMP, that is,
MOV BL, CUSCODE ;Get discount code
PROGRAM ORGANIZATION
PAGE 60,132
TITLE POSJMPTB (EXE) Use of a jump table
.MODEL SMALL
.STACK 64
. DATA
0000 DW DOONODSC ;Table of addresses
0002 DW D10DSCT
0004 o) o) N |onl DW D20DSCT
0006 DW D30DSCT
0008 A
AAADA DW D40DSCT
OOOA 04 DB 04 ;Discount code
CODE
0000 PROC FAR
0000 B8 MOV AX, @data ; Initialize
0003 8E D8 MOV DS , AX ; segment
0005 8E CO MOV ES , AX ; registers
Then plan the strategy for the instructions: routines for initialization, for using a con-
ditional jump, and for using a LOOP. The following, which shows the main logic, is
pseudocode that many programmers use to plan a program:
KEY POINTS
* Use CALL to access a procedure, and include RET at the end of the procedure for re-
turning. A called procedure may call other procedures, and if you follow the con-
ventions, RET causes the correct address in the stack to pop. The only examples in
this book that jump to a procedure are at the beginning of .COM programs.
¢ Use left shift to double a value and right shift to halve a value. Be sure to select the
appropriate shift instruction for unsigned and for signed data.
QUESTIONS
8-1 Explain these terms: (a) short address; (b) near address; (c) far address.
8-2. (a) What is the maximum number of bytes that a near JMP, a LOOP, and a conditional jump
instruction may jump? (b) What characteristic of the machine code operand causes this
limit?
8-3. A JMP instruction begins at offset location 0624H. Determine the transfer address, based on
the following object code for the JMP operand: (a) 27H; (b) 6BH; (c) C6H.
8—4. Code a routine using LOOP that calculates the Fibonacci series: 1, 1, 2, 3, 5, 8, 13,....
(Except for the first two numbers in the sequence, each number is the sum of the preceding
two numbers.) Set the limit for 12 loops. Assemble, link, and use DEBUG to trace through the
routine.
8-5. Assume that AX and BX contain signed data and that CX and DX contain unsigned data. De-
termine the CMP (where necessary) and conditional jump instructions for the following:
(a) Does the DX value exceed the CX? (b) Does the BX value exceed the AX? (c) Does the
CX contain zero? (d) Is there an overflow? (e) Is the BX equal to or smaller than the AX?
(f) Is the DX equal to or smaller than the CX?
. In the following, what flags are affected, and what would they contain? (a) An overflow oc-
curred; (b) a result is negative; (c) a result is zero; (d) processing is in single-step mode; (e) a
string data transfer is to be right to left.
. Refer to Figure 8—3. What would be the effect on program execution if the procedure B10 did
not contain a RET?
. What is the difference between coding a PROC operand with FAR and with NEAR?
. What are the ways in which a program can begin executing a procedure?
. In an .EXE program, A10 calls B10, B10 calls C10, and C10 calls D10. As a result of these
calls, how many addresses does the stack contain?
. Assume that the BL contains 1110 0011 and that a location named BOONO contains 0111
1001. Determine the effect on the BL for the following: (a) XOR BL,BOONO; (b) AND
BL,BOONO; (c) OR BL,BOONO; (d) XOR BL,11111111B; (e) AND BL,OOOOOOOOB.
8-12. Revise the program in Figure 8-4 as follows: Define the contents of TITLEX as uppercase let-
ters, and code the instructions that convert uppercase to lowercase.
8-13. Assume that the DX contains binary 10111001 10111001 and the CL contains 03. Determine
the hex contents of the DX after execution of the following unrelated instructions: (a) SHR
DX,1; (b) SHR DX,CL; (c) SHL DX,CL; (d) SHL DL,1; (e) ROR DX,CL; @) ROR DL,CL;
(g) SAL DH,1.
8-14. Use shift, move, and add instructions to multiply the contents of the AX by 10.
8-15. A routine at the end of the section entitled “Rotating Bits” multiplies the DX:AX by 2. Revise
the routine to (a) multiply by 4; (b) divide by 4; (c) multiply the 48 bits in the DX:AX:BX by 2.
PART C — Screen and Keyboard Operations
CHAPTER 9
Introduction to Screen and
Keyboard Processing
OBJECTIVE:
INTRODUCTION
Up to this point, our programs have defined data items either in the data area or as imme-
diate data within an instruction operand. However, most programs require input from a key-
board, disk, mouse, or modem and provide output in a useful format on a screen, printer, or
disk. This chapter covers the basic requirements for displaying information on a screen and
for accepting input from a keyboard.
There are various requirements for specifying a device to the system and for re-
questing an input or output operation. The INT (Interrupt) instruction handles input and out-
put for most purposes. The two types of interrupts covered in this chapter are BIOS INT
10H functions for screen handling and DOS INT 21H functions for displaying screen out-
put and accepting keyboard input. These functions (or services) request an action; you in-
sert a function value in the AH register to identify the type of operation the interrupt is to
perform.
Low-level BIOS operations such as INT 10H transfer control directly to BIOS. How-
ever, to facilitate some of the more complex operations, DOS INT 21H provides an inter-
rupt service that first transfers control to DOS. For example, input from a keyboard may
136
The Screen 137
involve a count of characters entered and a check against a maximum number. The DOS
INT 21H operation handles much of this additional high-level processing and then trans-
fers control automatically to BIOS, which handles the low-level part of the operation.
As aconvention, this book refers to the value ODH as the Enter character for the key-
board and as a Carriage Return for the screen and printer.
Operations introduced in this chapter are:
BIOS INT 10H FUNCTIONS DOS INT 21H FUNCTIONS
02H Set cursor 02H Screen display
06H Scroll screen 09H Screen display
OAH _ Keyboard input
3FH Keyboard input
40H Screen display
THE SCREEN
The screen is a grid of addressable locations at any one of which the cursor can be set. A
typical video monitor, for example, has 25 rows (numbered 0 to 24) and 80 columns (num-
bered 0 to 79). Here are some examples of cursor locations:
The system provides space in memory for a video display area, or buffer. The mono-
chrome display area begins at BIOS location BOOO[0]H and supports 4K bytes of memory,
2K of which are available for characters and 2K for an attribute for each character, such as
reverse video, blinking, high intensity, and underlining. The basic color graphics video dis-
play area supports 16K bytes, starting at BIOS location B800[0]H. You can process either
in text mode for normal character display or in graphics mode. For text mode, the display
area provides for screen “pages” numbered 0 through 3 for an 80-column screen, with bytes
for each character and its attribute.
The interrupts that handle screen displays transfer your data directly to a video dis-
play area, depending on the type of video adapter installed, such as EGA or VGA. Although
technically your programs may transfer data directly to a video display area, there is no as-
surance that the memory addresses will be the same on all models, so writing data directly
to a display area, although fast, can be risky. The recommended practice is to use the ap-
propriate interrupt instructions: INT 10H functions to display, to set the cursor at any loca-
tion, and to clear the screen and INT 21H functions for various types of display.
138 Introduction to Screen and Keyboard Processing Chapter 9
To set the row and column in the DX, you could also use one MOV instruction with an im-
mediate hex value, such as
BIOS INT 10H function 06H handles screen clearing or scrolling. You can clear all or part
of a display beginning at any screen location and ending at any higher numbered location.
For example, to clear the entire screen, specify the starting row:column as 00:00H and the
ending row:column as 18:4FH. Load these registers:
¢ AH = function 06H
¢ AL = OOH for full screen
¢ BH = attribute value
¢ CX = starting row:column
¢ DX = ending row:column
Attribute 71H in the following example sets the entire screen to white background (7)
with blue foreground (1):
If you mistakenly set the lower right screen location higher than 184FH, the opera-
tion wraps around the screen and clears some locations twice. This may cause an error on
some systems. The next chapter describes scrolling in more detail.
A program often has to display messages to a user that request data or an action
the user must take. We’ll first examine the methods for original DOS versions, which
are useful for exercises and small programs, and later examine the methods that involve
file handles. The original DOS operations work under all versions and in some respects
are simpler and easier to use, although use of the newer operations for software production
is recommended.
You can code the dollar sign immediately following the display string as just shown, inside
the string as ‘Customer name?$’, or on the next line as DB ‘$’. The effect, however, is that
you can’t use this function to display a $ character on the screen.
Set function 09H in the AH register, use LEA to load the address of the display string
in the DX, and issue an INT 21H instruction. The operation displays the characters from
left to right and recognizes the end of data on encountering the dollar sign ($) delimiter. The
assembly language code is:
MOV AH, 09H ;Request display
The INT operation does not change the contents of the registers. A displayed string
that exceeds the rightmost screen column automatically continues on the next row and
scrolls the screen as necessary. If you omit the dollar sign at the end of the string, the op-
eration displays characters from memory until it finds one—if there is any.
page 60,132
TITLE PO9DOSAS (COM) Display ASCII characters OOH-FFH
.MODEL SMALL
. CODE
ORG 100H
BEGIN: JUMP SHORT MAIN
CHAR DB 00,'S!
: Main procedure:
D1O0DISP PROC
MOV CX, 256 ;Initialize 256 iterations
LEA DX, CHAR ;Initialize address of char
D20
MOV AH, 09H ;Display ASCII char
INT 21H
INC CHAR ;Increment for next character
LOOP D20 ;Decrement CX, loop nonzero
RET ;Return
D1LODISP ENDP
END BEGIN
¢ DIODISP uses INT 21H, function 09H, to display the contents of CHAR, which is
initialized to OOH and is successively incremented by 1 to display each character un-
til reaching FFH.
The first displayed line begins with a blank (OOH), two “happy faces” (01H and 02H),
and then a heart (03H), diamond (04H), and club (OSH). Character 06H would have dis-
played a spade, but is erased by later control characters. Character 07H causes the speaker
to sound, 08H causes a backspace, 09H causes a tab, OAH causes a line feed, and ODH
DOS Function OAH for Keyboard Input 141
(Enter) causes a “carriage return” to the start of the next line. And, of course, under this op-
eration, the dollar symbol, 24H, is not displayed at all. (As you'll see in Chapter 10, BIOS
services can display proper symbols for these special characters.) The musical note is OEH,
and 7FH through FFH are extended ASCII characters.
You can revise the program to bypass attempting to display the control characters.
The following instructions bypass all characters between 08H and 0DH; you may want to
experiment with bypassing, say, only 08H (Backspace) and ODH (Carriage Return).
JB D30 ; Yes—accept
D30¢
D40:
INC CHAR
Although this exercise bypasses them, displaying the Backspace, Tab, Line Feed, and
Carriage Return characters is the normal way to perform these operations.
Suggestion: Reproduce the preceding program, assemble it, link it, and convert it to
a .COM file.
In the parameter list, the LABEL directive tells the assembler to align on a byte
boundary and gives the location the name NAMEPAR. Since LABEL takes no space,
NAMEPAR and MAXLEN refer to the same memory location.
To request input, set function OAH in the AH, load the address of the parameter list
(NAMEPAR in the example), into the DX, and issue INT 21H:
MOV AH, OAH ;Request input function
The INT operation waits for a user to enter characters and checks that they do not exceed
the maximum (20 in MAXLEN in the parameter list). The operation echoes each entered
character onto the screen and advances the cursor. The user presses the Enter Key to signal
the end of an entry. The operation also transfers this Enter character (ODH) to the input field
(NAMEFLD in the example), but does not count its entry in the actual length. If you key in
a name such as BROWN (Enter), the parameter list appears like this:
ASCII: 20 5 Bi R| o| wi] NI # wa
HWX: | 14 | 05 | 42 | 52 | 4F | 57 | 4B | op | 20 | 20 | 20 | 20] ...
The operation delivers the length of the input name, 05H, into the second byte of the para-
meter list, named ACTLEN in the example. The Enter character (ODH) is at NAME-
FLD+5. (The # symbol here indicates this character, because 0DH has no printable
symbol.) Since the maximum length of 20 includes the ODH, the actual entered name may
be up to only 19 characters.
The operation accepts and acts on the Backspace character, but doesn’t add it to
the count. Other than Backspace, the operation does not accept more than the maximum
number of characters. If in the preceding example a user keys in 20 characters without
pressing Enter, the operation causes the speaker to beep; at this point, it accepts only the
Enter character.
The operation bypasses extended function keys such as Fl, Home, PgUp, and Ar-
rows. If you expect a user to enter any of them, use BIOS INT 16H or DOS INT 21H, func-
tion 01H, both covered in Chapter 11.
The program in Figure 9—2 requests a user to enter a name, and then displays the name at
the center of the screen and sounds the speaker. If a user enters, for example, the name Pat
Brown, the program performs the following:
Accepting and Displaying Names 143
In FIOCENT, the SHR instruction shifts the length 09 one bit to the right, effectively
dividing the length by 2: Bits 00001001 become 00000100, or 4. The NEG instruction re-
verses the sign, changing +4 to —4. ADD adds the value 40, giving the starting position
for the column, 36, in the DL register. With the cursor set at row 12, column 36, the name
appears on the screen as follows:
page 60,132
TITLE PO9CTRNM (EXE) Accept names, center on screen
-MODEL SMALL
.STACK 64
.DATA
NAMEPAR LABEL BYTE ;Name parameter list:
MAXNLEN DB 20 ; maximum length of name
NAMELEN DB ? ; no. of characters entered
NAMEFLD DB 21. DUP{? *4 ; entered name
PROMPT DB 'Name? ', 'S'!
. CODE
BEGIN PROC FAR
MOV AX, @data ;Initialize segment
MOV DS, AX ; registers
MOV ES ,AX
CALL Q10CLR ;Clear screen
A20LOOP:
MOV DX, 0000 ;Set cursor to 00,00
CALL Q20CURS
CALL B10PRMP ;Display prompt
CALL D1OINPT ;Provide for input of name
CALL Q10CLR ;Clear screen
CMP NAMELEN, 00 ;Name entered?
JE A30 ; no, exit
CALL E10CODE ;Set bell & '$'
CALL F1O0CENT ;Center & display name
JMP A20LOOP
A30:
MOV AX, 4CO0OH ;Exit to DOS
INT 21H
BEGIN ENDP
i Display prompt:
.
t
Set bell and 'S' delimiter:
END BEGIN
Column: 36 40
Note the instructions in ELOCODE that insert the Bell (07H) character in the input
area immediately following the name:
MOV BH, 00 ;Replace Enter character (0DH)
The first two MOVs set the BX with the length. The third MOV references an index
specifier in square brackets, which means that the BX is to act as a special index register to
facilitate extended addressing. The MOV combines the length in the BX with the address
Accepting and Displaying Names 145
of NAMEFLD and moves the 07H to the calculated address. Thus for a length of 05, the
instruction inserts 07H at NAMEFLD+05 (replacing the Enter character) following the
name. The last instruction in E1OCODE inserts a ‘$’ delimiter following the 07H so that
DOS function 09H can display the name and sound the speaker.
If the length is zero, the program determines that input is ended, as shown by the instruc-
tion CMP NAMELEN,00 in A2ZOLOOP.
The first two MOV instructions set the BX with the length 05. The third MOV moves a
blank (20H) to the address specified in the first operand: the address of NAMEFLD plus
the contents of BX—in effect, NAMEFLD+5.
The name HAMILTON replaces the shorter name PAINE. But because the name ADAMS
is shorter than HAMILTON, it replaces HAMIL and the Enter character replaces the T. The
remaining letters, ON, still follow ADAMS. You may want to clear NAMEFLD prior to
prompting for a name, as follows:
146 Introduction to Screen and Keyboard Processing Chapter 9
B30:
Instead of the SI register, you could use DI or BX. A more efficient method that
moves a word of two blanks requires only 10 loops. However, because NAMEFLD is de-
fined as DB (byte), you would have to override its length with aWORD and PTR (pointer)
operand, as the following indicates:
MOV CX,10 ;Initialize for 10 loops
B30:
INC SI ; in name
Interpret the MOV at B30 as “Move a blank word to the memory location where the ad-
dress in the SI register points.” This example uses LEA to initialize the clearing of NAME-
FLD and uses a slightly different method for the MOV at B30 because you cannot code an
instruction such as
MOV WORD PTR[NAMEFLD] ,2020H ;Invalid
Clearing the input area solves the problem of short names being followed by previ-
ous data. A more efficient practice is to clear only positions to the right of the most recently
entered name.
Use these control characters for handling the cursor whenever you display output or accept
input. Here’s an example that displays the contents of a character string named MESSAGE,
followed by Carriage Return and Line Feed to set the cursor to the next line:
MESSAGE DB 09, ‘PC Users Group Annual Report’, 13, 10, ‘$’
Using EQU to redefine the control characters may make a program more readable:
CR EQU. 13 (or EQU ODH)
MESSAGE DB TAB, ‘PC Users Group Annual Report’, CR, LF, ‘$’
The following example shows how to use this service to display a string of charac-
ters. The string to display is defined in CONAME. The program loads the address of
CONAME in the DI register and its length in the CX. The loop involves incrementing the
DI (by INC) for each successive character and decrementing the CX (by LOOP) for the
number of characters to display. The code is as follows:
CONAME DB ‘Software Services’, 13, 10
-Finished
FILE HANDLES
We’ll now examine the use of file handles for screen and keyboard operations, which is
more in the UNIX and OS/2 style. A file handle is simply a number that refers to a specific
device. Since the following standard file handles are preset, you do not have to define them:
HANDLE DEVICE
00 Input, normally keyboard (CON), but may be redirected
Ol Output, normally display (CON), but may be redirected
02 Error output, display (CON), may not be redirected
03 Auxiliary device (AUX)
04 Printer (LPT1 or PRN)
As can be seen, the normal file handles are 00 for keyboard input and 01 for screen
display. Other file handles, such as those for disk devices, have to be set by your program.
You can also use these services for redirecting input and output to other devices, although
this feature doesn’t concern us here.
¢ AH = Function 40H
* BX = File handle 01
¢ CX = Number of characters to display
¢ DX = Address of the display area
A successful INT operation returns to the AX the number of bytes written and clears the
carry flag (which you may test).
An unsuccessful INT operation sets the carry flag and returns an error code in the AX:
OSH = access denied (for an invalid or disconnected device) or 06H = invalid handle. Since
the AX could contain either a length or an error code, the only way to determine an error
condition is to test the carry flag, although display errors are rare:
The operation responds like DOS function 09H to control characters 07H (Beep),
08H (Backspace), OAH (Line Feed), and ODH (Carriage Return). The following instruc-
tions illustrate this operation:
File Handles for Keyboard Input 149
10B INT 21
10D NOP
The program sets the AH to request a display and sets offset 1OEH in the DX—the location
of the DB containing your name.
When you have keyed in the instructions, press Enter again. To unassemble the pro-
gram, use the U command (U 100,10D), and to trace execution, press R and then repeated
T commands. On reaching the INT instruction, use the P (Proceed) command to execute
the interrupt through to the NOP instruction. Your name should be displayed on the screen.
Use the Q command to quit DEBUG.
¢ AH = Function 3FH
¢ BX = File handle 00
¢ CX = Maximum number of characters to accept
¢ DX = Address of the data area for entering characters
150 Introduction to Screen and Keyboard Processing Chapter 9
A successful INT operation clears the carry flag (which you may test) and sets the
AX with the number of characters entered.
An unsuccessful INT operation could occur because of an invalid handle; the opera-
tion sets the carry flag and inserts an error code in the AX: 05H = access denied (for an in-
valid or disconnected device) or 06H = invalid handle. Since the AX could contain either
a length or an error code, the only way to determine an error condition is to test the carry
flag, although keyboard errors presumably are rare.
Like DOS function OAH, function 3FH also acts on the Backspace, but ignores ex-
tended function keys such as Fl, Home, and PageUp.
The following instructions illustrate the use of DOS function 3FH:
INAREA DB 20 DUP(’ ‘) ;Input area
The INT operation waits for you to enter characters, but unfortunately does not check
whether the number of characters exceeds the maximum in the CX register (20 in the ex-
ample). Pressing the Enter key (ODH) signals the end of an entry. For example, typing the
characters “PC Users Group” enters the following in INAREA:
|PC Users Group]
ODH| 0AH|
The typed characters are immediately followed by Enter (ODH), which you typed, and Line
Feed (OAH), which you did not type. Because of this feature, the maximum number and the
length of the input area should provide for an additional two characters. If you type fewer
characters than the maximum, the locations in memory following the entered characters still
contain the previous contents.
A successful INT operation clears the carry flag and sets the AX with the number of
characters delivered. In the preceding example, this number is 14, plus 2 for the Enter and
Line Feed characters, or 16. Accordingly, a program can determine the actual number of
characters entered. Although this feature is trivial for YES and NO type of replies, it is use-
ful for replies with variable length, such as names.
If you key in a name that exceeds the maximum in the CX register, the operation ac-
tually accepts all the characters. Consider a situation in which the CX contains 08 and a user
enters the characters “PC Exchange’’. The operation sets the first eight characters in the in-
put area to “PC Excha” with no Enter and Line Feed following and sets the AX with a length
of 08. Now, watch this—the next INT operation to execute does not accept a name directly
from the keyboard, because it still has the rest of the previous string in its buffer. It deliv-
ers “nge” followed by the Enter and Line Feed characters to the input area and sets the AX
to 05. Both operations are “normal” and clear the carry flag:
Key Points 151
A program can tell whether a user has keyed in a “valid” number of characters if (a)
the number returned in the AX is less than the number in the CX or (b) the number returned
in the AX is equal to that in the CX, and the last two characters in the input area are ODH
and OAH. If neither condition is true, you'll have to issue additional INTs to accept the re-
maining characters. After all this, you may well wonder what is the point of specifying a
maximum length in the CX at all!
10B INT 21
10F DB 20 20 20 20 20 20 20 20 20 20 20 20
The program sets the AH and BX to request keyboard input and inserts the maximum
length in the CX. It also sets offset 1OFH in the DX—the location of the DB, where the en-
tered characters are to begin.
When you have keyed in the instructions, press Enter again. Try the U command
(U 100,10E) to unassemble the program. Use R and repeated T commands to trace the ex-
ecution of the four MOV instructions. At location 1OBH, use P (Proceed) to execute through
the interrupt. The operation waits for you to key in characters followed by Enter. Check the
contents of the AX register and the carry flag, and use D DS: 10F to display the entered char-
acters in memory. You can continue looping indefinitely. Key in Q to quit DEBUG.
KEY POINTS
The INT 10H instruction transfers control to BIOS for display operations. Two com-
mon operations are function 02H (set cursor) and 06H (scroll screen).
DOS INT 21H provides special functions to handle some of the complexity of
input/output.
When using INT 21H, function 09H, for displaying, define a delimiter ($) immedi-
ately following the display area. A missing delimiter can cause spectacular effects on
the screen.
INT 21H, function OAH, for keyboard input expects the first byte to contain a maxi-
mum value and automatically inserts an actual value in the second byte.
A file handle is a number that refers to a specific device. Some numbers for file han-
dles are preset, while others can be set by your program.
For DOS function 40H to display, use handle 01 in the BX.
For DOS function 3FH for keyboard input, use handle 00 in the BX. The operation
includes Enter and Line Feed characters following the typed characters in the input
area. It does not check for entries that exceed your specified maximum.
QUESTIONS
9-1. What are the hex values for (a) the top leftmost location and (b) the bottom rightmost location
on an 80-column screen?
9-2. Code the instructions to set the cursor to row 12, column 8.
9-3. Code the instructions to clear the screen, beginning at row 12, column 0, through row 22, col-
umn 79.
9-4, Code data items and DOS INT 21H, function 09H, to display the message “What is the date
(mm/dd/yy)?” Follow the message with a beep.
9-5. Code data items and DOS INT 21H, function OAH, to accept input from the keyboard accord-
ing to the format in Question 9-4.
9-6. The section titled “Clearing the Input Area” shows how to clear to blank the entire keyboard
input area, defined as NAMEFLD. Change the example so that it clears only the characters im-
mediately to the right of the most recently entered name.
. Key in the program in Figure 9-2 with the following changes: (a) Instead of row 12, set the
center at row 15; (b) instead of clearing the entire screen, clear only rows 0 through 15. As-
semble, link, and test the new program.
. Identify the standard file handles for (a) keyboard input; (b) normal screen display; (c) the
printer.
. Code data items and DOS INT 21H, function 40H, to display the message “What is the date
(mm/dd/yy)?” Follow the message with a beep.
. Code data items and DOS INT 21H, function 3FH, to accept input from the keyboard accord-
ing to the format in Question 9-4.
- Revise Figure 9-2 for use with DOS INT 21H, functions 3FH and 40H, for input and display.
Assemble, link, and test the new program.
CHAPTER 10
Advanced Screen Processing
OBJECTIVE:
To cover advanced features of screen handling, including
scrolling, reverse video, blinking, and the use of color
graphics.
INTRODUCTION
Chapter 9 introduced the basic features concerned with screen handling and keyboard in-
put. This chapter provides advanced features related to video adapters, setting modes (text
or graphics), and screen handling. The first section describes the common video adapters
and their associated video display areas.
The sections on text mode explain the use of the attribute byte for color, blinking, and
high intensity, as well as the instructions to set the cursor size and location, to scroll up or
down the screen, and to display characters. The last few sections explain the use of graph-
ics mode, together with the various instructions used for its display.
This chapter introduces the following services offered by BIOS INT 10H:
OOH Set video mode
O1H Set cursor size
02H Set cursor position
03H Read cursor position
04H Read light pen position
OSH Select active page
153
154 Advanced Screen Processing Chapter 10
VIDEO ADAPTERS
The VGA and its superVGA clones replaced the CGA and EGA video adapters. Soft-
ware written for a CGA or an EGA usually can run on a VGA system, although software
written specifically fora VGA doesn’t run on a CGA or an EGA.
A video adapter consists of three basic units: the video controller, video BIOS, and
video display area.
1. The video controller, the workhorse unit, generates the monitor’s scan signals for the
selected text or graphics mode. The computer’s processor sends instructions to the
controller’s registers and reads status information from them.
2. The video BIOS, which acts as an interface to the video adapter, contains such rou-
tines as setting the cursor and displaying characters.
3. The video display area in memory contains the information that the monitor is to dis-
play. The interrupts that handle screen displays transfer your data directly to this area.
The locations of the video display area depend on the video modes in use. Following
are the beginning video display segment addresses for major video adapters:
¢ AQ00:[0] Used for font descriptors when in text mode and for high-resolution
graphics for EGA, MCGA, and VGA
¢ BOOO:[0] Monochrome text mode for MDA, EGA, and VGA
Text Mode 155
The common RGB color graphics monitor accepts input signals that are sent to three
separate electron guns—red, green, and blue, for each of the primary additive colors.
If you write software for unknown video monitors, you can use INT 10H, function
OFH (covered later), which returns the current video mode in the AL. Another approach is
to use BIOS INT 11H to determine the device attached to the system, although the infor-
mation delivered is rather primitive. The operation returns a value to the AX, with bits 5
and 4 indicating video mode:
You can test the AX for the type of monitor and then set the mode accordingly.
TEXT MODE
Text mode is used for the normal display of ASCII characters on the screen. Processing is
similar for both monochrome and color, except that color does not support the underline at-
tribute. Text mode provides access to the full extended ASCII 256-character set. Figure
10—1 shows common text modes, with the mode number on the left.
Text modes 00 (mono) and 01 (color). These modes provide 40-column for-
mat. Although originally designed for the CGA, they are upward compatible and also work
on EGA and VGA systems.
156 Advanced Screen Processing Chapter 10
262,144
262,144
Text modes 02 (mono) and 03 (color). |These modes provide conventional 80-
column format. Although originally designed for the CGA, they are upward compatible and
also work on EGA and VGA systems.
Text mode 07 (mono). This is the standard monochrome mode for MDA, EGA,
and VGA and offers respectable screen resolutions.
Attribute Byte
An attribute byte in text (not graphics) mode determines the characteristics of each dis-
played character. When a program sets an attribute, it remains set; that is, all subsequent
displayed characters have the same attribute until another operation changes it. You can use
INT 10H functions to generate a screen attribute and perform such actions as scroll up,
scroll down, read attribute or character, or display attribute or character. If you use DEBUG
to view the video display area of your system, you’!l see each one-byte character, immedi-
ately followed by its one-byte attribute.
The attribute byte has the following format, according to bit position:
Background Foreground
Attribute: R G B
Bit number: 1 0
Text Mode 157
The letters R, G, and B indicate bit positions for red, green, and blue, respectively.
The RGB bits define a color—on both color and monochrome, 000 is black and 111
is white. For example, an attribute set with the value 0000 0111 means black background
with white foreground.
Monochrome Display
For a monochrome monitor, bit 0 sets the underline attribute. To specify attributes, you may
set combinations of bits as follows:
Color Display
For many color displays, the background can display 1 of 8 colors and the foreground char-
acters can display | of 16 colors. Blinking and intensity apply only to the foreground. You
can also select 1 of 16 colors for the border. Color monitors do not provide underlining; in-
stead, setting bit 0 selects the blue color as foreground.
The attribute byte is used the same way as was shown for a monochrome monitor.
The three basic colors are red, green, and blue. You can combine these in the attribute byte
to form a total of 8 colors (including black and white) and can set high intensity, for a total
of 16 colors:
Black Gray
Blue Light blue
Green Light green
Cyan Light cyan
Red Light red
Magenta Light magenta
Brown Yellow
White © oO
Oo Oo
rr
©
ee
O°:O- oO
rer
Oo
FP
Or
© Or
CO
OF High-intensity
rPOrF white RP
Oo
Oo:
oO
eRIE
© CO
IO
eSoO
OS ee
FF
oO
rF
©
Or
158 Advanced Screen Processing Chapter 10
If the background and foreground colors are the same, the displayed character is in-
visible. You can also use the attribute byte to cause a foreground character to blink. Here
are some typical attributes:
Background Foreground
BL RGB LRG BB Hex
Black
Blue
Red
Cyan
Light magenta
Gray (blinking)
You can use INT 11H to determine the type of monitor installed. Then, for mono-
chrome, use 07H to set the normal attribute (black background, white foreground) and, for
color, use any of the color combinations described. The color stays set until another oper-
ation changes it. Text mode also supports screen pages 0-3, where page 0 is the normal
screen.
As an example, the following INT 10H operation (explained later) uses function 09H
to display five light green, blinking asterisks on a magenta background:
MOV AH,09H ;Request display
You can use DEBUG to check out this example, as well as trying other color
combinations.
SCREEN PAGES
Text modes allow you to store data in video memory in pages. Page numbers are 0 through
3 for normal 80-column mode (and 0 through 7 for the rarely used 40-column screen). In
80-column mode, page number 0 is the default and begins in the video display area at
B800[0], page 1 begins at B9OO[0], page 2 at BAOO[O], and page 3 at BBOO[O].
You may format any of the pages in memory, although you can display only one page
at a time. Each character to be displayed on the screen requires two bytes of memory—one
byte for the character and a second for its attribute. In this way, a full page of characters for
80 columns and 25 rows requires 80 X 25 X 2 = 4,000 bytes. The amount of memory ac-
tually allocated for each page is 4K, or 4,096 bytes, so that 96 unused bytes immediately
follow each page.
BIOS Interrupt 10H for Text Mode 159
You can adjust the cursor size between the top and bottom—0:14 for VGA, 0:13 for
monochrome and EGA, and 0:7 for CGA. The following code enlarges the cursor from top
to bottom for a VGA:
The cursor now blinks as a solid rectangle. You can adjust its size anywhere between the
stated bounds—for example, 04:08, 03:10, and so forth. The cursor retains these attributes
until another operation changes them. Using 0:14 (VGA), 12:13 (monochrome or EGA), or
6:7 (CGA) resets the cursor to normal. If you are unsure of your monitor’s bounds, first try
executing function 03H under DEBUG.
The cursor location on each page is independent of its location on the other pages. This code
sets row 5, column 20, for page 0:
¢ AX and BX = Unchanged
¢ CH = Starting scan line of the cursor
¢ CL = Ending scan line
¢ DH = Row
¢ DL = Column
The following example uses function 03H to read the cursor and determine its lo-
cation and size and then uses function 02H to advance the cursor to the next column on
the screen:
The following code scrolls the full screen one line and sets a color attribute:
1. Define an item named, for example, ROW, initialized to zero, for setting the row lo-
cation of the cursor.
2. Display a line and advance the cursor to the next line.
3. Test to see whether ROW is near the bottom of the screen (CMP ROW,22).
4. If no, increment ROW (INC ROW) and exit.
5. If yes, scroll one line, use ROW to set the cursor, and clear ROW to 00.
The CX and DX registers permit scrolling any portion of the screen. But be especially
careful to match the AL value with the distance in the CX:DX, especially when you refer-
ence a partial screen. The following instructions scroll five lines, in effect creating a win-
dow at the center of the screen with its own attributes:
MOV AX,0605H >Scroll five lines
This example specifies scrolling five lines, which is the same value as the distance
between rows 10 and 14. Since the attribute for a window remains set until another opera-
tion changes it, you may set various windows to different attributes at the same time.
Function 08H can read both a character and its attribute from the video display area in ei-
ther text or graphics mode. Load the page number, normally 0, in the BH, as the following
example shows:
The operation returns the character in the AL and its attribute in the AH. In graphics mode,
the operation returns OOH for a non-ASCII character. Since only one character at a time is
read, you have to code a loop to read successive characters.
The operation does not advance the cursor or respond to the Bell, Carriage Return, Line
Feed, or Tab character; instead, it attempts to display them as ASCII characters. The fol-
lowing code displays five blinking hearts with reverse video:
Displaying different characters requires a loop. In text but not graphics mode, dis-
played characters automatically carry over from one line to the next. To display a prompt
or message, code a routine that sets the CX to 01 and loops to move one character at a time
from memory into the AL. (Since the CX is occupied, you can’t easily use the LOOP in-
struction.) Also, after displaying each character, use INT 10H, function 02H, to advance the
cursor to the next column.
You can use this operation to change any valid video page and then use function 05H
to display the page.
DOS INT 21H functions that can print a string of characters and respond to screen
control characters are often more convenient than BIOS operations.
The Backspace (08H), Bell (07H), Carriage Return (ODH), and Line Feed (OAH) con-
trol characters act as commands for screen formatting. The operation automatically ad-
vances the cursor, wraps characters onto the next line, scrolls the screen, and maintains the
present screen attributes.
JE... ; Jump
OAH Display each character, including control characters, at the current cursor
position.
The characters are displayed in 16 columns and 16 rows. This program, like others in
this book, are written for clarity rather than processing efficiency. You could revise the pro-
gram to make it more efficient—for example, by using registers for the row, column, and
ASCII character generator. Also, since INT 10H destroys only the contents of the AX reg-
ister, the values in the other registers don’t have to be reloaded. However, the program
won’t run noticeably faster and it would lose some clarity.
F10READ PROC
MOV
INT
RET
F10READ ENDP
‘
The following code uses INT 10H, function 09H, to draw a solid horizontal line 25
positions long:
INT 21H
In the next chapter, Figure 11—1 displays a similar menu in a double-line box. The
“dots on” characters for drop shadows are often used to the right or bottom of a box:
Value Character
To control the placement of the cursor, the program defines ROW for incrementing
the screen row and COL for advancing the cursor when displaying the prompt and name.
(INT 10H, function 09H, does not automatically advance the cursor.) The program displays
down the screen until it reaches row 20 and then begins scrolling up one line for each ad-
ditional prompt.
For keyboard input, the procedure DIOINPT uses INT 21H, function OAH.
page 60,132
TITLE PLONMSCR (EXE) Reverse video, blinking, scrolling
-MODEL SMALL
-STACK 64
. DATA
NAMEPAR LABEL BYTE ;Name parameter list:
MAXNLEN DB 20 ; maximum length of name
ACTNLEN DB ? ; no. of chars entered
NAMEFLD DB 20 DUP(' ') ; name
COL DB 00
COUNT DB 4
PROMPT DB ‘Name? '
ROW DB 00
. CODE
BEGIN PROC FAR
MOV AX,@data ;Initialize segment
MOV DS, AX ; registers
MOV ES, AX
MOV AX, 0600H
CALL Q10SCR ;Clear screen
A20LOOP:
MOV COL, 00 ;Set column to 0
CALL Q20CURS
CALL B10PRMP ;Display prompt
CALL D1OINPT ;Provide for input of name
CMP ACTNLEN, 00 ;No name? (indicates end)
JNE A30
MOV AX, 0600H
CALL Q10SCR ;If so, clear screen,
MOV AX, 4C0O0H ;Exit to DOS
INT 21H
A30:
CALL E1O0NAME ;Display name
JMP A20LOOP
BEGIN ENDP
; Display prompt:
B10PRMP NEAR
SI, PROMPT ;Set address of prompt
COUNT,05
B20:
Bl, 7iH ;Reverse video
F1LODISP ;Display routine
;Next character in name
;Next column
;Set cursor
; Countdown
;Loop n times
B10PRMP
f Accept input of name:
D1IOINPT
;Request keyboard
; input
D1IOINPT
‘
E10NAME NEAR
SI,NAMEFLD ;Initialize name
COL, 40 ;Set screen column
E20:
Q20CURS ;Set cursor
BL, OF1H ;Blink reverse video
F1ODISP ;Display routine
;Next character in name
;Next screen column
;Countdown name length
;Loop n times
E30:
AX,0601H ; yes,
Q10SCR ; scroll screen
E10NAME
’ Display character:
eeie i
F1lODISP
4
Q10SCR
‘ Set cursor row/col:
°
’
D B800:00
The display shows what was on the screen at the time you typed the command, which is
usually a set of bytes containing 20 07H (for blank character, black background, and white
foreground). Note that DEBUG and you are both competing for the same display area and
screen. Try changing the screen with these commands to display happy faces on the top and
bottom rows:
E B800:000 01 25 02 36 03 47
E B800:F90 01 25 02 36 03 47
The program in Figure 10-4 gives an example of transferring data directly to the
video display area at B900[0]H—that is, page 1, rather than the default page 0. The pro-
gram uses the SEGMENT AT feature to define the BIOS video display area, in effect as a
dummy segment. (This is not a violation of the rule that a .COM program may have only
one segment.) VIDAREA identifies the location of page 01, at the start of the segment.
The program displays characters in rows 5 through 20 and columns 10 through 70.
The first row displays a string of the character A (41H) with an attribute of 01H, the sec-
ond row displays a string of the character B (42H) with an attribute of 02H, and so forth,
with the character:attribute incremented for each row.
1/2 Advanced Screen Processing Chapter 10
.STACK 64
CODE
0000 PROC FAR
0000 MOV AX, VIDSEG ;Addressability for
0003 MOV ES, AX ; video area
ASSUME ES:VIDSEG
0005 MOV AH, OFH ;Request get
0007 INT 10H ; and save
0009 PUSH AX ; current mode
OOOA PUSH BX ; and page
00O0B MOV AH, 00H ;Request set
000D MOV AL, 03 ; mode 03, clear screen
OOOF INT 10H;
0011 MOV AH,05H ;Request set
0013 MOV AL, 01H ; page #01
0015 INT 10H
0017 CALL C10PROC ;Process display area
OO1A CALL E1LOINPT ;Provide for input
001D MOV AH, 05H ;Restore
OO1F POP BX ; original
0020 MOV AL, BH ; page number
0022 INT 10H
0024 POP AX ;Restore video
0025 MOV AH, 00H ; mode (in AL)
0027 INT 10H
0029 MOV AX,4CO0H 2EXieE tO DOS
002C INT 21H
002E BEGIN ENDP
The program establishes the starting position of a page in the video display area based
on the fact that there are 80 X 2 = 160 columns in a row. The starting position, then, for
row 10, column 10, is (160 X 10 rows) + (10 columns X 2) = 660. After displaying one
row, the program advances 40 positions in the display area for the start of the next line and
ends on reaching the letter Q (51H).
The video display segment for page 1 is defined as VIDSEG and the page as
VIDAREA. The program establishes the ES register as the segment register for VIDSEG.
At the start, the program saves the current mode and page and then sets mode 03 and
page O1.
In the procedure C1OPROC, the starting character and attribute are initialized in
the AX and the starting video area offset in the DI. The instruction MOV WORD PTR
[VIDAREA+DI],AX moves the contents of the AL (the character) to the first byte of the
display area and the AH (the attribute) to the second byte. The LOOP routine executes this
instruction 60 times, displaying the character:attribute across the screen. It then increments
the character:attribute and adds 40 to the DI—20 for the end of the current row and 20 for
indenting the start of the next row (on the screen, 10 columns each). The routine then re-
peats the display of the next row of characters.
On completion of the display, the procedure E1OINPT waits for the user to press a
key and then the program restores the original mode and page.
GRAPHICS MODE
Graphics adapters have two basic modes of operation: text (the default) and graphics.
Use BIOS INT 10H, function OOH, to set graphics or text mode, as the following two
examples show:
1. Set graphics mode for VGA:
MOV AH, 00H ;Request set mode
The EGA and the VGA provide significantly better resolution than the original CGA
and are compatible with it in many ways. Resolutions and modes for graphics adapters are
shown in Figure 10-5 and are as follows:
¢ Graphics modes 04H, 05H, and 06H. The address of the video display area for these
modes is B800[0]. These are the original CGA modes, which are also used by the
EGA and VGA for upward compatibility, so that programs written for the CGA can
often run on an EGA or VGA.
174 Advanced Screen Processing Chapter 10
¢ Graphics modes ODH, OEH, OFH, and 10H. The address of the video display area for
these modes is AQOO[0]. These are the original EGA modes, which are also used by
the VGA for upward compatibility, so that programs written for the EGA can usually
run on a VGA. These modes also support 8, 4, 2, and 2 pages of video display area,
respectively, with page O the default.
¢ Graphics modes 11H, 12H, and 13H. The address of the video display area for these
modes is AOOO[0]. These modes are specifically designed for the VGA (and the now
rare MCGA) and are not usable by other video adapters.
In graphics mode, ROM contains dot patterns for only the first (bottom) 128 charac-
ters. INT 1FH provides access to a 1K area in memory that defines the top 128 characters,
eight bytes per character.
Pixels
Graphics mode uses pixels (also, picture elements or pels) to generate color patterns. For
example, mode 04H for standard color graphics provides 200 rows of 320 pixels. Each byte
represents four pixels (that is, two bits per pixel), numbered O through 3, as follows:
byte: Ci CO Cl CO Cl CO Cl CO
pixel: 0 1 2 3
At any given time, there are four available colors, numbered 0 through 3. The limi-
tation of four colors is because a two-bit pixel provides four bit combinations: 00, 01, 10,
and 11. You can choose pixel 00 for any one of the 16 available colors for the background:
And you can choose pixels 01, 10, and 11 for any one of two three-color palettes:
Palette 0 Palette 1
background background
green cyan
red magenta
brown white
Use INT 10H, function OBH, to select a color palette and the background. Thus if you
choose background color yellow and palette 0, the available colors are yellow, green, red,
and brown. A byte consisting of the pixel value 10101010 would display as all red. If you
choose background color blue and palette 1, the available colors are blue, cyan, magenta,
and white. A byte consisting of pixel value 00011011 would display blue, cyan, magenta,
and white.
1. BH = 00. Select the background color, where the BL contains the color value in bits
0-3 (any of 16 colors):
2. BH = 01. Select the palette for graphics, where the BL contains the palette (0 or 1):
MOV AH, OBH ;Request color
Once you set a palette, it remains set. But once you change the palette, the whole
screen changes to that color combination. If you use function OBH while in text mode, the
value set for color 0 for the palette determines the color of the border.
The minimum value for the column or row is 0, and the maximum value depends on the
video mode. The following example sets a pixel at column 50, row 70, on the screen:
BIOS Interrupt 10H for Graphics 177
EGA/VGA modes 0DH, 0EH, OFH, and 10H provide 8, 4, 2, and 2 pages of video
display area, respectively. The default page is number 0.
Other AL subfunction codes for the VGA under function 10H are 07H (read individ-
ual palette register), O8H (read overscan register), 09H (read all palette registers and over-
scan), 10H (set individual color register), 12H (set block of color registers), 13H (select
color page), 15H (read individual color register), 17H (read block of color registers), and
1AH (read color page state).
This function saves and restores the video state, including the status of color registers, BIOS
data area, and video hardware.
The program in Figure 10-6 uses a number of INT 10H functions, including the following,
for a display of graphics:
The actual screen displayed is 210 rows and 512 columns. Note that rows and
columns are in terms of dots, not characters.
The program increments the color for each row (so that bits 0000 become 0001, etc.)
and, since only the rightmost four bits are used, the colors repeat after every 16 rows. The
display begins 64 columns from the left of the screen and ends 64 columns from the right.
At the end, the program waits for the user to press a key, and then it resets the dis-
play to the original mode. For a VGA system, you could experiment by trying various
graphics modes.
Since video graphics adapters support various services, there may be times when you want
to know what type of adapter is installed in a system. A recommended way is to check first
for VGA, then for EGA, and last for CGA or MDA. Here are the steps:
Since an EGA may be installed along with an MDA or CGA, you may want to deter-
mine whether the EGA is active. The BIOS data area at 40:0087 contains an EGA instruc-
tion byte. Check bit 3, where 0 means that the EGA is active and 1 means that it iS inactive.
180 Advanced Screen Processing Chapter 10
KEY POINTS
The attribute byte for text mode provides for blinking, reverse video, and high inten-
sity. For color text, the RGB bits enable you to select colors, but not underlining.
BIOS INT 10H provides functions for full screen processing, such as setting the video
mode, setting the cursor location, scrolling the screen, reading from the keyboard, and
writing characters.
If your program displays lines down the screen, use BIOS INT 10H, function 06H, to
scroll up before the display reaches the bottom.
For INT 10H services that display a character, you have to advance the cursor and
possibly echo the character to the screen.
The 16K memory for color display permits storing additional “pages” or “screens.”
There are four pages per 80-column screen.
The fastest way to display screen characters (text or graphics) is to transfer them di-
rectly to the appropriate video display area.
A pixel (picture element) consists of a specified number of bits, depending on the
graphics adapter and resolution (low, medium, or high).
For graphics modes 04 and 05, you can select 4 colors, of which | is any of the 16
available colors and the other 3 are from a color palette.
QUESTIONS
10-1. Provide the attribute bytes, in binary, for monochrome screens for the following: (a) underline
only; (b) white on black, normal intensity; (c) reverse video, intense.
10-2. Provide the attribute bytes, in binary, for the following: (a) magenta on light cyan; (b) brown
on yellow; (c) red on gray, blinking.
10-3. Code the following routines: (a) Set the mode for 80-column monochrome; (b) set the cursor
size to start at line 5 and end at line 12; (c) scroll up the screen 10 lines; (d) display 10 blink-
ing “dots” with one-half dots (hex B1) on.
10-4. Under text mode 03, how many colors are available for background and for foreground?
10-5. Code the instructions for displaying five diamond characters in text mode with light green on
magenta.
10-6. What mode permits the use of screen pages?
10-7. Write a program that uses INT 21H, function OAH, to accept data from the keyboard and func-
tion 09H to display the characters. The program will clear the screen, set screen colors (your
choice), and accept a set of data from the keyboard beginning at the current position of the cur-
sor. The set of data could be four or five lines (say, any length up to 25 characters) entered
from the keyboard, each followed by Enter. You could use a variety of colors, reverse video,
Questions 181
or beeping as an experiment. Then set the cursor to a different row and column (you decide),
and display the entered data at that location. The program is to accept any number of sets of
data. It could terminate when the user presses Enter with no data. Write the program with a
short main logic routine and a series of called subroutines. Include some concise comments.
10-8. Revise the program in Question 10—7 so that it uses INT 16H for keyboard input and INT
10H, function O9H, for display.
10-9. Explain how the common attribute byte limits the number of available colors.
10-10. Code the instructions to set graphics mode for these resolutions: (a) 320 x 200; (b) 640 x
200; (c) 640 < 480.
10-11. Code the instructions for selecting the background color blue in graphics mode.
10-12. Code the instructions to read a dot from row 12, column 13, in graphics mode.
10-13. Revise the program in Figure 10-6 so that it provides for the following: (a) a suitable graph-
ics mode for your own monitor; (b) background color red; (c) row beginning at 10 and end-
ing at 30; (d) column beginning at 20 and ending at 300.
10-14. Based on the changes you made in Question 10-13, revise the program to display graphics
dots one column (instead of row) at a time. That is, display dots down the screen, then ad-
vance to the next column, and so forth.
CHAPTER 11
Advanced Keyboard Processing
OBJECTIVES:
INTRODUCTION
This chapter describes the many different operations for handling keyboard input, some of
which have specialized uses. Of these operations, INT 21H function OAH (covered in Chap-
ter 9), and INT 16H (covered in this chapter) should provide almost all the keyboard opera-
tions you'll require.
Other topics in the chapter include the keyboard shift status bytes, scan codes, and
the keyboard buffer area. The shift status bytes in the BIOS data area enables a program to
determine, for example, whether the Ctrl, Shift, or Alt keys have been pressed. The scan
code is a unique number assigned to each key on the keyboard that enables the system to
identify the source of a pressed key and enables a program to check for extended function
keys such as Home, PgUp, and Arrows. And the keyboard buffer area provides space in
memory for you to type ahead before a program actually requests input.
Operations introduced in this chapter are as follows:
DOS INT 21H FUNCTIONS
O1H Keyboard input with echo
182
The Keyboard 183
THE KEYBOARD
The keyboard provides three basic types of keys:
. Control keys for Alt, Ctrl, and Shift, which work in association with other keys. BIOS
treats these differently from other keys by updating their current state in the shift sta-
tus bytes in the BIOS data area. BIOS does not deliver them as ASCII characters to
your program.
The original PC with its 83 keys suffered from a short-sighted design decision that
caused keys on the so-called numeric keypad to perform two actions. Thus numbers shared
keys with the Home, End, Arrows, Del, Ins, PgUp, and PgDn keys, with the NumLock key
toggling between them. To overcome problems caused by this layout, designers produced
an enhanced keyboard with 101 keys. Of the 18 new keys, only two, F11 and F12, provide
a new function; the rest duplicate the function of keys on the original keyboard. If your pro-
grams allow users to press F11, F12, or any of the fancy new key combinations, the users
must have an enhanced keyboard and a computer with a BIOS that can process them. For
most other keyboard operations, your programs need not be concerned with the type of key-
board that is installed.
184 Advanced Keyboard Processing Chapter 11
Action i Action
You may use INT 16H, function 02H (covered later), to check these values. Note that
“active” means that the user is currently holding down the key; releasing the key clears the
bit value. The 83-key keyboard requires only this shift status byte.
The enhanced 101-key keyboard has duplicate (left and right) Ctrl and Alt keys,
so that additional information is needed to test for them. The second byte of the key-
board status needed for the 101-key keyboard is at 40:18H, where a 1-bit indicates the
following:
Action i Action
Bits 0, 1, and 2 are associated with the enhanced (101-key) keyboard. You can now test, for
example, whether either Ctrl or Alt is pressed, or both.
Another keyboard status byte resides at 40:96H. The item of interest to us here 1s bit
4; when on, it indicates that a 101-key keyboard is installed.
To see the effect of the Ctrl, Alt, and Shift keys on the shift status bytes, load DEBUG
for execution. Enter D 40:17 to view the contents of the status bytes. Press the Caps-
Lock, NumLock, and ScrollLock keys, and enter D 40:17 again to see the result on both
status bytes. The byte at 40:17H should show 70H (0111 OOOOB), and the byte at 40:18H
is probably OOH. The byte at 40:96H should show the presence (or absence) of a 101-key
keyboard.
Try changing the contents of the status byte at 40: 17H—enter E 40:17 00. If your key-
board Lock keys have indicator lights, they should turn off. Now try entering E 40:17 70 to
turn them on again.
You could try various combinations, although it’s difficult to type a valid DEBUG
command while holding down the Ctrl and Alt keys. Enter Q to quit DEBUG.
DOS Interrupt 21H for Keyboard Input 185
KEYBOARD BUFFER
An item of interest in the BIOS data area at 40:1 EH is the keyboard buffer. This feature al-
lows you to type up to 15 characters before a program requests input. When you press a
key, the keyboard’s processor generates the key’s scan code (its unique assigned number)
and automatically requests INT 09H.
In simple terms, the BIOS INT 09H routine gets the scan code from the keyboard,
converts it to an ASCII character, and delivers it to the keyboard buffer area. Subsequently,
BIOS INT 16H (the lowest level keyboard operation) reads the character from the buffer
and delivers it to your program. Your program need never request INT 09H, because BIOS
performs it automatically when you press a key. A later section covers INT 09H and the
keyboard buffer in detail.
¢ AL = anonzero value means that a standard ASCH character 1s present, such as a let-
ter or number, which the operation echoes on the screen
¢ AL = zero means that the user has pressed an extended function key such as Home,
Fl, or PgUp, and the AH still contains the original function. The operation handles
extended functions clumsily, attempting to echo them on the screen. And to get the
scan code for the function key in the AL, you immediately have to repeat the INT
21H operation. The operation also responds to a Ctrl+ Break request.
For screen output, load the ASCII character (not OFFH) into the DL.
The operation clears the keyboard buffer, executes the function in the AL, and accepts (or
waits for) a character, according to the function request in the AL. You could use this op-
eration for a program that does not allow a user to type ahead.
Key Pressed
The following code tests the AL for OOH to determine whether the user has pressed an ex-
tended function key:
MOV AH, OOH ;Request BIOS keyboard input
JE G40 ; -yes
Since the operation does not echo the character to the screen, you have to issue a screen dis-
play interrupt for that purpose.
188 Advanced Keyboard Processing Chapter 11
JE XxXXX ; —yes
See function 11H for handling the shift status at location 418H for extended functions
on the enhanced keyboard.
You can test the AL for OOH or EOH to determine whether the user has pressed an extended
function key:
MOV AH,10H ;Request BIOS keyboard input
JE G40 ; —yes
JE G40 ; -yes
Since the operation does not echo the character to the screen, you have to issue a screen dis-
play interrupt for that purpose.
This operation is the same as function 01H, except that it recognizes the additional extended
functions from the enhanced keyboard, whereas 01H does not.
whether you press a character key or an extended function key. For a character, such as the
letter A, the operation delivers these two items:
The keyboard contains two keys each for such characters as —, +, and *. Pressing
the asterisk key, for example, sets the character code 2AH in the AL and one of two scan
codes in the AH, depending on which key was pressed: 09H for the asterisk above the num-
ber 8, or 29H for the asterisk by the numeric keypad.
The following logic tests the scan code to determine which asterisk was pressed:
CMP AL,2AH ;Asterisk?
JE EXIT2
If you press an extended function key, such as Ins, the operation delivers these two
items:
1. In the AL register: Zero, or EOH for a new control key on the enhanced keyboard.
2. In the AH register: The scan code for Ins, 52H.
52 |0
Thus after an INT 16H operation (and some INT 21H operations), you can test the AL. If
it contains OOH or EOH, the request is for an extended function; otherwise, the operation
has delivered a character. The following tests for an extended function key:
JZ exit ; yes-exit
In the following code, if a user presses the Home key (scan code 47H), the cursor is
set to row 0, column 0:
Selecting from a Menu 191
JE G30 ; yes-—bypass
Program function keys F1-F10 generate scan codes 3BH—44H, respectively, and F11
and F12 generate 85H and 86H. The following code tests for program function key F10:
JE EXIT1 ; yes-exit
Keyboard Exercise
The following DEBUG exercise examines the effects of entering various keyboard charac-
ters. For an 83-key keyboard, use function 00H, and for a 101-key keyboard, use function
10H. Use the command A 100 to enter these instructions:
INT 16
JMP 100
Use the P (Proceed) command to execute the INT operation. Key in various characters, and
compare the results in the AX with the listing in Appendix F.
page 60,132
P11SELMU (EXE) Select item from menu
-MODEL SMALL
-STACK 64
DATA
TOPROW EQU 00 ;Top row of menu
BOTROW EQU 07 ;Bottom row of menu
LEFCOL EQU 16 ;Left column of menu
COL DB 00 ;Screen column
ROW DB 00 ;Screen row
COUNT DB ? ;Characters per line
LINES DB ? ;Lines displayed
ATTRIB DB ? ;Screen attribute
NINTEEN DB 1 ;Width of menu
MENU DB OC9H, 17 DUP(OCDH), OBBH
DB OBAH, ’ Add records ‘, OBAH
DB OBAH, ' Delete records ‘'’, OBAH
DB OBAH, ‘ Enter orders ‘, OBAH
DB OBAH, ‘' Print report ’, OBAH
DB OBAH, ‘' Update accounts ’, OBAH
DB OBAH, ’ View records ’, OBAH
DB OC8H, 17 DUP(OCDH), OBCH
CODE
PROC FAR
MOV AX, @data ;Initialize segment
MOV DS, AX ; registers
MOV ES ,AX
CALL Q1O0CLR ;Clear screen
MOV ROW , BOTROW+2
MOV COL, 00
CALL Q20CURS ;Set cursor
MOV AH, 40H ;Request display
MOV BX, 01 ;Handle for screen
MOV Cx, 715 ;Number of characters
LEA DX, PROMPT ; Prompt
INT 21H
A10OLOOP:
CALL B10MENU ;Display menu
MOV COL, LEFCOL+1
CALL Q20CURS ;Set cursor
MOV ROW, TOPROW+1 ;Set row to top item
MOV ATTRIB,16H ;Set reverse video
CALL H10DISP ;Highlight current menu line
CALL DLOINPT ;Provide for menu selection
CMP AL, ODH ;Enter pressed?
JE A10LOOP ; yes, continue
MOV AX, 0600H ;Esc pressed (indicates end)
CALL Q10CLR ;Clear screen
MOV AX, 4C0O0H ;Exit to DOS
INT 21H
ENDP
Display full menu:
PROC NEAR
MOV ROW, TOPROW ;Set top row
MOV LINES, 08 ;Number of lines
LEA SI,MENU
MOV ATTRIB, 71H ;Blue on white
B20:
MOV COL, LEFCOL »Set left column of menu
MOV COUNT, 19
B30:
CALL Q20CURS ;Set cursor next column
MOV AH, 09H ;Request display
MOV AL, [ST] ;Get character from menu
MOV BH, 00 ;Page 0
MOV BL, 71H ;New attribute
MOV CX, O01 ;One character
INT 10H
INC COL ;Next column
INC SI ;Set for next character
DEC COUNT ;Last character?
JNZ B30 ;No, repeat
INC ROW ;Next row
DEC LINES
JNZ B20 ;All lines printed?
RET ,Li 80, TEeturn
B1LOMENU ENDP
: Accept input for request:
H20:
CALL Q20CURS ;Set cursor next column
MOV AH, 09H ;Request display
MOV AL, [SI] ;Get character from menu
MOV BH, 00 ;Page 0
MOV BL,ATTRIB ;New attribute
MOV Cx, 01 ;One character
INT 10H
INC COL ;Next column
INC SI ;Set for next character
DEC COUNT ;Last character?
JNZ H20 ;No, repeat
MOV COL, LEFCOL+1 ;Reset column to left
CALL Q20CURS ;Set cursor
RET
H1ODISP ENDP
; Clear screen:
;Blue on brown
MOV CX, 0000
MOV DX,184FH
INT L0H ;Call BIOS
RET
Q10CLR ENDP
* BEGIN calls QIOCLR to clear the screen, calls BIOMENU to display the menu items
and to set the first item to reverse video, and calls DIOINPT to accept keyboard input.
* BIOMENU displays the full set of menu selections.
* DIOINPT uses INT 16H for input: the Down Arrow to move down the menu, the Up
Arrow to move up the menu, Enter to accept a menu item, and Esc to quit. All other
keyboard entries are ignored. The routine wraps the cursor around, so that trying to
move the cursor above the first menu line sets it to the last line, and vice versa. The
routine also calls HIODISP to reset the previous menu line to normal video and the
new (selected) menu line to reverse video.
¢ HIODISP displays the currently selected line according to an attribute (normal or re-
verse video) that has been provided.
* QIOCLR clears the entire screen and sets it to blue foreground and brown background.
The program illustrates menu selection in a simple manner; a full program would exe-
cute a routine for each selected item. You’ll get a better understanding of this program by
typing it in and testing it.
Interrupt 09H and the Keyboard Buffer 195
When you press a key, the keyboard’s processor generates the key’s scan code and re-
quests INT 09H. This interrupt (at location 36 of the interrupt services table) points to an
interrupt-handling routine in ROM BIOS. The routine issues a request for input from port
96 (60H):
IN AL,60H
The BIOS routine reads the scan code and compares it with entries in a scan code table
for the associated ASCII character (if any). The routine combines the scan code with its
associated ASCII character and delivers the two bytes to the keyboard buffer. Figure 11-2
illustrates this procedure.
Note that INT 09H handles the keyboard status bytes at 40:17H, 40:18H and 40:96H
for Shift, Alt, and Ctrl, respectively. However, although pressing these keys generates
INT 09H, the interrupt routine sets the appropriate bits in the status bytes, but doesn’t de-
liver any characters to the keyboard buffer. Also, INT 09H ignores undefined keystroke
combinations.
When you press a key, the keyboard processor automatically generates a scan code
and INT 09H. When you release the key within one-half second, it generates a second scan
code [the value of the first code plus 128 (1000 0000B), which sets the leftmost bit] and is-
sues another INT O9H. The second scan code tells the interrupt routine that you have re-
leased the key. If you hold the key for more than one-half second, the keyboard process
becomes typematic and automatically repeats the key operation.
ADDRESS EXPLANATION
41AH Address of current head of the buffer, the next position for INT 16H
to read.
41CH Address of current tail of the buffer, the next position for INT 09H to
store an entered character.
41EH Address of the beginning of the keyboard buffer itself: 16 words (32
bytes), although it can be longer. The buffer holds keyboard characters
and scan codes as entered for later reading via INT 16H. Two bytes are
required for each character and its associated scan code:
BIOS
INTO9H
Routine
Scan
Code
Character
DOS ! Scan |
INT 21H | Code | Char.
Routine | -
Bee clatae J AX Register
When you type a character, INT 09H advances the tail. When INT 16H reads a char-
acter, it advances the head. In this way, the process is circular, with the head continually
chasing the tail.
When the buffer is empty, the head and tail are at the same address. In the following
example, a user has keyed ’abcd<Enter>’. INT 09H has stored the characters in the buffer
and has advanced the tail to 428H. (For simplicity, the example does not show the associ-
ated scan codes.) The program has issued INT 16H five times to read all the characters and
has advanced the tail to 428H, so that the buffer is now empty:
ab C d <QDH>
| | | | | |
41E 420 422 424 426 428
When the buffer is full, the tail is immediately behind the head. To see this, sup-
pose the user now types ’fghijkImnopgqrs’. Then INT 09H stores the characters beginning
with the tail at 428H and circles around to store the ’s’ at 424H, immediately before the
head at 426H.
p q rs <QDH> e f g h i jy k 1 mi nn O
| | | | | | | | | | | | | | | |
41E 420 422 424 426 428 42A 42C 42E 430 432 434 436 438 43A 43C
At this point, INT 09H does not accept any more characters typed ahead and, indeed,
accepts only 15 at most, although the buffer holds 16. (Can you tell why?) If INT 09H were
Entering the Full ASCII Character Set 197
to accept another character, it would advance the tail to the same address as the head, and
INT 16H would suppose that the buffer is empty.
BIODATA ENDS
ASSUME CS:CODESG,
DS: BIODATA
ORG 100H
BEGIN:
The program uses the SEGMENT AT feature to define the BIOS data area as, in ef-
fect, a dummy segment. KBSTATE identifies the location of the keyboard status byte at
40:17H. The code segment initializes the address of BIODATA in the DS and stores the
keyboard status byte in the AL. An OR operation tests the byte for either Shift key pressed.
You could modify this code to test as well for the enhanced keyboard status bytes at
40:18H and 40:96H.
characters are not represented on it. You can, however, enter any of the codes 01 through
255 by holding down the Alt key and entering the appropriate code as a decimal value on
the numeric keypad. The system stores your entered value as two bytes in the keyboard
buffer, the first of which is the generated ASCII character and the second of which is zero.
For example, Alt+001 delivers 01H, and Alt+255 delivers FFH. You could use DEBUG
to examine the effect of entering various values:
102 INT 16
KEY POINTS
The shift status bytes in the BIOS data area indicate the current status of Ctrl, Alt,
Shift, CapsLock, NumLock, and ScrollLock.
DOS INT 21H keyboard operations provide a variety of services to echo or not echo
on the screen, to recognize or ignore Ctrl+ Break, and to accept scan codes.
BIOS INT 16H provides the basic BIOS keyboard operation for accepting characters
from the keyboard buffer. For a character key, the operation delivers the character to
the AL and the key’s scan code to the AH. For an extended function key, the opera-
tion delivers zero to the AL and the key’s scan code to the AH.
The scan code is a unique number assigned to each key that enables the system to
identify the source of a pressed key and enables a program to check for extended func-
tion keys such as Home, PgUp, and Arrow.
The BIOS data area at 40:1EH contains the keyboard buffer. This area allows you to
type up to 15 characters before a program requests input.
When you press a key, the keyboard’s processor generates the key’s scan code (its
unique assigned number) and requests INT 09H. When you release the key, it gener-
ates a second scan code (the first code plus 128—the leftmost bit is set) to tell INT
O9H that the key is released.
BIOS INT 09H gets a scan code from the keyboard, and either it generates an asso-
ciated ASCII character and delivers the scan code and character to the keyboard
buffer area or it sets the Ctrl, Alt, Shift status.
QUESTIONS
11-1. (a) What is the location of the first byte of the keyboard shift status in the BIOS data area? (b)
What do the contents 00001100 mean? (c) What do the contents 00000010 mean?
11-2. Explain the features of the following functions for INT 21H keyboard input: (a) 01H; (b) 07H;
(c) O8H; (d) OAH.
11-3. Explain the differences among INT 16H functions 00H, 01H, and 10H.
Questions 199
11-4. Provide the scan codes for the following extended functions: (a) Up Arrow; (b) program func-
tion key F3; (c) Home; (d) PgUp.
11-5. Use DEBUG to examine the effects of entered keystrokes. To request entry of assembly lan-
guage statements, type A 100 and enter the following instructions:
INT 16
Use U 100,104 to unassemble the program, and use the P command to get DEBUG to exe-
cute through the INT. Execution stops, waiting for your input. Press any key and examine the
AH and AL registers. Continue entering a variety of keys. Press Q to quit DEBUG.
11-6. Code the instructions to enter a keystroke; if the key is PgDn, set the cursor to row 24,
column 0.
11-7. Revise Figure 11-1 to provide for the following features: (a) After the initial clearing of the
screen, display a prompt that asks users to press F1 for a menu screen. (b) When F1 is pressed,
display the menu. (c) Allow users to select menu items also by pressing the first character
(upper- or lowercase) of each item. (d) On request of an item, display a message for that par-
ticular selection, such as “Procedure to Delete Records.” (e) Allow users to press Esc to re-
turn to the main menu for the selected routine.
11-8. Under what circumstances does an INT 09H occur?
11-9. Explain in simple terms how INT 09H handles Ctrl and Shift keys differently from the way
it handles the standard keyboard keys.
11-10. (a) Where is the BIOS memory location of the keyboard buffer? (b) What is the buffer’s size,
in bytes? (c) How many keyboard characters can it contain?
11-11. (a) What does it mean when the address of the head and tail in the keyboard buffer are the
same? (b) What does it mean when the address of the tail immediately follows the head?
PART D — Data Manipulation
CHAPTER 12
String Operations
OBJECTIVE:
To explain the special instructions used to process string
data.
INTRODUCTION
To this point, the instructions presented have handled data defined as only one byte, word,
or doubleword. It is often necessary, however, to move or compare data fields that exceed
these lengths. For example, you may want to compare descriptions or names in order to sort
them into ascending sequence. Items in this format are known as string data and may be e1-
ther character or numeric. For processing string data, assembly language provides five
string instructions:
MOVS Moves one byte, word, or doubleword from one location to another in
memory.
LODS Loads from memory a byte into the AL, a word into the AX, or a double-
word into the EAX.
STOS Stores the contents of the AL, AX, or EAX registers into memory.
CMPS Compares byte, word, or doubleword memory locations.
SCAS Compares the contents of the AL, AX, or EAX with the contents of a
memory location.
200
REP: Repeat String Prefix 201
An associated instruction, the REP prefix, causes a string instruction to perform repet-
itively a specified number of times.
The second way to code string instructions is the standard practice, as shown in the
fourth, fifth, and sixth columns. You load the addresses of the operands in the DI and SI
registers and code, for example, MOVSB, MOVSW, and MOVSD without operands.
The string instructions assume that the DI and SI contain valid offset addresses that
reference bytes in memory. The SI register is normally associated with the DS (data seg-
ment) register as DS:SI. The DI register is always associated with the ES (extra segment)
register as ES:DI. Consequently, MOVS, STOS, CMPS, and SCAS require that an .EXE
program initialize the ES register, usually, but not necessarily, with the same address as that
in the DS register:
MOV AX, @data *Get address of data segment
¢ For processing from left to right (the normal way of processing), use CLD to clear
the DF to zero.
¢ For processing from right to left, use STD to set the DF to 1.
The following example moves (or rather, copies) the 20 bytes of STRINGI to
STRING2 (assume that the DS and ES are both initialized with the address of the data seg-
ment, as shown earlier):
STRING1 DB 20 DUP(‘*’)
STRING2 DB 20 DUP(’ ‘)
During execution, the CMPS and SCAS instructions also set status flags, so that the
operation can terminate immediately on finding a specified condition. The variations of
REP for this purpose are the following:
For the 80286 and more advanced processors, the use of word and doubleword oper-
ations can provide faster processing. We’ll now examine each string operation in detail.
MOVS combined with a REP prefix and a length in the CX can move any number of char-
acters. Although you don’t code the operands, the instruction looks like this:
[label:] REP MOVSn [ES:DI,DS:ST]
For the receiving string, the segment:offset registers are the ES:DI; for the sending
string, the segment:offset registers are the DS:SI. As a result, at the start of an .EXE pro-
gram, initialize the ES register along with the DS register, and prior to executing the MOVS,
use LEA to initialize the DI and SI registers. Depending on the direction flag, MOVS in-
crements or decrements the DI and SI registers by 1 for byte, 2 for word, and 4 for double-
word. The following code is illustrative:
MOVS: Move String 203
LOOP LABEL1
LABEL2: ...
Earlier, Figure 6-2 illustrated moving a 9-byte field. The program could also
have used MOVSB for this purpose. In Figure 12-1, the procedure CIOMVSB uses
MOVSB to move a 10-byte field, NAME1, 1 byte at a time to NAME2. The first instruc-
tion, CLD, clears the direction flag to zero so that the MOVSB processes data from left to
right. The direction flag is normally zero at the start of execution, but CLD is coded here
as a precaution.
The two LEA instructions load the SI and DI registers with the offset addresses of
NAMEI and NAME2, respectively. Since the DOS loader for a .COM program automati-
cally initializes the DS and ES registers, the segment:offset addresses are correct for ES:DI
and DS:SI. A MOV instruction initializes the CX with 10 (the length of NAME1 and of
NAME2). The instruction REP MOVSB now performs the following:
° Moves the leftmost byte of NAME1 (addressed by DS:SI) to the leftmost byte of
NAME2 (addressed by ES:DI).
¢ Increments the DI and SI by 1 for the next bytes to the right.
¢ Decrements the CX by 1.
¢ Repeats this operation, 10 loops in all, until the CX becomes zero.
Because the direction flag is zero and MOVSB increments DI and SI, each iteration
processes one byte farther to the right, as NAME1 + 1 to NAME2+1, and so on. At the end
of execution, the CX contains 00, the DI contains the address of NAME2+ 10, and the SI
contains the address of NAME1+10—both 1 byte past the end of the name.
If the direction flag is 1, MOVSB would decrement DI and SI, causing processing to
occur from right to left. But in that case, to move the contents correctly, you would have to
initialize the SI with NAME1+9 and the DI with NAME2+9.
The next procedure in Figure 12-1, DIOMVSW, uses MOVSW to move five words
from NAME2 to NAMB3. At the end of execution, the CX contains 00, the DI contains the
address of NAME3 +10, and the SI contains the address of NAME2+ 10.
204 String Operations Chapter 12
‘Assemblers’
10 DUP(’ ‘)
10° DUP(? ©)
procedure
subroutine
subroutine
AX, 4C0O0OH : to DOS
21H
; Use of MOVSB:
C1OMVSB
;Left to right
;Move 10 bytes,
; NAME1 to NAME2
REP MOVSB
RET
C1LOMVSB ENDP
i Use of MOVSW:
D1OMVSW
END BEGIN
Since MOVSW increments the DI and SI registers by 2, the operation requires only
five loops. For processing right to left, initialize the SI with NAME1+8 and the DI with
NAME2+8.
LODS loads the AL with a byte, the AX with a word, or the EAX with a doubleword from
memory. The memory address is subject to the DS:SI registers, although you can override
the SI. Depending on the direction flag, the operation also increments or decrements the SI
by 1 for byte, 2 for word, and 4 for doubleword.
Since one LODS operation fills the register, there is no practical reason to use the
REP prefix with it. For most purposes, a simple MOV instruction is adequate. But MOV
generates 3 bytes of machine code, whereas LODS generates only 1, although it requires
that you initialize the SI register. You could use LODS to step through a string 1 byte, word,
or doubleword at a time, examining successively for a particular value.
The instructions equivalent to LODSB are
STOS: Store String 205
100H
SHORT MAIN
‘Assemblers’
10 DUP(20H)
procedure
to right
BEGIN
In Figure 12—2, the data area defines a 10-byte field named FIELDA containing the
value “Assemblers” and another 10-byte field named FIELDB. The objective is to transfer
the bytes from FIELDA to FIELDB in reverse sequence, so that FIELDB contains
“srelbmessA.” LODSB is used to access | byte at a time from FIELDA into the AL, and
the instruction MOV [DI],AL transfers the bytes to FIELDB, from right to left.
LOOP LABEL1
LABEL2 :
206 String Operations Chapter 12
The STOSW instruction in Figure 12-3 repeatedly stores a word containing 2020H
(blanks) five times through NAME1. The operation stores the AL in the first byte and the
AH in the next byte (that is, reversed). At the end, all of NAME1 is blank, the CX contains
OO, and the DI contains the address of NAME1+ 10.
The program in Figure 12-4 illustrates the use of both the LODS and STOS instructions.
The example is similar to the program in Figure 10-4, which transfers characters and at-
tributes directly to the video display area, except that Figure 12—4 contains these differences:
¢ For the video area, it uses page number 02 rather than page 01.
¢ In CLOPROC, it uses STOSW to store characters and associated attributes in the video
area, instead of this instruction and its accompanying two DEC instructions that
decrement the DI:
MOV WORD PTR [VIDAREA+DI]
, AX
* It defines an item named PROMPT in the data segment, prompting the user to “Press
any key .. .”, to be used at the end of processing.
* On completion of processing, the procedure DIOPROMPT transfers the defined
prompt to the video display area. To this end, it uses LODSB to access characters one
at a time from PROMPT into the AL and uses STOSW to transfer each character and
its associated attribute from the AX into the video area.
;Addressability for
DS, AX ; data segment
AX, VIDSEG ; and for
ES ,AX ; video area
ES: VIDSEG
AH, OFH ;Request get
10H ; and save
AX ; current mode
BX ; and page
AH, 00H ;Request set
AL, 03 ; mode 03, clear screen
10H
AH, 05H ;Request set
AL, 02H ; page #02
10H
C1OPROC ;Process display area
D10PROMPT ;Display user prompt
ELOINPT ;Provide for input
;Restore
; original
; page number
;Restore video
; mode (in AL)
C2Z0PROC
;Character to display
;Attribute
;Start of display area
C30: ;Characters per row
C40: ;AX in display area
;Repeat 60 times
;Next atribute
;Next character
;Indent for next row
;Last character to display?
; no, repeat
; yes, return
C10PROC
’ Prompt user to press key
—VM— we www eww eM ew eS eK —
D1OPROMPT
CX, 16 ;Characters to display
SI, PROMPT ;Address of prompt
doubleword. The operation sets the AF, CF, OF, PF, SF, and ZF flags. When combined with
a REP prefix and a length in the CX, CMPS can successively compare strings of any length.
But note that CMPS provides an alphanumeric comparison, that is, a comparison ac-
cording to ASCH values. The operation is not suited to algebraic comparisons, which con-
sist of signed numeric values. Consider the comparison of two strings containing JEAN and
JOAN. A comparison from left to right, one byte at a time, results in the following:
Apa Equal
E:O Unequal (E is low)
A:A_ Equal
N:N_ Equal
A comparison of the entire four bytes ends with a comparison of N with N (equal). Now
since the two names are not identical, the operation should terminate as soon as the com-
parison is between two different characters. For this purpose, REP has a variation, REPE
(Repeat on Equal), which repeats the operation as long as the comparison is between equal
characters, or until the CX register equals zero. The coding for repeated one-byte compar-
isons is REPE CMPSB.
Figure 12—5 consists of two examples that use CMPSB. The first example compares
NAME1 with NAME2, which contain the same values. The CMPSB operation therefore
continues for the entire 10 bytes. At the end of execution, the CX contains 00, the DI con-
tains the address of NAME2+ 10, the SI contains the address of NAME1+ 10, the sign flag
is positive, and the zero flag indicates equal or zero.
The second example compares NAME2 with NAME3, which contain different values.
The CMPSB operation terminates after comparing the first byte and results in a high or un-
equal condition: The CX contains 09, the DI contains the address of NAME3 + 1, the SI con-
tains the address of NAME2 + 1, the sign flag is positive, and the zero flag indicates unequal.
The first example results in equal or zero and (for illustrative reasons only) moves 01
to the BH register. The second example results in unequal and moves 02 to the BL register.
If you use DEBUG to trace the instructions, you’llsee 0102 in the BX at the end of execution.
Warning!: These examples use CMPSB to compare data one byte at a time. If you
use CMPSW to compare data a word at a time, initialize CX to 5. But that’s not the prob-
SCAS: Scan String 209
100H
SHORT MAIN
‘Assemblers’
‘Assemblers’
10 DUP(’
;Main procedure
;Left to right
;Initialize for 10 bytes
BEGIN
lem. When comparing words, CMPSW reverses the bytes. For example, let’s compare the
names SAMUEL and ARNOLD. For the initial comparison of words, instead of compar-
ing SA with AR, the operation compares AS with RA. So, instead of the name SAMUEL
indicating a higher value, it will be lower—and incorrect. CMPSW works correctly only if
the compared strings contain unsigned numeric data defined as DW, DD, or DQ.
100H
SHORT MAIN
;Main procedure
;Lett. to: right
;Scan NAME1
» for *m'
REPNE SCASB
H20 ;I£ found,
AL, 03 ; store 03 in AL
AH, 4CH
21H ;Exit to DOS
BEGIN
eration you will see that the zero flag shows zero, the CX is decremented to 05, and the DI
is incremented by 05. (The DI is incremented one byte past the actual location of the ‘m’.)
The program stores 03 in the AL register (for illustrative reasons) to indicate that an
“m’” was found.
SCASW scans for a word in memory that matches the word in the AX register. If you
used LODSW or MOV to transfer a word into the AX register, the first byte would be in
the AL and the second byte in the AH. Since SCASW compares the bytes in reversed se-
quence, the operation works correctly.
K20:
implies a repeated move of the byte beginning at FLDB to the byte beginning at FLDA. If
you load the DI and SI registers with the addresses of FLDA and FLDB, you can also code
the MOVS instruction as
Few programs are coded this way, and the format is covered here just for the record.
DUPLICATING A PATTERN
The STOS instruction is useful for setting an area according to a specific byte, word, or dou-
bleword value. However, for repeating a pattern that exceeds these lengths, you can use
MOVS with a minor modification. Let’s say that you want to set a display line to the fol-
lowing pattern:
KAKHHHAAAHHHAAKEEHAAAHHH
AAA HH
Rather than define the entire pattern repetitively, you need only define the first six bytes
that immediately precede the display line. Here is the required coding:
PATTERN DB CRP RT
DISAREA DB 42 DUP(?)
212 String Operations Chapter 12
On execution, MOVSW moves the first word of PATTERN (**) to the first word of DIS-
AREA and then moves the second (*#) and third (##) words:
KAKHPEAKAKPHEH
| |
PATTERN DISAREA
At this point, the DI contains the address of DISAREA+6, and the SI contains the address
of PATTERN +6, which is also the address of DISAREA. The operation now automatically
duplicates the pattern by moving the first word of DISAREA to DISAREA+6, DIS-
AREA +2 to DISAREA+8, DISAREA+4 to DISAREA+ 10, and so forth. Eventually the
pattern is duplicated through the end of DISAREA:
KAKHHHEAAKAHHEHAAKAE HH A RAH H RR AHH .. keKHUH
| | | |
PATTERN DISAREA+6 DISAREA+12 DISAREA+42
You can use this technique to duplicate a pattern any number of times. The pattern it-
self may be any length, but must immediately precede the target field.
Babe Ruth
Mickey Mantle
Reggie Jackson
. DATA
LABEL BYTE ;Name parameter list
DB 31 ;Maximum length
DB ? ;No. of chars entered
NAMEFLD DB 31 DUP(' ') ; Name
CODE
PROC FAR ;Main procedure
MOV AX, @data ;Initialize
MOV DS, AX ; data segment
MOV ES, AX
MOV AX, 0600H
CALL Q10SCR ;Clear screen
SUB DX, DX ;Set cursor 00,00
CALL Q20CURS
A10LOOP:
CALL B1OINPT ;Request input of name
TEST ACTNLEN, OFFH ;No name? (indicates end)
JZ A90 ; yes, exit
CALL D1OSCAS ;Scan for asterisk
CMP AG? ** ; Found?
JE A10LOOP ; yes, bypass
CALL E1LORGHT ;Right adjust name
CALL F1LOCLNM ;Clear name
JMP A10LOOP
A90: MOV AX, 4C0O0OH ;Exit to DOS
INT 21H
BEGIN ENDP
/ Prompt for input:
B1LOINPT PROC
MOV AH, 09H
LEA DX, PROMPT ;Display prompt
INT 21H
MOV AH, OAH
LEA DX, NAMEPAR ;Accept input
INT 21H
RET
BLOINPT ENDP
f Scan name for asterisk:
D10SCAS PROC
CLD ;Left to right
MOV AL, '*! ;Character for scan
MOV CX;,30 ;Set 30-byte scan
LEA DI, NAMEFLD
REPNE SCASB ;Asterisk found?
JE D20 ; no, exit
MOV AL,20H ; yes, clear * in AL
RET
ENDP
E10RGHT PROC
FLOCLNM PROC
CLD ;Left to right
MOV AX, 2020H
MOV CX,15 ;Clear 15 words
LEA DI,NAMEDSP
REP STOSW
RET
F1OCLNM ENDP
°
‘ Scroll screen:
KEY POINTS
¢ For the string instructions MOVS, STOS, CMPS, and SCAS, be sure that your .EXE
programs initialize the ES register.
¢ For string instructions, use the suffixes B, W, or D for handling byte, word, or dou-
bleword strings.
¢ Clear (CLD) or set (STD) the direction flag for the required direction of processing.
¢ Double check the initialization of the DI and SI registers. For example, MOVS im-
plies operands DI,SI, whereas CMPS implies operands SI,DI.
¢ Initialize the CX register for REP to process the required number of bytes, words, or
doublewords.
¢ For normal processing, use REP with MOVS and STOS, and use a conditional REP
(REPE or REPNE) with CMPS and SCAS.
¢ CMPSW and SCASW reverse the bytes in words that are compared.
¢ Where you want to process right to left, watch out for addressing beginning at the
rightmost byte of a field. For example, if the field is NAME1 and is 10 bytes long,
then for processing bytes, the load address for LEA is NAME~+9. For processing
words, however, the load address for LEA is NAME+8 because the string operation
initially accesses NAME+8 and NAME+9.
QUESTIONS
12-1. The string operations assume that the operands relate to the DI or SI registers. Identify these
registers for the following: (a) MOVS (operands 1 and 2); (b) CMPS (operands | and 2);
(c) SCAS (operand 1).
12-2. For string operations using REP, how do you define the number of repetitions that are to occur?
12-3. For string operations using REP, how do you set processing right to left?
12-4. The chapter gives the instructions equivalent to (a) MOVSB, (b) LODSB, and (c) STOSB, each
with a REP prefix. For each case, provide equivalent code for processing words.
12-5. Revise the program in Figure 12—1. Convert the program from .COM to .EXE format, and be
sure to initialize the ES register. Change the MOVSB and MOVSW operations to move data
from right to left. Use DEBUG to trace through the procedures, and note the contents of the
data segment and registers.
12-6. Use the following data definitions and code string operations for parts (a)-(f):
PRLINE DB 20 DUP(’ ‘)
OBJECTIVE:
To cover the requirements for addition, subtraction, mul-
tiplication, and division of binary data.
INTRODUCTION
This chapter covers addition, subtraction, multiplication, and division and the use of un-
signed and signed data. The chapter also provides many examples and warnings of vari-
ous pitfalls for the unwary traveler in the realm of the microprocessor. Chapter 14
covers special requirements involved with conversion between binary and ASCII data
formats.
Although we are accustomed to performing arithmetic in decimal (base 10) format,
a microcomputer performs its arithmetic only in binary (base 2). Further, the limitation of
16-bit registers on pre-80386 processors involves special treatment for large values.
Instructions introduced in this chapter are:
217
218 Arithmetic: I—Processing Binary Data Chapter 13
ADD/SUB {register,register}
ADD/SUB {register,memory}
Figure 13-1 provides examples of ADD and SUB for processing byte and word val-
ues. The procedure BIOADD uses ADD to process bytes, and the procedure C1OSUB uses
SUB to process words.
Overtlows
Be alert for overflows in arithmetic operations. Since a byte provides for only a sign bit
and seven data bits (from —128 to +127), an arithmetic operation can easily exceed the
capacity of a one-byte register. And a sum in the AL register that exceeds its capacity
may cause unexpected results. Suppose, for example, that the AL contains 60H. Then
the instruction
generates a sum of 80H in the AL. Having added two positive values, we expect the sum to
be positive, but the operation sets the overflow flag to overflow and the sign flag to nega-
tive. The reason? The value 80H, or binary 10000000, is a negative number; instead of
+ 128, the sum is — 128. The problem is that the AL register is too small for the sum, which
should be in the full AX register, as shown in the next section.
Addition and Subtraction 219
100H
SHORT MAIN
procedure:
ADD routine
SUB routine
AX, 4C00H ; j to DOS
21H
AL, BYTEA
BL, BYTEB
AL, BL ;Register to register
AL, BYTEC ;Memory to register
BYTEA, BL ;Register to memory
BL,10H :Immediate to register
BYTEA, 25H ;Immediate to memory
B10ADD
BX, WORDB
AX, BX ;Register from register
AX ,WORDC ;Memory from register
WORDA, BX ;Register from memory
BX, 1000H :Immediate from register
WORDA, 256H ;Immediate from memory
END BEGIN
AH AL
Do :4 60H
The numeric result in the second example is the same, but the operation on the AX
does not treat it as overflow or negative. Still, although a full word in the AX allows for a
sign bit and 15 data bits, the AX is limited to values from —32,768 to +32,767. The next
section examines how to handle numbers that exceed these limits.
MULTIWORD ARITHMETIC
As we have seen, large numeric values may exceed the capacity of a word, in effect requir-
ing multiword capacity. A major requirement in multiword arithmetic is reverse-byte and
reverse-word sequence. Recall that the assembler automatically converts the contents of de-
fined numeric words into reverse-byte sequence, so that, for example, a definition of 0134H
becomes 3401H. But for doubleword values, it is your responsibility to define the related
pair of words in reverse-word sequence. Let’s say that a doubleword pair looks like this:
Hex | 01 23 | BC 62 |
The assembler then converts these definitions into reverse-byte sequence, suitable for dou-
bleword arithmetic:
Hex | 62 BC | 23 01 |
Let’s examine two ways to perform multiword arithmetic. The first is simple and spe-
cific, whereas the second is more sophisticated and general.
In Figure 13-2, the procedure DIODWD illustrates adding one pair of words
(WORDIA and WORD 1B) to a second pair (WORD2A and WORD2B) and storing the
sum in a third pair (WORD3A and WORD3B). In effect, the operation is to add values, such
as the following:
Initial value: 0123 BC62H
Because of the reverse-byte sequence in memory, the program defines the values with the
words reversed: BC62 0123 and 553A 0012, respectively. The assembler then stores these
doubleword values in memory in proper reverse-byte sequence:
Multiword Arithmetic 29)
D1ODWD PROC
MOV AX,WORDIA ;Add leftmost word
ADD AX,WORD2A
MOV WORD3A,AX
MOV AX,WORD1B ;Add rightmost word
ADC AX ,WORD2B ; with carry
MOV WORD3B,
AX
RET
D1ODWD ENDP
i Generalized add operation:
E1ODWD PROC
CLC ;Clear carry flag
MOV Cx, 02 ;Set loop count
LEA SI,WORDIA ;Leftmost word
LEA DI,WORD2A ;Leftmost word
LEA BX ,WORD3A ;Leftmost word of sum
E20:
MOV AX, [SI] ;Move word to AX
ADC AX, [DI] ;Add with carry to AX
MOV [BX] , AX ;Store word
INC SI ;Adjust addresses for
INC SI >; next word to right
INC DI
INC DI
INC BX
INC BX
LOOP E20 ;Repeat for next word
RET
E1LODWD ENDP
END BEGIN
The procedure first adds WORD2A to WORDIA in the AX (they are really the low-order
portions) and stores the sum in WORD3A. It next adds WORD2B to WORD1B (the high-
order portions) in the AX, along with the carry from the previous addition. It then stores the
222 Arithmetic: I—Processing Binary Data = Chapter 13
sum in WORD3B. Let’s examine the operations in detail. The first MOV and ADD opera-
tions reverse the bytes in the AX and add the leftmost words:
WORDIA: BC62H
WORD2A: +553AH
Since the sum of WORD1IA plus WORD2A exceeds the capacity of the AX, a carry occurs,
and the carry flag is set to 1. Next, the example adds the words at the right, but this time us-
ing ADC (Add With Carry) instead of ADD. ADC adds the two values and, since the carry
flag is set, adds 1 to the sum:
WORD1B 0123H
WORD2B +0012H
Plus carry + 1H
By using DEBUG to trace the arithmetic, you can see the sum 0136H in the AX and the re-
versed values 9C11H in WORD3A and 3601H in WORD3B.
Also in Figure 13-2, the more sophisticated procedure EIODWD provides an ap-
proach to adding values of any length, although here it adds the same pairs of words as be-
fore, WORD1A:WORD1B and WORD2A:WORD2B. The procedure uses the SI, DI, and
BX as base registers for the addresses of WORDIA, WORD2A, and WORD3A, respec-
tively. It loops once through the instructions for each pair of words to be added—in this
case, two times. The first loop adds the leftmost words, and the second loop adds the right-
most words. Since the second loop is to process the words to the right, the addresses in the
SI, DI, and BX registers are incremented by 2. Two INC instructions perform this opera-
tion for each register. INC (rather than ADD) is used for a good reason: The instruction
ADD reg,02 would clear the carry flag and would cause an incorrect answer, whereas INC
does not affect the carry flag.
Because of the loop, there is only one add instruction, ADC. At the start, a CLC (Clear
Carry) instruction ensures that the carry flag is initially clear. To make this method work,
be sure to (1) define the words adjacent to each other, (2) process words from left to right,
and (3) initialize the CX to the number of words to be added.
For multiword subtraction, the instruction equivalent to ADC is SBB (Subtract With
Borrow). Simply replace ADC with SBB in the procedure E1ODWD.
You could add quadwords using the technique covered earlier for adding multiwords.
Unsigned and Signed Data 223
ds Ssi 8 251 ae 0 0
The binary result of the addition in this example is the same for both unsigned and signed
data. However, the bits in the unsigned field represent decimal 251, whereas the bits in the
signed field represent decimal —5. In effect, the contents of a field mean whatever you in-
tend them to mean.
Arithmetic Carry
An arithmetic operation that causes a carry out of the sign bit also sets the carry flag. Where
a carry occurs on unsigned data, the result is invalid. The following example of addition
causes a Carry:
UNSIGNED SIGNED
BINARY DECIMAL DECIMAL OF CF
LIAAT100 252 —4
+00000101 + 5 +5
(1) 00000001 1 0 0 1
(invalid) (valid)
The operation on the unsigned data is invalid because of the carry out of a data bit, whereas
the operation on the signed data is valid.
Arithmetic Overflow
An arithmetic operation sets the overflow flag when a carry into the sign bit does not carry
out, or a carry out occurs with no carry in. Where an overflow occurs on signed data, the
result is invalid (because of an overflow into the sign bit), as this example shows:
224 Arithmetic: I—Processing Binary Data = Chapter 13
UNSIGNED SIGNED
BINARY DECIMAL DECIMAL OF CF
01111001 IZ! 4121
+00001011 ae ab +17
An add operation may set both the carry and the overflow flag. In the next example,
the carry makes the unsigned operation invalid, and the overflow makes the signed opera-
tion invalid:
UNSIGNED SIGNED
BINARY DECIMAL DECIMAL OF CF
11110110 246 -10
+10001001 +137 -119
(invalid) (invalid)
The upshot of all this is that you must have a good idea as to the magnitude of the
numbers that your program will process, and you must define field sizes accordingly.
MULTIPLICATION
For multiplication, the MUL instruction handles unsigned data, and the IMUL (integer
Multiplication) instruction handles signed data. Both instructions affect the carry and over-
flow flags. As programmer, you have control over the format of the data you process, and
you have the responsibility of selecting the appropriate multiply instruction. The general
format for MUL and IMUL is
[label: ] {register/memory }
The basic multiplication operations are byte times byte, word times word, and (80386 and
later processors) doubleword times doubleword.
Before multiplication:
Multiplicand
After multiplication: AX
<——— Product ———>
Multiplication 225
For multiplying two one-word values, the multiplicand is in the AX register and the multi-
plier is a word in memory or another register. For the instruction MUL DX, the operation
multiplies the contents of the AX by the contents of the DX. The generated product is a dou-
bleword that requires two registers: the high-order (leftmost) portion in the DX and the low-
order (rightmost) portion in the AX. The operation ignores and erases any data that may
already be in the DX.
Before multiplication:
After multiplication:
For multiplying two doubleword values, the multiplicand is in the EAX register and the
multiplier is a doubleword in memory or another register. The product is generated in the
EDX:EAX pair. The operation ignores and erases any data already in the EDX.
Before multiplication:
After multiplication:
Field Sizes
The operand of MUL or IMUL references only the multiplier, which determines the field
sizes. In the following examples, the multiplier is in a register, which specifies the type of
operation:
BYTE DB ?
WORDI DW ?
DWORD1 DD ?
MULTIWORD MULTIPLICATION
Conventional multiplication involves multiplying byte by byte, word by word, or double-
word by doubleword. As we have already seen, the maximum signed value in a word is
+ 32,767. Multiplying larger values on pre-80386 processors involves additional steps. The
approach on these processors is to multiply each word separately and then add each prod-
uct together. The following example multiplies a four-digit decimal number by a two-digit
number:
Multiword Multiplication aay
100H
SHORT MAIN
;Main procedure
;Call MUL routine
;Call IMUL routine
AX, 4C00H ;Exit to DOS
21H
Examples
Examples
D10IMUL
BEGIN
1,96)
x 12
16,380
What if you could multiply only two-digit numbers? Then you could multiply the 13 and
the 65 by 12 separately, like this:
13
x 12
228 Arithmetic: I—Processing Binary Data = Chapter 13
Next, add the two products; but remember, since the 13 is in the hundreds position, its prod-
uct is actually 15,600:
An assembly program can use this same technique, except that the data consists of
words (four digits) in hexadecimal format. Let’s now examine the requirements for multi-
plying doubleword by word and doubleword by doubleword.
Doubleword by Word
In Figure 13-4, ELOXMUL multiplies a doubleword by a word. The multiplicand,
MULTCND, consists of two words containing 3206H and 2521H, respectively. The reason
for defining two DWs instead of a DD is to facilitate addressing for MOV instructions that
move words to the AX register. The values are defined in reverse-word sequence, and the
assembler stores each word in reverse-byte sequence. Thus MULTCND, which has a de-
fined value of 32062521H, is stored as 21250632H.
E10XMUL PROC
MOV AX, MULTCND ;Multiply left word
MUL MULTPLR+2 ; Of multiplicand
MOV PRODUCT, AX ;Store product
MOV PRODUCT+2 , DX
; Doubleword x doubleword:
F1OXMUL PROC
MOV AX, MULTCND ;Multiplicand word 1
MUL MULTPLR ; x multiplier word 1
MOV PRODUCT+0 , AX ;Store product
MOV PRODUCT+2
, DX
Z1LOZERO PROC
MOV PRODUCT, 0000 ;Clear words
MOV PRODUCT+2,0000 ; left to right
MOV PRODUCT+4, 0000
MOV PRODUCT+6,0000
Z10ZERO ENDP
END BEGIN
The multiplier, MULTPLR-+ 2, contains 6400H. The field for the generated product,
+2
PRODUCT, provides for three words. The first MUL operation multiplies MULTPLR
and the left word of MULTCND; the product is hex 0OE80 E400H, stored in PROD-
UCT+2 and PRODUCT+4. The second MUL multiplies MULTPLR+2 and the right
word of MULTCND; the product is 138A 5800H. The routine then adds the two products,
like this:
Product 1: 0000 0E80 E400
Product 2: +138A 5800
Since the first ADD may cause a carry, the second add is ADC (Add with Carry). Because
numeric data is stored in reversed byte format, PRODUCT will actually contain 00E4 8066
8A13. The routine requires that the first word of PRODUCT initially contain zero.
Doubleword by Doubleword
Multiplying two doublewords on pre-80386 processors involves four multiplications:
230 Arithmetic: I—Processing Binary Data = Chapter 13
MULTIPLICAND MULTIPLIER
word 2 Xx word 2
word 2 Xx word 1
word | x word 2
word 1 Xx word |
You add each product in the DX and AX to the appropriate word in the final product. In
Figure 13-4, FIOXMUL gives an example. MULTCND contains 3206 2521H, MULTPLR
contains 6400 0A26H, and PRODUCT provides for four words.
Although the logic is similar to multiplying doubleword by word, this problem re-
quires an additional feature. Following the ADD/ADC pair is another ADC that adds 0 to
PRODUCT. The first ADC itself could cause a carry, which subsequent instructions would
clear. The second ADC, therefore, adds 0 if there is no carry and adds 1 if there is a carry.
The final ADD/ADC pair does not require an additional ADC: Since PRODUCT is large
enough for the final generated answer, there is no carry.
The final product is 138A 687C 8E5C CCE6, stored in PRODUCT with the bytes re-
versed. Try using DEBUG to trace through this example.
MULTIPLICATION BY SHIFTING
For multiplying by a power of 2 (2, 4, 8, etc.), it is more efficient simply to shift left
the necessary number of bits. For the 8088/8086, a shift greater than 1 requires that
you load the shift value in the CL register. In the following examples, the multiplicand is
in the AX:
The next method for left shifting requires an 80286 or later processor and does
not require looping. Although specific to a four-bit shift, it could be adapted to other
values:
DIVISION
For division, the DIV (Divide) instruction handles unsigned data, and IDIV (Integer Di-
vide) handles signed data. You are responsible for selecting the appropriate instruction. The
general format for DIV/IDIV is
[label: ] {register/memory }
The basic divide operations are byte into word, word into doubleword, and (80386 and later)
doubleword into quadword.
AX
Before division:
¢<——— Dividend
After division: AH AL
Remainder Quotient
For this operation, the dividend is in the DX:AX pair and the divisor is a word in memory
or another register. After division, the remainder is in the DX and the quotient is in the AX.
The quotient of one word allows a maximum of +32,767 (FFFFH) if unsigned and + 16,383
(7FFFH) if signed. We have:
In dividing a doubleword into a quadword, the dividend is in the EDX:EAX pair and the
divisor is a doubleword in memory or another register. After division, the remainder is in
the EDX and the quotient is in the EAX.
Field Sizes
The operand of DIV or IDIV references the divisor, which specifies the field sizes. In the fol-
lowing DIV examples, the divisors are in a register, which determines the type of operation:
OPERATION DIVISOR DIVIDEND QUOTIENT REMAINDER
DIV CL byte AX AL AH
DIV CX word DX:AX AX DX
DIV EBX doubleword EDX: HAX BAX EDX
Remainder. _ If you divide 13 by 3, the result is 43, where the quotient is 4 and the
true remainder is 1. Note that a calculator (and a high-level programming language) would
deliver a quotient of 4.333. .. , which consists of an integer portion (4) and a fraction por-
tion (.333 ...). The values 3 and .333 are fractions, whereas the | is a remainder.
100H
SHORT MAIN
procedure
DIV routine
E1OIDIV ; IDIV routine
AX,4C0O0OH : i to DOS
21H
; Examples
f
D1ODIV
AX,WORD1 ;Word / byte
BYTE1 ; xrmdr:quot in AH:AL
AL, BYTE1 ;Byte / byte
AH, AH ; extend dividend in AH
BYTE3 ; rmdr:quot in AH:AL
D1LODIV
‘
E1OIDIV
;Word / byte
; rmdr:quot in AH:AL
AL, BYTE1 ;Byte / byte
; extend dividend in
BYTE3 ; rmdr:quot in AH:AL
E1OIDIV
BEGIN
Only Example 4 produces the same answer as did DIV. In effect, if the dividend and divi-
sor have the same sign bit, DIV and IDIV generate the same result. But if the dividend and
divisor have different sign bits, DIV generates a positive quotient, and IDIV generates a
negative quotient.
You may find it worthwhile to use DEBUG to trace through these examples.
In both cases, the generated quotient would exceed its available space. You may be wise to
include a test prior to a DIV or IDIV operation, as shown in the next two examples. In the
first, DIVBYTE is a one-byte divisor, and the dividend is already in the AX:
CMP AH, DIVBYTE ;Compare AH to divisor
In the second example, DIVWORD is a one-word divisor, and the dividend is in the
DX:AX:
For IDIV, the logic should account for the fact that either the dividend or the divisor
could be negative. Since the absolute value of the divisor must be the smaller of the two,
you could use the NEG instruction to set a negative value temporarily to positive and re-
store the sign after the division.
Division by Subtraction
If a quotient is too large for the divisor, you could perform division by means of successive
subtraction. That is, subtract the divisor from the dividend, increment a quotient value by
236 Arithmetic: I—Processing Binary Data Chapter 13
1, and continue subtracting until the dividend is less than the divisor. In the following ex-
ample, the dividend is in the AX, the divisor is in the BX, and the quotient is developed in
the CX:
JB C30 ; exit
At the end of the routine, the CX contains the quotient and the AX contains the remainder.
The example is intentionally primitive to demonstrate the technique. If the quotient is in the
DX:AX pair, include these two operations:
Note that a very large quotient and a small divisor may cause thousands of loops at a
cost of processing time.
DIVISION BY SHIFTING
For division by a power of 2 (2, 4, 8, and so on), it is more efficient simply to shift right the
required number of bits. For the 8088/8086, a shift greater than | requires a shift value in
the CL register. The following examples assume that the dividend is in the AX:
Divide by 2 (shift right 1): SHR Ax, 01
Divide by 8 (shift right 3): MOV CL, 03 ; 8088/8086
SHR AX,CL
Divide by 8 (shift right 3): SHR CL, 03 780286 and later
NEG BL ;8 bits
Reversing the sign of a 32-bit (or larger) value involves more steps. Assume that the
DX:AX pair contains a 32-bit binary number. NEG cannot act on the DX:AX pair concur-
rently, and using it on both registers would mean adding 1| to both. Instead, use NOT to flip
the bits, and use ADD and ADC to add the 1 for two’s complement:
One minor problem remains: It is all very well to perform arithmetic on binary data
that the program itself defines or on data already in binary form on a disk file. However,
data that enters a program from a terminal is in ASCII format. Although ASCII data is suit-
able for displaying and printing, it requires special adjusting for arithmetic—a topic dis-
cussed in the next chapter.
The 8087 consists of eight 80-bit registers, R1-R8, in the following format:
Each register has an associated 2-bit tag that indicates its status:
OO Contains a valid number
01 Contains a zero value
10 Contains an invalid number
11 Is empty
The coprocessor recognizes seven types of numeric data:
1. Word integer: 16 bits of binary data.
=
se SOS—
AS . Short real: 32 bits of floating-point data.
64 | 63 0
Questions 239
significand
KEY POINTS
The maximum signed values for one-byte accumulators are +127 and — 128.
¢ For multiword addition, use ADC to account for any carry from a previous ADD. If
the operation is performed in a loop, use CLC to initialize the carry flag to zero.
Use MUL for unsigned data and IMUL for signed data.
¢ With MUL, if a multiplier is defined as a byte, the multiplicand is AL; if the multi-
plier is a word, the multiplicand is AX; if the multiplier is a doubleword, the multi-
plicand is EAX.
Shift left (SHL or SAL) for multiplying by powers of 2.
¢ Use DIV for unsigned data and IDIV for signed data.
For division, be especially careful of overflows. The divisor must be greater than the
contents of the AH if the divisor is a byte, DX if the divisor is a word, or EDX if the
divisor is a doubleword.
With DIV, if a divisor is defined as a byte, the dividend is AX; if the divisor is
a word, the dividend is DX:AX; if the divisor is a doubleword, the dividend is
EDX:EAX.
Shift right for dividing by powers of 2—SHR for unsigned fields and SAR for
signed fields.
QUESTIONS
13-1. (a) What are the maximum values in a byte for signed data and for unsigned data? (b) What is
the maximum value in a word for signed data and for unsigned data?
13-2. Distinguish between a carry and an overflow.
240 Arithmetic: I—Processing Binary Data Chapter 13
Questions 13-3 through 13-7 refer to the following data, with words defined in reverse sequence:
DATAX DW 0148H
DW 2316H
DATAY DW 0237H
DW 4052H
DATAZ DW 0
DW 0
DW 0
13-3. Code the instructions to add the following: (a) the word DATAX to the word DATAY;; (b) the
doubleword beginning at DATAX to the doubleword at DATAY.
13-4. Explain the effect of the following related instructions:
oh Ss
OBJECTIVE:
To examine ASCII and BCD data formats, to perform
arithmetic in these formats, and to cover conversions be-
tween these formats and binary.
INTRODUCTION
The natural data format for arithmetic on a computer is binary. As seen in Chapter 13, bi-
nary format causes no major problems, as long as the program itself defines the data. For
many purposes, however, numeric data enters a program from a keyboard as ASCII char-
acters, in base-10 format. Similarly, the display of numeric values on a screen is in ASCII
format.
A related format, binary-coded decimal (BCD), has occasional uses and appears as
unpacked and as packed. The PC provides a number of instructions that facilitate simple
arithmetic and conversion between formats. This chapter also covers techniques for con-
verting ASCII data into binary format to perform arithmetic, as well as techniques for con-
verting the binary results back into ASCII format for viewing. The program at the end of
the chapter combines much of the material covered in Chapters 1 through 13.
If you have programmed in a high-level language such as C, you are used to the com-
piler accounting for the radix (decimal or binary) point. However, the computer does not
recognize a radix point in an arithmetic field, so that you as the programmer have to ac-
count for its position.
241
242 Arithmetic: II—Processing ASCII and BCD Data _— Chapter 14
1. BCD permits proper rounding of numbers with no loss of precision, a feature that is
particularly useful for handling dollars and cents. (Rounding of binary numbers that
represent dollars and cents may well cause a loss of precision.)
2. It is often simpler to perform arithmetic on small values entered from a keyboard or
to be written on the screen or printer.
A BCD digit consists of four bits that can represent the decimal digits 0 through 9:
1. Unpacked BCD contains a single BCD digit in the lower four bits of each byte, with
zeros in the upper four bits. Note that although ASCII format is also “unpacked,” it
isn’t called that.
2. Packed BCD contains two BCD digits, one in the upper four bits and one in the lower
four bits. This format is commonly used for arithmetic using the numeric coproces-
sor, defined as 10 bytes with the DT directive.
Let’s examine the representation of the decimal number 1,527 in the three decimal
formats:
The processor performs arithmetic on ASCII and BCD values one digit at a time. You
have to use special instructions for converting from one format to another.
Processing ASCII Data 243
These instructions are coded without operands and automatically adjust an ASCII value in
the AX register. The adjustment occurs because an ASCII value represents an unpacked
base-10 number, whereas the processor performs base-2 arithmetic.
ASCII Addition
Consider the effect of adding the ASCII numbers 8 (38H) and 4 (34H):
hex 38
hex 34
hex 6C
The sum 6CH is neither a correct ASCII nor a correct binary value. However, ignore the
leftmost 6, and add 6 to the rightmost hex C: Hex C plus 6 = hex 12, the correct answer in
terms of decimal numbers. Why add 6? Because that’s the difference between hexadecimal
(16) and decimal (10). This is a little oversimplified, but it does indicate the way in which
AAA performs its adjustment.
The AAA operation checks the rightmost hex digit (four bits) of the AL register. If
the digit is between A and F or the auxiliary carry flag is 1, the operation adds 6 to the AL
register, adds 1 to the AH register, and sets the carry and auxiliary carry flags to 1. In all
cases, AAA clears the leftmost hex digit of the AL to zero.
As an example, assume that the AX contains 0038H and the BX contains 0034H. The
38 in the AL and the 34 in the BL represent two ASCII bytes that are to be added. Addition
and adjustment are as follows:
ADD AL,BL ;Add 34H to 38H, equals 006CH
Since the rightmost hex digit of the AL is C, AAA adds 6 to the AL, adds 1 to the AH, sets
the carry and auxiliary carry flags, and clears to zero the leftmost hex digit of the AL. The
result in the AX is now 0102H.
To restore the ASCII representation, simply insert 3s in the leftmost hex digits of the
AH and AL to get 3132H, or decimal 12:
All that is very well for adding one-byte ASCII numbers. Adding multibyte ASCII
numbers, however, requires a loop that processes from right to left (low order to high
244 Arithmetic: I—Processing ASCII and BCD Data Chapter 14
;Loop 3 times
;At end, store carry
;Loop 4 times
;Exit to DOS
MAIN
order) and accounts for carries. The code in Figure 14-1 adds two three-byte ASCII
numbers, ASC1 and ASC2, and produces a four-byte sum, ASCSUM. Note the follow-
ing points:
The routine did not use OR after AAA to insert leftmost 3s, because OR sets the carry
flag and changes the effect for the ADC instructions. A solution that saves the flag settings
is to push (PUSHF) the flags register, execute the OR, and then pop (POPF) the flags to re-
store them:
ADC AL, [DI] >Add with carry
ASCII Subtraction
The AAS instruction works like AAA. AAS checks the rightmost hex digit (four bits) of
the AL. If the digit is between A and F or the auxiliary carry is 1, the operation subtracts 6
from the AL, subtracts 1 from the AH, and sets the auxiliary (AF) and carry (CF) flags. In
all cases, AAS clears the leftmost hex digit of the AL to zero.
The next two examples assume that ASC1 contains 38H and ASC2 contains 34H. The
first example subtracts ASC2 (34H) from ASC1 (38H). AAS does not need to make an ad-
justment, because the rightmost hex digit is less than hex A:
AX AF
AAS ; 0004 0
OR AL,30H 70034
The second example subtracts ASC1 (38H) from ASC2 (34H). Since the rightmost
digit is hex C, AAS subtracts 6 from the AL, subtracts 1 from the AH, and sets the AF and
CF flags. The answer, which should be —4, is FFO6H, its 10’s complement, which has
little value:
AX AF
AAS ;FFO6 x
ASCII Multiplication
The AAM instruction corrects the result of multiplying ASCII data in the AX register. How-
ever, you must first clear the 3 in the leftmost hex digit of each byte, thus converting the
value to unpacked BCD. For example, the ASCII number 31323334 becomes 01020304 as
unpacked BCD. Also, because the adjustment is only one byte at a time, you can multiply
only one-byte fields and have to perform the operation repetitively in a loop. Use only the
MUL, not the IMUL, operation.
AAM divides the AL by 10 (OAH) and stores the quotient in the AH and the remain-
der in the AL. For example, suppose that the AL contains 35H and the CL contains 39H.
The following code multiplies the contents of the AL by the CL and converts the result to
ASCII format:
INSTRUCTION COMMENT AX CL
AND CL, OFH ;Convert CL to 09 0035 09
AND AL, OFH ;Convert AL to 05 0005
MUL CL ;Multiply AL by CL O002D
AAM ;Convert to unpacked BCD 0405
OR AX,3030H ;Convert to ASCII 3435
The MUL operation generates 45 (002DH) in the AX. AAM divides this value by 10, gen-
erating a quotient of 04 in the AH and a remainder of 05 in the AL. The OR instruction then
converts the unpacked BCD value to ASCII format.
Figure 14—2 depicts multiplying a four-byte multiplicand by a one-byte multiplier.
Since AAM can accommodate only one-byte operations, the routine steps through the mul-
tiplicand one byte at a time, from right to left. At the end, the unpacked BCD product is
0108090105, which a loop routine converts to true ASCII format as 3138393135, or deci-
mal 18,915.
If a multiplier is greater than one byte, you have to provide yet another loop that steps
through the multiplier. It may be simpler to convert the ASCII data to binary format, as cov-
ered in a later section.
ASCII Division
The AAD instruction provides a correction of an ASCII dividend prior to dividing. Just as
with AAM, you first clear the leftmost 3s from the ASCII bytes to create unpacked BCD
format. AAD allows for a two-byte dividend in the AX. The divisor can be only a single
byte containing 01 to 09.
Assume that the AX contains the ASCII value 28 (3238H) and the CL contains
the divisor, ASCIL 7 (37H). The following instructions perform the adjustment and
division:
Processing Unpacked BCD Data 247
MULTCND
MULTPLR
PRODUCT
;Loop 4 times
;Loop 4 times
;Exit to DOS
INSTRUCTION COMMENT AX CL
AND CL, OFH ;Convert to unpacked BCD S250 07
AND AX, OFOFH ;Convert to unpacked BCD 0208
AAD ;Convert to binary 001C
DIV CL ;Divide by 7 0004
AAD multiplies the AH by 10 (OAH), adds the product 20 (14H) to the AL, and clears the
AH. The result, 001CH, is the hex representation of decimal 28.
Figure 14—3 allows for dividing a one-byte divisor into a four-byte dividend. The rou-
tine steps through the dividend from left to right. LODSB gets a byte from DIVDND into
the AL (via the SI), and STOSB stores bytes from the AL into QUOTNT (via the DI). The
remainder stays in the AH register so that AAD will adjust it in the AL. At the end, the quo-
tient, in unpacked BCD format, is 00090204, and the remainder in the AH is 02. Another
loop (not coded) could convert the quotient to ASCII format as 30393234.
If the divisor is greater than one byte, you have to provide yet another loop to step
through the divisor. Better yet, see the later section, “Conversion of ASCII to Binary Format.”
248 Arithmetic: I—Processing ASCII and BCD Data _— Chapter 14
100H
SHORT MAIN
;Initialize 4 loops
;Clear left byte of dividend
DIVSOR, OFH ;Clear divisor of ASCII 3
SI,DIVDND
DI, QUOTNT
BEGIN
In the preceding example of ASCII division, the quotient was 00090204. If you were to
compress this value, keeping only the right digit of each byte, the result would be 0924, now
in packed BCD format. You can also perform addition and subtraction on packed BCD data.
For this purpose, there are two adjustment instructions:
DAA corrects the result of adding two packed BCD values in the AL, and DAS cor-
rects the result of subtracting them. Once again, you have to process the fields one byte at
a time.
The program in Figure 14—4 illustrates BCD addition. The procedure B10CONV con-
verts the ASCII values ASC1 and ASC2 to packed BCD values BCD1 and BCD2, respec-
tively. Processing, which is from right to left, could just as easily be from left to right. Also,
processing words is easier than processing bytes because you need two ASCII bytes to gen-
erate one packed BCD byte. However, the use of words does require an even number of
bytes in the ASCII field.
The procedure C1OADD performs a loop three times to add the packed BCD num-
bers to BCDSUM. The final total is 00127263.
Conversion of ASCII to Binary Format 249
‘05 7636"
'069427'
NEAR
SI,ASC1+4 ;Initialize for ASCl1
DI,BCD1+2
B1OCONV ;CalI convert routine
SI,ASC2+4 ;Initialize for ASC2
DI,BCD2+2
B1OCONV ;Call convert routine
C1OADD ;Call add routine
AX,4C00H ;Exit to DOS
21H
MAIN
Convert ASCII to BCD:
B1O0CONV
CL, 04 #Shitt factor
DX, 03 ;No. of words to convert
B20:
AX, [STI] ;Get ASCII pair
AH, AL
AL, CL ;Shift off
AX, CL ; ASCII 3s
[DI] , AH ;Store BCD digits
Si
Si
DI
DX
B20 ;Three times?
; Yes; Tecturn
B1OCONV
’ Add BCD numbers
C1OADD
AH, AH ;Clear AH
SI,BCD1+2 ;Initialize
DI,BCD2+2 ; BCD
BX, BCDSUM+3 ; addresses
cx, 03 ;3-byte fields
C20:
AL, [STI] ;Get BCD1 (or LODSB)
AL, [DI] ;Add BCD2
;Decimal adjust
(BX] , AL ;Store in BCDSUM
SI
Di
DEC BX
LOOP C20 ;Loop 3 times
RET
C10ADD ENDP
END BEGIN
1. Start with the rightmost byte of the ASCII number and process from right to left.
2. Strip the 3 from the left hex digit of each ASCII byte, thus forming a packed BCD
number.
3. Multiply the first BCD digit by 1, the second by 10 (OAH), the third by 100 (64H),
and so forth, and sum the products.
Decimal Hexadecimal
Try checking that the sum 04D2H really equals decimal 1234. In Figure 14—5, the program
converts ASCII number 1234 to its binary equivalent. An LEA instruction initializes the
address of the rightmost byte of the ASCII field, ASCVAL+3, in the SI register. The in-
struction at B20 that moves the ASCII byte to the AL is
MOV AL, [ST]
The operation uses the address of ASCVAL+3 to copy the rightmost byte of ASCVAL into
the AL. Each iteration of the loop decrements the SI by | and references the next byte to
the left. The loop repeats for each of the four bytes of ASCVAL. Also, each iteration mul-
tiplies MULT10 by 10 (OAH), giving multipliers of 1, 10 (OQAH), 100 (64H), and so forth.
At the end, BINVAL contains the correct binary value, D204H, in reverse-byte sequence.
The routine is coded for clarity; for faster processing, the multiplier could be stored
in the DI register.
To print or display the result of binary arithmetic, you have to convert it into ASCII format.
The operation involves reversing the previous step: Instead of multiplying, continue divid-
ing the binary number by 10 (OAH) until the quotient is less than 10. The remainders, which
Shifting and Rounding 251
can be only 0 through 9, successively generate the ASCII number. As an example, let’s con-
vert 4D2H back into decimal format:
DIVIDE BY 10 QUOTIENT REMAINDER
A | 4D2 7B +
A | 7B C 3
AIC 1 2
Since the quotient (1) is now less than the divisor (OAH), the operation is complete. The re-
mainders, along with the last quotient, form the BCD result, from right to left: 1234. All
that remains is to store these digits in memory with ASCII 3s, as 31323334.
The program in Figure 14—6 converts binary number 04D2H to ASCII format. The
routine divides the binary number successively by 10, until the remaining quotient is less
than 10 (OAH), and stores the generated hex digits in ASCII format as 31323334. You may
find it useful, if not downright entertaining, to reproduce this program and trace its execu-
tion step by step.
100H
SHORT MAIN
;Main procedure
CX, 0010 ;Division factor
SI,ASCVAL+3 ;Address of ASCVAL
AX, BINVAL ;Get binary field
BEGIN
Product: 12.345
Add 5: + 5
If the product is 12.3455, add 50 and shift two digits, and if the product is 12.34555,
add 500 and shift three digits:
12.3455 12.34555
a 50 2 500
12.3505 = 12.35 12.35055 = 12.35
Further, a number with six decimal places requires adding 5,000 and shifting four dig-
its, and so forth. Now, since a computer normally processes binary data, 12.345 appears as
3039H. Adding 5 to 3039H gives 303EH, or 12350 in decimal format. So far, so good. But
shifting one binary digit results in 181FH, or 6175—indeed, the shift simply halves the
value. We require a shift that is equivalent to shifting right one decimal digit. You can ac-
complish this shift by dividing the rounded value by 10, or hex A: Hex 303E divided by
hex A = 4D3H, or decimal 1235. Conversion of 4D3H to a decimal number gives 1235.
Now just insert a decimal point in the correct position, and you can display a rounded,
shifted value as 12.35.
In this fashion, you can round and shift any binary number. For three decimal places,
add 5 and divide by 10; for four decimal places, add 50 and divide by 100. Perhaps you
Program to Convert ASCII Data 253
have noticed a pattern: The rounding factor (5, 50, 500, etc.) is always one-half of the value
of the shift factor (10, 100, 1,000, etc.).
Of course, the radix point in a binary number is implied and is not actually present.
page 60,132
TITLE P14SCREMP (EXE) Enter hours and rate, display wage
.MODEL SMALL
.STACK 64
DATA
LEFCOL EQU 28 ;Equates for screen
RITCOL EQU 52
TOPROW EQU 10
BOTROW EQU 14
CODE
BEGIN PROC FAR
MOV AX, @data ;Initialize DS
MOV DS,AX ; and ES registers
MOV ES , AX
CALL Q1OSCR ;Clear screen
A2Q0LOOP:
Q1LSWIN ;Clear window
Q20CURS ;Set cursor
B1OINPT ;Accept hours & rate
D10HOUR ;Convert hours to binary
ELORATE ;Convert rate to binary
FLOMULT ;Calculate wage, round
G1OWAGE ;Convert wage to ASCII
K10DISP ;Display wage
L10PAUS ;Pause for user
AL, 1BH ;Esc pressed?
A20LOOP ; no, continue
; yes, end of input
Q10SCR ;Clear screen
AX, 4C0O0H ;Exit to DOS
PAN |
BEGIN
°
| Input hours and rate:
e
’
B1OINPT NEAR
ROW, TOPROW+1 ;Set cursor
COL, LEFCOL+3
Q20CURS
ROW
AH, 09H
DX, MESSG1 ;Prompt for hours
21H
AH, OAH
DX,HRSPAR ;Accept hours
21H
COL, LEFCOL+3 ;Set column
Q20CURS
ROW
AH, 09H
DX, MESSG2 ;Prompt for rate
21H
AH, OAH
DX, RATEPAR ;Accept rate
Zi
B1LOINPT
‘ Process hours:
D10HOUR NEAR
NODEC,00
CL, ACTHLEN
CH, CH
SI,HRSFLD-1 ;Set right position
SI,CX ; of hours
M10ASBI ;Convert to binary
AX, BINVAL
BINHRS , AX
D10HOUR
.
4 Process rate:
E1ORATE NEAR
CL, ACTRLEN
CH, CH
SI,RATEFLD-1 ;Set right position
SI,CX ; of rate
M1OASBI ;Convert to binary
AX, BINVAL
BINRATE,AX
E1LORATE ENDP
; Multiply, round, and shift:
MOV SHIFT,10
MOV ADJUST, 00
MOV CX, NODEC
CMP CL, 06 -I£f more than 6
JA F40 ; decimals, error
DEC CX
DEC CX
JLE F30 ;Bypass if 0, 1, 2 decs
MOV NODEC, 02
MOV AX,0O1
F20:
MUL TENWD ;Calculate shift factor
LOOP F20
MOV SHIFT,AX
SHR AX,1 :Calculate round value
MOV ADJUST, AX
F30%
MOV AX, BINHRS
MUL BINRATE ;Calculate wage
ADD AX, ADJUST ;Round wage
ADC DX, 00
CMP DX, SHIFT ;Product too large
JB F50 . Or DiVe
F40:
SUB AX, AX
JMP F70
F50:
CMP ADJUST,00 ;Shift required?
JZ F80 ,; no, bypass
DIV SHIFT ;Shift wage
F70: SUB DX , DX ;Clear remainder
F80: RET
FLOMULT ENDP
; Convert to ASCII:
G60:
AL, 30H ;Store last ASCII
(SI] , AL ; character
G1OWAGE
U Display wage:
es
K1i0ODISP NEAR
COL, LEFCOL+3 ;Set column
Q20CURS
CX, 09
SI, ASCWAGE
K20: ;Clear leading zeros
BYTE PTR[SI] ,30H
K30 ; to blanks
BYTE PTR[SI] ,20H
SI
K20
K30:
AH, 09H ;Request display
DX,MESSG3 ;Wage
240
K1ODISP
.
/ Pause for user:
L10PAUS
;Set cursor
AH, 09H
DX,MESSG4 ;Display pause
21H
AH,10H ;Request reply
16H
L10PAUS
.
£ Convert ASCII to binary:
M1OASBI NEAR
MULT10,0001
BINVAL,00
DECIND,00
BA; Bx
M20:
AL, [SI] ;Get ASCII character
AL," <! ;Bypass if dec point
M40
DECIND,01
M90
M40:
AX, OOOFH
MULT10 ;Multiply by factor
BINVAL,AX ;Add to binary
AX ,MULT10 ;Calculate next
TENWD ; factor x 10
MULT10 , AX
DECIND,00 ;Reached decimal point?
M90
BX ; yes, add to count
M90:
SI
M20
DECIND,00 ;End of loop
M100 ;Any decimal point?
NODEC,BX ; yes, add to total
M100: RET
M1OASBI ENDP
; Scroll whole screen:
Limitations. A limitation of this program is that it allows only a total of six dec-
imal places in the calculated wage. Another limitation is the magnitude of the wage itself
and the fact that shifting involves dividing by a multiple of 10 and converting to ASCII in-
volves dividing by 10. If hours and rate of pay contain a total that exceeds six decimal
258 Arithmetic: II—Processing ASCII and BCD Data Chapter 14
places, or if the wage exceeds about 655,350, the program clears the wage to zero. In prac-
tice, a program would print a warning message or would contain subroutines to overcome
these limitations.
Error checking. A program designed for users other than the programmer not
only should produce warning messages, but also should validate hours and rate of pay. The
only valid characters are the numbers 0 through 9 and one decimal point. For any other char-
acter, the program should display a message and return to the input prompt. A useful in-
struction for validating is XLAT, which Chapter 15 covers.
In practice, test your program thoroughly for all possible conditions, such as zero val-
ues, extremely high and low values, and negative values.
KEY POINTS
An ASCII field requires one byte for each character. For a numeric field, the right-
most half-byte contains the digit, and the leftmost half-byte contains 3.
Clearing the leftmost ASCII 3s to Os converts the field to unpacked binary-coded dec-
imal (BCD) format.
Compressing ASCII characters to two digits per byte converts the field to packed bi-
nary-coded decimal (BCD) data.
After an ASCII add, use AAA to adjust the answer; after an ASCII subtract, use AAS
to adjust the answer.
Before an ASCII multiplication, convert the multiplicand and multiplier to unpacked
BCD by clearing the leftmost hex 3s to Os. After the multiplication, use AAM to ad-
just the product.
Before an ASCII division, convert the dividend and divisor to unpacked BCD by
clearing the leftmost hex 3s, and use AAD to adjust the dividend.
For most arithmetic purposes, convert ASCII numbers to binary. When converting
from ASCII to binary format, check that the ASCII characters are valid: 30 though
39, a decimal point, and possibly a minus sign.
Questions 259
QUESTIONS
14-1. Suppose that the AX contains ASCII 9 (0039H) and the BX contains ASCII 7 (0037H). Ex-
plain the exact results of the following unrelated operations:
AAA AAA
AAS AAS
14-2. An unpacked BCD field named UNPAK contains 01040705H. Code a loop that causes its con-
tents to be proper ASCII 31343735H.
14-3. A field named ASCA contains the ASCII decimal value 173, and another field named ASCB
contains ASCII 5. Code the instructions to multiply the ASCII numbers together and to store
the product in ASCPRO.
14-4. Use the same fields as in Question 14-3 to divide ASCA by ASCB and store the quotient in
ASCQUO.
14-5. Provide the manual calculations for the following: (a) Convert ASCII decimal value 46328 to
binary, and show the result in hex format; (b) convert the hex value back to ASCII.
14-6. Code and run a program that determines a computer’s memory size (see INT 12H in Chapter
3), converts the size to ASCII format, and displays it on the screen as follows:
OBJECTIVE:
To cover the requirements for defining tables, performing
searches of tables, and sorting table entries.
INTRODUCTION
Many program applications require tables containing such data as names, descriptions,
quantities, and prices. The definition and use of tables largely involves applying what you
have already learned. This chapter begins by defining some conventional tables and then
covers methods for searching through them. Techniques for searching tables are subject to
the way in which the tables are defined, and many methods of defining and searching other
than those given here are possible. Other commonly used features are the use of sorting,
which rearranges the sequence of data in a table, and the use of linked lists, which use point-
ers to locate items in a table.
The only instruction introduced in this chapter is XLAT (Translate).
DEFINING TABLES
To facilitate searching through them, most tables are arranged in a consistent manner, with
each entry defined with the same format (character or numeric), with the same length, and
in either ascending or descending order.
260
Defining Tables 261
A table that you have been using throughout this book is the definition of the stack,
which in the following is a table of 64 uninitialized words (the name STACK refers to the
first word of the table):
STACK DW 64 DUP(?)
The following two tables, MONTAB and EMPTAB, initialize character and numeric
values, respectively. MONTAB defines alphabetic abbreviations of the months, whereas
EMPTAB defines a table of employee numbers:
All entries in MONTAB are three characters, and all entries in EMPTAB are three digits.
But note that the assembler converts the decimal numbers to binary format and, provided
that they don’t exceed the value 255, stores them each in a byte.
A table may also contain a mixture of numeric and character values, provided that
they are defined consistently. In the following table of stock items, each numeric entry
(stock number) is two digits (one byte), and each character entry (stock description) is nine
bytes. The four dots following the description “Paper” are to show that spaces should be
present; that is, spaces, not dots, are to be keyed in the description:
For clarity, you may also code table entries on separate lines:
STOKTBL DB 12, ‘Computers’
DB 14, ‘Paper....’
DB 17, ‘Diskettes’
The next example defines a table with 50 entries, each initialized to 20 blanks:
STORETAB DB 50 DUP(20 DUP(’ ‘))
A program could use this table to store up to 50 values that it has generated internally, or it
could use it to store the contents of up to 50 entries that it reads from a disk file.
Tables on Disk
In real-world situations, many programs are table driven. Tables are stored as disk files,
which any number of programs may read into their data segment for processing. The rea-
son for this practice is because the contents of tables change over time. If each program de-
fined it own tables, any changes would require all the programs to redefine the tables and
be reassembled. With table files on disk, you just need to change the contents of the file.
Chapter 17 gives an example of a table file.
Now let’s examine different ways to use tables in programs.
262 Table Processing Chapter 15
Suppose that a user enters a numeric month such as 03 and that a program is to convert it
to alphabetic format—in this case, March. The routine to perform this conversion involves
defining a table of alphabetic months, all of equal length. The length of each entry should
be that of the longest name, September:
MONTAB DB ‘January...’
DB ‘February. ’
DB ‘March....’
DB ‘December. ’
The procedure D1OLOC determines the actual location of entries in the table:
Deduct 1 from month in the AX OOOB (decimal 11)
Multiply by 9 (length of entries) 0063 (decimal 99)
Add address of table (MONTAB) MONTAB+63H
One way to improve this program is to accept numeric months from the keyboard and
to verify that their values are between 01 and 12, inclusive.
Direct Table Addressing 263
100H
SHORT MAIN
9 DUP (20H
' January 'February 'March '
'April 'May ' June '
'July ‘August 'September'
‘October 'November 'December '
;Main procedure
C10CONV ;Convert to binary
D10LOC ;Locate month
F1ODISP ;Display alpha month
AX ,4C0O0H ;Exit to DOS
21H
C1OCONV PROC
AH, MONIN ;Set up month
AL, MONIN+¢+1
AX, 3030H ;Clear ASCII 3s
AH, 00 ;Month 01-09?
C20 ; yes, bypass
AH, AH ; no, clear AH,
AL,10 ; correct for binary
C20
C1OCONV
; Locate month in table:
D10LOC
SI,MONTAB
AL ;Correct for table
NINE ;Multiply AL by 9
SI,AX
CX, NINE ;Initialize 9-char move
DI, ALFMON
REP MOVSB ;Move 9 characters
RET
D10LOC ENDP
; Display alpha month:
FaUDISP
;Request display
DX, ALFMON
21H
F1ODISP
BEGIN
-.DATA
SAVEDAY DB 2
SAVEMON DB 4
TEN DB AD
ELEVEN DB i
TWELVE DB 12
DAYSTAB DB ‘Sunday, $ ', 'Monday, $ :
DB ‘Tuesday, $ ', 'Wednesday, $'
DB 'Thursday, $ ', 'Friday, $ :
DB ‘Saturday, $ '
MONTAB DB 'January $ ', 'February $ ', 'March $ '
DB 'April §$ ', 'May $ ', 'June §$ ’
DB ‘July $ . ', 'August $ ', 'September $'
DB ‘October $ ', ‘November $ ', 'December S$ '
. CODE
PROC FAR
MOV AX, @data ; Initialize
MOV DS, AX ; segment registers
MOV ES, AX
MOV AX,0600H
CALL Q10SCR ;Clear screen
CALL Q20CURS ;Set cursor
MOV AH, 2AH ;Get today's date
INT 21H
MOV _SAVEMON, DH ;Save month
MOV SAVEDAY , DL ;Save day of month
CALL B1ODAYWK ;Display day of week
CALL C1LOMONTH ;Display month
CALL D1ODAYMO ;Display day
CALL ELOINPT ;Wait for input
CALL Q10SCR ;Clear screen
MOV AX, 4C00H ;Exit to DOS
INT 21H
BEGIN ENDP
The program uses these values to display the alphabetic day of the week and the month in
the form “Wednesday, September 12.” To this end, the program defines a table of days of
the week named DAYSTAB, beginning with Sunday, and a table of months named
MONTAB, beginning with January.
Entries in DAYSTAB are 12 bytes long, with each description followed by a comma,
blank, and $ sign and padded with blanks to the right. DOS INT 21H, function O9H, dis-
plays all characters up to the $ sign; the comma and blank are followed on the screen by the
month. The procedure BIODAYWK multiplies the day of the week by 12 (the length of
each entry in DAYSTAB). The product is an offset into the table, where, for example, Sun-
day is at DAYSTAB+0, Monday is at DAYSTAB+ 12, and so forth. The day is displayed
directly from the table.
Entries in MONTAB are 11 bytes long, with each description followed by a blank,
and $ sign and padded with blanks to the right. The procedure Cl OMONTH first decrements
266 Table Processing § Chapter 15
the month by | so that, for example, month 01 becomes entry zero in MONTAB. It then
multiplies the month by 11 (the length of each entry in MONTAB). The product is an
offset into the table, where, for example, January is at MONTAB+0O, February at
MONTAB +11, and so forth. The month is displayed directly from the table.
The procedure DIODAYMO divides the day of the month by 10 to convert it from
binary to ASCII format. Since the maximum value for day is 31, both the quotient and the
remainder can be only one digit. (For example, 31 divided by 10 gives a quotient of 3 and
a remainder of 1.) DOS function 02H displays each of the two characters, including the
leading zero for days less than 10; suppressing the leading zero involves some minor pro-
gram changes.
At the end, the program waits for the user to press a key before exiting to DOS.
Although direct table addressing is very efficient, it works best when entries are se-
quential and in a predictable order. Thus it would work well for entries that are in the order
1,2,3,..., or 106, 107, 108,..., or even 5, 10, 15,.... Unfortunately, few applications
provide such a neat arrangement of table values. A later section examines tables with val-
ues that are sequential, but not in any particular order.
SEARCHING A TABLE
Some tables consist of unique numbers with no apparent pattern. A typical example is a
table of stock items with nonconsecutive numbers such as 134, 138, 141, 239, and 245. An-
other type of table—such as an income tax table—contains ranges of values. The follow-
ing sections examine both of these types of tables and the requirements for searching them.
Each step in a search could increment the address of the first table by 2 (the length of each
entry in STOKNOS) and the address of the second table by 10 (the length of each entry in
STOKDESC). Or, a procedure could keep a count of the number of loops executed and, on
finding a match with a certain key stock number, multiply the count by 10 and use the prod-
uct as an offset to the address of STOKDESC.
On the other hand, it may be clearer to define stock numbers and descriptions in the
same table, with one line for each pair:
DB ‘12’,’Presses...’
The program in Figure 15-3 defines this table with six pairs of stock numbers and
descriptions. The search loop at A20 begins comparing the first byte of the input stock
number, STOKNIN, with the first byte of stock numbers in the table. If the comparison iS
equal, the routine compares the second bytes. If these are equal, the stock number is found
and, at ASO, the program copies the description from the table into DESCRN, where it is
displayed.
If the comparison of the first or second bytes is low, the stock number is known to be
not in the table and, at A40, the program could display an error message (not coded).
If the comparison of the first or second bytes is high, the program has to continue the
search; to compare the input stock number with the next stock number in the table, it in-
crements the SI, which contains the table address. The search loop performs a maximum of
six comparisons. If the loops exceed six, the stock number is known to be not in the table.
Let’s verify this logic by comparing entered stock numbers 01, 06, and 10 succes-
sively with items in the table:
¢ Stock number 01 with table item 05. The first byte is equal, but the second is low, so
the item is not in the table.
¢ Stock number 06 with table item 05. The first byte is equal, but the second is high, so
we compare the input with the next item in the table: stock number 06 with table item
10. The first byte is low, so the item is not in the table.
¢ Stock number 10 with table item 05. The first byte is high, so we compare the input
with the next item in the table: stock number 10 with table item 10. The first byte is
equal and the second is equal, so the item is found.
The table could also define unit prices. The user enters stock number and quantity
sold. The program could locate the stock item in the table, calculate amount of sale (quan-
tity sold times unit price), and display description and amount of sale.
In Figure 15-3, the item number is 2 characters and the description is 10. Program-
ming details would vary for different numbers of entries and different lengths of entries.
For example, to compare three-byte fields, you could use REPE CMPSB, although the in-
struction involves the CX register, which LOOP already uses.
100H
SHORT MAIN
;Initialize compares
SI, STOKTAB
AL, STOKNIN
AL, [STI] ;Stock# (1) table
A30 ;Not equal, exit
AL, STOKNIN+1 ;Equal:
AL, [SI+1] ; stock# (2) table
A50 ; equal, found
;Extract description
; from table
;Request display
DX, DESCRN ; stock description
21H
BEGIN
In the tax table, rates increase as taxable income increases. The adjustment factor compen-
sates for our calculating tax at the high rate, whereas lower rates apply to lower levels of
income. Entries for taxable income contain the maximum income for each step:
To perform a search of the table, the program compares the taxpayer’s actual taxable income
with entries in the table and does the following, according to results of the comparison:
STOKNIN STOKTAB
| |
| | | |
Hex offset: 000 003 010 01D
The last entry in the table contains ‘999’ to force the search to end, since REPE makes the
CX unavailable for the LOOP instruction. The search routine compares STOKNIN (arbi-
trarily defined to contain 123) with each table entry, as follows:
The program initializes the DI to the offset address of STOKTAB (003), the CX
to the length (03) of each stock number, and the SI to the offset of STOKNIN (000).
The CMPSB operation compares byte for byte, as long as the bytes contain equal values,
and automatically increments the DI and SI registers. A comparison with the first table
entry (123:035) causes termination after one byte; the DI contains 004, the SI contains
001, and the CX contains 02. For the next comparison, the DI should contain 010 and
the SI should contain 000. Correcting the SI simply involves reloading the address
of STOKNIN. For the address of the table entry that should be in the DI, however, the in-
crement depends on whether the comparison ends after one, two, or three bytes. The CX
contains the number of the remaining uncompared bytes, in this case, 02. Adding the
CX value plus the length of the stock description gives the offset of the next table item,
as follows:
270 Table Processing Chapter 15
page 60,132
TITLE P15STRSR (EXE) Search using CMPSB
.MODEL SMALL
-STACK 64
.DATA
0000 STOKNIN DB r123"
0003 63 STOKTAB DB '035','Excavators' ;Start table
72
0010 66 DB '038','Lifters :
20
6C DB '102','Valves ’
20
0037 6F DB '123','Processors'
12
i
CODE
0000 BEGIN PROC FAR
0000 MOV AX, @data ; Initialize
0003 MOV DS, AX ; segment
0005 MOV ES, AX ; registers
0007 CLD
0008 LEA DI, STOKTAB ;Initialize table
000C A20: ; address
000C MOV cx, 03 ;Compare 3 bytes
0O00F LEA SI, STOKNIN ;Init stock# addr
0013 REPE CMPSB ;Stock# table
0015 JE A30 ; equal, exit
0017 JB A40 ; low, not entry
0019 ADD DI, Cx ;Add CX to offset
001B ADD DI, 10 ;Next table item
0O1E JMP A20
0020 A30:
0020 MOV CxX,05 ;Set for 5 words
0023 MOV SsL,DI
0025 LEA DI, DESCRN ;Addr of descr'n
0029 F3/ AS REP MOVSW ;Get description
; from table
002B MOV AH, 09H ;Request display
002D | LEA DX, DESCRN ; Stock descrip'n
0031 INT 21H
0033 JMP A90 ;Go to exit
0036 A40
<Display error message>
0036 A90
0036 MOV AX, 4C00H ;Exit to DOS
0039 INT 21H
003B RET
003C BEGIN ENDP
END BEGIN
Since the CX contains the number of the remaining uncompared bytes (if any), the arith-
metic works for all cases and terminates after one, two, or three comparisons. On an equal
comparison, the CX contains 00, and the DI is already incremented to the address of the re-
quired description. A REP MOVSW operation then copies the description into DESCRN,
where it is displayed.
The following example converts ASCII numbers 0-9 into EBCDIC. Since the repre-
sentation in ASCII is 30-39 and in EBCDIC is FO-F9, you could use an OR operation to
make the change. However, let’s also convert all other characters to a blank, EBCDIC 40H.
For XLAT, you define a translation table that accounts for all 256 possible characters, with
EBCDIC codes inserted in the ASCII positions:
XLTBL DB 48 DUP(40H) ;EBCDIC blanks
XLAT expects that the address of the table is in the BX register and the byte to be trans-
lated (let’s name it ASCNO) is in the AL. The following performs the initialization and
translation:
LEA BX, XLTBL -Load address of table
BEGIN
XLAT uses the AL value as an offset address; in effect, the BX contains the starting ad-
dress of the table, and the AL contains an offset value within the table. If the AL value is
00, for example, the table address would be XLTBL+0 (the first byte of XLTBL contain-
ing 40H). XLAT would replace the 00 in the AL with 40H from the table.
Note that the first DB in XLTBL defines 48 bytes, addressed as XLTBL+00 through
XLTBL+47. The second DB in XLTBL defines data beginning at XLTBL+48. If the AL
value is 32H (decimal 50), the table address is XLTBL+50; this location contains F2
(EBCDIC 2), which XLAT would insert in the AL register.
The program in Figure 15—5 expands this example to convert ASCII minus sign (2D)
and decimal point (2E) to EBCDIC (60 and 4B, respectively) and to loop through a six-byte
field. Initially, ASCNO contains —31.5 followed by a blank, or hex 2D33312E3520. At the
end of the loop, EBCNO should contain hex 60F3F14BF540.
The program in Figure 15-6 displays all 256 hex values (O0-FF), including most of their
related ASCII symbols. For example, the program displays both the ASCII symbol S and
its hex representation, 53. The full display appears on the screen as a 16-by-16 matrix:
Displaying HEX and ASCII Characters 273
page 60,132
TITLE PISASCHX (COM) Display ASCII and hex characters
-MODEL SMALL
. CODE
ORG 100H
BEGIN JMP SHORT MAIN
D40: RET
D1ODISP ENDP
00 O01 02 03 04 05 06 07 08 09 OA OB OC OD OE OF
FO Fl F2 F3 F4 F5 Fo F7 F8 FO FA FB FC FD FE FF
table, exchanging where necessary. If you made any exchanges, repeat the entire process
from the start of the table, comparing entry | with entry 2 again. If you didn’t make any ex-
changes, the table is in sequence and you can end the sort.
In the following pseudocode, SWAP is an item that indicates whether an exchange
was made (YES) or not made (NO).
At end of table?
The program in Figure 15-7 allows a user to enter up to 30 names from the keyboard,
which the program stores successively in a table named NAMETAB. When all the names
are entered, the user just presses the Enter key, with no name. The program then sorts the
table of names into ascending sequence and displays them on the screen. Note that the table
entries are all fixed-length 20 bytes; a routine for sorting variable-length data would be
more complicated.
LINKED LISTS
A linked list contains data in what are called cells, like entries in a table, but in no specified
sequence. Each cell contains a pointer to the next cell in the list to facilitate forward
searches. (A cell may also contain a pointer to the preceding cell so that searching may pro-
ceed in either direction.) The method facilitates additions and deletions to a list without the
need for expanding and contracting it.
For our purposes, the linked list contains cells with part number (four-byte
ASCII value), unit price (binary word), and a pointer (binary word) to the next cell in
the list, which contains the next part number in the sequence. Thus each entry is eight
bytes in length. The pointer is an offset from the start of the list. The linked list be-
gins at offset 0000, the second item in the series is at 0024, the third is at 0032, and
so forth:
276 Table Processing Chapter 15
page 60,132
TITLE PI5NMSRT (EXE) Sort names entered from terminal
-MODEL SMALL
.STACK 64
—— ewe
wai ae ae eee ae ie iia ei ia ei eae a a
. DATA
LABEL BYTE ;Name parameter list:
DB at ; Maximum length
DB ? ; no. of chars entered
NAMEFLD DB 21. DUP{’ *) ; Nname
DB 00
CODE
PROC FAR
MOV AX,@data ;Initialize DS and
MOV DS , AX ; ES registers
MOV ES, AX
CLD
CALL Q10CLR ;Clear screen
CALL Q20CURS ;Set cursor
LEA DI, NAMETAB
A20LOOP:
CALL B1OREAD ;Accept name
CMP NAMELEN,00 ;Any more names?
JZ A30 ; no, go to sort
CMP NAMECTR,30 ;30 names entered?
JE A30 ; yes, go tO sort
CALL D1OSTOR ;Store entered name in table
JMP A20LOOP
A30: ;End of input
CALL Q10CLR ;Clear screen
CALL Q20CURS ; and set cursor
CMP NAMECTR, 01 ;One or no name entered?
JBE A40 ; yes, exit
CALL G10SORT ;Sort stored names
CALL K1ODISP ;Display sorted names
A40: MOV AX,4CO0OH ;Exit to DOS
INT Z2LA
BEGIN ENDP
U Accept name as input:
B1OREAD PROC
MOV AH, 09H
LEA DX,MESSG1 ;Display prompt
INT 21H
MOV AH, OAH
LEA DX, NAMEPAR ;Accept name
INT 21H
MOV AH, 09H
LEA DX, CRLF ;Return/line feed
INT 21H
B20;
MOV NAMEFLD
[BX] , 20H ;Set name to blank
INC BX
LOOP B20
RET
B1IOREAD ENDP
f Store name in table:
D10STOR PROC
INC NAMECTR ;Add to number of names
CLD
LEA SI,NAMEFLD
MOV CAO ;Ten words
REP MOVSW ;Name (SI) to table (DI)
RET
D10STOR ENDP
Sort names in table:
G10SORT PROC
SUB DI,40 ;Set up stop address
MOV ENDADDR,
DI
G20:
MOV SWAPPED,00 ;Set up start
LEA SI,NAMETAB ; of table
G30:
MOV CX, 20 ;Length of compare
MOV DiS
ADD DI,20 ;Next name for compare
MOV AX,DI
MOV BX, ol
REPE CMPSB ;Compare name to next
JBE G40 ; no exchange
CALL H10XCHG ; exchange
G40:
MOV SI,AX
CMP SI, ENDADDR ;End of table?
JBE G30 ; no, continue
CMP SWAPPED,00 ;Any swaps?
JNZ G20 ; yes, continue
RET ; no, end of sort
G10SORT ENDP
H10XCHG
MOV CX,10 ;Number of characters
LEA DI, NAMESAV
MOV SI,BxX
REP MOVSW ;Move lower item to save
MOV Cz, 20
LEA SI,NAMESAV
REP MOVSW ;Move save to higher item
MOV SWAPPED, 01 ;Signal exchange made
RET
H1OXCHG ENDP
.
/ Display sorted names:
——_— ee
K10DISP PROC
LEA SI, NAMETAB
K20:
LEA DI, NAMESAV ;Init'ze start of table
MOV CX, 20 ;Count for loop
REP MOVSW
MOV AH,09H ;Request display
LEA DX , NAMESAV
INT 21H
DEC NAMECTR ;Is this last one?
JNZ K20 ; no, loop
RET ; yes, exit
K10DISP ENDP
‘ Clear screen:
Q10CLR PROC
MOV AX, 0600H
MOV BH,61H ;Attribute
MOV CX, 00 ;Full screen
MOV DX,184FH
INT 10H
RET
Q10CLR ENDP
f Set cursor:
Q20CURS PROC
MOV AH, 02H ;Request set cursor
MOV BH, 00 ;Page 0
MOV DX, 00 ;Location 00:00
INT 10H
RET
Q20CURS ENDP
END BEGIN
The item at offset 0016 contains zero as the next address, either to indicate the end of the
list or to make the list circular.
The program in Figure 15-8 uses the contents of the defined linked list, LINKLST,
to locate a specified part number, in this case, 1720. The search begins with the first item
in the table. The logic for using CMPSB is similar to that in Figure 15—4. The program com-
pares the part number (1720) with each item in the table and does the following, according
to the results of the comparison:
A more complete program could allow a user at a keyboard to enter any part number
and could display the price as an ASCH value.
The program can use the TYPE operator to determine the definition (DW in this case), the
LENGTH operator to determine the DUP factor (10), and the SIZE operator to determine
the number of bytes (10 X 2, or 20). The following examples illustrate the three operators:
MOV AX,TYPE TABLEX ;AX = 0002 (2 bytes)
You may use the values that LENGTH and SIZE return to end a search or a sort of a
table. For example, if the SI register contains the incremented offset address of a search,
you may test this offset using
CMP SI,SIZE TABLEX
KEY POINTS
For most purposes, define tables so that their entries are related and have the same
length and data format.
Design tables based on their data format. For example, table entries may be charac-
ter or numeric and one, two, or more bytes each in length.
¢ Remember that the maximum numeric value for a DB is 256 and that numeric DW
and DD reverse the bytes. Also, CMP and CMPSW assume that words contain bytes
in reverse sequence.
If a table is subject to frequent changes, or if several programs reference the table,
store it on disk. An updating program can handle changes to the table. Any program
can then load the table from disk, and the programs need not be changed.
Under direct table addressing, the program calculates the address of a table entry and
accesses that entry directly.
280 Table Processing Chapter 15
.DATA
PARTNO DB ‘L720 ' ;Part number
LINKLST DB '0103'! ;Linked list table
DW 1250, 24
DB ‘i720
DW 0895, 16
DB ‘1627
DW 0375; 00
DB *O120*
DW 1380, 32
DB 0205"
DW 2500, 08
A90
MOV AX, 4C0O0OH ;Exit to DOS
INT 21H
BEGIN ENDP
END BEGIN
¢ When searching a table, a program successively compares a data item against each
entry in the table until it finds a match.
¢ The XLAT instruction facilitates translating data from one format to another.
QUESTIONS
15-1. Distinguish between processing a table by direct addressing and by searching.
15-2. Define a table named TABLEX with 50 words, initialized to blanks.
Questions 281
15-3. Define three separate related tables that contain the following data: (a) item numbers 06, 10,
14, 21, and 24; (b) item descriptions of videotape, receivers, modems, keyboards, and
diskettes; (c) item prices 93.95, 82.25, 90.67, 85.80, and 13.85.
15-4, Code a program that allows a user to enter item numbers (ITEMIN) and quantities (QTYIN)
from the keyboard. Use the tables defined in Question 15-3, and include a search routine that
uses ITEMIN to locate an item number in the table. Extract the descriptions and prices from
the table. Calculate the value (quantity X price) of each sale, and display description and value
on the screen.
15-5. Using the description table defined in Question 15—3, code the following: (a) a routine that
moves the contents of the table to another (empty) table; (b) a routine that sorts the contents
of this new table into ascending sequence by description.
15-6. A program is required to provide simple encryption of data. Define an 80-byte data area named
CRYPTEXT containing any ASCII data. Arrange a translation table to convert the data some-
what randomly, for example, A to X, B to E, C to R, and so forth. Provide for all possible byte
values. Arrange a second translation table that reverses (decrypts) the data. The program
should perform the following actions:
CHAPTER 16
Disk Storage Organization
OBJECTIVE:
To examine the basic formats for hard disk and diskette
storage, the boot record, directory, and file allocation table.
INTRODUCTION
At some point, a serious programmer has to be familiar with the technical details of disk
organization, particularly for developing utility programs that examine the contents of
diskettes and hard disks. Where a reference to a disk or diskette is required, this text uses
the general term disk.
This chapter explains the concepts of tracks, sectors, and cylinders and gives the ca-
pacities of some commonly used devices.
Also covered is the organization of important data recorded at the beginning of a disk,
including the boot record (which helps the system load the DOS programs from disk into
memory), the directory (which contains the name, location, and status of each file on the
disk), and the file allocation table (or FAT, which allocates disk space for files).
DISK CHARACTERISTICS
For processing records on disks, it is useful to be familiar with the terms and characteris-
tics of their organization. A diskette has two sides (or surfaces), whereas a hard disk con-
tains a number of two-sided disks.
282
Disk Characteristics 283
Each side of a diskette or hard disk contains a number of concentric tracks, numbered be-
ginning with 00, the outermost track. Each track is formatted into sectors of 512 bytes,
where the data is stored.
Both diskettes and hard disk devices are run by a controller that handles the place-
ment of the read-write heads on the disk surface and the transfer of data between disk and
memory. There is a read-write head for each disk surface. For both diskette and hard disk,
a request for a read or a write causes the disk drive controller to move the read-write heads
(if necessary) to the required track. The controller then waits for the required sector on the
spinning surface to reach the head, at which point the read or write operation takes place.
Figure 16-1 illustrates these features.
There are two main differences between a hard disk and a diskette drive. For hard
disk, the read-write head rides just above the disk surface without ever touching it,
whereas for diskette, the read-write head actually touches the surface. Also, a hard disk
device is constantly spinning, whereas a diskette device starts and stops for each read/
write operation.
Cylinders
A cylinder is a vertical set of all of the tracks with the same number on each surface of a
diskette or hard disk. Thus cylinder 0 is the set of tracks numbered 0, cylinder 1 is the set
of tracks numbered 1, and so forth. For a diskette, then, cylinder 0 consists of track 0 on
side | and track 0 on side 2; cylinder 1 consists of track 1 on side 1 and track 1 on side 2;
and so forth. When writing a file, the system fills all the tracks on a cylinder and then ad-
vances the read-write heads to the next cylinder.
A reference to disk sides (heads), tracks, and sectors is by number. Side and track
numbers begin with 0, but sectors may be numbered one of two ways:
1. Cylinder-track address: Sector numbers on each track begin with 1, so that the first
sector on the disk is addressed as cylinder 0, track 0, sector 1.
Sectors
Sectors
Access Arm
SS
Read/Write Sectors
Head
2. Relative sector number: Sectors may be numbered relative to the start of the disk, so
that the first sector on the disk, on cylinder 0, track 0, is addressed as relative sector 0.
Disk Controller
The disk controller is located between the processor and the disk drive and handles all
communication between them. The controller accepts data from the processor and con-
verts the data into a form that is usable by the device. For example, the processor may
send a request for data from a specific cylinder, disk head, and sector. The role of the con-
troller is to provide the appropriate commands to move the access arm to the required
cylinder, select the read/write head, and accept the data from the sector when the data
reaches the read-write head.
The processor is freed for other tasks while the controller is performing its work. Un-
der this approach, the controller handles only one byte at a time. However, the controller
can also perform faster I/O by bypassing the processor entirely and transferring data directly
to and from memory. The method of transferring a large block of data in this manner is
known as direct memory access (DMA). To this end, the processor provides the controller
with the read or write command, the address of the I/O buffer in memory, the number of
sectors to transfer, and the numbers of the cylinder, head, and starting sector. With this
method, the processor has to wait until the DMA is complete, since only one component at
a time can use the memory path.
Clusters
A cluster is a group of sectors that DOS treats as a unit of storage space. A cluster size is
always a power of 2, such as 1, 2, 4, or 8 sectors. A hard disk typically has four sectors per
cluster. On a disk device that uses one sector per cluster, sector and cluster are the same. A
file begins on a cluster boundary and requires a minimum of one cluster even if the file oc-
cupies only one of four sectors. A cluster may also overlap from one track to another.
A disk with two sectors per cluster would look like this:
And a disk with four sectors per cluster would look like this:
cluster cluster
A 100-byte file (small enough to occupy one sector) stored on disk with four sectors
per cluster uses 4 X 512 = 2,048 bytes of storage, although only one sector would contain
data. DOS stores clusters for files in ascending sequence, although a file may be fragmented
so that it resides, for example, in clusters 8, 9, 10, 14, 17, and 18.
Disk Capacity
Here are common diskette storage capacities:
Disk System Area and Data Area 285
For hard disks, capacities vary considerably by device and by partition. Useful opera-
tions for determining the number of cylinders, sectors per track, or read-write heads include
INT 21H, functions 1FH and 440DH with minor code 60H, both covered in Chapter 18.
System Area
The system area is the first area of a disk, on the outermost track(s) beginning with side 0,
track 0, sector 1. The information that DOS stores and maintains in its system area is used
to determine, for example, the location of each file that is to be accessed. The three com-
ponents of the system area are:
1. Boot record
2. File allocation table (FAT)
3. Directory
The system area and the data area are arranged like this:
5.25” 360KB
5.25” 1.2MB
3.5” 720KB
3.5” 1.44MB
For hard disk, the locations of the boot record and the FAT are usually the same as
for diskette; the size of the FAT and the location of the directory vary by device.
Data Area
The data area for a bootable disk or diskette begins with two DOS system files named
IO.SYS and MSDOS.SYS (for MS-DOS) or IBMBIO.COM and IBMDOS.COM (for IBM
PC DOS). When you use FORMAT /S to format a disk, DOS copies its system files onto
the first sectors of the data area. User files either immediately follow the system files or, if
there are no system files, begin at the start of the data area.
A formatted two-sided diskette with nine sectors per track contains the following
information:
Records for data files begin on side 1, track 0, sectors 3 through 9. The system stores
records next on side 0, track 1, then side 1, track 1, then side 0, track 2, and so forth. This
feature of filling data on opposite tracks (in the same cylinder) before proceeding to the next
cylinder reduces the motion of the disk head and is the method used on both diskettes and
hard disks.
For other devices, the FAT and directory may be different lengths. The next sections
cover the boot record, directory, and FAT in detail.
BOOT RECORD
The boot record contains the instructions that load (or “boot’”) the system files IO.SYS, MS-
DOS.SYS, and COMMAND.COM (if present) from disk into memory. All formatted disks
contain this record even if the system files are not stored on it. The boot record contains the
following information, in order of offset address:
OOH = Short or far jump to the bootstrap routine at offset 1EH or 3EH in the boot
record
03H Manufacturer’s name and DOS version number when boot was created
OBH Bytes per sector, usually 200H (512)
ODH Sectors per cluster (1, 2, 4, or 8)
OEH ~ Reserved sectors
Directory 23/
DOS 4.0 extended the boot record with additional fields from 20H through 1FFH.
Thus the original boot record is 20H (32) bytes, whereas the extended version is 200H
(512) bytes.
DIRECTORY
All files on a disk begin on a cluster boundary, which is the first sector of the cluster. For
each file, DOS creates a 32-byte (20H) directory entry that describes the name of the file,
the date it was created, its size, and the location of its starting cluster. Directory entries have-
the following format:
BYTE PURPOSE
OOH-O7H Filename, as defined in the program that created the file. The first byte
of the filename can also indicate the file status:
OOH File has never been used
OS5H First character of filename is actually ESH
2EH Entry is for a subdirectory
E5H File has been deleted
O8H Volume label (if this is a volume label record, the label itself is
in the filename and extension fields)
10H Subdirectory
20H Archive file, which indicates whether the file was rewritten
since the last update.
(As an example, code 07H would mean a system file (04H) that is read
only (01H) and hidden (02H).)
OCH-15H Reserved for DOS.
16H-17H — Time of day when the file was created or last updated; stored as 16 bits
in binary format as |hhhhhmmmmmmsssssl.
18H-19H Date when the file was created or last updated, stored as 16 bits in bi-
nary format as lyyyyyyymlmmmdddddl. The year can be 0-119 (as-
suming 1980 as the starting point), the month can be 01-12, and the day
can be 01-31.
1AH-1BH _ Starting cluster of the file. The number is relative to the last two sec-
tors of the directory. Where there are no DOS system files, the first data
file begins at relative cluster 002. The actual side, track, and cluster
depend on disk capacity. A zero entry means that the file has no space
allocated to it.
1CH-1FH _ Size of the file in bytes. When you create a file, DOS calculates and
stores its size in this field.
For numeric fields that exceed one byte in the directory, the bytes are stored in re-
verse sequence.
The original designers provided for two copies of the FAT (FAT1 and FAT2), presumably
because FAT2 could be used if FAT1 became corrupted. However, although FAT2 is still
maintained, its use has never been implemented. All discussions in this book concern FAT1.
Note that FOH and F9H each identify two different disk formats.
You would expect that the data area would be the starting point for clusters, but in-
stead, the first two cluster numbers (0 and 1) point to the directory, so that the data area for
stored data files begins with cluster number 2. The reason for this odd state of affairs will
soon be made clear.
Following the first two FAT entries are pointer entries that relate to every cluster in the data
area. The directory (at 1AH—1BH) contains the location of the first cluster for a file, and
the FAT contains a chain of pointer entries for each succeeding cluster.
Since DOS 3.0, the entry length for diskettes is still three hex digits (1% bytes, or
12 bits), but for hard disk it is four hex digits (2 bytes, or 16 bits). Each FAT pointer entry
indicates the use of a particular cluster according to the following format:
The first two entries for a 1.44MB diskette (a 12-bit FAT) look like this:
The term “relative cluster” means the cluster to which the FAT entry points. In a
sense, the first two FAT entries (0 and 1) point to the last two clusters in the directory, which
have been assigned as the start of clusters; the directory indicates the size and starting clus-
ter for files.
The directory contains the starting cluster number for each file and a chain of FAT
pointer entries that indicate the location of the next cluster, if any, at which the file contin-
ues. A pointer entry containing (F)FFFH indicates the last cluster for the file.
extentry [ror[ow[on[olor]...oa
Relative cluster: 0 1 Z 3 4 5 6 ae end
For the first two FAT entries, FO indicates a two-sided nine-sectored (1.44MB) diskette,
followed by FFFFH. To read CUSTOMER.FIL from disk into memory, the system takes
the following steps:
* Searches the disk directory for the filename CUSTOMER and the extension FIL.
DOS extracts from the directory the location of the first relative cluster (2) of the file
and delivers its contents (data from the sectors) to the program in main memory.
* Accesses the FAT pointer entry that represents relative cluster 2. From the diagram,
this entry contains 003, meaning that the file continues on relative cluster 3. DOS de-
livers the contents of this cluster to the program.
* Accesses the FAT pointer entry that represents relative cluster 3. This entry contains
004, meaning that the file continues on relative cluster 4. DOS delivers the contents
of this cluster to the program.
The FAT entry for relative cluster 4 contains FFFH, to indicate that no more clus-
ters are allocated for the file. DOS has now delivered all the file’s data, from clusters 2.
3, and 4.
We've just seen how FAT entries work in principle; now let’s see how they work in
terms of reversed-byte sequence, where a little more ingenuity is required.
File Allocation Table 291
But what’s needed now to decipher the entries is to represent them according to relative
byte rather than cluster:
Multiply 2 (the file’s first cluster) by 1.5 (the length of FAT entries) to get 3. (For pro-
gramming, multiply by 3 and shift right one bit.) Access the word at bytes 3 and 4 in
the FAT. These contain 03 40, which become, in reverse, 4003. Since cluster 2 was
an even number, use the last three digits, so that 003 is the second cluster for the file.
For the third cluster, multiply cluster number 3 by 1.5 to get 4. Access FAT bytes 4
and 5. These contain 40 00, which become, in reverse, 0040. Since cluster 3 was an
odd number, use the first three digits, so that 004 is the third cluster for the file.
For the fourth cluster, multiply 4 by 1.5 to get 6. Access FAT bytes 6 and 7. These
contain FF OF, which become, in reverse, OFFF. Since cluster 4 was an even number,
use the last three digits, FFF, which mean that this is the last entry. (Whew!)
Relative cluster: 0 cf Z 3 4 5
The FAT entry for relative cluster 2, 0300, reverses as 0003 for the next cluster. The FAT
entry for relative cluster 3, 0400, reverses as 0004 for the next cluster. Continue with the
chain of remaining entries in this fashion through to the entry for cluster number 5.
292 Disk Storage Organization | Chapter 16
If your program has to determine the type of disk that is installed, it can check the
media descriptor in the boot sector directly or, preferably, could use DOS INT 21H, func-
tion 1BH or ICH.
720K Disk. First insert the 720K diskette in drive A (or B if necessary). Load DE-
BUG and enter the L (load) command (explained more fully in Appendix E):
You can now examine the boot record, directory, and FAT for this diskette. To dis-
play the boot record, enter the command D 100. Note some of the fields:
Segment offset 103H shows the manufacturer’s name and DOS version when the
FAT was created
1OBH shows the number of bytes per sector (where 0002H reverses as 0200H, or
512 bytes)
115H is the media descriptor, F9H for this diskette.
Check out the other fields.
¢ For the first file, multiply 2 (its first cluster) by 1.5 to get 3. Access offset bytes 3 and
4 in the FAT, which contain FF 4F, and reverse the bytes to get 4FFF. Because clus-
ter 2 was an even number, use the last three digits, FFF, which tell you that there are
no more clusters for this file.
For the second file, multiply 3 (its first cluster) by 1.5 to get 4. Access offset bytes 4
and 5 in the FAT, which contain 4F 00, and reverse the bytes to get O04F. Because
cluster 3 was an odd number, use the first three digits, 004, which identify the next
cluster in the series. Multiply cluster 4 by 1.5 to get 6. Access offset bytes 6 and 7 in
the FAT, which contain FF OF, and reverse the bytes to get OFFF. Because cluster 4
was an even number, use the first three digits, FFF, which indicate the end of the data.
1.44MB Disk. Now insert the 1.44MB diskette in drive A, and enter the DEBUG
command L 100 0 0 30. (Load 30H sectors because there’s more FAT on 1.44MB
diskettes.) Display the boot record for this disk, and note that the media descriptor byte at
115H is FO and the number of sectors per cluster (at 1ODH) is 1. The directories at 2700H
and 2720H should show that the starting cluster for the first file is 2 and for the second file
is 4. (The starting cluster for the second file on the 720K diskette was 3 because that for-
mat has two sectors per cluster.)
Display the FAT at 300H, which appears as
Since the first file starts at cluster 2, multiply 2 by 1.5 to get relative byte 3. Bytes 3
and 4 contain 03 FO, which reverse as F003. Because cluster 2 was an even number, use the
last three digits, 003. Cluster 3 X 1.5 is 4; relative bytes 4 and 5 contain FO FF, which re-
verse as FFFO. Because cluster 3 was an odd number, use the first three digits, FFF, which
indicate that the file does not continue. We now know that the file resides on clusters 2 and 3.
Use the same technique to trace through the chain for the second file, which begins
with cluster 4, or relative byte 6.
DOS provides some supporting services for programs to access information about the
directory and the FAT. Functions 47H (Get Current Directory) and 1BH and 1CH (Get FAT
Information) are described in Chapter 18.
294 Disk Storage Organization Chapter 16
Processing for files on hard disk is similar to that for diskette, and for both, you have
to supply a path name to access files in subdirectories.
KEY POINTS
* Each side of a diskette or hard disk contains a number of concentric tracks, starting
with track number 00. Each track is formatted into sectors of 512 bytes, starting with
sector number 1.
¢ A cylinder is the set of all tracks with the same number on each side.
* A cluster is a group of sectors that DOS treats as a unit of storage space. A cluster
size is always a power of 2, such as 1, 2, 4, or 8 sectors. A file begins on a cluster
boundary and requires a minimum of one cluster.
Questions 295
QUESTIONS
16-1. What is the length in bytes of a standard sector?
16-2. What is a cylinder?
16-3. What is the purpose of a disk controller?
16-4. (a) What is a cluster? (b) What is its purpose? (c) A file is 48 bytes long. What is the disk
space used for cluster sizes 1, 2, 4, and 8?
16-5. Show how to calculate the capacity of a diskette, based on the number of cylinders, sectors
per track, and bytes per sector, for (a) 25.25”, 360KB diskette and (b) a 3.5”, 1.44MB diskette.
16-6. What does the disk system area contain?
16-7. (a) Where is the boot record located? (b) What is its purpose?
16-8. What is the indication in the directory for a deleted file?
16-9. What is the indication in the directory for (a) a normal file; (b) a hidden file?
16-10. What is the additional effect on a diskette or hard disk when you use FORMAT /S to format?
16-11. Consider a file with a size of 2,890 (decimal) bytes. (a) Where does the system store the size?
(b) What is the size in hexadecimal format? Show the value as the system stores it.
16-12. Where and how does the FAT indicate that the device on which it resides is on (a) hard disk;
(b) a 5.25”, 360KB diskette; (c) a 3.5”, 1.44MB diskette?
CHAPTER 17
Disk Processing:
I—Writing and Reading Files
OBJECTIVE:
To cover the use of file handles and the DOS functions for
writing and reading disk files sequentially and randomly.
INTRODUCTION
The original DOS services for processing disk files used a method called file control blocks
(FCBs). This method, although still supported by DOS, can address drives and filenames,
but not subdirectories. Succeeding DOS versions introduced a number of extended services
that are simpler than their original counterparts and are generally recommended. Some of
these operations involve the use of an ASCIIZ string to initially identify a drive, path, and
filename; a file handle for subsequent accessing of the file; and special return codes to iden-
tify errors. As a reminder, the term cluster denotes a group of one or more sectors of data,
depending on the device.
Although no new assembly language instructions are required, this chapter introduces
a number of DOS 21H services for processing disk files. Here they are, arranged by category:
OPERATIONS USING FILE HANDLES OPERATIONS USING FCBS
3CH_ Create file OFH Open file
3DH_ Open file 10H _ Close file
3EH Close file 14H Read record
3FH Read record 15H _ Write record
296
File Handles 297
The chapter covers DOS services for writing and reading disk files. Chapter 18 cov-
ers the various support services required for handling disk drives, directories, and files.
ASCITZ STRINGS
When using many of the extended services for disk processing, you first tell DOS the ad-
dress of an ASCIIZ string containing the location of the file: disk drive, directory path, and
filename (all optional and within apostrophes), followed by a byte of hex zeros; thus the
name ASCIIZ string. The maximum length of the string is 128 bytes.
The following code defines a drive and filename:
PATHNM1 DB ‘D:\TEST.ASM’ , 00H
PATHNM2 DB ‘D:\UTILITY\NU.EXE’
, 00H
The backslash, which may also be a forward slash, acts as a path separator. A byte of zeros
terminates the string. For interrupts that require an ASCIIZ string, load its offset address in
the DX register—for example, as
LEA DX, PATHNAME.
FILE HANDLES
As discussed in Chapter 9, you may use file handles directly for certain standard devices:
00 = input, 01 = output, 02 = error output, 03 = auxiliary device, and 04 = printer. Many
DOS services also involve the use of a file handle for operations that access files, and you
have to request the file handle number from DOS. A disk file must first be opened; unlike
transferring data from the keyboard or to the screen, DOS has to address disk files through
its directory and FAT entries and must update these entries. During program execution,
each file referenced must be assigned its own unique file handle.
DOS delivers a file handle when you open a file for input or create a file for output.
The operations involve the use of an ASCIIZ string and DOS function 3CH or 3DH. The
file handle is a unique one-word number returned in the AX that you save in a word data
item and use for all subsequent requests to access the file. Typically, the first file handle re-
turned is 05, the second is 06, and so forth.
The PSP contains a default file handle table that provides for 20 handles (thus the
nominal limit for opened files), but INT 21H, function 67H, can be used to increase the
limit, as explained in Chapter 24.
298 Disk Processing: I—Writing and Reading Files | Chapter 17
FILE POINTERS
DOS maintains a separate file pointer for each file that a program is processing. The create
and open operations set the value of the file pointer to zero, the file’s starting location. The
file pointer subsequently accounts for the current offset location within the file.
Each read/write operation causes DOS to increment the file pointer by the number of
bytes transferred. The file pointer then points to the location of the next record to be ac-
cessed. File pointers facilitate both sequential and random processing. For random pro-
cessing, you can use DOS function 42H (covered in a later section) to set the file pointer to
any location in a file.
PATHNM1 DB ‘D:\ACCOUNTS.FIL’
, 00H
HANDLE1 DW ?
For a valid operation, DOS creates a directory entry with the given attribute, clears
the carry flag, and sets the handle for the file in the AX. Use this file handle for all subse-
quent disk operations. The named file is opened with its file pointer set to zero and is now
available for writing. If a file with the given name already exists in the path, the operation
sets up a zero length for overwriting the new file on the old one.
For error conditions, the operation sets the carry flag and returns a code in the AX:
03, 04, or 05 (see Figure 17—1). Code 05 means that either the directory is full or the refer-
enced filename has the read-only attribute. Be sure to check the carry flag first. For exam-
ple, creating a file probably delivers handle 05 to the AX, which could easily be confused
with error code 05, access denied. Related services for creating a file are 5AH and 5BH,
covered in Chapter 18.
HANDLE1 DW 2
OUTREC DB 256 DUP(‘ ‘) ;Output area
A valid operation writes the record onto disk, increments the file pointer, clears the carry
flag, and sets the AX to the number of bytes actually written. A full disk may cause the num-
ber written to differ from the number requested, although DOS does not report this condi-
tion as an error. An invalid operation sets the carry flag and returns to the AX error code 05
(access denied) or 06 (invalid handle).
A successful close operation writes any remaining records still in the memory buffer and
updates the FAT and the directory with the date and file size. An unsuccessful operation
sets the carry flag and returns the only possible error code in the AX, 06 (invalid handle).
¢ CIOCREA Uses function 3CH to create the file and saves the handle in a data item
named HANDLE.
¢ DIOPROC Accepts input from the keyboard and clears positions from the end of
the name to the end of the input area.
¢* FIOWRIT Uses function 40H to write records.
¢ GIOCLSE At the end of processing, uses function 3EH to close the file in order to
create a proper directory entry.
The input area is 30 bytes, followed by 2 bytes for the Enter (QDH) and Line Feed
(OAH) characters, for 32 bytes in all. The program writes the 32 bytes as a fixed-length
record. You could omit the Enter/Line Feed characters, but you should include them if you
want to sort the records in the file, since the DOS SORT program requires these characters
to indicate the end of records. For this example, the SORT command to sort the records
from NAMEFILE.DAT into ascending sequence in NAMEFILE.SRT could be
SORT D:<NAMEFILE.DAT >NAMEFILE.SRT
The program in Figure 17-3 reads and displays the contents of NAMEFILE.SRT. Note
two points: (1) The Enter/Line Feed characters are included after each record only to facil-
itate the sort and could otherwise be omitted. (2) The records could be of variable length,
Using File Handles to Create Disk Files 301
.DATA
NAMEPAR LABEL BYTE ;Parameter list:
MAXLEN DB 30 ;Maximum length
NAMELEN DB ? ;Actual length
NAMEREC DB 30 DUP(' '), ODH, OAH ;Entered name,
; CR/LF for writing
ERRCDE DB 00 ;Error indicator
HANDLE DW ? ;File handle
PATHNAM DB 'D:\NAMEFILE.DAT', 0
PROMPT DB 'Name? '
ROW DB O1
OPNMSG DB ‘kkk Open error ***', QODH, OAH
WRTMSG DB 'kk*e Write error ***', ODH, OAH
CODE
BEGIN PROC FAR
MOV AX,@data ;Initialize data
MOV DS , AX ; segment
MOV ES ,AX
MOV AX, 0600H
CALL Q10SCR ;Clear screen
CALL Q20CURS ;Set cursor
CALL C1OCREA ;Create file, set DTA
CMP ERRCDE, 00 ;Create error?
JZ A20LOOP ; yes, continue
JMP A90 »; no, exit
A20LOOP:
CALL D10PROC
CMP NAMELEN, 00 ;End of input?
JNE A20LOOP ; no, continue
CALL G1OCLSE ; yes, close,
A990: MOV AX, 4C0O0H ;Exit to DOS
INT aan
BEGIN ENDP
; Create disk file:
INT 21H
CMP NAMELEN, 00 ;Is there a name?
JZ D90 ; no, exit
MOV AL, 20H ;Blank for storing
SUB CH; CH
MOV CL, NAMELEN ;Length
LEA DI, NAMEREC
ADD Dr, cx ;Address + length
NEG CX ;Calculate remaining
ADD Cx ,30 ; length
REP STOSB ;Set to blank
CALL F1IOWRIT ;Write disk record
CALL E1OSCRL ;Check for scroll
D90:
RET
D10PROC ENDP
; Check for scroll:
. Set cursor:
only up to the end of the names; this would involve some extra programming, as you'll
see later.
In writing a file, be sure to use function 3CH to create the file, not function 3DH to
open it. The following example opens a file for reading:
304 Disk Processing: I—Writing and Reading Files Chapter 17
If a file with the given name exists, the operation sets the record length to 1 (which
you can override), assumes the file’s current attribute, sets the file pointer to 0 (the start of
the file), clears the carry flag, and sets a handle for the file in the AX. Use this file handle
for all subsequent operations.
If the file does not exist, the operation sets the carry flag and returns an error code in
the AX: 02, 03, 04, 05, or 12 (see Figure 17—1). Be sure to check the carry flag first. For
example, creating a file probably delivers handle 05 to the AX, which could easily be con-
fused with error code 05, access denied.
HANDLE2 DW ?
INPREC DB 512 DUP? *)
A valid operation delivers the record to the program, clears the carry flag, and sets the AX
to the number of bytes actually read. Zero in the AX means an attempt to read from the end
of the file; this is a warning, not an error. An invalid read sets the carry flag and returns to
the AX error code 05 (access denied) or 06 (invalid handle).
Since DOS limits the number of files open at one time, a program that successively
reads a number of files should close them as soon as possible.
* EIOQPEN Uses DOS function 3DH to open the file and saves the handle in a data
item named HANDLE,
Using File Handles to Read Disk Files 305
32 DUP(* *)
'*** Open error ***', ODH, OAH
'D: \NAMEFILE.SRT',0
'***k* Read error ***', QDH, OAH
00
© ces. sens) soe seca eeus’ Sie; “am ahs cs ik San ce te cee, ea: ta) aa le, a a el ae ae ee ae ee a ee Oe, a a ae a) oe aa a ee ee ae ae ee ee ee ee ee ae
;Initialize
; segment
; registers
;Clear screen
;Set cursor
E10OPEN ;Open file, set DTA
ENDCDE,00 ;Valid open?
A90 ; no, exit
A20LOOP:
F1OREAD ;Read disk record
ENDCDE , 00 ;Normal read?
A90 ; no, exit
G10DISP ; yes, display name,
A20LOOP ; continue
A90: ;End processing,
AX, 4C0O0OH >; exit to DOS
21H
BEGIN
, Open file:
E1OOPEN
;Request open
;Normal file
*Brror?
; no, save handle
E20:
ENDCDE,01 ; yes,
DX, OPENMSG ; display
X10ERR ; error message
E1LOOPEN ENDP
Read disk record:
F20: 7 no,
LEA DX, READMSG ; invalid read
CALL X1LOERR
P3003
MOV ENDCDE, 01 ;Force end
F90: RET
F1OREAD ENDP
i Display name:
¢ FIOREAD Issues DOS function 3FH, which uses the handle to read the records.
¢ GIODISP Displays the records and scrolls the screen. Since Enter and Line Feed
characters already follow each record, the program does not have to advance the cur-
sor when displaying records.
Processing ASCII Files 307
<Tab>MOV<Tab>AH, 09<Enter>
094D4F560941482C30390D0A
where 09H is Tab, ODH is Enter, and OAH is Line Feed. When TYPE or an editor read the
file, the Tab, Enter, and Line Feed characters automatically adjust the cursor on the screen.
Let’s now examine the program in Figure 17-4, which reads and displays the file
P17HANRD.ASM (from Figure 17-3), one sector at a time. The program performs much
the same functions as DOS TYPE, where each line displays everything up to the Enter/Line
Feed characters. Since lines in an ASCII file are of variable length, you have to scan for the
end of each line before displaying it. Scrolling can be a problem. If you perform no special
tests to determine whether you have reached the bottom of screen, the operation automati-
cally displays new lines over old and, if the old line is longer, old characters still appear to
the right. For proper scrolling, you have to count rows and test whether you are at the bot-
tom of the screen.
The program reads a full sector of data into SECTOR. The procedure GIOXFER
transfers one byte at a time from SECTOR to DISAREA, where the characters are to be dis-
played. When a Line Feed is encountered, the routine displays the contents of DISAREA
up to and including the Line Feed. (The display screen accepts Tab characters (09H) and
automatically sets the cursor on the next location evenly divisible by eight.)
The program has to check for the end of a sector (to read another sector) and the end
of the display area. For conventional ASCII files, such as .ASM files, each line is relatively
short and is sure to end with Enter/Line Feed. Non-ASCII files, such as .EXE and .OBJ
files, do not have lines, so the program has to check for the end of DISAREA to avoid crash-
ing. The program is intended to display only ASCII files, but the test for the end 1s insur-
ance against unexpected files.
These are the steps in GIOXFER:
.DATA
DISAREA DB 120 DUP(' ©) ;Display area
ENDCDE DW 00 ;End process indicator
HANDLE DW 0 ;File handle
OPENMSG DB "wee Open error ***!
PATHNAM DB 'D:\17HANRED.ASM', 0
ROW DB 00
DB 512 DUP(' ') ;Input area
. CODE
PROC FAR ;Main procedure
MOV AX, @data ;Initialize
MOV DS , AX : segment
MOV ES,AX ; registers
MOV AX,0600H
CALL Q10SCR ;Clear screen
CALL QO20CURS ;Set cursor
CALL E10OPEN ;Open file
CMP ENDCDE, 00 ;Valid open?
JNE A90 :; no, exit
A20LOOP: ; yes, continue
CALL R1LOREAD ;Read 1st disk sector
CMP ENDCDE, 00 ;End of file, no data?
JE A90 ; yes, exit
CALL G1OXFER ;Display and read
A990:
MOV AH, 3EH ;Request close file
MOV BX, HANDLE
INT 21H
MOV AX, 4C00H 7ExXit to DOS
INT 21H
BEGIN ENDP
f Open disk file:
HLODISP
;Request display
BX, 01 ;Handle
CX, DISAREA ;Calculate
CX ; length of
Cx, D1 ; line
DX, DISAREA
218
ROW, 22 ;Bottom of screen?
H20 ; no, exit
ROW
H90
H20:
AX,0601H »Scroll
Q10SCR
Q20CURS
H90:
H1LODISP
j Scroll screen:
Q10SCR
f
Set cursor:
Q20CURS
;Request set
; cursor
Q20CURS
3. If at the end of DISAREA, force an Enter/Line Feed, display the line, and initialize
DISAREA.
4. Get a character from SECTOR and store it in DISAREA.
71). Ifthe character is end-of-file (1AH), exit.
6. If the character is Line Feed (OAH), display the line and go to step 2; otherwise go to
step 3.
Try running this program under DEBUG with an appropriate drive number and
ASCII file. After each disk input, display the contents of the input area and see how DOS
has formatted your records. An enhancement to this program would be to prompt a user to
enter the filename and extension via the keyboard.
When a program first requests a random record, the operation uses the directory to
locate the sector in which the record resides, reads the entire sector from disk into a buffer,
and delivers the required record to the program.
In the next example, records are 128 bytes long and four to a sector. A request for
random record number 21 causes the following four records to be read into the buffer:
When the program requests the next random record—say number 23—the operation first
checks the buffer. Since the record is already there, it is transferred directly to the program.
If the program requests a record number that is not in the buffer, the operation uses the di-
rectory to locate the record, reads the entire sector into the buffer, and delivers the record
to the program. Accordingly, it is usually more efficient to request random record numbers
that are close together in the file.
The following example moves the pointer 1,024 bytes from the start of a file:
MOV CX,00
JC error
312 Disk Processing: I—Writing and Reading Files Chapter 17
A valid operation clears the carry flag and delivers the new pointer location in the DX:AX.
You may then perform a read or write operation for random processing. An invalid operation
sets the carry flag and returns in the AX code 01 (invalid method code) or 06 (invalid handle).
Program: Reading a Disk File Randomly
The program in Figure 17—5 reads the file created in Figure 17—2. By keying in a relative
record number that is within the bounds of the file, a user can request any record in the file
to be displayed on the screen. If the file contains 24 records, then valid record numbers are
O1 through 24. A number entered from the keyboard is in ASCII format and in this case
should be only one or two digits.
The program is organized as follows:
The procedure has to convert the ASCII number to binary. Since the value is in the
AX, the AAD instruction works well for this purpose. The system recognizes location
0 as the beginning of a file. The program deducts | from the actual number (so that a
user request, for example, for record 1 becomes record 0), multiplies the value by 16
(the length of records in the file), and stores the result in a field called RECINDX.
As an example, if the entered number is ASCII 12, the AX would contain 3132.
An AND instruction converts this value to 0102, AAD further converts it to 000C
(12), and SHL effectively multiplies the number by 16 to get CO (192). An improve-
ment would be to validate the input number.
FIOREAD — Uses function 42H and the relative record location from RECINDX to
set the file pointer and issues function 3FH to deliver the required
record to the program in IOAREA.
GI1ODISP _ Displays the retrieved record.
We now cover the DOS FCB services for creating disk files and processing them sequen-
tially and randomly. All of these services were introduced by the first version of DOS and
are available under all versions.
Disk processing for the DOS FCB services involves defining a file control block
(FCB) that defines the file and a disk transfer area (DTA) that defines records. You provide
DOS with the DTA address for all disk input/output operations. Note that FCBs do not use
file handles and do not use the error codes listed in Figure 17—1; they also do not clear or
set the carry flag to indicate success or failure. (FCBs also exist in the PSP, which DOS in-
stalls immediately preceding programs loaded into memory for execution.)
Disk Services Using File Control Blocks 313
? ;File handle
? ;Record index
00 ;Read error indicator
'Record number? $'!
32 DUP(' ') ;Disk record area
'D:\NAMEFILE.SRT',0
'*** Open error ***', ODH, OAH
'*e** Read error ***', OQDH, OAH
00
00
FAR
AX,@data ;Initialize
DS , AX ; segment
ES, AX ; registers
AX, 0600H
Q10SCRN ;Clear screen
Q20CURS ;Set cursor
C1OOPEN ;Open file
ERRCDE, 00 ;Valid open?
A90 ; no, exit
A20LOOP:
D1ORECN ;Request record #
ACTLEN , 00 ;Any more requests?
A90 ; no, exit
F1OREAD ;Read disk record
ERRCDE, 00 ;Normal read?
A30 ; no, bypass
G1ODISP ; yes, display name,
A30:
A20LOOP ; continue
A90:
AX,4C0O0H ;Exit to DOS
21H
BEGIN
4 Open file:
C1LOOPEN NEAR
AH, 3DH ;Request open
AL, 00 ;Normal file
DX, PATHNAM
21H
C20 ; Error?
HANDLE, AX ; no, save handle
C20:
ERRCDE, 01 ; yes,
DX, OPENMSG ; @display
X1LOERR ; error message
C1OOPEN
1 Get record number:
D1O0RECN
AH, 09H ;Request display prompt
DX, PROMPT
21H
D1ORECN
Read disk record randomly:
F1OREAD NEAR
AX, 4200H ;Request set file pointer
AL, 00 ;Start of file
BX, HANDLE
CX, 00 °
/
DX, RECINDX
Zin
F20 ;Error condition?
; yes, bypass
AH, 3FH ;Request read
BX, HANDLE
Cry 32 ;30 for name, 2 for CR/LF
DX, IOAREA
21H
F20 ;Error on read?
IOAREA, 1AH ;EOF marker?
F30 ; yes, exit
F90
F20: i no,
DX, READMSG ; invalid read
X1OERR
F30:
ERRCDE,01 ;Force end
F90:
FILOREAD
s Display name:
G10ODISP
;Request display
;Set handle
; and length
Since the FCB method does not support path names, its use is primarily for processing files
in the current directory. The FCB, which you define in the data area, contains the follow-
ing information about the file and its records (you initialize bytes 00-15 and 32-36, whereas
DOS sets bytes 17-31):
0 Disk drive. For most FCB operations, 00 is the default drive, 01 is drive
A, 02 is drive B, and so forth.
1-8 Filename. The name of the file, left adjusted with trailing blanks, if any.
9-11 Filename extension. A subdivision of filename for further identification,
such as .DOC or .ASM, left adjusted if fewer than three characters. When
you create a file, DOS stores its filename and extension in the directory.
316 Disk Processing: I—Writing and Reading Files Chapter 17
12-13 Current block number. A block consists of 128 records. Read and write op-
erations use the current block number and current record number (byte 32)
to locate a particular record. The number is relative to the beginning of the
file, where the first block is 0, the second is 1, and so forth. An open oper-
ation sets this entry to zero. DOS handles the current block number auto-
matically, although you may change it for random processing.
14-15 Logical record size. An open operation initializes the record size to 128
(80H). After an open and before any read or write, you may change this en-
try to your own required record size.
16-19 File size. When a program creates a file, DOS calculates and stores its size
(number of records X record size) in the directory. An open operation sub-
sequently extracts the size from the directory and stores it in this field. Your
program may read the field, but should not change it.
20-21 Date. DOS records the date in the directory when the file was created or last
updated. An open operation extracts the date from the directory and stores
it in this field.
22-31 Reserved by DOS.
Gy: Current record number. This entry is the current record number (0-127)
within the current block. (See bytes 12-13.) The system uses the current
block and record to locate records in the file. Although open initializes the
record number to zero, you may set this field to begin sequential processing
at any number between 0 and 127.
33-36 Relative record number. For random read/write, this entry must contain a
relative record number. For example, to read record 25 (19H) randomly, set
the entry to 19000000H. For random processing, the system automatically
converts the relative record number to the current block and record. Because
of the limit on the maximum file size (1,073,741,824 bytes), a file with a
short record size can contain more records and may have a higher maximum
relative record number than a file with a longer record size. If the record size
is greater than 64, byte 36 always contains 00.
Preceding the FCB is an optional seven-byte extension, which may be used for pro-
cessing files with special attributes. To use the extension, code the first byte with FFH, the
second byte with the file attribute (described in Chapter 16), and the remaining five bytes
with hex zeros.
For each disk file referenced, a program using original DOS disk services defines an FCB.
Disk operations require the address of the FCB in the DX register and use this address to
access fields within the FCB. Operations include create file, set disk transfer area (DTA),
write record, and close file.
Using FCBs to Create Disk Files 317
DOS searches the directory for a filename that matches the entry in the FCB. If one is found,
DOS reuses the space in the directory, and if none is found, DOS searches for a vacant
entry. The operation then initializes the file size to zero and opens the file. The open
step checks for available disk space and sets one of the following return codes in the AL:
OOH = space is available; FFH = no space is available. Open also initializes the FCB cur-
rent block number to zero and sets a default value in the FCB record size of 128 (80H) bytes.
Before writing a record, you may override this default with your own record size.
The Disk Transfer Area
The disk transfer area (DTA) is the start of the definition of your output record. Since the
FCB contains the record size, the DTA does not require a delimiter to indicate the end of
the record. Prior to a write operation, use FCB function 1AH to supply DOS with the ad-
dress of the DTA. Only one DTA may be active at any time. The following code initializes
the address of the DTA:
If a program processes only one disk file, it needs to initialize the DTA only once for
its entire execution. If a program processes more than one file, it must initialize the appro-
priate DTA immediately before each read or write.
INT 21H, Function 15H: Write Record
To write a disk record sequentially, use FCB function 15H:
MOV AH,15H ;Request write record
The write operation uses the information in the FCB and the address of the current DTA. If
the record is the size of a sector, the operation writes the record. Otherwise, the operation
fills records into a buffer area that is the length of a sector and writes the buffer when it is
full. For example, if each record is 128 bytes long, the operation fills the buffer with four
records (4 X 128 = 512) and then writes the buffer into an entire disk sector.
318 Disk Processing: I—Writing and Reading Files Chapter 17
On a successful write, DOS increments the FCB file size field (by adding the record
size to it) and increments the current record number by 1. When the current record number
exceeds 127, the operation sets it to 0 and increments the FCB current block number. (You
could also change the current block and record number.) The write operation sets one of the
following return codes in the AL: OOH = write was successful; 01H = disk is full; 02H =
DTA is too small for the record.
The close operation writes on disk any partial data still in the DOS disk buffer and updates
the directory with the date and file size. One of the following codes is returned to the AL:
OOH = close was successful; FFH = file was not in the correct position in the directory,
perhaps caused by a user changing a diskette.
The open operation checks that the directory contains an entry with the filename and ex-
tension defined in the FCB. If the entry is not in the directory, the operation returns code
FFH in the AL. If the entry is present, the operation returns code 00 in the AL and sets the
actual file size, date, current block number (0), and record size (80H) in the FCB. After the
Open executes, you may override the default record size.
The DTA defines an area for the input record, according to the format used to create the
file. Use FCB function 1 AH to set the address of the DTA, just as you do when you create
a disk file.
Using FCBs for Random Processing 319
The operation sets one of the following return codes in the AL: 00 = successful read; 01 =
end of file, no data was read; 02 = DTA is too small for the record; 03 = end of file, record
was read partially and filled out with zeros.
For a successful read, the operation uses the information in the FCB to deliver the
disk record, beginning at the address of the DTA. An attempt to read past the last record of
the file causes the operation to signal an end-of-file condition that sets the AL to 01H, for
which you should test. It’s a recommended practice to close an input file after fully read-
ing it, because of the DOS limit on the number of files that may be open at one time.
The read operation returns one of the following codes in the AL: 00 = successful read;
01 = end of file, no more data available; 02 = DTA too small for the record; 03 = record
has been read partially and filled out with zeros.
A successful operation converts the relative record number to the current block and
record. It uses this value to locate the required disk record and delivers it to the DTA. Faulty
responses can be caused by an invalid relative record number or an incorrect address in the
DTA or FCB.
INT 21H, Function 22H: Write Record Randomly
The create operation and setting of the DTA are the same for both random and sequential
processing. With the relative record number initialized in the FCB, random write uses func-
tion 22H:
320 Disk Processing: I—Writing and Reading Files Chapter 17
The write operation returns one of the following codes in the AL: 00 = successful write;
01 = disk full; 02 = DTA too small for the record.
The operation converts the FCB relative record number to the current block and record. It
uses this value to determine the starting disk location and sets one of the following return
codes in the AL: 00 = successful write of all records; 01 = no records written because of
insufficient disk space; 02 = DTA too small for the record. The operation sets the FCB rel-
ative record field and the current block and record fields to the next record number.
INT 21H, Function 27H: Read Block Randomly
For a random block read, initialize the required number of records in the CX, and use FCB
function 27H:
The read operation returns one of the following codes in the AL: 00 = successful read of
all records; 01 = has read to end of file, last record is complete; 02 = DTA too small for
the record, read not completed; 03 = end of file, has read a partial record.
Absolute Disk I/O 321
The operation stores in the CX the actual number of records read and sets the FCB
relative record field and current block and record fields for the next record.
A convenient formula for determining the relative record number on diskettes with
nine sectors 1S
(2X9)+
(9-1) = 184+ 8 = 26
Here is the required coding for disk partitions that are less than 32 MBs:
JC [error]
Absolute disk read/write operations destroy all registers except the segment registers
and use the carry flag to indicate a successful (0) or unsuccessful (1) operation. An unsuc-
cessful operation returns one of the following nonzero codes to the AL:
a2 Disk Processing: I—Writing and Reading Files Chapter 17
The INT operation pushes the flags onto the stack. Because the original flags are still
on the stack upon returning from the operation, you should pop them after checking the
carry flag.
Since DOS 4.0, you can use INT 25H and 26H to access disk partitions that exceed
32 megabytes. The AL and CX are still used the same way. The DX is not used, and the BX
points to a 10-byte parameter block described as follows:
BYTES DESCRIPTION
QOH—03H 32-bit sector number
04H—05H Number of sectors to read/write
06H—07H Offset of buffer
O8H-09H Segment of buffer
KEY POINTS
Many of the DOS disk services reference an ASCIIZ string that consists of a direc-
tory path followed by a byte of hex zeros.
On errors, many of the DOS disk functions set the carry flag and return an error code
in the AX.
DOS maintains a file pointer for each file that a program is processing. The cre-
ate and open operations set the value of the file pointer to zero, the file’s starting
location.
The create and open functions return a file handle that you use for subsequent file
accessing.
Create function 3CH is used initially when writing a file and open function 3DH ini-
tially when reading a file.
A program that has completed writing a file should close it so that DOS may update
the directory.
A program using original DOS INT 21H functions for disk I/O defines a file control
block (FCB) for each file that it accesses.
An FCB block consists of 128 records. The current block number, combined with the
current record number, indicates the disk record to be processed. The entries in the
FCB for the current block, record size, file size, and relative record number are stored
in reversed-byte sequence.
Questions 323
¢ The disk transfer area (DTA) is the location of the record that is to be written or read.
You have to initialize each DTA in a program prior to execution of a write or read
operation.
¢ DOS INT 25H and 26H provide absolute disk read and write operations, but do not
supply automatic directory handling, end-of-file operations, or record blocking and
deblocking.
QUESTIONS
Of the following questions, the first 10 concern disk operations involving file handles, and
the remainder involve FCB disk operations.
17-1. What are the error return codes for (a) file not found; (b) invalid handle?
17-2. Define an ASCIIZ string named PATH] for a file named CUST.LST on drive C.
17-3. For the file in Question 17—2, provide the instructions to (a) define an item named CUSTHAN
for the file handle; (b) create the file; (c) write a record from CUSTOUT (128 bytes); and (d)
close the file. Test for errors.
17-4. For the file in Question 17-3, code the instructions to (a) open the file and (b) read records into
CUSTIN. Test for errors.
17-5. Under what circumstances should you close a file that is used only for input?
17-6. Revise the code in Figure 17—4 so that a user at a keyboard can enter a filename, which the
program uses to locate the file and to display its contents. Provide for any number of requests
and for pressing only the Enter key to cause the input to end.
17-7. Write a program that allows a user to enter part numbers (3 characters), part descriptions (12
characters), and unit prices (xxx.xx) on a terminal. The program is to use file handles to cre-
ate a disk file containing this information. Remember to convert the price from ASCII to bi-
nary. Following is sample input data:
|027|Compilers |00525|
|049|Compressors 100920|
|114|Extractors 111250|
|117|Haulers |00630|
|122|Lifters |10520|
|127|Labelers 100960 |
|232|Bailers |05635|
|999 | 100000 |
324 Disk Processing: I—Writing and Reading Files Chapter 17
17-8. Write a program that displays the contents of the file created in Question 17-7. It will have
to convert the binary value for the price to ASCII format.
17-9. Use the file created in Question 17-7 for the following requirements: (a) The program reads
the records into a table in memory; (b) a user can enter part number and quantity from the
keyboard; (c) the program searches the table for part number; (d) if the part number is found,
the program uses the table price to calculate the value of the part (quantity X price); (e) the
program displays description and calculated value.
17-10. Revise the program in Question 17-8 so that it does random processing. Define a table of the
valid part numbers. Allow a user to enter a part number, which the program locates in the
table. Use the offset in the table to calculate the offset in the file, and use function 42H to
move the file pointer. Display description and price. Allow the user to enter quantity sold;
calculate and display amount of sale (quantity X price).
17-11. Provide the full DOS function operations for the following FCB operations: (a) create; (b) set
DTA; (c) sequential write; (d) open; (e) sequential read.
17-12. A program uses the record size to which the FCB open operation defaults. (a) How many
records would a sector contain? (b) How many records would a diskette contain, assuming
three tracks with nine sectors per track? (c) If the file in part (b) is being read sequentially,
how many physical disk accesses will occur?
CHAPTER 18
Disk Processing:
II—DOS Operations for Supporting
Disks and Files
OBJECTIVE:
INTRODUCTION
This chapter introduces a number of useful operations involved in the handling of disk dri-
ves, the directory, the FAT, and disk files.
329
326 Disk Processing: II—DOS Operations for Supporting Disks & Files Chapter 18
Error codes cited in this chapter refer to the list in Figure 17-1.
The operation returns the number of drives (all types, including RAM disks) to the AL. Be-
cause DOS requires at least two logical drives A and B, it returns the value 02 for a one-
drive system. (Use INT 11H for determining the actual number of drives.)
The operation returns a drive number in the AL, where 0 = A, | = B, and so forth. You
could move this number directly into your program for accessing a file from the default
drive, although some operations assume that 1 = drive A and 2 = drive B.
Since the operation changes the DS, you should PUSH it before the interrupt and POP it af-
ter. The operation has now been superseded by function 36H. A successful 1BH operation
returns the following information:
The product of the AL, CX, and DX gives the capacity of the disk. An unsuccessful
1BH operation returns FFH in the AL.
The operation is otherwise identical to function 1BH and is also superseded by function
36H.
PUSH the DS before issuing this function, and POP it on returning from the function. The
operation has no parameters. A valid operation clears the AL and returns an address in the
DS:BX that points to the DPB for the default drive. For an error, the AL is set to FFH. See
also function 32H.
The operation does not return any value, since it simply sets a switch. The system subse-
quently responds to invalid write operations. Since a disk drive rarely records data incor-
rectly and the verification causes some delay, the operation is most useful where recorded
data is especially critical. A related function, 54H, delivers the current setting of the verify
switch.
The product of AX, CX, and DX gives the capacity of the disk. For an invalid device num-
ber, the operation returns FFFFH in the AX. The operation does not set or clear the carry
flag.
A valid operation clears the carry flag and returns a value in the DX, where bit 7 = 0 means
that the handle indicates a file, and bit 7 = 1 means a device. The other bits have this meaning:
An error sets the carry flag and returns code 01, 05, or 06 in the AX.
An error sets the carry flag and returns code 01, 05, or 06 in the AX.
An error sets the carry flag and returns code 01, 05, or 06 in the AX.
INT 21H, Function 4408H: Determine if Removable Media
for Device
This service determines whether the device contains removable media, such as diskette.
Load the BL with the drive number (0 = default, 1 = A, etc.). A valid operation clears the
carry flag and returns one of the following codes in the AX:
¢ 0OH = removable device or 01H = fixed device
An error sets the carry flag and returns code 01 or OFH (invalid drive number) in the AX.
INT 21H, Function 440DH, Minor Code 41H:
Write Disk Sector
This operation writes data from a buffer to one or more sectors on disk. Load these registers:
MOV AX,440DH ;IOCTL for block device
The rwbuffr entry provides the address of the buffer in segment:offset (DS:DX) format, al-
though coded in reverse-word sequence. The SEG operator indicates the definition of a
segment, in this case the data segment, DATA. The buffer identifies the data area to be
written and should be the length of the number of sectors X 512, such as
WRBUFFER DB 1024 DUP (?) ;Output buffer
332 Disk Processing: II—DOS Operations for Supporting Disks & Files Chapter 18
A successful operation clears the carry flag and writes the data. Otherwise, the operation
sets the carry flag and returns error code 01, 02, or 05 in the AX.
cylindr DW ? ;Cylinder
A successful operation clears the carry flag and formats the tracks. Otherwise, the op-
eration sets the carry flag and returns error code 01, 02, or 05 in the AX.
For this function to set the media ID, set these registers:
The filetyp field contains the ASCII value FAT12 or FAT16, with trailing blanks. A
successful operation clears the carry flag and sets the ID. Otherwise, the operation sets the
carry flag and returns error code 01, 02, or 05 in the AX. (See also function 440DH, minor
code 66H.)
If the specfun field is 0, the information is about the default medium in the drive; if 1, the
information is about the current medium. A successful operation clears the carry flag and
delivers the data. Otherwise, the operation sets the carry flag and returns error code 01, 02,
or 05 in the AX.
A successful operation clears the carry flag and sets the ID. The filetyp field contains ASCII
value FAT12 or FAT16, with trailing blanks. Otherwise, the operation sets the carry flag
and returns error code 01, 02, or 05 in the AX. (See also function 440DH, minor code 46H.)
INT 21H, Function 440DH, Minor Code 68H: Sense Media Type
To use this function to get the media type, set these registers:
MOV AX,440DH ;Request disk service
The DX points to a two-byte media block to receive data in the following format:
default DB ? ;01 for default value, 02 for other
A successful operation clears the carry flag and sets the type. Otherwise, the operation sets
the carry flag and returns error code 01 or 05 in the AX.
Other function 44H IOCTL operations concerned with file sharing are outside the
scope of this book.
Also, the operation clears the carry flag and—watch for this—destroys the contents of the
CL, DI, DS, DX, ES, and SI registers. PUSH all required registers prior to this interrupt,
and POP them afterward.
Extended Error Code (AX). Returns some 90 or more error codes; code 00
means that the previous INT 21H operation resulted in no error.
Error Class (BH). Provides the following information:
01H Out of resource, such as storage channel
02H Temporary situation (not an error), such as a locked file condition that should
go away
03H Lack of proper authorization
04H System software error, not this program
OSH Hardware failure
336 Disk Processing: I—DOS Operations for Supporting Disks & Files Chapter 18
1. IOBUFFR is the offset address of the input buffer, which provides for one sector
of data.
2. SEG _DATA uses the SEG operator to identify the address of the data segment for
the IOCTL operation.
. DATA
DB 00
DB 00
DB 30H, 31H, 32H, 33H, 34H, 35H, 36H, 37H, 38H, 39H
DB 41H,42H,43H,44H,45H, 46H
READMSG DB '‘*** Read error ***', ODH, OAH
RDBLOCK DB 0 ;Block
RDHEAD DW ¢) : structure
RDCYLR DW 0 :
RDSECT DW 8 ;
RDNOSEC DW 1 :
RDBUFFR DW IOBUFFR ;
DW SEG DATA :
IOBUFFR DB 512 DUP(' ') ;Disk sector area
CODE
PROC FAR
MOV AX, @data ; Initialize
MOV DS , AX ; segment
MOV ES, AX ; registers
CALL Q10SCR ;Clear screen
CALL Q20CURS ;Set cursor
CALL B1LOREAD ;Get sector data
JNC A80 ;If valid read, bypass
LEA DX, READMSG ; invalid read
CALL XLOERR
JMP A90
A80:
CALL C1OCONV ;Convert and display
A90:
MOV AX,4C00H ;Exit to DOS
INT 21H
ENDP
You could enhance this program by allowing a user to request sectors via the keyboard.
INT 21H
A valid operation clears the carry flag; an error sets the carry flag and returns code 03 or 05
in the AX.
INT 21H, Function 3AH: Remove Subdirectory
This service deletes a subdirectory, just as the DOS command RMDIR does. Load the DX
with the address of an ASCIIZ string containing the drive and directory pathname (note that
you cannot delete the current directory or a subdirectory containing files):
ASCstrg DB ‘d:\pathname’ ,00H ;ASCIIZ string
INT 21H
A valid operation clears the carry flag; an error sets the carry flag and returns code 03, 05,
or 10H in the AX.
INT 21H, Function 3BH: Change Current Directory
This service changes the current directory to one that you specify, just as the DOS com-
mand CHDIR does. Load the DX with the address of an ASCIIZ string containing the new
drive and directory pathname:
ASCstrg DB ‘d:\pathname’ ,00H ;ASCIIZ string
INT 21H
A valid operation clears the carry flag; an error sets the carry flag and returns code 03 in
the AX.
INT 21H, Function 47H: Get Current Directory
DOS function 47H determines the current directory for any drive. Define a buffer space
large enough to contain the longest possible pathname (64 bytes), and load its address in
the SI. Identify the drive in the DL by 0 = default, 1 = A, 2 = B, and so forth:
340 Disk Processing: I—DOS Operations for Supporting Disks & Files Chapter 18
INT 21H
A valid operation clears the carry flag and delivers the name of the current directory (but
not the drive) to the buffer as an ASCIIZ string, such as
ASSEMBLE \EXAMPLESQ
A byte containing 00H identifies the end of the pathname. If the requested directory is the
root, the value returned is only a byte of OOH. In this way, you can get the current pathname
in order to access any file in a subdirectory. An invalid drive number sets the carry flag and
returns error code OFH in the AX.
The program in Figure 18—2 illustrates the use of two of the functions described in the pre-
ceding section. The procedures perform the following:
BIODRIV Uses function 19H to get the default drive in the AL register. The
drive is returned as 0 (for A), 1 (for B), and so forth. To adjust the
number to its alphabetic equivalent, simply add 41H, so that 00 be-
comes 41H (A), 01 becomes 42H (B), and so forth. The procedure
then displays the drive letter followed by a colon and backslash (n:\).
CIOPATH Uses function 47H to get the current directory pathname. The proce-
dure tests immediately for the OOH ASCIHIZ delimiter, since a default
to the root directory would deliver only that character. Otherwise, the
routine displays each character up to the OOH.
The program intentionally contains only features necessary to get it to work; a full
program would include, for example, clearing the screen and setting colors.
SHORT MAIN
B10ODRIV NEAR;
AH, 19H ;Request default drive
21H
AL,41H ;Change hex no. to letter
DL, AL : O=A, 1=B, etc.
QO10ODISP ;Display drive number,
Di, *s*
Q10DISP - ‘colon;
DL, rm!
Q10DISP ; backslash
B1ODRIV
C1OPATH NEAR ;
AH, 47H ;Request pathname
DL, 00
SI, PATHNAM
21H
Q10DISP
BEGIN
Load the SI register (associated with the DS) with the address of the filespec to be
parsed, the DI (associated with the ES) with the address of an area where the operation is
to generate the FCB format, and the AL with a bit value that controls the parsing method:
MOV AH,29H ;Request parse filename
For valid data, function 29H creates a standard FCB format for the filename and
extension, with an eight-character filename filled out with blanks if necessary, a three-
character extension filled out with blanks if necessary, and no period between them.
The operation recognizes standard punctuation and converts the wild cards * and ?
into a string of one or more characters. For example, PROG12.* becomes PROG12bb???.
The AL returns one of the following codes:
OOH No wild cards encountered
O1H Wild cards converted
FFH Invalid drive specified
After the operation, the DS:SI contains the address of the first byte after the parsed
filespec, and the ES:DI contains the address of the first byte of the FCB. For a failed oper-
ation, the byte at DI+ 1 is blank, although the operation attempts to convert almost anything
you throw at it.
Operations Handling Disk Files 343
For this operation to work with file handles, you have to edit the FCB further, to delete
blanks and enter the period between the filename and the extension.
A valid operation clears the carry flag, marks the filename in the directory as deleted, and
releases the file’s allocated disk space in the FAT. An error sets the carry flag and returns
code 02, 03, or 05 in the AX.
A valid operation clears the carry flag and returns the current attribute to the CX (CH = 00
and CL = attribute):
An error sets the carry flag and returns code 02 or 03 to the AX.
To set file attribute, load the AL with code 01, and set the attribute bit(s) in the CX.
You may change read-only, hidden, system, and archive files, but not the volume label or
subdirectory. The following example sets hidden and archive attributes for a file:
A valid operation clears the carry flag and sets the directory entry to the attribute in the CX.
An invalid operation sets the carry flag and returns code 02, 03, or 05 to the AX.
INT 21H
A successful operation clears the carry flag and returns a new file handle (the next one avail-
able) in the AX. An error sets the carry flag and returns error code 04 or 06 to the AX. (See
also function 46H.)
buffer for the operation to return the located directory entry and issue function 1AH (set
DTA) before using this service. For beginning the search, set the CX with the file attribute
of the filename(s) to be returned— any combination of read only (bit 0), hidden (bit 1), sys-
tem (bit 2), volume label (bit 3), directory (bit 4), or archive (bit 5). Load the DX with the
address of an ASCIIZ string containing the pathname; the string may contain the wild-card
characters ? and *:
DTAname DB 43 DUP(?)
An operation that locates a match between attribute bits clears the carry flag and fills the
43-byte (2BH) DTA with the following:
An error sets the carry flag and returns code 02, 03, or 12H.
A unique use for function 4EH is to determine whether a reference is to a filename or
to a subdirectory. For example, if the returned attribute is 10H, the reference is to a subdi-
rectory. The operation also returns the size of the file. Thus you may use function 4EH to
determine the size of a file and function 36H to check the space available for writing it.
A successful operation clears the carry flag and returns to the AX codes 00 (filename found)
or 18 (no more files). An error sets the carry flag and returns code 02, 03, or 12H to the AX.
Figure 18-3 illustrates functions 4EH and 4FH.
INT 21H, Function 56H: Rename File or Directory
This service can rename a file or directory from within a program. Load the DX with the
address of an ASCIIZ string containing the old drive, path, and name of the file or direc-
tory to be renamed. Load the DI (actually, ES:DI) with the address of an ASCIIZ string con-
taining the new drive, path, and name, with no wild cards. Drive numbers, if used, must be
the same in both strings. Since the paths need not be the same, the operation can both re-
name a file and move it to another directory on the same drive:
oldstrg DB ‘d:\oldpath\oldname’, QOH
A successful operation clears the carry flag; an error sets the carry flag and returns in the
AX code 02, 03, 05, or 11H.
(Seconds are in the form of the number of 2-second increments, 0-29.) Load the request
(O = get, | = set) in the AL and the file handle in the BX. For a set request, load the time
in the CX and the date in the DX. Following is an example:
MOV AH,57H ;Request date/time
INT 21H
Program: Selectively Deleting Files 347
A valid operation clears the carry flag; get returns the time in the CX and date in the DX,
whereas set changes the date and time entries for the file. An invalid operation sets the carry
flag and returns in the AX error code 01 or 06.
INT 21H, Function 5AH: Create a Temporary File
This service would be useful for a program that creates temporary files, especially in net-
works, where the names of other files may be unknown and the program is to avoid acci-
dentally overwriting them. The operation creates a file with a unique name within the path.
Load the CX with the required file attribute—any combination of read only (bit 0),
hidden (bit 1), system (bit 2), volume label (bit 3), directory (bit 4), and archive (bit 5). Load
the DX with the address of an ASCIIZ path—the drive (if necessary), the subdirectory (if
any), a backslash, and OOH, followed by 13 bytes for the new filename:
ASCpath DB ‘d:\pathname\’, OOH, 13 DUP(20H)
INT 21H
A successful operation clears the carry flag, delivers the file handle to the AX, and appends
the new filename to the ASCIIZ string, beginning at the OOH byte. An invalid operation sets
the carry flag and returns code 03, 04, or 05 in the AX.
TAB EQU 09
LF EQU 10
CR EQU 13
CRLF DB CR. LF, *s!
PATHNAM DB 'R:\*.*! OOH
DELMSG DB TAB, 'Erase ','S'!
CR, LF, 'No more directory entries', CR, LF, '§
'Invalid path/file', '$'
ERRMSG2 DB 'Write-protected disk','$'
PROMPT DB 'y = Erase, N = Keep, Ent = Exit', CR, LF, '$'
DISKAREA DB 43 DUP (20H)
PUSH AX . HO;
LEA DX, ENDMSG ; @Gisplay ending
CALL Q30LINE ; message
POP AX
CoO. RET
C1LONEXT ENDP
KEY POINTS
¢ Operations involved in handling disk drives include reset, select default, get drive in-
formation, get free disk space, and the extensive operation I/O control for devices.
* Operations involved in handling the directory and FAT include create subdirectory,
remove subdirectory, change current directory, and get current directory.
¢ Operations involved in handling disk files (other than create, open, read, and write)
include rename file, get/set attribute, find matching file, and get/set date/time.
QUESTIONS
Use DEBUG for the first three questions. Key in the A 100 command and the required in-
structions. Examine any values returned in the registers.
18-1. Operations involving disk drives:
(a) Function 19H to determine the current default disk drive.
(b) Function 1BH for information about the current default disk drive.
(c) Function 1FH for information about the default DPB.
(d) Function 36H to determine the amount of free disk space.
(e) Function 4400H to get information on the device in use.
(f) Function 4408H to determine whether any media in use are removable.
(g) Function 440DH, minor code 60H, to get the device parameters.
(h) Function 440DH, minor code 66H, to get the media ID.
18-2. Operations involving directories:
(a) Function 39H to create a subdirectory. For safety, you could create it on a RAM disk or
diskette. Use any name.
(b) Function 56H to rename the subdirectory.
(c) Function 3AH to remove the subdirectory.
18-3. Operations involving disk files:
(a) Function 43H to get the attribute from a file on a diskette. (Use a copied file for this
exercise.)
(b) Function 56H to rename the file.
(c) Function 43H to set the attribute to hidden.
(d) Function 57H to get the file’s date and time.
(e) Function 41H to delete the file.
Questions 351
18-4. Write a small program from within DEBUG that simply executes DOS function 29H, parse
filename. Provide for the filespec at 81H and the FCB at 5CH; both are in the PSP immedi-
ately before the program. Enter various filespecs, such as D:PROGA.DOC, PROGB,
PROGC.*, and C:*.ASM. Check the results at offset 5CH after each execution of the
CHAPTER 19
Disk Processing:
III—BIOS Disk Operations
OBJECTIVE
INTRODUCTION
In Chapters 17 and 18, we examined the use of the DOS services for disk processing. You
can also code directly at the BIOS level for disk processing, although BIOS supplies no au-
tomatic use of the directory or blocking and deblocking of records. BIOS disk operation
INT 13H treats data as the size of a sector and handles disk addressing in terms of actual
track and sector numbers. BIOS disk operations involve resetting reading from , writing to,
verifying, and formatting the drive.
Most of the BIOS operations are for experienced software developers who are aware
of the potential danger in their misuse. Also, BIOS versions may vary according to the
processor used and even by computer model.
This chapter introduces the following BIOS INT 13H functions:
DISKETTE FUNCTIONS HARD DISK FUNCTIONS
OOH Reset diskette system OOH Reset disk system
O1H Read diskette status O1H Read disk status
02H Read sectors 02H Read sectors
352
BIOS Status Byte 300
Code Status
OOH No error
01H Bad command, not recognized by the controller
02H Address mark on disk not found
03H Writing on protected disk attempted
04H Invalid track/sector
OSH Reset operation failed
O6H Diskette removed since last access
O7H Drive parameters wrong
O8H Direct memory access (DMA) overrun
(data accessed too fast to enter)
09H DMA across a 64K boundary attempted on read/write
10H Bad CRC on a read encountered
(error check indicated corrupted data)
20H Controller failed (hardware failure)
40H Seek operation failed (hardware failure)
80H Device failed to respond (diskette: drive door open
or no diskette; hard disk: time out)
AAH Drive not ready
BBH Undefined error
Write fault
A valid operation clears the carry flag. An error sets the carry flag and returns a status code
in the AH. Function ODH is a related operation.
On return from a valid operation, the carry flag is cleared, and the AL contains the number
of sectors that the operation has actually read. The contents of the DS, BX, CX, and DX
registers are preserved. An error sets the carry flag and returns the status code in the AH;
reset the drive (function 00H) and retry the operation.
For most situations, you specify only one sector or all sectors for a track. Initialize
the CH and CL, and increment them to read the sectors sequentially. Once the sector num-
ber exceeds the maximum for a track, you have to reset it to 01 and either increment the
track number on the same side of the disk or increment the head number for the next side.
0118 NOP
356 Disk Processing: III—BIOS Disk Operations Chapter 19
Now let’s examine the program in Figure 19-2, which uses BIOS INT 13H to read sectors
from disk into memory. Note that there is no open operation or file handle. The major sec-
tions are:
CURADR _ Contains the beginning track and sector (which the program increments).
ENDADR _ Contains the ending track and sector. One way to enhance the program
would be to prompt the user for the starting and ending track and sector.
CIOADDR Calculates each disk address in terms of side, track, and sector. When
the sector number reaches 10, the routine resets the sector to 01. If
the side is 1, the program increments the track number; the side
number is then changed, from 0 to | or from 1 to 0. This process works
only for diskettes (because they are two sided) that contain nine sec-
tors per track.
FIOREAD _ Reads a sector and increments the sector number for a valid read
operation.
G1ODISP Displays the currently read sector.
Try running this program under DEBUG. Trace through the instructions that initial-
ize the segment registers. For the input operation, adjust the starting and ending sectors to
the location of the disk’s FAT. (See Chapter 16.) Use G (Go) to execute the program, and
examine the FAT and directory entries in the input area.
As an alternative to DEBUG, your program could convert the ASCII characters in the
input area to their hex equivalents and display the hex values just as DEBUG does. (See
also the program in Figure 15-6.) In this way, you could examine the contents of any
sector—even hidden ones—and could allow a user to enter changes and write the changed
sector back onto disk.
Note that when DOS creates a file, it inserts records in available clusters, which may
not be contiguous on disk. Thus, you can’t expect BIOS INT 13H to read a file sequentially,
although you could access the FAT entries for the location of the next cluster.
The following describes additional BIOS INT 13H services for diskette and hard disk.
Other BIOS Disk Operations 357
DATA
DW 0304H ;Beginning track/sector
DW 0501H ;Ending track/sector
DB 00 ;End process indicator
DB '***k Read error ***S'!
DB 512 DUP(' ') ;Input area
DB 00
CODE
PROC FAR
MOV AX, @data ;Initialize
MOV DS , AX ; segment
MOV ES , AX ; registers
MOV AX, 0600H ;Request scroll
A20LOOP:
CALL Q10SCRN ;Clear screen
CALL Q20CURS ;Set cursor
CALL C1OADDR ;Calculate disk address
MOV CX, CURADR
MOV DX, ENDADR
CMP CX, DX ;At ending sector?
JE A90 ; yes, exit
CALL F1OREAD ;Read disk record
CMP ENDCDE, 00 ;Normal read?
JINZ A90 ; no , exit
CALL G1LODISP ;Display sector
JMP A20LOOP ;Repeat
A90: MOV AX, 4COOH
INT 21H :;Exit to DOS
BEGIN ENDP
/ Calculate next disk address:
C1LOADDR
MOV CX, CURADR ;Get track/sector
CMP CL, 10 ;Past last sector?
JNE C90 ; no, exit
MOV CL, 01 ;Set sector to l
CMP SIDE, 00 ;Bypass if side 0
JE C20
INC CH ;Increment track
C203
XOR SIDE,01 ;Change side
MOV CURADR
, CX
C90: RET
C1LOADDR ENDP
e
‘ Read disk sector:
F1LOREAD
MOV AH,02H ;Request read
MOV AL,0O1 ;Number of sectors
LEA BX, RECDIN ;Address of buffer
MOV CX, CURADR ;Track/sector
MOV DH, SIDE ;Side
MOV DL, 01 ;Drive B
INT 13H
CMP AH, 00 ;Normal read?
JZ F90 ; yes, exit
MOV ENDCDE, 01 : nos
CALL X1LOERR ; invalid read
F90:
INC CURADR ; Increment sector
RET
F1LOREAD ENDP
; Display sector:
turning from loading, the carry flag is cleared and the AL contains the number of sectors ac-
tually verified. The contents of the DS, BX, CX, and DX registers are preserved. An error
sets the carry flag and returns a status code in the AH; reset the drive and retry the operation.
For example, if you format track 03, head 00, and 512 bytes per sector, the first entry for
the track is hex 03000102, followed by one entry for each remaining sector.
The operation clears or sets the carry flag and returns the status code in the AH.
BL Diskette type (01H = 360K, 02H = 1.2M, 03H = 720K, 04H = 1.44M)
CH High cylinder/track number
CL Bits 0-5 = high sector number
Bits 6—7 = high-order two bits of cylinder number
DH High head number
DL Number of drives attached to the controller
ES:DI _ For diskettes, segment:offset of an 11-byte diskette drive parameter table.
Two relevant fields are:
Offset 3—bytes per sector (OOH = 128, O1H = 256, 02H = 512,
03H = 1024)
Offset 4—sectors per track
360 Disk Processing: III—BIOS Disk Operations Chapter 19
You can use the DEBUG command D ES:offset (the offset in the DI) to display the
values. The operation clears or sets the carry flag and returns the status code in the AH.
A successful operation returns to the AL the number of sectors transferred. The op-
eration clears or sets the carry flag and returns a status code in the AH.
The operation clears or sets the carry flag and returns a status code in the AH.
For AH return code 03, the CX:DX pair contains the total number of disk sectors on the
drive. The operation clears or sets the carry flag, and error codes are returned in the AH.
INT 13H, Function 16H: Change of Diskette Status
This function checks for a change of diskette for systems that can sense a change. Load the
DL with the drive number (0 = A, etc.). The operation returns one of the following codes
in the AH:
OOH Nochange of diskette (carry flag = 0)
01H Invalid diskette parameter (carry flag = 1)
06H Diskette changed (carry flag = 1)
80H Diskette drive not ready (carry flag = 1)
Status codes 01H and 80H are errors that set the carry flag, whereas 06H is a valid status
that also sets the carry flag. This is a potential source of confusion.
The operation clears or sets the carry flag and returns the status in the AH.
A valid operation returns in the ES:DI a pointer to an 1 1-byte diskette parameter table. (See
function 08H.) The operation clears or sets the carry flag and returns the status in the AH.
KEY POINTS
QUESTIONS
19-1. What are the two major disadvantages of using BIOS INT 13H? That is, why is the use of DOS
interrupts usually preferred?
19-2. Under what circumstances would a programmer use BIOS INT 13H?
19-3. Most INT 13H operations return a status code. (a) Where is the code returned? (b) What does
code 00H mean? (c) What does code 03H mean?
19-4, What is the standard procedure for an error returned by INT 13H?
19-5, Code the instructions to reset the diskette controller.
19-6. Code the instructions to read the diskette status.
19-7. Using memory address INDSK, drive A, head 0, track 6, and sector 3, code the instructions for
BIOS INT 13H to read one sector.
Questions 363
19-8. Using memory address OUTDSK, drive B, head 0, track 8, and sector 1, code the instructions
for BIOS INT 13H to write three sectors.
19-9. After the write in Question 19-8, how would you check for an attempt to write on a pro-
tected disk?
19-10. Based on Question 19-8, code the instructions to verify the write operation.
CHAPTER 20
Printing
OBJECTIVE:
INTRODUCTION
Compared to screen and disk handling, printing appears to be a relatively simple process.
There are only a few operations involved, all done either through DOS INT 21H or through
BIOS INT 17H. The special commands to the printer include Form Feed, Line Feed, and
Carriage Return.
A printer must understand a signal from the processor—for example, to eject to anew
page, to feed one line down a page, or to tab across a page. The processor also must un-
derstand a signal from a printer indicating that it is busy or out of paper. Unfortunately,
many types of printers respond differently to signals from a processor, and one of the more
difficult tasks for software specialists is to interface their programs to such printers.
This chapter introduces the following interrupt operations:
DOS INT 21H FUNCTIONS BIOS 17H FUNCTIONS
40H Print characters OOH Print character
O5H Print character 01H = Initialize port
02H Get printer port status
364
DOS 21H, Function 40H: Print Characters 365
Horizontal Tab
Line Feed (advance one line)
Form Feed (advance to next page)
Carriage Return (return to left margin)
Horizontal Tab. The Horizontal Tab (09H) control character causes the printer to
place the current character at the next tab stop (usually, if set at all, every eight positions).
The command works only on printers that have the feature and only when the printer tabs
are set up. You can print blank spaces to get around a printer’s inability to tab.
Line Feed. The Line Feed (OAH) control character advances a single line, and two
successive line feeds cause a double space.
Form Feed. Initializing the paper when you power up a printer determines the
starting position for the top of a page. The default length for a page is 11 inches, which pro-
vides 66 lines at 6 lines per inch. Neither the processor nor the printer automatically checks
for the bottom of a page. On continuous forms, if your program continues printing down a
page, it eventually prints over the perforation at the bottom of the page and onto the top of
the next page. To control paging, count the lines as they print, and on reaching the maxi-
mum for a page (such as 60 lines), issue a Form Feed (OCH) command, and then reset the
line count to O or 1.
At the end of printing, deliver a Line Feed or Form Feed command to force the printer
to print the last line still in its buffer. Issuing a form feed at the end of printing also facili-
tates the user’s tearing off the last page.
Carriage Return. The Carriage Return (ODH) control character resets the printer
to its leftmost margin and programs normally accompany it with a Line Feed. On the key-
board, this character is known as Enter or Return.
We have already used file handles in the chapters on screen handling and disk processing.
For printing with DOS INT 21H, function 40H, load these registers:
AH Function 40H
BX File handle 04
CX Number of characters to print
DX Address of the text
366 Printing Chapter 20
The following example prints 25 characters from a data item named HEADING, be- |
ginning at the leftmost margin. The Carriage Return (ODH) and Line Feed (OAH) charac-
ters immediately following the text in HEADING cause the printer to reset the carriage and
advance one line:
A successful operation prints the text, clears the carry flag, and returns in the AX the num-
ber of characters printed. An unsuccessful operation sets the carry flag and returns in the
AX error code 05 (access denied) or 06 (invalid handle). An end-of-file marker (Ctrl-Z or
OAH) in the data also causes the operation to end.
Clancy Alderson
Janet Brown
David Christie
The program counts each line printed and, on nearing the bottom of a page, ejects the form
to the top of the next page. The major procedures are the following:
.DATA
NAMEPAR LABEL BYTE ;Keyboard parameter list:
MAXNLEN DB 20 ; maximum length of name
NAMELEN DB 2 ; actual length entered
NAMEFLD DB 20 DUP(' ') ; name entered
;Heading line:
HEADG DB 'List of Employee Names Page '
PAGECTR DB 'O1', OAH,
CODE
BEGIN PROC FAR
MOV AX, @data ; Initialize
MOV DS , AX ; segment
MOV ES , AX ; registers
CALL Q10CLR ;Clear screen
CALL M10PAGE ;Page heading
A20LOOP:
MOV DX, 0000 ;Set cursor to 00,00
CALL Q20CURS
CALL D1OINPT ;Provide input of name
CALL Q10CLR
CMP NAMELEN, 00 ;No name entered?
JE A30 ; no name, exit
CALL E10PRNT ; name, prepare printing
JMP A20LOOP
A30:
MOV Cx, 01 ;End of processing:
LEA DX, FFEED ; one character
CALL P100UT ; for form feed,
MOV AX, 4C0O0H ; exit to DOS
INT 21H
BEGIN ENDP
i Accept input of name:
PAGECTR DB ‘01’
Printing ASCII Files and Handling Tabs 369
A common procedure, performed, for example, by the video adapter, is to replace a Tab
character (09H) with blanks through to the next location evenly divisible by 8. Thus tab
stops could be at locations 8, 16, 24, and so forth, so that all locations between 0 and 7 tab
to 8, those between 8 and 15 tab to 16, and so forth. Some printers, however, ignore Tab
characters. DOS PRINT, for example, which prints ASCII files (such as assembly source
programs), has to check each character that it sends to the printer. If the character is a Tab,
the program inserts blanks to the next tab position.
The program in Figure 20—2 requests a user to enter the name of a file and prints the
contents of the file. The program is similar to the one in Figure 17-3 that displays records,
but goes a step further in replacing tab stops for the printer with blanks. You’II find the logic
in GIOXFER, after label G60. Following are three examples of tab stops, for print positions
1, 9, and 21, and the logic for setting the next tab position:
C1OPRMP Requests the user to enter a filename. Pressing only the Enter key in-
dicates that the user is finished.
E10OPEN Opens the requested disk file for input.
GIOXFER Checks the input data for end of sector, end of file, end of display area,
Line Feed, and Tab. Basically, sends regular characters to the display
area.
PIOPRNT Prints the display line and clears it to blanks.
RIOREAD _ Reads a sector from the file.
Carriage Return, Line Feed, and Form Feed characters should work on all printers.
You could modify the precding program to count the lines printed and force a form feed
when near the bottom of a page, at line 60 or so. (Some users prefer to use an editor pro-
gram to embed Form Feed characters directly in their ASCII files, at the exact location
where they want a page break, such as at the end of a procedure. The usual method is to
Printing Chapter 20
COUNT 00
DISAREA 120 DUP(' ;Display area
ENDCDE 00 ;End process indicator
FFEED OCH
HANDLE 0
OPENMSG '*k* Open error ***!
PROMPT 'Name of file?
512 DUP (* ;Input area for file
;Main procedure
; Initialize
; segment
; registers
Q10SCR ;Clear screen
Q20CURS ;Set cursor
AIOLOOP:
ENDCDE,00 ;Initialize
C1OPRMP ;Request filename
NAMELEN,00 ;Any request?
A90 ; no, exit
E1LOOPEN ;Open file, get handle
ENDCDE,00 ;Valid open?
A80 ; no, request again
R1OREAD ;Read 1st disk sector
ENDCDE,00 ;End of file, no data?
A80 ; yes, request next
G10XFER ;Print/read
A10LOOP ;Repeat
AX, 4C00H ;Exit to DOS
21H
C1OPRMP
;Prompt for filename
CX,i3
DX, PROMPT
Zi
AH, OAH ;Accept filename
DX, PATHPAR
Zi
BL, NAMELEN ;Iinsert
BH, 00 ; zero at end of
FILENAM [BX] , 0 ; filename
C90:
C1OPRMP
7 Open disk file:
E1LOOPEN
;Request open
;Read only
, Print line:
hold down the Alt key and press numbers on the numeric keypad—for example, 012 for
Form Feed.)
You could revise the program for DOS function 05H to send each character directly
to the printer, thereby eliminating the definition and use of the display area.
Special Printer Control Characters 373
These instructions are adequate for sending a single character to the printer. However, print-
ing typically involves a full or partial line of text and requires stepping through a line for-
matted in the data area.
The following example illustrates printing a full line. It first initializes the address of
HEADING in the SI register and sets the CX to the length of HEADING. The loop at P20
then extracts each character successively from HEADING and sends it to the printer. Since
the first character in HEADING is a Form Feed and the last two characters are Line Feeds,
the heading prints at the top of a new page and is followed by a double space. The code is
as follows:
HEADING DB OCH, ‘Industrial Bicycle Mfrs’ ,0DH, 0AH, OAH
P20:
If the printer is not on, DOS returns a message, “out of paper,” repetitively. If you
turn on the power, the program begins printing correctly. You can also press Ctrl+Break to
cancel execution of the print operation.
Some commands require a preceding Esc (escape) character (1BH). Some of these
commands, depending on the printer, are:
1. Define commands in the data area. The following sets condensed mode, sets 8 lines
per inch, prints a title, and causes a carriage return and line feed:
HEADING DB OFH, 1BH, 30H, ‘Title ... ‘, ODH, OAH
All subsequent characters print in condensed mode until the program sends a command that
resets the mode.
The foregoing commands do not necessarily work for all printer models. Check your
manual for the printer’s specific commands.
1. Issue function 02H first to determine the printer’s status, via a selected port number.
Include this status test before every attempt to print. If the printer is available, then
2. Issue function 01H to initialize the printer port, and
3. Issue function OOH operations to send characters to the printer.
The operations return the printer status to the AH, with one or more bits set to 1:
BIT CAUSE
0 Time out
3 Input/output error
4 Selected
BIOS INT 17H Functions for Printing 375
If the printer is already switched on and ready, the operation returns 90H (binary
10010000): the printer is not busy, but is selected, a valid condition. Printer errors are bit 5
(out of paper) and bit 3 (output error). If the printer is not switched on, the operation returns
BOH, or binary 10110000, indicating “out of paper.”
The operation returns the status to the AH register. The recommended practice is to use
function 02H first to check the printer status.
Since the operation sends a Form Feed character, you can use it to set the printer to the top-
of-page position, although most printers do this automatically when turned on. The opera-
tion returns a status code in the AH.
The operation returns the same printer port status as function 01H. When the program
runs, if the printer is not initially turned on, BIOS is unable to return a message automati-
376 Printing Chapter 20
cally—your program is supposed to test and act upon the printer status. If your program
does not check the status, your only indication is the cursor blinking. If you turn on the
printer at this point, some of the output data is lost. Consequently, before executing any
BIOS print operations, check the port status; if there is an error, display a message. (The
DOS operation performs this checking automatically, although its message, “out of paper,”
applies to various conditions.) When the printer is switched on, the message no longer ap-
pears, and printing begins normally with no loss of data.
At any time, a printer may run out of forms or may be inadvertently switched off.
If you are writing a program for others to use, include a status test before every attempt
to print.
KEY POINTS
¢ After printing is completed, use a Line Feed or Form Feed command to clear the
printer buffer.
¢ DOS function 40H (the preferred choice) prints strings of characters, whereas DOS
function 05H and BIOS function 17H print a single character at a time.
¢ DOS provides a message if there is a printer error; BIOS returns only a status code.
When using BIOS INT 17H, check the printer status before printing.
QUESTIONS
20-1. Provide the printer control characters for (a) Horizontal Tab; (b) Form Feed; (c) Backspace;
(d) Carriage Return.
20-2. Code a program using DOS function 40H for the following requirements: (a) Eject the forms
to the next page; (b) print your name; (c) perform a carriage return and a line feed, and print
your address; (d) perform a carriage return and a line feed, and print your city and state;
(e) eject the forms.
20-3. Revise Question 20—2 to use DOS function 05H.
20—4. Code a heading line that sets condensed mode, defines a title (any name), provides for carriage
return and form feed operations, and turns off condensed mode.
20-5. BIOS INT 17H for printing returns an error code. (a) Where is the code returned? (b) What
does code 08H mean? (c) What does code 90H mean?
20-6. Revise Question 20—2 to use BIOS INT 17H. Include a test for the printer status.
20-7. Revise Question 20-2 so that the program performs parts (b), (c), and (d) five times.
20-8. Revise Figure 20-1 to run under DOS function 05H.
20-9. Revise Figure 20-2 so that it also displays printed lines.
CHAPTER 21
OBJECTIVE
To describe programming for the mouse, the IN and OUT
instructions, ports, and generating sound.
INTRODUCTION
This chapter describes the use of the mouse, accessing the the PC’s ports, and generating
sound through the PC’s speaker. The instructions that are introduced are:
MOUSE FEATURES
The mouse is a commonly used pointing device, basically controlled by a driver that is nor-
mally installed by an entry in the CONFIG.SYS or AUTOEXEC.BAT file. The driver must
be installed for a program to respond to the mouse’s actions.
All mouse operations within a program are performed by standard INT 33H functions
of the form
377
378 Other Input/Output Facilities Chapter 21
Note that unlike other INT operations that use the AH register, INT 33H functions are
loaded in the full AX register.
The first mouse instruction that a program issues is function OOH, which simply ini-
tializes the mouse driver for the program. Typically, you need issue this command just once,
at the start of the program. The next instruction following function OOH should be function
01H, which causes the mouse pointer to appear on the screen. After that, you have a choice
of a wide range of mouse operations.
MOUSE FUNCTIONS
The following are the mouse functions available for INT 33H; relatively few of them are
commonly used:
This is the first command for handling a mouse that a program issues; it needs to be issued
only once. Simply load the AX with function OOH, and issue INT 33H. The operation re-
quires no input parameters, but returns these values:
e AX = OOOOH if no mouse support is available or FFFFH if support is available
¢ BX = number of mouse buttons (if support is available)
If mouse support is available, the operation initializes the mouse driver as follows:
The standard practice is to issue this function at the end of a program’s execution, to
cause the pointer to be concealed. The operation requires no input parameters and returns
no values.
The pointer flag is displayed when it contains a zero value and is concealed for any
other value. This function decrements the flag to force it to be concealed.
This function returns useful information about the mouse. It requires no input parameters,
but returns these values:
The horizontal and vertical coordinates are expressed in terms of pixels, even in text mode
(8 per byte for video mode 03). The values are always within the minimum and maximum
limits for the pointer.
Use this operation to set the horizontal and vertical coordinates for the mouse pointer on
the screen (the values for the location are in terms of pixels—8 per byte for video mode 03):
Common Mouse Operations 381
The operation sets the pointer at the new location, adjusted as necessary if outside the min-
imum and maximum limits.
Illustrative Code
The following code illustrates the use of the mouse instructions covered to this point:
MOV AX, 00H ;Request initialize mouse
INT 33H
To use this function to return information about button presses, set the BX with the button
number, where 0 = left, 1 = right, and 2 = center:
MOV AX,05H ;Request press information
The operation returns the up-down status of all buttons and the press count and location of
the requested button:
¢ AX = Status of buttons, according to bit location, as follows:
Bit O Left button, where 0 = up, 1 = down
Bit 1 Right button, where 0 = up, 1 = down
382 Other Input/Output Facilities | Chapter 21
To use this function to return information about button releases, set the BX with the button
number (0 = left, 1 = right, and 2 = center):
The operation returns the up-down status of all buttons and the release count and location
of the requested button, as follows:
This operation sets the minimum and maximum horizontal limits for the pointer:
If the minimum value is greater than the maximum, the operation exchanges the values. The
operation also moves the pointer to within the new area if necessary. See also functions 08H
and 10H.
Common Mouse Operations 383
This operation sets the minimum and maximum vertical limits for the pointer:
MOV AX,08H ;Request set vertical limit
If the minimum value is greater than the maximum, the operation exchanges the values. The
operation also moves the pointer to within the new area if necessary. See also functions 07H
and 10H.
This operation returns the horizontal and vertical mickey count since the last call to the
function (within the range —32,768 to +32,767). Returned values are:
¢ CX = Horizontal count (a positive value means travel to the right, negative means to
the left)
¢ DX = Vertical count (a positive value means travel downwards, negative means
upwards)
Define the interrupt handler as a FAR procedure. The mouse driver uses a far call to
enter the interrupt handler with these registers set:
¢ AX = The event mask as defined, except that bits are set only if the condition occurred
* BX = Button state, where, if set, the bits mean the following:
0 left button down
1 right button down
2 center button down
¢ CX = Horizontal (x) coordinate
¢ DX = Vertical (y) coordinate
¢ SI = Last vertical mickey count
¢ DI = Last horizontal mickey count
¢ DS = Data segment for the mouse driver
On the program’s entry into the interrupt handler, push all registers and initialize the
DS register to the address of your data segment. Within the handler, use only BJOS, not
DOS, interrupts. On exit, pop all registers.
This operation defines a screen area in which the pointer is not displayed:
MOV AX,10H ;Request set exclusion area
To replace the exclusion area, call the function again with different parameters, or reissue
function OOH or 01H.
MOUSE PROGRAM
The program in Figure 21-1 illustrates the use of a mouse. The screen displays the horizon-
tal and vertical positions of the pointer as a user moves the mouse. The main procedures are:
YMSG DB + = * ;Y message
YASCII DW ? ;Y ASCII value
MOV YASCII,AX
CALL Q30DISP ;Display X and Y values
JMP Al10 ;Repeat
A80:
CALL H10HIDE ;Hide mouse pointer
A90:
CALL Q10CLEAR ;Clear screen
MOV AX, 4CO0OH ;Exit to DOS
INT 21H
BEGIN
BLOINIT PROC NEAR
MOV AX, 00H ;Initialize mouse
INT 200
CMP AX, 00 ;Mouse installed?
JE B90 ; no, exit
MOV AX, 01H ;Show pointer
INT 33H
B90:
RET ;Return to caller
B1OINIT ENDP
.286
D1OPTR PROC NEAR
D202 MOV AX, 03H ;Get pointer location
INT 33H
CMP BX,01 ;Right button pressed?
JE D90 ; yes, means exit
SHR CX, 03 ;Divide pixel value
Dx, 03 I by 8
CX, XBINARY ;Has pointer
a
location
D30 ;
7 changed?
DX, YBINARY /
D900:
;Return to caller
D1OPTR
Q10CLEAR NEAR
AX, 0600H ;Request clear screen
BH, 30H ;Colors
CX, 00 ;Full
DX,184FH ; screen
10H
;Return to caller
Q10CLEAR
Q20CURS NEAR
AH, 02H ;Set cursor
BH, 0 ;Page 0
DH, 0 ; Row
Dis; 25 ; Column
10H
;Return to caller
Q20CURS
Q30DISP NEAR
AH, 40H ;Request display
BA, OL ;Screen
CX,14 ;Number of characters
DX, DISPDATA ;Display area
21H
;Return to caller
Q30DISP
BEGIN
One way to improve this program would be to issue function OCH to set an interrupt
handler. In this way, the required instructions are automatically invoked whenever the
mouse is active.
PORTS
A port is a device that connects a processor to the external world. Through a port, a proces-
sor receives a signal from an input device and sends a signal to an output device. Ports are
identified by their addresses, in the range of OH—3FFH, or 1,024 ports in all. Note that
these addresss are not conventional memory addresses. You can use the IN and OUT in-
structions to handle I/O directly at the port level:
IN transfers data from an input port to the AL if a byte and to the AX if a word. The
general format is
IN accum-reg, port
OUT transfers data to an output port from the AL if a byte and from the AX if a word.
The general format is
Although the recommended practice is to use DOS and BIOS interrupts, you may
safely bypass BIOS when you access ports 21H, 40-42H, 60H, 61H, and 201H. For exam-
ple, on bootup, a ROM BIOS routine scans the system for the addresses of the serial and
parallel port adapters. If the serial port address is found, BIOS places them in its data area,
beginning at memory location 40:00H; if the parallel addresses are found, BIOS places
them in its data area, beginning at location 40:08H. Each location has space for four one-
word entries. The BIOS table for a system with two serial ports and two parallel ports could
look like this:
To use BIOS INT 17H to print a character, insert the printer port number in the DX
register:
Some programs allow for printing only via LPT1. If you have two printers attached,
as LPT1 and LPT2, you could use the program in Figure 21-2 to reverse (toggle) their ad-
dresses in the BIOS table.
GENERATING SOUND
The PC generates sound by means of a built-in permanent magnet speaker. You can select
one of two ways to drive the speaker or combine both ways: (1) Use bit 1 of port 61H to
activate the Intel 8255A-5 Programmable Peripheral Interface (PPI) chip, or (2) use the gat-
ing of the Intel 8353-5 Programmable Interval Timer (PIT). The clock generates a 1.19318-
Mhz signal. The PPI controls gate 2 at bit 0 of port 61H.
The program in Figure 21-3 generates a series of notes in ascending frequency. DUR-
TION provides the length of each note, and TONE determines the frequency. The program
initially accesses port 61H and saves the value that the operation delivers. A CLI instruc-
tion clears the interrupt flag to enable a constant tone. The interval timer generates a clock
tick of 18.2 ticks per second that (unless you code CLI) interrupts execution of your pro-
gram and causes the tone to wobble.
The contents of TONE determine its frequency; high values cause low frequen-
_cies and low values cause high frequencies. After the routine B1OSPKR plays each note,
it increases the frequency of TONE by means of a right shift of 1 bit (effectively halv-
ing its value). Since decreasing TONE in this example reduces how long it plays, the rou-
tine also increases DURTION by means of a left shift of 1 bit (effectively doubling
its value).
The program terminates when TONE is reduced to 0. The initial values in DURTION
and TONE have no technical significance. You can experiment with other values and try
executing the program without the CLI instruction.
Key Points 391
;Length of tone
; Frequency
MAIN
B1LOSPKR NEAR
B20: DX, DURTION ;Set duration of sound
B30:
AL, 11111100B ;Clear bits 0 &1
61H, AL ;Transmit to speaker
CX, TONE ;Set length
You could use any variation of the logic to play a sequence of notes, in order, for ex-
ample, to draw a user’s attention. You could also revise the program as per Question 21-7.
KEY POINTS
¢ In text mode, the mouse pointer is a flashing block, in reverse video; in graphics
mode, the pointer is an arrowhead.
¢ Mouse operations use INT 33H, with a function code loaded in the AX.
¢ The first mouse operation to execute is function 00H, which initializes the mouse
driver.
¢ Function 01H is required to display the mouse pointer, 03H to get the button status,
and 04H to get the pointer location.
392 Other Input/Output Facilities Chapter 21
¢ Through a port, a processor receives a signal from an input device and sends a signal
to an output device. Ports are identified by their addresses, in the range OH—3FFH, or
1,024 in all.
¢ The PC generates sound by means of a built-in permanent magnet speaker. You can
select one of two ways to drive the speaker or combine both ways.
QUESTIONS
21-1. Explain these terms: (a) mickey; (b) mickey count; (c) mouse pointer.
21-2. Provide the INT 33H function for each of the following mouse operations:
(a) Read mouse-motion counters
(b) Get button-press information
(c) Conceal the mouse pointer
(d) Set pointer location
(e) Get button-release information
(f) Install interrupt handler for mouse events
21-3. What is the purpose of the mouse pointer flag?
21-4. Code the instructions for the following requirements:
(a) Initialize the mouse
(b) Display the mouse pointer
(c) Get mouse information
(d) Set the mouse pointer on the center row, to the far right
(e) Get mouse sensitivity
(f) Get button status and pointer location
(g) Conceal the mouse pointer
21-5. Combine the requirements in Question 21-4 into a full program. You can run the program un-
der DEBUG, although at times DEBUG may scroll the pointer off the screen.
21-6. Refer to Figure 21—2, and code the instructions to reverse the addresses for COM1 and COM2.
21-7. Revise the program in Figure 21-3 for the following requirements: Generate notes that de-
crease in frequency; initialize TONE to 01 and DURTION to a high value. On each loop, in-
crease the value in TONE, decrease the value in DURTION, and end the program when
DURTION equals 0.
PART F — Advanced Programming
CHAPTER 22
Writing Macros
OBJECTIVE:
To explain the definition and use of macro instructions.
INTRODUCTION
For each symbolic instruction that you code, the assembler generates one machine-language
instruction. But for each coded statement in a high-level language such as C or Pascal, the
compiler may generate many machine-language instructions. In this regard, you can think
of a high-level language as consisting of macro statements.
The assembler has facilities that programmers can use to define macros. You de-
fine a specific name for the macro, along with the set of assembly language instructions
sim-
that the macro is to generate. Then, wherever you need to code the set of instructions,
ply code the name of the macro, and the assembler automatically generates your defined
instructions.
Macros are useful for the following purposes:
393
394 Writing Macros Chapter 22
For macros that you want to include with your program, you first must define them (or copy
them from a macro library). A macro definition appears before any defined segment. Let’s
examine a simple macro definition that initializes the segment registers for an EXE program:
The name of this macro is INITZ, although any other unique valid name is acceptable. The
MACRO directive on the first line tells the assembler that the instructions that follow, up to
ENDM (“end macro”), are to be part of a macro definition. The ENDM directive ends the
macro definition. The instructions between MACRO and ENDM comprise the body of the
macro definition.
The names referenced in the macro definition—@data, AX, DS, and ES, must be de-
fined elsewhere in the program or must otherwise be known to the assembler. You may sub-
sequently use the macro instruction INITZ in the code segment where you want to initialize
the registers. When the assembler encounters the macro instruction INITZ, it scans a table
of symbolic instructions and, failing to find an entry, checks for macro instructions. Since
the program contains a definition of the macro INITZ, the assembler substitutes the body
of the definition, generating the instructions—the macro expansion. A program would use
the macro instruction INITZ only once, although other macros are designed to be used any
number of times, and each time the assembler generates the same macro expansion.
Figure 22—1 provides a listing of the assembled program. This particular assembler
version lists the macro expansion with the number | to the left of each instruction to indi-
cate that a macro instruction generated it. A macro expansion indicates only instructions for
which object code is generated, so that directives like ASSUME or PAGE would not appear.
It’s hardly worth bothering to define a macro that is to be used only once, but you
could catalog such a macro in a library for use with all programs. A later section explains
how to catalog macros in a library and how to include them automatically in any program.
page 60,132
Je wie o P22MACR1 (EXE) Macro to initialize
.MODEL SMALL
.STACK 64
.DATA
0000 54 65 73 74 20 6F MESSGE DB 'Test of macro instruction',13,10,'$'
66 20 6D 61 63 72
6F 20 69 6E 73 74
72 75 63 74 69 6F
6E OD OA 24
. CODE
0000 BEGIN PROC FAR
INITZ ;Macro instruction
0000 B8 ---- R A. MOV AX, @data
0003 8E D8 1 MOV DS, AX
0005 8E CO 1 MOV ES, AX
0007 B4 09 MOV AH, 09H ;Request display
0009 8D 16 0000 R LEA DX , MESSGE ;Message
OO00D CD 21 INT 21H
OOOF B8 4C00 MOV AX, 4C0O0OH ;Exit to DOS
0012 CD 21 INT 21H
0014 BEGIN ENDP
END BEGIN
Macros:
Name Lines
INITZ 3
Symbols:
Name Type Value Attr
BEGIN F PROC 0000 _ TEXT Length = 0014
MESSGE L BYTE 0000 _DATA
@CODE . TEXT TEXT
@FILENAME TEXT p22macrl
message. When using the macro instruction, the programmer has to supply the name of the
message, which references a data area terminated by a dollar sign.
PROMPT MACRO MESSGE ;Dummy argument
INT 21H
A dummy argument in a macro definition tells the assembler to match its name with any oc-
currence of the same name in the macro body. For example, the dummy argument MESSGE
also occurs in the LEA instruction.
When using the macro instruction PROMPT, you would supply as a parameter the
actual name of the message to be displayed, for example,
PROMPT MESSAGE2
In this case, MESSAGE2 has to be properly defined in the data segment. The parameter in
the macro instruction matches the dummy argument in the original macro definition:
The assembler has already matched the argument in the original macro definition with the
LEA statement in the body of the macro. It now substitutes the parameter(s) of the macro
instruction MESSAGE2 with the dummy argument in the macro definition, MESSGE. The
assembler substitutes MESSAGE2 for the occurrence of MESSGE in the LEA instruction
and would substitute it for any other occurrence of MESSGE.
The macro definition and macro expansion are shown in full in Figure 22-2. The pro-
gram also defines the macro INITZ at the start and uses it in the code segment.
A dummy argument may contain any valid name, including a register name such as
CX. You may define a macro with any number of dummy arguments, separated by com-
mas, up to column 120 of a line. The assembler substitutes parameters of the macro in-
struction for dummy arguments in the macro definition, entry for entry, from left to right.
COMMENTS
You may code comments in a macro definition to clarify its purpose. A COMMENT di-
rective or a semicolon indicates a comment line. The following example uses a semicolon
to indicate a comment:
Because the default is to list only instructions that generate object code, the assembler does
not automatically display a comment when it expands a macro definition. If you want a
comment to appear within an expansion, use the listing directive .LALL (“list all,” includ-
ing the leading period) prior to requesting the macro instruction:
Comments 397
page 60,132
TITLE P22MACR2 (EXE) Use of parameters
;End macro
PROMPT MACRO MESSGE ;Define macro
MOV AH, 09H
LEA DX, MESSGE
INT 21H
ENDM ;End macro
MODEL SMALL
STACK 64
’
DATA
0000 43 75 73 74 6F 6D MESSG1 DB 'Customer name?', 'S$'
65 72 20 6E 61 6D
65 3F 24
OOOF 43 75 73 74 6F 6D MESSG2 - DB 'Customer address?', '$'
65 72 20 61 64 64
72 65 73 73 3F 24
CODE
0000 BEGIN PROC FAR
INITZ
0000 B8 ---- R ee MOV AX, @data
0003 8E D8 aL MOV DS, AX
0005 8E CO 1 MOV ES, AX
PROMPT MESSG2
0007 B4 09 1 MOV AH, 09H
0009 8D 16 OOOF R 1 LEA DX, MESSG2
OO0O0D CD 21 i. INT 21H
OOOF B8& 4C00 MOV AX, 4CO0OH »Exit to DOS
0012. CD 21 INT 21H
0014 BEGIN ENDP
END BEGIN
. LALL
PROMPT MESSAGE1
A macro definition could contain a number of comments, some of which you may want to
list and some to suppress. Still use .LALL to list them, but code double semicolons (;;) be-
fore comments that are always to be suppressed. (The assembler default is XALL, which
causes a listing only of instructions that generate object code.) On the other hand, you may
not want to list any of the source code of a macro expansion, especially if the macro in-
struction is used several times in a program. In that case, code the listing directive SALL
(“suppress all”), which reduces the size of the printed program, although it has no effect on
the size of the generated object module.
A listing directive holds effect throughout a program until another listing directive is
encountered. You can place them in a program to cause some macros to list only the gen-
erated object code (.XALL), some to list both object code and comments (.LALL), and
some to suppress listing both object code and comments (.SALL).
398 Writing Macros Chapter 22
page 60,132
P22MACR3 (EXE) Use of .LALL & .SALL
.MODEL SMALL
.STACK 64
.DATA
0000 MESSG1 DB 'Customer name?', 13, 10, 'S$'
CODE
0000 PROC FAR
SALL
INITZ
PROMPT MESSG1
. LALL
PROMPT MESSG2
This macro displays any message
The program in Figure 22-3 illustrates the preceding features. It defines the two macros,
INITZ and PROMPT, described earlier. The code segment contains the listing directive
.SALL to suppress listing the expansion of INITZ and the first expansion of PROMPT. For
the second use of PROMPT, the listing directive .LALL causes the assembler to list the com-
ment and the expansion of the macro. But note that in the macro definition for PROMPT,
the comment in the macro expansion containing a double semicolon (;;) is not listed.
MASM 6.0 introduced the terms .LISTMACROALL, LISTMACRO, and .NOLIST-
MACRO for .LALL, .XALL, and .SALL, respectively.
A macro definition may contain a reference to another defined macro. Consider a simple
macro named DOS21 that loads a function in the AH register and issues INT 21H:
The Local Directive 399
INT Zi
ENDM
To use this DOS21 macro to accept input from the keyboard, code
LEA DX,NAMEPAR
DOS21 OAH
The generated code for DOS21 would load function OAH into the AH and issue INT 21H
for keyboard input. Now suppose you have another macro, named DISP, that loads INT
21H, function 02H, in the AH register to display a character:
DISP MACRO CHAR
INT 2iH
ENDM
To display a question mark, for example, code the macro as DISP *?’. You could
change DISP to take advantage of the DOS21H macro by referring to DOS21 within DISP’s
macro definition:
DISP MACRO CHAR
MOV DL, CHAR
DOS21 02H
ENDM
Now if you code the DISP macro as DISP ‘?’, the assembler generates
MOV Dis,’?"
MOV AH,02H
INT 21H
Figure 22-4 illustrates the use of LOCAL. The purpose of the program is to perform
division by successive subtraction. The routine subtracts the divisor from the dividend and
adds | to the quotient until the dividend is less than the divisor. The procedure requires two
labels: COMP for the loop address and OUT for exiting the procedure on completion. Both
COMP and OUT are defined as LOCAL and may have any valid names.
TITLE P22MACR4 (EXE) Use of LOCAL
MODEL SMALL
STACK 64
DATA
0000 0096 DIVDND DW 150 ; Dividend
0002 001B DIVSOR DW 27 ;Divisor
0004 0000 QUOTNT DW 2 ;Quotient
CODE
0000 BEGIN PROC FAR
. LALL
INITZ
0000 B8 ---- R 1 MOV AX, @data
0003 8E D8 i. MOV DS ,AX
0005 8E co a. MOV ES ,AX
DIVIDE DIVDND, DIVSOR, QUOTNT
1; AX = div'd, BX = divisor, CX = quotient
0007 Al 0000 R 1 MOV AX, DIVDND ;Set dividend
OOOA 8B 1E 0002 R 1 MOV BX, DIVSOR ;Set divisor
OOOE 2B C9 1 SUB CX, Cx ;Clear quotient
0010 1 ??0000:
0010 3B C3 1 CMP AX, BX ;Div'd < divisor?
0012 72 05 iL JB 2??0001 ; yes, exit
0014 2B C3 1 SUB AX, BX ;Div'd - divisor
0016 41 1 INC CX ;Add to quotient
0017 =EB F7 1 JMP 720000
0019 i ??PO001:
0019 89 OE 0004 R 1 MOV QUOTNT, CX ;Store quotient
001D B8 4C00 MOV AX, 4C0O0H suxrt. to DOS
O02Z0 ©€D 21 INT 21H
0022 BEGIN ENDP
END BEGIN
In the macro expansion, the generated symbolic label for COMP is ??0000 and for
OUT is ??0001. If you use the DIVIDE macro instruction again in the same program, the
symbolic labels for the next macro expansion would become ??0002 and ??0003, respec-
tively. In this way, the feature ensures that labels generated within a program are unique.
ENDM
PROMPT MACRO MESSGE
ENDM
To use any of the cataloged macros, instead of coding MACRO definitions at the start of
the program, use an INCLUDE directive like this:
INCLUDE D:\MACRO.LIB
INITZ
The assembler accesses the file named MACRO.LIB on drive D and includes both macro
definitions, INITZ and PROMPT, into the program. In this example, only INITZ is actu-
ally required. The assembled listing will contain a copy of the macro definitions, indicated
by the letter C in column 30 of the LST file. Following each macro instruction will be the
expansion of the macro, along with its generated object code, indicated by a plus (+) in col-
umn 31.
Since a MASM assembly (up to and including version 5.1) is a two-pass operation,
you can use the following statements to cause INCLUDE to occur only on pass | (instead
of both passes):
IF1
INCLUDE D: \MACRO.LIB
ENDIF
IF1 and ENDIF are conditional directives. IF1 tells the assembler to access the named li-
brary only on pass 1 of the assembly. ENDIF terminates the IF logic. A copy of the macro
definition no longer appears on the listing—a saving of both time and space. (MASM ver-
sions 6.0 and on do not need directives that refer to two passes.)
The program in Figure 22-5 contains the previously described IF1, INCLUDE, and
ENDIF statements, although the assembler lists only the ENDIF in the LST file. The two
macro instructions used in the code segment, INITZ and PROMPT, are both cataloged in
402 Writing Macros Chapter 22
page 60,132
TITLE P22MACRS (EXE) Test of INCLUDE
-MODEL SMALL
.STACK 64
DATA
0000 54 65 73 74 20 6F MESSGE DB 'Test of macro', 'S$'
66 20 6D 61 63 72
6F 24
CODE
0000 BEGIN PROC FAR
ENITS
0000 B8 ---- R 1 MOV AX, @data
0003 8E D8 1 MOV DS , AX
0005 8E CO 1 MOV ES, AX
PROMPT MESSGE
0007 B4 09 1 MOV AH, 09 ;Request display
0009 8D 16 0000 Rl LEA DX, MESSGE
OO00D CD 21 1 INT 21H
OOOF B8 4C00 MOV AX,4CO0OH ;Exit to DOS
O0O12 CD 21 INT 21H
0014 BEGIN ENDP
END BEGIN
MACRO.LIB. They were simply stored together as a disk file under that name by means
of an editor program.
The placement of INCLUDE 1s not critical, but the directive must appear before any
macro instruction that references the library entry.
Execution of an INCLUDE statement causes the assembler to include all the macro defini-
tions that are in the specified library. Suppose, however, that a library contains the macros
INITZ, PROMPT, and DIVIDE, but a program requires only INITZ. The PURGE direc-
tive enables you to “delete” the unwanted macros PROMPT and DIVIDE from the cur-
rent assembly:
IF1
A PURGE operation facilitates only the assembly of a program and has no effect on
macros stored in the library.
CONCATENATION
The ampersand (&) character tells the assembler to join (concatenate) text or symbols. The
following MOVE macro provides for generating the MOVSB, MOVSW, or MOVSD
instruction:
Repetition Directives 403
REP MOVS&TAG
ENDM
A user could code this macro instruction as MOVE B, MOVE W, or MOVE D. The as-
sembler will concatenate the parameter with the MOVS instruction, to produce REP
MOVSB, REP MOVSW, or REP MOVSD, respectively. (This example is somewhat triv-
ial and is for illustrative purposes only.)
REPETITION DIRECTIVES
The repetition directives REPT, IRP, and IRPC cause the assembler to repeat a block of
statements terminated by ENDM. (MASM 6.0 introduced the terms REPEAT, FOR, and
FORC for REPT, IRP, and IRPC, respectively.) These directives do not have to be con-
tained ina MACRO definition, but if they are, one ENDM is required to end the repetition
and a second ENDM to end the MACRO definition.
REPT: Repeat
The REPT directive causes repetition of a block of statements up to ENDM according to
the number of times in the expression entry:
REPT expression
The following example initializes the value N to 0 and then repeats the generation of
DB N five times:
N = 0
REPT 5
N = Dor
DB N
ENDM
The result is five generated DB statements, DB | through DB 5S. A use for REPT could be
to define a table or part of a table. The next example defines a macro that uses REPT for
beeping the speaker five times:
BEEPSPKR MACRO
MOV AH, 02H ;Request output
MOV Ding C7 ;Beep character
REPT 5 ;Repeat five times
INT 21H sCall DOS
ENDM ;End of REPT
ENDM ;End of MACRO
404 Writing Macros Chapter 22
The arguments, contained in angle brackets, are any number of valid symbols, including
string, numeric, or arithmetic constants. The assembler generates a block of code for each
argument. In the following example, the assembler generates DB 3, DB 9, DB 17, DB 25,
and DB 28:
IRP N,<3,9,17,25,28>
DB N
The assembler generates a block of code for each character in the string. In the following
example, the assembler generates DW 3 through DW 8:
IRPC N, 345678
DW N
ENDM
CONDITIONAL DIRECTIVES
Assembly language supports a number of conditional directives. We used IF1 earlier to in-
clude a library entry only during pass | of an assembly. Conditional directives are most use-
ful within a macro definition, but are not limited to that purpose. Every IF directive must
have a matching ENDIF to terminate a tested condition. One optional ELSE may provide
an alternative action. Here is the general format for the IF family of conditional directives:
IFxx (condition)
ms conditional
ELSE (optional)
aa block
ENDIF (end of IF)
IF and IFE can use the relational operators EQ (equal), NE (not equal), LT (less than),
LE (less than or equal), GT (greater than), and GE (greater than or equal) as, for example,
in the statement
IF expressionl EQ expression2
Here’s a simple example of the use of IFNB (if not blank). All INT 21H requests re-
quire a function in the AH register, and some requests also require a value in the DX. The
macro DOS21 uses IFNB to test for a nonblank argument for the DX; if the result is true
(the argument is nonblank), the assembler generates the MOV instruction that loads the DX:
Using DOS21 for simple keyboard input requires only loading the AH with a value,
in this case, function 01H:
DOS21 01
406 Writing Macros Chapter 22
The assembler generates MOV AH,O1 and INT 21H. Input of a character string requires
function OAH in the AH and the input address in the DX. You could code the DOS21
macro as
DOS21 OAH, IPFIELD
The assembler then generates both the MOV and the INT 21H instructions.
ENDIF
IF CNTR
EXI'TM
ENDIF
If CNTR has been set to a nonzero value, the assembler generates the comment and exits
(EXITM) from any further macro expansion. Note that an initial instruction clears CNTR
to 0 and also that the IFNDEF blocks need only to set CNTR to | rather than increment it.
If the assembler passes all the tests safely, it generates the macro expansion. In the
code segment, the second DIVIDE macro instruction contains an invalid dividend and quo-
tient and generates only comments. A way to improve the macro would be to test whether
the divisor is nonzero and whether the dividend and divisor have the same sign; for these
purposes, use assembly instructions rather than conditional directives.
Conditional Directives 407
page 60,132
P22MACR6 (EXE) Test of IF and IFNDEF
MODEL SMALL
STACK 64
.DATA
0096 DW 150 ;Dividend
001B DW 27 ;Divisor
0000 DW ? >Quotient
CODE
PROC FAR
. LALL
INITZ
---- R MOV AX, @data -Initialize
D8 MOV DS
, AX ; segment
co ao MOV ES, AX ; registers
DIVIDE DIVDND , DIVSOR, QUOTNT
CNTR =
AX = div'nd, BX = div'r, CX = guot't
0000 R MOV AX, DIVDND ;Set dividend
1E 0002 R MOV BX, DIVSOR ;Set divisor
cs SUB CX Cx ;Clear quotient
2??0000:
C3 CMP AX, BX *Div'd < divisor?
05 PRPRPRPPRPRPP JB ??0001 ; yes, exit
IFIDN <&TAG>,<B>
In the definition, the first IFIDN generates REP MOVSB if you code MOVIEFB as a macro
instruction. The second IFIDN generates REP MOVSW if you code MOVIFW as a macro
instruction. If a user does not supply B or W, the assembler generates a comment and de-
fault to MOVSB. (The normal use of the ampersand (&) operator is for concatenation.)
The three examples in the code segment of MOVIF test for B, for W, and for an in-
valid condition. Don’t attempt to execute the program as it stands, since the CX and DX
registers need to contain proper values for the MOVS instructions. Admittedly, this macro
is not very useful, since its purpose is to illustrate the use of conditional directives in a sim-
ple manner. By now, however, you should be able to develop some meaningful macros.
KEY POINTS
STACK 64
. CODE
0000 BEGIN PROC FAR
. LALL
INITZ
0000 B8 ---- R MOV AX, @data
0003 8E D8 MOV DS, AX
0005 8E CO MOV ES ,AX
MOVIF B
IFIDN <B>,<B>
0007 F3/ A4 REP MOVSB
EXITM
MOVIF W
IFIDN <W>, <W>
0009 F3/ A5 REP MOVSW
PRR
PPP
PRP ENDIF
MOVIF
ELSE
No B or W tag, default to B
O000B F3/ A4 REP MOVSB
ae ENDIF
000D B8 4C00 MOV AX, 4C0O0OH ;Exit to DOS
0010 CD 21 INT 21H
0012 BEGIN ENDP
END BEGIN
¢ A macro instruction is the use of the macro in a program. The code that a macro in-
struction generates is the macro expansion.
¢ The .SALL, .LALL, and .XALL directives control the listing of comments and the
object code generated in a macro expansion.
¢ The LOCAL directive facilitates using names within a macro definition and must ap-
pear immediately after the macro statement.
¢ The use of dummy arguments in a macro definition allows a user to code parameters
for more flexibility.
¢ A macro library makes macros available to other programs.
¢ Conditional directives enable you to validate macro parameters.
410 Writing Macros Chapter 22
QUESTIONS
22-1. Under what circumstances would the use of macros be recommended?
22-2. Code the first and last lines for a simple macro named SETUP.
22-3. Distinguish between the body of a macro definition and the macro expansion.
22-4. What is a dummy argument?
22-5. Code the following statements: (a) Suppress all instructions that a macro generates; (b) list only
instructions that generate object code.
22-6. Code two macro definitions that perform multiplication: (a) MULTBY is to generate code that
multiplies a byte by a byte; (b) MULTWD is to generate code that multiplies a word by a word.
Include the multiplicands and multipliers as dummy arguments in the macro definition. Test
the execution of the macros with a small program that also defines the required data fields.
22-7. Store the macros defined in Question 22-6 in a macro library. Revise the program to
INCLUDE the library entries during pass 1 of the assembly.
22-8. Write a macro named BIPRINT that uses BIOS INT 17H to print. The macro should include
a test for the status of the printer and should provide for any defined print line with any length.
22-9. Revise the macro in Figure 22-6 so that it bypasses the division if the divisor is zero.
CHAPTER 23
Linking to Subprograms
OBJECTIVE:
To cover the programming techniques involved in link-
ing and executing separately assembled programs.
INTRODUCTION
Up to this chapter, the programs we have presented have consisted of one stand-alone as-
sembled module. It is possible, however, to develop a program that consists of a main pro-
gram linked with one or more separately assembled subprograms. The following are
reasons for organizing a program into subprograms:
¢ To link between languages—for example, to combine the computing power of a high-
level language with the processing efficiency of assembly language.
¢ To facilitate the development of large projects, in which different teams produce their
modules separately.
¢ To overlay parts of a program during execution because of the program’s large size.
Each program is assembled separately and generates its own unique object (.OBJ)
module. The linker then links the object modules into one combined executable (.EXE)
module. Typically, the main program is the one that begins execution, and it calls one or
more subprograms. Subprograms in turn may call other subprograms.
411
412 Linking to Subprograms Chapter 23
Main Main
Program Program
(a)
Figure 23—1 shows two examples of a hierarchy of a main program and three sub-
programs. In part (a), the main program calls subprograms 1, 2, and 3. In part (b), the main
program calls subprograms | and 2, and only subprogram | calls subprogram 3.
There are numerous ways to organize subprograms, but the organization has to make
sense to the assembler, to the linker, and for execution. You also have to watch out for sit-
uations in which, for example, subprogram 1 calls subprogram 2, which calls subprogram
3, which in turn calls subprogram 1. This process, known as recursion, can be made to work,
but, if not handled carefully, can cause interesting execution bugs.
SEGMENTS
This section covers a number of options used for segments. The general format for the full
SEGMENT directive is
Align Type
The align operator tells the assembler to align the named segment beginning on a particu-
lar storage boundary:
BYTE Byte boundary, for a segment of a subprogram that is to be combined
with that of another program. Byte alignment is generally suitable for
programs run on an 8088 processor.
WORD Word boundary, for a segment of a subprogram that is to be combined
with that of another program. Word alignment is generally suitable for
programs run on 8086/80286 processors.
DWORD _ Doubleword boundary, normally for the 80386 and later processors.
PARA Paragraph boundary (divisible by 16, or 10H), the default and the most
commonly used alignment for both main programs and subprograms.
PAGE Page boundary (divisible by 256, or 100H).
Omitting the align operator from the first segment causes a default to PARA. Omit-
ting it from succeeding segments causes a default to PARA if the name is unique; if it is not
unique, the default is the alignment type of the previously defined segment of the same name.
Intrasegment Calls 413
Combine Type
The combine operator tells the assembler and linker whether to combine segments or to
keep them separate. We have already used the STACK combine type. Other combine types
relevant to this chapter are NONE, PUBLIC, and COMMON:
NONE The segment is to be logically separate from other segments, although
they all may end up to be physically adjacent. This type is the default
for full segment directives.
PUBLIC The linker is to combine the segment with all other segments that are
defined as PUBLIC and have the same segment name and class. The
assembler calculates offsets from the beginning of the first segment.
In effect, the combined segment contains a number of sections, each
beginning with a SEGMENT directive and ending with ENDS. This
type is the default for simplified segment directives.
COMMON _ If COMMON segments have the same name and class, the linker gives
them the same base address. During execution, the second segment
overlays the first one. The largest segment determines the length of the
common area.
Class Type
We have already used the class names ‘Stack,’ ‘Data,’ and ‘Code.’ You can assign the same
class name to related segments so that the assembler and linker group them together. That
is, they are to appear as segments one after the other, but not combined into one segment
unless the PUBLIC combine option is also coded. The class entry may contain any valid
name, contained in single quotes, although the name ‘Code’ is recommended for the code
segment.
The following two unrelated SEGMENT statements generate identical results,
namely, an independent code segment aligned on a paragraph boundary:
CODESEG SEGMENT PARA NONE ‘Code’
We explained fully defined segment directives in Chapter 4, but have used the sim-
plified segment directives in subsequent chapters. Since full segment directives can provide
tighter control when assembling and linking subprograms, most examples in this chapter
use them.
Program examples in this and later chapters illustrate many of the Align, Combine,
and Class options.
INTRASEGMENT CALLS
The CALL instructions used to this point have been intrasegment calls; that is, the called
procedure is in the same code segment as that of the calling procedure. An intrasegment
CALL is near if the called procedure is defined as or defaults to NEAR (that is, within 32K).
The CALL operation pushes the IP register onto the stack and replaces the IP with the off-
414 Linking to Subprograms Chapter 23
set of the destination address. Thus a near CALL references a (near) procedure within the
same segment.
Now consider an intrasegment CALL that consists of object code E8 2000, where E8
is the operation code and 2000 is the offset of a called procedure. The operation pushes the
IP onto the stack and stores the 2000 as offset 0020 in the IP. The processor then combines
the current address in the CS with the offset in the IP for the next instruction to execute. On
exit from the called procedure, a (near) RET pops the stored IP off the stack and into the IP
and returns to the instruction following the CALL:
; link to nearproc
nearproc
nearproc
An intrasegment call may be near, as described, or far if the call is to a procedure de-
fined as far within the same segment. RET is near if it appears in a NEAR procedure and
far if it appears in a FAR procedure.
INTERSEGMENT CALLS
A CALL is classed as far if the called procedure is defined as FAR or as EXTRN, often in
another segment. The CALL operation first pushes the contents of the CS register onto the
stack and inserts a new segment address in the CS. It then pushes the IP onto the stack and
inserts a new offset address in the IP. (The pushed CS and IP values provide the address of
the instruction immediately following the CALL.) In this way, both addresses of the code
segment and the offset are saved for the return from the called procedure. A call to another
segment is always an intersegment far call:
Consider an intersegment CALL that consists of object code 9A 0002 AF04. Hex 9A is the
operation code for an intersegment CALL. The operation pushes the current IP onto the
EXTRN and PUBLIC Attributes 415
CALL SUBPROG
MAINPROG
PUBLIC SUBPROG
SUBPROG PROC FAR
RET
SUBPROG ENDP
Figure 23-2 Intersegment Call
stack and stores the new offset 0002 as 0200 in the IP. It then pushes the CS onto the stack
and stores the new segment address AF04 as 04AF in the CS. The CS and IP values com-
bine to establish the address of the first instruction to execute in the called subprogram:
Code segment: O4AFOH
Offset in IP: + Q200H
On exit from the called procedure, an intersegment (far) RET reverses the CALL operation,
popping both the original IP and CS addresses back into their respective registers. The
CS:IP pair now points to the address of the instruction following the original CALL, where
execution resumes.
The difference then between a near and a far CALL is basically that a near CALL re-
places only the IP offset, whereas a far CALL replaces both the CS segment address and
the IP offset.
SUBPROG in its turn contains a PUBLIC directive that tells the assembler and linker
that another module has to know the address of SUBPROG. In a later step, when both
MAINPROG and SUBPROG are successfully assembled into object modules, they may be
linked as follows:
416 Linking to Subprograms Chapter 23
The linker matches EXTRNs in one object module with PUBLICs in the other and inserts
any required offset addresses. It then combines the two object modules into one executable
module. If unable to match references, the linker supplies error messages; watch for these
before attempting to execute the module.
The EXTRN directive tells the assembler that the named item—a data item, procedure,
or label—is defined in another assembly. (MASM 6.0 introduced the term EXTERN.)
EXTRN has the following format:
You can define more than one name up to the end of the line or code additional EXTRN
statements. The other assembly module in its turn must define the name and identify it as
PUBLIC. The type entry may be ABS (a constant), BYTE, DWORD, FAR, NEAR,
WORD, or a name defined by an EQU and must be valid in terms of the actual definition
of a name:
¢ BYTE, WORD, and DWORD identify data items that one module references but an-
other module defines.
¢ NEAR and FAR identify a procedure or instruction label that one module references
but another module defines.
The PUBLIC directive tells the assembler and linker that the address of a specified symbol
defined in the current assembly is to be available to other modules. The general format for
PUBLIC is
You can define more than one symbol up to the end of the line or code additional PUBLIC
statements. The symbol entry can be a label (including PROC labels), a variable, or a num-
ber. Invalid entries include register names and EQU symbols that define values greater than
two bytes.
The calling of far procedures and the use of EXTRN and PUBLIC should offer little
difficulty, although considerable care is required for making data defined in one module
known in other modules.
Use of EXTRN and PUBLIC for a Label 417
Let’s now examine three different ways of making data known between programs: us-
ing EXTRN and PUBLIC, defining common data in subprograms, and passing parameters.
The machine code for an intersegment CALL is 9AH. The operation pushes the IP register
onto the stack and loads 0000 in the IP. It then pushes the CS containing OF20[0] onto the
stack and loads 0F22[0] (from the CALL operand) in the CS. (We’ll show the register con-
tents in normal, not reversed, byte order.)
The next instruction to execute is CS:IP, or OF22[0] plus 0000. What is at OF220? It’s
the entry point to P23SUB1 at its first executable instruction, which you can calculate. The
main program began with the CS register containing 0F20[0]. According to the map, the
main code segment offset begins at offset O0090H and the subprogram offset begins at off-
set OOOBOH, 20H bytes apart. Adding the main program’s CS value plus 20H supplies the
effective address of the subprogram’s code segment:
418 Linking to Subprograms Chapter 23
Link Map
Object Modules: P23MAIN1+P23SUB1
The program loader determines this address just as we have and substitutes it in the CALL
operand. P23SUB1 multiplies the two values in the AX and BX, with the product in the
DX:AX, and makes a far return to P23MAINI1 (because RET is in a FAR procedure).
Interesting results appear in the link map and the CALL object code. In the symbol table
following each assembly, the combine type for CODESG is PUBLIC, whereas in Figure
23-3 it was NONE. Also, the link map at the end now shows only one code segment. The
fact that both segments have the same name (DATASG), class (‘Code’), and PUBLIC at-
tribute caused the linker to combine the two logical code segments into one physical code
segment. Further, a trace of machine execution showed that the CALL is far; that is, even
though the call is within the same segment, it is to a FAR procedure:
This far CALL stores 2000H in the IP as 0020H and 200FH in the CS register as OF20[0].
Because the subprogram shares a common code segment with the main program, the CS
register is set to the same starting address, OF20H. But the CS:IP for P23SUB2 now pro-
vide the following:
CS address for P23MAIN2 and P23SUB2: OF200H
IP offset for P23SUB2: + QO20H
The code segment of the subprogram therefore presumably begins at 0F220H. Is this cor-
rect? The link map doesn’t make the point clear, but you can infer the address from the
listing of the main program, which ends at offset 0015H. (The map shows 16H, which
is the next available location.) Since the code segment for the subprogram is defined as
420 Linking to Subprograms Chapter 23
Link Map
Object Modules: P23MAIN2+P23SUB2
le segment
PARA, it begins on a paragraph boundary (evenly divisible by 10H, so that the rightmost
digit is Q):
| | |
OF200 141F0 OF220
The linker sets the subprogram at the first paragraph boundary immediately following the
main program, at offset O0020H. Therefore, the code segment of the subprogram begins at
OF200H plus 0020H, or OF220H.
Now let’s examine this same program defined with simplified segment directives.
This time, the new offset value is 16H, and the segment address is OF17H. Because the sub-
program shares a common code segment with the main program, the CS register is set to
the same starting address, 0F17(0), for both. The address of P23SUB3 may therefore be cal-
culated as follows:
CS address for P23MAIN3 and P23SUB3: F170H
IP offset for P23SUB3: + 016H
You can infer the address from the listing of the main program, which ends at offset OO15H.
(The map shows 16H, which is the next available location.) Since the map shows the main
code segment beginning at 0OO00H, the next word boundary following 0015H is at 00016H,
where P23SUB3 begins.
422 Linking to Subprograms Chapter 23
. DATA
0000 0140 QTY DW 0140H
0002 2500 PRICE DW 2500H
CODE
0000 BEGIN PROC FAR
0000 B8 ---- R MOV AX,@data
0003 8E D8 MOV DS, AX
0005 Al 0002 R MOV AX, PRICE ;Set up price
0008 8B 1E 0000 R MOV BX, OTY ; and quantity
O0O00C 9A 0000 ---- E CALL P23SUB3 ;Call subprogram
0011 B8 4C00 MOV AX, 4C0O0H ;Exit to DOS
0014 CD 21 INT 21H
0016 BEGIN ENDP
END BEGIN
Link Map
Object Modules: P23MAIN3+P23SUB3
Start Stop Length Name Class
OOOO0OH 00018H 00019H _TEXT CODE <-- code segment lst
OOO1LAH OOO1DH 00004H DATA DATA
00020H OOOSFH 00040H STACK STACK
Object code Al means move a word from memory to the AX, whereas 8B means move a
word from memory to the BX. (AX operations often require fewer bytes.) For P23SUB4,
the assembler has no way of knowing the locations of QTY and PRICE, so it has stored ze-
ros in the operands for both MOVs. Tracing through program execution reveals that the
linker has completed the object code operands as follows:
Al 0200
8B 1E 0000
The object code is now identical to that generated for the three preceding programs, where
the MOV instructions are in the calling program. This is a logical result because the
operands in all three programs reference the same data segment address in the DS register
and the same offset values.
The main program and the subprogram may define other data items, but only those
defined as PUBLIC and EXTRN are known in common to them.
Link Map
Object Modules: P23MAIN4+P23SUB4
EXTRN QTY:WORD
ASSUME CS:CODESG,
DS: DATASG
PUBLIC P23SUB5
RET
PASSING PARAMETERS
The stack frame is the portion of the stack that the calling program uses to pass parameters
and that the called subprogram uses for accessing the parameters. The called subprogram
426 Linking to Subprograms Chapter 23
Symbols
Name Type Value Attr
P235UB5 ¢ 6 & « + wo & #& PROC 0000 CODESG Global Length=0011
PRICE « « «& » © « © = «= i WORD 0000 DATASG
OTY 2% & &» = » & « «= « W°WORD 0000 External
Link Map
Object Modules: P23MAIN5+P23SUB5
may also use the stack frame for temporary storage of local data. The BP register acts as a
frame pointer. For passing parameters, we’ll make use of both the BP and SP registers.
In Figure 23-8, the calling program P23MAIN6 pushes both PRICE and QTY prior
to calling the subprogram P23SUB6. Initially, the SP contained the size of the stack, 80H.
Each word pushed onto the stack decrements the SP by 2. After the CALL, the stack frame
appears as follows:
78 7A 7C TE
1. A PUSH loaded PRICE (2500H) onto the stack frame at offset 7EH.
nN.A PUSH loaded QTY (0140H) onto the stack frame at offset 7CH.
3. CALL pushed the contents of the CS (OF20H for this execution) onto the stack frame
at 7AH. Since the subprogram is PUBLIC, the linker combines the two code seg-
ments, and the CS address is the same for both.
4. CALL also pushed the contents of the IP register, 0012H, onto the stack frame at 78H.
The called program requires the use of the BP to access the parameters in the stack
frame. Its first action is to save the contents of the BP for the calling program, so it pushes
the BP onto the stack. In this example, the BP happens to contain zero, which PUSH stores
in the stack at offset 76H:
Link Map
Object Modules: P23MAIN6+P23SUB6
Before returning to the calling program, the routine pops the BP (returning the zero
address to the BP), which increments the SP by 2, from 76H to 78H.
The last instruction, RET, is a far return to the calling program, which performs the
following:
¢ Pops the word now at the top of the stack frame (1200H) to the IP and increments the
SP by 2, from 78H to 7AH.
¢ Pops the word now at the top (0F20) onto the CS and increments the SP by 2, from
7AH to 7CH.
Because of the two passed parameters at offsets 7CH and 7EH, the RET instruction
is coded as
RET 4
The 4, known as a pop-value, contains the number of bytes in the passed parameters (two
one-word parameters in this case). The RET operation adds the pop-value to the SP, cor-
recting it to 80H. In effect, because the parameters in the stack are no longer required, the
operation discards them and returns correctly to the calling program. Note that the POP and
RET operations increment the SP, but don’t actually erase the contents of the stack.
If you follow the general rules discussed in this chapter, you should be able to link a
program consisting of more than two assembly modules and to make data known in all the
modules. But watch out for the size of the stack: For large programs, defining 64 words
could be a wise precaution, because of the many PUSH and CALL operations.
Chapter 24 covers some important concepts on memory management and executing
overlay programs. Chapter 26 provides additional features of segments, including defining
more than one code or data segment in the same assembly module and the use of GROUP
to combine these into a common segment.
This section explains how to link a Pascal program to an assembly language subprogram.
The simple Pascal program in Figure 23-9 links to an assembly language subprogram
whose purpose is just to set the cursor. The Pascal program is compiled to produce an .OBJ
module, and the assembly language program is assembled to produce an .OBJ module. The
linker then combines these two .OBJ modules into one .EXE executable module.
The Pascal program defines two items named temp_row and temp_col and accepts
entries for row and column from the keyboard into these variables. The program defines the
name of the assembly language subprogram as set_curs and defines the two parameters as
430 Linking to Subprograms Chapter 23
begin
write( ! Enter cursor row: ' );
readln ( temp row );
write( '! Enter cursor column: ' );
readin ( temp _col );
set curs ( temp_row, temp_col );
write( ' New cursor location' );
extern. It sends the addresses of temp_row and temp_col as parameters to the subprogram
to set the cursor to that location. The Pascal statement that “calls” the name of the subpro-
gram and passes the parameters is
Values pushed onto the stack are the calling program’s stack pointer, the return seg-
ment pointer, the return offset, and the addresses of the two passed parameters. The fol-
lowing shows the offsets for each entry in the stack:
Linking C and Assembly Language Programs 431
Since the assembly language subprogram has to use the BP register, you have to push
the BP onto the stack to save its address for the return to the Pascal calling program. Note
that the steps in the called subprogram are similar to those in the program in Figure 23-7.
The SP register normally addresses entries in the stack. But since you cannot use the
SP to act as an index register, the step after pushing the BP is to move the address in the SP
to the BP. This step enables you to use the BP as an index register to access entries in the
stack frame.
The next step is to access the addresses of the two parameters in the stack frame. The
first passed parameter, the row, is at offset 08H in the stack frame and can be accessed by
BP + O8H. The second passed parameter, the column, is at offset 06H and can be accessed
by BP + O6H.
Each of the two addresses in the stack frame has to be transferred to one of the available
index registers: BX, DI, or SI. This example uses [BP +08] to move the address of the row
to the SI and then uses [SI] to move the contents of the passed parameter to the DH register.
The column is transferred to the DL register in a similar way. Then the subprogram
uses the row and column in the DX register for INT 10H to set the cursor. On exit, the sub-
program pops the BP. The RET instruction requires an operand value that is two times the
number of parameters—in this case, 2 X 2, or 4. Values are automatically popped off the
stack and control transfers back to the calling program.
If you change a segment register, be sure to PUSH it on entry into and POP it on exit
from the subprogram. The recommended practice for a Pascal call is to preserve the DI, SI,
BP, DS, and SS registers. You can also use the stack to pass values from a subprogram to
a calling program. Although the subprogram in Figure 23-9 doesn’t return values, Pascal
would expect a subprogram to return them as a single word in the AX or as a pair of words
in the DX:AX.
This trivial program produces a module larger than 20K bytes. A compiler language
typically generates considerable overhead regardless of the size of the source program.
Do not assume that other Pascal versions necessarily follow the conventions we have
used here. The appropriate standard is that described in the compiler manual, usually in a
section whose title begins with “Interfacing .. .” or “Mixed Languages ...”.
¢ For versions of C that are sensitive to uppercase and lowercase, the name of the as-
sembly language module should be in the same case as the C program’s reference.
432 Linking to Subprograms Chapter 23
¢ Most versions of C pass parameters onto the stack in a sequence that is the reverse of
that of other languages. Consider, for example, the C statement
The statement pushes n and then m onto the stack in that order and calls Adds.
On return from the called module, the C module (not the assembly language module)
adds 4 to the SP to discard the passed parameters. The typical procedure in the called
assembly language module for accessing the two passed parameters is as follows:
PUSH BP
MOV BP, SP
POP BP
RET
Some versions of C require that an assembly language module that changes the DI
and SI registers should push them on entry into and pop them on exit from the as-
sembly subprogram.
The assembly language module should return values, if required, as one word in the
AX or two words in the DX:AX pair.
For some versions of C, an assembly language program that sets the DF flag should
clear it (CLD) before returning.
Registers. The assembly language module must preserve the original values in the
BP, SP, CS, DS, SS, DI, and SI registers.
1. By reference, either as near (an offset in the default segment) or as far (an offset in
another segment). The called assembly module can directly alter the value defined in
the C module.
2. By value, in which the C caller passes a copy of the variable on the stack. The called
assembly module can alter the passed value, but has no access to the original C value.
If there is more than one parameter, C pushes them onto the stack from right to left.
Linking C and Assembly Language Programs 433
Returned values. The called assembly module uses the following registers for
any returned values:
On return from the called module, issue RET with no pop value.
Compiling and Assembling. Use the same memory model for both lan-
guages. The assembly .MODEL statement indicates the C convention, such as MODEL
SMALL,C. Also, use the appropriate assembly switch to preserve the case of (nonlocal)
names.
1. Separate modules. For this conventional method, you code the C and assembly pro-
grams separately. Use TCC to compile the C module, TASM to assemble the assem-
bly module, and TLINK to link them.
2. Inline Assembly Code. To compile the C module, you request TCC.EXE (the com-
mand version of Turbo C). Simply insert assembly statements, preceded by the key-
word asm, in the source code, as, for example,
Segments. The code segment must be named _TEXT. The data segments (two if
required) are named _DATA for data that is to be initialized on entry to a block and _BSS
for uninitialized data.
Naming conventions. The Turbo Assembler modules must use a naming con-
vention for segments and variables that is compatible with that of Turbo C. All assembler
434 Linking to Subprograms Chapter 23
references to functions and variables in the C module must begin with an underscore (_).
Further, since C is case sensitive, the assembly module should use the same case (upper or
lower) for any variable names in common with the C module.
Registers. The assembly module may freely use the AX, BX, CX, DX, ES, and
flags registers. It may also use the BP, SP, CS, DS, SS, DI, and SI registers, provided that
it saves (pushes) and restores (pops) them.
Return. The assembly program simply uses RET (with no pop-value) to return to
the C module. The C module pops the stack on reentry to it.
Example of a C Program
The program in Figure 23-10 illustrates linking a Turbo C program with an assembly mod-
ule. The program performs the same actions as the Pascal program in the previous section:
The C program accepts values from the keyboard for row and column and passes them to
the assembler subprogram. The assembler subprogram in its turn sets the cursor and returns
to the C module.
KEY POINTS
The align operator tells the assembler to align the named segment, beginning on a
particular storage boundary.
The combine operator tells the assembler and linker whether to combine segments or
to keep them separate.
You can assign the same class name to related segments so that the assembler and
linker group them together.
An intrasegment CALL is near if the called procedure is defined as or defaults to
NEAR (within 32K). An intrasegment call may be far if the call is to a far procedure
within the same segment.
An intersegment CALL calls a procedure in another segment and is defined as FAR
or as EXTRN.
In a main program that calls a subprogram, define the entry point as EXTRN; in the
subprogram, define the entry point as PUBLIC.
If two code segments are to be linked into one segment, define them with the same
name, the same class, and the PUBLIC combine type.
It is generally easier (but not necessary) to define common data in the main program.
The main program defines the common data as PUBLIC, and the subprogram (or sub-
programs) defines the common data as EXTRN.
Questions 435
#include <stdio.h>
PUBLIC —_set_curs
_set_curs PROC NEAR
PUSH BP ;Caller’s BP register
MOV BP, ;Point to parameters
POP BP ;Restore BP
RET ;Return to caller
_S8eC curs ENDP
_ TEXT ENDS
END
QUESTIONS
23-1. Provide four reasons for organizing a program into subprograms.
The next three questions refer to the general format for the SEGMENT directive:
23-2. (a) For the SEGMENT directive’s align option, what is the default? (b) What is the effect of
the BYTE option? (That is, what action does the assembler take?)
436 Linking to Subprograms Chapter 23
23-3. (a) For the SEGMENT directive’s combine option, what is the default? (b) When would you
use the PUBLIC option? (c) When would you use the COMMON option?
23-4. (a) What should the code segment’s class option be for the SEGMENT directive? (b) Two
segments have the same class, but not the PUBLIC combine option. What is the effect? (c)
Two segments have the same class, and both have the PUBLIC combine option. What is the
effect?
23-5. Distinguish between an intrasegment call and an intersegment call.
23-6. A program named MAINPRO is to call a subprogram named SUBPRO. (a) What statement
in MAINPRO informs the assembler that the name SUBPRO is defined outside its own as-
sembly? (b) What statement in SUBPRO is required to make its name known to MAINPRO?
23-7. Assume that MAINPRO in Question 23-6 has defined variables named QTY as DB, VALUE
as DW, and PRICE as DW. SUBPRO is to divide VALUE by QTY and is to store the quo-
tient in PRICE. (a) How does MAINPRO inform the assembler that the three variables are to
be known outside this assembly? (b) How does SUBPRO inform the assembler that the three
variables are defined in another assembly?
23-8. Combine Questions 23--6 and 23-7 into a working program and test it.
23-9. Revise Question 23—8 so that MAINPRO passes all three variables as parameters. Note, how-
ever, that SUBPRO is to return the calculated price intact in its parameter.
23-10. Expand Question 23-9 so that MAINPRO accepts quantity and value from the keyboard, sub-
program SUBCONV converts the ASCII amounts to binary, subprogram SUBCALC calcu-
lates the price, and subprogram SUBDISP converts the binary price to ASCII and displays
the result.
CHAPTER 24
DOS Memory Management
OBJECTIVE:
To describe the boot procedure, DOS initialization, the
program segment prefix, the environment, memory con-
trol, the program loader, and resident programs.
INTRODUCTION
This chapter describes DOS organization in detail. The operations introduced are DOS INT
2FH, function 4A01H, multiplex interrupt; and these INT 21H functions:
437
438 DOS Memory Management Chapter 24
The four major DOS programs are the boot record, IO.SSYS, MSDOS.SYS, and COM-
MAND.COM:
1. The boot record is on track 0, sector 1, of any disk that you format with FORMAT
/S. When you initiate the computer, the system automatically loads the boot record
from disk into memory. The boot record, in turn, loads IO.SYS from disk into
memory.
2. IO.SYS is a low-level interface to the BIOS routines in ROM. On initiation, it deter-
mines the status of the devices and equipment associated with the computer and sets
interrupt table addresses for interrupts up to 20H. IO.SYS also handles input/output
between memory and external devices such as a video monitor or disk. It then loads
MSDOS.SYS.
3. MSDOS.SYS is a high-level interface to programs that sets interrupt table addresses
for interrupts 20H through 3FH. It manages the directory and files on disk, blocking
and deblocking of disk records, INT 21H functions, and a number of other services.
It then loads COMMAND.COM.
4. COMMAND.COM handles the various commands such as DIR and CHKDSK and
runs all requested .COM, .EXE, and .BAT programs. It is responsible for loading ex-
ecutable programs from disk into memory.
Figure 24—1 shows a map of memory after the DOS system programs have been
loaded. Details vary by system.
Beginning Contents
Address
User programs
Resident programs (if any)
XXxXxX0H Resident portion of COMMAND.COM
XxXXxX0OH MSDOS.SYS and I0.SYS
00500H DOS communication area
00400H BIOS data area
O0O0000H Interrupt address table
HIGH-MEMORY AREA
The processor uses a number of address lines to access memory. For the 80286 and later,
line number A20 can address a 64K space known as the high-memory area (HMA), from
FFFF:10H through FFFF:FFFFH, just above the DOS limit of one megabyte.
When the computer runs in real (8086) mode, it normally disables the A20 line so that
addresses that exceed this limit wrap around to the beginning of memory. Enabling the A20
line permits addressing locations in the HMA. Since DOS 5.0, you can ask CONFIG.SYS
to relocate DOS from low memory to the HMA, thereby freeing space for user programs.
You can use INT 21H, function 3306H (Get DOS version), to determine the presence of
DOS in the HMA:
MOV AX,3306H ;Request DOS version
COMMAND.COM
The system loads the three portions of COMMAND.COM into memory either permanently
during a session or temporarily as required. The following describes the three parts:
1. The resident portion of COMMAND.COM immediately loads MSDOS.SYS (and its
data areas), where it resides during processing. The resident portion handles errors
for disk I/O and the following interrupts:
INT 22H Terminate address
INT 23H Ctrl+Break handler
INT 24H _ Error detection on disk read/write or bad memory image of the FAT
INT 27H Terminate but stay resident (TSR)
440 DOS Memory Management Chapter 24
Each byte in the 20-byte default file handle table refers to an entry in a DOS table that de-
fines the related device or driver. Initially, the table contains 0101010002FF ... FF, where
the first 01 refers to the keyboard, the second 01 to the screen, and so forth:
The table of 20 handles explains why DOS allows a maximum of 20 files open at one time.
Normally, the word at PSP offset 32H contains the length of the table (14H, or 20), and 34H
contains its segment address in the form IP:CS, where the IP is 18H (the offset in the PSP)
and the CS is the segment address of the PSP.
Programs that need more than 20 open files have to release memory (INT 21H, func-
tion 4AH) and use function 67H (set maximum handle count):
The amount of memory required is one byte per handle, rounded up to the next byte para-
graph plus 16 bytes. The operation creates the new handle table outside the PSP and up-
dates PSP locations 32H and 34H. An invalid operation sets the carry flag and sets an error
code in the AX.
This portion of the PSP is called a default buffer for the DTA. DOS initializes this area with
the full text (if any) that a user keys in following the requested program name. The first byte
contains the number of keys (if any) pressed immediately after the entered program name
that is entered, followed by any actual characters entered. After that is any “garbage” left
in memory from a previous program.
The following four examples should clarify the contents and purpose of FCB #1, FCB
#2, and the DTA.
80H DTA: 00 UD ss
FCB #1 and FCB #2: These are both dummy FCBs. Their first byte, OOH, refers
to the default drive number. The subsequent bytes for filename and extension are blank,
since the user entered no text following the keyed program name.
DTA: The first byte contains the number of bytes keyed in after the name CALCIT,
not including the Enter character. Since no keys other than Enter were pressed, the number
is zero. The second byte contains the Enter character, 0DH, that was pressed.
Example 2: Command with Text Operand. Suppose that a user wants to execute
a program named COLOR and passes a parameter “BY” that tells the program to set the
color to blue (B) on a yellow (Y) background. The user types the program name followed
by the parameter: COLOR BY. DOS then sets the following in the PSP:
5CH FCB #1: 00 42 59 20 20 20 20 20 20 20. 20 20 is.
C AL CI.égT O B WJ
Ds CAA tk Les O B wg
FCB #1: The first character indicates the drive number (04 = D), followed by the
name of the file, CALCIT, that the program is to reference. Then come two blanks that com-
plete the eight-character filename and, finally, the extension, OBJ.
DTA: The length of 13 (ODH) is followed by exactly what was typed, including the
Enter character.
rF ft i: &./B A S M
FCB #1: The first byte, 01, refers to drive A, followed by the filename.
FCB #2: The first byte, 04, refers to drive D, followed by the filename.
DTA: The bytes contain the number of characters entered (10H), a space (20H),
A:FILEA.ASM D:FILEB.ASM, and the Enter character (ODH).
To locate the DTA for a .COM program, simply set 80H in the SI, DI, or BX regis-
ter, and access the contents:
ASSUME CS:CODESG
ORG 100H
BEGIN: MOV AL, ODH ;Search character (Enter)
MOV C2 ;Number of bytes
MOV DI, 82H ;Start address in PSP
REPNZ SCASB ;Scan for Enter
JNZ iia ;Not found, error
DEC DI ; Found:
MOV BYTE PTR [DI],0 ;Replace with OOH
MOV AH, 43H ;Request
MOV AL, 01 ; set attribute
MOV CX, 00 ; to normal
MOV DX, 82H ;ASCIIZ string in PSP
INT 21H ;Call DOS
JC ane ;Write error?...
CODESG ENDS
END BEGIN
MEMORY BLOCKS
DOS allows any number of programs to be loaded and to stay resident. Examples in-
clude RAMDISK, MOUSE, and SIDEKICK. DOS sets up one or two memory blocks for
each loaded program. Immediately preceding each memory block is an arena header (or
memory control record) beginning on a paragraph boundary and containing the follow-
ing fields:
Memory Blocks 445
OO-OOH Code, where 4DH (‘M’) means more blocks to follow and 5AH (‘Z’)
means zero blocks to follow (the last block). (This is a useful interpreta-
tion, but not necessarily the original intention.)
O1-O2H Segment address of the owner’s PSP. 0800H means that the segment be-
longs to MSDOS.SYS, and 0000H means that it is released and available.
03-04H Length of the memory block, in paragraphs
O5-O7H Reserved
O8-OFH Filename of owner, in ASCIIZ format (since DOS 4.0).
A forward linked list connects memory blocks. The first memory block, set up and owned
by MSDOS.SYS, contains DOS file buffers, FCBs used by file handle functions, and de-
vice drivers loaded by DEVICE commands in CONFIG.SYS.
The second memory block is the resident portion of COMMAND.COM with its own
PSP. A few special programs such as FASTOPEN and SHARE may be loaded before
COMMAND.COM.
The third memory block is the master environment containing the COMSPEC com-
mand, PROMPT commands, PATH commands, and any strings set by SET.
Succeeding blocks include any resident (TSR) programs and the currently executing
program. Each of these programs has two blocks; the first is a copy of the environment, and
the second is a program segment with the PSP and the executable module.
Function 52H returns the segment address of the list of DOS file tables (the second entry)
in the ES and an offset in the BX. ES:[BX-4] therefore points to the preceding entry, a
doubleword in IP:CS format that contains the address of the first arena header.
To find subsequent memory blocks in the chain:
1. Use the address of the arena header for the memory block.
2. Add 1 to the segment address of the arena header to get the start of its memory block.
(The arena header is 10H bytes long.)
3. Add the length of the memory block from offsets 03-04H of the arena header. You
now have the segment address of the next arena header.
To determine the paragraphs of memory available to DOS for the last program, find
the arena header containing “Z” in byte 0, and perform the preceding calculations. The last
block has available to it all remaining higher memory.
446 DOS Memory Management Chapter 24
H valuel,value2
The H command returns the sum and the difference of the two values.
For the following example, DEBUG displayed the required memory contents. Watch
out for reversed-byte sequence. The trace proceeded as follows:
1. Function 52H returned 02CC[0] in the ES and 0026H in the BX. Since we want the
four bytes to the left at 0022H, use D 02CC:22 to display the address of the arena
header for the first memory block in IP:CS format. This turns out to be 00 00 56 OB.
The address is therefore 0B56[0].
2. Use D B56:0 to display the first arena header:
4D 08 00 AEF 05...
The 4D (‘“M’’) means more memory blocks follow, 0800 (0008H) tells us that the
memory block belongs to MSDOS.SYS, and AE05 (OSAEH) is the length of the
memory block.
The operation clears the carry flag and returns the strategy in the AX:
¢ OOH = First fit (the default): Search from the lowest address in conventional mem-
ory for the first available block that is large enough to load the program.
01H = Best fit: Search for the smallest available block in conventional memory that
is large enough to load the program.
02H = Last fit: Search from the highest address in conventional memory for the first
available block.
¢ 40H = First fit, high only: Search from the lowest address in upper memory for the
first available block.
41H = Best fit, high only: Search for the smallest available block in upper memory.
42H = Last fit, high only: Search from the highest address in upper memory for the
first available block.
¢ 80H = First fit, high: Search from the lowest address in upper memory for the first
available block. If none is found, search conventional memory.
¢ 81H = Best fit, high: Search for the smallest available block in upper memory. If
none is found, search conventional memory.
82H = Last fit, high: Search from the highest address in upper memory for the first
available block. If none is found, search conventional memory.
Best fit and last fit strategies are appropriate to multitasking systems, which could
have fragmented memory because of programs running concurrently. When a program fin-
ishes processing, its memory is released to the system.
448 DOS Memory Management Chapter 24
A successful operation clears the carry flag and allows a program to allocate memory from
it. An error sets the carry flag and returns to the AX code 01 (CONFIG.SYS did not con-
tain DOS=UMB) or 07 (memory links damaged).
PROGRAM LOADER
1. Sets up memory blocks for the program’s environment and for the program segment
2. Creates a program segment prefix at location 00H of the program segment and loads
the program at 100H.
Other than these steps, the load and execute steps differ for .COM and .EXE pro-
grams. A major difference is that the linker inserts a special header record in an .EXE file
when storing it on disk, and the DOS loader uses this record for loading.
IP offset
(100H)
.COM program
< SP offset
.EXE programs loaded in memory. The first two bytes of the PSP contain the INT 20H in-
struction (return to DOS). On loading a .COM program, DOS
¢ Sets the four segment registers with the address of the first byte of the PSP.
¢ Sets the stack pointer (SP) to the end of the 64K segment, offset FFFEH (or to the end
of memory if the segment is not large enough), and pushes a zero word on the stack.
¢ Sets the instruction pointer to 100H (the size of the PSP) and allows control to pro-
ceed to the address generated by CS:IP, the first location immediately following the
PSP. This is the first byte of your program, and it should contain an executable in-
struction. Figure 24—2 illustrates this initialization.
10-11H Offset that the loader is to insert in the SP register when transferring con-
trol to the executable module. The value is the defined size of the stack.
12-13H Checksum value—the sum of all the words in the file (ignoring overflows),
used as a validation check for possible lost data.
14-15H Offset (usually, but not necessarily, OOH) that the loader is to insert in the
IP register when transferring control to the executable module.
16-17H Offset in the executable module of the code segment. The loader inserts
the offset in the CS register. If the code segment is first, the offset would be zero.
18-19H Offset of the relocation table (see the item at 1CH).
1A-I1BH Overlay number: zero (the usual) means that the .EXE file contains the
main program.
1CH-end Relocation table containing a variable number of relocation items, as
identified at offset 06-O07H. Positions 06—07H of the header indicate the number
of items in the executable module that are to be relocated. Each relocation item,
beginning at header 1CH, consists of a two-byte offset value and a two-byte seg-
ment value.
The system constructs memory blocks for the environment and the program segment.
Following are the steps that DOS performs when loading and initializing an .EXE program:
Stack segment
Figure 24-3 Initialization of an EXE Pro-
SP offset gram
Program Loaders 451
After the preceding, DOS is finished with the .EXE header and discards it. The CS
and SS registers are set correctly, but your program has to set the DS (and ES) for its own
data segment:
MOV AX,datasegname ;Set DS and ES registers
MOV DS,AX ; to address
MOV ES,AX ; of data segment
The map provides the relative (not actual) location of each of the three segments. Note that
some linkers arrange these segments in alphabetic sequence by name. According to the map,
the code segment (CSEG) is to start at OOOOOH—1ts relative location is the beginning of the
executable module, and its length is 003BH bytes. The data segment, DSEG, begins at
00040H and has a length of 001BH. This is the first address following CSEG that aligns on
a paragraph boundary (a boundary evenly divisible by 10H). The stack segment, STACK,
begins at 0O060H, the first address following DSEG that aligns on a paragraph boundary.
DEBUG can’t display a header record after a program is loaded for execution, because
DOS replaces the header record with the PSP. However, there are various utility programs
on the market (or you can write your own) that allow you to view the hex contents of any
disk sector. The header for the program we are examining contains the following relevant
information, according to hex location (the contents of fields are in reverse-byte sequence):
When DEBUG loaded this program, the registers contained the following values:
For .EXE modules, the loader sets the DS and ES to the address of the PSP and sets
the CS, IP, SS, and SP to values from the header record. Let’s now see how the loader ini-
tializes these registers.
CS Register
According to the DS register, when the program loaded, the address of the PSP was
138F[O]H. Since the PSP is 100H bytes long, the executable module follows immediately
at 139F[0]H, which the loader inserts in the CS register:
The CS provides the starting address of the code portion (CSEG) of the program. You can
use the DEBUG display command D CS:0000 to view the machine code of a program in
memory. The code is identical to the hex portion of the assembler .LST printout, other than
operands that .LST tags as R.
SS Register
The loader used the value 60H in the header (at OEH) for setting the address of the stack in
the SS register:
Start address of PSP (see DS): 138F0H
Length of PSP: + 100H
Offset of stack (see location OEH in header): + 60H
SP Register
The loader used 20H from the header (at 10H) to initialize the stack pointer to the length
of the stack. In this example, the stack was defined as 16 DUP(?), that is, 16 two-byte
fields = 32, or 20H. The SP points to the current top of the stack.
DS Register
The loader uses the DS register to establish the starting point for the PSP at 138F[0]. Be-
cause the header does not contain a starting address for the DS, your program has to ini-
tialize it:
The assembler left unfilled the machine address of DSEG, which becomes an entry in
the relocation table in the header, discussed earlier. DEBUG shows the completed instruc-
tion as
B8 A313
DS address: 13A30H
DS 13A3[0]H 40H
SS 13A5[0]H 60H
As an exercise, trace any of your linked .EXE programs with DEBUG, and note the
changed values in the registers:
The DS now contains the correct address of the data segment. You can use D DS:00 to view
the contents of DSEG and use D SS:00 to view the contents of the stack.
A successful operation clears the carry flag and returns in the AX the segment ad-
dress of the allocated memory block. The operation begins at the first memory block and
steps through each block until it locates a space large enough for the request, usually at the
high end of memory.
An unsuccessful operation sets the carry flag and returns in the AX an error code
(07 = memory block destroyed or 08 = insufficient memory) and in the BX the size, in
paragraphs, of the largest block available. A memory block destroyed means that the oper-
ation found a block in which the first byte was not ‘M’ or ‘2’.
A successful operation clears the carry flag and stores OOH in the second and third bytes of
the memory block, meaning that it is no longer in use. An unsuccessful operation sets the
carry flag and returns in the AX an error code (07 = memory block destroyed and 09 = in-
valid memory block address).
A program can calculate its own size by subtracting the end of the last segment from the
address of the PSP. You’ll have to ensure that you use the last segment if your linker re-
arranges segments in alphabetic sequence.
A successful operation clears the carry flag. An unsuccessful operation sets the carry
flag and returns in the AX an error code (07 = memory block destroyed, 08 = insufficient
memory, and 09 = invalid memory block address) and returns in the BX the maximum pos-
sible size (if an attempt to increase the size was made). A wrong address in the ES can cause
error 07.
An invalid operation sets the carry flag and returns an error code in the AX.
This operation loads an .EXE or .COM program into memory, establishes a program seg-
ment prefix for it, and transfers control to it for execution. Since all registers, including the
stack, are changed, the operation is not for novices. The parameter block addressed by the
ES:BX has the following format:
OFFSET PURPOSE
0OH Address of environment-block segment to be passed at PSP +2CH. A zero
address means that the loaded program is to inherit the environment of its
parent.
02H Doubleword pointer to command line for placing at PSP+80H.
06H Doubleword pointer to default FCB #1 for passing at PSP+5CH.
OAH Doubleword pointer to default FCB #2 for passing at PSP+6CH.
OFFSET PURPOSE
OOH Address of environment-block segment to be passed at PSP +2CH. If the
address is zero, the loaded program is to inherit the environment of its
parent.
456 DOS Memory Management Chapter 24
OFFSET PURPOSE
OOH Word segment address where file is to be loaded
02H Word relocation factor to apply to the image
An error sets the carry flag and returns an error code in the AX, described in Figure
18-1.
PROGRAM OVERLAYS
The program in Figure 24—5 uses the same service as that in Figure 244, but this time just
to load a program into memory without executing it. The process consists of a main pro-
gram, P24CALLYV, and two subprograms, P24SUB1 and P24SUB2.
P24CALLYV is the main program, with these segments:
STACKSG SEGMENT PARA STACK ‘Stack1’
DATASG SEGMENT PARA ‘Datal’
CODESG SEGMENT PARA ‘Codel’
ZENDSG SEGMENT ;Dummy (empty) segment
P24CALLV’s segments are linked first—that’s why their class names differ: ‘Datal’,
‘Data2’, “Codel’, “Code2’, and so forth. Here’s the link map for P24CALLV+P24SUBI:
P24SUB2 is also called by P24CALLYV, but is linked separately. Its segments are:
DATASG SEGMENT PARA ‘Data’
CODESG SEGMENT PARA ‘Code’
A5SOERR:
CALL Q20SET ;Set cursor
LEA DX, ERRMSG3
CALL Q30DISP ;Display message
JMP A90
A90:
MOV AH, 4CH -EXit
INT 2iH
BEGIN ENDP
°
, Video screen services:
Called subprogram
me i i ie ae iaes iaes ia ia ia es iasia i ia a ae el
but since P24SUB1 has its own data segment, it has to push P24CALLV’s DS and estab-
lish its own DS address. P24SUB1 sets the cursor, displays a message, pops the DS, and re-
turns to P2A4CALLV.
To overlay P24SUB2 on P24SUB1, P24CALLV has to shrink its own memory space,
since DOS has given it all available memory. P24CALLV’s highest segment is ZENDSG,
which is empty. P24CALLV subtracts the address of its PSP (still in the ES) from the
address of ZENDSG. The difference is 270H (27H paragraphs), calculated as the size of
the PSP (100H) plus the offset of ZENDSG (170H), which is delivered to DOS by func-
tion 4AH.
DOS function 48H then allocates memory to allow space for P24SUB2 to be loaded
(overlaid) on top of P24SUB1, arbitrarily set to 40H paragraphs. The operation returns the
loading address in the AX register, which P24CALLYV stores in PARABLK. This is the first
word of a parameter block to be used by function 4BH.
Function 4BH with code 03 in the AL loads P24SUB2 into memory. Note the defi-
nition in the data segment: F:\P24SUB2.EXE,0. Function 4BH references CS and
PARABLK—the first word contains the segment address where the overlay is to be loaded
and the second word is an offset, in this case, zero. A diagram may help make these steps
clearer:
462 DOS Memory Management Chapter 2.4
The far CALL to P24SUB2 requires a reference defined as IP:CS, but PARABLK is
in the form CS:IP. The CS value is therefore moved to the second word, and 20H is stored
in the first word for the IP, since the link map shows that value as the offset of P24SUB2’s
code segment. The next instructions load the address of PARABLK in the BX and call
P24SUB2:
LEA BX, PARABLK ;Address of PARABLK
Note that P24¢CALLV doesn’t reference P24SUB2 by name in its code segment and so
doesn’t require an EXTRN statement specifying P24SUB2. Since P24SUB2 has its own
data segment, it first pushes the DS onto the stack and initializes its own address. But
P24SUB2 wasn’t linked with P24CALLYV. As a result, the instruction MOV AX,DATASG
would set the AX only with the offset address of DATASG, O[0]H, and not its segment ad-
dress. We do know that CALL set the CS with the address of the first segment, which (ac-
cording to the map) happens to be the address of the data segment. Moving the CS to the
DS gives the correct address in the DS. Note that if P24SUB2’s code and data segments
were in a different sequence, the coding would have to be somewhat different.
P24SUB2 sets the cursor, displays a message, pops the DS, and returns to
P24CALLV. DEBUG was indispensable in developing this program.
RESIDENT PROGRAMS
A number of popular commercial and shareware programs are designed to reside in mem-
ory while other programs run, and you can activate their services through special key-
strokes. You load resident programs after DOS is loaded and before activating other normal
processing programs. They are almost always .COM programs and are also known as “‘ter-
minate but stay resident” (TSR) programs.
The easy part of writing a resident program is getting it to reside. Instead of normal
termination, you exit by means of INT 21H, function 31H (keep program). The operation
requires the size of the program in the DX register:
MOV AH, 31H ;Request TSR
INT 21H
When you execute the initialization routine, DOS reserves the memory block where
the program resides and loads subsequent programs higher in memory.
Resident Programs 463
The not-so-easy part of writing a resident program involves activating it after it is res-
ident, since it is not a program internal to DOS, as are CLS, COPY, and DIR. A common
approach is to modify the interrupt services table so that the resident program interrupts all
keystrokes, acts on a special keystroke or combination, and passes on all other keystrokes.
The effect is that a resident program typically, but not necessarily, consists of the follow-
ing parts:
In effect, the initialization procedure sets up all the conditions to make the resident
program work and then allows itself to be erased. The organization of memory now appears
as follows:
A resident program may use two INT 21H functions for accessing the interrupt ser-
vices table, since there is no assurance that more advanced computers will have the inter-
rupt table located in the same memory locations.
INT ‘21h
The operation returns the address of the interrupt in the ES:BX as segment:offset. For con-
ventional memory, a request for the address of INT 09H returns OOH in the ES and 24H (36)
in the BX.
464 DOS Memory Management Chapter 24
INT 21H
The operation replaces the present address of the interrupt with the new address. In effect,
then, when the specified interrupt occurs, processing links to your (resident) program,
rather than to the normal interrupt address.
;Turn on speaker
61H, AL
CA, 5000 ;Set duration
PAUSE:
PAUSE
AL, AH ;Turn off speaker
61H, AL
EALT:
DS ;Restore registers
CX
AX
JMP CS:SAVINTS9 ;Resume INT 09H
‘ Initialization routine
INITZE:
;Prevent further interrupts
AH, 35H ;Get address of INT 09H
AL,09 ; in ES:BX
21H
WORD PTR SAVINT9,BX ; and save it
WORD PTR SAVINT9+2,ES
AH, 25H
AL,09 ;Set new address for INT 09H
DX,OFFSET TESTNUM ; in TESTNUM
21H
to INT9SAV, which contains the original INT 09H address. We now release control back
to the interrupt.
The next example should help make the procedure clear. First we explain a conven-
tional operation without a TSR intercepting the interrupt:
1. A user presses a key, and the keyboard sends interrupt 09H to BIOS.
2. BIOS uses the address of INT 09H in the interrupt services table to locate its BIOS
routine.
3. Control then transfers to the BIOS routine.
4. The routine gets the character and (if it’s a standard character) delivers it to the key-
board buffer.
Next is the procedure for the resident program:
1. A user presses a key, and the keyboard sends INT 09H to BIOS.
2. BIOS uses the address of INT 09H in the interrupt services table to locate its BIOS
routine.
3. But the table now contains the address of TESTNUM, the resident program, to which
control transfers.
4. If NumLock is on and the character is a numeric keypad number, TESTNUM beeps
the speaker.
5. TESTNUM exits by jumping to the original saved INT 09H address, which transfers
control to the BIOS routine.
6. The BIOS routine gets the character and (if it’s a standard character) delivers it to the
keyboard buffer.
Since this program is intended to be illustrative, you can modify or expand it for your
own purposes. A few commercial programs that also replace the table address of interrupt
O9H do not allow concurrent use of a resident program such as this one.
JE
Key Points 467
The service returns the address of inDOS in the ES:BX. The flag contains the number
of DOS functions currently active, where 0 means none. You may enter DOS only if
inDOS is 0.
KEY POINTS
The boot record is on track 0, sector 1, of any disk that you use FORMAT JS to for-
mat. When you initiate the system, it automatically loads the boot record from disk
into memory. The boot record then loads IO.SYS from disk into memory.
IO.SYS is a low-level interface to the BIOS routines in ROM. On initiation, IO.SYS
determines the status of all devices and equipment associated with the computer and
sets interrupt table addresses for interrupts up to 20H. IO.SYS also handles I/O be-
tween memory and external devices.
MSDOS.SYS is a high-level interface to programs that is loaded into memory after
IO.SYS. Its operations include setting interrupt table addresses for interrupts 20H
through 3FH, managing the directory and files on disk, handling blocking and de-
blocking of disk records, and handling INT 21H functions.
COMMAND.COM handles the various DOS commands and runs requested .COM,
.EXE, and .BAT files. It consists of a small resident portion, an initialization portion,
and a transient portion. COMMAND.COM is responsible for loading executable pro-
grams from disk into memory.
The .EXE module that the linker creates consists of a header record containing con-
trol and relocation information and the actual load module.
On loading either a .COM or an .EXE program, DOS sets up memory blocks for the
program’s environment and for the program segment. Preceding each memory block
is a 16-byte arena header beginning on a paragraph boundary. DOS also creates a PSP
at location OOH of the program segment and loads the program at 100H
On loading a .COM program, DOS sets the segment registers wiht the address of the
PSP, sets the stack pointer to the end of the segment, pushes a zero word onto the
stack, and sets the intruction pointer to 100H (the size of the PSP). Control then pro-
ceeds to the address generated by CS:IP, the first location immediatley following the
PSP.
On loading an .EXE program, DOS reads the header record into memory, calculates
the size of the executable module, and reads the module into memory at the start seg-
ment. It adds the value of each relocation table item to the start segment value. It sets
the DS and ES to the segment address of the PSP; sets the SS to the address of the
PSP, plus 100H, plus the SS offset value; sets the SP to the size of the stack, and sets
the CS to the address of the PSP, plus 100H, plus the CS offset value in the header.
DOS also sets the IP with the offset at 14H. The CS:IP pair provide the starting ad-
dress of the code segment for program execution.
Useful fields within the PSP include parameter area 1 at 5CH, parameter area 2 at
6CH, and default disk transfer area at 80H.
Load resident programs before activating other normal processing programs. Exit by
means of INT 21H, function 31H, which requires the size of the program in the DX.
468 DOS Memory Management Chapter 24
QUESTIONS
24-1. (a) Where is the boot record located? (b) What is its purpose?
24-2. What is the purpose of IO.SYS IBMBIO.COM)?
24-3. What is the purpose of MSDOS.SYS IBMDOS.COM)?
24-4. Where, generally, are the following portions of COMMAND.COM located in memory and
what is their purpose? (a) Resident; (b) transient.
24-5. (a) Where is the program segment prefix located? (b) What is its size?
24-6. A user types in the instruction FUDGE C:ALF.DOC to request execution of a FUDGE pro-
gram. Show the hex contents in the program’s PSP at (a) SCH, parameter area 1 (FCB #1),
and (b) 80H, the default DTA.
24-7. Your program has to determine what PATH commands are set for its environment. Explain
where the program may find its own environment. (Note: The request is for the program’s en-
vironment, not the DOS master environment.)
24-8. A .COM program is loaded for execution with its PSP beginning at location 2BA1[0]H. What
address does DOS store in each of the following registers (ignore reverse-byte notation): (a)
CS: (b) DS: (c) Es; (da) SS.
24-9. A link map for an .EXE program shows the following:
DOS loads the program with the PSP beginning at location 1A25[0]H. Showing calculations
where appropriate, state the contents of each of the registers at the time of loading (ignore
reverse-byte notation): (a) CS; (b) DS; (c) ES; (d) SS; (e) SP.
24-10. An arena header begins at location EB6[0] and contains the following: 4D COOE OA00 ... .
(a) What does the 4D (M) mean to DOS? (b) How would the contents differ if this were the
last memory block? (c) What is the memory location of the next arena header? Show calcu-
lations.
24-11. (a) Resident programs commonly intercept keyboard input. Where and what exactly is this
intercepted address? (b) In what two significant ways does the coding for terminating a resi-
dent program differ from the coding for terminating a normal program?
PART G —Reference Chapters
CHAPTER 25
BIOS Data Areas and Interrupts
OBJECTIVE:
To describe the BIOS data areas and interrupt services.
INTRODUCTION
BIOS contains an extensive set of input/output routines and tables that indicate the status
of the system’s devices. DOS and user programs can request BIOS routines for communi-
cation with devices attached to the system. The method of interfacing with BIOS is soft-
ware interrupts. This chapter examines the data areas (or tables) that BIOS supports, the
interrupt procedure, and the various interrupt services.
The chapter covers the following BIOS interrupts:
469
470 BIOS Data Areas and Interrupts Chapter 25
On the PC, ROM resides beginning at location FFFFOH. Turning on the power causes a
“cold boot.” The processor enters a reset state, sets all memory locations to zero, performs
a parity check of memory, and sets the CS register to FFFF[0]H and the IP register to zero.
The first instruction to execute is therefore at FFFF:0, the entry point to BIOS. BIOS also
stores the value 1234H at 40[0]:72H to signal a subsequent Ctrl+ Alt+ Del (“warm reboot’)
not to perform the preceding power-on self-test.
BIOS checks the various ports to identify and initialize devices that are attached, in-
cluding INT 11H (equipment determination) and INT 12H (memory size determination).
Then, beginning at location 0 of memory, BIOS establishes the interrupt service table that
contains addresses of interrupt routines.
Next, BIOS determines whether a disk containing DOS is present and, if so, it exe-
cutes INT 19H to access the first disk sector containing the bootstrap loader. This program
is a temporary operating system to which the BIOS routine transfers control after loading
it into memory. The bootstrap has only one task: to load the first part of the real operating
system into memory. The DOS files IO.SYS, MSDOS.SYS, and COMMAND.COM are
then loaded from disk into memory.
DEVICE
ACTION ACTION
Insert active Alt pressed
CapsLock active Ctrl pressed
Num Lock active Left shift pressed
Scroll Lock active Right shift pressed
ACTION ACTION
7 Insert pressed 2 Ctrl/NumLock pressed
6 CapsLock pressed Z SysReq pressed
5 NumLock pressed ] Left Alt pressed
4 Scroll Lock pressed 0 Left Ctrl pressed
3EH Disk seek status. Bit number 0 refers to drive A, | to B, 2 to C, and3 toD. A
bit value of 0 means that the next seek is to reposition to cylinder 0 to recali-
brate the drive.
3FH Disk motor status. If bit 7 = 1, a write operation is in progress. Bit number 0
refers to drive A, 1 to B, 2 to C, and 3 to D; a bit value of 0 means that the
motor is on.
40H Motor count for time-out until motor is turned off
41H Disk status, indicating an error on the last diskette drive operation:
OOH No error 09H Attempt to make DMA across
64K boundary
O1H Invalid drive parameter OCH Media type not found
02H Address mark not found 10H CRC error on read
03H Write-protect error 20H Controller error
04H Sector not found 40H _ Seek failed
06H Diskette change line active 80H Drive not ready
08H DMA overrun
42H-48H Diskette drive controller status
MODE MODE
Monochrome 80 X 25 color
640 < 200 monochrome 80 X 25 monochrome
320 X 200 monochrome 40 X 25 color
320 X 200 color 40 X 25 monochrome
ACTION ACTION
Read ID in progress Right Alt pressed
Last code was ACK Right Ctrl pressed
Force NumLock if read ID and KBX Last scan code was EO
101/102 keyboard installed Last scan code was E1
INTERRUPT SERVICES
An interrupt is an operation that suspends execution of a program so that the system can
take special action. We have already used a number of interrupts for video display, disk I/O,
printing, and resident programs. The interrupt routine executes and normally returns con-
trol to the interrupted procedure, which then resumes execution. BIOS handles interrupts
OOH—1FH, and DOS handles interrupts 20H—3FH.
Executing an Interrupt
An interrupt pushes onto the stack the contents of the flags register, the CS, and the IP. For
example, the table address of INT 05H (which prints the screen when a user presses
Ctrl+PrtSc) is 0014H (05H X 4 = 14H). The operation extracts the four-byte address from
location 0014H and stores two bytes in the IP and two in the CS. The address in the CS:IP
then points to the start of a routine in the BIOS area, which now executes. The interrupt re-
turns via an IRET (Interrupt Return) instruction, which pops the IP, CS, and flags from the
stack and returns control to the instruction following the INT.
rupt request (INTR) line. The NMI line reports memory and I/O parity errors. The proces-
sor always acts on this interrupt, even if you issue CLI to clear the interrupt flag in an at-
tempt to disable external interrupts. The INTR line reports requests from external devices,
namely, interrupts 05H through OFH, for the timer, keyboard, serial ports, fixed disk,
diskette drives, and parallel ports.
An internal interrupt occurs as a result of the execution of an INT instruction or a di-
vide operation that causes an overflow, execution in single-step mode, or a request for an
external interrupt, such as disk I/O. Programs commonly use internal interrupts, which are
nonmaskable, to access BIOS and DOS procedures.
BIOS INTERRUPTS
This section covers BIOS interrupts 00H through 1BH. There are other operations not cov-
ered that can be executed only by BIOS.
INT 01H: Single Step. Used by DEBUG and other debuggers to enable single-
stepping through program execution.
INT 02H: Nonmaskable Interrupt. Used for serious hardware conditions, such as
parity errors, that are always enabled. Thus a program issuing a CLI (clear interrupt) in-
struction does not affect these conditions.
INT 03H: Break Point. Used by debugging programs to stop execution. DEBUG’s
Go and Proceed commands set this interrupt at the appropriate stopping point in the pro-
gram; DEBUG undoes single-step mode and allows the program to execute normally up to
INT 03H, whereupon DEBUG resets single-step mode.
INT 05H: Print Screen. Causes the contents of the screen to print. Issuing INT
O5H activates the interrupt internally, and pressing the Ctrl+PrtSc keys activates it exter-
nally. The operation enables interrupts and saves the cursor position. No registers are af-
fected. Address 50:00 in the BIOS data area contains the status of the operation.
INT 08H: System Timer. A hardware interrupt that updates the system time and
(if necessary) date. A programmable timer chip generates an interrupt every 54.9254 mil-
liseconds, about 18.2 times a second.
INT 09H: Keyboard Interrupt. Caused by pressing or releasing a key on the key-
board; described in detail in Chapter 11.
INT OBH, INT OCH: Serial Device Control. Control the COM1 and COM2
ports, respectively.
476 BIOS Data Areas and Interrupts § Chapter 25
INT ODH, INT OFH: Parallel Device Control. Control the LPT2 and LPT1 ports,
respectively.
INT OEH: Diskette Control. Signals diskette activity, such as completion of an I/O
operation.
INT 10H: Video Display. Accepts a number of functions in the AH for screen
mode, setting the cursor, scrolling, and displaying; described in detail in Chapter 10.
INT 12H: Memory Size Determination. Returns in the AX the size of memory on
the system board, in terms of contiguous kilobytes such that 640K memory is 0280H, as
determined during power-on.
INT 13H: Disk Input/Output. Accepts a number of functions in the AH for disk
status, read sectors, write sectors, verify, format, and get diagnostics; covered in Chapter 19.
INT 14H: Communications Input/Output. Provides byte stream I/O (that is, one
bit at a time) to the RS232 communication port. The DX should contain the number of the
RS232 adapter (O—-3 for COM], 2, 3, and 4, respectively). A number of functions are es-
tablished through the AH register.
Function 00H: Initialize Communications Port. Set the following parameters in
the AL, according to bit number:
The operation returns the status of the communications port in the AX. (See function 03H
for details.) Here’s an example that sets COM1 to 1,200 baud, no parity, one stop bit, and
eight-bit data length:
MOV AH,00H ;Request initialize port
MOV AL,10000011B ;Parameters
MOV Dx,00 ;COM1 serial port
INT 14H 7Cail BIOS
BIOS Interrupts 477
Function 01H: Transmit Character. Load the AL with the character that the rou-
tine is to transmit and the DX with the port number. On return, the operation sets the port
status in the AH. (See function 03H.) If the operation is unable to transmit the byte, it also
sets bit 7 of the AH, although the normal purpose of this bit is to report a time-out error.
Make sure to execute function 00H before using this service.
Function 02H: Receive Character. Load the port number in the DX. The opera-
tion accepts a character from the communications line into the AL. It also sets the AH with
the port status (see function 03) for error bits 7, 4, 3, 2, and 1. Thus a nonzero value in the
AX indicates an input error. Make sure to execute function 00H before using this service.
Function 03H: Return Status of Communications Port. Load the port number in
the DX. The operation returns the line status in the AH and modem status in the AL:
Other INT 14H functions are 04H (extended initialize) and OSH (extended commu-
nications port control).
INT 15H: System Services. This rather elaborate operation provides for a large
number of functions in the AH, such as the following:
21H Power-on self-testing
43H Read system status
84H Joystick support
88H Determine extended memory size
89H Switch the processor to protected mode
C2H Mouse interface
For example, with function code 88H in the AH, the operation returns in the AX the num-
ber of kilobytes of extended memory. (For example, 0580H means 1408K bytes.) Since the
operation exits without resetting interrupts, use it like this:
MOV AH, 88H ;Request extended memory
INT 16H: Keyboard Input. Accepts a number of functions in the AH for basic
keyboard input; covered in Chapter 10.
478 BIOS Data Areas and Interrupts Chapter 25
INT 17H: Printer Output. Provides a number of functions for printing via BIOS;
discussed in Chapter 20.
INT 18H: ROM BASIC Entry. Called by BIOS if the system starts up with no
disk containing the DOS system programs.
INT 19H: Bootstrap Loader. If a disk(ette) device is available with the DOS sys-
tem programs, reads track 0, sector 1, into the boot location in memory at 7COQOH and trans-
fers control to this location. If there is no disk drive, transfers to the ROM BASIC entry
point via INT 18H. It is possible to use this operation as a software interrupt; it does not
clear the screen or initialize data in ROM BIOS.
INT 1AH: Read and Set Time. Reads or sets the time of day according to a func-
tion code in the AH:
¢ OOH = Read system timer clock. Returns the high portion of the count in the CX and
the low portion in the DX. If the time has passed 24 hours since the last read, the op-
eration sets the AL to a nonzero value.
¢ 01H = Set system timer clock. Load the high portion of the count in the CX and the
low portion in the DX.
¢ 02H-O7H. These functions handle the time and date for real-time clock services.
To determine how long a routine executes, you could set the clock to zero and then
read it at the end of processing.
INT 1BH: Get Control on Keyboard Break. When Ctrl+ Break keys are pressed,
causes ROM BIOS to transfer control to its interrupt address, where a flag is set.
KEY POINTS
¢ ROM resides beginning at location FFFFOH. Turning on the power causes a “cold
boot.” The processor enters a reset state, sets all memory locations to zero, performs
a parity check of memory, and sets the CS register to FFFF[0]H and the IP register to
zero. The first instruction to execute is therefore at FFFF:0, or FFFFO, the entry point
to BIOS.
On boot-up, BIOS checks the various ports to identify and initialize devices that are
attached. BIOS then establishes an interrupt service table, beginning at location 0 of
memory, that contains addresses for interrupts that occur. Two operations that BIOS
performs are equipment and memory size determination. If a disk containing DOS is
present, BIOS accesses the first disk sector containing the bootstrap loader. This pro-
gram loads DOS files IO.SYS, MSDOS.SYS, and COMMAND.COM from disk into
memory.
BIOS maintains its own data area in lower memory, beginning at segment address
40[0]H. Relevant data areas include those of the serial port, parallel port, system
equipment, keyboard, diskette drive, video control, hard disk, and real-time clock.
Questions 479
¢ The operand of an interrupt instruction such as INT 12H identifies the type of request.
For each of the 256 possible types, the system maintains a four-byte address in the
interrupt services table at locations OOOOH through 3FFH. Thus bytes 0-3 contain the
address for interrupt 0, bytes 4—7 for interrupt 1, and so forth.
BIOS interrupts range from OOH through 1FH and include divide by zero, print
screen, timer, video control, diskette control, video display I/O, equipment and mem-
ory size determination, disk I/O, communications I/O, keyboard input, printer output,
and bootstrap loader.
QUESTIONS
25-1. Distinguish between an external and an internal interrupt.
25-2. Distinguish between an NMI line and an INTR line.
25-3. (a) What is the memory location of the entry point to BIOS? (b) On power-up, how does the
system direct itself to this address?
25-4. On bootup, BIOS performs interrupts 11H, 12H, and 19H. What is their purpose?
25-5. What is the beginning location of the BIOS data area?
25-6. The following binary values were noted in the BIOS data area. For each item, identify the field
and explain the significance of the 1-bits.
(a) 10-11H: 10000010 00100101 (b) 17H: 11100001
(c) 18H: 00000011 (d) 96H: 00001 100
25-7. The following hex values were noted in the BIOS data area. For each item, identify the field
and explain the significance of the value.
(a) OO-03H: F8 03 F8 02 (b) 08—OBH: 78 03 00 00
(c) 13-14H: 80 02 (d) 15—16H: 00 08
(e) 4A-4BH: 50 00 (f) 60-61H: OE OD
(g) 84H: 18
25-8. Identify the following BIOS interrupts: (a) Divide by zero; (b) print screen; (c) keyboard in-
terrupt; (d) video display; (e) disk I/O; (f) keyboard input; (g) printer output; (h) get equipment
status; (i) memory size determination; (j) communications I/O.
CHAPTER 26
DOS Interrupts
OBJECTIVE:
To describe the various DOS interrupt functions.
INTRODUCTION
The two DOS modules, IO.SYS and MSDOS.SYS, facilitate using BIOS. Since these mod-
ules provide much of the additional required testing, the DOS operations are generally eas-
ier to use than their BIOS counterparts and are generally more machine independent.
I1O.SYS is a low-level interface to BIOS that facilitates reading data from external de-
vices into memory and writing data from memory onto external devices.
MSDOS.SYS contains a file manager and provides a number of services. For exam-
ple, when a user program requests INT 21H, the operation delivers information to MS-
DOS.SYS via the contents of registers. To complete the request, MSDOS.SYS may
translate the information into one or more calls to IO.SYS, which in turn calls BIOS. The
following shows the relationships involved:
480
DOS INT 21H Services 481
DOS INTERRUPTS
Interrupts 20H through 3FH are reserved for DOS operations, as described in the follow-
ing sections.
INT 20H: Terminate Program. Ends execution of a .COM program, restores ad-
dresses for Ctrl+Break and critical errors, flushes register buffers, and returns control to
DOS. This function would normally be placed in the main procedure and, on exit from it,
the CS should contain the address of the PSP. The preferred termination is INT 21H, func-
tion 4CH.
INT 21H: DOS Function Request. The main DOS operation, which requires a
function in the AH and is described in detail later.
INT 22H: Terminate Address. Copies the address of this interrupt into the pro-
gram’s PSP (at offset 0AH) when DOS loads a program for execution. On program termi-
nation, DOS transfers control to the address of the interrupt. Your programs should not issue
this interrupt.
| INT 24H: Critical-Error Handler. Used by DOS to transfer control (via PSP off-
| set 12H) when it recognizes a critical error (often in a disk or printer operation). Your pro-
grams should not issue this interrupt.
INT 25H: Absolute Disk Read. Reads the contents of one or more disk sectors;
covered in Chapter 17, but superseded by INT 21H, function 440DH, minor code 61H.
INT 26H: Absolute Disk Write. Writes data from memory to one or more disk sec-
tors; covered in Chapter 17, but superseded by INT 21H, function 440DH, minor code 41H.
INT 27H: Terminate but Stay Resident. Causes a .COM program on exit to re-
main in memory; superseded by INT 21H, function 31H.
INT 33H: Mouse Handler. Provides services for handling a mouse. (See Chapter 21.)
OOH Terminate program. Basically the same as INT 20H and also superseded by
INT 21H, function 4CH.
O1H Keyboard input with echo. (See Chapter 11.)
02H Display character. (See Chapter 9.)
03H Communications input. Reads a character from the serial port into the AL.
This is a primitive service, and BIOS INT 14H is preferred.
04H Communications output. The DL contains the character to transmit. BIOS
INT 14H is preferred.
O5H Printer output. (See Chapter 20.)
06H Direct keyboard and display. (See Chapter 11.)
07H Direct keyboard input without echo. (See Chapter | 1.)
O8H Keyboard input without echo. (See Chapter 11.)
09H Display string. (See Chapter 9.)
OAH Buffered keyboard input. (See Chapter 11.)
OBH Check keyboard status. (See Chapter 11.)
OCH Clear keyboard buffer and invoke input. (See Chapter 11.)
ODH Reset disk drive. (See Chapter 18.)
OEH Select default disk drive. (See Chapter 18.)
OFH Open FCB file. (See Chapter 17.)
10H Close FCB file. (See Chapter 17.)
11H Search for first matching disk entry. Obsolete and superseded by function 4EH.
12H Search for next matching disk entry. Obsolete and superseded by function 4FH.
13H Delete FCB file. Obsolete and superseded by function 41H.
14H Read FCB sequential record. (See Chapter 17.)
15H Write FCB sequential record. (See Chapter 17.)
16H Create FCB file. (See Chapter 17.)
17H Rename FCB file. Obsolete and superseded by function 56H.
19H Determine default disk drive. (See Chapter 18.)
1AH Set disk transfer area. (See Chapter 17.)
1BH Get information for default drive. (See Chapter 18.)
1CH Get information for specific drive. (See Chapter 18.)
1FH Get default drive parameter block. (See Chapter 18.)
21H Read FCB record randomly. (See Chapter 17.)
22H Write FCB record randomly. (See Chapter 17.)
23H Get FCB file size. Obsolete and superseded by function 42H.
24H Set random FCB record field. (See Chapter 17.)
pi) Set interrupt table address. (See Chapter 24.) The example that follows illus-
trates the use of this function. When a user presses the Ctrl+ Break or Ctrl+C
keys, the normal procedure is for the program to terminate and return to DOS.
You may want your program to provide its own routine to handle this situa-
tion. The example uses INT 21H, function 25H, to set the address for
Ctrl+ Break in the interrupt table (INT 23H) for its own routine, CIOBRK.
The routine could reinitialize the program or do whatever is necessary. The
code is as follows:
DOS INT 21H Services 483
44H I/O control for devices. Supports an extensive set of subfunctions for
checking devices and reading and writing data, listed in the following
functions:
4400H Get device information. (See Chapter 18.)
4401H Set device information. (See Chapter 18.)
4404H Read control data from drive. (See Chapter 18.)
4405H Write control data to drive. (See Chapter 18.)
4406H Check input status. (See Chapter 18.)
4407H Check output status. (See Chapter 18.)
4408H Determine if removable media for device. (See Chapter 18.)
440DH, Minor Code 41H Write disk sector. (See Chapter 18.)
440DH, Minor Code 61H Read disk sector. (See Chapter 18.)
440DH, Minor Code 42H Format track. (See Chapter 18.)
440DH, Minor Code 46H Set media ID. (See Chapter 18.)
440DH, Minor Code 60H Get device parameters. (See Chapter 18.)
440DH, Minor Code 66H Get media ID. (See Chapter 18.)
440DH, Minor Code 68H Sense media type. (See Chapter 18.)
45H Duplicate a file handle. (See Chapter 18.)
46H Force duplicate of handle. (See Chapter 18.)
47H Get current directory. (See Chapter 18.)
48H Allocate memory block. (See Chapter 24.)
49H Free allocated memory block. (See Chapter 24.)
4AH Set allocated memory block size. (See Chapter 24.)
4BH Load/execute a program. (See Chapter 24.)
4CH Terminate program. (See Chapter 4.) This is the standard operation for ter-
minating a program.
4DH Retrieve return code of a subprocess. (See Chapter 24.)
4EH Find first matching directory entry. (See Chapter 18.)
4FH Find next matching directory entry. (See Chapter 18.)
50H Set address of program segment prefix (PSP). Load the BX with the off-
set address of the PSP for the current program. No values are returned.
51H Get address of program segment prefix (PSP). Returns the offset address
of the PSP for the current program. (See Chapter 24.)
52H Get address of internal DOS list (undocumented, see Chapter 24).
54H Get verify state. (See Chapter 18.)
56H Rename a file. (See Chapter 18.)
oo Get/set file date and time. (See Chapter 18.)
5800H Get memory allocation strategy. (See Chapter 24.)
5801H Set memory allocation strategy. (See Chapter 24.)
5802H Get upper memory link. (See Chapter 24.)
5803H Set upper memory link. (See Chapter 24.)
59H Get extended error code. (See Chapter 18.)
SAH Create a temporary file. (See Chapter 18.)
SBH Create a new file. (See Chapter 18.)
486 DOS Interrupts Chapter 26
SCH Lock/unlock file access. Used for networking and multitasking environments.
5DH Set extended error. Load the DX with the offset address of a table of infor-
mation on errors The table is to be retrieved by the next execution of function
59H (get extended error code: see function 59H in Chapter 18 for details.)
SEH Local area network services. A subfunction in the AL specifies the service:
OOH Get machine name
02H Set printer setup
03H Get printer setup
S5FH Local area network services. A subfunction in the AL specifies the service:
02H Get assign-list entry
03H Make network connection
04H Cancel network connection
62H Get address of PSP. (See function 51H for an identical operation.)
65H Get extended country information. Supports a number of subfunctions con-
cerning information specific to various countries.
66H Get/set global code page.
67H Set maximum handle count. (See Chapter 24.)
68H Commit file. (See Chapter 18.)
6CH Extended open file. Combines functions 3CH (create file), 3DH (open file),
and 5BH (create unique file). (See Chapter 18.)
KEY POINTS
QUESTIONS
26-1. What interrupts are reserved for DOS?
26-2. Identify the functions for the following DOS INT 21H services: (a) communications input;
(b) get system time; (c) get DOS version; (d) terminate but stay resident; (e) get address of
interrupt table; (f) create subdirectory; (g) get free disk space; (h) get address of PSP.
26-3. Identify the following INT 21H, functions: (a) 05H; (b) OAH; (c) OFH; (d) 16H; (e) 35H;
(f) 3CH; (g) 3DH; (h) 3FH; (i) 40H.
CHAPTER 2,7
Operators and Directives
OBJECTIVE:
To describe in detail the assembly language operators and
directives.
INTRODUCTION
The various assembly language features at first tend to be somewhat overwhelming.
But once you have become familiar with the simpler and more common features de-
scribed in earlier chapters, you should find the descriptions in this chapter more easily
understood and a handy reference. Here, we describe the various type specifiers, opera-
tors, and directives. The assembly language manual contains a few other marginally use-
ful features.
TYPE SPECIFIERS
Type specifiers can provide the size of a data variable or the relative distance of an in-
struction label. Type specifiers that give the size of a data variable are BYTE, WORD,
DWORD, FWORD, QWORD, and TBYTE. Those that give the distance of an instruction
label are NEAR, FAR, and PROC. A near address, which is simply an offset, is assumed to
be in the current segment; a far address, which consists of a segment:offset address, can be
used to access another segment.
487
488 Operators and Directives Chapter 27
The PTR and THIS operators, as well as the COM, EXTRN, LABEL, and PROC di-
rectives, use type specifiers.
OPERATORS
An operator provides a facility for changing or analyzing operands during an assembly. Op-
erators are divided into various categories:
Calculation operators: Arithmetic, index, logical, shift, and structure field name.
Macro operators: Various types, covered in Chapter 22.
Record operators: MASK and WIDTH, covered later in this chapter under the
RECORD directive.
Relational operators: EQ, GE, GT, LE, LT, and NE.
Segment operators: OFFSET, SEG, and segment override.
Type (or attribute) operators: HIGH, HIGHWORD, LENGTH, LOW, LOWWORD,
PTR, SHORT, SIZE, THIS, and TYPE.
Since a knowledge of these categories is not necessary, we’ll simply cover the oper-
ators in alphabetic sequence.
Arithmetic Operators
Arithmetic operators include the familiar arithmetic signs and perform arithmetic during an
assembly. In most cases, you could perform the calculation yourself, although the advan-
tage of using these operators is that every time you change the program and reassemble it,
the assembler automatically recalculates the values of the arithmetic operators. Following
is a list of the operators, together with an example of their use and the effect obtained:
Except for addition (+) and subtraction (—), all operators must be integer constants.
The following related examples of integer expressions are illustrative:
valuel = 12 * 4 ;48
INDEX Operators
For a direct memory reference, one operand of an instruction specifies the name of a de-
fined variable, as shown by COUNTER in the instruction ADD CX,COUNTER. During
execution, the processor locates the specified variable in memory by combining the offset
value of the variable with the data segment address in the DS.
For indirect addressing of memory, an operand references a base or index register,
constants, offset variables, and variables. The index operator, which uses square brackets,
acts like a plus (+) sign. A typical use of indexing is to reference data items in tables. You
can use the following operations to reference indexed memory:
¢ [Constant], i.e., an immediate number or name in square brackets. For example, load
the fifth entry of TABLEA into the CL (note that TABLEA[0] is the first entry):
TABLEA DB 25 DUP(?) »Defined table
- Base register BX as [BX] in association with the DS segment register, and base reg-
ister BP as [BP] in association with the SS segment register. For example, use the off-
set address in the BX (combined with the segment address in the DS register), and
move the referenced item to the DX:
MOV DX, [BX] -Base register DS:BX
Index register DI as [DI] and index register SI as [SI], both in association with the DS
segment register. For example, combine the address in the DS with the offset address
in the SI, and move the referenced item to the AX:
Combined index registers. For example, move the contents of the AX to the address
determined by adding the DS address, the BX offset, the SI offset, and the constant 4:
The preceding example could also be coded as [BX +SI]+4. You may combine these
operands in any sequence, but don’t combine two base registers [BX+BP] or two index
registers [DI+SI]. Only the index registers must be in square brackets.
490 Operators and Directives Chapter 27
LENGTH Operator
The LENGTH operator returns the number of entries defined by a DUP operator. The fol-
lowing MOV instruction returns the length 10 to the DX:
TABLEA DW 10 DUP(?)
If the referenced operand does not contain a DUP entry, the operator returns the value
O1. (See also the SIZE and TYPE operators.)
Logical Operators
The logical operators perform logical operations on the bits in an expression:
OFFSET Operator
The OFFSET operator returns the offset address (that is, the relative address within the data
segment or code segment) of a variable or label. The general format is
OFFSET variable or label
Note that LEA doesn’t require OFFSET to return the same value:
MASK Operator
See “RECORD directive” in the section entitled “Directives.”
PTR Operator
The PTR operator can be used on data variables and instruction labels. It uses the type spec-
ifiers BYTE, WORD, FWORD, DWORD, QWORD, and TBYTE to specify a size in an
ambiguous operand or to override the defined type (DB, DW, DF, DD, DF, or DT) for vari-
ables. It also uses the type specifiers NEAR, FAR, and PROC to override the implied dis-
tance of labels. The general format for PTR is
The type is the new attribute, such as BYTE. The expression is a variable or constant. Fol-
lowing are examples of the PTR operator (watch out for FLDW, where the bytes are in re-
verse sequence):
FLDB DB PAPAs|
DB 35H
FLDW DW 2672H >Stored as 7226
A feature that performs a similar function to PTR is the LABEL directive, described
later.
SEG Operator
The SEG operator returns the address of the segment in which a specified variable or label
is placed. Programs that combine separately assembled segments would most likely use this
operator. The general format is
The following MOV instructions return the address of the segment in which the ref-
erenced names are defined:
segment :expression
The named segment can be any of the segment registers or a segment or group name. The
expression can be a constant, an expression, or a SEG expression. These next examples
override the default DS segment register:
An instruction may have a segment override operator apply to only one operand.
In the following example, the SHR operator shifts the bit constant three bits to the right:
Most likely, the expression would reference a symbolic name rather than a constant value.
SHORT Operator
The purpose of the SHORT operator is to modify the NEAR attribute of a JMP destination
that is within +127 and —128 bytes. The format is
The assembler reduces the machine code operand from two bytes to one. This feature is use-
ful for near jumps that branch forward, since otherwise the assembler initially doesn’t know
the distance of the jump address and may assume two bytes for a far jump.
SIZE Operator
The SIZE operator returns the product of LENGTH times TYPE and is useful only if the
referenced variable contains the DUP entry. The general format is
SIZE variable
THIS Operator
The THIS operator creates an operand with segment and offset values that are equal to those
of the current location counter. The general format is
THis. type
The type specifier can be BYTE, WORD, DWORD, FWORD, QWORD, or TBYTE for
variables and NEAR, FAR, or PROC for labels. You typically use THIS with the EQU, or
equals sign (=) directive. The following example defines FLDA:
FLDA EQU THIS BYTE
TYPE Operator
The TYPE operator returns the number of bytes, according to the definition of the referenced
variable. However, the operation always returns 1 for a string variable and 0 for a constant.
The following examples illustrate the TYPE, LENGTH, and SIZE operators:
FLDB DB fd ;Define one byte
TABLEA DW 20 DUP(?) >Define 20 words
Since TABLEA is defined as DW, TYPE returns 0002H, LENGTH returns 00O0AH based
on the DUP entry, and SIZE returns type times length, or 14H (20).
494 Operators and Directives Chapter 27
WIDTH Operator
See “RECORD Directive” in the following section.
DIRECTIVES
This section describes most of the assembly language directives. Chapter 4 covered in de-
tail the directives for defining data (DB, DW, etc.), and Chapter 22 covered the directives
for macro instructions, so they aren’t repeated here. Directives are divided into various
categories:
Since a knowledge of these categories is not necessary, we'll cover the directives
(other than macro-related ones) in alphabetic sequence.
ALIGN Directive
MASM 5.0 introduced the ALIGN directive to force the assembler to align the next data
item or instruction according to a given value. The general format is
ALIGN number
The number must be a power of 2, such as 2, 4, 8, or 16. For the statement ALIGN 4, the
assembler advances its location counter to the next address that is evenly divisible by 4. If
the location counter is already at the required address, it is not advanced. The assembler
Directives 495
fills unused bytes with zeros for data and NOPs for instructions. Note that ALIGN 2 has the
same effect as EVEN.
Alignment is no advantage on the 8088 processor, which accesses only one byte at a
time, but can speed up more advanced processors.
.ALPHA Directive
The .ALPHA directive, placed at or near the start of a program, tells the assembler to
arrange segments in alphabetic sequence. It overrides the assembler option /S. (See also the
SEQ directive.)
ASSUME Directive
ASSUME tells the assembler to associate segment names with the CS, DS, ES, and SS seg-
ment registers. The general format is
ASSUME seg-reg:seg-name [, ... ]
Valid segment register entries are CS, DS, ES, and SS, plus FS and GS on the 80386 and
later processors. Valid segment names are those of segment registers, NOTHING,
GROUPs, and a SEG expression. One ASSUME statement may assign up to four segment
registers, in any sequence. The simplified segment directives automatically generate an
ASSUME.
In the following ASSUME statement, CODESG, DATASG, and STACK are the
names the program has used to define the segments:
ASSUME CS:CODESG,DS:DATASG,
SS: STACK, ES: DATASG
Omission of a segment reference is the same as coding NOTHING. Use of the key-
word NOTHING also cancels any previous ASSUME for a specified segment register:
ASSUME ES:NOTHING
Suppose that you neither assign the ES register nor use NOTHING to cancel it. Then,
to reference a data item in the data segment, an instruction operand may use the segment
override operator (:) to reference the ES register, which must contain a valid address:
-CODE Directive
This simplified segment directive defines the code segment. Its general format is
.CODE [name]
All executable code must be placed in this segment. For TINY, SMALL, and COMPACT
models, the default segment name is _TEXT. The MEDIUM and LARGE memory models
permit multiple code segments, which you distinguish by means of the name operand. (See
also the .MODEL directive.)
496 Operators and Directives Chapter 27
COMM Directive
Defining a variable as COMM gives it both the PUBLIC and EXTRN attributes. In this
way, you would not have to define the variable as PUBLIC in one module and EXTRN in
another. The general format is
COMM [NEAR/FAR] label:size[:count]
COMMENT Directive
This directive is useful for multiple lines of comments. The general format is
COMMENT delimiter [comments]
[comments ]
delimiter [comments]
The delimiter is the first nonblank character, such as % or +, following COMMENT. The
comments terminate on the line on which the second delimiter appears. This next example
uses a plus sign as a delimiter:
COMMENT + This routine scans
the input stream
for invalid
+ characters.
.CONST Directive
This simplified segment directive defines a data (or constant-data) segment with the ‘const’
class. (See also the MODEL directive.)
.CREF Directive
This directive (the default) tells the assembler to generate a cross-reference table. It would
be used following an .XCREF directive that caused suppression of the table.
These simplified segment directives define data segments. .DATA defines a segment for
initialized near data; .DATA? defines a segment for uninitialized near data, usually used
Directives 497
when linking to a high-level language. For a stand-alone assembly program, you may also
define uninitialized near data in a .DATA segment (See, in addition, the .FARDATA and
MODEL directives.)
DOSSEG Directive
There are a number of ways to control the sequence in which the assembler arranges seg-
ments. (Some versions arrange them alphabetically.) You may code the .SEQ or .ALPHA
directives at the start of a program, or you may enter the /S or /A assembler options at as-
sembly time. The DOSSEG (.DOSSEG since MASM 6.0) directive tells the assembler to
ignore all other requests and to adopt the DOS segment sequence—basically, code, data,
and stack. Code this directive at or near the start of the program, primarily to facilitate the
use of CODEVIEW for stand-alone programs.
END Directive
The END directive is placed at the end of a source program. The general format is
END [start-address]
The optional start-address indicates the location in the code segment (usually the first in-
struction) where execution is to begin. The system loader uses this address to initialize the
CS register. If your program consists of only one module, define a start-address. If it con-
sists of a number of modules, only one (usually the first) has a start-address.
ENDP Directive
This directive indicates the end of a procedure, defined by PROC. The general format is
label ENDP
The label is the same as the one that defines the procedure.
ENDS Directive
This directive indicates the end of a segment (defined as SEGMENT) or a structure. Its gen-
eral format is
label ENDS
The label is the same as the one that defines the segment or structure.
EQU Directive
The EQU directive is used to redefine a data name or variable with another data name, vari-
able, or immediate value. The directive should be defined in a program before it is refer-
enced. The formats for numeric and string data differ:
Numeric equate: name EQU expression
The assembler replaces each occurrence of the name with the operand. Since EQU is used
for simple replacement, it takes no additional storage in the generated object program.
498 Operators and Directives Chapter 27
COUNTER DW 0
SUM EQU COUNTER ;Another name for COUNTER
TEN EQU 10 ;Numeric value
-ERR Directives
These conditional error directives can be used to help test for errors during an assembly:
You could use the preceding directives in macros and in conditional assembly statements.
In the following conditional assembly statements, the assembler displays a message if the
condition is not true:
IF condition
ELSE . ERR
SOUT [message]
ENDIF
Directives 499
Since MASM 6.0, it is no longer necessary to refer to pass 1 (.ERR1) or pass 2 (.ERR2) of
an assembly.
EVEN Directive
EVEN tells the assembler to advance its location counter if necessary so that the next de-
fined data item or label is aligned on an even storage boundary. This feature makes pro-
cessing more efficient on processors that access 16 or 32 bits at a time. (See also the ALIGN
directive.)
In the following example, BYTELOCN is a one-byte field on an even boundary. The
assembler’s location counter starts at 0017. EVEN causes the assembler to advance the lo-
cation counter one byte to 0018:
0016 BYTELOCN DB ?
[0017 NOP ]
0018 WORDLOCN DW ?
EXTRN Directive
The EXTRN (or EXTERN since MASM 6.0) directive informs the assembler and linker
about data variables and labels that the current assembly references, but that another mod-
ule (linked to the current one) defines. The general format is
EXTRN name:type [, ... ]
The name entry is an item defined in another assembly and declared in it as PUBLIC. The
type specifier can refer to either of the following:
¢ Data items: ABS (a constant), BYTE, WORD, DWORD, FWORD, QWORD,
TBYTE. Code the EXTRN in the segment in which the item occurs.
¢ Distance: NEAR or FAR. Code NEAR in the segment in which the item occurs, and
code FAR anywhere.
In the next example, the calling program defines CONVAL as PUBLIC and as aDW.
The called subprogram identifies CONVAL (in another segment) as EXTRN and FAR. The
code is as follows:
Calling program:
DSEG1 SEGMENT
PUBLIC CONVAL
CONVAL DW 4
DSEG1 ENDS
500 Operators and Directives Chapter 27
Called subprogram:
EXTRN CONVAL:FAR
DSEG2 SEGMENT
DSEG2 ENDS
See Chapter 23 for examples of EXTRN.
GROUP Directive
A program may contain several segments of the same type (code, data, or stack). The pur-
pose of the GROUP directive is to collect them all under one name, so that they reside
within one segment, usually a data segment. The general format is
The following GROUP combines SEG1 and SEG2 in the same assembly module:
SEG1 ENDS
ASSUME DS : GROUPX
SEG2 ENDS
The effect of using GROUP is similar to giving the segments the same name and the
PUBLIC attribute.
INCLUDE Directive
You may have sections of assembly code or macro instructions that various programs use.
If so, you may store these in separate disk files available for use by any program. Consider
Directives 501
a routine that converts ASCII code to binary is stored on drive D in a file named CON-
VERT.LIB. To access the file, insert an INCLUDE statement such as
INCLUDE D:CONVERT.
LIB
at the location in the source program where you would normally code the ASCII conver-
sion routine. The assembler then locates the file on disk and includes the statements in your
own program. (If the assembler cannot find the file, it issues an error message and ignores
the INCLUDE.)
For each included line, the assembler prints a C in column 30 of the .LST file and be-
gins the source code in column 33.
Chapter 22 gives a practical example of INCLUDE and explains how to use the di-
rective for only pass | of an assembly.
LABEL Directive
The LABEL directive enables you to redefine the attribute of a data variable or instruction
label. The general format is
name LABEL type-specifier
For labels, you may use LABEL to redefine executable code as NEAR, FAR, or PROC,
such as for a secondary entry point into a procedure. For variables, you may use the type
specifiers BYTE, WORD, DWORD, FWORD, QWORD, or TBYTE, or a structure name,
to redefine data items and the names of structures, respectively. For example, LABEL en-
ables you to define a field as both DB and DW. The following illustrates the use of BYTE
and WORD types:
REDEFB LABEL BYTE
FIELDW DW Zoo2u
FIELDB DB 25H
DB 32H
The first MOV instruction moves only the first byte of FIELDW. The second MOV moves
the two bytes beginning at FIELDB. The PTR operator performs a similar function.
.LIST Directive
The .LIST directive (the default) causes the assembler to list the source program. You may
use the .XLIST directive anywhere in an assembly source program to discontinue listing it.
A typical situation is where statements are common to other programs and you don’t need
another listing. .LIST resumes the listing. Code both of these directives with no operand.
502 Operators and Directives Chapter 27
MODEL Directive
This simplified segment directive creates default segments and the required ASSUME and
GROUP statements. Its general format is
.MODEL memory-model
The .STACK directive defines the stack, .CODE defines the code segment, and any
or all of .DDATA, .DATA?, .FARDATA, and .FARDATA? may define data segments. Here
is an example:
-MODEL SMALL
.STACK 120
. DATA
[data items]
. CODE
[instructions]
ORG Directive
The assembler uses a location counter to account for its relative position in a data or code
segment. Consider a data segment with the following definitions:
02 FLDB DB 36H 03
03 FLDC DW 212EH 05
05 FLDD DD 00000705H 09
Initially, the location counter is set to 00. Since FLDA is two bytes, the location counter is
incremented to 02 for the location of the next item. Since FLDB is one byte, the location
counter is incremented to 03, and so forth. You may use the ORG directive to change the
Directives 503
contents of the location counter and, accordingly, the location of the next defined items.
The general format is
ORG expression
The expression must form a two-byte absolute number and must not be a symbolic name. Sup-
pose the following data items are defined immediately after FLDD in the previous definition:
OFFSET NAME OPERATION OPERAND LOCATION COUNTER
ORG 0 00
00 FLDX DB ig O1
O1 FLDY DW 2 02
03 FLDZ DB c 04
ORG S15 09
The first ORG resets the location counter to 00. The variables that follow—-FLDX, FLDY,
and FLDZ—redefine these memory locations as FLDA, FLDB, and FLDC, respectively:
Offset:
An operand containing a dollar symbol ($), as in the second ORG, refers to the cur-
rent value in the location counter. The operand $+5 therefore sets the location counter to
04 + 5, or 09, which is the same setting as after the definition of FLDD.
A reference to FLDC is to a one-word field at offset 03, and a reference to FLDZ is
to a one-byte field at offset 03:
MOV AX,FLDC *One word
You may use ORG to redefine memory locations in the preceding manner. But be
sure that you reset the location counter to the correct value and that you account for all re-
defined memory locations. Also, the redefined variables should not contain defined con-
stants—these would overlay constants on top of the original ones. ORG cannot appear
within a STRUC definition.
% OUT Directive
This directive tells the assembler to direct a message to the standard output device (usually
the screen). (Since MASM 6.0, the name is ECHO.) The general format is
ZOUT message
PAGE Directive
The PAGE directive at the start of a source program specifies the maximum number of lines
to list on a page and the maximum number of characters on a line. Its general format is
PAGE [[length]
, width]
The following example sets 60 lines per page and 132 characters per line:
PAGE 60,132
The number of lines per page may range from 10 to 255, and the number of characters per
line may range from 60 to 132. Omission of a PAGE statement causes the assembler to as-
sume PAGE 50,80. To force a page to eject at a specific line, such as at the end of a seg-
ment, code PAGE with no operand.
PROC Directive
A procedure is a block of code that begins with the PROC directive and terminates with
ENDP. A typical use is for a subroutine within the code segment. Although technically, you
may enter a procedure in line or by a JMP instruction, the normal practice is to use CALL
to enter and RET to exit. The CALL operand may be a NEAR or FAR type specifier, and
RET assumes the same type.
A procedure that is in the same segment as the calling procedure is a NEAR proce-
dure and is accessed by an offset:
proc-name PROC [ NEAR]
An omitted operand defaults to NEAR. If a called procedure is external to the calling seg-
ment, it must be declared as PUBLIC, and you should use CALL to enter it.
For an .EXE program, the main PROC that is the entry point for execution must be
FAR. Also, a called procedure under a different ASSUME CS value must have the FAR
attribute:
PUBLIC proc-name
A far label may be in another segment, which CALL accesses by a segment address
and offset.
Processor Directives
These directives define the processors that the assembler is to recognize. The normal place-
ment of processor directives is at the start of a source program, although you could code
them inside a program at a point where you want a processor enabled or disabled. A refer-
ence to the 8086 also assumes the 8088, and .486 was introduced by MASM 6.0.
* .8086 enables the 8086 and 8087 coprocessor (the default mode).
* .186, .286, .386, and .486 enable all the instruction sets up to and including the named
processor and its associated coprocessor. That is, the directive permits instructions of
earlier processors. (For example, .386 enables .387, .286, .186, and .8086.)
Directives 505
¢ .186P, .286P, .386P, and .486P enable all the instruction sets just cited, plus the
processor’s privileged instructions.
PUBLIC Directive
The purpose of the PUBLIC directive is to inform the assembler and linker that the identi-
fied symbols in an assembly are to be referenced by other modules linked with the current
one. The general format is
The symbol can be a label, a number (up to two bytes), or a variable. See the “EXTRN Di-
rective” section and Chapter 23 for examples.
RECORD Directive
The RECORD directive enables you to define patterns of bits. One purpose is to define
switch indicators either as one bit or as multibit. The general format is
record-name RECORD field-name:width[=exp] [, ... ]
The record name and the field names may be any unique valid identifiers. Following each
field name is a colon (:) and a width—the number of bits. The range of the width entry is 1
to 16 bits:
Any length up to 8 becomes 8 bits, and lengths 9 to 16 become 16 bits, with the con-
tents right adjusted if necessary. The following example defines RECORD:
BITREC RECORD BIT1:3,BIT2:/7,BIT3:6
BIT1 defines the first 3 bits of BITREC, BIT2 defines the next 7, and BIT3 defines the last
6. The total is 16 bits, or one word. You may initialize values in RECORD as follows:
BITREC2 RECORD BIT1:3=101B, BIT2:7=0110110B, BIT3:6=011010B
Suppose that a definition of RECORD is at the start of the data segment. Within the
data segment, there should be another statement that allocates storage for the record. De-
fine a unique valid name, the record name, and an operand consisting of angle brackets (the
less-than and greater-than symbols):
DEFBITS BITREC <>
The allocation for DEFBITS generates object code AD9AH (stored as 9AAD) in the data
segment. The angle brackets may also contain entries that redefine BITREC.
The program in Figure 27-1 defines BITREC as RECORD, but without initial val-
ues in the record fields. In this case, an allocation statement in the data segment initializes
each field as shown within angle brackets.
506 Operators and Directives Chapter 27
Record-specific operators are WIDTH, shift count, and MASK. The use of these op-
erators permits you to change a RECORD definition without having to change the instruc-
tions that reference it.
WIDTH operator. The WIDTH operator returns a width as the number of bits in
a RECORD or in a RECORD field. For example, in Figure 27—1, following A10 are two
examples of WIDTH. The first MOV returns the width of the entire RECORD BITREC (16
bits); the second MOV returns the width of the record field BIT2 (7 bits). In both cases, the
assembler has generated an immediate operand for WIDTH.
MOV CL,BIT2
does not refer to the contents of BIT2. (Indeed, that would be rather difficult.) Instead, the
assembler generates an immediate operand that contains a shift count to help you isolate the
field. The immediate value represents the number of bits that you would have to shift BIT2
to right adjust it. In Figure 27-1, the three examples following B10 return the shift count
for BIT1, BIT2, and BIT3.
MASK operator. The MASK operator returns a mask of 1-bits representing the
specified field and, in effect, defines the bit positions that the field occupies. For example,
the MASK for each of the fields defined in BITREC is
FIELD BINARY HEX
Biv 1110000000000000 E000
In Figure 27-1, the three instructions following C10 return the MASK values for
BIT1, BIT2, and BIT3. The instructions following D10 and E10 isolate BIT2 and BIT1, re-
spectively, from BITREC. D10 gets the record into the AX register and uses a MASK of
BIT2 to AND it:
Record: 101 0110110 011010
The effect is to clear all bits except those of BIT2. The next two instructions cause
the AX to shift six bits so that BIT2 is right-adjusted:
The example following E10 gets the record into the AX, and because BIT1 is the left-
most field, the routine simply uses its shift factor to shift right 13 bits:
0000000000000101 (0005H)
508 Operators and Directives Chapter 2.7
SEGMENT Directive
An assembly module consists of one or more segments, part of a segment, or even parts of
several segments. The general format for a segment is
seg-name ENDS
All operands are optional. The following subsections describe the entries for align, com-
bine, and class.
Align. The align operand indicates the starting boundary for a segment:
PARA is commonly used for all types of segments. BYTE and WORD can be used for seg-
ments that are to be combined within another segment, usually a data segment. DWORD is
normally used with 80386 and later processors.
Combine. The combine operands NONE, PUBLIC, STACK, and COMMON in-
dicate the way the linker is to handle a segment:
NONE (default): The segment is to be logically separate from other segments, al-
though it may end up physically adjacent to them. The segment is presumed to have
its own base address.
PUBLIC: LINK loads PUBLIC segments of the same name and class adjacent to one
another. One base address is presumed for all such PUBLIC segments.
STACK: LINK treats STACK the same as PUBLIC. There must be at least one
STACK defined in a linked .EXE program. If there is more than one stack, the SP is
set to the start of the first stack.
COMMON: If COMMON segments have the same name and class, the linker gives
them the same base address. During execution, the second segment overlays the first
one. The largest segment determines the length of the common area.
AT paragraph-address: The paragraph must be defined previously. The entry facili-
tates defining labels and variables at fixed offsets within fixed areas of memory, such
as the interrupt table in low memory or the BIOS data area at 40[0]H. For example,
the code in ROM defines the location of the video display buffer as
The assembler creates a dummy segment that provides, in effect, an image of memory
locations.
‘class’. The class entry can help the linker associate segments with different
names, identify segments, and control their order. Class may contain any valid name, con-
tained in single quotes. The linker uses the name to relate segments that have the same name
and class. Typical examples are ‘Data’ and ‘Code’. If you define a class as ‘Code’, the linker
expects that segment to contain instruction code. Also, the CODEVIEW debugger expects
that class for the code segment.
The linker combines the following two segments with the same name (CSEG) and
class (‘Code’) into one physical segment under the same segment register:
Since you may want to control the ordering of segments within a program, it is use-
ful to understand how the linker handles the process. The original order of the segment
names provides the basic sequence, which you may override by means of the PUBLIC at-
tribute and class names. The following example links two object modules (both modules
contain a segment named DSEG1 with the PUBLIC attribute and identical class names):
Before linking the .OBJ modules:
module 1 SSEG SEGMENT PARA STACK
You may nest segments, provided that one nested segment is completely contained
within the other. In the following example, SEG2 is completely contained within SEG1:
SEG1 SEGMENT
SEG1 begins
SEG2 SEGMENT
SEG2 area
SEG2 ENDS
SEG1 resumes
SEG1 ENDS
The .ALPHA, .SEQ, and DOSSEG directives and the assembler options /A and /S
can also control the order of segments. (To combine segments into groups, see the GROUP
directive.)
SEQ Directive
This directive (the default), placed at or near the start of a program, tells the assembler to
leave segments in their original sequence. It overrides the assembler option /A. (See also
the .ALPHA directive.)
STACK Directive
This simplified segment directive defines the stack. Its general format is
.STACK [size]
The default stack size is 1,024 bytes, which you may override. (See also the MODEL
directive.)
STRUC Directive
The STRUC directive (STRUCT since MASM 6.0) facilitates defining related fields within
a structure. Its general format is
Directives 511
struc-name STRUC
[ defined fields ]
struc-name ENDS
A structure begins with its name and the directive STRUC and terminates with the name and
the directive ENDS. The assembler stores the defined fields one after the other from the start of
the structure. Valid entries are DB, DW, DD, DQ, and DT definitions with optional field names.
In Figure 27-2, STRUC defines a parameter list named PARLIST for use with DOS
INT 21H, function OAH, to input a name via the keyboard. A subsequent statement allo-
cates storage for the structure, making it addressable within the program:
PARAMS PARLIST <>
The angle brackets (less-than and greater-than symbols) in the operand are empty in this
example, but you may use them to redefine (or override) data within a structure.
Instructions may reference a structure directly by its name. To reference fields within
a structure, instructions must qualify them by using the allocate name of the structure
(PARAMS in the example), followed by a period that connects it with the field name, as,
for example,
MOV AL, PARAMS.ACTLEN
You may also use the allocate statement (PARAMS in Figure 27-2) to redefine the
contents of fields within a structure.
SUBTTL Directive
The SUBTTL directive (SUBTITLE since MASM 6.0) causes a subtitle of up to 60 char-
acters to print on line 3 of each page of an assembly source listing. You may code SUBTTL
any number of times. The general format is
SUBTTL text
TEXTEQU Directive
The general format for this directive (introduced by MASM 6.0) is
TEXTEQU [text-item]
The operand text-item can be a literal string, a constant preceded by %, or a string that a
macro function has returned.
TITLE Directive
The TITLE directive causes a title of up to 60 characters to print on line 2 of each page of
a source listing. You may code TITLE once, at the start. The general format is
TITLE text
oy Operators and Directives Chapter 27
.XCREF Directive
This directive (named .NOCREF since MASM 6.0) tells the assembler to Suppress the
cross-reference table. The general format is
Omitting the operand causes suppression of all entries in the table. You may also suppress
the cross-reference of particular items. Here are examples of .XCREF and .CREF:
.XLIST Directive
You may use the .XLIST directive (named .NOLIST since MASM 6.0) anywhere in a
source program to discontinue printing an assembled program. A typical situation would
be where the statements are common to other programs and you don’t need another listing.
The .LIST directive (the default) resumes the listing. Code both of these directives with no
operand.
CHAPTER 28
The PC Instruction Set
OBJECTIVE:
To explain machine code and to provide a description of
the PC instruction set.
INTRODUCTION
This chapter explains machine code and provides a list of symbolic instructions with an ex-
planation of their purpose.
Many instructions have a specific purpose, so that a one-byte machine language in-
struction code is adequate. The following are examples:
MACHINE SYMBOLIC
CODE INSTRUCTION COMMENT
40 INC AX ;Increment AX
50 PUSH AX ;Push AX
C3 RET (short) ;Short return from procedure
CB RET (far) ;Far return from procedure
FD STD ;Set direction flag
None of these instructions makes a direct reference to memory. Instructions that specify an
immediate operand, an eight-bit register, two registers, or a reference to memory are more
complex and require two or more bytes of machine code.
514
Addressing Mode Byte 515
Machine code has a special provision for indicating a particular register and another
provision for referencing memory by means of an addressing mode byte.
REGISTER NOTATION
Instructions that reference a register may contain three bits that indicate the particular
register and a w-bit that indicates whether the width is a byte (0) or a word (1). Also,
only certain instructions may access the segment registers. Figure 28—1 shows the com-
plete register notations. For example, the bit value 000 means AH if the w bit is 0 and AX
if itis 1.
Here’s the symbolic and machine code for a MOV instruction with a one-byte im-
mediate operand:
In this case, the first byte of machine code indicates a width of one byte (w = 0) and refers
to the AH register (100). Here’s a MOV instruction that contains a one-word immediate
operand, along with its generated machine code:
The first byte of machine code indicates a width of one word (w = 1) and refers to the AX
register (000). For other instructions, w and reg may occupy different positions.
mod_ A two-bit mode, where the values 00, 01, and 10 refer to memory locations
and 11 refers to a register
reg _ A three-bit reference to a register
t/m A three-bit reference to a register or memory, where r specifies which regis-
ter and m indicates a memory address
Also, the first byte of machine code may contain a d-bit that indicates the direction
of flow. Here’s an example of adding the AX to the BX:
In the example, d = 1 means that mod (11) and reg (011) describe the first operand and r/m
(O00) describes the second operand. Since w = 1, the width is a word. Therefore, the in-
struction is to add the AX (000) to the BX (011).
The second byte of the object code indicates most modes of addressing memory. The
next section examines the addressing mode in more detail.
Mod Bits
The two mod bits distinguish between addressing of registers and memory. The following
explains their purpose:
00 r/m bits give the exact addressing option; there is no offset byte.
Ol r/m bits give the exact addressing option; there is one offset byte.
10 r/m bits give the exact addressing option; there are two offset bytes.
11 r/m specifies a register. The w-bit (in the operation code byte) determines
whether a reference is to an 8-, 16-, or 32-bit register.
Reg Bits
The three reg bits, in association with the w-bit, determine the actual 8- or 16-bit register.
R/M Bits
The three r/m (register/memory) bits, in association with the mod bits, determine the ad-
dressing mode, as shown in Figure 28-2.
TWO-BYTE INSTRUCTIONS
The following example of a two-byte instruction adds the BX to the AX:
d= 1 reg plus w describe the first operand, and mod plus r/m plus w describe the
second operand
w= 1 The width is a word
mod = 11 The second operand is a register
reg = 000 ‘The first operand is the AX register
r/m=011 The second operand is the BX register
The processor assumes that the multiplicand is in the AL if it is a byte, the AX if it is a word,
and the EAX if it is a doubleword. The width (w = 0) is a byte, mod (11) references a reg-
ister, and the register (r/m = 011) is the BL (011). Reg = 100 is not meaningful here.
THREE-BYTE INSTRUCTIONS
The following MOV generates three bytes of machine code:
A move from the accumulator (AX or AL) needs to know only whether the operation is byte
or word. In this example, w = 1 means a word, and the 16-bit AX is understood. (The use
of AL in the second operand would cause the w bit to be zero.) Bytes 2 and 3 contain the
offset to the memory location. The use of the accumulator register is often more efficient
(because of the shorter instruction length required for it and its faster execution) than the
use of other registers.
FOUR-BYTE INSTRUCTIONS
For this instruction, although reg is 100, the multiplicand is assumed to be in the AL.
Mod = 00 indicates a memory reference, and r/m = 110 means a direct reference to
memory. The machine instruction also contains two subsequent bytes that provide the
offset to the memory location.
The next example illustrates the LEA instruction, which specifies a word address:
Reg = 010 designates the DX register. Mod = 00 and r/m = 110 indicate a direct refer-
ence to a memory address. The two subsequent bytes provide the offset to this location.
INSTRUCTION SET
This section covers the instruction set in alphabetic sequence, although closely related in-
structions are grouped together for convenience. In addition to the preceding discussion of
mode byte and width bit, the following abbreviations are relevant:
The 80286 and later processors support a number of specialized instructions not cov-
ered here: ARPL, BOUND, CLTS, ENTER, LAR, LEAVE, LGDT, LIDT, LLDT, LMSW,
LSL, LTR, SGDT, SIDT, SLDT, SMSW, STR, VERR, and VERW. Instructions unique to
the 80486 and later are BSWAP, XADD, CMPXCHG, INVD, WBINVD, AND INVLPG,
also not covered here.
value in the AL for a subsequent binary divide. It multiplies the AH by 10, adds the prod-
uct to the AL, and clears the AH.
Flags. Affects PF, SF, and ZF. (AF, CF, and OF are undefined.)
Source code. AAD (no operand)
Object code. \11010101 |00001010|
Operation. Calls a near or far procedure. The assembler generates a near CALL if the called
procedure is NEAR and a far CALL if the called procedure is FAR. For near, CALL pushes
the IP (the address of the next instruction) onto the stack. It then loads the IP with the des-
tination offset address. For far, CALL pushes the CS onto the stack and loads an interseg-
ment pointer onto the stack. It then pushes the IP onto the stack and loads the IP with the
destination offset address. A subsequent near or far RET reverses these steps on return.
Flags. Affects none.
Source code. CALL {register/memory}
Intruction Set 521
Flags. CF (reversed).
Source code. CMC (no operand)
Object code. 11110101
CMP: Compare
Operation. Compares the contents of two data fields. CMP internally subtracts the second
operand from the first and sets or clears flags, but does not store the result. Both operands
are byte, word, or doubleword (80386 and later). CMP may compare register, memory, or
immediate to a register or compare register or immediate to memory. (See also CMPS.)
Flags. Affects AF, CF, OF, PF, SF, and ZF.
Source code. CMP {register/memory },{register/memory/immediate }
Object code. Three formats:
¢ Reg/mem with reg: | 001110dw|modregr/m|
¢ Immed to accumulator: |0011110w|---data--|data if w=1|
¢ Immed to reg/mem: |100000sw|mod11lir/m|---data----|data if sw=0|
CMPS/CMPSB/CMPSW/CMPSD: Compare String
Operation. Compares strings of any length in memory. A REPn prefix normally precedes
these instructions, along with a maximum value in the CX. CMPSB compares bytes,
CMPSW compares words, and CMPSD (80386 and later) compares doublewords. The
DS:SI registers address the first operand, and the ES:DI registers address the second. If the
DF flag is 0, the operation compares from left to right and increments the SI and DI: if the
DF is 1, it compares from right to left and decrements the SI and DI. REPn decrements the
CX by | for each repetition. The operation terminates when the compared value is found
(REPNE), when it is not found (REPE), or when the CX is decremented to 0; the DI and SI
are advanced past the byte that caused termination. The last compare sets/clears the flags,
not the contents of the CX.
Flags. Affects AF, CF, OF, PF, SF, and ZF.
Source code. [REPnn] CMPSB/CMPSW/CMPSD (no operand)
Object code. 1010011w
DEC: Decrement by 1
Operation. Decrements | from a byte, word, or doubleword (80386 and later) in a register
or memory. (See also INC.)
Flags. Affects AF, OF, PF, SF, and ZF.
Source code. DEC {register/memory}
Object code. Two formats:
¢ Register: |01001reg|
¢ Reg/memory: |1111111w|mod001r/m|
Dividend Divisor
Size (Operand 1) (Operand 2) Quotient Remainder Example
Flags. Affects AF, CF, OF, PF, SF, and ZF. (all undefined.)
Source code. DIV {register/memory}
Object code. |1111011w|mod110r/m|
ESC: Escape
Operation. Facilitates the use of coprocessors such as the 80x87 to perform special opera-
tions. ESC provides the coprocessor with an instruction and operand for execution. Note
that as of version 6.1, MASM no longer supports ESC; instead, it generates the full required
object code for coprocessor instructions.
Flags. Affects none.
Source code. ESC immediate, {register/memory }
Object code. |11011xxx|modxxxr/m| (x-bits are not important)
Operation. Causes the processor to enter a halt state while waiting for an interrupt. HLT
terminates with the CS and IP registers pointing to the instruction following the HLT. When
an interrupt occurs, the processor pushes the CS and IP onto the stack and executes the in-
terrupt routine. On return, an IRET instruction pops the stack, and processing resumes fol-
lowing the original HLT.
Flags. Affects none.
Source code. HLT (no operand)
Object code. 11110100
Dividend Divisor
Size (Operand 1) (Operand 2) Quotient Remainder Example
Multiplicand Multiplier
Size (Operand 1) (Operand 2) Product Example
Flags. Affects CF and OF. (AF, PF, SF, and ZF are undefined.)
Source code. IMUL {register/memory} (all processors)
Object code. |1111011w|mod101r/m| (first format)
INC: Increment by 1
Operation. Increments by 1 a byte, word, or doubleword (80386 and later) in a register or
memory, coded, for example, as INC CX. (See also DEC.)
Flags. Affects AF, OF, PF, SF, and ZF.
Source code. INC {register/memory}
Object code. Two formats:
¢ Register: |01000reg |
¢ Reg/memory: |1111111w/mod000r/m|
INT: Interrupt
Operation. Interrupts processing and transfers control to one of the 256 interrupt (vector)
addresses beginning at segment 0, offset 0. INT performs the following: (1) pushes the flags
onto the stack and resets the IF and TF flags; (2) pushes the CS onto the stack and places
the high-order word of the interrupt address in the CS; and (3) pushes the IP onto the stack
526 The PC Instruction Set | Chapter 28
and fills the IP with the low-order word of the interrupt address. For the 80386 and later,
INT pushes a 16-bit IP for 16-bit segments and a 32-bit IP for 32-bit segments. IRET re-
turns from the interrupt routine.
Flags. Clears IF and TF.
Source code. INT number
Object code. |1100110v|--type--| (if v = 0 type is 3)
Operation. Jumps if an operation set the sign to negative. If the SF flag is 1 (negative), JS
adds the operand offset to the IP and performs the jump. The jump must be short (— 128 to
127 bytes), except for the 80386 and later, on which it may be near. (See also JNS.)
Flags. Affects none.
Source code. JS label
Object code. |01111000|--disp--|
Operation. Prevents 80x87 or other coprocessors from changing a data item at the same
time as the processor. LOCK is a one-byte prefix that you may code immediately before
any instruction. The operation sends a signal to the coprocessor to prevent it from using the
data until the next instruction is completed.
Flags. Affects none.
Source code. LOCK instruction
Object code. 11110000
Operation. Transfers data between two registers or between a register and memory, and
transfers immediate data to a register or memory. The referenced data defines the number
of bytes (1, 2, or 4) moved; the operands must agree in size. MOV cannot transfer between
two memory locations (use MOVS), from immediate data to a segment register, or from a
segment register to a segment register. (See also MOVSX/MOVZX.)
Flags. Affects none.
Source code. MOV {register/memory },{register/memory/immediate }
Object code. Seven formats:
¢ Reg/mem to/from reg: |100010dw|modregr/m|
¢ Immed to reg/mem: —_|1100011w|/mod000r/m| ---data---|data if w=1|
¢ Immed to register: |1011wreg|---data--|data if w=1|
¢ Mem to accumulator: |1010000w! addr-low| addr-high |
¢ Accumulator to mem: |1010001w| addr-low| addr-high |
¢ Reg/mem to seg reg: |10001110|mod0sgr/m| (sg = seg reg)
e Seg reg to reg/mem: |10001100|modOsgr/m|l (sg = seg reg)
tion moves data from right to left and decrements the DI and SI. REP decrements the CX
by | for each repetition. The operation terminates when the CX is decremented to 0; the DI
and SI are advanced past the last byte moved.
Flags. Affects none.
Source code. [REP] MOVSB/MOVSW/MOVSD (no operand)
Object code. 1010010w
Multiplicand Multiplier
Size (Operand 1) (Operand 2) Product Example
Flags. Affects CF and OF. (AF, PF, SF, and ZF are undefined.)
Source code. MUL {register/memory}
Object code. \1111011w|mod100r/m|
NEG: Negate
Operation. Reverses a binary value from positive to negative and from negative to positive.
NEG provides the two’s complement of the specified operand by subtracting the operand
from zero and adding 1. Operands may be a byte, word, or doubleword (80386 and later)
in aregister or memory. (See also NOT.)
Flags. Affects AF, CF, OF, PF, SF, and ZF.
Source code. NEG {register/memory}
Object code. \1111011w\mod01 1r/m|
NOP: No Operation
Operation. Used to delete or insert machine code or to delay execution for purposes of tim-
ing. NOP simply performs a null operation by executing XCHG AX,AX.
534 The PC Instruction Set | Chapter 28
OR: Logical OR
Operation. Performs a logical OR operation on bits of two operands. Both operands are
bytes, words, or doublewords (80386 and later), which OR matches bit for bit. If either
matched bit is 1, the bit in the first operand is set to 1; otherwise the bit is unchanged. (See
also AND and XOR.)
Flags. Affects CF. (0), OF (0), PF, SF, and ZF. (AF is undefined.)
Source code. OR {register/memory
}, {register/memory/immediate }
Object code. Three formats:
¢ Reg/mem with register: |000010dw|modregr/m|
¢ Immed to accumulator: |0000110w|---data--|data if w=1|
¢ Immed to reg/mem: | 100000sw|mod001r/m|---data----|data if w=1|
¢ Register: |01011reg|
. Segment reg: |000sg111 | (sg implies segment reg)
Operation. Pushes a word or doubleword (80386 and later) onto the stack for later use. The
SP register points to the current (double)word at the top of the stack. PUSH decrements
the SP by 2 or 4 and transfers a (double)word from the specified operand to the new top
of the stack. The source may be a general register, segment register, or memory. (See also
POP and PUSHF.)
Flags. Affects none.
Source code. PUSH {register/memory} (all processors)
PUSH immediate (80286 and later)
Object code. Three formats:
¢ Register: |01010reg|
° Segment reg: |000sg110| (sg implies segment reg)
¢ Reg/memory: |11111111|mod110r/m|
Operation. Typically used in multiword binary subtraction to carry an overflowed 1-bit into
the next stage of arithmetic. SBB first subtracts the contents of the CF from the first operand
and then subtracts the second operand from the first, just like SUB. (See also ADC.)
Flags. Affects AF, CF, OF, PF, SF, and ZF.
Source code. SBB {register/memory }, {register/memory/immediate }
Object code. Three formats:
¢ Reg/mem with register: =| 000110dw|modregr/m|
¢ Immed from accumulator: |0001110w|---data--|data if w=1|
¢ Immed from reg/mem: |100000sw|mod011r/m|---data---Zdata if sw501Z
If a tested condition is true, the operation sets the byte operand to 1, otherwise to 0. An
example is
CMP AX, BX ;Compare contents of AX to BX
Operation. Tests a field for a specific bit configuration such as AND, but does not change
the destination operand. Both operands are bytes, words, or doublewords (80386 and later)
in a register or memory; the second operand may be immediate. TEST uses AND logic to
set flags, which you may test with JE or JNE.
Flags. Clears CF and OF and affects PF, SF, and ZF. (AF is undefined.)
Source code. TEST {register/memory },{register/memory/immediate }
Object code. Three formats:
Intruction Set 541
Operation. Allows the main processor to remain in a wait state until an external inter-
rupt occurs, in order to synchronize it with a coprocessor. The main processor waits until
the coprocessor finishes executing and resumes processing on receiving a signal in the
TEST pin.
Flags. Affects none.
Source code. WAIT (no operand)
Object code. 10011011
XCHG: Exchange
Operation. Exchanges data between two registers (as XCHG AH,BL) or between a regis-
ter and memory (as XCHG CX,word).
Flags. Affects none.
Source code. XCHG {register/memory }, {register/memory }
Object code. Two formats:
¢ Reg with accumulator: |10010reg|
¢ Reg/mem with reg: |1000011w|mod reg r/ml|
XLAT/XLATB: Translate
Operation. Translates bytes into a different format, such as ASCII to EBCDIC. You define
a table, load its address in the BX, and then load the AL with a value that is to be translated.
The operation uses the AL value as an offset into the table, selects the byte from the table,
and stores it in the AL. (XLATB is a synonym for XLAT.)
Flags. Affects none.
Source code. XLAT [AL] (AL operand is optional)
Object code. 11010111
XOR: Exclusive OR
Operation. Performs a logical exclusive OR on bits of two operands. Both operands are
bytes, words, or doublewords (80386 and later), which XOR matches bit for bit. If both
matched bits are the same, the bit in the first operand is cleared to 0; if the matched bits are
different the bit in the first operand is set to 1. (See also AND and OR.)
Flags. Affects CF (0), OF (0), PF, SF, and ZF. (AF is undefined.)
Source code. XOR {register/memory },{register/memory/immediate }
Object code. Three formats:
¢ Reg/mem with register: |001100dw|mod reg r/m|
¢ Immed to reg/mem: |1000000w|mod 110 r/m|---data----|data if w=1|
¢ Immed to accumulator: |0011010w|---data----|data if w=1|
APPENDIX A
This appendix provides the steps in converting between hexadecimal and decimal formats.
The first section shows how to convert hex A7B8 to decimal 42,936, and the second sec-
tion shows how to convert 42,936 back to hex A7B8.
542
Converting Decimal to Hexadecimal 543
Multiply by 16 x 16
42,928
Add next digit, 8 + 8
Decimal value 42,936
You can also use a conversion table. For A7B8H, think of the rightmost digit (8) as
position 1, the next digit to the left (B) as position 2, the next digit (7) as position 3, and the
leftmost digit (A) as position 4. Refer to Table A—1, and locate the value for each hex digit:
You can also use Table A—1 to convert decimal to hexadecimal. For decimal number
42,936, locate the number that is equal to or next smaller than it. Note the equivalent hex
number and its position in the table. Subtract the decimal value of that hex digit from
42,936, and locate the difference in the table. The procedure works as follows:
DECIMAL HEX
Starting decimal value 42,936
Subtract next smaller number —40,960 A000
Difference 1,976
Subtract next smaller number = [5792 700
Difference 184
Subtract next smaller number —176 BO
Difference 8 8
Final hex number A7B8
544
AJ1aVL
L-V TWWIDSG-IWANNDS
NOISYSANOD
0VX3SH
A1dVL
0 0 ) 0
0
—
ao
SEv'89T
OCP LLL'9O1
DIT 9LS°8r0'T 9E6°S9 960°
OL8'9ES
TIO CEP'PSS'EE ZSTL60°7 TEL GLO Z6I'8
89€°90E'S08 8P9'TEEOS 87L'SPHL'€ 809°961 887°TI
TPL*€ELO'l
P78 798°801°L9 r6l‘vPOE prt'797 p8COl
LLICre'l
O87 080°988'¢8 Tres O88° O89°LZE O8b'07
9EL'TI9'OI9'T 967°€99‘001 1679 9Sr' €6€ 917 OLS‘V7
C6L80'6L8'I LII Orr
CIS OPE'LCEO CSL'8Sh TLO'8T
8P9'EsP'Lrl‘Z PEI 87L‘LIT 809°88E'8 PCS 887° 89L'TE
I ¢C € v ¢ 9 L 8
ge.” Table
The term ASCII stands for “American Standard Code for Information Interchan
through FFH),
B-1 lists the representations of the entire 256 ASCII character codes (OOH
along with their hexadecimal representations. The categories are:
in-
00-1 FH Control codes for screens, printers, and data transmission, that are
tended to cause an action.
is
20-—7FH Character codes for numbers, letters, and punctuation. Note that 20H
the standard space or blank.
sym-
80—FFH Extended ASCII codes, foreign characters, Greek and mathematic
bols, and graphic characters for drawing boxes.
do not print:
Here are the control codes from 00H through 1FH; those in parentheses
HEX CHARACTER HEX CHARACTER
HEX CHARACTER
(Null) Ol Happy face 02 Happy face
00
Heart 04 Diamond O5 Club
03
Spade 07 (Beep) 08 (Back space)
06
(Tab) OA (Line feed) OB (Vertical tab)
09
OD (Return) OE (Shift out)
OC (Form Feed)
10 (Data line esc) 11 (Dev ctl 1)
OF (Shift in)
13 (Dev ctl 3) 14 (Dev ctl 4)
2 (Dev ctl 2)
16 (Synch idle) 17 (End tran block)
15 (Neg acknowledge)
545
546 ASCII Character Codes Appendix B
00 20 40 @ 60 ~ 80¢ AO 4A cO + E0 a
01 © 21! 41A 61a 81% Ali ci+ E18
02 ®@ 22" 42 B 62 b 82 € A26 C2 E2 T
03 ¥ 23 # 43 C 63 c 83 A AZ CB it E3 7
04 @ 24 $ 44D 64d 84a A4 fi C4 - E42
OS &® 25 % 45 BE 65 e 85 a AS N C5 i E5 0
06 #& 26 & 46 F 66 £ 86 A A6 # CE E6 pL
07 27 ' 47 G 67 g 87 ¢ AT ° C7 [ E7 T
08 28 ( 48 H 68 h 88 € A8 : C8 E8 ©
09 29) 49 I 69 i 89 € AIF Cg I E9 @
OA 2A * 4A J 6A 3 8A 6 AAA CA+ EA Q
OB 2B + 4B K 6Bk 8Bi AB % CB T EB 6
OC 2c 4cL 6C1l sect ACK CC | EC ©
OD 2D - 4DM 6Dm 8Di AD j; CD= ED ¢
OE 2E. 4EN 6En 8EA AE « CE EE €
OF 2F / 4F 0 6F 0 8FA AF» cCF+ EFA
10 >» 300 50P 70 p 90 & BO! pot Fo =
11 <«< 311 51Q 71 q 912 Bi Dl = Fl +
12 ¢ 32 2 52 R 72 r 92 E B2 D2 T F2 2
13 ! 333 53 S$ 73s 93 6 B3 D3 F3 <
14 9 34 4 547 74 t 94 6 B4 D4 & F4 |
15 § 35 5 55 U 75 u 95 6 BS5 D5 - FS
16 = 36 6 56 V 76 v 96 G B6 {| D6 F6 +
17 ¢ 37 7 57W 77 w 9704 B7 q D7 { F7 =
18 * 38 8 58 X 78 x 98 ¥Y B& ¥ D8 T FS °
19 ¥ 399 59 Y¥Y Wy 99 0 BIZ DI F9
1A 3A: 5AZ 7A z 9AU BA DA FA -
1B 3B ; 5B [ 7B { 9B ¢ BB DB § FB V
itm 3C 2 Scr. Ve 9C £ BC ] DC FC 2
1D * 3D = 5D] #7D 9D ¥ BDH opp FD 2
1E 4 3E > 5E ~*~ 7E ~ 9EF Pt BEd ODE FE @
1F 3F ? SF 7F QA 9F f BF 4 DF FF
The assembler recognizes some words as having a specific meaning; you may use these
words only under prescribed conditions. Words that the assembler reserves may be classed
into four categories:
If used to define a data item, many of the reserved words that follow may confuse the as-
sembler or cause an assembly error.
Register Names
AH, AL, AX, BH, BL, BP, BX, CH, CL, CS, CX, DH, DI, DI, DL, DS, DX, EAX, EBP,
EBX, ECX, EDI, EDX, EIP, ES, ES, ESI, FS, GS, IP, SI, SP, SS
Symbolic Instructions
AAA, AAD, AAM, AAS, ADC, ADD, AND, ARPL, BOUND, BSF, BSR, BTn, CALL,
CBW, CDQ, CLC, CLD, CLI, CLTS, CMC, CMP, CMPSn, CWDn, DAA, DAS, DEC,
DIV, ENTER, ESC, HLT, IDIV, IMUL, IN, INC, INSw, INT, INTO, IRET, JA, JAE, JB,
547
548 Reserved Words Appendix C
JBE, JCXZ, JE, JECXZ, JG, JGE, JL, JLE, JMP, JNA, JNAE, JNB, JNBE, JNE, JNG,
JNGE, JNL, JNLE, JNO, INP, JNS, JNZ, JO, JP, JPE, JPO, JS, JZ, LAHF, LAR, LDS,
LEA, LEAVE, LES, LFS, LGDT, LGS, LIDT, LLDT, LMSW, LOCK, LODSn, LOOP,
LOOPE, LOOPNE, LOOPNZ, LOOPZ, LSL, LSS, LSS, LTR, MOV, MOVSn, MOVSX,
MOVZX, MUL, NEG, NOP, NOT, OR, OUTn, POP, POPA, POPAD, POPF, POPED,
PUSH, PUSHAD, PUSHF, PUSHFD, RCL, RCR, REN, REP, REPE, REPNE, REPNZ,
REPZ, RET, RETF, ROL, ROR, SAHF, SAL, SAR, SBB, SCASn, SETnn, SGDT, SHL,
SHLD, SHR, SHRD, SIDT, SLDT, SMSW, STC, STD, STI, STOSn, STR, SUB, TEST,
VERR, VERRW, WAIT, XCHG, XLAT, XOR
Directives
Operators
AND, BYTE, COMMENT, CON, DUP, EQ, FAR, GE, GT, HIGH, LE, LENGTH, LINE,
LOW, LT, MASK, MOD, NE, NEAR, NOT, NOTHING, OFFSET, OR, PTR, SEG, SHL,
SHORT, SHR, SIZE, STACK, THIS, TYPE, WHILE, WIDTH, WORD, XOR
APPENDIX D
Assembler and Link Options
This appendix covers the rules for assembling, linking, generating cross-reference files, and
converting .EXE programs to .COM. The Microsoft assembler version is MASM, Bor-
land’s is TASM, and SLR System’s is OPTASM, all of which are similar. Since version
6.0, the Microsoft assembler uses the ML command, which can perform an assembly and
link in one command. Examples in this appendix use disk drive D; users of other drives can
substitute the appropriate letter and path.
ASSEMBLING A PROGRAM
You can use a command line to request an assembly, although MASM also provides for
prompts.
549
550 Assembler and Link Options |=Appendix D
* Object provides for a generated OBJ file. The drive or path and the filename may be
the same as or different from the source.
¢ Listing provides for a generated .LST file that contains the source and object
code. The drive or path and the filename may be the same as or different from
the source.
¢ Crossref provides for a generated file containing symbols for a cross-reference list-
ing. The extension is .CRF for MASM and .XRF for TASM. The drive or path and
the filename may be the same or different.
The following shortcut command allows for defaults for the object, listing, and cross-
reference files, all with the same name:
MASM D:filename,D:,D:,D:
MASM D:filename,D:,,D:
Cross-reference [NUL.CRF]:
Source filename identifies the name of the source file. Key in the drive or path (if it’s
not the default) and the name of the source file, without the extension ASM.
Object filename provides for the object file. The prompt assumes the same file-
name, although you could change it. To get an object file on drive D, type D: and
press Enter.
Source listing provides for an assembler listing, although the prompt assumes that
you do not want one. To get a listing on drive D, type D: and press Enter.
Cross-reference provides for a cross-reference listing, although the prompt assumes
that you do not want one. To get one on drive D, type D: and press Enter.
For the last three prompts, just press Enter if you want to accept the default.
Assembling a Program 551
Assembler Options
Assembler options for MASM, TASM, and OPTASM include the following:
/A Arrange source segments in alphabetic sequence.
IC Create a cross-reference file.
MASM: Produce listing files on both pass 1 and pass 2 to locate phase errors.
For TASM, /Dsymbol means define a symbol.
es Accept 80x87 coprocessor instructions and generate a linkage to BASIC, C,
or FORTRAN for emulated floating-point instructions.
Display assembler options with a brief explanation. Enter /H (for help) with
no filenames or other options.
ih Create a normal listing file.
/ML Make all names case sensitive.
/MU_ Convert all names to uppercase.
/MX Make public and external names case sensitive.
IN Suppress generation of the symbol table.
Provide real math coprocessor support.
/S Leave source segments in original sequence.
/T (Terse) Display diagnostics at the end of the assembly only if an error is
encountered.
/V (Verbose) At the end of the assembly, display the number of lines and sym-
bols processed. (Not with OPTASM.)
/Wn _ Set the level of warning messages: 0 = display only severe errors; 1 = dis-
play severe errors and serious warnings (the default); 2 = display severe er-
rors, serious warnings, and advisory warnings.
IZ, Display source lines on the screen for errors.
/ZD Include information on line numbers in the object file for CodeView, Turbo-
Debugger, or SYMDEB.
/Z] Include information on line-numbers and symbolic information in the object
file for CodeView, TurboDebugger, or SYMDEB.
You may request options in either prompt or command-line mode. For prompts, you
could code MASM/A/V [Enter], for example, and then key in the usual filename. Or you
may key in options in any prompt line—for example, as
The /A/V options tell the assembler to write segments in alphabetic sequence and to dis-
play additional diagnostics at the end of the assembly.
Turbo Assembler lets you assemble multiple files, each with its own options, in one com-
mand line. You can also use DOS wild cards (* and ?). To assemble all source programs
in the current directory, key in TASM *. To assemble all source programs named
552 Assembler and Link Options Appendix D
PROGI.ASM, PROG2.ASM, and so on, key in TASM PROG?. You can key in groups (or
sets) of filenames, with each group separated by a semicolon. The following command as-
sembles PROGA and PROGB with the /C option and PROGC with the /A option:
TASM /C PROGA PROGB; /A PROGC
The assembler allows you to assemble any number of programs into one executable
module. One useful option is ML -?, which displays the complete command-line syntax
and options.
Tables
Following an assembler .LST listing are a segments and groups table and a symbols table.
Segment and Group Table. This table has the following heading:
Name Length Align Combine Class
The name column gives the names of all segments and groups, in alphabetic sequence. The
length column give the size, in hex, of each segment. The align column gives the alignment
type, such as BYTE, WORD, or PARA. Combine lists the defined combine type, such as
STACK for a stack, NONE where no type is coded, PUBLIC for external definitions, or a
hex address for AT types. The class column lists the segment class names, as coded in the
SEGMENT statement.
The name column lists the names of all defined items, in alphabetic sequence. The type col-
umn gives the type, as follows:
The value column gives the hex offset from the beginning of a segment for names, labels,
and procedures. The attribute column lists a symbol’s attributes, including its segment and
length.
Linking a Program 593
CROSS-REFERENCE FILE
A .CRF or .XRF file is used to produce a cross-reference listing of a program’s labels, sym-
bols, and variables, However, you have to use CREF for Microsoft or TCREF for Borland
to convert the listing to a sorted cross-reference file. You can key in CREF or TCREF with
a command line or use prompts.
CREF/TCREF xreffile,reffile
The command line contains references to the original cross-reference file (.CRF or .XRF)
and to a generated .REF file. The following example using CREF writes a cross-reference
file named ASMPROG.REF on drive D:
CREF/TCREF D:ASMPROG,D:
Using Prompts
You can key in just CREF or TCREF with no command line. TCREF simply displays the
general format for the command and an explanation of its options, whereas CREF displays
these prompts:
Cref filename [.CRF]:
For the first prompt, key in the name of the file, without a .CRF extension. For the second
prompt, you can key in the drive and/or path only and accept the default file name.
LINKING A PROGRAM
Microsoft’s linker is LINK, and Borland’s is TLINK. LINK and TLINK accept a command
line to request linking; LINK also provides for prompts.
* Mapfile provides for generating a file with an extension .MAP that indicates the rel-
ative location and the size of each segment and any errors that LINK has found. A
typical error is the failure to define a stack segment. Entering CON tells the linker to
display the map on the screen (instead of writing it on disk) so that you can view it
immediately for errors.
¢ Libraryfile provides for the libraries option.
To link more than one object file into an executable module, combine them in one
line like this:
* Object Modules asks for the name(s) of the object module(s) to be linked; it defaults
to .OBJ if you omit the extension.
* Run File requests the name of the file that is to execute and allows a default to the ob-
ject module filename. You just need to key in the drive and/or path.
¢ List File provides for the map file, although the default is NUL.MAP (that is, no map).
The reply CON tells the linker to display the map on the screen, a convenient choice.
¢ Libraries asks for the library option, which is outside the scope of this text.
For the last three prompts, just press Enter to accept the default. The following ex-
ample tells the linker to produce .EXE and .CON files:
Debugging Options
If you intend to use CodeView, TurboDebugger, or SYMDEB, use the assembler’s /ZI op-
tion for assembling. For linking, use DOS LINK’s /CO option, in either command-line or
prompt mode, or Turbo TLINK’s /V option:
EXE2BIN Options 399
CROSS-REFERENCE LISTING
The assembler generates an optional .CRF or .XRF file that you can use to produce a cross-
reference listing of a program’s labels, symbols, and variables. The program that performs
this function is CREF for Microsoft or TCREF for Borland. You can key in CREF or
TCREF with a command line or by means of prompts.
|CREF/TCREF d:xreffile,d:reffile
¢ Xreffile identifies the cross-reference file generated by the assembler. The program
assumes the extension, so you need not enter it.
¢ Reffile provides for generating a .REF file. The drive, subdirectory, and filename may
be the same as or different from those of the source.
Use of a Prompt
You can key in TCREF or CREF with no command line, although they respond differently.
TCREF displays the general format for the command and an explanation of options,
whereas CREF displays prompts. Here are the CREF prompts to which you reply:
Cross-reference [.CRF]:
For the first prompt, key in the name of the .CRF file, such as D:EXASM1. For the second
prompt, you can key in drive number only and accept the default file name. This choice
causes CREF to write a cross-reference file named EXASM1.REF on drive D.
EXE2BIN OPTIONS
The DOS EXE2BIN program converts .EXE modules generated by MASM into .COM
modules, provided that the source program was originally coded according to .COM re-
quirements. Enter the following command:
556 Assembler and Link Options | Appendix D
The first operand is the name of the .EXE file, which you key in without an extension. The
second operand is the name of the .COM file; you may change the name, but be sure to code
a .COM extension. Delete the .OBJ and .EXE files
APPENDIXE
The DOS Debug Program
The DEBUG program on the DOS disk is useful for writing very small programs, for de-
bugging assembly language programs, and for examining the contents of a file or memory.
You may enter one of two commands to start DEBUG:
DOS loads DEBUG into memory, and DEBUG displays a hyphen (-) as a prompt.
The memory area for your program is known as a program segment. The CS, DS, ES, and
SS registers are initialized with the address of the program segment prefix (PSP), and your
work area begins at PSP + 100H.
A reference to a memory address may be in terms of a segment and offset, such
as DS:120, or an offset only, such as 120. You may also make direct references to
memory addresses, such as 40:417, where 40[0]H is the segment and 417H is the off-
set. DEBUG assumes that all numbers entered are hexadecimal, so you do not key in
the trailing H. The Fl and F3 keys work for DEBUG just as they do for DOS; that iS,
Fl duplicates the previous command one key at a time, and F3 duplicates the entire
previous command. Also, DEBUG does not distinguish between uppercase and lower-
case letters.
Following is a description of each DEBUG command, in alphabetic sequence.
327
558 The DOS Debug Program Appendix E
A(Assemble). Translates assembly source statements into machine code. The op-
eration is especially useful for writing small assembly language programs and for examin-
ing small segments of code. The default starting address for code is CS:0100H, and the
general format for the command is
A [address ]
Since DEBUG sets the IP to 100H because of the size of the PSP, the statements begin at
100H. The last Enter key (that’s two in a row) tells DEBUG to end the program. You can
now use the U (unassemble) command to see the machine code and the T (trace) command
to execute it.
You may change any of the preceding instructions, provided that the length of the
new instruction is the same as that of the old one. For example, to change the ADD at 104H
to SUB, enter
A 104 [Enter]
When you reexecute the program, the IP is still incremented. Use the register (R) command
to reset it to 1OOH. Use Q to quit.
Note that you can use DB and DW to define data items.
C (Compare). Compares the contents of two blocks of memory. The default reg-
ister is the DS, and the general format is
C [range] [address]
You may code the command one of two ways: (1) a starting address (compare from), a
length, and a starting address (compare to); or (2) a starting address and an ending address
(compare from) and a starting address (compare to). These examples compare bytes be-
ginning at DS:050 to bytes beginning at DS:300:
The DOS Debug Program 559
D [address] or D [range]
You may specify a starting address or a starting address with a range. Omission of a range
or length causes a default to 80H. Examples of the D command sre:
E (Enter). Enters data or machine instructions. The default register is the DS, and
the general format is
E address [list]
The operation allows two options: to replace bytes with those in a list or to provide se-
quential editing of bytes. Examples of the first option follow:
For the second option, key in the address that you want displayed:
The operation waits for your input. Enter one or more bytes of hex values, separated by a
space, beginning at DS:12CH. Character strings accept either single or double quotes.
F (Fill). Fills a range of memory locations with values in a list. The default regis-
ter is the DS. The general format is
F range list
These examples fill locations in memory beginning at DS:214H with bytes containing rep-
etitions of ‘SAM’:
560 The DOS Debug Program Appendix E
G (Go). Executes a machine language program that you are debugging through to
a specified breakpoint. Be sure to examine the machine code listing for valid IP addresses,
because an invalid address may cause unpredictable results. Also, set break points only in
your own program, not in DOS or BIOS. The operation executes through interrupts and
pauses, if necessary, to wait for keyboard input. The default register is the CS. The general
format is
G [=address] address [address ...]
The entry =address provides an optional starting address. The other entries provide up to
10 break-point addresses. The following example tells DEBUG to execute through loca-
tion L1A:
G11A
H (Hexadecimal). Shows the sum and difference of two hex values, coded as H
value value. The maximum length is four hex digits. For example, H 14F 22 displays the
result 171 (sum) and 12D (difference).
I (Input). Inputs and displays one byte from a port. Code this as I portaddress.
L(Load). Loads a file or disk sectors into memory. There are two general formats:
Use the address parameter to cause L to load beginning at a specific location. Omis-
sion of the address causes L to load at CS:100. To load a file, note that it should be
already named (see N):
¢ Address provides the memory location for loading the data. (The default is CS:100.)
¢ Drive identifies the disk drive, where 0 = A, 1 = B, etc.
¢ Start specifies the hex number of the first sector to load. (This is a relative number,
where cylinder 0, track 0, sector 1, is relative sector 0.)
¢ Number gives the hex number of consecutive sectors to load.
The DOS Debug Program 561
The following example loads beginning at CS:100 from drive 0 (A), starting at sec-
tor 20H for 15H sectors:
iy 100 220) 25
The L operation returns to the BX:CX the number of bytes loaded. For an .EXE file,
DEBUG ignores the address parameter (if any) and uses the load address in the .EXE
header. It also strips off the header; to preserve it, first rename the file with a different
extension.
M (Move). Moves (or copies) the contents of memory locations. The default reg-
ister is the DS, and the general format is
M range address
These examples copy the bytes beginning at DS:050H through 150H into the address be-
ginning at DS:400H:
M DS:50 L100 DS:400 Use a length
N (Name). Names a program or a file that you intend to read from or write onto
disk. Code the command as N filespec, such as
N D:SAM.COM
The operation sets the name at CS:80 in the PSP. The first byte at CS:80 contains the length
(OAH), followed by the space and the filespec. You may then use L (Load) or W (Write) to
read or write the file.
where =address is an optional starting address and value is an optional number of instruc-
tions to proceed through. Omission of =address causes a default to the CS:IP register pair.
For example, if your trace of execution is at an INT 21H operation, just key in P to execute
through that operation.
Q (Quit). Exits DEBUG. The operation does not save files; use W for that purpose.
R (Register). Displays the contents of registers and the next instruction. The gen-
eral format is
R [registername]
562 The DOS Debug Program Appendix E
S (Search). Searches memory for characters in a list. The default register is the DS,
and the general format is
S range list
If the characters are found, the operation delivers their addresses; otherwise it does not re-
spond. The following example searches for the word “VIRUS” beginning at DS:300 for
2000H bytes:
S 300 L 2000 “VIRUS”
This example searches from CS:100 through CS:400 for a byte containing 51H:
S CS:100 400 51
T (Trace). Executes a program in single-step mode. Note that you should normally
use P (Proceed) to trace through INT instructions. The default register is the CS:IP pair, and
the general format is
T [=address] [value]
The optional entry =address tells DEBUG where to begin the trace, and the optional value
gives the number of instructions to trace. Omission of the operands causes DEBUG to ex-
ecute the next instruction and to display the registers. Here are two examples:
U [address] or U [range]
The DOS Debug Program 563
The area specified should contain valid machine code, which the operation displays as sym-
bolic instructions. Here are three examples:
U 0100 Unassemble 32 bytes beginning at CS:100
Note that DEBUG does not properly translate some conditional jump instructions, al-
though they still execute correctly.
W (Write). Writes a file from DEBUG. The file should first be named if it wasn’t
already loaded. The default register is the CS, and the general format is
W [address [drive start-sector number-of-sectors] ]
Write program files only with a .COM extension, since W does not support the .EXE for-
mat. (To modify an .EXE program, you may change the extension temporarily.) The fol-
lowing example uses W with no operands and has to set the size of the file in the BX:CX
pair (first ensure that the BX is zero):
R CX Request CX register
If you modify a file and make no change to its length or name, DEBUG can still cor-
rectly write the file back to its original disk location. You may also write the file directly to
disk sectors, although this practice requires considerable care.
See the DOS manual for these commands:
In the following lists, keys are grouped rather arbitrarily into categories. For each category,
the columns show the format for a normal key (not combined with another key) and the for-
mats when the key is combined with the Shift, Ctrl, and Alt keys. Under the columns headed
“Normal,” “Shift,” “Ctrl,” and “Alt” are two hex bytes as they appear when a keyboard op-
eration delivers them to the AH and AL registers. For example, pressing the letter ‘“‘a” the
normal delivers 1EH in the AH for the scan code and 61H in the AL for the ASCII charac-
ter. When shifted to uppercase (“A”), the letter delivers 1EH and 41H, respectively. Scan
codes 85H and higher are for the enhanced 101-key keyboard.
564
Keyboard Scan Codes and ASCII Codes 565
k and K 25 6B 25 4B 25 OB 25 00
land L 26 6C 26 4C 26 OC 26 00
m and M 32 6D 32 4D 32 OD 32 ~=6©00
n and N 31 6E 31 4E 31 OE 31 O00
o and O 18 6F 18 4F 18 OF 18 OO
p andP 19 70 19 50 19 10 19 OO
q and Q 10 71 10 51 10 iil 10 OO
randR 5 72 13 32 fo: AZ 13 =00
sand§S 1F 73 1F 53 1F 13 1F 00
t and T 14 74 14 54 14 14 14. + 00
u andU 16 75 16 55 16 615 16 00
v and V 2F 76 2F 56 2F 16 2F 00
w and W | ee 11 57 it 6f7 11 OO
x and X 2D 78 2D 58 2D 18 2D 00
y and Y 15 79 15 59 I> 19 15 00
z and Z 2C TA 20 63C 2C 1A 2C 00
Spacebar 39 = 20 39 =—-20 39 =.20 39 =:20
FUNCTION KEYS NORMAL SHIFT CTRL ALT
Fl 3B. 00 54 00 5E 00 68 00
F2 3C = 00 55 00 5F 00 69 00
F3 3D 00 56 ~=6©00 60 00 6A 00
F4 3E 00 57 ~=00 61 00 6B 00
F5 3F 00 58 00 62 00 6C 00
F6 40 00 59 ~=600 63 00 6D 00
F7 41 00 5A 00 64 00 6E 00
F8 42 00 5B 00 65 00 6F 00
F9 43 00 5C 00 66 00 70 = 00
F10 44 00 5D 00 67 00 71 00
F1l1 85 §600 87 00 89 00 8B 00
F12 86 =00 88 O00 8A 00 8C 00
NUMERIC KEYPAD NORMAL SHIFT CTRL ALT
Ins and 0 52 00 52 30 92 00
End and 1 4F 00 4F 31 75 00 00 Ol
Dn Arrow and2 50 OO 50 32 91 O00 00 02
PgDn and 3 51 00 St 33 76 00 00 03
Lt Arrow and4 4B 00 4B 34 73 ~=00 00 04
5 (keypad) 4C 00 4C 35 8F O00 00 O05
Rt Arrowand6 4D _ 00 4D 36 74 00 00 06
Home and 7 47 00 47 37 77 ~=00 00 O07
Up Arrow and& 48 00 48 38 8D 00 00 08
PgUp and 9 49 00 49 39 84 00 00 09
+ (gray) 4E 2B 4E 2B 90 00 4E 00
566 Keyboard Scan Codes and ASCII Codes Appendix F
— (gray) 4A 2D 4A 2D SE 00 4A 00
Del and . 53 ~00 53. 2E 93 00
* (gray) 37 2A 37 2A 96 00 37 ~=—00
TOP ROW NORMAL SHIFT CTRL ALT
‘and ~ 29 = 60 29 TE 29 00
1 and ! 02 31 02 21 78 00
2 and @ O03 32 03 40 03 00 79 ~~ 00
3 and # 04 33 04 23 7A 00
4 and $ 05 34 05 24 7B 00
5 and % 06 35 06 25 7C 00
6 and * 07 36 O07 5SE O07 IE 7D 00
7 and & 08 37 08 26 TE 00
8 and * 09 38 09 2A 7F 00
9 and ( OA 39 OA 38 80 =00
0 and ) OB 30 OB 29 81 00
— and _ OC 2D OC 5F OC IF 82 00
= and + OD 3D OD 2B 83 00
Following are the duplicate keys for the enhanced keyboard (the first two entries are ASCII
characters, and the rest are cursor keys):
Control keys also have identifying scan codes, although BIOS doesn’t deliver them to the
keyboard buffer. Here are their scan codes:
CapsLock 3A
NumLock 45
ScrollLock 46
Shift (Left) 2A
Shift (Right) 36
Alt 38
Ctrl 1D
PrtScreen 3]
Answers to Selected Questions
CHAPTER 1
1-1. (a) 0110; (c) 10110.
1-2. (a) 00100010; (c) 00100000.
1-3. (a) 11101010; (c) 11000100.
1+4. (a) 00111000; (c) 00000010.
1-5. (a) 51; (c) 5D.
1-6. (a) 23C8; (c) 8000.
1-7. (a) 13; (c) 59; (e) FFF.
1-8. (a) 01010000; (c) 00100011.
1-10. ROM (read-only memory) is permanent, performs startup procedures, and handles input/
output. RAM (random-access memory) is temporary and is the area where programs and
data reside when executing.
1-12. (a) A section of a program, up to 64K in size, containing code, data, or the stack.
1-13. (a) Stack, data, and code.
1-15. (a) AX, BX, CX, DX, DI, SI; (c) AX and DX; (e) flags.
1-17. (a) MOV CH,?25.
568
Chapter 4 569
CHAPTER 2
CHAPTER 3
3-1. The commands are identified at the beginning of the chapter.
3-2. (a) D DS:264; (c) E DS:200 A8 B3 64.
. (a) B82946.
» ECS:101 54.
- (a) MOV AX, 3004
ADD AX, 3000
NOP
CHAPTER 4
4-3. Name (of a data item) and label (of an instruction).
4-4. (d) Invalid because it starts with a number; (e) valid only if it refers to the AX register.
4-6. (a) TITLE.
4-8. (a) Causes alignment of a segment on a boundary, such as a paragraph.
4-9. (a) Provides a section of related code, such as a subroutine.
4-10. (a) END; (c) ENDS.
4-11. The END directive tells the assembler that there are no more instructions to assemble; instruc-
tions to cause control to return to the operating system are MOV AX,4COOH and INT 21H.
4-12. ASSUME SS:STKSEG,DS:DATSEG,CS:CDSEG.
4-15. (a) 4; (c) 10; (e) 1.
4-16. TITLE1 DB ‘RGB Electronics’
4-17. (a) FLDA DD 73H
(c) FLDC DW?
(e) FLDE DW 17, 19, 21, 26, 31
4-18. (a) ASCII 3238; (b) hex IC.
4-19. (a) 28; (c) 3A732800.
570 Answers to Selected Questions
CHAPTER 5
MUL BL ; by 22H
FIELDA DB 40H
FIELDB DB 22H
FIELDC DW ?
CHAPTER 6
6-2. (a) The first MOV moves immediate value 325AH; the second MOV moves the contents of
locations 325AH and 325BH into the AX.
6-4. Move the contents to the CX of the memory location pointed to by the sum of the offset ad-
dresses in the BX, plus the SI, plus 4 (technically by DS:[BX+SI+4]).
6-5. (a) The processor cannot move data directly from one memory location to another.
6-7. (a) MOV AX, 320
(c) ADD BX, 40H
(e) SHL FLDB,1 (or SAL)
6-8. Use XCHNG.
6-9. Use LEA.
6-11. (a) Pushes the flags, IP, and CS onto the stack, replaces the IF and TF flags, and stores the in-
terrupt address in the CS:IP.
CHAPTER 7
7-1. 64K.
7-4. It uses the high area of the .COM program or, if insufficient space there, uses the end of
memory.
7-5. (a) EXE2BIN SAMPLE SAMPLE.COM.
Chapter 9 571
CHAPTER 8
8-1. (a) Within —128 and +127 bytes.
8-2. (a) Within —128 and +127 bytes. (b) The operand is a one-byte value allowing for OOH
through 7FH (0 through + 128) and 80H through FFH (— 128 through — 1).
8—3. (a) 64B; (c) SEA.
8—4. Here is one of many possible solutions:
MOV AX, 00
MOV BX,01
MOV CX,12
MOV Dx, 00
Baus
MOV BX,DX
MOV DX,AX
LOOP B20
8—5. (a) CMP DX,CX (c) JCXZ address (e) CMP BX,AX
JA address or CMP CX,0 JLE or JING
JZ address
8-6. (a) OF (1); (c) ZF (1); (e) DF (1).
8-8. The first (main) PROC must be FAR because DOS links to its address for execution. ANEAR
attribute means that the address is within this particular segment.
8-10. Three (one for each CALL).
8-11. (a) 1001 1010; (c) 1111 1011; (e) 0000 0000.
8-13. (a) 5CDCH; (c) CDC8H; (e) 3737H; (g) 72B9H.
CHAPTER 9
9-1. (a) Row = 00 and column = 00.
9-3. MOV AX,060BH ;Request
MOV BH,attribute ; clear
MOV CX, 0C0O0H ; screen
MOV DX,164FH
INT 10H
ACTLEN DB ©
DATEFLD DB 9 DUP(*? *)
INT 21H
9-8.
(a) 00.
CHAPTER 10
CHAPTER 11
CHAPTER 12
Then use INT 21H, function 09H, to display the variable DISPLAY.
CHAPTER 13
13-1. (a) 127 and 255.
13-3. (a) MOV AX, DATAY
ADD AX, DATAX ;Add DATAX
MOV DATAY,AX ; to DATAY
(b) See Figure 13-2 for multiword addition.
13-4. STC sets the carry flag. The sum is 0148H, plus 0237H, plus 1.
13-5. (a) MOV AX, DATAX
MUL DATAY ;Product is in the DX:AX
(c) See Figure 13-4 for multiplying a doubleword by a word.
13-7. (a) MOV AX, DATAX
MOV BL,25 ;Divide DATAX
DIV BL ; by 23
CHAPTER 14
14-1. (a) ADD generates 6CH, and AAA generates 0102H.
(c) SUB generates 02H, and AAS has no effect.
14-2. LEA SI,UNPAK ;Initialize address
MOV CxX,04 ; and 4 loops
Bav:
OR [SI] ,30H ;Insert ASCII 3
INC SI ;Increment for next byte
LOOP B20 ;Loop 4 times
14-3. Use Figure 14—2 as a guide, but initialize the CX to 03.
14-4. Use Figure 14—3 as a guide, but initialize the CX to 03.
14-5. (a) Convert decimal 46,328 to binary:
Decimal Hex
§x1= 8 8
2x 10= 20 14
3 X 100 = 300 |
6 X 1000 = 6000 1770
4 x 10000 = 40000 9C40
Chapter 17 D790
CHAPTER 15
15-2. TABLEX DW 50 DUP (‘ *).
15-3. (a) ITEMNO DB ‘06’,’10’,’14’,’21’,’24’
(c) ITPRICE DW 9395,8225,9067,8580, 1385
15-4. A possible organization is into the following procedures:
SUBROUTINE PURPOSE
BIOREAD Display prompt, accept item number.
C10SRCH Search table, display message if invalid item.
DIOMOVE Extract description and price from table.
E1LOCONV Convert quantity from ASCII to binary.
FIOCALC Calculate value (quantity X price).
GI1OCONV Convert value from binary to ASCII.
K10DISP Display description and value on screen.
15-5. The following routine copies the table. Refer to Figure 15-7 for sorting table entries.
CHAPTER 16
16-1. 512.
16-4. (a) A group of sectors (1, 2, 4, or 8) that DOS treats as a unit of storage space on a disk.
16-5. (a) 40 cylinders X 9 sectors X 2 sides X 512 bytes = 368,640.
16-7. (a) Side 0, track 0, sector 1.
16-8. In the directory, the first byte of filename is set to E5H.
16-11. (a) Positions 28-31 of the directory; (b) OB4AH, stored as 4A0B.
16-12. (a) The first byte (media descriptor) contains F8H.
CHAPTER 17
CHAPTER 18
All the questions for this chapter are exercises involving the use of DEBUG.
CHAPTER 19
19-2, Most likely as a developer of disk utility programs.
19-3. (a) In the AH.
19-5. Use INT 13H and function OOH.
19-6. Use INT 13H and function 01H.
19-8. MOV AH, 03H ;Request write
MOV AL, 03 ;3 sectors
LEA BX,OUTDSK ;Output area
MOV CH,08 ;Track 08
MOV CL,O1 ;Sector 01
MOV DH,00 ;Head #0
MOV DL,O1 ;Drive B
INT 13H
CHAPTER 20
20-1. (a) 09.
20-3. (a) MOV AH,05H ;Request print
INT 21H
Chapter 22 577
(c) You could code a line feed (OAH) in front of the address. The solution is similar to part (b).
(e) Issue another form feed (OCH).
20-4. HEADNG DB 13, 10, 15, ‘Title’, 12
20-5. (a) In the AH.
20-7. The CX is not available for looping because the loop that prints the name uses the CX. You
could use the BX like this:
CHAPTER 21
21-1. (a) Unit of measure for mouse movement in increments of 1/200 of an inch.
21-2. All these functions are identified near the beginning of the chapter.
21-3. Note the effect of functions 01H and 02H on the flag.
21-6. Note that the figure reverses the parallel ports, LPT1 and LPT2.
CHAPTER 22
22-1. The introduction to this chapter gives three reasons.
22-2. The statements include MACRO and ENDM.
22-5. (a) SALL.
22-6. (a) MULTBY MACRO MULTPR,MULTCD
MOV AL,MULTCD
MUL MULTPR
ENDM
22-7. To include the macro in pass 1, code the following:
IF1
INCLUDE library-name
ENDIF
578 Answers to Selected Questions
CMP DIVISOR,
00 ;Zero divisor?
JNZ (bypass) ;No, bypass
CALL (error message routine)
CHAPTER 23
23-1. The introduction to this chapter gives reasons.
23-2. (a) PARA.
23-3. (a) NONE.
23—4. (a) ‘code’.
23-6. (a) EXTRN SUBPRO:FAR
23-7. (a) PUBLIC QTY,VALUE,PRICE
23-8. Use Figure 23-6 as a guide.
23-9. Use Figure 23-8 as a guide for passing parameters. However, this question involves pushing
three variables onto the stack. The called program therefore has to access [BP+10] for the
third entry (PRICE) in the stack. You can define your own standard for returning PRICE
through the stack. Watch also for the pop value in the RET operand.
23-10. This program involves material from Chapters 9 (screen I/O), 13 (binary multiplication),
14 (conversion between ASCII and binary), and 23 (linkage to subprograms). Be careful
of the stack.
CHAPTER 24
CHAPTER 25
25-1. The section on interrupts at the start of this chapter discusses these types.
25-2. The section on interrupts at the start of this chapter discusses these lines.
25-3. (a) FFFF[O]H.
25-5. At 40[0]H.
25-6. (a) Equipment status; (c) second byte of shift status.
25-7. (a) The addresses (in reverse-byte sequence) of COM1 and COM2.
25-8. (a) INT OOH.
CHAPTER 26
26-1. Interrupts 20H through 3FH.
26-2. (a) 03H; (c) 30H or 3306H.
26-3. (a) Printer output; (c) buffered keyboard input.
Index
581
582 Index
INT 10H Video display functions 11H determine if character 22H write FCB record randomly,
(cont.) present, 189 319
OEH write teletype, 164 12H return current shift status, 189 23H get FCB file size, 482
OFH get current video mode, 164 described/listed, 187, 477 24H set random FCB record field,
10H set palette registers, 177 INT 17H BIOS print functions 482
11H character generator, 164 OOH print a character, 375, 389 25H set interrupt table address,
12H select alternative screen 01H initialize printer port, 375 464, 482
routine, 164 02H get printer port status, 375 26H create new PSP, 483
13H display character string, 165 INT 18H ROM BASIC entry, 478 27H read disk block randomly,
1AH read/write display combina- INT 19H Bootstrap loader, 478 320
tion code, 177 INT 1AH Read and set time, 478 28H write disk block randomly,
1BH return functionality/state INT 1BH Get control on keyboard 320
information, 177 break, 478 29H parse filename, 341
described/listed, 137, 476 INT 20H Terminate program, 481 2AH get system date, 42, 483
INT 11H Equipment determination, INT 21H DOS functions 2BH set system date, 483
239, 470, 476 OOH terminate program, 482 2CH get system time, 483
INT 12H Memory size determina- 01H keyboard input with echo, 2DH set system time, 483
tion, 42, 470, 476 185 2EH set/reset disk verification,
INT 13H BIOS disk I/O functions 02H display character, 147 328
OOH reset disk system, 354 03H communications input, 482 30H get DOS version number, 483
01H read disk status, 354 04H communications output, 482 31H terminate but stay resident,
02H read sectors, 354 OSH printer output, 373 462
03H write sectors, 356 06H direct keyboard and display, 32H get drive parameter block,
04H verify sectors, 358 186 329
O5H format tracks, 359 07H direct keyboard input without 3300H get/check Ctrl+C state, 484
O8H get drive parameters, 359 echo, 186 3305H get startup drive, 484
O9H initialize drive, 360 08H keyboard input without echo, 3306H get DOS version number,
OAH read extended sector buffer, 186 439, 484
360 O9H display string, 139 34H get DOS busy flag, 466
OBH write extended sector buffer, OAH buffered keyboard input, 35H get interrupt table address,
360 141, 186 463
OCH seek cylinder, 360 OBH check keyboard status, 186 36H get free disk space, 329
ODH alternate disk reset, 360 OCH clear buffer and invoke input, 38H get/set country-dependent in-
OEH read sector buffer, 361 187 formation, 484
OFH write sector buffer, 361 ODH reset disk drive, 326 39H create subdirectory, 338
10H test for drive ready, 361 OEFH select default disk drive, 326 3AH remove subdirectory, 339
11H recalibrate hard drive, 361 OFH open FCB file, 318 3BH change current directory, 339
12H ROM diagnostics, 361 10H close FCB file, 318 3CH create file with handle, 299
13H drive diagnostics, 361 11H search for first matching disk 3DH open file with handle, 303
14H controller diagnostics, 361 entry, 482 3EH close file with handle, 300
15H get disk type, 361 12H search for next matching disk 3FH read file/device, 149, 304
16H change of diskette status, 361 entry, 482 40H write file/device with handle,
17H set diskette type, 361 13H delete FCB file, 482 148, 299, 365
18H set media type for format, 14H read FCB sequential record, 41H delete file from directory, 343
362, 362 319 42H move file pointer, 311
19H park disk heads, 362 15H write FCB sequential record, 43H check/change file attribute,
described/listed, 354, 476 317 343
status codes, 353 16H create FCB file, 317 44H I/O control for devices,
INT 14H Communications I/O, 476 19H determine default disk drive, 329, 485
INT 15H System services, 477 327 4400H get device information,
INT 16H keyboard input functions 1BH get information for default 329
OOH read a character, 187 drive, 327 4401H set device information, 330
01H determine if character 1CH get information for specific 4404H read control data from
present, 188 drive, 327 drive, 330
Q2H return current shift status, 1FH get default drive parameter 4405H write control data to drive,
188 block, 328 330
O5H keyboard write, 188 21H read FCB record randomly, 4406H check input status, 330
10H read a character, 188 319 4407H check output status, 330
Index 587
4408H determine if removable 67H set maximum handle count, services table, 20, 463, 470, 474
media, 331 441 Intersegment call, 414
440DH minor code 41H write disk 6CH extended open file, 486 Interval timer, 390
sector, 331 described/listed, 42, 137, 481 INTO instruction, 526
440DH minor code 42H format INT 22H Terminate address, 481 Intrasegment call, 413
track, 332 INT 23H Ctrl/break address, 481 Invalid file handle, 148, 150
440DH minor code 46H set media INT 24H Critical error handler, 481 IO.SYS, 20, 286, 438, 480
I) ,332 INT 25H Absolute disk read, IOCTL (I/O control for devices), 329
440DH minor code 60H get de- 321, 481 EP register, 13, 14, 22, 25, 35, 115,
vice parameters, 333 INT 26H Absolute disk write, 124, 449
440DH minor code 61H read disk 321, 481 IRET instruction, 474, 526
sector, 334 INT 27H Terminate but stay resi- IRETD instruction, 526
440DH minor code 66H get media dent, 481 IRP directive, 403, 404
ID, 334 INT 2FH Multiplex interrupt, 481 IRPC directive, 403, 404
440DH minor code 68H sense me- INT 33H Mouse functions
dia type, 334 OOH initialize mouse, 379 JA/JAE instructions, 119, 526
45H duplicate a file handle, 344 01H display mouse pointer, 380 JB/JBE instructions, 119, 527
46H force duplicate of handle, 344 02H conceal mouse pointer, 380 JC instruction, 120, 131, 527
47H get current directory, 339 03H get button status and pointer JCXZ/JECXZ instructions, 527
48H allocate memory block, 453 location, 380 JE instruction, 119, 120, 527
49H free allocated memory 04H set pointer location, 380 JG/JGE instructions, 120, 527
block, 454 OSH get button-press informa- JL/JLE instructions, 120, 528
4AH set allocated memory block tion, 381 JMP instruction, 113, 114, 528
size, 454 06H get button-release informa- JNA/JNAE instructions, 119, 527
4BH load/execute a program, 454 tion, 382 JNB/JNBE instructions, 119, 526
4CH terminate program, 57, 109 07H set horizontal limits for JNC instruction, 120, 528
4DH retrieve return code of a sub- pointer, 382 JNE instruction, 101, 119, 120, 529
process, 456 O8H set vertical limits for pointer, JNG/JNGE instructions, 120, 528
4EH find first matching directory 383 JNL/JNLE instructions, 120, 528
entry, 344 O9H set graphics pointer type, 378 JNO/INP/INS instructions, 120, 529
4FH find next matching directory OAH set text pointer type, 378 JNZ instruction, 119, 120, 529
entry, 345 OBH read mouse-motion counters, JO/JPO instructions, 120, 529
50H set address of PSP, 485 383 JP/JPE instructions, 120, 530
51H get address of PSP, 443, 485 OCH install interrupt handler for JS instruction, 120, 530
52H get address of DOS list, 445 mouse events, 383 Jump
54H get verify state, 335 10H set pointer exclusion area, based on signed data, 120
56H rename a file, 346 384, 384 based on unsigned data, 119
57H get/set file date and time, 346 1AH set mouse sensitivity, 384 instruction, 113, 528
5800H get memory allocation 1BH get mouse sensitivity, 385 address, 115
strategy, 447 1DH select display page for tables, 131
5801H set memory allocation pointer, 385 JZ instruction, 119, 120, 527
strategy, 448 1EH get display page for pointer,
5802H get upper memory link, 385 Keyboard, 141, 182
448 24H get mouse information, 385 buffer, 185, 188, 194, 196
5803H set upper memory link, 448 INT instruction, 41, 101, 136, 525 data area 1, 471
59H get extended error code, 335 Intensity (screen), 157 data area 2, 473
5AH create a temporary file, 347 Internal data area 3, 473
5BH create a new file, 347 DOS list, 445 input, 141, 149, 182, 185, 186
5CH lock/unlock file access, 486 DOS tables, 21 interrupt, 475
5DH set extended error, 486 interrupt, 475 LED flags, 473
5EH local area network services, memory. See Memory mode state, 473
486 Interrupt scan codes, 564
5FH local area network services, address, 463 shift status. See Shift status
486 DOS, 480 Kilobyte, 3
62H get address of PSP, 486 execution, 474
65H get extended country infor- flag (IF), 16, 101, 117, 540 L (Load) DEBUG command,
mation, 486 handling, 20 292, 560
66H get/set global code page, 486 instruction, 525 Label, 113
588 Index
LABEL directive, 141, 501 LOOPD instruction, 531 Mod bits, 516
LAHF instruction, 530 LOOPE/LOOPZ instructions, Mode (screen), 155, 164, 175
.LALL directive, 396 116, 531 Mode byte, 515
LARGE memory model, 59, 502 LOOPNE/LOOPNZ instructions, MODE command, 75
Last fit, 447 116, 532 -MODEL directive, 502
LDS instruction, 530 LOOPW instruction, 531 Model ID, 32
LEA instruction, 99, 100, 530 LOW operator, 490 Modify allocated memory block, 454
Least significant byte, 10 Low portion of a register, 14 Monochrome display, 137, 157
Left shift key, 129 Low-level BIOS, 136 Monochrome display adapter
LENGTH operator, 279, 490, 493 Low-level language, 49 (MDA), 154
LES instruction, 530 Lowest level of disk processing, 294 Most significant byte, 10
LFS instruction, 530 LOWWORD operator, 490 Mouse
LGS instruction, 530 LPT 1/LPT2 ports, 390 driver, 379
Libraries option, 81 LSS instruction, 530 features, 377
Light pen, 175 .LST file, 74, 550 pointer, 378, 380, 382, 383
Line feed character, 140, 146, sensitivity, 384
148, 365 M (Move) DEBUG command, 561 MOV instruction, 26, 32, 35, 77,
Line spacing, 374 Machine 95, 532
Link code, 24, 33,37, 515 Move file pointer, 311
C and Assembler, 431 language example, 32 Move string. See MOVS instruction
.COM program, 108 language instruction, 49 Move-and-fill instructions, 96
.EXE program, 73 Macro MOVS instruction, 200, 202,
map, 82, 451 comments, 396 211,532
Pascal and Assembler, 429 definition, 394 MOVSB/MOVSW instructions,
program, 49, 81 expansion, 394 203;552
to subprograms, 411 library, 401 MOVSD instruction, 532
with a command line, 553 statements, 393 MOVSX/MOVZX instructions,
with prompts, 554 writing, 393 96, 533
LINK command, 81, 553 Main program, 412, 458 MSDOS.SYS, 20, 286, 438, 480
Linked list, 275 .MAP file, 81, 82 MUL instruction, 224, 226, 533
LIST directive, 501 Map of memory, 9, 21, 438 Multiplex interrupt, 481
Listing directives, 52, 494 MASK operator, 507 Multiplication
Load MASM command, 73, 549 ASCII data, 246
.COM program, 448 MCGA (video adapter), 154, 157 binary data, 224, 226, 524, 533
-EXE program, 449 MDA (video adapter), 154, 157, 180 by shifting, 231
module, 449 Media Multiword
or execute a program func- block, 334, 335 arithmetic, 220
tion, 454 descriptor byte, 287, 288, 292 multiplication, 226
overlay, 456 ID, 332
program, 455 type, 362 N (Name) DEBUG command,
program for execution, 19, 22, 73 MEDIUM model, 59, 502 29, 561
segment register, 530 Megabyte, 3 Name (of data item), 50, 61
string. See LODS instruction Memory, 1, 9, 10, 25 Near
LOCAL directive, 399 allocation strategy, 447 address, 102, 113, 114
Local area network services, 486 blocks, 444 call, 121, 413
Location counter, 502 control record, 444 procedure, 414
LOCK instruction, 531 management, 20, 437 RET, 414
Lock/unlock file access, 486 model, 59, 502 NEAR operator, 54, 413
LODS instruction, 200, 204, 531 references, 26 NEG instruction, 143, 237, 533
LODSB instruction, 204, 531 size data area, 471 Negative numbers, 4, 65, 258
LODSD instruction, 531 size determination, 31, 476 Nested segment, 510
LODSW instruction, 205, 531 Menu, 191 NMI line, 474
Logical operator, 490 Mickey, 378 NONE combine type, 413, 508
Logical record size, 316 Mickey count, 378, 383 Nonmaskable interrupt (NMI), 474
Long integer, 238 Mickey-to-pixel ratio, 380 NOP instruction, 36, 43, 533
Long real data format, 238 Microsoft assembler, 552 NOT instruction, 126, 534
Loop (example), 114 Microsoft C, 432 NOTHING operand (in ASSUME),
LOOP instruction, 113, 116, 531 ML command, 81 495
Index 589
Number of read-write heads, 287 in subprograms, 425, 430, 432 assembly diagnostics, 85
Numeric constant, 63, 64 Parity blinking, reverse video, and
Numeric data processor, 237 bit, 2 scrolling, 169
NumLock key, 184, 189 check of memory, 20, 470 calling a subprogram and over-
flag (PF), 16, 117, 120 lay, 459
O (Output) DEBUG command, Park disk heads, 362 changing lowercase to uppercase,
561 Parse filename, 341 t27
OBJ (object) file, 73, 81, 108, Pascal, 429 code segment defined as PUBLIC,
550, 553 Passing parameters, 425, 430, 432 420
Odd parity, 2 Path separator, 297 color graphics display, 179
OF (flag). See Overflow flag Pentium processor, 8 common data in subprograms, 424
Offset PF (flag). See Parity flag conversion of ASCII to EBCDIC,
in a file, 311 Phase error between passes, 80, 86 pA:
in a segment, 12, 25, 26, 93, 99, Physical drive number, 287 defining data in two programs,
102, 115 Pipeline structure, 9 426
OFFSET operator, 490 Pixel, 174, 177, 378, 380 direct table addressing, 263
Open a file, 294, 297, 303, 318 Pointer direct video display, 172, 208
Operand, 33, 51, 52, 92 entries in the FAT, 289, 293 displaying ASCII and hex, 274
Operating system. See DOS exclusion area, 384 displaying employee wages, 253
Operation, 51 registers, 14 displaying the directory, 341
Operators (listed), 487, 488 to acell, 275 DOS function to display ASCII
OR instruction, 125, 534 to the keyboard buffer, 471 characters, 140
ORG directive, 107, 502 POP instruction, 23, 123, 429, 534 execution of DIR from within a
%OUT directive, 503 Pop value, 429 program, 457
OUT instruction, 388, 534 POPA instruction, 24, 535 extended move operations, 100
Output POPF instruction, 24, 535 generating sound, 391
device, 148 Port, 388 linked list, 280
status, 330 Preparing a program for execution, linking C to Assembler, 435
Overflow linking Pascal to Assembler, 430
arithmetic, 218, 223 Print characters, 365, 373, 375 listing and suppression of macro
flag, 16, 118, 120, 218, 224 Print screen interrupt, 475 expansion, 398
from division, 235 Printer passing parameters, 428
Overlay, 456 control characters, 365, 373 printing with page overflow and
Overscan register, 177 port, 375, 389 headings, 367
status, 374 reading a disk file randomly, 312
P (Proceed) DEBUG command, PROC directive, 54, 77, 123, 504 reading disk sectors, 337
29, 561 Procedure, 54, 55, 121, 504 resident program, 465
Packed Process a file randomly, 310 right adjusting data on the screen,
BCD format, 248 Processor, 7 213
data, 241, 242 control instructions, 91 select item from menu, 194
decimal, 239 directives, 504 selectively deleting files, 347
Page (screen), 137, 138, 158, 160, Program simplified assembled macro in-
162, 170, 176 addressing, 24 struction, 395
PAGE .ASM, 74 sorting a table of names, 278
align type, 412, 508 .COM, 21, 22, 106, 448 table search using CMP, 268
directive, 52, 55, 504 entry point, 82, 419 table search using CMPSB, 271
Palette EXE, 21, 22, 55, 73, 449 using a file handle to create a
color, 175, 176, 177 execution, 22 file, 300
register, 177 function keys, 191 using a file handle to read a
PARA align type, 53, 412, 508 hierarchy, 412 file, 304
Paragraph, 3 loading, 20, 73, 448, 454 using a structure, 512
Paragraph boundary, 10, 11, 22, OBJ. See .OBJ file using BIOS to display ASCII
53,412 organization, 132 characters, 165
Parallel device control, 476 overlay, 458 using EXTRN and PUBLIC, 418
Parallel port data area, 470 size, 43, 107 using IF and IFNDEF, 407
Parameter termination, 43, 57, 107, 109, 481 using INT 13H to read disk
for keyboard input, 141 Program examples sectors, 357
in macros, 394 accept and display names, 143 using LOCAL in a macro, 400
590 Index
Program examples (cont.) Read-only memory. See ROM Rotate bits, 129, 536, 537
using macro parameters, 397 Read-write head, 283 Rounding data, 251
using simplified segment direc- Real mode, 7, 12, 439 Row on screen, 137, 138
tives, 422 Real value, 63
using the IFIDN macro, 409 Receive character, 477 S (Search) DEBUG command, 562
using the library INCLUDE, 402 Record, 294 SAHF instruction, 537
using the mouse, 386 RECORD directive, 505 SAL instruction, 129, 538
using the RECORD directive, 506 Record operators, 488 SALL directive, 397
Program segment prefix. See PSP Recursion, 412 SAR instruction, 128, 538
Protected mode, 59 Reexecute instructions, 36 Saving a program in DEBUG, 43
PSP (program segment prefix), 22, .REF file, 84, 553 SBB instruction, 538
57, 73, 123, 297, 440, 443, reg bits, 516 Scan codes, 185, 187, 188, 189, 190,
455, 485 Register, 1, 7, 8, 10, 11, 13, 92 195, 194, 564
PTR operator, 44, 146, 491 notation, 515 Scan line (screen), 159, 160
PUBLIC references, 26 Scan string. See SCAS instruction
combine type, 53, 413, 508 Relational operator, 488 SCAS instruction, 200, 209, 538
directive, 415, 416, 417, 419, 505 Relative SCASB instruction, 209, 538
PURGE directive, 402 byte, 291 SCASD instruction, 538
PUSH instruction, 23, 123, 427, 535 cluster number, 289, 290 SCASW instruction, 210, 538
PUSHA instruction, 24, 535 record number, 316, 321 Screen, 137
PUSHF instruction, 24, 536 sector, 321 Screen display, 147, 148
sector number, 284 Screen page. See Page
Q (Quit) DEBUG command, Relocation table, 450 Scroll
29, 561 Remainder, 232, 233 down the screen, 162
Quadword, 3, 66, 232 Removable media, 331 on the screen, 138, 307
Question mark (?) in expression, 61 Remove subdirectory, 339 up the screen, 161, 169
Quotient, 232 Rename file/directory, 346 ScrollLock key, 184, 189
QWORD directive. See DQ directive REP instruction, 201, 202, 536 Search a table, 266, 269
REPE/REPZ instructions, 202, 536 Sector, 283
R (register) DEBUG command, Repeat string, 536 Sector buffer, 360, 361
29, 561 Repetition directives, 403 Sectors
r/m bits, 516 REPNE/REPNZ instructions, per cluster, 286, 329
Radix 202, 536 per track, 287
point, 241, 253 REPT directive, 403 Seek cylinder, 360
specifier, 63 Reserved sectors, 286 SEG operator, 331, 336, 491
RAM (random access memory), Reserved words, 50, 547 Segment, 10, 53, 412, 458
9,10 Reset disk drive, 326, 354 address, 24, 35, 491
Random Resident portion of address of environment, 441
block, 320 COMMAND.COM, 439 address of PSP, 443
processing, 310, 319 Resident program, 462 boundary, 11, 53
RCL/RCR instructions, 130, 536 RET instruction, 121, 123, 124, 414, code. See Code segment
Read 429, 537 data. See Data segment
attribute or character, 162 RETF/RETN instructions, 537 directives, 494
block randomly, 320 Return code, 57 for .COM program, 107
control data from drive, 330 Reverse video, 169 offset. See Offset
cursor position, 160 Reverse the sign, 237 override operator, 492
disk file, 303 Reversed-byte sequence, 10, 25, 220 override prefix, 102
disk sector, 334 Reversed-word sequence, 220 register, 13
disk status, 354 RGB SEGMENT AT directive, 171, 197
extended sector buffer, 360 bits, 157 SEGMENT directive, 76, 53, 412,
graphics character, 175 monitor, 155 419, 508
keyboard character, 187, 188 Right-adjusting on the screen, 212 Segments and Groups table, 78, 552
light pen position, 175 Right shifting, 128 Select
mouse-motion counters, 383 Rightmost zero for segment address, active page, 160
pixel dot, 177 11 alternative screen routine, 164
record, 304, 319 ROL instruction, 130, 537 default disk drive, 326
sector buffer, 361 ROM (read-only memory), 9 display page for pointer, 385
sectors, 354 ROM BIOS. See BIOS Semicolon (;) for comment, 49
time, 478 ROR instruction, 130, 537 Sense media type, 334
Index 591
SEQ directive, 510 SHORT operator, 114, 492 Store string. See STOS instruction
Sequence of segments, 57 SHR STOS instruction, 200, 205, 211, 540
Sequential reading, 303, 318 instruction, 128, 539 STOSB/STOSW instructions, 205,
Serial operator, 492 540
device control, 475 SHRD instruction, 539 STOSD instruction, 540
number (in ROM), 31 SI register, 15, 100 String
port data area, 470 Sign bit, 5, 128, 218 compares, 200, 206, 269
Service. See Function Sign flag, 16, 117, 120 data, 62, 139, 200
Set Signed operations, 200
address of PSP, 485 data, 119, 120, 223 STRUC directive, 510
color palette, 176 division, 233 SUB instruction, 218, 540
cursor, 138, 159 multiplication, 226 Subdirectory, 288, 338
cursor size, 159 Simplified segment directives, 59, Subprogram, 411, 456, 458
date, 483 78, 109, 421 Subtraction
device information, 330 Single quotes (in string), 62 ASCII data, 245
direction flag. See STD instruction Single-step mode, 29, 117, 475 binary data, 218
diskette type, 361 Size SUBTTL directive, 511
double threshold speed, 384 of .COM program, 107 Switch printer ports, 390
extended error, 486 of fibe in bytes, 288 Symbolic
file attribute, 343 of memory, 42 code, 24, 32
file date and time, 346 SIZE operator, 279, 492, 493 instructions, 49
graphics mode, 175, 178 Skeleton of an .EXE program, 56 program, 40
horizontal limits for pointer, 382 SMALL model, 59, 502 Symbols table, 78, 552
interrupt address, 464 Software environment, 19 System
media ID, 332 SORT command (DOS), 300 area. See disk system area
media type, 362 Sort table entries, 274 data area, 473
memory allocation strategy, 448 Sound, 390 date, 483
mouse sensitivity, 384 Source index register. See SI register equipment, 30, 470
palette registers, 177 Source program, 58, 72, 74 file, 287
pointer exclusion area, 384 SP (stack pointer) register, 13, 14, loader, 123
pointer location (for mouse), 380 22, 23, 101, 124, 449 program loader, 21
time, 478, 483 Space on a disk, 329 time, 475, 478, 483
upper memory link, 448 Square brackets. See Index operator
vertical limits for pointer, 383 SS register, 11, 13, 22, 23, 452 T (Trace) DEBUG command,
video mode, 155, 159, 175 SS:BP pair, 94 29, 562
Set/reset disk write verification, 328 SS:SP pair, 22 Tab
SETnn instruction, 538 Stack, 11, 22, 55, 58, 77, 123, 427 character, 140, 146
SF (flag). See Sign flag for .COM program, 107, 109 stops, 369
Shift frame, 425 ; Table, 260
and rotate doubleword, 131 pointer register. See SP register of months, 262
and round data, 251 segment register. See SS register of months and days, 263
bits, 127, 538, 539 STACK combine type, 53, 508 on disk, 261
bits left, 129 STACK directive, 61, 510 sort, 274
bits right, 128 Standard unopened FCB, 441 with ranges, 267
count, 507 Start row:column, 161, 162 with unique entries, 266
key, 183, 184, 197 Start scan line, 159, 160 Tail of the buffer, 195
status, 184, 188, 189, 471 Starting cluster, 288 TASM command, 73, 549
to divide, 236 STARTUP directive, 61 TBYTE directive. See DT directive
to multiply, 231 Statement, 51 TCREF command, 84, 553
value, 127 Status Temporary real data, 238
SHL byte, 353, 354 Terminate
instruction, 129, 539 of communications port, 477 address, 481
operator, 492 of flags, 16 but stay resident (TSR), 462
SHLD instruction, 539 of printer, 375 program execution, 57, 109, 481
Short STC instruction, 539 TEST instruction, 125, 540
address, 113, 114 STD instruction, 202, 539 Test if diskette is ready, 355
integer data, 238 Steps in assembly, link, and Text mode, 137, 155, 156
jump, 116 execute, 74 _TEXT segment name, 60, 79
real data, 238 STI instruction, 540 TEXTEQU directive, 68, 511
592 Index