Assembler 1
Assembler 1
Source Object
Assembler
Program Code
1
Introduction to Assemblers
Fundamental functions of SIC Assembler
Convert mnemonic operation codes to their
machine language equivalents
ex: STL to 14
Convert symbolic operands to their equivalent
machine addresses
ex: RETADR to 1033
Build the machine instructions in the proper
format.
Write the object program.
2
Assembler Directives
Pseudo-Instructions
Not translated into machine equivalents
Providing information to the assembler
Basic assembler directives
START : Specify starting address for the program.
END : Indicate the end of the source program and specify the
first executable instruction in the program.
BYTE : Reserve BYTE of memory.
WORD :Reserve one-word integer constant.
RESB : Reserve the indicated number of bytes for a data
area.
RESW : Reserve the indicated number of words for a data
area.
3
Difficulties: Forward Reference
Forward reference: reference to a label that is
defined later in the program.
4
Two Pass Assembler
Pass 1
Assign addresses to all statements in the program
Save the addresses assigned to all symbolic operands(labels) for
use in Pass 2
Build the SYMTAB(Symbol Table)
validates the mnemonic operation codes
Pass 2
Assemble instructions
Translating operation codes and symbols
Actual translation will be done in pass-2
Write the object program and the assembly listing
5
Two Pass Assembler
Working procedure of Two pass assembler
Source
program
Intermediate Object
Pass 1 Pass 2
file codes
6
Data Structures: OPTAB (operation code
table)
Content
Mnemonic operation code, machine equivalent number
Characteristic
static table
Implementation
array or hash table, easy for search
Mnemonic machine
operation code equivalent
number
STL 14
JSUB 48
.
.
.
7
SYMTAB (symbol table)
Content
symbol(label) name, value(address).
Characteristic
dynamic table COPY 1000
FIRST 1000
Implementation CLOOP 1003
hash table ENDFIL 1015
EOF 1024
THREE 102D
ZERO 1030
RETADR 1033
LENGTH 1036
BUFFER 1039
RDREC 2039
8
Object Program
Header
Col. 1 H
Col. 2-7 Program name
Col. 8-13 Starting address
Col.14-19 Length of object program in bytes
Text
Col. 1 T
Col. 2-7 Starting address in this record
Col. 8-9 Length of object code in this record in bytes
Col.10-69 Object code
End
Col. 1 E
Col. 2-7 Address of first executable instruction
9
Object program format :
H COPY 001000 00107A
T 001000 1E 141033 482039 001036 281030 301015 482061 ...
T 00101E 15 0C1036 482061 081044 4C0000 454F46 000003 000000
T 002039 1E 041030 001030 E0205D 30203F D8205D 281030 …
T 002057 1C 101036 4C0000 F1 001000 041030 E02079 302064 …
T 002073 07 382064 4C0000 05
E 001000
10
Assembler Design
Machine Dependent Assembler Features
instruction formats and addressing modes
program relocation
Machine Independent Assembler Features
literals
symbol-defining statements(EQU)
expressions
program blocks
control sections and program linking
11
Instruction Format and Addressing Mode
SIC/XE(General formats for addressing modes)
PC-relative or Base-relative addressing: opcode m
Indirect addressing: opcode @m
Immediate addressing: opcode
#m
Extended format: +opcode m
Index addressing: opcode m , x
register-to-register instructions opcode r1,r2
m is operand
x is index register
r1,r2 are two registers
12
Translation
Register translation
register name (A, X, L, B, S, T, F, PC, SW) and their
values (0,1, 2, 3, 4, 5, 6, 8, 9)
preloaded in SYMTAB
Address translation
Most instructions use program counter relative or base
relative addressing
Format 3: 12-bit address field
Format 4: 20-bit address field
13
PC-Relative Addressing Modes
PC-relative
10 0000 FIRST STL RETADR 17202D
14
Base-Relative Addressing Modes
Base-relative
base register is under the control of the programmer
12 LDB #LENGTH
13 BASE LENGTH
160 104E STCH BUFFER, X 57C003
( 54 )op(6)
16 1 1 1n 1I 0x0b p (e003 ) 16 disp(12)
(54) 111010 0036-1051= -101B16
displacement= BUFFER - B = 0036 - 0033 = 3
NOBASE is used to inform the assembler that the contents of the
base register no longer be relied upon for addressing
15
Immediate Address Translation
Immediate addressing
55 0020 LDA #3 010003
op(6) n I xbp e disp(12)
( 00 )16 0 1 0 0 0 0 ( 003 ) 16
16
Immediate Address Translation (Cont.)
Immediate addressing
12 0003 LDB #LENGTH 69202D
op(6) n I xbp e disp(12)
( 68)16 010010 ( 02D ) 16
( 68)16 010000 ( 033)16 690033
the immediate operand is the symbol LENGTH
the address of this symbol LENGTH is loaded into register
B
LENGTH=0033=PC+displacement=0006+02D
if immediate mode is specified, the target address
becomes the operand
17
Indirect Address Translation
Indirect addressing
target addressing is computed as usual (PC-
relative or BASE-relative)
only the n bit is set to 1
70 002A J @RETADR 3E2003
18
Program Relocation
19
Example
20
Relocatable Program
Modification record
Col 1 M
Col 2-7 Starting location of the address field to be
modified
Col 8-9 length of the address field to be modified, in half-
bytes
21
Object Code
22
Machine-Independent Assembler
Features
Literals
Symbol Defining Statement
Expressions
Program Blocks
Control Sections and Program
Linking
23
Literals
Design idea
Let programmers to be able to write the value
of a constant operand as a part of the
instruction that uses it.
This avoids having to define the constant
elsewhere in the program and make up a label
for it.
Example
24
Literals vs. Immediate Operands
Immediate Operands
The operand value is assembled as part of the
machine instruction
e.g. 55 0020 LDA #3 010003
Literals
The assembler generates the specified value
as a constant at some other memory location
e.g. 45 001A ENDFILLDA =C’EOF’ 032010
Compare (Fig. 2.6)
e.g. 45 001A ENDFIL LDA EOF 032010
80 002D EOF BYTE C’EOF’454F46
25
Literal - Implementation (1/3)
Literal pools
In some cases, it is desirable to place literals
into a pool at some other location in the object
program
assembler directive LTORG
reason: keep the literal operand close to the
instruction
26
Literal - Implementation (2/3)
Duplicate literals
e.g. 215 1062 WLOOP TD =X’05’
e.g. 230 106B WD =X’05’
The assemblers should recognize duplicate
literals and store only one copy of the specified
data value
Comparison of the defining expression
• Same literal name with different value, e.g.
LOCCTR=*
Comparison of the generated data value
• The benefits of using generate data value are usually
not great enough to justify the additional complexity in
the assembler
27
Data structures : LITTAB & Two Pass
LITTAB
literal name, the operand value
Pass 1
build LITTAB with literal name, operand value
when LTORG statement is encountered, assign an address to
each literal not yet assigned an address
Pass 2
search LITTAB for each literal operand encountered
generate data values using BYTE or WORD statements
generate modification record for literals that represent an
address in the program
28
Symbol-Defining Statements
Labels on instructions or data areas
the value of such a label is the address
assigned to the statement
Defining symbols
symbol EQU value
value can be: constant, other symbol,
expression
making the source program easier to
understand
no forward reference
29
Symbol-Defining Statements
Example 1
MAXLEN EQU 4096
+LDT #MAXLEN +LDT
#4096
Example 2
BASE EQU R1
COUNT EQU R2
INDEX EQU R3
Example 3
MAXLEN EQU BUFEND-BUFFER
30
ORG (origin)
Indirectly assign values to symbols
Reset the location counter to the specified value
ORG value
Value can be: constant, other symbol,
expression
No forward reference
Example
SYMBOL: 6bytes
VALUE: 1word SYMBOL VALUE FLAGS
STAB
FLAGS: 2bytes (100 entries)
LDA VALUE, X
. . .
. . .
. . .
31
ORG Example
Using EQU statements
STAB RESB 1100
SYMBOL EQU STAB
VALUE EQU STAB+6
FLAG EQU STAB+9
Using ORG statements
STAB RESB 1100
ORG STAB
SYMBOL RESB 6
VALUE RESW 1
FLAGS RESB 2
ORG STAB+1100
32
Expressions
Expressions can be classified as absolute
expressions or relative expressions
MAXLEN EQU BUFEND-BUFFER
BUFEND and BUFFER both are relative terms,
representing addresses within the program
However the expression BUFEND-BUFFER represents
an absolute value
When relative terms are paired with opposite
signs, the dependency on the program starting
address is canceled out; the result is an absolute
value
33
SYMTAB
None of the relative terms may enter into a
multiplication or division operation
Errors:
BUFEND+BUFFER
100-BUFFER
3*BUFFER
The type of an expression
keep track of the types of all symbols defined in
the program Symbol Type Value
RETADR R 30
BUFFER R 36
BUFEND R 1036
MAXLEN A 1000
34
Example 2.9
Name Value
SYMTAB COPY 0 LITTAB
FIRST 0 C'EOF' 454F46 3 002D
CLOOP 6 X'05' 05 1 1076
ENDFIL 1A
RETADR 30
LENGTH 33
BUFFER 36
BUFEND 1036
MAXLEN 1000
RDREC 1036
RLOOP 1040
EXIT 1056
INPUT 105C
WREC 105D
WLOOP 1062
35
Program Blocks
Program blocks
refer to segments of code that are rearranged
within a single object program unit
USE [block name]
At the beginning, statements are assumed to
be part of the unnamed (default) block
If no USE statements are included, the entire
program belongs to this single block
Each program block may actually contain
several separate segments of the source
program
36
Program Blocks - Implementation
Pass 1
each program block has a separate location counter
each label is assigned an address that is relative to the
start of the block that contains it
at the end of Pass 1, the latest value of the location
counter for each block indicates the length of that block
the assembler can then assign to each block a starting
address in the object program
Pass 2
The address of each symbol can be computed by
adding the assigned block starting address and the
relative address of the symbol to that block
37
Block Table
Example
20 0006 0 LDA LENGTH 032060
38
Control Sections and Program Linking
External definition
EXTDEF names symbols that are defined in this
control section and may be used by other sections
External reference
EXTREF names symbols that are used in this
control section and are defined elsewhere
40
Define record
Col. 1 D
Col. 2-7 Name of external symbol defined in this control
section
Col. 8-13 Relative address
Col.14-73 Repeat information in Col. 2-13 for other external symbols
Refer record
Col. 1 R
Col. 2-7 Name of external symbol referred to in this control
section
Col. 8-73 Name of other external reference symbols
41
Modification Record
Modification record(Revised)
Col. 1 M
Col. 2-7 Starting address of the field to be modified
Col. 8-9 Length of the field to be modified, in half-bytes
Col. 10 Modification flag ( + )
Col.11-16 External symbol whose value is to be added to the
indicated field
Example
M00000405+RDREC
M00000705+COPY
42
One-Pass Assemblers
Source Program
OPTAB SYMTAB
one-pass also avoids the over head of an additional pass over the source
program.
When external working-storage devices are not available or too slow ,that time
one pass assembler is useful because it does not requires intermediate file.
Two types of one-pass assembler
load-and-go
produces object code directly in memory for immediate execution
the other
produces usual kind of object code for later execution
43
Load-and-go Assembler
Characteristics
Useful for program development and testing
Avoids the overhead of writing the object
program out and reading it back
No object program is written out.
No loader is needed.
No intermediate file is needed.
44
Algorithm : One-pass Assembler
For any symbol that has not yet been
defined
1. omit the address translation
2. insert the symbol into SYMTAB, and mark this
symbol undefined
3. the address that refers to the undefined
symbol is added to a list of forward references
associated with the symbol table entry
4. when the definition for a symbol is
encountered, the proper address for the
symbol is then inserted into any instructions
previous generated according to the forward
reference list
45