Computer Organization &
Assembly Languages
Introduction
Pu-Jen Cheng
2008/09/15
Course Administration
Instructor: Pu-Jen Cheng (CSIE R323)
pjcheng@[Link]
[Link]
Class Hours: 2:00pm-5:00pm, Monday
Classroom: CSIE R102
TA(s): 戴瑋彥 b93705014@[Link]
Course Information:
Announce: [Link]
Q&A: bbs://[Link] → CSIE_ASM
Textbook
Assembly Language for Intel-Based Computers, 5th Edition,
by Kip Irvine, Prentice-Hall, 2006
[Link]
References
Computer Systems: A Programmer's Perspective
By Randal E. Bryant and David R. O'Hallaron,
Prentice Hall
[Link]
The Art of Assembly Language
By Randy Hyde,
[Link]
[Link]
System Software: An Introduction to Systems
Programming
By Leland L. Beck
Addison-Wesley
Pre-requisite
Experiences in writing programs in a high-level
language such as C, C++, and Java
Course Grading (tentative)
Assignments (55%)
Class participation (5%)
Midterm exam (20%)
Final exam (20%)
Materials
Some materials used in this course are adapted from
¾ The slides prepared by Kip Irvine for the book, Assembly Language
for Intel-Based Computers, 5th Ed.
¾ The slides prepared by S. Dandamudi for the book, Introduction to
Assembly Language Programming, 2nd Ed.
¾ Introduction to Computer Systems,
Systems CMU
([Link]
15213-f05/www/)
¾ Assembly Language & Computer Organization, NTU
([Link]
([Link]
What is Assembly Language
First Glance at Assembly Language
Translating Languages
English: Display the sum of A times B plus C.
C++: cout << (A * B + C);
Assembly Language: Intel Machine Language:
mov eax,A A1 00000000
mul B F7 25 00000004
add eax,C
03 05 00000008
call WriteInt
E8 00500000
A Simple Example in VC++
View/Debug Windows/Disassembly
gcc -s prog.c
The Compilation System
First Glance at Assembly Language
Low-level language
¾ Each instruction performs a much lower-level task
compared to a high-level language instruction
¾ Most high-level language instructions need more than
one assembly instruction
One-to-one correspondence between assembly
language and machine language instructions
¾ For most assembly language instructions, there is a
machine language equivalent
Directly influenced by the instruction set and
architecture of the processor (CPU)
Comparisons with High-level Languages
Advantages of Assembly Languages
¾ Space-efficiency
(e.g. hand-held device softwares, etc)
¾ Time-efficiency
(e g Real-time applications,
(e.g. applications etc )
¾ Accessibility to system hardwares
(e.g., Network interfaces, device drivers, video games, etc)
Advantages of High-level Languages
¾ Development
¾ Maintenance (Readability)
¾ Portability (compiler, virtual machine)
Comparisons with High-level Languages (cont.)
Why Taking the Course?
Basic Concepts of
Computer Computer Design
Organization
Computer Organization
This Course Computer Architecture
Assembly System
Language Software
Assembler, Linker, Loader
Compiler, Operating System, …
“I really don’t think that you can write a book for
serious computer programmers unless you are
able to discuss low-level details.”
Donald Knuth (高德納)
The Art of Computer Programming
[Link]
Course Coverage
Basic Concepts
IA-32 Processor Architecture
Assembly Language Fundamentals
Data Transfers, Addressing, and Arithmetic
Procedures
Conditional Processing
Integer Arithmetic
Advanced Procedures
Strings and Arrays
Structures and Macros
High-Level Language Interface
Assembler, Linker, and Loader
Other Advanced Topics (optional)
What You Will Learn
Basic principles of computer architecture
IA-32 processors and memory management
Basic assembly programming skills
How high-level language is translated to assembly
How assembly is translated to machine code
How application program communicates with OS
Interface between assembly to high-level language
Performance: Multiword Arithmetic
Longhand multiplication 0101
1101
¾ Final 128-bit result in P:A 0101
0101
P := 0; count := 64 0101
A := multiplier; B := multiplicand
while (count > 0)
if (LSB of A = 1)
then P := P+B
CF := carry generated by P+B
else CF := 0
end if
shift right CF:P:A by one bit position
count := count-1
end while
Example
A = 11012 (13)
B = 01012 (5)
After P+B After the shift
CF P A CF P A
Initial state ? 0000 1101 -- ---- ----
Iteration 1 0 0101 1101 ? 0010 1110
Iteration 2 0 0010 1110 ? 0001 0111
Iteration 3 0 0110 0111 ? 0011 0011
Iteration 4 0 1000 0011 ? 0100 0001
Time Comparison
5
4
C version
Timee (seconds)
2
ASM version
1
0
0 20 40 60 80 100
Number of calls (in millions)
Multiplication time comparison on a 2.4-GHz Pentium 4 system
Chapter 1: Basic Concept
Virtual Machine Concept
Data Representation
Boolean Operations
Translating Languages
English: Display the sum of A times B plus C.
C++: cout << (A * B + C);
Assembly Language: Intel Machine Language:
mov eax,A A1 00000000
mul B F7 25 00000004
add eax,C
03 05 00000008
call WriteInt
E8 00500000
Virtual Machines
Abstractions for computers
Machine-independent
High-Level Language Level 5
Assembly Language Level 4
Machine-specific
Operating System
Level 3
Instruction Set
Architecture Level 2
Microarchitecture Level 1
Digital Logic Level 0
High-Level Language
Level 5
Application-oriented languages
¾ C++, Java, Pascal, Visual Basic . . .
Programs compile into assembly language
(Level 4)
Assembly Language
Level 4
Instruction mnemonics that have a one-
to-one correspondence to machine
language
Calls functions written at the operating
system level (Level 3)
Programs are translated into machine
language (Level 2)
Operating System
Level 3
Provides services to Level 4 programs
Translated and run at the instruction set
architecture level (Level 2)
Instruction Set Architecture
Level 2
Also known as conventional machine
language
Executed by Level 1 (microarchitecture)
program
Microarchitecture
Level 1
Interprets conventional machine
instructions (Level 2)
Executed by digital hardware (Level 0)
Digital Logic
Level 0
CPU, constructed from digital logic gates
System bus
Memory
next: Data Representation
Data Representation
Binary Numbers
¾ Translating between binary and decimal
Binary Addition
Integer Storage Sizes
Hexadecimal Integers
g
¾ Translating between decimal and hexadecimal
¾ Hexadecimal subtraction
Signed Integers
¾ Binary subtraction
Fractional Binary Numbers
Character Storage
Machine Words
Binary Representation
Electronic Implementation
¾ Easy to store with bistable elements
¾ Reliably transmitted on noisy and inaccurate wires
Binary Numbers
Digits are 1 and 0
¾ 1 = true
¾ 0 = false
MSB – most significant bit
LSB – least significant bit
MSB LSB
Bit numbering: 1011001010011100
15 0
Binary Numbers
Each digit (bit) is either 1 or 0 1 1 1 1 1 1 1 1
Each bit represents a power of 2: 27 26 25 24 23 22 21 20
Every binary
number is a
sum of powers
of 2
Translating Binary to Decimal
Weighted positional notation shows how to
calculate the decimal value of each binary bit:
dec = (Dn-1 × 2n-1) + (Dn-2 × 2n-2) + ... + (D1 × 21)
+ (D0 × 20)
D = binary digit
binary 00001001 = decimal 9:
(1 × 23) + (1 × 20) = 9
Translating Unsigned Decimal to Binary
Repeatedly divide the decimal integer by 2.
Each remainder is a binary digit in the translated value:
37 = 100101
Binary Addition
Starting with the LSB, add each pair of digits, include the
carry if present.
carry: 1
0 0 0 0 0 1 0 0 (4)
+ 0 0 0 0 0 1 1 1 (7)
0 0 0 0 1 0 1 1 (11)
bit position: 7 6 5 4 3 2 1 0
Integer Storage Sizes
byte 8
Standard sizes:
word 16
doubleword 32
quadword 64
What is the largest unsigned integer that may be stored in 20 bits?
Large Measurements
Kilobyte (KB), 210 bytes
Megabyte (MB), 220 bytes
Gigabyte (GB), 230 bytes
Terabyte (TB), 240 bytes
Petabyte, 250 bytes
Exabyte, 260 bytes
Zettabyte, 270 bytes
Yottabyte, 280 bytes
Googol, 10100
Hexadecimal Integers
Binary values are represented in hexadecimal.
Translating Binary to Hexadecimal
Each hexadecimal digit corresponds to 4 binary
bits.
Example: Translate the binary integer
000101101010011110010100 to hexadecimal:
Converting Hexadecimal to Decimal
Multiply each digit by its corresponding power of
16:
dec = (D3 × 163) + (D2 × 162) + (D1 × 161) + (D0 × 160)
Hex 1234 equals (1 × 163) + (2 × 162) + (3 × 161) + (4
× 160), or decimal 4,660.
Hex 3BA4 equals (3 × 163) + (11 * 162) + (10 × 161) +
(4 × 160), or decimal 15,268.
Powers of 16
Used when calculating hexadecimal values up to 8
digits long:
Converting Decimal to Hexadecimal
decimal 422 = 1A6 hexadecimal
Hexadecimal Addition
Divide the sum of two digits by the number base (16).
The quotient becomes the carry value, and the remainder
is the sum digit.
1 1
36 28 28 6A
42 45 58 4B
78 6D 80 B5
21 / 16 = 1, rem 5
Important skill: Programmers frequently add and
subtract the addresses of variables and instructions.
Hexadecimal Subtraction
When a borrow is required from the digit to the left,
add 16 (decimal) to the current digit's value:
16 + 5 = 21
−1
C6 75
A2 47
24 2E
Practice: The address of var1 is 00400020. The
address of the next variable after var1 is 0040006A.
How many bytes are used by var1?
Signed Integers
The highest bit indicates the sign.
1 = negative, 0 = positive
sign bit
1 1 1 1 0 1 1 0
Negative
0 0 0 0 1 0 1 0 Positive
If the highest digit of a hexadecimal integer is > 7, the
value is negative. Examples: 8A, C5, A2, 9D
Forming the Two's Complement
Bitwise NOT of the number and add 1
Note that 00000001 + 11111111 = 00000000
8-bit Two's Complement Integers
Binary Subtraction
When subtracting A – B, convert B to its two's
complement
Add A to (–B)
00001100 00001100
– 00000011 11111101
00001001
Advantages for 2’s complement:
No two 0’s
Sign bit
Remove the need for separate circuits for add
and sub
Ranges of Signed Integers
The highest bit is reserved for the sign. This limits
the range:
Fractional Binary Numbers
2i
2i–1
4
••• 2
1
bi bi–11 • • • b2 b1 b0 .b–11 b–22 b–33 • • • b–j
1/2
1/4
•••
1/8
Representation 2–j
¾ Bits to right of “binary point” represent fractional
powers of 2 i
k
¾ Represents rational number: ∑ kb ⋅2
k =− j
Examples of Fractional Binary Numbers
Value Representation
5-3/4 101.112
2-7/8 10.1112
63/64 0.1111112
Observations
¾ Divide by 2 by shifting right
¾ Multiply by 2 by shifting left
¾ Numbers of form 0.111111…2 just below 1.0
1/2 + 1/4 + 1/8 + … + 1/2 + … → 1.0
i
Use notation 1.0 – ε
Representable Numbers
Limitation
¾ Can only exactly represent numbers of the form
¾ Other numbers have repeating bit representations
Value Representation
1/3
/ 0.0101010101[01]…2
1/5 0.001100110011[0011]…2
1/10 0.0001100110011[0011]…2
Converting Real Numbers
Binary real to decimal real
Decimal real to binary real
4.5625 = 100.10012
True or False
If x > 0 then x + 1 > 0
If x < 0 then x * 2 < 0
If x > y then -x < -y
If x >= 0 then -x <= 0
If x < 0 then -x > 0
If x >= 0 then (( !x – 1 ) & x ) == x
If x < 0 && y > 0 then x * y < 0
If x < 0 then ((x ^ x >> 31) + 1) > 0
Character Storage
Character sets
¾ Standard ASCII (0 – 127)
¾ Extended ASCII (0 – 255)
¾ ANSI (0 – 255)
¾ Unicode (0 – 65,535)
65 535)
Null-terminated String
¾ Array of characters followed by a null byte
Using the ASCII table
¾ back inside cover of book
Machine Words
Machine Has “Word Size”
¾ Nominal size of integer-valued data
Including addresses
¾ Most current machines use 32 bits (4 bytes) words
Limits addresses to 4GB
Users can access 3GB
Becoming too small for memory-intensive applications
¾ High-end systems use 64 bits (8 bytes) words
Potential address space ≈ 1.8 X 10
19 bytes
x86-64 machines support 48-bit addresses: 256 Terabytes
¾ Machines support multiple data formats
Fractions or multiples of word size
Always integral number of bytes
Word-Oriented Memory Organization
32-bit 64-bit
Bytes Addr.
Words Words
0000
Addr
=
0001
Addresses Specify Byte 0000
??
Addr
0002
0003
Locations =
0000
?? 0004
Addr
¾ Address of first byte
y in =
0005
word 0004
?? 0006
0007
¾ Addresses of successive 0008
words differ by 4 (32-bit) Addr
=
0009
or 8 (64-bit) 0008
??
Addr
0010
= 0011
0008
?? 0012
Addr
=
0013
0012
?? 0014
0015
Data Representations
Sizes of C Objects (in Bytes)
¾ C Data Type Typical 32-bit Intel IA32 x86-64
unsigned 4 4 4
int 4 4 4
long int 4 4 4
char 1 1 1
short 2 2 2
float 4 4 4
double 8 8 8
char * 4 4 8
Or any other pointer
Byte Ordering
How should bytes within multi-byte word be
ordered in memory?
Conventions
¾ Big Endian: Sun, PPC Mac
Least
L t significant
i ifi t byte
b t has
h highest
hi h t address
dd
¾ Little Endian: x86
Least significant byte has lowest address
Byte Ordering Example
Big Endian
¾ Least significant byte has highest address
Little Endian
¾ Least significant byte has lowest address
Example
¾ Variable x has 4-byte representation 0x01234567
¾ Address given by &x is 0x100
Big Endian 0x100 0x101 0x102 0x103
01 23 45 67
Little Endian 0x100 0x101 0x102 0x103
67 45 23 01
Representing Integers
int A = 15213; Decimal: 15213
int B = -15213; Binary: 0011 1011 0110 1101
long int C = 15213; Hex: 3 B 6 D
IA32, x86-64 A Sun A IA32 C x86-64 C Sun C
6D 00 6D 6D 00
3B 00 3B 3B 00
00 3B 00 00 3B
00 6D 00 00 6D
00
IA32, x86-64 B Sun B 00
00
93 FF
00
C4 FF
FF C4
FF 93 Two’s complement representation
Representing Strings
char S[6]= “15213”;
Strings in C
¾ Represented by array of characters
¾ Each character encoded in ASCII format
Standard 7-bit encoding of character set
Character “0” has code 0x30
Linux/Alpha S Sun S
Digit i has code 0x30+i
31 31
¾ String should be null-terminated 35 35
Final character = 0 32 32
Compatibility 31
33
31
33
¾ Byte ordering not an issue 00 00
Boolean Operations
NOT
AND
OR
Operator Precedence
Truth Tables
Boolean Algebra
Based on symbolic logic, designed by George Boole
Boolean expressions created from:
¾ NOT, AND, OR
NOT
Inverts (reverses) a boolean value
Truth table for Boolean NOT operator:
Digital gate diagram for NOT:
NOT
AND
Truth table for Boolean AND operator:
Digital gate diagram for AND:
AND
OR
Truth table for Boolean OR operator:
Digital gate diagram for OR:
OR
Operator Precedence
NOT > AND > OR
Examples showing the order of operations:
Use parentheses to avoid ambiguity
Truth Tables (1 of 3)
A Boolean function has one or more Boolean
inputs, and returns a single Boolean output.
A truth table shows all the inputs and outputs
of a Boolean function
Example: ¬X ∨ Y
Truth Tables (2 of 3)
Example: X ∧ ¬Y
S
Truth Tables (3 of 3)
X
mux Z
Y
Example: (Y ∧ S) ∨ (X ∧ ¬S)
Two-input multiplexer