CH7 - Instructions - Language of The Computer
CH7 - Instructions - Language of The Computer
Language of the
Computer
Computer Organization
502044
Acknowledgement
This slide show is intended for use in class, and is not a complete document.
Students need to refer to the book to read more lessons and exercises. Students
have the right to download and store lecture slides for reference purposes; Do not
redistribute or use for purposes outside of the course.
[2]. David A. Patterson, John L. Hennessy, [2014], Computer Organization and Design: The
Hardware/Software Interface, 5th edition, Elsevier, Amsterdam.
[3]. John L. Hennessy, David A. Patterson, [2012], Computer Architecture: A Quantitative
Approach, 5th edition, Elsevier, Amsterdam.
📧 trantrungtin.tdtu.edu.vn
2
Syllabus
7.1 Introduction 7.6 Instructions for Making Decisions
7.2 Operations of the Computer Hardware 7.7 Supporting Procedures in
7.3 Operands of the Computer Hardware Computer Hardware
3
Instruction Set
■ The collection of instructions of a computer
■ Different computers have different instruction
sets
■ But with many aspects in common
■ Early computers had very simple instruction sets
■ Simplified implementation
■ Many modern computers also have simple instruction
sets
4
The MIPS Instruction Set
■ Used as the example throughout the course
■ Stanford MIPS commercialized by MIPS Technologies
(www.mips.com)
■ Large share of embedded core market
■ Applications in consumer electronics, network/storage equipment,
cameras, printers, …
■ Typical of many modern ISAs
■ See MIPS Reference Data tear-out card, and Appendices B and E
5
Arithmetic Operations
■ Add and subtract, three operands
■ Two sources and one destination
● add a,b,c # a gets b + c
■ All arithmetic operations have this form
■ Design Principle 1: Simplicity favors regularity
■ Regularity makes implementation simpler
■ Simplicity enables higher performance at lower cost
6
7
Arithmetic Example
■ C code:
● f = (g + h) - (i + j);
t0 = g + h
t1 = i + j
sub f, t0, t1 #
f = t0 - t1
8
Register Operands
● Arithmetic instructions use register operands
● MIPS has a 32 by 32-bit register file
○ Used for frequently accessed data
○ Numbered 0 to 31
○ 32-bit data called a “word”
● Assembler names
○ $t0, $t1, …, $t9 for temporary values
○ $s0, $s1, …, $s7 for saved variables
● Design Principle 2: Smaller is faster
○ c.f. main memory: millions of locations
9
Register Operand Example
■ C code:
f = (g + h) - (i + j);
■ f, …, j in $s0, …, $s4
10
Memory Operands (1)
● Main memory used for composite data
○ Arrays, structures, dynamic data
● To apply arithmetic operations
○ Load values from memory into registers
○ Store result from register to memory
11
Memory Operands (2)
● Memory is byte addressed
○ Each address identifies an 8-bit byte
● Words are aligned in memory
○ Address must be a multiple of 4
● MIPS is Big Endian
○ Most-significant byte at least address of a word
○ c.f. Little Endian: least-significant byte at least address
12
Memory Operands (3)
● Data is transferred between memory and register using data transfer
instructions: lw and sw
13
Memory Operand Example(1)
■ C code:
g = h + A[8];
■ g in $s1, h in $s2, base address of A in
$s3
■ Compiled MIPS code:
■ Index 8 requires offset of 32
■ 4 bytes per word
lw $t0, 32($s3) # load word add
$s1, $s2, $t0
offset base register
14
Memory Operand Example(2)
■ C code: ■ Compiled MIPS code:
A[12] = h + A[8]; ■ Index 8 requires offset of
■ h in $s2, base address of A 32
in $s3
lw $t0, 32($s3) # load word
add $t0, $s2, $t0
sw $t0, 48($s3) # store word
15
Registers vs. Memory
● Registers are faster to access than memory
● Operating on memory data requires loads and stores
○ More instructions to be executed
● Compiler must use registers for variables as much as possible
○ Only spill to memory for less frequently used variables
○ Register optimization is important!
16
Immediate Operands
■ Constant data specified in an instruction
● addi $s3, $s3, 4
18
Translation and Startup
Many compilers
produce object
modules directly
Static
linking
● UNIX: C source files are named x.c, assembly files are x.s, object files are named x.o, statically linked library routines are
x.a, dynamically linked library routes are x.so, and executable fi les by default are called a.out.
● MS-DOS uses the .C, .ASM, .OBJ, .LIB, .DLL, and .EXE to the same effect. 19
Translation
■ Assembler (or compiler) translates program into machine instructions
20
SPIM Simulator
● SPIM is a software simulator that runs assembly language programs
● SPIM is just MIPS spelled backwards
● SPIM can read and immediately execute assembly language files
● Two versions for different machines
○ Unix xspim(used in lab), spim
○ PC/Mac: QtSpim
● Resources and Download
○ http://spimsimulator.sourceforge.net
21
System Calls in SPIM
● SPIM provides a small set of system-like services through the system call
(syscall) instruction.
● Format for system calls
○ Place value of input argument in $a0
○ Place value of system-call-code in $v0
○ Syscall
22
System Calls
.data
str:
.asciiz “answer
is:”
.text
addi
$v0,$zero,4 la
$a0, str syscall
23
Assembler Pseudoinstructions
■ Most assembler instructions
represent machine instructions one-
to-one
■ Pseudoinstructions: figments of
the assembler’s
move $t0, $t1 blt imagination
→ add $zero, $t1
$t0, $t1, L $t0, $t0, $t1
→ slt $at, $zero, L
bne $at,
■ $at (Register 1): assembler temporary
24
Assembler Pseudoinstructions (2)
● Pseudoinstructions give MIPS a richer set of assembly language
instructions than those implemented by the hardware.
● Register, $at (assembler temporary), reserved for use by the assembler.
● For productivity, use pseudoinstructions to write assembly programs.
● For performance, use real MIPS instructions
25
Reading
■ Read Appendix A.9 for SPIM
■ List of Pseudoinstructions can be found on page 235
26
Producing an Object Module
● Assembler (or compiler) translates program into machine instructions
● Provides information for building a complete program from the pieces
○ Header: contains size and position of pieces of object module
○ Text segment: translated machine instructions
○ Static data segment: data allocated for the life of the program
○ Relocation info: for instructions and data words that depend on absolute location of
loaded program
○ Symbol table: global definitions and external refs
○ Debug info: for associating with source code
27
Linking Object Modules
● Produces an executable file
○ Merges segments
○ Resolves labels (determine their addresses)
○ Patches location-dependent and external refs
● Could leave location dependencies for fixing by a relocating loader
● But with virtual memory, no need to do this
● Program can be loaded into absolute location in virtual memory space
28
Linking Object Modules
29
Linking Object Modules
30
Loading a Program
● Load from file on disk into memory
○ Read header to determine segment sizes
○ Create address space for text and data
○ Copy text and initialized data into memory
○ Set up arguments on stack
○ Initialize registers (including $sp, $fp, $gp)
○ Jump to startup routine
■ Copies arguments to $a0, … and calls main
■ When main returns, do exit syscall
31
Dynamic Linking
● Only link/load library procedure when it is called
○ Requires procedure code to be relocatable
○ Avoids image enlarge caused by static linking of all (transitively) referenced libraries
○ Automatically picks up new library versions
32
Starting Java Applications
Simple portable
instruction set
for the JVM
Compiles
Interprets
bytecodes of
bytecode
“hot”
s
methods into
native code
for host
machine
33
An Example MIPS Program
# Functional Description: Find the sum of the integers from 1 to N where # N is a value input from the keyboard.
#########################################################
# Register Usage: $t0 is used to accumulate the sum
# $v0 the loop counter, counts down to zero
##########################################################
# Algorithmic Description in Pseudocode:
# main: v0 << value read from the keyboard (syscall 4) # if (v0 < = 0 ) stop
# t0 = 0; # t0 is used to accumulate the sum # While (v0 > 0) { t0 = t0 + v0; v0 = v0 - 1}
# Output to monitor syscall(1) << t0; goto main
##########################################################
.data
prompt: .asciiz "\n\n Please Input a value for N = “
36
Four Important Number Systems
System Why? Remarks
Decimal Base 10: (10 fingers) Most used system
working with
binary
Hex Base 16 4 times less digits than 37
Computer
§2.5 Representing Instructions in the
Representing Instructions
● Instructions are encoded in binary
○ Called machine code
● MIPS instructions
○ Encoded as 32-bit instruction words
○ Small number of formats encoding operation code (opcode), register numbers, …
○ Regularity!
● Register numbers
○ $t0 – $t7 are reg’s 8 – 15
○ $t8 – $t9 are reg’s 24 – 25
○ $s0 – $s7 are reg’s 16 – 23
38
MIPS R-format Instructions
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits
6 bits
■ Instruction fields
■ op: operation code (opcode)
■ rs: first source register number
■ rt: second source register number
■ rd: destination register number
■ shamt: shift amount (00000 for now)
■ funct: function code (extends opcode)
39
R-format Example
op rs rt rd shamt funct
6 5 5 5 5 6
bits bits bits bits bits bits
0 17 18 8 0 32
000000100011001001000000001000002 = 0232402016
40
Hexadecimal
■ Base 16
■ Compact representation of bit
strings
■ 4 bits per hex digit
0 0000 4 0100 8 1000 c 1100
1 0001 5 0101 9 1001 d 1101
2 0010 6 0110 a 1010 e 1110
3 0011 7 0111 b 1011 f 1111
35 19 8 32
6 5 5 16
bits bits bits bits
47
OR Operations
■ Useful to include bits in a word
■ Set some bits to 1, leave others
unchanged
or $t0, $t1, $t2
0000 0000 0000 0000 0 110 1 1100 0000
000
$t2
0000 0000 0000 0000 11 0 0000 0000
$t1 00 110
48
NOT Operations
■ Useful to invert bits in a word
■ Change 0 to 1, and 1 to 0
■ MIPS has NOR 3-operand
instruction
■ a NOR b == NOT ( a OR b )
nor $t0, $t1, $zero Register 0: always
read as zero
49
Conditional Operations
■ Branch to a labeled instruction if
a condition is true
■ Otherwise, continue sequentially
■ beq rs, rt, L1
■ if (rs == rt) branch to instruction labeled L1;
■ bne rs, rt, L1
■ if (rs != rt) branch to instruction labeled L1;
■ j L1
■ unconditional jump to instruction labeled
L1
50
Compiling If Statements
■ C code:
if (i==j) f = g+h; else f
= g-h;
■ f, g,h in $s0, $s1, $s2
■ Compiled MIPS
code: bne $s3, $s4, Else
add $s0, j $s1, $s2
Exit sub
Else: $s1, $s2
$s0,
Exit:
… Assembler calculates
addresses
51
Compiling Loop Statements
■ C code:
while (save[i] == k) i += 1;
■ i in $s3, k in $s5, address of save in $s6
■ Compiled MIPS code:
■ A compiler identifies
basic blocks for
optimization
■ An advanced processor
can accelerate
execution of basic
53
blocks
More Conditional Operations
■ Set result to 1 if a condition is
true
■ Otherwise, set to 0
■ slt rd, rs, rt
■ if (rs < rt) rd = 1; else rd = 0;
■ slti rt, rs, constant
■ if (rs < constant) rt = 1; else rt = 0;
■ Use
slt in $t0,
combination
$s1, $s2 # if beq,
with ($s1bne
< $s2)
bne $t0, $zero, L # branch to L
54
Branch Instruction Design
■ Why not blt, bge, etc?
■ Hardware for <, ≥, … slower than =, ≠
■ Combining with branch involves more
work per instruction, requiring a slower
clock
■ All instructions penalized!
■ beq and bne are the common case
■ This is a good design compromise
55
Signed vs. Unsigned
■ Signed comparison: slt, slti
■ Unsigned comparison: sltu, sltui
■ Example
■ $s0 = 1111 1111 1111 1111 1111 1111 1111 1111
■ $s1 = 0000 0000 0000 0000 0000 0000 0000 0001
■ slt $t0, $s0, $s1 # signed
■ –1 < +1 $t0 = 1
■ sltu $t0, $s0, $s1 # unsigned
■ +4,294,967,295 > +1 $t0 = 0
56
Procedure Calling
■ Procedure (function) performs a
specific task and returns results to
caller.
57
Procedure Calling
■ Calling program
■ Place parameters in registers $a0 - $a3
■ Called procedure
■ Acquire storage for procedure, save values
of required register(s) on stack $sp
■ Perform procedure’s operations, restore
the values of registers that it used
■ Place result in register for caller $v0 - $v1
■ Return to place of call by returning to
instruction whose address is saved in
$ra
58
Register Usage
■ $a0 – $a3: arguments (reg’s 4 – 7)
■ $v0, $v1: result values (reg’s 2 and 3)
■ $t0 – $t9: temporaries
■ Can be overwritten by callee
■ $s0 – $s7: saved
■ Must be saved/restored by callee
■ $gp: global pointer for static data (reg 28)
■ $sp: stack pointer for dynamic data (reg
29)
■ $fp: frame pointer (reg 30)
■ $ra: return address (reg 31)
59
Procedure Call Instructions
■ Procedure call: jump and link
jal ProcedureLabel
■ Address of following instruction put in
$ra
■ Jumps to target address
■ Result in $v0
61
Leaf Procedure Example (2)
■ MIPS
code:
leaf_example:
addi $sp, $sp, -4 Save $s0 on stack
main:
…
jal leaf_example
…
63
Non-Leaf Procedures
■ Procedures that call other procedures
■ For nested call, caller needs to save on
the stack:
■ Its return address
■ Any arguments and temporaries needed
after the call
■ Restore from the stack after the call
64
Non-Leaf Procedure Example (2)
■ C code:
int fact (int n)
{
if (n < 1) return 1;
else return n * fact(n - 1);
}
■ Argument n in $a0
■ Result in $v0
65
Non-Leaf Procedure Example (3)
■ MIPS
code:
fact:
add $sp, $sp, -8 # adjust stack for 2 items
i $ra, 4($sp) # save return address
$a0, 0($sp) # save argument
s
w
s
w
slti $t0, $a0, 1 # test for n < 1
L1
beq $t0, $zero,
add $v0, $zero, 1 # if so, result is 1
i $sp, $sp, 8 # pop 2 items from stack
$ra # and return
ad
di
jr
L1: addi $a0, $a0, -1 # else decrement n 66
Non-Leaf Procedure Example (4)
67
Non-Leaf Procedure Example (5)
68
Non-Leaf Procedure Example (6)
69
Non-Leaf Procedure Example (7)
70
Non-Leaf Procedure Example (8)
71
Local Data on the Stack
74
Character Data
■ Byte-encoded character sets
■ ASCII: (7-bit) 128 characters
■ 95 graphic, 33 control
■ Latin-1: (8-bit) 256 characters
■ ASCII, +96 more graphic characters
76
ASCIICharacters
■ American Standard Code for Information Interchange (ASCII).
■ Most computers use 8-bit to represent each character. (Java uses
Unicode, which is 16- bit).
■ Signs are combination of characters.
■ How to load a byte?
■ lb, lbu, sb for byte (ASCII)
■ lh, lhu, sh for half-word instruction (Unicode)
77
Byte/Halfword Operations
■ Could use bitwise operations
■ MIPS byte/halfword load/store
■ String processing is a common case
●lb rt, offset(rs) lh rt, offset(rs)
■ Sign extend to 32 bits in rt
●lbu rt, offset(rs)lhu rt, offset(rs)
■ Zero extend to 32 bits in rt
●sb rt, offset(rs) sh rt, offset(rs)
■ Store just rightmost byte/halfword
78
String Copy Example
■ C code:
■ Null-terminated string
void strcpy (char x[], char y[])
{ int i; i =
0;
while ((x[i]=y[i])!='\0')
i += 1;
}
■ Addresses of x, y in $a0, $a1
■ i in $s0
79
String Copy Example
■
strcpy:
MIPS
addi $sp, $sp, -4 # adjust stack for 1 item
code:
sw $s0, 0($sp) # save $s0
add $s0, $zero, $zero # i=0
L1: add $t1, $s0, $a1 # addr of y[i] in $t1
lbu $t2, 0($t1) # $t2 = y[i]
add $t3, $s0, $a0 # addr of x[i] in $t3
sb $t2, 0($t3) # x[i] = y[i]
be $t2, $zero, L2 # exit loop if y[i] == 0
q
addi $s0, $s0, 1 # i=i+1
j L1 # next iteration of loop
L2: lw $s0, 0($sp) # restore saved $s0
addi $sp, $sp, 4 # pop 1 item from stack
jr $ra # and return
80
32-bit Constants
■ Most constants are small
■ 16-bit immediate is sufficient
■ For the occasional 32-bit constant
lui rt, constant
■ Copies 16-bit constant to left 16 bits of
rt
■ Clears right 16 bits of rt to 0
lui $s0,61 0000 0000 0011 0000 0000 0000 0000
1101
ori $s0,$s0,2304 0000 0000 0011 11010000 1001 0000 0000
81
Branch Addressing
■ Branch instructions specify
■ Opcode, two registers, target address
■ Most branch targets are near branch
■ Forward or backward
op rs rt constant or address
6 bits 5 bits 5 bits 16 bits
■ PC-relative addressing
■ Target address = PC + offset × 4
■ PC already incremented by 4 by this
time 82
Jump Addressing
■ Jump (j and jal) targets could be
anywhere in text segment
■ Encode full address in instruction
op address
6 26
bits bits
83
Target Addressing Example
■ Loop code from earlier
example
■ Assume Loop at location 80000
Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0
add $t1, $t1, $s6 80004 0 9 22 9 0 32
lw $t0, 0($t1) 80008 35 9 8 0
bne $t0, $s5, Exit 80012 5 8 21 2
addi $s3, $s3, 1 80016 8 19 19 1
j Loop 80020 2 20000
Exit: … 80024
84
Bài tập
■ Dịch thành 6 lệnh mã mã: nhị
phân và thập lục phân
85
Branching Far Away
■ If branch target is too far to encode with
16-bit offset, assembler rewrites the
code
■ Example
beq $s0,$s1, L1
written as
bne $s0,$s1, L2 j L1
L2: …
86
Addressing Mode Summary
87
Synchronization (Parallelism)
■ Two processors sharing an area of memory
■ P1 writes, then P2 reads
■ Data race if P1 and P2 don’t synchronize
■ Result depends on order of accesses
■ Hardware support required
■ Atomic read/write memory operation
■ No other access to the location allowed between
the read and write
■ Could be a single instruction
■ E.g., atomic swap of register ↔ memory
■ Or an atomic pair of instructions
88
Synchronization in MIPS
■ Load linked: ll rt, offset(rs)
■ Store conditional: sc rt, offset(rs)
■ Succeeds if location not changed since the ll
■ Returns 1 in rt
■ Fails if location is changed
■ Returns 0 in rt
■ Example: atomic swap (to test/set lock
try: add $t0,$zero,$s4 ll
variable) ;copy exchange value
$t1,0($s1) sc ;load linked
$t0,0($s1) ;store conditional
beq $t0,$zero,try add ;branch store fails
$s4,$zero,$t1 ;put load value in $s4
89
C Sort Example
■Illustrates use of assembly instructions for a C bubble sort
function
Swap procedure (leaf)
void swap(int v[], int k)
{
int temp; temp = v[k]; v[k] = v[k+1]; v[k+1]
= temp;
}
v in $a0, k in $a1, temp in $t0
90
The Procedure Swap
swap: sll $t1, $a1, 2 # $t1 = k * 4
add $t1, $a0, $t1 # $t1 = v+(k*4)
# (address of v[k])
lw $t0, 0($t1) # $t0 (temp) = v[k]
lw $t2, 4($t1) # $t2 = v[k+1]
sw $t2, 0($t1) # v[k] = $t2 (v[k+1])
sw $t0, 4($t1) # v[k+1] = $t0 (temp)
jr $ra # return to calling routine
91
Concluding Remarks
■ Design principles
1. Simplicity favors regularity
2. Smaller is faster
3. Make the common case fast
4. Good design demands good compromises
■ Layers of software/hardware
■ Compiler, assembler, hardware
■ MIPS: typical of RISC ISAs
■ c.f. x86
92