Computer Architecture Lec 10
Computer Architecture Lec 10
CS 283
Instructor: Sameen Fatima
Lecture 10
MIPS Arithmetic
All MIPS arithmetic instructions have 3 operands
Operand order is fixed (e.g., destination first)
Example:
Control Input
Memory
Datapath Output
Processor I/O
Memory Organization
Viewed as a large single-dimension array with access by
address
A memory address is an index into the memory array
Byte addressing means that the index points to a byte of
memory, and that the unit of memory accessed by a load/store
is a byte
0 8 bits of data
1 8 bits of data
2 8 bits of data
3 8 bits of data
4 8 bits of data
5 8 bits of data
6 8 bits of data
...
Memory Organization
Bytes are load/store units, but most data items use larger words
For MIPS, a word is 32 bits or 4 bytes.
0 32 bits of data
4 32 bits of data
Registers correspondingly hold 32 bits of data
8 32 bits of data
12 32 bits of data
...
232 bytes with byte addresses from 0 to 2 32-1
230 words with byte addresses 0, 4, 8, ... 2 32-4
i.e., words are aligned
what are the least 2 significant bits of a word address?
Load/Store Instructions
Load and store instructions
Example:
swap:
muli $2, $5, 4
add $2, $4, $2
lw $15, 0($2)
lw $16, 4($2)
sw $16, 0($2)
sw $15, 4($2)
jr $31
So far we’ve learned:
MIPS
loading words but addressing bytes
arithmetic on registers only
Instruction Meaning
Instructions, like registers and words of data, are also 32 bits long
Example: add $t0, $s1, $s2
registers are numbered, e.g., $t0 is 8, $s1 is 17, $s2 is 18
op
6 bits rs 5 bits rt 5 bits 16 bit offset
16 bits
Stored Program Concept
Instructions are bit sequences, just like data
Programs are stored in memory
to be read or written just like data
Big-endian Little-endian
Bit 31
Bit 31
Bit 0
Bit 0
Memory Memory
Byte 0 Byte 1 Byte 2 Byte 3 Word 0 Byte 3 Byte 2 Byte 1 Byte 0 Word 0
Byte 4 Byte 5 Byte 6 Byte 7 Word 1 Byte 7 Byte 6 Byte 5 Byte 4 Word 1
Memory Organization:
Big/Little Endian Byte Order
SPIM’s memory storage depends on that of the underlying
machine
Intel 80x86 processors are little-endian
because SPIM always shows words from left to right a “mental
adjustment” has to be made for little-endian memory as in Intel PCs
in our labs: start at right of first word go left, start at right of next
word go left, …!
Word placement in memory (from .data area of code) or word
access (lw, sw) is the same in big or little endian
Byte placement and byte access (lb, lbu, sb) depend on big or
little endian because of the different numbering of bytes within a
word
Character placement in memory (from .data area of code)
depend on big or little endian because it is equivalent to byte
placement after ASCII encoding
Run storeWords.asm from SPIM examples!!
Control: Conditional Branch
Decision making instructions
alter the control flow,
i.e., change the next instruction to be executed
I op rs rt 16 bit offset
Solution: specify a register (as for lw and sw) and add it to offset
use PC (= program counter), called PC-relative addressing, based on
principle of locality: most branches are to instructions near current
instruction (e.g., loops and if statements)
Addresses in Branch
Further extend reach of branch by observing all MIPS
instructions are a word (= 4 bytes), therefore word-relative
addressing:
MIPS branch destination address = (PC + 4) + (4 * offset)
Because hardware typically increments PC early
in execute cycle to point to next instruction
J op 26 bit address
MIPS Instructions:
addi $29, $29, 4
slti $8, $18, 10
andi $29, $29, 6
ori $29, $29, 4
op rs rt 16 bit number
How about larger constants?
First we need to load a 32 bit constant into a register
Must use two instructions for this: first new load upper immediate
instruction for upper 16 bits
lui $t0, 1010101010101010 filled with zeros
1010101010101010 0000000000000000
0000000000000000 1010101010101010
ori
1010101010101010 1010101010101010
Formats:
R op rs rt rd shamt funct
I op rs rt 16 bit address
J op 26 bit address
Control Flow
We have: beq, bne. What about branch-if-less-than?
New instruction:
if $s1 < $s2 then
$t0 = 1
slt $t0, $s1, $s2 else
$t0 = 0
int add10(int i)
{ return (i + 10);}
Procedures
Translated MIPS assembly
Note more efficient use of registers possible! save register
.text in stack, see
.globl main figure below
add10:
main:
addi $s0, $0, 5 addi $sp, $sp, -4
add $a0, $s0, $0 sw $s0, 0($sp)
jal add10
addi $s0, $a0, 10
argument add $s1, $v0, $0
add $v0, $s0, $0
to callee add $s0, $s1, $0 result
control returns here to caller
jump and link
li $v0, 10 restore lw $s0, 0($sp)
syscall values addi $sp, $sp, 4
return jr $ra
system code
MEMORY High address
& call to
exit $sp
Content of $s0
Low address
Run this code with PCSpim: procCallsProg1.asm
MIPS: Software Conventions
0 for
zero Registers
constant 0 16 s0 callee saves
1 at reserved for assembler ... (caller can clobber)
2 v0 results from callee 23 s7
3 v1 returned to caller 24 t8 temporary (cont’d)
4 a0 arguments to callee 25 t9
5 a1 from caller: caller saves 26 k0 reserved for OS kernel
6 a2 27 k1
7 a3 28 gp pointer to global area
8 t0 temporary: caller saves 29 sp stack pointer
... (callee can clobber) 30 fp frame pointer
15 t7 31 ra return Address (HW):
caller saves
Procedures (recursive)
Example C code – recursive factorial subroutine:
int main()
{ int i;
i = 4;
j = fact(i);
return 0;}
int fact(int n)
{ if (n < 1) return (1);
else return ( n*fact(n-1) );}
Procedures (recursive)
Translated MIPS assembly:
.text
.globl main
$sp $sp
$fp Saved argument
registers (if any)
Saved saved
registers (if any)
$sp
Low address
a. b. c.
Variables that are local to a procedure but do not fit into registers (e.g., local arrays, struc-
tures, etc.) are also stored in the stack. This area of the stack is the frame. The frame pointer
$fp points to the top of the frame and the stack pointer to the bottom. The frame pointer does
not change during procedure execution, unlike the stack pointer, so it is a stable base
register from which to compute offsets to local variables.
Use of the frame pointer is optional. If there are no local variables to store in the stack it is
not efficient to use a frame pointer.
Using a Frame Pointer
Example: procCallsProg1Modified.asm
This program shows code where it may be better to use $fp
Because the stack size is changing, the offset of variables stored in
the stack w.r.t. the stack pointer $sp changes as well. However, the
offset w.r.t. $fp would remain constant.
Why would this be better?
The compiler, when generating assembly, typically maintains a table
of program variables and their locations. If these locations are
offsets w.r.t $sp, then every entry must be updated every time the
stack size changes!
Exercise:
Modify procCallsProg1Modified.asm to use a frame pointer
Observe that SPIM names register 30 as s8 rather than fp. Of
course, you can use it as fp, but make sure to initialize it with the
same value as sp, i.e., 7fffeffc.
MIPS Addressing Modes
1. Immediate addressing
op rs rt Immediate
2. Register addressing
op rs rt rd ... funct Registers
Register
3. Base addressing
op rs rt Address Memor y
4. PC-relative addressing
op rs rt Address Memor y
PC + Word
5. Pseudodirect addressing
op Address Memor y
PC Word
Overview of MIPS
Simple instructions – all 32 bits wide
Very structured – no unnecessary baggage
Only three instruction formats
R op rs rt rd shamt funct
I op rs rt 16 bit address
J op 26 bit address
Rely on compiler to achieve performance
what are the compiler's goals?
Help compiler where we can
Summarize MIPS:
MIPS operands
Name Example Comments
$s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform
32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is
$fp, $sp, $ra, $at reserved for the assembler to handle large constants.
Memory[0], Accessed only by data transfer instructions. MIPS uses byte addresses, so
30
2 memory Memory[4], ..., sequential words differ by 4. Memory holds data structures, such as arrays,
words Memory[4294967292] and spilled registers, such as those saved on procedure calls.
Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers
add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 Used to add constants
load word lw $s1, 100($s2) $s1 = Memory[$s2 + 100] Word from memory to register
store word sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory
Data transfer load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100] Byte from memory to register
store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memory
load upper immediate lui $s1, 100 $s1 = 100 * 2
16 Loads constant in upper 16 bits
branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to Equal test; PC-relative branch
PC + 4 + 100
branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to Not equal test; PC-relative
PC + 4 + 100
Conditional
branch set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; Compare less than; for beq, bne
else $s1 = 0
set less than slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1; Compare less than constant
immediate else $s1 = 0
Saving grace:
the most frequently used instructions are not too difficult to build
compilers avoid the portions of the architecture that are slow
“ what the 80x86 lacks in style is made up in quantity, making it beautiful from
the right perspective”
Summary
Instruction complexity is only one variable
lower instruction count vs. higher CPI / lower clock rate
Design Principles:
simplicity favors regularity
smaller is faster
good design demands compromise
make the common case fast