CA04 2022S2 New
CA04 2022S2 New
❑ Quantitative design
➢ How many GPR & trade-off? 32 5. The role of compilers
➢ What is the GPR width & trade-off? 32 bit
RV32I storage model
Word address
Program Counter 0 M[0]
32-bit memory address
4 M[1]
of the current instruction
8 M[2]
⁞ M[3]
**Note** x0=0 M[4]
x1
x2
General Purpose
Register File
32 32-bit words
named x0...x31 M[N-1]
Memory
❖ Note: textbook uses the 64-bit variant Byte addressable
RV64I 32-bit address (4 GB)
Originally little-endian
Not require words alignment
1. Data Storage
❖ Design principles #2, #3: “smaller is faster”, “make the common case fast”
❑ Architectural representative
➢ Interface btw HLL and machine code
➢ Human-readable format of instructions
➢ Keep the format as similar as possible to that of the R-type: same bits =
same meaning (op, rd, funct3, rs1) → simplicity favors regularity.
➢ The constant ranges from [-211 to 211-1].
❑ Semantics:
➢ [rd] ← [rs1] op [sign-extend(imm)]
➢ Pseudo-instruction: mv (move) = add instruction with zero immediate
❑ New perspective:
➢ View register as 32 raw bits rather than as a single 32-bit number (as
arithmetic instructions) → operate on individual bits or bytes within a word
➢ Share the same encoding with arithmetic instructions (R- & I- types).
Logical shift operations
❑ Semantics: move all the bits in a word to the left/right by a
number of positions; fill the emptied positions with zeroes.
❑ R-type encoding: shift amount is in (lower 5 bits of) a register
➢ sll (shift left logical): sll t0, t1, t2 # t0 = t1 << t2
➢ srl (shift right logical): srl t0, t1, t2 # t0 = t1 >> t2
x11 1111 1111 1111 1111 1111 1111 1110 0111 (-2510)
x10 1111 1111 1111 1111 1111 1111 1111 1110 (-210)
Unfortunately, this is NOT same as dividing by 24
▪ Fails for odd negative numbers
Logical Operations: Bitwise AND
Notes:
The and instruction has an immediate version, andi
Logical Operations: Bitwise OR
Mnemonic: or ( bitwise OR )
Bitwise operation that that places a 1 in the result if
either operand bit is 1
Example: or x9, x10, x11
❑ Strange Fact 2:
➢ There is no nori, but there is xori in RISC-V
➢ Why?
Load instructions
❑ Encoding: I-type
12 bits 5 bits 3 bits 5 bits
imm rs1 funct3 rd op
010: lw 0000011
❑ Semantics
➢ lw rd, imm(rs1) #rd ← Mem[rs1+imm], load a word from Memory
➢ Address calculation is like addi: base (stored in rs1) + offset (stored in
instruction) → base addressing, a special case of displacement addressing.
❑ Encoding: S-type
➢ Keep rs1 and rs2 fields in same place as register names are more critical
than immediate bits in hardware design.
7 bits 5 bits 5 bits 3 bits 5 bit 7 bits
imm[11:5] rs2 rs1 funct3 imm[4:0] op
010: sw 0100011
❑ Make common cases fast
➢ sh rs2, imm(rs1) #rs2 → Mem[rs1+imm][0:15], store a halfword
➢ sb rs2, imm(rs1) #rs2 → Mem[rs1+imm] [0:7], store a byte
➢ Why there’re no unsigned store instructions shu, sbu?
A load byte example
❑ Suppose x10 initially contains 0x23456789. After the following
code runs, what is the value of x14
sw x10, 0(x0)
lb x14, 1(x0)
a. in a big-endian system?
b. in a little-endian system?
❑ Solution:
a. 0x00000045
b. 0x00000067
Big-Endian Little-Endian
Word
Byte Address 0 1 2 3 Address 3 2 1 0 Byte Address
Data Value 23 45 67 89 0 23 45 67 89 Data Value
MSB LSB MSB LSB
Instructions for 32-bit constants
❑ Most constants are small
➢ 12-bit immediate is sufficient
➢ Occationally need 32-bit constants → upper-immediate instructions
❑ Encoding: U-type
20 bits 5 bit 7 bits
imm[31:12] rd op
lui x8, 976 0000 0000 0111 1101 0000 0000 0000 0000
ori x8, x8, 2304 0000 0000 0000 0000 0000 1001 0000 0000
Branch Instructions
❑ Change control flow depending on outcome of comparison
➢ E.g., beq x1, x2, Label # branch to Label if x1 equals x2
➢ Branches read two registers but don’t write a register (similar to stores)
❑ Semantics
➢ Label ← PC + sign-extend(imm) x 2
if [rs] == [rt] then PC ← Label else PC ← PC + 4
➢ Variations: bne, blt, bge , bltu, bgeu. Why isn’t there a ble or bgt?
Branch encoding example
Loop: beq x19,x10,End
add x18,x18,x10 1 Count
addi x19,x19,-1 2 instructions
j Loop 3 from branch
End: # target instruction 4
❑ Encoding: J-type
1 bit 10 bits 1 bit 8 bits 5 bits 7 bits
imm[20] imm[10:1] imm[11] imm[19:12] rd op