Unit 1b
Unit 1b
■ Introduction
Architecture
Programmers Model
Instruction Set
TM
1 1
History of ARM
• ARM (Acorn RISC Machine) started as a new, powerful, CPU design for the
replacement of the 8-bit 6502 in Acorn Computers (Cambridge, UK, 1985)
• First models had only a 26-bit program counter, limiting the memory space
to 64 MB (not too much by today standards, but a lot at that time).
• 1990 spin-off: ARM renamed Advanced RISC Machines
• ARM now focuses on Embedded CPU cores
• IP licensing: Almost every silicon manufacturer sells some microcontroller
with an ARM core. Some even compete with their own designs.
• Processing power with low current consumption
• Good MIPS/Watt figure
• Ideal for portable devices
• Compact memories: 16-bit opcodes (Thumb)
• New cores with added features
• Harvard architecture (ARM9, ARM11, Cortex)
• Floating point arithmetic
• Vector computing (VFP, NEON)
• Java language (Jazelle)
TM
2 2
Facts
• 32-bit CPU
• 3-operand instructions (typical): ADD Rd,Rn,Operand2
• RISC design…
• Few, simple, instructions
• Load/store architecture (instructions operate on registers, not memory)
• Large register set
• Pipelined execution
• … And some very specific details
• No stack. Link register instead
• PC as a regular register
• Conditional execution of all instructions
• Flags altered or not by data processing instructions (selectable)
• Concurrent shifts/rotations (at the same time of other processing)
• …
TM
3 3
TM
4 4
Agenda
Introduction
■ Architecture
Programmers Model
Instruction Set
TM
5 5
ARM7TDMI
Block Diagram
ARM Pipelining examples
TM
7 7
ARM7TDMI Pipelining (I)
TM
8 8
Agenda
Introduction
Architecture
■ Programmers Model
Instruction Set
TM
9 9
Data Sizes and Instruction Sets
TM
10 10
Processor Modes
TM
11 11
The Registers
TM
12 12
The ARM Register Set
TM
13 13
Special Registers
■ SP (R13): Stack Pointer. There is no stack in the ARM architecture. Even so,
R13 is usually reserved as a pointer for the program-managed stack
■ CPSR : Current Program Status Register. Holds the visible status register
■ SPSR : Saved Program Status Register. Holds a copy of the previous status
register while executing exception or interrupt routines
- It is copied back to CPSR on the return from the exception or interrupt
- No SPSR available in User or System modes
TM
14 14
Register Organization Summary
User,
FIQ IRQ SVC Undef Abort
SYS
r0
r1
User
r2 mode
r3 r0-r7,
r4 r15, User User User User
r5 and mode mode mode mode
cpsr r0-r12, r0-r12, r0-r12, r0-r12,
r6
r15, r15, r15, r15,
r7 and and and and
r8 r8 cpsr cpsr cpsr cpsr
r9 r9
r10 r10
r11 r11
r12 r12
r13 r13 r13 r13 r13 r13
(sp)
r14 (sp)
r14 (sp)
r14 (sp)
r14 (sp)
r14 (sp)
r14
(lr)
r15 (lr) (lr) (lr) (lr) (lr)
(pc)
cpsr
spsr spsr spsr spsr spsr
TM
15 15
Program Status Registers
TM
16 16
Program Counter (R15)
TM
17 17
Agenda
Introduction
Architecture
Programmers Model
■ Instruction Set (for ARM state)
TM
18 18
Conditional Execution and Flags
TM
19 19
Condition Codes
Flags
Suffix Description
tested
EQ Equal Z=1
NE Not equal Z=0
CS/HS Unsigned higher or same C=1
CC/LO Unsigned lower C=0
MI Minus N=1
PL Positive or Zero N=0
VS Overflow V=1
VC No overflow V=0
HI Unsigned higher C=1 & Z=0
LS Unsigned lower or same C=0 or Z=1
GE Greater or equal N=V
LT Less than N!=V
GT Greater than Z=0 & N=V
Z=1 or
LE Less than or equal
N=!V
AL Always
TM
20 20
Advantages of conditional
execution
TM
21 21
Examples of conditional
execution
■ Use a sequence of several conditional instructions
if (a==0) func(1);
CMP r0,#0
MOVEQ r0,#1
BLEQ func
TM
22 22
Data processing Instructions
■ Consist of :
■ Arithmetic: ADD ADC SUB SBC RSB RSC
■ Logical: AND ORR EOR BIC
■ Comparisons: CMP CMN TST TEQ
■ Data movement: MOV MVN
Immediate value
■ 8 bit number, with a range of 0-255.
ALU
■ Rotated right through even number of
positions
■ Allows increased range of 32-bit
constants to be loaded directly into
Result registers
TM
24 24
The Barrel Shifter
LSL : Logical Left Shift ASR: Arithmetic Right Shift
C
CF Destination 0 Destination F
Multiplication by a power of 2 Division by a power of 2,
preserving the sign bit
Destination CF
TM
25 25
Loading 32 bit constants
or
■ Generate a LDR instruction with a PC-relative address to read the constant
from a literal pool (Constant data area embedded in the code).
■ For example
■ LDR r0,=0xFF => MOV r0,#0xFF
■ LDR r0,=0x55555555 => LDR r0,[PC,#Imm12]
…
…
DCD 0x55555555
■ This is the recommended way of loading constants into a register
TM
26 26
Data processing instr. FLAGS
■ Flags are changed only if the S bit of the op-code is set:
Mnemonics ending with “s”, like “movs”, and comparisons: cmp, cmn, tst, teq
■ N and Z have the expected meaning for all instructions
■ N: bit 31 (sign) of the result
■ Z: set if result is zero
■ Logical instructions (AND, EOR, TST, TEQ, ORR, MOV, BIC, MVN)
■ V: unchanged
■ C: from barrel shifter if shift ≠ 0. Unchanged otherwise
■ Arithmetic instructions (SUB, RSB, ADD, ADC, SBC, RSC, CMP, CMN)
■ V: Signed overflow from ALU
■ C: Carry (bit 32 of result) from ALU
TM
27 27
Arithmetic Operations
Operations are:
ADDoperand1 + operand2
ADCoperand1 + operand2 + carry
SUBoperand1 - operand2
SBCoperand1 - operand2 + carry -1
RSBoperand2 - operand1
RSCoperand2 - operand1 + carry - 1
Syntax:
<Operation>{<cond>}{S} Rd, Rn, Operand2
Examples
ADD r0, r1, r2
SUBGT r3, r3, #1
RSBLES r4, r5, #5
TM
28 28
Comparisons
TM
29 29
Logical Operations
Operations are:
ANDoperand1 AND operand2
EOR operand1 EOR operand2
ORR operand1 OR operand2
BIC operand1 AND NOT operand2 [ie bit clear]
Syntax:
<Operation>{<cond>}{S} Rd, Rn, Operand2
Examples:
ANDr0, r1, r2
BICEQ r2, r3, #7
EORS r1,r3,r0
TM
30 30
Data Movement
Operations are:
MOV operand2
MVN NOT operand2
Note that these make no use of operand1.
Syntax:
<Operation>{<cond>}{S} Rd, Operand2
Examples:
MOV r0, r1
MOVS r2, #10
MVNEQ r1,#0
TM
31 31
Multiply
■ Syntax:
■ MUL{<cond>}{S} Rd, Rm, Rs Rd = Rm * Rs
■ MLA{<cond>}{S} Rd,Rm,Rs,Rn Rd = (Rm * Rs) + Rn
■ [U|S]MULL{<cond>}{S} RdLo, RdHi, Rm, Rs RdHi,RdLo := Rm*Rs
■ [U|S]MLAL{<cond>}{S} RdLo, RdHi, Rm, Rs RdHi,RdLo:=(Rm*Rs)+RdHi,RdLo
■ Cycle time
■ Basic MUL instruction
■ 2-5 cycles on ARM7TDMI
■ 1-3 cycles on StrongARM/XScale
■ 2 cycles on ARM9E/ARM102xE
■ +1 cycle for ARM9TDMI (over ARM7TDMI)
■ +1 cycle for accumulate (not on 9E though result delay is one cycle longer)
■ +1 cycle for “long”
■ Above are “general rules” - refer to the TRM for the core you are using for
the exact details
TM
32 32
Branch instructions
31 28 27 25 24 23 0
Cond 1 0 1 L Offset
■ The processor core shifts the offset field left by 2 positions, sign-extends
it and adds it to the PC
■ ± 32 Mbyte range
■ How to perform longer branches or absolute address branches?
solution: LDR PC,…
TM
33 33
Single register data transfer
■ Syntax:
■ LDR{<cond>}{<size>} Rd, <address>
■ STR{<cond>}{<size>} <address>, Rn
e.g. LDREQB
TM
34 34
Address accessed
TM
35 35
Pre or Post Indexed Addressing?
■ Pre-indexed: STR r0,[r1,#12]
Offset r0
Source
12 0x20c 0x5 0x5 Register
for STR
r1
Base
Register 0x200 0x200
Base-update possible: r0 r1
LDM r10!,{r0-r6} r0
TM
37 37