0% found this document useful (0 votes)
21 views

Embedded Lecture 4 ARM

ARM microcontrollers are based on a RISC architecture developed in the 1980s. They use a reduced instruction set that executes in a single cycle for high efficiency. ARM processors share a common instruction set and use a load/store architecture that focuses on register-based operations. The ARM architecture has evolved over time with processors like ARM7, ARM9, and ARM11 that added pipeline stages for improved performance while maintaining compatibility. Key aspects of the ARM design include its simple yet powerful instructions, register-based operations, and focus on low power consumption suitable for battery-powered devices.

Uploaded by

msmukeshsinghms
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Embedded Lecture 4 ARM

ARM microcontrollers are based on a RISC architecture developed in the 1980s. They use a reduced instruction set that executes in a single cycle for high efficiency. ARM processors share a common instruction set and use a load/store architecture that focuses on register-based operations. The ARM architecture has evolved over time with processors like ARM7, ARM9, and ARM11 that added pipeline stages for improved performance while maintaining compatibility. Key aspects of the ARM design include its simple yet powerful instructions, register-based operations, and focus on low power consumption suitable for battery-powered devices.

Uploaded by

msmukeshsinghms
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

ARM Microcontrollers

• History:
– Architectural sketch developed in 1983 by Acorn
Computers
• To replace the 8-bit 6502 microprocessor in BBC computers
– The company found in 1990
• Advanced RISC machine(ARM)
• Initially, owned by Acorn, Apple, and VLSI
– E.g., ARM7: ipod
ARM9: BenQ, SonyEriccsion
ARM11: Apple iphone, Nokia
– Till 2010, 90% of embedded applications use ARMs
– Widely, used in low-power battery operated devices
ARM Processors
• A simple RISC-based architecture with powerful design

• Share a common-instruction set to maintain


compatibility

• Design philosophy:
– small processor with low power
– High code density for limited memory and physical size
restriction
– Can interface with slow and low-cost memory systems
– Reduced die size for processor to accommodate more
peripherals.
ARM Processors
• ARM 7
– 3 pipeline stages (fetch/decode/execute)
– High code density and low power consumption
– Most widely used for low-end system

• ARM9
– Compatible with ARM7
– 5 stages (fetch/decode/execute/memory/write)
– Separate instruction and data cache

• ARM10
– 6 stages (fetch/issue/decode/execute/memory/write)
ARM Family Comparison

➢ARM is based on RISC architecture: RISC supports simple and powerful


instructions which executes in a single cycle at high clock frequency
ARM Design
Features:
• Instructions: reduced set/single cycle/fixed length
• Pipeline: single stage decoding/no microprogramming
• Registers: Large numbers of GPRs
• Load/Store Architecture: data processing done only on registers and
only load/store instructions access memory

➢ ARM architecture is different from pure RISC


✓ Few instructions requires variable cycle execution (e.g. multiple
register load/store)
✓ Use of barrel shifter (increases code density and performance)
✓ Thumb-16 bit instruction set
✓ Conditional execution
✓ Improved DSP instructions
ARM Design

ARM7s ARM9s
and olders onwards

AHB
Bus I Data
Memory & Cache Cache
I/O

Bus Interface

AHB
Bus
Memory &
I/O
ARM 7 Architecture
ARM: Pipelining
• Pipelining: overlap execution of several tasks by dividing some
computations into a set of k-sub-computations
• Slight increase in cost but significant speed improvement (Ideally, k)

• Usage:
– Instruction execution:
– Arithmetic computations:
– Memory access:

Example: Washing Machine (Wash, Dry, and Iron)

T T/3 T/3 T/3


For N Clothes, total time = N.T For N Clothes, total time = (2+N).T/3
Use of Pipelining
W

T/3 T/3 T/3 T/3 T/3


Use of Pipelining: Processor
• For Hardware,
– Buy k copies of this CPU, and run k instructions in parallel
✓ Cost will also go up k time
OR
– Divide the execution into k-stages of pipeline
✓ Very nominal increase in cost.

• Buffer:
– To keep clothe on buffer until the machine accept new cloth
– Similarly, hardware needs latch or registers to between stages for
pipeline instructions
Synchronous k-stage pipelining

➢ Latch are built using master-slave flip-flops.

➢ Pipelining are implemented using combinational circuits.

➢ Clock transfer the data for processing into next stage.


Impact of pipelining on speed and efficiency
• Notations:
T: clock period of the pipeline
τ𝑖 : time delay of the circuit in stage 𝑆𝑖
d: delay due to latch

Maximum stage delay τ𝑚 : max{τ𝑖 }


Then, T=τ𝑚 +d
Pipeline frequency f =1/T
➢ If one result will come out in every clock cycle then f denotes the
maximum throughput of the pipeline.

The total time to process N data set in k stages 𝑇𝑘 : [(k-1)+N].T


Equivalent non-pipelined processor, total time 𝑇1 :N.k.T (ignoring latch
delay)
𝑁.𝑘.𝑇 𝑁.𝑘
Increased speed 𝑆𝑘 = 𝑇1 /𝑇𝑘 = = . N ∞, 𝑆𝑘 k
𝑘.𝑇+(𝑁−1)𝑇 𝑘+(𝑁−1)
Pipeline Efficiency and Throughput
𝑆𝑘 𝑁
• Pipeline Efficiency 𝐸𝑘 : =
k 𝑘+(𝑁−1)
𝑁 𝑁
• Pipeline Throughput𝐻𝑘 : =
𝑇𝑘 𝑘+ 𝑁−1 𝑇
Speedup

Number of Tasks
ARM Pipelines

ARM7 TDMI pipeline

1 clock cycle

ARM9 TDMI pipeline

1 clock cycle
ARM 7 Pipelining
Instruction

Time
Stall Cycles in Pipelining
1. ADD

2. STR
Instruction

3. ADD

4. ADD

5. ADD

Time
Pipelining: PC
➢In execution, Program counter is always 8
bytes ahead.
Processor Modes
Registers
• ARM has 37 registers of 32 bit long: 1 PC, 1 current program status
register (CPSR), 5 dedicated saved PSR (SPSR), 30 general purpose
registers (GPRs)

• Only 16 registers (r0-r15) are visible to specific mode of operation:


– r0-r12: particular set of registers
– r13: stack pointer (SP): For software managed stack
– r14: program counter (PC)
– r15: link register (LR) : saves PC while BL instruction or exception jump or
interrupt handling
– CPSR : Current status
✓ Special modes can access SPSR also

• All ARM operations are 32 bit but


data transfer operations support
shorter data types
Current Program Status Register

Mode Bits:
10000: User
10001: FIQ
10010: IRQ
10011: Supervisor
10111: Abort
11011: Undefined
11111: System
Program Counter
• Two mode of execution: ARM mode and
Thumb mode
– ARM Mode:
o All instructions are 32 bit wide and must be word aligned
o The last two bits of PC are not used (or zero)
o In pipelining, PC=PC+8 or PC= PC+12 in case of register
specified shift

– Thumb Mode: All instructions are 16-bits wide and


half word aligned
Register Organization Memory
User, Sys FIQ IRQ SVC Undefined Abort
Exception Handling
• When an exception occurs:
➢ Copies the CPSR content into
SPSR
➢ Set appropriate bits in CPSR
➢ Changes to ARM state
➢ Changes to related mode
➢ Disables IRQ, FIQ
➢ Stores return address in
LR<Mode>
➢ Sets PC to vector address

• While returning, the exception handler


needs to
➢ Restore CPSR from SPSR
➢ Restore PC from LR_<mode> Contain ARM instructions
NOT address
ARM and Thumb Mode

Conditional Execution: CPU will execute instruction or not


e.g. MOVEQ r1, #0 (if zero flag set in CPSR then r1=0)
ARM Instructions Set
• Data processing instructions: Work on
values in registers

• Data transfer instructions: Transfer


values between registers and memory

• Control flow instructions: Changes the


value of the PC
Data Processing Instructions
• All operands are 32 bit size: Either registers OR literals
(immediate values)

• Result is also 32 bit and goes into a specified register


– Exception: long multiply, that generates 64-bit result

• All operands and output registers are mentioned in the


instruction.
Arithmetic Instructions

ADD r0, r1, r2; r0=r1+r2,


ADC r0, r1, r2; r0=r1+r2+C (carry bit),
SUB r0, r1, r2; r0=r1-r2,
SBC r0, r1, r2; r0=r1-r2+C-1,
RSB r0, r1, r2; r0=r2-r1,
RSC r0, r1, r2; r0=r2-r1+C-1

➢All instructions can be viewed as either


unsigned or 2’s complement signed.
Bit Wise Logical Instructions

AND r0,r1,r2 ; r0=r1 and r2


ORR r0,r1,r2 ; r0=r1 or r2
EOR r0,r1,r2 ; r0=r1 xor r2
BIC r0,r1,r2 ; r0=r1 and not r2(bit clear)
Register- Register Move and
Comparison Instructions
Register- Register Move Instruction:

MOV r0,r2 ; r0=r2


MVN r0,r2 ; r0= not r2

Comparison Instructions:

CMP r1, r2 ; set CC on (r1-r2)


CMN r1, r2 ; set CC on (r1+r2)
TST r1, r2 ; set CC on (r1 and r2)
TEQ r1, r2 ; set CC on (r1 xor r2)

All these comparison instructions affect the status of CPSR which is


used on our program.
Instructions: Operands
• Specifying immediate operands:
ADD r1, r2, #1 ; r1= r2+1
SUB r1, r2, #1 ; r1= r2-1 #: immediate
AND r1, r2, #&0F ; r1= r2[3:0], &: Hex decimal
number
• Shifted register operands:
ADD r1, r2, r3, LSL #2 ; r1= r2+(r3<<2)
ADD r1, r2, r3, LSL r5 ; r1= r2+(r3<<r5)

– Various rotate and shift instructions:


• LSR: logical shift right
• ROR: rotate right
• RRX: rotate right extended by 1 bit
• ASL: arithmetic shift left
• ASR arithmetic shift right
Shifted register operands
• Multiplication Instruction:
MUL r1,r2,r3 ; r1=(r2 X r3) [31:0]

➢ Supports no immediate operation

• Multiply-Accumulate Instruction:
MLA r1,r2,r3,r4 ; r1=(r2 X r3+r4) [31:0]

➢ Required in DSP applications


➢ Multiplication with 64 bit is also supported
Data Transfer Instructions
• Single register load and store: Flexible, supports
bytes, half word and word transfers

• Multiple register loads and store: Less flexible,


multiple words, high transfer rate

• Single register memory swap: Mostly for system


use and implementing locks (semaphores)
• All ARM instruction use register indirect addressing.
Example:
o Before data transfer, any register must be initialized with the memory
address
ADRL r1, Table ; r1=memory address of Table

Then,
LDR r0, [r1]; r0=mem[r1]
STR r0, [r1]; mem[r1]=r0

Data transfer with offset


LDR r0, [r1,#4]; r0=mem[r1+4]
STR r0, [r1+10]; mem[r1+10]=r0

Data transfer with auto indexing


LDR r0, [r1,#4]!; r0=mem[r1+4], r1=r1+4
STR r0, [r1+10]!; mem[r1+10]=r0, r1=r1+10

Data transfer with post indexing


LDR r0, [r1],#4; r0=mem[r1], r1=r1+4
STR r0, [r1], #10; mem[r1]=r0, r1=r1+10
• A byte or half-word transfer also can be specified:
LDRB r0, [r1] ; r0=mem8[r1]
STRB r0, [r1] ; mem8[r1]=r0;
LDRSH r0, [r1] ; r0=mem16[r1];
STRSH r0, [r1] ; mem16[r1]=r0;

• Multiple register loads and stores:


LDMIA r1, {r3,r5,r6} : r3=mem[r1];r5=mem[r1+4];
r6=mem[r1+8];

In LDMIB, the addresses will be [r1+4], [r1+8], and


[r1+12];

Other two instructions are: LDMDA and LDMDA


Decrement
• Examples:

➢ No stack in ARM. Hence, LTM and STM can be used to implement software stack
Reducing the number of loops in ARM instructions:

AIM: Transfer a block of memory of 128 bytes aligned


– r9: address of the source
– r10: address of the destination
– r11: end address of the source r9
r11

r10

➢ To transfer these 128 bytes in 4 loop as opposite to general


transfer in computing system.
Memory-Mapped I/O in RAM
Data
Memory

I/O Port
Data
Data/memory
Address
Address
Decoder
I/O Port
.
.
Select Lines
I/O Port

➢ No separate instruction for I/O, uses same select line for data memory, i/o, though
there are different specific addresses
Control Flow Instructions:
• These instructions change the execution flow of
PC instead of direct increment, PC=PC+4;

– Unconditional Branch: B, BAL


A L

L
– Conditional Branch:

• Instructions:
– BEQ, BNE: Equal or not equal to zero
– BPL, PMI: Result positive or negative
– BCC, BCS: Carry set or clear
– BVC, BVS: Overflow set or clear
– BGT, BGE: Greater than, greater or equal
– BLT, BLE: Less than, less or equal
– Branch and Link:
• Used for subroutines in ARM
• The return address (current value in PC) is saved in r14
(link register)
• After returning from subroutine, jump to value in r14

• Can’t be used for nested subroutine


• Conditional Execution:
• A special instruction in ARM for execution based on condition
• Reduces many short branch instruction
• E.g. if (r2!=10) then, r5=r5+10-r3

• Similar, postfixs:
Pulse Width Modulation (PWM)
– It is a control mechanism an analog variable through
a rectangular digital signal.
– On-off behavior in PWM changes average power of
1 𝑇
the signal i.e., 𝑇 ‫׬‬0 𝑓 𝑡 𝑑𝑡 = 𝑡𝑂𝑁 . 𝑉𝐻 + (1 − 𝑡𝑂𝑁 ) 𝑉𝐿
0

– It is used in different applications including,


communication, automatic control, encoding data
transmission etc.
PWM: Extraction of average value
• Using Low-pass filter

• Used as a digital to analog converter


• LPF always is not necessary since some ES have
inbuilt LPF
• Applications: Control of DC motor, heater, lamps
etc.
• Different PWM functions are available in different
ES development boards.
Description of Analog World
Digital
Transducer ADC DAC Actuator
System
Physical
Control to
Variable
Physical
Variable

Digital to Analog Conversion


VR
k-= resolution or step size
D3 Vo α D
Vo how much o/p voltage change
D0 Vo = kD With respect to D

Resolution Δ = 𝑉𝑅 /(2𝑁 -1) for N-bit DAC


% resolution = (Δ / 𝑉𝑅 )X 100
• Example of DAC:
Clock Binary Counter D/A Converter CRO

• Types of DAC
– Weighted register:
• Easier to implement
• Issues

– Resistive ladder
• More practical and easier to understand
• Complex circuitry
Weighted register type DAC
• For n-bit DAC, it has values of resistor values in magnitude of R, 2R,
4R, …. 2𝑛 -1R.
• The current produced by these resistors is inversely proportional.
• Finally, op-amp adds all current and converts to voltage at the
output.

• Drawback: Demand precision resistances


Resistive ladder type DAC

To-Opamp as
voltage follower

Case I: Input is 1000:

You might also like