ICT1012 Computer Systems
Lecture 6 Inner Workings of the Central Processing Unit Lakshman Jayaratne
Learning Objectives
Computer architecture
Components of a simple central processing unit:
o o
registers, ALU, control unit and buses Buses, clocks, peripheral devices, memory
Other hardware components of a computer:
Features of computers
Speed and reliability Components and CPU registers Memory organization
Fetchdecodeexecute cycle and its use to Fetch decode execute instructions in a simple computer
Hardware Components of a Typical Computer
Central Processing Unit (CPU)
Peripheral Devices
Memory
Buses allow components to pass data to each other
Hardware Components of a Typical Computer - CPU
Peripheral Devices Central Processing Unit (CPU) Memory
Central Processing Unit (CPU)
Performs the basic operations Consists of two parts:
Arithmetic / Logic Unit (ALU) - data manipulation Control Unit - coordinate machines activities machine
Central Processing Unit (CPU)
Fetches, decodes and executes program instructions Two principal parts of the CPU
Arithmetic-Logic Unit (ALU) Arithmetico Connected to registers and memory by a data bus o All three comprise the Datapath Control unit o Sends signals to CPU components to perform sequenced operations
CPU: Registers, ALU and Control Unit
Registers
Hold data that can be readily accessed by the CPU Implemented using D flip-flops flipo A 32-bit register requires 32 D flip-flops 32flip-
Arithmetic-logic unit (ALU) ArithmeticCarries out logical and arithmetic operations Often affects the status register (e.g., overflow, carry) Operations are controlled by the control unit
Control unit (CU)
Policeman or traffic manager Determines which actions to carry out according to the values in a program counter register and a status register
Hardware Components of a Typical Computer - Memory
Peripheral Devices Central Processing Unit (CPU) Memory
Main Memory
Holds programs and data Stores bits in fixed-sized chunks: word (8, 16, fixedword 32 or 64 bits) Each word has a unique address The words can be accessed in any order random-access memory or RAM randomRAM
Memory
Consists of a linear array of addressable storage cells
A memory address is represented by an unsigned integer Can be byte-addressable or word-addressable bytewordByte-addressable: each byte has a unique address ByteWord-addressable: a word (e.g., 4 bytes) has a unique Wordaddress
Memory: Example
A memory word size of a machine is 16 bits A 4MB 16 RAM chip gives us 4 megabytes of 16-bit memory locations 164MB = 22 * 220 = 222 = 4,194,304 unique locations (each location contains a 16-bit word) 16Memory locations range from 0 to 4,194,303 in unsigned integers
2N addressable units of memory require N bits to address each location
Thus, the memory bus of this system requires at least 22 address lines The address lines count from 0 to 222 -1 in binary count
Hardware Components of a Typical Computer Peripheral Devices that Communicate with the Outside World
Peripheral Devices Central Processing Unit (CPU) Memory
Input/Output (I/O)
Input: keyboard, mouse, microphone, scanner, sensors (camera, infra-red), punch-cards infrapunchOutput: video, printer, audio speakers, etc
Communication
modem, ethernet card
Hardware Components of a Typical Computer Peripheral Devices that Store Data Long Term
Secondary (mass) storage Stores information for long periods of time as files
Examples: hard drive, floppy disk, tape, CDCDROM (Compact Disk Read-Only Memory), flash Readdrive, DVD (Digital Video/Versatile Disk)
Hardware Components of a Typical Computer Buses
Peripheral Devices Buses
Used to share data between system components inside and outside the CPU Set of wires (lines) that act as a shared path allow parallel movement of bits
Central Processing Unit (CPU)
Memory
Typical Bus Transactions
Sending an address (for performing a read or write) Transferring data from memory to register and vice versa Transferring data for I/O reads and writes from peripheral devices
Buses
Physically a bus is a group of conductors that allows all the bits in a binary word to be copied from a source component to a destination component Buses move binary values inside the CPU between registers and other components Buses are also used outside the CPU, to copy values between the CPU registers and main memory, and between the CPU registers and the I/O sub-system sub-
Types of Buses: Source and Destination
Point-to-point: Point- to- point: connects two specific components Multi-point: a shared Multi- point: resource that connects several components
access to it is controlled through protocols, which are built into the hardware
Types of Buses: Contents
Data bus: conveys bits from one device to another bus: Control bus: determines the direction of data flow and bus: when each device can access the bus Address bus: determines the location of the source bus: or destination of the data
Clock
Every computer contains at least one clock that synchronizes the activities of its components
A fixed number of clock cycles are required to carry out each data movement or computational operation The clock frequency determines the speed of all operations
o
Measured in megaHertz or gigaHertz
Generally the term clock refers to the CPU (master) clock
Buses can have their own clocks which are usually slower
Most machines are synchronous
Controlled by a master clock signal Registers must wait for the clock to tick before loading new data data
Clock Speed (I)
Clock cycle time is the reciprocal of clock frequency
Example, an 800 MHz clock has a cycle time of 1.25 ns
o
1/800,000,000 = 0.00000000125 = 1.25 * 10-9
Clock-speed CPU-performance ClockCPUThe CPU time required to run a program is given by the general performance equation:
Clock Speed (II)
Therefore, we can improve CPU throughput when we reduce
the number of instructions in a program the number of cycles per instruction the number of nanoseconds per clock cycle
But, in general
Multiplication takes longer than addition Floating point operations require more cycles than integer operations Accessing memory takes longer than accessing registers
Features of Computers: Speed and Reliability
Speed
CPU speed System-clock / Bus speed SystemMemory-access speed MemoryPeripheral device speed
Reliability
10
CPU Speed
CPU clock speed: in cycles per second ("hertz")
Example: 700MHz Pentium III, 3GHz Pentium IV
but different CPU designs do different amounts of work in one clock cycle Other measures of speed
flops (floating-point operations per second) flops (floatingmips (million instructions per second) mips
System-Clock / Bus Speed
Speed of communication between CPU, memory and peripheral devices Depends on main board design
Examples: Examples:
o Intel
1.50GHz Pentium-4 works on a 400MHz bus Pentiumspeed
11
Memory-Access Speed
RAM
about 60ns (1 nanosecond = a billionth of a second), and getting faster may be rated with respect to bus speed (e.g., speed PC-100) PC-
Cache memory
faster than main memory (about 20ns access speed), but more expensive contains data which the CPU is likely to use next
Peripheral Device Speed
Mass storage
Examples:
3.5in 1.4MB floppy disk: about 200kb/sec at 300 rpm (revolutions per minute) o Hard drive: up to 160 GB of storage, average seek time about 6 milliseconds, and 7,200 rpm
o
Communications
Examples: modems at 56 kilobits per second, and network cards at 10 or 100 megabits per second
I/O
Examples: ISA, PCI, IDE, SCSI, ATA, USB, etc....
12
Cache Memory and Virtual Memory
Cache memory random access memory that a processor can access more quickly than regular RAM Virtual memory an extension of RAM using the hard disk
allows the computer to behave as though it has more memory than what is physically available
Interrupts and Exceptions
Events that alter the normal execution of a program Exceptions are triggered within the processor
Arithmetic errors, overflow or underflow Invalid instructions User-defined break points User-
Interrupts are triggered outside the processor
I/O requests
Each type of interrupt or exception is associated with a procedure that directs the actions of the CPU
13
Fetch-decode-execute Cycle
A computer runs programs by performing fetch-decode-execute cycles
fetch next instruction from memory ( word pointed to by PC ) and place in IR decode instruction in the IR to determine type execute instruction go to the next instruction (next word in memory)
Example: instruction word Example: at mem[PC] is 0x20A9FFFD mem[PC]
001000 00101 01001 1111111111111101
Opcode 8 is add immediate, immediate source reg is $5, target reg $5, target is reg $9, add amount is 3 $9, Send reg $5 and -3 to ALU, add them, put result in reg $9 PC = PC + 4
Accessing Memory (I)
Every memory access needs an address word to be sent from CPU to memory
Address range is 0x00000000 to 0xFFFFFFFF
o
about 4 billion bytes of addressable space
Addresses output by the CPU go to the Memory Address Register (MAR)
During a fetch access, the PC value is copied to MAR During a load/store access, a computed address from the ALU is copied to MAR address
14
Accessing Memory (II)
Why compute load/store addresses?
32(instruction bits) 6(opcode bits) = 26(available bits) 26(available insufficient to hold a full memory address
Solution: register based addressing
use 26-bits to specify a base address GPR, a 26GPR, target GPR, plus a 16-bit signed offset GPR, 16ALU computes memory reference address on the fly as: MAR = base GPR + offset fly target GPR receives/supplies memory data
Memory Segments
Memory is organized into segments, each with its own segments, purpose
0x00000000 0x00400000 0x10000000 reserved for OS text segment data segment (heap) heap) stack segment reserved for the Operating System (OS) kernel code users code user free space, grows and shrinks as stack/data segments change kernel code and data
memory addresses
0x80000000
0xFFFFFFFF
15
Text Segment
Starts at memory address 0x00400000 runs up to address 0x0FFFFFFF Contains users executable program code (often called the code segment ) PC register value is a CPU reference into this memory segment
Data Segment
Starts at memory address 0x10000000 expands upwards towards stack Contains programs static data, i.e., data and program data, variables whose location in memory is fixed (and known to the assembler)
In C global variables string constants In Java public, static objects
16
Stack Segment
Starts at memory address 0x7FFFFFFF
grows in the direction of decreasing memory addresses ( i.e., towards the data segment)
Contains system stack Used for temporary storage of:
local variables of functions function parameter values return addresses of functions saved register values
Heap
Technically part of data segment
located at end of data segment, after all static data
Empty at start of program execution Dynamically allocated memory is taken from heap for program to use Freed memory (by user or garbage collection) is returned to heap
17
Block Diagram of the System
A Von Neuman Machine
Control Unit
Arithmetic Logic Unit
CENTRAL PROCESSING UNIT BUS
INPUT 1001100101001
Code Segment Data Segment
OUTPUT 0010011100011 MEMORY
Arithmetic Logic Unit
ALU
The part of a computer that performs all arithmetic computations, such as addition and multiplication, and all comparison operations
A typical schematic symbol for an ALU: A & B are operands; R is the output; F is the input from the Control Unit; D is an output status
18
Arithmetic Logic Unit
The component where data is held temporarily Calculations occur here It knows how to perform operations such as ADD, SUB, LOAD, STORE, SHIFT It knows the commands that make up the machine language of the CPU It is the calculator
Control Unit
A computers control unit keeps things computer synchronized
Makes sure that the correct components are activated as the components are needed Sends bits down control lines to trigger events
o
E.g., when Add is performed, the control signal tells the ALU to Add
How do these control lines become asserted? o Hardwired control: controllers implement this control: program using digital logic components o Microprogrammed control: a small program is control: placed into read-only memory in the microcontroller read-
19
Control Unit: Hardwired Control
Physically connect all of the control lines to the actual machine instruction Instructions are divided into fields and different bits are combined with various digital logic components (which drive the control line) The control unit is implemented using hardware The digital circuit uses inputs to generate the control signal to drive various components Advantage: very fast Disadvantage: instruction set and digital logic are locked
Control Unit: Microprogrammed Control
Microprogram: software stored in the CPU control unit Converts machine instructions (binary) into control signals One subroutine for each machine instruction Advantage: very flexible Disadvantage: additional layer of interpretation
20
Registers
A register is a single, permanent storage location within the CPU used for a PARTICULAR, defined purpose purpose A register is used to hold a binary value temporarily for storage, for manipulation, and/or for simple calculations calculations Registers have special addresses
Von Neuman Machine Model
Main Memory
Input Data and Instructions 10110111 00110111 01101001 11101001 00110100 01110100 . 10110111
PC
Output Data
CPU Cycle
Fetch an instruction from the memory cell where the PC points
. .
Bus
Decode the instruction
01101001 00110100 01111101 11100000
. Program Counter
ALU
Execute the instruction
Control Unit
CPU
Increment the PC
21
Registers
CPU Arithmetic/ Logic Unit Rn Control Unit R0 BUS R1 Main Memory Secondary Storage Input devices Output devices
Registers are used to hold the data immediately applicable to the operation at hand; Main memory is used to hold the data that will be needed in the near future Secondary storage is used to hold data that will be likely not be needed in the near future
Example: Machine Architecture
Consider a machine with
256 byte Main Memory: 00-FF 0016 General Purpose Registers: 0-F 016 Bit Instruction 8 Bit Integer Format (2s Complement) (2 8 Bit Floating Point Format
o o o 1 Sign Bit 3 Exponent Bits 4 Bit Mantissa
00 01 02 03 04 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000
16 Instructions: 1-F 1ff 0100 0000
22
Example: Addition Operation
A B 1001 1001 0110 1101
R1 10011001 R2 01101101
LOAD R A LOAD R11 ,, A LOAD R B LOAD R22 ,, B ADD ADD R R R R00 ,, R11 ,,R22
A+B
R0 01010100
STORE R X STORE R00 ,, X
Load the first number from memory cell A into register R1 Load the second number from memory cell B into register R2 Adding the numbers in these two registers and put the result in register R0 Store the result in R0 into the memory call X
Block Diagram of the CPU
CPU - Central Processing Unit MAR - Memory Address Register IR - Instruction Register MDR - Memory Data Register PC - Program Counter ALU - Arithmetic Logic Unit
23
Instruction Fetch
The address in the Program Counter is placed in MAR The addressed instruction is read from memory (through the MDR) and placed into the Instruction Register
Instruction Execute
The Instruction Decoder examines the instruction in the Instruction Register and sends appropriate signals to other parts of the CPU to carry out the actions specified by the instruction. This may include:
Reading operands from memory or registers into the Arithmetic Logic Unit, Unit, Enabling the circuits of the Arithmetic Logic Unit to perform arithmetic or other computations, Storing data values into memory or registers, Changing the value of the Program Counter
24
The CPU Cycle
The processor endlessly repeats the cycle:
fetch, execute, fetch, execute, fetch, execute, fetch, execute, fetch, execute, fetch, execute, fetch, execute, fetch, execute, fetch, execute, fetch ...
Fetch and Execute Cycle
At the beginning of each cycle the CPU presents the value of the program counter on the address bus The CPU then fetches the instruction from main memory (possibly via a cache and/or a pipeline) via the data bus into the instruction register
25
Fetch and Execute Cycle
From the instruction register, the data forming the instruction is decoded and passed to the control unit It sends a sequence of control signals to the relevant function units of the CPU to perform the actions required by the instruction such as reading values from registers, passing them to the ALU to add them together and writing the result back to a register
Fetch and Execute Cycle
The program counter is then incremented to address the next instruction and the cycle is repeated
26
Instruction Set Architecture (ISA)
Instruction sets definition and features
Instruction types Operand organization Number of operands and instruction length Addressing Instruction execution pipelining
Features of two machine instruction sets (CISC and RISC) Instruction format
53
Instruction Set Architecture (ISA)
Machine instructions
Opcodes and operands
High level languages
Hide detail of the architecture from the programmer Easier to program
Why learn computer architectures and assembly language?
To understand how the computer works To write more efficient programs
27
Instruction Set Architecture (ISA)
Instruction sets are differentiated by Instructions
types of instructions instruction length and number of operands
Operands
type (addresses, numbers, characters) and access mode location (CPU or memory) organization (stack or register based) o number of addressable registers
Memory organization
byte- or word-addressable byte- word-
CPU instruction execution
with/without pipelining
Instruction Set Architecture (ISA)
The instruction set format is critical to the machines architecture machine Performance of instruction set architectures is measured by
Main memory space occupied by a program Instruction complexity Instruction length (in bits) Total number of instructions
28
Instruction Set Architecture (ISA)
Instruction types Operand organization Number of operands and instruction length Addressing Instruction execution pipelining
Instruction Set Architecture (ISA)
An instruction set, or instruction set architecture set, (ISA) describes the aspects of a computer architecture visible to a programmer, including the native datadatatypes, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O (if any) An ISA includes a specification of the set of all binary codes (opcodes) that are the native form of commands implemented by a particular CPU design The set of opcodes for a particular ISA is also known as the machine language for the ISA
29
Instruction Set Architecture (ISA)
ISAs commonly implemented in hardware
Alpha AXP (DEC Alpha) ARM (Acorn RISC Machine) (Advanced RISC Machine now ARM Ltd) IA-64 (Itanium) IAMIPS Motorola 68k PA-RISC (HP Precision Architecture) PAIBM POWER PowerPC SPARC SuperH VAX (Digital Equipment Corporation) x86 (IA-32, Pentium, Athlon) (AMD64, EM64T) (IA-
Machine Instructions
Data Transfer: transfer data between registers and memory cells Arithmetic/Logic Operations: perform addition, AND, OR, XOR and etc. Control Operations: control the execution of the program
30
Data Transfer Instructions
1. L R , A 2. LI R , I 3. ST R , A LOAD the register R with the content of memory cell A LOAD the register R with I (I is called an immediate number) STORE the content of the register R to the memory cell whose address is A
4. LR R1 , R2 LOAD the register R1 with the content of the register R2
Example: Data Transfer Instructions
Swap the content of two memory cells 30(16) and 40(16)
30 0110 1101
L 1 30 L 1 ,, 30 L 2 40 L 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
/*Load R with the content /*Load R11with the content in memory cell 30 */ in memory cell 30 */ /* Load R with the content /* Load R22with the content in memory cell 40 */ in memory cell 40 */ /* Store R to 40 */ /* Store R11to 40 */ /* Store R to 30 */ /* Store R22to 30 */
40
10011010
R1 01101101 R2 10011010
31
Example: Data Transfer Instructions
Swap the content of two memory cells 30(16) and 40(16)
30 10011010 0110 1101 01101101 10011010
L 1 30 L 1 ,, 30 L 2 40 L 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
/*Load R with the content /*Load R11with the content in memory cell 30 */ in memory cell 30 */ /* Load R with the content /* Load R22with the content in memory cell 40 */ in memory cell 40 */ /* Store R to 40 */ /* Store R11to 40 */ /* Store R to 30 */ /* Store R22to 30 */
40
R1 01101101 R2 10011010
Arithmetic/Logic Instructions (I)
Arithmetic Instructions
5. ADD R0, R1, R2
ADD the numbers in R1 and R2 representing in 2s 2 complement and place the result in R0 ADD the numbers in R1 and R2 representing in floatingfloatingpoint and place the result in R0
6. AFP R0, R1, R2
32
Arithmetic/Logic Instructions (I)
Example: Addition
A0 A1
Memory
10011001 01101101 = -25 = 109
L 1 A0 L 1 ,, A0 L 2 A1 L 2 ,, A1 ADD 0 1 2 ADD 0 ,,1 ,,2 ST 0 X0 ST 0 ,, X0
X0
01010100
= 84
Registers
R0 01010100 R1 10011001 R2 01101101
Arithmetic/Logic Instructions (II)
Logic Instructions
7. OR R0, R1, R2
OR the bit patterns in R1 and R2 and place the result in R0 AND the bit patterns in R1 and R2 and place the result in R0
8. AND R0, R1, R2
9. XOR R0, R1, R2 XOR the bit patterns in R1 and R2 and place the result in R0
33
Arithmetic/Logic Instructions (II)
Example: Mask the first 4 bits of the binary string in memory A0
Memory
A0 10011011
L 1 A0 L 1 ,, A0 LI 2 OF LI 2 ,, OF ADD 0 1 2 ADD 0 ,,1 ,,2 ST 0 X0 ST 0 ,, X0
X0
00001011
Registers
R0 00001011 R1 10011011 R2 00001111
R0 R1 10011011 R2 00001111
Arithmetic/Logic Instructions (II)
Example: Masking
L 1 A0 L 1 ,, A0 L 2 A1 L 2 ,, A1 LI 3 0F LI 3 ,, 0F LI 4 F0 LI 4 ,, F0 AND 1 1 3 AND 1 ,,1 ,,3 AND 2 2 4 AND 2 ,,2 ,,4 OR 0 1 2 OR 0 ,,1 ,,2 ST 0 X0 ST 0 ,, X0 R3 R4
00001111 11110000 X0 11011001 11011001 A0 A1 10011001 10011001 11011011 11011011
R0 R1 R2
11011001 00001001 00001001 11010000 11010000
R0 R1 R2 R3 R4
10011001 11011011 00001111 11110000
34
Arithmetic/Logic Instructions (III)
Bit String Operating Instructions
B. RR R , I
ROTATE the bit patterns in R to right I times. Each time place the bit that started at the low-order end at the highlowhighorder end
1 1 0 1 1 1 0 1 1 1 0 0 1 1 0 0 1 0 0 0 1 0 0
Example RR , 0 , 02
Original String
Resulting String
Control Instructions
E. JMP R , A JUMP the instruction located in the memory cell A if the bit pattern in R is equal to the one in R HALT the execution
F. HALT
35
Example: Control Instructions
30 32 34 36 38 3A 3C 3E
LI 0 0A LI 0 ,, 0A LI 1 00 LI 1 ,, 00 LI 2 01 LI 2 ,, 01 ADD 3 1, 2 ADD 3 ,, 1, 2 JMP 3 3E JMP 3 ,, 3E LR 1 3 LR 1 ,,3 JMP 0 36 JMP 0 ,,36 HALT HALT R0 R1 R2 R3
00001010 00000000 00000001 00000001
R0 = 0A R1 = 00 R2 = 01 R3 = R1 +R2
Yes
R3 = R0 ?
No
R1 = R3
The CPU Cycle
Program Counter
Instruction Register Control Unit
8 bit bus
Circuits
Code Segment
A d d r e s s
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
General Purpose Registers
Main Memory
Data Segment
ALU
36
Operand Organization
Three choices
Accumulator architecture General Purpose Register (GPR) architecture Stack architecture
Operand Organization Accumulator Architecture
One operand of a binary operation is implicitly in the accumulator Advantage
Minimizes the internal complexity of the machine Allows for very short instructions
Disadvantage
Memory traffic is very high Programming is cumbersome
37
Operand Organization General Purpose Register (GPR) Architecture
Uses sets of general purpose registers Advantage
Register sets are faster than memory Easy for compilers to deal with Due to low costs large numbers of these registers are being added
Disadvantage
Results in longer instructions (longer fetch and decode times)
Operand Organization General Purpose Register (GPR) Architecture
Three types
Memory-memory Memoryo may have two or three operands in memory o an instruction may perform an operation without requiring any operand to be in a register Register-memory Registero at least one operand must be in a register and one in memory Load-store Loado requires data to be moved into registers before any operation is performed
38
Operand Organization Stack Architecture
Uses a stack to execute instructions Operations:
PUSH put a value on top of the stack POP read top value and move down the stack pointer pointer
Example:
POP PUSH 9
5 9 7 2
Operand Organization Stack Architecture
Instructions implicitly refer to values at the top of the stack
data can be accessed only from the top of the stack, one word at a time
Advantage
Good code density Simple model for evaluation of expressions
Disadvantage
Restricts the sequence of operand processing Execution bottleneck (the stack is located in memory)
39
Operand Organization Stack Architecture
Stack architecture requires us to think about arithmetic expressions in a new way
We are used to Infix notation o E.g., Z = X + Y Stack arithmetic requires Postfix notation: notation: o E.g., Z = XY+ o Postfix notation is also know as Reverse Polish Notation
Stack Architecture Postfix Notation
Postfix notation doesnt need parentheses doesn E.g., The infix expression Z = (X * Y) + (W * U) is the postfix expression Z = X Y * W U * + Calculating Z = X Y * W U * + PUSH X
PUSH MULT PUSH PUSH MULT ADD POP Y W U
in a stack ISA
Binary operators pop the two operands on the stack top, and push the result on the stack
40
Number of Operands and Instruction Length
The number of operands in each instruction affects the length of the instruction Instruction length can be
Fixed quick to decode but wastes space Variable more complex to decode but saves space
All architectures limit the number of operands allowed per instruction
Stack architecture has 0 or 1 explicit operand Accumulator architecture has 0 or 1 explicit operand GPR architecture has 1, 2 or 3 operands
Number of Operands - Example
Calculating the infix expression Z = X * Y + W * U
One operand LOAD MULT STORE LOAD MULT ADD STORE X Y TEMP W U TEMP Z Two operands LOAD MULT LOAD MULT ADD STORE R1,X R1,Y R2,W R2,U R1,R2 Z,R1 Three operands MULT R1,X,Y MULT R2,W,U ADD Z,R1,R2
The accumulator is the destination for the result of the instruction
The first operand is often the destination for the result of the instruction
41
Coding Instruction
16 bit Instruction (2 bytes)
High-Order Byte HighLow-Order Byte Low-
Bits 4-15 Operands 4Bits 0-3 OpCode 0-
1
LI
1
4
1
7C
The machine code 0010010001111100 represents the instruction LI 4 , 7C
Instruction Formats
16 bit Instruction (2 bytes) Format 1 Format 2 Format 3 Format 4 Register Register Register Immediate Value Memory Address Register Register Register
Unused (zero) Register
42
Format 1 Instruction
Format 1 Instruction Format 1 Register Immediate Value
Opcode
2 A B C D
Instruction LI R , I
Meaning
Load Immediate Rotate Left Rotate Right Shift Left Shift Right
RL RR SL SR
R , I R , I R, I R , I
Format 1 Instruction
Format 1 Register Immediate Value
1. COPY THE BIT PATTERN IN THE LOW-ORDER BYTE LOWINTO THE SPECIFIED REGISTER , OR 2. SHIFT/ROTATE THE BITS IN THE SPECIFIED REGISTER THE NUMBER OF PLACES SPECIFIED IN THE LOW-ORDER BYTE. LOW-
43
Format 2 Instruction
Format 2 Instruction Format 2 Register Memory Address
Opcode
Instruction
Meaning
Load from Memory Store to Memory Conditional Jump
1 3 E
L R , A ST R , A JMP R , A
Format 2 Instruction
Format 2 Register Memory Address
1. Load - Copy the value stored at the Memory Address into the specified register 2. Store - Copy the value in the specified register to the Memory Address 3. Jump - Compare the contents of the specified register and the contents of Register 0. If equal reset the Program Counter to the Memory Address
44
Format 3 Instruction
Format 3 Instruction Format 3 Register Register Register
Opcode
5
6
Instruction ADD R0, R1, R2
Meaning
Load Immediate Rotate Left Rotate Right Shift Left Shift Right
7 8 9
AFP OR AND XOR
R 0, R 1, R 2 R 0, R 1, R 2 R 0, R 1, R 2 R 0, R 1, R 2
Format 3 Instruction
Format 3
Register
Register
Register
Apply the operation to the two values in the registers specified in the Low-Order byte and store the result in the Lowregister specified in the High-Order byte High-
45
Format 4 Instruction
Format 4 Instruction
Format 4
Unused (zero) Register
Register
Opcode
Instruction
Meaning
Load Register
LR R1 , R2
Format 4 Instruction
Format 4
Unused (zero) Register
Register
Copy the value in the second register specified in the Low-Order byte to the first register specified in the LowLow-Order byte Low-
46
Full Instruction Set
1. L 1. L R A R ,, A 9. XOR R R, R 9. XOR R00,, R11, R22 A. RL A. RL B. RR B. RR C. SL C. SL D. SR D. SR R R ,, II R R ,, II R R ,, II R R ,, II
2. LI R 2. LI R ,, II 3. ST R A 3. ST R ,, A R 4. LR R 4. LR R11,, R22 5. ADD R R, R 5. ADD R00,, R11, R22 R, R 6. AFP R 6. AFP R00,, R11, R22 7. OR 7. OR R R, R R00,, R11, R22
E. JMP R A E. JMP R ,, A F. HALT F. HALT
8. AND R R, R 8. AND R00,, R11, R22
Examples of OpCode
Name MOV PUSH POP IN OUT Comment TRANSFER Move (copy) Push onto stack Pop from stack Input Output ARITHMETIC Add Subtract Divide (unsigned) Multiply (unsigned) Increment Decrement Compare Syntax MOV Dest,Source PUSH Source POP Dest IN Dest, Port OUT Port, Source
ADD SUB DIV MUL INC DEC CMP
ADD Dest,Source SUB Dest,Source DIV Op MUL Op INC Op DEC Op CMP Op1,Op2
47
Examples of OpCode
Name NEG NOT AND OR XOR Comment Syntax LOGIC Negate (two-complement) NEG Op (twoInvert each bit NOT Op Logical and AND Dest,Source Logical or OR Dest,Source Logical exclusive or XOR Dest,Source JUMPS Call subroutine Jump Jump if Equal Jump if Zero Return from subroutine Jump if not Equal Jump if not Zero
CALL JMP JE JZ RET JNE JNZ
CALL Proc JMP Dest JE Dest JZ Dest RET JNE Dest JNZ Dest
Coding Program: Example
Assembler Assembler L 1 30 L 1 ,, 30 L 2 40 L 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30 Machine Code Machine Code 0001 0001 0011 0000 0001 0001 0011 0000 0001 0010 0100 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000 0011 0010 0011 0000 Hexa Hexa 1130 1130 1240 1240 3140 3140 3230 3230
10 11 12 13 14 15 16 17 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
30
0110 1101
R1 R2
0110 1101 1001 1001
30 40
0110 1101 1001 1001
40
1001 1001
48
CPU Cycle (Machine Cycle)
FETCH CPU Cycle
Fetch an instruction from the memory cell where the PC points
DECODE EXECUTE
Decode the instruction
1.
Increment the PC
Execute
Fe tc h
Execute the instruction
Retrieve the next instruction from memory (as indicated by the program counter) and then increment the program counter
2.
Decode the bit pattern in the instruction register
D e od ec
3.
Perform the action requested by the instruction in the instruction register
Program Execution: Swap Example
PC
FETCH DECODE EXECUTE
10 11 12 13 14 15 16 17 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF
L L L L
1 30 1 ,, 30 2 40 2 ,, 40
1130 1130 1240 1240 3140 3140 3230 3230
40 30 0110 1101 1001 1001
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
49
Execute a Program
PC
FETCH DECODE EXECUTE
10 11 12 13 14 15 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
Instruction: 0001 0001 0011 0000
16 17
30 40
0110 1101 1001 1001
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
PC
FETCH DECODE EXECUTE
10 11 12 13 14 15 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Instruction: 0001 0001 0011 0000
16 17
Operation-code : 0001 OperationRegister : 0001
30 40 0110 1101 1001 1001
Memory address : 0011 0000
50
Execute a Program
PC
FETCH DECODE EXECUTE
10 11 12 13 14 15 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
Instruction: 0001 0001 0011 0000
16 17
Operation-code : 0001 OperationRegister : 0001
30 40 0110 1101 1001 1001
Memory address : 0011 0000
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
PC
FETCH DECODE EXECUTE
10 11 12 13 14 15 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
0110 1101
Instruction: 0001 0001 0011 0000
16 17
Operation-code : 0001 OperationRegister : 0001
30 40 0110 1101 1001 1001
Memory address : 0011 0000
51
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
0110 1101
FETCH DECODE EXECUTE
11
PC
12 13 14 15
Instruction: 0001 0001 0011 0000
16 17
Operation-code : 0001 OperationRegister : 0001
30 40 0110 1101 1001 1001
Memory address : 0011 0000
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
0110 1101
FETCH DECODE EXECUTE
11
PC
12 13 14 15
Instruction: 0001 0010 0100 0000
16 17
30 40
0110 1101 1001 1001
52
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
0110 1101
FETCH DECODE EXECUTE
11
PC
12 13 14 15
Instruction: 0001 0010 0100 0000
16 17
Operation-code : 0001 OperationRegister : 0010
30 40 0110 1101 1001 1001
Memory address : 0100 0000
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2 .. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
0110 1101
FETCH DECODE EXECUTE
11
PC
12 13 14 15
Instruction: 0001 0010 0100 0000
16 17
Operation-code : 0001 OperationRegister : 0010
30 40 0110 1101 1001 1001
Memory address : 0100 0000
53
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11
PC
12 13 14 15
.. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
Instruction: 0001 0010 0100 0000
16 17
Operation-code : 0001 OperationRegister : 0010
30 40 0110 1101 1001 1001
Memory address : 0100 0000
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13
PC
14 15 16 17
.. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
30 40
0110 1101 1001 1001
54
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13
PC
14 15
.. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
Instruction: 0011 0001 0100 0000
16 17
30 40
0110 1101 1001 1001
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13
PC
14 15
.. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Instruction: 0011 0001 0100 0000
16 17
Operation-code : 0011 OperationRegister : 0001
30 40 0110 1101 1001 1001
Memory address : 0100 0000
55
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13
PC
14 15
.. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
Instruction: 0011 0001 0100 0000
16 17
Operation-code : 0011 OperationRegister : 0001
30 40 0110 1101 1001 1001
Memory address : 0100 0000
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13
PC
14 15
.. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Instruction: 0011 0001 0100 0000
16 17
Operation-code : 0011 OperationRegister : 0001
30 40 0110 1101 0110 1101
Memory address : 0100 0000
56
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13 14 15
PC
.. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
16 17
30 40
0110 1101 0110 1101
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13 14 15
Instruction:
PC
.. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
16 17
0011 0010 0011 0000
30 40
0110 1101 0110 1101
57
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13 14 15
Instruction:
PC
.. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
16 17
0011 0010 0011 0000
Operation-code : 0011 OperationRegister : 0010
30 40 0110 1101 0110 1101
Memory address : 0011 0000
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13 14 15
Instruction:
PC
.. RF L L L L 1 30 1 ,, 30 2 40 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
16 17
0011 0010 0011 0000
Operation-code : 0011 OperationRegister : 0010
30 40 0110 1101 0110 1101
Memory address : 0011 0000
58
Execute a Program
10 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
R0 R1 R2
0110 1101 1001 1001
FETCH DECODE EXECUTE
11 12 13 14 15
Instruction:
PC
.. RF L L L L 1 1 ,, 2 2 ,, 30 30 40 40
16 17
0011 0010 0011 0000
Operation-code : 0011 OperationRegister : 0010
30 40 1001 1001 0110 1101
Memory address : 0011 0000
ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30
Coding Program: An Example
Assembler Assembler L 1 30 L 1 ,, 30 L 2 40 L 2 ,, 40 ST 1 40 ST 1 ,, 40 ST 2 30 ST 2 ,, 30 Machine Code Machine Code 0001 0001 0011 0000 0001 0001 0011 0000 0001 0010 0100 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000 0011 0010 0011 0000 Hexa Hexa 1130 1130 1240 1240 3140 3140 3230 3230
10 11 12 13 14 15 16 17 0001 0001 0011 0000 0001 0010 0100 0000 0011 0001 0100 0000 0011 0010 0011 0000
30
1001 1001
R1 R2
0110 1101 30 1001 1001 40 0110 1101 1001 1001
40
0110 1101
59
Assembler Code for A:=23, B:=-11;
LI 1 17 LI 1 ,, 17 ST 1 A ST 1 ,, A LI 1 F5 LI 1 ,, F5 ST 1 B ST 1 ,, B
LOAD 23 IN HEX INTO R1 LOAD 23 IN HEX INTO R1 STORE VALUE AT A STORE VALUE AT A LOAD 11 IN HEX INTO R1 LOAD --11 IN HEX INTO R1 STORE VALUE AT B STORE VALUE AT B
Machine Code for A:=23, B:=-11;
LI 1 17 LI 1 ,, 17 ST 1 A ST 1 ,, A LI 1 F5 LI 1 ,, F5 ST 1 B ST 1 ,, B
2117 2117 3180 3180 21F5 21F5 3181 3181
00100001 00010111 00100001 00010111 00110001 10000000 00110001 10000000 00100001 11110101 00100001 11110101 00110001 10000001 00110001 10000001
60
Assembler Code for C:=A-B;
L 1 A L 1 ,, A L 2 B L 2 ,, B LI 3 FF LI 3 ,, FF XOR 4 2 3 XOR 4 ,, 2 ,, 3 LI 3 01 LI 3 ,, 01 ADD 2 3 4 ADD 2 ,, 3 ,, 4 ADD 3 1 2 ADD 3 ,, 1 ,, 2 ST 3 C ST 3 ,, C LOAD A INTO R1 LOAD A INTO R1 LOAD B INTO R2 LOAD B INTO R2 SET MASK TO FLIP B SET MASK TO FLIP B FLIP B FLIP B LOAD 1 INTO R3 LOAD 1 INTO R3 ADD 1 TO FLIPPED B ADD 1 TO FLIPPED B NOW DO R3 = A + B NOW DO R3 = A + B STORE R3 AT C STORE R3 AT C
Machine Code for C:=A-B;
L 1 A L 1 ,, A L 2 B L 2 ,, B LI 3 FF LI 3 ,, FF XOR 4 2 3 XOR 4 ,, 2 ,, 3 LI 3 01 LI 3 ,, 01 1180 1180 1281 1281 23FF 23FF 9423 9423 2301 2301 00010001 10000000 00010001 10000000 00010010 10000001 00010010 10000001 00100011 11111111 00100011 11111111 10010100 00100011 10010100 00100011 00100011 00000001 00100011 00000001 01010010 00110100 01010010 00110100 01010011 00010010 01010011 00010010 00110011 10000010 00110011 10000010
ADD 2 3 4 5234 ADD 2 ,, 3 ,, 4 5234 ADD 3 1 2 5312 ADD 3 ,, 1 ,, 2 5312 ST 3 C ST 3 ,, C 3382 3382
61
Example Program
PROGRAM Sort; VAR A,B,C : INTEGER; PROCEDURE Swap (VAR X,Y : INTEGER); VAR Temp : INTEGER; BEGIN {Swap} Temp := A; A := B; B := Temp; END {Swap}; BEGIN {Sort} C := A-B; AIF C = 0 THEN Swap (A,B); END {Sort}.
Assembler and Machine Code
30 32 34 36 38 3A 3C 3E 40 42 44 46 LI 1,17 ST 1,A LI 1,F5 ST 1,B L 1,A L 2,B LI 3,FF XOR 4,2,3 LI 3,01 ADD 2,3,4 ADD 3,1,2 ST 3,C 2117 3180 21F5 3181 1180 1281 23FF 9423 2301 5234 5312 3382 48 4A 4C 4E 50 52 54 56 58 5A 5C 5E L 1,C LI 2,80 AND 3,1,2 LI 0,00 JMP 3,5E L 1,A L 2,B ST 1,TEMP ST 2,A L 2,TEMP ST 2,B HALT 1182 2280 8312 2000 E35E 1180 1281 317F 3180 127F 3281 F000
62
Code Loaded in Memory
30 3C 48 54 74 80 8C 98
21 23 11 12
17 FF 82 81
31 94 22 31
80 23 80 7F
21 23 83 31
F5 01 12 80
31 52 20 12
81 34 00 7F
11 53 E3 32
80 12 5E 81
12 33 11 F0
81 82 80 00
The CPU Cycle
Cycle Status (illustration only) FETCH DECODE EXECUTE Code Segment
A d d r e s s
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
Program Counter
Instruction Register
Control Unit 8 bit bus Circuits
General Purpose Registers
Main Memory
Data Segment
ALU
63
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
30
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
30
21
21
Main Memory
ALU
64
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
30
21 17 21
17
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
32
21 17 21
Main Memory
ALU
65
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
32
LI
21 17 21
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
32
LI
21 17 21
17
Main Memory
ALU
66
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
32
31 17 21
31
17
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
32
31 80 21
80
17
Main Memory
ALU
67
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
34
31 80 21
17
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
34
31 80 21
ST
17
17
ALU
Main Memory
68
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54 74 80 8C 98
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
34
31 80 21
ST
17
17
17
ALU
Main Memory
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
34
21 80 21
21
17
74 80 17 8C 98
Main Memory
ALU
69
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
34
21 F5 21
F5
17
74 80 17 8C 98
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
36
21 F5 21
17
74 80 17 8C 98
Main Memory
ALU
70
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
36
21 F5 21
LI
17
74 80 17 8C 98
17
ALU
Main Memory
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
36
21 F5 21
LI
17
74 80 17 8C 98
F5
ALU
Main Memory
71
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
36
31 F5 21
31
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
F5
74 80 17 8C 98
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
36
31 81 21
81
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
F5
74 80 17 8C 98
Main Memory
ALU
72
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
38
31 81 21
F5
74 80 17 8C 98
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
38
31 81 21
ST
F5
74 80 17 8C 98
F5
ALU
Main Memory
73
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
38
31 81 21
ST
F5
74 80 17 8C 98
F5
F5
ALU
Main Memory
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
38 11
11 81 21
30 3C 48 54
21 23 11 12
17 FF 82 81
31 94 22 31
80 23 80 7F
21 23 83 31
F5 01 12 80
31 52 20 12
81 34 00 7F
11 53 E3 32
80 12 5E 81
12 33 11 F0
81 82 80 00
F5
74 80 17 F5 8C 98
Main Memory
ALU
74
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
38 80
11 80 21
30 3C 48 54
21 23 11 12
17 FF 82 81
31 94 22 31
80 23 80 7F
21 23 83 31
F5 01 12 80
31 52 20 12
81 34 00 7F
11 53 E3 32
80 12 5E 81
12 33 11 F0
81 82 80 00
F5
74 80 17 F5 8C 98
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
3A
11 80 21
F5
74 80 17 F5 8C 98
Main Memory
ALU
75
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
3A
11 80 21
L
F5
74 80 17 F5 8C 98
17
Main Memory
ALU
The CPU Cycle
Control Unit FETCH DECODE EXECUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
3A
11 80 21
L
F5
74 80 17 F5 8C 98
17
17
ALU
Main Memory
76
The CPU Cycle and so on
Control Unit FETCH DECODE EXEUTE
30 3C 48 54
21 23 11 12 17 FF 82 81 31 94 22 31 80 23 80 7F 21 23 83 31 F5 01 12 80 31 52 20 12 81 34 00 7F 11 53 E3 32 80 12 5E 81 12 33 11 F0 81 82 80 00
21
L
F5
74 80 17 F5 8C 98
Main Memory
ALU
Instruction Execution - Pipelining
Some CPUs divide the fetch-decode-execute fetch- decodecycle into smaller steps Instruction Level Pipelining overlaps these smaller steps for consecutive instructions in order to increase throughput Need to balance the time taken by each pipeline stage
77
Instruction Level Pipelining - Example
Suppose a fetch-decode-execute cycle were broken fetch- decodeinto the following smaller steps: 1. Fetch instruction 2. Decode opcode 3. Calculate the address of operands 4. Fetch operands 5. Execute instruction 6. Store result For every clock cycle, one small step is carried out, and the stages are overlapped
Instruction Level Pipelining - Speed
There are n instructions There are k stages in the pipeline, and the time per stage is tp The first instruction requires k x tp time to complete The remaining (n 1) instructions emerge from the pipeline one per stage The total time to complete the remaining instructions is (n 1) tp Thus, the time required to complete n tasks using a k-stage pipeline is (k * tp) + (n 1) tp = (k + n 1) tp
78
Instruction Level Pipelining - Speed
Speedup gained by using a pipeline
time without pipeline time with pipeline
Speedup
n k tp ( k + n 1) t p
As n approaches infinity, (k + n 1) approaches n, which results in a theoretical speedup of
Speedup = n k tp ntp = k
Instruction Level Pipelining - Issues
Assumptions the architecture supports fetching instructions and data in parallel the pipeline can be kept filled at all times o This is not always the case due to pipeline conflicts It may appear that more stages imply faster performance, but the amount of control logic increases with the number of stages pipeline conflicts affect the execution of instructions
79
Instruction Level Pipelining Pipeline Conflicts
Resource conflicts One instruction is storing a value to memory while another instruction is being fetched from memory Data dependencies When the not-yet-available result of one instruction not- yetis the operand of a subsequent instruction Conditional branch statements Several instructions can be fetched and decoded before the execution of a preceding branch instruction is finished
Thank You
SCS1003 - Computer Systems
80