0% found this document useful (0 votes)
81 views

SAP1

The document discusses the CPU fetch-decode-execute cycle and components. The CPU fetches instructions from memory, decodes them, and executes them by performing operations with the arithmetic logic unit and registers. The CPU communicates with memory and other components via buses that transmit address, data, and control signals. The fetch-decode-execute cycle repeats as the CPU processes instructions sequentially from the program counter.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

SAP1

The document discusses the CPU fetch-decode-execute cycle and components. The CPU fetches instructions from memory, decodes them, and executes them by performing operations with the arithmetic logic unit and registers. The CPU communicates with memory and other components via buses that transmit address, data, and control signals. The fetch-decode-execute cycle repeats as the CPU processes instructions sequentially from the program counter.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Computer System Architecture

(CPET428)

LECTURE#3

Professor: Michael C. Olivo


CPU fetches, decodes, and executes

 The computer’s CPU fetches, decodes, and executes


program instructions.
◦ This is called the Fetch/Decode/Execute Cycle and the
CPU repeats this cycle ad nauseum.
 The principal parts of the CPU are the datapath, registers,
and the control unit.
◦ The datapath consists of an arithmetic-logic unit and
storage units (registers) that are interconnected by a data
bus that is also connected to main memory.
◦ Various CPU components perform sequenced operations
according to signals provided by its control unit.

2
 Registers hold data that can be readily accessed by
the CPU.
 They can be implemented using D flip-flops .
◦ A 32-bit register requires 32 D flip-flops.
 The arithmetic-logic unit (ALU) carries out logical
and arithmetic operations as directed by the control
unit.
 The control unit determines which actions to carry
out according to the values in a program counter
register and a status register.

3
 The CPU shares data with other system components
by way of a data bus.
◦ A bus is a set of wires that simultaneously convey a
single bit along each line.
 Two types of buses are commonly found in computer
systems: point-to-point, and multipoint buses.

This is a point-to-point bus

configuration:

4
 Buses consist of data lines, control lines, and address
lines.
 While the data lines convey bits from one device to
another, control lines determine the direction of data
flow, and when each device can access the bus.
 Address lines determine the location of the source or
destination of the data.

5
Model of Bus Configuration

6
 A multipoint bus is shown below.
 Because a multipoint bus is a shared resource,
access to it is controlled through protocols, which
are built into the hardware.

7
Single Bus Problems
 Lots of devices on one bus leads to:
◦ Propagation delays
 Long data paths mean that co-ordination of bus use can adversely
affect performance – bus skew, data arrives at slightly different times
 If aggregate data transfer approaches bus capacity. Could increase
bus width, but expensive
◦ Device speed
 Bus can’t transmit data faster than the slowest device
 Slowest device may determine bus speed!
 Consider a high-speed network module and a slow serial port on the
same bus; must run at slow serial port speed so it can process data
directed for it
◦ Power problems
 Most systems use multiple buses to overcome these
problems

8
Traditional (ISA) with cache

Buffers data transfers

between system,

expansion bus

This approach breaks down as I/O devices need higher performance 9


High Performance Bus – Mezzanine
Architecture

Addresses higher speed I/O devices by moving up in the hierarchy

10
11
MAR and MBR
 To get data from memory to the CPU
◦ The address to read from is copied onto the MAR
◦ The MAR sends its values on the address bus to memory
◦ The control unit signals memory via the control bus that this is a “read”
operation
◦ Memory transmits the data at the address received on the address bus on
the data bus
 To store data from the CPU to memory
◦ The address to write to is copied onto the MAR
◦ The data to write is copied onto the MBR
◦ The MAR sends its values on the address bus to memory and the MBR
sends its values on the data bus to memory
◦ The control unit signals memory via the control bus that this is a “write”
operation
◦ Memory stores the data from the data bus into the address received from
the address bus
 Transparent to the programmer
◦ Since the MBR and MAR are intermediate steps to fetching and storing
data, we will often leave off these details and just talk about writing
directly from a register to memory, or from memory to a register

12
Bus Communications

Address Bus
MAR

Memory
Data Bus
CPU Address 0 00000001
Address 1 00000000
MBR


Address 15 00110101

Control Bus

Example: Read from address 0, write to address 15


13
Instruction Cycle
 The CPU repetitively performs the instruction
cycle:
◦ Fetch
 The PC holds the address in memory of the next instruction
to execute
 The address from memory is fetched and stored in the IR
 The PC is incremented to fetch the next instruction (unless
told otherwise)
◦ Decode
 The CPU determines what instruction is in the IR
◦ Execute
 Circuitry interprets the opcode and executes the instruction
 Moving data, performing an operation in the ALU, etc.
 May need to fetch operands from memory or store data
back to memory

14
Fetch/Execute Example (1)
Fetch

Memory Instruction Meaning CPU Registers

300 1940 Load address 940 to AC


PC 300

301 3941 Set AC = AC + Data at Address 941


AC 0000

302 2941 Store AC to Address 941


IR 1940

940 0003

941 0002

Execute

Memory Instruction Meaning CPU Registers

300 1940 Load address 940 to AC


PC 300

301 3941 Set AC = AC + Data at Address 941


AC 0003

302 2941 Store AC to Address 941


IR 1940

940 0003

941 0002

15
Fetch/Execute Example (2)
Fetch

Memory Instruction Meaning CPU Registers

300 1940 Load address 940 to AC


PC 301

301 3941 Set AC = AC + Data at Address 941


AC 0003

302 2941 Store AC to Address 941


IR 3941

940 0003

941 0002

Execute

Memory Instruction Meaning CPU Registers

300 1940 Load address 940 to AC


PC 301

301 3941 Set AC = AC + Data at Address 941


AC 0005

302 2941 Store AC to Address 941


IR 3941

940 0003 3+2=5

941 0002

16
Fetch/Execute Example (3)
Fetch

Memory Instruction Meaning CPU Registers

300 1940 Load address 940 to AC


PC 302

301 3941 Set AC = AC + Data at Address 941


AC 0005

302 2941 Store AC to Address 941


IR 2941

940 0003

941 0002

Execute

Memory Instruction Meaning CPU Registers

300 1940 Load address 940 to AC


PC 302

301 3941 Set AC = AC + Data at Address 941


AC 0005

302 2941 Store AC to Address 941


IR 2941

940 0003

941 0005

17
Modifications to Instruction Cycle

 Simple Example
◦ Always added one to PC
◦ Entire operand fetched with instruction
 More complex examples
◦ Might need more complex instruction address
calculation
 Consider a 64 bit processor, variable length instructions
◦ Instruction set design might require repeat trip to
memory to fetch operand
 In particular, if memory address range exceeds word size
◦ Operand store might require many trips to memory
 Vector calculation

18
Instruction Cycle (with Interrupts) -
State Diagram
Fetch Decode Execute

19
Real World Architectures

 The classic Intel architecture, the 8086, was born in


1979. It is a CISC architecture.
◦ It was adopted by IBM for its famed PC, which was
released in 1981.
◦ The 8086 operated on 16-bit data words and supported 20-
bit memory addresses.
 In 1980, to lower costs, the 8088 was introduced.
Like the 8086, it used 20-bit memory addresses but
used an 8 bit data bus instead of a 16 bit data bus.
To get 16 bits of data, the CPU made two trips to
memory.

20
Real World Architectures

 The 8086 had four 16-bit general-purpose registers


that could be accessed by the half-word.
 It also had a flags register, an instruction register,
and a stack accessed through the values in two
other registers, the base pointer and the stack
pointer.
 The 8086 had no built in floating-point processing.
 In 1980, Intel released the 8087 numeric
coprocessor, but few users elected to install them
because of their cost.

21
Real World Architectures

 80286
◦ Used in IBM AT
◦ 24 bit address bus (16 Mb of RAM), 16 bit data bus
◦ Protected mode – OS could protect programs in
separate memory segments
 In 1985, Intel introduced the 32-bit 80386.
◦ It also had no built-in floating-point unit.
◦ 32 bit registers, 24 bit address bus
◦ 80386 DX 32 bit data bus
◦ 80386 SX 16 bit data bus
◦ Supported virtual mode memory, paging

22
Real World Architectures

 The80486, introduced in 1989, was an


80386 that had built-in floating-point
processing and cache memory.
◦ The 80386 and 80486 offered downward
compatibility with the 8086 and 8088.
◦ Software written for the smaller word systems was
directed to use the lower 16 bits of the 32-bit
registers.
◦ Could decode/execute 5 instructions at once with
pipelining
◦ 8K level-1 cache for both instructions and data

23
Real World Architectures

 Pentium

◦ Legal issues with 586


◦ Separate 8K caches for data, instructions
◦ Branch prediction
◦ 32 bit address bus
◦ 64 bit internal data bus
◦ MMX - perform integer operations on vectors of 8,
16, or 32 bit words
◦ Superscalar – two parallel execution pipelines

24
Real World Architectures

 Pentium Pro
◦ multiple branch prediction
◦ speculative execution
◦ register renaming
◦ “P6” core

 Pentium II (1997)
◦ P6 core with MMX instructions
◦ Processor card (SEC) instead of IC package
 Higher frequency components, fewer pins
 Marketing reasons?

 Celeron
◦ Pentium II with no (or smaller) L2 cache
◦ Positioning for low-end market

25
Real World Architectures

 Pentium III
◦ Streaming SIMD Extensions (SSE)
 Perform float operations on vectors of up to 32 bit words
 Eight 128-bit registers to contain four 32-bit ints or floats
◦ On-die cache
 Pentium IV
◦ Multiple ALU’s
◦ Trace cache
◦ SSE2
◦ Redesign to allow higher clock rate
 Itanium
◦ EPIC - Explicit Parallel Instruction Computing
◦ 128 bit registers, data bus
 41-bit instructions in 128 bit bundles of three plus five "template bits" which
indicate dependencies or types
◦ Marrying ideas of RISC with CISC

26
Real World Architectures

 The MIPS family of CPUs has been one of the most


successful in its class.
 In 1986 the first MIPS CPU was announced.
 It had a 32-bit word size and could address 4GB of
memory.
 Over the years, MIPS processors have been used in
general purpose computers as well as in games.
 The MIPS architecture now offers 32- and 64-bit
versions.

27
Real World Architectures

 MIPS was one of the first RISC microprocessors.


 The original MIPS architecture had only 55 different
instructions, as compared with the 8086 which had
over 100.
 MIPS was designed with performance in mind: It is
a load/store architecture, meaning that only the load
and store instructions can access memory.
 The large number of registers in the MIPS
architecture keeps bus traffic to a minimum.

How does this design affect performance?

28
SAP-1
 The Simple-As-Possible (SAP)-1 computer is a
very basic model of a microprocessor.
 The SAP-1 design contains the basic necessities

for a functional Microprocessor. Its primary


purpose is to develop a basic understanding of
how a microprocessor works, interacts with
memory and other parts of the system like input
and output.
 The instruction set is very limited and is simple.
SAP1 Architecture
 1. Program Counter
 The program is stored at the beginning of the memory with the first
instruction at binary address 0000, the second instruction at 0001, the
third at address 0010 and so on. The program counter which is part of
the control unit, counts from 0000 to 1111. Its job is to send to the
memory the address of the next instruction to be fetched and executed. It
does this as mentioned in the next paragraph.
 The program counter is reset to 0000 before each computer run. When
the computer run begins, the program counter sends the address 0000 to
the memory. The program counter is then incremented to get 0001. After
the first instruction is fetched and executed, the program counter sends
address 0001 to the memory. Again the program counter is incremented.
After the second instruction is fetched and executed, the program
counter sends address 0010 to the memory. So this way, the program
counter keeps track of the next instruction to be fetched and executed.
 The program counter is like someone pointing a finger at a list of
instructions saying do this first, do this second, do this third, etc. This is
why the program counter is called a pointer; it points to an address in
memory where the instruction or data is being stored.
 2. Input & MAR
 The Input and MAR includes the address and data switch
registers. Switch registers are part of input unit, allows us to send
4 address bits and 8 data bits to the RAM.
 The memory address register (MAR) is the part of SAP-1 memory.
During a computer run, the address in the program counter is
latched in to the MAR. A bit later, the MAR applies this 4-bit
address to the RAM where a read operation is performed.
 3. The RAM
 The RAM is a 16 X 8 static TTL RAM. We can program the RAM by
means of the address and data switch registers. This allows you
to store a program and data in the memory before a computer
run.
 During a computer run, the RAM receives 4-bit addresses from
the MAR and a read operation is performed. In this way, the
instruction or data word stored in the RAM is placed on the W bus
for use in some other part of the computer.
 4. Instruction Register
 The instruction register is the part of the control unit. To
fetch an instruction from the memory the computer does a
memory read operation. This places the contents of the
addressed memory location on the W bus. At the same time,
the instruction register is set up for loading on the next
positive clock edge. The content of the instruction register are
split into two nibbles. The upper nibble goes directly to the
block “Controller – Sequencer”. The lower nibble is read onto
the W bus when needed.
 5.Controller Sequencer
 It generates the control signals for each block so that action
occur in desired sequence. CLK signal is used to synchronize
the overall operation of the SAP-1 computer. A 12-bit word
comes out of the Controller- Sequencer block. This control
word determines how the registers will react to the next
positive CLK edge.
 6. Accumulator
 The accumulator (A) is a buffer register that stores

intermediate answers during a computer run. Accumulator


has two outputs, one directly goes to the adder-subtractor
and the other goes to the W bus.
 7. The Adder – Subtractor
 SAP-1 uses a 2’s complement adder-subtractor. When

SU is low, the sum out of the adder-subtractor is S = A +


B. When SU is high, the difference appears as A = A + B ’.
 8.B Register
 The B register is another buffer register. It is used in

arithmetic operations. A low LB and positive clock edge


load the word on the W bus into the B register. The two
state output of the B register drives the adder-subtractor,
supplying the number to be added or subtracted from the
content of the accumulator.
 9.Output Register
 At the end of the computer run, the accumulator contains

the answer to the problem being solved. At this point, we


need to transfer the answer to the outside world. This is
where the output register is used.
 When EA is high and LO is low, the next positive clock edge

loads the accumulator content to the output register. The


output register is often called an output port because the
processed data can leave the computer through this register.
 10. Binary Display
 The binary display is a row of eight light emitting diodes

(LED’s). Because each LED connects to one flip-flop of the


output port, the binary display shows us the content of the
output port. Therefore, after we transferred an answer from
the accumulator to the output port, we can see the answer in
binary form.
Instruction Set
Code assembling in memory
 Example of arithmetic problem in assembly
language translate into a equivalent machine
code using the Opcode given in the above
instruction table.
 Example: 21H – 02H + 0AH
Memory Addresses Assembly code Machine code

0H LDA 8H 0000    1000

1H SUB 9H 0010     1001

2H ADD AH 0001      1011

3H OUT 1110      xxxx

4H HLT 1111      xxxx

5H

6H

7H

8H 21H 0010 0001

9H 02H 0000 0010

AH 0AH 0000 1010

BH

CH

DH

EH

FH
Practice:
11H - 02H + 05H + 0BH
Assignment:
1. SAP 2 and SAP 3.
Give the Architecture, Instruction Set, and
sample program.

You might also like