CISC Vs RISC and I286, I386, I486
CISC Vs RISC and I286, I386, I486
2
2
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
4
5
5
5
5
5
6
6
6
6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
7
7
7
8
8
8
8
11 80286 Features
12 80386 Features
13 80486 Features
15 Definitions
15.1 Real Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15.2 Virtual Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
9
10
10
11
Instruction set architecture(ISA) is the set of processor design techniques used to implement the instruction
work flow on hardware. In more practical words, ISA tells that how the processor going to process program
instructions.
A reduced instruction set computer (RISC /pronounce as risk/) is a computer which only use simple instructions
that can be divide into multiple instructions which perform low-level operation within single clock cycle, as its
name suggest REDUCED INSTRUCTION SET
A complex instruction set computer (CISC /pronounce as sisk/) is a computer where single instructions can
execute several low-level operations (such as a load from memory, an arithmetic operation, and a memory store)
or are capable of multi-step operations or addressing modes within single instructions, as its name suggest
COMPLEX INSTRUCTION SET.
Example
4.1
CISC Approach
The primary goal of CISC architecture is to complete a task in as few lines of assembly as possible. This is
achieved by building processor hardware that is capable of understanding & executing a series of operations,
this is where our CISC architecture introduced.
For this particular task, a CISC processor would come prepared with a specific instruction (well call it
MULT). When executed, this instruction
Loads the two values into separate registers
Multiplies the operands in the execution unit
And finally third, stores the product in the appropriate register.
Thus, the entire task of multiplying two numbers can be completed with one instruction:
MULT A, B ; This is assembly statement
MULT is what is known as a complex instruction. It operates directly on the computers memory banks and
does not require the programmer to explicitly call any loading or storing functions.
Advantage
Compiler has to do very little work to translate a high-level language statement into assembly
Length of the code is relatively short
Very little RAM is required to store instructions
The emphasis is put on building complex instructions directly into the hardware.
4.2
RISC processors only use simple instructions that can be executed within one clock cycle. Thus, the MULT
command described above could be divided into three separate commands:
LOAD which moves data from the memory bank to a register
PROD which finds the product of two operands located within the registers
STORE which moves data from a register to the memory banks.
In order to perform the exact series of steps described in the CISC approach, a programmer would need to
code four lines of assembly:
; These are all assembly statmements
LOAD R1, A
LOAD R2,B
PROD A, B
STORE R3, A
At first, this may seem like a much less efficient way of completing the operation. Because there are more
lines of code, more RAM is needed to store the assembly level instructions. The compiler must also perform
more work to convert a high-level language statement into code of this form.
Advantage
Each instruction requires only one clock cycle to execute, the entire program will execute in approximately
the same amount of time as the multi-cycle MULT command.
These RISC reduced instructions require less transistors of hardware space than the complex instructions,
leaving more room for general purpose registers. Because all of the instructions execute in a uniform
amount of time (i.e. one clock)
Pipelining is possible.
LOAD/STORE Mechanism Separating the LOAD and STORE instructions actually reduces the amount
of work that the computer must perform. After a CISC-style MULT command is executed, the processor
automatically erases the registers. If one of the operands needs to be used for another computation, the
processor must re-load the data from the memory bank into a register. In RISC, the operand will remain in the
register until another value is loaded in its place.
RISC
Emphasis on software
Single-clock
Reduced instruction Only
Register-to-Register
LOAD/STORE are independent instructions
Low cycles per second, large code sizes
Spends more transistor on memory registers
CISC
Emphasis on hardware
Multi-Clock
Complex Instructions
Memory-to-memory: LOAD and STORE incorporated in instructions
high cycles per second, Small code sizes
Transistors used for storing complex instructions
CISC Examples of CISC instruction set architectures are PDP-11, VAX, Motorola 68k, and desktop PCs on
intels x86 architecture based too .
RISC Examples of RISC families include DEC Alpha, AMD 29k, ARC, Atmel AVR, Blackfin, Intel i860 and
i960, MIPS, Motorola 88000, PA-RISC, Power (including PowerPC), SuperH, SPARC and ARM too.
8.0.1
Features at a glance
8.1
Most operations are performed on operands in CPU registers rather than in memory. All of the arithmetic,
comparison, branching, bitwise operations are performed with registers and literals (5-bit floating
point).
Only LOAD & STORE are memory reference instructions
8.2
Large internal register sets feating 32 32-bit general purpose and specific function registers are divided into two
types:
Global
Local
Both of these types can be used for general storage of operands
Table 2: Global v Local
Global
Retain contents across procedure boundaries
8.3
Local
Processor allocates a new set of local registers each time a procedure is called
8.4
This is accomplished through a register scoreboarding scheme which enhances execution speed.
Register Scoreboarding Scheme
fetched from memory.
Procedure When a LOAD instruction is executed, the processor sets one or more scoreboard bits to
indicate the target register to be loaded. After the target registers are loaded, the scoeboard bits are
cleared. While the target registers are being loaded, the processor is allowed to execute other instructions,
called independent instructions, that do not use these registers.
8.5
Most of the commonly used instructions are executed in a minimum number of clock cycle. [Usually one
clock]
Example Instructions, either 32 or 64-bit long, are aligned on 32-bit boundaries allowing instructions
to be decoded in one clock cycle. This eliminates the need for an instruction alignment stage in pipeline
resulting in over 50 instructions that can be executed in a single clock cycle.
8.6
Interrupt Model
To handle interrupts, the processor maintains an interrupt table of 248 interrupt vectors.
Of which 240 are available for general use.
When an interrupt is generated, the processor uses a pointer from the interrupt table to perform an
implicit call to an interrupt handler procedure.
The processor automatically saves the state of the processor prior to receiving the interrupt, performs the
interrupt routine, and then restores its previous states.
A separate interrupt stack is also provided to segregate interrupt handling from application programs.
Interrupt handling facilities feature prioritizing pending interrupt.
8.7
Each time a call instruction is issued, the processor automatically saves the current set of local registers
and allocates a new set of local registers for the called procedure.
Likewise, on the return from a procedure, the current set of local registers is deallocated and the local
registers for the procedure being returned to are restored.
Thus, on a procedure call, the program never has to explicitly save and restore those local
variables
8.8
The processor offers a full set of load, store, move, arithmetic, comparison, and branch instructions with operations on both integer and ordinal data types.
It also provides a complete set of Boolean and bit-field instructions to simplify operations on
bits and bit strings.
8.9
The on-chip floating point unit includes a full set of floating point operations including add, subtract,
multiply, divide, trigonometric functions and logarithmic functions
Data Bus
Instruction address bus
Data Address bus
Instruction bus (fixed instruction length)
10
10.0.1
10.1
The RTC has normally been an I/O device completely outside the CPU in the most earlier microcomputers.
This is the first time the RTC is implemented inside a top-of-the line microprocessor such as the PowerPC.
Modern multi-tasking operating systems require time keeping for task switching as well as keeping the
calendar date.
The 601 RTC provides a measure of real-time in terms of time ofday and date with a calendar range of
136.19 years.
The RTC contains two registers, RTC Upper (RTCU) and RTC Lower (RTCL)
RTCU
10.2
Instruction Unit
The 601 Instruction unit computes the address of the next instruction to be fetched. The instruction unit
includes an instruction queue and Branch Processing Unit (BPU).
The instruction queue holds up to eight instructions and can be filled from the cahce during a single cycle.
The BPU searches through the instruction queue for a conditional branch instruction and tries to resolve it
early in order to achieve zero-cycle branch in many instances.
10.3
Execution Unit
BPU It contains an adder to compute branch target addresses and three special purpose, user-control
registers namely.
10.4
10.5
Cache Unit
10.6
Memory Unit
It consists of read and write queues that buffer operation between the external interface and the cache.
10.7
System Interface
It includes 32-bit address bus, a 64-bit data bus and 52 control and information signals.
The 601 control and information signals allow for functions such as address arbitration, address start, address
termination, address transfer, data arbitration, data start and data termination.
11
80286 Features
12
80386 Features
13
80486 Features
14
Topic
Date
CPU speed
Cores
Registers (Programmer)
RAM
Functional Units
Pipeline Stages
Cache off chip
Cache on chip
Transistors
15
15.1
Table 3: My caption
80286
80386
1982
1985
6 - 25 MHz 12 - 40 MHz
1
1
8, 15 total
16, ?
16 MB
4 GB
4
6
3
3
0
Yes
0
0
134 000
275 000
80486
1989
16 - 100 MHz
1
16, ?
4 GB
9
5
Yes
8 KB
>1 000 000
Definitions
Real Memory
Real memory refers to the actual memory chips that are installed.
15.2
Virtual Memory
All programs actually run in this physical memory. However, it is often useful to allow the computer to think
that it has memory that isnt actually there, in order to permit the use of programs that are larger than will
physically fit in memory, or to allow multitasking (multiple programs running at once). This concept is called
virtual memory.
How virtual Memory works Lets suppose the operating system needs 80 MB of memory to hold all the
programs that are running, but there are only 32 MB of RAM chips installed in the computer. The operating
system sets up 80 MB of virtual memory and employs a virtual memory manager, a program designed to control
virtual memory, to manage the 80 MB. The virtual memory manager sets up a file on the hard disk that is
48 MB in size (80 minus 32). The operating system then proceeds to use 80 MB worth of memory addresses.
To the operating system, it appears as if 80 MB of memory exists. It lets the virtual memory manager worry
about how to handle the fact that we only have 32 MB of real memory.
16
17
If an 80286-based system is running an operating system such as Microsofts OS/2, which uses the protected
mode, the real mode will be used to
Initialize peripheral device
Load the main part of the OS from disk into memory
Load some registers
Enable interrupts
Set up descriptor table
Switch the processor to protected mode
The first step in switching an 80286 to protected mode is to set the protection enable bit in
the machine status word (MSW) register
10
Figure 2: (a) 80286 machine status word bits, (b) flag register bits
In figure Bits 1, 2 and 3 of the MSW are for the most part used to indicate whether a coprocessor is present
in the system or not.
Bit 0 of the MSW is used to switch the 80286 into protected mode. To change bits in the MSW we have
to load the desired word in a register or memory location and execute the machine status word (LMSW)
The final step to get the 80286 in protected mode is to execute an intersegment jump to the start of the
main system program.
Switching an 80286 to protected mode enables the integrated MMU to provide virtual memory and protection. A 286 virtual address consists of a 16-bit selector and a 16-bit offset
The MMU uses 14 bits of the selector to access a descriptor for the desired segment in a table
of descriptors
The descriptor contains the 24-bit physical base address, the privilege level, and some control bits for the
segment.
If the privilege level contained in the selector is as high as or higher than the privilege level contained in the
descriptor, then access to the segment will be allowed.
If not, an exception will be generated. The MMU also checks the P bit in the descriptor to determine if the
segment is present in physical memory. If not, the MMU will generate a segment-not-present exception.
To be remembered In protected mode 80286 uses all 24 address line, so it can address 16 Mbytes of
memory instead of just the 1 MByte addressable in real mode.
How to switch back to Real Mode Once an 80286 is switched into protected mode by executing LMSW
instruction, the only way to get an 80286 back to its real address mode is by resetting the system.
18
11
Support to use RAMBUS memory technology DDR (Double Data Rate) : both edge
Interconnection from aluminum to copper
Copper: better conductor, increases clock speed
Bus speed from current max. of 133MHz to 200MHz or higher
12