How Microprocessors Work 23
How Microprocessors Work 23
The computer you are using to read this page uses a microprocessor to do its
work. The microprocessor is the heart of any normal computer, whether it is
a desktop machine, a server or a laptop. The microprocessor you are using might be
a Pentium, a K6, a PowerPC, a Sparc or any of the many other brands and types of
microprocessors, but they all do approximately the same thing in approximately the
same way.
A microprocessor -- also known as a CPU or central processing unit -- is a complete
computation engine that is fabricated on a single chip. The first microprocessor was
the Intel 4004, introduced in 1971. The 4004 was not very powerful -- all it could do
was add and subtract, and it could only do that 4 bits at a time. But it was amazing
that everything was on one chip. Prior to the 4004, engineers built computers either
from collections of chips or from discrete components (transistors wired one at a
time). The 4004 powered one of the first portable electronic calculators.
If you have ever wondered what the microprocessor in your computer is doing, or if
you have ever wondered about the differences between types of microprocessors,
then read on. In this article, you will learn how fairly simple digital logic techniques
allow a computer to do its job, whether its playing a game or spell checking a
document!
Here are the differences between the different processors that Intel has
introduced over the years.
Compiled from The Intel Microprocessor Quick Reference Guide and TSCP
Benchmark Scores
3
Additional information about the table on this page:
The date is the year that the processor was first introduced. Many processors
are re-introduced at higher clock speeds for many years after the original
release date.
Transistors is the number of transistors on the chip. You can see that the
number of transistors on a single chip has risen steadily over the years.
Microns is the width, in microns, of the smallest wire on the chip. For
comparison, a human hair is 100 microns thick. As the feature size on the
chip goes down, the number of transistors rises.
Clock speed is the maximum rate that the chip can be clocked at. Clock
speed will make more sense in the next section.
MIPS stands for "millions of instructions per second" and is a rough measure
of the performance of a CPU. Modern CPUs can do so many different things
that MIPS ratings lose a lot of their meaning, but you can get a general sense
of the relative power of the CPUs from this column.
From this table you can see that, in general, there is a relationship between clock
speed and MIPS. The maximum clock speed is a function of the manufacturing
process and delays within the chip. There is also a relationship between the number
of transistors and MIPS. For example, the 8088 clocked at 5 MHz but only executed
at 0.33 MIPS (about one instruction per 15 clock cycles). Modern processors can
often execute at a rate of two instructions per clock cycle. That improvement is
directly related to the number of transistors on the chip and will make more sense in
the next section.
WHAT'S A CHIP?
A chip is also called an integrated circuit. Generally it is a small, thin piece of
silicon onto which the transistors making up the microprocessor have been etched.
A chip might be as large as an inch on a side and can contain tens of millions of
transistors. Simpler processors might consist of a few thousand transistors etched
onto a chip just a few millimeters square.
Microprocessor Logic
To understand how a microprocessor works, it is helpful to look inside and learn
about the logic used to create one. In the process you can also learn
about assembly language -- the native language of a microprocessor -- and many
of the things that engineers can do to boost the speed of a processor.
A microprocessor executes a collection of machine instructions that tell the
processor what to do. Based on the instructions, a microprocessor does three basic
things:
5
There may be very sophisticated things that a microprocessor does, but those are
its three basic activities. The following diagram shows an extremely simple
microprocessor capable of doing those three things:
A data bus (that may be 8, 16 or 32 bits wide) that can send data to
memory or receive data from memory
An RD (read) and WR (write) line to tell the memory whether it wants to set
or get the addressed location
A reset line that resets the program counter to zero (or whatever) and
restarts execution
Let's assume that both the address and data buses are 8 bits wide in this example.
Here are the components of this simple microprocessor:
Registers A, B and C are simply latches made out of flip-flops. (See the
section on "edge-triggered latches" in How Boolean Logic Works for details.)
The program counter is a latch with the extra ability to increment by 1 when
told to do so, and also to reset to zero when told to do so.
The ALU could be as simple as an 8-bit adder (see the section on adders
in How Boolean Logic Worksfor details), or it might be able to add, subtract,
multiply and divide 8-bit values. Let's assume the latter here.
The test register is a special latch that can hold values from comparisons
performed in the ALU. An ALU can normally compare two numbers and
determine if they are equal, if one is greater than the other, etc. The test
register can also normally hold a carry bit from the last stage of the adder. It
stores these values in flip-flops and then the instruction decoder can use the
values to make decisions.
There are six boxes marked "3-State" in the diagram. These are tri-state
buffers. A tri-state buffer can pass a 1, a 0 or it can essentially disconnect its
output (imagine a switch that totally disconnects the output line from the
wire that the output is heading toward). A tri-state buffer allows multiple
outputs to connect to a wire, but only one of them to actually drive a 1 or a 0
onto the line.
The instruction register and instruction decoder are responsible for controlling
all of the other components.
Although they are not shown in this diagram, there would be control lines from the
instruction decoder that would:
Tell the A register to latch the value currently on the data bus
Tell the B register to latch the value currently on the data bus
Tell the C register to latch the value currently output by the ALU
Tell the program counter register to latch the value currently on the data bus
Tell the address register to latch the value currently on the data bus
Tell the instruction register to latch the value currently on the data bus
Coming into the instruction decoder are the bits from the test register and the clock
line, as well as the bits from the instruction register.
Microprocessor Memory
The previous section talked about the address and data buses, as well as the RD
and WR lines. These buses and lines connect either to RAM or ROM -- generally
both. In our sample microprocessor, we have an address bus 8 bits wide and a data
bus 8 bits wide. That means that the microprocessor can address (2 8) 256 bytes of
memory, and it can read or write 8 bits of the memory at a time. Let's assume that
this simple microprocessor has 128 bytes of ROM starting at address 0 and 128
bytes of RAM starting at address 128.
ROM stands for read-only memory. A ROM chip is programmed with a permanent
collection of pre-set bytes. The address bus tells the ROM chip which byte to get
and place on the data bus. When the RD line changes state, the ROM chip presents
the selected byte onto the data bus.
RAM stands for random-access memory. RAM contains bytes of information, and the
microprocessor can read or write to those bytes depending on whether the RD or
WR line is signaled. One problem with today's RAM chips is that they forget
everything once the power goes off. That is why the computer needs ROM.
RAM chip
By the way, nearly all computers contain some amount of ROM (it is possible to
create a simple computer that contains no RAM -- many microcontrollers do this by
placing a handful of RAM bytes on the processor chip itself -- but generally
impossible to create one that contains no ROM). On a PC, the ROM is called
8
the BIOS (Basic Input/output System). When the microprocessor starts, it begins
executing instructions it finds in the BIOS. The BIOS instructions do things like test
the hardware in the machine, and then it goes to the hard disk to fetch the boot
sector (see How Hard Disks Work for details). This boot sector is another small
program, and the BIOS stores it in RAM after reading it off the disk. The
microprocessor then begins executing the boot sector's instructions from RAM. The
boot sector program will tell the microprocessor to fetch something else from the
hard disk into RAM, which the microprocessor then executes, and so on. This is how
the microprocessor loads and executes the entire operating system.
Microprocessor Instructions
Even the incredibly simple microprocessor shown in the previous example will have
a fairly large set of instructions that it can perform. The collection of instructions is
implemented as bit patterns, each one of which has a different meaning when
loaded into the instruction register. Humans are not particularly good at
remembering bit patterns, so a set of short words are defined to represent the
different bit patterns. This collection of words is called the assembly language of
the processor. An assembler can translate the words into their bit patterns very
easily, and then the output of the assembler is placed in memory for the
microprocessor to execute.
Here's the set of assembly language instructions that the designer might create for
the simple microprocessor in our example:
If you have read How C Programming Works, then you know that this simple piece of
C code will calculate the factorial of 5 (where the factorial of 5 = 5! = 5 * 4 * 3 * 2 *
1 = 120):
a=1;f=1;while (a <= 5){ f = f * a; a = a + 1;}
At the end of the program's execution, the variable f contains the factorial of 5.
Assembly Language
A C compiler translates this C code into assembly language. Assuming that RAM
starts at address 128 in this processor, and ROM (which contains the assembly
language program) starts at address 0, then for our simple microprocessor the
assembly language might look like this:
// Assume a is at address 128// Assume F is at address 1290 CONB 1 // a=1;1 SAVEB
1282 CONB 1 // f=1;3 SAVEB 1294 LOADA 128 // if a > 5 the jump to 175 CONB 56
COM7 JG 178 LOADA 129 // f=f*a;9 LOADB 12810 MUL11 SAVEC 12912 LOADA
128 // a=a+1;13 CONB 114 ADD15 SAVEC 12816 JUMP 4 // loop back to if17 STOP
ROM
So now the question is, "How do all of these instructions look in ROM?" Each of
these assembly language instructions must be represented by a binary number. For
the sake of simplicity, let's assume each assembly language instruction is given a
unique number, like this:
LOADA - 1
LOADB - 2
CONB - 3
SAVEB - 4
SAVEC mem - 5
10
ADD - 6
SUB - 7
MUL - 8
DIV - 9
COM - 10
JUMP addr - 11
JEQ addr - 12
JNEQ addr - 13
JG addr - 14
JGE addr - 15
JL addr - 16
JLE addr - 17
STOP - 18
The numbers are known as opcodes. In ROM, our little program would look like this:
// Assume a is at address 128// Assume F is at address 129Addr opcode/value0 3 //
CONB 11 12 4 // SAVEB 1283 1284 3 // CONB 15 16 4 // SAVEB 1297 1298 1 //
LOADA 1289 12810 3 // CONB 511 512 10 // COM13 14 // JG 1714 3115 1 // LOADA
12916 12917 2 // LOADB 12818 12819 8 // MUL20 5 // SAVEC 12921 12922 1 //
LOADA 12823 12824 3 // CONB 125 126 6 // ADD27 5 // SAVEC 12828 12829 11 //
JUMP 430 831 18 // STOP
You can see that seven lines of C code became 18 lines of assembly language, and
that became 32 bytes in ROM.
Decoding
The instruction decoder needs to turn each of the opcodes into a set of signals that
drive the different components inside the microprocessor. Let's take the ADD
instruction as an example and look at what it needs to do:
1. During the first clock cycle, we need to actually load the instruction.
Therefore the instruction decoder needs to: activate the tri-state buffer for
the program counter activate the RD line activate the data-in tri-state buffer
latch the instruction into the instruction register
11
2. During the second clock cycle, the ADD instruction is decoded. It needs to do
very little: set the operation of the ALU to addition latch the output of the ALU
into the C register
3. During the third clock cycle, the program counter is incremented (in theory
this could be overlapped into the second clock cycle).
Every instruction can be broken down as a set of sequenced operations like these
that manipulate the components of the microprocessor in the proper order. Some
instructions, like this ADD instruction, might take two or three clock cycles. Others
might take five or six clock cycles.
Microprocessor Performance and Trends
The number of transistors available has a huge effect on the performance of a
processor. As seen earlier, a typical instruction in a processor like an 8088 took 15
clock cycles to execute. Because of the design of the multiplier, it took
approximately 80 cycles just to do one 16-bit multiplication on the 8088. With
moretransistors, much more powerful multipliers capable of single-cycle speeds
become possible.
More transistors also allow for a technology called pipelining. In a pipelined
architecture, instruction execution overlaps. So even though it might take five clock
cycles to execute each instruction, there can be five instructions in various stages of
execution simultaneously. That way it looks like one instruction completes every
clock cycle.
Many modern processors have multiple instruction decoders, each with its own
pipeline. This allows for multiple instruction streams, which means that more than
one instruction can complete during each clock cycle. This technique can be quite
complex to implement, so it takes lots of transistors.
Trends
The trend in processor design has primarily been toward full 32-bit ALUs with fast
floating point processors built in and pipelined execution with multiple instruction
streams. The newest thing in processor design is 64-bit ALUs, and people are
expected to have these processors in their home PCs in the next decade. There has
also been a tendency toward special instructions (like the MMX instructions) that
make certain operations particularly efficient, and the addition of hardware virtual
memory support and L1 caching on the processor chip. All of these trends push up
the transistor count, leading to the multi-million transistor powerhouses available
today. These processors can execute about one billion instructions per second!
12
13
of 64-bit features. But the average user who is reading e-mail, browsing the Web
and editing Word documents is not really using the processor in that way.
For more information on microprocessors and related topics, check out the links on
the next page.