Ch1 - Basic Concepts and Computer Evolution
Ch1 - Basic Concepts and Computer Evolution
COMPUTER EVOLUTION
1.1 Organization and Architecture
1.2 Structure and Function
Function
Structure
2 CHAPTER 1 / BASIC CONCEPTS AND COMPUTER EVOLUTION
LEARNING OBJECTIVES
After studying this chapter, you should be able to:
♦ Explain the general functions and structure of a digital computer.
♦ Present an overview of the evolution of computer technology from early
digital computers to the latest microprocessors.
♦ Present an overview of the evolution of the x86 architecture.
♦ Define embedded systems and list some of the requirements and constraints
that various embedded systems must meet.
included a number of models. The customer with modest requirements could buy a
cheaper, slower model and, if demand increased, later upgrade to a more expensive,
faster model without having to abandon software that had already been developed.
Over the years, IBM has introduced many new models with improved technology
to replace older models, offering the customer greater speed, lower cost, or both.
These newer models retained the same architecture so that the customer's soft
ware investment was protected. Remarkably, the System/370 architecture, with a
few enhancements, has survived to this day as the architecture of IBM's mainframe
product line.
In a class of computers called microcomputers, the relationship between archi
tecture and organization is very close. Changes in technology not only influence
organization but also result in the introduction of more powerful and more complex
architectures. Generally, there is less of a requirement for generation-to-generation
compatibility for these smaller machines. Thus, there is more interplay between
organizational and architectural design decisions. An intriguing example of this is
the reduced instruction set computer (RISC), which we examine in Chapter 15.
This book examines both computer organization and computer architecture.
The emphasis is perhaps more on the side of organization. However, because a
computer organization must be designed to implement a particular architectural
specification, a thorough treatment of organization requires a detailed examination
of architecture as well.
lower layers of the hierarchy. The remainder of this section provides a very brief
overview of this plan of attack.
Function
Both the structure and functioning of a computer are, in essence, simple. In general
terms, there are only four basic functions that a computer can perform:
■ Data processing: Data may take a wide variety of forms, and the range of pro
cessing requirements is broad. However, we shall see that there are only a few
fundamental methods or types of data processing.
■ Data storage: Even if the computer is processing data on the fly (i.e., data
come in and get processed, and the results go out immediately), the computer
must temporarily store at least those pieces of data that are being worked on
at any given moment. Thus, there is at least a short-term data storage function.
Equally important, the computer performs a long-term data storage function.
Files of data are stored on the computer for subsequent retrieval and update.
■ Data movement: The computer's operating environment consists of devices
that serve as either sources or destinations of data. When data are received
from or delivered to a device that is directly connected to the computer, the
process is known as input-output (I/O), and the device is referred to as a
peripheral. When data are moved over longer distances, to or from a remote
device, the process is known as data communications.
■ Control: Within the computer, a control unit manages the computer's
resources and orchestrates the performance of its functional parts in response
to instructions.
The preceding discussion may seem absurdly generalized. It is certainly
possible, even at a top level of computer structure, to differentiate a variety of func
tions, but to quote [SIEW82]:
Structure
We now look in a general way at the internal structure of a computer. We begin with
a traditional computer with a single processor that employs a microprogrammed
control unit, then examine a typical multicore structure.
SIMPLE SINGLE-PROCESSOR COMPUTER Figure 1.1 provides a hierarchical view
of the internal structure of a traditional single-processor computer. There are four
main structural components:
■ Central processing unit (CPU): Controls the operation of the computer and
performs its data processing functions; often simply referred to as processor.
■ Main memory: Stores data.
1.2 I STRUCTURE AND FUNCTION 5
---------
--- ----
--
■ 1/0: Moves data between the computer and its external environment.
■ System interconnection: Some mechanism that provides for communication
among CPU, main memory, and 1/0. A common example of system intercon
nection is by means of a system bus, consisting of a number of conducting
wires to which all the other components attach.
There may be one or more of each of the aforementioned components. Tra
ditionally, there has been just a single processor. In recent years, there has been
increasing use of multiple processors in a single computer. Some design issues relat
ing to multiple processors crop up and are discussed as the text proceeds; Part Five
focuses on such computers.
6 CHAPTER 1 / BASIC CONCEPTS AND COMPUTER EVOLUTION
Each of these components will be examined in some detail in Part Two. How
ever, for our purposes, the most interesting and in some ways the most complex
component is the CPU. Its major structural components are as follows:
■ Control unit: Controls the operation of the CPU and hence the computer.
■ Arithmetic and logic unit (ALU): Performs the computer's data processing
functions.
■ Registers: Provides storage internal to the CPU.
■ CPU interconnection: Some mechanism that provides for communication
among the control unit, ALU, and registers.
Part Three covers these components, where we will see that complexity is added by
the use of parallel and pipelined organizational techniques. Finally, there are sev
eral approaches to the implementation of the control unit; one common approach is
a microprogrammed implementation. In essence, a microprogrammed control unit
operates by executing microinstructions that define the functionality of the control
unit. With this approach, the structure of the control unit can be depicted, as in
Figure 1.1. This structure is examined in Part Four.
MULTICORE COMPUTER STRUCTURE As was mentioned, contemporary
computers generally have multiple processors. When these processors all reside
on a single chip, the term multicore computer is used, and each processing unit
(consisting of a control unit, ALU, registers, and perhaps cache) is called a core. To
clarify the terminology, this text will use the following definitions.
■ Central processing unit (CPU): That portion of a computer that fetches and
executes instructions. It consists of an ALU, a control unit, and registers.
In a system with a single processing unit, it is often simply referred to as a
processor.
■ Core: An individual processing unit on a processor chip. A core may be equiv
alent in functionality to a CPU on a single-CPU system. Other specialized pro
cessing units, such as one optimized for vector and matrix operations, are also
referred to as cores.
■ Processor: A physical piece of silicon containing one or more cores. The
processor is the computer component that interprets and executes instruc
tions. If a processor contains multiple cores, it is referred to as a multicore
processor.
After about a decade of discussion, there is broad industry consensus on this usage.
Another prominent feature of contemporary computers is the use of multiple
layers of memory, called cache memory, between the processor and main memory.
Chapter 4 is devoted to the topic of cache memory. For our purposes in this section,
we simply note that a cache memory is smaller and faster than main memory and is
used to speed up memory access, by placing in the cache data from main memory,
that is likely to be used in the near future. A greater performance improvement may
be obtained by using multiple levels of cache, with level 1 (Ll) closest to the core
and additional levels (L2, L3, and so on) progressively farther from the core. In this
scheme, level n is smaller and faster than level n + 1.
1.2 I STRUCTURE AND FUNCTION 7
o MOTHERBOARD
oDOO
Main memory chips
0 1/0 chips
... ...
- ..........
... ............
-
.....
000 \
\
\
...............
......... ......
...
\
\
...
... ...
\
\
PROCESSOR CHIP
E]E]E]E]
\
\
\
I I I I
\
\
_
\
n
\
_ _ / \
\
L3 cache L3 cache
/
\
\
\
--
--
- -- I
-- I
I
CORE
�--� /
I
Instruction Load/ I
I
logic store logic I
I
I
I
Ll I-cache Ll data cache I
I
I
I
L2 instruction L2data I
I
cache cache I
The motherboard contains a slot or socket for the processor chip, which typ
ically contains multiple individual cores, in what is known as a multicore processor.
There are also slots for memory chips, I/O controller chips, and other key computer
components. For desktop computers, expansion slots enable the inclusion of more
components on expansion boards. Thus, a modern motherboard connects only a
few individual chip components, with each chip containing from a few thousand up
to hundreds of millions of transistors.
Figure 1.2 shows a processor chip that contains eight cores and an L3 cache.
Not shown is the logic required to control operations between the cores and the
cache and between the cores and the external circuitry on the motherboard. The
figure indicates that the L3 cache occupies two distinct portions of the chip surface.
However, typically, all cores have access to the entire L3 cache via the aforemen
tioned control circuits. The processor chip shown in Figure 1.2 does not represent
any specific product, but provides a general idea of how such chips are laid out.
Next, we zoom in on the structure of a single core, which occupies a portion of
the processor chip. In general terms, the functional elements of a core are:
■ Instruction logic: This includes the tasks involved in fetching instructions,
and decoding each instruction to determine the instruction operation and the
memory locations of any operands.
■ Arithmetic and logic unit (ALU): Performs the operation specified by an
instruction.
■ Load/store logic: Manages the transfer of data to and from main memory via
cache.
The core also contains an Ll cache, split between an instruction cache
(I-cache) that is used for the transfer of instructions to and from main memory, and
an Ll data cache, for the transfer of operands and results. Typically, today's pro
cessor chips also include an L2 cache as part of the core. In many cases, this cache
is also split between instruction and data caches, although a combined, single L2
cache is also used.
Keep in mind that this representation of the layout of the core is only intended
to give a general idea of internal core structure. In a given product, the functional
elements may not be laid out as the three distinct elements shown in Figure 1.2,
especially if some or all of these functions are implemented as part of a micropro
grammed control unit.
EXAMPLES It will be instructive to look at some real-world examples that
illustrate the hierarchical structure of computers. Figure 1.3 is a photograph of the
motherboard for a computer built around two Intel Quad-Core Xeon processor
chips. Many of the elements labeled on the photograph are discussed subsequently
in this book. Here, we mention the most important, in addition to the processor
sockets:
■ PCI-Express slots for a high-end display adapter and for additional peripher
als (Section 3.6 describes PCie).
■ Ethernet controller and Ethernet ports for network connections.
■ USB sockets for peripheral devices.
1.2 I STRUCTURE AND FUNCTION 9
Intel® 3420
Chipset
S ix Channel DDR3-1333 Memory
2x Quad-Core Intel® Xeon® Processors
InterfacesUp to 4SGB
with Integrated Memory Controllers
2xUSB 2.0
Internal
2xUSB 2.0
External
BIOS
Ethernet Controller
■ Serial ATA (SATA) sockets for connection to disk memory (Section 7.7
discusses Ethernet, USB, and SATA).
■ Interfaces for DDR ( double data rate) main memory chips (Section 5.3
discusses DDR).
■ Intel 3420 chipset is an I/O controller for direct memory access operations
between peripheral devices and main memory (Section 7.5 discusses DDR).
Following our top-down strategy, as illustrated in Figures 1.1 and 1.2, we can
now zoom in and look at the internal structure of a processor chip. For variety, we
look at an IBM chip instead of the Intel processor chip. Figure 1.4 is a photograph
of the processor chip for the IBM zEnterprise EC12 mainframe computer. This chip
has 2.75 billion transistors. The superimposed labels indicate how the silicon real
estate of the chip is allocated. We see that this chip has six cores, or processors.
In addition, there are two large areas labeled L3 cache, which are shared by all six
processors. The L3 control logic controls traffic between the L3 cache and the cores
and between the L3 cache and the external environment. Additionally, there is stor
age control (SC) logic between the cores and the L3 cache. The memory controller
(MC) function controls access to memory external to the chip. The GX I/O bus
controls the interface to the channel adapters accessing the I/O.
Going down one level deeper, we examine the internal structure of a single
core, as shown in the photograph of Figure 1.5. Keep in mind that this is a portion
of the silicon surface area making up a single-processor chip. The main sub-areas
within this core area are the following:
■ ISU (instruction sequence unit): Determines the sequence in which instructions
are executed in what is referred to as a superscalar architecture (Chapter 16).
■ IFU (instruction fetch unit): Logic for fetching instructions.
10 CHAPTER 1 / BASIC CONCEPTS AND COMPUTER EVOLUTION
■ IDU (instruction decode unit): The IDU is fed from the IFU buffers, and is
responsible for the parsing and decoding of all z/Architecture operation codes.
■ LSU (load-store unit): The LSU contains the 96-kB Ll data cache,1 and man
ages data traffic between the L2 data cache and the functional execution
units. It is responsible for handling all types of operand accesses of all lengths,
modes, and formats as defined in the z/Architecture.
■ XU (translation unit): This unit translates logical addresses from instructions
into physical addresses in main memory. The XU also contains a translation
lookaside buffer (TLB) used to speed up memory access. TLBs are discussed
in Chapter 8.
■ FXU (fixed-point unit): The FXU executes fixed-point arithmetic operations.
■ BFU (binary floating-point unit): The BFU handles all binary and hexadeci
mal floating-point operations, as well as fixed-point multiplication operations.
■ DFU (decimal floating-point unit): The DFU handles both fixed-point and
floating-point operations on numbers that are stored as decimal digits.
■ RU (recovery unit): The RU keeps a copy of the complete state of the sys
tem that includes all registers, collects hardware fault signals, and manages the
hardware recovery actions.
1kB = kilobyte = 2048 bytes. Numerical prefixes are explained in a document under the "Other Useful"
tab at ComputerScienceStudent.com.
1.3 / A BRIEF HISTORY OF COMPUTERS 11