Cit 314
Cit 314
GUIDE
CIT314
COMPUTER ARCHITECTURE AND ORGANIZATION II
Lagos Office
14/16 Ahmadu Bello Way
Victoria Island, Lagos
Departmental email: [email protected]
NOUN e-mail: [email protected]
URL: www.nou.edu.ng
First Printed 2022
ISBN: 978-058-557-5
ii
CIT 314 COURSE GUIDE
CONTENTS PAGE
Introduction…………………………………………………iv
Summary ..............................................................................xvi
iii
CIT 314 COURSE GUIDE
COURSE GUIDE
INTRODUCTION
This book is about the structure and function of Computers. Its purpose
is to present as clearly and completely as possible, the nature and
characteristics of modern-day computer systems.
In spite of the variety, and pace of change in the computer field, certain
fundamental concepts apply consistently throughout. The application of
these concepts depends on the current state of the technology and the
price, performance, objectives of the designer. The intent of this book is
to provide a thorough discussion of the fundamentals of computer
organization and architecture and to relate these to contemporary design
issues.
This course is divided into four modules. The first module deals with an
overview of the memory system. The second module covers, memory
addressing, elements of memory hierarchy, and virtual memory control
systems. The third module discusses various forms of control including
hardware, asynchronous, microprogrammed and asynchronous forms.
The fourth and last module takes on fault tolerant computing and
methods for fault-tolerant computing.
This course guide gives you a brief overview of the course contents,
course duration, and course materials.
COURSE COMPETENCIES
First, students will learn about the basics of computers and what they are
made up of. Second, they will be able to judge certain functionalities in
computer systems dependent on the type of architectures they are
operating on. This in turn will give them a deeper understanding on how
to manage computer faults.
COURSE OBJECTIVES
Certain objectives have been set out for the achievement of the course
aims. And apart from the course objectives, each unit of this course has
its objectives, which you need to confirm if are met, at the end of each
unit. So, upon the completion of this course, you should be able to:
STUDY UNITS
MODULE ONE
MODULE TWO
MODULE THREE
MODULE FOUR
vii
CIT 314 COURSE GUIDE
You should make use of the course materials, and do the exercises to
enhance your learning.
Ailamaki AG, DeWitt DJ., Hill MD, Wood DA. DBMSs on a modern
processor: where does time go? In: Proceedings of the 25th
International Conference on Very Large Data Bases; 1999. p. 266–
77. URL: https://www.semanticscholar.org/paper/DBMSs-on-a-
Modern-Processor%3A-Where-Does-Time-Go-Ailamaki-
DeWitt/54b92179ede08158e2cf605f5e9f264ca06c01ff
Denning PJ. The working set model for program behaviour. Commun
ACM. 1968;11(5):323–33. URL:
https://denninginstitute.com/pjd/PUBS/WSModel_1968.pdf
https://books.google.com/books/about/Fundamentals_of_Computer
_Organization_an.html?id=m6uFlL41TlIC
https://books.google.com/books/about/Interconnection_Networks_
for_Multiproces.html?id=-1u7QgAACAAJ
Walton G. H., Long Taff T.A. and R. C. Linder, (1997) “Computational
Evaluation of Software Security attributes”, IEEE.
PRESENTATION SCHEDULE
ASSESSMENT
There are two aspects to the assessment of the course. First are the tutor
–marked assignments; second, is a written examination. In tackling the
assignments, you are expected to apply the information and knowledge
you acquired during this course. The assignments must be submitted to
xii
CIT 314 COURSE GUIDE
your tutor for formal assessment in accordance with the deadlines stated
in the Assignment File. The work you submit to your tutor for
assessment will count for 30% of your total course mark. At the end of
the course, you will need to sit for a final three-hour examination.
This will also count for 70% of your total course mark.
TUTOR-MARKED ASSIGNMENT
There are eight tutor- marked assignments in this course. You need to
submit all the assignments. The total marks for the best four (4)
assignments will be 30% of your total course mark.
Assignment questions for the units in this course are contained in the
Assignment
When you have completed each assignment, send it together with a form
to your tutor. Make sure that each assignment reaches your tutor on or
before the deadline given. If, however you cannot complete your work
on time, contact your tutor before the assignment is done to discuss the
possibility of an extension.
The final examination for the course will carry 70% percentage of the
total mark available for this course. The examination will cover every
aspect of the course, so you are advised to revise all your corrected
assignments before the examination.
This course endows you with the status of a teacher and that of a learner.
This means that you teach yourself and that you learn, as your learning
capabilities would allow. It also means that you are in a better position
to determine and to ascertain the what, the how, and the when of your
language learning. No teacher imposes any method of learning on you.
The course units are similarly designed with the introduction following
the table of contents, then a set of objectives and then the discourse and
so on. The objectives guide you as you go through the units to ascertain
your knowledge of the required terms and expressions.
xiii
CIT 314 COURSE GUIDE
Assessment Marks
Assignment 1- 4 Four assignments, best three marks of the four count at
30% of course marks
Final Examination 70% of overall course marks
Total 100% of course marks
In distance learning the study units replace the university lecturer. This is
one of the great advantages of distance learning; you can read and work
through specially designed study materials at your own pace, and at a time
and place that suit you best. Think of it as reading the lecture instead of
listening to a lecturer. In the same way that a lecturer might set you some
reading to do, the study units tell you when to read your set books or other
material. Just as a lecturer might give you an in-class exercise, your study
units provide exercises for you to do at appropriate points.
Each of the study units follows a common format. The first item is an
introduction to the subject matter of the unit and how a particular unit is
integrated with the other units and the course as a whole. Next is a set of
learning objectives. These objectives enable you know what you should be
able to do by the time you have completed the unit. You should use these
objectives to guide your study. When you have finished the units you
must go back and check whether you have achieved the objectives. If you
make a habit of doing this, you will significantly improve your chances of
passing the course.
Remember that your tutor’s job is to assist you. When you need help, don’t
hesitate to call and ask your tutor to provide it.
5. Assemble the study materials. Information about what you need for
a unit is given in the overview at the beginning of each unit. You
will almost always need both the study unit you are working on and
one of your set of books on your desk at the same time.
6. Work through the unit. The content of the unit itself has been
arranged to provide a sequence for you to follow. As you work
through the unit you will be instructed to read sections from your
set books or other articles. Use the unit to guide your reading.
7. Review the objectives for each study unit to confirm that you have
achieved them. If you feel unsure about any of the objectives,
review the study material or consult your tutor.
8. When you are confident that you have achieved a unit’s
objectives, you can then start on the next unit. Proceed unit by
unit through the course and try to pace your study so that you
keep yourself on schedule.
9. When you have submitted an assignment to your tutor for marking,
do not wait for its return before starting on the next unit. Keep to
your schedule. When the assignment is returned, pay particular
attention to your tutor’s comments, both on the tutor-marked
assignment form and on the assignment. Consult your tutor as
soon as possible if you have any questions or problems.
10. After completing the last unit, review the course and prepare
yourself for the final examination. Check that you have achieved
the unit objectives (listed at the beginning of each unit) and the
course objectives (listed in this Course Guide).
Your tutor will mark and comment on your assignments, keep a close
watch on your progress and on any difficulties you might encounter and
provide assistance for you during the course. You must mail or submit
your tutor-marked assignments to your tutor well before the due date (at
least two working days are required). They will be marked by your tutor
and returned to you as soon as possible.
• you do not understand any part of the study units or the assigned
readings,
xv
CIT 314 COURSE GUIDE
You should try your best to attend the tutorials. This is the only chance to
a have face to face contact with your tutor and to ask questions which are
answered instantly. You can raise any problem encountered in the course
of your study. To gain the maximum benefit from course tutorials, prepare
a question list before attending them. You will learn a lot from
participating in discussions actively.
SUMMARY
We wish you success with the course and hope that you will find it
interesting and useful.
xvi
MAIN
COURSE
CONTENT PAGE
MODULE 1………………………………………. 1
MODULE 2………………………………………… 41
MODULE 3………………………………………………. 69
INTRODUCTION
1
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
UNIT ONE: Memory Systems
3.1 Main Memories
3.2 Auxiliary Memories
3.3 Memory Access Methods
3.4 Memory Mapping and Virtual Memories
3.5 Replacement Algorithms
3.6 Data Transfer Modes
3.7 Parallel Processing
3.8 Pipelining
4.0 Self-Assessment Exercises
5.0 Conclusion
6.0 Summary
7.0 References/Further Reading
1.0 INTRODUCTION
A related concept is the unit of transfer, for internal memory, the unit
of transfer is equal to the number of data lines into and out of the
memory module. This may be equal to the word length, but is often
larger. such as 64. 128, or 256 bits. From a user's point of view, the two
most important characteristics of memory are capacity and
performance. Three performance parameters arc used: a. second access
can commence. This additional time may be required for transients to
die out on signal lines or to regenerate data if they are read destructively.
Now that memory cycle time is concerned with the system bus, not the
processor.
3
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
At the end of this module, the user should be able to discuss elaborately
on;
Magnetic tape
Magnetic Disks
Floppy Disks
Hard Disks and Drives
These high-speed storage devices are very expensive and hence the cost
per bit of storage is also very high. Again, the storage capacity of the
main memory is also very limited. Often it is necessary to store
hundreds of millions of bytes of data for the CPU to process. Therefore,
additional memory is required in all the computer systems. This memory
is called auxiliary memory or secondary storage. In this type of memory,
the cost per bit of storage is low. However, the operating speed is slower
than that of the primary memory. Most widely used secondary storage
devices are magnetic tapes, magnetic disks and floppy disks.
5
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Magnetic tape is wound on reels (or spools). These may be used on their
own, as open-reel tape, or they may be contained in some sort of
magnetic tape cartridge for protection and ease of handling. Early
computers used open-reel tape, and this is still sometimes used on large
computer systems although it has been widely superseded by cartridge
tape. On smaller systems, if tape is used at all it is normally cartridge
tape.
Magnetic tape is used in a tape transport (also called a tape drive, tape
deck, tape unit, or MTU), a device that moves the tape over one or more
magnetic heads. An electrical signal is applied to the write head to
record data as a magnetic pattern on the tape; as the recorded tape passes
over the read head it generates an electrical signal from which the stored
6
CIT 314 MODULE 1
data can be reconstructed. The two heads may be combined into a single
read/write head. There may also be a separate erase head to erase the
magnetic pattern remaining from previous use of the tape. Most
magnetic-tape formats have several separate data tracks running the
length of the tape. These may be recorded simultaneously, in which
case, for example, a byte of data may be recorded with one bit in each
track (parallel recording); alternatively, tracks may be recorded one at a
time (serial recording) with the byte written serially along one track.
Magnetic tape has been used for offline data storage, backup,
archiving, data interchange, and software distribution, and in the
early days (before disk storage was available) also as online
backing store. For many of these purposes it has been superseded
by magnetic or optical disk or by online communications. For
example, although tape is a non-volatile medium, it tends to
deteriorate in long-term storage and so needs regular attention
(typically an annual rewinding and inspection) as well as a
controlled environment. It is therefore being superseded for
archival purposes by optical disk.
Magnetic tape is still extensively used for backup; for this
purpose, interchange standards are of minor importance, so
proprietary cartridge-tape formats are widely used.
Magnetic tapes are used for large computers like mainframe
computers where large volume of data is stored for a longer time.
In PCs also you can use tapes in the form of cassettes.
The cost of storing data in tapes is inexpensive. Tapes consist of
magnetic materials that store data permanently. It can be 12.5
mm to 25 mm wide plastic film-type and 500 meter to 1200-
meter-long which is coated with magnetic material. The deck is
connected to the central processor and information is fed into or
read from the tape through the processor. It is similar to cassette
tape recorder.
7
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
You might have seen the gramophone record, which is circular like a
disk and coated with magnetic material. Magnetic disks used in
computer are made on the same principle. It rotates with very high speed
inside the disk drive. Data are stored on both the surface of the disk.
Magnetic disks are most popular for direct access storage. Each disk
consists of a number of invisible concentric circles called tracks.
Information is recorded on tracks of a disk surface in the form of tiny
magnetic sports. The presence of a magnetic sport represents one bit (1)
and its absence represents zero bit (0). The information stored in a disk
can be read many times without affecting the stored data. So the reading
operation is non-destructive. But if you want to write a new data, then
the existing data is erased from the disk and new data is recorded.
The data capacity of magnetic disk memories ranges from several tens
of thousands up to several billion bits, and the average access time is 10-
100 milliseconds. The two main types are the hard disk and the floppy
disk.
individual bits due to the variability of motor speed. High speed disks
have an access time of 28 milliseconds or less, and low speed disk s,
65milliseconds or more. The higher
speed disks also transfer their data faster than the slower speed units.
The disks are usually aluminum with a magnetic coating. The heads
"float" just above the disk's surface on a current of air, sometimes at
lower than atmospheric pressure in an air tight enclosure. The head has
an aerodynamic shape so the current pushes it away from the disk. A
small spring pushes the head towards the disk at the same time keeping
the he a data constant distance from the disk (about two microns). Disk
drives are commonly characterized by the kind of interface used to
connect to the computer
These are small removable disks that are plastic coated with magnetic
recording material. Floppy disks are typically 3.5″ in size (diameter) and
can hold 1.44 MB of data. This portable storage device is a rewritable
media and can be reused a number of times. Floppy disks are commonly
used to move files between different computers. The main disadvantage
of floppy disks is that they can be damaged easily and, therefore, are not
very reliable. The following figure shows an example of the floppy disk.
It is similar to magnetic
9
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Tunnel Erasure: As the track is laid down by the R/W heads, the
trailing tunnel erasure heads force the data to be present only
within a specified narrow tunnel on each track. This process
prevents the signals from reaching adjacent track and making
cross talk.
10
CIT 314 MODULE 1
Straddle Erasure: In this method, the R/W and the erasure heads
do recording and erasing at the same time. The erasure head is
not used to erase data stored in the diskette. It trims the top and
bottom fringes of recorded flux reversals. The erasure heads
reduce the effect of cross-talk between tracks and minimize the
errors induced by minor run out problems on the diskette or
diskette drive.
Head alignment: Alignment is the process of placement of the
heads with respect to the track that they must read and write.
Head alignment can be checked only against some sort of
reference- standard disk recorded by perfectly aligned machine.
These types of disks are available and one can use one to check
the drive alignment.
A hard disk drive (HDD), hard disk, hard drive or fixed disk is a data
storage device that uses magnetic storage to store and retrieve digital
information using one or more rigid rapidly rotating disks (platters)
coated with magnetic material. The platters are paired with magnetic
heads, usually arranged on a moving actuator arm, which read and write
data to the platter surfaces. Data is accessed in a random-access manner,
meaning that individual blocks of data can be stored or retrieved in any
order and not only sequentially.
As can be seen in the picture below, the desktop hard drive consists of
the following components: the head actuator, read/write actuator arm,
read/write head, spindle, and platter. On the back of a hard drive is a
11
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
circuit board called the disk controller or interface board and is what
allows the hard drive to communicate with the computer.
Although most hard drives are internal, there are also stand-alone
devices called external hard drives, which can backup data on computers
and expand the available disk space. External drives are often stored in
an enclosure that helps protect the drive and allows it to interface with
the computer, usually over USB or eSATA.
The first hard drive was introduced to the market by IBM on September
13, 1956. The hard drive was first used in the RAMAC 305 system, with
a storage capacity of 5 MB and a cost of about $50,000 ($10,000 per
megabyte). The hard drive was built-in to the computer and was not
removable. The first hard drive to have a storage capacity of one
gigabyte was also developed by IBM in 1980. It weighed 550 pounds
12
CIT 314 MODULE 1
and cost $40,000. 1983 marked the introduction of the first 3.5-inch size
hard drive, developed by Rodime. It had a storage capacity of 10 MB.
Seagate was the first company to introduce a 7200 RPM hard drive in
1992. Seagate also introduced the first 10,000 RPM hard drive in 1996
and the first 15,000 RPM hard drive in 2000. The first solid-state drive
(SSD) as we know them today was developed by SanDisk Corporation
in 1991, with a storage capacity of 20 MB. However, this was not a
flash-based SSD, which were introduced later in 1995 by M-Systems.
These drives did not require a battery to keep data stored on the memory
chips, making them a non-volatile storage medium.
Advantages
Disadvantages
metal coating along the tracks. When data stored on the optical disk is to
be read, a less powerful laser beam is focused on the disk surface. The
storage capacity of these devices is tremendous; the Optical disk access
time is relatively fast. The biggest drawback of the optical disk is that it
is a permanent storage device. Data once written cannot be erased.
Therefore it is a read only storage medium. A typical example of the
optical disk is the CD-ROM.
1. Read-only memory (ROM) disks, like the audio CD, are used
for the distribution of standard program and data files. These are
mass-produced by mechanical pressing from a master die. The
information is actually stored as physical indentations on the
surface of the CD. Recently low-cost equipment has been
introduced in the market to make one-off CD-ROMs, putting
them into the next category.
2. Write-once read-many (WORM) disks: Some optical disks can
be recorded once. The information stored on the disk cannot be
changed or erased. Generally the disk has a thin reflective film
deposited on the surface. A strong laser beam is focused on
selected spots on the surface and pulsed. The energy melts the
film at that point, producing a nonreflective void. In the read
mode, a low power laser is directed at the disk and the bit
information is recovered by sensing the presence or absence of a
reflected beam from the disk.
3. Re-writeable, write-many read-many (WMRM) disks, just like
the magnetic storage disks, allows information to be recorded and
erased many times. Usually, there is a separate erase cycle
although this may be transparent to the user. Some modern
devices have this accomplished with one over-write cycle. These
devices are also called direct read-after-write (DRAW) disks.
4. WORM (write once, read many) is a data storage technology
that allows information to be written to a disc a single time and
prevents the drive from erasing the data. The discs are
intentionally not rewritable, because they are especially intended
to store data that the user does not want to erase accidentally.
Because of this feature, WORM devices have long been used for
the archival purposes of organizations such as government
agencies or large enterprises. A type of optical media, WORM
devices were developed in the late 1970s and have been adapted
to a number of different media. The discs have varied in size
from 5.25 to 14 inches wide, in varying formats ranging from
140MB to more than 3 GB per side of the (usually) double-sided
medium. Data is written to a WORM disc with a low- powered
laser that makes permanent marks on the surface. WORM (Write
Once, Read Many) storage had emerged in the late 1980s and
was popular with large institutions for the archiving of high
15
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Data need to be accessed from the memory for various purposes. There
are several methods to access memory as listed below:
Sequential access
Direct access
Random access
Associative access
17
18
CIT 314 MODULE 1
Accessing files via memory map is faster than using I/O functions such
as fread and fwrite. Data are read and written using the virtual memory
capabilities that are built in to the operating system rather than having to
allocate, copy into, and then deallocate data buffers owned by the
process does not access data from the disk when the map is first
constructed. It only reads or writes the file on disk when a specified part
of the memory map is accessed, and then it only reads that specific part.
This provides faster random access to the mapped data.
19
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
3.4.1.2 Efficiency
Mapping a file into memory allows access to data in the file as if that
data had been read into an array in the application's address space.
Initially, MATLAB only allocates address space for the array; it
does not actually read data from the file until you access the mapped region. As
a result, memory-mapped files provide a mechanism by which
applications can access data segments in an extremely large file without
having toread the entire file into memory first. Efficient Coding Style
Memory-mapping in your MATLAB application enables you to access file
data using standard MATLAB indexing operations.
Processes in a system share the CPU and main memory with other
processes. However, sharing the main memory poses some special
challenges. As demand on the CPU increases, processes slowdown in
some reasonably smooth way. But if too many processes need too much
memory, then some of them will simply not be able to run. When a
program is out of space, it is out of luck. Memory is also vulnerable to
corruption. If some process inadvertently writes to the memory used by
another process, that process might fail in some bewildering fashion
totally unrelated to the program logic. In order to manage memory more
efficiently and with fewer errors, modern systems provide an abstraction
of main memory known as virtual memory (VM). Virtual memory is an
elegant interaction of hardware exceptions, hardware address translation,
main memory, disk files, and kernel software that provides each process
with a large, uniform, and private address space. With one clean
mechanism, virtual memory provides three important capabilities.
21
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
As with any cache, the VM system must have some way to determine if
a virtual page is cached somewhere in DRAM. If so, the system must
determine which physical page it is cached in. If there is a miss, the
system must determine where the virtual page is stored on disk, select a
victim page in physical memory, and copy the virtual page from disk to
DRAM, replacing the victim page. These capabilities are provided by a
combination of operating system software, address translation hardware
in the MMU (memory management unit), and a data structure stored in
physical memory known as a page table that maps virtual pages to
physical pages. The address translation hardware reads the page table
each time it converts a virtual address to a physical address. The
operating system is responsible for maintaining the contents of the page
table and transferring pages back and forth between disk and DRAM
Virtual memory was invented in the early 1960s, long before the
widening CPU-memory gap spawned SRAM caches. As a result, virtual
memory systems use a different terminology from SRAM caches, even
though many of the ideas are similar. In virtual memory parlance, blocks
are known as pages. The activity of transferring a page between disk and
memory is known as swapping or paging. Pages are swapped in (paged
in) from disk to DRAM, and swapped out (paged out) from DRAM to
disk. The strategy of waiting until the last moment to swap in a page,
when a miss occurs, is known as demand paging. Other approaches,
such as trying to predict misses and swap pages in before they are
actually referenced, are possible. However, all modern systems use
demand paging.
22
CIT 314 MODULE 1
Any modern computer system must provide the means for the operating
system to control access to the memory system. A user process should
not be allowed to modify its read-only text section. Nor should it be
allowed to read or modify any of the code and data structures in the
kernel. It should not be allowed to read or write the private memory of
other processes, and it should not be allowed to modify any virtual pages
that are shared with other processes, unless all parties explicitly allow it
(via calls to explicit inter-process communication system calls).
In any system that uses both virtual memory and SRAM caches, there is
the issue of whether to use virtual or physical addresses to access the
SRAM cache. Although a detailed discussion of the trade-offs is beyond
our scope here, most systems opt for physical addressing. With physical
addressing, it is straightforward for multiple processes to have blocks in
the cache at the same time and to share blocks from the same virtual
pages. Further, the cache does not have to deal with protection issues
because access rights are checked as part of the address translation
process.
As we have seen, every time the CPU generates a virtual address, the
MMU must refer to a PTE in order to translate the virtual address into a
physical address. In the worst case, this requires an additional fetch from
memory, at a cost of tens to hundreds of cycles. If the PTE happens to
be cached in L1, then the cost goes down to one or two cycles. However,
many systems try to eliminate even this cost by including a small cache
of PTEs in the MMU called a translation lookaside buffer (TLB). A
TLB is a small, virtually addressed cache where each line holds a block
consisting of a single PTE. A TLB usually has a high degree of
associativity
When a page fault occurs, the operating system has to choose a page to
remove from memory to make room for the page that has to be brought
in. If the page to be removed has been modified while in memory, it
must be rewritten to the disk to bring the disk copy up to date. If,
however, the page has not been changed (e.g., it contains program text),
the disk copy is already up to date, so no rewrite is needed. The page to
be read in just overwrites the page being evicted. While it would be
23
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
with virtual memory have two status bits associated with each page. R is
set whenever the page is referenced (read or written). M is set when the
page is written to (i.e., modified). The bits are contained in each page
table entry. It is important to realize that these bits must be updated on
every memory reference, so it is essential that they be set by the
hardware. Once a bit has been set to 1, it stays 1 until the operating
system resets it to 0 in software. If the hardware does not have these
bits, they can be simulated as follows. When a process is started up, all
of its page table entries are marked as not in memory. As soon as any
page is referenced, a page fault will occur. The operating system then
sets the R bit (in its internal tables), changes the page table entry to point
to the correct page, with mode READ ONLY, and restarts the
instruction. If the page is subsequently written on, another page fault
will occur, allowing the operating system to set the M bit and change the
page’s mode to READ/WRITE. The R and M bits can be used to build a
simple paging algorithm as follows. When a process is started up, both
page bits for all its pages are set to 0 by the operating system.
Periodically (e.g., on each clock interrupt), the R bit is cleared, to
distinguish pages that have not been referenced recently from those that
have been. When a page fault occurs, the operating system inspects all
the pages and divides them into four categories based on the current
values of their R and M bits:
the most recent arrival. On a page fault, the page at the head is removed
and the new page added to the tail of the list. When applied to stores,
FIFO might remove mustache wax, but it might also remove flour, salt,
or butter. When applied to computers the same problem arises. For this
reason, FIFO in its pure form is rarely used.
When a page fault occurs, the page being pointed to by the hand is
inspected. If its R bit is 0, the page is evicted, the new page is inserted
into the clock in its place, and the hand is advanced one position. If R is
1, it is cleared and the hand is advanced to the next page. This process is
repeated until a page is found with R = 0. Not surprisingly, this
algorithm is called clock. It differs from second chance only in the
implementation.
26
CIT 314 MODULE 1
The IOP can fetch and execute its own instructions that are specifically
designed to characterize I/O transfers. In addition to the I/O – related
27
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
tasks, it can perform other processing tasks like arithmetic, logic, and
branching and code translation. The main memory unit takes the pivotal
role. It communicates with processor by the means of DMA.
3.6.1 Advantages
• The I/O devices can directly access the main memory without the
intervention by the processor in I/O processor-based systems.
• It is used to address the problems that are arises in Direct memory
access method.
Data transfer to and from the peripherals may be done in any of the
three possible ways.
Programmed I/O.
Interrupt- initiated I/O.
Direct memory access (DMA).
Example of Programmed I/O: In this case, the I/O device does not
have direct access to the memory unit. A transfer from I/O device to
memory requires the execution of several instructions by the CPU,
including an input instruction to transfer the data from device to the CPU
and store instruction to transfer the data from CPU to memory. In
programmed I/O, the CPU stays in the program loop until the I/O unit
28
CIT 314 MODULE 1
• The I/O transfer rate is limited by the speed with which the
processor can test and service a device.
Example
Consider the problem of constructing the list of all prime numbers in the
interval [1, n] for a given integer n > 0. A simple algorithm that can be
used for this computation is the sieve of Eratosthenes. Start with the list
of numbers 1, 2, 3, 4, ... , n represented as a “mark” bit-vector initialized
to 1000 . . . 00. In each step, the next unmarked number m (associated
with a 0 in element m of the mark bit-vector) is a prime. Find this
element m and mark all multiples of m beginning with m². When m² > n,
the computation stops and all unmarked elements are prime numbers.
The computation steps for n = 30 are shown in the figure below
31
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Imagine a large hall like a theater. The walls of this chamber are
painted to form a map of the globe. A myriad of computers at work upon
the weather on the part of the map where each sits, but each computer
attends to only one equation or part of an equation. The work of each
region is coordinated by an official of higher rank. Numerous little
‘night signs’ display the instantaneous values so that neighbouring
computers can read them. One of [the conductor’s] duties is to maintain
a uniform speed of progress in all parts of the globe. But instead of
waving a baton, he turns a beam of rosy light upon any region that is
running ahead of the rest, and a beam of blue light upon those that are
behindhand.
The MIMD category includes a wide class of computers. For this reason,
in 1988, E. E. Johnson proposed a further classification of such
machines based on their memory structure (global or distributed) and the
mechanism used for communication/synchronization (shared variables
or message passing). Again, one of the four categories (GMMP) is not
widely used. The GMSV class is what is loosely referred to as (shared-
memory) multiprocessors.
33
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
3.8 PIPELINING
Pipeline system is like the modern day assembly line setup in factories.
For example in a car manufacturing industry, huge assembly lines are
setup and at each point, there are robotic arms to perform a certain task,
and then the car moves on ahead to the next arm.
36
CIT 314 MODULE 1
Types of Pipeline:
It is divided into 2 categories:
Pipeline Conflicts
There are some factors that cause the pipeline to deviate its normal
performance. Some of these factors are given below:
37
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Advantages of Pipelining
Disadvantages of Pipelining
It is clear from the figure that the total time required to process three
instructions (I1, I2, I3) is only six time units if four-stage pipelining is
used as compared to 12 time units if sequential processing is used. A
possible saving of up to 50% in the execution time of these three
instructions is obtained. In order to formulate some performance
measures for the goodness of a pipeline in processing a series of tasks, a
space time chart (called the Gantt’s chart) is used.
As can be seen from the figure 3.20, 13 time units are needed to finish
executing 10 instructions (I1 to I10). This is to be compared to 40 time
units if sequential processing is used (ten instructions each requiring
four time units).
38
CIT 314 MODULE 1
4.0 CONCLUSION
5.0 SUMMARY
This unit studied the memory system of a computer, starting with the
organisation of its main memory, which, in some simple systems, is the
only form of data storage to the understanding of more complex systems
and the additional components they carry. Cache systems, which aim at
speeding up access to the primary storage were also studied, and there
was a greater focus on virtual memory systems, which make possible the
transparent use of secondary storage as if it was main memory, by the
processor.
2.1 INTRODUCTION
This module is divided into three units. The first unit explains memory
addressing and the various modes available. Unit two explains the
elements of memory hierarchy while the last unit takes on virtual
memory control systems. All these are given below.
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
3.1.1 What is memory addressing mode?
3.1.2 Modes of addressing
3.1.3 Number of addressing modes
3.1.4 Advantages of addressing modes
3.1.5 Uses of addressing modes
4.0 Self-Assessment Exercises
5.0 Conclusion
41
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
6.0 Summary
7.0 References/Further Reading
1.0 INTRODUCTION
Memory address modes determine the method used within the program
to access data either from the Cache or the RAM.
However, there are basic requirement for the operation to take effect.
First, the must be an operator to indicate what action to take and
secondly, there must be an operand that portray the data to be executed.
For instance; if the numbers 5 and 2 are to be added to have a result, it
42
CIT 314 MODULE 2
There are many methods for defining or obtaining the effective address
of an operators directly from the register. Such approaches are known as
modes of addressing. The programmes are usually written in a high-
level language, as it is a simple way to describe the variables and
operations to be performed on the variables by the programmer. The
following are the modes of addressing;
Note :
< - = assignment
M = the name for memory: M[R1] refers to contents of memory location
whose address is given by the contents of R1
44
CIT 314 MODULE 2
46
CIT 314 MODULE 2
Some direction set models, for instance, Intel x86 and its substitutions,
had a pile ground-breaking area direction. This plays out an assessment
of the fruitful operand location, anyway rather following up on that
memory territory, it stacks the area that might have been gotten in the
register. This may be significant during passing the area of a display part
to a browse mode. It can similarly be a fairly precarious strategy for
achieving a greater number of includes than average in one direction; for
example, using such a direction with the keeping an eye on mode “base+
index+ balance” (unequivocal underneath) licenses one to assemble two
registers and a consistent into a solitary unit in one direction.
4.0 CONCLUSION
The computer register is defined as the small subset of the data that has
fast accessible memory in the central processing unit. For the execution
of different computer instructions and programs type of registers are
used. There are numerous categories of computer registers that are
available for the execution of instructions. The registers can be
categorized by their size, functions, or names. These registers are used
to store the data temporarily and perform the execution of computer
instructions and can be also used to store results in it. The processing
speed of registers is the fastest to another data set.
47
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
6.0 SUMMARY
For the central processing unit, there are various type of computer
registers defined that has some specific role during the execution of the
instruction. All these registers have some particular role like data-related
operations, fetching or storing of data, and many more operations. And
the instructions stored in the register are executed by the processor of
the central processing unit.
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main ContentsWhat is memory hierarchy
3.2.1 Memory hierarchy diagram
3.2.2 Characteristics of memory diagram
3.2.3 Memory hierarchy design
3.2.4 Advantages of memory hierarchy
1.0 INTRODUCTION
Memory is one of the important units in any computer system. Its serves
as a storage for all the processed and the unprocessed data or programs
in a computer system. However, due to the fact that most computer users
often stored large amount of files in their computer memory devices, the
use of one memory device in a computer system has become inefficient
and unsatisfactory. This is because only one memory cannot contain all
the files needed by the computer users and when the memory is large, it
decreases the speed of the processor and the general performance of the
computer system.
49
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Where,
IC = Instruction Count
Mem_Refs = Memory References per Instruction
Miss_Rate = Fraction of Accesses that are not in the
cache
Miss_Penalty = Additional time to service the Miss
The memory hierarchy system encompasses all the storage devices used
in a computer system. Its ranges from the cache memory, which is
smaller in size but faster in speed to a relatively auxiliary memory which
is larger in size but slower in speed. The smaller the size of the memory
the costlier it becomes.
a. Cache memory,
b. Main memory and
c. Auxiliary memory
The memory hierarchy system encompasses all the storage devices used
in a computer system. Its ranges from fastest but smaller in size (cache
memory) to a relatively fast but small in size (main memory) and more
slowly but larger in size (auxiliary memory). The cache memory is the
smallest and fastest storage device, it is place closer to the CPU for easy
accessed by the processor logic. More so, cache memory is helps to
enhance the processing speed of the system by making available
currently needed programs and data to the CPU at a very high speed. Its
stores segment of programs currently processed by the CPU as well as
the temporary data frequently needed in the current calculation
The main memory communicates directly to the CPU. It also very fast in
speed and small in size. Its communicates to the auxiliary memories
through the input/ output processor. The main memory provides a
communication link between other storage devices. It contains the
currently accessed data or programs. The unwanted data are transferred
to the auxiliary memories to create more space in the main memory for
the currently needed data to be stored. If the CPU needs a program that
is outside the main memory, the main memory will call in the program
from the auxiliary memories via the input/output processor. The main
difference between cache and main memories is the access time and
processing logic. The processor logic is often faster than that of the main
memory access time.
The auxiliary memory is made up of the magnetic tape and the magnetic
disk. They are employ in the system to store and backup large volume of
51
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Main Cache
Memor Memor
I/O
Processo CPU
a. Access type,
b. Capacity,
c. Cycle time,
d. Latency,
e. Bandwidth, and
f. Cost
c. Cycle time: is defined as the time elapsed from the start of a read
operation to the start of a subsequent read.
53
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
will mount to access the data. Once the data is allowed, then it
will be unmounted. The access time of memory will be slower
within magnetic strip as well as it will take a few minutes for
accessing a strip.
4.0 CONCLUSION
5.0 SUMMARY
Ailamaki AG, DeWitt DJ., Hill MD, Wood DA. DBMSs on a modern
processor: where does time go? In: Proceedings of the 25th
International Conference on Very Large Data Bases; 1999. p.
266–77. URL: https://www.semanticscholar.org/paper/DBMSs-
on-a-Modern-Processor%3A-Where-Does-Time-Go-Ailamaki-
DeWitt/54b92179ede08158e2cf605f5e9f264ca06c01ff
Denning PJ. The working set model for program behaviour. Commun
ACM. 1968;11(5):323–33. URL:
https://denninginstitute.com/pjd/PUBS/WSModel_1968.pdf
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
3.3.1 Memory management systems
3.3.2 Paging
3.3.3 Address mapping using paging
3.3.4 Address mapping using segments
3.3.5 Address mapping using segmented paging
3.3.6 Multi-programming
3.3.7 Virtual machines/memory and protection
3.3.8 Hierarchical memory systems
3.3.9 Drawbacks that occur in virtual memories
4.0 Self-Assessment Exercises
5.0 Conclusion
6.0 Summary
7.0 References/Further Reading
1.0 INTRODUCTION
3.3.2 Paging
58
CIT 314 MODULE 2
From the above diagram you can see that A2 and A4 are moved to the
waiting state after some time. Therefore, eight frames become empty,
and so other pages can be loaded in that empty blocks. The process A5
of size 8 pages (8 KB) are waiting in the ready queue.
Advantages
59
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Disadvantages
Page no.
101 0 1 Line
0 1 0No.
10011 Virtual Address
Presence bit
Table 000 0
Address
001
010 11
00 11
Block 0
011 01 0101010011
100 00 Block 1
Main Memory
101
110
111 01
10 110 Block 2
Address Register
Block 3
01 1
Memory Page Table
Segment Table
Block Word
62
CIT 314 MODULE 2
3.3.6 Multi-programming
Disadvantages of Multiprogramming:
65
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Inclusion
Coherence and
Locality.
SELF-ASSESSMENT EXERCISE
66
CIT 314 MODULE 2
4.0 CONCLUSION
The memory hierarchy system encompasses all the storage devices used
in a computer system. Its ranges from fastest but smaller in size (cache
memory) to a relatively fast but small in size (main memory) and slower
but larger in size (auxiliary memory). A memory element is a set of
storage devices that stores binary in bits. They include; register, cache
memory, main memory, magnetic disk and magnetic tape. This set of
storage devices can be classified into two categories such as; the primary
memory and the secondary memory.
5.0 SUMMARY
Memory addresses act just like the indexes of a normal array. The
computer can access any address in memory at any time (hence
the name "random access memory").
It can also group bytes together as it needs to form larger
variables, arrays, and structures.
Memory hierarchy is the hierarchy of memory and storage
devices found in a computer system.
It ranges from the slowest but high capacity auxiliary memory to
the fastest but low capacity cache memory.
Memory hierarchy is employed to balance this trade-off.
68
CIT 314 MODULE 3
1.0 INTRODUCTION
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
3.1.1 Hardwired Control Unit
3.1.2 Design of a hardwired Control Unit
4.0 Self-Assessment Exercises
5.0 Conclusion
6.0 Summary
7.0 References/Further Reading
1.0 INTRODUCTION
Control Unit is the part of the computer’s central processing unit (CPU),
which directs the operation of the processor. It was included as part of
the Von Neumann Architecture by John von Neumann. It is the
responsibility of the Control Unit to tell the computer’s memory,
arithmetic/logic unit and input and output devices how to respond to the
instructions that have been sent to the processor. It fetches internal
69
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
70
CIT 314 MODULE 3
72
CIT 314 MODULE 3
4.0 CONCLUSION
5.0 SUMMARY
74
CIT 314 MODULE 3
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
3.2.1 Design of a Micro-Programmed Control Unit
3.2.2 Differences between Hardwired and Microprogrammed
Control
3.2.3 Organization of Micro-Programmed Control Unit
3.2.4 Types of Micro-programmed Control Unit
4.0 Self-Assessment Exercises
5.0 Conclusion
6.0 Summary
7.0 References/Further Reading
1.0 INTRODUCTION
Each bit that forms the microinstruction is linked to one control signal.
When the bit is set, the control signal is active. When it is cleared the
control signal turns inactive. These microinstructions in a sequence can
be saved in the internal ’control’ memory. The control unit of a
microprogram-controlled computer is a computer inside a computer.
76
CIT 314 MODULE 3
77
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
It can make the design of the control unit much simpler. Hence, it
is inexpensive and less error-prone.
It can orderly and systematic design process.
It is used to control functions implemented in software and not
hardware.
It is more flexible.
It is used to complex function is carried out easily.
4.0 CONCLUSION
5.0 SUMMARY
82
CIT 314 MODULE 3
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
3.3.1 Clock limitations
3.3.2 Basic Concepts
3.3.3 Benefits of Asynchronous Control
3.3.4 Asynchronous Communication
3.3.5 Asynchronous Transmission
3.3.6 Synchronous vs. Asynchronous Transmission
3.3.7 Emerging application areas
3.3.8 Asynchronous Datapaths and Data Transfer
3.3.9 Handshaking
4.0 Self-Assessment Exercises
5.0 Conclusion
6.0 Summary
7.0 References/Further Reading
1.0 INTRODUCTION
83
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
For some time now it has been difficult to sustain the synchronous
framework from chip to chip at maximum clock rates. On-chip phase-
locked loops help compensate for chip-to-chip tolerances, but above
about 50MHz even this is not enough. Building the complete CPU on a
single chip avoids inter-chip skew, as the highest clock rates are only
used for processor-MMU-cache transactions. However, even on a single
chip, clock skew is becoming a problem. High-performance processors
must dedicate increasing proportions of their silicon area to the clock
drivers to achieve acceptable skew, and clearly there is a limit to how
much further this proportion can increase. Electrical signals travel on
chips at a fraction of the speed of light; as the tracks get thinner, the
chips get bigger and the clocks get faster, the skew problem gets worse.
Perhaps the clock could be injected optically to avoid the wire delays,
but the signals which are issued as a result of the clock still have to
propagate along wires in time for the next pulse, so a similar problem
remains.
Even more urgent than the physical limitation of clock distribution is the
problem of heat. CMOS is a good technology for low power as gates
only dissipate energy when they are switching. Normally this should
correspond to the gate doing useful work, but unfortunately in a
synchronous circuit this is not always the case. Many gates switch
because they are connected to the clock, not because they have new
inputs to process. The biggest gate of all is the clock driver, and it must
switch all the time to provide the timing reference even if only a small
part of the chip has anything useful to do. Often it will switch when
84
CIT 314 MODULE 3
none of the chip has anything to do, because stopping and starting a
high-speed clock is not easy.
Early CMOS devices were very low power, but as process rules have
shrunk CMOS has become faster and denser, and today's high-
performance CMOS processors can dissipate 20 or 30 watts.
Furthermore there is evidence that the trend towards higher power will
continue. Process rules have at least another order of magnitude to
shrink, leading directly to two orders of magnitude increase in
dissipation for a maximum performance chip. Whilst a reduction in the
power supply voltage helps reduce the dissipation (by a factor of 3 for 3
Volt operation and a factor of 6 for 2 Volt operation, relative to a 5 Volt
norm in both cases), the end result is still a chip with an increasing
thermal problem. Processors which dissipate several hundred watts are
clearly no use in battery powered equipment, and even on the desktop
they impose difficulties because they require water cooling or similar
costly heat-removal technology.
3.3.2.2 Mode
applied until all outputs have settled in response to a previous input. The
second, input/output mode, allows
Two-phase
86
CIT 314 MODULE 3
Four-phase
Four-phase
Single-rail encoding
Dual-rail encoding
87
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Two major assumptions guide the design of today’s logic; all signals
are binary, and time is discrete. Both of these assumptions are made in
order to simplify logic design. By assuming binary values on signals,
simple Boolean logic can be used to describe and manipulate logic
constructs. By assuming time is discrete, hazards and feedback can
largely be ignored. However, as with many simplifying assumptions, a
system that can operate without these assumptions has the potential to
generate better results.
Asynchronous circuits keep the assumption that signals are binary, but
remove the assumption that time is discrete.
88
CIT 314 MODULE 3
That means the two devices do not share a dedicated clock signal (a
unique clock exists on each device). Each device must setup ahead of
time a matching bit rate and how many bits to expect in a given
transaction.
91
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Emails
Forums
Letters
Radios
Televisions
93
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
These two methods can achieve this asynchronous way of data transfer:
94
CIT 314 MODULE 3
Strobe control method of data transfer uses a single control signal for
each transfer. The strobe may be activated by either the source unit or
the destination unit. This control line is also known as a strobe, and it
may be achieved either by source or destination, depending on which
initiate the transfer.
The strobe is a single line that informs the destination unit when a valid
data word is available in the bus.
First, the destination unit activates the strobe pulse, informing the
source to provide the data.
The source unit responds by placing the requested binary
information on the unit to accept it.
95
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
The data must be valid and remain in the bus long enough for the
destination unit to accept it.
The falling edge of the strobe pulse can be used again to trigger a
destination register.
The destination unit then disables the strobe.
The source removes the data from the bus after a predetermined
time interval.
3.3.9 HANDSHAKING
The strobe method has the disadvantage that the source unit that initiates
the transfer has no way of knowing whether the destination has received
the data that was placed in the bus. Similarly, a destination unit that
initiates the transfer has no way of knowing whether the source unit has
placed data on the bus.
96
CIT 314 MODULE 3
units.
The source unit initiates the transfer by placing the data on the
bus and enabling its data valid signal.
The data accepted signals is activated by the destination unit after
it accepts the data from the bus.
The source unit then disables its data valid signal, which
invalidates the data on the bus.
The destination unit the disables its data accepted signal and the
system goes into its initial state.
97
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
In this case the name of the signal generated by the destination unit
is ready for data.
The source unit does not place the data on the bus until it receives
the ready for data signal from the destination unit.
4.0 CONCLUSION
1. Strobe
2. Handshaking
5.0 SUMMARY
If the registers in the I/O interface share a common clock with CPU
registers, then transfer between the two units is said to be synchronous.
But in most cases, the internal timing in each unit is independent of each
other, so each uses its private clock for its internal registers. In this case,
the two units are said to be asynchronous to each other, and if data
transfer occurs between them, this data transfer is called Asynchronous
Data Transfer.
100
CIT 314 MODULE 3
INTRODUCTION
Fault tolerance refers to the property that enables the system to continue
to function correctly even when some of its components fail. In other
words, fault tolerance means how an operating system (OS) responds
and allows hardware or software malfunctions and fails.
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
3.0.1.1 What is Fault Tolerance
3.0.1.2 Fault Tolerant Systems
3.0.1.3 Hardware and Software Fault Tolerant Issues
3.0.1.4 Fault Tolerance VS High Availability
3.0.1.5 Redundancy
3.0.1.6 Relationship Between Security and Fault Tolerance
INTRODUCTION
102
CIT 314 MODULE 4
103
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Fault Tolerance has been part of the computing community for quite a
long time, to clarify the building of our understanding of fault tolerance,
then we should know that fault tolerance is the art and science of
building computing systems that continue to operate satisfactorily in the
presence of faults. An operating system that offers a solid definition for
faults cannot be disrupted by a single point of failure. It ensures business
continuity and the high availability of crucial applications and systems
regardless of any failures.
Fault tolerance can be built into a system to remove the risk of it having
a single point of failure. To do so, the system must have no single
component that, if it were to stop working effectively, would result in
the entire system failing. Fault tolerance is reliant on aspects like load
balancing and failover, which remove the risk of a single point of
failure. It will typically be part of the operating system’s interface,
which enables programmers to check the performance of data
throughout a transaction.
Three central terms in fault-tolerant design are fault, error, and failure.
There is a cause effect relationship between faults, errors, and failures.
Specifically, faults are the cause of errors, and errors are the cause of
failures. Often the term failure is used interchangeably with the term
malfunction, however, the term failure is rapidly becoming the more
commonly accepted one. A fault is a physical defect, imperfection, or
flaw that occurs within some hardware or software component.
Essentially, the definition of a fault, as used in the fault tolerance
community, agrees with the definition found in the dictionary. A fault is
a blemish, weakness, or shortcoming of a particular hardware or
software component. An error is the manifestation of a fault.
Specifically, an error is a deviation from accuracy or correctness.
Finally, if the error results in the system performing one of its functions
104
CIT 314 MODULE 4
The concepts of faults, errors, and failures can be best presented by the
use of a three-universe model that is an adaptation of the four-universe
models;
105
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Characteristics of Faults
a) Causes/Source of Faults
b) Nature of Faults
c) Fault Duration
d) Extent of Faults
e) Value of faults
Fault Duration. The duration specifies the length of time that a fault is
active.
106
CIT 314 MODULE 4
108
CIT 314 MODULE 4
Safety. Safety is the probability, S(t), that a system will either perform
its functions correctly or will discontinue its functions in a manner that
does not disrupt the operation of other systems or compromise the safety
of any people associated with the system. Safety is a measure of the
failsafe capability of a system; if the system does not operate correctly,
it is desired to have the system fail in a safe manner. Safety and
reliability differ because reliability is the probability that a system will
perform its functions correctly, while safety is the probability that a
system will either perform its functions correctly or will discontinue the
functions in a manner that causes no harm.
109
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
An extensive methodology has been developed in this field over the past
thirty years, and a number of fault-tolerant machines have been
developed most dealing with random hardware faults, while a smaller
number deal with software, design and operator faults to varying
degrees. A large amount of supporting research has been reported and
efforts to attain software that can tolerate software design faults
(programming errors) have made use of static and dynamic redundancy
approaches similar to those used for hardware faults. One such
approach, N-version programming, uses static redundancy in the form of
independently written programs (versions) that perform the same
functions, and their outputs are voted at special checkpoints. Here, of
course, the data being voted may not be exactly the same, and a criterion
must be used to identify and reject faulty versions and to determine a
consistent value (through inexact voting) that all good versions can use.
An alternative dynamic approach is based on the concept of recovery
blocks. Programs are partitioned into blocks and acceptance tests are
executed after each block. If an acceptance test fails, a redundant code
block is executed.
110
CIT 314 MODULE 4
111
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
112
CIT 314 MODULE 4
In everyday language, the terms fault, failure, and error are used
interchangeably. In fault-tolerant computing parlance, however, they
have distinctive meanings. A fault (or failure) can be either a hardware
defect or a software i.e. programming mistake (bug). In contrast, an
error is a manifestation of the fault, failure and bug. As an example,
consider an adder circuit, with an output line stuck at 1; it always carries
the value 1 independently of the values of the input operands. This is a
fault, but not (yet) an error. This fault causes an error when the adder is
used and the result on that line is supposed to have been a 0, rather than
113
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Both faults and errors can spread through the system. For example, if a
chip shorts out power to ground, it may cause nearby chips to fail as
well. Errors can spread because the output of one unit is used as input by
other units. To return to our previous examples, the erroneous results of
either the faulty adder or the sin(x) subroutine can be fed into further
calculations, thus propagating the error.
114
CIT 314 MODULE 4
A fault that just causes a unit to go dead is called benign. Such faults are
the easiest to deal with. Far more insidious are the faults that cause a unit
to produce reasonable-looking, but incorrect, output, or that make a
component “act maliciously” and send differently valued outputs to
different receivers. Think of an altitude sensor in an airplane that reports
a 1000-foot altitude to one unit and an 8000-foot altitude to another unit.
These are called malicious (or Byzantine) faults.
115
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
If an FT system and an HA cluster have the same fault rate, but the FT
system can recover in 3 seconds and the HA cluster takes 5 minutes (300
seconds) to recover from the same fault, then the HA cluster will be
down 100 times as long as the FT system and will have an availability
which is two 9s less. That glorious five 9s claim becomes three 9s (as
reported in several industry studies), at least so far as software faults are
concerned.
So, the secret to high availability is in the recovery time. This is what
the Tandem folks worked so hard on for two decades before becoming
the Nonstop people. Nobody else has done it. Today, Nonstop servers
116
CIT 314 MODULE 4
3.0.1.5Redundancy
117
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
118
CIT 314 MODULE 4
Techniques of Redundancy
Hardware Redundancy
119
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
The primary challenge with TMR is obviously the voter; if the voter
fails, the complete system fails. In other words, the reliability of the
simplest form of TMR, as shown in Figure 4.1, can be no better than the
reliability of the voter. Any single component within a system whose
failure will lead to a failure of the system is called a single-point-of-
failure. Several techniques can be used to overcome the effects of voter
failure. One approach is to triplicate the voters and provide three
independent outputs, as illustrated in Figure 4.2. In Figure 4.2, each of
three memories receives
120
CIT 314 MODULE 4
data from a voter which has received its inputs from the three separate
processors. If one processor fails, each memory will continue to receive
a correct value because its voter will correct the corrupted value. A
TMR system with triplicated voters is commonly called a restoring
organ because the configuration will produce three correct outputs even
if one input is faulty. In essence, the TMR with triplicated voters
restores the error-free signal. A generalization of the TMR approach is
the N-modular redundancy (NMR) technique. NMR applies the same
principle as TMR but uses N of a given module as opposed to only
three. In most cases, N is selected as an odd number so that a majority
voting arrangement can be used. The advantage of using N modules
rather than three is that more module faults can often be tolerated.
Voting within NMR systems can occur at several points. For example,
an industrial controller can sample the temperature of a chemical
process from three independent sensors, perform a vote to determine
which of the three sensor values to use, calculate the amount of heat or
cooling to provide to the process (the calculations being performed by
three or more separate modules), and then vote on the calculations to
determine a result. The voting can be performed on both analog and
digital data. The alternative, in this example, might be to sample the
temperature from three independent sensors, perform the calculations,
and then provide a single vote on the final result. The primary difference
between the two approaches is fault containment. If voting is not
performed on the temperature values from the sensors, then the effect of
121
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
a sensor fault is allowed to propagate beyond the sensors and into the
primary calculations. Voting at the sensors, however, will mask, and
contain, the effects of a sensor fault. Providing several levels of voting,
however, does require additional redundancy, and the benefits of fault
containment must be compared to the cost of the extra redundancy.
122
CIT 314 MODULE 4
123
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Information Redundancy
124
CIT 314 MODULE 4
This is the basic concept of the error detecting codes. The basic concept
of the error correcting code is that the code word is structured such that
it is possible to determine the correct code word from the corrupted, or
erroneous, code word. Typically, the code is described by the number of
bit errors that can be corrected. For example, a code that can correct
single-bit errors is called a single error correcting code. A code that can
correct two-bit errors is called a double-error correcting code, and so
on.
125
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
126
CIT 314 MODULE 4
4.0 CONCLUSION
5.0 SUMMARY
127
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
128
CIT 314 MODULE 4
CONTENTS
1.0 Introduction
2.0 Intended Learning Outcomes (ILOS)
3.0 Main Contents
3.0.2.0 Fault Tree Analysis
3.0.2.1 Fault Detection Methods
3.0.2.2 Fault Tolerance Architecture
3.0.2.3 Fault Models
3.0.2.4 Fault Tolerance Methods
3.0.2.5 Major Issues in Modelling and Evaluation
3.0.2.6 Fault Tolerance for Web Applications
4.0 Self-Assessment Exercises
5.0 Conclusion
6.0 Summary
7.0 References/Further Reading
1.0 INTRODUCTION
129
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
130
CIT 314 MODULE 4
Benefits of FTA
With every product, there are numerous ways it can fail. Some more
likely and possible than others. The FTA permits a team to think
through and organize the sequences or patterns of faults that have to
occur to cause a specific top level fault. The top level fault may be a
specific type of failure, say the car will not start. Or it may be focused
on a serious safety related failure, such as the starter motor overheats
starting a fire. A complex system may have numerous FTA that each
explore a different failure mode.
131
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
Focus on one fault at a time. The FTA can start with an overall
failure mode, like the car not starting, or it can focus on one
element of the vehicle failing, like the airbag not inflating as
expected within a vehicle. The team chooses the area for focus at
the start of the analysis.
Expose system behavior and possible interactions. FTA allows
the examination of the many ways a fault may occur and may
expose non-obvious paths to failure that other analysis
approaches miss.
Account for human error. FTA includes hardware, software, and
human factors in the analysis as needed. The FTA approach
includes the full range of causes for a failure.
Just another tool in the reliability engineering toolbox. For
complex systems and with many possible ways that a significant
fault may occur, FTA provides a great way to organize and
manage the exploration of the causes. The value comes from the
insights created that lead to design changes to avoid or minimize
the fault.
There exist several overlapping taxonomies of the field. Some are more
oriented toward control engineering approach, other to mathematical,
Statistical and AI approach. Interesting divisions are described in the
following division of fault detection methods Below:
132
CIT 314 MODULE 4
Parity equations
State observers
Parameter estimation
Nonlinear models (neural nets)
Expert systems
Fuzzy logic
PAIR-AND-A-SPARE
133
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
RAZOR
134
CIT 314 MODULE 4
STEM
CPipe
135
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
TMR
DARA-TMR
DARA-TMR triplicates entire pipeline but uses only two pipeline copies
to run identical process threads in Dual Modular Redundancy (DMR)
mode. The third pipeline copy is disabled using power gating and is only
engaged for diagnosis purposes in case of very frequent errors reported
136
CIT 314 MODULE 4
137
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
register FFs. The comparison takes place after every clock cycle. Thus,
error detection can invoke a reconfiguration and a rollback cycle,
confining the error and preventing it from effecting the computation in
the following cycles. The comparison takes place only during brief
intervals of time referred to as comparison window. The timing of
comparison window is defined by the high phase of a delayed clock
signal DC, which is generated from CLK using a delay element. These
brief comparisons allow keeping the switching activity in OR-tree of the
comparator to a minimum, offering a 30% power reduction compared
with a static comparator. The functioning of the pseudo-dynamic
comparator requires specific timing constraints to be applied during
synthesis of CL blocks, as defined below. Timing Constraints: In typical
pipeline circuits the contamination delay of CL should respect hold-time
of the pipeline register latches. However, in the HyFT architecture, as
CL also feeds to the pseudo-dynamic comparator, CL outputs need to
remain stable during the comparison. And since the comparison takes
place just after a clock edge, any short paths in the CL can cause the
input signals of the comparator to start changing before the lapse of the
comparison-window. Thus, the CL copies have to be synthesized with
minimum delay constraints governed by:
138
CIT 314 MODULE 4
A fault model attempts to identify and categorize the faults that may
occur in a system, in order to provide clues as to how to fine-tune the
software development environment and how to improve error detection
and recovery. A question that needs to be asked is: is the traditional
distributed systems fault model appropriate for Grid computing, or are
refinements necessary?
139
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
A Fault in any software system, usually, happens due to the gaps left
unnoticed during the design phase. Based on this, the fault tolerance
techniques are identified into two different groups, that is, the Single
Version Technique and the Multi-Version Technique. There can be
plenty of techniques implemented under each of these categories, and a
few of the techniques often used by the programmers are,
2. Error Detection
3. Exception Handling
140
CIT 314 MODULE 4
5. Process Pairs
6. Data Diversity
7. Recovery Blocks
8. N – Version Programming
141
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
9. N Self–Checking Programming
142
CIT 314 MODULE 4
In web services when a fault occurs, it goes into various stages. When an
error occurs in web services, it should make sure the error or faults
through various fault detection mechanism to know the failure causes so
that failed components can be repaired of recovered from an error. The
flow of web service failure responses shown in figure 4.11.
143
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
In web services, there are many fault tolerant techniques that can be
applied such as replication. Replication is a more efficient technique for
handling exception in a distributed application. Services can resume
more effectively by maintaining the global state of the application. For
instance, let's assume if one service needs the assistance of another
service to provide the desired result to the customer then service needs
to communicate with other service. Suppose, while communicating with
other service, at certain point of time if a fault occurs in a service, then
there is no need to continue service with faults. Then the state manager
has to roll back the state of the application at that point where the fault
occurred so that service can resume without fault and response can be
given to the consumer more effectively.
144
CIT 314 MODULE 4
Servers – The physical machines that act as host machines for one
or more e virtual machines.
Virtualization – Technology that abstracts physical components
such as servers, storage, and networking and provides these as
logical resources.
Storage – In the form of Storage Area Networks (SAN), network
attached storage (NAS), disk drives etc. Along with facilities as
archiving and backup.
Network – To provide interconnections between physical servers
and storage.
Management – Various software for configuring, management and
monitoring of cloud infrastructure including servers, network, and
storage devices.
Security – Components that provide integrity, availability, and
confidentiality of data and security of information, in general.
Backup and recovery services.
145
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
146
CIT 314 MODULE 4
4.0 CONCLUSION
5.0 SUMMARY
147
CIT314 COMPUTER ARCHITECTURE AND ORGANIZATION II
148