CSC 306 - Operating Systems 20232024
CSC 306 - Operating Systems 20232024
The hardware—the central processing unit (CPU), the memory, and the input/output (I/O)
devices—provides the basic computing resources for the system. The application programs—
such as MS Word, MS Excel, compilers, and Google Chrome—are some applications used to
complete user tasks. The operating system controls the hardware and coordinates its use among
the various application programs for the various users. An operating system is similar to a
1
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
User View
The user’s view of the computer varies according to the interface being used. Most computer
users sit in front of a PC, consisting of a monitor, keyboard, mouse, and system unit. Such a
system is designed for one user. The goal is to maximize the work (or play) that the user is
performing. In this case, the operating system is designed mostly for ease of use, with some
attention paid to performance and none paid to resource utilization—how various hardware and
software resources are shared. Performance is, of course, important to the user; but such
systems are optimized for the single-user experience rather than the requirements of multiple
users.
In other cases, a user sits at a terminal connected to a mainframe or a minicomputer. Other users
are accessing the same computer through other terminals. These users share resources and may
exchange information. The operating system in such cases is designed to maximize resource
utilization— to assure that all available CPU time, memory, and I/O are used efficiently and that
no individual user takes more than her fair share.
In still other cases, users sit at workstations connected to networks of other workstations and
servers. These users have dedicated resources at their disposal, but they also share resources
such as networking and servers, including file, compute, and print servers. Therefore, their
operating system is designed to compromise individual usability and resource utilization.
Recently, many varieties of mobile computers, such as smartphones and tablets, have come into
fashion. Most mobile computers are standalone units for individual users. Quite often, they are
connected to networks through cellular or other wireless technologies. Increasingly, these mobile
devices are replacing desktop and laptop computers for people who are primarily interested in
using computers for e-mail and web browsing. The user interface for mobile computers generally
features a touch screen, where the user interacts with the system by pressing and swiping fingers
across the screen rather than using a physical keyboard and mouse. Some computers have little
or no user view. For example, embedded computers in devices found in homes and automobiles
may have numeric keypads and may turn indicator lights on or off to show status, but they and
their operating systems are designed primarily to run without user intervention.
System View
From the system’s (computer) point of view, the operating system is the program most intimately
involved with the hardware. In this context, we can view an operating system as a resource
allocator. A computer system has many resources that may be required to solve a problem: CPU
time, memory space, file-storage space, I/O devices, and so on. The operating system acts as the
manager of these resources. Facing numerous and possibly conflicting requests for resources,
the operating system must decide how to allocate them to specific programs and users so that it
can operate the computer system efficiently and fairly. As we have seen, resource allocation is
2
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
especially important where many users access the same mainframe or minicomputer. A slightly
different view of an operating system emphasizes the need to control the various I/O devices and
user programs. An operating system is a control program. A control program manages the
execution of user programs to prevent errors and improper use of the computer. It is especially
concerned with the operation and control of I/O devices.
In summary, the functions of an OS can be concluded to include but are not limited to the
following:
1. Booting: OS manages the startup of a device.
2. Memory management: The OS coordinates applications and allocates space to different
programs installed on a computer.
3. Security management: OS protects user data from unauthorized access. It also protects
the computer hardware from abuse.
4. Loading and execution: OS starts and executes programs controlling resource allocation
and program instruction flow.
5. Drive/disk management: An OS manages secondary storage devices like drives and disks.
6. Device control: An OS enables you to allow or block access to devices. Common among
devices which are controlled by the OS are I/O devices.
7. User Interface: This part of the OS also known as UI or in some cases GUI, allows users to
enter and receive information by interacting with icons and Manus.
8. Process management: OS allocates space to enable computer processes, such as
computing, storing and sharing of information.
Operating systems have been evolving through the years since. In the following sections, we will
briefly look at a few of the highlights. Since operating systems have historically been closely tied
to the architecture of the computers on which they run, we will look at successive generations of
computers to see what their operating systems were like. This mapping of operating system
generations to computer generations is crude, but it does provide some structure where there
would otherwise be none.
3
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
enormous, filling up entire rooms with tens of thousands of vacuum tubes, but they were still
millions of times slower than even the cheapest personal computers available today. In these
early days, a single group of people designed, built, programmed, operated, and maintained each
machine. All programming was done in absolute machine language, often by wiring up
plugboards to control the machine’s basic functions. Programming languages were unknown
(even assembly language was unknown). Operating systems were unheard of. The usual mode of
operation was for the programmer to sign up for a block of time on the signup sheet on the wall,
then come down to the machine room, insert his or her plugboard into the computer, and spend
the next few hours hoping that none of the 20,000 or so vacuum tubes would burn out during
the run. Virtually all the problems were straightforward numerical calculations, such as grinding
out tables of sines, cosines, and logarithms. By the early 1950s, the routine had improved
somewhat with the introduction of punched cards. It was now possible to write programs on
cards and read them instead of using plugboards; otherwise, the procedure was the same.
When the computer finished whatever job it was currently running, an operator delivers the
output or result in the form of a printout to the output room, so that the programmer could
collect it later. The operator would then take one of the card decks that had been brought from
the input room and read it in. If the FORTRAN compiler was needed, the operator would have to
get it from a file cabinet and read it in. Much computer time was wasted while operators were
walking around the machine room. Given the high cost of the equipment, it is not surprising that
people quickly looked for ways to reduce the wasted time. The solution generally adopted was
the batch system. The idea behind it was to collect a tray full of jobs in the input room and then
read them onto a magnetic tape using a small (relatively) inexpensive computer like the IBM
1401, which was very good at reading cards, copying tapes, and printing output, but not at all
good at numerical calculations. Other, much more expensive machines, such as the IBM 7094,
were used for real computing.
After about an hour of collecting a batch of jobs, the tape was rewound and brought into the
machine room, where it was mounted on a tape drive. The operator then loaded a special
program (the ancestor of today’s operating system; this marks the beginning of OS), which read
the first job from tape and ran it. The output was written onto a second tape, instead of being
printed. After each job was finished, the operating system automatically read the next job from
the tape and began running it. When the whole batch was done, the operator removed the input
4
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
and output tapes, replaced the input tape with the next batch, and brought the output tape to
1401 for printing offline.
The execution of these different programs and tasks grouped on an input tape is carried out as
follows:
• It starts with a $JOB card, specifying the maximum run time in minutes, the account
number to be charged, and the programmer’s name.
• Then came a $FORTRAN card, telling the operating system to load the FORTRAN compiler
from the system tape.
• It was followed by the program to be compiled, and then a $LOAD card, directing the
operating system to load the object program just compiled.
• Next came the $RUN card, telling the operating system to run the program with the data
following it.
• Finally, the $END card marked the end of the job. These primitive control cards were the
forerunners of modern job control languages and command interpreters. The diagram
below illustrates the control structure.
5
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Organizations with a small machine later outgrew it and wanted a bigger machine that had the
same architectures as their current one so it could run all their old programs, but faster. IBM
attempted to solve both of these problems in a single stroke by introducing the System/360. The
360 was a series of software-compatible machines ranging from 1401-sized to much more
powerful than the 7094. The machines differed only in price and performance (maximum
memory, processor speed, number of I/O devices permitted, and so forth). Since all the machines
had the same architecture and instruction set, programs written for one machine could run on
all the others, at least in theory. Furthermore, the 360 was designed to handle both scientific and
commercial computing. Thus a single family of machines could satisfy the needs of all customers.
In subsequent years, IBM has come out with compatible successors to the 360 line, using more
modern technology, known as the 370, 4300, 3080, 3090, and Z series. The 360 was the first
major computer line to use (small-scale) Integrated Circuits (ICs), thus providing a major
price/performance advantage over the second-generation machines, which were built up from
individual transistors. It was an immediate success, and the idea of a family of compatible
computers was soon adopted by all the other major manufacturers. The descendants are often
used for managing huge databases (e.g., for airline reservation systems) or as servers for World
Wide Web sites that must process thousands of requests per second. The greatest strength of
the ‘‘one family’’ idea was simultaneously its greatest weakness. The intention was that all
software, including the operating system, OS/360, had to work on all models. It had to work in
commercial environments and scientific environments. There was no way that IBM (or anybody
else) could write a piece of software to meet all those conflicting requirements. The result was
an enormous and extraordinarily complex operating system, probably two to three orders of
magnitude larger than FMS. It consisted of millions of lines of assembly language written by
thousands of programmers and contained thousands upon thousands of bugs, which
necessitated a continuous stream of new releases in an attempt to correct them.
Despite its enormous size and problems, OS/360 and the similar third-generation operating
systems produced by other computer manufacturers satisfied most of their customers
reasonably well. They also popularized several key techniques absent in second-generation
operating systems. Probably the most important of these was multiprogramming. On the 7094,
when the current job paused to wait for a tape or other I/O operation to complete, the CPU
simply sat idle until the I/O finished. With heavily CPU-bound scientific calculations, I/O is
infrequent, so this wasted time is not significant. With commercial data processing, the I/O wait
time can often be 80% or 90% of the total time, so something had to be done to avoid having the
(expensive) CPU idle so much. The solution that evolved was to partition memory into several
pieces, with a different job in each partition, as shown below.
Job 3
Job 2
Job 1 Multiprogramming system with three jobs in memory
OS
While one job was waiting for I/O to complete, another job could be using the CPU. If enough
jobs could be held in the main memory at once, the CPU could be kept busy nearly 100% of the
6
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
time. Having multiple jobs safely in memory at once requires special hardware to protect each
job against snooping and mischief by the other ones, but the 360 and other third-generation
systems were equipped with this hardware.
Another major feature present in third-generation OS was the ability to read jobs from cards onto
the disk as soon as they were brought to the computer room. Then, whenever a running job is
finished, the OS could load a new job from the disk into the now-empty partition and run it. This
technique is called spooling (from Simultaneous Peripheral Operation on Line) and was also used
for output. With spooling, the 1401s were no longer needed, and much carrying of tapes
disappeared.
Although third-generation OS was well suited for big scientific calculations and massive
commercial data processing runs, they were still basically batch systems with turnaround times
of an hour. This desire of many programmers for quick response time paved the way for
timesharing, a variant of multiprogramming, in which each user has an online terminal. In a
timesharing system, if 20 users are logged in and 17 of them are thinking or talking or drinking
coffee, the CPU can be allocated in turn to the three jobs that want service. The first serious
timesharing system, CTSS (Compatible Time-Sharing System), was developed at M.I.T on a
specially modified 7094. However, timesharing did not become popular until the necessary
protection hardware became widespread during the third generation.
7
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
8
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Batch operating systems are used for tasks such as managing payroll systems, data entry and
bank statements.
9
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Distributed operating systems are used for tasks such as telecommunication networks, airline
reservation controls and peer-to-peer networks.
10
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Examples of network operating systems include Microsoft Windows Servers, Unix, Linux and
macOS X.
Real-time operating systems may either be hard real-time systems or soft real-time systems.
Hard real-time systems are installed in applications with strict time constraints. The system
guarantees the completion of sensitive tasks on time. Hard real-time does not have virtual
memory. Soft real-time systems do not have equally rigid time requirements. A critical task gets
priority over other tasks.
Real-time operating systems are used for tasks such as scientific experiments, medical imaging,
robotics and air traffic control operations.
Examples of mobile operating systems include Android OS, Apple iOS and Windows mobile OS.
Mainframe OS
At the high end are the OS for the mainframes. The OS for mainframes are heavily oriented
toward processing many jobs at once, most of which need prodigious amounts of I/O. They
typically offer three kinds of services: batch, transaction processing, and timesharing. Claims
processing in an insurance company or sales reporting for a chain of stores is typically done in
batch mode. Transaction processing systems handle large numbers of small requests, for
example, check processing at a bank or airline reservations. Each unit is small, but the system
must handle hundreds or thousands per second. However, mainframe OS are gradually being
replaced by UNIX variants such as Linux.
Server OS
They run on servers, which are either very large PCs, workstations, or even mainframes. They
serve multiple users at once over a network and allow the users to share hardware and
software resources. The server can provide print services, file services, or Web services. Typical
server OS is Solaris, FreeBSD, Linux and Windows servers.
Microsoft Windows
Created by Microsoft, Microsoft Windows is one of the most popular proprietary operating
systems for computers in the world. Most personal computers come preloaded with a version of
Microsoft Windows. One downside of Windows is that compatibility with mobile phones has
been problematic.
Apple iOS
Apple iOS from Apple is used on smartphones and tablets manufactured by the same company.
Users of this system have access to hundreds of applications. The operating system offers strong
encryption capabilities to control unauthorized access to users' private data.
Google Android
Android from Google is the most popular operating system in the world. It's mainly used on
tablets and smartphones. It also runs on devices made by other manufacturers. Users have access
to numerous mobile applications available on the Google Play Store.
Apple macOS
Developed by Apple, this proprietary operating system runs on the manufacturer's personal
computers and desktops. All Apple and Macintosh computers come equipped with the latest
12
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
version of macOS, previously known as OS X systems. The ability to prevent bugs and fend off
hackers make Apple operating systems popular with their users.
Linux-
Created by the Finnish programmer Linus Torvalds, Linux is today developed by programmer
collaborators across the world who submit tweaks to the central kernel software. Linux is popular
with programmers and corporate servers. It is available for free online.
Interrupt
In the operating system, interrupts are essential because they give a reliable technique for the
OS to communicate & react to its surroundings. An interrupt is nothing but a signal sent by a
program that needs some resource to the OS or a signal sent by a hardware component after it
is done servicing a job. Whenever an interrupt signal is received, the CPU of the computer puts
on hold automatically whatever computer program is running presently, keeps its status & runs
a computer program which is connected previously with the interrupt.
Modern operating systems are interrupt-driven. If there are no processes to execute, no I/O
devices to service, and no users to whom to respond, an operating system will sit quietly, waiting
for something to happen. Events are almost always signalled by the occurrence of an interrupt
or a trap. A trap (or an exception) is a software-generated interrupt caused either by an error (for
example, division by zero or invalid memory access) or by a specific request from a user program
that an operating-system service is performed. The interrupt-driven nature of an operating
system defines that system’s general structure. For each type of interrupt, separate segments of
code in the operating system determine what action should be taken. An interrupt service routine
is provided to deal with the interrupt.
13
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
2. The appropriate instructions to handle the specific interrupt are loaded into the memory
3. Once interrupt instructions are done executing, the instructions are taken out of memory
4. The saved point of the routine that was running before the interrupt is popped out of the
stack into memory and the process continues.
This is the surface explanation of how interrupts are handled within a running operating system.
System Calls
System calls provide a channel for user programs to access the services made available by an
operating system. The calls made are generally available as routines (a step-by-step pre-program)
written in C and C++, although certain low-level tasks (for example, tasks where hardware must
be accessed directly) may have to be written using assembly-language instructions.
Frequently, systems execute thousands of system calls per second. However, most programmers
never get to deal with this level of detail. Typically, application developers design programs
according to an application programming interface (API). The API specifies a set of functions that
are available to an application programmer, including the parameters that are passed to each
function and the return values the programmer can expect. Three of the most common APIs
available to application programmers are the Windows API for Windows systems, the POSIX API
for POSIX-based systems (which include virtually all versions of UNIX, Linux, and Mac OS X), and
the Java API for programs that run on the Java virtual machine. A programmer accesses an API
via a library of code provided by the operating system. In the case of UNIX and Linux for programs
written in the C language, the library is called libc.
So why would an application programmer prefer programming according to an API and not by
actual system calls? There are several reasons for doing so.
14
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
System calls can be grouped roughly into six major categories: process control, file manipulation,
device manipulation, information maintenance, communications, and protection.
In a typical modern operating system, the basic structure that runs behind the scenes is described
as seen below.
Kernel (Core)
Supervisor
Programs/users
The kernel
The kernel in the OS provides the basic level program that directly manipulates the hardware. In
the operating system, the kernel is an essential component that loads first and remains within
the main memory throughout the period for which a computer system is active. It remains active
so that memory accessibility can be managed for the programs within the RAM, it creates the
programs to get access to the hardware resources. It resets the operating states of the CPU for
the best operation at all times.
After arrays, lists are perhaps the most fundamental data structures in computer science.
Whereas each item in an array can be accessed directly, the items in a list must be accessed in a
particular order. That is, a list represents a collection of data values as a sequence. The most
15
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
common method for implementing this structure is a linked list, in which items are linked to one
another. Linked lists are of several types:
• In a doubly linked list, a given item can refer either to its predecessor or its successor
• In a circularly linked list, the last element in the list refers to the first element, rather than to
null
Linked lists accommodate items of varying sizes and allow easy insertion and deletion of items.
One potential disadvantage of using a list is that performance for retrieving a specified item in a
list of size n is linear — O(n), as it requires potentially traversing all n elements in the worst case.
Lists are sometimes used directly by kernel algorithms.
Another data structure used is stacks and queues. A stack is a sequentially ordered data structure
that uses the last in, first out (LIFO) principle for adding and removing items, meaning that the
last item placed onto a stack is the first item removed. The operations for inserting and removing
items from a stack are known as push and pop, respectively. An operating system often uses a
stack when invoking function calls. Parameters, local variables, and the return address are
pushed onto the stack when a function is called; returning from the function call pops those items
off the stack. A queue, in contrast, is a sequentially ordered data structure that uses the first in,
first out (FIFO) principle: items are removed from a queue in the order in which they were
inserted. There are many everyday examples of queues, including shoppers waiting in a checkout
line at a store and cars waiting in line at a traffic signal. Queues are also quite common in
operating systems—jobs that are sent to a printer are typically printed in the order in which they
16
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
were submitted, for example. As we shall see in Chapter 6, tasks that are waiting to be run on an
available CPU are often organized in queues.
The Supervisor
The supervisor surrounds the kernel and it is the interface between the user job/program and
system resources. It controls and coordinates system resources. It accepts user commands and
translates them to kernel instructions to manipulate the hardware.
The Program
This consists of instructions given by the users. Programs referred to in this case are programs
residing in the RAM/ programs loaded to the RAM.
To ensure the proper execution of the operating system, we must be able to distinguish between
the execution of operating-system code (supervisor domain) and user-defined code (program
domain). At this point, we need two separate modes of operation: user mode and kernel mode
(also called supervisor mode, system mode, or privileged mode).
A bit, called the mode bit, is added to the hardware of the computer to indicate the current
mode: kernel (0) or user (1). With the mode bit, we can distinguish between a task that is
executed on behalf of the operating system and one that is executed on behalf of the user. When
the computer system is executing on behalf of a user application, the system is in user mode.
However, when a user application requests a service from the operating system (via a system
call), the system must transition from user to kernel mode to fulfil the request as seen in the
diagram below.
At system boot time, the hardware starts in kernel mode. The operating system is then loaded
and starts user applications in user mode. Whenever a trap or interrupt occurs, the hardware
17
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
switches from user mode to kernel mode (that is, changes the state of the mode bit to 0). Thus,
whenever the operating system gains control of the computer, it is in kernel mode. The system
always switches to user mode (by setting the mode bit to 1) before passing control to a user
program.
The dual mode of operation provides us with the means for protecting the operating system. We
accomplish this protection by designating some of the machine instructions that may cause harm
as privileged instructions. The hardware allows privileged instructions to be executed only in
kernel mode. If an attempt is made to execute a privileged instruction in user mode, the hardware
does not execute the instruction but rather treats it as illegal and traps it in the operating system.
Some earlier approaches to operating system design include MS-DOS & UNIX (Monolithic)
In this type of OS, the kernel is designed to operate in one space with no separation between
user space and kernel or supervisor space. With this type of OS kernel design, the user
applications can have direct access to control hardware as seen in the diagram above. This
monolithic structure was difficult to implement and maintain. It had a distinct performance
advantage, however: there is very little overhead in the system call interface or communication
within the kernel.
18
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
/ OS
We still see evidence of this simple, monolithic structure in the UNIX, Linux, and Windows
operating systems.
Later came the Mikrokernel approach which was an expanded version of the monolithic to
accommodate more functionalities. This method structures the operating system by removing all
nonessential components from the kernel and implementing them in the user space
environment.
The main function of the microkernel is to provide communication between the client program
and the various services that are also running in the user space
In practice, very few operating systems adopt a single, strictly defined structure. Instead, they
combine different structures, resulting in a hybrid that addresses performance, security, and
usability issues. For example, both Linux and Solaris are monolithic, because having the operating
system in a single address space provides very efficient performance. However, they are also
modular, so that new functionality can be dynamically added to the kernel. Windows is largely
19
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
monolithic as well (again primarily for performance reasons), but it retains some behaviour
typical of microkernel systems, including providing support for separate subsystems (known as
operating-system personalities) that run as user-mode processes. Windows systems also provide
support for dynamically loadable kernel modules.
PROCESS MANAGEMENT
Early computers allowed only one program to be executed at a time, this meant that a program
had complete control of the system and had access to all the system’s resources but as we
realized as we walked through the generation of computers that this was a waste of computer
resources and so further improvements were made leading to batch jobs and multiprogramming
which are the bedrock of modern operating systems. The improvements of OS led modern
computers to be able to execute multiple programs at a goal. With modern operating systems
came the requirement for stricter control of various programs in execution so that they do not
interfere with each other leading to the birthing of processes in operating systems.
A process can be thought of as a program in execution. A process is the unit of work in most
systems. A process will need certain resources—such as CPU time, memory, files, and I/O
devices — to accomplish a task. These resources are allocated to the process either when it is
created or while it is executing. Systems consist of a collection of processes:
➢ operating-system processes execute system code, whilst
➢ user processes execute user code. All these processes may execute concurrently
Process
In simple terms, a process is a program in execution as seen earlier. A process is more than just
the program code. A process generally includes the process stack (which contains temporary
data such as function parameters, return addresses, and local variables). A process may also
include a heap which is some memory that is dynamically allocated during the period a process
running. The structure of a process in memory is shown in the diagram below.
program code
We emphasize that a program by itself is not a process. A program in itself is a passive entity
like a file containing a list of instructions stored on a disk (known as an executable file) but in
20
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
contrast, a process is an active entity with a program counter and a set of associated resources.
A program counter is a variable which specifies the next instruction to execute. A program
becomes a process when an executable file is loaded into memory.
Although two processes may be associated with the same program, they are not considered
one process. For instance, several users may be running different copies or the same user may
invoke many copies of the web browser program. Each of these is a separate process; and
although the text sections are equivalent, the data, heap, and stack sections vary. There are
also situations where a process grows into processes as it runs.
Process State
As a process executes, it changes state. The state of a process is defined in part by the current
activity of that process. A process may be in one of the following states:
• New: The process is being created.
• Running: Instructions are being executed.
• Waiting: The process is waiting for some event to occur (such as an I/O completion or
reception of a signal).
• Ready: The process is waiting to be assigned to a processor.
• Terminated: The process has finished execution.
The names of the states are not fixed, they vary across operating systems. It is important to
realize that only one process can be running on any processor at any instant. Many processes
may be ready and waiting, however.
21
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The diagram above shows some portions of a PCB, it contains many pieces of information
associated with a specific process which include:
• Process state: The state may be new, ready, running, waiting, halted, and so on.
• Program counter: The counter indicates the address of the next instruction to be
executed for this process
• CPU registers: The registers vary in number and type, depending on the hardware
architecture of your computer. They may include index registers, stack pointers, and
general-purpose registers, plus any condition-code information. Along with the program
counter, the state information of a process must be saved when an interrupt occurs to
allow the process to be continued correctly afterwards as seen in the diagram below.
22
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
• I/O status information: This information includes the list of I/O devices allocated to the
process, a list of open files, and so on.
Threads
A thread is the smallest unit of a process that can exist in an OS. So we say a thread exists
within a process - that is, a single process may contain multiple threads. A thread is a basic unit
of CPU utilization. It comprises a thread ID, a program counter, a register set, and a stack. It
shares with other threads belonging to the same process its code section, data section, and
other operating-system resources, such as open files and signals.
You can imagine multitasking as something that allows processes to run concurrently, while
multithreading allows sub-processes to run concurrently.
When multiple threads are running concurrently, this is known as multithreading, which is
similar to multitasking. An operating system with multitasking capabilities allows programs (or
processes) to run seemingly at the same time. On the other hand, a single program with
multithreading capabilities allows individual sub-processes (or threads) to run seemingly at the
same time.
One example of multithreading is downloading a video while playing it at the same time.
Another example is a word processor may have a thread for displaying graphics, another thread
for responding to keystrokes from the user, and a third thread for performing spelling and
grammar checking in the background.
Among the widely-used programming languages that allow developers to work on threads in
their program source code are Java, Python and . NET.
If a process has multiple threads of control, it can perform more than one task at a time. The
diagram above illustrates single-threading and multithreading processes.
The benefits of multithreading programming can be broken down into four major categories;
23
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Process Scheduling
To maximize CPU utilization, multiprogramming aims to keep a process active at all times. Time-
sharing aims to rotate the CPU between processes frequently enough to allow users to engage
with each program while it is running. The process scheduler chooses one available process
(potentially from a collection of multiple such processes) for CPU program execution to achieve
these goals. There can never be more than one process active on a single processor system. If
there are other processes, they must wait until the CPU is free so that they may be
rescheduled.
Scheduling Queues
All processes in the system are gathered into a job queue, which receives new processes as
they come into the system. The ready queue is a list that contains the processes that are stored
in the main memory and are prepared and waiting to be executed. In most cases, a linked list is
used to hold this queue. Pointers to the first and last PCBs in the list are contained in the ready-
queue header. Every PCB has a pointer field that directs the user to the following PCB in the
ready queue.
The system also includes other queues. When a process is allocated to the CPU, it executes for a
while and eventually quits, is interrupted, or waits for the occurrence of a particular event, such
as the completion of an I/O request. Suppose the process makes an I/O request to a shared
device, such as a disk. Since there are many processes in the system, the disk may be busy with
24
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
the I/O request of some other process. The process therefore may have to wait for the disk. The
list of processes waiting for a particular I/O device is called a device queue. Each device has its
device queue.
scheduler
A process migrates among the various scheduling queues throughout its lifetime. The operating
system must select, for scheduling purposes, processes from these queues in some fashion. The
selection process is carried out by the appropriate scheduler. Often, in a batch system, more
processes are submitted than can be executed immediately. These processes are spooled to a
mass-storage device (typically a disk), where they are kept for later execution. The long-term
scheduler, or job scheduler, selects processes from this pool and loads them into memory for
execution. The short-term scheduler, or CPU scheduler, selects from among the processes that
are ready to execute and allocates the CPU to one of them. The primary distinction between
these two schedulers lies in the frequency of execution. The short-term scheduler must select a
new process for the CPU frequently. Often, the short-term scheduler executes at least once
every 100 milliseconds. Because of the short time between executions, the short-term
scheduler must be fast. The long-term scheduler executes much less frequently; minutes may
separate the creation of one new process and the next. Because of the long interval between
executions, the long-term scheduler can afford to take more time to decide which process
should be selected for execution.
In general, most processes can be described as either I/O bound or CPU bound. An I/O-bound
process spends more of its time doing I/O than it spends doing computations. A CPU-bound
process, in contrast, generates I/O requests infrequently, using more of its time doing
computations. It is important that the long-term scheduler selects a good process mix of I/O-
bound and CPU-bound processes so that these resources are used to the maximum.
25
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The key idea behind a medium-term scheduler is that sometimes it can be advantageous to
remove a process from memory (and from active contention for the CPU) and thus reduce the
degree of multiprogramming. Later, the process can be reintroduced into memory, and its
execution can be continued where it left off. This scheme is called swapping. The process is
swapped out and is later swapped in, by the medium-term scheduler. Swapping may be
necessary to improve the process mix or because a change in memory requirements has
overcommitted available memory, requiring memory to be freed up.
Context Switch
As is previously known, interruptions trigger the operating system to switch a CPU from its pres
ent job and launch a kernel procedure. These actions typically take place on general-
purpose systems. The system must save the current context of the CPU-running process when
an interrupt happens so that it may restore it after its processing is complete, effectively
suspending and then restarting the process. Generically, we perform a state save of the current
state of the CPU, be it in kernel or user mode, and then a state restore to resume operations.
Switching the CPU to another process requires performing a state save of the current process
and a state restore of a different process. This task is known as a context switch. When a
context switch occurs, the kernel saves the context of the old process in its PCB and loads the
saved context of the new process scheduled to run.
Process Creation
During execution, a process may create several new processes. As mentioned in our previous
sessions, the creating process is called a parent process, and the new processes are called the
children of that process. Each of these new processes may in turn create other processes,
forming a tree of processes as seen in the diagram below.
Most operating systems (including UNIX, Linux, and Windows) identify processes according to a
unique process identifier (or pid), which is typically an integer number. The pid provides a
unique value for each process in the system, and it can be used as an index to access various
attributes of a process within the kernel.
26
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The diagram above shows how a process tree may look. The process tree for the Linux
operating system is depicted in this figure. Here, we can see that each process is identified by
its pid and name. Although both are presented, transactions involving a process always utilize
the pid, not the name. The init process (which always has a pid of 1) serves as the root parent
process for all processes (kernel and user). The init has pid = 1 because it was created at the
very start-up of the OS. The init process can also create various user processes, such as a web or
print server, an ssh server, and the like as seen in the diagram. We also see two children of
init—kthreadd and sshd. The kthreadd process is responsible for creating additional processes
that perform tasks on behalf of the kernel (in this situation, khelper and pdflush). Clients that
connect to this system via ssh must be managed by the sshd process which is the short form for
secure shell. The login process is responsible for managing clients that directly log onto this
system. In this example, a client has logged on and is using the bash shell, which has been
assigned pid 8416. Using the bash command-line interface, this user has created the process ps
as well as the emacs editor. The more interactions that will be created the larger our tree will
grow.
When a child process is created, that process will need certain resources such as CPU time,
memory, files, I/O devices and more to accomplish its task just like the parent process will too.
A child process may be able to obtain its resources directly from the operating system, or it may
be constrained to share the resources of its parent process. In the case where there is resource
sharing with the parent, there would have to be a partitioning of the available resources among
children. Resource sharing between parent and child process may be partial where only some
resources such as memory or files are shared among several children.
27
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Restricting a child process to a subset of the parent’s resources prevents any process from
overloading the system by creating too many child processes. In addition to supplying various
physical and logical resources, the parent process may pass along initialization data as input to
the child process. For example, consider a process whose function is to display the contents of a
file —say image.jpg—on the screen. When the process is created, it will get, as an input from its
parent process, the name of the file image.jpg that it is supposed to display. Now using that file
name, it will open the file and write the contents out. It may also get the name of the output
device. However, some operating systems pass resources to child processes at their creation.
When a process creates a new process, two possibilities for execution exist:
There are also two address-space possibilities for the new process:
1. The child process is a duplicate of the parent process that is it has the same program
and data as the parent.
2. The child process has a new program loaded into it.
We take the case of the UNIX operating system to give some clarity to this concept. In UNIX, as
we’ve seen in the diagram above, each process is identified by its process identifier, which is a
unique integer. A new process is created by the fork () system call. The new process consists of
a copy of the address space of the original process. This mechanism allows the parent process
to communicate easily with its child process. Both processes (the parent and the child) continue
execution at the instruction after the fork (), with one difference: the return code for the fork ()
is zero for the new (child) process, whereas the (nonzero) process identifier of the child is
returned to the parent.
After a fork () system call, one of the two processes typically uses the exec () system call to start
execution. In this manner, the two processes can communicate and then go their separate
ways. The parent can then create more children; or, if it has nothing else to do while the child
runs, it can issue a wait () system call to move itself off the ready queue until the termination of
the child. The exec () system call overlays the process’s address space with a new program, so
the call to exec() does not return control unless an error occurs. The child process inherits
privileges and scheduling attributes from the parent as well as certain resources.
Processes are created in the Windows OS using the CreateProcess () system call. This is similar
to fork () in UNIX. However, whereas fork () has the child process inherits the address space of
its parent, CreateProcess () requires loading a specified program into the address space of the
child process at process creation. Furthermore, whereas fork () is passed no parameters,
CreateProcess () expects no fewer than ten parameters.
28
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Process Termination
A process terminates when it finishes executing its final statement and asks the operating
system to delete it by using the exit() system call. The operating system reallocates all of the
process's resources, including physical and virtual memory, open files, and I/O buffers.
Other situations may also result in termination. Using the proper system call (such as Windows'
TerminateProcess() or UNIX's exit()), a process can make another process terminate. Usually,
such a system call can be invoked only by the parent of the process that is to be terminated else
users could arbitrarily kill each other’s jobs. A parent needs to know the identities of their
children if it is to terminate them. Thus, when one process creates a new process, the identity
of the newly created process is passed to the parent.
A parent may terminate the execution of one of their children for a variety of reasons, such as:
• The child has exceeded its usage of some of the resources that it has been allocated. (To
determine whether this has occurred, the parent must have a mechanism to inspect the
state of their children.)
• The task assigned to the child is no longer needed.
• The parent is exciting, and the operating system does not allow a child to continue
without a parent.
In certain systems, once a parent has terminated, the kid cannot continue to exist. In such
systems, every child of a process that ends (normally or abnormally) must likewise be
terminated, referred to as Cascading termination. This phenomenon is typically started by the
operating system.
We can have a process that has completed execution and made the exit system call but still has
an entry in the process table. This type of process is referred to as a zombie process or defunct
process. Because the process still has an entry in the process table, it is seen not to have
completely exited from the system. There are also instances where we can have a process
which is executing but its parent process has terminated. This is called an orphan process. This
orphan process is then adopted by another process, in most cases the kernel core process.
Interprocess communication
Processes executing concurrently in the operating system may be either independent processes
or cooperating processes. A process is independent if it cannot affect or be affected by the
other processes executing in the system. Any process that does not share data with any other
process is independent. On the other hand, a process is cooperating if it can affect or be
affected by the other processes executing in the system. Any process that shares data with
other processes is a cooperating process.
There are several reasons for providing an environment that allows process cooperation:
1. Information sharing: Since several users may be interested in the same piece of
information (for instance, a shared file), we must provide an environment to allow
concurrent access to such information.
29
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
2. Computation speedup: If we want a particular task to run faster, we must break it into
subtasks, each of which will be executed in parallel with the others. Notice that such a
speedup can be achieved only if the computer has multiple processing cores.
3. Modularity: We may want to construct the system in a modular fashion, dividing the
system functions into separate processes or threads.
4. Convenience: Even an individual user may work on many tasks at the same time. For
instance, a user may be editing, listening to music, and compiling in parallel.
Cooperating processes require an interprocess communication (IPC) mechanism that will allow
them to exchange data and information. There are two fundamental models of interprocess
communication: shared memory and message passing. In the shared-memory model, a region
of memory that is shared by cooperating processes is established. Processes can then exchange
information by reading and writing data to the shared region. In the message-passing model,
communication takes place using messages exchanged between the cooperating processes.
Both of the models just mentioned are common in operating systems, and many systems
implement both. Message passing is useful for exchanging smaller amounts of data because no
conflicts need to be avoided. Message passing is also easier to implement in a distributed
system than in shared memory.
Shared Memory
Shared memory is a form of indirect communication by which messages are left in some shared
"mailbox" by a poster, and then can be retrieved at will by a receiver.
Since we typically limit the size of the mailbox, it is known as a bounded buffer storage
medium, in which there is a maximum number of messages that can be stored at any time.
30
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The idea of a mailbox is abstract and can accommodate any sort of data structure in which to
store shared information.
Sharing memory works in this manner as shown above; process A and process B are executing
simultaneously and they share some resources. Process A generates data based
on computations in the code. Process A stores this data in shared memory. When process B
needs to use the shared data, it will check in the shared memory segment and use the data that
process A placed there. Processes can use shared memory for extracting information as a
record from another process as well as for delivering any specific information to other
processes.
Message-Passing
In this method of process communication, processes can communicate with each other without
using any kind of shared memory. For instance, if two processes process A and process B want
to communicate with each other, they proceed as follows:
• Establish a communication link (if a link already exists, no need to establish it again.)
• Start exchanging messages using a system's library functions send() and receive().
We need at least two primitives:
– send(message, destination) or send(message)
– receive(message, host) or receive(message)
To send a message, Process A, sends a message via the communication link that has been
opened between the 2 processes. Using the send() function it sends the necessary message.
Process B, which is monitoring the communication link, uses the receive() function to pick up
the message and performs the necessary processing based on the message it has received.
The message size can be of fixed size or variable size. If it is of fixed size, it is easy for an OS
designer but complicated for a programmer and if it is of variable size then it is easy for a
programmer but complicated for the OS designer.
Synchronization
Communication between processes takes place through calls to send() and receive() primitives.
There are different design options for implementing each primitive. Message passing may be
either blocking or nonblocking— also known as synchronous and asynchronous.
• Blocking send: The sending process is blocked until the message is received by the
receiving process or by the mailbox.
• Nonblocking send: The sending process sends the message and resumes operation.
• Blocking receive: The receiver blocks until a message is available.
• Nonblocking receive: The receiver retrieves either a valid message or a null.
Different combinations of send() and receive() are possible. When both send() and receive() are
blocked, we have a rendezvous between the sender and the receiver. There becomes a
problem when we use blocking send() and receive() statements together. The problem is that
31
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
with these two combination, both the sender and receiver may become stagnant because as
the sender invokes the blocking send() call it waits until the message is delivered to either the
receiver or the mailbox. Likewise, when the receiver invokes receive(), it blocks until a message
is available.
Buffering
Whether process communication is direct or indirect, messages exchanged by communicating
processes reside in a temporary queue. Such queues can be implemented in three ways:
1. Zero capacity: The queue has a maximum length of zero; thus, the link cannot have any
messages waiting in it. In this case, the sender must block until the recipient receives
the message.
2. Bounded capacity: The queue has a finite length of n; thus, at most n messages can
reside in it. If the queue is not full when a new message is sent, the message is placed in
the queue (either the message is copied or a pointer to the message is kept), and the
sender can continue execution without waiting. The link’s capacity is finite, however. If
the link is full, the sender must block it until space is available in the queue.
3. Unbounded capacity: The queue’s length is potentially infinite; thus, any number of
messages can wait in it. The sender never blocks.
The zero-capacity case is sometimes referred to as a message system with no buffering. The
other cases are referred to as systems with automatic buffering.
Deadlocks
In a multiprogramming environment, several processes may compete for a finite number of
resources. A process requests resources; if the resources are not available at that time, the
process enters a waiting state. Sometimes, a waiting process is never again able to change
state, because the resources it has requested are held by other waiting processes. This situation
is called a deadlock.
Under the normal mode of operation, a process may utilize a resource in only the following
sequence:
1. Request: The process requests the resource. If the request cannot be
granted immediately (for example, if the resource is being used by another
process), then the requesting process must wait until it can acquire the
resource.
2. Use: The process can operate on the resource (for example, if the resource
is a printer, the process can print on the printer).
3. Release: The process releases the resource.
The request and release of resources are done using system calls. Examples are the request()
and release() device, open() and close() file, and allocate() and free() memory system calls.
32
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Indefinite Postponement
In a system that keeps processes waiting while it makes resource allocation and process
scheduling decisions, it is possible to delay indefinitely the scheduling of the process while
other processes receive the system’s attention. This process is known as indefinite
postponement, indefinite blocking or starvation. Indefinite postponement may occur because
of the biases in the system measure scheduling policies. A given process can wait for a resource
indefinitely as processes with higher priority continue to arise. Systems should be designed to
manage waiting processes fairly as well as effectively. In the past, ageing has been used to
prevent indefinite postponement (Starvation).
Deadlock Prevention
There are three major ways of preventing deadlocks;
1. Each process must request all its required resources at once and can not proceed until
all have been granted.
2. If a process holding certain resources is denied a further request, that process must
release its original resource and if necessary and then request them again together with
the additional resources.
3. Impose a linear ordering of resources type on all processes. That is if a process has been
allocated resources of a given type, it then waits for other processes to be allocated the
resources of that type before asking for additional resources of that type
CPU SCHEDULING
CPU scheduling is the basis of multiprogrammed operating systems. By switching the CPU among
processes, the operating system can make the computer more productive.
In a single-processor system, only one process can run at a time. Others must wait until the CPU
is free and can be rescheduled. The objective of multiprogramming is to have some process
running at all times, to maximize CPU utilization. The idea is relatively simple. A process is
executed until it must wait, typically for the completion of some I/O request. In a simple
computer system, the CPU then just sits idle. All this waiting time is wasted; no useful work is
accomplished. With multiprogramming, we try to use this time productively. Several processes
are kept in memory at one time. When one process has to wait, the operating system takes the
CPU away from that process and gives the CPU to another process. Every time one process has
to wait, another process can take over the use of the CPU. Scheduling of this kind is a fundamental
operating-system function. Almost all computer resources are scheduled before use. The CPU is,
of course, one of the primary computer resources. Thus, its scheduling is central to the operating-
system design.
The success of CPU scheduling depends on an observed property of processes: process execution
consists of a cycle of CPU execution and I/O wait. Processes alternate between these two states.
Process execution begins with a CPU burst. That is followed by an I/O burst, which is followed by
another CPU burst, then another I/O burst, and so on. Eventually, the final CPU burst ends with
a system request to terminate execution as seen in the diagram below.
33
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
An I/O-bound program typically has many short CPU bursts. A CPU-bound program might have a
few long CPU bursts. This distribution can be important in the selection of an appropriate CPU-
scheduling algorithm.
When scheduling takes place only under circumstances such as when a process switches from
the running state to the waiting state (for example, as the result of an I/O request or an invocation
of wait() for the termination of a child process) or when a process terminates, we say that the
scheduling scheme is nonpreemptive or cooperative. Otherwise, it is preemptive. Preemptive
scheduling could occur when a process switches from the running state to the ready state (for
example, when an interrupt occurs) or when a process switches from the waiting state to the
ready state (for example, after I/O). Under nonpreemptive scheduling, once the CPU has been
allocated to a process, the process keeps the CPU until it releases the CPU either by terminating
or by switching to the waiting state.
34
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Cooperative scheduling is the only method that can be used on certain hardware platforms
because it does not require the special hardware (for example, a timer) needed for preemptive
scheduling.
Unfortunately, preemptive scheduling can result in race conditions when data are shared among
several processes. Consider the case of two processes that share data. While one process is
updating the data, it is preempted so that the second process can run. The second process then
tries to read the data, which are in an inconsistent state.
Dispatcher
Another component involved in the CPU-scheduling function is the dispatcher. The dispatcher is
the module that gives control of the CPU to the process selected by the short-term scheduler.
This function involves the following:
• Switching context
• Switching to user mode
• Jumping to the proper location in the user program to restart that program.
The dispatcher should be as fast as possible since it is invoked during every process switch. The
time it takes for the dispatcher to stop one process and start another running is known as the
dispatch latency.
Scheduling Criteria
Different CPU-scheduling algorithms have different properties, and the choice of a particular
algorithm may favour one class of processes over another. In choosing which algorithm to use in
a particular situation, we must consider the properties of the various algorithms. Many criteria
have been suggested for comparing CPU-scheduling algorithms. Which characteristics are used
for comparison can make a substantial difference in which algorithm is judged to be best. The
criteria include the following:
➢ CPU utilization: We want to keep the CPU as busy as possible. Conceptually, CPU
utilization can range from 0 to 100%. In a real system, it should range from 40% (for a
lightly loaded system) to 90% (for a heavily loaded system).
➢ Throughput: If the CPU is busy executing processes, then work is being done. One
measure of work is the number of processes that are completed per time unit, called
throughput. For long processes, this rate may be one process per hour; for short
transactions, it may be ten processes per second.
➢ Turnaround time: From the point of view of a particular process, the important criterion
is how long it takes to execute that process. The interval from the time of submission of
a process to the time of completion is the turnaround time. Turnaround time is the sum
of the periods spent waiting to get into memory, waiting in the ready queue, executing
on the CPU, and doing I/O.
➢ Waiting time: The CPU-scheduling algorithm does not affect the amount of time during
which a process executes or does I/O. It affects only the amount of time that a process
spends waiting in the ready queue. Waiting time is the sum of the periods spent waiting
in the ready queue.
35
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
➢ Response time: In an interactive system, turnaround time may not be the best criterion.
Often, a process can produce some output fairly early and can continue computing new
results while the previous results are output to the user. Thus, another measure is the
time from the submission of a request until the first response is produced. This measure,
called response time, is the time it takes to start responding, not the time it takes to
output the response. The turnaround time is generally limited by the speed of the output
device.
It is desirable to maximize CPU utilization and throughput and to minimize turnaround time,
waiting time, and response time. In most cases, we optimize the average measure. However,
under some circumstances, we prefer to optimize the minimum or maximum values rather than
the average. For example, to guarantee that all users get good service, we may want to minimize
the maximum response time.
Scheduling Algorithms
CPU scheduling deals with the problem of deciding which of the processes in the ready queue is
to receive CPU allocation. There are many different CPU-scheduling algorithms. Below are some
scheduling algorithms.
1. First-Come, First-Served Scheduling
2. Shortest-Job-First Scheduling
3. Priority Scheduling
4. Round-Robin Scheduling
5. Multilevel Queue Scheduling
6. Multilevel Feedback Queue Scheduling
MEMORY MANAGEMENT
Due to the ability of CPU scheduling, we can improve both the utilization of the CPU and the
speed of the computer’s response to its users. To realize this increase in performance, however,
we must keep several processes in memory—that is, we must share memory.
Memory is central to the operation of a modern computer system. Memory consists of a large
array of bytes, each with its address. The CPU fetches instructions from memory according to the
value of the program counter. These instructions may cause additional loading from and storing
to specific memory addresses.
Main memory and the registers built into the processor itself are the only general-purpose
storage that the CPU can access directly. There are machine instructions that take memory
addresses as arguments, but none that take disk addresses. Therefore, any instructions in
execution, and any data being used by the instructions, must be in one of these direct-access
storage devices. If the data are not in memory, they must be moved there before the CPU can
operate on them.
Registers that are built into the CPU are generally accessible within one cycle of the CPU clock.
Most CPUs can decode instructions and perform simple operations on register contents at the
rate of one or more operations per clock tick. The same cannot be said of the main memory,
which is accessed via a transaction on the memory bus. Completing memory access may take
36
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
many cycles of the CPU clock. In such cases, the processor normally needs to stall, since it does
not have the data required to complete the instruction that it is executing. This situation is
intolerable because of the frequency of memory access. The remedy is to add fast memory
between the CPU and main memory, typically on the CPU chip for fast access such as a cache.
Not only are we concerned with the relative speed of accessing physical memory, but we also
must ensure correct operation. For proper system operation, we must protect the operating
system from access by user processes. On multiuser systems, we must additionally protect user
processes from one another. This protection must be provided by the hardware because the
operating system doesn’t usually intervene between the CPU and its memory accesses (because
of the resulting performance penalty).
We first need to make sure that each process has a separate memory space. Separate per-process
memory space protects the processes from each other and is fundamental to having multiple
processes loaded in memory for concurrent execution. To separate memory spaces, we need the
ability to determine the range of legal addresses that the process may access and to ensure that
the process can access only these legal addresses. We can provide this protection by using two
registers, usually a base and a limit, as illustrated in the diagram below.
The base register holds the smallest legal physical memory address; the limit register specifies
the size of the range.
For example, if the base register holds 300040 and the limit register is 120900, then the program
can legally access all addresses from 300040 through 420939 (inclusive). Protection of memory
space is accomplished by having the CPU hardware compare every address generated in user
mode with the registers. Any attempt by a program executing in user mode to access operating-
system memory or other users’ memory results in a trap to the operating system, which treats
the attempt as a fatal error. This scheme prevents a user program from (accidentally or
deliberately) modifying the code or data structures of either the operating system or other users.
37
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The base and limit registers can be loaded only by the operating system, which uses a special
privileged instruction. Since privileged instructions can be executed only in kernel mode, and
since only the operating system executes in kernel mode, only the operating system can load the
base and limit registers.
The operating system, executing in kernel mode, is given unrestricted access to both operating-
system memory and users’ memory. This provision allows the operating system to load users’
programs into users’ memory, dump out those programs in case of errors, access and modify
parameters of system calls, perform I/O to and from user memory, and provide many other
services.
Address Binding
Usually, a program resides on a disk as a binary executable file. To be executed, the program
must be brought into memory and placed within a process. Depending on the memory
management in use, the process may be moved between disk and memory during its execution.
The processes on the disk that are waiting to be brought into memory for execution form the
input queue. The normal single-tasking procedure is to select one of the processes in the input
queue and load that process into memory. As the process is executed, it accesses instructions
and data from memory. Eventually, the process terminates, and its memory space is declared
available.
In most cases, a user program goes through several steps—some of which may be optional—
before being executed (Figure 8.3). Addresses may be represented in different ways during these
steps. Addresses in the source program are generally symbolic (such as the variable count). A
compiler typically binds these symbolic addresses to relocatable addresses (such as “14 bytes
from the beginning of this module”). The linkage editor or loader in turn binds the relocatable
addresses to absolute addresses (such as 74014). Each binding is a mapping from one address
space to another. Classically, the binding of instructions and data to memory addresses can be
done at any step along the way:
❖ Compile time: If you know at compile time where the process will reside in memory, then
absolute code can be generated. For example, if you know that a user process will reside
starting at location R, then the generated compiler code will start at that location and
extend up from there. If the starting location changes later, it will then be necessary to
recompile this code.
❖ Load time: If it is not known at compile time where the process will reside in memory,
then the compiler must generate relocatable code. In this case, the final binding is delayed
until load time. If the starting address changes, we need only reload the user code to
incorporate this changed value.
❖ Execution time: If the process can be moved during its execution from one memory
segment to another, then binding must be delayed until run time. Special hardware must
be available for this scheme to work. Most general-purpose operating systems use this
method.
38
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The set of all logical addresses generated by a program is logical address space. The set of all
physical addresses corresponding to these logical addresses is physical address space. The run-
time mapping from virtual to physical addresses is done by a hardware device called the memory-
management unit (MMU).
Swapping
A process must be in memory to be executed. A process, however, can be swapped temporarily
out of memory to a backing store (a dedicated portion of disk) and then brought back into
memory for continued execution.
Swapping makes it possible for the total physical address space of all processes to exceed the
real physical memory of the system, thus increasing the degree of multiprogramming in a system.
The idea of the backing store gives rise to the concept of virtual memory where some portion of
a disk is reserved to support the activities of the main memory thereby virtually giving the
memory an increase in size.
Standard swapping involves moving processes between the main memory and a backing store.
The backing store is commonly a fast disk. It must be large enough to accommodate copies of all
39
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
memory images for all users, and it must provide direct access to these memory images. The
system maintains a ready queue consisting of all processes whose memory images are on the
backing store or in memory and are ready to run. Whenever the CPU scheduler decides to
execute a process, it calls the dispatcher. The dispatcher checks to see whether the next process
in the queue is in memory. If it is not, and if there is no free memory region, the dispatcher swaps
out a process currently in memory and swaps in the desired process. It then reloads registers and
transfers control to the selected process. The context-switch time in such a swapping system is
fairly high.
The major part of the swap time is transfer time. The total transfer time is directly proportional
to the amount of memory swapped. If we have a computer system with 4 GB of main memory
and a resident operating system taking 1 GB, the maximum size of the user process is 3 GB.
However, many user processes may be much smaller than this—say, 100 MB. A 100-MB process
could be swapped out in 2 seconds, compared with the 60 seconds required for swapping 3 GB.
It would be useful to know exactly how much memory a user process is using, not simply how
much it might be using. Then we would need to swap only what is used, reducing swap time. For
this method to be effective, the user must keep the system informed of any changes in memory
requirements. Thus, a process with dynamic memory requirements will need to issue system calls
(request memory() and release memory()) to inform the operating system of its changing
memory needs.
Swapping is constrained by other factors as well. If we want to swap a process, we must be sure
that it is completely idle. Of particular concern is any pending I/O. A process may be waiting for
an I/O operation when we want to swap that process to free up memory. However, if the I/O is
asynchronously accessing the user memory for I/O buffers, then the process cannot be swapped.
Assume that the I/O operation is queued because the device is busy. If we were to swap out
process P1 and swap in process P2, the I/O operation might then attempt to use memory that
now belongs to process P2. There are two main solutions to this problem: never swap a process
with pending I/O, or execute I/O operations only into operating-system buffers. Transfers
between operating-system buffers and process memory then occur only when the process is
swapped in. Note that this double buffering itself adds overhead. We now need to copy the data
again, from kernel memory to user memory, before the user process can access it.
Standard swapping is not used in modern operating systems. It requires too much swapping time
and provides too little execution time to be a reasonable memory-management solution.
Modified versions of swapping, however, are found on many systems, including UNIX, Linux, and
Windows. In one common variation, swapping is normally disabled but will start if the amount of
free memory (unused memory available for the operating system or processes to use) falls below
a threshold amount. Swapping is halted when the amount of free memory increases. Another
variation involves swapping portions of processes—rather than entire processes—to decrease
swap time. Typically, these modified forms of swapping work in conjunction with virtual memory.
40
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The main memory must accommodate both the operating system and the various user processes.
We, therefore, need to allocate the main memory in the most efficient way possible. This section
explains one early method, contiguous memory allocation.
We usually want several user processes to reside in memory at the same time. We, therefore,
need to consider how to allocate available memory to the processes that are in the input queue
waiting to be brought into memory. In contiguous memory allocation, each process is contained
in a single section of memory that is contiguous to the section containing the next process.
Before discussing memory allocation further, we must discuss the issue of memory protection.
We can prevent a process from accessing memory it does not own by combining two ideas
previously discussed; that is applying base and limit registers. When the CPU scheduler selects a
process for execution, the dispatcher loads the relocation and limit registers with the correct
values as part of the context switch. Because every address generated by a CPU is checked against
these registers, we can protect both the operating system and the other users’ programs and
data from being modified by this running process.
One of the simplest methods for memory allocation is to divide memory into several fixed-sized
partitions. Each partition may contain exactly one process. Thus, the degree of
multiprogramming is bound by the number of partitions. In this multiple partition method, when
a partition is free, a process is selected from the input queue and is loaded into the free partition.
When the process terminates, the partition becomes available for another process.
The method described next is a generalization of the fixed-partition scheme (called MVT); it is
used primarily in batch environments. Many of the ideas presented here are also applicable to a
time-sharing environment in which pure segmentation is used for memory management.
In the variable-partition scheme, the operating system keeps a table indicating which parts of
memory are available and which are occupied. Initially, all memory is available for user processes
and is considered one large block of available memory, a hole. Eventually, as you will see, the
memory contains a set of holes of various sizes.
As processes enter the system, they are put into an input queue. The operating system takes into
account the memory requirements of each process and the amount of available memory space
in determining which processes are allocated memory. When a process is allocated space, it is
loaded into memory, and it can then compete for CPU time. When a process terminates, it
releases its memory, which the operating system may then fill with another process from the
input queue.
At any given time, then, we have a list of available block sizes and an input queue. The operating
system can order the input queue according to a scheduling algorithm. Memory is allocated to
processes until, finally, the memory requirements of the next process cannot be satisfied—that
is, no available block of memory (or hole) is large enough to hold that process. The operating
system can then wait until a large enough block is available, or it can skip down the input queue
to see whether the smaller memory requirements of some other process can be met.
41
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
In general, as mentioned, the memory blocks available comprise a set of holes of various sizes
scattered throughout memory. When a process arrives and needs memory, the system searches
the set for a hole that is large enough for this process. If the hole is too large, it is split into two
parts. One part is allocated to the arriving process; the other is returned to the set of holes. When
a process terminates, it releases its block of memory, which is then placed back in the set of
holes. If the new hole is adjacent to other holes, these adjacent holes are merged to form one
larger hole. At this point, the system may need to check whether processes are waiting for
memory and whether this newly freed and recombined memory could satisfy the demands of
any of these waiting processes.
There are many ways to allocate memory in a system that dynamically allocates space that
allocation is done based on the size needed by a process and not in a fixed partition. The first-fit,
best-fit, and worst-fit strategies are the ones most commonly used to select a free hole from the
set of available holes.
❖ First fit. Allocate the first hole that is big enough. Searching can start either at the
beginning of the set of holes or at the location where the previous first-fit search ended.
We can stop searching as soon as we find a free hole that is large enough.
❖ Best fit. Allocate the smallest hole that is big enough. We must search the entire list unless
the list is ordered by size. This strategy produces the smallest leftover hole.
❖ Worst fit. Allocate the largest hole. Again, we must search the entire list, unless it is sorted
by size. This strategy produces the largest leftover hole, which may be more useful than
the smaller leftover hole from a best-fit approach.
Experimental works have shown that both first fit and best fit are better than worst fit in terms
of decreasing time and storage utilization. Neither first fit nor best fit is better than the other in
terms of storage utilization, but first fit is generally faster.
Segmentation
Segmentation provides such a mechanism for dealing with memory in terms of its physical
properties. This is to take care of the inconvenience caused to both the operating system and the
programmer whose view of memory are different. Segmentation is a solution provided by the
hardware that maps the programmer’s view to the actual physical memory allowing for more
freedom for the system to manage memory while the programmer would have a more natural
programming environment.
In segmentation, the logical address space is broken down into a collection of segments. Each
segment has a name and a length. The address will now specify the segment name and an offset
(offset describes the location of a piece of data compared to another location that it gives the
address of the next data in relation to the current).
For simplicity of implementation, segments are numbered and are referred to by a segment
number, rather than by a segment name. Thus, a logical address consists of two tuples:
(segment-number, offset)
42
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Paging
Segmentation permits the physical address space of a process to be noncontiguous. Paging is
another memory-management scheme that offers this advantage. However, paging avoids
external fragmentation and the need for compaction, whereas segmentation does not. External
fragmentation occurs when total unused memory space is enough to answer all the allocation
requests. Here the memory is non-contiguous. Therefore, the memory has empty blocks
scattered all over, which are insufficient to be allocated to other programs.
The basic method for implementing paging involves breaking physical memory into fixed-sized
blocks called frames and breaking logical memory into blocks of the same size called pages. When
a process is to be executed, its pages are loaded into any available memory frames from their
source (a file system or the backing store). The backing store is divided into fixed-sized blocks
that are the same size as the memory frames or clusters of multiple frames. This rather simple
idea has great functionality and wide ramifications. For example, the logical address space is now
totally separate from the physical address space, so a process can have a logical 64-bit address
space even though the system has less than 264 bytes of physical memory.
FILE MANAGEMENT
The file system is the most visible aspect of an operating system. It provides the mechanism for
online storage of and access to both data and programs of the operating system and all the users
of the computer system. The file system consists of two distinct parts: a collection of files, each
storing related data, and a directory structure, which organizes and provides information about
all the files in the system.
43
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Files may be of different forms, such as text files, or maybe formatted rigidly. In general, a file is
a sequence of bits, bytes, lines, or records, the meaning of which is defined by the file’s creator
and user. The concept of a file is thus extremely general.
The information in a file is defined by its creator. Many different types of information may be
stored in a file—source or executable programs, numeric or text data, photos, music, video, and
so on. A file has a certain defined structure, which depends on its type. A text file is a sequence
of characters organized into lines (and possibly pages). A source file is a sequence of functions,
each of which is further organized as declarations followed by executable statements. An
executable file is a series of code sections that the loader can bring into memory and execute.
File Attributes
A file is named, for the convenience of its human users, and is referred to by its name. A name is
usually a string of characters, such as example; c. Some systems differentiate between uppercase
and lowercase characters in names, whereas other systems do not. When a file is named, it
becomes independent of the process, the user, and even the system that created it. A file’s
attributes vary from one operating system to another but typically consist of these:
❖ Name: The symbolic file name is the only information kept in a human-readable form.
❖ Identifier: This unique tag, usually a number, identifies the file within the file system; it is
the non-human-readable name for the file.
❖ Type: This information is needed for systems that support different types of files.
❖ Location: This information is a pointer to a device and to the location of the file on that
device.
❖ Size. The current size of the file (in bytes, words, or blocks) and possibly the maximum
allowed size are included in this attribute.
❖ Protection. Access-control information determines who can do reading, writing,
executing, and so on.
❖ Time, date, and user identification. This information may be kept for creation, last
modification, and last use. These data can be useful for protection, security, and usage
monitoring.
44
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
The information about all files is kept in the directory structure, which also resides in secondary
storage. Typically, a directory entry consists of the file’s name and its unique identifier. The
identifier in turn locates the other file attributes.
File Operations
A file is an abstract data type. To define a file properly, we need to consider the operations that
can be performed on files. The operating system can provide system calls to create, write, read,
reposition, delete, and truncate files.
❖ Creating a file: Two steps are necessary to create a file. First, space in the file system must
be found for the file then secondly, an entry for the new file must be made in the
directory.
❖ Writing a file: To write a file, we make a system call specifying both the name of the file
and the information to be written to the file. Given the name of the file, the system
searches the directory to find the file’s location. The system must keep a write pointer to
the location in the file where the next write is to take place. The write pointer must be
updated whenever a write occurs.
❖ Reading a file: To read from a file, we use a system call that specifies the name of the file
and where (in memory) the next block of the file should be put. Again, the directory is
searched for the associated entry, and the system needs to keep a read pointer to the
location in the file where the next read is to take place. Once a read operation has taken
place, the read pointer is updated. Because a process is usually either reading from or
writing to a file, the current operation location can be kept as a per-process current file-
45
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
position pointer. Both the read and write operations use this same pointer, saving space
and reducing system complexity.
❖ Repositioning within a file: The directory is searched for the appropriate entry, and the
current-file-position pointer is repositioned to a given value. Repositioning within a file
need not involve any actual I/O. This file operation is also known as a file seek.
❖ Deleting a file: To delete a file, we search the directory for the named file. Having found
the associated directory entry, we release all file space, so that it can be reused by other
files, and erase the directory entry.
❖ Truncating a file: The user may want to erase the contents of a file but keep its attributes.
Rather than forcing the user to delete the file and then recreate it, this function allows all
attributes to remain unchanged—except for file length—but lets the file be reset to length
zero and its file space released.
Most of the file operations mentioned involve searching the directory for the entry associated
with the named file. To avoid this constant searching, many systems require that an open()
system call be made before a file is first used. The operating system keeps a table, called the
open-file table, containing information about all open files. When a file operation is requested,
the file is specified via an index in this table, so no searching is required. When the file is no longer
being actively used, it is closed by the process, and the operating system removes its entry from
the open-file table. create() and delete() are system calls that work with closed rather than open
files.
The open() operation takes a file name and searches the directory, copying the directory entry
into the open-file table. The open() call can also accept access mode information—create, read-
only, read-write, append-only, and so on.
Typically, the open-file table also has an open count associated with each file to indicate how
many processes have the file open. Each close() decreases this open count, and when the open
count reaches zero, the file is no longer in use, and the file’s entry is removed from the open-file
table. In summary, several pieces of information are associated with an open file.
➢ File pointer: On systems that do not include a file offset as part of the read() and write()
system calls, the system must track the last read-write location as a current file-position
pointer. This pointer is unique to each process operating on the file and therefore must
be kept separate from the on-disk file attributes.
➢ File-open count: As files are closed, the operating system must reuse its open-file table
entries, or it could run out of space in the table. Multiple processes may have opened a
file, and the system must wait for the last file to close before removing the open-file table
entry. The file-open count tracks the number of opens and closes and reaches zero on the
last close. The system can then remove the entry.
➢ Disk location of the file: Most file operations require the system to modify data within the
file. The information needed to locate the file on disk is kept in memory so that the system
does not have to read it from the disk for each operation.
➢ Access rights: Each process opens a file in an access mode. This information is stored on
the per-process table so the operating system can allow or deny subsequent I/O requests.
46
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Some operating systems provide facilities for locking an open file (or sections of a file). File locks
allow one process to lock a file and prevent other processes from gaining access to it. File locks
are useful for files that are shared by several processes. Below are some examples of file types.
Access Methods
Files store information. When this information is used, it must be accessed and read into
computer memory. The information in the file can be accessed in several ways. Some systems
provide only one access method for files, while others support many access methods. Choosing
the right one for a particular application is a major design problem.
Sequential Access: The simplest access method is sequential access. Information in the file is
processed in order, one record after the other. This mode of access is by far the most common.
Reads and writes make up the bulk of the operations on a file. A read operation—read next()—
reads the next portion of the file and automatically advances a file pointer, which tracks the I/O
location. Similarly, the write operation—write next()—appends to the end of the file and
advances to the end of the newly written material (the new end of file).
Direct Access: Another method is direct access (or relative access). Here, a file is made up of
fixed-length logical records that allow programs to read and write records rapidly in no particular
order. The direct-access method is based on a disk model of a file since disks allow random access
to any file block. For direct access, the file is viewed as a numbered sequence of blocks or records.
Thus, we may read block 14, then read block 53, and then write block 7. There are no restrictions
on the order of reading or writing for a direct-access file. Direct-access files are of great use for
immediate access to large amounts of information. Databases are often of this type. When a
query concerning a particular subject arrives, we compute which block contains the answer and
then read that block directly to provide the desired information.
47
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
Other access methods can be built on top of a direct-access method. These methods generally
involve the construction of an index for the file. The index, like an index in the back of a book,
contains pointers to the various blocks.
File-System Implementation
Numerous on-disk and in-memory configurations and structures are used for implementing file
systems. These structures differ based on the operating system and the file system but some
general principles are applied. These general principles include:
• A boot control block usually contains the information required by the system for booting
an operating system from that volume. When the disks do not contain any operating
system, this block can be treated as empty. This is typically the first chunk of a volume. In
UFS, this is termed as the boot block; in NTFS, it is the partition boot sector.
• A volume control block holds volume or the partition details, such as the number of blocks
in the partition, size of the blocks or chunks, free-block count along with free-block
pointers. In UFS, it is termed as superblock; in NTFS, it is stored in the master file table.
• A directory structure per file system is required for organizing the files. In UFS, it held the
file names and associated 'inode' numbers. In NTFS, it gets stored in the master file table.
• The FCB contains many details regarding any file. It has a unique identifier number to
allow association with a directory entry. In NTFS, this information is stored within the
master file table, which uses a relational database structure, with a row per file.
The in-memory information is used for both file-system management and performance
improvement via caching. The data are loaded at mount time, updated during file-system
operations, and discarded at dismount. Several types of structures may be included.
To create a new file, an application program calls the logical file system. The logical file system
knows the format of the directory structures. To create a new file, it allocates a new FCB.
(Alternatively, if the file-system implementation creates all FCBs at file-system creation time, an
FCB is allocated from the set of free FCBs.) The system then reads the appropriate directory into
memory, updates it with the new file name and FCB, and writes it back to the disk.
48
School of Computing and Information Sciences, CKT-UTAS Operating System: CSC 306
PROTECTION
Protection refers to a mechanism for controlling the access of programs, processes, or users to the
resources defined by a computer system. This mechanism provides a means for specifying the controls to
be imposed, together with a means of enforcement.
As computer systems have become more sophisticated and pervasive in their applications, the need to
protect their integrity has also grown. Protection was originally conceived as an adjunct to
multiprogramming operating systems, whereby untrustworthy users might safely share a common logical
namespace such as a directory of files or share a common physical namespace like memory. Modern
protection concepts have evolved to increase the reliability of any complex system that makes use of
shared resources. There are several reasons for the need for protection in computer systems. The most
obvious is the need to prevent the mischievous and intentional violation of an access restriction by a user.
Protection can improve reliability by detecting latent errors at the interfaces between component
subsystems. Early detection of interface errors can often prevent contamination of a healthy subsystem
by a malfunctioning subsystem. An unprotected resource cannot defend against use (or misuse) by an
unauthorized or incompetent user. A protection-oriented system provides means to distinguish between
authorized and unauthorized usage.
The role of protection in a computer system is to provide a mechanism for the enforcement of the policies
governing resource use. These policies can be established in a variety of ways. Some are fixed in the design
of the system, while others are formulated by the management of a system. Others are defined by the
individual users to protect their own files and programs.
Policies for resource use may vary by application, and they may change over time. For these reasons,
protection is no longer the concern solely of the designer of an operating system. The application
programmer needs to use protection mechanisms as well, to guard resources created and supported by
an application subsystem against misuse. Note that mechanisms are distinct from policies. Mechanisms
determine how something will be done; policies decide what will be done.
Principles of Protection
A key time-tested guiding principle for protection is the principle of least privilege. It dictates that
programs, users, and even systems be given just enough privileges to perform their tasks. An operating
system following the principle of least privilege implements its features, programs, system calls, and data
structures so that failure or compromise of a component does the minimum damage and allows the
minimum damage to be done.
Also beneficial is the creation of audit trails for all privileged function access. The audit trail allows the
programmer, system administrator, or law-enforcement officer to trace all protection and security
activities on the system.
Managing users with the principle of least privilege entails creating a separate account for each user, with
just the privileges that the user needs. Some systems implement role-based access control (RBAC) to
provide this functionality.
49