OS Module-5-1
OS Module-5-1
5-1
OPERATING SYSTEMS
5.1.2 Solid-State Disks
• An SSD is non-volatile memory that is used like a hard-drive.
• For example:
DRAM with a battery to maintain its state in a power-failure through flash-memory technologies.
• Advantages compared to Hard-disks:
1) More reliable : SSDs have no moving parts and are faster because they have no seek-time or
latency.
2) Less power consumption.
• Disadvantages:
1) More expensive
2) Less capacity and so shorter life spans, so their uses are somewhat limited.
• Applications:
1) One use for SSDs is in storage-arrays, where they hold file-system metadata that require
high performance.
2) SSDs are also used in laptops to make them smaller, faster, and more energy-efficient.
Never let the weight of challenges and the duration of your dreams scare you to give up.
5-2
OPERATING SYSTEMS
5.3 Disk Attachment
• Computers access disk storage in two ways.
1) via I/O ports (or host-attached storage); this is common on small systems.
2) via a remote host in a distributed file system; this is referred to as network-attached
storage.
When it is time to sow, sow and when it is time to harvest, you will harvest. Obey this rule.
5-3
OPERATING SYSTEMS
5.3.3 Storage-Area Network
• A storage-area network (SAN) is a private network connecting servers and storage units (Figure 5.3).
• The power of a SAN lies in its flexibility.
1) Multiple hosts and multiple storage-arrays can attach to the same SAN.
2) Storage can be dynamically allocated to hosts.
3) A SAN switch allows or prohibits access between the hosts and the storage.
4) SANs make it possible for clusters of servers to share the same storage and for storage
arrays to include multiple direct host connections.
5) SANs typically have more ports than storage-arrays.
• FC is the most common SAN interconnect.
• Another SAN interconnect is InfiniBand — a special-purpose bus architecture that provides hardware
and software support for high-speed interconnection networks for servers and storage units.
Dare to stretch hard and make it a point to do the hard thing others are afraid to try.
5-4
OPERATING SYSTEMS
5.4 Disk Scheduling
• Access time = Seek-time + Rotational-latency
1) Seek-time: The seek-time is the time for the disk-arm to move the heads to the cylinder
containing the desired sector.
2) Rotational-latency: The Rotational-latency is the additional time for the disk to rotate the
desired sector to the disk-head.
• The disk bandwidth is the total number of bytes transferred, divided by the total time between the
first request for service and the completion of the last transfer.
• We can improve both the access time and the bandwidth by managing the order in which disk I/O
requests are serviced.
• Whenever a process needs I/0 to or from the disk, it issues a system call to the operating system.
• The request specifies several pieces of information:
1) Whether this operation is input or output
2) What the disk address for the transfer is
3) What the memory address for the transfer is
4) What the number of sectors to be transferred is
• If the desired disk-drive and controller are available, the request can be serviced immediately.
• If the drive or controller is busy, any new requests for service will be placed in the queue of pending
requests for that drive.
• For a multiprogramming system with many processes, the disk queue may often have several
pending requests.
• Thus, when one request is completed, the operating system chooses which pending request to
service next.
• Any one of several disk-scheduling algorithms can be used.
• Starting with cylinder 53, the disk-head will first move from 53 to 98, then to 183, 37, 122, 14, 124,
65, and finally to 67 as shown in Figure 5.4.
Head movement from 53 to 98 = 45
Head movement from 98 to 183 = 85
Head movement from 183 to 37 = 146
Head movement from 37 to 122 =85
Head movement from 122 to 14 =108
Head movement from 14 to 124 =110
Head movement from 124 to 65 =59
Head movement from 65 to 67 = 2
Total head movement = 640
• Advantage: This algorithm is simple & fair.
• Disadvantage: Generally, this algorithm does not provide the fastest service.
No struggle, no success. The stronger the thunder, the heavier the rainfall.
5-5
OPERATING SYSTEMS
5.4.2 SSTF Scheduling
• SSTF stands for Shortest Seek-time First.
• This selects the request with minimum seek-time from the current head-position.
• Since seek-time increases with the number of cylinders traversed by head, SSTF chooses the pending
request closest to the current head-position.
• Problem: Seek-time increases with the number of cylinders traversed by head.
Solution: To overcome this problem, SSTF chooses the pending request closest to the current
head-position.
• For example:
• The closest request to the initial head position 53 is at cylinder 65. Once we are at cylinder 65, the
next closest request is at cylinder 67.
• From there, the request at cylinder 37 is closer than 98, so 37 is served next. Continuing, we service
the request at cylinder 14, then 98, 122, 124, and finally 183. It is shown in Figure 5.5.
Head movement from 53 to 65 = 12
Head movement from 65 to 67 = 2
Head movement from 67 to 37 = 30
Head movement from 37 to 14 =23
Head movement from 14 to 98 =84
Head movement from 98 to 122 =24
Head movement from 122 to 124 =2
Head movement from 124 to 183 = 59
Total head movement = 236
• Advantage: SSTF is a substantial improvement over FCFS, it is not optimal.
• Disadvantage: Essentially, SSTF is a form of SJF and it may cause starvation of some requests.
5-6
OPERATING SYSTEMS
5.4.3 SCAN Scheduling
• The SCAN algorithm is sometimes called the elevator algorithm, since the disk-arm behaves just like
an elevator in a building.
• Here is how it works:
1. The disk-arm starts at one end of the disk.
2. Then, the disk-arm moves towards the other end, servicing the request as it reaches each
cylinder.
3. At the other end, the direction of the head movement is reversed and servicing continues.
• The head continuously scans back and forth across the disk.
• For example:
• Before applying SCAN algorithm, we need to know the current direction of head movement.
• Assume that disk-arm is moving toward 0, the head will service 37 and then 14.
• At cylinder 0, the arm will reverse and will move toward the other end of the disk, servicing the
requests at 65,67,98, 122, 124, and 183. It is shown in Figure 5.6.
Head movement from 53 to 37 = 16
Head movement from 37 to 14 = 23
Head movement from 14 to 0 = 14
Head movement from 0 to 65 =65
Head movement from 65 to 67 =2
Head movement from 67 to 98 =31
Head movement from 98 to 122 =24
Head movement from 122 to 124 = 2
Head movement from 124 to 183 = 59
Total head movement = 236
• Disadvantage: If a request arrives just in from of head, it will be serviced immediately.
On the other hand, if a request arrives just behind the head, it will have to wait
until the arms reach other end and reverses direction.
Success does just happen. It happens to those who desire, plan and work for it.
5-7
OPERATING SYSTEMS
5.4.4 C-SCAN Scheduling
• Circular SCAN (C-SCAN) scheduling is a variant of SCAN designed to provide a more uniform wait
time.
• Like SCAN, C-SCAN moves the head from one end of the disk to the other, servicing requests along
the way.
• When the head reaches the other end, however, it immediately returns to the beginning of the disk,
without servicing any requests on the return trip (Figure 5.7).
• The C-SCAN scheduling algorithm essentially treats the cylinders as a circular list that wraps around
from the final cylinder to the first one.
• Before applying C - SCAN algorithm, we need to know the current direction of head movement.
• Assume that disk-arm is moving toward 199, the head will service 65, 67, 98, 122, 124, 183.
• Then it will move to 199 and the arm will reverse and move towards 0.
• While moving towards 0, it will not serve. But, after reaching 0, it will reverse again and then serve
14 and 37. It is shown in Figure 5.7.
Head movement from 53 to 65 = 12
Head movement from 65 to 67 = 2
Head movement from 67 to 98 = 31
Head movement from 98 to 122 =24
Head movement from 122 to 124 =2
Head movement from 124 to 183 =59
Head movement from 183 to 199 =16
Head movement from 199 to 0 = 199
Head movement from 0 to 14 = 14
Head movement from 14 to 37 = 23
Total head movement = 382
Take your current success as the beginning of your journey and you will keep breaking your own records.
5-8
OPERATING SYSTEMS
5.4.5 LOOK Scheduling
• SCAN algorithm move the disk-arm across the full width of the disk.
In practice, the SCAN algorithm is not implemented in this way.
• The arm goes only as far as the final request in each direction.
Then, the arm reverses, without going all the way to the end of the disk.
• This version of SCAN is called Look scheduling because they look for a request before continuing to
move in a given direction.
• For example:
• Assume that disk-arm is moving toward 199, the head will service 65, 67, 98, 122, 124, 183.
• Then the arm will reverse and move towards 14. Then it will serve 37. It is shown in Figure 5.8.
Head movement from 53 to 65 = 12
Head movement from 65 to 67 = 2
Head movement from 67 to 98 = 31
Head movement from 98 to 122 =24
Head movement from 122 to 124 =2
Head movement from 124 to 183 =59
Head movement from 183 to 14 = 169
Head movement from 14 to 37 = 23
Total head movement = 322
Work, Work and work, until you either succeed in your mission, or you die!
5-9
OPERATING SYSTEMS
5.5 Disk Management
• The operating system is responsible for several other aspects of disk management.
• For example:
1) disk initialization
2) booting from disk
3) bad-block recovery.
5-10
OPERATING SYSTEMS
5.5.3 Bad Blocks
• Because disks have moving parts and small tolerances, they are prone to failure.
• Sometimes,
→ The disk needs to be replaced.
→ The disk-contents need to be restored from backup media to the new disk.
→ One or more sectors may become defective.
• From the manufacturer, most disks have bad-blocks.
• How to handle bad-blocks?
On simple disks, bad-blocks are handled manually.
One strategy is to scan the disk to find bad-blocks while the disk is being formatted.
Any bad-blocks that are discovered are flagged as unusable. Thus, the file system does not
allocate them.
If blocks go bad during normal operation, a special program (such as Linux bad-blocks
command) must be run manually
→ to search for the bad-blocks and
→ to lock the bad-blocks.
Usually, data that resided on the bad-blocks are lost.
• A typical bad-sector transaction might be as follows:
1) The operating system tries to read logical block 87.
2) The controller calculates the ECC and finds that the sector is bad. It reports this finding to
the operating system.
3) The next time the system is rebooted, a special command is run to tell the controller to
replace the bad sector with a spare.
4) After that, whenever the system requests logical block 87, the request is translated into the
replacement sector’s address by the controller.
Greatness means setting out to make some difference somewhere to someone in someplace.
5-11
OPERATING SYSTEMS
5.6 Swap Space Management
• Swap-space management is a low-level task of the operating system.
• Virtual memory uses disk space as an extension of main memory.
• Main goal of swap space: to provide the best throughput for the virtual memory system.
• Here, we discuss about 1) Swap space use 2) Swap space location.
5-12
OPERATING SYSTEMS
Exercise Problems
1) Suppose that the disk-drive has 5000 cylinders numbered from 0 to 4999. The drive is currently
serving a request at cylinder 143, and the previous request was at cylinder 125. The queue of pending
requests in FIFO order is 86, 1470, 913, 1774, 948, 1509, 1022, 1750, 130. Starting from the current
(location) head position, what is the total distance (in cylinders) that the disk-arm moves to satisfy all
the pending requests, for each of the following disk-scheduling algorithms?
(i) FCFS
(ii) SSTF
(iii) SCAN
(iv) LOCK
(v) C-SCAN
Solution:
(i) FCFS
5-13
OPERATING SYSTEMS
(ii) SSTF
(iii) SCAN
(iv) LOCK
5-14
OPERATING SYSTEMS
(v) C-SCAN
2) Suppose that a disk has 50 cylinder named 0 to 49. The R/W head is currently serving at cylinder
15. The queue of pending request are in order: 4 40 11 35 7 14 starting from the current head
position, what is the total distance traveled (in cylinders) by the disk-arm to satisfy the request using
algorithms
i) FCFS
ii) SSTF and
iii) LOOK.
Illustrate with figure in each case.
Solution:
(i) FCFS
Queue: 4 40 11 35 7 14
Head starts at 15
To be successful at anything, it takes hard work, discipline, dedication, and a burning desire to be the best.
5-15
OPERATING SYSTEMS
(ii) SSTF
Queue: 4 40 11 35 7 14
Head starts at 15
(iii) LOOK
Queue: 4 40 11 35 7 14
Head starts at 15
3) Given the following queue 95, 180, 34, 119, 11, 123, 62, 64 with head initially at track 50 and
ending at track 199. Calculate the number moves using
i) FCFS
ii) SSTF
iii) Elevator and
iv) C-look.
Solution:
(i) FCFS
Strive to enrich all lives, hearts and minds not just your own pockets.
5-18
OPERATING SYSTEMS
(ii) SSTF
(iv) C LOOK
5-19
OPERATING SYSTEMS
5-20
OPERATING SYSTEMS
5.9 Principles of Protection
• A key principle for protection is the principle of least privilege.
• Principle of Least Privilege:
―Programs, users, and even systems are given just enough privileges to perform their tasks‖.
• The principle of least privilege can help produce a more secure computing environment.
• An operating system which follows the principle of least privilege implements its features, programs,
system-calls, and data structures.
• Thus, failure of a component results in minimum damage.
• An operating system also provides system-calls and services that allow applications to be written with
fine-grained access controls.
• Access Control provides mechanisms
→ to enable privileges when they are needed.
→ to disable privileges when they are not needed.
• Audit-trails for all privileged function-access can be created.
• Audit-trail can be used to trace all protection/security activities on the system.
• The audit-trail can be used by
→ Programmer
→ System administrator or
→ Law-enforcement officer.
• Managing users with the principle of least privilege requires creating a separate account for each
user, with just the privileges that the user needs.
• Computers implemented in a computing facility under the principle of least privilege can be limited to
→ running specific services
→ accessing specific remote hosts via specific services
→ accessing during specific times.
• Typically, these restrictions are implemented through enabling or disabling each service and through
using Access Control Lists.
YOU ARE HERE FOR A CAUSE. YOU REACH TO YOUR ULTIMATE DESTINY BY LIVING THAT CAUSE.
5-21
OPERATING SYSTEMS
5.10 Domain of Protection
• A process operates within a protection domain.
• Protection domain specifies the resources that the process may access.
• Each domain defines
→ set of objects and
→ types of operations that may be invoked on each object.
• The ability to execute an operation on an object is an access-right.
• A domain is a collection of access-rights.
• The access-rights are an ordered pair <object-name, rights-set>.
• For example:
If domain D has the access-right <file F, {read,write}>;
Then a process executing in domain D can both read and write on file F.
• As shown in Figure 5.9, domains may share access-rights. The access-right <O4, {print}> is shared
by D2 and D3.
• The association between a process and a domain may be either static or dynamic.
1) If the association between processes and domains is static, then a mechanism must be
available to change the content of a domain.
Static means the set of resources available to the process is fixed throughout the process’s
lifetime.
2) If the association between processes and domains is dynamic, then a mechanism is
available to allow domain switching.
Domain switching allows the process to switch from one domain to another.
• A domain can be realized in a variety of ways:
1) Each user may be a domain.
2) Each process may be a domain.
3) Each procedure may be a domain.
Self-education is open to all but it is taken only by those who refuse to live a small and purposeless life.
5-22
OPERATING SYSTEMS
5.11 Access Matrix
• Access-matrix provides mechanism for specifying a variety of policies.
• The access matrix is used to implement policy decisions concerning protection.
• In the matrix, 1) Rows represent domains.
2) Columns represent objects.
3) Each entry consists of a set of access-rights (such as read, write or execute).
• In general, Access(i, j) is the set of operations that a process executing in Domaini can invoke on
Objectj
• Example: Consider the access matrix shown in Figure 5.10.
There are
1) Four domains: D1, D2, D3, and D4
2) Three objects: F1, F2 and F3
A process executing in domain D1 can read files F1 and F3.
• Domain switching allows the process to switch from one domain to another.
• When we switch a process from one domain to another, we are executing an operation (switch) on an
object (the domain)
• We can include domains in the matrix to control domain switching.
• Consider the access matrix shown in Figure 5.11.
A process executing in domain D2 can switch to domain D3 or to domain D4.
• Allowing controlled change in the contents of the access-matrix entries requires 3 additional
operations (Figure 5.12):
1) Copy(*) denotes ability for one domain to copy the access right to another domain.
2) Owner denotes the process executing in that domain can add/delete rights in that column.
3) Control in access(D2,D4) means: A process executing in domain D2 can modify row D4.
Figure 5.12 Access matrix with Copy rights, Owner rights & Control rights
• The problem of guaranteeing that no information initially held in an object can migrate outside of its
execution environments is called the confinement problem.
Life is a divine blessing for those who live life to add values in the lives of others.
5-23
OPERATING SYSTEMS
5.12 Implementation of Access Matrix
5.12.1 Global Table
• A global table consists of a set of ordered triples <domain, object, rights-set>.
• Here is how it works:
Whenever an operation M is executed on an object Oj within domain Di, the global table is
searched for a triple < Di , Oj , Rk >, with M Є Rk.
If this triple is found,
Then, we allow the access operation;
Otherwise, access is denied, and an exception condition occurs.
• Disadvantages:
1) The table is usually large and can't be kept in main memory.
2) It is difficult to take advantage of groupings, e.g. if all may read an object, there must be an
entry in each domain.
You grow in life when you grow your attitude by making it more positive and productive about life.
5-24
OPERATING SYSTEMS
5.13 Access Control
• Protection can be applied to non-file resources (Figure 5.13).
• Solaris 10 provides role-based access control (RBAC) to implement least privilege.
• Privilege is right to execute system call or use an option within a system call.
• Privilege can be assigned to processes.
• Users assigned roles granting access to privileges and programs
5-25
OPERATING SYSTEMS
The essence of adversity is to show us the real brave people who can withstand it.
5-26
OPERATING SYSTEMS
5.15.2 Linux-System
• Linux uses many tools developed as part of
→ Berkeley’s BSD OS
→ MIT’s X Window-System and
→ Free Software Foundation’s GNU project.
• Main system-libraries are created by GNU project.
• Linux networking-administration tools are derived from 4.3BSD code.
• Linux-System is maintained by a many developers collaborating over the Internet.
• A small groups or individuals are responsible for maintaining the integrity of specific components.
• Linux community is responsible for maintaining the File-system Hierarchy Standard.
• This standard ensures compatibility across the various system-components.
5.15.3 Linux-Distributions
• Linux-Distributions include
→ system-installation and management utilities
→ ready-to-install packages of common UNIX tools (ex: text-processing, web browser).
• The first distributions managed these packages by simply providing a means of unpacking all the files
into the appropriate places.
• Early distributions included SLS and Slackware.
• RedHat and Debian are popular distributions from commercial and non-commercial sources,
respectively.
• RPM Package file format permits compatibility among the various Linux-Distributions.
In times of failure you are alone. But success brings many friends.
5-27
OPERATING SYSTEMS
5.16 Design Principles
• Linux is a multiuser multitasking-system with a full set of UNIX-compatible tools.
• Linux’s file-system follows traditional UNIX semantics.
• The standard UNIX networking model is fully implemented.
• Main design-goals are speed, efficiency, and standardization
• Linux is designed to be compliant with the relevant POSIX documents; at least two Linux-
Distributions have achieved official POSIX certification.
When YOU choose to change, your reality must change accordingly. It simply has no other choice.
5-28
OPERATING SYSTEMS
5.17 Kernel Modules
• A kernel module can implement a device-driver, a file-system, or a networking protocol.
• The kernel’s module interface allows third parties to write and distribute, on their own terms, device-
drivers or file-systems that could not be distributed under the GPL.
• Kernel modules allow a Linux-System to be set up with a standard minimal kernel, without any extra
device-drivers built in.
• The module support has 3 components:
1) Module Management
2) Driver Registration
3) Conflict Resolution
5-29
OPERATING SYSTEMS
5.17.3 Conflict Resolution
• Allows different device-drivers to
→ reserve hardware resources and
→ protect the resources from accidental use by another driver.
• Its aims are as follows:
1) To prevent modules from clashing over access to hardware resources.
2) To prevent autoprobes from interfering with existing device-drivers.
3) To resolve conflicts among multiple drivers trying to access the same hardware.
5-30
OPERATING SYSTEMS
5.18 Process management
5.18.1 The fork() and exec() Process Model
• UNIX process management separates the creation of processes and the running of a new program
into two distinct operations.
• A new process is created by the fork() system-call. A new program is run after a call to exec().
• Process properties fall into 3 groups: 1) Process identity 2) Environment and 3) Context.
There is always another way, make sure you look in all directions before deciding to give up.
5-31
OPERATING SYSTEMS
5.18.2 Processes and Threads
• Linux provides the ability to create threads via the clone() system-call.
• The clone() system-call behaves identically to fork(), except that it accepts as arguments a set of
flags.
• The flags dictate what resources are shared between the parent and child.
• The flags include:
• If clone() is passed the above flags, the parent and child tasks will share
→ same file-system information (such as the current working directory)
→ same memory space
→ same signal handlers and
→ same set of open files.
However, if none of these flags is set when clone() is invoked, the associated resources are not
shared
• A separate data-structures is used to hold information of process. Information includes:
→ file-system context
→ file-descriptor table
→ signal-handler table and
→ virtual-memory context
• The process data-structure contains pointers to these other structures.
• So any number of processes can easily share a sub-context by
→ pointing to the same sub-context and
→ incrementing a reference count.
• The arguments to the clone() system-call tell it
→ which sub-contexts to copy and
→ which sub-contexts to share.
• The new process is always given a new identity and a new scheduling context
The path you walk today will determine where you are tomorrow.
5-32
OPERATING SYSTEMS
5.19 Scheduling
• Scheduling is a process of allocating CPU-time to different tasks within an OS.
• Like all UNIX systems, Linux supports preemptive multitasking.
• In such a system, the process-scheduler decides which process runs and when.
Focus on the win and you lose the battle, focus on the battle and you win.
5-33
OPERATING SYSTEMS
5.19.3 Kernel Synchronization
• Two ways of requesting for kernel-mode execution:
1) A running program may request an OS service, either
→ explicitly via a system-call or
→ implicitly when a page-fault occurs
2) A device-driver may deliver a hardware-interrupt.
The interrupt causes the CPU to start executing a kernel-defined handler.
• Two methods to protect critical-sections: 1) spinlocks and 2) semaphores.
1) Spinlocks are used in the kernel only when the lock is held for short-durations.
i) On SMP machines, spinlocks are the main locking mechanism used.
ii) On single-processor machines, spinlocks are not used, instead kernel pre-emption are
enabled and disabled.
2) Semaphores are used in the kernel only when a lock must be held for longer periods.
The second protection technique applies to critical-sections that occur in ISR (interrupt
service routine).
The basic tool is the processor’s interrupt-control hardware.
By disabling interrupts during a critical-section, the kernel guarantees that it can proceed
without the risk of concurrent-access to shared data-structures.
• Kernel uses a synchronization architecture that allows long critical-sections to run for their entire
duration without interruption.
• ISRs are separated into a top half and a bottom half (Figure 5.15):
1) The top half is a normal ISR, and runs with recursive interrupts disabled.
2) The bottom half is run, with all interrupts enabled, by a miniature-scheduler that ensures
that bottom halves never interrupt themselves.
• This architecture is completed by a mechanism for disabling selected bottom halves while executing
normal, foreground kernel-code.
• Each level may be interrupted by code running at a higher level, but will never be interrupted by
code running at the same or a lower level.
• User-processes can always be preempted by another process when a time-sharing scheduling
interrupt occurs.
Individuals who keep growing in knowledge are the ones who succeed.
5-34
OPERATING SYSTEMS
5.20 Memory-Management
• Memory-management has 2 components:
1) The first component is used for allocating and freeing physical-memory such as
→ groups of pages and
→ small blocks of RAM.
2) The second component is used for handling virtual-memory.
A virtual-memory is a memory-mapped into the address-space of running processes.
• Page-allocator is used to
→ allocate and free all physical-pages.
→ allocate a ranges of physically-contiguous pages on-demand.
• Page-allocator uses a buddy-heap algorithm to keep track of available physical-pages (Figure 5.17).
• Each allocatable memory-region is paired with an adjacent partner (hence, the name buddy-heap).
1) When 2 allocated partners regions are freed up, they are combined to form a larger
region (called as a buddy heap).
2) Conversely, if a small memory-request cannot be satisfied by allocation of an existing small
free region, then a larger free region will be subdivided into two partners to satisfy the request.
• Memory allocations occur either
→ statically (drivers reserve a contiguous area of memory during system boot time) or
→ dynamically (via the page-allocator).
5-35
OPERATING SYSTEMS
• The slab-allocator first attempts to satisfy the request with a free object in a partial slab(Figure 5.18)
1) If none exists, a free object is assigned from an empty slab.
2) If no empty slabs are available, a new slab is allocated from contiguous physical-pages and
assigned to a cache; memory for the object is allocated from this slab.
Individuals who keep growing in knowledge are the ones who succeed.
5-36
OPERATING SYSTEMS
5.20.2 Virtual-memory
• VM system
→ maintains the address-space visible to each process.
→ creates pages of virtual-memory on-demand and
→ loads those pages from disk & swaps them back out to disk as required.
• The VM manager maintains 2 separate views of a process’s address-space:
1) Logical-view and 2) Physical-view
1) Logical-Vew
• Logical-view of a address-space refers to a set of separate regions.
• The address-space consists of a set of non-overlapping regions.
• Each region represents a continuous, page-aligned subset of the address-space.
• The regions are linked into a balanced binary-tree to allow fast lookup of the region.
2) Physical-View
• Physical-view of a address-space refers to a set of pages.
• This view is stored in the hardware page-tables for the process.
• The page-table entries identify the exact current location of each page of virtual-memory.
• Each page of virtual-memory may be on disk or in physical-memory.
• A set of routines manages the Physical-view.
• The routines are invoked whenever a process tries to access a page that is not currently present in
the page-tables.
5.20.2.1 Virtual-Memory-Regions
• Virtual-memory-regions can be classified by backing-store.
• Backing-store defines from where the pages for the region come.
• Most memory-regions are backed either 1) by a file or 2) by nothing.
1) By Nothing
Here, a region is backed by nothing.
The region represents demand-zero memory.
When a process reads a page in the memory, the process is returned a page-of-memory filled
with zeros.
2) By File
A region backed by a file acts as a viewport onto a section of that file.
When the process tries to access a page within that region, the page-table is filled with the
address of a page within the kernel’s page-cache.
The same page of physical-memory is used by both the page-cache and the process’s page
tables.
• A virtual-memory-region can also be classified by its reaction to writes. 1) Private or 2) Shared.
1) If a process writes to a private-region, then the pager detects that a copy-on-write is
necessary to keep the changes local to the process.
2) If a process writes to a shared-region, the object mapped is updated into that region.
Thus, the change will be visible immediately to any other process that is mapping that
object.
If you concentrate on small, manageable steps you can cross unimaginable distances.
5-37
OPERATING SYSTEMS
5.20.2.3 Swapping and Paging
• A VM system relocates pages of memory from physical-memory out to disk when that memory is
needed.
• Paging refers to movement of individual pages of virtual-memory between physical-memory & disk.
• Paging-system is divided into 2 sections:
1) Policy algorithm decides
→ which pages to write out to disk and
→ when to write those pages.
2) Paging mechanism
→ carries out the transfer and
→ pages data back into physical-memory when they are needed again.
• Linux’s pageout policy uses a modified version of the standard clock algorithm.
• A multiple pass clock is used, and every page has an age that is adjusted on each pass of the clock.
• The age is a measure of the page’s youthfulness, or how much activity the page has seen recently.
• Frequently accessed pages will attain a higher age value, but the age of infrequently accessed pages
will drop toward zero with each pass. (LFU → least frequently used)
• This age valuing allows the pager to select pages to page out based on a LFU policy.
• The paging mechanism supports paging both to
1) dedicated swap devices and partitions and
2) normal files
• Blocks are allocated from the swap devices according to a bitmap of used blocks, which is maintained
in physical-memory at all times.
• The allocator uses a next-fit algorithm to try to write out pages to continuous runs of disk blocks for
improved performance.
The experiences you get from failures early in life, will carry you for the rest of your life.
5-38
OPERATING SYSTEMS
5.20.3.1 Mapping of Programs into Memory
• Initially, the pages of the binary-file are mapped into regions of virtual-memory.
• Only when a program tries to access a given page, a page-fault occurs.
• Page-fault results in loading the requested-page into physical-memory.
• An ELF-format binary-file consists of a header followed by several page-aligned sections.
• The ELF loader
→ reads the header and
→ maps the sections of the file into separate regions of virtual-memory.
• As shown in Figure 5.19
Kernel VM is not accessible to normal user-mode programs.
Job of loader: To set up the initial memory mapping to begin the execution of the program.
The regions to be initialized include 1) stack and 2) program’s text/data regions.
The stack is created at the top of the user-mode virtual-memory.
The stack includes copies of the arguments given to the program.
In the binary-file,
¤ Firstly, program-text or read-only data are mapped into a write-protected region.
¤ Then, writable initialized data are mapped.
¤ Then, any uninitialized data are mapped in as a private demand-zero region.
¤ Finally, we have a variable-sized region that programs can expand as needed to hold
data allocated at run time.
Each process has a pointer brk that points to the current extent of this data region,
5-39
OPERATING SYSTEMS
5.21 File-Systems
5.21.1 Virtual File-System
• The Linux VFS is designed around object-oriented principles.
• It has two components:
1) A set of definitions that specify the file-system objects.
2) A layer of software to manipulate the objects.
• The VFS defines 4 main object types:
1) An inode object represents an individual file.
2) A file-object represents an open file.
3) A superblock object represents an entire file-system.
4) A dentry object represents an individual directory entry.
• For each object type, the VFS defines a set of operations.
• Each object contains a pointer to a function-table.
• The function-table lists the addresses of the actual functions that implement the defined operations
for that object.
• Example of file-object’s operations includes:
int open(. . .) — Open a file.
ssize t read(. . .) — Read from a file.
ssize t write(. . .) — Write to a file.
int mmap(. . .) — Memory-map a file.
• The complete definition of the file-object is located in the file /usr/include/linux/fs.h.
• An implementation of the file-object is required to implement each function specified in the definition
of the file-object.
• The VFS software layer can perform an operation on the file-objects by calling the appropriate
function from the object’s function-table.
• The VFS does not know whether an inode represents
→ networked file
→ disk file
→ network socket, or
→ directory file.
• The inode and file-objects are the mechanisms used to access files.
• An inode object is a data-structure containing pointers to the disk blocks that contain the actual file
contents.
• The inode also maintains standard information about each file, such as
→ owner
→ size and
→ time most recently modified.
• A file-object represents a point of access to the data in an open file.
• A process cannot access an inode’s contents without first obtaining a file-object pointing to the inode.
• The file-object keeps track of where in the file the process is currently reading/writing.
• File-objects typically belong to a single process, but inode objects do not.
• There is one file-object for every instance of an open file, but always only a single inode object.
• Directory files are dealt with slightly differently from other files.
• The UNIX programming interface defines a number of operations on directories, such as
→ creating file
→ deleting file and
→ renaming file.
5-40
OPERATING SYSTEMS
5.21.2 Linux ext3 File-system
• Similar to BSD FFS, ext3 File-system locates the data blocks belonging to a specific file.
• The main differences between ext3 and FFS lie in their disk-allocation policies.
1) In FFS, the disk is allocated to files in blocks of 8 KB. (FFS → Fast File-system)
The 8KB-blocks are further subdivided into fragments of 1 KB for storage of small files.
2) In ext3, fragments are not used.
Allocations are performed in smaller units.
Supported block sizes are 1, 2, 4, and 8 KB.
• ext3 uses allocation policies designed to place logically adjacent blocks of a file into physically
adjacent blocks on disk.
• Thus, ext3 can submit an I/O request for several disk blocks as a single operation.
• The allocation-policy works as follows (Figure 5.20):
An ext3 file-system is divided into multiple segments. These are called block-groups.
When allocating a file, ext3 first selects the block-group for that file.
Within a block-group, ext3 keeps the allocations physically contiguous to reduce
fragmentation.
ext3 maintains a bitmap of all free blocks in a block-group.
i) When allocating the first blocks for a new file, ext3 starts searching for a free block
from the beginning of the block-group.
ii) When extending a file, ext3 continues the search from the block most recently
allocated to the file. The search is performed in 2 stages:
1) First, ext3 searches for an entire free byte in the bitmap; if it fails to
find one, it looks for any free bit.
¤ The search for free bytes aims to allocate disk-space in chunks of at
least 8 blocks.
2) After a free block is found, the search is extended backward until an allocated
block is encountered.
¤ The backward extension prevents ext3 from leaving a hole.
The preallocated blocks are returned to the free-space bitmap when the file is closed.
A road often travelled has heavy traffic and you take more time to reach the destination.
5-41
OPERATING SYSTEMS
5.21.3 Journaling
• ext3 file-system supports a popular feature called journaling.
• Here, modifications to the file-system are written sequentially to a journal.
• A set of operations that performs a specific task is a transaction.
• Once a transaction is written to the journal, it is considered to be committed.
• The journal entries relating to the transaction are replayed across the actual file-system structures.
• When an entire committed transaction is completed, it is removed from the journal.
• If the system crashes, some transactions may remain in the journal.
• If those transactions were never completed, then they must be completed once the system recovers.
• The only problem occurs when a transaction has been aborted i.e. it was not committed before the
system crashed.
• Any changes from those transactions that were applied to the file-system must be undone, again
preserving the consistency of the file-system.
If your beliefs are not useful to you, or they are not bringing you peace and abundance, update them!
5-42
OPERATING SYSTEMS
5.22 Input and Output
• Three types of devices (Figure 5.21): 1) Block device
2) Character device and
3) Network device.
1) Block Devices
• Block devices allow random access to completely independent, fixed-sized blocks of data.
• For example: hard disks and floppy disks, CD-ROMs and Blu-ray discs, and flash memory.
• Block devices are typically used to store file-systems.
2) Character Devices
• A character-device-driver does not offer random access to fixed blocks of data.
• For example: mice and keyboards.
• Character devices include mice and keyboards.
3) Network Devices
• Users cannot directly transfer data to network devices.
• Instead, they must communicate indirectly by opening a connection to the kernel’s networking
subsystem.
Habits are tools. They can either propel you toward success or drag you down.
5-43
OPERATING SYSTEMS
5.23 Inter-Process Communication
• In some situations, one process needs to communicate with another process.
• Three methods for IPC:
1) Synchronization and Signals
2) Message Passing Data between Processes
3) Shared Memory Object
Being prepared prevents us from getting tripped up by unexpected glitches and surprises.
5-44