0% found this document useful (0 votes)
2 views

New4TH Operating System

Chapter 6 of the document discusses file management in operating systems, detailing file organization, directories, file structures, access methods, and allocation methods. It highlights the importance of file protection, reliability, and secondary storage management, including disk scheduling algorithms. The chapter provides a comprehensive overview of how files are managed, accessed, and stored within an operating system.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

New4TH Operating System

Chapter 6 of the document discusses file management in operating systems, detailing file organization, directories, file structures, access methods, and allocation methods. It highlights the importance of file protection, reliability, and secondary storage management, including disk scheduling algorithms. The chapter provides a comprehensive overview of how files are managed, accessed, and stored within an operating system.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

OPERATING SYSTEM

Chapter 6: File Management

MARCH 21, 2020


BOSE CUTTACK
Nishita Kindo
CHAPTER 6: FILE MANAGEMENT
6.0 FILE- A file is a collection of related information that is recorded on secondary storage.
From user’s pointof view, a file is the smallest allotment of logical secondary storage.
Or file is a collection of logically related entities.

6.1 FILE ORGANIZATION refers to the way data is stored in a file. File organization is
very important because it determines the methods of access, efficiency, flexibility and storage
devices to use. Or we can say, File organization is the “logical structuring’’ as well as the
access method(s) of files.
Common file organization schemes are:

• sequential,
• random,
• serial and
• indexed-sequential

6.1.1 FILE DIRECTORIES:


Collection of files is a file directory. The directory contains information about the files,
including attributes, location and ownership. Much of this information, especially that is
concerned with storage, is managed by the operating system. The directory is itself a file,
accessible by various file management routines.

Information contained in a device directory are:


• Name
• Type
• Address
• Current length
• Maximum length
• Date last accessed
• Date last updated
• Owner id
• Protection information

Operation performed on directory are:


• Search for a file
• Create a file
• Delete a file
• List a directory
• Rename a file
• Traverse the file system

Advantages of maintaining directories are:


• Efficiency: A file can be located more quickly.
• Naming: It becomes convenient for users as two users can have same name for different
files or may have different name for same file.
• Grouping: Logical grouping of files can be done by properties e.g. all java programs, all
games etc.

1
1. SINGLE-LEVEL DIRECTORY
In this a single directory is maintained for all the users.
• Naming problem: Users cannot have same name for two files.
• Grouping problem: Users cannot group files according to their need.

2. TWO-LEVEL DIRECTORY
In this, separate directories for each user is maintained.
• Path name: Due to two levels there is a path name for every file to locate that file.
• Now, we can have same file name for different user.
• Searching is efficient in this method.

3. TREE-STRUCTURED DIRECTORY:
Directory is maintained in the form of a tree. Searching is efficient and also there is
grouping capability. We have absolute or relative path name for a file.
When operating system defines different file structures, it also contains the code to
support these file structure. Unix, MS-DOS support minimum number of file structure.

2
6.1.2 FILE STRUCTURE

A File Structure should be according to a required format that the operating system
can understand.

• A file has a certain defined structure according to its type.


• A text file is a sequence of characters organized into lines.
• A source file is a sequence of procedures and functions.
• An object file is a sequence of bytes organized into blocks that are
understandable by the machine.
• When operating system defines different file structures, it also contains the code
to support these file structure. Unix, MS-DOS support minimum number of file
structure.

6.1.3 FILE SHARING

Allowing users to share files raises a major issue: protection. A general approac
h is to provide controlled access to files through a set of operations such as read, write,
delete, list, and append. Then permit users to perform one or more operations.
One popular protection mechanism is a condensed version of access list, where the syst
em recognizes three classifications of users with each file and directory:
• user
• group
• other

6.2.1 FILE ACCESS METHODS


File access methods refers to the manner in which the records of a file may be
accessed. There are several ways to access files −

• Sequential access
• Direct/Random access
• Indexed sequential access
SEQUENTIAL ACCESS
• A sequential access is that in which the records are accessed in some sequence,
i.e., the information in the file is processed in order, one record after the other.
• This access method is the most primitive one.
• Example: Compilers usually access files in this fashion.

DIRECT/RANDOM ACCESS
• Records are stored randomly but accessed directly.
• Each record has its own address on the file with by the help of which it can be
directly accessed for reading or writing.

3
• The records need not be in any sequence within the file and they need not be in
adjacent locations on the storage medium.
• Magnetic and optical disks allow data to be stored and accessed randomly.

INDEXED SEQUENTIAL ACCESS


• This mechanism is built up on base of sequential access.
• An index is created for each file which contains pointers to various blocks.
• Index is searched sequentially and its pointer is used to access the file directly.
• For example, on a magnetic drum, records are stored sequential on the tracks.
However, each record is assigned an index that can be used to access it
directly.

6.2.2 FILE SYSTEMS


A file system provides a mapping between the logical and physical views of a file, thr
ough a set of services and an interface. Simply put, the file system hides all the device
specific aspects of file manipulation from users.
The basic services of a file system include:
• keeping track of files (knowing location),
• I/O support, especially the transmission mechanism to and from main memory,
• management of secondary storage,
• sharing of I/O devices,
• providing protection mechanisms for information held on the system.

6.2.3 RELIABILITY

Regular file operations often involve changing several disk blocks. For example,
What can happen if disk loses power or machine software crashes?

• Some operations in progress may complete


• Some operations in progress may be lost
• Overwrite of a block may only partially complete

what if the computer crashes between two of these operations? Then the filesystem
could enter an inconsistent state, confusing the OS to the point that the filesystem can't
be used at all. In order to make file system reliable we need to keep
Backup/restore (disaster scenarios) and have a consistent check on File system (i.e.
maintain consistency) (e.g., UNIX fsck).

File system wants durability


– Data previously stored can be retrieved (maybe after some recovery step), regardless
of failure.

4
6.3 FILE ALLOCATION METHODS

The allocation methods define how the files are stored in the disk blocks. There are
three main disk space or file allocation methods.
• Contiguous Allocation
• Linked Allocation
• Indexed Allocation
The main idea behind these methods is to provide:
• Efficient disk space utilization.
• Fast access to the file blocks.

1. CONTINUOUS ALLOCATION:
• Each file occupies a contiguous address space on disk.
• Assigned disk address is in linear order.

Advantages
• It is simple
• Easy to implement.
• We will get Excellent read performance.
• Supports Random Access into files.

5
Disadvantage
• External fragmentation is a major issue with this type of allocation technique.
• It may be difficult to have a file grow.

2. LINKED ALLOCATION (NON-CONTIGUOUS ALLOCATION):


• This method solves the problems associated with contiguous allocation. Here the
blocks of a single file can be scattered anywhere on the disk.
• The entire file is implemented as a Linked List.Each file is a linked list of disk
blocks.
• The directory maintained by the Operating System contains the starting block
address.
• Each block of a file contains a pointer to the next block after it in the list.
• For creating a new file, we need to just create a new entry in the directory and not
to search for sufficient space as in contiguous.

Advantages
• No external fragmentation
• Effectively used in sequential access file.
• Any free block can be utilized in order to satisfy the file block requests.
• File can continue to grow as long as the free blocks are available.

6
Disadvantage:
• There is an overhead of maintaining the pointer in every disk block.
• If the pointer of any disk block is lost, the file will be truncated.
• Does not support direct access of file.

3. INDEXED ALLOCATION:
• Provides solutions to problems of contiguous and linked allocation.
• A index block is created having all pointers to files.
• Each file has its own index block which stores the addresses of disk space occupied
by the file.
• Directory contains the addresses of index blocks of files.

Advantages
• Supports direct access
• A bad data block causes the loss of only that block.

Disadvantages
• A bad index block could cause the loss of entire file.

7
• Size of a file depends upon the number of pointers, an index block can hold.
• Having an index block for a small file is totally wastage.
• More pointer overhead.

6.4.1 FILE PROTECTION


File protection is provided by the operating system, which can designate files as read
only. This allows both regular (read/write) and read only files to be stored on the same disk
volume. Files can also be designated as hidden files, which makes them invisible to most
software programs.

A computer file needs protection which is of two types


1. Reliability
• Protection from physical damage.File systems can be damaged by
o Hardware problems(errors in r/w)
o Power surges or failures
o Head crashes
o Dirt and temperature
o Bugs in file system software
Reliability can be provided by
• duplicate copies of files
• Take backups at regular intervals (daily/weekly/monthly)

2. Security
• Protection from improper access
▪ Protecting files from unauthorized access
▪ More important in a multi user system
▪ Provided by controlling access to files

NEED FOR PROTECTING FILES


• Need for protection is due to the ability to access files
• Two ways to tackle the problem
1. Prohibit access providing complete protection
2. Provide free access without protection
• Both approaches are too extreme for general use.

6.4.2 SECONDARY STORAGE MANAGEMENT


Brief:
Secondary storage devices are those devices whose memory is non volatile i.e. the stored data
will be intact even if the system is turned off. Here are a few things worth noting about
secondary storage.
• Secondary storage is also called auxiliary storage.
• Secondary storage is less expensive when compared to primary memory like RAMs.
• The speed of the secondary storage is also lesser than that of primary storage.
• Hence, the data which is less frequently accessed is kept in the secondary storage.
• A few examples are magnetic disks, magnetic tapes, removable thumb drives etc.

8
MAGNETIC DISK STRUCTURE
In modern computers, most of the secondary storage is in the form of magnetic disks.
Hence, knowing the structure of a magnetic disk is necessary to understand how the data in
the disk is accessed by the computer.

STRUCTURE OF A MAGNETIC DISK


A magnetic disk contains several platters. Each platter is divided into circular shaped tracks.
The length of the tracks near the centre is less than the length of the tracks farther from the
centre. Each track is further divided into sectors, as shown in the figure.
Tracks of the same distance from centre form a cylinder. A read-write head is used to read data
from a sector of the magnetic disk.
The speed of the disk is measured as two parts:

• Transfer rate: This is the rate at which the data moves from disk to the computer.

• Random access time: It is the sum of the seek time and rotational latency.

Seek time is the time taken by the arm to move to the required track.
Rotational latency is defined as the time taken by the arm to reach the required sector in the
track.
Even though the disk is arranged as sectors and tracks physically, the data is logically arranged
and addressed as an array of blocks of fixed size. The size of a block can be 512 or 1024 bytes.
Each logical block is mapped with a sector on the disk, sequentially. In this way, each sector in
the disk will have a logical address.

9
DISK SCHEDULING ALGORITHMS
On a typical multiprogramming system, there will usually be multiple disk access
requests at any point of time. So those requests must be scheduled to achieve good efficiency.
Disk scheduling is similar to process scheduling. Some of the disk scheduling algorithms are:
• First Come First Served (FCFS) or FIFO
• Shortest Service Time First (SSTF)
• SCAN—back and forth over disk
• CSCAN—circular SCAN or one way SCAN and fast return
• LOOK—look for a request before moving in that direction
• CLOOK—circular LOOK

SECONDARY STORAGE MANAGEMENT ISSUES


• Formatting
➢ Physical: divide the blank slate into sectors identified by headers containing such inf
ormation as sector number; sector interleaving
➢ Logical: marking bad blocks; partitioning (optional) and writing a blank directory on
disk; installing file allocationtables and other relevant information(file system initia
lization)
• Reliability
➢ disk interleaving or striping
➢ RAIDs (Redundant Array of Inexpensive Disks): various levels, e.g., level 0 is disk stri
ping)
• Controller caches
➢ newer disks have on-disk caches (128KB—512KB)

10
OPERATING SYSTEM
CHAPTER 5: DEADLOCK

MARCH 14, 2020


BOSE, CUTTACK
BY NISHITA KINDO
CHAPTER 5: DEADLOCK
5.1 DEFINITION: A deadlock happens in operating system when two or more processes
need some resource to complete their execution that is held by the other process.

EXPLANATION: In the above diagram, the process 1 has resource 1 and needs to acquire
resource 2. Similarly process 2 has resource 2 and needs to acquire resource 1. Process 1 and
process 2 are in deadlock as each of them needs the other’s resource to complete their execution
but neither of them is willing to relinquish their resources.

5.1.1 NECESSARY CONDITIONS AND PREVENTIONS FOR DEADLOCK


(i.e.COFFMAN CONDITIONS)

A deadlock occurs if the four Coffman conditions hold true. But these conditions are not
mutually exclusive.
The Coffman conditions are given as follows −
• Mutual Exclusion
There should be a resource that can only be held by one process at a time. In the diagram
below, there is a single instance of Resource 1 and it is held by Process 1 only.

• Hold and Wait


A process can hold multiple resources and still request more resources from
other processes which are holding them. In this diagram, Process 2 holds Resource 2
and Resource 3 and is requesting the Resource 1 which is held by Process 1.

1
• No Pre-emption
A resource cannot be pre-empted from a process by force. A process can only release a
resource voluntarily. In the diagram below, Process 2 cannot pre-empt Resource 1 from
Process 1. It will only be released when Process 1 relinquishes it voluntarily after its
execution is complete.

• Circular Wait

A process is waiting for the resource held by the second process, which is waiting for
the resource held by the third process and so on, till the last process is waiting for a
resource held by the first process. This forms a circular chain. For example: Process 1
is allocated Resource2 and it is requesting Resource 1. Similarly, Process 2 is allocated
Resource 1 and it is requesting Resource 2. This forms a circular wait loop.

5.2 SYSTEM MODEL


• For the purposes of deadlock discussion, a system can be modelled as a collection of
limited resources(r1, r2,r3,….rn) which can be partitioned into different categories, to
be allocated to a number of processes(P1, P2 ,P3 ,...Pn ) each having different needs.
• Resource categories may include memory, printers, CPUs, open files, tape drives, CD-
ROMS, etc.
• By definition, all the resources within a category are equivalent, and a request of this
category can be equally satisfied by any one of the resources in that category. If this is
not the case (i.e. if there is some difference between the resources within a category),
then that category needs to be further divided into separate categories. For example,
"printers" may need to be separated into "laser printers" and "colour inkjet printers".

2
• Some categories may have a single resource.
• In normal operation a process must request a resource before using it, and release it
when it is done, in the following sequence:
1. Request - If the request cannot be immediately granted, then the process must
wait until the resource(s) it needs become available. For example the system
calls open( ) , malloc( ), new( ), and request( ).
2. Use - The process uses the resource, e.g. prints to the printer or reads from the
file.
3. Release - The process relinquishes the resource. so that it becomes available
for other processes. For example, close( ), free( ), delete( ), and release( ).

5.3 DEADLOCK DETECTION


A deadlock can be detected by a resource scheduler as it keeps track of all the resources that
are allocated to different processes. After a deadlock is detected, it can be resolved using the
following methods:

1. All the processes that are involved in the deadlock are terminated. This is not a good
approach as all the progress made by the processes is destroyed.
2. Resources can be pre-empted from some processes and given to others till the
deadlock is resolved.
3. If resources have single instance:
In this case for Deadlock detection we can run an algorithm to check for cycle in the
Resource Allocation Graph. Presence of cycle in the graph is the sufficient condition
for deadlock.

In the above diagram, resource 1 and resource 2 have single instances. There is a cycle
R1 → P1 → R2 → P2. So, Deadlock is Confirmed.
4. If there are multiple instances of resources:
Detection of the cycle is necessary but not sufficient condition for deadlock detection,
in this case, the system may or may not be in deadlock varies according to different
situations.
5.4 RESOURCE ALLOCATION GRAPH (RAG)
Resource allocation graph is explained to us to see what is the state of the system in terms
of processes and resources.

3
We know that any graph contains vertices and edges i.e. G= (v, e). So, RAG also contains
vertices and edges. In RAG vertices are two type –
1. Process vertex – Every process will be represented as a process vertex. Generally, the
process will be represented with a circle.
2. Resource vertex – Every resource will be represented as a resource vertex. It is also two
type –
• Single instance type resource – It represents as a box, inside the box, there will be one
dot. So, the number of dots indicate how many instances are present of each resource
type.
• Multi-resource instance type resource – It also represents as a box, inside the box, there
will be many dots present.

Now coming to the edges of RAG. There are two types of edges in RAG –
1. Assign Edge – If you already assign a resource to a process then it is called Assign edge.
2. Request Edge – It means in future the process might want some resource to complete the
execution, that is called request edge.

4
5
5.5 METHODS FOR HANDLING DEADLOCKS

• Generally, speaking there are three ways of handling deadlocks:


1. Deadlock prevention or avoidance - Do not allow the system to get into a
deadlocked state.
2. Deadlock detection and recovery - Abort a process or preempt some resources
when deadlocks are detected.
3. Ignore the problem all together - If deadlocks only occur once a year or so, it
may be better to simply let them happen and reboot as necessary than to incur
the constant overhead and system performance penalties associated with
deadlock prevention or detection. This is the approach that both Windows and
UNIX take.
• In order to avoid deadlocks, the system must have additional information about all
processes. In particular, the system must know what resources a process will or may
request in the future. ( Ranging from a simple worst-case maximum to a complete
resource request and release plan for each process, depending on the particular
algorithm. )
• Deadlock detection is fairly straightforward, but deadlock recovery requires either
aborting processes or preempting resources, neither of which is an attractive alternative.
• If deadlocks are neither prevented nor detected, then when a deadlock occurs the system
will gradually slow down, as more and more processes become stuck waiting for
resources currently held by the deadlock and by other waiting processes. Unfortunately,
this slowdown can be indistinguishable from a general system slowdown when a real-
time process has heavy computing needs.

5.6 DEADLOCK PREVENTION


We can prevent Deadlock by eliminating any of the above four conditions.
1. Eliminate Mutual Exclusion
It is not possible to dis-satisfy the mutual exclusion because some resources, such as the tape
drive and printer, are inherently non-shareable.
2. Eliminate Hold and wait
Allocate all required resources to the process before the start of its execution, this way hold
and wait condition is eliminated but it will lead to low device utilization. for example, if a
process requires printer at a later time and we have allocated printer before the start of its
execution printer will remain blocked till it has completed its execution.
The process will make a new request for resources after releasing the current set of resources.
This solution may lead to starvation.

3. Eliminate No Pre-emption
Pre-empt resources from the process when resources required by other high priority
processes.

6
4. Eliminate Circular Wait
Each resource will be assigned with a numerical number. A process can request the resources
increasing/decreasing. order of numbering.
For Example, if P1 process is allocated R5 resources, now next time if P1 ask for R4, R3 lesser
than R5 such request will not be granted, only request for resources more than R5 will be
granted.

5.6.1 DEADLOCK AVOIDANCE


The general idea behind deadlock avoidance is to prevent deadlocks from ever happening, by
preventing at least one of the aforementioned conditions.
• This requires more information about each process, and tends to lead to low device
utilization. (i.e. it is a conservative approach.)
• In some algorithms the scheduler only needs to know the maximum number of each
resource that a process might potentially use. In more complex algorithms the scheduler
can also take advantage of the schedule of exactly what resources may be needed in
what order.
• When a scheduler sees that starting a process or granting resource requests may lead to
future deadlocks, then that process is just not started or the request is not granted.
• A resource allocation state is defined by the number of available and allocated
resources, and the maximum requirements of all processes in the system.
Safe State- A state is safe if the system can allocate all resources requested by all processes (
up to their stated maximums ) without entering a deadlock state. If the system cannot fulfill the
request of all processes then the state of the system is called unsafe.
More formally, a state is safe if there exists a safe sequence(the order of process
termination) of processes { P0, P1, P2, ..., PN } such that all of the resource requests for Pi
can be granted using the resources currently allocated to Pi and all processes Pj where j < i.
(i.e. if all the processes prior to Pi finish and free up their resources, then Pi will be able to
finish also, using the resources that they have freed up. )

• If a safe sequence does not exist, then the system is in an unsafe state, which may lead
to deadlock. ( All safe states are deadlock free, but not all unsafe states lead to
deadlocks. )

Figure 7.6 - Safe, unsafe, and deadlocked state spaces.

7
5.6.2 BANKER’S ALGORITHM
Deadlock avoidance can be done with Banker’s Algorithm.
Banker’s Algorithm is resource allocation and deadlock avoidance algorithm which test all
the request made by processes for resources, it checks for the safe state, if after granting request
system remains in the safe state it allows the request and if there is no safe state it doesn’t allow
the request made by the process.
Banker's Algorithm Notations
Following Data structures are used to implement the Banker’s Algorithm:
Let ‘n’ be the number of processes in the system and ‘m’ be the number of resources types.
Available :
• It is a 1-d array of size ‘m’ indicating the number of available resources of each type.
• Available[ j ] = k means there are ‘k’ instances of resource type Rj
Max :
• It is a 2-d array of size ‘n*m’ that defines the maximum demand of each process in a
system.
• Max[ i, j ] = k means process Pi may request at most ‘k’ instances of resource type Rj.
Allocation :
• It is a 2-d array of size ‘n*m’ that defines the number of resources of each type
currently allocated to each process.
• Allocation[ i, j ] = k means process Pi is currently allocated ‘k’ instances of resource
type Rj
Need :
• It is a 2-d array of size ‘n*m’ that indicates the remaining resource need of each
process.
• Need [ i, j ] = k means process Pi currently need ‘k’ instances of resource type Rj
for its execution.
• Need [ i, j ] = Max [ i, j ] – Allocation [ i, j ]
Allocationi specifies the resources currently allocated to process Pi and Needi specifies the
additional resources that process Pi may still request to complete its task.
Banker’s algorithm consists of Safety algorithm and Resource request algorithm

8
BANKER’S ALGORITHM(Resource Request Algorithm)
Resource request algorithm enables you to represent the system behaviour when a specific
process makes a resource request.
Let understand this by the following steps:
Step 1) When a total requested instance of all resources is lesser than the process, move to step
2.
Step 2) When a requested instance of each and every resource type is lesser compared to the
available resources of each type, it will be processed to the next step. Otherwise, the process
requires to wait because of the unavailability of sufficient resources.
Step 3) Resource is allocated as shown in the below given Pseudocode.
Available = Available – Request (y)
Allocation(x) = Allocation(x) + Request(x)
Need(x) = Need(x) - Request(x)
This final step is performed because the system needs to assume that resources have been
allocated. So that there should be less resources available after allocation.
Characteristics of Banker's Algorithm
Here are important characteristics of banker's algorithm:
• Keep many resources that satisfy the requirement of at least one client
• Whenever a process gets all its resources, it needs to return them in a restricted period.
• When a process requests a resource, it needs to wait
• The system has a limited number of resources
• Advance feature for max resource allocation
Disadvantage of Banker's algorithm
Here, are cons/drawbacks of using banker's algorithm
• Does not allow the process to change its Maximum need while processing
• It allows all requests to be granted in restricted time, but one year is a fixed period for
that.
• All processes must know and state their maximum resource needs in advance.
Summary:
• Banker's algorithm is used majorly in the banking system to avoid deadlock. It helps
you to identify whether a loan will be given or not.
• Notations used in banker's algorithms are 1) Available 2) Max 3) Allocation 4) Need
• Resource request algorithm enables you to represent the system behavior when a
specific process makes a resource request.
• Banker's algorithm keeps many resources that satisfy the requirement of at least one
client

9
• The biggest drawback of banker's algorithm this that it does not allow the process to
change its Maximum need while processing.
Note: Deadlock prevention is more strict that Deadlock Avoidance.

5.6.3 SAFETY ALGORITHM


The algorithm for finding out whether or not a system is in a safe state can be described as
follows:
1) Let Work and Finish be vectors of length ‘m’ and ‘n’ respectively.
Initialize: Work = Available
Finish[i] = false; for i =1, 2, 3, 4….n
2) Find an i such that both
a) Finish[i] = false
b) Needi <= Work
if no such i exists goto step (4)
3) Work = Work + Allocation[i]
Finish[i] = true
goto step (2)
4) if Finish [i] = true for all i
then the system is in a safe state

5.6.4 DEADLOCK RECOVERY


A traditional operating system such as Windows doesn’t deal with deadlock recovery as it is
time and space consuming process. Real-time operating systems use Deadlock recovery.
Recovery method
• Killing the process: killing all the process involved in the deadlock. Killing process one
by one. After killing each process check for deadlock again keep repeating the process
till system recover from deadlock.
• Resource Pre-emption: Resources are pre-empted from the processes involved in the
deadlock, pre-empted resources are allocated to other processes so that there is a
possibility of recovering the system from deadlock. In this case, the system goes into
starvation.

https://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/7_Deadlocks.html

10

You might also like