UNIT-5 File System Interface and Operations
UNIT-5 File System Interface and Operations
Afileisacollectionofsimilarrecords.
Arecordisacollectionofrelatedfieldsthatcanbetreatedasaunitbyapplic
ationprograms.
Afield is abasicelementof data. Anyindividualfield contains
asinglevalue.
FILE ATTRIBUTES:
Name:Afileisnamedfortheconvenienceoftheuserandisreferredbyits
name.Aname is usuallyastringof characters.
Identifier:Thisuniquetag, usuallyanumber, identifiesthefilewithin
thefilesystem.
Type:Filesareofsomanytypes.Thetypedependsontheextensionofthef
ile.Thedifferenttypes of files details aregiven inbelow table.
Location:Thisisa pointertothelocationofthe file onstoragedevice.
Size:Thecurrentsizeof thefile(inbytes,words,blocks).
Protection:Accesscontrolinformationdetermineswhocandoreading,
writing,executingand so on.
Time,Date,Useridentification:Thisinformationmaybekeptforcreati
on,lastmodification,and last use.
FILEOPERATIONS
Creatingafile:FirstOScheckwhetherfreespaceisavailableornot.Iftherei
snofreespaceavailable,filecannotbecreated.Ifthespaceisavailablethenfil
eiscreatedandanentry for the new file is added in the directory. The
entry includes file attributes such as filename,filelocation, size, etc.
Writing a file: The OS search for the given file. If the file is not
found, then a new file iscreated with the given name. If the file is
found, it opens the existing file. The system set awrite pointer to the
location in the file where the next write is to take place. After each
writeoperationtaken place, thewritepointer is updated.
Readingafile:Toreadafile,firstofallwesearchthedirectoriesforthegiven
file,ifthefile is found; the system needs to keep a read pointer to the
location in the file where the nextreadis totakeplace.
Aftereachreadoperation taken place,thereadpointer is updated.
Repositioning within a file: This operation is also called file seek.
The current file positionpointeris changed to agiven value.
Deleting a file: To delete a file, first search the directory for the given
file name, then releasethefile spaceand erasethedirectoryentry.
Truncatingafile:To truncateafile, thefiletotalcontent iserasedbut, the
fileexist asitis.
FILETYPES
The name of the file consists of 2 parts. One is name and second is
extension. The file type isdepending on extension of the file. The
extension of the file defines what type of file it is andwithwhat
application softwareis used to open or run it.
.arc,.zip Relatedfilesgroupedintoonefile,someti
Archive mes
compressedforarchivingorstorage
For example, consider a file consisting of 100 records. Let the current
position of read/writehead is at 45th record. Suppose next we want to
read the 75th record then, it access sequentiallyfrom 45, 46, 47 ……..
74, 75.Even though after 45th we need 75th, the read/write head
traverseallthe records from 45 to75.So, it is a time-consumingmethod.
ii. DirectAccess:
Direct access is also called relative access. Here records can
read/write randomly without anyorder. The direct access method is
based on a disk model of a file, because disks allow randomaccess to
any file block.For example, consider a disk containing of 512 blocks.
Let the positionof read/write head is at 124th block. Suppose if the next
block is to be read or write is 256thblock. Then wecan jumpfrom
124thblock to 256thblock directlywithout anyrestrictions.
DIRECTORYSTRUCTURE
As a computer system consisting of thousands to millions of files, it is
very hard to managethem. To manage these files, directory concept was
introduced. The files are grouped and loadeach group into one partition
called a directory. In Windows we also call these directories asfolders. A
directory structure provides a mechanism for organizing many files in
the file system.A directory can contain multiple files. It can even have
other directories inside of them. Thedirectory contains information about
the file attributes such as location, ownership, size, etc.
Thedirectorystructures supported byOS are:
i. Singleleveldirectory:
This directory system contains only one directory called as root
directory. All the files are savedin this directory only. When the number
of files increases or when the system has more than oneuser, single level
directory is not useful. Since all the files are in the same directory, they
musthave the unique name . For example,if user-1 creates a files called
sample and then later user-2alsocreates afilecalled sample,then user-
2’sfile will overwrite user-1file.
ii. Twoleveldirectory:
In the two-level directory structure, the root directory is called as master
file directory (MFD).Each user has their own user files directory (UFD).
The MFD is indexed by username and eachentry points to the UFD for
that user. The files of a particular user are stored in UFD. In thismodel,
Root directory is the MFD directory. The user1, user2, user3 and user4
are user level ofdirectories.F1, f2, …, f8 are files. Different users can
have same file name. This is shown in thebelowdiagram.
iii. Treestructureddirectory:
Two level directory eliminates name conflicts among users but it is
not satisfactory for userswith a large number (hundreds to millions) of
files. To avoid this, each user can create the sub-directories and load the
same type of files into the sub-directory. Even a sub-directory can
haveanother sub-directory and so on. This can viewed as a tree like
structure. So, here each user canhave as many as directories needed. The
user can change his current directory whenever hedesires. If a file is not
needed in the current directory then the user usually must either specify
apathnameor changethe current directory. Paths can beof two types:
a) AbsolutePath:ItBeginsatrootandfollows apathdowntothespecifiedfile.
Ex:\user2\programs\a1.java
b) RelativePath:Definesapathfromcurrentdirectory.
Ex:\programs\a1.javaifuser2isthecurrentdirectory.
iv. Acyclicgraphdirectory:
When multiple users are working on the same project, the project files
can be stored in acommon sub-directory and those filesare shared among
those multiple users. This type ofdirectory is called acyclic graph
directory. The common directory will be declared as a shareddirectory.
The graph contain no cycles with shared files, changes made by one user
are madevisible to other users. A file may now have multiple absolute
paths.When shared directory orshared file is deleted, all pointers to the
directory or files are also to be removed. The user1 anduser2 shares
same directory called Programs. Similarly, user3 and user4 shares same
file calledt3.txt.This is shown in thebelow diagram.
v. General graph directory:
When we add links to an existing tree structured directory, the tree
structure is destroyed;resulting is a simple graph structure. Cycles are
allowed within a directory structure wheremultiple directories can be
derived from more than one parent directory. The advantage of
thistypeof directoryis that traversingis easyand alsosharingis possible.
PROTECTION
The information stored in a computer system should be protected
from improper access.Protection mechanisms provide controlled access
by limiting types of file access that can bemade. Access is permitted or
denied depending on several factors, such as the user type,
theaccesstyperequested.Most common approach to the protection
problem is to make access dependent on theidentity of the user.
Different users need different types of access to a file. An access control
list(ACL) specifying user names and types of file access, OS checks the
list (ACL) associated withthat file. If that user is listed for the requested
access, the access is allowed. Otherwise protectionviolationoccurs,
anduser process is deniedaccessto thefile.
Accesscanbeprovidedtothefollowingclass of users:
can be:
Example: file_namerwxrw-r--
On the given file_name ,the owner can perform read, write and execute,
the group can performreadandwrite, and the(otherusers) universecan
perform onlyread access.
OtherProtectionapproaches:Maintainpasswordforeachfile.
Disadvantages
Numberofpasswordsthatauserneedstoremembermaybecomelarge,if
differentpasswordsset to different files.
Ifonlyonepasswordisusedforallfiles,thenonceitisdiscovered,allfilesa
reaccessible.
Encryption is an important tool in protection, security and authentication. The
process involves two steps:
Encryption is the process of converting normal message (plaintext) into
meaningless message (Ciphertext).
Decryption is the process of converting meaningless message
(Ciphertext) into its original form (Plaintext).
FILESYSTEMSTRUCTURE
A file System must provide efficient mechanism to store the file, locate
the file and retrieve thefile in a convenient way. Most of the Operating
Systems use layering approach for every
taskincludingfilesystems.Everylayerofthefilesystemisresponsibleforsome
activities.Theimageshownbelow,elaborateshowthefilesystemisdividedind
ifferentlayers,andalsothefunctionalityofeach layer.
When an application program asks for a file, the first request is
directed to the logical
filesystem.ThelogicalfilesystemcontainstheMetadataofthefileanddir
ectorystructure.It maintains file structure via file control blocks. A
file control block (inode in Unix filesystems) contains information
about the file, ownership, permissions, location of the filecontents.
If the application program doesn't have the required permissions of
the file thenthislayerwillthrowanerror.Logical filesystems
alsoverifythe pathto thefile.
Files are to be stored and retrieved from the hard disk. Hard disk is
divided into varioustracks. Each track is divided into sectors. Each
sector is divided into blocks. The filecontent is divided into various
logical blocks.Each logical block is mapped and storedinto Hard
disk blocks. Therefore, in order to store and retrieve the files, the
logical blocksneed to be mapped to physical blocks. This mapping
is done by File organization module.Itis also
responsibleforfreespacemanagement.
BootControlBlock
Boot Control Block contains all the information which is needed to boot
an operating systemfrom the Hard disk. It is called boot block in UNIX
file system. It is called the partition bootsectorIn NTFS (windows).
VolumeControlBlock
Volume control block contain all the information regarding that volume
such as number
ofblocks,sizeofeachblock,partitiontable,pointerstofreeblocksandfreeFCB
blocks.InUNIX
file system, it is known as super block. In NTFS, this information is
stored inside master filetable.
DirectoryStructure(perfilesystem)
A directory structure (per file system) contains file names and pointers
to corresponding FileControlBlocks (FCBs). In UNIX,
itincludesinodenumbers associated tofilenames.
FileControlBlock
File Control block contains all the details about the file such as
ownership details, permissiondetails, file size, etc. In UFS, this detail is
stored in inode. In NTFS, this information is storedinside master file
table as a relational database structure. A typical file control block is
shown inthe imagebelow.
FilePermissions
FileDates
(Create,Access,Write)
FileOwner,Group,ACL
FileSize
FileDataBlocks
Figure:FileControlBlock
Next, an entry is made in the per – process open file table, with the
pointer to the entry
inthesystemwideopenfiletableandsomeotherfields.Thesearethefieldsinclu
deapointertothe current location in the file ( for the next read/write
operation) and the access mode in whichthe file is open. The open () call
returns a pointer to the appropriate entry in the per-process
filesystemtable.Allfileoperationsarepreformedviathispointer.Whenaproc
essclosesthefile,the per- process table entry is removed. And the system
wide entry open count is
decremented.Whenallusersthathaveopenedthefilecompletedtheirtaskthent
hefileisclosed,anyupdatedmetadataiscopiedbacktothediskbasedirectorystr
ucture.Systemwideopenfiletableentryisremoved.
FILE ALLOCATION METHODS
Anallocationmethodreferstohow diskblocksareallocated forfiles:
Contiguousallocation:
Asingle continuoussetofblocks
areallocatedtoafileatthetimeoffilecreation.
Eachfileoccupiesset ofcontiguousblocks.
Itissimpleand givesbestperformanceinmostcases.
Non-Contiguousallocation:
i. Linkedallocation
Allocationisbasedon anindividualblock.
Inthiseachblockcontainsapointertothenextblockinthechainandlastbl
ockcontainsNULLpointer.
Thedirectorymaintainsthefilenameswiththestartingandendingblocks
asshowninthediagram.
Thisallocationmethodutilizesthefreeblockseffectively.
ii. Indexedallocation
Eachfilehasits ownindexblock(s)ofpointersto itsdatablocks.
Ifweneedsomedatawhichisavailableinaparticularblock#(number)no
needtotraversefrom thestarting blockofthe filejust likelinked list
allocation.
Wecangetalltheblock numbers allocatedtoafilefromtheindexblock.
FREESPACEMANAGEMENT
Thememoryspaceintheharddiskislimited.Soweneedtousethespaceofth
edeletedfiles for the allocation of the new file.The system should
maintain a free space list by keep
trackofthefreediskblocks.Thesefreeblockscanbeallocatedtoothernewfileo
rdirectory.Whenwe want to create a file, if the free space is available
then this free space is allocated to the newfile. Otherwisefile is not
created.Theprocess of finding and managing the free blocks of thediskis
called freespacemanagement.Themethods toimplement afreespacelist
are:
Bitmap
Linkedlist
Grouping
Counting
i. BitmaporBitVector
ABitmapor BitVector isseries ofbinarybits(0and1)whereeachbit
correspondstoonedisk block.
Thebit 0 indicatestheblock isallocated
Thebit 1indicates theblockisfree.
The white color block indicates
allocated to
fileThegreycolorboxindicatesfre
eblock
LetusconsidertheinstanceofdiskblocksonthediskshownintheFigure1
(wherewhiteblocks areallocatedand greyblocksarefree)can
berepresentedbyabitmap of 32 bits as:
DiskBlockNo
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0
BitVector
Thebitvector:00111100111111000110000001110000
Themainadvantageofthebitmapisthatitissimpletounderstandandefficientin
findingthefreeblocks in the disk.
ii. LinkedList:
Inthisapproach, the freediskblocks arelinked togetherwiththehelpof
linkedlist..
Freeblocks arelinked witheach other
Afreeblock containsapointer to thenext freeblock
Theblock number ofthe veryfirst freediskblockis stored ata
separate location ondiskandit is called as freelist head.
Thelast freeblockwouldcontain anull pointer
indicatingtheendoffreelist.
Figure2: LinkedListmethodoffreediskblocks
In Figure-2, the free space list head points to Block 2 which points
to Block 3, the next freeblockandso on.A drawbackof thismethodis
themore I/Orequiredforfreespacelisttraversal.
iii. Grouping
Thisapproachformsgroupsbasedonthecontiguous freeblocks.
Thefirst freeblock storestheaddress of first groupof
contiguousfreeblocks.
The last free block in the first group stores the address of second
group of contiguous freeblocksand so on.
Anadvantageofthisapproachisthattheaddressesofagroupoffreediskblock
scanbefoundeasily.
Figure3:Groupingmethod offreedisk blocks
Thisapproachstorestheaddressofthefirstfreediskblockandanumber
of
freecontiguous disk blocks that follow the first block. Free space list contains
address of first freeblock and counts in each group of contiguous free disk
blocks. This method of free spacemanagement is similar to the method of
allocating blocks. We can store these entries in the B-treein placeof
thelinkedlist.
USAGEOFOPEN,CREATE,READ,WRITE,CLOSE,LSEEK,STAT,IO
CTL SYSTEM CALLS
i. open()Systemcall:
Thesystemcallopen()isusedto openorcreate afile.Thesyntax is:
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>
intopen(constchar*path,intflags, [mode_tmod]);
The argument “path” specifies the name of the file, while “mod”
defines the accessrights. The access rights are given in the above topic (
open()systemcall). If the file to be createddoes not exist, a new i-node is
allocated and a link is added to the directory. If the file exists, itloses its
contents and it will be opened for writing. In this case, the second
argument is
ignoredandtheoldownershipandtheaccesspermissionsarenotmodified.Thi
ssystemcallreturnsthesmallestfiledescriptoravailable.Thefunctionreturnst
hefiledescriptororincaseofanerroritreturnsthe value-1. The system call
creat( )is equivalent with:
open(path,O_WRONLY|O_CREAT|O_TRUN
C,mod);
iii. read()systemcall:
Whenwewanttoreadacertainnumberofbytesstartingfromthecurrentposit
ioninafile,weuse theread systemcall. Thesyntaxis:
#include<unistd.h>
ssize_tread(intfd,void*buf,size_tnoct);
It reads “noct” bytes from the opened file referred by the file descriptor
“fd” and it puts thoseread bytes into a buffer “buf”. The pointer (current
position) is incremented automatically after areading thegiven amountof
bytes.The functionreturnsthe number of bytes read, 0 for end
offile(EOF) and-1 in casean erroroccurred.
Itwrites“noct”bytesfromthebuffer“buf”intotheopenedfilereferredbythefiled
escriptor
“fd”.Thefunction returnsthenumberofbytes writtenor -1 incaseof anerror.
v. close()systemcall:
Thissystemcallisusedtocloseafileandrelease
theassignedfiledescriptor“fd”.
#include<unistd.h>
intclose(intfd);
Thefunction returns0incaseofsuccessfullyclosingthefileand-1
incaseofanerror. When the
processterminated,allthe filesopenedbyit areclosedautomatically.
<sys/types.h>#include
<unistd.h>
off_tlseek(intfd,off_toffset,intref);
The first argument “fd” refers file descriptor of an opened file. The
second argument “offset”refers number of positions to be moved. The
third argument “ref” gives the position from wherethedisplacement
offilepointer to bedone.
If“ref”issettoSEEK_SETthe positioningisdonefromthe
beginningofthefile.
If“ref”issettoSEEK_CURthe positioningisdonefrom thecurrent
position.
If“ref”issettoSEEK_ENDthen the positioningis done from the endof
thefile.
Thefunctionreturnsthenewcurrentpositionafterdisplacementfromthegiven
fileor-1incaseofan error.
vii. stat()Systemcalls:
Thesystemcall statis usedtoread the attributesofafile. Thesyntaxis:
#include<sys/types.h>
#include<sys/stat.h>
intstat(constchar*path,structstat*buf);
The first argument “path” gives the file name. The second
argument “buf” is used tostore the file attributes read from the given i-
node of a file. The file attributes can be file accesstypes, owner, file size,
last access time, last modified time, etc. On success, the functions
returnzero, and on error, −1 is returned. The structure struct stat is
described in the sys/stat.hheader andhasthe followingfields:
structstat{
mode_tst_mode;
/*fileaccesstypesandrig
hts*/ino_tst_ino;/* i-node*/
dev_tst_dev;
/*identifierofdevicecontainin
gfile*/nlink_tst_nlink; /* nr oflinks */
uid_tst_uid; /*ownerID*/
gid_tst_gid; /*groupID*/
off_tst_size; /*ordinaryfile size*/
time_tst_atime;
/*lasttimeitwasaccessed*/ti
me_tst_mtime;
/*lasttimeitwasmodified*/t
ime_tst_ctime;
/*lasttimesettingswerechan
ged*/
viii. ioctlsystemcall:
IOCTL is referred as Input and Output Control. The system call ioctl() is
used to interact withdevice driver files. The major use of this is to handle
some specific operations of a device forwhich the kernel does not have a
system call by default. It manipulates the underlying
deviceparametersofdevicedriver files.Some real timeapplications ofioctl(
)are:
Ejectingthe mediafroma“cd”drive
tochangethe BaudRate ofSerialport
Adjustthe Volume
ReadingorWritingdeviceregist
ers,etc.Thesyntaxis:
#include<sys/ioctl.h>
intioctl(intfd,intrequest,<Arguments>);
The first argument “fd” is a file descriptor of an opened file. The ioctl
command needs to beexecuted on this opened file, which would
generally be device files.The second argument“request” is a device-
dependent request code. The request code varies from device to
device.The ioctl command implements the task associated with request
code to achieve the
desiredfunctionality.Thethirdargumentisanuntypedpointertomemory.It'st
raditionally char
*argp. An ioctl( )request has encoded in it whether the argument is an in
parameter or outparameter, and the size of the argument argp in
bytes.Macros and defines used in specifyingan ioctl( ) request are located
in the file <sys/ioctl.h>. Usually, on success zero orpositivevalue is
returned. On error, -1 is returned.