Unit 5-Part2 (File Systems)
Unit 5-Part2 (File Systems)
File Concept
File Operations
o Creating a file
o Writing a file
o Reading a file
o Deleting a file
o Truncating a file.
File types:
Access Methods
Direct Access
Jump to any record and read that record. Operations supported include:
o read n - read record number n. ( Note an argument is now
required. )
o write n - write record number n. ( Note an argument is now
required. )
o jump to record n - could be 0 or the end of file.
o Query current record - used to return back to this record later.
o Sequential access can be easily emulated using direct access. The
inverse is complicated and inefficient
Index access
Directory Structure
Storage Structure
File Sharing
Multiple Users
o The owner ( user ) who owns the file, and who can control its access.
o The group of other user IDs that may have some special access to the
file.
o What access rights are afforded to the owner ( User ), the Group, and
to the rest of the world ( the universe, a.k.a. Others. )
o Various forms of distributed file systems allow remote file systems to be mounted
onto a local directory structure, and accessed using normal file access commands.
( The actual files are still transported across the network as needed, possibly using
ftp as the underlying transport mechanism. )
o The WWW has made it easy once again to access files on remote systems without
mounting their filesystems, generally using ( anonymous ) ftp as the underlying
file transport mechanism.
Client-Server Model
When one computer system remotely mounts a filesystem that is physically located on
another system, the system which physically owns the files acts as a server, and the
system which mounts them is the client.
User IDs and group IDs must be consistent across both systems for the system to work
properly. ( I.e. this is most applicable across multiple computers managed by the same
organization, shared by a common group of users. )
The same computer can be both a client and a server. ( E.g. cross-linked file systems. )
The Domain Name System, DNS, provides for a unique naming system
across all of the Internet.
Files must be kept safe for reliability ( against accidental damage ), and
protection ( against deliberate malicious access. ) The former is usually
managed with backup copies. This section discusses the latter.
One simple protection scheme is to remove all access to a file. However this
makes the file unusable, so some sort of controlled access must be arranged.
Types of Access
Access Control
File-System Structure
Hard disks have two important properties that make them suitable for
secondary storage of files in file systems: (1) Blocks of data can be rewritten in
place, and (2) they are direct access, allowing any block of data to be accessed
with only ( relatively ) minor movements of the disk heads and rotational
latency. ( See Chapter 12 )
Disks are usually accessed in physical blocks, rather than a byte at a time.
Block sizes may range from 512 bytes to 4K or larger.
File systems organize storage on disk drives, and can be viewed as a layered
design:
o At the lowest layer are the physical devices, consisting of the magnetic
media, motors & controls, and the electronics connected to them and
controlling them. Modern disk put more and more of the electronic
controls directly on the disk drive itself, leaving relatively little work for
the disk controller card to perform.
o I/O Control consists of device drivers, special software programs ( often
written in assembly ) which communicate with the devices by reading
and writing special codes directly to and from memory addresses
corresponding to the controller card's registers. Each controller card
( device ) on a system has a different set of addresses ( registers,
a.k.a. ports ) that it listens to, and a unique set of command codes and
results codes that it understands.
o The basic file system level works directly with the device drivers in
terms of retrieving and storing raw blocks of data, without any
consideration for what is in each block. Depending on the system, blocks
may be referred to with a single block number, ( e.g. block # 234234 ),
or with head-sector-cylinder combinations.
o The file organization module knows about files and their logical blocks,
and how they map to physical blocks on the disk. In addition to
translating from logical to physical blocks, the file organization module
also maintains the list of free blocks, and allocates free blocks to files as
needed.
o The logical file system deals with all of the meta data associated with a
file ( UID, GID, mode, dates, etc ), i.e. everything about the file except
the data itself. This level manages the directory structure and the
mapping of file names to file control blocks, FCBs, which contain all of
the meta data as well as block number information for finding the data
on the disk.
Figure - Layered file system.
Directory Implementation
Linear List
A linear list is the simplest and easiest directory structure to set up, but it
does have some drawbacks.
Finding a file ( or verifying one does not already exist upon creation )
requires a linear search.
Deletions can be done by moving all entries, flagging an entry as
deleted, or by moving the last entry into the newly vacant position.
Sorting the list makes searches faster, at the expense of more complex
insertions and deletions.
A linked list makes insertions and deletions into a sorted list easier, with
overhead for the links.
More complex data structures, such as B-trees, could also be considered.
Hash Table
Allocation Methods
There are three major methods of storing files on disks: contiguous, linked, and
indexed.
Contiguous Allocation
Disk files can be stored as linked lists, with the expense of the storage
space consumed by each link. ( E.g. a block may be 508 bytes instead of
512. )
Linked allocation involves no external fragmentation, does not require
pre-known file sizes, and allows files to grow dynamically at any time.
Another big problem with linked allocation is reliability if a pointer is
lost or damaged. Doubly linked lists provide some protection, at the cost
of additional overhead and wasted space.
Indexed Allocation
Indexed Allocation combines all of the indexes for accessing each file
into a common block ( for that file ), as opposed to spreading them all
over the disk or storing them in a FAT table.
Figure - Indexed allocation of disk space.
Free-Space Management
1. Bit Vector
One simple approach is to use a bit vector, in which each bit represents a
disk block, set to 1 if free or 0 if allocated.
Fast algorithms exist for quickly finding contiguous blocks of a given
size
2. Linked List
A linked list can also be used to keep track of all free blocks.
Traversing the list and/or finding a contiguous block of a given size are
not easy, but fortunately are not frequently needed operations. Generally
the system just adds and removes single blocks from the beginning of the
list.
The FAT table keeps track of the free list as just one more linked list on
the table.
Grouping
Counting
When there are multiple contiguous blocks of free space then the system can
keep track of the starting address of the group and the number of contiguous
free blocks. As long as the average length of a contiguous group of free blocks
is greater than two this offers a savings in space needed for the free list.