OS Lecture-14 (File Systems)
OS Lecture-14 (File Systems)
File Systems
File Concept
File Concept
• The operating system abstracts from the physical properties of its
storage to define a logical storage unit, the file.
• File types:
– Data, free form or formatted
• numeric
• character
• binary
– Program
• source
• object
File Structure
• None - sequence of words, bytes
• Complex Structures
– Formatted document
– Relocatable load file
– Executable
• Who decides:
– Operating system
– Program
File Attributes
• Name – only information kept in human-readable form
• Other operations
– append, rename, copy, get/set file attributes
Open and Close Files
• Most file operations involve searching the directory for a file.
– open (Fi) – search the directory structure on disk for entry Fi,
and move the content of entry to memory.
• To find a record in the file, we first search the index and then use the
pointer to access the file directly and to find the desired record.
A Typical File-System Organization
• Each volume that contains a file system must also contain information
about the files in the system.
Directory
Files F4
F1 F2
F3
Fn
Information in a Device Directory
• Name
• Type
• Address
• Current length
• Maximum length
• Create a file
• Delete a file
• List a directory
• Rename a file
• Directory structure:
– Single-level: A single directory of all users
– Two-level: Separate directory for each user
– Tree-structured: Most common
Single-Level Directory
• A single directory for all users.
• Advantages:
─ Easy to support and understand
─ Simple
• Disadvantages:
─ When number of files increases, all files must have different names
Two-Level Directory
• One master file directory.
• Each user has their own user file directory.
• Advantages:
─ Separate directory for each user
─ Can have the same file name for different user
• Disadvantages:
─ Structure isolated one user from another
Tree-Structured Directories
• The directory structure is a tree with arbitrary height.
• Relative path: defines a path from the current directory. prt/first given
root/spell/mail as current path.
• Advantages:
─ Efficient searching
─ Allows users to create their own subdirectory
─ Users can access files of other users also by specifying pathnames
• Disadvantages:
─ User have to remember long path names
Acyclic-Graph Directories
• Allows sharing of directories and files several times on the tree structure.
• Only one file exists. Any changes made by one person are immediately
visible to the other.
Acyclic-Graph Directories (cont.)
• There are several ways to implement shared files and directories:
• Example – Unix:
• Symbolic links:
─ A different type of directory entry (other than a file or a directory).
─ Specifies the name of the file that this link is pointing to.
─ Can be relative or absolute.
─ What happens when one deletes the original file?
• If multi-user system:
– User IDs identify users, allowing permissions and protections to be per-user.
Group IDs allow users to be in groups, permitting group access rights.
– Owner of a file / directory.
– Group of a file / directory.
File Sharing – Remote File Systems
• Uses networking to allow file system access between systems.
– Manually via programs like FTP
– Automatically, seamlessly using distributed file systems
– Semi automatically via the world wide web
• Remote file systems add new failure modes, due to network failure,
server failure
• Types of access:
– Read
– Write
– Execute
– Append
– Delete
– List
Access Lists and Groups
• Mode of access: read, write, execute.
• Three classes of users on Unix / Linux:
RWX
a) owner access 7 ⇒ 111
RWX
b) group access 6 ⇒ 110
RWX
c) public access 1 ⇒ 001
• Ask manager to create a group (unique name), say G, and add some users
to the group.
• For a particular file (say game) or subdirectory, define an appropriate
access.
• Two aspects
– User’s view
• Define files/attributes, operations, directory
– Implementing file system
• Data structures and algorithms to map logical view to physical one
Layered File System
Application
programs
Logical file system
Manages file metadata
information. Directory structure,
inodes
File-organization module
Translates logical view (blocks)
to physical view (cylinder/track),
Manages free space
Basic file system
Generic read/write to device
Buffers/Caches for data blocks
I/O control
Interrupt handling, low level
Devices I/O, DMA management
File System Layers
• Device drivers manage I/O devices at the I/O control layer
– Given commands like “read drive1, cylinder 72, track 2, sector 10, into
memory location 1060” outputs low-level hardware specific
commands to hardware controller
• Linux has more than 40 types, with extended file system ext2 and
ext3 leading.
file permissions
file size
(a) Existing file system (b) Unmounted partition (c) After it was mounted
Mount Point
1. The OS is first given the name of the device and the mount point.
• VFS allows the same system call interface (the API) to be used for
different types of file systems.
• The API is to the VFS interface, rather than any specific type of file
system.
Schematic View of Virtual File System
file-system interface
VFS interface
network
Directory Implementation
1. Linear list of file names with pointer to the data blocks
– Simple to program
– Time-consuming to search for file name in (long) lists
Pointers to Pointers to
data blocks data blocks
Directory Implementation (cont.)
2. Hash Table – linear list with hash data structure
– Decreases directory search time
– Collisions – situations where two file names hash to the same location
• Enlarge the hash table
• Or use fixed size with overflow chains
Pointers to Pointers to
data blocks data blocks
Key 1 Value 1
Key 2 Value 2
• Pros:
– Simple – only starting location
(block #) and length (number
of blocks) are required
– Random access
• Cons:
– Wastage of space (dynamic
storage-allocation problem)
• External fragmentation:
may need to compact
space
– Files may not be allowed to
grow
Contiguous File Allocation Example
Linked Allocation
• Each file is a linked list of disk
blocks: blocks may be
scattered anywhere on the
disk.
block = pointer
• Pros:
− Simple – need only starting
address
− Free-space management
system – no waste of space
• The table has one entry for each disk block and is indexed by block number.
– Similar to the linked list
– Contain the block number of the next block in the file
• Cons:
– Each Disk block needs a FAT entry.
– FAT size may be very big depending upon the number of FAT entries.
– Number of FAT entries can be reduced by increasing the block size but
it will also increase Internal Fragmentation.
File-Allocation Table (FAT) Example
Indexed Allocation
• Each file has its own index block(s) of pointers to its data blocks.
– An array of disk-block addresses
– The ith entry points to the ith block of the file
– The directory contains the address of the index block
– Similar to the paging scheme for memory management
index table
Index Table
Indexed Allocation (cont.)
• Pros:
– Supports direct access.
– A bad data block causes the lost of only that block.
• Cons:
– A bad index block could cause the lost of entire file.
– Size of a file depends upon the number of pointers, a index block can
hold.
– Having an index block for a small file is totally wastage.
– More pointer overhead.
Indexed Allocation Example
Performance
• Best method depends on file access type.
– Contiguous great for sequential and random
• Therefore it has to keep track of all the free blocks present in the
disk.
1 block[i] free
bit[i] =
0 block[i] occupied
• Simple and efficient to find first free blocks or n consecutive free blocks.
0 0 0 0 0 0 1 0 Bit vector
word word
Bit Vector (cont.)
• Pros:
– Easy to find space to allocate contiguous files.
• Cons:
– Bit map requires extra space, which can be huge.
– Example:
block size = 29 bytes
disk size = 232 bytes (4G bytes)
n = 232/29 = 223 bits, requiring 220 bytes (1M bytes)
Linked List
• Pros:
– No wastage of space.
• Cons:
– Difficult to allocate contiguous blocks.
– Not efficient: have to traverse the disk for free
spaces
• Counting:
– Because space is frequently contiguously used and freed, with
contiguous-allocation allocation, extents, or clustering.
• Keep address of first free block and count of following free blocks
• Free space list then has entries containing addresses and counts
Example of Grouping And Counting
Counting:
2 4
8 6
17 2
25 3