0% found this document useful (0 votes)
7 views

OS Lecture-14 (File Systems)

The document provides an overview of file systems, detailing the concepts of files, their structure, attributes, and operations. It discusses directory structures, file types, access methods, and file sharing, including protection mechanisms. Additionally, it covers file system implementation, mounting, and various allocation methods, highlighting the importance of efficient data access and organization.

Uploaded by

ekam3886
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

OS Lecture-14 (File Systems)

The document provides an overview of file systems, detailing the concepts of files, their structure, attributes, and operations. It discusses directory structures, file types, access methods, and file sharing, including protection mechanisms. Additionally, it covers file system implementation, mounting, and various allocation methods, highlighting the importance of efficient data access and organization.

Uploaded by

ekam3886
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

OS Lecture-14

File Systems
File Concept
File Concept
• The operating system abstracts from the physical properties of its
storage to define a logical storage unit, the file.

• Files are mapped by the OS onto physical, usually nonvolatile,


devices.

• File types:
– Data, free form or formatted
• numeric
• character
• binary
– Program
• source
• object
File Structure
• None - sequence of words, bytes

• Simple record structure


– Lines and Pages
– Fixed length
– Variable length

• Complex Structures
– Formatted document
– Relocatable load file
– Executable

• Who decides:
– Operating system
– Program
File Attributes
• Name – only information kept in human-readable form

• Identifier – unique tag (number) identifies file within file system

• Type – needed for systems that support different file types

• Location – pointer to the file location on device

• Size – current file size

• Protection – controls who can do reading, writing, executing

• Time, date, and user identification – data for protection, security,


and usage monitoring

• Information about files are kept in the directory structure, which is


maintained on the secondary storage, like a disk.
File Operations
• File is an abstract data type with the following basic operations:
– create
– write
• The system must keep a writer pointer. File info in the directory also
updated
– read
• The system must keep a read pointer.
– reposition within file (known as file seek)
– delete
– truncate

• Other operations
– append, rename, copy, get/set file attributes
Open and Close Files
• Most file operations involve searching the directory for a file.
– open (Fi) – search the directory structure on disk for entry Fi,
and move the content of entry to memory.

– close (Fi) – move the content of entry Fi in memory to directory


structure on disk.

• The OS normally maintains two-level open-file tables, per-


process and system-wide.
Open and Close Files
• Several pieces of data are needed to manage open files:
– File pointer: pointer to last read/write location, per process that has
the file open.

– File-open count: counter of number of times a file is open – to allow


removal of data from open-file table when last processes closes it.

– Disk location of the file: cache of data access information.

– Access rights: per-process access mode information.


File Types – Name, Extension
Internal File Structure
• All disks is performed in units of one block (physical record).

• Logical records may vary in length.

• Packing a number of logical records into physical blocks is the


common solution.
– Example: UNIX defines all files to be streams of bytes. Its logical record
size is 1 byte.

• Packing can be done either by user’s application or by the


operating system.
– Internal fragmentation problem
Access Methods
• Sequential Access (based on tape model)
read next
write next
reset to the beginning
no read after last write

• Direct Access (or relative access)


read n
write n
position to n
read next
write next
rewrite n
(n = relative block number)
Other Access Methods - Index and Relative Files

• The index contains pointers to various blocks.

• To find a record in the file, we first search the index and then use the
pointer to access the file directly and to find the desired record.
A Typical File-System Organization

• Each volume that contains a file system must also contain information
about the files in the system.

• This information is kept in entries in a device directory which records


name, location, size and type.
Directory Concept
Directory Structure
• A collection of nodes containing information about all files.

• Both the directory structure and the files reside on disk.

Directory

Files F4
F1 F2
F3
Fn
Information in a Device Directory
• Name

• Type

• Address

• Current length

• Maximum length

• Date last accessed (for archival)

• Date last updated (for dump)

• Owner ID (who pays)

• Protection information (discuss later)


Operations Performed on Directory
• Search for a file

• Create a file

• Delete a file

• List a directory

• Rename a file

• Traverse the file system


Actual File Size
• In Windows, disk block is 4KB.
• In Unix, disk block is 512B.

• “Size” of file reflects the actual


size of data.
• “Size on disk” which is multiple of
data blocks.

• Ex, if you create a text file with


“xyz” with notepad, size of file will
show 431 bytes, while “Size on
disk” will show 4KB.
Organize the Directory (Logically) to Obtain
• Efficiency – locating a file quickly.

• Naming – convenient to users.


– Two users can have same name for different files.
– The same file can have several different names.

• Grouping – logical grouping of files by properties, (e.g., all Python


programs, all games, …)

• Directory structure:
– Single-level: A single directory of all users
– Two-level: Separate directory for each user
– Tree-structured: Most common
Single-Level Directory
• A single directory for all users.

• Advantages:
─ Easy to support and understand
─ Simple

• Disadvantages:
─ When number of files increases, all files must have different names
Two-Level Directory
• One master file directory.
• Each user has their own user file directory.

• Advantages:
─ Separate directory for each user
─ Can have the same file name for different user

• Disadvantages:
─ Structure isolated one user from another
Tree-Structured Directories
• The directory structure is a tree with arbitrary height.

• Users may create their own sub-directories.


Tree-Structured Directories (cont.)
• Absolute path: begins at the root and follows a path down to the specified
file. root/spell/mail/prt/first

• Relative path: defines a path from the current directory. prt/first given
root/spell/mail as current path.

• Advantages:
─ Efficient searching
─ Allows users to create their own subdirectory
─ Users can access files of other users also by specifying pathnames

• Disadvantages:
─ User have to remember long path names
Acyclic-Graph Directories
• Allows sharing of directories and files several times on the tree structure.

• Only one file exists. Any changes made by one person are immediately
visible to the other.
Acyclic-Graph Directories (cont.)
• There are several ways to implement shared files and directories:

• Example – Unix:

• Symbolic links:
─ A different type of directory entry (other than a file or a directory).
─ Specifies the name of the file that this link is pointing to.
─ Can be relative or absolute.
─ What happens when one deletes the original file?

• Duplicate directory entries (also called hard links):


─ The original and copy entries are the same.
─ What happens when one deletes the original or copy entries?
General Graph Directory
File Sharing and Protection
File Sharing
• Sharing of files on multi-user systems is desirable.

• Sharing may be done through a protection scheme.

• On distributed systems, files may be shared across a network.

• Network File System (NFS) is a common distributed file-sharing


method.

• If multi-user system:
– User IDs identify users, allowing permissions and protections to be per-user.
Group IDs allow users to be in groups, permitting group access rights.
– Owner of a file / directory.
– Group of a file / directory.
File Sharing – Remote File Systems
• Uses networking to allow file system access between systems.
– Manually via programs like FTP
– Automatically, seamlessly using distributed file systems
– Semi automatically via the world wide web

• Client-server model allows clients to mount remote file systems from


servers.
– Server can serve multiple clients
– Client and user-on-client identification is insecure or complicated
– NFS is standard UNIX client-server file sharing protocol
– CIFS is standard Windows protocol
– Standard operating system file calls are translated into remote calls

• Distributed Information Systems (distributed naming services) such as


LDAP, DNS, NIS, Active Directory implement unified access to information
needed for remote computing.
File Sharing – Failure Modes
• All file systems have failure modes
– For example corruption of directory structures or other non-user data,
called metadata

• Remote file systems add new failure modes, due to network failure,
server failure

• Recovery from failure can involve state information about status


of each remote request

• Stateless protocols such as NFS v3 include all information in each


request, allowing easy recovery but less security
Protection
• File owner/creator should be able to control:
– what can be done
– by whom

• Types of access:
– Read
– Write
– Execute
– Append
– Delete
– List
Access Lists and Groups
• Mode of access: read, write, execute.
• Three classes of users on Unix / Linux:
RWX
a) owner access 7 ⇒ 111
RWX
b) group access 6 ⇒ 110
RWX
c) public access 1 ⇒ 001

• Ask manager to create a group (unique name), say G, and add some users
to the group.
• For a particular file (say game) or subdirectory, define an appropriate
access.

• Attach a group to a file: chgrp G game


File System Structure
File System Structure
• File system
– Provide efficient and convenient access to disk
– Easy access to the data (store, locate and retrieve)

• Two aspects
– User’s view
• Define files/attributes, operations, directory
– Implementing file system
• Data structures and algorithms to map logical view to physical one
Layered File System
Application
programs
Logical file system
Manages file metadata
information. Directory structure,
inodes
File-organization module
Translates logical view (blocks)
to physical view (cylinder/track),
Manages free space
Basic file system
Generic read/write to device
Buffers/Caches for data blocks
I/O control
Interrupt handling, low level
Devices I/O, DMA management
File System Layers
• Device drivers manage I/O devices at the I/O control layer
– Given commands like “read drive1, cylinder 72, track 2, sector 10, into
memory location 1060” outputs low-level hardware specific
commands to hardware controller

• Basic file system given command like “retrieve block 123”


translates to device driver
– Also manages memory buffers and caches (allocation, freeing,
replacement)
• Buffers hold data in transit
• Caches hold frequently used data

• File organization module understands files, logical address, and


physical blocks
– Translates logical block # to physical block #
– Manages free space, disk allocation
File System Layers (cont.)
• Logical file system manages metadata information
– Manages the directory structure to provide the file-organization
module with the information the latter needs, given a symbolic file
name
– Maintains file control blocks (inodes in Unix) contain information
about the file, including
• ownership,
• permissions, and
• location of the file contents
– Responsible for protection
Advantage & Disadvantages of Layered File System
• Layering useful for reducing complexity and redundancy
– The I/O control and sometimes the basic file-system code can be used
by multiple file systems.

• But adds overhead and can decrease performance


– whether to use of layering
– how many layers to use
– what each layer should do
Examples of File System
• CD-ROMs are written in the ISO 9660 format.

• Unix has Unix File System (UFS).

• Windows has FAT, FAT32, NTFS.

• Linux has more than 40 types, with extended file system ext2 and
ext3 leading.

• Distributed file systems.


File-Control Block
• The logical file system maintains structures consisting of
information about a file: file-control block (FCB)

file permissions

file dates (create, access, write)

file owner, group, ACL

file size

file data blocks or pointers to file data blocks


File System Mounting
File System Mounting
• A file system must be mounted before it can be accessed.

• A unmounted file system is mounted at a mount point (an empty


directory).

• E.g.: on UNIX mount point is /home

(a) Existing file system (b) Unmounted partition (c) After it was mounted
Mount Point
1. The OS is first given the name of the device and the mount point.

2. The OS verifies that the device contains a valid file system.


– Read the device directory and verify the directory format

3. The OS notes in the directory structure that a file system is


mounted at the specified mount point.

4. If the volume is unmounted, the file system is restored to the


situation before mounting.
File System Implementation
On-Disk Structures
• Boot control block: information needed by the system to boot an
OS from that partition
– UFS: boot block; NTFS: partition boot sector

• Partition control block: partition details


– No. of blocks, size of the blocks, free-block count and free-block
pointers, free FCB count and FCB pointers
– UFS: superblock; NTFS: Master File Table

• A directory structure is used to organize the files


– Names and inode numbers, master file table

• File control block: many of the file’s details


– File permissions, ownership, size, location of the data blocks
– UFS: inode; NTFS: within the Master File Table
Virtual File Systems
• There are many different file systems available on any operating
systems.
– Windows: NTFS, FAT, FAT32
– Linux: ext2/ext3, ufs, vfat, ramfs, tmpfs, reiserfs, xfs ...

• Virtual File Systems (VFS) provide an object-oriented way of


implementing file systems.

• VFS allows the same system call interface (the API) to be used for
different types of file systems.

• The API is to the VFS interface, rather than any specific type of file
system.
Schematic View of Virtual File System

file-system interface

VFS interface

local file system local file system remote file system


type 1 type 2 type 3

network
Directory Implementation
1. Linear list of file names with pointer to the data blocks
– Simple to program
– Time-consuming to search for file name in (long) lists

File name 1 File name 2

Pointers to Pointers to
data blocks data blocks
Directory Implementation (cont.)
2. Hash Table – linear list with hash data structure
– Decreases directory search time
– Collisions – situations where two file names hash to the same location
• Enlarge the hash table
• Or use fixed size with overflow chains

File name 1 File name 2

Pointers to Pointers to
data blocks data blocks

Key 1 Value 1

Key 2 Value 2

Key n Value n Hash_Func(file_name) = key


Value File
Allocation Methods
Allocation Methods
• An allocation method refers to how disk blocks are allocated for
files:
1. Contiguous allocation
2. Linked allocation
3. Indexed allocation

• For these approaches we regard the file system blocks to be


numbered sequentially 0 ... n.
– Mapping to track and sector # done at a lower level
Contiguous Allocation
• Each file occupies a set of
contiguous blocks on the disk.

• Pros:
– Simple – only starting location
(block #) and length (number
of blocks) are required
– Random access

• Cons:
– Wastage of space (dynamic
storage-allocation problem)
• External fragmentation:
may need to compact
space
– Files may not be allowed to
grow
Contiguous File Allocation Example
Linked Allocation
• Each file is a linked list of disk
blocks: blocks may be
scattered anywhere on the
disk.

block = pointer

• Pros:
− Simple – need only starting
address
− Free-space management
system – no waste of space

• Cons: No random access


Linked Allocation Example
Linked List Allocation Using a Table in Memory
Physical 0
block
1
2 10
3 11
4 7 File A starts here
5
6 3 File B starts here
7 2
8
9
10 12
11 14
12 -1
13
14 -1
15
File-Allocation Table
• FAT (File Allocation Table)
– Removes link pointer from blocks themselves

• The table has one entry for each disk block and is indexed by block number.
– Similar to the linked list
– Contain the block number of the next block in the file

• Significant number of disk head seeks.


– One for FAT, one for data
– Improved by caching FAT

• Random access time is improved.


– Can find block quickly by traversing the table
– Don’t need to access all the blocks on the way

• Easy to find empty blocks, and to extend files.


File-Allocation Table (cont.)
• Pros:
– Uses the whole disk block for data.
– A bad disk block doesn't cause all successive blocks lost.
– Random access is provided although its not too fast.
– Only FAT needs to be traversed in each file operation.

• Cons:
– Each Disk block needs a FAT entry.
– FAT size may be very big depending upon the number of FAT entries.
– Number of FAT entries can be reduced by increasing the block size but
it will also increase Internal Fragmentation.
File-Allocation Table (FAT) Example
Indexed Allocation
• Each file has its own index block(s) of pointers to its data blocks.
– An array of disk-block addresses
– The ith entry points to the ith block of the file
– The directory contains the address of the index block
– Similar to the paging scheme for memory management

index table
Index Table
Indexed Allocation (cont.)
• Pros:
– Supports direct access.
– A bad data block causes the lost of only that block.

• Cons:
– A bad index block could cause the lost of entire file.
– Size of a file depends upon the number of pointers, a index block can
hold.
– Having an index block for a small file is totally wastage.
– More pointer overhead.
Indexed Allocation Example
Performance
• Best method depends on file access type.
– Contiguous great for sequential and random

• Linked good for sequential, not random.

• Declare access type at creation -> select either contiguous or linked.

• Indexed more complex.


– Single block access could require 2 index block reads then data block
read

– Clustering can help improve throughput, reduce CPU overhead


Free-Space Management
Free Space Management
• A file system is responsible to allocate the free blocks to the file.

• Therefore it has to keep track of all the free blocks present in the
disk.

• There are mainly four approaches:


1. Bit Vector or Bit Map
2. Linked List
3. Grouping
4. Counting
Bit Vector / Bit Map
• Bit vector (for n blocks) 0 1 2 n-1


1  block[i] free
bit[i] =
0  block[i] occupied

• Simple and efficient to find first free blocks or n consecutive free blocks.

Block number calculation =


(number of bits per word)  (number of 0-value words) + offset of first 1 bit

0 0 0 0 0 0 1 0 Bit vector

word word
Bit Vector (cont.)
• Pros:
– Easy to find space to allocate contiguous files.

• Cons:
– Bit map requires extra space, which can be huge.
– Example:
block size = 29 bytes
disk size = 232 bytes (4G bytes)
n = 232/29 = 223 bits, requiring 220 bytes (1M bytes)
Linked List

free-space list head

• Link together all the free disk blocks.

• Keep a pointer to the first free block.

• Pros:
– No wastage of space.
• Cons:
– Difficult to allocate contiguous blocks.
– Not efficient: have to traverse the disk for free
spaces

• Note: FAT incorporate the linked list


mechanism
Grouping And Counting
• Grouping:
– Modify linked list to store address of next n - 1 free blocks in first free
block, plus a pointer to next block that contains free-block-pointers.

• Counting:
– Because space is frequently contiguously used and freed, with
contiguous-allocation allocation, extents, or clustering.

• Keep address of first free block and count of following free blocks
• Free space list then has entries containing addresses and counts
Example of Grouping And Counting

free-space list head Grouping (n = 3):


Block 2  3, 4, 5
Block 5  8, 9, 10
Block 10  11, 12, 13
Block 13  17, 18, 25
Block 25  26, 27

Counting:
2 4
8 6
17 2
25 3

You might also like