File Systems
File Systems
Operating Systems
File systems
● Way to organize and (persistently) store
information
● Abstraction over storage devices
○ Hard disk, SSD, network, …
● Organized in files and (typically) directories.
● Examples:
○ FAT12/FAT16: MS-DOS
○ NTFS: Windows
○ Ext4: Linux
○ APFS: macOS/iOS
Overview
Storage
Operating System
User program
Overview
Storage
HDD, SSD, network, RAM, …
FAT driver
Virtual File Page NTFS driver
System cache
Ext4 driver
...
Syscall
open, read, write, readdir, ...
User program
Overview
Storage Chapter 5 (I/O)
HDD, SSD, network, RAM, … (next week)
...
Syscall
This lecture
User program (files, directories,
user operations)
Overview
● Files and directories + user operations
● File system implementation
Files
● Abstract storage nodes
● File access:
○ Sequential vs. random access
● File types:
○ Regular files, directories, soft links
○ Special files (e.g., device files, metadata files)
● File structure:
○ OS’ perspective: Files as streams of bytes
○ Program’s perspective: Archives, Executables, etc.
○ Is OS ever aware of the file structure?
File Naming
● Different file systems have different
limitations/conventions for file names
● File extensions
● File name length
○ FAT12: 8.3 characters (later extended to 255)
○ Ext4: 255 characters
● Special characters in file names
○ FAT12: No "*/:<>?\| and more
○ Ext4: No '\0' and '/' or the special names "." and ".."
● Case sensitivity
Possible File Attributes
File Operations
● Create/Delete
● Open/Close
● Read/Write
● Append
● Seek
● GetAttributes/SetAttributes
● Rename
File Operations - Unix
● Opening and reading file:
○ int fd = open(“foo.txt”, O_RDONLY);
char buf[512];
ssize_t bytes_read = read(fd, buf, 512);
close(fd);
printf(“read %zd: %s\n”, bytes_read, buf);
● Opening a file returns a handle (file descriptor)
for future operations.
● Any function may return an error, e.g.:
○ -ENOENT: File does not exist
○ -EBADF: Bad file descriptor
File Operations - Unix
● Seeking in files:
○ int fd = open(“foo.txt”, O_RDONLY);
lseek(fd, 128, SEEK_CUR);
char buf[8];
read(fd, buf, 8);
close(fd);
● Move current position in file forwards or
backwards.
File Operations - Unix
● Writing files:
○ int fd = open(“foo.txt”, O_WRONLY |
O_CREAT |
O_TRUNC);
char buf[] = “Hi there”;
write(fd, buf, strlen(buf));
close(fd);
● O_CREAT: Create file if it does not exist
● O_TRUNC: “Truncate” file to size 0 if it exists
(i.e., throw away current file contents)
File Operations - Unix
● Removing file:
○ unlink(“foo.txt”);
● Renaming file:
○ rename(“foo.txt”, “bar.txt”)
● Change file permission attribute:
○ chmod(“foo.txt”, 0755);
● Change file owner attribute:
○ chown(“foo.txt”, uid, gid);
Directories
● Data structures organizing and maintaining
information about files
● Often stored as file entries with special
attributes
● Directories are denoted by “/” (Unix) or “\”
(Windows)
● Special directory entries:
○ . Current directory
○ .. Parent directory
Directory Hierarchies
Directory Operations
● Create/Delete
● Opendir/Closedir
● Readdir
● Rename
● Link/Unlink (hard and soft links)
Directory Operations - Unix
● Reading current directory contents:
○ DIR *dirp = opendir(“.”);
closedir(dirp);
Virtual File System (VFS)
● Interface between user programs and
underlying file systems
● Programs only see a single file system, e.g.:
○ /: Partition 1 (Ext3)
○ /home: Partition 2 (Ext4)
○ /mnt/usb: USB stick partition 1 (FAT)
○ /mnt/share: Remote network share (NFS)
○ /tmp: RAM (non-persistent) storage (tmpfs)
Virtual File System (VFS)
VFS Data Structures
Overview
Storage Chapter 5 (I/O)
HDD, SSD, network, RAM, … (next week)
FAT driver
Virtual File Page NTFS driver
System cache
Ext4 driver
...
Syscall
This/next lecture
(FS implementation)
○ Block-based strategies:
■ Linked List
■ File Allocation Table
■ i-nodes
File Storage:
Linked List
RAID 1
RAID 5
Buffer cache
OS
FAT driver
NTFS driver
VFS Page
Cache
Ext4 driver
FUSE
Syscall
read, readdir, ...
libfuse
libc
Userspace
User program FUSE driver
FUSE
● The driver is just a normal userspace
process
● libfuse talks with the kernel module
Storage
Buffer cache
OS
FAT driver
NTFS driver
VFS Page
Cache
Ext4 driver
FUSE
Syscall Normal
syscalls FUSE syscalls (read/write /dev/fuse)
read, readdir, ...
libc libfuse
libc
Userspace
User program FUSE driver
FUSE
● Makes driver development much easier
● No permissions required for new FS drivers
● But, slower than in-kernel drivers
Source: https://github.com/libfuse/libfuse/blob/master/example/hello.c
FUSE callbacks
● Most important callback: getattr
● Called before any operation on a file
● Retrieves file(/directory) attributes
via struct stat - see man 2 stat
● Directory example: stbuf->st_mode = S_IFDIR | 0755;
● File example: stbuf->st_mode = S_IFREG | 0755;
● Non-existing entry: return -ENOENT
FUSE callbacks
● Familiar from syscalls:
● readdir: List directory contents
● read: Read file contents
● mkdir: Create directory
● rmdir: Remove directory
● unlink: Remove file
● create: Create file
● truncate: Resize file
● write: Write data to file
● rename: Rename/move file