Requirements For Distributed File Systems
Requirements For Distributed File Systems
1
File Service Design Options File Service Design Options
• Stateful • Stateless
– server holds information on open files, current position, file – no state information held by server
locks – file operations idempotent, must contain all information
– open before access, close after needed (longer message)
– better performance - shorter message, read-ahead possible – simpler file server design
– server failure - lose state – can recover easily from client or server crash
– client failure - tables fill up – locking requires extra lock server to hold state
– can provide file locks
2
File names
Directory structure Text name (=directory pathname+file name)
• Hierarchical • hostname:local name
– tree-like, pathnames from root – not mobility transparent
– (in UNIX) several names per file (link operation) • uniform name structure (same name space for all
• Naming system clients)
– implemented by client module, using directory service • remote mount (e.g. Sun NFS)
– root has well-known UFID – remote directory inserted into local directory
– locate file following path from root – relies on clients maintaining consistent naming
conventions across all clients
• all clients must implement same local tree
• must mount remote directory into the same local directory
Distributed Systems 13 Distributed Systems 14
Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1;
the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.
3
NFS architecture File identifier (FileId)
Client computer Server computer
Simple Solution
– i-node (number identifying file Server address Index
Application Application
program program
within file system)
UNIX – file migration requires finding IP address.socket i-node number
system calls
UNIX kernel and changing all FileIds
UNIX kernel Virtual file system Virtual file system – UNIX reuses i-node numbers
Local Remote
after file deleted (i-node gen. no)
file system
client server
system
NFS
system Virtual file system uses i-node if local, file handle if remote.
protocol
File handle
File system identifier i-node no. i-node gener. no.
RPC (UDP or TCP)
Distributed Systems 19 Distributed Systems 20
Client caching
• Potential consistency problems! Summary
– different versions, portions of files, check if copy still valid • File service
• Timestamp method – crucial to the running of a distributed system
– tag with latest time of validity check and modification time – performance, consistency and easy recovery essential
– copy valid if time since last check less than freshness
interval, or modification time on server the same • Design issues
– choose freshness interval adaptively – separate flat file service from directory service and client
module
• Reads
– perform validity check, if not valid, request data from server, – stateless for performance and fault-tolerance
optimisations – caching for performance
• Writes – concurrent updates difficult with caching
– After modification, marked as dirty and flushed – approximation of one-copy update semantics
• Not truly one-copy update semantics...
Distributed Systems 23 Distributed Systems 24