module 5
module 5
DISTRIBUTED COMPUTING
MODULE 5
• The agreement condition is satisfied because in the f+ 1 rounds, there must be at least one round in which
no process failed.
• In this round, say round r, all the processes that have not failed so far succeed in broadcasting their values,
and all these processes take the minimum of the values broadcast and received in that round.
• Thus, the local values at the end of the round are the same, say x ri for all non-failed processes.
• In further rounds, only this value may be sent by each process at most once, and no process i will update its
value xri.
• The validity condition is satisfied because processes do not send fictitious values in this failure model.
• For all i, if the initial value is identical, then the only value sent by any process is the value that has been
agreed upon as per the agreement condition.
• The termination condition is seen to be satisfied.
Complexity: The complexity of this particular algorithm is it requires f + 1 rounds where f < n and the number of
messages is O(n2 )in each round and each message has one integers hence the total number of messages is O((f +1)·
n 2 ) is the total number of rounds and in each round n 2 messages are required.
Lower bound on the number of rounds: At least f + 1 rounds are required, where f < n. In the worst-case scenario,
one process may fail in each round; with f + 1 rounds, there is at least one round in which no process fails. In that
guaranteed failure-free round, all messages broadcast can be delivered reliably, and all processes that have not failed
can compute the common function of the received values to reach an agreement value.
providing a service to clients accessing the same set of files, enhancing the scalability of the service, and it
enhances fault tolerance. Few file services support replication fully, but most support the caching of files or
portions of files locally, a limited form of replication.
4. Hardware and operating system heterogeneity The service interfaces should be defined so that client and
server software can be implemented for different operating systems and computers. This requirement is an
important aspect of openness.
5. Fault tolerance The central role of the file service in distributed systems makes it essential that the service
continue to operate in the face of client and server failures. To cope with transient communication failures,
the design can be based on at-most-once invocation semantics or it can use the simpler at-leastonce
semantics. Tolerance of disconnection or server failures requires file replication.
6. Consistency Conventional file systems such as that provided in UNIX offer one-copy update semantics. If
any changes made to one file, that changes must do in other replicated copies.
7. Security In distributed file systems, there is a need to authenticate client requests so that access control at
the server is based on correct user identities and to protect the contents of request and reply messages with
digital signatures and (optionally) encryption of secret data.
8. Efficiency A distributed file service should offer facilities that are of at least the same power and generality
as those found in conventional file systems and should achieve a comparable level of performance.
Client module A client module runs in each client computer, integrating and extending the operations of the flat file
service and the directory service under a single application programming interface that is available to user-level
programs in client computers.
Directory service The directory service provides a mapping between text names for files and their UFIDs. Clients may
obtain the UFID of a file by quoting its text name to the directory service.
Flat file service The flat file service is concerned with implementing operations on the contents of files. Unique file
identifiers (UFIDs) are used to refer to files in all requests for flat file service operations. UFIDs are long sequences of
bits chosen so that each file has a UFID that is unique among all of the files in a distributed system. When the flat file
service receives a request to create a file, it generates a new UFID for it and returns the UFID to the requester.
Access control: An access check is made whenever a file name is converted to a UFID. A user identity is submitted
with every client request, and access checks are performed by the server for every file operation.
File groups • A file group is a collection of files located on a given server. A server may hold several file groups, and
groups can be moved between servers, but a file cannot change the group to which it belongs. In a distributed file
service, file groups support the allocation of files to file servers.
File group identifiers must be unique throughout a distributed system. Since file groups can be moved and distributed
systems that are initially separate can be merged to form a single system, the only way to ensure that file group
identifiers will always be distinct in a given system is to generate them with an algorithm that ensures global
uniqueness. For example, whenever a new file group is created, a unique identifier can be generated by
concatenating the 32-bit IP address of the host creating the new group with a 16-bit integer derived from the date,
producing a unique 48-bit integer.
file group identifier: 32 bits 16 bits
IP Address date
• The file system identifier field is a unique number that is allocated to each file system when it is created.
• The i-node number is needed to locate the file in file sysytem and also used to store its attribute and inode
numbers are reused after a file is removed.
• The i-node generation number is needed to increment each time i-node numbers are reused after a file is
removed
• The virtual file system layer has one VFS structure for each mounted file system and one v-node per open
file. The v-node contains an indicator to show whether a file is local or remote.
Client Integration
• The NFS client module cooperates with the virtual file system in each client machine.
• If the file is local, access the unix file system for the local file.
• If the file is remote, NFS client send request to NFS server
• It operates in a similar manner to the conventional UNIX file system, transferring blocks of files to and from
the server and caching the blocks in the local memory whenever possible.
Mount services:
• Mount the remote directories to the local directories.
• Mount is to make a group of files in a file system structure accessible to a user or user group.
• Mount operation: mount(remotehost, remotedirectory, localdirectory)
• Client with two remotely mounted file stores. The nodes people and users in file systems at Server 1 and
Server 2 are mounted over nodes students and staff in Client’s local file store. The meaning of this is that
programs running at Client can access files at Server 1 and Server 2 by using pathnames such as
/usr/students/jon and /usr/staff/ann.
• Mount can be of 3 types:
1. Soft mount: a time bound is there. (send failure message) 2. Hard
mount: no time bound. (retry the request until it is satisfied) 3. Auto
mount: mount operation done on demand.
Server caching
The use of the server’s cache to hold recently read disk blocks does not raise any consistency problems but when a
server performs write operations, extra measures are needed to ensure that clients can be confident that the results
of the write operations are persistent, even when server crashes occur. The write operation offers two options for
this :
• Data in write operations received from clients is stored in the memory cache at the server and written to
disk before a reply is sent to the client. This is called write- through caching.
• Data in write operations is stored only in the memory cache. It will be written to disk when a commit
operation is received for the relevant file.
Client caching • The NFS client module caches the results of read, write, getattr, lookup and readdir operations in
order to reduce the number of requests transmitted to servers. A timestamp-based method is used to validate
cached blocks before they are used. Each data or metadata item in the cache is tagged with two timestamps:
• Tc is the time when the cache entry was last validated.
• Tm is the time when the block was last modified at the server.
• Step 2: A second request for the same data is satisfied from the local cache.
• Stateful servers in AFS allow the server to inform all clients with open files about any updates made to that
file by another client, through what is known as callback.
• Callbacks to all clients with a copy of that file is ensured as a callback promise that is issued by the server to
a client when it requests for a copy of a file.
Cache consistency
• When Vice supplies a copy of a file to a Venus process it also provides a callback promise – a token issued
by the Vice server that is the custodian of the file, guaranteeing that it will notify the Venus process when
any other client modifies the file.
• Callback promises are stored with the cached files on the workstation disks and have two states: valid or
cancelled.
• Whenever Venus handles an open on behalf of a client, it checks the cache. If the required file is found in
the cache, then its token is checked. If its value is cancelled, then a fresh copy of the file must be fetched
from the Vice server, but if the token is valid, then the cached copy can be opened and used without
reference to vice.
• When a server performs a request to update a file it notifies all of the Venus processes to which it has issued
callback promises by sending a callback to each – a callback is a remote procedure call from a server to a
Venus process.
• When the Venus process receives a callback, it sets the callback promise token for the relevant file to
cancelled.
• When a workstation is restarted after a failure or a shutdown, Venus generates a cache validation request
containing the file modification timestamp to the server.
• If the timestamp is current, the server responds with valid and the token is reinstated. If the timestamp
shows that the file is out of date, then the server responds with cancelled and the token is set to cancelled.
• Whole-file caching: Once a copy of a file or a chunk has been transferred to a client computer it is stored in
a cache on the local disk. The cache contains several hundred of the files most recently used on that
computer. The cache is permanent, surviving reboots of the client computer. Local copies of files are used
to satisfy clients’ open requests in preference to remote copies whenever possible.
Operation of AFS
1. When a user process in a client computer issues an open system call for a file in the shared file space and
there is not a current copy of the file in the local cache, the server holding the file is located and is sent a
request for a copy of the file.
2. The copy is stored in the local UNIX file system in the client computer. The copy is then opened and the
resulting UNIX file descriptor is returned to the client.
3. Subsequent read, write and other operations on the file by processes in the client computer are applied to
the local copy.
4. When the process in the client issues a close system call, if the local copy has been updated its contents are
sent back to the server. The server updates the file contents and the timestamps on the file. The copy on
the client’s local disk is retained in case it is needed again by a user-level process on the same workstation.