0% found this document useful (0 votes)
1 views

Distributed 5

Uploaded by

Sahil Chourasiya
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Distributed 5

Uploaded by

Sahil Chourasiya
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Design and Implementation Issue of Distributed Shared

Memory

DSM is a mechanism that manages memory across multiple nodes and makes
inter-process communications transparent to end-users. To design information
shared memory we might deal with certain issues which are called issues.

1. Granularity:
Granularity refers to the size of the memory block that can be shared or moved across the
network. It can be small (like a word) or large (like a whole page). The block size should
match the needs of the application and network.
2. Structure of Shared Memory Space:
This is about how the shared memory is organized. The way memory is structured depends
on the type of applications the DSM system is designed to support.
3. Memory Coherence and Access Synchronization:
Multiple nodes can access the same data at the same time, which can lead to
inconsistencies. To prevent this, DSM uses techniques like locks, semaphores, or other
synchronization methods to ensure data consistency.
4. Data Location and Access:
The system needs a way to find and retrieve data blocks when requested by processors.
This requires a mechanism to locate the data and ensure it follows the memory consistency
rules.
5. Replacement Strategy:
When the local memory is full, the system must decide which old data block to remove to
make space for new data. This decision is crucial for efficiency.
6. Thrashing:
If two nodes repeatedly move the same data block back and forth because they both need
it, it can slow down the system. DSM should have strategies to reduce such unnecessary
data movement.
7. Heterogeneity:
If the DSM system operates in an environment with machines of different architectures
(heterogeneous systems), it should be designed to handle these differences and still work
correctly.

Algorithm for implementing Distributed Shared Memory

Distributed shared memory(DSM) system is a resource management


component of distributed operating system that implements shared memory
model in distributed system which have no physically shared memory. The
shared memory model provides a virtual address space which is shared by all
nodes in a distributed system.
The central issues in implementing DSM are:
 how to keep track of location of remote data.
 how to overcome communication overheads and delays involved in execution
of communication protocols in system for accessing remote data.
 how to make shared data concurrently accessible at several nodes to improve
performance.

Algorithms to implement DSM

1. Central Server Algorithm:


 In this, a central server maintains all shared data. It services read requests
from other nodes by returning the data items to them and write requests by
updating the data and returning acknowledgement messages.
 Time-out can be used in case of failed acknowledgement while sequence
number can be used to avoid duplicate write requests.
 It is simpler to implement but the central server can become bottleneck and
to overcome this shared data can be distributed among several servers. This
distribution can be by address or by using a mapping function to locate the
appropriate server.

2. Migration Algorithm:
 In contrast to central server algo where every data access request is
forwarded to location of data while in this data is shipped to location of data
access request which allows subsequent access to be performed locally.
 It allows only one node to access a shared data at a time and the whole block
containing data item migrates instead of individual item requested.
3. Read Replication Algorithm:
o This extends the migration algorithm by replicating data blocks and
allowing multiple nodes to have read access or one node to have both
read write access.
o It improves system performance by allowing multiple nodes to access
data concurrently.
.

4. Full Replication Algorithm:


o It is an extension of read replication algorithm which allows multiple
nodes to have both read and write access to shared data blocks.
o Since many nodes can write shared data concurrently, the access to
shared data must be controlled to maintain it’s consistency.
Importance of Effective Recovery in Distributed Systems
Effective recovery in distributed systems is crucial for ensuring system
reliability, availability, and fault tolerance. When a component fails or an error occurs, the
system must recover quickly and correctly to minimize downtime and data loss. Effective
recovery mechanisms, such as checkpointing, rollback, and forward recovery, help
maintain system consistency, prevent cascading failures, and ensure that the system can
continue to function even in the presence of faults.

Recovery in Distributed Systems


Recovery ensures the system remains reliable and operational after failures.

1. Checkpointing:
 Periodically save the system’s state to storage.
 After a failure, the system can revert to the last saved state.
2. Rollback Recovery:
 Returns the system to a previous checkpoint after detecting an error.
 Useful for undoing issues caused by faults.
3. Forward Recovery:
 Instead of reverting, the system fixes the error and continues.
 Requires anticipating and handling potential errors dynamically.
4. Logging and Replay:
 Records system actions in logs to reconstruct the state after a failure.
5. Replication:
 Keeps multiple copies of data or components across nodes.
 If one fails, another takes over to ensure uninterrupted service.

Stable Storage for Recovery


Stable storage ensures data remains safe even during failures. It uses redundant disks to
keep backups. For example:

 Updates are written to one disk first, then the second.


 If one disk fails, the data from the other disk is used for recovery.

Checkpointing and Recovery Techniques


1. Coordinated Checkpointing:
 All processes save their states together to ensure consistency.
 This prevents errors from spreading across nodes.
2. Message Logging:
 Logs messages between nodes.
 If a failure occurs, the messages can be replayed to restore the system’s state.

You might also like