Distributed 5
Distributed 5
Memory
DSM is a mechanism that manages memory across multiple nodes and makes
inter-process communications transparent to end-users. To design information
shared memory we might deal with certain issues which are called issues.
1. Granularity:
Granularity refers to the size of the memory block that can be shared or moved across the
network. It can be small (like a word) or large (like a whole page). The block size should
match the needs of the application and network.
2. Structure of Shared Memory Space:
This is about how the shared memory is organized. The way memory is structured depends
on the type of applications the DSM system is designed to support.
3. Memory Coherence and Access Synchronization:
Multiple nodes can access the same data at the same time, which can lead to
inconsistencies. To prevent this, DSM uses techniques like locks, semaphores, or other
synchronization methods to ensure data consistency.
4. Data Location and Access:
The system needs a way to find and retrieve data blocks when requested by processors.
This requires a mechanism to locate the data and ensure it follows the memory consistency
rules.
5. Replacement Strategy:
When the local memory is full, the system must decide which old data block to remove to
make space for new data. This decision is crucial for efficiency.
6. Thrashing:
If two nodes repeatedly move the same data block back and forth because they both need
it, it can slow down the system. DSM should have strategies to reduce such unnecessary
data movement.
7. Heterogeneity:
If the DSM system operates in an environment with machines of different architectures
(heterogeneous systems), it should be designed to handle these differences and still work
correctly.
2. Migration Algorithm:
In contrast to central server algo where every data access request is
forwarded to location of data while in this data is shipped to location of data
access request which allows subsequent access to be performed locally.
It allows only one node to access a shared data at a time and the whole block
containing data item migrates instead of individual item requested.
3. Read Replication Algorithm:
o This extends the migration algorithm by replicating data blocks and
allowing multiple nodes to have read access or one node to have both
read write access.
o It improves system performance by allowing multiple nodes to access
data concurrently.
.
1. Checkpointing:
Periodically save the system’s state to storage.
After a failure, the system can revert to the last saved state.
2. Rollback Recovery:
Returns the system to a previous checkpoint after detecting an error.
Useful for undoing issues caused by faults.
3. Forward Recovery:
Instead of reverting, the system fixes the error and continues.
Requires anticipating and handling potential errors dynamically.
4. Logging and Replay:
Records system actions in logs to reconstruct the state after a failure.
5. Replication:
Keeps multiple copies of data or components across nodes.
If one fails, another takes over to ensure uninterrupted service.