Distributed Process Management PDF
Distributed Process Management PDF
Introduction
A process may be migrated because:-
node does not have required resources
node has to be shut down
if expected turnaround time will be better
It is the responsibility of the DOS to control the assignment of resources to processes and to route the process to
suitable nodes of the system once a user submits a job.
In this chapter we will consider a resource to be a node/processor. A resource manager schedules processes and
methodologies for scheduling can be broadly classified thus:-
i. Task assignment approach – where each process submitted by a user is viewed as a collection of related
tasks which are then scheduled to suitable nodes to improve performance.
ii. Load-balancing Approach – all processes submitted are distributed to spread workload equally.
iii. Load-sharing Approach – attempts to ensure that no node is idle while processes want to be processed.
2. Load Balancing Approach:- Maximizes total system throughput. Various types of load-balancing algorithms
include:
Static Vs Dynamic:- Static algorithms use only information about the average behavior of the system, ignoring
the current system state, while dynamic reacts to system state changes.
1|Page
Fig: A taxonomy of load-balancing algorithms
Static is simple due to no need to maintain state info, but suffers from not being able to react to system state, while
dynamic is vice versa.
Deterministic Vs probabilistic:-Deterministic algorithms use info about the properties of the nodes and the
characteristics of the processes to be scheduled to deterministically allocate processes to nodes. Probabilistic
uses information regarding static attributes of the system e.g. number of nodes, processing capability of each
node, network topology etc. to formulate simple process placement rules. But deterministic is difficult to
optimize and is more expensive to implement while probabilistic is easier but has poor performance.
Centralized Vs Distributed:- Decisions are centralized (one node) in centralized algorithms and distributed
among the nodes on distributed. In centralized, it can effectively make process assignment decision because it
knows both the load at each node and the number of processes needing service. Each node is responsible for
updating the central server with its state information, but it suffers from poor reliability. Replication is thus
necessary but with the added cost of maintaining information consistency. For distributed, there is no master
entity i.e. each entity is responsible for making scheduling decision for the processes of its own node, in either
transferring local processes or accepting of remote processes.
Since the above parameters may not necessarily work well, an acceptable method is to measure the CPU utilization
of the nodes.
ii. Process transfer policy:- That determines whether to execute a process locally or remotely. This policy
observes the “threshold policy” i.e. the limiting of a node’s workload.
2|Page
Two techniques, single-threshold or double-threshold can be used.
overloaded
under loaded
Fig a
overloaded
normal
under loaded
Fig b
In instance b,
Overloaded region – new local processes sent for remote execution, new requests rejected.
Normal region – new local processes run locally, request for remote processes rejected.
Under-loaded region – new local processes run locally new remote processes accepted.
iii. Location policies:- Select the destination node for a process’s execution, the main policies being:
a. Threshold:- selects destination node randomly and checks if the process transfer would place the
node in a state that prohibits it from accepting remote processes.
b. Shortest:- a number of nodes chosen randomly and polled in turn to determine load and the one with
least load given job.
c. Bidding:- each node takes on the role of “manager-contractor” where manager broadcasts and
contactors bid. Best bidder wins but in a case where contractor wins too much, it is bad. Hence the
contractor has to send a message accepting/rejecting back to the manager, either for processes transfer
or new broadcasting.
d. Pairing:- which pairs nodes and reduces the variance of loads only between the pairs. A node
randomly chooses another and requests for pairing i.e. sharing load. At this instance, the requester
node does not accept any communication from other nodes until it gets a response from the requested.
iv. State info exchange policies:- Proper selection of the state info exchange policy is essential, and may use
one of the following policies for this purposes
a. Periodic broadcast – each node broadcasts after every time t,
Not good due to heavy network traffic and possible existence of “fruitless” messages (i.e.
messages from those nodes whose nodes state has not changed within time t)
Scalability hence poor.
b. Broadcast when state changes – avoids problem of fruitless massages.
Small state changes are not necessarily reported to all nodes i.e. broadcasting occurs only when
node switches load regions.
Works only with two-threshold transfer policy.
c. On-demand exchange – node broadcasts when either under/overloaded and only those nodes in need
communicate. It works only with two-threshold transfer policy.
d. Exchange by polling:- an affected node searches for a suitable partner by randomly polling other
nodes one by one. The polling process stops either when a suitable partner is found or predefined poll limit
is reached e.g. m-out-of-n, where m<n.
3|Page
c. Intermediate:- Plays between number of local and remote processes i.e. if the number of local process
> remote process, local process given higher priority otherwise vice versa. Treats local processes better
than remote and overall response time closer to (b) above.
vi. Migration-Limiting Policies:- Decides on the total number of times a process should be allowed to
migrate and may use one of the following policies:
a. Uncontrolled:- Migration times unlimited but causes instability.
b. Controlled:- Sets a limit as to how many times a process may be migrated.
Thus it is necessary and sufficient to prevent some nodes from being idle while others are busy, which is the essence
of sharing. Thus, policies for load sharing include:
a. Load estimation policies:- try only to check if a node is idle or busy, e.g. via measuring CPU
utilization.
b. Process transfer policies:- May use single or double threshold policies i.e. in single, a node already
executing one process may transfer others. In double however, anticipation is a factor i.e. nodes about
to be underutilized are given more processes.
c. Location Policies:- which may take the forms of:-
Sender-initiated policy - in which sender node decides where to send process.
Receiver-initiated policy - in which receiver node decides from where to get the process.
d. State information exchange policies: A node normally exchanges state info only when its state
changes that takes either of the forms below.
i. Broadcast when state changes.
ii. Poll when state changes.
4|Page
PROCESS MIGRATION
Is concerned with making the best possible use of the processing resources of the entire system by sharing them
among all processes. Three important concepts are used to achieve this goal:
i. Processor allocation: - decides on which process is assigned to which processor.
ii. Process migration: - which is the movement of a process from its current allocation to the
processor to which it can be assigned.
iii. Threads: - which deals with fine-grained parallelism for better utilization of the processing
capability of the system.
Process migration
A process may be migrated before (non-preemptive) or after (pre-emptive) it starts executing. The latter is more
costly since the process environment must also accompany the process. Migration involves:
i. Selecting a process to move.
ii. Selecting the destination node.
iii. Transferring the process to new node.
5|Page
The following are used as address transfer mechanisms:-
Total freezing: - a process’s execution is stopped while transferring its address space.
Pre-transferring:- address space transferred while process still runs on source node.
Transfer on reference: - assumes process only uses a small part of their address spaces while
executing and so, the process address space is left at source node. As relocated process executes
on destination node, a page of the migrant process’s address space is transferred from its source to
destination only when referenced.
Threads:
May be referred to as lightweight processes that help in improving application performance through parallelism. In
OS with threads facility, the basic unit of CPU utilization is a thread. A process, hence all its threads, are owned by a
single user. Threads share the CPU just like processes on a time-sharing basis. The motivations for using threads are
several. (Read more about threads).
6|Page
7|Page