0% found this document useful (0 votes)
23 views

Distributed Process Management PDF

Uploaded by

Yahya Gigi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Distributed Process Management PDF

Uploaded by

Yahya Gigi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

DISTRIBUTED PROCESS MANAGEMENT

Introduction
A process may be migrated because:-
 node does not have required resources
 node has to be shut down
 if expected turnaround time will be better

It is the responsibility of the DOS to control the assignment of resources to processes and to route the process to
suitable nodes of the system once a user submits a job.

In this chapter we will consider a resource to be a node/processor. A resource manager schedules processes and
methodologies for scheduling can be broadly classified thus:-
i. Task assignment approach – where each process submitted by a user is viewed as a collection of related
tasks which are then scheduled to suitable nodes to improve performance.
ii. Load-balancing Approach – all processes submitted are distributed to spread workload equally.
iii. Load-sharing Approach – attempts to ensure that no node is idle while processes want to be processed.

Desirable Features of a Good Global Scheduling Algorithm.


i. No a priori knowledge about the processes - A good scheduling algorithm should operate with
absolutely no a priori knowledge about the process to be executed so as not to pose an extra
burden on users to specify this information during job submissions.
ii. Dynamic nature - should be able to take care of the dynamically changing load (or status) of the
various nodes of the system. (requires preemptive process migration facility)
iii. Quick decision-making capability - must make quick decision about the assignment of processes
to processors. e.g. heuristic methods requiring less computational effort while providing near-
optimal results are normally preferable to exhaustive (optimal) solution methods.
iv. Balanced system performance and scheduling overhead - algorithms that provide near-optimal
system performance with a minimum of global state information gathering overhead are desirable
(aging of information is bad).
v. Stability - an algorithm is unstable if it can enter a state in which all the nodes are spending all
their time migrating processes without accomplishing useful work in an attempt to properly
schedule the processes for better performance, termed as processor thrashing.
vi. Scalability - the algorithm should be capable of handling small as well as large networks e.g.
probe only M of N nodes for selecting host.
vii. Fault Tolerance - should not be disabled by the crash of one or more nodes.
viii. Fairness of service – e.g. where some loads are larger than others, a node should be able to share
some of its resources as long as its users are not significantly affected.
Explanations:
1. Task Assignment Approach:- Here a process is considered to be made up of multiple tasks. The goal is to find
an optimal assignment policy for the tasks of an individual process. With suitable assumptions, we seek to achieve:-
 Minimization of IPC costs
 Quick turnaround time
 High degree of parallelism
 Efficient utilization of system resources in general

But it lacks the dynamicity in changing work loads and it is deterministic.

Nb: lacks the dynamicity in changing workloads.

2. Load Balancing Approach:- Maximizes total system throughput. Various types of load-balancing algorithms
include:
 Static Vs Dynamic:- Static algorithms use only information about the average behavior of the system, ignoring
the current system state, while dynamic reacts to system state changes.

1|Page
Fig: A taxonomy of load-balancing algorithms

Static is simple due to no need to maintain state info, but suffers from not being able to react to system state, while
dynamic is vice versa.

 Deterministic Vs probabilistic:-Deterministic algorithms use info about the properties of the nodes and the
characteristics of the processes to be scheduled to deterministically allocate processes to nodes. Probabilistic
uses information regarding static attributes of the system e.g. number of nodes, processing capability of each
node, network topology etc. to formulate simple process placement rules. But deterministic is difficult to
optimize and is more expensive to implement while probabilistic is easier but has poor performance.

 Centralized Vs Distributed:- Decisions are centralized (one node) in centralized algorithms and distributed
among the nodes on distributed. In centralized, it can effectively make process assignment decision because it
knows both the load at each node and the number of processes needing service. Each node is responsible for
updating the central server with its state information, but it suffers from poor reliability. Replication is thus
necessary but with the added cost of maintaining information consistency. For distributed, there is no master
entity i.e. each entity is responsible for making scheduling decision for the processes of its own node, in either
transferring local processes or accepting of remote processes.

 Cooperative Vs non-cooperative:- in non-cooperative, entities act autonomously, making scheduling decision


independently of others and vice versa for cooperative. But cooperative are more complex, leading to larger
overheads, but display better stability.

Issues In Designing Load-Balancing Algorithms.


The following should be considered:-
i. Load estimation policy:- that determines how to estimate the workload of a particular node of the system.
Measurable parameters used may include
 Total number of processes on node at the time of load-estimation.
 Resource demands of these processes
 Instruction mixes of these processor

Since the above parameters may not necessarily work well, an acceptable method is to measure the CPU utilization
of the nodes.
ii. Process transfer policy:- That determines whether to execute a process locally or remotely. This policy
observes the “threshold policy” i.e. the limiting of a node’s workload.

2|Page
Two techniques, single-threshold or double-threshold can be used.

overloaded
under loaded
Fig a

overloaded
normal
under loaded
Fig b

In instance b,
 Overloaded region – new local processes sent for remote execution, new requests rejected.
 Normal region – new local processes run locally, request for remote processes rejected.
 Under-loaded region – new local processes run locally new remote processes accepted.

iii. Location policies:- Select the destination node for a process’s execution, the main policies being:
a. Threshold:- selects destination node randomly and checks if the process transfer would place the
node in a state that prohibits it from accepting remote processes.
b. Shortest:- a number of nodes chosen randomly and polled in turn to determine load and the one with
least load given job.
c. Bidding:- each node takes on the role of “manager-contractor” where manager broadcasts and
contactors bid. Best bidder wins but in a case where contractor wins too much, it is bad. Hence the
contractor has to send a message accepting/rejecting back to the manager, either for processes transfer
or new broadcasting.
d. Pairing:- which pairs nodes and reduces the variance of loads only between the pairs. A node
randomly chooses another and requests for pairing i.e. sharing load. At this instance, the requester
node does not accept any communication from other nodes until it gets a response from the requested.

iv. State info exchange policies:- Proper selection of the state info exchange policy is essential, and may use
one of the following policies for this purposes
a. Periodic broadcast – each node broadcasts after every time t,
 Not good due to heavy network traffic and possible existence of “fruitless” messages (i.e.
messages from those nodes whose nodes state has not changed within time t)
 Scalability hence poor.
b. Broadcast when state changes – avoids problem of fruitless massages.
 Small state changes are not necessarily reported to all nodes i.e. broadcasting occurs only when
node switches load regions.
 Works only with two-threshold transfer policy.
c. On-demand exchange – node broadcasts when either under/overloaded and only those nodes in need
communicate. It works only with two-threshold transfer policy.
d. Exchange by polling:- an affected node searches for a suitable partner by randomly polling other
nodes one by one. The polling process stops either when a suitable partner is found or predefined poll limit
is reached e.g. m-out-of-n, where m<n.

v. Priority Assignment Policies:- Could take any of the three forms:-


a. Selfish – Local process have higher priority than remote but yields the worst response performance
time among the three policies i.e. poor performance for remote but best response time for local
processes. Beneficial for possesses arriving at a lightly-loaded node.
b. Altruistic:- remote processes have higher priority than local. Achieves best response time performance
of the three policies. Remote processes experience lower delays than the local which is unfair.

3|Page
c. Intermediate:- Plays between number of local and remote processes i.e. if the number of local process
> remote process, local process given higher priority otherwise vice versa. Treats local processes better
than remote and overall response time closer to (b) above.

vi. Migration-Limiting Policies:- Decides on the total number of times a process should be allowed to
migrate and may use one of the following policies:
a. Uncontrolled:- Migration times unlimited but causes instability.
b. Controlled:- Sets a limit as to how many times a process may be migrated.

1. Load-sharing Approach:- Load balancing has been criticized for:


o The overhead involved in gathering state info is very large.
o It’s not achievable because the number of processes in a node is always fluctuating.

Thus it is necessary and sufficient to prevent some nodes from being idle while others are busy, which is the essence
of sharing. Thus, policies for load sharing include:
a. Load estimation policies:- try only to check if a node is idle or busy, e.g. via measuring CPU
utilization.
b. Process transfer policies:- May use single or double threshold policies i.e. in single, a node already
executing one process may transfer others. In double however, anticipation is a factor i.e. nodes about
to be underutilized are given more processes.
c. Location Policies:- which may take the forms of:-
 Sender-initiated policy - in which sender node decides where to send process.
 Receiver-initiated policy - in which receiver node decides from where to get the process.
d. State information exchange policies: A node normally exchanges state info only when its state
changes that takes either of the forms below.
i. Broadcast when state changes.
ii. Poll when state changes.

4|Page
PROCESS MIGRATION

Is concerned with making the best possible use of the processing resources of the entire system by sharing them
among all processes. Three important concepts are used to achieve this goal:
i. Processor allocation: - decides on which process is assigned to which processor.
ii. Process migration: - which is the movement of a process from its current allocation to the
processor to which it can be assigned.
iii. Threads: - which deals with fine-grained parallelism for better utilization of the processing
capability of the system.

Process migration
A process may be migrated before (non-preemptive) or after (pre-emptive) it starts executing. The latter is more
costly since the process environment must also accompany the process. Migration involves:
i. Selecting a process to move.
ii. Selecting the destination node.
iii. Transferring the process to new node.

Desirable Features Of A Good Process Migration Mechanism.


i. Transparency:- which occurs at two levels namely:-
 Object access level - that allows access to objects e.g. files and devices to be done in a
location-independent manner.
 System call and interprocess communication level - i.e. these must be location independent
e.g. once a message is sent, it should reach its receiver process without need for resend if
receiver moves to another node before it reaches.
ii. Minimal interference:- Migration must cause only minimal interference to the process of the
process involved and to system as a whole e.g. by minimizing “freezing time”.
iii. Minimal Residual Dependencies:- No residual dependency should remain on the previous node
or else:
 Migrated process continues with its load on previous node diminishing benefits of migration.
 A failure/reboot of previous node causes process failure
iv. Efficiency:- involves minimizing
 Freezing time
 Cost of locating an object
 Cost of supporting remote execution once process is migrated.
v. Robustness:- i.e. failure of a node other than the one on which a process is currently running
should not affect the accessibility/execution of that process.
vi. Communication between co processes of a job: - It is vital that co processes communicate
directly with each other irrespective of their locations.

Process Migration Mechanisms


Involves the following activities: - freezing the process, transferring the process, forwarding message meant for the
migrant process and handling communication between cooperating processes. Freezing a process means the
execution of the process is suspended and all external interactions with the process are deferred.
Freezing and restarting operations normally do differ from system to system, but some general issues involved in
these operations are described as follows.
i. Immediate and Delayed Blocking of the process: - i.e. before a process is frozen, its executions is
blocked either immediately or until the process reaches a suitable state for blocking. After blocking, it is
advisable to wait for the completion of fast I/O operations (e.g. disk I/O) associated with the processes.
Information about open files must be maintained as well. Finally, the process is restarted on its destination
node in whatever state it was before migration.
ii. Address Space Transfer Mechanisms: - Migration involves transfer of the following information types
from source to destination node.
 Processes’ state e.g. execution status, scheduling information, I/O states, objects accessible etc.
 Process address space e.g. code, data and stack of the program.

5|Page
The following are used as address transfer mechanisms:-
 Total freezing: - a process’s execution is stopped while transferring its address space.
 Pre-transferring:- address space transferred while process still runs on source node.
 Transfer on reference: - assumes process only uses a small part of their address spaces while
executing and so, the process address space is left at source node. As relocated process executes
on destination node, a page of the migrant process’s address space is transferred from its source to
destination only when referenced.

Mechanisms for handling co processes:


Provide efficient communication between a process and its sub processes which might have been migrated on
different nodes. Two mechanisms are:
i. Disallowing separation of co processes:- This can be achieved via:-
 Disallowing the migration of processes that wait for one or more of their children to complete.
 Ensuring that when a parent process migrates, its children process migrates along with it.
ii. Home node or Origin site concept – home node concept means that communication between a process
and its sub process occurs via the home node. It allows the complete freedom of migrating a process or
its sub processes independently and executing them on different nodes of the system. But message
traffic and communication costs increases considerably since all communication between a parent
process and its children processes take place via the home node.

Advantages of Process Migration


 Reducing average response time of processes:- via redistributing node loads to idle/less utilized nodes.
 Speeding up individual jobs: - Redistributing tasks to execute concurrently or using a faster CPU node or
migrating job to a node with minimum turnaround time due to various reasons.
 Gaining higher throughput: - by application of a suitable load balancing policy.
 Utilizing resources effectively: - i.e. a process can be migrated to the most suitable node to utilize the
system resources in the most effective manner.
 Reducing network traffic: - i.e. migrate a process closer to the resources it is using most heavily or
migrate and cluster two or more processes which communicate quite often to same system node.
 Improving system reliability: - i.e. migrate critical process to a node of higher reliability, migrate a copy
of critical process and execute both the original and copy concurrently or processes of a node may be
migrated to another node before a dying node completely fails.
 Improving systems security: - i.e. running a sensitive process on a more secure node.

Threads:
May be referred to as lightweight processes that help in improving application performance through parallelism. In
OS with threads facility, the basic unit of CPU utilization is a thread. A process, hence all its threads, are owned by a
single user. Threads share the CPU just like processes on a time-sharing basis. The motivations for using threads are
several. (Read more about threads).

6|Page
7|Page

You might also like