0% found this document useful (0 votes)

10 views

An Operating System Framework For Large

Uploaded by

kevinallein

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

An Operating System Framework For Large

Uploaded by

kevinallein

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Submitted to the Seventh IEEE Symposium on Parallel and Distributed Processing

An Operating System Framework for Large Parallel Computers

Isaac D. Scherson
Piotr Chrzastowski-Wachtely
Dinesh Ramanathan
Raghu Subramanian
Vara Ramakrishnan
Ver^onica L. M. Reisz
fisaac,pch,dinesh,raghu,vara,[email protected]
Department of Information and Computer Science
University of California, Irvine
Irvine, California 92717-3425

Abstract
Little work has been done on operating systems for massively parallel computing. This paper pro-
poses a framework for such an operating system. It is assumed that there are multiple jobs executing
on a large MIMD computer. Each job is assumed to be data parallel, using as many virtual processors
as necessary to exploit its inherent parallelism. We view the notion of virtual processors as playing
a unifying role in the conceptual design of the operating system. Our main thesis is that the vari-
ous functions performed by the operating system may be viewed as operations on the set of virtual
processors.
In the context of the above framework, several open theoretical problems are identi ed, and in
particular, the twin problems of spatial and temporal scheduling are addressed. Preliminary analysis
indicates the viability of horizontal spatial schedules and periodic temporal schedules.

This research was supported in part by the Air Force Oce of Scienti c Research under grant numbers F49620-92-J-0126
and AFOSR-90-0144, the NASA under grant number NAG-5-1897, and the NSF under grant numbers MIP-9106949 and
MIP-9205737
y
On leave from the Institute of Informatics, Warsaw University
z
Supported by CNPq (Conselho Nacional de Desenvolvimento Cient co e Tecnologico), Brazilian Government, under
grant number 200358-92.8

1
1 Introduction
When programming a uniprocessor machine, the user has to know very little about the machine to write
ecient programs. With current massively parallel processors (MPPs), however, the user needs to know
machine dependent details such as the number of processors available and the amount of memory at
each node. In addition, users often analyze characteristics of the machine's communication delays and
the application's communication patterns in order to speed up programs [1]. Current parallel programs
are, therefore, highly machine speci c, and dicult to port across machines. One of the reasons for this
situation is the lack of operating systems that can eciently bridge the gap between parallel machines
and high level parallel programming models.
Moreover, the high cost of current MPPs demands that they be utilized eciently and to the fullest. (In
the past, these precise economic considerations used to apply to mainframes.) Typically, this requires the
accommodation of multiple users, and an operating system to manage the sharing of machine resources
among these users.
There are, in fact, several MPP operating systems available in the market, for example, CM-5's CMost [25]
and T3D's UNICOS [17]. Unfortunately, the convenience a orded by these commercial operating systems
is nowhere near what we have grown accustomed to with uniprocessor operating systems. Duties that
properly belong to the operating system, such as processor virtualization and virtual memory management,
are often ignored and dubbed as \programmer responsibility." The inability of these operating systems
to o er programming convenience without compromising performance may stem from the fact that they
are largely extensions of uniprocessor operating systems.
This paper attempts to rethink, ab initio, the role of operating systems in massively parallel computing.
We envisage several jobs executing on a large MIMD machine; each job is assumed to be data parallel,
using as many virtual processors as necessary to exploit its inherent parallelism. We believe that the notion
of virtual processors uni es the conceptual design of the operating system, in much the same way as the
lesystem does in UNIXTM . Our main thesis is that most activities of the operating system may be viewed
as operations or manipulations on the set of virtual processors. Viewing the activities of the operating
system in this context facilitates the examination of the merits and demerits of various operating system
policies and reveals the underlying basic theoretical problems.
Section 2 presents a framework to view operating system issues in terms of virtual processors. In the
context of this framework, the sections that follow take a closer look at two complementary problems:
spatial and temporal scheduling. In Section 4, we consider spatial schedules, and make a qualitative
comparison between two simple spatial scheduling policies. In Section 5, a simpli ed, discrete model for
temporal scheduling is proposed, and metrics for evaluating temporal schedules are described. Section 6
nds the optimal temporal scheduling policy in terms of these metrics.

2 Operating System Framework

The operating system is viewed as a layer between the programming model and the physical machine
model. This is schematically depicted in Figure 1. Since a program is nothing but a description of a
virtual machine, the desired function of an operating system can be succinctly stated as the simultaneous
emulation of several virtual machines (corresponding to multiple programs) on a single physical machine
in an ecient manner.

2
VM

VM
VM VM - Virtual Machine
Each VM is a user job

VM Physical
VM
Machine

Operating System
VM VM

Programming Model

Figure 1. The function of the operating system is to emulate several virtual machines (corresponding to multiple
programs) on a single physical machine.

2.1 Programming and Machine Models

Since the operating system bridges the programming model and the physical machine, an understanding
of the role of the operating system requires a clearer de nition of both the programming model and the
physical machine.

The programming model

Several parallel programming models have been proposed in the past. Of late, at least in the realm
of massively parallel computing, there seems to be a convergence on the data parallel programming
model [2, 7], in the form of HPF [12] and Fortran 90 [3, 16]. Sabot [20] mentions that, in a survey of
120 parallel algorithms from three ACM Symposia on Theory of Computing (STOC), all were found
to be data parallel. This preponderance of the data parallel programming model is perhaps because
it allows one to express a large degree of parallelism, while retaining the single-thread-of-control
philosophy of sequential programming.
The physical machine
A natural way of executing a data parallel program is on a SIMD machine. However, it would
be a mistake to equate data parallelism with SIMD. A data parallel program may just as well be
executed on an asynchronous MIMD machine, and in fact, there are several advantages to doing so.
The chief advantage is that it is possible to run several jobs on a MIMD machine simultaneously.
Moreover, a MIMD machine does not force unnecessary synchronization after every instruction, or
unnecessary sequentialization of non-interfering branches, as a SIMD machine does. These factors,
among others, explain the recent market trend towards MIMD computers, exempli ed by Thinking
Machines' CM-5 and Cray Research's T3D.

The combination of the data parallel programming model and a MIMD machine model is called a SPMD
execution model [18, page 606]. SPMD stands for Single Program Multiple Data, indicating that all
processors execute the same program, but may be at di erent instructions at a given time, owing to
asynchronous execution. Throughout this paper, our discussion of operating systems is predicated on the
SPMD execution model.

3
2.2 Virtual Processors as a Basis for Operating Systems
As pointed out before, a program is nothing more than a description of a virtual machine. In the case
of a data parallel program, the virtual machine consists of a (typically large) number of identical virtual
processors (VPs), communicating through an interconnection network. For instance, the standard data
parallel program to multiply two N N matrices [10] might be viewed as a virtual machine consisting of
N 2 VPs communicating in a mesh.
For many years now, the concept of virtual processors has been relegated to the status of a mere logical
aid to programmers [24, 8]. In our view, the notion of virtual processors should form the fundamental
basis of an MPP operating system. We claim two advantages to this approach.

It is our thesis that most of the functions of an MPP operating system can be viewed as operations
on the set of VPs. Thus, the notion of virtual processors provides a uni ed framework within which
several operating system issues may be considered and evaluated. We brie y illustrate our thesis
below, by phrasing various well-known operating system issues in terms of VP manipulations:
{ Spatial scheduling: When a job enters the system, the spatial scheduling (or space sharing)
policy de nes which processor each VP is allocated to.
{ Temporal scheduling: The temporal scheduling (or time sharing) policy dictates how each
processor switches between the execution of the VPs allocated to it.
{ Load balancing: Once a VP is allocated to a processor, it usually does not get reallocated, since
moving VPs between processors is quite expensive. However, there are situations when VPs do
indeed move between processors. For example, if a job spawns and kills VPs dynamically in an
unpredictable fashion, it is periodically necessary to load balance the VPs among processors.
{ Memory and I/O problems: Memory limitations and I/O bottlenecks may also be phrased in
terms of VPs. Memory limitations occur when a processor does not have enough local memory
to hold all the VPs allocated to it. I/O bottlenecks are most pronounced while loading (roll-in)
the VPs of a job from the disk into the local memories of various processors1.
The above problems are not independent of each other | any policy on one issue has subtle reper-
cussions on the other issues. The VP model enables us to understand and tackle these complex
interactions. For example, suppose each processor schedules the VPs assigned to it in FIFO order.
A VP that is at the end of a processor's queue may work its way towards the head, only to be
bumped o to the end of another processor's queue because of an ill-timed load balancing. If this
continues, the VP in question might never get a chance to be executed.
An operating system based on virtual processors alleviates several constraints imposed by current
commercial MPP operating systems, such as Thinking Machines' CM-5 and Cray's T3D.
Since these operating systems do not support the virtual processor abstraction, it is the programmer's
(or compiler's) responsibility to manage the virtual processors. This involves grouping together
logical VPs into coarse grain processes, forcing the number of such processes to t a legal partition
size. If all partitions of that size happen to be in use, then the programmer must either wait,
or re-group the VPs (perhaps compile the code with di erent options) to t another partition size.
1
For example, it takes the Cray T3D (128 processors with 64M per processor, and 2 I/0 gateways at 50M/sec) about 1.3
minutes to load the whole machine.

4
Furthermore, if some jobs terminate, leaving the machine lightly loaded, then current MPP operating
systems are unable to load balance the VPs of still-running jobs over a larger number of processors. 2

On the ip side, there are, conceivably, performance overheads in designing an operating system based
on virtual processors. Whether the convenience won is worth the performance lost can not be settled by
rhetoric | extensive experimentation and analysis is required before anything can be said on the matter.
As an application of the virtual processor framework, in the following sections, we take a preliminary look
at two of the problems mentioned above, namely spatial and temporal scheduling. As the names suggest,
these two issues are complementary to each other: the rst determines where VPs must be executed, while
the second determines when VPs must be executed.

3 Prior Work on Spatial and Temporal Scheduling

There is a plethora of work on scheduling in the literature. However, almost all of the work is in the
context of bus-based multiprocessor systems and a fork-and-join programming model [22, 4, 5, 13, 14,
15, 23, 6, 21]. Perhaps the only commonality between the existing literature and the problem at hand is
the word \scheduling". The conclusions of the scheduling papers and books are primarily artifacts of the
dynamically varying number of threads, and the disk access latencies associated with switching between
threads. We believe that the issues involved in the context of a SPMD execution model are completely
di erent and merit separate study.
The scheduling support provided by commercial parallel computers is limited at best [9, 19]. For example,
MasPar's MP-2, being a SIMD machine, does not allow space sharing at all: at any given time, only one
job may run on the machine. However, it does allow time sharing: the whole machine switches between
the various jobs at regular time intervals (this is often called gang scheduling). In contrast, Cray's T3D
allows space sharing but disallows time sharing. When a job arrives, a set of processors is carved out of
the pool of free processors (if possible) and placed at the job's disposal (this is called partitioning). The
partition runs the job to completion, never ever switching to another job. Thinking Machines' CM-5 falls
somewhere in between, providing limited forms of both space sharing and time sharing. The machine is
divided into partitions of prede ned sizes at boot time. Several jobs may share a partition. Each partition,
en masse, switches between the various jobs allocated to it at regular time intervals.

4 Spatial Scheduling
Whenever a job enters the system, the spatial scheduling policy must specify the processor that each VP
of the job is allocated to. For simplicity, we restrict ourselves to the static case, wherein a set of jobs
present themselves to be allocated initially, and no jobs arrive or leave the system thereafter. Moreover,
we assume that jobs do not spawn and kill VPs dynamically, thereby necessitating load balancing.
In the static case, a spatial schedule is simply a mapping from the set of VPs to the set of processors.
Figure 2 shows an example of such a spatial schedule.
Spatial scheduling can be done in many ways. Two policies suggest themselves immediately. A vertical
spatial schedule is one in which the VPs of every job are granted exclusive access to a subset of the
2
This information was obtained from discussions with engineers from MasPar, Cray Research and Thinking Machines [1,
9, 19].

5
4 4

3 4 5 4 2

2 2 2 4 1

1 2 3 4 5
Figure 2. A spatial schedule, or allocation. In this toy example, 5 jobs need to be allocated on a machine with 5
processors. The jobs have 1,4,1,5, and 1 VPs each.

Vertical Horizontal
Processor Uti- Wasted, if the number of idle PEs is not enough to Fully utilized. When a job terminates and leaves
lization satisfy the minimum requirements of any queued the system, the resources are automatically shared
job. If jobs leave the system, freeing many PEs, among the remaining jobs in the system: no load
then VPs of running jobs can not exploit the free balancing required.
PEs, unless load balancing is performed (with large
OS overhead).
Memory As in processor utilization case, memory resources Since each job uses all PEs, a copy of each program
utilization may not be fully utilized. On the other hand, since has to be on each PE, increasing code memory.
many VPs of the same job are on each PE, the Also, since VPs of di erent jobs exist on the same
number of copies of the program can be minimized PE, memory fragmentation may occur.
by just having one per PE. Memory fragmentation
does not occur.
Interprocessor Reduced, as VPs on the same PE can communicate Many communications performed concurrently.
communica- by local memory access, however, these communi- Although the network delay may be signi cant,
tion cations will be sequential. horizontal allocation makes the communication
patterns random, which makes networks behave
well [11].
Roll-in/roll- When only one job is allocated on a processor, time Can be fully masked: while the PEs are execut-
out time lost in roll-in/out is unavoidable, and usually sig- ing jobs allocated to them, the new one is loaded
ni cant. into the system by using a DMA controller. Once
the job has been loaded, the local schedule is aug-
mented with the VPs of the new job. A similar
procedure is applied to mask the roll-out time.
Table 1. A qualitative comparison of vertical and horizontal spatial schedules

processors. This scheme is also called partitioning. At the other extreme, a horizontal spatial schedule is
one in which the VPs of every job are spread evenly over all the processors in the system (or as many
processors as possible, if the number of VPs is smaller than the number of processors).
Table 1 gives a qualitative comparison of vertical and horizontal spatial schedules in terms of system
performance metrics such as processor utilization, memory utilization, interprocessor communication, and
roll-in/roll-out time. These preliminary considerations indicate several advantages of horizontal allocations
over the conventionally adopted vertical/partitioning schemes.

5 Temporal Scheduling
Once spatial scheduling is done, the problem is more local in nature. Several VPs, possibly belonging to
di erent jobs, may have been allocated to the same processor. The temporal scheduling policy speci es

6
Time ... ... ... ... ... ...
Slices

IDLE IDLE IDLE

6 Job A

Period 2
5

Job B
4

IDLE IDLE IDLE

3
Job C

Period 1
2

1
Processors

PE 1 PE 2 PE 3 PE 4 PE 5 PE 6

Figure 3. An Example of a Trace Diagram.

how each processor multiplexes the execution of the VPs that are allocated to it.

5.1 A simple model of temporal schedules

In its most general form, a temporal schedule may be described as follows. Each processor loops through
the following cycle: According to some policy, it picks up one of the VPs currently allocated to it and
runs it, until one of the following two conditions is satis ed:
The VP hits a barrier synchronization3 (when the data it needs to read is not ready).
A quantum of time, called the time-slice, elapses.
If all VPs allocated to a processor are stuck at barriers, then the processor idles.
Modeling a temporal schedule analytically to the last detail, or even simulating it, is a hairy task. Even
if it were possible to do so, the results would probably be quite inaccurate, owing to our ignorance of the
characteristics of real-world parallel programs (such as: how often does an \average" data parallel program
hit a barrier synchronization?). In order to gain engineering insights without getting bogged down with
inessential details, we propose the following simpli ed, discretized model of temporal schedules.
The execution of a parallel computer is represented by a two dimensional diagram, such as the one shown
in Figure 3. The vertical (time) axis is discretized into time steps of length equal to the time slice. The
horizontal (processor) axis has one spot for every processor. While the processor axis of the diagram
is nite, the diagram extends in nitely along the time axis. This is a simpli ed representation of the
fact that the time for execution of a job (minutes) is much larger than a time-slice (milliseconds). Each
square of this diagram labelled with either the name of a VP, or the word IDLE. When a square (which
corresponds to a processor and a time-step) is labelled with the name of a VP, it means that the VP
was executed on that processor during that time-step. The IDLE squares the trace diagram require more
careful interpretation. An IDLE square is not intended to indicate deliberate lazing by the processor, but
rather an unavoidable idling caused by synchronization statements in the programs.
The above two dimensional diagram is called a trace diagram. Not all trace diagrams can occur. In a legal
execution of jobs, no VP can be ahead of any other VP of the same job by a barrier synchronization. To
3
Since all jobs are data parallel, it is assumed that the only synchronization statements are barriers.

7
model this fact in the trace diagram, it is postulated that a trace diagram is legal if all VPs belonging to
the same job receive the same number of time slices \on average". More precisely: Over any period of
time, the number of time slices devoted to the various VPs of a job di er by no more than a constant.

Metrics for Temporal Scheduling

What quali es a temporal schedule as good? There are two angles to this question. From the system
manager's point of view, a temporal schedule is good if all processors are kept busy (the imagery here is
that users are charged by CPU time; so if all the CPUs are always busy, then the management makes a lot
of money). From a single user's point of view, a temporal schedule is good if the user's job is guaranteed
a certain rate of progress in its execution. As usual in any working system, these two objectives may
sometimes con ict, in which case they have to be traded o against each other.
The system administrator's point of view is formalized as a metric called idling ratio. In terms of the
trace diagram, the idling ratio is the fraction of the squares of the trace diagram that are marked IDLE.
The system administrator wishes the idling ratio to be as small as possible.
Similarly, the user's point of view is formalized as a metric called job happiness. A job's happiness is the
fraction of squares in the trace diagram marked with the name of a VP belonging to that job. The user
wishes the job happiness to be as large as possible.
Some care is required in the precise mathematical de nition of idling ratio and job happiness. Since a
trace diagram contains an in nite number of squares, and the number of squares marked X (here X is
either the name of a VP, or the word IDLE) may also be in nite, it is not immediately clear what is meant
by the fraction of squares marked X. We adopt the following de nition. The fraction of squares marked
X is the in mum of all such that: There exists a suciently large time interval t such that, in any
window of trace diagram of length t, the fraction of squares marked X is greater than or equal to .
If X is a job, then t may loosely be interpreted as an upper bound on the response time (for example,
the time that X's screen is frozen) seen by X.
Traditionally, in distributed systems, a much weaker de nition for job happiness, called fairness, is used.
A temporal schedule is fair if each VP occurs in nitely many times in the trace diagram (that is, no VP
is consistently ignored by a processor). The above de nition of job happiness is much stronger, since it
demands not only that the VP be serviced eventually, but also within a certain hard time bound.

6 Finding Good Temporal Schedules

Having de ned what constitutes a good temporal schedule, it is natural to try to construct one. This is
done in two stages. First, a simple class of schedules called periodic schedules is de ned, and it is shown
that there are periodic schedules that are as good as any other temporal schedules, enabling us to restrict
the search for a good schedule to the class of periodic schedules. Next, it is shown that the problem of
nding the best among periodic schedules can be formulated as a linear program.

6.1 Periodic Schedules Suce

A periodic schedule is a trace diagram that repeats at regular intervals in the vertical (time) direction.
Figure 1 depicts a periodic schedule with a period of three time slices. A periodic schedule is a slight

8
extension of a round robin schedule, wherein a processor may return to a VP many times within the same
round.
An optimal temporal schedule is de ned as one with the least idling ratio that satis es the happiness
requirements of all the users. The following theorem proves that for any temporal schedule, there exists
a periodic schedule with lower idling ratio, in which each job is at least as happy. This implies that the
search for an optimal temporal schedule may be restricted to the class of periodic schedules.
Theorem 1. For every temporal schedule S , there exists a periodic schedule Sp, such that the idling
ratio of Sp is at most that of S , and every job's happiness in Sp is at least as much as in S .
Proof: De ne the progress of a job at a particular time as the number of time slices granted to each of
its VPs upto that time. Thus, if a job has V VPs, its progress at time slice t may be represented by a
progress vector of V components, where each component is an integer less than or equal to t.
By the rules of legal execution, no VP may lag behind another VP of the same job by more than a constant
C number of time slices. Therefore, no two elements in the progress vector can di er by more than C .
De ne the di erential progress of a job at a particular time as the number of time slices by which each
VP leads the slowest VP of the job. Thus, di erential progress vector at time t is also a vector of V
components, where each component is an integer less than or equal to C . The di erential progress vector
is obtained by subtracting out the minimum component of the progress vector from each component of
the progress vector.
The system's di erential progress vector (SDPV) at time t is the concatenation of all job's di erential
progress vectors at time t. The key is to note that the SDPV can only assume a nite number of values.
Therefore, there exists an in nite sequence of times ti1 ; ti2 ; . . . such that the SPDVs at these times are
identical.
Consider any time interval [tik ; tik ]. One may construct a periodic schedule by cutting out the portion of
0

the trace diagram between tik and tik , and replicating it in nitely in the vertical direction.
0

First of all, we claim that such a periodic schedule is legal. From the equality of the SPDVs at tik and
tik , it follows that all VPs belonging to the same job receive the same number of time slices during each
0

period. In other words, at the end of each period, all the VPs belonging to the same job have made equal
progress. Therefore, no two VP lags behind another VP of the same job by more than a constant number
of time slices.
Secondly, observe it is possible to choose a time interval [tik ; tik ], such that the happiness of each job
0

in the during this interval is at least as much as in the complete trace diagram. This implies that the
happiness of each job in the constructed periodic schedule is greater than or equal to the happiness of
each job in the original temporal schedule.
Finally, the idling ratio of the constructed periodic schedule must be less than or equal to the idling ration
of the original temporal schedule. Since the fraction of area in the trace diagram covered by each job
increases, the fraction covered by the holes must necessarily decrease. This concludes the proof. 2

6.2 The Optimal Periodic Schedule

Let 1 ; 2; . . . n denote the n processors of the machine, and let J1 ; J2 ; . . . ; Jm denote the m jobs which
need to be run. Each job Jj has an associated happiness requirement j . (The happiness requirement
may be a function of the priority level of the job, for example.) The problem is to nd a periodic schedule,
if possible, that satis es all jobs' happiness requirements, and that minimizes the idling ratio.
9
1 2 3 4 5
J1 0 0 0 0 1
J2 1 1 1 0 1
J3 1 0 0 0 0
J4 1 1 0 3 0
J5 0 0 1 0 0
Figure 4. The allocation matrix corresponding to the spatial schedule depicted in Figure 2.

We assume that the spatial scheduling is done, and VPs have been allocated to processors somehow. The
spatial schedule can be summarized in the form of an m n matrix A, called the allocation matrix, where
A gives the number of VPs of job J on processor . For example, Figure 4 gives the allocation matrix
j;p j p

corresponding to the spatial schedule shown in Figure 2.

Let T be the period of the sought periodic schedule, and let R be the number of time slices received by
j

any VP of job J in one period of the periodic schedule. (Recall that within each period of a periodic
j

schedule, all VPs belonging to the same job must receive exactly the same number of time slices.) Of
course, T and the R s are not yet known.
j

To summarize, the input data available to us are the s and the A s; and the output data to be computed
j j;p

are T and the R s. Without loss of generality, let us normalize the period T to 1, and rede ne the R s
j j

to be the old R s divided by T . (Thus, while the old R s were integer variables, the new R s rational
j j j

numbers.)
The minimization of the idling ratio may be written algebraically as:
0 XX 1
B A R CC
min B
j;p j

B@1 ? n CA p j
(1)

The constraints imposed by the happiness requirements are:

PA R
n
p
j;p j
j 8j = 1; . . . ; m: (2)
Finally, we need a sanity constraint, restricting the normalized period to 1.
X
A R 1 j;p j8p = 1; . . . ; n: (3)
j

The objective function (1), along with the constraints (2) and (3), form a linear program, which may be
solved using standard techniques. An apparent complication is that we seek not just any R s, but R s j j

that are rational. This is really not a problem. As long as the s are rational, the R s will automatically
j j

turn out to be rational.

7 Conclusions
This paper presents the notion of virtual processors as the unifying concept in the design of operating sys-
tems for massively parallel computing. We propose that several well-recognized activities of the operating
10
system can be viewed as operations or manipulations on the set of virtual processors. To illustrate the
applicablility of the virtual processor framework, we present preliminary analyses of spatial and temporal
scheduling.
We believe that the many conceptual bene ts of founding an MPP operating system on virtual processors
will outweigh the overheads in terms of performance. However, a de nitive answer will require extensive
experimentation and analysis.

References
[1] Tom Blank. Personal communications, 1993. MasPar Computer Corporation, Sunnyvale, CA.
[2] Guy E. Blelloch. Vector Models for Data-parallel Computing. MIT Press, Cambridge, MA, 1990.
[3] Walter S. Brainerd, Charles H. Goldberg, and Jeanne C. Adams. Programmer's guide to Fortran 90.
McGraw-Hill Book Co., 1990.
[4] M. Crovella et al. Multiprogramming on multiprocessors. In Proceedings of the Third IEEE Sympo-
sium on Parallel and Distributed Processing, pages 590{597, Dec, 1991.
[5] R. Cytron, J. Lipkis, and E. Schonberg. A computer-assisted approach to SPMD execution. In
Proceedings of Supercomputing '90, pages 398{406, Nov, 1990.
[6] Heshan El-Rewini, Theodore G. Lewis, and Heshan H. Ali. Task Scheduling in Parallel and Distributed
Systems. Prentice Hall, Englewood Cli s, New Jersey 07632, 1994.
[7] Philip J. Hatcher and Michael J. Quinn. Data-parallel programming on MIMD computers. MIT Press,
Cambridge, MA, 1991.
[8] W. D. Hillis. The Connection Machine. MIT Press, Cambridge, Mass., 1985.
[9] Kent K. Koeninger. Personal communications, 1994. Cray Research Corp., Minneapolis, MN.
[10] T. Leighton. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Mor-
gan Kaufmann Publishers, Inc., San Mateo, CA 94403, 1992.
[11] Tom Leighton. Methods for message routing in parallel machines. In 24th Annual ACM Symposium
on Theory of Computing, pages 77 { 95, 1992.
[12] David B. Loveman. High Performance Fortran. IEEE Parallel and Distributed Technology, 1(1):25 {
42, 1993.
[13] S.T. Luetenegger and M.K. Vernon. The performance of multiprogrammed multiprocessor scheduling
policy. In Performance Evaluation Review, pages 226{236, May, 1990.
[14] C. McCann, R. Vaswani, and J. Zahorjan. A dynamic processor allocation policy for multiprogrammed
shared-memory multiprocessors. ACM Transactions on Computer Systems, 11(2):146{178, 1993.
[15] C. McCann and J. Zoharjan. Processor allocation policies for message-passing parallel computers. In
Performance Evaluation Review, pages 19{32, May, 1994.
[16] Michael Metcalf and John Reid. Fortran 90 Explained. Oxford University Press, New York, 1990.

11
[17] Wilfried Oed. The Cray Research Massively Parallel Processor System CRAY T3D. available by
anonymous ftp from ftp.cray.com, November 1993.
[18] David A. Patterson and John L. Hennessy. Computer Organization and Design: The Hard-
ware/Software Interface. Morgan Kaufmann Publishers, San Mateo, CA, 1994.
[19] David M. Ray. Personal communications, 1994. Thinking Machines Corporation, Cambridge, MA.
[20] Gary Sabot. The Paralation Model: Architecture-Independent Parallel Programming. MIT Press,
Cambridge, MA, 1988.
[21] Vivek Sarkar. Partitioning and Scheduling Parallel Programs for Multiprocessors. Pitman Publishing,
128, Long Acre, London WC2E 9AN, 1989.
[22] S. K. Setia, M.S. Squillante, and S.K. Tripathi. Analysis of processor allocation in multiprogrammed,
distributed-memory parallel processing system. IEEE Transaction on Parallel and Distributed Sys-
tems, 5(4):401{430, April, 1994.
[23] K. C. Sevcik. Characterization of parallelism in applications and their use in scheduling. In Perfor-
mance Evaluation Review, pages 171{180, May, 1989.
[24] Thinking Machines Corporation, Cambridge, MA. *Lisp release notes, 1987.
[25] Thinking Machines Corporation, Cambridge, MA. The Connection Machine CM-5 Technical Sum-
mary, October 1991.

Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
Mah Mca Cet Solved Question Paper 2023
No ratings yet
Mah Mca Cet Solved Question Paper 2023
13 pages
Convert Dremel 3D40 To Klipper
No ratings yet
Convert Dremel 3D40 To Klipper
12 pages
Learn NodeJS in 1 Day: Complete Node JS Guide with Examples
From Everand
Learn NodeJS in 1 Day: Complete Node JS Guide with Examples
Krishna Rungta
3.5/5 (4)
Accelerated Computing with HIP
From Everand
Accelerated Computing with HIP
Yifan Sun
4.5/5 (2)
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
C# for Beginners: Learn in 24 Hours
From Everand
C# for Beginners: Learn in 24 Hours
Alex Nordeen
No ratings yet
Software-Defined Networks: A Systems Approach
From Everand
Software-Defined Networks: A Systems Approach
Larry Peterson
5/5 (1)
Samsung Ue65ks9000t Chassis Uwq60 Suhd TV
No ratings yet
Samsung Ue65ks9000t Chassis Uwq60 Suhd TV
92 pages
Samsung UE65KS9502T Chassis UWQ61
100% (1)
Samsung UE65KS9502T Chassis UWQ61
130 pages
On Training
No ratings yet
On Training
43 pages
Operating System Text Book
From Everand
Operating System Text Book
Manish Soni
No ratings yet
Operating System Interview Questions and Answers
From Everand
Operating System Interview Questions and Answers
Manish Soni
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Computer Science Self Management: Fundamentals and Applications
From Everand
Computer Science Self Management: Fundamentals and Applications
Fouad Sabry
No ratings yet
Concurrency and Multithreading in C: POSIX Threads and Synchronization
From Everand
Concurrency and Multithreading in C: POSIX Threads and Synchronization
Larry Jones
No ratings yet
Cloud vs Edge
From Everand
Cloud vs Edge
Isaac Berners-Lee
No ratings yet
Cloud Engineering
From Everand
Cloud Engineering
Kai Turing
No ratings yet
How To Do Virtualization: Your Step-By-Step Guide To Virtualization
From Everand
How To Do Virtualization: Your Step-By-Step Guide To Virtualization
HowExpert
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
From Everand
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
Poonam Devi
No ratings yet
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Professional ASP.NET MVC 4
From Everand
Professional ASP.NET MVC 4
Jon Galloway
3.5/5 (1)
Learn Computer Science
From Everand
Learn Computer Science
Knowledge Flow
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Cloud Development
From Everand
Cloud Development
Mei Gates
No ratings yet
Human-Machine Interface Design for Process Control Applications
From Everand
Human-Machine Interface Design for Process Control Applications
Jean-Yves Fiset
4/5 (2)
System Design Basics
From Everand
System Design Basics
Kai Turing
No ratings yet
Hierarchical Control System: Fundamentals and Applications
From Everand
Hierarchical Control System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Backend Development
From Everand
Backend Development
Kai Turing
No ratings yet
Mainframe Modernization with DevOps Mastery: Mainframes
From Everand
Mainframe Modernization with DevOps Mastery: Mainframes
Ricardo Nuqui
No ratings yet
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
Professional ASP.NET MVC 3
From Everand
Professional ASP.NET MVC 3
Jon Galloway
3.5/5 (1)
Rule Based System: Fundamentals and Applications
From Everand
Rule Based System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Expert System: Fundamentals and Applications for Teaching Computers to Think like Experts
From Everand
Expert System: Fundamentals and Applications for Teaching Computers to Think like Experts
Fouad Sabry
No ratings yet
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
From Everand
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
Editor IJSMI
No ratings yet
Model-Driven Online Capacity Management for Component-Based Software Systems
From Everand
Model-Driven Online Capacity Management for Component-Based Software Systems
André van Hoorn
No ratings yet
Edge AI Solutions
From Everand
Edge AI Solutions
Kai Turing
No ratings yet
Application Layering with VMware App Volumes
From Everand
Application Layering with VMware App Volumes
Peter Von Oven
No ratings yet
AI-Driven Web Apps: Practical Machine Learning for Software Developers
From Everand
AI-Driven Web Apps: Practical Machine Learning for Software Developers
Sivaramarajalu Ramadurai Venkataraajalu
No ratings yet
Design Principles in Architecture
From Everand
Design Principles in Architecture
Rajendra Asan
No ratings yet
Mastering MATLAB: A Comprehensive Journey Through Coding and Analysis
From Everand
Mastering MATLAB: A Comprehensive Journey Through Coding and Analysis
Kameron Hussain
No ratings yet
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
From Everand
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
Christopher Ford
No ratings yet
Embedded Systems Programming with C++: Real-World Techniques
From Everand
Embedded Systems Programming with C++: Real-World Techniques
Robert Johnson
No ratings yet
Software Architecture
From Everand
Software Architecture
Kai Turing
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
From Everand
ASP.NET For Beginners: The Simple Guide to Learning ASP.NET Web Programming Fast!
Tim Warren
No ratings yet
Next-Gen Mainframe: Mastering Modern Automation Techniques: Mainframes
From Everand
Next-Gen Mainframe: Mastering Modern Automation Techniques: Mainframes
Isaac Nangan
No ratings yet
Real-Time Phoenix: Building Scalable Elixir Applications with Live Updates and WebSocket Streams
From Everand
Real-Time Phoenix: Building Scalable Elixir Applications with Live Updates and WebSocket Streams
Sam Stevenson
No ratings yet
Mastering Cloud Computing With Best Practices
From Everand
Mastering Cloud Computing With Best Practices
Manish Soni
No ratings yet
The Software Programmer: Basis of common protocols and procedures
From Everand
The Software Programmer: Basis of common protocols and procedures
S Mathioudakis
No ratings yet
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
From Everand
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
Sam Steed
No ratings yet
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Concurrency in C++: Writing High-Performance Multithreaded Code
From Everand
Concurrency in C++: Writing High-Performance Multithreaded Code
Robert Johnson
No ratings yet
Beginning Asp.Net
From Everand
Beginning Asp.Net
Knowledge Flow
No ratings yet
Shedding Light on Cloud Computing
From Everand
Shedding Light on Cloud Computing
Gregor Petri
5/5 (1)
Modern Mainframe Mastery: Navigating the New Era of Systems Management: Mainframes
From Everand
Modern Mainframe Mastery: Navigating the New Era of Systems Management: Mainframes
Isaac Nangan
No ratings yet
PowerShell for Sysadmins: Workflow Automation Made Easy
From Everand
PowerShell for Sysadmins: Workflow Automation Made Easy
Adam Bertram
No ratings yet
Mastering Concurrency and Multithreading in C++: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Concurrency and Multithreading in C++: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Virtual Report Processing: The Mapper Story
From Everand
Virtual Report Processing: The Mapper Story
Louis Schlueter
No ratings yet
Implementing Linkerd Service Mesh
From Everand
Implementing Linkerd Service Mesh
Kimiko Lee
No ratings yet
Building Scalable Systems with C: Optimizing Performance and Portability
From Everand
Building Scalable Systems with C: Optimizing Performance and Portability
Larry Jones
No ratings yet
All My IT Tech Posts
From Everand
All My IT Tech Posts
Stephen Edwards
No ratings yet
Multi Agent System: Fundamentals and Applications
From Everand
Multi Agent System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Learn Java Programming in 24 Hours
From Everand
Learn Java Programming in 24 Hours
PublishDrive
No ratings yet
A Universal Approach For Operating Syste
No ratings yet
A Universal Approach For Operating Syste
23 pages
An Extensible Operating System Design Fo
No ratings yet
An Extensible Operating System Design Fo
26 pages
A Classifing Bibliography On Computer Op
No ratings yet
A Classifing Bibliography On Computer Op
32 pages
Samsung LE40C750R2 N86A
No ratings yet
Samsung LE40C750R2 N86A
113 pages
Quick Start Guide 3D40 FLEX
No ratings yet
Quick Start Guide 3D40 FLEX
4 pages
3d40 Connecting To Wifi
No ratings yet
3d40 Connecting To Wifi
1 page
Quick Start Guide: Side 1
No ratings yet
Quick Start Guide: Side 1
2 pages
Replacing Top Cover: Service Instructions
No ratings yet
Replacing Top Cover: Service Instructions
2 pages
Leveling Arm Replacement: Service Instructions
No ratings yet
Leveling Arm Replacement: Service Instructions
4 pages
3d40 Noozle Gap Calibration
No ratings yet
3d40 Noozle Gap Calibration
1 page
Dremel 3D40 - F0133D4000 - Spare Parts Miles Tool & Machinery Centre
No ratings yet
Dremel 3D40 - F0133D4000 - Spare Parts Miles Tool & Machinery Centre
4 pages
3d40 Unclogging The Extruder
No ratings yet
3d40 Unclogging The Extruder
1 page
TMC5130 Datasheet Rev1.18
No ratings yet
TMC5130 Datasheet Rev1.18
128 pages
AN038: Using TRINAMIC's IC Software API and Examples
No ratings yet
AN038: Using TRINAMIC's IC Software API and Examples
11 pages
Samsung UE32K5500AW Chassis UWK60
No ratings yet
Samsung UE32K5500AW Chassis UWK60
117 pages
Samsung Ue49nu7172u Uwx80
No ratings yet
Samsung Ue49nu7172u Uwx80
104 pages
Samsung LTM230HP01
No ratings yet
Samsung LTM230HP01
35 pages
Model Name: T550QVR04.0: Issue Date: 2016/09/22 Preliminary Specifications ( ) Final Specifications
No ratings yet
Model Name: T550QVR04.0: Issue Date: 2016/09/22 Preliminary Specifications ( ) Final Specifications
29 pages
Panel Board Schematics
No ratings yet
Panel Board Schematics
5 pages
STM32F103R6
No ratings yet
STM32F103R6
90 pages
Rog Product Guide
No ratings yet
Rog Product Guide
33 pages
Usb Ids
100% (1)
Usb Ids
139 pages
Computer Practicals Assignment Questions
100% (2)
Computer Practicals Assignment Questions
2 pages
QP MEP Q0202 Office Assistant
No ratings yet
QP MEP Q0202 Office Assistant
28 pages
The Evolution of DSP Processors
No ratings yet
The Evolution of DSP Processors
9 pages
Rev Detailed Lesson Plan Connecting Peripherals
No ratings yet
Rev Detailed Lesson Plan Connecting Peripherals
8 pages
ACC1205C: Four Door Access Controller
No ratings yet
ACC1205C: Four Door Access Controller
2 pages
03.CA (CL) - IT - (Module-2) - (3) Information Technology-Hardware
No ratings yet
03.CA (CL) - IT - (Module-2) - (3) Information Technology-Hardware
18 pages
8085
No ratings yet
8085
56 pages
MIPS Architecture
100% (1)
MIPS Architecture
28 pages
SM ARP 350 450
No ratings yet
SM ARP 350 450
81 pages
Operating System
No ratings yet
Operating System
36 pages
Mainboard ESC Model C51 M940
No ratings yet
Mainboard ESC Model C51 M940
32 pages
VEE15
No ratings yet
VEE15
14 pages
Find Testing Pins in Motherboard Bios Without Debug Cards PDF
No ratings yet
Find Testing Pins in Motherboard Bios Without Debug Cards PDF
3 pages
Reset Epson TX 430wrar PDF
No ratings yet
Reset Epson TX 430wrar PDF
3 pages
Ck3Ng Series and Ck3B: Accessory Guide
No ratings yet
Ck3Ng Series and Ck3B: Accessory Guide
6 pages
Introduction To Computers
100% (1)
Introduction To Computers
7 pages
Task 2-Infographics-Calara-Dalang
No ratings yet
Task 2-Infographics-Calara-Dalang
3 pages
Ecs h61h2 m12
No ratings yet
Ecs h61h2 m12
70 pages
Computer System Servicing Trainer in PC Assembly Kit
No ratings yet
Computer System Servicing Trainer in PC Assembly Kit
1 page
Ep-901x 903x1
0% (1)
Ep-901x 903x1
4 pages
Ir Advance Colour Range
No ratings yet
Ir Advance Colour Range
2 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
30 pages
National Clearing Company of Pakistan Limited: 8th Floor, Karachi Stock Exchange Building, Stock Exchange Road, Karachi
No ratings yet
National Clearing Company of Pakistan Limited: 8th Floor, Karachi Stock Exchange Building, Stock Exchange Road, Karachi
2 pages
Colorburst Manual PDF
No ratings yet
Colorburst Manual PDF
145 pages
IBM Power System AC922 Introduction and Technical Overview: Paper
No ratings yet
IBM Power System AC922 Introduction and Technical Overview: Paper
76 pages

An Operating System Framework For Large

Uploaded by

An Operating System Framework For Large

Uploaded by

Submitted to the Seventh IEEE Symposium on Parallel and Distributed Processing

An Operating System Framework for Large Parallel Computers

2 Operating System Framework

2.1 Programming and Machine Models

 The programming model

3 Prior Work on Spatial and Temporal Scheduling

IDLE IDLE IDLE

IDLE IDLE IDLE

Figure 3. An Example of a Trace Diagram.

5.1 A simple model of temporal schedules

Metrics for Temporal Scheduling

6 Finding Good Temporal Schedules

6.1 Periodic Schedules Suce

6.2 The Optimal Periodic Schedule

corresponding to the spatial schedule shown in Figure 2.

The constraints imposed by the happiness requirements are:

turn out to be rational.

You might also like

The programming model

6.1 Periodic Schedules Suce