ospp-chap07-part3
ospp-chap07-part3
Thrashing
If the number of frames allocated to a low-priority process falls below the minimum number
required by the computer architecture, we must suspend that process’s execution. We should
then page out its remaining pages, freeing all its allocated frames. This provision introduces a
swap-in, swap-out level of intermediate CPU scheduling.
If the process does not have the number of frames it needs to support pages in active use, it
will quickly page-fault. At this point, it must replace some page. However, since all its pages
are in active use, it must replace a page that will be needed again right away. Consequently, it
quickly faults again, and again, and again, replacing pages that it must bring back in
immediately. This high paging activity is called thrashing. A process is thrashing if it is
spending more time paging than executing.
Thrashing results in severe performance problems. Consider the following scenario, which is
based on the actual behaviour of early paging systems. The operating system monitors CPU
utilization. If CPU utilization is too low, we increase the degree of multiprogramming by
introducing a new process to the system. A global page-replacement algorithm is used; it
replaces pages without regard to the process to which they belong. Now suppose that a
process enters a new phase in its execution and needs more frames. It starts faulting and
taking frames away from other processes. These processes need those pages, however, and so
they also fault, taking frames from other processes. These faulting processes must use the
paging device to swap pages in and out. As they queue up for the paging device, the ready
queue empties. As processes wait for the paging device, CPU utilization decreases.
The CPU scheduler sees the decreasing CPU utilization and increases the degree of
multiprogramming as a result. The new process tries to get started by taking frames from
running processes, causing more page faults and a longer queue for the paging device. As a
result, CPU utilization drops even further, and the CPU scheduler tries to increase the degree
of multiprogramming even more. Thrashing has occurred, and system throughput plunges.
The page-fault rate increases tremendously. As a result, the effective memory-access time
increases. No work is getting done, because the processes are spending all their time paging.
This is demonstrated in figure below.
As the degree of multiprogramming increases, CPU utilization also increases, although more
slowly, until a maximum is reached. If the degree of multiprogramming is increased even
further, thrashing sets in, and CPU utilization drops sharply. At this point, to increase CPU
utilization and stop thrashing, we must decrease the degree of multiprogramming.
We can limit the effects of thrashing by using a local replacement algorithm (or priority
replacement algorithm). With local replacement, if one process starts thrashing, it cannot
steal frames from another process and cause the latter to thrash as well. However, the
problem is not entirely solved. If processes are thrashing, they will be in the queue for the
paging device most of the time. The average service time for a page fault will increase
because of the longer average queue for the paging device. Thus, the effective access time
will increase even for a process that is not thrashing.
To prevent thrashing, we must provide a process with as many frames as it needs. But how do
we know how many frames it “needs”? There are several techniques. The working-set
strategy starts by looking at how many frames a process is actually using. This approach
defines the locality model of process execution.
The locality model states that, as a process executes, it moves from locality to locality. A
locality is a set of pages that are actively used together. A program is generally composed of
several different localities, which may overlap. For example, when a function is called, it
defines a new locality. In this locality, memory references are made to the instructions of the
function call, its local variables, and a subset of the global variables. When we exit the
function, the process leaves this locality, since the local variables and instructions of the
function are no longer in active use. We may return to this locality later. Thus, we see that
localities are defined by the program structure and its data structures. The locality model
states that all programs will exhibit this basic memory reference structure.
Suppose we allocate enough frames to a process to accommodate its current locality. It will
fault for the pages in its locality until all these pages are in memory; then, it will not fault
again until it changes localities. If we do not allocate enough frames to accommodate the size
of the current locality, the process will thrash, since it cannot keep in memory all the pages
that it is actively using.
For example, given the sequence of memory references shown in Figure below and if = 10
memory references, then the working set at time t1 is {1, 2, 5, 6, 7}. By time t2, the working
set has changed to {3, 4}.
The accuracy of the working set depends on the selection of . If is too small, it will not
encompass the entire locality; if is too large, it may overlap several localities. In the
extreme, if is infinite, the working set is the set of pages touched during the process
execution.
The most important property of the working set, then, is its size. If we compute the working-
set size, WSSi, for each process in the system, we can then consider that
where D is the total demand for frames. Each process is actively using the pages in its
working set. Thus, process i needs WSSi frames. If the total demand is greater than the total
number of available frames (D> m), thrashing will occur, because some processes will not
have enough frames.
Once has been selected, use of the working-set model is simple. The operating system
monitors the working set of each process and allocates to that working set enough frames to
provide it with its working-set size. If there are enough extra frames, another process can be
initiated. If the sum of the working-set sizes increases, exceeding the total number of
available frames, the operating system selects a process to suspend. The process’s pages are
written out (swapped), and its frames are reallocated to other processes.
The suspended process can be restarted later. This working-set strategy prevents thrashing
while keeping the degree of multiprogramming as high as possible. Thus, it optimizes CPU
utilization. The difficulty with the working-set model is keeping track of the working set. The
working-set window is a moving window. At each memory reference, a new reference
appears at one end, and the oldest reference drops off the other end. A page is in the working
set if it is referenced anywhere in the working-set window.
The specific problem is how to prevent thrashing. Thrashing has a high= page-fault rate.
Thus, we want to control the page-fault rate. When it is too high, we know that the process
needs more frames. Conversely, if the page-fault rate is too low, then the process may have
too many frames. We can establish
upper and lower bounds on the desired page-fault rate. If the actual page-fault rate exceeds
the upper limit, we allocate the process another frame. If the page-fault rate falls below the
lower limit, we remove a frame from the process. Thus, we can directly measure and control
the page-fault rate to prevent thrashing.
As with the working-set strategy, we may have to swap out a process. If the page-fault rate
increases and no free frames are available, we must select some and swap it out to backing
store. The freed frames are then distributed to processes with high page-fault rates.
Practically speaking, thrashing and the resulting swapping have a disagreeably large impact
on performance. The current best practice in implementing a computer facility is to include
enough physical memory, whenever possible, to avoid thrashing and swapping. From
smartphones through mainframes, providing enough memory to keep all working sets in
memory concurrently, except under extreme conditions, gives the best user experience.