Memory Management
Memory Management
• This scheme prevents user program from accidentally modifying the code or data
structures of either the OS or other users.
Address Binding
• Program resides on a disk as a binary executable file. To be executed, the
program must be brought into memory and placed within a process. Depending
on the memory management in use, the process may be moved between disk
and memory during its execution. The processes on the disk that are waiting to
be brought into memory for execution form an input queue.
• The normal procedure is to select one of the processes in the input queue and
load that process into memory. As the process is executed, it access instructions
and data from memory. Eventually, the process terminates, and its memory
space is declared available.
• Addresses represented in different ways at different stages of a program’s life
❖ Source code addresses usually symbolic (such as count)
❖ A compiler will bind these symbolic addresses to relocatable addresses
❖ Linker or loader will in turn bind relocatable addresses to absolute addresses
❖ Each binding is a mapping from one address space to another and is called
address binding.
Binding of Instructions and Data to Memory
• Address binding of instructions and data to memory addresses can happen at
three different stages
➢ Compile time: If memory location of process known at compile time, then
absolute code can be generated. Ex: If user process reside starting at
location R, then generated compiler code will start at that location and
extend up from there.
❖ If starting location changes then it is necessary to recompile this code
➢ Load time: If memory location of process is not known at compile time,
then must generate relocatable code. In this case final binding is delayed
until load time.
❖ If starting location changes then need to reload the user code to
incorporate this changed value.
➢ Execution time: Binding delayed until run time if the process can be moved
during its execution from one memory segment to another
❖ Need hardware support for address maps (e.g., base and limit registers)
Multistep Processing of a User Program
Logical vs Physical Address Space
• The concept of a logical address space that is bound to a separate physical
address space is central to proper memory management
➢ Logical address – generated by the CPU; also referred to as virtual address
➢ Physical address – address seen by the memory unit i.e the one loaded into
the memory-address register of the memory.
• Logical and physical addresses are the same in compile-time and load-time
address-binding schemes; logical (virtual) and physical addresses differ in
execution-time address-binding scheme
• Logical address space is the set of all logical addresses generated by a program
• Physical address space is the set of all physical addresses generated by a
program
• Memory-Management Unit (MMU):
• The run time mapping from virtual to physical address is done by a hardware
device called Memory-Management Unit (MMU).
• Mapping: Base register now called relocation register. The value in the
relocation register is added to every address generated by a user process at the
time it is sent to memory. (see figure)
➢ The user program deals with logical addresses; it never sees the real
physical addresses
Dynamic relocation using a relocation register
• The memory mapping hardware converts logical addresses into physical
addresses.
Two types of addresses:
• Logical addresses(0 to max)
• Physical addresses(R+0 to R+max for a base value R).
• User generates only logical addresses and thinks that process runs in location 0
to max.
• The user program supplies logical addresses, these must be mapped to physical
addresses before they are used.
Dynamic Loading
• With dynamic loading, a routine is not loaded until it is called.
• All routines are kept on disk in a relocatable load format.
• The main program is loaded into memory and is executed. When a routine is
needs to call another routine, the calling routine first checks to see whether the
other routine has been loaded.
• If it has not, the relocatable linking loader is called to load the desired routine
into memory and control is passed to the newly loaded routine.
Advantages:
1. unused routine is never loaded.
2. Provide better memory utilization
• OS provide only library routine to implement dynamic loading.
Dynamic Linking
Static linking – copies all library modules used in the program into the final executable
file at the final step of the compilation.
• If any of the libraries has changed then they(program and library) have to be
recompiled and re-linked again else the changes won’t reflect in the existing executable
file.
Dynamic linking – occurs at run time when both executable files and libraries are placed
in the memory.
• Similar to dynamic loading, linking is postponed until execution time.
• With dynamic linking, a stub is included in the executable file for each library-routine
reference.
• The stub is small piece of code, that indicates how to locate the appropriate memory-
resident library routine or how to load the library if the routine is not present.
• If it is not, stub replaces itself with the address of the routine, and executes the
routine
• Dynamic linking is particularly useful for library updates. A library may be replaced by
a new version and all programs that reference the library will automatically use the
new version. This system also known as shared libraries
Swapping
Swapping is a memory management scheme in which any process can be
temporarily swapped from main memory to secondary memory(Backing store) so
that the main memory can be made available for other processes and then brought
back into memory for continued execution,
• It is used to improve main memory utilization and availability.
Ex: Multiprogramming environment with round-robin CPU or priority based
algorithm.
• Roll out, roll in – swapping variant
used for priority-based scheduling
algorithms; lower-priority process is
swapped out so higher-priority process
can be loaded and executed
• Backing store – fast disk large enough
to accommodate copies of all memory
images for all users; must provide
direct access to these memory images.
Swapping
• The system maintains ready queue consisting of all processes whose memory
images on the backing store or in memory and are ready to run. Whenever CPU
scheduler decides to execute a process, it call dispatcher.
• The dispatcher checks to see whether the next process in the queue is in
memory. If it is not and if there is no free memory region, the dispatcher swaps
out a process currently in memory and swaps in the desired process.
• Context switch time in such a swapping system is fairly high.
Context Switch Time including Swapping
Ex.1: 100MB process swapping to hard disk with transfer rate of 50MB/sec and
assuming an average latency of 8ms.
▪ Time = process size / transfer rate = 100/50 = 2sec=2000ms (1sec = 1000ms)
▪ Swap out time of 2sec = 2000+8=2008ms
▪ Plus swap in of same sized process
▪ Total context switch swapping component time of 4016ms (4 seconds)
Swapping
Ex.2:User process size is 2048Kb and data transfer rate is 1Mbps = 1024 kbps
Time = process size / transfer rate = 2048 / 1024 = 2 seconds 2000 milliseconds
Now taking swap in and swap out time, the process will take 4000 milliseconds.
• Major part of the swap time is transfer time. The total transfer time is directly
proportional to the amount of a memory swapped.
• It would be useful to know exactly how much memory a user process is using. Then
we need to swap only what is actually used, reducing swap time.
• For this a process with dynamic memory requirements will need to issue system
calls request_memory() and release_memory() to inform OS its changing memory
needs.
Swapping is constrained by other factors as well:
• Process must be idle in swapping.
• Pending I/O – A process may be waiting for an I/O operation when we want to
swap that process to free up memory. If I/O operation is asynchronously accessing
the user memory for I/O buffers, then the process cannot be swapped.
Contiguous Memory Allocation
Contiguous memory allocation is a memory management technique used by
operating systems to allocate memory to processes in contiguous blocks.
• Main memory must accommodate both OS and user processes
• So main memory usually into two partitions:
➢ One for resident operating system, usually held in low memory with interrupt
vector
➢ One for user processes then held in high memory
➢ Each process contained in single contiguous section of memory
Memory mapping and protection:
• The relocation register contains the value of the smallest physical address; the limit
register contains the range of logical addresses.
• With relocation and limit register , each logical address must be less than the limit
register; the MMU maps the logical address dynamically by adding the value in the
relocation register. This mapped address is sent to memory.
• When CPU scheduler selects a process for execution, the dispatcher loads the
relocation and limit register with the correct values as part of the context switch.
Contiguous Memory Allocation
• Because every address generated by the CPU is checked against these registers,
both OS and user programs and data are protected from being modified by the
running process.
Contiguous Memory Allocation
Memory allocation:
• One of the simplest methods for allocating memory is to divide memory into
several partitions.
• Each partition may contain exactly one process.
• Thus, the degree of multiprogramming(refers to the number of processes that
can run concurrently in memory) limited by number of partitions. i.e multiple
partition method.
• In this multiple partition method, when a partition is free, a process is selected
from the input queue and is loaded into a free partition. When the process
terminates, the partition becomes available for another process.
Two different schemes:
1. Fixed partition scheme
2. Variable partition scheme
Multiple-Partition Allocation
1. Fixed partition scheme
• The entire memory will be partitioned into continuous blocks of fixed size
• Each time when a process enters the system, it will be given one of the available
blocks.
• Each process receives a block of memory space that is the same size, regardless
of the size of the process.
• Static partitioning is another name for this approach.
Multiple-Partition Allocation
Advantages:
1. This strategy is easy to employ because each block is the same size.
2. It is simple to keep track of how many memory blocks are still available, which
determines how many further processes can be allocated memory. Because OS
keeps a table indicating which parts of memory are available and which are
occupied.
Disadvantages:
1. Not able to allocate space to a process whose size exceeds the block since the
size of the blocks is fixed.
2. The process is assign to the block if the block's size is more than that of the
process, this will leave a lot of free space in the block. This causes internal
fragmentation.
Multiple-Partition Allocation
2. Variable partition scheme
• No fixed blocks or memory partitions are created according to the needs of
process.
• Each process is given a variable-sized block. This indicates that if space is
available, amount of memory is allocated to a new process whenever it
requests it. As a result, each block's size is determined by the needs and
specifications of the process that uses it.
• Initially, all memory is available for user processes and is considered one large
block of available memory, a hole.
• Holes of various size are scattered throughout memory
• When a process arrives, memory is allocated from a hole large enough to
accommodate it.
• Process exiting frees its partition, adjacent free partitions are merged.
• Operating system maintains information about:
a) allocated partitions b) free partitions (hole)
• Dynamic partitioning is another name for this approach.
Multiple-Partition Allocation
Advantages:
1. There is no internal fragmentation because the processes are given blocks of
space according to their needs. Therefore, this technique does not waste RAM.
2. Even a large process can be given space because there are no blocks with set
sizes.
Disadvantage:
1. Memory allocate dynamically, hence it is challenging to implement a variable-
size partition scheme.
2. It is challenging to maintain record of processes and available memory space.
Multiple-Partition Allocation
• Variable partition scheme is instance of dynamic storage allocation problem,
which concerns how to satisfy a request of size n from a list of free holes?
• There are many solutions to this problem. Following are most commonly used
to select the free hole from set of available holes.
First fit: Allocate the first hole that is big enough.
• Searching is start from beginning of set of holes and it stop searching as soon
as it find a free hole that is large enough.
Best fit: Allocate the smallest hole that is big enough; must search entire list,
unless ordered by size
• Produces the smallest leftover hole.
Worst fit: Allocate the largest hole; must also search entire list, unless ordered by
size
• Produces the largest leftover hole, which more useful than smaller leftover
hole from best fit approach.
Fragmentation
Internal Fragmentation – allocated memory may be slightly larger than requested
memory; this size difference is memory internal to a partition, but not being used
External Fragmentation – total memory space exists to satisfy a request, but it is
not contiguous; storage is fragmented into large number of small holes.
• Both first fit and best fit suffer from external fragmentation.
• One solution to the problem of external fragmentation is compaction.
Compaction is a technique to collect all the free memory present in form of
fragments into one large chunk of free memory, which can be used to run other
processes.
➢ It does that by moving all the processes towards one end of the memory and
all the available free space towards the other end of the memory so that it
becomes contiguous.
➢ It is not always easy to do compaction. Compaction can be done at execution
time.
Fragmentation
• Another possible solution to the external-fragmentation problem is to permit
the physical address space of the processes to be non-contiguous, thus
allowing a process to be allocated physical memory wherever the more recent
is available.
• Two complementary techniques achieve this solution:
– paging
– segmentation
n=2 and m=4 ; page size of 4-bytes, 16 bytes(4 pages) logical address and 32-
bytes(8 pages) physical memory
Address Translation Hardware
Address generated by CPU is divided into:
Page number (p) – used as an index into a page table. The page table contains base
address of each page in physical memory
Page offset (d) – combined with base address to define the physical memory address
that is sent to the memory unit
page number page offset
p d
m and n are bits
m -n n
Paging fragmentation
• No external fragmentation with paging; any free frame can be allocated to a process
that needs it.
• But there is internal fragmentation. Calculating internal fragmentation:
➢ Page size = 2,048 bytes
➢ Process size = 72,766 bytes
➢ 35 pages + 1,086 bytes
➢ Internal fragmentation of 2,048 - 1,086 = 962 bytes
• In worst case, a process may needs n pages + 1 byte, which requires n+1 frames,
resulting internal fragmentation = 1 frame – 1 byte (almost 1 frame)
• This consideration suggests that small page sizes are desirable. But this leads
overhead in page table entry and this overhead is reduced as the size of pages
increases.
• When the process is to be executed, its size, expressed in pages, is examined. Each
page of the process needs one frame. Thus if process requires n pages, at least n
frames must be available in memory. If n frames are available, they are allocated to
Free Frames
• First page of process loaded into available frame 14 and the frame number, 14, is
put in the page table. Next page loaded into available frame 13 and the frame
number, 13, is put in the page table and so on.
Paging
• An important aspect of paging is the clear separation between the user’s view of
memory and the actual physical memory.
• The user program views memory as one single space, containing only this one
program. In fact, the user program is scattered throughout physical memory.
• The logical addresses are translated into physical addresses by the address-
translation hardware. This mapping is hidden from the user and is controlled by the
OS.
• The page table includes only those pages that the process owns.
• Since the OS is managing physical memory, it must be aware of the allocation
details of physical memory-which frames are allocated, which frames are available,
how many total frames there are, and so on.
• This information is generally kept in a data structure called a frame table. The
frame table has one entry for each physical page frame, indicating whether the frame
is free or allocated and, if it is allocated, to which page of which process or processes.
Implementation of Page Table
• Page table is kept in main memory
• Page-table base register (PTBR) points to the page table
• Page-table length register (PTLR) indicates size of the page table
• In this scheme every data/instruction access requires two memory accesses
• One for the page table to find frame number and
• One to go to the address specified by frame number for the data / instruction
Implementation of Page Table
• The two memory access problem can be solved by the use of a special fast-
lookup hardware cache called associative memory or translation look-aside
buffers (TLBs)
• Translation look-aside buffer (TLB) is a special cache which is closer to the
CPU and the time taken by CPU to access TLB is lesser than that taken to access
main memory and is used to keep track of recently used transactions.
• TLB contains page table entries that have been most recently used (recent
translations of virtual memory to physical memory).
• Given a logical address, the processor examines the TLB if a page table entry is
present (TLB hit), the frame number is retrieved and the real address is formed.
• If a page table entry is not found in the TLB (TLB miss), the page number is
used as an index while processing the page table.
• TLB first checks if the page is already in main memory, if not in main memory
a page fault is issued then the TLB is updated to include the new page entry.
Paging Hardware With TLB
Effective Access Time
Hit ratio() : The percentage of times that a particular page number is found in the TLB.
Miss ratio: The percentage of times that a particular page number is not found in the
TLB.
Ex: 80% hit ratio means that we find the desired page number in the TLB 80% of the
time. Then miss ratio is 20%.
• If it takes 20ns for TLB search and 100ns for memory access,
• when page number is in the TLB, then memory mapping access taken 120ns
• If we fail to find the page number in the TLB, then memory mapping taken 220ns
(TLB search:20ns, page table access for frame number:100ns and access to
desired page:100ns)
Effective Access Time(EAT) : To find it, weight the case by its probability
• EAT = 0.80 x 120 + 0.20 x 220 = 140 => 40 percent slow down in memory access
time (from 100 to 140ns)
• For 98% hit ratio : EAT = 0.98 x 120 + 0.02 x 220 = 122ns – 22 percent slow down in
access time
Protection
• Memory protection in paged environment is implemented by associating
protection bit with each frame to indicate if read-only or read-write access is
allowed.
• At the time of memory reference to the memory through page table, the
protection bits can be checked to verify that no writes are being made to read
only page.
• An attempt to write to a read-only page causes a hardware trap to the
operating system (or memory-protection violation).
• One additional bit is generally attached to each entry in the page table: a valid-
invalid bit.
– When this bit is set to “valid”, the associated page is in the process’s
address space.
Valid (v) or Invalid (i) Bit In A Page Table
• Illegal addresses are trapped by use of the valid-invalid bit. The OS sets this bit
for each page to allow or disallow access to the page.
Structure of Page Table
• Memory structures for paging can get huge using straight-forward methods
➢ Consider a 32-bit logical address space as on modern computers
➢ Page size of 4 KB (212)
➢ Page table would have 1 million entries (232 / 212)
➢ If each entry is 4 bytes -> 4 MB of physical address space / memory for page
table alone
• That amount of memory used to cost a lot
• Don’t allocate that contiguously in main memory
Most common methods for structuring page table:
• Hierarchical Paging
• Hashed Page Tables
• Inverted Page Tables
Hierarchical Paging
Another name for Hierarchical Paging is multilevel paging.
• There might be a case where the page table is too big to fit in a contiguous space,
then page table can be structured in hierarchy with several levels.
• So in this type of paging the logical address space is broke up into Multiple page
tables.
• Hierarchical paging is one of the simplest techniques and for this purpose, a two-level
page table and three-level page table can be used.
Two level page table: in which page table itself is also paged.
Two-Level Paging Example
• A logical address (on 32-bit machine with 1K page size) is divided into:
• a page number consisting of 22 bits (Page table entries=222 = 4 million and
• a page offset consisting of 10 bits Page table size=4x4=16MB)
• Since the page table is paged, the page number is further divided into:
• a 12-bit page number
• a 10-bit page offset
where p1 is an index into the outer page table, and p2 is the displacement within
the page of the inner page table
• Known as forward-mapped page table
Address-Translation Scheme
Three-Level Paging Example
For a system with 64 bit logical address space, a two-level paging scheme not
sufficient
Ex: If page size is 4 KB (212)
• Then page table has 252 entries
• If two level scheme, inner page tables could be 210 4-byte entries
• Address would look like :
• CPU generates a logical address which contains two parts: 1. Segment Number
2. Offset
• The Segment number is mapped to the segment table. The limit of the respective
segment is compared with the offset. If the offset is less than the limit then the
address is valid otherwise it throws an error as the address is invalid.
• In the case of valid addresses, the base address of the segment is added to the
offset to get the physical address of the actual word in the main memory.