0% found this document useful (0 votes)
23 views

11 Memallocation

The document discusses dynamic memory allocation and the malloc package. It covers topics like dynamic allocation, how malloc manages memory, examples of malloc usage, and performance goals for allocators like throughput and peak memory utilization.

Uploaded by

oreh2345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

11 Memallocation

The document discusses dynamic memory allocation and the malloc package. It covers topics like dynamic allocation, how malloc manages memory, examples of malloc usage, and performance goals for allocators like throughput and peak memory utilization.

Uploaded by

oreh2345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

University of Washington

Roadmap Memory & data


Integers & floats
Machine code & C
C: Java:
x86 assembly
car *c = malloc(sizeof(car)); Car c = new Car(); Procedures & stacks
c->miles = 100; c.setMiles(100);
Arrays & structs
c->gals = 17; c.setGals(17);
float mpg = get_mpg(c); float mpg =
Memory & caches
free(c); c.getMPG(); Processes
Virtual memory
Assembly get_mpg: Memory allocation
language: pushq %rbp Java vs. C
movq %rsp, %rbp
...
popq %rbp
ret
OS:
Machine 0111010000011000
100011010000010000000010
code: 1000100111000010
110000011111101000011111

Computer
system:

Autumn 2013 Memory Allocation 1


University of Washington

Memory Allocation Topics


¢ Dynamic memory allocation
§ Size/number of data structures may only be known at run time
§ Need to allocate space on the heap
§ Need to de-allocate (free) unused memory so it can be re-allocated
¢ Implementation
§ Implicit free lists
§ Explicit free lists – subject of next programming assignment
§ Segregated free lists
¢ Garbage collection
¢ Common memory-related bugs in C programs

Autumn 2013 Memory Allocation 2


University of Washington

Dynamic Memory Allocation


¢ Programmers use dynamic memory allocators (such as
malloc) to acquire virtual memory at run time.
§ For data structures whose size is only known at runtime.
¢ Dynamic memory allocators manage an area of process virtual
memory known as the heap.
User stack

Top of heap
(brk ptr)
Heap (via malloc)

Uninitialized data (.bss)


Initialized data (.data)
Program text (.text)

0
Autumn 2013 Memory Allocation 3
University of Washington

Dynamic Memory Allocation


¢ Allocator maintains heap as collection of variable sized
blocks, which are either allocated or free
§ Allocator requests pages in heap region; virtual memory hardware and
OS kernel allocate these pages to the process.
§ Application objects are typically smaller than pages, so the allocator
manages blocks within pages. (Sometimes larger)
¢ Types of allocators
§ Explicit allocator: application allocates and frees space
§ E.g. malloc and free in C
§ Implicit allocator: application allocates, but does not free space
§ E.g. garbage collection in Java, ML, and Lisp

Autumn 2013 Memory Allocation 4


University of Washington

The malloc Package


#include <stdlib.h>
void* malloc(size_t size)
§ Successful:
§Returns a pointer to a memory block of at least size bytes
(typically) aligned to 8-byte boundary
§ If size == 0, returns NULL
§ Unsuccessful: returns NULL and sets errno
void free(void* p)
§ Returns the block pointed at by p to pool of available memory
§ p must come from a previous call to malloc or realloc
Other functions
§ calloc: Version of malloc that initializes allocated block to zero.
§ realloc: Changes the size of a previously allocated block.
§ sbrk: Used internally by allocators to grow or shrink the heap.
Autumn 2013
§ historical naming from before virtual memory was common…
Memory Allocation 5
University of Washington

Malloc Example
void foo(int n, int m) {
int i, *p;

/* allocate a block of n ints */


p = (int *)malloc(n * sizeof(int));
if (p == NULL) {
perror("malloc");
exit(0);
}
for (i=0; i<n; i++) p[i] = i;

/* add space for m ints to end of p block */


if ((p = (int *)realloc(p, (n+m) * sizeof(int))) == NULL) {
perror("realloc");
exit(0);
}
for (i=n; i < n+m; i++) p[i] = i;

/* print new array */


for (i=0; i<n+m; i++)
printf("%d\n", p[i]);

free(p); /* return p to available memory pool */


}
Autumn 2013 Memory Allocation 6
University of Washington

Assumptions
¢ Memory is word addressed (each word can hold a pointer)
§ block size is a multiple of words

Allocated block Free block


(4 words) (3 words) Free word
Allocated word

Autumn 2013 Memory Allocation 7


University of Washington

Allocation Example (32-bit)


p1 = malloc(16)

p2 = malloc(20)

p3 = malloc(12)

free(p2)

p4 = malloc(8)

Autumn 2013 Memory Allocation 8


University of Washington

Constraints
¢ Applications
§ Can issue arbitrary sequence of malloc() and free() requests
§ free() requests must be made only for a previously malloc()’d block

¢ Allocators
§ Can’t control number or size of allocated blocks
§ Must respond immediately to malloc() requests
§i.e., can’t reorder or buffer requests
§ Must allocate blocks from free memory
§ i.e., blocks can’t overlap, why not?
§ Must align blocks so they satisfy all alignment requirements
§ 8 byte alignment for GNU malloc (libc malloc) on Linux
§ Can’t move the allocated blocks once they are malloc()’d
§ i.e., compaction is not allowed. Why not?

Autumn 2013 Memory Allocation 9


University of Washington

Performance Goal: Throughput


¢ Given some sequence of malloc and free requests:
§ R0, R1, ..., Rk, ... , Rn-1

¢ Goals: maximize throughput and peak memory utilization


§ These goals are often conflicting

¢ Throughput:
§ Number of completed requests per unit time
§ Example:
§ 5,000 malloc() calls and 5,000 free() calls in 10 seconds
§ Throughput is 1,000 operations/second

Autumn 2013 Memory Allocation 10


University of Washington

Performance Goal: Peak Memory Utilization


¢ Given some sequence of malloc and free requests:
§ R0, R1, ..., Rk, ... , Rn-1
¢ Def: Aggregate payload Pk
§ malloc(p) results in a block with a payload of p bytes
§ After request Rk has completed, the aggregate payload Pk is the sum of
currently allocated payloads

¢ Def: Current heap size = Hk


§ Assume Hk is monotonically nondecreasing
§ Allocator can increase size of heap using sbrk()

¢ Def: Peak memory utilization after k requests


§ Uk = ( maxi≤k Pi ) / Hk
§ Goal: maximize utilization for a sequence of requests.
§ Why is this hard? And what happens to throughput?

Autumn 2013 Memory Allocation 11


University of Washington

Fragmentation
¢ Poor memory utilization is caused by fragmentation.
¢ Sections of memory are not used to store anything useful, but
cannot be allocated.
¢ internal fragmentation
¢ external fragmentation

Autumn 2013 Memory Allocation 12


University of Washington

Internal Fragmentation
¢ For a given block, internal fragmentation occurs if payload is smaller than
block size

block

Internal Internal
payload
fragmentation fragmentation

¢ Caused by
§ overhead of maintaining heap data structures (inside block, outside payload)
§ padding for alignment purposes
§ explicit policy decisions (e.g., to return a big block to satisfy a small request)
why would anyone do that?

Autumn 2013 Memory Allocation 13


University of Washington

External Fragmentation (32-bit)


¢ Occurs when there is enough aggregate heap memory, but no
single free block is large enough

p1 = malloc(16)

p2 = malloc(20)

p3 = malloc(24)

free(p2)

p4 = malloc(24) Oops! (what would happen now?)

¢ Depends on the pattern of future requests


§ Thus, difficult to measure
Autumn 2013 Memory Allocation 14
University of Washington

Implementation Issues
¢ How do we know how much memory to free given just a
pointer?

¢ How do we keep track of the free blocks?

¢ How do we pick a block to use for allocation (when many


might fit)?

¢ What do we do with the extra space when allocating a


structure that is smaller than the free block it is placed in?

¢ How do we reinsert freed block into the heap?


Autumn 2013 Memory Allocation 15
University of Washington

Knowing How Much to Free


¢ Standard method
§ Keep the length of a block in the word preceding the block
§This word is often called the header field or header
§ Requires an extra word for every allocated block

p0

p0 = malloc(16) 20

block size data

free(p0)

Autumn 2013 Memory Allocation 16


University of Washington

Keeping Track of Free Blocks


¢ Method 1: Implicit list using length—links all blocks

20 16 24 8

¢ Method 2: Explicit list among the free blocks using pointers

20 16 24 8

¢ Method 3: Segregated free list


§ Different free lists for different size classes

¢ Method 4: Blocks sorted by size


§ Can use a balanced binary tree (e.g. red-black tree) with pointers
within each free block, and the length used as a key
Autumn 2013 Memory Allocation 17
University of Washington

Implicit Free Lists e.g. with 8-byte alignment,


sizes look like:
00000000
¢ For each block we need: size, is-allocated? 00001000
00010000
§ Could store this information in two words: wasteful! 00011000

¢ Standard trick
§ If blocks are aligned, some low-order size bits are always 0
§ Instead of storing an always-0 bit, use it as a allocated/free flag
§ When reading size, must remember to mask out this bit
1 word

size a a = 1: allocated block


a = 0: free block
Format of
allocated and payload
size: block size
free blocks
payload: application data
(allocated blocks only)
optional
padding
Autumn 2013 Memory Allocation 18
University of Washington

Implicit Free List Example (32-bit)


Sequence of blocks in heap (size|allocated): 8|0, 16|1, 32|0, 16|1

Start of heap
Free word
8|0 16|1 32|0 16|1 0|1 Allocated word
Allocated word
unused

8 bytes = 2 word alignment

¢ 8-byte alignment
§ May require initial unused word
§ Causes some internal fragmentation
¢ Special one-word marker (0|1) marks end of list
§ zero size is distinguishable from all real sizes
Autumn 2013 Memory Allocation 19
University of Washington

Implicit List: Finding a Free Block *p gets the block header


*p & 1 extracts the
allocated bit
¢ First fit: *p & -2 masks the allocated
§ Search list from beginning, choose first free block that fits: bit, gets just the size
p = heap_start;
while ((p < end) && // not past end
((*p & 1) || // already allocated
(*p <= len))) { // too small
p = p + (*p & -2); // go to next block (UNSCALED +)
} // p points to selected block or end

§ Can take time linear in total number of blocks (allocated and free)
§ In practice it can cause “splinters” at beginning of list
¢ Next fit:
§ Like first-fit, but search list starting where previous search finished
§ Should often be faster than first-fit: avoids re-scanning unhelpful blocks
§ Some research suggests that fragmentation is worse
¢ Best fit:
§ Search the list, choose the best free block: fits, with fewest bytes left over
§ Keeps fragments small—usually helps fragmentation
Autumn 2013
§ Will typically run slower than first-fit
Memory Allocation 20
University of Washington

Implicit List: Allocating in Free Block


¢ Allocating in a free block: splitting
§ Since allocated space might be smaller than free space, we might want
to split the block

8 24 8

b
malloc(12) à split(b, 16)

8 16 8 8
assume ptr points to word and has unscaled pointer arithmetic
void split(ptr b, int bytes) { // bytes = desired block
size
int newsize = ((bytes + 7) >> 3) << 3;// round up to multiple of 8
int oldsize = *b; // why not mask out low bit?
*b = newsize; // initially unallocated
if (newsize < oldsize)
*(b+newsize) = oldsize - newsize; // set length in remaining
} 2013
Autumn Memory Allocation // part of block (UNSCALED 21
University of Washington

Implicit List: Freeing a Block


¢ Simplest implementation:
§ Need only clear the “allocated” flag
void free(ptr p) { ptr b = p – WORD; *b = *b & -2 }
§ But can lead to “false fragmentation”

8 16 8 8

free(p) p

8 16 8 8

malloc(20) Oops!

There is enough free space, but the allocator won’t be able to find it

Autumn 2013 Memory Allocation 22


University of Washington

Implicit List: Coalescing


¢ Join (coalesce) with next/previous blocks, if they are free
§ Coalescing with next block

16 16 8 8
logically
p
free(p) gone
16 24 8 8

void free(ptr p) { // p points to data


ptr b = p – WORD; // b points to block
*b = *b & -2; // clear allocated bit
ptr next = b + *b; // find next block (UNSCALED +)
if ((*next & 1) == 0)
*b = *b + *next; // add to this block if
} // not allocated

§ But how do we coalesce with the previous block?


Autumn 2013 Memory Allocation 23
University of Washington

Implicit List: Bidirectional Coalescing


¢ Boundary tags [Knuth73]
§ Replicate size/allocated word at “bottom” (end) of free blocks
§ Allows us to traverse the “list” backwards, but requires extra space
§ Important and general technique!

16 16 16 16 24 24 16 16

Header size a
a = 1: allocated block
Format of a = 0: free block
allocated and payload and
size: total block size
free blocks padding
payload: application data
Boundary tag size a (allocated blocks only)
(footer)

Autumn 2013 Memory Allocation 24


University of Washington

Constant Time Coalescing

Case 1 Case 2 Case 3 Case 4

allocated allocated free free


block being
freed
allocated free allocated free

Autumn 2013 Memory Allocation 25


University of Washington

Constant Time Coalescing


m1 1 m1 1 m1 1 m1 1

m1 1 m1 1 m1 1 m1 1
n 1 n 0 n 1 n+m2 0

n 1 n 0 n 1
m2 1 m2 1 m2 0

m2 1 m2 1 m2 0 n+m2 0

m1 0 n+m1 0 m1 0 n+m1+m2 0

m1 0 m1 0
n 1 n 1

n 1 n+m1 0 n 1
m2 1 m2 1 m2 0

m2 1 m2 1 m2 0 n+m1+m2 0
Autumn 2013 Memory Allocation 26
University of Washington

Implicit Free Lists: Summary


¢ Implementation: very simple
¢ Allocate cost:
§ linear time (in total number of heap blocks) worst case
¢ Free cost:
§ constant time worst case
§ even with coalescing
¢ Memory utilization:
§ will depend on placement policy
§ First-fit, next-fit or best-fit

¢ Not used in practice for malloc()/free() because of


linear-time allocation
§ used in some special purpose applications

¢ The concepts of splitting and boundary tag coalescing are


general to all allocators
Autumn 2013 Memory Allocation 27
University of Washington

Keeping Track of Free Blocks


¢ Method 1: Implicit free list using length—links all blocks

20 16 24 8

¢ Method 2: Explicit free list among the free blocks using pointers

20 16 24 8

¢ Method 3: Segregated free list


§ Different free lists for different size classes

¢ Method 4: Blocks sorted by size


§ Can use a balanced tree (e.g. Red-Black tree) with pointers within each
free block, and the length used as a key
Autumn 2013 Memory Allocation 28
University of Washington

Explicit Free Lists


Allocated block: Free block:

size a size a
next

payload and prev


padding

size a size a
(same as implicit free list)

¢ Maintain list(s) of free blocks, rather than implicit list of all


blocks
§ The “next” free block could be anywhere in the heap
§ So we need to store forward/back pointers, not just sizes
§ Luckily we track only free blocks, so we can use payload area for pointers
§ Still need boundary tags for coalescing
Autumn 2013 Memory Allocation 29
University of Washington

Explicit Free Lists


¢ Logically (doubly-linked lists):

A B C

¢ Physically?

Autumn 2013 Memory Allocation 30


University of Washington

Explicit Free Lists


¢ Logically (doubly-linked lists):

A B C

¢ Physically: blocks can be in any order

Forward (next) links


A B
16 16 16 16 24 24 16 16 16 16
C
Back (prev) links

Autumn 2013 Memory Allocation 31


University of Washington

Allocating From Explicit Free Lists


conceptual graphic
Before

Autumn 2013 Memory Allocation 32


University of Washington

Allocating From Explicit Free Lists


conceptual graphic
Before

After (with splitting)

= malloc(…)
Autumn 2013 Memory Allocation 33
University of Washington

Freeing With Explicit Free Lists


¢ Insertion policy: Where in the free list do you put a newly
freed block?

Autumn 2013 Memory Allocation 34


University of Washington

Freeing With Explicit Free Lists


¢ Insertion policy: Where in the free list do you put a newly
freed block?
§ LIFO (last-in-first-out) policy
§ Insert freed block at the beginning of the free list
§ Pro: simple and constant time
§ Con: studies suggest fragmentation is worse than address ordered

§ Address-ordered policy
§ Insert freed blocks so that free list blocks are always in address
order:
addr(prev) < addr(curr) < addr(next)
§ Con: requires linear-time search when blocks are freed
§ Pro: studies suggest fragmentation is lower than LIFO
¢ Cache effects?
Autumn 2013 Memory Allocation 35
University of Washington

Freeing With a LIFO Policy (Case 1)


conceptual graphic
Before
free( )

Root

¢ Insert the freed block at the root of the list

After

Root

Autumn 2013 Memory Allocation 36


University of Washington

Freeing With a LIFO Policy (Case 2)


conceptual graphic
Before free( )

Root

¢ Splice out predecessor block, coalesce both memory blocks,


and insert the new block at the root of the list
After

Root

Autumn 2013 Memory Allocation 37


University of Washington

Freeing With a LIFO Policy (Case 3)


conceptual graphic
Before free( )

Root

¢ Splice out successor block, coalesce both memory blocks and


insert the new block at the root of the list
After

Root

Autumn 2013 Memory Allocation 38


University of Washington

Freeing With a LIFO Policy (Case 4)


conceptual graphic
Before free( )

Root

¢ Splice out predecessor and successor blocks, coalesce all 3


memory blocks and insert the new block at the root of the list
After

Root

Autumn 2013 Memory Allocation 39


University of Washington

Do we always need the boundary tag?


Allocated block: Free block:

size a size a
next

payload and prev


padding

size a size a

¢ Lab 5 suggests no…

Autumn 2013 Memory Allocation 40


University of Washington

Explicit List Summary


¢ Comparison to implicit list:
§ Allocate is linear time in number of free blocks instead of all blocks
§ Much faster when most of the memory is full
§ Slightly more complicated allocate and free since needs to splice blocks
in and out of the list
§ Some extra space for the links (2 extra words needed for each block)
§ Possibly increases minimum block size, leading to more internal
fragmentation

¢ Most common use of explicit lists is in conjunction with


segregated free lists
§ Keep multiple linked lists of different size classes, or possibly for
different types of objects

Autumn 2013 Memory Allocation 41


University of Washington

Keeping Track of Free Blocks


¢ Method 1: Implicit list using length—links all blocks

20 16 24 8

¢ Method 2: Explicit list among the free blocks using pointers

20 16 24 8

¢ Method 3: Segregated free list


§ Different free lists for different size classes

¢ Method 4: Blocks sorted by size


§ Can use a balanced tree (e.g. Red-Black tree) with pointers within each
free block, and the length used as a key
Autumn 2013 Memory Allocation 42
University of Washington

Segregated List (Seglist) Allocators


¢ Each size class of blocks has its own free list
8

16

24-32

40-inf

¢ Often have separate classes for each small size


¢ For larger sizes: One class for each two-power size

Autumn 2013 Memory Allocation 43


University of Washington

Seglist Allocator
¢ Given an array of free lists, each one for some size class

¢ To allocate a block of size n:


§ Search appropriate free list for block of size m > n
§ If an appropriate block is found:
§ Split block and place fragment on appropriate list (optional)
§ If no block is found, try next larger class
§ Repeat until block is found

¢ If no block is found:
§ Request additional heap memory from OS (using sbrk())
§ Allocate block of n bytes from this new memory
§ Place remainder as a single free block in largest size class

Autumn 2013 Memory Allocation 44


University of Washington

Seglist Allocator
¢ To free a block:
§ Coalesce and place on appropriate list (optional)

¢ Advantages of seglist allocators


§ Higher throughput
§ log time for power-of-two size classes
§ Better memory utilization
§ First-fit search of segregated free list approximates a best-fit
search of entire heap.
§ Extreme case: Giving each block its own size class is equivalent to
best-fit.

Autumn 2013 Memory Allocation 45


University of Washington

Summary of Key Allocator Policies


¢ Placement policy:
§ First-fit, next-fit, best-fit, etc.
§ Trades off lower throughput for less fragmentation
§ Observation: segregated free lists approximate a best fit placement
policy without having to search entire free list

¢ Splitting policy:
§ When do we go ahead and split free blocks?
§ How much internal fragmentation are we willing to tolerate?
¢ Coalescing policy:
§ Immediate coalescing: coalesce each time free() is called
§ Deferred coalescing: try to improve performance of free() by
deferring coalescing until needed. Examples:
§ Coalesce as you scan the free list for malloc()
§ Coalesce when the amount of external fragmentation reaches
some threshold
Autumn 2013 Memory Allocation 46
University of Washington

More Info on Allocators


¢ D. Knuth, “The Art of Computer Programming”, 2nd edition,
Addison Wesley, 1973
§ The classic reference on dynamic storage allocation

¢ Wilson et al, “Dynamic Storage Allocation: A Survey and


Critical Review”, Proc. 1995 Int’l Workshop on Memory
Management, Kinross, Scotland, Sept, 1995.
§ Comprehensive survey
§ Available from CS:APP student site (csapp.cs.cmu.edu)

Autumn 2013 Memory Allocation 47


University of Washington

Wouldn’t it be nice…
¢ If we never had to free memory?
¢ Do you free objects in Java?

Autumn 2013 Memory Allocation 48


University of Washington

Garbage Collection (GC)


(Automatic Memory Management / Implicit Memory Allocation)

¢ Garbage collection: automatic reclamation of heap-allocated


storage—application never explicitly frees memory.
void foo() {
int* p = (int *)malloc(128);
return; /* p block is now garbage */
}

¢ Common in implementations of functional languages, scripting


languages, and modern object oriented languages:
§ Lisp, Racket, Erlang, ML, Haskell, Scala, Java, C#, Perl, Ruby, Python, Lua,
JavaScript, Dart, Mathematica, MATLAB, many more…
¢ Variants (“conservative” garbage collectors) exist for C and C++
§ However, cannot necessarily collect all garbage

Autumn 2013 Memory Allocation 49


University of Washington

Garbage Collection
¢ How does the memory allocator know when memory can be
freed?
§ In general, we cannot know what is going to be used in the future since it
depends on conditionals (halting problem, etc.)

Autumn 2013 Memory Allocation 50


University of Washington

Garbage Collection
¢ How does the memory allocator know when memory can be
freed?
§ In general, we cannot know what is going to be used in the future since it
depends on conditionals (halting problem, etc.)
§ But, we can tell that certain blocks cannot be used if there are no
pointers to them

¢ So the memory allocator needs to know what is a pointer and


what is not – how can it do this?

¢ We’ll make some assumptions about pointers:


§ Memory allocator can distinguish pointers from non-pointers
§ All pointers point to the start of a block in the heap
§ Application cannot hide pointers
(e.g., by coercing them to an int, and then back again)
Autumn 2013 Memory Allocation 51
University of Washington

Classical GC Algorithms
¢ Mark-and-sweep collection (McCarthy, 1960)
§ Does not move blocks (unless you also “compact”)
¢ Reference counting (Collins, 1960)
§ Does not move blocks (not discussed)
¢ Copying collection (Minsky, 1963)
§ Moves blocks (not discussed)
¢ Generational Collectors (Lieberman and Hewitt, 1983)
§ Most allocations become garbage very soon, so
focus reclamation work on zones of memory recently allocated.
¢ For more information:
§ Jones, Hosking, and Moss, The Garbage Collection Handbook: The Art
of Automatic Memory Management, CRC Press, 2012.
§ Jones and Lin, Garbage Collection: Algorithms for Automatic Dynamic
Memory, John Wiley & Sons, 1996.

Autumn 2013 Memory Allocation 52


University of Washington

Memory as a Graph
¢ We view memory as a directed graph
§ Each allocated heap block is a node in the graph
§ Each pointer is an edge in the graph
§ Locations not in the heap that contain pointers into the heap are called
root nodes (e.g. registers, locations on the stack, global variables)

Root nodes

Heap nodes reachable

Not-reachable
(garbage)

A node (block) is reachable if there is a path from any root to that node
Non-reachable nodes are garbage (cannot be needed by the application)
Autumn 2013 Memory Allocation 53
University of Washington

Mark and Sweep Collecting


¢ Can build on top of malloc/free package
§ Allocate using malloc until you “run out of space”
¢ When out of space:
§ Use extra mark bit in the head of each block
§ Mark: Start at roots and set mark bit on each reachable block
§ Sweep: Scan all blocks and free blocks that are not marked
root

Before mark

After mark Mark bit set

After sweep free free


Autumn 2013 Memory Allocation 54
University of Washington

Assumptions For a Simple Implementation


¢ Application can use functions to allocate memory:
§ b = new(n) : returns pointer, b, to new block with all locations cleared
§ b[i] : read location i of block b into register
§ b[i] = v : write v into location i of block b
¢ Each block will have a header word
§ b[-1]

¢ Functions used by the garbage collector:


§ is_ptr(p): determines whether p is a pointer to a block
§ length(p): returns length of block pointed to by p, not including header
§ get_roots(): returns all the roots

Autumn 2013 Memory Allocation 55


University of Washington

Mark
Mark using depth-first traversal of the memory graph
ptr mark(ptr p) { // p: some word in a heap block
if (!is_ptr(p)) return; // do nothing if not pointer
if (markBitSet(p)) return; // check if already marked
setMarkBit(p); // set the mark bit
for (i=0; i < length(p); i++) // recursively call mark on
mark(p[i]); // all words in the block
return;
}

root

Before mark

After mark Mark bit set

Autumn 2013 Memory Allocation 56


University of Washington

Sweep

After mark Mark bit set

After sweep free free

Sweep using lengths to find next block


ptr sweep(ptr p, ptr end) { // ptrs to start & end of heap
while (p < end) { // while not at end of heap
if markBitSet(p) // check if block is marked
clearMarkBit(p); // if so, reset mark bit
else if (allocateBitSet(p)) // if not marked, but allocated
free(p); // free the block
p += length(p); // adjust pointer to next block
}

Autumn 2013 Memory Allocation 57


University of Washington

Conservative Mark & Sweep in C


¢ Would mark & sweep work in C?
§ is_ptr() (previous slide) determines if a word is a pointer by
checking if it points to an allocated block of memory
§ But in C, pointers can point into the middle of allocated blocks (not so
in Java)
§ Makes it tricky to find all allocated blocks in mark phase
ptr
header

§ There are ways to solve/avoid this problem in C, but the resulting


garbage collector is conservative:
§ Every reachable node correctly identified as reachable, but some
unreachable nodes might be incorrectly marked as reachable
§ In Java, all pointers (i.e., references) point to the starting address of an
object structure – the start of an allocated block

Autumn 2013 Memory Allocation 58


University of Washington

Memory-Related Perils and Pitfalls in C


¢ Dereferencing bad pointers
!!!
¢ Reading uninitialized memory
¢ Overwriting memory
¢ Referencing nonexistent variables
¢ Freeing blocks multiple times
¢ Referencing freed blocks
¢ Failing to free blocks

Autumn 2013 Memory Allocation 59


University of Washington

Dereferencing Bad Pointers


¢ The classic scanf bug

int val;

...

scanf(“%d”, val);

Autumn 2013 Memory Allocation 60


University of Washington

Dereferencing Bad Pointers


¢ The classic scanf bug

int val;

...

scanf(“%d”, val);

¢ Will cause scanf to interpret contents of val as an


address!
§ Best case: program terminates immediately due to segmentation fault
§ Worst case: contents of val correspond to some valid read/write area
of virtual memory, causing scanf to overwrite that memory, with
disastrous and baffling consequences much later in program execution

Autumn 2013 Memory Allocation 61


University of Washington

Reading Uninitialized Memory


¢ Assuming that heap data is initialized to zero

/* return y = Ax */
int *matvec(int **A, int *x) {
int *y = (int *)malloc( N * sizeof(int) );
int i, j;

for (i=0; i<N; i++) {


for (j=0; j<N; j++) {
y[i] += A[i][j] * x[j];
}
}
return y;
}

Autumn 2013 Memory Allocation 62


University of Washington

Overwriting Memory
¢ Allocating the (possibly) wrong sized object

int **p;

p = (int **)malloc( N * sizeof(int) );

for (i=0; i<N; i++) {


p[i] = (int *)malloc( M * sizeof(int) );
}

Autumn 2013 Memory Allocation 63


University of Washington

Overwriting Memory
¢ Off-by-one error

int **p;

p = (int **)malloc( N * sizeof(int *) );

for (i=0; i<=N; i++) {


p[i] = (int *)malloc( M * sizeof(int) );
}

Autumn 2013 Memory Allocation 64


University of Washington

Overwriting Memory
¢ Not checking the max string size

char s[8];
int i;

gets(s); /* reads “123456789” from stdin */

¢ Basis for classic buffer overflow attacks


§ Your lab assignment #3

Autumn 2013 Memory Allocation 65


University of Washington

Overwriting Memory
¢ Misunderstanding pointer arithmetic

int *search(int *p, int val) {

while (p && *p != val)


p += sizeof(int);

return p;
}

Autumn 2013 Memory Allocation 66


University of Washington

Overwriting Memory
¢ Referencing a pointer instead of the object it points to

int *getPacket(int **packets, int *size) {


int *packet;
packet = packets[0];
packets[0] = packets[*size - 1];
*size--; // what is happening here?
reorderPackets(packets, *size);
return(packet);
}
¢ ‘--’ and ‘*’ operators have same precedence and associate
from right-to-left, so -- happens first!

Autumn 2013 Memory Allocation 67


University of Washington

Referencing Nonexistent Variables


¢ Forgetting that local variables disappear when a function
returns

int *foo () {
int val;

return &val;
}

Autumn 2013 Memory Allocation 68


University of Washington

Freeing Blocks Multiple Times


¢ Nasty!
x = (int *)malloc( N * sizeof(int) );
<manipulate x>
free(x);
...

y = (int *)malloc( M * sizeof(int) );


free(x);
<manipulate y>

Autumn 2013 Memory Allocation 69


University of Washington

Freeing Blocks Multiple Times


¢ Nasty!
x = (int *)malloc( N * sizeof(int) );
<manipulate x>
free(x);
...

y = (int *)malloc( M * sizeof(int) );


free(x);
<manipulate y>

¢ What does the free list look like?

x = (int *)malloc( N * sizeof(int) );


<manipulate x>
free(x);
free(x);
Autumn 2013 Memory Allocation 70
University of Washington

Referencing Freed Blocks


¢ Evil!

x = (int *)malloc( N * sizeof(int) );


<manipulate x>
free(x);
...
y = (int *)malloc( M * sizeof(int) );
for (i=0; i<M; i++)
y[i] = x[i]++;

Autumn 2013 Memory Allocation 71


University of Washington

Failing to Free Blocks (Memory Leaks)


¢ Slow, silent, long-term killer!

foo() {
int *x = (int *)malloc(N*sizeof(int));
...
return;
}

Autumn 2013 Memory Allocation 72


University of Washington

Failing to Free Blocks (Memory Leaks)


¢ Freeing only part of a data structure

struct list {
int val;
struct list *next;
};

foo() {
struct list *head =
(struct list *)malloc( sizeof(struct list) );
head->val = 0;
head->next = NULL;
<create and manipulate the rest of the list>
...
free(head);
return;
}
Autumn 2013 Memory Allocation 73
University of Washington

Dealing With Memory Bugs


¢ Conventional debugger (gdb)
§ Good for finding bad pointer dereferences
§ Hard to detect the other memory bugs

¢ Debugging malloc (UToronto CSRI malloc)


§ Wrapper around conventional malloc
§ Detects memory bugs at malloc and free boundaries
§Memory overwrites that corrupt heap structures
§ Some instances of freeing blocks multiple times
§ Memory leaks
§ Cannot detect all memory bugs
§ Overwrites into the middle of allocated blocks
§ Freeing block twice that has been reallocated in the interim
§ Referencing freed blocks

Autumn 2013 Memory Allocation 74


University of Washington

Dealing With Memory Bugs (cont.)


¢ Some malloc implementations contain checking code
§ Linux glibc malloc: setenv MALLOC_CHECK_ 2
§ FreeBSD: setenv MALLOC_OPTIONS AJR
¢ Binary translator: valgrind (Linux), Purify
§ Powerful debugging and analysis technique
§ Rewrites text section of executable object file
§ Can detect all errors as debugging malloc
§ Can also check each individual reference at runtime
§ Bad pointers
§ Overwriting
§ Referencing outside of allocated block

Autumn 2013 Memory Allocation 75


University of Washington

What about Java or ML or Python or …?


¢ In memory-safe languages, most of these bugs are impossible
§ Cannot perform arbitrary pointer manipulation
§ Cannot get around the type system
§ Array bounds checking, null pointer checking
§ Automatic memory management
¢ But one of the bugs we saw earlier is possible. Which one?

Autumn 2013 Memory Allocation 76


University of Washington

Memory Leaks with GC


¢ Not because of forgotten free() -- we have GC!
¢ Unneeded “leftover” roots keep objects reachable
¢ Sometimes nullifying a variable is not needed for correctness
but is for performance
¢ Bigger issue with reference counting GC

Root nodes

Heap nodes reachable

Not-reachable
(garbage)

Autumn 2013 Memory Allocation 77

You might also like