0% found this document useful (0 votes)
3 views

Converted-7da7c

The document provides an overview of storage management and garbage collection terminology, including concepts like stack, heap, roots, and garbage. It discusses memory types, allocation strategies, and garbage collection algorithms such as reference counting and mark-and-sweep. Additionally, it highlights the importance of garbage collection in software engineering and the challenges associated with different garbage collection methods.

Uploaded by

qudus4060
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Converted-7da7c

The document provides an overview of storage management and garbage collection terminology, including concepts like stack, heap, roots, and garbage. It discusses memory types, allocation strategies, and garbage collection algorithms such as reference counting and mark-and-sweep. Additionally, it highlights the importance of garbage collection in software engineering and the challenges associated with different garbage collection methods.

Uploaded by

qudus4060
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Storage Management and

Garbage Collection
Terminology
• Stack: a memory area where
activation records or frames are
pushed onto when a procedure is
called and popped off when it returns
• Heap: a memory area where data
structures can be allocated and
deallocated in any order.

2
Terminology
(Continued)

• Roots: values that a program can


manipulate directly (i.e. values held in
registers, on the program stack, and global
variables.)
• Node/Cell/Object: an individually
allocated piece of data in the heap.
• Children Nodes: the list of pointers that a
given node contains.
• Live Node: a node whose address is held
in a root or is the child of a live node.
3
Terminology
(Continued)

• Garbage: nodes that are not live, but


are not free either.
• Garbage collection: the task of
recovering (freeing) garbage nodes.
• Mutator: The program running
alongside the garbage collection
system.
4
Three kinds of memory
• Fixed memory
• Stack memory
• Heap memory
Fixed address memory
• Executable code
• Global variables
• Constant structures that don’t fit inside a
machine instruction. (constant arrays,
strings, floating points, long integers etc.)
• Static variables.
• Subroutine local variable in non-recursive
languages (e.g. early FORTRAN).
Stack memory
• Local variables for functions, whose size can be
determined at call time.
• Information saved at function call and restored at
function return:
– Values of callee arguments
– Register values:
• Return address (value of PC)
• Frame pointer (value of FP)
• Other registers
– Static link (to be discussed)
Heap memory
• Structures whose size varies dynamically
(e.g. variable length arrays or strings).
• Structures that are allocated dynamically
(e.g. records in a linked list).
• Structures created by a function call that
must survive after the call returns.
Issues:
• Allocation and free space management
• Deallocation / garbage collection
Stack of Activation Records
main { A() }
int A();
{ int I; … B() …}

int B()
{ int J; ... C(); A(); …}

int C()
{ int K; … B() … }

main calls A calls B calls C


calls B calls A.
Local variable
The address of a local variable in the active
function is a known offset from the FP.
Calling protocol: A calls B(X,Y)
• A pushes values of X, Y onto stack
• A pushes values of registers (including FP)
onto stack
• PUSHJ B --- Machine instruction pushes
PC++ (address of next instruction in A)
onto stack, jumps to starting address in B.
• FP = SP – 4
• SP += size of B’s local variables
• B begins execution
Function return protocol
• B stores value to be returned in a register.
• SP -= size of B’s local variables
(deallocate local variables)
• POPJ (PC = pop stack --- next address in
A)
• Pop values from stack to registers
(including FP)
Semi-dynamic arrays
In Ada and some other languages, one can
have an array local to a procedure whose
size is determined when the procedure is
entered. (Scott p. 353-355)

procedure foo(N : in integer)


M1, M2: array (1 .. N) of integer;
Resolving reference
To resolve reference to M2[I]:
Pointer to M2 is known offset from FP.
Address of M2[I] == value of pointer + I.
Dynamic memory
Resolving reference:
If a local variable is a dynamic entity, then
the actual entity is allocated from the
heap, and a pointer to the entity is stored
in the activation record on the stack.
Dynamic memory allocation
• Free list: List of free blocks
• Allocation algorithm: On request for
allocation, choose a free block.
• Fragmentation: Disconnected blocks of
memory, all too small to satisfy the
request.
Fixed size dynamic allocation
In LISP (at least old versions) all dynamic
allocations were in 2 word record.

In that case, things are easy:

Free list: linked list of records.


Allocation: Pop the first record off the free
list.
No fragmentation.
Variable sized dynamic allocation
Free list: Linked list of consecutive blocks of
free space, labelled by size.
Allocation algorithms:
First fit: Go through free list, find first block that
is large enough.
Best fit: Go through free list, find smallest block
that is large enough.
Best fit requires more search, sometimes
leads to more fragmentation.
Multiple free list
• Keep several free lists. The blocks on a single
free list all have the same size. Different free
lists have blocks of different sizes.
– powers of two: blocks of sizes 1, 2, 4, 8, 16 …
– Fibonacci numbers: blocks of size 1, 2, 3, 5, 8 …
• On request for a structure of size N, find next
standard size >= N, allocate first block.
• Rapid allocation, lots of internal fragmentation.
Deallocation
• Explicit deallocation (e.g. C, C++).
Very error prone. If structure S is
allocated, then deallocated, then the
space is used for structure T, then the
program accesses the space as S,
the resulting error can be catastrophic
and very hard to debug.
Garbage collection
Structures are deallocated when the runtime
executor determines that the program can
no longer access them.
• Reference counts
• Mark and sweep
All garbage collection techniques require
disciplined creation of pointers, and
unambiguous typing.
(e.g. not as in C: “p = &a + 40;”)
Why Garbage Collect?
• Language requirements
– In some situations it may be impossible to
know when a shared data structure is no
longer in use.

24
Why Garbage Collect?
(Continued)

• Software Engineering
– Garbage collection increases abstraction
level of software development.
– Simplified interfaces and decreases
coupling of modules.
– Studies have shown a significant amount
of development time is spent on memory
management bugs [Rovner, 1985].
25
Comparing Garbage Collection
Algorithms
• Directly comparing garbage collection algorithms
is difficult – there are many factors to consider.
• Some factors to consider:
– Cost of reclaiming cells
– Cost of allocating cells
– Storage overhead
– How does the algorithm scale with residency?
– Will user program be suspended during garbage
collection?
– Does an upper bound exist on the pause time?
– Is locality of data structures maintained (or maybe
even improved?)
26
Classes of Garbage Collection
Algorithms
• Direct Garbage Collectors: a record is
associated with each node in the heap. The
record for node N indicates how many other
nodes or roots point to N.
• Indirect/Tracing Garbage Collectors: usually
invoked when a user’s request for memory fails
because the free list is exhausted. The garbage
collector visits all live nodes, and returns all
other memory to the free list. If sufficient
memory has been recovered from this process,
the user’s request for memory is satisfied.
27
Reference count
With each dynamic structure there is a record of
how many pointers exist to that structure.
Deallocate when the reference count falls to 0.
Iterate if the deallocated structure points to
something else.
Advantage: Happens incrementally. Low cost.
Disadvantage: Doesn’t work with circular structure.
Can use:
– with structures that can’t contain pointers (e.g.
dynamic strings)
– In languages that can’t create circular structures (e.g.
restricted forms of LISP)
Mark and sweep
Any accessible structure is accessible via
some expression in terms of variables on
the stack (or symbol table etc. but some
known entity).
Therefore:
• Unmark every structure in the heap.
• Follow every pointer in the stack to
structure in the heap. Mark. Follow these
pointers. Iterate.
• Go through heap, deallocate any unmarked
structure.
Problem with Garbage Collection
(other than reference count)
Inevitably large CPU overhead.
Generally program execution has to halt
during GC. Annoying for interactive
programs; dangerous for real-time, safety
critical programs (e.g. a program to detect
and respond to meltdown in a nuclear
reactor).
Hybrid approach
Use reference counts to deallocate.

When memory runs out, use mark and


sweep (or other GC) to collect
inaccessible circular structures.

You might also like