Basic Garbage Collection

Garbage Collection (GC) is the automatic reclamation of heap records that will never again be accessed by the program.

GC is universally used for languages with closures and complex data structures that are implicitly heap-allocated.

GC may be useful for any language that supports heap allocation, because it obviates the need for explicit deallocation, which is tedious, error-prone, and often non-modular.

GC technology is increasingly interesting for ``conventional'' language implementation, especially as users discover that free isn't free. I.e., explicit memory management can be costly too.

We view GC as part of an allocation service provided by the runtime environment to the user program, usually called the mutator. When the mutator needs heap space, it calls an allocation routine, which in turn performs garbage collection activities if needed.

Simple Heap Model

For simplicity, consider a heap containing ``cons'' cells.

tex2html_wrap247

Heap consists of two-word cells and each element of a cell is a pointer to another cell. (We'll deal with distinguishing pointers from non-pointers later.)

There may also be pointers into the heap from the stack and global variables; these constitute the root set.

At any given moment, the system's live data are the heap cells that can be reached by some series of pointer traversals starting from a member of the root set.

Garbage is the heap memory containing non-live cells. (Note that this is a slightly conservative definition.)

Reference Counting

The most straightforward way to recognize garbage and make its space reusable for new cells is to use reference counts.

We augment each heap cell with a count field that records the total number of pointers in the system that point to the cell. Each time we create or copy a pointer to the cell, we increment the count; each time we destroy a pointer, we decrement the count.

If the reference count ever goes to 0, we can reuse the cell by placing it on a free list.

tex2html_wrap249

When allocating a new cell, we first try the free list (before extending the heap).

Pros:
Conceptually simple;
Immediate reclamation of storage

Cons:
Extra space;
Extra time (every pointer assignment has to change/check count)
Can't collect ``cyclic garbage''

Mark and Sweep

There's no real need to remove garbage as long as unused memory is available. So GC is typically deferred until the allocator fails due to lack of memory. The collector then takes control of the processor, performs a collection--hopefully freeing enough memory to satisfy the allocation request--and returns control to the mutator. This approach is known generically as ``stop and collect.''

There are several options for the collection algorithm. Perhaps the simplest is called mark and sweep, which operates in two phases:

tex2html_wrap_inline137 First, mark each live data cell by tracing all pointers starting with the root set.

tex2html_wrap_inline137 Then, sweep all unmarked cells onto the free list (also unmarking the marked cells).

code63

code65

code67

Here mark traverses the live data graph in depth-first order, and potentially uses lots of stack! A standard trick called pointer reversal can be used to avoid needing extra space during the traversal.

Copying Collection

Mark and sweep has several problems:

tex2html_wrap_inline137 It does work proportional to the size of the entire heap.

tex2html_wrap_inline137 It leaves memory fragmented.

tex2html_wrap_inline137 It doesn't cope well with non-uniform cell sizes.

An alternative that solves these problems is copying collection. The idea is to divide the available heap into 2 semi-spaces. Initially, the allocator uses just one space; when it fills up, the collector copies the live data (only) into the other space, and reverses the role of the spaces.

tex2html_wrap251

Copying collection must fix up all pointers to copied data. To do this, it leaves a forwarding pointer in the ``from'' space after the copy is made.

A copying collector typically traverses the live data graph breadth first, using ``to'' space itself as the search ``queue.''

Copying compacts live data, which improves locality and may be good for virtual memory and caches.

tex2html_wrap253

Copying Collection Details tex2html_wrap255

code87

code89

Comparison

Copying collector does work proportional to amount of live data. Asymptotically, this means it does less work than mark and sweep. Let

tex2html_wrap_inline147
tex2html_wrap_inline149 before a collection.

After the collection, there is M-A space left for allocation before the next collection. We can calculate the amortized cost per allocated byte as follows:

tex2html_wrap_inline151 for some tex2html_wrap_inline153

tex2html_wrap_inline155 for some tex2html_wrap_inline157

As tex2html_wrap_inline159 , tex2html_wrap_inline161 , while tex2html_wrap_inline163 .

Of course, real memories aren't infinite, so the values of tex2html_wrap_inline165 matter, especially if a significant percentage of data are live at collection (since generally tex2html_wrap_inline167 ).

Further Issues

tex2html_wrap_inline137 Distinguishing pointers from integers.

tex2html_wrap_inline137 Handling records of variable size.

tex2html_wrap_inline137 Finding the root set.

tex2html_wrap_inline137 Avoiding repeated copying of permanently live data.

tex2html_wrap_inline137 Avoiding nasty pauses during collection.

These concerns lead to the study of three important varieties of collectors:

tex2html_wrap_inline137 Conservative collectors.

tex2html_wrap_inline137 Generational collectors.

tex2html_wrap_inline137 Incremental and concurrent collectors.

Conservative Collection

Standard GC algorithms rely on precise identification of pointers.

This is hard in ``uncooperative'' environments, i.e., when the mutator (and its compiler) are not aware that GC will be performed. This is the normal case for C/C++ programs.

(Hence issue for portable Java implementations based on C, and for native functions.)

Basic problem: the mutator and collector can no longer communicate a root set.

Idea: for any scanning collector to be correct, it's essential that every pointer be found. But for non-moving collectors, it's ok to mistake a non-pointer for a pointer - the worst that happens is that some garbage doesn't get collected.

Conservative collectors scan the entire register set and stack of the mutator, and assume that anything that might be a pointer really is a pointer.

Issues in Conservative Collection

tex2html_wrap_inline137 Some bit patterns that are actually integers, reals, chars, etc. will be mistaken for pointers, so the ``records'' they ``point'' to will be treated as live data.

tex2html_wrap_inline137 Accidental pointer identifications can be greatly decreased by careful tests, e.g., must be on a page known to be in the heap, at an appropriate alignment for objects on that page; data at ``pointed-to'' location must look like a heap header.

tex2html_wrap_inline137 Can further reduce false id's by not allocating on pages whose addresses correspond to data values known to be in use.

Major problems:

tex2html_wrap_inline137 Collector must be able to find registers and stack frames.

tex2html_wrap_inline137 Pointers must not be kept in ``hidden'' form by mutator code.

Object Lifetimes

Major problem with tracing GC: long-lived data get traced (scanned and/or copied) repeatedly, without producing free space.

(Weak) Generational Hypothesis: ``Most data die young.''

I.e., most records become garbage a short time after they are allocated.

If we equate ``age'' of an object O is equated with amount of heap allocated since O was allocated, this says that most records become garbage after a small number of other records have been allocated.

Moreover, the longer an object stays live, the more likely it is to remain live in the future.

These are empirical properties of many (not necessarily all) languages/programs.

Implication : if you're looking for garbage, it's more likely to be found in recently-allocated data, e.g., in data allocated since the last garbage collection.

Generational Collection

Idea: Segregate data by age into generations.

tex2html_wrap_inline137 Arrange that the younger generations can be collected independently of the older ones.

tex2html_wrap_inline137 When space is needed, collect the youngest generation first.

tex2html_wrap_inline137 Only collect older generation(s) if space is still needed.

tex2html_wrap_inline137 Should make GC more efficient overall, since less total tracing is performed.

tex2html_wrap_inline137 Should shorten pause times (at least for young generation GCs).

Some variant of generational collection is almost universally used in serious implementations of heavily-allocating languages (LISP, functional languages, Smalltalk, ...)

Most generational systems are copying collectors, although mark and sweep variants are possible.

In generational copying collector, data in generation n that are still live after a certain number of gc's (the promotion threshold) are copied into generation n+1 (possibly triggering a collection there).

Key problem: finding all the roots that point into generation n without scanning higher generations.

Example

Assume 2 generations, promotion threshold = 1. Initial memory configuration after allocation of R:

tex2html_wrap257 Suppose a GC is now needed:

tex2html_wrap259 Note that S is now tenured (uncollected garbage).

Example (continued)

Now we allocate a new cell T pointed to by R, fill T with pointers to A and B, and zero the root set pointers to A and B.

tex2html_wrap261

If a further GC is needed, we must follow the inter-generational pointer from R to T.

Design issues

Tracking pointers from older generations to younger ones.

tex2html_wrap_inline137 This is primary added cost of generational system.

tex2html_wrap_inline137 Hope there are not too many!

tex2html_wrap_inline137 Maintain remembered set of updated memory chunks (``cards''), where chunk size can range from single address to entire page.

tex2html_wrap_inline137 Different tradeoffs in mutator overhead vs. scan time.

Promotion policy?

tex2html_wrap_inline137 Threshold = 1 gives simpler implementation, since no need to record object age, but promotes very young objects.

How many generations?

tex2html_wrap_inline137 Two-generation systems give simpler implementation, but multiple generations are useful if there is a spread of object lifetimes (especially if threshold = 1).

May want separate areas for large, pointer-free, or ``immortal'' objects.

Garbage Collection in Java

Sun's JVM uses a ``mark and compact'' collector

tex2html_wrap_inline137 Compromise between M&S and copying collectors.

tex2html_wrap_inline137 Live data cells are marked.

tex2html_wrap_inline137 Then heap is scanned and live data are slid down to a compact region at the bottom of the heap.

tex2html_wrap_inline137 Extra space costs for forwarding pointers; extra time costs for added traversals.

tex2html_wrap_inline137 Object pointers actually point to a handle which in turn points to the real object data. Handles are allocated in their own space, managed by M&S, and never moved. Object data records can be moved just by changing the handle's contents, without altering the object pointer.

In general, copying collection works fine for Java even without handles, since it's easy for interpreter to provide the root set and notice when it's been changed.

Garbage Collection and Native Code under Java

Interfacing to native code is problematic:

Java objects referenced by native code must not be collected by Java GC until native code is done with them.

Solutions:

tex2html_wrap_inline137 GC can implicitly register all arguments passed to native code (or returned to native code by callbacks into Java), and not collect them until native code finishes.
tex2html_wrap_inline137 References that need to live longer must be explicitly registered (and later unregistered) by native code programmer.

Java objects referenced directly by native code can't be moved.

Solutions:

tex2html_wrap_inline137 Use non-moving collector.
tex2html_wrap_inline137 Pass unmoveable, indirect object references to native code.
tex2html_wrap_inline137 Pin objects passed to native code.
tex2html_wrap_inline137 Make pinned copies of objects passed to native code.

Native code data objects pointed to by Java objects that get GC'ed should be freed.

Solution:

tex2html_wrap_inline137 Use finalization routines.



Andrew P. Tolmach
Fri Feb 13 10:54:45 PST 1998