COMS W4115
Programming Languages and Translators
Lecture 20: Run-time Memory Management
April 8, 2015
Lecture Outline
- Run-time memory and its management
- Procedure activations and activation trees
- Activation records
- Calling sequences
- Allocating space for arrays
- Heap memory management
1. Run-time Memory and its Management
- The execution of the target program is initially under the control of the operating system.
- When a program is invoked, the OS allocates space to the program, loads the target code into the
code section, and transfers control to the entry point ("main") of the target program.
- The compiler is responsible for generating the target code and orchestrating the use of the run-time
memory.
- Structure of run-time memory:
Code
Static data
Heap
free memory
Stack
The size of the generated target code is usually known at compile time and
can be placed into the statically determined area Code
.
Globals and other static items like constants are stored in the static data area.
The heap is used to store objects whose lifetimes exceed the lifetimes of the
procedures that created them.
The run-time stack is used to manage the data associated with procedure activations.
A pointer called the stack pointer (sp) points to the top of the run-time stack.
2. Procedure Activations and Activation Trees
- The compiler should generate target code that is correct and efficient.
- We assume that programs do not have concurrency and that when a procedure gets called,
control returns to the program point immediately after the call.
We also assume that when procedure p calls procedure q, then q returns before p returns.
- These assumptions imply that the lifetimes of procedure activations are nested,
and that the order of procedure activations can be drawn as a tree, called an
activation tree.
- Example 1: Consider the following C-like program:
int f(void) {
return;
}
int g(void) {
f();
}
int main(void) {
f();
g();
}
The activation tree for this program is
main
/ \
f g
|
f
Example 2: Consider the following C-like program:
int f(int x) {
if (x == 0)
return g();
else
return f(x-1);
}
int g(void) {
return 0;
}
int main(void) {
f(3);
}
The activation tree for this program is
main
|
f(3)
|
f(2)
|
f(1)
|
f(0)
|
g
Note that an activation tree depends on the run-time behavior
and may have different shapes depending on the program input.
A stack can be used to keep track of those procedures that are active at a given time
during the execution of the program.
3. Activation Records
- An activation record (AR), sometimes called a frame, is a data structure
that contains information needed to manage a procedure activation.
- If procedure p calls q, then q's activation record contains a mix of information about
p and q. The execution of p is suspended until q completes at which time p resumes
running from the program point in p immediately after the call to q.
The activation record for q has information to complete the execution of q and
where to resume the execution of p.
- The structure of an activation record is dependent on the operating system, the source
language, and the execution environment.
Let's assume procedure p calls procedure q. Here is a simple example of the fields that might
be found in q's activation record:
result
argument
control link
return address
- The result field is used to store the value computed by this activation of q.
- The argument field is used to store the value of the actual parameter
passed to q.
- The control link is a pointer to the AR of the calling procedure p.
- The return address is the location in the code section of the run-time memory
where execution should resume in p after the call to q.
We will assume there is a pointer called the frame pointer (fp) that points
to a known location in an activation record.
4. Calling Sequences
- Procedure calls are implemented by calling sequences, code that allocates
an activation record on the run-time stack and enters information into its
fields.
- A return sequence is code invoked after the call to restore the state
of the machine so the calling procedure can continue its execution after
the call.
- The code in a calling sequence is usually divided between the calling
procedure (the "caller") and the procedure it calls (the "callee").
- When a procedure p calls a procedure q, we might do something like the following:
- Evaluate the parameters of q and store them in a new AR for q.
- Store the frame pointer (fp) for p as the control link in the AR for q.
- Update fp to point to the AR of q.
- Store the return address in the AR for q.
- Jump to the code for q.
- Have q allocate and initialize its local data and temporaries on the stack.
- When q exits:
- Copy the fp of the AR for q into sp (the top of stack pointer).
- Load the control link of the AR for q into the fp.
- Jump to the return address.
- Change the stack pointer to pop the parameters of q off the run-time stack.
5. Allocating Space for Arrays
- The compiler may need to allocate space for array parameters on the run-time
stack. Depending on the situation at hand, the size of the array may be fixed or variable.
- Fixed-size arrays
void f(int x) {
int a;
int b[4];
int c;
...
}
Space for the array b
can be allocated on the runtime stack
between the space for a
and c
.
Variable-size arrays
void f(int x) {
int a;
int b[n];
int c;
...
}
A pointer to the space for b
can be allocated on the runtime stack
between the space for a
and c
so variables remain
a constant offset from the frame pointer.
The actual space for b
can be allocated after c
.
6. Heap Memory Management
- The runtime heap is used for data objects whose lifetimes may exist long after
the procedure that created them.
- The heap memory manager is the subsystem that allocates and deallocates space
within the heap.
- Garbage collection is the process of finding memory within the heap that is no
longer used by the program and can therefore be reallocated to house other
data objects. In some programming languages like C allocation and deallocation
needs to be done manually using library functions like
malloc
and free
. In other languages like Java it is the built-in garbage collector
that deallocates memory.
- There are many different garbage collection algorithms that vary in performance
metrics such as execution time, space usage, pause time, and program locality.
- Mark-and-sweep garbage collection
- A basic mark-and-sweep (M&S) garbage collection algorithm is straightforward to
implement and works well for languages like C because it does not move
data objects during garbage collection.
- A M&S collector firsts visits and marks all reachable data objects in the heap,
setting the reached-bit of each object to 1.
- It then sweeps the entire heap, freeing up unreachable objects.
- See Algorithm 7.12 (ALSU, p. 471) for details.
7. Practice Problem
- Consider the following program written in a hypothetical
statically scoped
language that allows nested functions. In this program,
main calls f which calls g2 which calls h
which calls g1.
function main() {
int i;
function f() {
int a, b, c;
function g1() {
int a, d;
a = b + c; // point 1
}; // end of g1
function g2(int x) {
int b, e;
function h() {
int c, e;
g1();
e = b + a; // point 2
}; // end of h
h();
a = d + e; // point 3
}; // end of g2
g2(1);
}; //end of f
// execution of main begins here
f();
}; // end of main
- Suppose we have activation records with the following fields:
- Parameters
- Control Link
- Access Link
- Return Address
- Local data
- If function p is nested immediately within function q, then the
access link in any AR for p points to the most recent AR for q.
- Show the activation records on the run-time stack
when execution first arrives at point 1 in the program above.
- To which declarations are the references to variables a, b, c at position 1?
- To which declarations are the references to variables a, b, e at position 2?
- To which declarations are the references to variables a, d, e at position 3?
8. Reading
- ALSU, Sections 7.1, 7.2, 7.6.1
aho@cs.columbia.edu