0% found this document useful (0 votes)
29 views56 pages

5CS4-CD-Unit-4_ppt @zammers

Uploaded by

MAYANK SAINI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views56 pages

5CS4-CD-Unit-4_ppt @zammers

Uploaded by

MAYANK SAINI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Compiler Design

Unit IV

Storage
organization By
Sourabh Banga
Asst. Prof
ACERC
Contents
• Runtime environment
• Storage organization
• Storage allocation strategies
• Activation records
• Accessing local and non local names
• Parameter passing
• Symbol table organization
• Data structures used in symbol table
Runtime Environment
• The compiler must implement various abstractions in the
source language definition such as
• Names used in a program
• Define the scope of variables
• Data types
• Operators
• Procedures
• Parameters and
• Flow of control constructs.

Definition- The compiler must co-operate with operating system and other
systems software to support the implementation of these abstractions on the
target machine. This can be done by the compiler by creating run-time
environment.
Runtime Environment
• A lot has to happen at run time to get your program
running.
• At run time, we need a system to map NAMES (in the
source program) to STORAGE on the machine.
• Allocation and deallocation of memory is handled by a
RUNTIME SUPPORT SYSTEM typically linked and
loaded along with the compiled target code.
• One of the primary responsibilities of the run-time
system is to manage ACTIVATIONS of procedures.
Storage organization
• Suppose that the compiler obtains memory from
the OS so that it can execute the compiled
program
• Program gets loaded on a newly created process
• This runtime storage must hold
• Generated target code
• Data objects
• A counterpart of the control stack to keep track of
procedure activations
Runtime Memory
code for function 1
PASCAL and C use extensions of the control
code for function 2 stack to manage activations of procedures
...
code for function n
Stack contains information about register
values, value of program counter and data objects
global / static area whose lifetimes are contained in that of an
activation
stack
Heap holds all other information. For example,
activations that cannot be represented as a tree.
free space
By convention, stack grows down and the top of
the stack is drawn towards the bottom of this slide
(value of top is usually kept in a register)
heap
Runtime Memory
code for function 1 code for function 1

code for function 2 code for function 2

... ...

code for function n code for function n

global / static area global / static area

stack stack
Stack grows

free space free space

heap heap
Activation Record
• Information needed by a single returned value
execution of a procedure is
managed using an activation actual parameters
record or frame
optional control link
– Not all compilers use all of the
fields optional access link
– Pascal and C push activation saved machine status
record on the runtime stack
when procedure is called and local data
pop the activation record off the
stack when control returns to temporaries
the caller
Activation Record
1) Temporary values returned value
e.g. those arising in the evaluation of
expressions actual parameters
2) Local data optional control link
Data that is local to an execution of the
procedure optional access link
3) Saved machine status saved machine status
State of the machine info before
procedure is called. Values of program local data
counter and machine registers that
have to be restored when control temporaries
returns from the procedure
Activation Record
4) Access Link
refer to non-local data held in other activation
returned value
records

5) Control link actual parameters


points to the activation record of the caller
optional control link
6) Actual parameters
used by the calling procedure to supply parameters optional access link
to the called procedure
(in practice these are passed in registers) saved machine status
7) Returned value local data
used by the called procedure to return a value to
the calling procedure temporaries
(in practice it is returned in a register)
Storage allocation strategies
• The various storage allocation strategies to allocate
storage in different data areas of memory are:
1. Static Allocation
• Storage is allocated for all data objects at compile time
2. Stack allocation
• The storage is managed as a stack.
3. Heap Allocation (It is one of Dynamic Storage Allocation)
• The storage is allocated and deallocated at runtime
from a data area known as heap
Static allocation
• There are two different approaches for run time storage
allocation.
• Static allocation.
• Dynamic allocation.

Static allocation: uses no stack and heap.


• Activation Record in static data area, one per procedure.
• Names bounds to locations at compiler time.
• Every time a procedure is called, its names refer to the
same pre assigned location.
Static allocation
Disadvantages: .
• No recursion. .
• Waste lots of space when inactive. .
• No dynamic allocation.
Advantages:
• No stack manipulation or indirect access to names, i.e.,
faster in accessing variables.
• Values are retained from one procedure call to the next.
For example: static variables in C.
Static Allocation
• In a static environment (Fortran 77) there are a
number of restrictions:
– Size of data objects are known at compile time
– No recursive procedures
– No dynamic memory allocation
• Only one copy of each procedure activation
record exists at time t
– We can allocate storage at compile time
• Bindings do not change at runtime
• Every time a procedure is called, the same bindings occur
Static Allocation
int i = 10; code for function 1
int f(int j)
{
code main() code for function 2
int k; ...
int m;
… code f() code for function n
}

main()
{
i (int)
int k; global / static area
f(k);
}
main() stack
Activation
record k (int)
free space
f()
Activation
record k (int)
m (int)
heap
Stack-based Allocation
• In a stack-based allocation, the previous
restrictions are lifted (Pascal, C, etc)
– procedures are allowed to be called recursively
• Need to hold multiple activation records for the same
procedure
• Created as required and placed on the stack
– Each record will maintain a pointer to the record that activated it
– On completion, the current record will be deleted from the stack
and control is passed to the calling record
– Dynamic memory allocation is allowed
– Pointers to data locations are allowed
Stack-based Allocation
PROGRAM sort(input,output);
Position in Activation Records
VAR a : array[0..10] of Integer;
PROCEDURE readarray; Activation Tree On Stack
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer; s
VAR i,j,x,v : Integer; s
BEGIN

END; a (array)
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Stack-based Allocation
PROGRAM sort(input,output); Position in Activation Records
VAR a : array[0..10] of Integer;
PROCEDURE readarray; Activation Tree On Stack
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer; s
VAR i,j,x,v : Integer;
BEGIN
… s
END; a (array)
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN r q(1,9) q(1,9)
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
p(1,9) q(1,3) i (integer)
END
END; q(1,3)
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9) i (integer)
END.
Calling Sequences
• Procedure calls are
implemented by generating
returned value
calling sequences in the target
code actual parameters

– Call sequence: allocates activation optional control link


record and enters information into
optional access link
fields
– Return sequence: restores the saved machine status
state of the machine so that the local data
calling procedure can continue
execution temporaries
Calling Sequences returned value

• Why placing returned value and actual parameters

actual parameters next to the optional control link


caller
activation record of the caller? optional access link

– Caller can access these values


saved machine status

local data
using offsets from its own activation
temporaries
record
– No need to know the middle part of
returned value

actual parameters
the callee’s activation record
optional control link
callee
optional access link

saved machine status

local data

temporaries

Copyright (c) 2011


Ioanna Dionysiou
Calling Sequences
returned value
• How do we calculate offset? actual parameters
– Maintain a register that points to optional control link
the end of the machine status caller
optional access link
field in an activation record
– Top_sp is known to the caller, so saved machine status

it can be responsible for setting it local data


before control flows to the called temporaries
procedure
– Callee can access its temporaites
returned value

and local data using offsets from actual parameters


top_sp optional control link
callee
optional access link
top_sp
saved machine status

local data

temporaries

Copyright (c) 2011


Ioanna Dionysiou
Call Sequence
• The caller evaluates actuals
• The caller
– stores a return address and the old value of top_sp
into the callee’s activation record
– increments top_sp; that is moved past the caller’s
local data and temporaries and the callee’s
parameter and status fields
• The callee
– saves register values and other status information
• The callee
– initializes its local data and begins execution
Return Sequence
• The callee places a return value next to the
activation record of the caller
• Using the information in the status field, the
callee
– restores top_sp and other registers
– braches to a return address in the caller’s code
• Although top_sp has been decremented, the
caller can copy the returned value into its own
activation record and use it to evaluate an
expression
Dangling References
• Whenever storage is deallocated, the
problem of dangling references arises
– Occurs when there is a reference to storage
that has been deallocated
– Logical error
• Mysterious bugs can appear
Dangling References

int *dangle() What’s the problem?


{
int i = 23;
return &i;
}

main()
{
int *p;
p = dangle();
}
Dangling References
Local variable i only exists in
dangle()
int *dangle()
{
int i = 23; When procedure completes
return &i; execution and control is
}
transferred to main(), the space for
main()
{
i does not exist anymore (pop
int *p; activation record for dangle off the
p = dangle();
} stack)

Pointer p is a dangling reference


Dynamic Allocation
• Returning the address of a local variable is defined to be
a logical error (e.g. in C)
• In a dynamic environment there is no such restriction
– All variables and activation records must be maintained for as
long as there are references to them
• Callee outlives the caller
– It is also possible to return pointers to local functions
– Must deallocate space when procedures and variables are no
longer needed (garbage collection)
Accessing local and non local variables
Parameter Passing
• Parameters are the most common way for a calling
procedure to communicate with the callee.
• Different languages have different parameter semantics.
• Mostly, the differences lie in whether an l-value or rvalue
or text of the actual parameter is passed.
• We consider four protocols:
– Call by value
– Call by reference
– Copy-restore
– Call by name

29
Call by value
• This is the simplest parameter passing method.
• The caller computes r-values for the actuals.
• The caller places the resulting values on the
stack, in the AR of the callee.
• The callee may change the parameters, but this
has no effect on the caller.
• This is the default protocol in Pascal, and the
ONLY protocol in C.

30
Parameter passing example
1) program reference( input, output );
2) var a, b: integer;
3) procedure swap( var x, y: integer );
4) var temp : integer;
5) begin
6) temp := x;
Specifies call-by-
7) x := y;
reference
8) y := temp;
9) end;
10)begin
11) a := 1; b := 2;
12) swap( a, b );
13) writeln( ‘a = ‘, a ); writeln( ‘b = ‘, b )
14)end.

31
Call by reference
• The caller passes the called procedure a
POINTER to the storage address of the actual
parameter.
• If the actual has an l-value, it is used.
• If the actual is an expression, we place the result
of the expression in a temporary and pass a
pointer to the temporary.
• Pascal uses call by reference if the “var”
keyword is used.
• C++ uses call by reference if the “&” operator is
specified. 32
Copy restore
• This is a hybrid between call-by-value and call-
by reference.
• Before callee is activated, we evaluate the
actuals and put their r-values in the AR for the
callee.
• But we also compute and save the l-values of
the actuals.
• In the return sequence, we copy the updated r-
values from the callee’s AR to the location for
the saved values. FORTRAN used this
approach.
33
Call by name (macro expansion)
• In this method, we just substitute the body
of the procedure for the procedure call.
• In the copied body, the formal parameters
are replaced by the text of the actuals.
• #define macros in C/C++ use this
technique.

34
SYMBOL TABLE

Definition: Symbol tables are data structures


that are used by compilers to hold information
about source-program constructs.

A symbol table is a necessary component because


 Declaration of identifiers appears once in a
program
 Use of identifiers may appear in many places of the
program text

11/28/2020 35
INFORMATION PROVIDED BY
SYMBOL TABLE

 Given an Identifier which name is it?


 What information is to be associated with a name?
 How do we access this information?

11/28/2020 36
SYMBOL TABLE - NAMES
Variable and labels

Parameter

Constant

NAME Record

Record Field

Procedure
Array and files

11/28/2020 37
SYMBOL TABLE-ATTRIBUTES
• Each piece of information associated with a name is
called an attribute.
• Attributes are language dependent.
• Different classes of Symbols have different Attributes

Variable, Procedure or
Array
Constants function

• Type , Line • Number of • # of


number parameters, Dimensions,
where parameters Array
declared , themselves, bounds.
Lines where result type.
referenced ,
Scope

11/28/2020 38
WHO CREATES SYMBOL TABLE??

 Identifiers and attributes are entered by the analysis


phases when processing a definition (declaration) of an
identifier
 In simple languages with only global variables and implicit
declarations:
 The scanner can enter an identifier into a symbol
table if it is not already there
 In block-structured languages with scopes and explicit
declarations:
 The parser and/or semantic analyzer enter
identifiers and corresponding attributes
11/28/2020 39
USE OF SYMBOL TABLE
• Symbol table information is used by the analysis and
synthesis phases
• To verify that used identifiers have been defined
(declared)
• To verify that expressions and assignments are
semantically correct – type checking
• To generate intermediate or target code

11/28/2020 40
IMPLEMENTATION OF SYMBOL TABLE

• Each entry in the symbol table can be implemented as a


record consisting of several field.
• These fields are dependent on the information to be
saved about the name
• But since the information about a name depends on the
usage of the name the entries in the symbol table
records will not be uniform.
• Hence to keep the symbol tables records uniform some
information are kept outside the symbol table and a
pointer to this information is stored in the symbol table
record.

11/28/2020 41
IMPLEMENTATION OF SYMBOL
TABLE
• Each entry in the symbol table can be implemented as a
record consisting of several field.
• These fields are dependent on the information to be
saved about the name
• But since the information about a name depends on the
usage of the name the entries in the symbol table
records will not be uniform.
• Hence to keep the symbol tables records uniform some
information are kept outside the symbol table and a
pointer to this information is stored in the symbol table
record.

11/28/2020 42
int LB1
a
UB1

SYMBOL TABLE

A pointer steers the symbol table to remotely stored information


for array a.

11/28/2020 43
WHERE SHOULD NAMES BE HELD??

• If there is modest upper bound on the length of the name


, then the name can be stored in the symbol table record
itself.
• But If there is no such limit or the limit is already reached
then an indirect scheme of storing name is used.
• A separate array of characters called a ‘string table’ is
used to store the name and a pointer to the name is kept
in the symbol table record

11/28/2020 44
LB1

int
UB1

SYMBOL TABLE

A B
STRING TABLE
11/28/2020 45
SYMBOL TABLE AND SCOPE
• Symbol tables typically need to support multiple
declarations of the same identifier within a program.

The scope of a declaration is the portion of a


program to which the declaration applies.

• We shall implement scopes by setting up a separate


symbol table for each scope.

11/28/2020 46
SYMBOL TABLE
ORGANIZATION
TOP
z Real
Y Real Symbol table for block q
x Real

Var x,y : integer

Procedure P:
q Real
Var x,a :boolean;
a Real Symbol table for p
Procedure q: x Real
Var x,y,z : real;

begin
…… P Proc
end Symbol table for main Y Integer
begin
X Integer
….. 47
End
SYMBOL TABLE DATA STRUCTURES
 Issues to consider : Operations required
• Insert
– Add symbol to symbol table
• Look UP
– Find symbol in the symbol table (and get its
attributes)
 Insertion is done only once
 Look Up is done many times
 Need Fast Look Up
 The data structure should be designed to allow the
compiler to find the record for each name quickly and to
store or retrieve data from that record quickly.
11/28/2020 48
LINKED LIST
 A linear list of records is the easiest way to implement
symbol table.
 The new names are added to the symbol table in the
order they arrive.
 Whenever a new name is to be added to be added it is
first searched linearly or sequentially to check if or the
name is already present in the table or not and if not , it
is added accordingly.
• Time complexity – O(n)
• Advantage – less space , additions are simple
• Disadvantages - higher access time.
11/28/2020 49
UNSORTED LIST
01 PROGRAM Main
02 GLOBAL a,b
03 PROCEDURE P (PARAMETER x)
04 LOCAL a

On 
05 BEGIN {P}
06 …a… Look up Complexity
07 …b…
08 …x…
09 END {P}
10 BEGIN{Main}
11 Call P(a)
12 END {Main}

Name Characteristic Class Scope Other Attributes


Declared Referenced Other
Main Program 0 Line 1
a Variable 0 Line 2 Line 11
b Variable 0 Line 2 Line 7
P Procedure 0 Line 3 Line 11 1, parameter, x
x Parameter 1 Line 3 Line 8
a 11/28/2020 Variable 1 Line 4 Line 6 50
SORTED LIST
01 PROGRAM Main
02 GLOBAL a,b

Olog n
03 PROCEDURE P (PARAMETER x) Look up Complexity
04 LOCAL a 2
05 BEGIN {P}
06 …a… If stored as array (complex insertion)

On 
07 …b…
08 …x…
Look up Complexity
09 END {P}
If stored as linked list (easy insertion)
10 BEGIN{Main}
11 Call P(a)
12 END {Main}

Name Characteristic Class Scope Other Attributes


Declared Reference Other
a Variable 0 Line 2 Line 11
a Variable 1 Line 4 Line 6
b Variable 0 Line 2 Line 7
Main Program 0 Line 1
P Procedure 0 Line 3 Line 11 1, parameter, x
x 11/28/2020 Parameter 1 Line 3 Line 8 51
SEARCH TREES
• Efficient approach for symbol table organisation
• We add two links left and right in each record in the
search tree.
• Whenever a name is to be added first the name is
searched in the tree.
• If it does not exists then a record for new name is
created and added at the proper position.
• This has alphabetical accessibility.

11/28/2020 52
BINARY TREE

Main Program 0 Line1

P Procedure 1 Line3 Line11

x Parameter 1 Line3 Line8

a Variable 0 Line2 Line11

b Variable 0 Line2 Line7

a Variable
11/28/2020 1 Line4 Line6 53
BINARY TREE

Lookup complexity if tree


balanced Olog 2 n

On 
Lookup complexity if tree
unbalanced

11/28/2020 54
HASH TABLE
• Table of k pointers numbered from zero to k-1 that points
to the symbol table and a record within the symbol table.
• To enter a name in to the symbol table we found out the
hash value of the name by applying a suitable hash
function.
• The hash function maps the name into an integer
between zero and k-1 and using this value as an index in
the hash table.

11/28/2020 55
M n a b P x
HASH TABLE - EXAMPLE 77 110 97 98 80 120

PROGRAM Main
0 Main Program 0 Line1
GLOBAL a,b
1
PROCEDURE
2 P(PARAMETER x)

3 LOCAL a

4 BEGIN (P)
…a…
5
…b…
6 P Procedure 1 Line 3
…x…
7 a Variable 10 Line4
Line2 a Variable 0 Line2
END (P)
8
BEGIN (Main)
9 bx Parameter
Variable 0 1Line2
Line3 b Variable 0 Line2
Call P(a)
10
End (Main)
H(Id) = (# of first letter + # of last letter) mod 11
56

You might also like