UNIT IV CD Mam Notes
UNIT IV CD Mam Notes
SDD is a CFG along with attributes and rules. An attribute is associated with grammar
symbols (attribute grammar). A synthesized attribute: defined by a semantic rule associated with
the production at N.
S-attributed Definitions
Syntax directed definition that involves only synthesized attributes is called S-attributed.
Attribute values for the non-terminal at the head is computed from the attribute values of the
symbols at the body of the production.
The attributes of a S-attributed SDD can be evaluated in bottom up order of nodes of the parse
tree. i.e., by performing post order traversal of the parse tree and evaluating the attributes at a
node when the traversal leaves that node for the last time.
L-attributed Definitions
The syntax directed definition in which the edges of dependency graph for the attributes in
production body, can go from left to right and not from right to left is called L-attributed
definitions. Attributes of L-attributed definitions may either be synthesized or inherited.
• Either by inherited or synthesized attribute associated with the production located to the left of
the attribute which is being computed.
• Either by inherited or synthesized attribute associated with the attribute under consideration in
such a way that no cycles can be formed by it in dependency graph.
In production 1, the inherited attribute T' is computed from the value of F which is to its left. In
production 2, the inherited attributed Tl' is computed from T'. inh associated with its head and
the value of F which appears to its left in the production. i.e., for computing inherited attribute it
must either use from the above or from the left information of SDD.
Types of translation
• L-attributed translation
o It performs translation during parsing itself.
o No need of explicit tree construction.
o L represents 'left to right'.
• S-attributed translation
o It is performed in connection with bottom up parsing.
o 'S' represents synthesized.
Types of attributes
• Inherited attributes
o It is defined by the semantic rule associated with the production at the parent of node.
o Attributes values are confined to the parent of node, its siblings and by itself.
o The non-terminal concerned must be in the body of the production.
Example: inherited attributes
– production semantic rules
• Synthesized attributes
o It is defined by the semantic rule associated with the production at the node.
o Attributes values are confined to the children of node and by itself.
o The non terminal concerned must be in the head of production.
o Terminals have synthesized attributes which are the lexical values (denoted by lexval)
generated by the lexical analyzer.
Example Syntax directed definition of simple desk calculator
Example Construct a annotated Parse tree for the expression S = 2+3*4
--------------------------------------------------------------------------------------------------------------
Example
------------------------------------------------------------------------------------------------------------------
TYPE CHECKING
A compiler must check that the source program follows both the syntactic and semantic
conventions of the source language. This checking called static checking ensures that certain
kinds of programming errors will be detected and reported. Examples of static check include:
1. Type Check: A compiler should report an error if an operator is applied to an
incompatible operand; for example, if an array variable and function variable are added together.
3. Uniqueness Checks: There are situations in which an object must be defined exactly
once. For example, in Pascal, an identifier must be declared uniquely, labels in a case statement
must be distinct.
4. Name-Related checks: Sometimes, the same name must appear two or more times.
For example, in Ada, a loop or block may have a name that appears at the beginning and end of
the construct. The Compiler must check that the same name is used at both places.
A type checker verifies that the type of a construct matches that expected by its context.
For example, the built-in arithmetic operator mod in Pascal requires integer operands, so a type
checker must verify that the operands of mod have type integer. Type information gathered by
the type checker may be needed when code is generated.
The design of type checker for a language is based on information about the syntactic
constructs in the language, the notion of types, and rules for assigning types to language.
Examples of information that a compiler writer might have to start with.
1. A basic type is a type expression. Among the basic types are Boolean, char, integer, and
real. A special basic type, type_error, will signal an error during type checking. Finally
the basic type void denoting “the absence of a value”
2. Since type expressions may be named , a type name is a type expression. An example of
the use of type names appears below in 3(c).
a.A type constructor applied to type expressions is a type expression. Arrays: If T is a
type expression, the array(I,T) is a type
expression denoting the type of an array with elements of type T and index set I. I is often
a range of integers. For example var a: array[1..10] of integer;
address: integer;
end;
3. Type expressions may contain variables whose values are type expressions.
A convenient way to represent type expressions is to use a graph.
Tree
and dag, respectively, for charxchar->pointer(integer)
Type systems:
the various parts of a program. A type checker implements a type system. Different type systems
may be used by different compilers or processors of the same language.
Checking done by a compiler is said to be static, while checking done when the target
program runs is termed dynamic. A sound type system eliminates the need for dynamic checking
for type errors because it allows us to determine statically that these errors cannot occur when the
target program runs. A language is strongly types if its compiler can guarantee that the
programs it accepts will execute without type errors.
In practice, some checks can be done dynamically. For example, if we first declare,
i: integer
and then compute table[i]. a compiler cannot in general guarantee that during execution,
the value of i will lie in the range of 0..255.
Error Recovery:
The inclusion of error handling may result in a type system that goes beyond the one
needed to specify correct programs. For example, once an error has occurred, we may not know
the type of the incorrectly formed program fragment.
Syntax trees are useful for representing programming language constructs like expressions and
statements.
• They help compiler design by decoupling parsing from translation.
• Each node of a syntax tree represents a construct; the children of the node represent the meaningful
components of the construct.
• e.g. a syntax-tree node representing an expression E1 + E2 has label + and two children representing the
sub expressions E1 and E2
• Each node is implemented by objects with suitable number of fields; each object will have an op field
that is the label of the node with additional fields as follows:
If the node is a leaf, an additional field holds the lexical value for the leaf . This is created by
function Leaf(op, val)
If the node is an interior node, there are as many fields as the node has children in the syntax
tree. This is created by function Node(op, c1, c2,...,ck) .
SDDs are useful for is construction of syntax trees. A syntax tree is a condensed form of
parse tree. Syntax trees are useful for representing programming language constructs like
expressions and statements. They help compiler design by decoupling parsing from translation
Example: The S-attributed definition in figure below constructs syntax trees for a simple
expression grammar involving only the binary operators + and -. As usual, these operators are at
the same precedence level and are jointly left associative. All nonterminals have one synthesized
attribute node, which represents a node of the syntax
tree.
If the rules are evaluated during a post order traversal of the parse tree, or with reductions during
a bottom-up parse, then the sequence of steps shown below ends with p5 pointing to the root of
the constructed syntax tree.
With a grammar designed for top-down parsing, the same syntax trees are constructed,
using the same sequence of steps, even though the structure of the parse trees differs significantly
from that of syntax trees. The L-attributed definition below performs the same translation as the
S-attributed definition shown before.
Syntax tree for a-4+c using the above SDD is shown below.
---------------------------------------------------------------------------------------------------------------------
SPECIFICATION OF A SIMPLE TYPE CHECKER
Let us consider a simple language in which the type of each identifier must be declared
before the identifier is used. The type checker can handle arrays, pointers, statements and
functions.
A Simple Language:
P->D;E
D->D;D | id :T
key : integer;
P->D;E
D->D;D
D -> id :T { addtype(id.entry, T.type)}
T -> char { T.type := char }
T -> integer { T.type := integer }
T -> array[num] of T { T.type := array(1..num.val, T1.type) }
T -> ^ T { T.type := pointer(T1.type)}
The expression formed by applying the mod operator to two subexpressions of type
integer has type integer; otherwise , its type is type_error. The rule is
else type_error }
else type_error }
else type_error }
Since language constructs like statements typically do not values, the special basic type
void can be assigned to them. If an error is detected within a statement, the type assigned to the
statement is type_error.
else type_error }
else type_error }
else type_error }
The first rule checks that the left and right sides of an assignment statement have the
same type. The second and third rules specify that expressions in conditional and while
statements must have type boolean. Errors are propagated by the last rule because a sequences of
statements has type void only if each sub-statement has type void.
E -> E(E)
in which an expression is the application of one expression to another. The rule for
checking the type of a function application is
else type_error }
This rule says that in an expression formed by applying E 1 to E2, the type of E1 must be a
function s->t from the type s of E2 to some range type t; the type of E1(E2) is t.
Types of EQUIVALENCE
NAME EQUIVALENCE
Two types are the same if their component have the SAME NAMES.
STRUCTURAL EQUIVALENCE
boolean sequiv( s, t )
return TRUE;
return false
Evaluated by a bottom-up parser as the input is being parsed. The parser keeps the values
of the synthesized attributes associated with the grammar symbols on its stack.
When a reduction is made, the values of the new synthesized attributes are computed
from the attributes appearing on the stack for the grammar symbols on the right side of the
reducing production.
Basic idea
1)A translator for an S-attributed definition can often be implemented with an LR-parser
L ->E n print(E.val)
2)The stack is used to hold information about sub-trees that have been parsed.
3)We can use extra fields in the parser stack to hold the values of synthesized attributes
Basic idea
Step 1 E.g.
… …
X X.val
Y Y.val
… …
Step 2 Evaluating the Synthesized attributes
--------------------------------------------------------------------------------------------------------------------
RUN-TIME ENVIRONMENTS
During the execution of a program, the same name in the source can denote different data
objects in the computer. The allocation and deallocation of data objects is managed by the run-
time support package
The static source text of a program must be related to the actions that must occur at run
time to implement the program. As execution proceeds, the same name in the source text can
denote different data objects in the target machine. The allocation and deallocation. of data
objects is managed by the run-time support package, consisting of routines loaded with the
generated target code. The design of the run-time support package is influenced by the semantics
of procedures. Each execution of a procedure is referred to as an activation of the procedure. If
the procedure is recursive, several of its activations may be alive at the same time. The
representation of a data object at run time is determined by its type. Often, elementary data types,
such as characters, integers, and reals, can be represented by equivalent data objects in the target
machine. However aggregates, such as arrays, strings, .and structures, are usually represented by
collections of primitive objects;
Procedures
When a procedure name appears within an executable statement, we say that the
procedure is called at that point. The basic idea is that a procedure call executes the procedure
body. The main program in lines 21-25 calls the procedure readarray at line 23 and then calls
quicksort at line 24. Note that procedure calls can also occur within expressions, as on line 16.
Some of the identifiers appearing in a procedure definition are special, and are called
formal parameters of the procedure. The identifiers m and n on line 12 are formal parameters of
quicksort. Arguments, known as actual parameters may be passed to a called procedure; they
are substituted for the formals in the body. Line 18 is a call of quicksort with actual parameters
i+ 1 and n.
(5) begin
(10) begin …
(11) end;
(f4) begin
(16) i := partition(m,n);
(17) quicksort(m,i-1);
(19) end
(20) end;
(21) begin
(23) readarray;
(25) end
Activation Trees
We make the following assumptions about the flow of control among procedures during
the execution of a program:
1. Control flows sequentially; that is, the execution of a program consists of a sequence of steps,
with control being at some specific point in the program at each step.
2. Each execution of a procedure starts at the beginning of the procedure body and eventually
returns control to the point immediately following the place where the procedure was called.
This means the flow of control between procedures can be depicted using trees.
In languages like Pascal, each time control enters a procedure q from procedure p, it
eventually returns to p (in the absence of a fatal error). More precisely, each time control flows
from an activation of a procedure p to an activation of a procedure q, it returns to the same
activation of p.
If a and b are procedure activations, then their lifetimes are either-non-overlapping or are
nested. That is, if b is entered before a is left, then control must leave b before it leaves a.
This nested property of activation lifetimes can be illustrated by inserting two print
statements in each procedure, one before the first statement of the procedure body and the other
after the last. The first statement prints enter followed by the name of the procedure and the
values of the actual parameters; the last statement prints leave followed by the same information.
One execution of the program shown above with these print statements produced the output
shown below. The lifetime of the activation quicksort (1,9) is the sequence of steps executed
between printing enter quicksort (1,9) and printing leave quicksort (1,9).
A procedure is recursive if a new activation can begin before an earlier activation of the
same procedure has ended.
enter readarray
leave readarray
enter partition(1,9)
leave partition(1,9)
enter quicksort(1,3)
...
leave quicksort(1,3))
enter quicksort(5,9)
...
leave quicksort(5,9)
leave quicksort(1,9)
Execution terminated
A recursive procedure p need not call itself directly; p may call another procedure q, which
may then call p through some sequence of procedure calls. We can use a tree, called an
activation tree, to depict the way control enters and leaves activations. In an activation tree,
Control Stacks
The flow of control in a program corresponds to a depth-first traversal of the activation tree
that starts at the root, visits a node before its children, and recursively visits children at each node
in a left-to-right order. The output in Fig. 2.2 can therefore be reconstructed by traversing the
activation tree in Fig. 2.3, printing enter when the node for an "activation is reached for the first
time and printing leave after the entire sub tree of the node has been visited during the traversal.
We can use a stack, called a control stack to keep track of live procedure activations. The
idea is to push the node for activation onto the control stack as the activation begins and to pop
the node when the activation ends. Then the contents of the control stack are related to paths to
the root of the activation tree. When node n is at the top of the control stack, the stack contains
the nodes along the path from n to the root.
Example. Figure 2.4 shows nodes from the activation tree of Fig.2.3 that have been reached
when control enters the activation represented by q(2,3) . Activations with labels r, p(1,9), p(1,3),
and q(1,0) have executed to completion, so the figure contains dashed lines to their nodes. The
solid lines mark the path from q(2,3) to the root.
Fig 2.4. The control stack contains nodes along a path to the root
At this point the control stack contains the following nodes along this path
var i : integer;
or they may be implicit. For example, any variable name starting with I is assumed to denote an
integer in a Fortran program unless otherwise declared. There may be independent declarations
of the same name in different parts of a program. The scope rules of a language determine which
declaration of a name applies when the name appears in the text 'of a program. In the Pascal
program in Fig. 2.1, i is declared thrice, on lines 4, 9, and 13, and the uses of the name i in
procedures read array, partition, and quicksort are independent of each other. The declaration on
line 4 applies to the uses of i on line 6. That is, the two occurrences of i on line 6 are in the scope
of the declaration on line 4. The three occurrences of i on lines 16-18 are in the scope of the
declaration of i on line 13.
The portion of the program to which a declaration applies is called the scope of that
declaration. An occurrence of a name in a procedure is said to be local to the procedure if it is in
the scope of a declaration within the procedure; Otherwise, the occurrence is said to be nonlocal.
At compile time, the symbol table can be used to find the declaration that applies to an
occurrence of a name. When a declaration is seen, a symbol table entry is created for it. As long
as we are in the scope of the declaration; its entry is returned when the name in it is looked up.
Bindings of Names
Even if each name is declared once in a program, the same name may denote different
data objects at run time. The informal term "data object" corresponds to a storage location that
can hold values.
In programming language semantics, the term environment refers to a function that maps
a name to a storage location, and the term state refers to a function that maps a storage location
to the value held there, as in Fig. 2.5. An environment maps a , name to an I-value, and a state
maps the I-value to an r-value.
Environments and states are different; an assignment changes ,the state, but not the
environment. For example, suppose that storage address 100, associated with variable pi, holds
O. After the assignment pi : =3. 14, the same storage address is associated with pi, but the value
held there is 3. 14 .
A binding is the dynamic counterpart of a declaration, as shown in Fig. 2.6. More than
one activation of a recursive procedure can be alive at the same time.· In Pascal, a local variable
name in a procedure is bound to a different storage location in each activation of a procedure.
---------------------------------------------------------------------------------------------------
STORAGE ORGANIZATION
The organization of run-time storage in this section can be used for languages such as
Fortran, Pascal. and C.
Suppose that the compiler obtains a block of storage from the operating system for the
compiled program to run in. From the discussion in the last section, this run-time storage might
be subdivided to hold:
Implementations of languages like Pascal and C use extensions of the control stack to
manage activations of procedures. When a call occurs, execution of an activation is interrupted
and information about the status of the machine, such as the value of the program counter and
machine registers, is saved on the stack. When control returns from the call, this activation can
be restarted after restoring the values of relevant registers and setting the program counter to the
point immediately after the call. Data objects whose life times are contained in that of activation
can be allocated on the stack, along with other information associated with the activation.
A separate area time memory called a heap, holds all other information. The sizes of the
stack and the heap can change as the program executes. So
we show these at opposite ends of memory in Fig. 2.7, where they can grow toward each other as
needed. Pascal and C need both a run-time stack and heap, but not all languages do.
Activation Records
The purpose of the fields of an activation record is as follows, starting from the field for
temporaries.
1. Temporary values, such as those arising in the evaluation of expressions, are stored in
the field for temporaries.
2. The field for local data holds data that is local to an execution of a procedure.
3. The field for saved machine status holds information about the state of the machine
just before, the procedure is called. This formation includes the values of the program counter
and machine registers that have to be restored when control returns from the procedure.
4. The optional access link is to refer to nonlocal data held in other activation records.
5. The optional control link points to the activation record of the caller.
Activation Record
The storage layout for data objects is strongly influenced by the addressing constraints of
the target machine. For example instructions to add integers may expect integers to be aligned,
that is, placed at certain positions in memory such as an address divisible by 4., Although an
array of ten characters needs only enough bytes to hold ten characters, a compiler may therefore
allocate 12 bytes, leaving 2 bytes unused. Space left unused due to alignment considerations is
referred to as padding. When space is at a premium, a compiler may pack data so that no padding
is left; additional instructions may then need to be executed at run time to position packed data
so that it can be operated on as if it were properly aligned.
Example Figure 2.9 is a simplification of the data layout used by C compilers for, two machines
that we call Machine I and Machine 2.
The memory of Machine I is organized into bytes consisting of 8 bits each. In Machine 2. each
word consists of 64 bits, and 24 bits are allowed for the address of a word.
1. Static allocation lays out storage for all data objects at compile time.
2. Stack allocation manages the run-time storage as a stack.
3. Heap allocation allocates and deallocates storage as needed at run time from a data area
known as a heap.
Static Allocation
In static allocation, names are bound to storage as the program is compiled, so there is no
need for a run time support package. Since the bindings do not change at run time, every time a
procedure is activated, its names are bound to the same storage locations. When control returns
to a procedure, the values of the locals are the same as they were when control left the last time.
From the type of a name, the compiler determines the amount of storage. The address of this
storage consists of an offset from an end of the activation record for the procedure. The compiler
must eventually decide where the activation records go. Once this decision is made, the position
of each activation record and the storage for each name in the record is fixed. At compile time
we can fill in the addresses at which the target code can find the data it operates on. Similarly,
the addresses at which information is to be saved when a procedure call occurs are also known at
compile time.
1. The size of a data object and constraints on its position in memory must be known at
compile time.
2. Recursive procedures are restricted, because all activations of a procedure use the same
bindings for local names.
3. Data structures cannot be created dynamically, since there is no mechanism for storage
allocation at run time.
Fortran was designed to permit static storage allocation. A Fortran pro gram consists of a
main program, subroutines, and functions as in Fig. 2.10. Using the memory organization of Fig.
2.7, the layout of the code and the activation records is shown in Fig. 2.11.
A Fortran Program
Within the activation record for CNSUME, there is space for the locals BUF, NEXT, and C.
The storage bound to BUF holds a string of fifty characters. It is followed by space for holding
an integer value for NEXT and a character value for C. The fact that NEXT is also declared in
PRDUCE presents no problem, because the locals of the two procedures get space in their
respective activation records.
STACK ALLOCATION :
Stack allocation is based on the idea of a control stack; storage is organized as a stack,
and activation records are pushed and popped as activations begin and end, respectively. Storage
for the locals in each call of a procedure is contained in the activation record for that call. Thus
locals are bound to fresh storage in each activation, because a new activation record is pushed
onto the stack when a call is made. The values of locals are deleted when the activation ends;
because the storage for locals disappears when the activation record is popped.
We first describe a form of stack allocation in which the sizes of all activation records are
known at compile time. Situations in which incomplete information about sizes is available at
compile time are considered below.
Suppose that register top marks the top of the stack. At run time, an
menting top, respectively, by the size of the record. If procedure q has an activation record of
size a, then top is incremented by a just before the target code of q is executed. When control
returns from q, top is decremented by a.
Fig 2.12 Downward growing stack allocation of activation records
The above figure shows the activation records that are pushed and popped from the run-
time stack.
Execution begins with an activation of procedure s. When control reaches the first call in
the body of s, procedure r is activated and its activation record is pushed onto the stack. When
control returns from this activation, the record is popped leaving just the record for s in the stack.
In the activation of s, control then reaches a call of q with actuals 1 and 9, and an activation
record is allocated on top of the stack for an activation of q.
Calling Sequences
Procedure calls are implemented by generating what are known as calling sequences in
the target code. A call sequence allocates an activation record and enters information into its
fields. A return sequence restores the state of the machine so the calling procedure can continue
execution.
A principle that aids the design of calling sequences and activation records is that fields
whose sizes are fixed are placed in the middle. For example, the control link, access link and
machine status fields appear in the middle.
Division of tasks between caller and callee
Since each call has its own actual parameters, the caller usually evaluates
actual parameters and communicates them to the activation record of the callee. In the run-time
stack, the activation record of the caller is just below that for the callee, as in Fig. 2.13. The
caller can access the fields of the callee using offsets from the end of its own activation record,
without knowing the complete layout of the record for the callee.
As in Fig. 2 13, register top_sp points to the end of the machine status field in an
activation record. This position is known to the caller, so it can be made responsible for setting
top_sp before control flows to the called procedure.
The code for the callee can access its temporaries and local data using offsets from
top_sp. The call sequence is:
2. The caller stores a return address and the old value of top_sp into the callee's activation
record. The caller then increments top_sp to the posi tion shown in Fig. 2.13. That is, top_sp is
moved past the caller's local
data and temporaries and the callee's parameter and status fields.
1. The callee places a return value next to the activation record of the caller.
2. Using the information in the status field, the callee restores top_sp and other registers
and branches to a return address in the caller's code.
3. Although top_sp has been decremented, the caller can copy the returned value into its
own activation record and use it to evaluate an expression.
The above calling sequences allow the number of arguments of the called procedure to
depend on the call. Note that, at compile time, the target code of the caller knows the number of
arguments it is supplying to the callee. Hence the caller knows the size of the parameter field.
However, the target code of the callee must be prepared to handle other calls as well, so it waits
until it is called, and then examines the parameter field.
Variable-Length Data
A common strategy for handling variable-length data is suggested in Fig. 2.14, where
procedure p has three local arrays. The storage for these arrays is not part of the activation record
for p; only a pointer to the beginning of each array appears in the activation record. The relative
addresses of these pointers are known at compile time, so the target code can access array
elements through the pointers.
Also shown in Fig. 2.14 is a procedure q called by p. The activation record for q begins
after the arrays of p, and the variable-length arrays of q begin beyond that.
Access to data on the stack is through two pointers, top and top_sp. The first of these
marks the actual top of the stack; it points to the position at which the next activation record will
begin. The second is used to find local data. Within the field is a control link to the previous
value of top_sp when control was in the calling activation of p.
The code to reposition top and top_sp can be generated at compile time, using the sizes of
the fields in the activation records. When q returns, the new value of top is top_sp minus the
length of the machine-status and parameter fields in q's activation record. This length is known at
compile time, at least to the caller. After adjusting top, the new value of top_sp can be copied
from the control link of q.
Dangling References:
Example Procedure dangle in the C program shown below returns a pointer to the storage bound
to the local name i. The pointer is created by the operator & applied to i. When control returns to
main from dangle, the storage for locals is freed and can be used for other purposes. Since p in
main refers to this storage, the use of p is a dangling reference.
main()
int *p;
p =dangle( ) ;
int *dangle()
int i = 23;
return &i;
HEAP ALLOCATION
The stack allocation strategy discussed above cannot be used if either of the following is
possible.
1. The values of local names must be retained when an activation ends.
2. A called activation outlives the caller. This possibility cannot occur for those
languages where activation trees correctly depict the flow of control between procedures.
In each of the above cases, the deallocation of activation records need not occur in a last-
in first-out fashion, so storage cannot be organized as a stack.
Heap allocation parcels out pieces of contiguous storage, as needed for activation records
or other objects. Pieces may be deallocated in any order, so over time the heap will consist of
alternate areas that are free and in use.
The difference between heap and stack allocation of activation records can be seen from
Fig. 2.15 and 2.12. In Fig.2.15, the record for an activation of procedure r is retained when the
activation ends. The record for the new activation q( 1 ,9) therefore cannot follow that for s
physically, as it did in Fig.2.12. Now if the retained activation record for r is deallocated, there
will be free space in the heap between the activation records for s and q (1,9). It is left to the
heap manager to make use of this space.
There is generally some time and space overhead associated with using a heap manager.
For efficiency reasons, it may be helpful to handle small activation records or records of a
predictable size as a special case, as follows:
1. For each size of interest, keep a linked list of free blocks of that size.
2. If possible, fill a request for size s with a block of size s ’, where s' is the smallest size
greater than or equal to s. When the block is eventually deallocated, it is returned to the linked
list it came from.
3. For large blocks of storage use the heap manager.
This approach results in fast allocation and deallocation of small amounts of storage,
since taking and returning a block from a linked list are efficient operations. For large amounts
of storage we expect the computation to take some time to use up the storage.
free free
---------------------------------------------------------------------------------------------------------------------
PARAMETER PASSING
The communication medium among procedures is known as parameter passing. The values of
the variables from a calling procedure are transferred to the called procedure by some
mechanism. Before moving ahead, first go through some basic terminologies pertaining to the
values in a program.
r-value
The value of an expression is called its r-value. The value contained in a single variable also
becomes an r-value if it appears on the right-hand side of the assignment operator. r-values can
always be assigned to some other variable.
l-value
The location of memory (address) where an expression is stored is known as the l-value of that
expression. It always appears at the left hand side of an assignment operator.
Example 7 = x + y;
s an l-value error, as the constant 7 does not represent any memory location.
Types of Parameters
Formal Parameters
Variables that take the information passed by the caller procedure are called formal parameters.
These variables are declared in the definition of the called function.
Actual Parameters
Variables whose values or addresses are being passed to the called procedure are called actual
parameters. These variables are specified in the function call as arguments.
Example:
fun_one()
{
int actual_parameter = 10;
call fun_two(int actual_parameter);
}
fun_two(int formal_parameter)
{
print formal_parameter;
}
Formal parameters hold the information of the actual parameter, depending upon the parameter
passing technique used. It may be a value or an address.
Pass by Value
In pass by value mechanism, the calling procedure passes the r-value of actual parameters and
the compiler puts that into the called procedure’s activation record. Formal parameters then hold
the values passed by the calling procedure. If the values held by the formal parameters are
changed, it should have no impact on the actual parameters.
Pass by Reference
In pass by reference mechanism, the l-value of the actual parameter is copied to the activation
record of the called procedure. This way, the called procedure now has the address (memory
location) of the actual parameter and the formal parameter refers to the same memory location.
Therefore, if the value pointed by the formal parameter is changed, the impact should be seen on
the actual parameter as they should also point to the same value.
Pass by Copy-restore
This parameter passing mechanism works similar to ‘pass-by-reference’ except that the changes
to actual parameters are made when the called procedure ends. Upon function call, the values of
actual parameters are copied in the activation record of the called procedure. Formal parameters
if manipulated have no real-time effect on actual parameters (as l-values are passed), but when
the called procedure ends, the l-values of formal parameters are copied to the l-values of actual
parameters.
Example:
int y;
calling_procedure()
{
y = 10;
copy_restore(y); //l-value of y is passed
printf y; //prints 99
}
copy_restore(int x)
{
x = 99; // y still has value 10 (unaffected)
y = 0; // y is now 0
}
When this function ends, the l-value of formal parameter x is copied to the actual parameter y.
Even if the value of y is changed before the procedure ends, the l-value of x is copied to the l-
value of y making it behave like call by reference.
Pass by Name
Languages like Algol provide a new kind of parameter passing mechanism that works like
preprocessor in C language. In pass by name mechanism, the name of the procedure being called
is replaced by its actual body. Pass-by-name textually substitutes the argument expressions in a
procedure call for the corresponding parameters in the body of the procedure so that it can now
work on actual parameters, much like pass-by-reference.
Advantage of call-by-value
No aliasing.
1. Arguments unchanged by procedure call.
2. Easier for static optimization analysis for both programmers and the complier.
Example:
x = 0;
Y(x); /* call-by-value */
z = x+1; /* can be replaced by z=1 for optimization */
Compared with call-by-reference, code in called function is faster because no need for
redirecting pointers.
Advantage of call-by-reference
1. Efficiency in passing large objects.
2. Only need to copy addresses.
Advantage of call-by-value-result
1. More efficient than call-by-value for small objects.
2. If there is no aliasing, can implement call-by-value-result using call-by-reference for
large objects.
Advantage of call-by-name
1. More efficient when passing parameters that are never used.
Example:
P(Ackerman(5),0,3)
/* Ackerman’s function takes enormous time to compute */
function P(int a, int b, int c)
{ if(odd(c)){
return(a)
}else{ return(b) }
}
Note: if the condition is false, then using call-by-name, it is never necessary to evaluate the first
actual at all.This saves lots of time because evaluating a takes a long time.
---------------------------------------------------------------------------------------------------------------------
SYMBOL TABLES
Symbol table is an important data structure created and maintained by compilers in order to store
information about the occurrence of various entities such as variable names, function names,
objects, classes, interfaces, etc. Symbol table is used by both the analysis and the synthesis parts
of a compiler.
A symbol table may serve the following purposes depending upon the language in hand:
A symbol table is simply a table which can be either linear or a hash table. It maintains an entry
for each name in the following format:
Operations
A symbol table, either linear or hash, should provide the following operations.
1. insert()
This operation is more frequently used by analysis phase, i.e., the first half of the compiler where
tokens are identified and names are stored in the table. This operation is used to add information
in the symbol table about unique names occurring in the source code. The format or structure in
which the names are stored depends upon the compiler in hand.
For example: int a;
should be processed by the compiler as: insert(a, int);
2. Lookup()
lookup() operation is used to search a name in the symbol table to determine:
On procedure calls,
•the calling procedure:
.May save some registers (in its own A.R.).
.May set optional access link (push it onto stack).
.Pushes parameters onto stack.
.Jump and Link: jump to the first instruction of called procedure and put address of next
instruction into register RA.
•the called procedure:
.Pushes return address in RA.
.Pushes old FP (control link).
.Sets new FP to old SP.
.Sets new SP to be old SP + (size of parameters) + (size of RA) + (size of FP). (These sizes are
computed at compile time.)
.May save some registers.
.Push local data (maybe push actual data if initialized or maybe just their sizes from SP)
---------------------------------------------------------------------------------------------------------------------