0% found this document useful (0 votes)
33 views12 pages

RkCD-Chapter 6 - Intermediate Code Generation

Uploaded by

tilayefkadie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views12 pages

RkCD-Chapter 6 - Intermediate Code Generation

Uploaded by

tilayefkadie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Chapter - 6

Intermediate Code Generation


Intermediate Language:
There are three kinds of intermediate representations, they are
1. Syntax trees
2. Postfix notations 3. Three address code
1.Syntax trees:
A syntax tree depicts the natural hierarchical structure of a source program. A
syntax tree for the assignment statement a := b* - c + b* -c is

2.Postfix notation
• Postfix notation is a linearized representation of a syntax tree, it is a list of the nodes of the tree in
which a node appears immediately after its children.
• The postfix notation for the syntax tree is

3.Three-Address Code

• Three address code is a sequence of statements of the general form where x, y, and z are names,
constants, or compiler-generated temporaries;
• op stands for any operator, such as a fixed or floating-point arithmetic operator, or a logical operator
on Boolean-valued data.
• The reason for the term "three-address code" is that each statement usually contains three addresses,
two for the operands and one for the result.
Example :
Three address code for the syntax tree a := b* - c + b* -c is

Three-Address Code Rules:


A three address statement is an abstract form of intermediate code. There are three types of
representations:
1. Quadruples
2. Triples
3. Indirect Triples
1.Quadruples
• A quadruple is a record structure with four fields, which we call op, arg1, arg2, and result. The
op field contains an internal code for the operator.
• The three-address statement x : = y op z is represented by placing y in arg 1, z in arg2, and x in
result.
• Statements with unary operators like x : = -y or x : = y do not use arg2.
• Operators like param use neither arg2 nor result.
• Conditional and unconditional jumps put the target labe1 in result.

Quadruple for the statement a := b* - c + b* -c is


• The contents of fields arg1 , arg2, and result are normally pointers to the symbol-table entries for
the names represented by these fields. If so, temporary names must be entered into the symbol
table as they are created.

2.Triples
• Three-address statements can be represented by records with only three fields: op, arg1 and
arg2. Since three fields are used, this intermediate code format is known as triples.
• The field’s arg1 and arg2, for the arguments of op, are either pointers to the symbol table or
pointers into the triple structure.
• To avoid entering temporary names into the symbol table. We might refer to a temporary value
by the position of the statement that computes it.

The triple representation of x[ i]: = y

The triple representation of x := y[ I ]

3.Indirect triples
Listing pointers to triples rather than listing the tuples themselves are called indirect triples.
Indirect triple representation of a := b* - c + b* -c

Advantage: Indirect triples can save some space compared with quadruples, if the same
temporary value is used more than once.

Assignment Declarations (or) Statements:


• Expressions can be of type Integer, real, array, and record.
• Translation scheme to produce three address code for assignments

• The translation scheme shows how such symbol-table entries can be found. The lexeme for the
name represented by id is given by attribute id.name.
• Operation lookup (id.name) checks if here is an entry for this occurrence of the name in the
symbol table. If so, a pointer to the entry is returned; otherwise, lookup returns nil to indicate that
no entry was found.
• E.place - the name that will hold the value of E.
• Newtempgenerates a new temporary name each time a temporary is needed. Type Conversions
within Assignment
• In practice, there would be many different types of Variables and constants, so the compiler must
either reject certain mixed-type operations or generate appropriate type conversion
instructions.
• Consider the grammar for assignment statements as above, Assume there are two types - real and
integer, with integers converted to real when necessary.
• The semantic rule for E.typeassociated with the Production EE + E is:

The complete semantic action for a production of the form E  E1 + E2 is


Flow Control Statement:
• The following grammar generates the flow of control statements, if-then, if-then-else, and while-
do statements.

• The Syntax Directed Definition for flow of control statement


o Newlabel - returns a new symbolic label each time it is called oE.true - the labe1
to which control flows if E is true oE.false - the label to which control flows if E
is false
o S.next - a label that is attached to the first three address instruction to be
executed after the code for s.
If-Then
• In translating the if-then statement S  if E then S1, a new label E.true is created and attached to
the first three-address instruction generated for the statement as in Fig, 8.22(a).
• The code for E generates a jump to E.true if E is true and a jump to S.next if E is false. We
therefore set E.false to S.next.

If-then-else
In translating the if-then-else statement S if E then S1 else S2. the code for the boolean
expression E has jumps out of it to the first instruction of the code for S1 if E is true, and to
the first instruction of the code for S2 if E is false, as illustrated in Fig.
8.22(b).
While - do statement
• In translating Swhile E do S1, A new label S.begin is created and attached to the first
instruction generated for E. Another new label E.true is attached to the first instruction for
S1.
• The code for E generates a jump to this label if E is true, a jump to S.next if E is false and set
E.false to be S.next. After the code for S1 we place the instruction goto S.begin, which
causes a jump back to the beginning of the code for the boolean expression,
Example

Case Statements
The "switch" or "case" statement is available in a variety of languages. Switch-
statement syntax is shown in Fig.

The intended translation of a switch is code to:


1. Evaluate the expression.
2. Find which value in the list of cases is the same as the value of the expression. Recall that the default
value matches the expression if none of the values explicitly mentioned in cases does.
3. Execute the statement associated with the value found.
Syntax-Directed Translation of Case Statements

Translation of case statements

To translate into the form of Fig. 8.27, when we see the keyword switch, we generate two new
labels test, next and a new temporary t. Then as we parse the expression E, we generate code to evaluate
E into t. After processing E, we generate the jump goto test.
Then as we see each case keyword, we create a new label Li and enter it into the symbol table.
We place on a stack, used only to store cases, a pointer to this symbol-table entry and the value Vi of the
case constant.
We process each statement case Vi : Si by emitting the newly created label Li, followed by the
code for Si, followed by the jump goto next. Then when the keyword end terminating the body of the
switch is found, we are ready to generate the code for the n-way branch.

Backpatching:
• Back patching is a technique to solve the problem of replacing symbolic names into goto statements
by the actual target addresses.
• This problem comes up because if some languages do not allow symbolic names in the branches.
• Backpatching can be used to do generate code for Boolean expression and control statements in one
pass.
• During one pass, the following steps are executed 1)Construct the syntax tree.
2) Generate jumps with unspecified targets.
3) Keep a list of such statements. 4)Subsequently fill in the labels.
backpatch(p, i ) - inserts i as the target label for each of the statements on the list pointed to by p.
Example:
Consider the expression a <b or c <d and e< f . The six
statements generated so far are thus:

The corresponding semantic action calls Backpatch ({102), 104) where (102) as argument
denotes a pointer to the list containing only 402, that list being the one pointed to by E1.truetist. This call
to backpatch fills in 104 in statement 102. The six statements generated so far are thus:

Symbol Table:
A compiler uses a symbol table to keep track of scope and binding information about names. The
symbol table is searched every time a name is encountered in the source text. Changes to the table occur
if a new name or new information about an existing name is discovered.
A symbol-table mechanism must allow us to add new entries and find existing entries efficiently.
The two symbol-table mechanism presented in this section are linear lists and hash tables.
A linear list is the simplest to implement. but its performance is poor when the number of entries
are large.
Hashing schemes provide better performance for somewhat greater programming effort and
space overhead. Symbol-Table Entries
Each entry in the symbol table is for the declaration of a name, The format of entries does not
have to be uniform, because the information saved about a name depends on the usage of the name.
Each entry can be implemented as a record consisting of a sequence of consecutive words of
memory. To keep symbol-table records uniform, it may be convenient for some of the information about
a name to be kept outside the table entry, with only a pointer to this information stored in the record.
If there is a modest upper bound on the length of a name, then the characters in the name can be
stored in the symbol-table entry, as in Fig. 7.32(a).
If there is no limit on the length of a name, or if the limit is rarely reached, the indirect scheme of
Fig. 7.32(b) can be used.
The data structures used for implementing the symbol table are:
1. Linear list 2. Hash table 1.Linear list
• The simplest and easiest to implement data structure for a symbol table is a linear list of
records.New names are added to the list in the order in which they are encountered.
• The position of the end of the array is marked by the pointer available.
• The search for a name proceeds backwards from the end of the array to the beginning.When the
name is located, the associated information can be found in the words following next.
• If we reach the beginning of the array without finding the name, a fault occurs - an expected
name is not in the table.
The average search for the name in the symbol table is N/2 comparisons. Advantage
It takes less space.
Insertion to the table is simple.
2.Hash Tables:
The basic hashing scheme consists of two parts
1. A hash table consisting of a fixed array of m pointers to table entries.
2. Table entries organized into m separate linked lists, called buckets. Each record in the symbol
table appears on exactly one of these lists.

0 Name Info Name Info

1 Name Info
.
.
.
m

Hash Table
To enter a name into symbol table, we find out the hash valueof the name by applying suitable
hash function, which maps the name into an integer between 0 to m-1, and using the value generated by
a hash function as index in a hash table, we search the list of the symbol table records built on that hash
index, if the name is not present in that list we create a record for name and insert it at the head of the list
built on that hash index.
The retrieval of the information associated with the name is done as follows.
First, the hash value of the name is obtained and the list button on this hash value is searched for
getting the information about the name.

You might also like