0% found this document useful (0 votes)

165 views32 pages

Instruction Selection in Code Generation

The document discusses code generation during compilation. It describes the Aho-Johnson algorithm for unified instruction selection and register allocation. The algorithm generates optimal code for expression trees by selecting instructions from a general machine model that allows operators with any arity and operands in any order. It considers expression trees and generates code with linear complexity without using algebraic properties of operators.

Uploaded by

Rishikesh Khilari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

165 views32 pages

Instruction Selection in Code Generation

Uploaded by

Rishikesh Khilari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Code Generation: Integrated Instruction

Selection and Register Allocation Algorithms

Amitabha Sanyal
(www.cse.iitb.ac.in/˜as)

Department of Computer Science and Engineering,

Indian Institute of Technology, Bombay

September 2007
College of Engineering, Pune Code Generation: Instruction Selection: 2/50

Place of Code Generator in a Compiler

Text book stuff . . .

Front End

Lexical Syntax Semantic Intermediate

Analyser Analyser Analyser Code Gen

Mc. Dependent Code Generator Mc. Independent

Optimizer Optimizer

Back End

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 4/50

Code Generation - Issues

• Expressions and Assignments:

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 4/50

Code Generation - Issues

• Expressions and Assignments:

I Instruction selection. Selection of the best instruction for the

computation.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 4/50

Code Generation - Issues

• Expressions and Assignments:
I Instruction selection. Selection of the best instruction for the

computation.
I The instruction should be able to perform the computation.
I It should be the fastest of possible choices.
I It should combine well with the instructions of its surrounding
computations?

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 4/50

Code Generation - Issues

• Expressions and Assignments:

I Instruction selection. Selection of the best instruction for the

computation.
I Register allocation. To hold result of computations as long as

possible in registers.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 4/50

Code Generation - Issues

• Expressions and Assignments:

I Instruction selection. Selection of the best instruction for the

computation.
I Register allocation. To hold result of computations as long as

possible in registers.
I What computations will be held in registers?
I In which regions of the program?

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 4/50

Code Generation - Issues

• Expressions and Assignments:

I Instruction selection. Selection of the best instruction for the

computation.
I Register allocation. To hold result of computations as long as

possible in registers.
• Control Constructs:
I Lazy evaluation of boolean expressions.
I Avoiding jump statements to jump statements.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 4/50

Code Generation - Issues

• Expressions and Assignments:

I Instruction selection. Selection of the best instruction for the

computation.
I Register allocation. To hold result of computations as long as

possible in registers.
• Control Constructs:
I Lazy evaluation of boolean expressions.
I Avoiding jump statements to jump statements.

• Procedure Calls:
I Activation record building:

I Division of work between caller and callee

I Using special instruction for creation and destruction of activation
records.
I Saving and restoring of registers across procedure calls.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 6/50

Outline of Lecture

• Unified algorithms for instruction selection and code generation.

I Aho-Johnson Algorithm

I Optimal code generation for realistic expression and machine models.

Most code generators are variations of this.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 8/50

Expression Trees
• Here is the expression a/(b + c) − c ∗ (d + e) represented as a tree:
_

/ *

a + c +

b c d e

• We have not identified common sub-expressions; else we would have

a directed acyclic graph (DAG):
_

/ *

a + +

b c d e

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 10/50

Aho-Johnson Algorithm – Introduction

• Considers expression trees.
• The target machine model is general enough to generate code for a
large class of machines. Represented as a tree, an instruction
I can have a root of any arity.
I can have as leaves registers or memory locations appearing in any
order.
I can be of of any height
• Does not use algebraic properties of operators.
I If e1 ∗ e2 has to be evaluated using r ← r ∗ m, and
I e1 and e2 are in m and r ,

then the code sequence has to be:

m1 ← r
r ← m and not simply: r ← r ∗m
r ← r ∗ m1

• Generates optimal code, where, once again, the cost measure is the
number of instructions in the code. This can be modified.
• Complexity is linear in the size of the expression tree.
Amitabha Sanyal IIT Bombay
College of Engineering, Pune Code Generation: Instruction Selection: 12/50

Aho-Johnson Algorithm
Let Θ be a finite set of operators. Then,
1. A single vertex labeled by a variable name or constant is an
expression tree.
2. If T1 , T2 , . . . , Tk are expression and θ is a k-ary operator in Θ, then
Θ

T T T
1 2 k

is an expression tree.
An example of an expression tree is
+
ind *
+ i b

addr_a *
4 i

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 14/50

The Machine Model

Considers machines which have

1. n general purpose registers (no special registers).
2. sequence of memory locations,
3. instructions of the form
a. r ← E , r is a register and E is an expression tree whose operators are
from Θ and operands are registers, memory locations or constants.
Further, r should be one the register names occurring (if any) in E .
b. m ← r , a store instruction.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 16/50

The Machine Model

• Here is an example of a machine.

r c {MOV #c, r}
r m {MOV m, r}
m r {MOV r, m}
r ind {MOV m(r), r}
+
r m
r op
1 {op r , r }
2 1
r r2
1

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 18/50

Machine Program

• A machine program consists of a finite sequence of instructions

P = I 1 I2 . . . I q .
• The machine program below evaluates a[i] + i ∗ b

r1 ←4
r1 ← r1 ∗ i
r2 ← addr a
r2 ← r2 + r1
r2 ← ind(r2 )
r3 ←i
r3 ← r3 ∗ b
r2 ← r2 + r3
• A machine program computing an expression tree will have at most
one use for each definition.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 20/50

Rearrangability of Programs

• We shall show that any program can be rearranged to obtain an

equivalent program of the same length in strong normal form.
• Aho-Johnson’s algorithm searches for the optimal only amongst
strong normal form programs.
• The above result assures us that by doing so, we shall not miss out
an optimal solution.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 22/50

Width

• The width of a program is the maximum number of registers live at

any instruction.
• A program of width w (but possibly using more than w registers)
can always be rearranged into an equivalent program using exactly
w registers.
• In the example below, the first program has width 2 but uses 3
registers. By suitable renaming, the number of registers in the
second program has been brought down to 2.
r1 ←a r1 ←a
r2 ←b r2 ←b
r1 ← r1 + r2 r1 ← r1 + r2
r3 ←c r2 ←c
r3 ← r3 + d r2 ← r2 + d
r1 ← r1 ∗ r3 r1 ← r1 ∗ r2

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 24/50

Contiguity and Strong Contiguity

Can one decrease the width of a program?
*
+ /

+ *e f
a b c d

P1 P2 P3
r1 ← a r1 ← a r1 ← a
r2 ← b r2 ← b r2 ← b
r3 ← c r3 ← c r 1 ← r1 + r2
r4 ← d r4 ← d r2 ← c
r5 ← e r 1 ← r1 + r2 r3 ← d
r6 ← f r3 ← r3 ∗ r4 r2 ← r 2 ∗ r 3
r5 ← r5 /r6 r1 ← r 1 + r 3 r1 ← r 1 + r 2
r3 ← r 3 ∗ r 4 r2 ← e r2 ← e
r1 ← r 1 + r 2 r3 ← f r3 ← f
r1 ← r 1 + r 3 r2 ← r2 /r3 r2 ← r2 /r3
r1 ← r 1 ∗ r 5 r1 ← r 1 ∗ r 2 r1 ← r 1 ∗ r 2

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 26/50

Contiguity and Strong Contiguity

• Can one decrease the width of a program? For storeless programs,

there is an arrangement which has minimum width.
• All the three programs P1 , P2 , and P3 compute the expression tree
shown below:
• The program P2 has a width less than P1 , whereas P3 has the least
width of all three programs. P2 is a contiguous program whereas P3
is a strongly contiguous (SC) program.

• Every program without stores can be transformed into SC form.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 28/50

Strong Normal Form Programs

• Programs requiring stores can also be cast in a certain form called

strong normal form.
op
T3
The marked nodes, T1 , T2
T1 T2 and T3 require stores.

I Compute T1 using a SC program P1 . Store the result in m1 .

I Compute T2 using a SC program P2 . Store the result in m2 .
I Compute T3 using a SC program P3 . Store the result in m3 .
I Compute the resulting tree using a SC program P4 .
The resultant program has the form P 1 J1 P2 J2 P3 J3 P4 .
The Ji s are stores.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 30/50

Strong Normal Form Programs

op
m3
m1 m2

• A program in such a form is called a normal form program. A

normal form program looks like P1 J1 P2 J2 . . . Ps−1 Js−1 Ps .
• Further, P is in strong normal form, if each P i is strongly
contiguous.

• THEOREM: Let P be a program of width w . We can transform P

into an equivalent program Q such that:
I P and Q have the same cost.
I Q has width at most w , and
I Q is in strong normal form.
Amitabha Sanyal IIT Bombay
College of Engineering, Pune Code Generation: Instruction Selection: 32/50

The Algorithm

• The algorithm makes three passes over the expression tree.

Pass 1 Computes an array of costs for each node. This helps to select an
instruction to evaluate the node, and the evaluation order to evaluate
the subtrees of the node.
Pass 2 Identifies the subtrees which must be evaluated in memory locations.
Pass 3 Actually generates code.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 34/50

Cover
• An instruction covers a node in an expression tree, if it can be used
to evaluate the node.
+

a ind
*

4 i

r + r1 + r1 +

r m r1 r2 r1 ind

r2
regset = { a } memset = { } memset = { }
memset = {ind } regset = {ind, a } regset = { * ,a }

* * 4 i
4 i 4 i

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 36/50

The Algorithm – Pass 1

• Pass 1: Calculates an array of costs C j (s) for every subtree S of T ,
whose meaning is to be interpreted as follows:
I Cj (S), j 6= 0 : is the minimum cost of evaluating S with a strong
normal form program using j registers.
I C0 (S) : cost of evaluating S strong normal form program in a
memory location.
• Consider a machine with the instructions shown below.
r c {MOV #c, r}
r m {MOV m, r}
m r {MOV r, m}
r ind {MOV m(r), r}
+
r m
r op
1 {op r , r }
2 1
r r2
1

Note that there are no instructions of the form r ← r op m.

Amitabha Sanyal IIT Bombay
College of Engineering, Pune Code Generation: Instruction Selection: 38/50

The Algorithm – Pass 1

• We show an example expression tree, along with the cost array at

each node:
1112 10
+
2 registers
7 6 6 ind * 45 3
1 register
0 register 67 5 + i b
0 11 011
2 1 1 addr_a * 45 3

4 i
21 1 0 1 1

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 40/50

The Algorithm – Pass 2

• This pass marks the nodes which have to be evaluated into memory.
It returns a sequence of nodes x1 , . . . , xs , where x1 , . . . , xs represent
the nodes to be evaluated in memory.
1112 10
+1
7 6 6 ind *1 4 5 3

6 7 5 +2 i1 b
0 11 011
2 1 1 addr_a *2 4 5 3

4 i2
21 1 0 1 1

• The node *2 has to be stored in memory.

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 42/50

The Algorithm – Pass 3

• The algorithm generates code for the subtrees rooted at x 1 , . . . xs , in

that order.
• After generating code for xi , the algorithm replaces the node with a
distinct memory location mi .
• For the example, the code generated is:
r1 ← #4 (evaluate 4 ∗ i first, since it is to be stored)
r2 ← i
r1 ← r 1 ∗ r 2
m1 ← r 1
r1 ← i (evaluate i ∗ b next, since it requires 2 registers)
r2 ← b
r1 ← r 1 ∗ r 2
r2 ← #addr a
r2 ← m1 (r2 ) (evaluate the ind node)
r2 ← r 2 + r 1 (evaluate the root)

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 44/50

Complexity of the Algorithm

1. The time required by Pass 1 is an, where a is a constant depending

I linearly on the size of the instruction set
I exponentially on the arity of the machine, and
I linearly on the number of registers in the machine
and n is the number of nodes in the expression tree.
2. Time required by Passes 2 and 3 is proportional to n
Therefore the complexity of the algorithm is O(n).

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 46/50

Example

• Consider a machine model with 2 general purpose registers and

instructions shown below with their costs.

Ri ← Ri op Rj cost – 2
R ←c cost – 1
R ←m cost – 1
R ← ind(R) cost – 1
R ← ind(R + m) cost – 4
m←R cost – 1

Now consider the expression

((a ∗ b) ∗ (ind(1 + 2))) ∗ (ind(c + d)).

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 48/50

The Cost Array

22 23 21
*
14 15 13 6 6 5
* ind
5 6 4 6 7 5 5 6 4
store * ind +

a b + 5 6 4 c d
0 1 1 0 1 11 0 1 1 0 1 1
2
2 1 1 2 1 1

Amitabha Sanyal IIT Bombay

College of Engineering, Pune Code Generation: Instruction Selection: 50/50

Generated code

The code generated is:

R1 ← a code for the subtree to be stored

R2 ← b
R1 ← R 1 ∗ R 2
m ← R1
R1 ← 1 code for ind(1 + 2)
R2 ← 2
R1 ← R 1 + R 2
R1 ← ind(R1 )
R2 ← m code for (a ∗ b) ∗ ind(1 + 2)
R2 ← R 2 ∗ R 1
R1 ← c code for ind(c + d)
R1 ← ind(R1 + d)
R2 ← R 2 ∗ R 1 code for the root

Amitabha Sanyal IIT Bombay

Code Csharp
No ratings yet
Code Csharp
56 pages
Unit VI Code Generation
No ratings yet
Unit VI Code Generation
29 pages
Lec08 Code Generation
No ratings yet
Lec08 Code Generation
37 pages
Acd-Unit 5
100% (1)
Acd-Unit 5
50 pages
Code
No ratings yet
Code
73 pages
Code Generation
No ratings yet
Code Generation
25 pages
Week 05
No ratings yet
Week 05
29 pages
Code Generation
No ratings yet
Code Generation
30 pages
Code Generation Techniques in Compilers
No ratings yet
Code Generation Techniques in Compilers
59 pages
Code Generation 1
No ratings yet
Code Generation 1
34 pages
CD - Unit V
No ratings yet
CD - Unit V
30 pages
Code Generation: Steve Johnson
No ratings yet
Code Generation: Steve Johnson
45 pages
Unit 6
No ratings yet
Unit 6
80 pages
Compiler Code Generation Guide
No ratings yet
Compiler Code Generation Guide
38 pages
7-Intermediate Code Generation PDF
No ratings yet
7-Intermediate Code Generation PDF
24 pages
Unit 5 1 Basicblocks
No ratings yet
Unit 5 1 Basicblocks
39 pages
15Cs314J - Compiler Design: Unit 4
No ratings yet
15Cs314J - Compiler Design: Unit 4
71 pages
CIS 461 Compiler Design and Construction Fall 2012 Lecture-Module 17
No ratings yet
CIS 461 Compiler Design and Construction Fall 2012 Lecture-Module 17
33 pages
Chapter 6 (Intermediate-Code Generation)
No ratings yet
Chapter 6 (Intermediate-Code Generation)
120 pages
Code Gen 1
No ratings yet
Code Gen 1
27 pages
Code Generation Design Issues
No ratings yet
Code Generation Design Issues
19 pages
Code Generation for CS Students
No ratings yet
Code Generation for CS Students
15 pages
CH5 2
No ratings yet
CH5 2
24 pages
Unit 4.2
No ratings yet
Unit 4.2
44 pages
Chapter 08 Suchi (Code Generation)
No ratings yet
Chapter 08 Suchi (Code Generation)
73 pages
Intermediate Code Generation in Compilers
No ratings yet
Intermediate Code Generation in Compilers
32 pages
Lecture Notes On Code Generation
No ratings yet
Lecture Notes On Code Generation
74 pages
Aho Johnson Paper
No ratings yet
Aho Johnson Paper
14 pages
Compiler Design Lec-8Code Generation and Optimization
No ratings yet
Compiler Design Lec-8Code Generation and Optimization
46 pages
Unit 4 PCD
No ratings yet
Unit 4 PCD
15 pages
Code Generation
No ratings yet
Code Generation
49 pages
Instruction Selection in Code Generation
No ratings yet
Instruction Selection in Code Generation
19 pages
Compiler Code Generation Basics
No ratings yet
Compiler Code Generation Basics
96 pages
Unit V Intermediate Code Generation
No ratings yet
Unit V Intermediate Code Generation
51 pages
Code Generation F
No ratings yet
Code Generation F
7 pages
Ch05 Paul BetterRegisterAllocation
No ratings yet
Ch05 Paul BetterRegisterAllocation
50 pages
CD Unit-4
No ratings yet
CD Unit-4
20 pages
Code Generation: Issues in The Design of A Code Generator
No ratings yet
Code Generation: Issues in The Design of A Code Generator
33 pages
18 Code Gen
No ratings yet
18 Code Gen
24 pages
DAA Notes
No ratings yet
DAA Notes
126 pages
Code Generation and Optimization
No ratings yet
Code Generation and Optimization
37 pages
Code Generation in Compiler Design
No ratings yet
Code Generation in Compiler Design
32 pages
Code Generation Part 1 L17
No ratings yet
Code Generation Part 1 L17
14 pages
CD GTU Study Material Presentations Unit-8 09092020043210PM
No ratings yet
CD GTU Study Material Presentations Unit-8 09092020043210PM
56 pages
Chapter 5 - Code Generation
No ratings yet
Chapter 5 - Code Generation
27 pages
Intermediate Code Generation Techniques
No ratings yet
Intermediate Code Generation Techniques
19 pages
CD Unit-5
No ratings yet
CD Unit-5
16 pages
CS6109 Module 11
No ratings yet
CS6109 Module 11
41 pages
Intermediate Representation: Goals
No ratings yet
Intermediate Representation: Goals
40 pages
Intermediate Code Generation
100% (1)
Intermediate Code Generation
42 pages
Codegeneration Final
No ratings yet
Codegeneration Final
31 pages
Lab #2 - Algorithms
No ratings yet
Lab #2 - Algorithms
79 pages
5 Ir
No ratings yet
5 Ir
51 pages
CD Unit 5
No ratings yet
CD Unit 5
9 pages
Compiler Design - Code Generation
No ratings yet
Compiler Design - Code Generation
62 pages
Code Generation
No ratings yet
Code Generation
43 pages
Unimaid Cpe 101
No ratings yet
Unimaid Cpe 101
36 pages
Basic Concepts
No ratings yet
Basic Concepts
57 pages
Toshiba Manual
No ratings yet
Toshiba Manual
9 pages
Himalayan 2023
No ratings yet
Himalayan 2023
13 pages
FLS and CCA Assessment Guidelines
100% (1)
FLS and CCA Assessment Guidelines
6 pages
Ant2033 Syllabus
No ratings yet
Ant2033 Syllabus
5 pages
30days English Learning Plan
100% (6)
30days English Learning Plan
32 pages
Case Digest
No ratings yet
Case Digest
10 pages
Economics: Swami Ramanand Teerth Marathwada University, Nanded
No ratings yet
Economics: Swami Ramanand Teerth Marathwada University, Nanded
13 pages
Fundamental University Organic and Inorganic Chemistry Series B Abass Olajire 2015 9jabaz - NG
No ratings yet
Fundamental University Organic and Inorganic Chemistry Series B Abass Olajire 2015 9jabaz - NG
82 pages
CO2 Impact on Photosynthesis Rate
No ratings yet
CO2 Impact on Photosynthesis Rate
7 pages
Five Secrets of A Happy Marriage
No ratings yet
Five Secrets of A Happy Marriage
5 pages
Science Lesson Plan for Grade 10
No ratings yet
Science Lesson Plan for Grade 10
7 pages
2023 National Exam Humanities July Session BQS
No ratings yet
2023 National Exam Humanities July Session BQS
6 pages
Managing Emotions Workplace
No ratings yet
Managing Emotions Workplace
22 pages
Leadership Theory and Practice, Northouse, Peter, Sage, 2012, PP 38-39
No ratings yet
Leadership Theory and Practice, Northouse, Peter, Sage, 2012, PP 38-39
3 pages
Poor Compliance With Treatment As Health Threat Cues
No ratings yet
Poor Compliance With Treatment As Health Threat Cues
11 pages
Understanding Discourse Analysis
100% (1)
Understanding Discourse Analysis
2 pages
Cryptography Techniques Guide
No ratings yet
Cryptography Techniques Guide
13 pages
Mindoro Biodiversity
No ratings yet
Mindoro Biodiversity
42 pages
4th-Founding-Anniversary-5th-Draft-Plans
No ratings yet
4th-Founding-Anniversary-5th-Draft-Plans
16 pages
Emotional Intelligence: Innate Potential vs. EQ
No ratings yet
Emotional Intelligence: Innate Potential vs. EQ
13 pages
Traffic Volume Analysis in Dhaka
100% (3)
Traffic Volume Analysis in Dhaka
60 pages
Xfurbish Seller Guide: IT Products
No ratings yet
Xfurbish Seller Guide: IT Products
7 pages
Stem Cell Controversy in Medicine
No ratings yet
Stem Cell Controversy in Medicine
26 pages
Notes - Lengths, Heights and Distances
100% (1)
Notes - Lengths, Heights and Distances
8 pages
ASTM - B987 (Jan 2023)
100% (1)
ASTM - B987 (Jan 2023)
10 pages
Film SWOT Analysis
No ratings yet
Film SWOT Analysis
2 pages
Testes de Avaliação Inglês
75% (8)
Testes de Avaliação Inglês
42 pages
01-06 (D) Lec22 The Strain Index
No ratings yet
01-06 (D) Lec22 The Strain Index
25 pages
Chi Square Test
No ratings yet
Chi Square Test
13 pages
Laboratory No. 2 F Dynamics
No ratings yet
Laboratory No. 2 F Dynamics
2 pages