Code Optimization
Code Optimization
Introduction
Position of code optimizer:
Source target
Front Intermediate code Code Intermediate code Code program
program
end optimizer generator
Control-
flow Data-flow transformations
analysis analysis
begin
prod := 0;
i := 1;
do begin
prod := prod + a[i] * b[i];
i := i+1;
end
while i<= 20
end
Basic Blocks and flow graphs
three-address code of program to compute dot product
(1) prod := 0
(2) i := 1
(3) t1 := 4*I
(4) t2=a[t1]
(5) t3 := 4*I
(6) t4=b[t3]
(7) t5 :=t2*t4
(8) t6 := prod+t5
(9) prod := t6
(10) t7:= i+1
(11) i=t7
(12) if i<=20 goto (3)
Basic Blocks and flow graphs
three-address code of program to compute dot product
prod := 0
i := 1 B1
t1 := 4*I
t2=addr(a)-4
t3 := t2 [ t1 ] B2
t4=addr(b)-4
t5 :=t4 [ t1 ]
t6 := t3*t5
prod := prod +t6
i := i+1
if i<=20 goto (3)
Basic Blocks and flow graphs
Example 2: exp(x)
{
exp(x)
int p=1;
{ i=2
int p=1; change do
for(i=2;i<=x;i++) p=p*I
p=p*i i++
r=p+1 while(i<=x)
r=p+1
}
}
Basic Blocks and flow graphs
1. p=1
1. p=1
2. i=2 2. i=2
3. t1=p*i 3. if (i<=x) goto 5
4. p=t1 4. goto 9
5. t2=i+1 5. t1=p*i
6. i=t2 6. p=t1
7. t2=i+1
7. if (i<=x) goto 3
8. i=t2
8. t3=p+1 9. t3=p+1
9. r=t3 10. r=t3
p=1
i=2
t1=p*i
p=t1
t2=i+1
i=t2
if (i<=x) goto 3
t3=p+1
r=t3
The Principal Sources of Optimization
The possibilities to improve a compiler can be explained with the following most
frequently applied transformation techniques:
1. Function-preserving transformations;
Common subexpressions elimination;
Copy propagation;
Dead-code elimination;
Constant Folding
2. Loop optimisations;
code motion, Induction variables and reduction in strength.
• A code improving transformation is called local if it is performed by
looking at statements within one concrete block
• a code improving transformation is global if it is performed by looking
at statements not only in one concrete block, but also outside in global
and other outside blocks
Quick Sort Example
void quicksort( int m, int n )
{
int i,j;
int v,x;
if ( n <= m ) return;
i = m - 1;
j = n;
v = a[ n ];
while ( 1 )
{
do i = i + 1; while ( a[ i ] < v );
do j = j - 1; while ( a[ j ] > v );
if ( I >= j ) break;
x = a[ i ];
a[ i ] = a[ j ];
a[ j ] = x;
}
x = a[ i ];
a[ i ] = a[ n ];
a[ n ] = x;
quicksort( m, j );
quicksort( i+1, n );
}
Quick Sort TAC
1) i := m - 1 16 ) t7 := 4 * i
2) j := n 17 ) t8 := 4 * j
3) t1 := 4 * n 18 ) t9 := a[ t8 ]
4) v := a[ t1 ] 19 ) a[ t7 ]:= t9
5) i := i + 1 20 ) t10 := 4 * j
6) t2 := 4 * i 21 ) a[ t10 ]:= x
7) t3 := a[ t2 ] 22 ) goto (5)
8) if ( t3 < v ) goto (5) 23 ) t11 := 4 * i
9) j := j - 1 24 ) x := a[ t11 ]
10 ) t4 := 4 * j 25 ) t12 := 4 * i
11 ) t5 := a[ t4 ] 26 ) t13 := 4 * n
12 ) if ( t5 > v ) goto (9) 27 ) t14 := a[ t13 ]
13 ) if ( i >= j ) goto (23) 28 ) a[ t12 ]:= t14
14 ) t6 := 4 * i 29 ) t15 := 4 * n
15 ) x := a[ t6 ] 30 ) a[ t15 ]:= x
Flowgraph
i := i + 1 j := j - 1
i := m - 1
t2 := 4 * i t4 := 4 * j
j := n
t3 := a[ t2 ] t5 := a[ t4 ]
t1 := 4 * n
if ( t3 < v ) goto B2 if ( t5 > v ) goto B3 B3
v := a[ t1 ]
B1 B2
if ( i >= j ) goto B6
B4
t6 := 4 * i t11 := 4 * i
x := a[ t6 ] x := a[ t11 ]
t7 := 4 * i t12 := 4 * i
t8 := 4 * j t13 := 4 *n
t9 := a[ t8 ] t14 := a[ t13 ]
a[ t7 ]:= t9 a[ t12 ]:= t14
t10 := 4 * j B5 t15 := 4 * n
a[ t10 ]:= x B6 a[ t ]:= x
15
goto B2
Common subexpressions Elimination
An expression E is called a common sub-expression if an
expression E was previously computed, and the values of
variables in E have not changed since the previous computation.
Notes: We can avoid re-computing the expression if we
can use the previously computed value
Example: Local Common subexpressions
Before After Before After
t6 := 4 * i t11 := 4 * i
x := a[ t6 ] t6 := 4 * i x := a[ t11 ] t11 := 4 * i
t7 := 4 * i x := a[ t6 ] t12 := 4 * i x := a[ t11 ]
t8 := 4 * j t8 := 4 * j t13 := 4 *n t13 := 4 *n
t9 := a[ t8 ] t9 := a[ t8 ] t14 := a[ t13 ] t14 := a[ t13 ]
a[ t7 ]:= t9 a[ t6 ]:= t9 a[ t12 ]:= t14 a[ t11 ]:= t14
t10 := 4 * j a[ t8 ]:= x t15 := 4 * n a[ t13 ]:= x
a[ t10 ]:= x goto B2 a[ t15 ]:= x
goto B2
B5 B5 B6 B6
Common subexpressions
Example: Global Common subexpressions
i := i + 1 j := j - 1
i := m - 1
t2 := 4 * i t4 := 4 * j
j := n
t3 := a[ t2 ] t5 := a[ t4 ]
t1 := 4 * n
if ( t3 < v ) goto B2 B3
v := a[ t1 ] if ( t5 > v ) goto B3
B2
B1
t11 := 4 * i
t6 := 4 * i
x := a[ t11 ]
x := a[ t6 ]
t13 := 4 *n
t8 := 4 * j
t14 := a[ t13 ]
t9 := a[ t8 ]
a[ t11 ]:= t14
a[ t6 ]:= t9
B5 a[ t8 ]:= x a[ t13 ]:= x B6
goto B2
Before Elimination
Common subexpressions
i := i + 1 j := j - 1
i := m - 1
t2 := 4 * i t4 := 4 * j
j := n
t3 := a[ t2 ] t5 := a[ t4 ]
t1 := 4 * n
if ( t3 < v ) goto B2 if ( t5 > v ) goto B3 B3
v := a[ t1 ]
B1 B2 if ( i >= j ) goto B6
B4
x := t3
x := t3 a[ t12 ]:= a[ t1 ]
a[ t2 ]:= t5
a[ t2 ] := t14
a[ t4 ]:= x
a[ t1 ]:= x
B5 goto B2
B6
i := i + 1 j := j - 1
i := m - 1
t2 := 4 * i t4 := 4 * j
j := n
t3 := a[ t2 ] t5 := a[ t4 ]
t1 := 4 * n
if ( t3 < v ) goto B2 if ( t5 > v ) goto B3 B3
v := a[ t1 ]
B1 B2
if ( i >= j ) goto B6
B4
a[ t12 ]:= a[ t1 ]
a[ t2 ]:= t5 a[ t2 ] := t14
a[ t4 ]:= t3 B6 a[ t1 ]:= t3
goto B2 B5
Constant Folding
Constant-folding is what allows a language to accept
constant expressions where a constant is required.
Example :
int arr[20 * 4 + 3];
switch (i) {
case 10 * 5: ...
}
• The expression can be resolved to an integer constant at compile time
Code Motion
• An important modification that decreases the amount of code
in a loop
• Loop-invariant computation
• A loop-invariant is a statements that evaluate the same
value at every iteration of the loop
• So move all loop-invariants to outside the loop
• Code Motion takes loop-invariant computation before its loop
t = limit -2
while (i <= limit-2) while (i <= t)
26
Induction Variables and Reduction in
Strength
• Induction variable
• For an induction variable x, there is a positive or negative constant c
such that each time x is assigned, its value increases by c
• Induction variables can be computed with a single increment
(addition or subtraction) per loop iteration
• Strength reduction
– The transformation of replacing an expensive operation, such as
multiplication, by a cheaper one, such as addition
• Induction variables lead to
– strength reduction
– eliminate computation
Induction Variables and Reduction in
Strength Let j=5
J - 4 3 21
t4 - 16 12 8 4
i := i + 1 j := j - 1 Here J and t4
i := m - 1
t2 := 4 * i t4 := 4 * j are induction
j := n
t3 := a[ t2 ] t5 := a[ t4 ] varible.
t1 := 4 * n
if ( t3 < v ) goto B2 if ( t5 > v ) goto B3 B3 As they are
v := a[ t1 ]
used later in
B2 B3 and B4 we
B1
if ( i >= j ) goto B6 cant
B4
eliminate
either
a[ t12 ]:= a[ t1 ]
a[ t2 ]:= t5 a[ t2 ] := t14
a[ t4 ]:= t3 B6 a[ t1 ]:= t3
goto B2 B5
Induction Variables and Reduction in
Strength
After strength reduction
i := m - 1 i := i + 1 j := j - 1
j := n t2 := t2 + 4 t4 := t4 -4
t1 := 4 * n t3 := a[ t2 ] t5 := a[ t4 ]
v := a[ t1 ] if ( t3 < v ) goto B2 if ( t5 > v ) goto B3 B3
t2 =4*i
B2
t4 =4*j
if ( i >= j ) goto B6 B4
B1
a[ t12 ]:= a[ t1 ]
a[ t2 ]:= t5 a[ t2 ] := t14
a[ t4 ]:= t3 B6 a[ t1 ]:= t3
goto B2 B5
Induction Variables and Reduction in
Strength As I and J
becomes dead
code ,so they
i := m - 1 are eliminated
j := n t2 := t2 + 4 t4 := t4 -4
t1 := 4 * n t3 := a[ t2 ] t5 := a[ t4 ]
v := a[ t1 ] if ( t3 < v ) goto B2 if ( t5 > v ) goto B3 B3
t2 := 4 * i
t4 := 4 * j B2
if (t2 >= t4 ) gotoB6 B4
B1
a[ t12 ]:= a[ t1 ]
a[ t2 ]:= t5 a[ t2 ] := t14
a[ t4 ]:= t3 B6 a[ t1 ]:= t3
goto B2 B5
Directed Acyclic Graph
• It is used for an expression to identify
common sub expression in the expression
• It is like a syntax tree
=
i 10
i = i +10
Directed Acyclic Graph
i = i +10
a=b+c
+ i1 b=a–d
c=b+c
d=a-d
i0 10
t1 = 4*i
t2=a[t1]
[]
*
a
4 i
Directed Acyclic Graph
Construction:
37
Example
38
Reaching Definitions Contd.
Unambiguous Definition: X = ….;
Ambiguous Definition: *p = ….; p may point to
X
definitions. *p=..
39
Computing Reaching Definitions
At each program point p, we compute the set
of definitions that reach point p.
Reaching definitions are computed by solving a
system of equations (data flow equations).
d2: X=… d3: X=…
IN[B]
GEN[B] ={d1}
d1: X=…
OUT[B] KILL[B]={d2,d3}
40
Data Flow Equations
IN[B]: Definitions that reach B’s entry.
OUT[B]: Definitions that reach B’s exit.
41
Reaching Definitions Contd.
• Forward problem – information flows
forward in the direction of edges.
• May problem – there is a path along which
definition reaches a point but it does not
always reach the point.
Therefore in a May problem the meet
operator is the Union operator.
42
Applications of Reaching Definitions
• Constant
Propagation/folding
• Copy Propagation
43
2. Available Expressions
An expression is generated at a point if it is computed at that
point.
An expression is killed by redefinitions of operands of the
expression.
44
Available Expressions
45
Data Flow Equations
IN[B]: Expressions available at B’s entry.
OUT[B]: Expressions available at B’s exit.
46
Available Expressions Contd.
• Forward problem – information flows
forward in the direction of edges.
• Must problem – expression is definitely
available at a point along all paths.
Therefore in a Must problem the meet
operator is the Intersection operator.
• Application:
A
47
3. Live Variable Analysis
A path is X-clear is it contains no definition of X.
A variable X is live at point p if there exists a X-clear path
from p to a use of X; otherwise X is dead at p.
48
Data Flow Equations
IN[B]: Variables live at B’s entry.
OUT[B]: Variables live at B’s exit.
49
Live Variables Contd.
• Backward problem – information flows
backward in reverse of the direction of
edges.
• May problem – there exists a path along
which a use is encountered.
Therefore in a May problem the meet
operator is the Union operator.
50
Applications of Live Variables
• Register Allocation
• Dead Code
Elimination
• Code Motion
Out of Loops
51
4. Very Busy Expressions
A expression A+B is very busy at point p if for all paths
starting at p and ending at the end of the program, an
evaluation of A+B appears before any definition of A or B.
Application:
Code Size Reduction
53
Very Busy Expressions Contd.
• Backward problem – information flows
backward in reverse of the direction of
edges.
• Must problem – expressions must be
computed along all paths.
Therefore in a Must problem the meet
operator is the Intersection operator.
54
Summary
May/Union Must/
Intersection
Forward Reaching Available
Definitions Expressions
Backward Live Very Busy
Variables Expressions
55
Use-Def & Def-Use Chains
56
Loops in flow graphs
Dominators:
We say node d of a flow graph dominates node n,
written d dom n, if every path from the initial
node of the flow graph to n goes through d
1 dom(1) = {1,2,3,4,5,6,7,8,9,10}
2 It means node 1 dominates to all.
dom(2) = {2}
3 dom(3) = {3,4,5,6,7,8,9,10}
dom(4) = {4,5,6,7,8,9,10}
4 dom(5) = {5}
dom(6) = {6}
5 6
dom(7) = {7,8,9,10}
7 dom(8) = {8,9,10}
dom(9) = {9}
8 dom(10) = {10}
9 10
Loops in flow graphs
Example : Dominator Tree
Back edge:
The edges whose heads dominate their tails.
If there exists an edge (b, a) in the CFG where a
dominates b, then, edge (b, a) is a back edge
Loops in flow graphs
Example: Find the natural loops in the CFG
The back edges are (10, 7) , (9, 1), (8, 3)
(7, 4) , (4, 3)
Dom(1)= {2,3}
2 3 Dom(2)= {2}
Dom(3)= {3}