0% found this document useful (0 votes)
303 views

Content Beyond Syllabus

This document contains lecture notes on loop optimizations from a compiler design course. It discusses two main loop optimizations: hoisting loop-invariant computations out of the loop, and optimizations based on induction variables. Induction variables are variables whose values change in a regular pattern with each loop iteration, such as by adding a constant. Derived induction variables can be optimized by rewriting them in terms of the loop iteration variable.

Uploaded by

sandhya_emerald
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
303 views

Content Beyond Syllabus

This document contains lecture notes on loop optimizations from a compiler design course. It discusses two main loop optimizations: hoisting loop-invariant computations out of the loop, and optimizations based on induction variables. Induction variables are variables whose values change in a regular pattern with each loop iteration, such as by adding a constant. Derived induction variables can be optimized by rewriting them in terms of the loop iteration variable.

Uploaded by

sandhya_emerald
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Lecture Notes on

Loop Optimizations

15-411: Compiler Design


Frank Pfenning

Lecture 17
October 22, 2013

1 Introduction
Optimizing loops is particularly important in compilation, since loops (and in par-
ticular the inner loops) account for much of the executions times of many programs.
Since tail-recursive functions are usually also turned into loops, the importance of
loop optimizations is further magnified. In this lecture we will discuss two main
ones: hoisting loop-invariant computation out of a loop, and optimizations based
on induction variables.

2 What Is a Loop?
Before we discuss loop optimizations, we should discuss what we identify as a
loop. In our source language, this is rather straightforward, since loops are formed
with while or for, where it is convenient to just elaborate a for loop into its corre-
sponding while form.
The key to a loop is a back edge in the control-flow graph from a node l to a
node h that dominates l. We call h the header node of the loop. The loop itself then
consists of the nodes on a path from h to l. It is convenient to organize the code so
that a loop can be identified with its header node. We then write loop(h, l) if line l
is in the loop with header h.
When loops are nested, we generally optimize the inner loops before the outer
loops. For one, inner loops are likely to be executed more often. For another, it
could move computation to an outer loop from which it is hoisted further when
the outer loop is optimized and so on.

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.2

3 Hoisting Loop-Invariant Computation


An (pure) expression is loop invariant if its value does not change throughout the
loop. We can then define the predicate inv(h, p), where p is a pure expression, as
follows:

c constant def(l, x) loop(h, l) inv(h, s1 ) inv(h, s2 )


inv(h, c) inv(h, x) inv(h, s1 s2 )

Since we are concerned only with programs in SSA form, it is easy to see that vari-
ables are loop invariant if they are not parameters of the header label. However, the
definition above does not quite capture this for definitions t p where p is loop-
invariant but t is not part of the label parameters. So we add a second propagation
rule.
l : t p inv(h, p) loop(h, l)
inv(h, t)
Note that we do not consider memory references of function calls to be loop invari-
ant, although under some additional conditions they may be hoisted as well.
In order to hoist loop invariant computations out of a loop we should have
a loop preheader in the control-flow graph, which immediately dominates the loop
header. When then move all the loop invariant computations to the preheader, in
order.
Some care must be taken with this optimization. For example, when the loop
body is never executed the code could become significantly slower. Another prob-
lem if we have conditionals in the body of the loop: values computed only on one
branch or the other will be loop invariant, but depending on the boolean condition
one or the other may never be executed.
In some cases, when the loop guard is inexpensive and effect-free but the loop-
invariant code is expensive, we might consider duplicating the test so that instead
of
seq(pre, while(e, s))
we generate code for

seq(if(e, seq(pre, while(e, s)), nop))

where pre is the hoisted computation in the loop pre-header.


A typical example of hoisting loop invariant computation would be a loop to
initialize all elements of a two-dimensional array:

for (int i = 0; i < width * height; i++)


A[i] = 1;

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.3

We show the relevant part of the abstract assembly on the left. In the right is the
result of hoisting the multiplication, enabled because both width and height are
loop invariant and therefore their product is.

i0 0 i0 0
t width height
goto loop(i0 ) goto loop(i0 )
loop(i1 ) : loop(i1 ) :
t width height
if (i1 t) goto exit if (i1 t) goto exit
... ...
i2 i1 + 1 i2 i1 + 1
goto loop(i2 ) goto loop(i2 )
exit : exit :

4 Induction Variables
Hoisting loop invariant computation is significant; optimizing computation which
changes by a constant amount each time around the loop is probably even more
important. We call such variables basic induction variables. The opportunity for op-
timization arises from derived induction variables, that is, variables that are computed
from basic induction variables.
As an example we will use a function check if a given array is sorted in ascend-
ing order.

bool is_sorted(int[] A, int n)


//@requires 0 <= n && n <= \length(A);
{
for (int i = 0; i < n-1; i++)
//@loop_invariant 0 <= i && i <= n-1;
if (A[i] > A[i+1]) return false;
return true;
}

Below is a possible compiled SSA version of this code, assuming that we do not

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.4

perform array bounds checks (or have eliminated them).

is sorted(A, n) :
i0 0
goto loop(i0 )
loop(i1 ) :
t0 n 1
if (i1 t0 ) goto rtrue
t1 4 i1
t2 A + t1
t3 M [t2 ]
t4 i1 + 1
t5 4 t4
t6 A + t5
t7 M [t6 ]
if (t3 > t7 ) goto rfalse
i2 i1 + 1
goto loop(i2 )
rtrue :
return 1
rfalse :
return 0

Here, i1 is the basic induction variable, and t1 = 4 i1 and t4 = i1 + 1 are the


derived induction variables. In general, we consider a variable a derived induction
variable if its has the form a i + b, where a and b are loop invariant.
Lets consider t4 first. We see that common subexpression elimination applies.
However, we would like to preserve the basic induction variable i1 and its version

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.5

i2 , so we apply code motion and then eliminate the second occurrence of i1 + 1.

is sorted(A, n) : is sorted(A, n) :
i0 0 i0 0
goto loop(i0 ) goto loop(i0 )
loop(i1 ) : loop(i1 ) :
t0 n 1 t0 n 1
if (i1 t0 ) goto rtrue if (i1 t0 ) goto rtrue
t1 4 i1 t1 4 i1
t2 A + t1 t2 A + t1
t3 M [t2 ] t3 M [t2 ]
i2 i1 + 1
t4 i1 + 1 t4 i2
t5 4 t4 t5 4 t4
t6 A + t5 t6 A + t5
t7 M [t6 ] t7 M [t6 ]
if (t3 > t7 ) goto rfalse if (t3 > t7 ) goto rfalse
i2 i1 + 1
goto loop(i2 ) goto loop(i2 )

Next we look at the derived induction variable t1 4 i1 . The idea is to see how
we can calculate t1 at a subsequent iteration from t1 at a prior iteration. In order to
achieve this effect, we add a new induction variable to represent 4 i1 . We call this
j and add it to our loop variables in SSA form.

is sorted(A, n) :
i0 0
j0 4 i0 @ensures j0 = 4 i0
goto loop(i0 , j0 )
loop(i1 , j1 ) : @requires j1 = 4 i1
t0 n 1
if (i1 t0 ) goto rtrue
t1 j1 @assert j1 = 4 i1
t2 A + t1
t3 M [t2 ]
i2 i1 + 1
j2 4 i2 @ensures j2 = 4 i2
t4 i2
t5 4 t4
t6 A + t5
t7 M [t6 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , j2 )

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.6

Crucial here is the invariant that j1 = 4 i1 when label loop(i1 , j1 ) is reached. Now
we calculate
j2 = 4 i2 = 4 (i1 + 1) = 4 i1 + 4 = j1 + 4
so we can express j2 in terms of j1 without multiplication. This is an example of
strength reduction since addition is faster than multiplication. Recall that all the laws
we used are valid for modular arithmetic. Similarly:

j0 = 4 i0 = 0

since i0 = 0, which is an example of constant propagation followed by constant


folding.
is sorted(A, n) :
i0 0
j0 0 @ensures j0 = 4 i0
goto loop(i0 , j0 )
loop(i1 , j1 ) : @requires j1 = 4 i1
t0 n 1
if (i1 t0 ) goto rtrue
t1 j1 @assert j1 = 4 i1
t2 A + t1
t3 M [t2 ]
i2 i1 + 1
j2 j1 + 4 @ensures j2 = 4 i2
t4 i2
t5 4 t4
t6 A + t5
t7 M [t6 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , j2 )

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.7

With some copy propagation, and noticing that n 1 is loop invariant, we next get:

is sorted(A, n) :
i0 0
j0 0 @ensures j0 = 4 i0
t0 n 1
goto loop(i0 , j0 )
loop(i1 , j1 ) : @requires j1 = 4 i1
if (i1 t0 ) goto rtrue
t2 A + j1
t3 M [t2 ]
i2 i1 + 1
j2 j1 + 4 @ensures j2 = 4 i2
t5 4 i2
t6 A + t5
t7 M [t6 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , j2 )

With common subexpression elimination (noting the additional assertions we are


aware of), we can replace 4 i2 by j2 . We combine this with copy propagation.

is sorted(A, n) :
i0 0
j0 0 @ensures j0 = 4 i0
t0 n 1
goto loop(i0 , j0 )
loop(i1 , j1 ) : @requires j1 = 4 i1
if (i1 t0 ) goto rtrue
t2 A + j1
t3 M [t2 ]
i2 i1 + 1
j2 j1 + 4 @ensures j2 = 4 i2
t6 A + j2
t7 M [t6 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , j2 )

We observe another derived induction variable, namely t2 = A + j1 . We give this a


new name (k1 = A + j1 ) and introduce it into our function. Again we just calculate:

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.8

k2 = A + j2 = A + j1 + 4 = k1 + 4 and k0 = A + j0 = A.

is sorted(A, n) :
i0 0
j0 0 @ensures j0 = 4 i0
k0 A + j0 @ensures k0 = A + j0
t0 n 1
goto loop(i0 , j0 , k0 )
loop(i1 , j1 , k1 ) : @requires j1 = 4 i1 k1 = A + j1
if (i1 t0 ) goto rtrue
t2 k1
t3 M [t2 ]
i2 i1 + 1
j2 j1 + 4 @ensures j2 = 4 i2
k2 k1 + 4 @ensures k2 = A + j2
t6 A + j2
t7 M [t6 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , j2 , k2 )

After more round of constant propagtion, common subexpression elimination, and


dead code elimination we get:

is sorted(A, n) :
i0 0
j0 0 @ensures j0 = 4 i0
k0 A @ensures k0 = A + j0
t0 n 1
goto loop(i0 , j0 , k0 )
loop(i1 , j1 , k1 ) : @requires j1 = 4 i1 k1 = A + j1
if (i1 t0 ) goto rtrue
t3 M [k1 ]
i2 i1 + 1
j2 j1 + 4 @ensures j2 = 4 i2
k2 k1 + 4 @ensures k2 = A + j2
t7 M [k2 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , j2 , k2 )

With neededness analysis we can say that j0 , j1 , and j2 are no longer needed and

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.9

can be eliminated.
is sorted(A, n) :
i0 0
k0 A @ensures k0 = A + 4 i0
t0 n 1
goto loop(i0 , k0 )
loop(i1 , k1 ) : @requires k1 = A + 4 i1
if (i1 t0 ) goto rtrue
t3 M [k1 ]
i2 i1 + 1
k2 k1 + 4 @ensures k2 = A + 4 i2
t7 M [k2 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , k2 )

Unfortunately, i1 is still needed, since it governs a conditional jump. In order to


eliminate that we would have to observe that

i1 t0 iff A + 4 i1 A + 4 t0

This holds the the addition here is a on 64 bit quantities where the second operand
is 32 bits, so no overflow can occur. If we exploit this we obtain:

is sorted(A, n) :
i0 0
k0 A @ensures k0 = A + 4 i0
t0 n 1
goto loop(i0 , k0 )
loop(i1 , k1 ) : @requires k1 = A + 4 i1
if (k1 A + 4 t0 ) goto rtrue
t3 M [k1 ]
i2 i1 + 1
k2 k1 + 4 @ensures k2 = A + 4 i2
t7 M [k2 ]
if (t3 > t7 ) goto rfalse
goto loop(i2 , k2 )

Now i0 , i1 , and i2 are no longer needed and can be eliminated. Moreover, A + 4 t0

L ECTURE N OTES O CTOBER 22, 2013


Loop Optimizations L17.10

is loop invariant and can be hoisted.

is sorted(A, n) :
k0 A
t0 n 1
t8 4 t0
t9 A + t8
goto loop(k0 )
loop(k1 ) :
if (k1 t9 ) goto rtrue
t3 M [k1 ]
k2 k1 + 4
t7 M [k2 ]
if (t3 > t7 ) goto rfalse
goto loop(k2 )
rtrue :
return 1
rfalse :
return 0

It was suggested that we can avoid two memory accesses per iteration by unrolling
the loop once. This make sense, but this opimization is beyond the scope of this
lecture.
We have carried out the optimizations here on concrete programs and values,
but it is straightforward to generalize them to arbitrary induction variables x that
are updated with x2 x1 c for a constant c, and derived variables that arise from
constant multiplication with or addition to a basic induction variable.

L ECTURE N OTES O CTOBER 22, 2013

You might also like