0% found this document useful (0 votes)
57 views

8 CodeGeneration

The document discusses code generation for a stack machine. It provides examples of a stack machine program and how it can be optimized by using an accumulator register. It then discusses how a stack machine can be simulated on a MIPS architecture by storing the stack in memory and keeping track of the stack pointer. The document proposes a small language and provides an approach to generate MIPS assembly code recursively from the language's abstract syntax trees while preserving the stack. It discusses code generation for operations, conditionals, and function calls based on using the accumulator and stack pointer registers.

Uploaded by

gareeb engineer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

8 CodeGeneration

The document discusses code generation for a stack machine. It provides examples of a stack machine program and how it can be optimized by using an accumulator register. It then discusses how a stack machine can be simulated on a MIPS architecture by storing the stack in memory and keeping track of the stack pointer. The document proposes a small language and provides an approach to generate MIPS assembly code recursively from the language's abstract syntax trees while preserving the stack. It discusses code generation for operations, conditionals, and function calls based on using the accumulator and stack pointer registers.

Uploaded by

gareeb engineer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Code Generation

Midterm Statistics

I Max: 54, Min 28, Median: 45


I Questions (best score):
1-5: 8/10
6: 5/5
7: 5/5
8: 5/5
9: 10/10
11: 4/5
12: 10/10
Code Generation: What to do?

I Input: intermediate representation (IR) and symbol table


I Output: semantically equivalent target code for a given a machine
architecture
I Compiler: translate IR to assembly code; Assembler: generate binary
code; Linker: link the binary code (assembler and linker as part of
the compiler tool chain)
Three main tasks:
I Instruction selection: choosing appropriate target machine
instructions to implement the IR statements
I Register allocation and assignment: deciding what values to keep in
which registers (interface with the hardware)
I Instruction ordering: deciding in what order to schedule the
execution of instructions
Understanding Code Generation

Basic code generation (and then code generation for object-oriented


programs)
I MIPS assembly language
I Stack machine (architecture type)
I Simple source language
I Assembly programs have assumptions on architecture:
I which architecture type: stack machine? RISC? CISC?
I how stack grows?
I how many registers? what registers store?
Stack Machines

• A simple evaluation model

• No variables or registers

• A stack of values for intermediate results


Example of a Stack Machine Program

• Consider two instructions


– push i - place the integer i on top of the stack
– add - pop two elements, add them and put
the result back on the stack
• A program to compute 7 + 5
push 7
push 5
add
Stack Machine. Example

5
7 7 ⊕ 12
stack … … … …

push 7 push 5 add


• Each instruction
– Takes its operands from the top of the stack
– Removes those operands from the stack
– Computes the required operation on them
– Pushes the result on the stack
Why Use a Stack Machine ?

• Each operation takes operands from the same


place and puts results in the same place

• This means a uniform compilation scheme

• And therefore a simpler compiler

7
Why Use a Stack Machine?

• Location of the operands is implicit


– Always on the top of the stack
• No need to specify operands explicitly
• No need to specify the location of the result
• Instruction “add” as opposed to “add r1, r2”
⇒ Smaller encoding of instructions
⇒ More compact programs
• This is one reason why Java Bytecodes use a
stack evaluation model
Optimizing the Stack Machine

• The add instruction does 3 memory operations


– Two reads and one write to the stack
– The top of the stack is frequently accessed
• Idea: keep the top of the stack in a register
(called accumulator)
– Register accesses are faster
• The “add” instruction is now
acc ← acc + top_of_stack
– Only one memory operation!
Stack Machine with Accumulator

Invariants
• The result of computing an expression is
always in the accumulator
• For an operation op(e1,…,en) push the
accumulator on the stack after computing
each of e1,…,en-1
– The result of en is in the accumulator before op
– After the operation pop n-1 values
• After computing an expression the stack is as
before
Stack Machine with Accumulator: Example

• Compute 7 + 5 using an accumulator

acc 7 5 12

7 7
stack … … … …

acc ← 7 acc ← 5 acc ← acc + top_of_stack


push acc pop
A Bigger Example: 3 + (7 + 5)

Code Acc Stack


acc ← 3 3 <init>
push acc 3 3, <init>
acc ← 7 7 3, <init>
push acc 7 7, 3, <init>
acc ← 5 5 7, 3, <init>
acc ← acc + top_of_stack 12 7, 3, <init>
pop 12 3, <init>
acc ← acc + top_of_stack 15 3, <init>
pop 15 <init>
Notes

• It is very important that the stack is


preserved across the evaluation of a
subexpression
– Stack before the evaluation of 7 + 5 is 3, <init>
– Stack after the evaluation of 7 + 5 is 3, <init>
– The first operand is on top of the stack
From Stack Machines to MIPS

• The compiler generates code for a stack


machine with accumulator

• We want to run the resulting code on the


MIPS processor (or simulator)

• We implement stack machine instructions


using MIPS instructions and registers
Simulating a Stack Machine…

• The accumulator is kept in MIPS register $a0


• The stack is kept in memory
• The stack grows towards lower addresses
– Standard convention on the MIPS architecture
• The address of the next location on the stack
is kept in MIPS register $sp
– The top of the stack is at address $sp + 4
MIPS Assembly

MIPS architecture
– Prototypical Reduced Instruction Set Computer
(RISC) architecture
– Arithmetic operations use registers for operands
and results
– Must use load and store instructions to use
operands and results in memory
– 32 general purpose registers (32 bits each)
• We will use $sp, $a0 and $t1 (a temporary register)
• Read the SPIM handout for more details
A Sample of MIPS Instructions

– lw reg1 offset(reg2)
• Load 32-bit word from address reg2 + offset into reg1
– add reg1 reg2 reg3
• reg1 ← reg2 + reg3
– sw reg1 offset(reg2)
• Store 32-bit word in reg1 at address reg2 + offset
– addiu reg1 reg2 imm
• reg1 ← reg2 + imm
• “u” means overflow is not checked
– li reg imm
• reg ← imm
MIPS Assembly: Example

• The stack-machine code for 7 + 5 in MIPS


acc ← 7 li $a0 7
push acc sw $a0 0($sp)
addiu $sp $sp -4
acc ← 5 li $a0 5
acc ← acc + top_of_stack lw $t1 4($sp)
add $a0 $a0 $t1
pop addiu $sp $sp 4

• We now generalize this to a simple language…


A Small Language

• A language with integers and integer


operations

P → D; P | D
D → def id(ARGS) = E;
ARGS → id, ARGS | id
E → int | id | if E1 = E2 then E3 else E4
| E1 + E2 | E1 – E2 | id(E1,…,En)
A Small Language (Cont.)

• The first function definition f is the “main”


routine
• Running the program on input i means
computing f(i)
• Program for computing the Fibonacci numbers
def fib(x) = if x = 1 then 0 else
if x = 2 then 1 else
fib(x - 1) + fib(x – 2)
Code Generation Strategy

• For each expression e we generate MIPS code


that
– Computes the value of e in $a0
– Preserves $sp and the contents of the stack

• We define a code generation function cgen(e)


whose result is the code generated for e
Code Generation for Constants

• The code to evaluate a constant simply copies


it into the accumulator

cgen(i) = li $a0 i

• Note that this also preserves the stack, as


required
Code Generation for Add

cgen(e1 + e2) =
cgen(e1)
sw $a0 0($sp)
addiu $sp $sp -4
cgen(e2)
lw $t1 4($sp)
add $a0 $t1 $a0
addiu $sp $sp 4
• Possible optimization: Put the result of e1 directly in
register $t1 ?
Code Generation for Add. Wrong!

• Optimization: Put the result of e1 directly in $t1?

cgen(e1 + e2) =
cgen(e1)
move $t1 $a0
cgen(e2)
add $a0 $t1 $a0

• Try to generate code for : 3 + (7 + 5)


Code Generation Notes

• The code for + is a template with “holes” for


code for evaluating e1 and e2
• Stack-machine code generation is recursive
• Code for e1 + e2 consists of code for e1 and e2
glued together
• Code generation can be written as a recursive-
descent of the AST
– At least for expressions
Code Generation for Sub and Constants

• New instruction: sub reg1 reg2 reg3


– Implements reg1 ← reg2 - reg3
cgen(e1 - e2) =
cgen(e1)
sw $a0 0($sp)
addiu $sp $sp -4
cgen(e2)
lw $t1 4($sp)
sub $a0 $t1 $a0
addiu $sp $sp 4
Code Generation for Conditional

• We need flow control instructions

• New instruction: beq reg1 reg2 label


– Branch to label if reg1 = reg2

• New instruction: b label


– Unconditional jump to label
Code Generation for If (Cont.)

cgen(if e1 = e2 then e3 else e4) =


cgen(e1)
false_branch:
sw $a0 0($sp)
cgen(e4)
addiu $sp $sp -4
b end_if
cgen(e2)
true_branch:
lw $t1 4($sp)
cgen(e3)
addiu $sp $sp 4
end_if:
beq $a0 $t1 true_branch
The Activation Record

• Code for function calls and function


definitions depends on the layout of the
activation record
• A very simple AR suffices for this language:
– The result is always in the accumulator
• No need to store the result in the AR
– The activation record holds actual parameters
• For f(x1,…,xn) push xn,…,x1 on the stack
• These are the only variables in this language
The Activation Record (Cont.)

• The stack discipline guarantees that on


function exit $sp is the same as it was on
function entry
– No need to save $sp
• We need the return address
• It’s handy to have a pointer to start of the
current activation
– This pointer lives in register $fp (frame pointer)
– Reason for frame pointer will be clear shortly
The Activation Record

• Summary: For this language, an AR with the


caller’s frame pointer, the actual parameters,
and the return address suffices
• Picture: Consider a call to f(x,y), The AR will
be: FP high
addresses
old FP
AR of f y
x
SP
Code Generation for Function Call

• The calling sequence is the instructions (of


both caller and callee) to set up a function
invocation
• New instruction: jal label
– Jump to label, save address of next instruction in
$ra
– On other architectures the return address is
stored on the stack by the “call” instruction
Code Generation for Function Call (Cont.)

cgen(f(e1,…,en)) = • The caller saves its value


sw $fp 0($sp) of the frame pointer
addiu $sp $sp -4 • Then it saves the actual
cgen(en) parameters in reverse
sw $a0 0($sp) order
addiu $sp $sp -4 • The caller saves the
… return address in
cgen(e1) register $ra
sw $a0 0($sp)
addiu $sp $sp -4
jal f_entry
Code Generation for Function Definition

• New instruction: jr reg


– Jump to address in register reg
cgen(def f(x1,…,xn) = e) = • Note: The frame pointer
move $fp $sp points to the top, not
sw $ra 0($sp) bottom of the frame
addiu $sp $sp -4 • The callee pops the return
address, the actual
cgen(e) arguments and the saved
lw $ra 4($sp) value of the frame pointer
addiu $sp $sp z • z = 4*n + 8
lw $fp 0($sp)
jr $ra
Calling Sequence: Example for f(x,y)

Before call On entry Before exit After call

FP FP FP

SP old FP old FP SP
y y
x x
SP FP return
SP
Code Generation for Variables

• Variable references are the last construct


• The “variables” of a function are just its
parameters
– They are all in the AR
– Pushed by the caller

• Problem: Because the stack grows when


intermediate results are saved, the variables
are not at a fixed offset from $sp
Code Generation for Variables (Cont.)

• Solution: use a frame pointer


– Always points to the return address on the stack
– Since it does not move it can be used to find the
variables
• Let xi be the ith (i = 1,…,n) formal parameter of
the function for which code is being generated
Code Generation for Variables (Cont.)

• Example: For a function def f(x1,x2) = e the


activation and frame pointer are set up as
follows:
old fp x1 is at fp + 4
x2 x2 is at fp + 8
x1 • Thus:
FP return cgen(xi) = lw $a0 z($fp)
( z = 4*i )

SP
Summary

• The activation record must be designed


together with the code generator

• Code generation can be done by recursive


traversal of the AST

• Production compilers do different things


– Emphasis is on keeping values (esp. current stack
frame) in registers
– Intermediate results are laid out in the AR, not
pushed and popped from the stack
Allocating Temporaries in the AR

• Idea: Keep temporaries in the AR

• The code generator must assign a location in


the AR for each temporary
How Many Temporaries?

• Let NT(e) = # of temps needed to evaluate e

• NT(e1 + e2)
– Needs at least as many temporaries as NT(e1)
– Needs at least as many temporaries as NT(e2) + 1

• Space used for temporaries in e1 can be reused


for temporaries in e2
The Equations

NT(e1 + e2) = max(NT(e1), 1 + NT(e2))


NT(e1 - e2) = max(NT(e1), 1 + NT(e2))
NT(if e1 = e2 then e3 else e4) = max(NT(e1),1 + NT(e2), NT(e3), NT(e4))
NT(id(e1,…,en) = max(NT(e1),…,NT(en))
NT(int) = 0
NT(id) = 0

Is this bottom-up or top-down?


What is NT(…code for fib…)?
The Revised AR

• For a function definition f(x1,…,xn) = e the AR


has 2 + n + NT(e) elements
– Return address
– Frame pointer
– n arguments
– NT(e) locations for intermediate results
Picture

Old FP
xn
...
x1
Return Addr.
Temp NT(e)
...
Temp 1
Revised Code Generation

• Code generation must know how many


temporaries are in use at each point

• Add a new argument to code generation: the


position of the next available temporary
Code Generation for + (original)

cgen(e1 + e2) =
cgen(e1)
sw $a0 0($sp)
addiu $sp $sp -4
cgen(e2)
lw $t1 4($sp)
add $a0 $t1 $a0
addiu $sp $sp 4
Code Generation for + (revised)

cgen(e1 + e2, nt) =


cgen(e1, nt)
sw $a0 nt($fp)
cgen(e2, nt + 4)
lw $t1 nt($fp)
add $a0 $t1 $a0
Notes

• The temporary area is used like a small, fixed-


size stack

• Exercise: Write out cgen for other constructs


Code Generation for OO Languages
Outline

• Code generation for OO languages


– Object layout
– Dynamic dispatch

• Parameter passing mechanisms


Object Layout

• OO Slogan: If B is a subclass of A, then an


object of class B can be used wherever an
object of class A is expected

• This means that code in class A works


unmodified for an object of class B
Two Issues

• How are objects represented in memory?

• How is dynamic dispatch implemented?


Object Layout Example

Class A {
a: Int <- 0;
d: Int <- 1;
f(): Int { a <- a + d };
};

Class B inherits A { Class C inherits A {


b: Int <- 2; c: Int <- 3;
f(): Int { a }; // Override h(): Int { a <- a * c };
g(): Int { a <- a - b }; };
};
Object Layout (Cont.)

• Attributes a and d are inherited by classes B


and C

• All methods in all classes refer to a

• For A methods to work correctly in A, B, and C


objects, attribute a must be in the same
“place” in each object
Object Layout (Cont.)

An object is like a struct in C. The reference


foo.field
is an index into a foo struct at an offset
corresponding to field

Objects in Cool are implemented similarly


– Objects are laid out in contiguous memory
– Each attribute stored at a fixed offset in object
– When a method is invoked, the object is self and
the fields are the object’s attributes
Cool Object Layout

• The first 3 words of Cool objects contain


header information:
Offset

Class Tag 0
Object Size 4
Dispatch Ptr 8
Attribute 1 12
Attribute 2 16
...
Cool Object Layout (Cont.)

• Class tag is an integer


– Identifies class of the object
• Object size is an integer
– Size of the object in words
• Dispatch ptr is a pointer to a table of methods
– More later
• Attributes in subsequent slots

• Lay out in contiguous memory


Subclasses

Observation: Given a layout for class A, a layout


for subclass B can be defined by extending
the layout of A with additional slots for the
additional attributes of B

Leaves the layout of A unchanged


(B is an extension)
Layout Picture

Offset 0 4 8 12 16 20

Class
A Atag 5 * a d

B Btag 6 * a d b

C Ctag 6 * a d c
Subclasses (Cont.)

• The offset for an attribute is the same in a


class and all of its subclasses
– Any method for an A1 can be used on a subclass A2
• Consider layout for An · … · A3 · A2 · A1

Header A1 object

A1 attrs. A2 object
A2 attrs A3 object
A3 attrs
...
Dynamic Dispatch

• Consider again our example


Class A {
a: Int <- 0;
d: Int <- 1;
f(): Int { a <- a + d };
};

Class C inherits A {
Class B inherits A {
c: Int <- 3;
b: Int <- 2;
h(): Int { a <- a * c };
f(): Int { a };
};
g(): Int { a <- a - b };
};
Dynamic Dispatch Example

•  e.g()
– g refers to method in B if e is a B
•  e.f()
– f refers to method in A if f is an A or C
(inherited in the case of C)
– f refers to method in B for a B object

• The implementation of methods and dynamic


dispatch strongly resembles the
implementation of attributes
Dispatch Tables

• Every class has a fixed set of methods


(including inherited methods)

• A dispatch table indexes these methods


– An array of method entry points
– A method f lives at a fixed offset in the dispatch
table for a class and all of its subclasses
Dispatch Table Example

• The dispatch table for


Offset 0 4
class A has only 1
method
Class • The tables for B and C
A fA extend the table for A
to the right
• Because methods can be
B fB g
overridden, the method
for f is not the same in
C fA h every class, but is
always at the same
offset
Using Dispatch Tables

• The dispatch pointer in an object of class X


points to the dispatch table for class X

• Every method f of class X is assigned an


offset Of in the dispatch table at compile time
Using Dispatch Tables (Cont.)

• Every method must know what object is “self”


– “self” is passed as the first argument to all
methods
• To implement a dynamic dispatch e.f() we
– Evaluate e, obtaining an object x
– Find D by reading the dispatch-table field of x
– Call D[Of](x)
• D is the dispatch table for x
• In the call, self is bound to x
Parameter Passing Mechanisms
Parameter Passing Mechanisms

• There are many semantic issues in


programming languages centering on when
values are computed and the scope of names

• We’ll focus on parameter passing


– When are arguments of function calls evaluated?
– To what are formal parameters bound?
Call-By-Value

• To evaluate f(e)
– Evaluate e to a value v
– Bind v to the formal parameter in the function body
• Example
g(x) = x <- x + 1;
f(y) = { g(y); print(y); }

• Under call-by-value, f(0) = 0


The Stack Under Call-By-Value

y 0

f(y)
g(y) y 0
x <- x + 1 x 0
print(y)
y 0
x 1

y 0
Call-By-Value Discussion

• Under call-by-value, g(y) does not affect the


value of y

• y’s value, not y itself, is passed to g


– The formal parameter is stored in a different
location from the actual parameter

• Call-by-value is widely used


– C is a prominent example
Call-By-Reference

• To evaluate f(e)
– e is evaluated
– A pointer to e is passed as the argument
– f’s code accesses the argument through the
pointer

• If e is already a stored value (a variable) a


pointer to that location is used
• Otherwise, e is evaluated and stored in a
fresh location first
The Stack Under Call-By-Reference

y 0

f(y)
g(y) y 0
x <- x + 1 x
print(y)
y 1
x

y 1
Call-By-Reference Discussion

• Under Call-By-Reference, only the address is


passed
– References to the value dereference the pointer

• In the example, because x and y refer to the


same value, changes to x also change y

• Many languages pass large data structures


(e.g., arrays) by reference
Call-By-Reference Discussion (Cont.)

• IMPORTANT: There is a difference between


passing a value by reference and passing a
pointer by value!

• In C, often one does pass structures as


arguments; one passes pointers (by value)
Call-By-Name

• Consider a definition f(x) = e

• To evaluate f(e’)
– e’ is passed unevaluated to f
– e’ is evaluated each time e refers to x
– The expression e’ is always evaluated in its original
environment
The Stack Under Call-By-Name

Example
j(y) = { print(y); print(y); }
{z <- 0; j(z++); }
z 0
z <- 0 z 0
j(z++) y z++
print(y); z 1
print(y); y z++
z 2
Call-By-Name Discussion

• Note a function argument may be evaluated 0


or more times
– Evaluated 0 times if the argument is never used

• Call-By-Name is used in lazy functional


languages

• There are other, less important,


parameter passing mechanisms

You might also like