0% found this document useful (0 votes)
10 views

Lecture 1 - CSC 303

The document outlines the course objectives and structure for a Compiler Construction course, covering essential topics such as lexical analysis, syntax analysis, and code generation. It details the phases of a compiler, including analysis and synthesis, and discusses the importance of compiler design in computer science. Additionally, it provides information on course materials, grading, and an assignment due date.

Uploaded by

adamilola2019
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lecture 1 - CSC 303

The document outlines the course objectives and structure for a Compiler Construction course, covering essential topics such as lexical analysis, syntax analysis, and code generation. It details the phases of a compiler, including analysis and synthesis, and discusses the importance of compiler design in computer science. Additionally, it provides information on course materials, grading, and an assignment due date.

Uploaded by

adamilola2019
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

COMPILER

CONSTRUCTIO
N
(CSC 303)
LECTURE 1

1
Course Objectives
This course introduces students to the
fundamental principles and techniques
involved in designing and implementing
compilers.
Topics include lexical analysis, syntax
analysis, semantic analysis, intermediate
code generation, optimization, and code
generation.

A.O. AGBEYANGI - CHRISLAND


UNIVERSITY 2
COURSE OUTLINE
Introduction to Compilers
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code Generation
Code Optimization and Generation

A.O. AGBEYANGI - CHRISLAND


UNIVERSITY 3
COURSE INFORMATION
Textbook:
Compilers: Principles, Techniques and Tools (2nd Ed)
by Aho, Lam, Sethi & Ullman.
Modern Compiler Implementation in Java (2nd Ed)
by Andrew W. Appel

Grading:
Assignment – 15%
Mid-semester exam – 15%
Exam – 70%
A.O. AGBEYANGI - CHRISLAND
UNIVERSITY 4
Compilers,
Interpreters …

PS — INTRODUCTION 5 1.5
Compilers and
Interpreters
“Compilation”
◦ Translation of a program written in a source language into a semantically
equivalent program written in a target language
◦ Oversimplified view:
Input

Source Target
Compiler
Program Program

Error messages Output

A.O. AGBEYANGI - CHRISLAND


UNIVERSITY 6
Compilers and Interpreters (cont’d)

“Interpretation”
◦ Performing the operations implied by the source program
◦ Oversimplified view:

Source
Program
Interpreter Output
Input

Error messages

A.O. AGBEYANGI - CHRISLAND


UNIVERSITY 7
The Analysis-Synthesis
Model of Compilation
There are two parts to compilation:
◦ Analysis determines the operations implied by the
source program which are recorded in a tree structure
◦ Synthesis takes the tree structure and translates the
operations therein into the target program

A.O. AGBEYANGI - CHRISLAND


UNIVERSITY 8
Other Tools that Use the
Analysis-Synthesis Model
Editors (syntax highlighting)
Pretty printers (e.g. Doxygen)
Static checkers (e.g. Lint and Splint)
Interpreters
Text formatters (e.g. TeX and LaTeX)
Silicon compilers (e.g. VHDL)
Query interpreters/compilers (Databases)

A.O. AGBEYANGI - CHRISLAND


UNIVERSITY 9
Why do we care?

artificial greedy algorithms


Compiler construction is intelligence learning algorithms
graph algorithms
a microcosm of
algorithms union-find
computer science dynamic programming
DFAs for scanning
theory parser generators
lattice theory for analysis
allocation and naming
systems locality
synchronization
pipeline management
architecture hierarchy management
instruction set use

Inside a compiler, all these things come together


COMPILER CONSTRUCTION 10
Preprocessors, Compilers,
Assemblers, and Linkers
Skeletal Source Program

Preprocessor
Source Program
Compiler
Target Assembly Program
Assembler
Relocatable Object Code
Linker Libraries and
Relocatable Object Files

Absolute Machine Code


11
The Phases of a Compiler
Phase Output Sample
Programmer (source code producer) Source string A=B+C;

Scanner (performs lexical analysis) Token string ‘A’, ‘=’, ‘B’, ‘+’, ‘C’, ‘;’
And symbol table with names

Parser (performs syntax analysis based Parse tree or abstract syntax tree ;
|
on the grammar of the programming =
language) / \
A +
/ \
B C

Semantic analyzer (type checking, etc) Annotated parse tree or abstract


syntax tree
Intermediate code generator Three-address code, quads, or RTL int2fp B t1
+ t1 C t2
:= t2 A
Optimizer Three-address code, quads, or RTL int2fp B t1
+ t1 #2.3 A
Code generator Assembly code MOVF #2.3,r1
ADDF2 r1,r2
MOVF r2,A
Peephole optimizer Assembly code ADDF2 #2.3,r2
MOVF r2,A 12
The Grouping of Phases
Compiler front and back ends:
◦ Front end: analysis (machine independent)
◦ Back end: synthesis (machine dependent)
Compiler passes:
◦ A collection of phases is done only once (single pass)
or multiple times (multi pass)
◦ Single pass: usually requires everything to be defined before
being used in source program
◦ Multi pass: compiler may have to keep entire program
representation in memory

13
Compiler-Construction
Tools

Software development tools are available to implement


one or more compiler phases
◦ Scanner generators
◦ Parser generators
◦ Syntax-directed translation engines
◦ Automatic code generators
◦ Data-flow engines

14
What qualities are
important in a compiler?
1. Correct code
2. Output runs fast
3. Compiler runs fast
4. Compile time proportional to program size
5. Support for separate compilation
6. Good diagnostics for syntax errors
7. Works well with the debugger
8. Good diagnostics for flow anomalies
9. Cross language calls
10. Consistent, predictable optimization

15
A bit of history

1952: First compiler (linker/loader) written by Grace Hopper for A-0


programming language

1957: First complete compiler for FORTRAN by John Backus and team

1960: COBOL compilers for multiple architectures

1962: First self-hosting compiler for LISP

16
A compiler was originally a program that
“compiled” subroutines [a link-loader].
When in 1954 the combination “algebraic
compiler” came into use, or rather into
misuse, the meaning of the term had already
shifted into the present one.
— Bauer and Eickel [1975]

17
Abstract view

 recognize legal (and illegal) programs


 generate correct code
 manage storage of all variables and code
 agree on format for object (or assembly) code

Big step up from assembler — higher level notations

18
Traditional two pass
compiler

 intermediate representation (IR)


 front end maps legal code into IR
 back end maps IR onto target machine
 simplify retargeting
 allows multiple front ends
 multiple passes  better code

19
Front end

 recognize legal code


 report errors
 produce IR
 preliminary storage map
 shape code for the back end

Much of front end construction can be automated


20
Scanner

 map characters to tokens


 character string value for a token is a lexeme
 eliminate white space

x = x + y <id,x> = <id,x> + <id,y>

21
Parser

 recognize context-free syntax


 guide context-sensitive analysis
 construct IR(s)
 produce meaningful error messages
 attempt error correction

Parser generators mechanize much of the work


22
Context-free
grammars
Context-free syntax is 1. <goal> := <expr>
specified with a 2. <expr>:= <expr> <op>
grammar, usually in <term>
Backus-Naur form 3. |
(BNF) <term>
4. <term>:= number
5. | id
6. <op> := +
A grammar G = (S,N,T,P)
7.
 S is the start-symbol | -
 N is a set of non-terminal symbols
 T is a set of terminal symbols
 P is a set of productions — P: N  (N T)*
23
Deriving valid
sentences
Productio Result
n Given a grammar, valid
<goal> sentences can be derived
1 <expr> by repeated substitution.
2 <expr> <op> <term>
5
To recognize a valid
<expr> <op> y
sentence in some CFG,
7 <expr> - y
we reverse this process
2 <expr> <op> <term> - y
and build up a parse.
4 <expr> <op> 2 - y
6 <expr> + 2 - y
3 <term> + 2 - y
5 x + 2 - y

24
Parse trees

A parse can be represented by a tree


called a parse or syntax tree.

Obviously, this contains a lot of


unnecessary information

© Oscar Nierstrasz 25
Abstract syntax trees

So, compilers often use an abstract syntax tree (AST).

ASTs are often


used as an IR.

26
Roadmap
Overview
Front end
Back end
Multi-pass compilers

27
Back end

 translate IR into target machine code


 choose instructions for each IR operation
 decide what to keep in registers at each point
 ensure conformance with system interfaces

Automation has been less successful here

28
Instruction selection

 produce compact, fast code


 use available addressing modes
 pattern matching problem
— ad hoc techniques
— tree pattern matching
— string pattern matching
— dynamic programming

29
Register allocation

 have value in a register when used


 limited resources
 changes instruction choices
 can move loads and stores
 optimal allocation is difficult

Modern allocators often use an analogy to graph coloring


30
Roadmap

Overview
Front end
Back end
Multi-pass compilers

31
Traditional three-pass
compiler

 analyzes and changes IR


 goal is to reduce runtime
 must preserve values

32
Optimizer (middle
end)
Modern optimizers are usually built as a set of passes

 constant propagation and folding


 code motion
 reduction of operator strength
 common sub-expression elimination
 redundant store elimination
 dead code elimination
33
The MiniJava compiler

34
Compiler phases
Lex Break source file into individual words, or tokens
Parse Analyse the phrase structure of program
Parsing Actions Build a piece of abstract syntax tree for each phrase
Determine what each phrase means, relate uses of variables to their
Semantic Analysis
definitions, check types of expressions, request translation of each phrase
Place variables, function parameters, etc., into activation records (stack
Frame Layout
frames) in a machine-dependent way
Produce intermediate representation trees (IR trees), a notation that is not tied
Translate
to any particular source language or target machine
Hoist side effects out of expressions, and clean up conditional branches, for
Canonicalize
convenience of later phases
Group IR-tree nodes into clumps that correspond to actions of target-machine
Instruction Selection
instructions
Analyse sequence of instructions into control flow graph showing all possible
Control Flow Analysis
flows of control program might follow when it runs
Gather information about flow of data through variables of program; e.g.,
Data Flow Analysis liveness analysis calculates places where each variable holds a still-needed
(live) value
Choose registers for variables and temporary values; variables not
Register Allocation
simultaneously live can share same register
Code Emission Replace temporary names in each machine instruction with registers

35
A straight-line programming language
(no loops or conditionals):
Stm  Stm ; Stm CompoundStm
Stm  id := Exp AssignStm
Stm  print ( ExpList ) PrintStm
Exp  id IdExp
Exp  num NumExp
Exp  Exp Binop Exp OpExp
Exp  ( Stm , Exp ) EseqExp
ExpList  Exp , ExpList PairExpList
ExpList  Exp LastExpList
Binop  + Plus
Binop   Minus
Binop   Times
Binop  / Div

a := 5 + 3; b := (print(a,a—1),10a); print(b) 8 7
prints 80
36
Tree representation
a := 5 + 3; b := (print(a,a—1),10a); print(b)

37
SUMMARY: What you should know!
 What is the difference between a compiler and an interpreter?
 What are the important qualities of compilers?
 Why are compilers commonly split into multiple passes?
 What are the typical responsibilities of the different parts of a
modern compiler?
 How are context-free grammars specified?
 What is “abstract” about an abstract syntax tree?
 What is an intermediate representation and what is it for?
 Why is optimisation a separate activity?
 Is Java compiled or interpreted? What about Python? Ruby? PHP? Are
you sure?
 What are the key differences between modern compilers and
compilers written in the 1970s?
 Why is it hard for compilers to generate good error messages?
 Discuss in detail, the differences between one-pass and a multi-pass
compiler
38
Assignment 1

Answer all the questions in the previous slide.

◦ Submission: Sunday 21st January, 2024


(Deadline)

◦ Send to: [email protected]

39
Questions

40

You might also like