AT_Module6_Compiler and its phases_PS
AT_Module6_Compiler and its phases_PS
Topic
Introduction to Compiler and its phases
Course Outcome
Develop understanding of applications of various automata.
Compiler phases 1
Recommended Reading
Compiler phases 2
Language Processing System
Compiler phases 3
Language Processors.
Assembler:
Assembler
source program object program
(in assembly language) (in machine language)
Compiler phases 4
Language Processors.
⚫ a compiler is a program that can read a program in one language the source
language - and translate it into an equivalent program in another language - the
target language;
⚫ An important role of the compiler is to report any errors in the source program that
it detects during the translation process
Compiler phases 5
Interpreter
Compiler phases 6
Overview of Compilers
Compiler phases 7
Compilers and Interpreters
Why Interpretation
❖ A higher degree of machine independence: high
portability.
❖ Dynamic execution: modification or addition to user
programs as execution proceeds.
❖ Dynamic data type: type of object may change at
runtime
❖ Easier to write – no synthesis part.
Compilation Process:
Data
Source
Compiler Result
program
Compiler phases 9
Example Of Combining Both Interpreter and Compiler
⚫ A Java source program may first be compiled into an intermediate form called
bytecodes.
⚫ The bytecodes are then interpreted by a virtual machine. A benefit of this
arrangement is that bytecodes compiled on one machine can be interpreted on
another machine, perhaps across a network.
⚫ In order to achieve faster processing of inputs to outputs, some Java compilers,
called just-in-time compilers, translate the bytecodes into machine language
immediately before they run the intermediate program to process the input.
Compiler phases 10
Model of A Compiler
Compiler phases 11
source program object program
Synthesis
Analysis
Code Code
Lexical Syntactic Semantic
Generator optimizer
Analysis Analysis Analysis
Tables
Compiler phases 12
Tasks of Compilation Process and Its Output
Error
handler
Compiler phases
13
Dividing into tokens (symbols :
variable name, keyword or number)
14
14
15
15
Lexical Analysis (scanner): The first phase of a compiler
Compiler phases 16
Example: position =initial + rate * 60
1.”position” is a lexeme mapped into a token (id, 1), where id is an abstract symbol
standing for identifier and 1 points to the symbol table entry for position. The
symbol-table entry for an identifier holds information about the identifier, such as its
name and type.
2. = is a lexeme that is mapped into the token (=). Since this token needs no attribute-
value, we have omitted the second component. For notational convenience, the
lexeme itself is used as the name of the abstract symbol.
3. “initial” is a lexeme that is mapped into the token (id, 2), where 2 points to the
symbol-table entry for initial.
4. + is a lexeme that is mapped into the token (+).
5. “rate” is a lexeme mapped into the token (id, 3), where 3 points to the symbol-table
entry for rate.
6. * is a lexeme that is mapped into the token (*) .
7. 60 is a lexeme that is mapped into the token (60)
Blanks separating the lexemes would be discarded by the lexical analyzer.
Compiler phases 17
Syntax Analysis (parser) : The second phase of the compiler
⚫ The parser uses the first components of the tokens produced by the lexical
analyzer to create a tree-like intermediate representation that depicts the
grammatical structure of the token stream.
⚫ A typical representation is a syntax tree in which each interior node
represents an operation and the children of the node represent the
arguments of the operation
Compiler phases 18
Syntax Analysis Example
Assignment stmt
identifier expression
=
⚫ The semantic analyzer uses the syntax tree and the information in the
symbol table to check the source program for semantic consistency with the
language definition.
⚫ Gathers type information and saves it in either the syntax tree or the symbol
table, for subsequent use during intermediate-code generation.
⚫ An important part of semantic analysis is type checking, where the
compiler checks that each operator has matching operands. For example,
many programming language definitions require an array index to be an
integer; the compiler must report an error if a floating-point number is used
to index an array.
⚫ The language specification may permit some type conversions called
coercions. For example, a binary arithmetic operator may be applied to
either a pair of integers or to a pair of floating-point numbers. If the
operator is applied to a floating-point number and an integer, the compiler
may convert or coerce the integer into a floating-point number.
Compiler phases 20
Intermediate Code Generation: three-address code
After syntax and semantic analysis of the source program, many compilers
generate an explicit low-level or machine-like intermediate representation
(a program for an abstract machine). This intermediate representation
should have two important properties:
⚫ it should be easy to produce and
⚫ it should be easy to translate into the target machine.
The considered intermediate form called three-address code, which consists of
a sequence of assembly-like instructions with three operands per
instruction. Each operand can act like a register.
Compiler phases 21
Code Optimization: to generate better target code
⚫ The optimizer can deduce that the conversion of 60 from integer to floating
point can be done once and for all at compile time, so the int to float
operation can be eliminated by replacing the integer 60 by the floating-point
number 60.0. Moreover, t3 is used only once
⚫ There are simple optimizations that significantly improve the running time
of the target program without slowing down compilation too much.
Compiler phases 22
Code Generation: takes as input an intermediate representation of the
source program and maps it into the target language
Compiler phases 23
Symbol-Table Management:
Compiler phases 24
Translation of an assignment
statement
Compiler phases 25
Grouping of Compiler Phases
⚫ Front end
❖ Consist of those phases that depend on the source language but
largely independent of the target machine.
⚫ Back end
❖ Consist of those phases that are usually target machine dependent
Compiler phases 26
Common Back-end Compiling System
Optimizer
Compiler phases 27
Compiling Passes
Compiler phases 28
Cousins of Compilers
⚫ Preprocessors
⚫ Assemblers
⚫ Compiler may produce assembly code instead of
generating relocatable machine code directly.
⚫ Loaders and Linkers
⚫ Loader copies code and data into memory, allocates
storage, setting protection bits, mapping virtual
addresses, .. Etc
⚫ Linker handles relocation and resolves symbol
references.
⚫ Debugger
Compiler phases 29
Tasks of Compilation Process and Its Output
Compiler phases 30
Parse Tree and Symbol Table
Compiler phases 31
⚫ Questions
Compiler Construction 32