0% found this document useful (0 votes)
23 views28 pages

1 Compiler Design Lect1

The document provides an overview of compiler design, detailing the processes involved in translating source programs into target machine code, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and code generation. It distinguishes between compilers and interpreters, highlighting their respective roles in error detection and execution speed. Additionally, it discusses the structure of compilers, including the front-end and back-end phases, and the importance of symbol tables in managing variable information.

Uploaded by

Ayad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views28 pages

1 Compiler Design Lect1

The document provides an overview of compiler design, detailing the processes involved in translating source programs into target machine code, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, optimization, and code generation. It distinguishes between compilers and interpreters, highlighting their respective roles in error detection and execution speed. Additionally, it discusses the structure of compilers, including the front-end and back-end phases, and the importance of symbol tables in managing variable information.

Uploaded by

Ayad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Compiler Design

Introduction

1
Source Program

Preprocessor

Modified Source Program

Compiler

Language Target Assembly Program

Processing

Compiler Construction
Assembler

System Relocatable Machine Code

Linker Library Files


Relocatable Object Files

Target Machine Code

Loader
2
Results
Language Processors: Translators.
• A translator inputs and then converts a source program into
an object or target program.
• Source program is written in a source language
• Object program belongs to an object language
• A translators could be: Assembler, Compiler, Interpreter

Compiler Construction
Assembler:

source program object program


Assembler
(in assembly language) (in machine language)

3
Language Processors: Compiler
• A compiler is a program that can read a program in one language the source
language - and translate it into an equivalent program in another language - the
target language;
• An important role of the compiler is to report any errors in the source program that
it detects during the translation process

Compiler Construction
• If the target program is an executable machine-language program, it can then be
called by the user to process inputs and produce outputs;
4
Language Processors: Interpreter
An interpreter is another common kind of language processor. Instead of producing a
target program as a translation, an interpreter appears to directly execute the operations
specified in the source program on inputs supplied by the user

Compiler Construction
The machine-language target program produced by a compiler is usually much faster
than an interpreter at mapping inputs to outputs. An interpreter, however, can usually
give better error diagnostics than a compiler, because it executes the source program
statement by statement.

5
Compilers and Interpreters

Why Interpretation
❖A higher degree of machine independence: high portability.
❖Dynamic execution: modification or addition to user
programs as execution proceeds.

Compiler Construction
❖Dynamic data type: type of object may change at runtime
❖Easier to write – no synthesis part.
❖Better diagnostics: more source text information available

6
Compilers: an Overview
- Compiler: translates a source program written in a
High-Level Language (HLL) such as Pascal, C++ into
computer’s machine language (Low-Level Language
(LLL)).
* The time of conversion from source program into

Compiler Construction
object program is called compile time
* The object program is executed at run time

- Interpreter: processes an internal form of the source


program and data at the same time (at run time); no
object program is generated.

7
Compilers: an Overview (cont.)

Compilation Process:
Data

Compiler Construction
Source Object Executing Results
Compiler
program program Computer
Compile time run time

Interpretation Process:
Data

Source
Interpreter Result
program
8
Combining Both Interpreter and Compiler
Example
• Java language processors combine
compilation and interpretation,

Compiler Construction
• A Java source program may first be compiled into an intermediate form called
bytecodes.
• The bytecodes are then interpreted by a virtual machine. A benefit of this arrangement
is that bytecodes compiled on one machine can be interpreted on another machine,
perhaps across a network.
• In order to achieve faster processing of inputs to outputs, some Java compilers, called
just-in-time compilers, translate the bytecodes into machine language immediately
before they run the intermediate program to process the input. 9
Compiler Model
• A compiler must perform two tasks:
• Analysis of source program: The analysis part breaks up the
source program into constituent pieces and imposes a
grammatical structure on them. It then uses this structure to
create an intermediate representation of the source program.

Compiler Construction
• Synthesis of its corresponding program constructs the desired
target program from the intermediate representation and the
information in the symbol table.
• The analysis part is often called the front end of the
compiler; the synthesis part is the back end.

10
Compiler Model (cont.)
Input source program Output object program

Synthesis
Analysis
Code Code
Lexical Syntactic Semantic

Compiler Construction
Generator optimizer
Analysis Analysis Analysis

Symbol Tables

11
Tasks of Compilation Process & Output

Compiler Construction
Error handler

Compiler phases
12
First Phase: Lexical Analysis (Scanner):
• Lexical analyzer reads the stream of characters making up the
source program and groups the characters into meaningful
sequences called lexeme
• For each lexeme, the lexical analyzer produces a token of the
form that it passes on to the subsequent phase, syntax analysis

Compiler Construction
(token-name, attribute-value)
• Token-name: an abstract symbol is used during syntax
analysis,
• Attribute-value: points to an entry in the symbol table for this
token.

13
Example:
position = initial + rate * 60
1.”position” is a lexeme mapped into a token (id, 1), where id is an abstract symbol
standing for identifier and 1 points to the symbol table entry for position. The
symbol-table entry for an identifier holds information about the identifier, such as
its name and type.
2. = is a lexeme that is mapped into the token (=). Since this token needs no attribute-
value, we have omitted the second component. For notational convenience, the
lexeme itself is used as the name of the abstract symbol.

Compiler Construction
3. “initial” is a lexeme that is mapped into the token (id, 2), where 2 points to the
symbol-table entry for initial.
4. + is a lexeme that is mapped into the token (+).
5. “rate” is a lexeme mapped into the token (id, 3), where 3 points to the symbol-
table entry for rate.
6. * is a lexeme that is mapped into the token (*) .
7. 60 is a lexeme that is mapped into the token (60)

Blanks separating the lexemes would be discarded by the lexical 14


analyzer.
Second Phase: Syntax Analysis (parser)

• The parser uses the first components of the tokens produced by the lexical
analyzer to create a tree-like intermediate representation that depicts the
grammatical structure of the token stream.
• A typical representation is a syntax tree in which each interior node
represents an operation and the children of the node represent the
arguments of the operation

Compiler Construction
15
Syntax Analysis: Example
Pay := Base + Rate* 60
❖ The seven tokens are grouped into a parse tree

Assignment stmt

Compiler Construction
identifier expression
:=

pay expression expression


+
identifier
Rate*60
base
16
Third phase: Semantic Analysis
• The semantic analyzer uses the syntax tree and the information in the symbol table to
check the source program for semantic consistency with the language definition.

• Gathers type information and saves it in either the syntax tree or the symbol table, for
subsequent use during intermediate-code generation.

• An important part of semantic analysis is type checking, where the compiler checks
that each operator has matching operands. For example, many programming

Compiler Construction
language definitions require an array index to be an integer; the compiler must report
an error if a floating-point number is used to index an array.

• The language specification may permit some type conversions called coercions. For
example, a binary arithmetic operator may be applied to either a pair of integers or to
a pair of floating-point numbers. If the operator is applied to a floating-point number
and an integer, the compiler may convert or coerce the integer into a floating-point
number.
17
Phase Four: Intermediate Code Generation
Intermediate Code Generation generates three-address code
After syntax and semantic analysis of the source program, many compilers
generate an explicit low-level or machine-like intermediate representation (a
program for an abstract machine). This intermediate representation should
have two important properties:
• it should be easy to produce and

Compiler Construction
• it should be easy to translate into the target machine.
The considered intermediate form called three-address code, which consists
of a sequence of assembly-like instructions with three operands per
instruction. Each operand can act like a register.
Pay := Base + Rate* 60

18
Code Optimization
Code Optimization applied to generate better target code
• The machine-independent code-optimization phase attempts to improve the
intermediate code so that better target code will result.
• Usually better means:
• faster, shorter code, or target code that consumes less power.
• The optimizer can deduce that the conversion of 60 from integer to floating
point can be done once and for all at compile time, so the int to float

Compiler Construction
operation can be eliminated by replacing the integer 60 by the floating-point
number 60.0. Moreover, t3 is used only once

• There are simple optimizations that significantly improve the running time
of the target program without slowing down compilation too much.

19
Code Generation
Code generation takes as input an intermediate representation of
the source program and maps it into the target (machine / object)
language
• If the target language is machine, code, registers or memory locations are
selected for each of the variables used by the program.
• The intermediate instructions are translated into sequences of machine

Compiler Construction
instructions that perform the same task.
• A crucial aspect of code generation is the judicious assignment of
registers to hold variables.

20
Symbol-Table Management
• The symbol table is a data structure containing a record
for each variable name, with fields for the attributes of the
name.
• The data structure should be designed to allow the
compiler to find the record for each name quickly and to
store or retrieve data from that record quickly

Compiler Construction
• These attributes may provide information about the
storage allocated for a name, its type, its scope (where in
the program its value may be used), and in the case of
method names, such things as the number and types of its
arguments, the method of passing each argument (for
example, by value or by reference), and the type returned.

21
Translation of an assignment
statement

Compiler
Phases:
Example

Compiler Construction
with output

22
Compiler Phases: Grouping
• Front end
❖ Consist of those phases that depend on the source language but
largely independent of the target machine.

• Back end

Compiler Construction
❖ Consist of those phases that are usually target machine dependent
such as optimization and code generation.

23
Common Back-end Compiling System

Fortran C/C++ Pascal Cobol

Compiler Construction
Common IR (e.g., Unicode)

Optimizer

Target Machine Code Gen


24
Compiling Passes
• Several phases can be implemented as a single pass consist of
reading an input file and writing an output file.

• A typical multi-pass compiler looks like:


• First pass: preprocessing, macro expansion

Compiler Construction
• Second pass: syntax-directed translation, IR code generation
• Third pass: optimization
• Last pass: target machine code generation

25
Cousins of Compilers
• Preprocessors
• Assemblers
• Compiler may produce assembly code instead of
generating relocatable machine code directly.
• Loaders and Linkers

Compiler Construction
• Loader copies code and data into memory, allocates
storage, setting protection bits, mapping virtual addresses,
.. etc
• Linker handles relocation and resolves symbol references.
• Debugger
26
Tasks of Compilation Process &Output
• Each tasks is assigned to a phase, e.g. Lexical
Analyzer phase, Syntax Analyzer phase, and so on.
• Each task has input and output.
• Any thing between brackets in the last figure is

Compiler Construction
output of a phase.
• The compiler first analyzes the program, the result
is representations suitable to be translated later on:
- Parse tree
- Symbol table
27
Parse Tree & Symbol Table
• Parse tree (syntax tree) defines the program
structure; how to combine parts of the program to
produce larger part and so on.

• Symbol table provides

Compiler Construction
- the associations between all occurrences of each
identifier name given in the program.
- It provides a link between each identifier name
and its declaration.

28

You might also like