0% found this document useful (0 votes)
56 views

Dakshina Ranjan Kisku Associate Professor Department of Computer Science and Engineering National Institute of Technology Durgapur

The document summarizes the key phases and components of a compiler. It discusses the roles of the lexical analyzer, parser, semantic analyzer, code generator, symbol table, and intermediate representations. It also briefly outlines some compiler construction tools and areas of research related to compilers and computer architectures.

Uploaded by

Agrawal Darpan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Dakshina Ranjan Kisku Associate Professor Department of Computer Science and Engineering National Institute of Technology Durgapur

The document summarizes the key phases and components of a compiler. It discusses the roles of the lexical analyzer, parser, semantic analyzer, code generator, symbol table, and intermediate representations. It also briefly outlines some compiler construction tools and areas of research related to compilers and computer architectures.

Uploaded by

Agrawal Darpan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Dakshina Ranjan Kisku

Associate Professor
Department of Computer Science and Engineering
National Institute of Technology Durgapur
[email protected]
1) Majority of texts, diagrams and tables in the slide is based
on the text book Compilers: Principles, Techniques, and
Tools by Aho, Sethi, Ullman and Lam.
2) Some texts and diagrams are based on MIT Slides on
Compilers Course 6.035.
3) Some texts and diagrams are taken from IITK slides of Prof.
S. K. Aggarwala.
• The lexical analyzer reads the stream of characters making up the source
program and groups the characters into meaningful sequences called
lexemes.
• For each lexeme, the lexical analyzer produces as output a token of the
form
<Token-name, attribute-value>
• For example, suppose a source program contains the assignment
statement
position = initial + rate * 60
• The characters in the above assignment could be grouped into lexemes
and mapped into tokens passed on to the syntax analyzer.
<id, 1> <=> <id, 2> <+> <id, 3> <*> <60>
• The parser uses the first components of the tokens produced by the
lexical analyzer to create a tree-like intermediate representation that
depicts the grammatical structure of the token stream.
• A typical representation is a syntax tree in which each interior node
represents an operation and the children of the node represent the
arguments of the operation.
• Use context-free grammars to specify the grammatical structure of
programming languages and construct efficient syntax analyzers
automatically from certain classes of grammars.
• The semantic analyzer uses the syntax tree and the information in the
symbol table to check the source program for semantic consistency with
the language definition.
• It also gathers type information and saves it in either the syntax tree or
the symbol table, for subsequent use during intermediate-code
generation.
• An important part of semantic analysis is type checking, where the
compiler checks that each operator has matching operands.
• The language specification may permit some type conversions called
coercions.
• In the process of translating a source program into target code, a
compiler may construct one or more intermediate representations,
which can have a variety of forms.
• Syntax trees are a form of intermediate representation; they are
commonly used during syntax and semantic analysis.
• Generate an explicit low-level or machine-like intermediate
representation.
• This intermediate representation should have two important properties:
it should be easy to produce and it should be easy to translate into the
target machine.
• Consider an intermediate form called three-address code.
• The machine-independent code-optimization phase attempts to improve
the intermediate code so that better target code will result.
• A simple intermediate code generation algorithm followed by code
optimization is a reasonable way to generate good target code.
• There is a great variation in the amount of code optimization different
compilers perform.
• The code generator takes as input an intermediate representation of the
source program and maps it into the target language.
• If the target language is machine code, registers or memory locations are
selected for each of the variables used by the program.
• Then, the intermediate instructions are translated into sequences of
machine instructions that perform the same task.
• A crucial aspect of code generation is the judicious assignment of
registers to hold variables.
• An essential function of a compiler is to record the variable names used
in the source program and collect information about various attributes
of each name.
• These attributes may provide information about the storage allocated
for a name, its type, its scope and in the case of procedure names, such
things as the number and types of its arguments, the method of passing
each argument (for example, by value or by reference), and the type
returned.
• The symbol table is a data structure containing a record for each variable
name, with fields for the attributes of the name.
Translation of an Assignment Statement
• Phases deal with the logical organization of a compiler.
• In an implementation, activities from several phases may be grouped
together into a pass that reads an input file and writes an output.
• For example, the front-end phases of lexical analysis, syntax analysis,
semantic analysis, and intermediate code generation might be grouped
together into one pass.
• Code optimization might be an optional pass.
• Then there could be a back-end pass consisting of code generation for a
particular target machine.
• Compiler construction tools*** use specialized languages for specifying and
implementing specific components. Some commonly used compiler-
construction tools include –
• Parser generators that automatically produce syntax analyzers from a grammatical
description of a programming language.
• Scanner generators that produce lexical analyzers from a regular-expression description of
the tokens of a language.
• Syntax-directed translation engines that produce collections of routines for walking a parse
tree and generating intermediate code.
• Code-generator generators that produce a code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for a
target machine.
• Data-flow analysis engines that facilitate the gathering of information about how values
are transmitted from one part of a program to each other part. Data-flow analysis is a key
part of code optimization.
• Compiler-construction toolkits that provide an integrated set of routines for constructing
various phases of a compiler.

***(language editors, debuggers, version managers, profilers, test harnesses)


• Implementation of High-Level Programming Languages
• Optimizations for Computer Architectures
• Parallelism
• Memory hierarchies
• Design of New Computer Architectures
• RISC, CISC
• Specialized architectures – VLIW, SIMD
• Program Translations
• Binary translation
• Hardware synthesis
• Database query interpreters
• Compiled simulations
• Software Productivity Tools
• Type checking
• Bounds checking
• Memory management tools (Purify tools) – Garbage collection

You might also like