Advanced Computer Systems: Compiler Design & Implementation
Advanced Computer Systems: Compiler Design & Implementation
5th Computer
at Modern Academy
1
1. Compiler : Introduction
Programming languages are notations for describing
computations to people and to machines. The world as we
know it depends on programming languages, because all the
software running on all the computers was written in some
programming language. But, before a program can be run, it first
must be translated into a form in which it can be executed by a
computer.
The software systems that do this translation are called
compilers.
This course is about how to design and implement compilers.
We shall discover that a few basic ideas can be used to
construct translators for a wide variety of languages and
machines. Besides compilers, the principles and techniques for
compiler design are applicable to so many other domains that
they are likely to be reused many times in the career of a
computer scientist. The study of compiler writing touches upon
programming languages, machine architecture, language theory,
algorithms, and software engineering.
2
1. Compiler : Introduction
Fundamentals of Compiling
The compiler must preserve the
meaning of the program.
The compiler must improve the
source code in some noticeable
manner.
4
1. Compiler : Introduction
Desirable Properties
Correct output code
Compile-time efficiency
Better code (Optimized)
Minimum Space
Feedback
Debugging
5
1. Compiler : Introduction
Fundamentals of Compiling
The compiler must preserve the
meaning of the program.
The compiler must improve the
source code in some noticeable
manner.
6
1.1 Language Processors
Simply stated, a compiler is a program that can read a program in one
language - the source language - and translate it into an equivalent
program in another language - the target language; see Fig. 1.1. An
important role of the compiler is to report any errors in the source
program that it detects during the translation process.
.
7
1.1 Language Processors
An interpreter is another common kind of language
processor. Instead of producing a target program as a
translation, an interpreter appears to directly execute the
operations specified in the source program on inputs supplied
by the user, as shown in Fig. 1.3.
8
1.1 Language Processors
11
1.2 The Structure of a Compiler
Up to this point we have treated a compiler as a single box
that maps a source program into a semantically equivalent
target program. If we open up this box a little, we see that
there are two parts to this mapping: analysis and synthesis.
The analysis part breaks up the source program into
constituent pieces and imposes a grammatical structure on
them. It then uses this structure to create an intermediate
representation of the source program. If the analysis part
detects that the source program is either syntactically ill
formed or semantically unsound, then it must provide
informative messages, so the user can take corrective action.
The analysis part also collects information about the source
program and stores it in a data structure called a symbol
table, which is passed along with the intermediate
representation to the synthesis part.
12
1.2 The Structure of a Compiler
The synthesis part constructs the desired target program
from the intermediate representation and the information in
the symbol table. The analysis part is often called the front
end of the compiler; the synthesis part is the back end.
If we examine the compilation process in more detail, we
see that it operates as a sequence of phases, each of which
transforms one representation of the source program to
another. A typical decomposition of a compiler into phases is
shown in Fig. 1.6. In practice, several phases may be grouped
together, and the intermediate representations between the
grouped phases need not be constructed explicitly. The
symbol table, which stores information about the entire
source program, is used by all phases of the compiler.
13
1.2 The Structure of a Compiler
Jobs for a Compiler…
Determine whether input is a well constructed sentence in
the language.
Lexical analysis or Scanning
Break the input into tokens (words)
Syntax analysis or Parsing
Analyze the phrase or sentence structure of the
input
Determine whether the program (input) has a well-defined
meaning.
Semantic or context-sensitive analysis
Perform type checking, analyze the hierarchical
structure of statements and report errors
W W*2*X*Y*Z 14
1.2 The Structure of a Compiler
Jobs for a Compiler…
Improve the code.
Optimization
Improve some important aspect of the code, e.g.,
speed or size.
W W*2*2 becomes W W*4
Generate an executable program.
Code Generation
Select the instructions to implement the transformed
code on the target machine.
Allocate program values to the limited set of registers.
Order or schedule the selected instructions to
minimize the execution time.
15
1.2 The Structure of a Compiler
Simplified Modern Compiler Organization
16
1.2 The Structure of a Compiler
17
1.2 The Structure of a Compiler
Lexical Analysis
Determine the words (token) in the input.
What are words in computer programs?
How do we separate the input string into words?
How does this facilitate compilation?
Syntactic Analysis
Determine the sentences in the input.
What are sentences in computer programs?
How do we group the words into sentences?
How does this facilitate compilation?
18
1.2 The Structure of a Compiler
Intermediate Representations
Represent the program in an easily transformable manner.
Representation strongly tied to the algorithms
trees ➫ recursion
linear ➫ iteration
How do IRs support compilation?
Abstract Syntax Trees
Form of intermediate representation.
Tree basis for representing the essence of the computer
program being translated.
Form reflects the form of the program.
Representation for each programming structure mimics the
language. 19
1.2 The Structure of a Compiler
Semantic Analysis
Determine the meaning of the input.
What are sentences in computer program supposed to do?
Are all of the interactions valid?
How does this support compilation?
Code Generation
Translate the internal representation into a form that can be
executed by a machine.
Optimization
Improve the generated code in some measurable way (speed,
size, …)
20