0% found this document useful (0 votes)
101 views

Compilers PDF

The document discusses compiler design and history. It covers: 1) The origins of compilers in the late 1950s to translate programs from assembly languages to machine code as software reuse grew more important. The first FORTRAN compilers took 18 person-years to build. 2) The basic functions of a compiler in translating a human-oriented programming language into a machine-oriented language by ignoring machine-dependent details. Compilers can generate pure machine code, augmented machine code, or virtual machine code. 3) The typical structure of a compiler including lexical analysis, parsing, semantic analysis, code optimization, and code generation stages to transform the source code into executable target code.

Uploaded by

Nina Cavcaliuc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views

Compilers PDF

The document discusses compiler design and history. It covers: 1) The origins of compilers in the late 1950s to translate programs from assembly languages to machine code as software reuse grew more important. The first FORTRAN compilers took 18 person-years to build. 2) The basic functions of a compiler in translating a human-oriented programming language into a machine-oriented language by ignoring machine-dependent details. Compilers can generate pure machine code, augmented machine code, or virtual machine code. 3) The typical structure of a compiler including lexical analysis, parsing, semantic analysis, code optimization, and code generation stages to transform the source code into executable target code.

Uploaded by

Nina Cavcaliuc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Compiler

design
Overview and History
• Cause
– Software for early computers was written in assembly language
– The benefits of reusing software on different CPUs started to
become significantly greater than the cost of writing a compiler

• The first real compiler


– FORTRAN compilers of the late 1950s
– 18 person-years to build

2
Overview and History
• Compiler technology
– is more broadly applicable and has been employed in
rather unexpected areas.
• Text-formatting languages,
like nroff and troff; preprocessor packages like eqn,
tbl, pic
• Silicon compiler for the creation of VLSI circuits
• Command languages of OS
• Query languages of Database systems

3
What Do Compilers Do
• A compiler acts as a translator,
transforming human-oriented programming
languages
into computer-oriented machine languages.
– Ignore machine-dependent details for programmer

Programming Machine
Language Compiler Language
(Source) (Target)

4
What Do Compilers Do
• Compilers may generate three types of code:
– Pure Machine Code
• Machine instruction set without assuming the existence
of any operating system or library.
• Mostly being OS or embedded applications.
– Augmented Machine Code
• Code with OS routines and runtime support routines.
• More often
– Virtual Machine Code
• Virtual instructions, can be run on any architecture with
a virtual machine interpreter or a just-in-time compiler
• Ex. Java 5
The Structure of a Compiler
• Any compiler must perform two major tasks
Compiler

Analysis Synthesis

– Analysis of the source program


– Synthesis of a machine-language program
6
Compiler Architecture

In more detail:
Intermediate
Language
Source Target Language
Front End – Back End –
Language
language specific machine specific

•Separation of Concerns
•Retargeting

7
Compiler Architecture
Intermediate
Intermediate
Language
Language

Scanner Parser Semantic Code


Source Code Target
(lexical (syntax Analysis Generator
language Optimizer language
analysis) analysis) (IC generator)
tokens Syntactic
structure

Symbol
Table

8
Lexical Analysis - Scanning

Scanner tokens Parser Semantic Code


Source (lexical (syntax Generator
Analysis
languag analysis) analysis) (IC generator)
e

Code
Optimizer

• Tokens described formally


• Breaks input into tokens
Symbol
• Remove white space
Table

9
Input: result = a + b * c / d

• Tokens:
‘result’, ‘=‘, ‘a’, ‘+’, ‘b’, ‘*’, ‘c’, ‘/’, ‘d’

identifiers
operators

10
Static Analysis - Parsing

tokens Syntactic
Scanner Parser structure Semantic Code
Source Target
language
(lexical (syntax Analysis Generator
language
analysis) analysis) (IC generator)

Code
Optimizer

• Syntax described formally


• Tokens organized into syntax tree
that describes structure Symbol
• Error checking (syntax) Table

11
Input: result = a + b * c / d

Exp ::= Exp ‘+’ Exp Assign


| Exp ‘-’ Exp
| Exp ‘*’ Exp ID ‘=‘ Exp
| Exp ‘/’ Exp
| ID Exp ‘+’ Exp

Assign ::= ID ‘=‘ Exp ID Exp ‘*’ Exp

ID Exp ‘/’ Exp

ID ID

12
Semantic Analysis

Syntactic/semantic
Syntactic structure
Scanner Parser structure Semantic Code
Source Target
language
(lexical (syntax Analysis Generator
language
analysis) analysis) (IC generator)

Syntactic/semantic
structure
Code
Optimizer

• “Meaning”
• Type/Error Checking
• Intermediate Code Generation –
abstract machine Symbol
Table

13
Optimization

Scanner Parser Semantic Code


Source Target
language
(lexical (syntax Analysis Generator
language
analysis) analysis) (IC generator)
Syntactic/semantic
structure
Syntactic/semantic
Code structure
Optimizer

• Improving efficiency (machine


independent)
• Finding optimal code Symbol
Table

14
Code Generation

Syntactic/semantic
structure
Scanner Parser Semantic Code
Source Target
language
(lexical (syntax Analysis Generator language
analysis) analysis) (IC generator)

Syntactic/semantic
Code structure
Optimizer

• Intermediate code to real


machine code
• Memory management, register Symbol
allocation, instruction selection, Table
instruction scheduling, …

15
The Role of Lexical Analyzer
• Lexical analyzer is the first phase of a compiler.
• Its main task is to read input characters and produce as
output a sequence of tokens that parser uses for syntax
analysis.

16
Tokens, Patterns, Lexemes
• A lexeme is a sequence of characters in the source
program that is matched by the pattern for a token.
• A lexeme is a basic lexical unit of a language comprising
one or several words, the elements of which do not
separately convey the meaning of the whole.
• The lexemes of a programming language include its identifier,
literals, operators, and special words.
• A token of a language is a category of its lexemes.
• A pattern is a rule describing the set of lexemes that can
represent as particular token in source program.
17
Examples of Tokens
const pi = 3.1416;
The substring pi is a lexeme for the token “identifier.”

18
Lexeme and Token
Index = 2 * count +17;

Lexemes Tokens
Index Identifier
= equal_sign
2 int_literal
* multi_op
Count identifier
+ plus_op
17 int_literal
; semicolon
19
Lexical Errors
• Few errors are discernible at the lexical level
alone, because a lexical analyzer has a very
localized view of a source program.
• Let some other phase of compiler handle any
error.
• Panic mode
• Error recovery

20

You might also like