0% found this document useful (0 votes)
15 views

1 Intro

Uploaded by

huzafa zaheer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

1 Intro

Uploaded by

huzafa zaheer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

Introduction

Chapter 1 Introduction 1
What is a Compiler?
• A compiler is a computer
program that translates a
program in a source language
into an equivalent program in a
target language.
• A source program/code is a Source Target pr
program/code written in the program compiler ogram
source language, which is
usually a high-level language.
• A target program/code is a
Error me
program/code written in the ssage
target language, which often is
a machine language or an
intermediate code.

Chapter 1 Introduction 2
Process of Compiling
Stream of characters
scanner
Stream of tokens
parser
Parse/syntax tree
Semantic analyzer
Annotated tree
Intermediate code generator
Intermediate code
Code optimization
Intermediate code
Code generator
Target code
Code optimization
Target code
Chapter 1 Introduction 3
Some Data Structures
• Symbol table
• Literal table
• Parse tree

Chapter 1 Introduction 4
Symbol Table
• Identifiers are names of variables, constant
s, functions, data types, etc.
• Store information associated with identifiers
– Information associated with different types of id
entifiers can be different
• Information associated with variables are name, type,
address,size (for array), etc.
• Information associated with functions are name,type
of return value, parameters, address, etc.

Chapter 1 Introduction 5
Symbol Table (cont’d)
• Accessed in every phase of compilers
– The scanner, parser, and semantic analyzer put
names of identifiers in symbol table.
– The semantic analyzer stores more information
(e.g. data types) in the table.
– The intermediate code generator, code optimize
r and code generator use information in symbol
table to generate appropriate code.
• Mostly use hash table for efficiency.

Chapter 1 Introduction 6
Literal table
• Store constants and strings used in program
– reduce the memory size by reusing constants a
nd strings
• Can be combined with symbol table

Chapter 1 Introduction 7
Parse tree
• Dynamically-allocated, pointer-based
structure
• Information for different data types
related to parse trees need to be stored
somewhere.
– Nodes are variant records, storing
information for different types of data
– Nodes store pointers to information stored
in other data structure, e.g. symbol table

Chapter 1 Introduction 8
Scanning
• A scanner reads a stream of characters and
puts them together into some meaningful
(with respect to the source language) units
called tokens.
• It produces a stream of tokens for the next
phase of compiler.

Chapter 1 Introduction 9
Parsing
• A parser gets a stream of tokens from the
scanner, and determines if the syntax
(structure) of the program is correct
according to the (context-free) grammar of
the source language.
• Then, it produces a data structure, called a
parse tree or an abstract syntax tree, which
describes the syntactic structure of the
program.

Chapter 1 Introduction 10
Semantic analysis
• It gets the parse tree from the parser together with
information about some syntactic elements
• It determines if the semantics or meaning of the
program is correct.
• This part deals with static semantic.
• Mostly, a semantic analyzer does type checking.
• It modifies the parse tree in order to get that
(static) semantically correct code.

Chapter 1 Introduction 11
Intermediate code generation
• An intermediate code generator
– takes a parse tree from the semantic analyzer
– generates a program in the intermediate
language.
• In some compilers, a source program is
translated into an intermediate code first
and then the intermediate code is translated
into the target language.
• In other compilers, a source program is
translated directly into the target language.
Chapter 1 Introduction 12
Intermediate code generation (cont’d)
• Using intermediate code is beneficial when
compilers which translates a single source
language to many target languages are
required.
– The front-end of a compiler – scanner to
intermediate code generator – can be used for
every compilers.
– Different back-ends – code optimizer and code
generator– is required for each target language.
• One of the popular intermediate code is
three-address code. A three-address code
instruction is in the form of x = y op z.
Chapter 1 Introduction 13
Code optimization
• Replacing an inefficient sequence of
instructions with a better sequence of
instructions.
• Sometimes called code improvement.
• Code optimization can be done:
– after semantic analyzing
• performed on a parse tree
– after intermediate code generation
• performed on a intermediate code
– after code generation
• performed on a target code
Chapter 1 Introduction 14
Code generation
• A code generator
– takes either an intermediate code or a parse
tree
– produces a target program.

Chapter 1 Introduction 15
Error Handling
• Error can be found in every phase of compil
ation.
– Errors found during compilation are called static
(or compile-time) errors.
– Errors found during execution are called dynami
c (or run-time) errors
• Compilers need to detect, report, and recov
er from error found in source programs
• Error handlers are different in different phas
es of compiler.
Chapter 1 Introduction 16
Cross Compiler
• a compiler which generates target code for
a different machine from one on which the c
ompiler runs.
• A host language is a language in which the
compiler is written.
– T-diagram
S T
H

• Cross compilers are used very often in pract


ice.
Chapter 1 Introduction 17
Bootstrapping
• If we have to implement, from s
A H
cratch, a compiler from a high-l
evel language A to a machine, H
which is also a host, language,
– direct method
– bootstrapping A H
A1 A1 H
A2 A2 H
A3 A3 H
H
Chapter 1 Introduction 18
Cousins of Compilers
• Linkers
• Loaders
• Interpreters
• Assemblers

Chapter 1 Introduction 19
History (1930’s -40’s)
• 1930’s
– John von Neumann invented the concept of
stored-program computer.
– Alan Turing defined Turing machine and
computability.
• 1940’s
– Many electro-mechanic, stored-program
computers were constructed.
• ABC (Atanasoff Berry Computer) at Iowa
• Z1-4 (by Zuse) in Germany
• ENIAC (programmed by a plug board)
Chapter 1 Introduction 20
History : 1950
• Many electronic, stored-program computers were
designed. 0A 1F 83 90 4B
– EDVAC (by von Neumann)
– ACE (by Turing) op code, address,..
• Programs were written in machine languages.
• Later, programs are written in assembly languages
instead. LDI B, 4
– Assemblers translateLDI C, 3code and memory
symbolic
address to machine LDI
code. A, 0
ST: ADI A, C
• John Backus developed DEC FORTRAN
B (no recursive
call) and FORTRANJNZ compiler.B, ST
• Noam Chomsky studied STO structure
0XF0, A of languages
and classified them into classes called Chomsky
hierarchy.
Chapter 1 Introduction
Grammar 21
History (1960’s)
• Recursive-descent parsing was introduced.
• Nuar designed Algol60, Pascal’s ancestor,
which allows recursive call.
• Backus-Nuar form (BNF) was used to
described Algol60.
• LL(1) parsing was proposed by Lewis and
Stearns.
• General LR parsing was invented by Knuth.
• SLR parsing was developed by DeRemer.

Chapter 1 Introduction 22
History (1970’s)
• LALR was develpoed by DeRemer.
• Aho and Ullman founded the theory of LR
parsing techniques.
• Yacc (Yet Another Compiler Compiler) was
developed by Johnson.
• Type inference was studied by Milner.

Chapter 1 Introduction 23

You might also like