0% found this document useful (0 votes)
32 views27 pages

CH2 1

This document describes a simple one-pass compiler. A one-pass compiler reads source code only once and immediately translates it into machine code without requiring additional passes or processes. Building a one-pass compiler involves defining a language's syntax, developing a parser, implementing syntax-directed translation to generate intermediate code, and optimizing the generated code. Context-free grammars are used to specify a language's syntax and allow for efficient parser construction. Parsing involves replacing non-terminals with production rules from left to right. Parse trees represent the structure of parsed code. Ambiguous grammars allow for multiple parse trees of the same input.

Uploaded by

sam negro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views27 pages

CH2 1

This document describes a simple one-pass compiler. A one-pass compiler reads source code only once and immediately translates it into machine code without requiring additional passes or processes. Building a one-pass compiler involves defining a language's syntax, developing a parser, implementing syntax-directed translation to generate intermediate code, and optimizing the generated code. Context-free grammars are used to specify a language's syntax and allow for efficient parser construction. Parsing involves replacing non-terminals with production rules from left to right. Parse trees represent the structure of parsed code. Ambiguous grammars allow for multiple parse trees of the same input.

Uploaded by

sam negro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

A simple One Pass Compiler

2
Introduction
 In computer programming, a one-pass compiler is a compiler
that passes through the parts of each compilation unit only once,
immediately translating each part into its final machine code.

 One pass compiler reads the code only once and then translates it.

 A one-pass compiler is fast since all the compiler code is loaded in the
memory at once.

 It can process the source text without the overhead of the operating
system having to shut down one process and start another.

3
Introduction
 Building one pass compiler involves:
 Defining the syntax of a programming language (CFG/BNF)

 Develop a source code parser: (Top down parser)

 Implementing syntax directed translation to generate


intermediate code:

 Generating

 Optimize

4
Structure of Compiler
Character Token Intermediate
Syntax-directed
stream Lexical analyzer stream Representation
translator

Develop
parser and code
generator for translator

Syntax definition
(CFG)

5
Syntax Definition
 To specify the syntax of a language : CFG and BNF

Example : if-else statement in C has the form of statement → if (


expression ) statement else statement;

 An alphabet of a language is a set of symbols.

Examples : {0,1} for a binary number system (language)


={0,1,100,101,...}

{a,b,c} for language={a,b,c, ac,abcc..}

{if,(,),else ...} for a if statements={if(a==1)goto10, if--}


6
Syntax Definition
 A Context-free Grammar (CFG) Is Utilized to Describe the Syntactic
Structure of a Language.

 CFG is a set of recursive rules used to generate patterns of strings.

 In CFG, the start symbol is used to derive the string. You can derive the
string by repeatedly replacing a non-terminal by the right hand side of the
production, until all non-terminal have been replaced by terminal symbols.

 It is useful to describe most of the programming languages.

 If the grammar is properly designed then an efficient parser can be


constructed automatically.

7
CFG
 A CFG recursively defines several sets of strings

 Each set is denoted by a name, which is called a nonterminal.

 One of the non terminals are chosen to denote the language described by the
grammar. This is called the start symbol of the grammar.

 Each production describes some of the possible strings that are contained in the set
denoted by a nonterminal.

 A production has the form N → X1…….Xn

where N is a nonterminal and X1…Xn are zero or more symbols, each of which is

either a terminal or a nonterminal.

8
CFG
 Some examples:

A→ a

 says that the set denoted by the nonterminal A contains the one-
character string a.

A→ aA

 says that the set denoted by A contains all strings formed by


putting an a in front of a string taken from the set denoted by A.

9
CFG

 From regular expressions to context free grammars

10
CFG
 Common syntactic categories in programming languages
are:
 Expressions:- are used to express calculation of values.
 Statements:- express actions that occur in a particular
sequence.
 Declarations:- express properties of names used in other parts
of the program.

11
CFG
 A CFG Is Characterized By a 4 tuple:

1. A Set of Tokens(Terminal Symbols)

2. A Set of Non-terminals

3. A Set of Production Rules

Each Rule Has the Form NT →{T, NT}*

4. A designated Start symbol.

12
Example CFG
 Context-free grammar for simple expressions

G = <{list, digit}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, list> with a production P=

List → list + digit

List → list-digit

List → digit

Digit → 0|1|2|3|4|5|6|7|8|9

(the “|” means OR)

(so we could have written List → list + digit | list - digit | digit )

13
Derivation
 A given CFG we can determine the set of all strings(tokens) generated by the
grammar using derivation.

 The basic idea of derivation is to consider productions as rewrite rules:


Whenever we have a nonterminal, we can replace this by the right-hand side
of any production in which the nonterminal appears on the left-hand side.

 During parsing we have to take two decisions. These are as follows:


 We have to decide the non-terminal which is to be replaced.
 We have to decide the production rule by which the non-terminal will be
replaced.

14
Derivation
 We begin with the start symbol

 In each step, we replace one non terminal in the current


sentential form with one of the right-hand sides of production
for that nonterminal.

 Formally, we define the derivation relation by the three rules

1: N =>    if there is a production N → 

2:  => 

3:  =>  if there is a  such that  => and =>


15
Derivation

generates the string aabbbcc by the derivation

16
Left-most Derivation
 the input is scanned and replaced with the production rule from left to
right. So in left most derivatives we read the input string from left to
right. Example

 Production rules:
S=S+S
S=S-S
 S = a | b |c

 Input : a - b + c

17
Right-most Derivation
 The input is scanned and replaced with the production rule from right
to left. So in right most derivatives we read the input string from right
to left.. Example

 Production rules:
S=S+S
S=S-S
 S = a | b |c

 Input : a - b + c

18
Grammars are Used to Derive Strings:
 We can derive the string: 9 - 5 + 2 as follows:

 list → list + digit P1: list → list + digit

→list - digit + digit P2: list → list - digit

→digit - digit + digit P3:list→digit

→9 - digit + digit P4: digit →9

→9 - 5 + digit P4: digit → 5

→9 - 5 + 2 P4: digit → 2

This is an example leftmost derivation, because we replaced the


leftmost nonterminal (underlined) in each step
19
Defining Parse tree

➢ More Formally, a Parse Tree for a CFG Has the


Following Properties:
➢ The root of the tree is labeled by the start symbol
➢ Each leaf of the tree is labeled by a terminal(token) or ε
➢ Each Interior Node (Now Leaf) Is a Non-Terminal
➢ If A→ x1x2…xn, is a production, Then A Is an Interior;
x1x2…xn Are Children of A and May Be Non-Terminals or
Tokens.

20
Parse Tree for the Example
Grammar

➢ Parse tree of the string 9-5+2 using grammar G

21
Ambiguity
 A grammar is said to be ambiguous if there exists more than one left
most derivation or more than one right most derivation or more than
one parse tree for a given input string.

 Consider the following context-free grammar:

G = <{string}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, string>

with production P =

string → string + string | string - string | 0 | 1 | … | 9

 This grammar is ambiguous, because more than one parse tree


generates the string 9-5+2

22
Ambiguity
 Two derivations (Parse Trees) for the same token string.

23
Associativity of Operators
➢ An operator  is left-associative if the expression abc must be evaluated
from left to right, i.e., as (ab)c .

➢ An operator  is right-associative if the expression abc must be evaluated


from right to left, i.e., as a a(bc).

➢ An operator  is non-associative if expressions of the form abc are illegal.

➢ Left-associative operators have left-recursive productions.

eg) 9+5+2≡(9+5)+2, a=b=c≡a=(b=c)

• Left Associative Grammar • Right Associative Grammar

list → list + digit | list – digit right → letter = right | letter

digit →0|1|…|9 letter → a|b|…|z

24
Associativity of Operators
• Left Associative Grammar • Right Associative Grammar
list → | list – digit right → letter = right | letter
digit →0|1|…|9 letter → a|b|…|z

25
Precedence of Operator
➢ A possible way of resolving the ambiguity is to use
precedence rules during syntax analysis to select among the
possible syntax trees.

➢ We say that a operator(*) has higher precedence than other


operator(+) if the operator(*) takes operands before other
operator(+) does.
• ex. 9+5*2≡9+(5*2), 9*5+2≡(9*5)+2
• left associative operators : + , - , * , /
• right associative operators : = , **

26
Precedence of Operator

expr → expr + term | term


term → term * factor | factor
factor → number | ( expr )

String 2+3*5 has the same meaning as 2+(3*5)

expr term

term term factor

factor factor number

number number

2 + 3 * 5 27

You might also like