0% found this document useful (0 votes)
11 views33 pages

CS372 Formal Languages & The Theory of Computation

Chap 4

Uploaded by

Chol Ngủyên
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views33 pages

CS372 Formal Languages & The Theory of Computation

Chap 4

Uploaded by

Chol Ngủyên
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Unit 4

Context Free Grammars

Dr. Nguyen Thi Thu Huong


Mobi: +84 903253796
Email: [email protected],
[email protected]
Context-Free Grammars

• Informal description
• Context-Free grammars
• Designing context-free grammars
• Ambiguity
• Chomsky normal form
Imagine a tiny set of syntax rules of English
<sentence>::=<noun phrase> <verb phrase> <noun>::= Boeing
<noun phrase>::= <noun> <noun> ::= Seattle
<verb phrase>::= <verb> <verb phrase> <verb>::= is
<verb phrase>::= <verb> <prepositional phrase>
<verb>::= located
<prepositional phrase>::= <preposition>
<preposition>::= in
<noun phrase>

3
Grammar for arithmetic expressions

<expression> --> number


<expression> --> ( <expression> )
<expression> --> <expression> + <expression>
<expression> --> <expression> - <expression>
<expression> --> <expression> * <expression>
<expression> --> <expression> / <expression>

4
Context Free Grammars (CFG)

A context free grammar G has:


• A set of terminal symbols, 
• A set of nonterminal symbols (or variables), V
• A start symbol, S, which is a member of V
• A set R of production rules of the form A  w,
where A is a variable and w is a string of terminal
and nonterminal symbols or .

5
Formal definition of a context free grammar

A context-free grammar is a 4-tuple (V,Σ, R, S), where


1. V is a finite set called the variables,
2. Σ is a finite set, disjoint from V, called the terminals,
3. R is a finite set of rules, with each rule being a
variable and a string of variables and terminals(form of
a rule is A where AV and  (V)*)
4. S V is the start variable.
Context Free Grammar Examples

• Grammar of nested parentheses


G1 = (V, , R, S) where
V = {S}
 ={ (, ) }
R ={ S  (S), S  SS, S   }

7
Context Free Grammar Examples
The grammar of decimal numbers

G2 = (V, , R, S), V = {S, A, B, C},


 = {+,-, .,0, 1, 2,…., 9}
R: S  +A | -A | A
A  B.B | B
B  BC | C
C0|1|2|....|9
8
How to create a sentence by using a
grammar?
• Start with the start symbol S, replace variables
with one of its RHS until the string contains only
terminal symbols
S  -A
 -B.B
 -B.BC
 -C.BC Derivation

 -1.BC
 -1.CC
 -1.2C
 -1.23
Derivations
• When X consists only of terminal symbols, it is
a string of the language denoted by the
grammar.
• Each iteration of the loop is a derivation step.
• If an iteration has several nonterminals to
choose from at some point, the rules of
derivation would allow any of these to be
applied.

10
Definitions

• If u, v, and w are strings of variables and


terminals, and A → w is a rule of the
grammar, we say that uAv yields uwv,
written uAv uwv.
• Say that u derives v, written u v,
– if u = v or
– if a sequence u1, u2, . . . , uk exists for k ≥ 0
and
u = u1 u2 . . . uk = v.
Example
• Consider grammar G1 = ({S}, {(, )}, {S
(S), SSS, S}, S)
• A derivation that generate string (()()) is
S (S) sentential form
(SS)
((S)S)
(()S)
(()(S))
(()()) sentence
Derivation Tree (Parse Tree)
Derivation tree is constructed with
1) Each tree vertex is a variable
(nonterminal) or terminal or epsilon
2) The root vertex is S
3) Interior vertices are from V, leaf
vertices are from ∑ or 
4) An interior vertex A has children,
in order, left to right,
X1, X2, ... , Xk when there is a rule
in R of the form A -> X1 X2 ... Xk
5) A leaf can be  only when there is
a production A   and the
leaf’s parent can have only this
child. Example: Derivation tree
of string (()())
13
Designing context free grammars
• Many CFLs are the union of simpler CFLs. To construct
a CFG for a CFL, you can break into simpler pieces, do
so and then construct individual grammars for each
piece then merge the grammars to get the necessary
CFG.
• Certain CFLs contain strings with two substrings that are
“linked”, construct a CFG with a recursive rule of the
form R → uRv
• In more complex languages, the strings may contain
certain structures that appear recursively as part of
other (or the same) structures.
Ambiguity

• A string w is derived ambiguously in


context-free grammar G if it has two or more
different leftmost derivations.
• Grammar G is ambiguous if it generates
some string ambiguously.
Example of an ambiguous grammar
Consider grammar G3
<EXPR>  <EXPR> + <EXPR> | <EXPR> * <EXPR> |
(<EXPR>) | a

This grammar generates 2 different parse trees


for string a + a * a

Grammar G3 is ambiguous
16
Disambiguation
• Find an unambiguous grammar that generates
the same language.
<EXPR>  <EXPR> + <TERM> | <TERM>
<TERM>  <TERM> * <FACTOR> | <FACTOR>
<FACTOR>  (<EXPR> ) | a
• Some context-free languages are inherently
ambiguous because they can be generated only
by ambiguous grammar.
Example: {aibjck| i = j or j = k}.

17
Chomsky Normal Form – CNF
A grammar where every production is either of the form
A → BC or A → c
where A, B, C are arbitrary variables and c an arbitrary symbol.
If language contains ε, then we allow S → ε
where S is start symbol, and forbid S on RHS.

A  BC or Aa

Nonterminal Nonterminal Terminal


Example

S  AS S  AS
S a S  AAS
A  SA A  SA
Ab A  aa
CNF Non CNF
Theorem
• Any context-free language is generated by
a context-free grammar in Chomsky
normal form
• PROOF IDEA
– Convert a grammar G into CNF
Method
• The conversion to Chomsky Normal Form has
four main steps:
1. Get rid of all ε rules.
2. Get rid of all rules where RHS is one
variable.
3. Replace every rule that is too long by
shorter rules.
4. Move all terminals to rules where RHS still
have one terminal.
Remove  - rules
 - rule: X 
Nullable variables: Y  

Example: S  aMb
M  aMb
M 

Nullable variable
 - rule : 22
Remove  - rules
S  aMb
S  aMb | ab
Replace
M  aMb M  M  aMb | ab
M 
• Determine the nullable variables (those that generate ε)
• Go through all productions, and for each, omit every
possible subset of nullable variables.
• Delete all productions with empty RHS.
• If the start variable is nullable then create a new start
variable S’ , and add the rules to the grammar S’ → S | 
Remove unit rules
Unit rule is the rule of the form:
X Y
where both X and Y are nonterminals
Example:

Unit rule
Remove unit rules
Because A * B, for all rules of form B → u,
add the rule A → u unless this was a unit rule
previously removed

S  aA | aB
Replace A  a | bb
A B
BA
B  bb
Remove unit rules
Similarly, we replace rule BA

replace
BA
Remove unit rules
Remove duplicate rules Result

S  aA | aB
A  a | bb
B  bb | a
Convert a CFG to Chomsky Normal Form

• Consider grammar G:

S  ABa G is not in CNF


A  aab
B  Ac

We will convert grammar G to CNF


Replace Long Productions by
Shorter Ones
• Replace each rule A → u1u2 · · · uk, where k ≥ 3
and each ui is a variable or terminal symbol, with
the set of rules
o A → u 1A 1
o A 1 → u 2A 2
o A 2 → u 3A 3, . . . ,
o Ak−2 → uk−1uk.
The Ai’s are new variables
• Replace any terminal ui in the preceding rule(s)
with the new variable Ui and add the rule Ui → ui.
Example

• Convert G to CNF

S → aSb| ε
• Remove ε production

S → aSb| ab
Make new start variable

• The new start variable will not appear on


the RHS of any rule
S’ → S | ε
S → aSb | ab
Remove unit rules
• Grammar before removing S’ → S
S’ → S| ε
S → aSb | ab
Grammar after removing S’ → S
S’ → aSb | ab | ε
S → aSb | ab
Replace long rules and terminals
• Before replacing long rules
S’ → aSb | ab | ε
S → aSb | ab
• Replace S → aSb using new variable A1
S’ → aA1| ab | ε
A1 → Sb
S → aA1 | ab
A1 → Sb
• Replace a with U, b with V and add rules U → a, V → b
S’ → UA1| UV | ε
A1 → SV
S → UA1 | UV
– U → a, V → b

You might also like