CS372 Formal Languages & The Theory of Computation
CS372 Formal Languages & The Theory of Computation
• Informal description
• Context-Free grammars
• Designing context-free grammars
• Ambiguity
• Chomsky normal form
Imagine a tiny set of syntax rules of English
<sentence>::=<noun phrase> <verb phrase> <noun>::= Boeing
<noun phrase>::= <noun> <noun> ::= Seattle
<verb phrase>::= <verb> <verb phrase> <verb>::= is
<verb phrase>::= <verb> <prepositional phrase>
<verb>::= located
<prepositional phrase>::= <preposition>
<preposition>::= in
<noun phrase>
3
Grammar for arithmetic expressions
4
Context Free Grammars (CFG)
5
Formal definition of a context free grammar
7
Context Free Grammar Examples
The grammar of decimal numbers
-1.BC
-1.CC
-1.2C
-1.23
Derivations
• When X consists only of terminal symbols, it is
a string of the language denoted by the
grammar.
• Each iteration of the loop is a derivation step.
• If an iteration has several nonterminals to
choose from at some point, the rules of
derivation would allow any of these to be
applied.
10
Definitions
Grammar G3 is ambiguous
16
Disambiguation
• Find an unambiguous grammar that generates
the same language.
<EXPR> <EXPR> + <TERM> | <TERM>
<TERM> <TERM> * <FACTOR> | <FACTOR>
<FACTOR> (<EXPR> ) | a
• Some context-free languages are inherently
ambiguous because they can be generated only
by ambiguous grammar.
Example: {aibjck| i = j or j = k}.
17
Chomsky Normal Form – CNF
A grammar where every production is either of the form
A → BC or A → c
where A, B, C are arbitrary variables and c an arbitrary symbol.
If language contains ε, then we allow S → ε
where S is start symbol, and forbid S on RHS.
A BC or Aa
S AS S AS
S a S AAS
A SA A SA
Ab A aa
CNF Non CNF
Theorem
• Any context-free language is generated by
a context-free grammar in Chomsky
normal form
• PROOF IDEA
– Convert a grammar G into CNF
Method
• The conversion to Chomsky Normal Form has
four main steps:
1. Get rid of all ε rules.
2. Get rid of all rules where RHS is one
variable.
3. Replace every rule that is too long by
shorter rules.
4. Move all terminals to rules where RHS still
have one terminal.
Remove - rules
- rule: X
Nullable variables: Y
Example: S aMb
M aMb
M
Nullable variable
- rule : 22
Remove - rules
S aMb
S aMb | ab
Replace
M aMb M M aMb | ab
M
• Determine the nullable variables (those that generate ε)
• Go through all productions, and for each, omit every
possible subset of nullable variables.
• Delete all productions with empty RHS.
• If the start variable is nullable then create a new start
variable S’ , and add the rules to the grammar S’ → S |
Remove unit rules
Unit rule is the rule of the form:
X Y
where both X and Y are nonterminals
Example:
Unit rule
Remove unit rules
Because A * B, for all rules of form B → u,
add the rule A → u unless this was a unit rule
previously removed
S aA | aB
Replace A a | bb
A B
BA
B bb
Remove unit rules
Similarly, we replace rule BA
replace
BA
Remove unit rules
Remove duplicate rules Result
S aA | aB
A a | bb
B bb | a
Convert a CFG to Chomsky Normal Form
• Consider grammar G:
• Convert G to CNF
S → aSb| ε
• Remove ε production
S → aSb| ab
Make new start variable