Context Free Grammar (CFG)
Context Free Grammar (CFG)
Formalism
Derivations
Left- and Rightmost Derivations
1
Informal Comments
A context-free grammar is a notation
for describing languages.
It is more powerful than finite automata
or RE’s, but still cannot define all
possible languages.
Useful for nested structures, e.g.,
parentheses in programming languages.
2
Example: CFG for { 0n1n | n > 1}
Productions:
S -> 01
S -> 0S1
Basis: 01 is in the language.
Induction: if w is in the language, then
so is 0w1.
3
CFG Formalism
Terminals = symbols of the alphabet of
the language being defined.
Variables = nonterminals = a finite
set of other symbols, each of which
represents a language.
Start symbol = the variable whose
language is the one being defined.
4
Productions
A production has the form variable ->
string of variables and terminals.
Convention:
A, B, C,… are variables.
a, b, c,… are terminals.
…, X, Y, Z are either terminals or variables.
…, w, x, y, z are strings of terminals only.
, , ,… are strings of terminals and/or
variables.
5
Example: Formal CFG
Here is a formal CFG for { 0n1n | n > 1}.
Terminals = {0, 1}.
Variables = {S}.
Start symbol = S.
Productions =
S -> 01
S -> 0S1
6
Derivations – Intuition
We derive strings in the language of a
CFG by starting with the start symbol,
and repeatedly replacing some variable
A by the right side of one of its
productions.
That is, the “productions for A” are those
that have A on the left side of the ->.
7
Derivations – Formalism
We say A => if A -> is a
production.
Example: S -> 01; S -> 0S1.
S => 0S1 => 00S11 => 000111.
8
Example: Iterated Derivation
S -> 01; S -> 0S1.
S => 0S1 => 00S11 => 000111.
So S =>* S; S =>* 0S1; S =>* 00S11;
S =>* 000111.
9
Context-Free Languages
A language that is defined by some CFG
is called a context-free language.
There are CFL’s that are not regular
languages, such as the example just
given.
But not all languages are CFL’s.
Intuitively: CFL’s can count two things,
not three.
10
Derivations
A derivation tree or parse tree is an ordered rooted
tree that graphically represents the semantic
information of strings derived from a context free
grammar.
For example, for the Grammar G = (V, T, P, S) where
S-> 0B, A-> 1AA|E,
B-> 0AA
11
Example
Root Vertex: Must be labelled by
the Start Symbol.
Vertex: Labelled by Non-Terminal
Symbols
Leaves: Labelled by Terminal
Symbols or E
12
Leftmost and Rightmost
Derivations
Derivations allow us to replace any of the
variables in a string.
Leads to many different derivations of the same
string.
By forcing the leftmost variable (or alternatively,
the rightmost variable) to be replaced, we avoid
these “distinctions without a difference.”
For example: For generating the string aabaa
from the Grammar S-> aAS| aSS|E, A-> SbA/ba
13
Example
14
Ambiguous Grammar
A grammar is ambiguous if it can gener
ate two different parse trees for one stri
ng.
Ambiguous grammars can cause inconsi
stency in parsing.
15
Example
S-> S+S | S*S | a | b
The String a + a*b can be generated
as:
S-> S+S
S-> a+S
S-> a+S*S
S-> a+a*S
S-> a+a*b 16
Example: Ambiguous Grammar
EE+E
EE-E
EE*E
EE/E
E id
E E
E + E *
E E
E E E E
id1 * + id3