CFGModule PDF
CFGModule PDF
1 Mathematical definition
A Context Free Grammar is described by a 4 tuple (V, T, P, S) where V is the set of variables.
T is the set of terminals. P is the set of what are called productions and S is the start symbol
of the grammar.
A production is basically a substitution rule. It consists of a symbol and a string separated
by an arrow. The symbol is called a variable. The string on the right side of the arrow
consists of some variables and some other symbols that are called terminals. A terminal is
not allowed to be on the left side of an arrow for any of the rules. The terminals in Context
Free Grammar are analogous to the input alphabet in automata.
One variable is called the start variable. Generally, that variable is denoted by S and
generally while listing the production rules, a rule with that variable on the left side is listed
first.
In the example, you will see the top line has a production rule S → 0S0. S is a variable.
0 and 1 are supposed to be terminals. Note how 0 and 1 never appear on the left side of the
rules.
So in our example V = {S}, T = {0, 1} and the start symbol is S.
This is clearly indicated in JFLAP as well.
1
2 How a context free grammar produces strings
As mentioned before, a production rule is basically a substitution rule. The variable on the
left side can be substituted with the string that is found on the right side.
To produce a string, you always begin with the start symbol and then follow production
rules.
In our example, if we use rule 2, it is easy to see how this grammar produces the string
.
For a more interesting string, let us use the other rule. Again, we begin with S. S gets
replaced with 0S1 by using the first rule. If we apply this same rule 3 times we get the string
000S111. Now to stop this production process, we will use S → to end up with the string
000111.
The notation used for this production is S ⇒ 0S1 ⇒ 00S11 ⇒ 000S111 ⇒ 000111
By trying out a few more examples of such substitutions, it should be clear that this CFG
produces string of the form - a certain number of 0s followed by same number of 1s, which
2
is exactly what the language L1 . We say, the context free grammar generates the language
L1 .
Click on step again and see that we get 00S11 as a result of using the rules S → 0S1
3
Finally clicking step once more gives us the string 0011 by using the rules S → .
4
5
On the other hand if we try a string like 00111 we find at some point the current deriva-
tions give you the empty collection [], meaning that the input string cannot possibly be
derived.
6
generating strings of the form 1n 0n .
Try and complete the rules yourself and then enter them into JFLAP.
The complete solution is provided in CFGModuleExample2.jflap.
To convince yourself that this solution actually works, try using the brute force parse to
see whether or not the following string are accepted {0011, 1100, 10, 101, 010011}.
3. Given two context free grammars generating languages L1 and L2 , can you always
make a context free grammar that generates the union of the two languages.
7 Answers
1. If you forget S → B, there is no way to generate the strings that begin with a block of
1s followed by 0s. In this case, the rules that have the variable B on the left side are
essentially unusable (unreachable).
3. To create a CFG that generates the union of the two languages, create a new start
symbol S, and add two rules S → A and S → B assuming that A and B are the start
symbols of the two languages.