3-5 Simplified Note
3-5 Simplified Note
Some RE Examples
Regular
Regular Set
Expressions
(0 + 10*) L = { 0, 1, 10, 100, 1000, 10000, … }
(0*10*) L = {1, 01, 10, 010, 0010, …}
(0 + ε)(1 + ε) L = {ε, 0, 1, 01}
Set of strings of a’s and b’s of any length including the null string. So L = { ε, a,
(a+b)*
b, aa , ab , bb , ba, aaa…….}
Set of strings of a’s and b’s ending with the string abb. So L = {abb, aabb, babb,
(a+b)*abb
aaabb, ababb, …………..}
Set consisting of even number of 1’s including empty string, So L= {ε, 11, 1111,
(11)*
111111, ……….}
Set of strings consisting of even number of a’s followed by odd number of b’s , so
(aa)*(bb)*b
L = {b, aab, aabbb, aabbbbb, aaaab, aaaabbb, …………..}
String of a’s and b’s of even length can be obtained by concatenating any
(aa + ab + ba +
combination of the strings aa, ab, ba and bb including null, so L = {aa, ab, ba, bb,
bb)*
aaab, aaba, …………..}
Regular Sets
Any set that represents the value of the Regular Expression is called a Regular Set.
Properties of Regular Sets
Property 1. The union of two regular set is regular.
Proof −
Let us take two regular expressions
RE1 = a(aa)* and RE2 = (aa)*
Hence, proved.
Property 2. The intersection of two regular set is regular.
Proof −
Let us take two regular expressions
RE1 = a(a*) and RE2 = (aa)*
So, L1 = { a,aa, aaa, aaaa, ....} (Strings of all possible lengths excluding Null)
Hence, proved.
Property 3. The complement of a regular set is regular.
Proof −
Let us take a regular expression −
RE = (aa)*
So, L = {ε, aa, aaaa, aaaaaa, .......} (Strings of even length including Null)
Complement of L is all the strings that is not in L.
So, L’ = {a, aaa, aaaaa, .....} (Strings of odd length excluding Null)
RE (L’) = a(aa)* which is a regular expression itself.
Hence, proved.
Property 4. The difference of two regular set is regular.
Proof −
Let us take two regular expressions −
RE1 = a (a*) and RE2 = (aa)*
So, L1 = {a, aa, aaa, aaaa, ....} (Strings of all possible lengths excluding Null)
Hence, proved.
Property 5. The reversal of a regular set is regular.
Proof −
Here, L1 = {0, 00, 10, 000, 010, ......} (Set of strings ending in 0)
Then, L1 L2 = {001,0010,0011,0001,00010,00011,1001,10010,.............}
Set of strings containing 001 as a substring which can be represented by an RE − (0 + 1)*001(0 +
1)*
Hence, proved.
Arden's Theorem
In order to find out a regular expression of a Finite Automaton, we use Arden’s Theorem along with
the properties of regular expressions.
Statement −
Let P and Q be two regular expressions.
If P does not contain null string, then R = Q + RP has a unique solution that is R = QP*
Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ε + P + P2 + P3 + …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Hence, proved.
Assumptions for Applying Arden’s Theorem
• The transition diagram must not have NULL transitions
• It must have only one initial state
Method
Step 1 − Create equations as the following form for all the states of the DFA having n states with
initial state q1.
…………………………
…………………………
…………………………
…………………………
qn = q1R1n + q2R2n + … + qnRnn
Rij represents the set of labels of edges from qi to qj, if no such edge exists, then Rij = ∅
Step 2 − Solve these equations to get the equation for the final state in terms of Rij
Problem
Construct a regular expression corresponding to the automata given below −
Solution −
Here the
initial state
and final
state is q1.
The
equations
for the three
states q1,
q2, and q3
are as
follows −
q1 = q1a + q3a + ε (ε move is because q1 is the initial state0
q3 = q2a
Now, we will solve these three equations −
q2 = q1b + q2b + q3b
q1 = q1a + q3a + ε
Solution −
Here the initial state is q1 and the final state is q2
q2 = q11 + q20
q3 = q21 + q30 + q31
So, q1 = 0*
q2 = 0*1 + q20
Construction of an FA from an RE
We can use Thompson's Construction to find out a Finite Automaton from a Regular Expression.
We will reduce the regular expression into smallest regular expressions and converting these to NFA
and finally to DFA.
Some basic RA expressions are the following −
Case 1 − For a regular expression ‘a’, we can construct the following FA −
Method
Step 1 Construct an NFA with Null moves from the given regular expression.
Step 2 Remove Null transition from the NFA and convert it into its equivalent DFA.
Problem
Convert the following RA into its equivalent DFA − 1 (0 + 1)* 0
Solution
We will concatenate three expressions "1", "(0 + 1)*" and "0"
Now we will remove the ε transitions. After we remove the ε transitions from the NDFA, we get the
following −
It is an NDFA corresponding to the RE − 1 (0 + 1)* 0. If you want to convert it into a DFA, simply
apply the method of converting NDFA to DFA discussed in Chapter 1.
Solution
Step 1 −
Here the ε transition is between q1 and q2, so let q1 is X and qf is Y.
Step 2 −
Now we will Copy all these edges from q1 without changing the edges from qf and get the
following FA −
Step 3 −
Here q1 is an initial state, so we make qf also an initial state.
So the FA becomes −
Step 4 −
Here qf is a final state, so we make q1 also a final state.
So the FA becomes −
Ambiguity in Context-Free Grammars
If a context free grammar G has more than one derivation tree for some string w ∈ L(G), it is
called an ambiguous grammar. There exist multiple right-most or left-most derivations for some
string generated from that grammar.
Problem
Check whether the grammar G with production rules −
X → X+X | X*X |X| a
is ambiguous or not.
Solution
Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1 − X → X+X → a +X → a+ X*X → a+a*X → a+a*a
Parse tree 1 −
In a CFG, it may happen that all the production rules and symbols are not needed for the derivation
of strings. Besides, there may be some null productions and unit productions. Elimination of these
productions and symbols is called simplification of CFGs. Simplification essentially comprises of
the following steps −
• Reduction of CFG
• Removal of Unit Productions
• Removal of Null Productions
Reduction of CFG
CFGs are reduced in two phases −
Phase 1 − Derivation of an equivalent grammar, G’, from the CFG, G, such that each variable
derives some terminal string.
Derivation Procedure −
Step 1 − Include all symbols, W1, that derive some terminal and initialize i=1.
Phase 2 − Derivation of an equivalent grammar, G”, from the CFG, G’, such that each symbol
appears in a sentential form.
Derivation Procedure −
Step 1 − Include the start symbol in Y1 and initialize i = 1.
Step 2 − Include all symbols, Yi+1, that can be derived from Yi and include all production rules that
have been applied.
Step 3 − Increment i and repeat Step 2, until Yi+1 = Yi.
Problem
Find a reduced grammar equivalent to the grammar G, having production rules, P: S → AC | B, A
→ a, C → c | BC, E → aA | e
Solution
Phase 1 −
T = { a, c, e }
W1 = { A, C, E } from rules A → a, C → c and E → aA
W2 = { A, C, E } U { S } from rule S → AC
W3 = { A, C, E, S } U ∅
G’ = { { A, C, E, S }, { a, c, e }, P, {S}}
where P: S → AC, A → a, C → c , E → aA | e
Phase 2 −
Y1 = { S }
Y2 = { S, A, C } from rule S → AC
Y4 = { S, A, C, a, c }
G” = { { A, C, S }, { a, c }, P, {S}}
where P: S → AC, A → a, C → c
Removal Procedure −
Step 1 − To remove A → B, add production A → x to the grammar rule whenever B → x occurs in
the grammar. [x ∈ Terminal, x can be Null]
Step 2 − Delete A → B from the grammar.
Step 3 − Repeat from step 1 until all unit productions are removed.
Problem
Remove unit production from the following −
S → XY, X → a, Y → Z | b, Z → M, M → N, N → a
Solution −
There are 3 unit productions in the grammar −
Y → Z, Z → M, and M → N
At first, we will remove M → N.
As N → a, we add M → a, and M → N is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → M, M → a, N → a
Now we will remove Z → M.
As M → a, we add Z→ a, and Z → M is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → a, M → a, N → a
Now we will remove Y → Z.
As Z → a, we add Y→ a, and Y → Z is removed.
The production set becomes
S → XY, X → a, Y → a | b, Z → a, M → a, N → a
Now Z, M, and N are unreachable, hence we can remove those.
The final CFG is unit production free −
S → XY, X → a, Y → a | b
Removal Procedure
Step 1 − Find out nullable non-terminal variables which derive ε.
Step 2 − For each production A → a, construct all productions A → x where x is obtained from ‘a’
by removing one or multiple non-terminals from Step 1.
Step 3 − Combine the original productions with the result of step 2 and remove ε - productions.
Problem
Remove null production from the following −
S → ASA | aB | b, A → B, B → b | ∈
Solution −
There are two nullable variables − A and B
At first, we will remove B → ε.
After removing B → ε, the production set becomes −
S→ASA | aB | b | a, A ε B| b | &epsilon, B → b
Now we will remove A → ε.
After removing A → ε, the production set becomes −
S→ASA | aB | b | a | SA | AS | S, A → B| b, B → b
This is the final production set without null transition.
Problem
Convert the following CFG into CNF
S → ASA | aB, A → B | S, B → b | ε
Solution
(1) Since S appears in R.H.S, we add a new state S0 and S0→S is added to the production set and it
becomes −
S0→S, S→ ASA | aB, A → B | S, B → b | ∈
A → B | S, B → b
After removing A→ B, the production set becomes −
S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A→S|b
B→b
After removing A→ S, the production set becomes −
S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A → b |ASA | aB | a | AS | SA, B → b
(4) Now we will find out more than two variables in the R.H.S
Here, S0→ ASA, S → ASA, A→ ASA violates two Non-terminals in R.H.S.
Hence we will apply step 4 and step 5 to get the following final production set which is in CNF −
S0→ AX | aB | a | AS | SA
S→ AX | aB | a | AS | SA
A → b |AX | aB | a | AS | SA
B→b
X → SA
(5) We have to change the productions S0→ aB, S→ aB, A→ aB
S→ AX | YB | a | AS | SA
A → b A → b |AX | YB | a | AS | SA
B→b
X → SA
Y→a
S→ε
where A, D1,....,Dn are non-terminals and b is a terminal.
Problem
Convert the following CFG into CNF
S → XY | Xn | p
X → mX | m
Y → Xn | o
Solution
Here, S does not appear on the right side of any production and there are no unit or null productions
in the production rule set. So, we can skip Step 1 to Step 3.
Step 4
Now after replacing
X in S → XY | Xo | p
with
mX | m
we obtain
S → mXY | mY | mXo | mo | p.
And after replacing
X in Y → Xn | o
Union
Let L1 and L2 be two context free languages. Then L1 ∪ L2 is also context free.
Example
Let L1 = { anbn , n > 0}. Corresponding grammar G1 will have P: S1 → aAb|ab
Concatenation
If L1 and L2 are context free languages, then L1L2 is also context free.
Example
Union of the languages L1 and L2, L = L1L2 = { anbncmdm }
Kleene Star
If L is a context free language, then L* is also context free.
Example
Let L = { anbn , n ≥ 0}. Corresponding grammar G will have P: S → aAb| ε
This means at state q1, if we encounter an input string ‘a’ and top symbol of the stack is ‘b’, then
we pop ‘b’, push ‘c’ on top of the stack and move to state q2.
Turnstile Notation
The "turnstile" notation is used for connecting pairs of ID's that represent one or many moves of a
PDA. The process of transition is denoted by the turnstile symbol "⊢".
Consider a PDA (Q, ∑, S, δ, q 0, I, F). A transition can be mathematically represented by the
following turnstile notation −
(p, aw, Tβ) ⊢ (q, w, αb)
This implies that while taking a transition from state p to state q, the input symbol ‘a’ is consumed,
and the top of the stack ‘T’ is replaced by a new string ‘α’.
Note − If we want zero or more moves of a PDA, we have to use the symbol (⊢*) for it.
Example
Construct a PDA that accepts L = {0n 1n | n ≥ 0}
Solution
Example
Construct a PDA that accepts L = { wwR | w = (a+b)* }
Solution
Initially we put a special symbol ‘$’ into the empty stack. At state q2, the w is being read. In state
q3, each 0 or 1 is popped when it matches the input. If any other input is given, the PDA will go to a
dead state. When we reach that special symbol ‘$’, we go to the accepting state q4.
Solution
Let the equivalent PDA,
P = ({q}, {a, b}, {a, b, X, S}, δ, q, S)
where δ −
δ(q, ε , S) = {(q, XS), (q, ε )}
δ(q, ε , X) = {(q, aXb), (q, Xb), (q, ab)}
δ(q, a, a) = {(q, ε )}
δ(q, 1, 1) = {(q, ε )}
Step 2 − For every w, x, y, z ∈ Q, add the production rule Xwx → XwyXyx in grammar G.
Example
Design a top-down parser for the expression "x+y*z" for the grammar G with the following
production rules −
P: S → S+X | X, X → X*Y | Y, Y → (S) | id
Solution
If the PDA is (Q, ∑, S, δ, q0, I, F), then the top-down parsing is −
Example
Design a top-down parser for the expression "x+y*z" for the grammar G with the following
production rules −
P: S → S+X | X, X → X*Y | Y, Y → (S) | id
Solution
If the PDA is (Q, ∑, S, δ, q0, I, F), then the bottom-up parsing is −
(x+y*z, I) ⊢ (+y*z, xI) ⊢ (+y*z, YI) ⊢ (+y*z, XI) ⊢ (+y*z, SI)
⊢(y*z, +SI) ⊢ (*z, y+SI) ⊢ (*z, Y+SI) ⊢ (*z, X+SI) ⊢ (z, *X+SI)
⊢ (ε, z*X+SI) ⊢ (ε, Y*X+SI) ⊢ (ε, X+SI) ⊢ (ε, SI)
Definition
A Turing Machine (TM) is a mathematical model which consists of an infinite length tape divided
into cells on which input is given. It consists of a head which reads the input tape. A state register
stores the state of the Turing machine. After reading an input symbol, it is replaced with another
symbol, its internal state is changed, and it moves from one cell to the right or left. If the TM
reaches the final state, the input string is accepted, otherwise rejected.
A TM can be formally described as a 7-tuple (Q, X, ∑, δ, q0, B, F) where −
δ is given by −
Tape alphabet symbol Present State ‘q0’ Present State ‘q1’ Present State ‘q2’
a 1Rq1 1Lq0 1Lqf
b 1Lq2 1Rq1 1Rqf
Here the transition 1Rq1 implies that the write symbol is 1, the tape moves right, and the next state
is q1. Similarly, the transition 1Lq2 implies that the write symbol is 1, the tape moves left, and the
next state is q2.
A TM decides a language if it accepts it and enters into a rejecting state for any input not in the
language. A language is recursive if it is decided by a Turing machine.
There may be some cases where a TM does not stop. Such TM accepts the language, but it does not
decide it.
• From the above moves, we can see that M enters the state q1 if it scans an even number of
α’s, and it enters the state q2 if it scans an odd number of α’s. Hence q2 is the only accepting
state.
Hence,
M = {{q1, q2}, {1}, {1, B}, δ, q1, B, {q2}}
where δ is given by −
Example 2
Design a Turing Machine that reads a string representing a binary number and erases all leading 0’s
in the string. However, if the string comprises of only 0’s, it keeps one 0.
Solution
Let us assume that the input string is terminated by a blank symbol, B, at each end of the string.
The Turing Machine, M, can be constructed by the following moves −
• Let q0 be the initial state.
• If M is in q0, on reading 0, it moves right, enters the state q1 and erases 0. On reading 1, it
enters the state q2 and moves right.
• If M is in q1, on reading 0, it moves right and erases 0, i.e., it replaces 0’s by B’s. On
reaching the leftmost 1, it enters q2 and moves right. If it reaches B, i.e., the string comprises
of only 0’s, it moves left and enters the state q3.
• If M is in q2, on reading either 0 or 1, it moves right. On reaching B, it moves left and enters
the state q4. This validates that the string comprises only of 0’s and 1’s.
• If M is in q3, it replaces B by 0, moves left and reaches the final state qf.
• If M is in q4, on reading either 0 or 1, it moves left. On reaching the beginning of the string,
i.e., when it reads B, it reaches the final state qf.
Hence,
M = {{q0, q1, q2, q3, q4, qf}, {0,1, B}, {1, B}, δ, q0, B, {qf}}
where δ is given by −
Tape alphabet Present State Present State Present State Present State Present State
symbol ‘q0’ ‘q1’ ‘q2’ ‘q3’ ‘q4’
0 BRq1 BRq1 ORq2 - OLq4
1 1Rq2 1Rq2 1Rq2 - 1Lq4
B BRq1 BLq3 BLq4 OLqf BRqf
A Multi-tape Turing machine can be formally described as a 6-tuple (Q, X, B, δ, q0, F) where −
A Multi-track Turing machine can be formally described as a 6-tuple (Q, X, ∑, δ, q0, F) where −