0% found this document useful (0 votes)
27 views

Unit 2-Theory of Computation

Module II covers Regular Expressions (RE), including their formal definitions, construction, and identities. It discusses the relationship between finite automata (FA) and RE, methods for converting between them, and applications of RE in language construction. The module also highlights the limitations of RE and FA, emphasizing their inability to recognize non-regular languages.

Uploaded by

Eshan Jinabade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Unit 2-Theory of Computation

Module II covers Regular Expressions (RE), including their formal definitions, construction, and identities. It discusses the relationship between finite automata (FA) and RE, methods for converting between them, and applications of RE in language construction. The module also highlights the limitations of RE and FA, emphasizing their inability to recognize non-regular languages.

Uploaded by

Eshan Jinabade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Module II

Regular Expression

1
Syllabus and Planner
Lecture Topic Book
1 Formal definition and Construction of Regular Expression of the T1,T2
given Language

2 Identities of Regular Expressions Construction of Regular T1,T2


Expression of the given Language

3 Construction of Language from the RE T2

4 FA and RE, DFA to RE Using Arden’s Theorem, Closure T2,R3


properties of RLs, Applications of Regular Expressions

5 RE to DFA (RE to e-NFA to DFA and RE to DFA Direct T2,R3


Method), DFA to RE (Observation, Arden’s Theorem)
6 Pumping Lemma for Regular Languages T1,T2

7 Closure properties of RLs , Applications of Regular Expressions T1,T2

Tutorial No Topic
1 Construction of Regular Expression of the given Language, Construction of Language
from the RE, DFA to RE conversion Using Arden’s Theorem,

2
Text Books & Reference Books
• Text Books
1. Michael Sipser “Introduction to the Theory of Computation” CENGAGE Learning, 3rd
Edition ISBN-13:978-81-315-2529-6
2. Vivek Kulkarni, “Theory of Computation”, Oxford University Press, ISBN-13: 978-0-
19-808458-7

• Reference Books
1. Hopcroft Ulman, “Introduction To Automata Theory, Languages And Computations”,
Pearson Education Asia, 2nd Edition
2. Daniel. A. Cohen, “Introduction to Computer Theory” Wiley-India, ISBN:978-
81-265-1334-5
3. K.L.P Mishra ,N. Chandrasekaran ,“Theory Of Computer Science (Automata, Languages
and Computation)”, Prentice Hall India,2nd Edition
4. John C. Martin, “Introduction to Language and Theory of Computation”, TMH, 3rd
Edition ISBN: 978-0-07-066048-9

5. Kavi Mahesh, “Theory of Computation: A Problem Solving Approach”, Wiley- 3


India, ISBN: 978-81-265-3311-4
What is Regular expression

• Regular expressions are short notations that can denote complex and infinite regular
languages.

• In arithmetic, we can use the operations + and × to build up expressions such as


(5 + 3) × 4 .
• The value of the arithmetic expression is the number 32.

• Similarly, we can use the regular operations to build up expressions describing


languages, which are called regular expressions.
• Example : (0 ∪ 1)0∗.

• The value of a regular expression is a language:


language of consisting of all strings starting with a 0 or a 1 followed by zero or any number
of 0s. Answer: (0 ∪ 1)0∗.

4
Definition of a Regular Expression
1. Regular expressions over Σ, include letters, ∅(empty set) and
ε(empty string of length zero).

2. Every symbol a € Σ is a regular expression over Σ.

3. If R1 and R2 are regular expressions over Σ, then so are


(R1+R2),(R1.R2) and (R1)*
[Where ‘+’ indicates union,
‘.’ indicates concatenation,
‘*’ indicates closure or repetitive concatenation]

4. Regular expressions are only those that are obtained using rules1-3.

5
Operators of RE

Regular Expression operators:


1. The Star operator :Closure
if r = a* then L(r) = (€,a,aa,aaa,aaaa,….)

2. The Dot operator :Concatenation


if r = a.b then L(r) = (ab)

3. The Plus operator :Union


if r = a+b then L(r) = (a,b)

6
Continued…
1. “a+b” stands for either a or b (parallel)

2. “a.b” stands for a followed by b (series)

3. “a*” stands for any no of occurrences (zero or more)


of a (Closure).
a a
q0 q1 q0 a q1 b q2 q0

b
(2) (3)
(1)
7
Continued…
Concatenation of 2 sets:
U.V ={x|x=uv, u ⊆ U and v ⊆ V}
UV ≠ VU
U(VW)=(UV)W
E.g. U={000,111},V={101,010}

Closure of a set:
S*=S0 U S1 U S2…..
Where S0 ={€} and Si =Si-1.S for i>0
e.g. S={01,11}
S1= S0 .S ={€}.{01,11}
S2= S1 .S ={01,11} {01,11}
…..
S*={€ , 01,11,0101,0111,1101,1111,……}
8
Precedence of Regular Expression operators
The Star operator :Closure
The Dot operator :Concatenation
The Plus operator :Union
E.g.
01*+1
Set of all strings over {0,1} consisting of 1 or a 0 followed by zero or
more number of 1s. e.g. 01, 011, 011111, 1,…..

(01)*+1
Set of all strings over {0,1} consisting of 1 or zero or more number of
01. e.g. 01, 0101, 01011, 1,…..

0(1*+1)
Set of all strings over {0,1} starting with 0 followed by single one or
zero or more number of 1’s. e.g. 0, 01, 011, 01111,….

9
Identities of regular expressions
1. R U Φ=R
Adding the empty language to any other language will not change it.

2. Φ.R = R.Φ = Φ
if R = 0, then L(R) = {0} but L(R ◦ ∅) = ∅.

3. €.R =R.€ =R
Joining the empty string to any string will not change it

4. €*= € and Φ*= €


5. R+R= R
6. R*R *= R*
7. RR* = R*R
8. (R*)=R*
9. € +RR *= R*= € +R*R
10. (PQ)*P = P(QP)*
11. (P + Q)* = (P*Q*)* = (P*+Q*)*
12. (P+Q)R=PR+QR and R(P+Q)=RP+RQ
10
Examples

True or False?

Let R and S be two regular expressions. Then:

1. ((R*)*)* = R* ?

2. (R+S)* = R* + S* ?

11
Construction of RE from regular Language
Write RE to represent L over ∑* where ∑={0,1}:

1. Set of all strings of 0’s and 1’s over {0,1}


Ans: ∑={0,1},
L={0,1,10,01,11….}
R=(0 + 1)*

2. Set of all strings in which 0 is followed by any number 1’s


Ans: ∑={0,1},
L={0,01,011,0111,….}
R=01*
3. L={01, 10}
R=01 ∪ 10

4. The language of all binary strings over ∑={0,1} starting with 01


R=01(0+1)*

5. The language of all binary strings ending at 01


R=(0+1)*01.

6. The language of all binary strings having substring 01


12
R=(0+1)*01(0+1)*
Continued…
7. If L(r) ={aaa,aab,aba,abb,baa,bab,bba,bbb} find r.
R=(a+b)(a+b)(a+b)

8. If L(r) ={a,c,ab,cb,abb,cbb,abbb….} find r.


R=(a+c).b*

9. If L(r) ={€,x,xx,xxx,xxxx,xxxxx} find r.


R=(€+x) (€+x) (€+x) (€+x) (€+x)

10. The set of all strings of 0’s and 1’s such that the tenth
symbol from the right is 1.
R=(0+1)*.1.(0+1)(0+1)(0+1)(0+1)(0+1)(0+1)(0+1)(0+1)(0+1)
13
Construction of Regular Language from RE
Describe the language represented by following R.E.
1) r=(0|1)*011
Ans: ∑={0,1},
L{r}=Set of all strings over {0,1} such that all strings end with 011
2) r= 0*1*2*
Ans: ∑={0,1,2},
L{r}=Set of strings over {0,1,2} with zero or more number of 0’s,
followed by zero or more number of 1’s,
followed by zero or more number of 2’s

3) r= 00*11*22*
Ans:∑={0,1,2},
L{r}=Set of all strings over {0,1,2}such that every string will have at least one 0
followed by at least one 1 followed by at least one 2.

4) r= (0*1*)*000(0+1)*
L=Set of all strings over {0,1} with 3 consecutive 0’s.

5)r= (0 ∪ ε)(1 ∪ ε)
L= {ε, 0, 1, 01}
14
Construction of Regular Language from RE

6) R= (∑∑)*
L={w| wє∑ and is a string of even length}

7) R= (∑∑∑)*
L={w| wє∑ and length of w is a multiple of 3}

8) Φ*= ?
L= €

9) R . € =?
L= R

15
Equivalence of FA and RE
•Kleen’s Theorem:
States that:
1) ANY regular language can be recognized (accepted) by a
finite automata.
2) Languages accepted by the finite automata are regular.

16
RE to NFA (with €- moves Conversion)
1) a.b

0 = Start State, 2 = final state

0 = Start State, 3 = final state

2) a+b
0 = Start State, 5 = final state

17
RE to NFA (with €- moves Conversion)
3) a*
0 = Start State, 3 = final state

18
Example
Construct an NFA for the regular expression, (a +
b)* ab. Convert the NFA to its equivalent DFA .

Solution:
It is expected to construct a DFA that recognizes the
regular set: R = (a+b)*· a · b

Let us first build the NFA with ε moves and the convert
the same to DFA.

The TG for NFA with ε moves is as follows,

19
Let us convert this NFA with ε -moves to its equivalent DFA
using a direct method.
20
RE to NFA

We have relabelled the states as well.


Let us see if we can minimize it.
21
The STF for the DFA looks like,

22
23
DFA to RE: Observation
Derive the equivalent RE for DFA
Strategy:
1. R=(a+b).(a+b)*
1. Explore Parallel paths between “two
states” or “start to final state” individually:
“+” operator
2. Find self-loops: “*”
2. R=a.(a+b)*
3. Find out the trap (or dead) state.

Which is correct:
1. a.(a+b)*
2. (a+b).(a+b)*

24
DFA to RE: Arden's Theorem
Statement −
Let P and Q be two regular expressions.
If P does not contain null string(ε),
then R = Q + RP has a unique solution that is R = QP*

Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the
following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ε + P + P2 + P3 + …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Hence, proved.

25
Arden's Theorem Example 1

Solution:
The state equations for the given DFA are:
q0 = q0 b + q2 a + ε
q1 = q0 a
q2 = q1 a + q1 b + q2 b
26
Arden's Theorem
q2 = q1 a + q1 b + q2 b
Substituting for q1 in q2,
q2 = q0aa + q0ab + q2 b
q2= q0 a (a + b) + q2 b
q2 = q0 a (a + b)b* ... using Arden's Theorem(R = Q + RP has a
unique solution that is R = QP*)
Substituting for q2 in q0,
q0 = q0 b + q2 a + ε
q0 = q0 b + q0 a (a + b)b* a + ε
q0 = q0 (b + a (a + b)b* a ) + ε
q0 = ε (b + a (a + b)b* a )* ... using Arden's Theorem

Hence, q0 = (b + a (a + b)b* a )*
q0 being the only final state for the DFA,
R= (b + a (a + b)b* a )* 27
Example 2.

Construct RE for the given FA.

q1=q1.a+q2.b+є
q2=q1.a+q2.b+q3.a
q3=q2.a

28
Using q3in q2 (q3=q2.a)
q2 =q1.a+q2.b+q3.a
q2=q1.a+q2.b+q2.a.a
q2=q1.a+q2(b+aa) {R = Q + RP has a unique solution that is R = QP*)
Applying Arden’s Theorem,
q2=q1.a(b+aa)*

Using q2 in q1 (q2=q1.a(b+aa)*)
q1=q1.a+q2.b+є
q1=q1.a+q1.a(b+aa)*.b+ є
q1= є+q1[a+a(b+aa)*.b] {R = Q + RP has a unique solution that is R = QP*)
Applying Arden’s Theorem,
q1= є. [a+a(b+aa)*.b]*
q1= [a+a(b+aa)*.b]*

q2= [a+a(b+aa)*.b]*.a(b+aa)* {q2=q1.a(b+aa)*}


q3= [a+a(b+aa)*.b]*.a(b+aa)*.a {q3=q2.a}
Is the RE for given FA
29
Limitations of RE

1. FA does not have the capacity to remember large amount


of information.

2. The head can not move in reverse direction.

3. It cannot recognize the languages which are not regular.


e.g. Palindrome.

4. FA cannot multiply the numbers.

30
Closure properties of Regular Languages

Regular Grammar : A grammar is regular if it has rules of form A -> a or


A -> aB or A -> ɛ where ɛ is a special symbol called NULL.

Regular Languages : A language is regular if it can be expressed in terms


of regular expression.

31
Closure properties of Regular Languages
Union : If L1 and If L2 are two regular languages, their union L1 ∪ L2 will also
be regular.

For example,
L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
L3 = L1 ∪ L2 = {an ∪ bn | n ≥ 0} is also regular.

Intersection : If L1 and If L2 are two regular languages, their intersection L1


∩ L2 will also be regular.

For example,
L1= {am bn | n ≥ 0 and m ≥ 0} and L2= {am bn ∪ bn am | n ≥ 0 and m ≥ 0}
L3 = L1 ∩ L2 = {am bn | n ≥ 0 and m ≥ 0} is also regular.

32
Closure properties of Regular Languages
Concatenation : If L1 and If L2 are two regular languages, their
concatenation L1.L2 will also be regular.

For example,
L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
L3 = L1.L2 = {am . bn | m ≥ 0 and n ≥ 0} is also regular.

Kleene Closure : If L1 is a regular language, its Kleene closure L1* will


also be regular.

For example,
L1 = (a ∪ b)
L1* = (a ∪ b)*
33
Closure properties of Regular Languages
Complement

If L(G) is regular language, its complement L’(G) will also be regular.


Complement of a language can be found by subtracting strings which are in
L(G) from all possible strings.

For example,
L(G) = {an | n > 3}
L’(G) = {an | n <= 3}

34
Application of Regular Expression

• Application in Linux
Example : Text File Search , Unix tool: egrep
Editing Commands , cw :Change word

• Application in Search Engine

• Web application

• Regular Expressions in Lexical Analysis


Example : C programming language
36
37
38
39
40
Pumping lemma for Regular Languages
is used to prove that a language is not Regular

Let L be a regular language and M(Q,,,q1,F) is the finite automata with n


states.
Let L is accepted by M.
Let x L and | x | ≥ n , then x can be written as uvw ,where
(i) |v| > 0

(ii) |uv| ≤ n

(iii) uviw L for all i ≥0 , where vi denotes that v is repeated or pumped i times .
Pumping lemma Application

Example: Show that the given language


L={ ambm } is not regular
Ex 1. Show that the language L= {ambm} is not regular.
Answer:
Step 1:
Let us assume that L is regular and L is accepted by a FA with n states.

Step 2: Let us choose a string x= ambm


| x | = 2m >= n
Let us write x as uvw, with |v| > 0 and |uv|<=n
Since, |uv|<=n, u must be the form of as.
Since, |uv| <=n, v must be of the form ar | r>0
Now, ambm can be written as as ar am-s-r bm
as is u , ar is v , am-s-r bm is w

Step 3: Let us check whether uviw for i=2 belongs to L.


uv2w = as (ar)2 am-s-r bm = as a2r am-s-r bm = as+2r+m-s-rbm = am+r bm
Since r>0 , Number of a’s in am+r bm is greater than number of b’s .
Therefore, uv2w ⊄ L.
Hence by the contradiction we can say that the given language is not regular.
Q. Show that the language L= {xx|xє(a,b)*} is not regular.
Answer:
Step 1:
Let us assume that L is regular and L is accepted by a FA with n states.

Step 2: Let us choose a string x= amb, xx= amb.amb


| xx | = m+1+m+1=2m+2 >= n
Let us write x as uvw, with |v| > 0 and |uv|<=n
Since, |uv| <=n, u must be of the form as
Since, |uv|<=n, v must be the form of ar | r>0

Now, xx= amb.amb can be written as as ar am-s-rb.amb


as is u , ar is v , am-s-rb.amb is w

Step 3: Let us check whether uviw for i=2 belongs to L.


uv2w = as (ar)2 am-s-r b.amb = as a2r am-s-r b.amb
= as+2r+m-s-rb.amb = am+r b.amb ≠ L
Therefore, uv2w ⊄ L.
Hence by the contradiction we can say that the given language is not regular.

You might also like