0% found this document useful (0 votes)
84 views

ACD Module - 2 Notes

SDCS

Uploaded by

anilkali2004
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views

ACD Module - 2 Notes

SDCS

Uploaded by

anilkali2004
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Automata Theory & Compiler Design 21CS51 Module 2

REGULAR EXPRESSIONS AND LANGUAGES:


Introduction
Instead of focusing on the power of a computing device, let's look at the task that we need to
perform. Let's consider problems in which our goal is to match finite or repeating patterns.
For example regular expressions are used as pattern description language in
• Lexical analysis.-- compiler
• Filtering email for spam.
• Sorting email into appropriate mailboxes based on sender and/or content words and
phrases.
• Searching a complex directory structure by specifying patterns that are known to occur in
the file we want.
A regular expression is a pattern description language, which is used to describe particular
patterns of interest. A regular expression provides a concise and flexible means for
"matching" strings of text, such as particular characters, words, or patterns of characters.
Example: [ ] : A character class which matches any character within the brackets
[^ \t\n] matches any character except space, tab and newline character.
Regular expression
What are regular languages?
A language accepted by finite- automata is called as regular language.
A regular- languages can be described using regular expressions, in the form of algebraic
notations consisting of the symbols such as alphabets in Σ, the operators such as + . and *, where
+ is used for union operations, . is used for concatenation and * is used for closure operations.
Thus the regular expressions are the structural representation of finite-automata using algebraic
notations, which can serve as the input language for many systems that process strings. Example
such as Lex program, Unix Grep etc…
Definition of Regular expression
Define Regular expression.
A regular expression is defined as follows:
Φ is a regular expression denoting an empty language.
ε is regular expression denoting the language containing empty string.
a is regular expression denoting the language containing only {a}.

vtucode.in Page 1
Automata Theory & Compiler Design 21CS51 Module 2

If R is a regular expression denoting the language LR and S is a regular expression denoting the
language LS then R + S is a regular expression corresponding to the language LR U LS.
R.S is a regular expression corresponding to the language LR. LS.
R* is a regular expression corresponding to the language LR. Thus the expressions obtained by
applying any of the rules are regular expressions.
Examples of Regular expressions
Regular expression Meaning
a* String consisting of any number of a’s. (zero or more a’s)
a+ String consisting of at least one a. (one or more a’s)
(a + b) String consisting of either a or b
*
(a+b) String consisting of any nuber of a’s and b’s including ε
(a+b)* ab Strings of a’s and b’s ending with ab.
ab(a+b)* Strings of a’s and b’s starting with ab.
(a + b)* ab (a+b)* Strings of a’s and b’s with substring ab.

Write the regular expressions for the following languages:


a. Strings of a’s and b’s having length 2:
Regular expression = (a + b) ( a+ b).
b. Strings of a’s and b’s of length  10:
Regular expression = ( ε + a + b)10.
c. Strings of a’s and b’s of even length.
Regular expression = [(a + b) ( a + b) ]*
d. Strings of a’s and b’s of odd length
Regular expression = ( a + b) [(a + b) ( a + b)]*
e. Strings of a’s of even length
Regular expression = (aa)*
f. Strings of a’s of odd length
Regular expression = a(aa)*
Strings of a’s and b’s with alternate a’s and b’s.
Alternate a’s and b’s can be obtained by concatenating the string (ab) zero or more times. ie (ab) *
and adding an optional b to the front ie: (ε +b) and adding an optional a at the end, ie: (ε +a)

vtucode.in Page 2
Automata Theory & Compiler Design 21CS51 Module 2

The regular expression = (ε +b) (ab)*(ε +a)


Obtain regular expression to accept the language containing at least one a and one b over Σ = { a,
b, c}. OR
Obtain regular expression to accept the language containing at least one 0 and one 1 over Σ = {
0, 1, 2}.
String should contain at least one a and one b, so the regular expression corresponding to this is
given by = ab + ba
There is no restriction on c’s. Insert any number of a’s, b’s and c;s ie: (a+b+c) * in between the
above regular expression.
So the final regular expression = (a+b+c)* a (a+b+c)* b(a+b+c)* + (a+b+c)*
b(a+b+c)*a(a+b+c)*
Obtain regular expression to accept the language containing at least 3 consecutive zeros.
Regular expression for string containing 3 consecutive 0’s = 000
The above regular expression can be preceded or followed by any number of 0’s and 1’s, ie:
(0+1)*
Regular expression = (0+1)*000(0+1)*
Obtain regular expression to accept the language containing strings of a’s and b’s ending with b
and has no substring aa.
Regular expression for strings of a’s and b’s ending with b and has no substring aa is nothing but
the string containing any combinations of either b or ab without ε.
Regular expression = ( b + ab) (b +ab)*
Obtain regular expression to accept the language containing strings of a’s and b’s such that L = {
a2n b2m | n, m  0 }
a2n means even number of a’s, regular expression = (aa)*
b2m means even number of b’s, regular expression = (bb)*.
The regular expression for the given language = (aa)* (bb)*
Obtain regular expression to accept the language containing strings of a’s and b’s such that L = {
a2n+1 b2m | n, m  0 }.
a2n+1 means odd number of a’s, regular expression = a(aa)*
b2m means even number of b’s, regular expression = (bb)*
The regular expression for the given language = a(aa)* (bb)*

vtucode.in Page 3
Automata Theory & Compiler Design 21CS51 Module 2

Obtain regular expression to accept the language containing strings of a’s and b’s such that L = {
a2n+1 b2m+1 | n, m  0 }.
a2n+1 means odd number of a’s, regular expression = a(aa)*
b2m+1 means odd number of b’s, regular expression = b(bb)*
The regular expression for the given language = a(aa)*b(bb)*
Obtain regular expression to accept the language containing strings of 0’s and 1’s with exactly
one 1 and an even number of 0’s.
Regular expression for exactly one 1 = 1
Even number of 0’s = (00)*
So here 1 can be preceded or followed by even number of 0’s or 1 can be preceded and followed
by odd number of 0’s.
The regular expression for the given language = (00)* 1 (00)* + 0(00)* 1 0(00)*
Obtain regular expression to accept the language containing strings of 0’s and 1’s having no two
consecutive 0’s. OR
Obtain regular expression to accept the language containing strings of 0’s and 1’s with no pair of
consecutive 0’s.
Whenever a 0 occurs it should be followed by 1. But there is no restriction on number of 1’s. So
it is a string consisting of any combinations of 1’s and 01’s, ie regular expression = (1+01)*
Suppose string ends with 0, the above regular expression can be modified by inserting (0 + ε ) at
the end.
Regular expression for the given language = (1+01)* (0 + ε )
Obtain regular expression to accept the language containing strings of 0’s and 1’s having no two
consecutive 1’s. OR
Obtain regular expression to accept the language containing strings of 0’s and 1’s with no pair of
consecutive 1’s.
Whenever a 1 occurs it should be followed by 0. But there is no restriction on number of 0’s. So
it is a string consisting of any combinations of 0’s and 10’s, ie regular expression = (0+10)*
Suppose string ends with 1, the above regular expression can be modified by inserting (1 + ε ) at
the end.
Regular expression for the given language = (0+10)* (1 + ε )
Obtain regular expression to accept the following languages over Σ = { a, b}.
i. Strings of a’s and b’s with substring aab.
vtucode.in Page 4
Automata Theory & Compiler Design 21CS51 Module 2

Regular expression = (a+b)* aab(a+b)*


ii. Strings of a’s and b’s such that 4th symbol from right end is b and the 5th symbol from
right end is a.
Here the 4th symbol from right end is b and the 5th symbol from right end is a the
corresponding regu;lar expression = ab(a+b)(a+b)(a+b).
But the above regular expression can be preceded with any number of a’s and b’s.
Therefore the regular expression for the given language = (a+b)*ab(a+b)(a+b)(a+b).
iii. Strings of a’s and b’s such that 10th symbol from right end is b.
The regular expression for the given language = (a+b)*b(a+b)9.
iv. Strings of a’s and b’s whose lengths are multiple of 3.
OR
L = { |w| mod 3 = 0, where w is in Σ = { a, b}
Length of string w is multiple of 3, the regular expression = [(a+b) (a+b) (a+b)]*

v. Strings of a’s and b’s whose lengths are multiple of 5.


OR
L = { |w| mod 5 = 0, where w is in Σ = { a, b}
Length of string w is multiple of 5,
The regular expression = [(a+b) (a+b) (a+b) (a+b)(a+b)]*
vi. Strings of a’s and b’s not more than 3 a’s:
Not more than 3 a’s, regular expression= (ε+a) (ε+a) (ε+a).
But there is no restriction on b’s, so we can include b* in between the above regular
expression.
The regular expression for the given language = b*(ε+a) b*(ε+a) b* (ε+a) b*
vii. Obtain the regular expression to accept the words with two or more letters but
beginning and ending with the same letter. Σ = { a, b}
Regular expression beginning and ending with same letter is = a a + b b. In between
include any number of a’s and b’s.
Therefore the regular expression = a (a+b)* a + b (a+b)* b
viii. Strings of a’s and b’s of length is either even or multiple of 3.
Multiple of regular expression = [(a+b) (a+b) (a+b)]*
Length is of even, regular expression = [(a+b) (a+b)]*
vtucode.in Page 5
Automata Theory & Compiler Design 21CS51 Module 2

So the regular expression for the given language = [(a+b) (a+b) (a+b)]*+ [(a+b)
(a+b)]*
ix. Obtain the regular expression to accept the language L = { anbm | m+n is even }
Here n represents number of a’s and m represents number of b’s.
m+n is even results in two possible cases;
case i. when even number of a’s followed by even number of b’s.
regular expression : (aa)*(bb)*
case ii. Odd number of a’s followed by odd number of b’s.
regular expression = a(aa)* b(bb)*.
So the regular expression for the given language = (aa)*(bb)* + a(aa)* b(bb)*
x. Obtain the regular expression to accept the language L = { anbm | n  4 and m  3 }.
Here n  4 means at least 4 a’s, the regular expression for this = aaaa(a)*
m  3 means at most 3 b’s, regular expression for this = (ε+b) (ε+b) (ε+b).
So the regular expression for the given language = aaaa(a)* (ε+b) (ε+b) (ε+b).
xi. Obtain the regular expression to accept the language L = { anbm cp | n  4 and m  3 p 
2}.
Here n  4 means at least 4 a’s, the regular expression for this = aaaa(a)*
m  3 means at most 3 b’s, regular expression for this = (ε+b) (ε+b) (ε+b).
p  2 means at most 2 c’s, regular expression for this = (ε+c) (ε+c)
So the regular expression for the given language = aaaa(a)*(ε+b) (ε+b) (ε+b) (ε+c)
(ε+c).
xii. All strings of a’s and b’s that do not end with ab.
Strings of length 2 and that do not end with ab are ba, aa and bb.
So the regular expression = (a+b)*(aa + ba +bb)
xiii. All strings of a’s, b’s and c’s with exactly one a.
The regular expression = (b+c)* a (b+c)*
xiv. All strings of a’s and b’s with at least one occurrence of each symbol in Σ = {a, b}.
At least one occurrence of a’s and b’s means ab + ba, in between we have n number
of a’s and b’s.
So the regular expression =(a+b)* a (a+b)* b(a+b)* +(a+b)* b(a+b)* a(a+b)*

vtucode.in Page 6
Automata Theory & Compiler Design 21CS51 Module 2

Obtain the regular expression for the language L = { anbm | m  1, n  1, nm  3 }.


Solution:
Case i. Since nm  3, if n = 1 then m should be  3. The equivalent regular expression is given
by: RE = a bbb(b)*

Case ii. Since nm  3, if m = 1 then n should be  3. The equivalent regular expression is given
by: RE = aaa(a)* b
Case iii. Since nm  3, if m  2 and n  2 then the equivalent regular expression is given by:
RE = aa(a)* bb(b)*
So the final regular expression is obtained by adding all the above regular expression.
Regular expression = abbb(b)* + aaa(a)*b + aa(a)*bb(b)*
Application of Regular expression:
1. Regular expressions are used in UNIX.
2. Regular expressions are extensively used in the design of Lexical analyzer phase.
3. Regular expressions are used to search patterns in text.
FINITE AUTOMATA AND REGULAR EXPRESSIONS
1. ****Converting Regular Expressions to Automata:
Prove that every language defined by a regular expression is also defined by a finite automata.
Proof:
Suppose L = L(R) for a regular expression R, we show that L = L(E) for some ε-NFA E with:
a. Exactly one accepting state.
b. No arcs into the initial state.
c. No arcs out of the accepting state.
The proof must be discussed with the following transition diagrams for the basis of the
construction of an automaton.

vtucode.in Page 7
Automata Theory & Compiler Design 21CS51 Module 2

By definition of regular expression, if R is a RE and S is a RE then R+S is also a RE


corresponding to the language L (R+S), its automaton is given by:

Starting at new start state, we can go to the start state of either the automaton for R or S. We then
reach the accepting state of one of these automata R or S. We can follow one of the ε- arcs to the
accepting state of the new automaton.
Automaton for R.S is given by:

The start state of the first( R) automata becomes the start state of the whole and the final state of
the second(S) automata becomes the final state of the whole.
Automaton for R* is given by:

From start state to final state one arc labeled ε ( for ε in R*) or the to the start state of automaton
R through that automaton one or more time and then to the final state.

vtucode.in Page 8
Automata Theory & Compiler Design 21CS51 Module 2

Convert the regular expression (0 + 1)* 1 ( 0 + 1) to an ε- NFA.


The automaton for L = 0 is given by:

The automaton for L = 1 is given by:

The automaton for L = 0+1 is given by:

The automaton for L = (0+1)* is given by:

The automaton for L = (0+1)* 1 is given by:

vtucode.in Page 9
Automata Theory & Compiler Design 21CS51 Module 2

Finally the ε-NFA for the regular expression: (0+1)*1(0+1) is given by:

Convert the regular expression (01+ 1)* to an ε- NFA.


The automaton for L = 01 is given by:

The automaton for L = 1 is given by:

The automaton for L = 01+1 is given by:

Finally ε- NFA for L = (01+1)* is given by:

vtucode.in Page 10
Automata Theory & Compiler Design 21CS51 Module 2

Convert the regular expression (01+ 101) to an ε- NFA.


The automaton for L = 01 is given by:

The automaton for L = 101 is given

The automaton for L = (01+101) is given by:

Convert the regular expression (0+ 1)*01 to an ε- NFA.

The automaton for L = 01 is given by:

The automaton for L = (0+ 1)*is given by:

Epsilon-NFA for the regular expression (0+ 1)*01 is given by:

vtucode.in Page 11
Automata Theory & Compiler Design 21CS51 Module 2

Convert the regular expression 0* + 1* + 2* to an ε- NFA.


The automaton for L = 0 is given by:

The automaton for L = 1 is given by:

The automaton for L = 2 is given by:

The automaton for L = 0* is given by:

The automaton for L = 1* is given by:

The automaton for L = 2* is given by:

vtucode.in Page 12
Automata Theory & Compiler Design 21CS51 Module 2

Epsilon-NFA for the regular expression 0* + 1* + 2* is given by:

CONVERTING DFA’s TO REGULAR EXPRESSION USING STATE ELIMINATION


TECHNIQUE
How to build a regular expression for a FSM. Instead of limiting the labels on the transitions of
an FSM to a single character or ε, we will allow entire regular expressions as labels.
• For a given input FSM/FA M, we will construct a machine M’ such that M and M’ are
equivalent and M’ has only two states, start state and a single accepting state.
• M’ will also have just one transition, which will go from its start state to its accepting
state. The label on that transition will be a regular expression that describes all the strings
that could have driven the original machine M from its start state to some accepting state
Algorithm to create a regular expression from FSM: (State elimination)
1. Remove any states from given FSM M that are unreachable from the start state
2. If FSM M has no accepting states then halt and return the simple regular expression Ø.

vtucode.in Page 13
Automata Theory & Compiler Design 21CS51 Module 2

3. If the start state of FSM M is part of a loop (i.e: it has any transitions coming into it), then
create a new start state s and connects to M ‘s start state via an ε-transition. This new
start state s will have no transitions into it.
4. If a FSM M has more than one accepting state or if there is just one but there are any
transitions out of it, create a new accepting state and connect each of M’s accepting states
to it via an ε-transition. Remove the old accepting states from the set of accepting states.
Note that the new accepting state will have no transitions out from it.
5. At this point, if M has only one state, then that state is both the start state and the
accepting state and M has no transitions. So L (M} = {ε}. Halt and return the simple
regular expression as ε.
6. Until only the start state and the accepting state remain do:
6.1. Select some state s of M which is of any state except the start state or the accepting
state.
6.2 Remove that state s from M.
6.3 Modify the transitions among the remaining states so that M accepts the same
strings The labels on the rewritten transitions may be any regular expression.
7. Return the regular expression that labels the one remaining transition from the start state
to the accepting state
Consider the following FSM M: Show a regular expression for L(M).
OR
Obtain the regular expression for the following finite automata using state elimination method.

We can build an equivalent machine M' by eliminating state q2 and replacing it by a transition
from q1 to q3 labeled with the regular expression ab*a.
So M' is:

Regular Expression = ab*a


vtucode.in Page 14
Automata Theory & Compiler Design 21CS51 Module 2

Obtain the regular expression for the following finite automata using state elimination method.

There is no incoming edge into the initial state as well as no outgoing edge from final state. So
there is only two states, initial and final.

Regular expression = (a+b+c) or (a U b U c)


Obtain the regular expression for the following finite automata using state elimination method.

There is no incoming edge into the initial state as well as no outgoing edge from final state.
After eliminating the state B:

Regular expression = ab
Obtain the regular expression for the following finite automata using state elimination method.

There is no incoming edge into the initial state as well as no outgoing edge from final state.
After eliminating the state B:

Regular expression = ab*c

vtucode.in Page 15
Automata Theory & Compiler Design 21CS51 Module 2

Obtain the regular expression for the following finite automata using state elimination method.

Since initial state has incoming edge, and final sate has outgoing edge, we have to create a new
iniatial and final state by connecting new initial state to old initial state through ε and old final
state to new final state through ε. Make old final state has non-final state.

After removing state A:

After removing state B:

Regular expression: 0(10)*


Obtain the regular expression for the following finite automata using state elimination method.

vtucode.in Page 16
Automata Theory & Compiler Design 21CS51 Module 2

Since there are multiple final states, we have to create a new final state.

After removing states C, D and E:

After removing state B:

Regular Expression: a(b+c+d)


Obtain the regular expression for the following finite automata using state elimination method.

After inserting new start state:

vtucode.in Page 17
Automata Theory & Compiler Design 21CS51 Module 2

After removing state A:

After removing state B:

Regular expression: b(c +ab)*d

Obtain the regular expression for the following finite automata using state elimination method.

By creating new start and final states:

After removing state B:

After removing state A:

vtucode.in Page 18
Automata Theory & Compiler Design 21CS51 Module 2

Regular expression: (0+10*1)*


Obtain the regular expression for the following finite automata using state elimination method.

By creating new start and final states:

After removing state q1:

After removing state q2:

After removing state q3:

Regular expression: 1*00*1(0+10*1)*

vtucode.in Page 19
Automata Theory & Compiler Design 21CS51 Module 2

Obtain the regular expression for the following finite automata using state elimination method.

By creating new start state and final state:

After removing q1 state:

After removing q2 state:

After removing q3 state:


vtucode.in Page 20
Automata Theory & Compiler Design 21CS51 Module 2

After removing q0 state:

Regular expression: (01+10)*


Consider the following FSM M: Show a regular expression for L(M).
OR
Obtain the regular expression for the following finite automata using state elimination method.

Since start state 1 has incoming transitions, we create a new start state and link that state to state
1 through ε.

vtucode.in Page 21
Automata Theory & Compiler Design 21CS51 Module 2

Since accepting state 1 and 2 has outgoing transitions, we create a new accepting state and link
that state to state 1 and state 2 through ε. Remove the old accepting states from the set of
accepting states. (ie: consider 1 and 2 has non final states)

Consider the state 3 and remove that state:

Consider the state 2 and remove that state:

Consider the state 1 and remove that state:

Finally we have only start and final states with one transition from start state 1 to final state 2,
The labels on transition path indicates the regular edpression.
Regular Expression = (ab U aaa* b)* (a U ε )

vtucode.in Page 22
Automata Theory & Compiler Design 21CS51 Module 2

Consider the following FSM M: Show a regular expression for L(M).

After creating new start and final states:

After removing q2 state:

After removing q1 state:

After removing q0 state:

Regular expression: 0* (ε + 1+) = 0* 1*

vtucode.in Page 23
Automata Theory & Compiler Design 21CS51 Module 2

Consider the following FSM M: Show a regular expression for L(M). OR


Construct regular expression for the following FSM using state elimination method.

By creating new state and final states.

After removing D state:

After removing E state:

After removing A state:

vtucode.in Page 24
Automata Theory & Compiler Design 21CS51 Module 2

After removing B state:

After removing C state:

Regular expression = (00)*11(11)*

Consider the following FSM M: Show a regular expression for L(M).


OR
Construct regular expression for the following FSM using state elimination method.

vtucode.in Page 25
Automata Theory & Compiler Design 21CS51 Module 2

By creating final state.

After removing q1state:

After removing q2state:

After removing q3state:

Regular expression= 01*01*

vtucode.in Page 26
Automata Theory & Compiler Design 21CS51 Module 2

Consider the following FSM M: Show a regular expression for L(M). OR


Construct regular expression for the following FSM using state elimination method.

By creating new start and final states:

After removing q0 state:

After removing q1 state:

After removing q2 state:

vtucode.in Page 27
Automata Theory & Compiler Design 21CS51 Module 2

After removing q3 state:

Regular expression: (0+1)*1(0+1) +(0+1)*1(0+1)(0+1)


CONVERTING DFA’s TO REGULAR EXPRESSION USING KLEEN’S THEOREM
The construction regular expression using this method describes sets of strings that label certain
paths in the DFA’s transition diagram. However the paths are allowed to pass through only a
limited subset of the states. We start with the simplest expressions that describe paths that are not
allowed to pass through any states (ie: they are single state or single arc), and inductively build
the expressions that let the paths go through progressively larger sets of states. Finally the paths
are allowed to go through any state. At the end these expressions represent all possible paths.
Let us consider a DFA with ‘n’ number of states and use Rijk as the name of regular expression
whose language is the set of strings w is the label of a path from state I to state j in a given DFA
and that path has no intermediate node whose number is greater than k. Note that beginning and
end points of the path are not intermediate, so there is no constraint that i and/or j be less than or
equal to k.
To construct the regular expression Rijk we use the inductive definition, starting at k = 0 and
finally k is reaching k = n which is the number of states in DFA.
When k=0
Regular expressions for the paths that can go through no intermediate state at all, there are 2
kinds of paths that meet such condition:
1. An arc from node (state) i to node j
vtucode.in Page 28

You might also like