0% found this document useful (0 votes)
68 views

Regular Expression

The document defines regular expressions and describes how to construct non-deterministic finite automata (NFAs) from regular expressions using Thompson's construction method. Key points include: - Regular expressions can be used to denote languages and are composed of symbols, empty string, concatenation, union, closure - Thompson's construction method builds an NFA from a regular expression recursively by applying theorems for union, concatenation, closure, and parentheses - The method shows that any language with a regular expression representation can be recognized by an NFA, proving their equivalence in expressive power

Uploaded by

qwdfgh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Regular Expression

The document defines regular expressions and describes how to construct non-deterministic finite automata (NFAs) from regular expressions using Thompson's construction method. Key points include: - Regular expressions can be used to denote languages and are composed of symbols, empty string, concatenation, union, closure - Thompson's construction method builds an NFA from a regular expression recursively by applying theorems for union, concatenation, closure, and parentheses - The method shows that any language with a regular expression representation can be recognized by an NFA, proving their equivalence in expressive power

Uploaded by

qwdfgh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Regular Expression

• ∑ be an alphabet which is used to denote the


input set. The regular expression over ∑ can
be defined as follows.
• Ø is a regular expression which denotes the
empty set.
• Ε is a regular expression, standing for the
language {ε} and it is a null string.
• a for some a in the alphabet ∑, standing for
the language {a}
• If r and s are regular expressions denoting the
languages L1 and L2 respectively, then
• r+s is equivalent to L1 U L2 (Union)
• rs is equivalent to L1L2 (Concatenation)
• r* is equivalent to L1* (closure)
• The r* is known as kleen closure or closure
which indicates occurrence of r for number of
times.
• For example if ∑={a} and we have regular
expression R=a*, then R is a set denoted by
R={є,a,aa,aaa,aaaa….}

• That is R includes any number of a’s as well as


empty string which indicates zero number of
a’s appearing, denoted by є character.
• Positive closure L+ denotes set of all strings
except є or null string .
• If ∑={a} and if we have regular expression R=a+
then R is a set denoted by
• R={a,aa,aaa,aaaa…}
• We can construct L*= є. L+
PRECEDENCE
• Tightest ( )
• Star (“*”)
• Concatenation (“.”, “”)

• Loosest Union (“∪”, “+”, “|”)


• EXAMPLE
• R1*R2 ∪ R3 = ((R1*)R2)UR3
Write RE for the given Language

• Beginning with 1: 1(0+1)*


• Ends with 1 : (0+1)* 1
• Substring 1: (0+1)* 1 (0+1)*
• Even number of a’s: (b+ab*a)*
• Write r.e for the language accepting the
strings which are starting with 1 and ending
with 0, over the set ∑={0,1}.

• the r.e to denote the language L over ∑={a,b}


such that all the strings do not contain the
substring “ab”.
• Construct r.e for the language L over the set
∑={a,b} in which the total number of a’s are
divisible by 3.
• Solution:
– The total number of a’s are divisible by 3. So valid
strings can be

L={baaa,babaa,bababab,ababab,…..}

The r.e can be,

R=(b*.a.b*.a.b*.a.b*)*
• Write r.e to denote a language L which accepts
all the strings which begin or end with either
00 or 11.
Solution:
The r.e can be categorized into two subparts.
R=L1+L2
L1= The strings which begin with 00 or 11.
L2= The strings which end with 00 or 11.
Let us find out L1 and L2:
L1=(00+11)(any number of 0’s and 1’s)
L1=(00+11)(0+1)*
• Similarly,
L2=(any number of 0’s and 1’s)(00+11)
L2=(0+1)*(00+11)

Hence

R=[(00+11)(0+1)*]+[(0+1)*(00+11)]
EQUIVALENCE
• L can be represented by a regexp
• ⇔
• L is a regular language

• Given regular expression R, we show there


exists NFA N such that R represents L(N)
Given regular expression R, we show there
exists NFA N such that R represents L(N)
Induction on the length of R:
Base Cases (R has length 1):
R=a a
(matches a single symbol)

R=ε ε
(matches the empty string)

R=∅
(matches nothing)
THOMPSON’S CONSTRUCTION
Inductive Step:
Assume R has length k > 1 and that any regular
expression of length < k represents a language
that can be recognized by an NFA

Four possibilities for R:


R=R∪S (Union Theorem)
R = RS (Concatenation Theorem)
R = R* (Closure Theorem)
R = ( R) (Parenthesized R)
THOMPSON’S CONSTRUCTION
UNION THEOREM - (R+S)

ε R ε

ε S ε
THOMPSON’S CONSTRUCTION

CONCATENATION THEOREM - (R·S)

ε
R S
THOMPSON’S CONSTRUCTION
CLOSURE OPERATION OF R- R*

R
ε ε

ε
THOMPSON’S CONSTRUCTION
PARENTHESIZED R- ( R)

R
Have Shown

L can be represented by a regexp



L is a regular language
Transform (1(0 ∪ 1))* to an NFA
ε 0 ε

1
ε ε

ε 0 ε
ε 1 5 6
ε ε
1 2 3 4 9 10
1
ε 7 8 ε

ε
THOMPSON’S CONSTRUCTION METHOD

RE to ε-NFA Example
• Convert R= (ab+a)* to an NFA
– We proceed in stages, starting from simple
elements and working our way up
a
a
b
b
a ε b
ab
THOMPSON’S CONSTRUCTION

RE to ε-NFA Example (2)


ab+a
a ε b
ε ε

a
ε ε

(ab+a)* a ε b
ε ε
ε ε

a
ε ε

You might also like