Extended RE's: Character Classes
Extended RE's: Character Classes
Closure Properties
Not every language is a regular language.
However, there are some rules that say \if
these languages are regular, so is this one
derived from them.
There is also a powerful technique | the
pumping lemma | that helps us prove a
language not to be regular.
Key tool: Since we know RE's, DFA's,
NFA's, -NFA's all dene exactly the
regular languages, we can use whichever
representation suits us when proving
something about a regular language.
Pumping Lemma
If L is a regular language, then there exists a
constant n such that every string w in L, of length
n or more, can we written as w = xyz, where:
1. 0 < jyj.
2. jxyj n.
2
3. For all i 0, wyi z is also in L.
✦ Note yi = y repeated i times; y0 = .
The alternating quantiers in the logical
statement of the PL makes it very complex:
(8L)(9n)(8w)(9x; y; z)(8i).
Proof of Pumping Lemma
Since we claim L is regular, there must be a
DFA A such that L = L(A).
Let A have n states; choose this n for the
pumping lemma.
Let w be a string of length n in L, say w =
a1 a2 am , where m n.
Let qi be the state A is in after reading the
rst i symbols of w.
✦ q0 = start state, q1 = (q0; a1), q2 =
^(q0; a1a2 ), etc.
Since there are only n dierent states, two of
q0; q1; : : :; qn must be the same; say qi = qj ,
where 0 i < j n.
Let x = a1 ai ; y = ai+1 aj ; z =
aj +1 am .
Then by repeating the loop from qi to qi with
label ai+1 aj zero times once, or more, we
can show that xyi z is accepted by A.
PL Use
We use the PL to show a language L is not
regular.
Start by assuming L is regular.
Then there must be some n that serves as the
PL constant.
✦ We may no know what n is, but we can
work the rest of the \game" with n as a
parameter.
We choose some w that is known to be in L.
✦ Typically, w depends on n.
Applying the PL, we know w can be broken
into xyz, satisfying the PL properties.
✦ Again, we may not know how to break w,
so we use x; y; z as parameters.
We derive a contradiction by picking i (which
might depend on n, x, y, and/or z) such that
xyi z is not in L.
3
Example
Consider the set of strings of 0's whose length is a
perfect square; formally L = f0i j i is a squareg.
We claim L is not regular.
Suppose L is regular. Then there is a constant
n satisfying the PL conditions.
Consider w = 0n2 , which is surely in L.
Then w = xyz, where jxyj n and y 6= .
By PL, xyyz is in L. But the length of xyyz
is greater than n2 and no greater than n2 + n.
However, the next perfect square after n2 is
(n + 1)2 = n2 + 2n + 1.
Thus, xyyz is not of square length and is not
in L.
Since we have derived a contradiction, the
only unproved assumption | that L is
regular | must be at fault, and we have a
\proof by contradiction" that L is not regular.
Closure Properties
Certain operations on regular languages are
guaranteed to produce regular languages.
Example: the union of regular languages is
regular; start with RE's, and apply + to get
an RE for the union.
Substitution
Take a regular language L over some alphabet
.
For each a in , let La be a regular language.
Let s be the substitution dened by s(a) = La
for each a.
✦ Extend s to strings by s(a1 a2 an) =
s(a1 )s(a2 ) s(an ); i.e., concatenate the
languages La1 La2 Lan .
✦ Extend s to languages by s(M) =[w in M
s(w).
Then s(L) is regular.
Proof That Substitution of Regular
Languages Into a Regular Language is
Regular
Let R be a regular expression for language L.
4
Let Ra be a regular expression for language
s(a) = La , for all symbols a in .
Construct a RE E for s(L) by starting with R
and replacing each symbol a by the RE La .
Proof that L(E) = s(L) is an induction on the
height of (the expression tree for) RE R.
Basis : R is a single symbol, a. Then E = Ra,
L = fag, and s(L) = s(fag) = L(Ra ).
Cases where R is or ; easy.
Induction : There are three cases, depending on
whether R = R1 + R2, R = R1R2, or R = R1 .
We'll do only R = R1R2.
L = L1L2 , where L1 = L(R1) and L2 =
L(R2 ).
Let E1 be R1, with each a replaced by Ra,
and E2 similarly.
By the IH, L(E1 ) = s(L1 ) and L(E2 ) = s(L2 ).
Thus, L(E) = s(L1 )s(L2 ) = s(L).
Applications of the Substitution Theorem
If L1 and L2 are regular, so is L1 L2 .
✦ Let s(a) = L1 and s(b) = L2 . Substitute
into the regular language fabg.
So is L1 [ L2 .
✦ Substitute into fa; bg.
Ditto L1 .
✦ Substitute into L(a ).
Closure under homomorphism = substitution
of one string for each symbol.
✦ Special case of a substitution.
Example: Homomorphism
Let L = L(0 1 ), and let h be a homomorphism
dened by h(0) = aa and h(1) = .
,
Then h(L) = L aa) = all strings of an even
number of a's.
Closure Under Inverse Homomorphism
h,1 (L) = fw j h(w) is in Lg.
5
See argument in course reader. Brie
y:
✦ Given homomorphism h and regular
language L, start with a DFA A for L.
✦ Construct DFA B for h,1 (L), by having
B, go from state q to state p on input a if
^ q; h(a) = p.
Closure Under Reversal
The reverse of a string w = a1 a2 an is
an a2a1 .
✦ Denoted wR .
✦ Note R = .
The reverse of a language L is the set
containing the reverse of each string in L.
If L is regular, so is LR .
✦ Proof: use RE's, recursive reversal as in
course reader.