0% found this document useful (0 votes)
9 views

Auto Chapter2 -1 (1)

Uploaded by

Shafi Esa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Auto Chapter2 -1 (1)

Uploaded by

Shafi Esa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Chapter 2

Regular Expression and Regular languages


2.1. Regular expression
 We construct regular expressions from primitive constituents by
repeatedly applying certain recursive rules.
 Let Σ be a given alphabet. Then
 ∅, λ, and a ∈ Σ are all regular expressions. These are called
primitive regular expressions.
 If r1 and r2 are regular expressions, so are r1 + r2, r1 . r2, r1*, and (r1).
We use + to denote union, · for concatenation and ∗ for star-
closure.
 A string is a regular expression if and only if it can be derived from
the primitive regular expressions by a finite number of applications
of the rules in (2).
…cont’d
 Example, For Σ = {a, b, c}, the string (a + b · c)* · (c + ∅) is a regular
expression, since it is constructed by application of the above rules.
For example, if we take r1 = c and r2 = ∅, we find that c + ∅ and (c +
∅) are also regular expressions. Repeating this, we eventually generate
the whole string. On the other hand, (a + b +) is not a regular
expression, since there is no way it can be constructed from the
primitive regular expressions.
 Regular expressions can be used to describe some simple languages. If
r is a regular expression, L(r) denote the language associated with r.
…cont’d
 The language L (r) denoted by any regular expression r is defined by
the following rules.
 ∅ is a regular expression denoting the empty set,
 λ is a regular expression denoting {λ},
 For every a ∈ Σ, a is a regular expression denoting {a}.
 If r1 and r2 are regular expressions, then
 L (r1 + r2 ) = L (r1 ) ∪ L (r2 ),
 L (r1 · r2 ) = L (r1 ) L (r2 ),
 L ((r1 )) = L (r1 ),
 L (r1 * ) = (L (r1 ))*.
…cont’d
 Example: Exhibit the language L (a* · (a + b)) in set notation.
L (a* · (a + b)) = L (a*) L (a + b)
= (L (a))* (L (a) ∪ L (b))
= {λ, a, aa, aaa, ...}{a, b}
= {a, aa, aaa, ..., b, ab, aab, ...}.
2.2 Connection between Regular Expressions
and Regular Languages
 For every regular language there is a regular expression, and for every
regular expression there is a regular language.
 Theorem – Let r be a regular expression. Then there exists some
nondeterministic finite automaton that accepts L (r). Consequently,
L (r) is a regular language.
 Theorem - Let L be a regular language. Then there exists a regular
expression r such that L = L(r).
2.3. Regular Grammar
Grammars
 A grammar for the English language tells us whether a particular
sentence is well formed or not.
 A typical rule of English grammar is “a sentence can consist of a noun
phrase followed by a predicate.”
〈 sentence 〉 → 〈 noun phrase 〉 〈 predicate 〉
〈 noun phrase 〉 → 〈 article 〉 〈 noun 〉 ,
〈 predicate 〉 → 〈 verb 〉
…cont’d
 If we associate the actual words “a” and “the” with 〈 article 〉 , “boy”
and “dog” with 〈 noun 〉 , and “runs” and “walks” with 〈 verb 〉 ,
then the grammar tells us that the sentences “a boy runs” and “the dog
walks” are properly formed.
 We start with the top-level concept, here 〈 sentence 〉 , and
successively reduce it to the irreducible building blocks of the
language.
 The generalization of these ideas leads us to formal grammars.
…cont’d
 A grammar G is defined as a quadruple G = (V, T, S, P), where V is a
finite set of objects called variables, T is a finite set of objects called
terminal symbols, S ∈ V is a special symbol called the start variable, P
is a finite set of productions.
 The sets V and T are nonempty and disjoint.
 The production rules specify how the grammar transforms one string
into another, and through this they define a language associated with
the grammar.
 All production rules are of the form x → y, where x is an element of
(V ∪ T)+ and y is in (V ∪ T)*.
…cont’d
 Given a string w of the form w = uxv, we say the production x → y is
applicable to this string, and we may use it to replace x with y, thereby
obtaining a new string z = uyv
 This is written as w ⇒ z. We say that w derives z or that z is derived
from w.
 If w1 ⇒ w2 ⇒ ··· ⇒ wn, we say that w1 derives wn and write w1*⇒ wn.
 The ∗ indicates that an unspecified number of steps (including zero)
can be taken to derive wn from w1
 By applying the production rules in a different order, a given grammar
can normally generate many strings.
 The set of all such terminal strings is the language defined or
generated by the grammar.
…cont’d
 Let G = (V, T, S, P) be a grammar. Then the set
L(G) = {w ∈ T* : S* ⇒ w} is the language generated by G.
 If w ∈ L (G), then the sequence
S ⇒ w1 ⇒ w2 ⇒ ··· ⇒ wn ⇒ w is a derivation of the sentence w.
 The strings S, w1, w2, ..., wn, which contain variables as well as
terminals, are called sentential forms of the derivation.
 Example 1, consider the grammar G = ({S}, {a, b}, S, P), with P given
by
S → aSb,
S → λ.
…cont’d
 Then S ⇒ aSb ⇒ aaSbb ⇒ aabb, so we can write S * ⇒ aabb.
 The string aabb is a sentence in the language generated by G, while aaSbb is a
sentential form.
 Example 2, find a grammar that generates
L = {anbn+1: n ≥ 0}.
 Generate an extra b with a production S → Ab, and ab with a production
A → aAb. chosen so that A can derive the language in the previous example.
Hence, the grammar is G = ({S, A}, {a, b}, S, P), with productions
S → Ab,
A → aAb,
A→λ
 Two grammars G1 and G2 are equivalent if they generate the same
language, that is, if L (G1) = L (G2).
…cont’d
Right- and Left-Linear Grammars
 A grammar G = (V, T, S, P) is said to be right-linear if all productions are of
the form
A → xB, or
A → x,
where A, B ∈ V, and x ∈ T*.
 A grammar is said to be left-linear if all productions are of the form
A → Bx, or
A → x.
 A regular grammar is one that is either right-linear or left-linear.
…cont’d
 Example 1, The grammar G1 = ({S}, {a, b}, S, P1), with P1 given
as S → abS|a is right-linear. The grammar G2 = ({S, S1, S2}, {a,
b}, S, P2), with productions S → S1ab, S1 → S1ab|S2, S2 → a, is
left-linear. Therefore, both G1 and G2 are regular grammars.
 Example 2, The grammar G = ({S, A, B}, {a, b}, S, P) with
productions
S → A,
A → aB|λ,
B → Ab, is not regular because the grammar
itself is
neither right-linear nor left-linear
…cont’d
Right-Linear Grammars Generate Regular Languages
 Theorem: Let G = (V, T, S, P) be a right-linear grammar. Then L (G) is a
regular language.
 Example: Construct a finite automaton that accepts the language generated by
the grammar
V0 → aV1,
V1 → abV0|b,
where V0 is the start variable.
 We start the transition graph with vertices V0, V1, and Vf.
 The first production rule creates an edge labeled a between V0 and V1.
 For the second rule, we need to introduce an additional vertex so that there
is a path labeled ab between V1 and V0.
…cont’d
 Finally, we need to add an edge labeled b between V1 and Vf, giving the
following automaton

 The language generated by the grammar and accepted by the automaton is


the regular language L ((aab)*ab)
…cont’d
Right-Linear Grammars for Regular Languages
 Theorem: If L is a regular language on the alphabet Σ, then there exists a right-
linear grammar G = (V, Σ, S, P) such that L = L (G).
 To show that every regular language can be generated by some right-linear
grammar, we start from the DFA for the language and construct right-linear
grammar
 The states of the DFA now become the variables of the grammar, and the
symbols causing the transitions become the terminals in the productions.
 Example, Construct a right-linear grammar for L (aab*a).
 The transition function for an DFA, together with the corresponding
grammar productions is
…cont’d

 The string aaba can be derived with the constructed grammar by


q0 ⇒ aq1 ⇒ aaq2 ⇒ aabq2 ⇒ aabaqf ⇒ aaba.
…cont’d
 Theorem – A language L is regular if and only if there exists a left-linear
grammar G such that L = L (G).
 Theorem – A language L is regular if and only if there exists a regular
grammar G such that L = L (G).
2.4. Pumping lemma and non-regular
language
Using the Pigeonhole Principle
 The term “pigeonhole principle” is used by mathematicians to refer to the
following simple observation. If we put n objects into m boxes (pigeonholes),
and if n > m, then at least one box must have more than one item in it.
A Pumping Lemma
 The pumping lemma for regular languages, uses the pigeonhole principle in
another form.
 Theorem - Let L be an infinite regular language. Then there exists some
positive integer m such that any w ∈ L with |w| ≥ m can be decomposed as
w = xyz
with
|xy| ≤ m,
and
…cont’d
|y| ≥ 1,
such that
wi = xyiz,
is also in L for all i = 0, 1, 2, ....
 Example : Use the pumping lemma to show that L = {a2b2: n ≥ 0} is not
regular.
 Assume that L is regular, so that the pumping lemma must hold. We do not
know the value of m, but whatever it is, we can always choose n = m.
Therefore, the substring y must consist entirely of a’s. Suppose |y| = k.
Then the string obtained by using i = 0 is w0 = am-kbm and is clearly not in
L. This contradicts the pumping lemma and thereby indicates that the
assumption that L is regular must be false.

You might also like