Automata Chapter 2
Automata Chapter 2
The language accepted by finite automata can be easily described by simple expressions
called Regular Expressions. It is the most effective way to represent any language.
The languages accepted by some regular expression are referred to as Regular languages.
A regular expression can also be described as a sequence of pattern that defines a string.
Regular expressions are used to match character combinations in strings. String searching
algorithm used this pattern to find the operations on a string.
regular Expressions are an algebraic way to describe languages
It is a way of representing regular languages.
The algebraic description for regular languages is done using regular expressions.
They can define the same language that various forms of finite automata can describe.
Regular expressions offer something that finite automata do not, i.e. it is a declarative way
to express the strings that we want to accept.
ε is a Regular Expression indicates the language containing an empty string. (L (ε) = {ε})
Page | 1
Some RE Examples
(a+b)* Set of strings of a’s and b’s of any length including the null string. So
L = { ε, a, b, aa , ab , bb , ba, aaa…….}
(a+b)*abb Set of strings of a’s and b’s ending with the string abb. So L = {abb,
aabb, babb, aaabb, ababb, …………..}
(11)* Set consisting of even number of 1’s including empty string, So L= {ε,
11, 1111, 111111, ……….}
(aa + ab + ba + bb)* String of a’s and b’s of even length can be obtained by concatenating
any combination of the strings aa, ab, ba and bb including null, so L =
{aa, ab, ba, bb, aaab, aaba, …………..}
Example 1
Write the regular expression for the language accepting all combinations of a's, over the set ∑ =
{a}
Solution:
All combinations of a's means a may be zero, single, double and so on. If a is appearing zero
times, that means a null string. That is we expect the set of {ε, a, aa, aaa, ....}. So we give a
regular expression for this as
R = a*
Example 2. Write the regular expression for the language accepting all the string containing any
number of a's and b's.
Solution:
Page | 2
R = (a + b)*
This will give the set as L = {ε, a, aa, b, bb, ab, ba, aba, bab, .....}, any combination of a and b.
The (a + b)* shows any combination with a and b even a null string.
Example 3. Write the regular expression for the language accepting all the string, which are
starting with 1 and ending with 0, over ∑ = {0, 1}.
Solution:
In a regular expression, the first symbol should be 1, and the last symbol should be 0. The r.e. is
as follows:
R = 1 (0+1)* 0
Solution:
The language can be predicted from the regular expression by finding the meaning of it. We will
first split the regular expression as:
L = {The language consists of the string in which a's appear triples, there is no restriction on the
number of b's}
Example 5. Write the regular expression for the language starting and ending with a and having
any combination of b's in between.
Solution:
R = a b* a
Example 6. Write the regular expression for the language starting with a but not having
consecutive b's.
Page | 3
R = {a + ab}*
Regular Sets
Any set that represents the value of the Regular Expression is called a Regular Set.
Page | 4
Property 4. The difference of two regular set is regular.
Proof −
Let us take two regular expressions −
RE1 = a (a*) and RE2 = (aa)*
So, L1 = {a, aa, aaa, aaaa, ....} (Strings of all possible lengths excluding Null)
L2 = { ε, aa, aaaa, aaaaaa,.......} (Strings of even length including Null)
L1 – L2 = {a, aaa, aaaaa, aaaaaaa, ....}
(Strings of all odd lengths excluding Null)
RE (L1 – L2) = a (aa)* which is a regular expression.
Property 5. The reversal of a regular set is regular.
Proof −
We have to prove LR is also regular if L is a regular set.
Let, L = {01, 10, 11, 10}
RE (L) = 01 + 10 + 11 + 10
LR = {10, 01, 11, 01}
RE (LR) = 01 + 10 + 11 + 10 which is regular
Property 6. The closure of a regular set is regular.
Proof −
If L = {a, aaa, aaaaa, .......} (Strings of odd length excluding Null)
i.e., RE (L) = a (aa)*
L* = {a, aa, aaa, aaaa , aaaaa,……………} (Strings of all lengths excluding Null)
RE (L*) = a (a)*
Property 7. The concatenation of two regular sets is regular.
Proof −
Let RE1 = (0+1)*0 and RE2 = 01(0+1)*
Here, L1 = {0, 00, 10, 000, 010, ......} (Set of strings ending in 0)
and L2 = {01, 010,011,.....} (Set of strings beginning with 01)
Then, L1 L2 = {001, 0010, 0011, 0001, 00010, 00011, 1001, 10010,.............}
Set of strings containing 001 as a substring which can be represented by an RE − (0 + 1)*001(0 +
1)*
Hence, proved.
Page | 5
2.3 Identities Related to Regular Expressions
∅* = ε
ε* = ε
RR* = R*R
R*R* = R*
(R*)* = R*
RR* = R*R
(PQ)*P =P(QP)*
(a+b)* = (a*b*)* = (a*+b*)* = (a+b*)* = a*(ba*)*
R + ∅ = ∅ + R = R (The identity for union)
R ε = ε R = R (The identity for concatenation)
∅ L = L ∅ = ∅ (The annihilator for concatenation)
R + R = R (Idempotent law)
L (M + N) = LM + LN (Left distributive law)
(M + N) L = ML + NL (Right distributive law)
ε + RR* = ε + R*R = R*
In order to find out a regular expression of a Finite Automaton, we use Arden’s Theorem along
with the properties of regular expressions.
Let P and Q be two regular expressions.
If P does not contain null string, then R = Q + RP has a unique solution that is R = QP*
Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ε + P + P2 + P3 + …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Assumptions for Applying Arden’s Theorem
Page | 6
Method
Step 1 − Create equations as the following form for all the states of the DFA having n states with
initial state q1.
q1 = q1R11 + q2R21 + … + qnRn1 + ε
q2 = q1R12 + q2R22 + … + qnRn2
...
qn = q1R1n + q2R2n + … + qnRnn
Rij represents the set of labels of edges from qi to qj, if no such edge exists, then Rij = ∅
Step 2 − Solve these equations to get the equation for the final state in terms of Rij
Example 1
The equations for the three states q1, q2, and q3 are as follows −
q3 = q2a
Page | 7
= q1b (b + ab)* (Applying Arden’s Theorem)
q1 = q1a + q3a + ε
= (a + b(b + ab)*aa)*
Example 2
Construct a regular expression corresponding to the automata given below −
Solution −
Here the initial state is q1 and the final state is q2
Now we write down the equations −
q1 = q10 + ε
q2 = q11 + q20
q3 = q21 + q30 + q31
Now, we will solve these three equations −
q1 = ε0* [As, εR = R]
So, q1 = 0*
q2 = 0*1 + q20
So, q2 = 0*1(0)* [By Arden’s theorem]
Hence, the regular expression is 0*10*.
Page | 8
2.5 Regular grammar
In the literary sense of the term, grammars denote syntactical rules for conversation in natural
languages. Linguistics have attempted to define grammars since the inception of natural
languages like English, Sanskrit, Mandarin, etc.
The theory of formal languages finds its applicability extensively in the fields of Computer
Science. Noam Chomsky gave a mathematical model of grammar in 1956, which is effective for
writing computer languages.
Grammar
P is Production rules for Terminals and Non-terminals. A production rule has the form α
→ β, where α and β are strings on VN ∪ ∑ and least one symbol of α belongs to VN.
Example
Grammar G1 −
Here,
Productions, P : S → AB, A → a, B → b
Page | 9
N= {q1, q2, q3} T= {a, b} S = {q1}
P = {q1 aq1, q1bq2, q2aq3, q2bq2, q3aq1, q3bq2}
Derivations from a Grammar
Strings may be derived from other strings using the productions in a grammar.
Example
We will see another example The peacock is a beautiful bird. Starting with the
sentence symbol S, this can be dividing as Noun parse one NP1 and verb parse VP,
then Noun parse one is split into article one and Noun one, article one is The and
Noun one is Peacock, then VP is dividing into verb V, article two, adjective one and
Noun two N2. Verb is is, Article two is a, Adjective is beautiful and Noun two is
bird
Page | 10
<Art1 The
>
<N1> Peacock
<VP> <V> <Art2>
<Adj><N2>
<V> is
<Art2 a
>
<Adj> Beautiful
<N2> Bird.
So this single arrow can be rewritten as and
double arrow directly derives.
S <NP1> <VP>
<NP1><V> <ART2> <ADJ> <N2>
<NP1><V> a <ADJ><N2>
<NP1><V> a beautiful <N2>
<ART1> <N1><V> a beautiful <N2>
The <N1><V> a beautiful <N2>
The <N1> is a beautiful <N2>
The Peacock is a Beautiful <N2>
The Peacock is a Beautiful Bird.
Example
If there is a grammar
G: N = {S, A, B} T = {a, b} P = {S → AB, A → a, B → b}
Page | 11
Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted string
is ab, i.e.,
L(G) = {ab}
Example
Suppose we have the following grammar −
G: N = {S, A, B} T = {a, b} P = {S → AB, A → aA|a, B → bB|b}
The language generated by this grammar −
L(G) = {ab, a2b, ab2, a2b2, ………}
= {am bn | m ≥ 1 and n ≥ 1}
Construction of a Grammar Generating a Language
We’ll consider some languages and convert it into a grammar G which produces those
languages.
Example1
Problem − Suppose, L (G) = {am bn | m ≥ 0 and n > 0}. We have to find out the
grammar G which produces L(G).
Solution
Since L(G) = {am bn | m ≥ 0 and n > 0}
the set of strings accepted can be rewritten as −
L(G) = {b, ab,bb, aab, abb, …….}
Here, the start symbol has to take at least one ‘b’ preceded by any number of ‘a’ including null.
To accept the string set {b, ab, bb, aab, abb, …….}, we have taken the productions −
S → aS , S → B, B → b and B → bB
S → B → b (Accepted)
S → B → bB → bb (Accepted)
S → aS → aB → ab (Accepted)
S → aS → aaS → aaB → aab(Accepted)
S → aS → aB → abB → abb (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated by the
production set.
Example2
Problem − Suppose, L (G) = {am bn | m > 0 and n ≥ 0}. We have to find out the grammar G
which produces L(G).
Page | 12
Solution −
Since L(G) = {am bn | m > 0 and n ≥ 0}, the set of strings accepted can be rewritten as −
L(G) = {a, aa, ab, aaa, aab ,abb, …….}
Here, the start symbol has to take at least one ‘a’ followed by any number of ‘b’ including null.
To accept the string set {a, aa, ab, aaa, aab, abb, …….}, we have taken the productions −
S → aA, A → aA , A → B, B → bB ,B → λ
S → aA → aB → aλ → a (Accepted)
S → aA → aaA → aaB → aaλ → aa (Accepted)
S → aA → aB → abB → abλ → ab (Accepted)
S → aA → aaA → aaaA → aaaB → aaaλ → aaa (Accepted)
S → aA → aaA → aaB → aabB → aabλ → aab (Accepted)
S → aA → aB → abB → abbB → abbλ → abb (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated by the
production set.
Hence the grammar −
G: ({S, A, B}, {a, b}, S, {S → aA, A → aA | B, B → λ | bB })
2.6 Chomsky Classification of Grammars
Page | 13
Type 3 Regular grammar Regular language Finite state automaton
The productions can be in the form of α → β where α is a string of terminals and non-terminals
with at least one non-terminal and α cannot be null. β is a string of terminals and non-terminals.
Example
S → ACaB
Bc → acB
CB → DB
aD → Db
Type-1 grammars: generate context-sensitive languages.
α A β → α γ β, | α |<= | β |
where A ∈ N (Non-terminal)
Page | 14
The rule S → ε is allowed if S does not appear on the right side of any rule. The languages generated by
these grammars are recognized by a linear bounded automaton.
Example
AB → AbBc , A → bcA , B → b
Example
S→Xa,X→a
X → aX
X → abc
X→ε
Type-3 grammars:
Type-3 grammars must have a single non-terminal on the left-hand side and a right-hand side
consisting of a single terminal or single terminal followed by a single non-terminal.
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule.
Example
X→ε
X → a | aY
Y→b
Page | 15