0% found this document useful (0 votes)
19 views

Regular Grammar

Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Regular Grammar

Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 56

Regular Grammar

Regular Language

• Any language accepted by FSM is regular language

• Regular languages can be described by using


 FSM

 Regular expressions

 Regular Grammar (Right Linear grammars)


Regular language Example:

Language L = { strings of a’s and b’s with at least one a}


L = { a, ab, ba, aab, baa, abb, aba,…………………………..}

• Described by using regular expression as: (a+b)*a (a+b)*

• Described by using FSM as:

• Also described by using Regular grammar. How?


Regular Grammar
A regular grammar G is a quadruple (V, ∑, R,S)
where:
• V is the rule alphabet, which contains Non-terminal symbols
and Terminal symbols.
• ∑ is the set of terminal symbols ( Subset of V)

• R is finite set of rules of the form X → Y

• S is the start symbol, which is a non-terminal symbol.


Rules R of any Regular Grammar
Rule R is of the form X → Y must satisfy the following 2
conditions:

1. Left-hand side contains only one symbol that must be a non-


terminal.

2. RHS contains ε or a single character (terminal) or a single

character (terminal) followed by a single non-terminal

Example: A → ε or A → b or A → aC are legal Rules

BA → ε , A → aSa are not legal rules.


Non-Terminal and Terminal Symbols

• Non-terminal Symbols: symbols that are used in the


grammar but that do not appear in strings in the language.

• Terminal Symbols: symbols that can appear in strings


generated by G

L = { a, ab, ba, aab, baa, abb, aba,…………………………..}


Language generated by a Grammar

The language generated by a grammar G =( V, ∑, R,


S ), denoted L( G), is the set of all strings w in ∑* such
that it is possible to start with S, apply some finite set
of rules in R, and derive w
Regular Grammar
Language L = { strings of a’s and b’s with at least one a}

L = { a, ab, ba, aab, baa, abb, aba,…………………………..}

Regular grammar G = ( { S, A, a, b}, { a, b}, R, S ) where S is the start Non-terminal

symbol of grammar G and R is the rules defined as:

S→ aA, S → bS, A → aA, A → bA, A → ε


• To generate any string by using grammar G; to start with S,
apply derivation step, by replacing non-terminal symbol in each
derivation step until the required string is generated.
• To generate a string : abbabaa
Write a regular grammar for the Language
L = { w € (a, b)* : |w| is even }

Regular expression = ((aa) U (ab) U (ba) U (bb))*.


FSM

δ (S, a) = T Regular Grammar G =


S → aT
δ (S, b) = T
S → bT
δ (T, a) = S T → aS
T → bS
δ (T, b) = S
S→ɛ
Regular Grammars and Regular Languages

Theorem : The Regular grammar defines exactly the


regular languages.

To prove this theorem one must prove that for given regular
grammar it is possible to construct equivalent FSM or from FSM
it is possible to get the regular grammar
Method for conversion from Regular Grammar to
FSM
• Create in M a separate state for each non terminal in V.
• Make the state corresponding to S the start state.
• If there are any rules in R of the form X → w, for some w € ∑
then create an additional state labelled #.
• For each rule of the form X → wY, add a transition from X to Y
labelled w.
• For each rule of the form X → w, add a transition from X to #
labelled w.
• For each rule of the form X → ɛ, mark state X as accepting.
• Mark state # as accepting.
Method for conversion from FSM to Regular
Grammar
• Conversion from FSM to regular Grammar

G = ( { A0, A1,………An}, ∑, R, A0 ) is as follows:

Where R is the set of production rules can be defined by


following rules:

• Ai →aAj is a production rule if δ(Ai , a) = Aj where Aj can be


any final or non-final state but not the final state #

• Ai →a is a rule if δ(Ai , a) = Aj where Aj is final state #

• Ai →ε is a rule if Ai is any final state other than final state #


Write a regular grammar for the Language

L = {w € {a, b} *: w ends with the pattern aaaa }


Regular expression = (a + b)*aaaa.
FSM (By omitting the dead state)
Show a regular grammar for the language
L = {w € {a, b }*: w contains an even number of a's and an odd
number of b's }.
Show a regular grammar for the language

L = {w € {a, b}*: w contains an odd number of a's and w ends


in a}.
Generate a FSM for the following regular grammar G:
Regular and Non regular Languages
• The language L = { a*b*} is a regular language.
• The language L = { an bn| n≥ 0 } is not a regular
language.
• Regular languages can be modeled by a FSM
• Regular languages are normally denoted by some
regular expression.
How Many Regular Languages Are There?
There is a countably infinite number of regular
languages.
• There cannot be more regular languages than there are DFSMs.
• There are at most a count ably infinite number of regular
languages.
• There is no one to one relationship between RLs and DFSM’s,
since there is an infinite number of machines that accept any
given language.
Example: {a}, { aa} , { aaa}, { aaaa}. { aaaaa}, { aaaaaa } ….. .. .
Showing That a Language Is Regular
Theorem: Every finite language is regular.
Proof:
• If L is the empty language ( no strings); L = { }
The regular expression corresponding to L is = Ø so L is regular.
• If any finite language L composed of the strings s 1, s2, ... sn for some
positive integer n, then it is defined by the regular expression:
s1 U s2 U……………..U sn ;So it too is regular.
Intersection of two infinite languages is finite.
Example: L1 = { anbn |n ≥ 0 } and L2 = { bnan |n ≥ 0 }
Both L1 and L2 are non-regular.
L1∩L2 = { ε } which is finite, is a regular language.
Prove that language L is regular or not?
i. L = { ai bj |i, j ≥ } 0 and i + j = 5

Regular. A simple FSM with five states just counts the total number
of characters.

ii. L = { ai bj |i, j ≥ } 0 and i - j = 5

Not Regular. L consists of all strings of the form a*b* where the
number of a’s is five more than the number of b’s. So we need infinite
number of states to model this.
Closure Properties of Regular Languages

• If Certain languages are regular and language L is


formed from them by certain operations such as
union, concatenation, difference, star closure etc.
then L is also regular. (These properties are called closure properties)
• Closure property is a useful tool for building many
complex automata.
1. Regular languages are closed under Union,
concatenation and star (closure) operation.
2. The complement of two regular languages is regular.
3. The intersection of two regular languages is regular.
4. The difference of two regular languages is regular.
5. The reversal of two regular languages is regular.
6. A homomorphism of regular languages is regular.
7. The inverse homomorphism of regular language is
regular
Regular languages are closed under Union,
concatenation and star (closure) operation

If L1 and L2 are regular languages, then prove that

L1U L2, L1.L2 and L1* is also regular language.

Proof: Since L1 and L2 are regular languages, they have

regular expressions, say R1 and R2 such that L1 = L(R1)

and L2 = L(R2).
By the definition of RE

R1U R2 is a regular expression denoting the language

L1U L2.

R1.R2 is a regular expression denoting the language L1.L2

R1* is a regular expression denoting the language L1*

So L1U L2, L1.L2 and L1* is also regular language.


Closure under complementation

If L is regular language over alphabet Σ, then ¬L is also

regular language.

• Let L = L(M1) for some DFSM M1 = ( K, Σ, δ, s, A). Then

¬L= L(M2), where M2 is the DFSM = ( K, Σ, δ, q0, K - A).

That is M2 is exactly like M1, but the accepting states

of M1 have become non-accepting states of M2, and


vice versa.
• So the language which is rejected by M1 is accepted

by M2 and vice versa. Thus we have a machine M2


which accepts all those strings ‘w’ in ¬L that are
rejected by machine M1. So the complement of
regular language L is is also regular.
Closure under intersection
If L and M are regular languages then show that
L∩M is also regular.

Let L and M are regular languages


So complement of L, ie: ¬L and complement of
M, ie:¬M is also regular language.
Also union of two regular languages is a regular
Language.
So union of ¬L and ¬M , ie: ¬L U ¬M is also
regular
But we know that complement of ¬LU ¬M

ie: ¬ (¬LU ¬M) is also regular.

According to De-morgan’s law, L∩M = ¬ (¬LU ¬M)

which is a regular language.

Therefore L∩M is also regular.


The difference of two regular languages is
regular
If L and M are regular languages, then so is L – M.

Complement of M , ie: ¬M is regular.

Also the intersection of two regular languages is regular,

ie: L ∩ ¬M is also regular.

According De-morgan’s law L – M = L ∩¬M is regular

language.

Therefore L – M is a regular language.


The reversal of a regular language is regular
Homomorphism (Letter substitution)

The term homomorphism means substitution of strings

by some other symbols.

Example: string aabb can be written as 0011 under

Homomorphism by replacing a with 0 and b with 1


Let ∑ is the set of input alphabets and Г be the set of

substitution symbols, then ∑* → Г* is homomorphism.

Let w = a1, a2, a3, …………………..an

Then h(w) = h(a1) h(a2) h(a3)………….h(an)

If L is a language belongs to set ∑, then the homomorphic

image of L can be defined as:

h(L) = { h(w) : w € L }
A homomorphism of regular languages is regular
Proof: by counter example
Let L is a regular language, we have to prove that h(L) is
also regular.
∑ = { a, b} and string w = abab
h(a) = 00
h(b) = 11

By definition h(w) = h(a1) h(a2) h(a3)………….h(an)

h(w) = h(a) h(b) h(a) h(b) = 00110011


The homomorphism to a language is applied by

applying homomorphism on each string of language.

Let L = ab*b ( regular language)

L = { ab, abb, abbb, abbbb, …………….}

h(L) = {0011, 001111, 00111111, 0011111111,…….}

h(L) = 00(11)+ ; h(L) can be represented by RE, so it is a

regular language.
The inverse homomorphism of regular language is regular

Proof: Let ∑* → Г* is homomorphism, where ∑ is the


set of input alphabets and Г be the set of
substitution symbols., then h(L) be homomorphic
language.

The inverse of homomorphic language can be


represented as h-1(L)

Let h-1(L) = { w : w € L }
• If L is regular then we know that h(L) is also regular.
• Since L is regular there exists some FSM
M = ( K, ∑, δ, s, A) which accepts L.
• Then h(L) must also accepted by M.
• For complement of L ie: ¬L is also regular which is accepted
by FSM M’ where Final states of M becomes non final states
and all the non final states of M become the final states.
• Clearly ¬L is accepted’ by M.
• The inverse homomorphic language of ¬L is h-1 (L) is also
accepted by M’.
• So , h-1 (L) is also regular.
Pumping Lemma For Regular Languages
(To show languages are not regular)
To show that a language is regular

• A regular expression .
• FSM .
• A finite list of the equivalence classes of ~ L
• A regular grammar
• By using the closure properties that we have
proved hold for the regular languages.
To show that languages are not regular:
• Every regular language L can be accepted by an FSM
M with a finite number of states.
• If L is infinite, then there must be at least one loop in
M.
• All sufficiently long strings in L must be characterized
by one or more repeating patterns corresponding to
the substrings that drive M through its loops.
Consider the following 5 state FSM:
• It can accept infinite number of strings.

• The longest string that can be accepted by the FSM


without going through any loops has length 4.
• The total number of states in FSM is 5 ie: |k| = 5.

• Take any longest string w, such that length of longest


string in L ≥ number of states in FSM
• If we take string w = babbab, then |w| = 6 which is
greater than the number of states in FSM. Here
the second b in w drove the FSM through its loop,
we call it as y
• Suppose if we remove this y (pump y out), resulting
in a new string babab which is also accepted by
FSM.
• Also we can pump in (adding one or more) as many
copies of b as we like, generating such strings as
babbbab, babbbbab and so forth. FSM also accepts all
of them.
• In the original string w = babbab, the third b also
drove FSM through its loop. We could also pump it in
or out and we get a similar result.
Consider FSM

• This FSM accepts only one string, aab.


• The only string that can drive FSM through its
loop is ɛ
• It cannot accept any longer strings.
• Therefore the length of pumping string y must
be greater than 0. It should not be empty.
• This property of FSMs and the languages that they
can accept is the basis for a powerful tool for
showing that a language is not regular.

• If a language contains even one long string that


cannot be pumped in the fashion that we have just
described, then it is not accepted by any FSM and so
is not regular.
Pumping Theorem for Regular languages

Theorem:
Let L be a regular language. Then there exists a constant
‘k’
(number of states in FSM which depends on L) such that
for every string ‘w’ in L such that |w| ≥ k, we can break w
into three strings, w = xyz, such that:
|y| > 0 ie: y ≠ ε
|xy| ≤ k
For all q ≥ 0, the string xyqz is also in L
Proof
Suppose L = L(M) for some DFSM ‘M’ and L is regular
language.
Suppose ‘M’ has ’k’ number of states.
Consider any string w = a1a2a3………………..am of length ’m’
where m ≥ k
• Since |w| ≥ k, by the pigeonhole principle it is not
possible to have distinct transitions, since there are only
‘k’ different states. So one of the state can have a loop.
• Different integers i and j with 0 ≤ i < j ≤ k, such that qi = qj.
Now we can break the string w = xyz as follows:
x = a1a2a3……………..ai.
y = ai+1, ai+2, ……..aj ( loop string where i = j)
z = aj+1, aj+2,…………..am.
• x’ may be empty in the case that i= 0. Also ‘z’ may be
empty if j = k = m. However, y cannot be empty,
since ‘i’ is strictly less than ‘j’.
• Thus for any integer q ≥ 0, xyqz is also accepted by
DFSM ‘M’; that is for a language L to be a regular,
xyqz is also in L for all q ≥ 0.
Show that L = { anbn | n ≥ 0 } is not regular
• Assume that given language L is regular language and
there exist some ‘k’ number of states
• Let w = ak bk Since |w| = k + k = 2k ≥ k, we can split
‘w’ into xyz such that |xy| ≤ k and |y | ≥ 1 as
w = xyz
• Since |xy| <= k, y must occur within the first k
characters and so y = ap for some p. Also y ≠ε,
p must be greater than 0.
x = ak – p y = ap z = bk
• According to pumping lemma, language to be
regular, xyqz € L for all q ≥ 0.
• Let q =2 and the resulting string w = ak – p (ap)2 bk
where p ≥ 1 = ak + p bk must be in L.
• But it is not since it has more a’s than more b’s.
• Thus there exists at least one long string w in L that
fails to satisfy the conditions of the Pumping
Theorem. So anbn | n ≥ 0 is not regular.

You might also like