0% found this document useful (0 votes)
0 views

chapter twoRegular_anguage

Chapter 2 discusses regular expressions and regular languages, explaining that every regular language can be represented by a finite automaton and has a corresponding regular expression. It outlines the properties of regular expressions, including operations like union, concatenation, and Kleene star, and introduces regular grammars and their equivalence to regular languages. The chapter also covers types of automata, including deterministic and nondeterministic finite automata, and explores closure properties of regular languages.

Uploaded by

hirpaadugna1
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

chapter twoRegular_anguage

Chapter 2 discusses regular expressions and regular languages, explaining that every regular language can be represented by a finite automaton and has a corresponding regular expression. It outlines the properties of regular expressions, including operations like union, concatenation, and Kleene star, and introduces regular grammars and their equivalence to regular languages. The chapter also covers types of automata, including deterministic and nondeterministic finite automata, and explores closure properties of regular languages.

Uploaded by

hirpaadugna1
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 2

Regular Expression and Regular Language


Regular Languages
If a language is regular there exists a finite
acceptor for it. Therefore every regular
language can be described by some Dfa or
Nfa. For every regular languages there should
exist corresponding regular expression.
Theorem:
Let L be a regular language. Then there exists a
regular expression r such that L=L(r)
The set of regular expressions
is defined as follows:
 Every symbol of Σ is a regular expression
 ε is a regular expression
 Ø is a regular expression denoting the empty set.
 If r1 and r2 are regular expressions, then
 L(r1+r2)=L(r1)UL(r2) ---- union
 L(r1.r2)=L(r1)L(r2) ----- concatenation
 R* -------- (Kleene star) denoting the smallest
superset of set described by R that contains ε and is
closed under string concatenation. This is the set of
all strings that can be made by concatenating any
finite number (including zero) of strings from set
described by R. For example, {"0","1"}* is the set of
all finite binary strings (including the empty string),
and {"ab", "c"}* = {ε, "ab", "c", "abab", "abc", "cab",
"cc", "ababab", "abcab", ... }.
Priority
 * has the highest priority
 . has the next priority
 + has the least priority
 Parenthesis is used to override the above priorities
For example:
 (a+(b.c))* stands for the star closure of {a}U{bc} that is the
language{ ε,a,bc,aa,abc,bca,bcbc,aaa,aabc,.....}
 In other words we can define the regular expressions by using
the following terms.
r = epsilon
r=a
r = r1 + r2
r = r1 r2
r = r1*
r = (r1)
The language represented or generated by a regular expression
is a Regular Language, denoted L(r).
Cont’d

 Example 1:
 ab* specifies the strings starting with a followed by 0 or
more number of b’s,
 (ab)* specifies 0 or more repetitions of ab
Example 2:
 For Σ={a,b} the expression
r = (a+b)*(a+bb) is regular.
It denotes the language L(r)={a,bb,aa,abb,ba,bbb,.....} So,
L(r) is the set of all strings on {a,b}, terminated by either
an a or a bb.
Example 3:
 r=(aa)*(bb)*b denotes the set of all strings with an even
number of a’s followed by an odd number of b’s that is
 L(r)={a2nb2m+1: n>=0,m>=0}
Example 4:
 r=(0+1)*00(0+1)*
Algebra of regular expressions
Identity laws
a. ε. R =R. ε = R
b. Ø + R = R+ Ø = R
Idempotent laws
R+R=R
(R*)*=R*
Distributive laws
 A.(B+C)=A.B+A.C
Associative laws
 A.(B.C)=(A.B).C
 A+(B+C)=(A+B)+C
Regular Grammars
 A language is said to be regular if it can be represented
with a regular grammar. Regular languages are equivalent
to type 3 grammars.
 The Linear Grammars are either left or right:
Right Linear Grammars:
 Rules of the forms
A → ε
A → a
 A → aB
Left Linear Grammars:
 Rules of the forms
A → ε
A → a
 A → Ba
Transform the following Right Linear
grammar in an equivalent NFAε.
S → aS | bA
A → cA | ε
Right linear Grammar
A -> aB
1. A is a single symbol (corresponding to a state) called a ‘non-terminal symbol’
2. a corresponds to a lexical item
3. B is a single non-terminal symbol.
Formal definition of Right Linear Grammars
A right linear grammar is a 4-tuple <T, N, S, R>, where:
1. N is a finite set of non-terminals
2. T is a finite set of terminals, including the empty string
3. S is the start symbol
4. R is a finite set of rewriting rules of the form A-> xB or A-> x, where A and B
stand for non-terminals and x stands for a terminal.
Formal example:
G1 = <T, N, S, R>, where T = {a, b}, N = {S, A, B}, and
R=
S -> aA
A -> aA
A -> bB
B -> bB
In a left regular grammar (also called
left linear grammar), all rules obey the forms
A → a - where A is a non-terminal in N and a is a terminal
in Σ
A → Ba - where A and B are in N and a is in Σ
A → ε - where A is in N and ε is the empty string.
An example of a left regular grammar G with N = {S, A},
Σ = {a, b, c}, P consists of the following rules
S → Sa
S → Ab
A→ε
A → cA
and S is the start symbol. This grammar describes the
same language as the regular expression a*bc*.
A regular grammar is a left or right regular grammar.
Relation between regular
language and Regular expression
They are equivalent:
With every regular expression we can associate a
regular language.
Conversely, every regular language can be obtained
from a regular expression.
Examples:
–Regular expression = ab*c
–Regular language = {ac, abc, abbc, ….}
Let Σ be an alphabet. The regular expressions over Σ
are:
Ø Represents the empty set { }
ε Represents the set {ε}
a Represents the set {a}, for any symbol a in Σ
Con’t
For Ø:

For ε:

For a:
Types of automata
There are four basic types of automata,
distinguished by the following
characteristics:
FSA have no memory, regular
grammars
Pushdown automata -In addition to the tape, they use
a stack to read from and write to,
-context-free grammars
Linear-bound automata -read and write on a tape of finite
length in both directions
- context sensitive grammars
Turing machine -read and write on an infinite tape
in both directions
Finite Automata
 An abstract machine which can be used to implement
regular expressions (etc.).
 Has a finite number of states, and a finite amount of
memory (i.e., the current state).
 Can be represented by directed graphs(state transition
diagrams) or transition tables
Representation
 An FSA may be represented as a directed graph; each
node (or vertex) represents a state, and the edges (or
arcs) connecting the nodes represent transitions.
 Each state is labeled.
 Each transition is labeled with a symbol from the
alphabet over which the regular language represented
by the FSA is defined, or with e, the empty string.
Con’t
 Among the FSA’s states, there is a start state and at least
one final state (or accepting state).
 Given an input string, an FSA will either accept or reject the
input.
 If the FSA is in a final (or accepting) state after all input
symbols have been consumed, then the string is accepted
(or recognized).
 Otherwise (including the case in which an input symbol
cannot be consumed), the string is rejected.
 Informally, a state diagram that comprehensively captures
all possible states and transitions that a machine can take
while responding to a stream or sequence of input symbols
 Recognizer for “Regular Languages”
Deterministic Finite Accepters
 The first types of automaton we study in detail are finite
accepters that are deterministic in their operation. We start
with a precise formal definition of deterministic accepters. A
deterministic acceptor has internal states, rules for
transitions from one state to another, some input, and ways
of making decisions.
 Definition:
 A DFA is defined by the quintuple
M = (Q, Σ, δ, q0, F)
Q A finite set of states
Σ A finite input alphabet
q0 The initial/starting state, q 0 is in Q
F A set of final/accepting states, which is a subset of Q
δ A transition function, which is a total function from Q x Σ
to Q
 A deterministic finite accepter operates in the following manner. At
the initial time, it is assumed to be in the initial state q0, with its input
mechanism on the leftmost symbol of the input string. During each
move of the automaton, the input mechanism advances one position
to the right, so each move consumes one input symbol. When the end
of the string is reached, the string is accepted if the automaton is one
of its final states. Otherwise the string is rejected. The input
mechanism can move only from left to right and reads exactly one
symbol on each step. The transition from one internal state to another
are governed by the transition function δ. For example
 δ(q0,a)= q1. If the dfa is in state q0 and the current input symbol is a,
the dfa will go into state q1.
The graph below represents the dfa
 M = ({q0, q1, q2} , {0,1}, δ, q0,{ q1})
 Where δ is given by
 δ(q0,0) = q0,
 δ(q0,1) = q1
 δ(q1,0) = q0,
 δ(q1,1) = q2,
 δ(q2,0) = q2,
 δ(q2,1) = q1,
Cont..
 The string 01 is accepted. The dfa does not accept the
string 00, since after reading two consecutive 0’s, it will be in
state q0. By similar reasoning, we see that the automaton
will accept the strings 101, 0111, and 11001, but not 100 or
1100.
The language accepted by a dfa M=(Q,∑, δ, q0,F) is the set
of all strings on ∑ accepted by M. In formal notation,
L(M)={wÎ∑*: δ*(q0,w) ÎF}.
 A dfa will process every string in ∑* and either accept it or nor accept it.
Non acceptance means dfa stops in a non final state.
Theorem
Let M=(Q, Σ, δ,q0,F) be a deterministic finite accepter, and
let GM be its associated transition graph. Then for every qi, qj
Î Q and w Î Σ* , δ*(qi,w)=qj, if and only if there is in GM a walk
with label w from qi to qj.
The following automaton is an example for trap state or
dead state i.e a state is a dead state or trap state if it is not an
accepting state and has no out-going transitions except to itself.
Nondeterministic Finite State Automata (NFA)
 An NFA is an automaton that its states might have none, one or more outgoing
arrows under a specific symbol. Example

• An NFA is a five-tuple:
M = (Q, Σ, δ, q0, F)

Q A finite set of states


Σ A finite input alphabet
q0 The initial/starting state, q0 is in Q

F A set of final/accepting states, which is a subset of Q


δ A transition function, which is a total function from Q x Σ to 2 Q
δ: (Q x Σ) = 2Q or δ: Q x (ΣU{ ε }) = 2Q
• Example #1: some 0’s followed by some 1’s
• Q = {q0, q1, q2}
• Σ = {0, 1}
• Start state is q0

• F = {q2}

δ: 0 1
qo {q0, q1} {}

q1 {} {q1, q2}

q2 {q2} {q2}
 An NFA for the language of all strings over {a,b} that contain ababb
Example
State A B
A B C
B B D
C B C
D B E
E B C

Transition Table:
Cont..
Transition table:
Input
State
A B
Start A B A
B B D
D B E
Accept E B A
 The transition diagram for the optimized or minimized DFA
Closure properties of regular languages
We have seen already union concatenation kleene star
properties. Now let us move on to compliment
All we did was to make the accepting states be non-accepting,
and make the non accepting states be accepting
In terms of the 5-tuple M = (Q, Σ, δ, q0, F), all we did was to
replace F with Q-F
Using this construction, we have a proof that the complement
of any regular language is another regular language.
Refer the below diagram. The regular languages
are closed under complement.
Cont..

Cont..

Intersection
We can cross product the two DFAs as:
 LM = (QLxQM, , LM, (qL, qM), FLxFM)
Demorgan’s law

Closed under difference


Reversal
Given language L, LR is the set of strings whose reversal is in L.
Example: L = {0, 01, 100};
LR = {0, 10, 001}.
Proof: Let E be a regular expression for L.We show how to reverse E,
to provide a regular expression ER for LR.
Example: Reversal of a Regular expression:

Let E = 01* + 10*.


ER = (01* + 10*)
R = (01*)R + (10*)R
= (1*)R0R + (0*)R1R
= (1R)*0 + (0R)*1
= 1*0 + 0*1.
Homomorphisms
 A homomorphism on an alphabet is a function that
gives a string for each symbol in that alphabet.
Example: h(0) = ab; h(1) = ε.
Extend to strings by h(a1…an) = h(a1)…h(an).
Example: h(01010) = ababab.

You might also like