0% found this document useful (0 votes)
128 views

DFA Minimization Based On Hopcroft's Algorithm Along With Slides On Brzozowski's Algorithm Also Found in Lecture 5

dfa

Uploaded by

Adio odunola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views

DFA Minimization Based On Hopcroft's Algorithm Along With Slides On Brzozowski's Algorithm Also Found in Lecture 5

dfa

Uploaded by

Adio odunola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

COMP 412

FALL 2010

Lexical Analysis:
DFA Minimization
Comp 412

Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.
Students enrolled in Comp 412 at Rice University have explicit permission to make copies
of these materials for their personal use.
Faculty from other educational institutions may use these materials for nonprofit
educational purposes, provided this copyright notice is preserved.
Automating Scanner Construction
RENFA (Thompson’s construction)  The Cycle of Constructions
• Build an NFA for each term
• Combine them with -moves
minimal
RE NFA DFA
NFA DFA (subset construction)  DFA

• Build the simulation


DFA Minimal DFA
• Brzozowski’s Algorithm 
• Hopcroft’s algorithm (today)
DFA RE (not really part of scanner construction)
• All pairs, all paths problem
• Union together paths from s0 to a final state

Comp 412, Fall 2010 1


DFA Minimization
The Big Picture
• Discover sets of equivalent states in the DFA
• Represent each such set with a single state

Comp 412, Fall 2010 2


DFA Minimization
The Big Picture
• Discover sets of equivalent states in the DFA
• Represent each such set with a single state
Two states are equivalent if and only if:
• The set of paths leading to them are equivalent, and
•    , transitions on  lead to equivalent states (DFA)
⇒ Must split a state that has -transitions to distinct sets

Comp 412, Fall 2010 3


DFA Minimization
The Big Picture
• Discover sets of equivalent states in the DFA
• Represent each such set with a single state
Two states are equivalent if and only if:
• The set of paths leading to them are equivalent, and
•    , transitions on  lead to equivalent states (DFA)
⇒ Must split a state that has -transitions to distinct sets
A partition P of S
• A collection of sets P s.t. each s  S is in exactly one pi  P
• The algorithm iteratively constructs partitions of the DFA’s
states

Comp 412, Fall 2010 4


Maximally sized sets 
minimal number of sets
DFA Minimization
Details of the algorithm
• Group states into maximally sized initial sets, optimistically
• Iteratively subdivide those sets, based on transition graph
• States that remain grouped together are equivalent

Initial partition, P0 , has two sets: {F} & {S-F} D =(S,,,s0,F)


final states others

Splitting a set (“partitioning a set by a”)


• Assume sa & sb  pi, and (sa,a) = sx, & (sb,a) = sy
• If sx & sy are not in the same set pj, then pi must be split
— sa has transition on a, sb does not  a splits pi
• One state in the final DFA cannot have two transitions on a
• The algorithm works backward, from a pair (p,a) to the
subset of the states in some other set q that reach p on a
Comp 412, Fall 2010 5
Key Idea: Splitting S around 
Original set Q

 S
Q

Q has transitions

on  to R, S, & T

T
R

The algorithm partitions Q around 

Comp 412, Fall 2010 6


Key Idea: Splitting Q around (S,)
Find maximal partition I that has an -transition into S

I S

Q

Think of I as the
image of S under
This part must have an -transition to
the inverse of the one or more other states in one or more
transition function other partitions.
I   –1( s,  ) Otherwise, it does not split!
Comp 412, Fall 2010 7
Hopcroft's Algorithm
W  {F, S-F}; P  {F, S-F}; // W is the worklist, P the current partition
while ( W is not empty ) do begin
select and remove s from W ; // s is a set of states
for each  in  do begin
let I  –1( s ); // I is set of all states that can reach s on 
for each p  P such that p I is not empty
and p is not contained in I do begin
partition p into p1 and p2 such that p1  p I ; p2  p – p1;
P  (P – p)  p1  p2 ;
if p  W
then W  (W – p)  p1  p2 ;
else if |p1| ≤ |p2 |
then W  W  p1;
else W  W  p2;
end
end
end

Comp 412, Fall 2010 8


Key Idea: Splitting pi around 
Original set pi pj is the image of S under the inverse of 

 S
pj

Q pk How does the worklist


 algorithm ensure that it
splits pk around R & T ?

pk is everything T
Subtle point: either R or T
in pi - pj
R (or both) must already be
on the worklist. (R & T have
split from {S-F}.)
Thus, it can split pi around
one state (S) & add either
pj or pk to the worklist.
Comp 412, Fall 2010 9
A Detailed Example
Remember ( a | b )* abb ? (from last lecture)
a|b
Our first
 a b b NFA
q0 q1 q2 q3 q4

Applying the subset construction:


State -closure(move(si,)
Iter. DFA NFA a b
0 s0 q0,q1 q1,q2 q1
1 s1 q1,q2 q1,q2 q1,q3
s2 q1 q1,q2 q1
2 s3 q1,q3 q1,q2 q1,q4
3 s4 q1,q4 q1,q2 q1

Iteration 3 adds nothing to S, so the algorithm halts


contains q4
Comp 412, Fall 2010 (final state) 10
A Detailed Example
The DFA for ( a | b )* abb
Character
a a State a b
a b b s0 s1 s2
s0 s1 s3 s4
s1 s1 s3
b a a b
s2 s1 s2
s2
s3 s1 s4
b
s4 s1 s2

• Not much expansion from NFA (we feared exponential blowup)


• Deterministic transitions
• Use same code skeleton as before

Comp 412, Fall 2010 11


A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3}

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 For the record, example was right in 1999, broken in 2000 12
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 13
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 14
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

P1 {s4} {s3} {s0,s1,s2} {s3} {s0,s1,s2}

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 15
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

P1 {s4} {s3} {s0,s1,s2} {s3} {s0,s1,s2} {s3} none

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 16
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

P1 {s4} {s3} {s0,s1,s2} {s3} {s0,s1,s2} {s3} none {s1} {s0,s2}

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 17
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

P1 {s4} {s3} {s0,s1,s2} {s3} {s0,s1,s2} {s3} none {s1} {s0,s2}

P2 {s4} {s3} {s1} {s0,s2} {s1} {s0,s2}

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 18
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

P1 {s4} {s3} {s0,s1,s2} {s3} {s0,s1,s2} {s3} none {s1} {s0,s2}

P2 {s4} {s3} {s1} {s0,s2} {s1} {s0,s2} {s1} none none

a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 19
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

P1 {s4} {s3} {s0,s1,s2} {s3} {s0,s1,s2} {s3} none {s1} {s0,s2}

P2 {s4} {s3} {s1} {s0,s2} {s1} {s0,s2} {s1} none none

P2 {s4} {s3} {s1} {s0,s2} {s1} {s0,s2} {s0,s2} none none

Empty worklist  done!


a a
a b b
s0 s1 s3 s4

b a a b
s2
b
Comp 412, Fall 2010 20
A Detailed Example (DFA Minimization)

Current Partition Worklist s Split on a Split on b

P0 {s4} {s0,s1,s2,s3} {s4} {s0,s1,s2,s3} {s4} none {s3} {s0,s1,s2}

P1 {s4} {s3} {s0,s1,s2} {s3} {s0,s1,s2} {s3} none {s1} {s0,s2}

P2 {s4} {s3} {s1} {s0,s2} {s1} {s0,s2} {s1} none none

P2 {s4} {s3} {s1} {s0,s2} {s1} {s0,s2} {s0,s2} none none

a a b a
a
a b b a b b
s0 s1 s3 s4 s1 s3 s4
s0 , s2
b a a b a b
s2
b
Comp 412, Fall 2010 20% reduction in number of states 21
DFA Minimization
What about a ( b | c )* ? 
b
 q4 q5 
q0 a
q1  q2  q3 q8  q9
 q6 c 
q7

First, the subset construction:

States -closure(Move(s,*))
DFA NFA a b c b

s0 q0 s1 none none s2
b
q1, q2, q3, a
s1 none s2 s3 s0 s1 b c
q 4 , q6 , q9
c
q 5 , q8 , q9 ,
s3
s2 q 3 , q4 , q6 none s2 s3
c

s3 q 7 , q8 , q9 ,
q 3 , q4 , q6 none s2 s3
Comp 412, Fall 2010 From last lecture … 22
DFA Minimization

Then, apply the minimization algorithm b

Split on s2
b
Current Partition a b c a
s0 s1 b c
P0 {s1,s2,s3} {s0} none none none c
s3
c

It splits no states after the initial partition


 The minimal DFA has two states
 One for {s0}
 One for {s1,s2,s3}

Comp 412, Fall 2010 23


DFA Minimization
Then, apply the minimization algorithm b

Split on s2
b
Current Partition a b c a
s0 s1 b c
P0 {s1,s2,s3} {s0} none none none c
s3
c

It produces this DFA In lecture 5, we observed that a human


b|c would design a simpler automaton than
Thompson’s construction & the subset
a
s0 s1 construction did.
Minimizing that DFA produces the one
that a human would design!

Comp 412, Fall 2010 24


Abbreviated Register Specification
Start with a regular expression
r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7 | r8 | r9

Register names from


zero to nine

The Cycle of Constructions

minimal
RE NFA DFA
DFA
Comp 412, Fall 2010 25
Abbreviated Register Specification
Thompson’s construction produces
r 0
 

  r 1  

  
r 2 
 … … 

 
… 
 
s0  sf
r 8 

r 9 

The Cycle of Constructions


To make the example fit, we have
eliminated some of the -transitions,
e.g., between r and 0 minimal
RE NFA DFA
DFA
Comp 412, Fall 2010 26
Abbreviated Register Specification
The subset construction builds
sf0
0 1 sf1
r 2
s0 … sf 2

8
9
sf8
sf9

This is a DFA, but it has a lot of states …

The Cycle of Constructions

minimal
RE NFA DFA
DFA
Comp 412, Fall 2010 27
Abbreviated Register Specification
The DFA minimization algorithm builds

0,1,2,3,4,
r 5,6,7,8,9
s0 sf

This looks like what a skilled compiler writer would do!

The Cycle of Constructions

minimal
RE NFA DFA
DFA
Comp 412, Fall 2010 28
Automating Scanner Construction
RENFA (Thompson’s construction)  The Cycle of Constructions
• Build an NFA for each term
• Combine them with -moves
minimal
RE NFA DFA
NFA DFA (subset construction)  DFA

• Build the simulation


DFA Minimal DFA
• Brzozowski’s Algorithm 
• Hopcroft’s algorithm 
DFA RE (not really part of scanner construction)
• All pairs, all paths problem
• Union together paths from s0 to a final state

Comp 412, Fall 2010 29


RE Back to DFA
Kleene’s Construction
for i  0 to |D| - 1; // label each immediate path
for j  0 to |D| - 1;
R0ij  { a | (di,a) = dj}; Rkij is the set of paths
if (i = j) then from i to j that include
R0ii = R0ii | {}; no state higher than k

for k  0 to |D| - 1; // label nontrivial paths


for i  0 to |D| - 1;
for j  0 to |D| - 1;
Rkij  Rk-1ik (Rk-1kk)* Rk-1kj | Rk-1ij
L  {} // union labels of paths from
For each final state si // s0 to a final state si
L  L | R|D|-10i The Cycle of Constructions

minimal
RE NFA DFA
DFA
Comp 412, Fall 2010 30
Limits of Regular Languages
Not all languages are regular
RL’s  CFL’s  CSL’s
You cannot construct DFA’s to recognize these languages
• L = { pkqk } (parenthesis languages)
• L = { wcwr | w  *}
Neither of these is a regular language (nor an RE)

But, this is a little subtle. You can construct DFA’s for


• Strings with alternating 0’s and 1’s
(  | 1 ) ( 01 )* (  | 0 )
• Strings with and even number of 0’s and 1’s
RE’s can count bounded sets and bounded differences

Comp 412, Fall 2010 31


Limits of Regular Languages
Advantages of Regular Expressions
• Simple & powerful notation for specifying patterns
• Automatic construction of fast recognizers
• Many kinds of syntax can be specified with REs
Example — an expression grammar
Term  [a-zA-Z] ([a-zA-Z] | [0-9])*
Op  +|-||/
Expr  ( Term Op )* Term
Of course, this would generate a DFA …

If REs are so useful …


Why not use them for everything?

Comp 412, Fall 2010 32


EXTRA SLIDES START HERE

Comp 412, Fall 2010 33


This is a fixed-point algorithm!
DFA Minimization
The algorithm Why does this work?

T  { F, {S-F}}
• Partition P  2S
P{} • Start off with 2 subsets of S:
{F} and {S-F}
while ( P  T)
PT • The while loop takes PiPi+1 by
T{} splitting 1 or more sets
for each set pi  P • Pi+1 is at least one step closer to
T  T  Split(pi) the partition with |S | sets
• Maximum of |S | splits
Split(S)
for each c   Note that
if c splits S into s1 & s2 • Partitions are never combined
then return {s1, s2} • Initial partition ensures that
return S final states remain final states

mild abuse of notation


Comp 412, Fall 2010 34
DFA Minimization
Refining the algorithm
• As written, it examines every pi  P on each iteration
— This strategy entails a lot of unnecessary work
— Only need to examine pi if some T, reachable from pi, has split
• Reformulate the algorithm using a “worklist”
— Start worklist with initial partition, F and {S-F}
— When it splits pi into p1 and p2, place p2 on worklist

This version looks at each pi  P many fewer times


• Well-known, widely used algorithm due to John Hopcroft

Comp 412, Fall 2010 35


Alternative Approach to DFA Minimization
The Intuition
• The subset construction merges prefixes in the NFA

a b c
s1 s2 s3 s4
 abc | bc | ad
 b c
s0 s5 s6 s7

 Thompson’s construction would leave


a d
s8 s9 s10 -transitions between each single-
character automaton

s6
d
b Subset construction eliminates -
c
s1 s2 s3 transitions and merges the paths for a.
a
It leaves duplicate tails, such as bc.
b c
s0 s4 s5

Comp 412, Fall 2010 36


Alternative Approach to DFA Minimization
Idea: use the subset construction twice
• For an NFA N
— Let reverse(N) be the NFA constructed by making initial states
final (& vice-versa) and reversing the edges
— Let subset(N) be the DFA that results from applying the
subset construction to N
— Let reachable(N) be N after removing all states that are not
reachable from the initial state
• Then,
reachable(subset(reverse[reachable(subset(reverse(N))]))

is the minimal DFA that implements N [Brzozowski, 1962]


This result is not intuitive, but it is true.
Neither algorithm dominates the other.

Comp 412, Fall 2010 37


Alternative Approach to DFA Minimization
Step 1
• The subset construction on reverse(NFA) merges suffixes in
original NFA
a b c
s1 s2 s3 s4
 
 b c  Reversed NFA
s0 s5 s6 s7 s11

 
a d
s8 s9 s10

a b c
s1 s2 s3
subset(reverse(NFA))
a d
s8 s9 s11

Comp 412, Fall 2010 38


Alternative Approach to DFA Minimization
Step 2
• Reverse it again & use subset to merge prefixes …
a b c
s1 s2 s3

s0  Reverse it, again
d s11
 a
s8 s9

d
a s2
b
And subset it, again
b c
s0 s3 s11
The Cycle of Constructions

Minimal DFA
minimal
RE NFA DFA
DFA
Comp 412, Fall 2010 39
Brzozowski

You might also like