Lect 07
Lect 07
1
Finite State Machines(cont’d)
2
Finite State Machines(cont’d
3
Finite Automata and Lexical Analysis
4
Two Kinds of FSA
5
Implementing the Scanner
Three methods
– Hand-coded approach:
Draw DFSM, then implement with loop and case statement
– Hybrid approach :
Define tokens using regular expressions, convert to NFSM,
apply algorithm to obtain minimal DSFM
Hand-code resulting DFSM
– Automated approach:
Use regular grammar as input to lexical scanner
generator (e.g. LEX)
6
Hand-coding
Branch depending on first character:
– If digit, scan numeric literal
– If character, scan identifier or keyword
– If operator, check next character (++, etc.)
Return token found
Write aggressive efficient code: goto’s, global variables
7
NFAs & DFAs
Deterministic
Finite Automata (DFAs)
require more complexity to represent
regular expressions, but offer more
precision.
8
Non-Deterministic Finite Automata
9
Representing NFAs
More suitable to
Transition Tables:
representation within a
computer
10
Example NFA
S = { 0, 1, 2, 3 } a
s0 = 0 start
0 a 1 b 2 b 3
F={3}
= { a, b } b
What Language is defined ? (a|b)*abb
a
start a b b
0 1 2 3
Given an input string, we trace moves
b If no more input & in final state, ACCEPT
EXAMPLE:
-OR-
Input: ababb
move(0, a) = 0
move(0, a) = 1 move(0, b) = 0
move(1, b) = 2 move(0, a) = 1
move(2, a) = ? (undefined) move(1, b) = 2
move(2, b) = 3
12 REJECT ! ACCEPT !
How Does An NFA Work ?(cont’d)
13
How Does An NFA Work ?(cont’d)
14
How Does An NFA Work ?(cont’d)
15
How Does An NFA Work ?(cont’d)
16
The transition table representation has the
advantage that it provides fast access to the
transitions of a given state on a given
character; its disadvantage is that it can take
up a lot of space when the input alphabet is
large and most transitions are to the empty
set.
17
Handling Undefined Transitions
a
start a b b
0 1 2 3
b a
a
a, b
4
18
NFA- Regular Expressions & Compilation
start a b
0 1
c
3 c 5
String abbc can be accepted.
20
Alternative Solution Strategy
b
a c
a (b*c) 1 2 3
4 a 5 b
a (b | c+)?
c
c
7
21
Using Null Transitions to “OR” NFAs
b
a c
1 2 3
0 6
4 a 5 b
c
c
22 7
Other Concepts
b
a a b b
aabb is accepted along path : 0 0 1 2 3
Recognizes: aa* | b | ab a a
1 4
b
start 0 2 5
a
3 a b
0 1,2,3 - -
Can represent FA with either 1 - 4 Error
graph of transition table 2 - Error 5
3 - 2 Error
4 - 4 Error
24 5 - Error Error
Deterministic Finite Automata
INPUT:
An input string x terminated by end of file
character eof(or any other delimiter). A DFA
‘d’ with start state sº and a set of accepting
states F.
OUTPUT:
The answer “yes” if ‘d’ accepts x, “no” other
wise
26
Simulating a DFA
METHOD:
Apply the algorithm to the input string x .The
function move(s,c) gives the state to which
there is a transition from state s on input
character c . The function nextchar returns
the next character of the input string x.
27
Simulating a DFA(cont’d)
s = s0
c = nextchar;
while c eof do
s = move(s,c);
c = nextchar;
end;
if s is in F then return “yes”
else return “no”
28
Following transition graph of a DFA
accepting the language (a|b)*abb as that
accepted by the NFA .With this DFA and
input string ababb algorithm follows the
sequence of state 0,1,2,1,2,3 and return
“yes”.
29
String ababb b
a
start a b b
0 1 2 3
a
b a
a
start a b b
0 1 2 3
30 b
DFA Example
Recognizes: aa* | b | ab
a a
2
1
a
0 b
start
b 3
31
Finite State Machines(Cont’d)
Letter
Letter
1 2 23 ID
Digit
33
Finite State Machines(Cont’d)
34
Finite State Machines(Cont’d)
35
Finite State Machines(Cont’d)
36
Example
38
Where Are The Missing
Transitions?(cont’d)
Letter
Letter
Start In_id
2
Other Other
Error Digit
Any
39
Where Are The Missing
Transitions?(cont’d)
40
Where Are The Missing
Transitions?(cont’d)
41
Where Are The Missing
Transitions?(cont’d)
Letter
Letter {Other}
Start In_id Finish
2 Return ID
Digit
42
Structure of a Scanner Automaton
43
How much should we match?
In general, find the longest match possible.
E.g., on input 123.45, match this as
num_const(123.45)
rather than
num_const(123), “.”, num_const(45).
44
ASSINGMENT
45
THANKS
46