Answer Fo Auomata
Answer Fo Auomata
WS15/2016
Woche 12
Tim Conrad
AG Medical Bioinformatics
Institut für Mathematik & Informatik, Freie Universität Berlin
• Main idea:
given a text T, check for occurrence of a pattern
• Example I
Find an
exact string
(KMP!)
Knuth-Morris-Pratt
Automata in Bioinformatics
• Main idea:
given a text T, check for occurrence of a pattern
• Example II
Find an
inexact string
(motif,
correlations)
Serine carboxypeptidases,
histidine active site.
Automata in Bioinformatics
• Main idea:
given a text T, check for occurrence of a pattern
• Example III
Find a
gene
(long range
correlations)
Automata in Bioinformatics
• Main idea:
given a text T, check for occurrence of a pattern
• Example IV
Find a
structure
(long range
correlations)
• Parsing regular language
– DFA <-> NDFA <-> RE
• Context-free languages
– Long-range correlation
• NOT covered:
– Chomsky normal form
– Parsing CFG
(push-down-automaton,
CYK algorithm)
Recall:
Finite Automata
Example of a finite automaton
off on
0 1 0,1
q0 1 q1 0 q2
states
q2 q2 q2
Language of a DFA
M: off on
0 0
1
q0 q1
1
0 1
1
q0 q1
0
0 1 0,1
q0 1 q1 0 q2
L = {010, 1} ( Σ = {0, 1} )
Examples
L = {010, 1} ( Σ = {0, 1} )
• Answer
q0 1 q01 0 q010
0
0 1
qε 0, 1
1 q1 0, 1 qdie
0, 1
Examples
…
qε
1 q101
0 q10
1
q1 …
…
1
q11 1
q111 1
• Parsing regular language
– DFA <-> NDFA <-> RE
• Context-free languages
– Long-range correlation
Nondeterminism
Would be easier if…
3 symbols left 1 0 1
qdie
This is not a DFA!
Nondeterminism
0, 1
q0 1 q1 0 q2 1 q3
0, 1
q0 1 q1 0 q2 1 q3
0, 1
q0 1 q1 0 q2 1 q3
0, 1
q0 1 q1 0 q2 1 q3
0, 1
q0 1 q1 0 q2 1 q3
0, 1
q0 1 q1 0 q2 1 q3
• Example
0, 1
q0 1 q1 0 q2 1 q3
NO!
• Theorem
0, 1
1 0
NFA: q0 q1 q2
0 0
1 0
DFA: q0 q0 or q1 q0 or q2
1 1
General method
NFA DFA
states q0, q1, …, qn ∅, {q0}, {q1}, {q0,q1}, …, {q0,…,qn}
one for each subset of states in the NFA
transitions δ δ’({qi1,…,qik}, a) =
δ(qi1, a) ∪…∪ δ(qik, a)
accepting F⊆Q F’ = {S: S contains some state in F}
states
Proof of correctness
• Lemma
After reading n symbols, the DFA is in state
{qi1,…,qik} if and only if the NFA is in one of the
states qi1,…,qik
• Proof by induction on n
• At the end, the DFA accepts iff it is in a state that
contains some accepting state of NFA
• By lemma, this is true iff the NFA can reach an
accepting state
• Parsing regular language
– DFA <-> NDFA <-> RE
• Context-free languages
– Long-range correlation
Regular Expressions
Operations on strings
L* = L 0 ∪ L1 ∪ L2 ∪ …
{0}({0}∪{1})* 0(0+1)*
all strings that start with 0
({0}{1}*)∪({1}{0}*) 01*+10*
Regular expressions
(0+1)*00(0+1)*
(1*01)*1* + (1*01)*1*0
(1*01*01*)*
Main theorem for regular languages
• Theorem
regular
DFA NFA
expression
regular languages
Proof plan
regular
εNFA NFA DFA
expression
a ε
q0 ε,b q1 q2 Σ = {a, b}
a
• R1 = 0
q0 0 q1
• R2 = 0 + 1
q2 0 q3 M2
ε ε
q0 q1
ε q4 1 q5 ε
• R3 = (0 + 1)* ε
ε
q’0 ε M2 ε q’1
General method
∅ q0
ε q0
symbol a q0 a q1
RS q0 ε MR ε ε q1
MS
Convention
MR ε
ε
R+S q0 q1
ε ε
MS
ε
ε
R* q0 ε MR ε q1
Road map
εNFA NFA
regular DFA
expression
Example of εNFA to NFA conversion
a ε
εNFA: q0 ε,b q1 q2
a
a ε
εNFA: q0 ε,b q1 q2
a
a a a
NFA: q0 a, b q1 q2
a
a, b
General method
εNFA NFA
regular DFA
expression