Algoritmen & Datastructuren 2012 - 2013 Substring Search (Slides by Sedgewick)
Algoritmen & Datastructuren 2012 - 2013 Substring Search (Slides by Sedgewick)
2012 2013
Substring Search
(Slides by Sedgewick)
Philip Dutr & Benedict Brown
Dept. of Computer Science, K.U.Leuven
Usually M << N
Applications
Parsers
Spam filters
Digital libraries
Word processors
Web search engines
Natural language processing
Computational molecular biology
Feature detection in digitized images
PROFITS
L0SE WE1GHT
herbal V1AGRA
There is no catch
L0W M0RTGAGE RATES
This is a one-time mailing.
This message is sent in
compliance with spam
regulations
Detonate Bomb
Attack Belgians
Ik zal een taart in het gezicht van Leonard werpen
Brute force
Check all possible starting positions,
check entire length of pattern
Brute force
Challenges
11
Knuth-Morris-Pratt (KMP)
12
13
14
KMP trace
15
KMP Implementation
Running time
16
17
Mismatch transition
To compute dfa[c][j]:
19
20
KMP Analysis
21
R = size of alphabet
Donald
Knuth
Boyer-Moore
Intuition:
23
Boyer-Moore
24
Boyer-Moore
25
Boyer-Moore
26
Boyer-Moore Analysis
Worst-case: ~ M.N
27
Rabin-Karp
28
Rabin-Karp
ti = txt.charAt(i)
Horner's method
29
Rabin-Karp
30
Rabin-Karp
31
Rabin-Karp
32