Boyer Moore Algorithm
Boyer Moore Algorithm
occurrences of P in T
Right to Left
Matching the pattern from right to left For a pattern abc:
T: P:
bbacdcbaabcddcdaddaaabcbcb abc
Sublinearity!
ddbbacdcbaabcddcdaddaaabcbcb acabc
BCR Preprocessing
A table, for each position in the pattern and a
character, the size of the shift. O(m ||) space. O(1) access time. 1 2 3 4 5 a b a c b: a 1 1 3 3 3 1 2 3 4 5 b 2 2 2 5
BCR - Summary
On a mismatch, shift the pattern to the right
T: aaaaaaaaaaaaaaaaaaaaaaaaa P: abaaaa
exists) the smallest shift in P that will align a sub-string of P of the same S characters ?
GSR (Case 1)
Example 1 how much to move:
GSR (Case 2)
Example 2 what if there is no alignment:
GSR - Detailed
We mark the matched sub-string in T with t
k := m
while (k n) do
Match P and T from right to left starting at k If a mismatch occurs: shift P right (advance k) by max(good suffix rule, bad char rule). else, print the occurrence and shift P right (advance k) by the good suffix rule.
Algorithm Correctness
The bad character rule shift never misses a
match
The good suffix rule shift never misses a
match
prefix P[1..j] contains suffix P[i..m] as a suffix but not suffix P[i-1..m]
1 2 3 4 5 6 7 8 9 10 11 12 13
P: b b a b b a a b b c a b b L: 0 0 0 0 0 0 0 0 0 0 9 0 12
P: b b a b b a a b b c a b b l: 2 2 2 2 2 2 2 2 2 2 2 1
T: aaaaaaaaaaaaaaaaaaaaaaaaa P: aaaaaa
Boyer Moore Algorithm runs in (m n) when