Lect 8 Phrase Query
Lect 8 Phrase Query
Without the docs, we cannot verify that the docs matching the
above Boolean query do contain the phrase.
Biword indexes are not the standard solution (for all biwords) but can be
part of a compound strategy
Solution 2: Positional indexes
Positional indexes are a more efficient alternative to byword indexes.
Postings lists in a nonpositional index: each posting is just a docID
Postings lists in a positional index: each posting is a docID and a list of positions
In the postings, store, for each term the position(s) in which tokens of it
appear:
Document 4 is a match!
Positional index example
<be: 993427;
1: 7, 18, 33, 72, 86, 231;
Which of docs 1,2,4,5
2: 3, 149; could contain “to be
4: 17, 191, 291, 430, 434; or not to be”?