IR Slides
IR Slides
and Organisation
Dell Zhang
Birkbeck, University of London
IR Chapter 03
Dictionaries and
Tolerant Retrieval
Dictionaries
I Dictionary: the data structure for storing the
term vocabulary
Brutus −→ 1 2 4 11 31 45 173 174
Calpurnia −→ 2 31 54 101
..
.
| {z } | {z }
dictionary postings
Storing Dictionaries
I For each term, we need to store a couple of
items:
I document frequency
I pointer to postings list
I ...
I Assume for the time being that
I we can store this information in a fixed-length entry
I we store these entries in an array
Storing Dictionaries
term document pointer to
frequency postings list
a 656,265 −→
aachen 65 −→
... ... ...
zulu 221 −→