syllabus
syllabus
COURSE OBJECTIVES:
• To describe the differences between repositories like Base Management Systems,
Information retrieval systems and data warehouse
• To discover various pre-processing techniques which can apply on text documents
to outline the structure of queries and documents
• To articulate fundamental functions used in information retrieval such as
automatic indexing, abstracting, and clustering
• To learn the important concepts, algorithms, and data/file structures that are
necessary to specify, design, and implement Information Retrieval (IR) systems
COURSE OUTCOMES: After completion of the course, the student should be able to
CO-1: Identify and understand the relationships between various repository Systems
CO-2: Apply knowledge of data structures and indexing methods in information
retrieval Systems
CO-3: Implement various clustering, searching techniques and algorithms on
Information systems
CO-4: Analyze clustering techniques and algorithms using evaluation measures
UNIT-I:
Introduction: Definition, Objectives, Functional Overview, Relationship to DBMS, Digital
libraries and Data Warehouses, Information Retrieval System Capabilities – Search,
Browse, Miscellaneous.
UNIT-II:
Cataloging and Indexing: Objectives, Indexing Process, Automatic Indexing,
Information Extraction
Data Structures: Introduction, Stemming Algorithms, Inverted file structures, N – gram
data structure, PAT data structure, Signature file structure, Hypertext data structure.
UNIT-III:
Automatic Indexing: Classes of automatic indexing, Statistical Indexing, Natural
language, Concept indexing, Hypertext linkages.
Document and Term Clustering: Introduction, Thesaurus generation, Item clustering,
Hierarchy of clusters.
UNIT-IV:
User Search Techniques: Search statements and binding, Similarity measures and
ranking, Relevance feedback, Selective dissemination of information search,
weighted searches of Boolean Systems, Searching the Internet and hypertext.
UNIT-V:
Text Search Algorithms: Introduction, Software text search algorithms, Hardware text
search systems.
UNIT-VI:
Information System Evaluation: Introduction, Measures used in system evaluation,
Measurement example – TREC results.
TEXT BOOKS:
1. Information Storage and Retrieval Systems: Theory and Implementation, Kowalski,
Gerald, Mark T. Maybury, Springer
2. Modern Information Retrieval, Ricardo Baeza – Yates, Pearson Education, 2007
REFERENCES:
1. Information Retrieval: Algorithms and Heuristics, David A. Grossman and Ophir
Frieder, 2nd Edition, Springer
2. Information Retrieval Data Structures and Algorithms, Frakes W. B., Ricardo Baeza-
Yates, Prentice Hall, 1992