Week 1
Week 1
By
Dr Syed Khaldoon Khurshid
Course Objectives:
1. Understand the fundamentals of information retrieval, including
retrieval models, indexing, and ranking algorithms.
2. Explore advanced topics such as user modeling, query
expansion, and result diversification.
3. Analyze real-world case studies to identify challenges and
solutions in information retrieval.
4. Develop skills in implementing and evaluating information
retrieval systems.
5. Foster critical thinking and problem-solving abilities through
assignment-based learning.
Basic Concepts:
• Data 🡪 Raw fact
• Example of Data
3520179534535
• Information 🡪 Processed
Data • Example of Information
CNIC: 35201-7953453-5
MOTIVATION
INFORMATION RERIEVAL(IR)
Representation, storage, organization of,
and access to information items.
MOTIVATION
• User information need:
- Find all docs containing information on
Computer Systems which: (1) are
designed by IBM and (2) designed after
2020.
10
Information Retrieval
Systems IRS is with two basic aspects:
(i) How to store information, and
(ii) How to retrieve information.
Information Retrieval text based:
• An information retrieval system is designed to analyze process and
store sources of information and retrieve those that match a
particular users requirements. Modern information retrieval
systems can either retrieve bibliographic items or the exact text that
matches a user's search criteria from a stored database of documents.
• IRS originally meant text retrieval systems as they were dealing
with textual documents.
Information Retrieval in the Library
⮚ Libraries were among the first institutions to adopt
systems for retrieving information.
1) The software that deals with the 1) Data retrieval deals with obtaining
organization, storage, retrieval, and data from a database management
evaluation of information from system such as RDBMS.
document particularly textual
information.
2) Retrieves information about a subject. 2) Determines the keywords in the
user query and retrieves the data.
3) Small errors are likely to go unnoticed. 3) A single error object means
total failure.
4) Not always well structured and 4) Has a well-defined structure
is semantically ambiguous. and semantics.
5) Does not provide a solution to the 5) Provides solutions to the user of
user of the database system. the database system.
6) The results obtained are 6) The results obtained are exact matches.
approximate matches.
7) Results are ordered by relevance. 7) Results are unordered by relevance.
Information Retrieval was the main focus:
⮚ PAST 30-40 YEARS
• Information Retrieval has grown well beyond
• Research in IR includes modeling, document classification,
user interface, languages etc.
⮚ BEGINNING OF 1990s
• Introduction of World Wide Web.
• Its success is based on the conception of a standard user
interface which is always the Same.
• Any user can create his own Web documents Without any
Restriction.
19
User Task
• The User of a retrieval system has to translate his information need into a
query.
• With an information retrieval system, this implies specifying a set of words.
• With a data retrieval system, a query expression is used to convey the
constraints that must be satisfied by objects in the answer set.
Retrieval
Database
Browsing
⮚ Two other very important issues are copyright and patent rights. It is far
from clear how the wide spread of data on the Web affects copyright and
patent laws in the various countries.
Retrieval Process ⮚ To describe the retrieval process, we use a simple and generic
software architecture.
⮚ Given that the document database is indexed, the retrieval
process can be initiated.
⮚ The user first specifies a user need which is then parsed
and transformed by the same text operations applied to
the text.
⮚ Then, query operations might be applied before the actual
query, which provides a system representation for the user
need, is generated.
⮚ The query is then processed to obtain the retrieved
documents. Fast query processing is made possible by
the index structure previously built.
4, 10
User need Text
Text Operation
6, 7
Logical view Logical view
Operations Searching
5
8 Inverted file
User feedback
Query
Indexing
DB manager module
Index
8
Text
Database
Retrieved docs
Ranked docs
2
Ranking
Assignment 1:
• Design a Simple Document Search Engine First:
Searching the word in the document
Resources
•Chapter 1 (Introduction) Book: Modern Information Retrieval
•https://www.flexiprep.com/NIOS-Notes/Senior
Secondary/Library-Science/NIOS-Library-Science-Unit-16-
Information-Retrieval-System-Part-1-4.html [ Accessed: 27
Aug-2023]