0% found this document useful (0 votes)
56 views

Information Search and Retrieval

The document discusses different techniques for information search and retrieval including information searching and retrieval, electronic directories and catalogs, and information filtering. It also describes the goals, methods, and challenges of information search and retrieval.

Uploaded by

Chandra Kali
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Information Search and Retrieval

The document discusses different techniques for information search and retrieval including information searching and retrieval, electronic directories and catalogs, and information filtering. It also describes the goals, methods, and challenges of information search and retrieval.

Uploaded by

Chandra Kali
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Information Search and Retrieval

There are three different techniques in


search and Resource discovery paradigms,

•Information searching & retrieval

•Electronic directories & catalogs

•Information filtering
Information search & retrieval

• Information search and retrieval is a process of


finding and extracting information according to the
specification provided by a user
• The main purpose of developing this process is to
support naïve users in areas like electronic shopping
and home banking.
The goals of information search and retrieval:

• To satisfy the customers up to the maximum extent


• To reduce the cost.
• To fastly execute the requested query. Computer
methods that are used to execute the query are,
• Method for finding exact match based on keyword.
• Method for finding nearest neighbors.
• Information search and retrieval is used in areas
like libraries where customers are concentrating on
information seeking behavior.
Electronic Directories and Catalogs

• Directories and catalogs are used for the


following tasks
• 1)Information organizing
• 2)Information browsing
Information Organizing:

• Organizing refers to the way of organizing or


arranging the information so as to make decisions
for interrelating it.
• Organizing the information in a static way is useful
for some people but causes harm to other people.
• The apprehension of organizing the information is
very intuitive which means that what one finds easy
may be difficult for others to browse depending on
the requirement.
Information Browsing:
• It is defines as an activity that is guided by human for
analyzing the enterprise and identifying the details of
resource space.
• There are two major problems that occur while
performing browsing, they are navigation issues and
disorientation issues of users.
• These problems can be solved by using system that
supports different representation of similar
information.
Information Filtering:

• The objective f information filtering is to provide


access to relevant and variable information when a
user requests for it.
• Information filtering is a process of selecting only
those information that matches the users request.
• The purpose of this process is not responsible for
performing any kind of search but its only objective is
to filter out inconsistent data.
Software filters are used to provide access control.
• local filter
• Remote filter
• Local filter:
• Local filters are used for processing incoming stream
of data.
• Remote filters:
• They are software agents that perform their task on
behalf of users. They help users to perform daily task,
search and retrieve information, support decision-
making
INFORMATION SEARCH AND
RETRIEVAL
• Searching is a process of identifying/finding the
required information from a massive amount of
stored semi structured information. This is in contrast
to the database application that deals with the
structured format since it follow certain standards,
syntaxes and make use of data type that have
specific meaning
• Examples: students database and email messages.
There are two phases in which search process can be
accomplished. They are

• End user retrieval phase


• Publisher indexing phase.
End user Retrieval phase:
• The following steps must be followed in end user retrieval
phase are,
• A query is constructed by a user which specifies the search
method to be used.
• Query is then sent to the server , that examines the query,
process it and initiate the search process. The result is a
table that contains a list of matching documents called hit
list. This tables finally degenerate hit list is passed back to
the user.
• Users then select the pertinent document according to their
requirement , scans it and print only the desired part of a
document.
Publisher Indexing phase:
• This phase is responsible for,
• Making an entry of a document in the database.
• Creating and updating indexes and pointers that are useful
while searching is performed
•What is a Document?
• Examples: web pages, email, books, news stories, scholarly
papers, text messages, Word, Power point, PDF, forum
postings, patents, IM sessions, etc.

• Common properties
• Significant text content
• Some structure (e.g., title, author, date for papers; subject,
sender, destination for email)
Documents vs. Database Records
Database records (or tuples in relational databases)
are typically made up of well‐defined fields (or
attributes)e.g., bank records with account numbers,
balances, names, addresses, social security numbers,
dates of birth, etc.

Easy to compare fields with well‐defined semantics to


queries in order to find matches
Text is more difficult
Dimensions of IR
Content Applications Tasks
Text Web search Ad hoc search
Images Vertical search Filtering
Video Enterprise search Classification
Scanned docs Desktop search Question answering
Audio Forum search
Music P2P search
Literature search
InfrmationRetrievalTasks
Ad‐hoc search Find relevant documents for an
arbitrary text query

Filtering Identify relevant user profiles for a new


document

Classification Identify relevant labels for documents

Question answering Give a specific answer to a


question
Models of Information Retrieval:

• There are three models that are used for retrieving


information from the database in an efficient
manner. They are
• Boolean information retrieval model
• Vector space information retrieval model
• Probabilistic information retrieval model
Boolean information retrieval model:

• Boolean refers to query specification which are found


using word or phrases, which are combined using
standard operators AND, OR, NOT.
• The drawback of this model fetches those text files
irrespective of their locations.
• The drawback of this model is that it does not give any
preference or priority to fetched document
• The system is more effective if a query exactly matches
with the retrieved document and on the other hand
results ineffectiveness id the result is not definite and
accurate.
Vector space information retrieval model

• Vector space model is developed to overcome the


problems of Boolean model.
• It performs vector comparison using cosine
correlation similarity method.
• According to this method, the query matches the
text, when vector text is similar to vector query.
Probabilistic information retrieval model:

• This model is based on probability ranking criterion.


• According to this criteria, every text present in the
database is given some priority.
• Both vector space model and probabilistic model uses
Boolean queries.
CHALLENGES AND PROBLEMS
ENCOUNTERD IN INFORMATION SEARCH
• The following are the various challenges that a rises
while searching information online.
• Information is being uploaded on the internet at a high
rate.
• Since, the turnover of information is rapid, traditional
tools of information search are not sufficient for the
consumers. Therefore, a challenge is to design and
implement advances searching, filtering and data
mining tools that maximize the search process of an
individual in terms of time, cost and information needs
•Sometimes the search result’s gets overloaded often
confusing the consumers.
•Consumer learn the environment slowly by knowing ‘what is
where’. Additionally, directories and catalogs may be provided
to the consumers that facilitate them to navigate and browse
the product information of their choice.
•Today, the focus is on human technology interfaces,
according to this feature, the information regarding
preferences of a customer is taken and intelligent and useful
information is provided to the consumer, but the challenge
here is, how to represent this useful information on the
screen. Developments are being made to use virtual reality for
displaying such information to user

You might also like