0% found this document useful (0 votes)
818 views

Implementation Ofa Mini Search Engine: A Data Structures Project

This document describes the implementation of a mini search engine that indexes documents stored as files using hashing functions and allows searching through the indexed documents using trees. The search engine takes queries with Boolean operators like AND and OR and returns documents that contain the keywords specified in the query. Indexing is done to map words in documents to the files using hashing while searching is performed using balanced binary search trees or AVL trees to efficiently return matching documents.

Uploaded by

Eshwari Jay
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
818 views

Implementation Ofa Mini Search Engine: A Data Structures Project

This document describes the implementation of a mini search engine that indexes documents stored as files using hashing functions and allows searching through the indexed documents using trees. The search engine takes queries with Boolean operators like AND and OR and returns documents that contain the keywords specified in the query. Indexing is done to map words in documents to the files using hashing while searching is performed using balanced binary search trees or AVL trees to efficiently return matching documents.

Uploaded by

Eshwari Jay
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 3

IMPLEMENTATION OFA MINI SEARCH ENGINE

A Data Structures Project

Submitted By

MEGHA PALERI (CB.EN.P3MCA10026)

YOGESHWARI J (CB.EN.P3MCA10063)

NIRANJANA DEVI S (CB.EN.P3MCA10033)


ABSTRACT

In this project, we will design and implement a mini search engine that is used to
search through a collection of documents. The data structures used are files for
storing, hash tables for indexing and trees for searching the documents. The
documents will be stored using files and given a set of texts and a query, the search
engine will locate all the documents that contain the keywords in that query.

The purpose of this project is to provide an overview of how a search engine works
and to gain hands-on experience in using hash tables, files and trees.
MODULES

Indexing

The documents stored as files will be indexed based on their words/tokens using
hashing functions. This is done in order to make it easier to retrieve the required
documents.

Searching

Searching will be done using trees, and depending upon the efficiency and
complexity of the algorithm we will use AVL trees or balanced binary search trees.
In order to allow efficient searching, for every word a list of documents where it
will occur will be stored.

The queries may contain simple Boolean operators, that is AND/OR, which act in a
similar manner with the well-known analogous logical operators. For each such
query, the document that satisfies that query will be displayed.

For instance, a query:


Keyword1 AND Keyword2 -- should retrieve all documents that contain both
these keywords (elements).
Keyword1 OR Keyword2 -- instead will retrieve documents that contain either
one of the two keywords.

You might also like