The Question Answering System Using NLP and AI
The Question Answering System Using NLP and AI
ISSN 2229-5518
55
Abstract:
(ii) QA response with specific answer to a specific question
The Paper aims at an intelligent learning system that will take instead of a list of documents.
a text file as an input and gain knowledge from the given
text. Thus using this knowledge our system will try to answer
questions queried to it by the user. The main goal of the 1.1. APPROCHES in QA
Question Answering system (QAS) is to encourage research
IJSER
into systems that return answers because ample number of There are three major approaches to Question Answering
users prefer direct answers, and bring benefits of large-scale Systems: Linguistic Approach, Statistical Approach and
evaluation to QA task.
Pattern Matching Approach.
Keywords:
A. Linguistic Approach
Question Answering System (QAS), Artificial Intelligence
This approach understands natural language texts, linguistic
(AI), Natural Language Processing (NLP)
techniques such as tokenization, POS tagging and parsing.[1]
These are applied to reconstruct questions into a correct
query that extracts the relevant answers from a structured
1. INTRODUCTION database. The questions handled by this approach are of
Factoid type and have a deep semantic understanding.
Question Answering (QA) is a research area that combines
research from different fields, with a common subject, which B. Statistical Approach
are Information Retrieval (IR), Information Extraction (IE)
and Natural Language Processing (NLP). Actually, current Availability of huge amount of data on the World Wide Web
search engine just do “document retrieval”, i.e. given some increased the importance of Statistical Approach. Statistical
keywords it only returns the relevant ranked documents that approaches and online text repositories are not dependent on
contain these keywords. They do not provide a precise structured query languages and can formulate queries in
answer to that. Hence QAS is designed to help people find natural language.
specific answers to specific questions in restricted domain.
Statistical techniques: Support Vector Machine classifier,
QA systems are classified into two main categories, namely Bayesian classifiers, etc.
open-domain QA systems and closed- domain QA systems.
Open-domain question answering deals with questions about
nearly everything such as the World Wide Web. On the other C. Pattern Matching Approach
hand, closed-domain question answering deals with questions
under a specific domain (music, weather forecasting etc.) The This approach deals with expressive power of text pattern, it
domain specific QA system considers heavy use of natural replaces the sophisticated processing involved in other
language processing systems. computing approaches. This approach uses the
communicative power of text patterns. This approach best
QA is different from the search engines in two aspects: suits to small and medium sized websites. The type of
(i) In QA, query is the question not a keyword questions handled by this approach are mainly factoid based,
IJSER © 2016
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 7, Issue 12, December-2016
ISSN 2229-5518
56
Question
1.1. QAS COMPONENTS Processing Document
Question
The questions that the system receives can be divided into Fig.1 Architecture of QAS Answer
two major categories: FACTUAL & EXPERT. Factual
questions are those which contain words like what, where,
when, who, etc. Expert questions are those which contain
words like how, why etc. 2. LITERATURE SURVEY
TheDocument Processing Module QA systems, as explained before, have a backbone consisting
of three main parts: categorization of question, information
It takes in the choice of the user for a particular passage from retrieval, and answer extraction. Therefore, each of these
the displayed list .Then using POS Tagger tags the tokens
IJSER
three components attracted the attention of QA researchers.
.With the help of tags finds the verbs in the passage. Using
the list of irregular verbs and the logic for regular verbs a Question Classification
data structure (array) is created which contains the verbs
along with their tenses and ing form. Question Question Answer Type
class
TheQuestion Processing Module: WHAT basic-what/ Money/ No./
what-who /what- Definition/ Title/
It takes in a question from the user Using StringTokenizer when/ what- NNP/ Undefined
tokenizes it and stores it in another data structure returns it where
for further use in the program WHO Person /
HOW basic-how Manner
how-manyhow- Number
The Question-Answering Module long Time/Distance
how-much how- Money / Price how-
First finds the verb in the question. Matches the verb just muchhow-far much Undefined
found with the tokens created in the document processing how-tall Distance Number
stage. According to the selected case for type of factual how-rich Undefined
question (what, when, etc) it further tries to extract and how-large
formulate the answer.
WHERE Location
The user is first asked to select the passage of his choice and
WHEN Date
then the type of question. The Question processing module
WHICH which-who Person Location
will process the question and pass it to the Question which-where Date
Answering module which will make use of the various which-when NNP
extractions received from the Document Processing phase, which-what
along with the Processed Documents containing the tagged NAME name-who name- Person/ORG.
format of the original input document. By applying required where Location
algorithms this module will pass it to the Formulation module name-what Title / NNP
WHY Reason
for getting the desired answer.
WHOM Person
IJSER © 2016
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 7, Issue 12, December-2016
ISSN 2229-5518
57
hierarchical taxonomies. Flat taxonomies have only one level BING is Microsoft’s answer to google and it was launched in
of classes without having sub-classes, on t,he other hand 2009. Bing is the default search engine in Microsoft’s web
hierarchical taxonomies have multi-level classes. Lehnert [2] Browser.it is available in 40 languages. It provides different
proposed “QUALM”. services including image, web and vedio search along with
maps.
QA system used a flat taxonomy with seventeen classes e.g.
PERSON, PLACE, DATE, NUMBER, DEFINITION, Stoyanchev et al (StoQA) , 2008: Contribution In their
ORGANIZATION, DESCRIPTION, ABBREVIATION, research, they presented a document retrieval experiment on a
KNOWNFOR, RATE, LENGTH, MONEY, REASON, question answering system. They used exact phrases, as
DURATION, PURPOSE, NOMINAL,OTHER. constituents to search queries. The process of extracting
phrases was performed with the aid of named-entity (NE)
Zhang and Lee [3] compared various choices for machine recognition, stop-word lists, and parts-of-speech taggers.
learning classifiers using the hierarchical taxonomy propose
such as: Support Vector Machines (SVM), Nearest Neighbors Wolfram|Alpha is a Computational Knowledge Engine that
(NN), Naïve Bayes (NB), Decision Trees (DT). introduces a fundamentally new way to get knowledge and
answers, not by searching the Web sites, but by dynamic
Information Retrieval computations based on a vast collection of built-in data,
algorithms, and methods. [7]. Wolfram|Alpha returns an
Evaluate the use of named entities and of noun, verb, and answer in a form of a table where information, which is
prepositional phrases as exact match phrases in a document relevant to a query, is separated by categories (e.g. an answer
retrieval query. Gaizauskas, and Humphreys [4] described an to a query about some person usually contains such se image,
approach to question answering timeline, notable facts, familial relationships and others).
IJSER
Answer Extraction Answerbag is question answering website where users can
get answers to their questions, whether they're looking for
Finding the answers by exploiting surface text information facts, opinions or simply entertainment. Questions are
using manually constructed surface patterns. In order to answered by Answerbag professional researchers and
enhance the poor recall of the manual hand-crafting patterns, community members.[8]so Answerbag can also be
many researchers gained text patterns automatically such as considered as an expert community question answering
Xu et al [5]. website..
IJSER © 2016
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 7, Issue 12, December-2016
ISSN 2229-5518
58
by traditional keyword searching. The current Ask.com still and analyzing information; can naturally be
supports this, with added support for math, dictionary, and produced as QA queries.
conversion questions. This system tries to “understand” any Can be used to make online lectures more old-
users query and gives three forms of answer at once: a direct school type by allowing lectures to proceed only
answer, a list of links to webpages on related topics and a list when questions related to the previous lecture are
of similar questions with answers from other question answered correctly.
answering websites.
Table 1
CONCLUSION
QA System Domain Description Year
ELIZA Closed Attempt to mimic 1964
This paper describe about the Question Answering
basic human
interaction Q&A System for an English Language i.e. it receives query
exchanges. from the user and selects most appropriate answer. QAS
EVI Open Specializes in 2007 is approach to find the correct answer to the question
knowledge base & asked from user. This paper also describes different
semantic search QAS approaches, different types of QAS.QA system
Quora Open Knowledge based 2009
IJSER
help in improving system interaction. In this paper we
answering ability
also concentrated on finding the solution of some
Bing Open Provide diff. 2009
services like image, problem: Answer is restricted to a precise domain, user
web and video has to follow a particular path while entering a question
seach and Extracting correct answer. The solution consists:
Stoyanchev Closed Extract phases 2008 semantic representation for Natural Language, effective
Wolfram/Alp Closed Computation search 2009 logic is to be performed on them and developing a
ha engine formalism to represent the answer verification and
Answerbag Open Web based QA 2003 specific answer extraction. Thus there is great potential
system
for exploring the challenges in QA domain.
Blurtit Open People ask query of 2006
regular user provide
answer based on ACKNOWLEDGMENT
their opinion.
Kangavari Closed Depend on 2008 We sincerely thank prof. PoonamTanwar, Computer
Previously asked department, Lingaya’s University, Faridabad for her valuable
Question. guidance and motivation for this work.
ASK.com Open Web based QA 1996
IJSER © 2016
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 7, Issue 12, December-2016
ISSN 2229-5518
59
IJSER
http://www.answerbag.com/about-us/
IJSER © 2016
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 7, Issue 12, December-2016
ISSN 2229-5518
60
IJSER
IJSER © 2016
http://www.ijser.org