Recent Trends in Deep Learning Based Open-Domain Textual Question Answering Systems

The document surveys recent trends in deep learning based open-domain textual question answering systems. It provides an overview of approaches and datasets, illustrates models and acceleration methods in detail, and discusses limitations and future directions.

Uploaded by

Aparajita Aggarwal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

Recent Trends in Deep Learning Based Open-Domain Textual Question Answering Systems

Uploaded by

Aparajita Aggarwal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Received February 9, 2020, accepted March 22, 2020, date of publication April 20, 2020, date of current version

June 2, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.2988903

Recent Trends in Deep Learning Based

Open-Domain Textual Question
Answering Systems
ZHEN HUANG1 , SHIYI XU 1 , MINGHAO HU1 , XINYI WANG1 , JINYAN QIU2 , YONGQUAN FU1 ,
YUNCAI ZHAO3 , YUXING PENG1 , AND CHANGJIAN WANG1
1 Science and Technology on Parallel and Distributed Laboratory, College of Computer Science and Technology, National University of Defense Technology,
Changsha 410073, China
2 H.R. Support Center, Beijing 100010, China
3 Unit 31011, People’s Liberation Army, Beijing 102249, China

Corresponding author: Shiyi Xu ([email protected])

This work was supported by the National Key Research and Development Program of China under Grant 2018YFB0204300.

ABSTRACT Open-domain textual question answering (QA), which aims to answer questions from large
data sources like Wikipedia or the web, has gained wide attention in recent years. Recent advancements
in open-domain textual QA are mainly due to the significant developments of deep learning techniques,
especially machine reading comprehension and neural-network-based information retrieval, which allows
the models to continuously refresh state-of-the-art performances. However, a comprehensive review of
existing approaches and recent trends is lacked in this field. To address this issue, we present a thorough
survey to explicitly give the task scope of open-domain textual QA, overview recent key advancements on
deep learning based open-domain textual QA, illustrate the models and acceleration methods in detail, and
introduce open-domain textual QA datasets and evaluation metrics. Finally, we summary the models, discuss
the limitations of existing works and potential future research directions.

INDEX TERMS Open-domain textual question answering, deep learning, machine reading comprehension,
information retrieval.

I. INTRODUCTION bases (KBs), such as Freebase [8] and DBpedia [9], espe-
A. BACKGROUND cially with the emergence of open-domain datasets on
Question answering (QA) systems have long been concerned WebQuestions [10] and SimpleQuestions [11], KBQA tech-
by both academia and industry [1]–[3], where the concept nologies evolved quickly. In 2011, IBM Watson [12] won the
of QA system can be traced back to the emergence of arti- Jeopardy! game show, which received a great deal of atten-
ficial intelligence, namely the famous Turing test [4]. Tech- tion. Recently, due to the release of several large-scale bench-
nologies with respect to QA have been constantly evolving mark datasets [13]–[15] and the fast development in deep
over almost the last 60 years in the field of Natural Lan- learning techniques, large advancements have been made
guage Processing (NLP) [5]. Early works on QA mainly in the QA field. Especially, recent years have witnessed a
relied on manually-designed syntactic rules to answer simple research renaissance on deep learning based open-domain
answers due to constrained computing resources [6], such textual QA, an important QA branch that focuses on answer-
as Baseball in 1961, Lunar in 1977, Janus in 1989 and ing questions from large knowledge sources like Wikipedia
so on [5]. Around 2000, several conferences such as and the web.
TREC QA [1] and QA@CLEF [7], have greatly promoted the
development of QA. A large number of systems that utilize B. MOTIVATION
information retrieval (IR) techniques were proposed at that Despite the flourishing research of open-domain textual QA,
time. Then around 2007, with the development of knowledge there remains a lack of comprehensive survey that summa-
rizes existing approaches&datasets as well as systemically
The associate editor coordinating the review of this manuscript and analysis of the trends behind these successes. Although sev-
approving it for publication was Jan Chorowski . eral surveys [16]–[19] were proposed to discuss the broad

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 94341
Z. Huang et al.: Recent Trends in Deep Learning Based Open-Domain Textual QA Systems

picture of QA, none of them have focused on the specific TABLE 1. Question-answer pairs with sample excerpts from TriviaQA [14],
which requires reasoning from multiple paragraphs.
deep learning based open-domain textual QA branch. More-
over, there are several surveys [20]–[23] that illustrate recent
advancements in machine reading comprehension (MRC)
by introducing several classic neural MRC models. How-
ever, they only reported the approaches in close-domain
single-paragraph settings, and failed to present the latest
achievements in open-domain scenarios. So we write this
paper to summarize recent literature of deep learning based
open-domain textual QA for the researchers, practitioners,
and educators who are interested in this area.

C. TASK SCOPE
In this paper, we conduct a thorough literature review on
recent progress in open-domain textual QA. To achieve this
goal, we first category previous works based on five charac-
teristics described as below, then give an exact definition of
open-domain textual QA that explicitly constrains its scope.
1) Source: Towards different data sources, QA systems
can be classified into structured, semi-structured and rely on general text and knowledge base. Moreover,
unstructured categories. One the one hand, structured systems are usually required to find answers from large
data are mainly organized in the form of knowledge open-domain knowledge sources (e.g., Wikipedia,
graph (KG) [9], [24], [25], while semi-structured data web), instead of a given document [37], [38].
are usually viewed as lists or tables [26]–[28]. On the 5) Methodology: As for involved methodologies, QA
other hand, unstructured data are typically plain text systems can be categorized into IR based [39]–[41],
composed of natural language. NLP based [31] and KB based [42] approaches [5].
2) Question: The question type is defined as a IR based models mainly return the final answer as
certain semantic category characterized by some com- a text snippet that is most relevant to the question.
mon properties. The major types include factoid, list, NLP based models aim to extract candidate answer
definition, hypothetical, causal, relationship, proce- strings from the context document and re-rank them
dural, and confirmation questions [17]. Typically, by semantic matching. KBQA systems build a seman-
factoid question is the question that starts with a tic representation of the query and transform it into
Wh-interrogated word (What, When, Where, etc.) and a full predicate calculus statement for the knowledge
requires an answer as fact expressed in the text [17]. graph.
The form of question can be full question [14], key Following the above categories, open-domain textual
word/ phrase [15] or (item, property, answer) triple QA can be defined as: (1) unstructured data sources on
[29]. text, (2) factoid questions or keyword/phrase as inputs,
3) Answer: Based on how the answer is produced, (3) extractive-based answer, (4) open-domain, and
QA systems can be roughly classified into extractive- (5) NLP based technologies with auxiliary IR technologies.
based QA and generative-based QA. Extractive-based Table. 1 shows an example of deep learning based
QA selects a span of text [13], [15], [30], a word open-domain textual QA.
[31], [32] or an entity [10], [11] as the answer.
Generative-based QA may rewrite the answer if it does D. CONTRIBUTIONS
not (i) include proper grammar to make it a full sen- The purpose of this survey is to review the recent research
tence, (ii) make sense without the context of either the progress of open-domain textual QA based on deep learn-
query or the passage, (iii) have a high overlap with ing. It provides the reader with a panoramic view that
exact portions in context [33], [34]. allows the reader to establish a general understanding of
4) Domain: Closed-domain QA system deals with ques- open-domain textual QA and know how to build a QA model
tions under a specific field [35], [36] (e.g., law, with deep learning technique. In conclusion, the main con-
education, and medicine), and can exploit domain- tributions of this survey are as follows: (1) we conducted
specific knowledge frequently formalized in a systematic review for open-domain textual QA system
ontologies. Besides, closed-domain QA usually refers based on deep learning technique; (2) we introduced the
to a situation where only a limited type of question recent models, discussed the pros and cons of each method,
is asked, and a small amount of context is provided. summarized method used in each components of model,
Open-domain QA system, on the other hand, deals with and compared the models performance on each dataset;
questions from a broad range of domains, and only (3) we discussed the current challenges and problems to be