Proposal Guid
Proposal Guid
(SUMMAQA)
By
G126/0606/2018
A Research Project Report Submitted to the School of Science and Computing in partial fulfillment
of the Requirements for the award of the Degree of the Bachelor of Science in Computer Science
November, 2021
DECLARATION
I hereby solemnly declare that the proposal entitled : “information retrieval using
transformers” is based on my own work carried during the course of my study under the
supervision of Mr. Casper Shikali.
The work has not been submitted in any other institution for any other
degree/diploma/certificate in this university or any other university.
DEDICATION & ACKNOWLEDGEMENTS
This project is especially dedicated to the lecturers who have continually helped and guided me to
successfully complete this project work. I would also like to give special thanks to my supervisor Mr.
Casper Shikali who spared his valuable time to direct me and ensured that I had the materials relevant to
carry out my research and continue building my project and also improve on it.
Also I would like to dedicate this project to my family and friends, who have been continuously supported
me, gave me ideas, its a pleasure.
TABLE OF CONTENT
DECLARATION 2
DEDICATION & ACKNOWLEDGEMENTS 3
ABSTRACT 7
CHAPTER ONE 8
INTRODUCTION 8
1.1 Background of the Study 8
1.1.1 complexity of information in the current times 8
1.2 Statement of the Problem 9
1.3 Objectives of the Study 10
1.4 Research Questions 10
1.5 Justification of the Study 10
1.6 Scope of the Study 10
CHAPTER TWO 11
LITERATURE REVIEW 11
2.1 Introduction 11
2.2 Question Answering 11
2.3 Summarization 13
2.4 BERT 14
2.5 Conceptual Framework 14
2.6 Related Work 16
2.6.1 Related methods 17
Transformers 17
2.7 Gaps in the Literature 17
2.8 Approach 18
CHAPTER THREE 19
RESEARCH DESIGN AND METHODOLOGY 19
3.1 Introduction 19
3.2 Research Design 19
3.3 Target Population 19
3.4 Sampling Design 20
3.5 Data collection Techniques 20
3.6 Data analysis methods 21
3.7 Development Methodology 21
3.8 Technology for Development 21
3.8.1 Front-end / Client-Side 21
3.8.2 React 21
3.8.3 CSS 22
3.8.4 Server-Side/ Back-end 22
3.8.5 FastAPI 22
CHAPTER 4: 24
CHAPTER 5 35
LIST OF FIGURES
Figure 1 11
Figure 2 14
Figure 3 15
Figure 4 25
Figure 5 26
Figure 6 27
Figure 7 29
Figure 8 31
Figure 9 45
Figure 10 47
LIST OF TABLES
Table 1 23
LIST OF ABBREVIATIONS
INTRODUCTION
1.1 Background of the Study
In this section, I will detail the context and background surrounding the project and I will explain why
computerized solutions to text and Natural Language processing are needed, how it was actually achieved
historically and some of the present techniques.
2. Will the system be able to retrieve the documents that the user has uploaded?
3. Will the text which has been summarized by the system retain its meaning?
4. Will the system be able to respond with the correct answers phrased in the right way?
2.1 Introduction
While there are vast scholarly articles based on both Question Answering Systems and automatic text
summarization, there is still a wide research area especially involving Natural Language Understanding
since it's an emerging field. However there is little research that explores both text summarization
combined with question answering systems which this project is trying to explore.
Question answering systems (QA) come out as powerful platforms and they are very much known for
their function of automatically answering questions in Natural Language. Imagine talking to a computer
just as you would to a human, a good example are chatbots. Information is in large amounts even when
reading documents we often go skimming looking for the direct answer from the document, this can
become tedious especially when the document is lengthy and is using complex syntax. Question
Answering systems are being used for this purpose.These systems usually scan through a corpus of
documents and provide you with the relevant answer or paragraph, quite easy right?. All these are a part
of the computer science discipline in the field of information retrieval and NLP, which focuses on
building systems that automatically extract an answer to questions posed by humans or machines in a
natural language. Coming to the history of QA systems. Two of the earliest question answering systems,
BASEBALL and LUNAR, have been popular because of their core database or information system.
BASEBALL was built for answers to American League baseball questions over a one-year cycle.
LUNAR on the other hand was built to answer questions related to geological analysis of lunar rocks
based on data collected from the Apollo moon mission.Some of the advanced question answering systems
of the modern world are Apple Siri, Amazon Alexa, and Google Assistant which most of us interact with.
To understand the Question Answering subject, we need to define associated terms. A Question Phrase is
the part of the question that says what is being searched. The term Question Type refers to the
categorization of the question for its purpose. In the literature the term Answer Type refers to a class of
objects which are sought by the question. Question Focus is the property or entity being searched by the
question. Question Topic is the object or event that the question is about. Candidate Passage can broadly
be defined as anything from a sentence to a document retrieved by a search engine in response to a
question. Candidate Answer is the text ranked according to its suitability to as an answer.Previous studies
mostly defined a architecture of Question Answering systems in three macro modules : Question
Processing, Document Processing and Answer Processing as shown below:
Figure 1
Question Processing usually receives the input from the user which is a question in natural language, to
classify and analyze it. The analysis is to find out the type of question, meaning the focus of the question
which is necessary to avoid ambiguities in the answer produced.
Types of QA systems
Question answering systems(QA) are broadly divided into two categories: open-domain question
answering (ODQA) system and closed-domain question answering (CDQA) system.
Open Domain - in open domain systems questions can be from any domain such as health care Sports ,IT
and many more, key concept is that its not enclosed to any field, an example of such a system is the
DeepPavlov that uses a large dataset from wikipedia as its source of knowledge.
Closed Domain - in closed domain it deals with questions in a particular domain, example a healthcare QA
cannot answer any IT related questions.
2.3 Summarization
The Internet itself is a wide source of electronic information.The outcome of information retrieval
becomes a laborious task for humans. Hence automatic summarization came into use. It automatically
retrieves the data from documents in the process it utilizes precious time. Luhn was the first one who
invented automatic summarization of text in 1958 . NLP community invented the subfield of
summarization. Radev says that one or more documents are processed and a short summary is produced
which is less than the size of original documents. The requirements of summarization include:
This now brings us to the question , what is Automatic Text Summarization? Automatic text summarization
is the process of taking a sequence of words and reducing the number of words, while retaining the most
essential information from the original context. The approach of summarization are divided into either
extraction based or abstraction-based :
• In extractive summarization the main aim is to summarize the corpus using only the words provided in the
body of the text.
• In abstractive summarization, the model’s goal is to learn the inherent language representation to make
summarization more like how a human would make summaries, that is using their own choice of words.
Extractive summarization in fact has historically been the more extensively researched of the two since it is
considered a simpler problem to solve
2.4 BERT
In this project I will use a language representation model BERT which stands for Bidirectional Encoder
Representations from Transformers. BERT is designed to pretrain deep bidirectional representations from
unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained
BERT model can be finetuned with just one additional output layer to create state-of-the-art models for a
wide range of tasks, such as question answering and language inference, without substantial task specific
architecture modifications.
Since the introduction of the language representation model, the field of NLP has revolutionized and the
surprising thing is that it has surpassed all previous models up to that point in a wide variety of tasks .
BERT now expands a new model architecture called a Transformer, which allows the model to learn entire
sentences at a time instead of sequences of words. It also allows the network to learn how sentences and
languages are constructed, based on the context of the surrounding words . The transformer model, and
subsequently BERT is based on an encoder-decoder structure instead of 9 5 Method and Implementation a
recurrent structure and uses a concept called attention .
Attention is a metric assigned to each word in a sentence. It represents for each prediction how important
each word is and which words should be emphasized more than others . This allows transformer models
such as BERT to learn the context of words and sentences based on the context and surrounding words.
BERT extracts tokens from the question and passage and combines them together as an input. As
mentioned earlier, it starts with a [CLS] token that indicates the start of a sentence and uses an [SEP]
separator to separate the question and passage. Along with the [SEP] token, BERT also uses segment
embeddings to differentiate between the question and the passage that contains an answer. BERT creates
two segment embeddings, one for the question and other for the passage, to differentiate between question
and passage. Then these embeddings are added to a one-hot representation of tokens to segregate between
question and passage as shown
Next, we pass the combined embedded representation of question and passage as input in the BERT model.
The last hidden layer of BERT will then be changed and uses softmax to generate probability distributions
for the start and end index over an input text sentence that defines a substring,which is an answer, as
shown.
QA system diagram
Figure 2
Figure 3
2.6 Related Work
In this section I will be discussing some of the related projects and works, and how they compare to the
application that I am developing. I will also discuss some of the technologies that my application is based
on.
In my research I found out that there are many related applications to my project, a good example being
Watson which was developed and built by IBM.It was developed to answer questions on the popular
TV-show Jeopardy but has grown to be a general-purpose QA machine utilized in many fields like
economics and healthcare however, for the application to take on similar, but low-level tasks, the QA
system needed to be scaled down. A major drawback is that the speed, size,and complexity of IBM
Watson are not achievable due to resource limitations. In my application, I could not use Watson directly
since it uses proprietary components, which would not allow us to make modifications to it. This project
could potentially serve as an open-source alternative to Watson in the future.
In this section, I go over related deep learning approaches and architectures for handling specific problems
like Question-Answering and Summarization. We'll start with a list of the fundamental architectures and
models, then show how these models are typically utilized to solve NLP problems.
Transformers
In natural language processing, the Transformer is a unique design that seeks to solve
sequence-to-sequence tasks while also resolving long-range dependencies. In the paper Attention Is All
You Need, the Transformer was proposed.
In the paper Attention is all you need by Ashish the transformer neural network architecture was introduced
which brought about significant improvements which included the following:
1. It allows learning entire sentences instead of sequences of words
2. Allows models to be trained in parallel
3. The model learns to distinguish the words in a sentence based on its context.
This has made transformers to be greatly used especially in the field of NLP
Huggingface has released a Python module with various pre-trained transformer models taken from a
number of recent research papers. These transformer models are deep learning models designed for general
language understanding that may be retrained via transfer learning to handle specific challenges . I
customized parts of Huggingface's pre-trained NLP models, particularly a model titled BERT.
1. Deviation from the question asked by the user, hence returning an answer that makes no sense
according to the question posed
2. The user may not know the appropriate question to ask the model which decreases usability
2.8 Approach
1. The Bert based model is trained on the SQUAD v1 dataset hence conversant with the natural
language.
SQUAD Dataset - You might be wondering what type of challenge the SQUAD dataset poses. Well it is a dataset
which tests a model's ability to read a passage of text and be able to answer questions about it.
2. The user is actually guided through , when the user imports the file of text that he/she wants to ask
The application first summarizes the text and in the process the user gets a better understanding of the
context and the type of questions which he/she can ask to get the most out of the system
CHAPTER THREE
RESEARCH DESIGN AND METHODOLOGY
3.1 Introduction
This chapter outlines the many stages of the research. It explains the research design, the target
population, the sampling procedures used to obtain a representative sample size, the data collection
procedures and instruments, how the validity and reliability of the research instruments of data collection
were tested, and finally how the data was analyzed.
Before anything else we must first understand what research design is. It is the overarching method you
adopt to combine the various components of the study in a logical and cohesive manner, ensuring that you
will effectively address the research problem; it is the blueprint for data collecting, measurement, and
analysis. Normally the type of design you choose should be determined by the research challenge.My
research on Question answering and automatic text summarization requires a qualitative research design.
A qualitative research design is concerned with establishing answers to the whys and hows of the
phenomenon in question: How can we make machines understand natural language?, How can we make
them reply in a way that is indistinguishable from human? How can we make them better? . Due to this,
qualitative research is often defined as being subjective (not objective), and findings are gathered in a
written format as opposed to numerical. This means that the data collected from a piece of qualitative
research cannot usually be analyzed in a quantifiable way using statistical techniques establishing answers
to the whys and hows of the phenomenon in question (unlike quantitative).
The target population is the group of individuals that the intervention intends to conduct research
in and draw conclusions from.
The target population is the SQUAD dataset with which the fine tuned BERT model has been trained
on.The SQUAD dataset is a collection of question-answer pairs derived from Wikipedia articles. In
SQuAD, the correct answers of questions can be any sequence of tokens in the given text. Because the
questions and answers are produced by humans through crowdsourcing, it is more diverse than some
other question-answering datasets.
In research terms a sample is a group of people, objects, or items that are taken from a larger
population for measurement. The sample should be representative of the population to ensure
that we can generalize the findings from the research sample to the population as a whole.
I will use the SQUAD 2.0 dataset which is representative of the SQUAD v1 to evaluate the model
and test its performance.
In my area of research especially in the field of Natural Language Processing, there is a vast amount of
data and Datasets in this field, since I will not be training the deep learning model from scratch , I will be
using a model which has been pre-trained on the SQUAD v1 dataset. Although the Pre-trained model may
not be enough I will use Transfer learning to further train the models to attain high accuracy.
Why BERT pretrained model? I will be using an already available fine-tuned BERT model from the
Hugging Face Transformers library. BERT can better understand long term queries and as a result surface
more appropriate results.
Transfer Learning- Transfer Learning is a machine learning technique in which we reuse a previously
trained model as the basis for a new model on a different task.Simply said, a model learned on one task is
repurposed on a second, similar work as an optimization that allows for faster modeling progress on the
second activity. And with just a small amount of data we can achieve greater accuracy.
3.6 Data analysis methods
Data analysis is the process of working on data with the purpose of arranging it correctly,
explaining it, making it presentable, and finding a conclusion from that data. It is done for
finding useful information from data to make rational decisions.
The systematic application of statistical and logical techniques to describe the data scope,
modularize the data structure, condense the data representation, illustrate via images, tables, and
graphs, and evaluate statistical inclinations, probability data, to derive meaningful conclusions.
Data analysis tools are defined as a series of charts, and diagrams designed to interpret, and
present data for a wide range of applications. However my project does not entail any data
collection or data analysis since I am using a pretrained model by huggingface library.
React, an open-source JavaScript library that was developed by Facebook and is currently maintained by
Facebook, Instagram, and community developers. To display views, which are rendered to the screen,
React employs a component-based system.The components are specified as custom HTML tags, which
make them easy to use because they can be reused not only in different views, but also within other
components.
React is also extremely efficient when it comes to updating the HTML document with new data. The
content is re-rendered as the state changes. This enables React to display dynamic content on a web page
without requiring or changing anything on the server side. React is currently one of the most popular
front-end web frameworks. Other popular web frameworks include Angular and Vue, both of which are
open-source JavaScript frameworks. Google maintains Angular, while Vue is maintained by its creator
and a smaller team.
All three are component-based, which means you build your front end by assembling various components
to create a finished product. The aforementioned frameworks share similarities but also differences, with
the learning curve being the most important for us. Angular has the steepest learning curve, while Vue has
the flattest. Based on the statement above and the amount of time I had to work on this project, I decided
not to use Angular. Even though Angular is a powerful framework, the time required to get a front-end up
and running was not worth it for this project.
Angular would be the best choice for larger applications with a longer time frame.
3.8.3 CSS
I considered using a CSS framework for the styling of my web application at the start of the project. I
chose Bootstrap over others. Today, Bootstrap is the most popular CSS framework .
It was chosen because of its popularity, element styling, and previous experience with the framework. I
decided to redesign myr application after some careful consideration. I found that bootstrap was no longer
useful for how I wanted my app to look. Instead, I used CSS to customize the look of the web application.
3.8.5 FastAPI
FastAPI is described as a modern, fast web-framework for creating Python APIs.
It is just a simple framework for defining web-end points to which a request can be made.
FastAPI is very fast as compared to Node.js. Even though it is still relatively new, it is gaining popularity
and developers are beginning to use it for projects, particularly those involving machine learning, such as
SpaCy.
In mind I also had Flask and Node.js. Flask is a Python-based web framework with extensive
documentation . Node.js is an open-source server environment that allows JavaScript to be run on the
server.
When I first started this project, I thought about whether to use Node.js or a Python-based framework. I
reasoned that because the NLP models were written in Python, integrating the NLP models in one language
for the server and logic would be easier than using two which made lots of sense. Because the models were
written in Python, we chose a Python framework and began using Flask. Comparing the speeds I then
decided to switch. And so I switched to FastAPI. Which I believed to be better in terms of performance.
CHAPTER 4
This chapter presents the descriptive statistical analysis of data, interpretation of the finding and system
design of the study. The work process includes an in-depth view of the data preprocessing and feature
engineering activities. The results of mathematical procedures will be used to assess the performance of
significant variables on the dataset for this machine learning model
ii) Tokenization
Tokenizing is the process of splitting strings into a list of words. We will make use of Regular
Expressions or regex to do the splitting. Regex can be used to describe a search pattern.
This section gives a visualization of the words on the dataset by use of Wordcloud, Countplot. These visual
representations enable one to explore the dataset and have insights of the dataset.
Figure 4
The word cloud above describes the most frequent words in the dataset though I grabbed wordcloud at
every iteration of every paragraph containing an article, so there were a total of 50+ word clouds. Here
is another word cloud.
Figure 5
4.3Summary of Findings
Majority of the Question Answering Systems prediction models have been made but deployment has not
been possible. The previous systems were not flexible in that it accepts various types of documents and you
can ask it any questions regarding that document. Also they didnt have an added functionality of having a
summarization model within the system. So this is indeed something new.
The SummaQA system is a two in one system, a Summarizer and a QA model which is a closed domain
QA system. I performed transfer learning to the BERT qa Model and trained it with the SQUAD 2.0 which
made it become somewhat better than other models which were trained on the SQUAD 1.1. The system
provides the User with an UI where he/she can upload documents in the system and ask it questions with
respect to the document which they uploaded.
1. Usability -The model must be easy to use, accessible and easy to navigate by the user
2. Reliability-The deployed model must give accurate prediction to the user continuously
3. Performance-The prediction must not take long.
1. Accuracy & Performance-Most ML work reports on algorithm accuracy (often precision and recall)
i.e. how “correct” the output is compared to reality
2. Security considerations-efforts have been made to address the privacy concerns when using
personal data facilitate ML.
3. Reliability-the summaQA system should be reliable when it comes to ML predictions
● Any of the following operating systems: Windows 7 SP1 32/64-bit, Windows 8.1 32/64-bit,
● Or Windows 10 32/64-bit, Ubuntu 14.04 or later, or macOS Sierra or later.
● Browser: Google Chrome or Mozilla Firefox.
● Anaconda
● Python 3.6 ^
● cdQA == 1.3.9- Closed Domain Question Answering Pipeline
● Matplotlib offering many numerical computation and visualization tools.2.1Software requirements
●
Figure 6
In general the structure involves:
The pool of articles you can see from the above diagram is what the user will be uploading, so when a
user uploads a document it becomes a knowledge base for our system where it will be fitted to our
model. A user will then query the model which will then go to the retriever and it will output the
prediction, which is what the model thinks the answer is and it will also output where it thinks the
paragraph where it thinks matches the answer it found.
Figure 7
4.5System Implementation and Testing
4.5.1Creating, Testing and Training Datasets
Figure 8
Data Fields
plain_text
● id: a string feature.
● title: a string feature.
● context: a string feature.
● question: a string feature.
● answers: a dictionary feature containing:
○ text: a string feature.
○ answer_start: a int32 feature.
SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions
written adversarially by crowdworkers to look similar to answerable ones. And in order for the model to
perform well it must not only answer questions when possible, but also determine when no answer is
supported by the paragraph and abstain from answering.
Here I import transformers and initialize our model and tokenizer using the following code:
def load_model():
model =
joblib.load('/content/drive/MyDrive/coding/models/bert_qa.joblib')
return cdqa_pipeline
The Pipeline will entail everything from, tokenization, text cleaning and prediction and Once I fitted the
model with a document I uploaded to the system this is how it was answering questions :
Figure 9
Now since the pretrained model was trained on the SQUAD1.1 I took the initiative of performing a transfer
learning using the SQUAD2.0 with which I faced a lot of challenges due to computing power and It took
10 hours plus to train the pre-trained model and at times runtime disconnected and it was just tiresome.
Code to extract data from dict and append them to a list to train our model with, I will do the same thing
with the test data. With data as dict type it will be hard to be able to categorize the data into questions and
answers.
path = Path('squad/train-v2.0.json')
squad_dict = json.load(f)
texts = []
queries = []
answers = []
context = passage['context']
for qa in passage['qas']:
question = qa['question']
texts.append(context)
queries.append(question)
answers.append(answer)
print(len(train_texts))
print(len(train_queries))
print(len(train_answers))
86821
86821
86821
print(len(val_texts))
print(len(val_queries))
print(len(val_answers))
20302
20302
20302
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
padding=True)
I created a Squad Dataset class (inherits from torch.utils.data.Dataset), that helped me to train and
validate my previous data more easily and convert encodings to datasets.
class SquadDataset(torch.utils.data.Dataset):
self.encodings = encodings
self.encodings.items()}
def __len__(self):
return len(self.encodings.input_ids)
else 'cpu')
model = BertForQuestionAnswering.from_pretrained('bert-base-uncased').to(device)
epochs = 3
EVALUATING FINE TUNED MODEL
Here I give some examples to my model to see how well I trained it. I started with more easier examples
For extractive textual QA tasks, we usually adopt two evaluation metrics, which measure exact match and
● Exact Match: measures whether the predicted answer exactly matches the ground-truth answers. If
the exact matching occurs, then assigns 1.0, otherwise assigns 0.0.
● F1 Score: computes the average word overlap between predicted and ground-truth answers, which
can ensure both precision and recall rate are optimized at the same time.
Here is an example:
Here I took some content from Wikipedia pages to test my model. I observed that for questions that require
an answer with more than one entity, that in the context are separated by comma, the model returns only
the first one (in the question of the members of the band). Moreover, when I asked about the kind of band
they are, the model gave me the answer of "British rock", while I didn't ask about the origin of the band.
In [7]:
context = """ Queen are a British rock band formed in London in 1970. Their
Brian May (guitar, vocals), Roger Taylor (drums, vocals) and John
rock. """
answers = ["1970",
"rock"]
give_an_answer(context,q,a)
Prediction: 1970
EM: 1
F1: 1.0
Prediction: freddie mercury ( lead vocals, piano ), brian may ( guitar, vocals ),
True Answer: Freddie Mercury, Brian May, Roger Taylor and John Deacon
EM: 0
F1: 0.6923076923076924
EM: 0
F1: 0.6666666666666666
CHAPTER 5
5.1 Introduction
This chapter covers the summary of the findings of the processes undertaken during the system
implementation and testing.
5.2Summary of analysis
As mentioned earlier, the Bert model for this research project was based on a NLP challenge and Natural
Language Understanding. The key classification metrics were: Accuracy, Recall, Precision and F1-score.
5.3 QA Metrics
The accuracy , recall and F1score was done per each question answered by the model which it had never
seen below as shown in the screenshots below.
There are two dominant metrics used by many question answering datasets, including SQuAD: exact match
(EM) and F1 score. These scores are computed on individual question+answer pairs. When multiple
correct answers are possible for a given question, the maximum score over all possible correct answers is
computed. Overall EM and F1 scores are computed for a model by averaging over the individual example
scores.
Results
Prediction: 21
True Answer: 21
EM: 1
F1: 1.0
F1: 1.0
EM: 1
F1: 1.0
context = """ Harry Potter is a series of seven fantasy novels written by British
Harry Potter, and his friends Hermione Granger and Ron Weasley, all
The main story arc concerns Harry's struggle against Lord Voldemort,
wizards and Muggles (non-magical people). Since the release of the first novel,
Harry Potter and the Philosopher's Stone, on 26 June 1997, the books
have found immense popularity, positive reviews, and commercial success worldwide.
As of February 2018, the books have sold more than 500 million
copies worldwide, making them the best-selling book series in history, and have
been translated
selling roughly
"How many languages Harry Potter has been translated into? "
answers = [
"J. K. Rowling",
"Lord Voldemort",
"non-magical people",
"eighty"
give_an_answer(context,q,a)
Model Output:
Prediction: j. k. rowling
EM: 1
F1: 1.0
Question: Who are Harry Potter's friends?
EM: 1
F1: 1.0
EM: 1
F1: 1.0
EM: 0
F1: 0.4
EM: 1
F1: 1.0
F1: 1.0
EM: 1
F1: 0.875
Question: How many languages Harry Potter has been translated into?
Prediction: eighty
EM: 1
F1: 1.0
am using google colab to create a remote local tunnel through which I now run my model over because it
5.5 Conclusion
In this project I have shown how this tool can help people with digital text
Comprehension, question answering and text Summarization. I made this contribution to test whether or
not such a tool can be accomplished with currently existing technologies, and limited hardware. I have also
demonstrated what difficulties lie ahead in producing this as a real world application. The application
managed to solve basic problems with short to medium sized texts. Longer texts and passages proved to
yield inconsistent results, also how the user formats his/ her context to get an accurate response played a
big role since one can ask a question which may not have an answer. The application also proved to be
reasonably reliable when processing texts with simpler content, while being inaccurate with more complex
texts. The results presented was accomplished with minimal hardware resources and free to use libraries,
frameworks and tools. Test groups were presented with the application, with a mixed reception. Many users
saw the potential use of the tool, but did not express trust in the current implementation. The motivation
many participants had was that the time gain would not compensate for the amount of mistakes made by
the algorithm. In summary, our application showed that solving the problems of text comprehension is
indeed feasible, even for smaller systems. However, we have also seen that there needs to be a more
sophisticated central model in place to deal with text analysis. We also need a better way to communicate
While the application serves as an optimistic prototype, there are a couple of improvements that would be
necessary to make this a usable application in the real world. In this section, we outline these steps as a
GPU Support
The first step towards making the application more usable as a product is changing the hardware it runs on.
Instead of simply adding more CPU cores, moving the application to a hosting service with dedicated
CUDA enabled GPU support would make the NLP models much faster.
File Support
The system in its current state supports the file extensions TXT, PDF andFor future improvements, we
would like to extend the capabilities of the file formats and types of documents that can be parsed, such as
DOC, DOCX, ODT, MD, and CSV. We also considered support for URLs13, which could parse the text
content from a website. Furthermore, the ability to include and parse the content from an image, such as the
REFERENCES
C, L. (n.d.). An Introduction To Question Answering Systems | Engineering Education
(EngEd) Program | Section. Engineering Education (EngEd) Program | Section.
https://www.section.io/engineering-education/question-answering/.
Hofesmann, E. (2021, January 21). The Machine Learning Lifecycle In 2021 | By Eric
Hofesmann | Towards Data Science. Medium.
https://towardsdatascience.com/the-machine-learning-lifecycle-in-2021-473717c633bc.
Dwivedi, S. (2013, December 28). Research And Reviews In Question Answering System.
Research and Reviews in Question Answering System - ScienceDirect.
https://www.sciencedirect.com/science/article/pii/S2212017313005409.
Sabharwal, N., & Agrawal, A. (2021). Hands-On Question Answering Systems with BERT.
Apress.
Available https://streamlit.io
Available https://www.machinelearningmastery.com
APPENDICES
Table 1
ACTIVITY BUDGET
Internet 1000
Transport 500
Gantt chart
Figure 9
PERT CHART
Figure 10
APPENDIX C: HARDWARE REQUIREMENTS
Hardware requirements
● Processor: i5 or better
● Memory: 8GB RAM
● Storage: 10 GB available space.
● Mouse/Keyboard
● Monitor (LCD, LED)
● Printer
APPENDIX D: SOFTWARE
REQUIREMENTS
● Any of the following operating systems: Windows 7 SP1 32/64-bit, Windows 8.1 32/64-bit,
● Or Windows 10 32/64-bit, Ubuntu 14.04 or later, or macOS Sierra or later.
● Browser: Google Chrome or Mozilla Firefox.
● Anaconda
● Python 3.6 ^
● cdQA == 1.3.9- Closed Domain Question Answering Pipeline
● Matplotlib offering many numerical computation
and visualization tools.2.1Software requirements