0% found this document useful (0 votes)

12 views9 pages

A Multilingual Semantic Search Chatbot Framework

Chatbots are conversational agents which interact with users and simulate a human interaction. Companies use chatbots on their customer-facing sites to enhance user experience by answering questions about their products and directing users to relevant pages on the site. Existing chatbots which are used for this purpose give responses based on pre-defined frequently asked questions (FAQs) only. This paper proposes a framework for a chatbot which combines two approaches-retrieval from a knowledge base consisting of question answer pairs, combined with a natural language search mechanism which can scan through the paragraphs of text information. A feedback-based knowledge base update is implemented which provides continuous improvement in user experience. The framework achieves a result of 81.73 percent answer matching on stanford question answering dataset (SQuAD) 1.1 and 69.21 percent answer matching on SQuAD 2.0. The framework also performs well on languages such as Spanish (67.32 percent answer match), Russian (61.43 percent answer match), and Arabic (51.63 percent answer match). By means of zero shot learning.

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views9 pages

A Multilingual Semantic Search Chatbot Framework

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 13, No. 2, June 2024, pp. 2333~2341

ISSN: 2252-8938, DOI: 10.11591/ijai.v13.i2.pp2333-2341  2333

A multilingual semantic search chatbot framework

Vinay R., Thejas B. U., H. A. Vibhav Sharma, Poonam Ghuli, Shobha G.

Department of Computer Science and Engineering, RV College of Engineering, Bengaluru, India

Article Info ABSTRACT

Article history: Chatbots are conversational agents which interact with users and simulate a
human interaction. Companies use chatbots on their customer-facing sites to
Received Jun 23, 2023 enhance user experience by answering questions about their products and
Revised Oct 10, 2023 directing users to relevant pages on the site. Existing chatbots which are used
Accepted Jan 6, 2024 for this purpose give responses based on pre-defined frequently asked
questions (FAQs) only. This paper proposes a framework for a chatbot which
combines two approaches-retrieval from a knowledge base consisting of
Keywords: question answer pairs, combined with a natural language search mechanism
which can scan through the paragraphs of text information. A feedback-based
Bidirectional encoder knowledge base update is implemented which provides continuous
representations from improvement in user experience. The framework achieves a result of 81.73
transformers percent answer matching on stanford question answering dataset (SQuAD)
Chatbot 1.1 and 69.21 percent answer matching on SQuAD 2.0. The framework also
Cross-lingual question performs well on languages such as Spanish (67.32 percent answer match),
answering dataset Russian (61.43 percent answer match), and Arabic (51.63 percent answer
Natural language processing match). By means of zero shot learning.
Stanford question answering This is an open access article under the CC BY-SA license.
dataset
Universal sentence encoder

Corresponding Author:
Vinay R.
Department of Computer Science and Engineering, RV College of Engineering
RV Vidyanikethan Post, Mysore Road, Bengaluru 560059, Karnataka, India
Email: [email protected]

1. INTRODUCTION
Chatbots play a major part in enhancing user experience on websites due to ease of access and round
the clock availability. They can be useful in fast website navigation by directing the user to relevant parts of
the website based on the user’s query. Chatbots which answer queries about products on a website can also
significantly reduce the time and effort needed by the user to contact customer support. There are two broadly
used approaches to design chatbots for the above stated purpose. One approach involves comparing the user’s
query to the queries stored in the frequently asked questions (FAQ) database and retrieving the response for
the query most similar to the query. This is achieved by generating a vectorized representation of questions
using techniques such as bag of words [1], term frequency-inverse document frequency (TF-IDF) [2],
Word2Vec [3], [4], global vectors for word representation (GloVe) [5], FastText [6] and then using a similarity
metric such as cosine distance or euclidean distance to find the most similar question. The other approach
involves training a deep learning model by using the question answer pairs and then using this model to answer
user queries based on the relationships it has learnt between the questions and answers in the training set.
Recent trends in implementing this approach involve the use of recurrent neural networks (RNNs) [7], [8],
sequence to sequence (seq2seq) neural models [8], [9] and long short-term memory networks (LSTMs) [10],
[11]. Both the approaches require a very large set of FAQs to cover the entire site’s contents. In the second
approach, a large FAQ set is also necessary for training the model effectively since deep neural networks
(DNN) require large amounts of data to provide accurate results. Modern enterprises have websites which
contain many products with detailed information about each of them. Defining FAQs and their answers for

Journal homepage: http://ijai.iaescore.com

2334  ISSN: 2252-8938

each product is a very exhaustive process and it may not cover all the contents of the site. Another drawback
is that in case the website is updated frequently, additional FAQs need to be created for each update. In case
of the second approach, the model would also need to be retrained every time new FAQs are added. Both these
tasks would result in frequent updates to the chatbot’s modules which requires extra human and computational
resources.
To solve these issues, we propose a framework which combines the first approach (using vectorized
representations of questions to find the most similar question in FAQ database) with a natural language search
mechanism. The chatbot answers queries by first going through a knowledge base consisting of question
answer pairs, and if a satisfactory answer is not retrieved, it would then perform a search on the entire website’s
contents to give a relevant answer. The user is also presented with alternative answers at every stage. We have
implemented a feedback system wherein if the user finds the response satisfactory after the search phase, the
user query and answer are added to the knowledge base. This ensures that the computationally intensive search
phase is used less frequently as the usage increases, providing for a self-improving nature. Another objective
of the framework is to ensure support for multiple languages to support a use case where enterprises have
websites in native languages for users in countries in which English is not the primary language. The system
is built completely from open-source technologies such as tensorflow hub, huggingface transformer library and
MongoDB.
The concept of a chatbot was first realised when ELIZA [12], a rule based chatbot which used pattern
matching, was introduced in 1966. It was used to simulate conversations with a psychotherapist. ALICE was
another rule based chatbot which showed improvements over ELIZA by using a simple pattern matching
algorithm [13]. Both chatbots were rule based which meant that they wouldn’t work well for any query which
was outside the rules. To build a chatbot without training from scratch, we would have to make use of transfer
learning. The advantages of using sentence level embeddings instead of word level embeddings for transfer
learning tasks has been shown in recent works [14]. Universal sentence encoder was introduced, which
constructs sentence-level embeddings [15]. There were two variants introduced, one used a transformer
architecture, the other used a deep averaging network (DAN). The transformer architecture [16] is superior to
RNNs, LSTMs in machine translation tasks and reduces training time. It is mostly reliant on an attention
mechanism which allows the architecture to formulate relationships between words in an input sequence and
gets rid of recurrences and convolutions. The sentence encoder model that uses the transformer architecture
generates sentence embeddings that include context aware word representations which take into consideration
the identity and order of other words. The DAN [17] based sentence encoder produces sentence embeddings
which are produced when a feed forward DNN is used to process the averaged word and bi-gram embeddings.
The transformer-based model showcases higher accuracy whereas the DAN based model is faster. The addition
of multilingual convolutional neural network (CNN) and transformer-based variants of the Universal Sentence
Encoder [18], [19] which have robust performance in tasks like cross-lingual semantic retrieval makes it a very
versatile model. All the variants generate a vector of length 512 for a variable length input text sequence.
Text-to-text-transfer-transformer (T5) framework was introduced in [20] which converts every natural
language processing (NLP) task to a simplified common format which involves taking input text strings and
genrating target text as output. The massive clean crawled corpus (C4) text dataset, which is twice the size of
Wikipedia, was used in training of the T5 model. The model produced class-leading results on tasks such as
machine translation, text classification. This common NLP task modelling format allows for very simple
application of the framework to every problem, with the hyperparameters, model format remaining unchanged
irrespective of the type of NLP task. This aspect of T5 enabled it to be used in the process of question generation
(QG). As stated earlier, the process of manually defining FAQs can be very time consuming. Therefore, QG
can form a basis for chatbot data augmentation. Bidirectional encoder representations from transformers
(BERT) [21] and generative pre-trained transformer 2 (GPT-2) [22] are some of the latest models which were
successful in the task of generating questions given a context. Stanford question answering dataset (SQuAD)
[23] is a prominent dataset used for fine tuning models for this task. The format of the dataset is such that it
consists of paragraph (context), questions about the paragraph content and answer to each question as spans
within the context. The dataset consists of over 500 paragraphs and 100,000 question answer pairs. The way
this format is used to finetune a model for QG is by giving the paragraph (context) as input and the questions
which are based on that paragraph as a target output [24]. Proposes a T5 based QG model using the same
procedure, which produced results similar to previous class-leading QG models, with very less training. To
augment a chatbot’s training data we would require not only QG but question answer pair generation. This can
be achieved by using T5 in a multi task variation, where its is would first select spans of text as answers, then
generate questions based on the answer, and finally it would act as a question answering system by finding the
answer for the generated query and evaluate the correctness of the query by comparison the obtained answer
with the actual answer [25].
The ability of a chatbot to search through text and not just rely on FAQ pairs can be implemented

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

Int J Artif Intell ISSN: 2252-8938  2335

using the advancements made in machine reading comprehension (MRC) that involves a machine combing
through a set of paragraphs to answer questions about these paragraphs. SQuAD [23] is one of the benchmark
datasets used in MRC. The structure of SQuAD is described in the previous paragraph. It was improved over
earlier available datasets [26]-[29] because none of them had the combination of high quality and a large size,
which SQuAD showcases. Some researchers describe the class-leading techniques to perform MRC on SQuAD
[21], [30]-[33]. Out of these, BERT [21] had the best performance. BERT utilizes the transformer architecture
proposed in [16] in addition to bidirectional training mechanism which uses masked language modelling
(MLM) to randomly mask words and provide sentence word predictions using the words present before and
after it. This simultaneous consideration of left and right surroundings of a word leads to better contextualised
representations. Another training strategy used is next sentence prediction where two statements are provided
as input and the model tries to forecast if the second statement follows right after the first in the real context.
This leads to the model learning relationships at a sentence level. The pretrained model is then fine tuned for
different tasks. The details of fine-tuning BERT on SQuAD for MRC are described in [21]. Pires et al. [34]
showcase that multilingual BERT [21] (pre-trained on text from 104 different languages) performs very well
when fine tuned in one language for a given operation and evaluated on another language (zero shot cross
lingual model transfer). The tasks used were part-of-speech tagging (POS tagging) and named entity
recognition (NER). Konovalov et al. [35] shows multilingual BERT’s promising performance in zero shot
cross lingual model transfer for MRC task.
This survey on chatbot history, benchmark datasets in textual domain, class leading approaches in a
variety of NLP operations has been crucial in building a chatbot framework which can be used in enterprise
websites. The chatbot framework we propose leverages sentence level embeddings, question answer pair
generation, MRC techniques. The survey has also helped us build our framework using only open-source
technologies and ensure multilingual support. The related works were especially useful in modelling the
requirements for an enterprise website chatbot in terms of different NLP tasks.

2. METHODOLOGY
In this section, a two-stage chatbot framework is proposed that can evolve with experience. The
proposed framework can work well for many languages. The discussion will be based on the design of each of
these stages and how each stage contributes to answering the questions posed by the user.

2.1. System overview

The system involves two stages-database search stage and paragraph search stage. Figure 1 showcases
a high-level overview of the framework. Once a question is asked based on the similarity score of the question
and user feedback the system will try to get the answer for the query asked by the user. The database search
stage searches whether an answer exists in the database. If a similar question is not present or the user is not
satisfied with the answer given, paragraph search stage is invoked. These two stages will be discussed in detail
in the further sections. The system works on user feedback and hence we assume that the user doesn’t make
any irrational choices that can affect the overall performance of the system.

2.2. Database search

The database search is the first stage of the system. This stage consists of an existing database that
tores some predetermined or already queried questions along with the appropriate answers. A few question
answer pairs have been added to the database by the use of the T5 model [20]. When a question is asked, the
system first converts the question to its corresponding question embedding using the multilingual CNN based
Universal Sentence Encoder model [15], [18], [19] and we get the question embedding q for the question.
Embeddings 𝑣𝑖 of all the questions present in the database are obtained and then the embeddings are compared
using the similarity function [15] given in (1). Then the most relevant answers are sorted in a data structure D
based on similarity. The data structure D is then returned. The algorithm used to calculate similarity is given
in (1):
𝑞.𝑣
𝑠𝑖𝑚(𝑞, 𝑣𝑖 ) = 1 − 𝑎𝑟𝑐𝑐𝑜𝑠((|𝑞||𝑣𝑖 |))/𝜋 (1)
𝑖

Algorithm 1: Computing similarity scores

ComputeScores(V)
1) Input the question Q
2) Find the embeddings of the question Q. Let it be q
3) for all the values Vi present in V do
4) vi = embeddings of Vi

A multilingual semantic search chatbot framework (Vinay R.)

2336  ISSN: 2252-8938

5) score = sim (q, vi)

6) Store it in a data structure D
7) [End of for]
8) Sort D
9) return D

After calculating the similarity scores, the answer for the question corresponding to highest similarity
score is selected as the answer for the query. Here a limiting threshold is considered below which a question is
declared not present in the database and paragraph search is invoked. The limiting threshold is set to 0.75. The
limiting threshold is found by testing the model on the quora question pair similarity dataset and values ranging
from 0.70 to 0.80 were checked. The model performed best on 0.75. Now if the response satisfies the user, he
will proceed with the next question or else he will ask for other answers. Then the system presents the user
with three different questions in order to understand if the question asked is similar to any of the next most
similar questions that are stored in the data structure. If the user finds any of the questions similar, the system
then retrieves the answer for the chosen question. In case the user is still dissatisfied with the answer, paragraph
search stage is invoked.

Figure 1. System overview

2.3. Paragraph search

In this stage, the answer for the question is searched in paragraphs. At first the embeddings pi of all
the paragraphs are obtained using the multilingual CNN based Universal Sentence Encoder model [15], [18],
[19]. Then the similarity scores are obtained using (1) and Algorithm 1 is employed to compute the similarity
scores of all paragraphs. Once the data structure D is obtained, the paragraph with highest similarity is taken
and passed onto the multilingual BERT model fine tuned on SQuAD [21] which retrieves the answer given a

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

Int J Artif Intell ISSN: 2252-8938  2337

question and paragraph. Here, zero shot cross lingual transfer [34], [35] is utilized for response prediction if
the paragraph or queries are in languages other than English. Now the answer obtained is sent to the user for
his feedback. If the user is satisfied, he asks the next question or else the system returns the next best answers.
If the user is still dissatisfied, the system might not have the required information or requests the user to reframe
the question. Based on user feedback, the answer to a given query is either stored in the database or discarded
completely.

3. RESULTS AND DISCUSSION

This section discusses the results of the proposed method. This experimental setting, the metrics and
comparison between various standard class-leading methods are discussed in the proceeding sub sections. The
2 experminets conducted for performance evaluation of the chatbot framework are the question pair similarity
test, paragraph search and answer retrieval task. These experiments evaluate the database search and paragraph
search portions of the framework separately.

3.1. Experimental setup

The experiments involved in the proposed work are conducted on a Windows machine with Intel i7
8th Gen at 3.2 GHz coupled with an NVIDIA RTX 2060 GPU. The Universal Sentence Encoder is used from
Tensorflow Hub. The embedding size is 512. For paragraph search, SQuAD 1.1 fine tuned multilingual BERT
base model is used which is available on HuggingFace transformers. The embedding size of this model is 512.
The maximum sequence length to train the BERT model was 512 and the maximum expected answer length
was 64.

3.2. Dataset
Since the BERT model has been fine-tuned and pre-trained on SQuAD 1.1, we didn’t retrain the model
further on other datasets. Instead, the model is directly tested on SQuAD 2.0 and cross-lingual question
answering dataset (XQuAD). The model was fine tuned on English alone but due to BERT’s inherent capacity
to generalise (i.e., zero shot learning) it was decided to test it on XQuAD without training. The Quora question
pair similarity dataset is used to understand how well Universal Sentence Encoder can compare the questions
as similar based on a predefined threshold.

3.3. Evaluation metrics

We evaluate the framework in both stages. The evaluation of question similarity in database search is
done using F1 scores. For paragraph search, evaluation is done on two things-firstly whether the paragraph
chosen is right and secondly if the final answer obtained is right. The paragraph exact match is used [36]-[38]
to evaluate whether the paragraph chosen is the right one among four best paragraphs considering the user
feedback which takes into account the four best paragraphs to choose answers. Exact match and F1 scores [39]
for evaluating if the answer chosen is right from the chosen four paragraphs.

3.4. Question pair similarity

The question pair similarity test is done on the quora question pair similarity dataset by using the
Universal Sentence Encoder. A total of 15000 question pairs are chosen out of which 9409 are not similar pairs
while 5591 questions are similar. The threshold for a question pair to be marked as similar is set to 0.75. With
the given threshold, the model obtains a 0.70 F1 score and a 69.91 accuracy. The test results are showcased in
Table 1. The false positive count is comparable to the true positives and true negative count. This might lead
to unnecessary feedback in the database search stage of the framework.

Table 1. Quora question pair similarity results

Predicted NO Predicted YES
Actual NO 5329 4080
Actual YES 433 5158

Table 2 gives a performance analysis of the approach adopted in this framework versus the other
models. The results in the table are taken from [40]. Due to the zero-shot nature, the approach performs fairly
well. The accuracy when compared with other transformers models [20], [41], [42] is lower. Although the
results obtained are not the best, the model is still persisted with due to its multilingual capacity and DAN
connections, as it can understand and correlate complex sentences. This is further discussed in the next section
where the results for the paragraph search is evaluated.

A multilingual semantic search chatbot framework (Vinay R.)

2338  ISSN: 2252-8938

Table 2. Performance of various models on quora question pair similarity dataset

Model Accuracy (%) F1 score (%)
Logistic regression (LR) with Unigrams 75.4 63.8
LR with Bigrams 79.5 70.6
Support vector machine (SVM) with Unigrams 75.9 63.7
SVM with Bigrams 79.9 70.5
Decision tree 73.2 65.5
Random forest 75.7 66.9
Gradient boosting 75.0 66.5
Continuous bag of words (CBOW) 83.4 77.8
LSTM 81.4 75.4
LSTM + Attention 81.8 75.5
Bi-directional long short-term memory network 82.1 76.2
(BiLSTM)
BiLSTM + attention 82.3 76.4
Proposed model 69.9 70.0

3.5. Paragraph search and answer retrieval

The performance evaluation for paragraph search and answer retrieval is done using SQuAD 2.0 and
XQuAD dataset. For this purpose, a total of 25 paragraphs and 315 related questions are chosen from SQuAD
2.0 dataset and 25 paragraphs and 153 questions are chosen from XQuAD dataset. The outcomes obtained are
showcased in Tables 3 to 5. The exact match metric is used for evaluating the results obtained from Universal
Sentence Encoder and BERT multilingual model. The Universal Sentence Encoder performs very well on most
of the languages in recognizing the exact paragraph that contains the answer. The accuracy obtained is pretty
high as shown in Tables 3 and 4. Due to the zero-shot nature of the system, the model obtains high accuracy
without any fine tuning. So, this model acts as the suitable agent for recognizing the paragraph that contains
the answer.

Table 3. Exact match testing results on XQuAD dataset

Language Paragraph exact match (%) Answer exact match (%)
ES 100.0 67.32
AR 94.12 51.63
DE 99.34 64.05
EL 60.13 29.41
HI 36.60 18.95
RU 98.69 61.43
EN 98.03 75.16

Table 4. Testing results on SQuAD 2.0 dataset

Paragraph exact match (%) Answer exact match (%) F1 Score (%)
First Second Third Fourth
80.0 69.21 52.70 26.43 24.20 17.65

Table 5. F1 scores obtained while testing on XQuAD dataset

Language F1 score (%)
First answer Second answer Third answer Fourth answer
ES 71.85 41.69 28.46 25.18
AR 46.92 26.93 18.28 11.30
DE 67.75 42.41 35.35 22.79
EL 22.56 15.79 11.15 4.83
HI 14.74 2.98 2.68 2.38
RU 62.70 32.15 24.90 20.74
EN 79.42 48.02 39.15 30.12

The F1 score and answer exact match is used to assess the answers obtained from the multilingual
BERT model. The evaluation results are showcased in Tables 4 and 5 for XQuAD and SQuAD 2.0 datasets
respectively. Here, four different F1 scores correspond to the response that is obtained based on the best
paragraph selected and also the answers retrieved from the feedback mechanism. The multilingual BERT
model achieves an 81.73 percent exact match score and an 89.009 F1 score on SQuAD 1.1 dataset without the
paragraph searching technique. This result is comparable to many class-leading models trained and tested on

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

Int J Artif Intell ISSN: 2252-8938  2339

SQuAD 1.1 dataset. These results are compared in Table 6. The values obtained here are taken from [21]
Although the BERT multilingual model is not fine tuned on SQuAD 2.0 or on XQuAD, the model achieves
pretty good results comparable to some pretrained models [43].

Table 6. Performance of various models fine tuned on SQuAD 1.1

Model Exact match F1 score
BERT large (single+TriviaQA) 85.1 91.8
BERT large (ensemble+TriviaQA) 87.4 93.3
BERT large (single) 84.1 90.9
BERT large (ensemble) 85.8 91.8
Proposed model 81.73 89.009

4. CONCLUSION
In this paper, a chatbot framework is introduced which can be used in enterprise sites containing a lot
of information. The ability of our framework to perform natural language search on text such as entire website
content is useful in the above-mentioned use case because FAQs have limitations of covering entire content.
Use of QA pair generation as a data augmentation technique aims to eliminate the effort required for manual
FAQ generation. The feedback-based update facilitates the continuous improvement of performance of the
chatbot by storing back the new QA pair back to the database and hence, reducing usage of computationally
intensive paragraph search phase. The paragraph search phase relies on embeddings obtained from the
Universal Sentence Encoder for comparing similarities. In case the embeddings are not stored, the time taken
to get the embeddings of the data present increases. In order to avoid this over reliance, data can be grouped
based on similarities so that when querying a specimen of that grouped data will give a similarity score which
will be similar to most of the data in the group. Using TF-IDF and other document reranking techniques to
query data can also speed up the process and reduce dependency on Universal Sentence Encoder for obtaining
similarities. Fine tuning multilingual varaints of Universal Sentence Encoder and BERT on datasets such as
XQuAD can result in performance improvement of the proposed framework.

REFERENCES
[1] Y. Zhang, R. Jin, and Z. H. Zhou, “Understanding bag-of-words model: A statistical framework,” International Journal of Machine
Learning and Cybernetics, vol. 1, no. 1–4, pp. 43–52, 2010, doi: 10.1007/s13042-010-0001-0.
[2] S. Qaiser and R. Ali, “Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents,” International Journal of
Computer Applications, vol. 181, no. 1, pp. 25–29, 2018, doi: 10.5120/ijca2018917395.
[3] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 1st International
Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings, pp. 1-12, 2013.
[4] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and their
Compositionality. arXiv e-prints, page,” Advances in neural information processing systems, vol. 26, pp. 1–9, 2013.
[5] J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global vectors for word representation,” EMNLP 2014 - 2014 Conference on
Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1532–1543, 2014, doi: 10.3115/v1/d14-1162.
[6] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword Information,” Transactions of the
Association for Computational Linguistics, vol. 5, pp. 135–146, 2017, doi: 10.1162/tacl_a_00051.
[7] T. Young, D. Hazarika, S. Poria, and E. Cambria, “Recent Trends in Deep Learning Based Natural Language Processing [Review
Article],” IEEE Computational Intelligence Magazine, vol. 13, no. 3, pp. 55–75, Aug. 2018, doi: 10.1109/MCI.2018.2840738.
[8] J. Cahn, “CHATBOT: Architecture, Design, and Development,” Senior Thesis, Department of Computer and Information Science,
University of Pennsylvania, Pennsylvania, USA, 2017.
[9] K. Cho et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” EMNLP 2014-2014
Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1724–1734, 2014, doi:
10.3115/v1/d14-1179.
[10] K. Ramesh, S. Ravi shankaran, A. Joshi, and K. Chandrasekaran, “A survey of design techniques for conversational agents,”
Communications in Computer and Information Science, vol. 750, pp. 336–350, 2017, doi: 10.1007/978-981-10-6544-6_31.
[11] H. Sak, A. Senior, and F. Beaufays, “Long short-term memory recurrent neural network architectures for large scale acoustic
modelling,” Fifteenth Annual Conference of the International Speech Communication Association, pp. 338–342, 2014.
[12] J. Wizenbaum, “ELIZA—a computer program for the study of natural language communication between man and machine,”
Communications of the ACM, vol. 9, no. 1, pp. 36-45, 1966.
[13] B. A. Shawar and E. Atwell, “ALICE chatbot: Trials and outputs,” Computacion y Sistemas, vol. 19, no. 4, pp. 625–632, 2015, doi:
10.13053/CyS-19-4-2326.
[14] A. Conneau, D. Kiela, H. Schwenk, L. Barrault, and A. Bordes, “Supervised learning of universal sentence representations from
natural language inference data,” EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings,
pp. 670–680, 2017, doi: 10.18653/v1/d17-1070.
[15] D. Cer et al., “Universal Sentence Encoder,” EMNLP demonstration, Association for Computational Linguistics, pp. 1–7, 2018,
doi: 10.48550/arXiv.1803.11175.
[16] A. Vaswani et al., “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017, pp. 5999–6009.
[17] M. Iyyer, V. Manjunatha, J. Boyd-Graber, and H. Daumé, “Deep unordered composition rivals syntactic methods for text
classification,” Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International
Joint Conference on Natural Language Processing, vol. 1, pp. 1681–1691, 2015, doi: 10.3115/v1/p15-1162.

A multilingual semantic search chatbot framework (Vinay R.)

2340  ISSN: 2252-8938

[18] Y. Yang et al., “Multilingual universal sentence encoder for semantic retrieval,” Proceedings of the Annual Meeting of the
Association for Computational Linguistics, pp. 87–94, 2020, doi: 10.18653/v1/2020.acl-demos.12.
[19] M. Chidambaram et al., “Learning cross-lingual sentence representations via a multi-task dual-encoder model,” ACL 2019 - 4th Workshop
on Representation Learning for NLP, RepL4NLP 2019-Proceedings of the Workshop, pp. 250–259, 2019, doi: 10.18653/v1/w19-4330.
[20] C. Raffel et al., “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal of Machine Learning
Research, vol. 21, no. 1, pp. 5485–5551, 2020.
[21] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language
understanding,” NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1, pp. 4171–4186, 2019.
[22] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,”
Technical Report Open Al, pp. 1-24, 2019.
[23] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuad: 100,000+ questions for machine comprehension of text,” EMNLP 2016-
Conference on Empirical Methods in Natural Language Processing, Proceedings, pp. 2383–2392, 2016, doi: 10.18653/v1/d16-1264.
[24] K. Grover, K. Kaur, K. Tiwari, Rupali, and P. Kumar, “Deep learning based question generation using T5 transformer,”
Communications in Computer and Information Science, vol. 1367, pp. 243–255, 2021, doi: 10.1007/978-981-16-0401-0_18.
[25] F. Ç. Akyön, D. Çavuşoğlu, C. Cengiz, S. O. Altinuç, and A. Temizel, “Automated question generation and question answering
from Turkish texts,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 30, no. 5, pp. 1931–1940, 2022, doi:
10.55730/1300-0632.3914.
[26] M. Richardson, C. J. C. Burges, and E. Renshaw, “MCTest: A challenge dataset for the open-domain machine comprehension of text,” EMNLP
2013-2013 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 193–203, 2013.
[27] J. Berant et al., “Modeling biological processes for reading comprehension,” EMNLP 2014 - 2014 Conference on Empirical
Methods in Natural Language Processing, Proceedings of the Conference, pp. 1499–1510, 2014, doi: 10.3115/v1/d14-1159.
[28] K. M. Hermann et al., “Teaching machines to read and comprehend,” Advances in Neural Information Processing Systems, vol.
2015, pp. 1693–1701, 2015.
[29] F. Hill, A. Bordes, S. Chopra, and J. Weston, “The Goldilocks principle: Reading children’s books with explicit memory representations,”
4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, pp. 1-13, 2016.
[30] M. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi, “Bi-directional attention flow for machine comprehension,” 5th International
Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, pp. 1-13, 2017.
[31] C. Clark and M. Gardner, “Simple and effective multi-paragraph reading comprehension,” ACL 2018 - 56th Annual Meeting of the
Association for Computational Linguistics, Proceedings of the Conference (Long Papers), vol. 1, pp. 845–855, 2017.
[32] M. E. Peters et al., “Deep contextualized word representations,” NAACL HLT 2018 - 2018 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1,
pp. 2227–2237, 2018, doi: 10.18653/v1/n18-1202.
[33] M. Hu, Y. Peng, Z. Huang, X. Qiu, F. Wei, and M. Zhou, “Reinforced Mnemonic reader for machine reading comprehension,”
IJCAI International Joint Conference on Artificial Intelligence, pp. 4099–4106, 2018, doi: 10.24963/ijcai.2018/570.
[34] T. Pires, E. Schlinger, and D. Garrette, “How multilingual is multilingual BERT?,” ACL 2019 - 57th Annual Meeting of the
Association for Computational Linguistics, Proceedings of the Conference, pp. 4996–5001, 2020, doi: 10.18653/v1/p19-1493.
[35] V. P. Konovalov, P. A. Gulyaev, A. A. Sorokin, Y. M. Kuratov, and M. S. Burtsev, “Exploring the bert cross-lingual transfer for reading
comprehension,” in Proceedings of the International Conference, pp. 1–9, 2020, doi: 10.28995/2075-7182-2020-19-445-453.
[36] A. Asai, K. Hashimoto, H. Hajishirzi, R. Socher, and C. Xiong, “Learning to retrieve reasoning paths over Wikipedia graph for
question answering,” 8th International Conference on Learning Representations (ICLR), pp. 1-20, 2020.
[37] Y. Nie, S. Wang, and M. Bansal, “Revealing the importance of semantic retrieval for machine reading at scale,” EMNLP-IJCNLP
2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural
Language Processing, Proceedings of the Conference, pp. 2553–2566, 2019, doi: 10.18653/v1/d19-1258.
[38] K. Nishida et al., “Answering while summarizing: Multi-task learning for multi-hop QA with evidence extraction,” ACL 2019 - 57th Annual
Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pp. 2335–2345, 2020, doi: 10.18653/v1/p19-1225.
[39] Y. Zhang, P. Nie, A. Ramamurthy, and L. Song, “Answering any-hop open-domain questions with iterative document reranking,”
SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval,
pp. 481–490, 2021, doi: 10.1145/3404835.3462853.
[40] L. Sharma, L. Graesser, N. Nangia, and U. Evci, “Natural language understanding with the quora question pairs dataset,” arXiv-Computer
Science, pp. 1-10, 2019.
[41] P. He, X. Liu, J. Gao, and W. Chen, “Deberta: decoding-enhanced bert with disentangled attention,” ICLR 2021 - 9th International
Conference on Learning Representations, pp. 1-21, 2021.
[42] Z. Lan et al, “Albert: A lite bert for self-supervised learning of language representations,” ICLR, pp. 1-17, 2020.
[43] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, “XLNet: Generalized autoregressive pretraining for
language understanding,” Advances in Neural Information Processing Systems, vol. 32, pp. 1-11, 2019.

BIOGRAPHIES OF AUTHORS

Vinay R. graduated from RVCE, Bangalore in 2022. His areas of interest include
NLP and software engineering applications. He is currently working at Akamai
Technologies. He can be contacted at email: [email protected].

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

Int J Artif Intell ISSN: 2252-8938  2341

Thejas B. U. graduated from RVCE, Bangalore in 2022. His areas of interest

include NLP, image processing, and software engineering applications. He is currently
working at Saigeware Technologies Pvt. Ltd. He can be contacted at email:
[email protected].

H. A. Vibhav Sharma graduated from RVCE, Bangalore in 2022. His areas of

interest include NLP and software engineering applications. He can be contacted at email:
[email protected].

Poonam Ghuli worked as Assistant Professor at RVCE Bangalore, India from

2005 to 2022. Her main area of interest lies in the implementation of machine learning
algorithms. She can be contacted at email: [email protected].

Shobha G. is professor at RVCE with close to 30 years of experience. Her areas

of interest include data mining, data warehousing, and NLP. She can be contacted at email:
[email protected].

A multilingual semantic search chatbot framework (Vinay R.)

SCOPE and DELIMITATION of The STUDY
100% (2)
SCOPE and DELIMITATION of The STUDY
35 pages
Teacher Guide: Cambridge IGCSE / Cambridge IGCSE (9-1) Physics 0625 / 0972
100% (1)
Teacher Guide: Cambridge IGCSE / Cambridge IGCSE (9-1) Physics 0625 / 0972
28 pages
IQ Test Questions With Answers - IQ Test PDF
56% (16)
IQ Test Questions With Answers - IQ Test PDF
4 pages
Vocabulary LP Real
100% (3)
Vocabulary LP Real
15 pages
OBTL Literature 2 - Panunuring Pampanitikan
No ratings yet
OBTL Literature 2 - Panunuring Pampanitikan
11 pages
What Is A 8 Week Curriculum
100% (3)
What Is A 8 Week Curriculum
2 pages
Mga Teorya NG Kagamitang Panturo
No ratings yet
Mga Teorya NG Kagamitang Panturo
38 pages
Generating Topic-Based Chatbot Responses: Amandus Krantz Petrus Lindblom
No ratings yet
Generating Topic-Based Chatbot Responses: Amandus Krantz Petrus Lindblom
40 pages
A Conditional Generative Chatbot Using Transformer Model
No ratings yet
A Conditional Generative Chatbot Using Transformer Model
12 pages
Chapter 2 Literature Review
No ratings yet
Chapter 2 Literature Review
44 pages
Chatting About Chat GPT
No ratings yet
Chatting About Chat GPT
9 pages
Lesson Exemplar Template-GRADE 2 HEALTH - WEEK1
No ratings yet
Lesson Exemplar Template-GRADE 2 HEALTH - WEEK1
8 pages
A Conditional Generative Chatbot Using Transformer
No ratings yet
A Conditional Generative Chatbot Using Transformer
14 pages
An AI-Driven Interactive Chatbot: A Well-Trained Chatbot That Communicates With The Users and Reduces The Manual Interaction
No ratings yet
An AI-Driven Interactive Chatbot: A Well-Trained Chatbot That Communicates With The Users and Reduces The Manual Interaction
8 pages
Novel Approach For Semantic Similarity Measuremwnt For High Quality Answer Selection in QA Using DL Methods
No ratings yet
Novel Approach For Semantic Similarity Measuremwnt For High Quality Answer Selection in QA Using DL Methods
5 pages
Modern Chatbot Systems
No ratings yet
Modern Chatbot Systems
12 pages
The State-Of-Art Applications of NLP: Evidence From ChatGPT
No ratings yet
The State-Of-Art Applications of NLP: Evidence From ChatGPT
7 pages
Question Answering Draft Abinayapaper Forresearchgaterequests1
No ratings yet
Question Answering Draft Abinayapaper Forresearchgaterequests1
7 pages
Implementation of Chatbot On University Website Using RASA Framework
No ratings yet
Implementation of Chatbot On University Website Using RASA Framework
6 pages
1604-Article Text-2993-1-10-20210407
No ratings yet
1604-Article Text-2993-1-10-20210407
7 pages
Exploring Bi-Directional Context For Improved Chatbot Response Generation Using Deep Reinforcement Learning
No ratings yet
Exploring Bi-Directional Context For Improved Chatbot Response Generation Using Deep Reinforcement Learning
20 pages
Seq2Seq Attention Mechanism
No ratings yet
Seq2Seq Attention Mechanism
19 pages
Final Presentation
No ratings yet
Final Presentation
13 pages
Ijcs 49 2 28
No ratings yet
Ijcs 49 2 28
11 pages
E-Commerce Website Based On Chatbot
No ratings yet
E-Commerce Website Based On Chatbot
6 pages
376-Article Text-930-1-10-20180411
No ratings yet
376-Article Text-930-1-10-20180411
5 pages
Designing and Implementing Conversationa
No ratings yet
Designing and Implementing Conversationa
12 pages
RNN and LSTM Based Chatbot Using NLP: Department of Computer Science and Engineering, MSIT, New Delhi, India
No ratings yet
RNN and LSTM Based Chatbot Using NLP: Department of Computer Science and Engineering, MSIT, New Delhi, India
4 pages
What Can ChatGPT Do?
No ratings yet
What Can ChatGPT Do?
10 pages
Service Chatbots Final
No ratings yet
Service Chatbots Final
33 pages
Review of Chatbot System in Marathi Language:, Manjiri Pittule, Ravina Mane, Amit Rathod
No ratings yet
Review of Chatbot System in Marathi Language:, Manjiri Pittule, Ravina Mane, Amit Rathod
6 pages
Afgt Preparation Planner: Week Tasks Save And/Or Submit
No ratings yet
Afgt Preparation Planner: Week Tasks Save And/Or Submit
2 pages
Effective Chatbots Using Machine Learning and Natural Language Processing
No ratings yet
Effective Chatbots Using Machine Learning and Natural Language Processing
10 pages
L D S M C S C S N C: Earning EEP Emantic Odel For ODE Earch Using ODE Earch ET Orpus
No ratings yet
L D S M C S C S N C: Earning EEP Emantic Odel For ODE Earch Using ODE Earch ET Orpus
6 pages
Full Text 01
No ratings yet
Full Text 01
132 pages
Ijase 202106 18 2 007
No ratings yet
Ijase 202106 18 2 007
9 pages
Automated Chatbot Implemented Using Natural Language Processing PDF
No ratings yet
Automated Chatbot Implemented Using Natural Language Processing PDF
5 pages
Modern Chatbot Systems A Technical Revie
No ratings yet
Modern Chatbot Systems A Technical Revie
10 pages
FINAL-MIDTERM Major2
No ratings yet
FINAL-MIDTERM Major2
20 pages
Chat GPT: K.Ruchitha-217Z1A1222 P.Shanthi Sree-217Z1A1241
No ratings yet
Chat GPT: K.Ruchitha-217Z1A1222 P.Shanthi Sree-217Z1A1241
8 pages
A Chatbot Using LSTM-based Multi-Layer Embedding For Elderly Care
No ratings yet
A Chatbot Using LSTM-based Multi-Layer Embedding For Elderly Care
5 pages
A Brief Overview of ChatGPT The History Status Quo and Potential Future Development
No ratings yet
A Brief Overview of ChatGPT The History Status Quo and Potential Future Development
15 pages
Video Forgery: An Extensive Analysis of Inter-And Intra-Frame Manipulation Alongside State-Of-The-Art Comparisons
No ratings yet
Video Forgery: An Extensive Analysis of Inter-And Intra-Frame Manipulation Alongside State-Of-The-Art Comparisons
13 pages
Paper 3128
No ratings yet
Paper 3128
6 pages
Chatbot Using TensorFlow For Small Businesses
No ratings yet
Chatbot Using TensorFlow For Small Businesses
6 pages
Efficient Chatbot Designing
No ratings yet
Efficient Chatbot Designing
5 pages
3598-Article Text-6763-1-10-20210422
No ratings yet
3598-Article Text-6763-1-10-20210422
4 pages
Chatbots
No ratings yet
Chatbots
16 pages
Research Paper
No ratings yet
Research Paper
8 pages
Base Paper
No ratings yet
Base Paper
3 pages
Health Care Chatbot Final
No ratings yet
Health Care Chatbot Final
7 pages
Chatbot Application For Tourism Using Deep Learning
No ratings yet
Chatbot Application For Tourism Using Deep Learning
5 pages
The Application of Chatbot For Customer Service in
No ratings yet
The Application of Chatbot For Customer Service in
5 pages
Base Paper
No ratings yet
Base Paper
15 pages
RP 4
No ratings yet
RP 4
6 pages
Recent Deep Learning Based NLP Techniques For Chatbot Development An Exhaustive Survey
No ratings yet
Recent Deep Learning Based NLP Techniques For Chatbot Development An Exhaustive Survey
4 pages
A Survey of Chatbot Design Techniques: Manuscript Published On 30 January 2019
No ratings yet
A Survey of Chatbot Design Techniques: Manuscript Published On 30 January 2019
5 pages
Kuang 2022 J. Phys. Conf. Ser. 2170 012017
No ratings yet
Kuang 2022 J. Phys. Conf. Ser. 2170 012017
7 pages
2020-Anki P. Et Al.-Intelligent Chatbot Adapted From Question and Answer System Using RNN-LSTM Model
No ratings yet
2020-Anki P. Et Al.-Intelligent Chatbot Adapted From Question and Answer System Using RNN-LSTM Model
12 pages
JCST Papers: Only For Academic and Non-Commercial Use
No ratings yet
JCST Papers: Only For Academic and Non-Commercial Use
26 pages
August 22, 2020 - Parents'/ Guardians' Orientation
No ratings yet
August 22, 2020 - Parents'/ Guardians' Orientation
2 pages
CBSE - Application From Private Candidates
No ratings yet
CBSE - Application From Private Candidates
2 pages
Am I Teaching Well-Self-Evaluation Strategies For Effective Teachers-Nikolic and Cabaj 2000
No ratings yet
Am I Teaching Well-Self-Evaluation Strategies For Effective Teachers-Nikolic and Cabaj 2000
265 pages
DLL - New 1 1
No ratings yet
DLL - New 1 1
2 pages
Homeschooling Final
100% (1)
Homeschooling Final
6 pages
Speech & Language Therapy in Practice, Summer 2010
No ratings yet
Speech & Language Therapy in Practice, Summer 2010
32 pages
Two-Dimensional Klein-Gordon and Sine-Gordon Numerical Solutions Based On Deep Neural Network
No ratings yet
Two-Dimensional Klein-Gordon and Sine-Gordon Numerical Solutions Based On Deep Neural Network
13 pages
Automatic Detection of Dress-Code Surveillance in A University Using YOLO Algorithm
No ratings yet
Automatic Detection of Dress-Code Surveillance in A University Using YOLO Algorithm
8 pages
Foundations of Education in Ghana
No ratings yet
Foundations of Education in Ghana
24 pages
Miaa 320 Final Presentation2
No ratings yet
Miaa 320 Final Presentation2
14 pages
Tata Power
No ratings yet
Tata Power
10 pages
Detecting Road Damage Utilizing Retinanet and Mobilenet Models On Edge Devices
No ratings yet
Detecting Road Damage Utilizing Retinanet and Mobilenet Models On Edge Devices
11 pages
A Comparative Analysis of Exponential Smoothing Method and Deep Learning Models For Bitcoin Price Prediction
No ratings yet
A Comparative Analysis of Exponential Smoothing Method and Deep Learning Models For Bitcoin Price Prediction
9 pages
Hybrid Model Detection and Classification of Lung Cancer
No ratings yet
Hybrid Model Detection and Classification of Lung Cancer
11 pages
Lesson Plan Gustar
No ratings yet
Lesson Plan Gustar
3 pages
Adaptive Kernel Integration in Visual Geometry Group 16 For Enhanced Classification of Diabetic Retinopathy Stages in Retinal Images
No ratings yet
Adaptive Kernel Integration in Visual Geometry Group 16 For Enhanced Classification of Diabetic Retinopathy Stages in Retinal Images
12 pages
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
No ratings yet
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
10 pages
Module4 DS PPT
No ratings yet
Module4 DS PPT
49 pages
2011 School District Rankings
No ratings yet
2011 School District Rankings
136 pages
U-Net For Wheel Rim Contour Detection in Robotic Deburring
No ratings yet
U-Net For Wheel Rim Contour Detection in Robotic Deburring
14 pages
Multi-Task Deep Learning For Vietnamese Capitalization and Punctuation Recognition
No ratings yet
Multi-Task Deep Learning For Vietnamese Capitalization and Punctuation Recognition
11 pages
Courseoutliness 8
No ratings yet
Courseoutliness 8
3 pages
A Novel Scalable Deep Ensemble Learning Framework For Big Data Classification Via MapReduce Integration
No ratings yet
A Novel Scalable Deep Ensemble Learning Framework For Big Data Classification Via MapReduce Integration
15 pages
Deep Ensemble Learning With Uncertainty Aware Prediction Ranking For Cervical Cancer Detection Using Pap Smear Images
No ratings yet
Deep Ensemble Learning With Uncertainty Aware Prediction Ranking For Cervical Cancer Detection Using Pap Smear Images
11 pages
Deep Learning-Based Techniques For Video Enhancement, Compression and Restoration
No ratings yet
Deep Learning-Based Techniques For Video Enhancement, Compression and Restoration
13 pages
Lesson7 Lettera
No ratings yet
Lesson7 Lettera
3 pages
Teachers' Workshop 2025
No ratings yet
Teachers' Workshop 2025
1 page
Criticisms of MI Theory
No ratings yet
Criticisms of MI Theory
2 pages
Improved Convolutional Neural Networks For Aircraft Type Classification in Remote Sensing Images
No ratings yet
Improved Convolutional Neural Networks For Aircraft Type Classification in Remote Sensing Images
8 pages
Hamburg School District Comprehensive School Counseling Plan 2020-2021
No ratings yet
Hamburg School District Comprehensive School Counseling Plan 2020-2021
27 pages
Primary Phase Alzheimer's Disease Detection Using Ensemble Learning Model
No ratings yet
Primary Phase Alzheimer's Disease Detection Using Ensemble Learning Model
9 pages
Exploring DenseNet Architectures With Particle Swarm Optimization: Efficient Tomato Leaf Disease Detection
No ratings yet
Exploring DenseNet Architectures With Particle Swarm Optimization: Efficient Tomato Leaf Disease Detection
9 pages
Hybrid Object Detection and Distance Measurement For Precision Agriculture: Integrating YOLOv8 With Rice Field Sidewalk Detection Algorithm
No ratings yet
Hybrid Object Detection and Distance Measurement For Precision Agriculture: Integrating YOLOv8 With Rice Field Sidewalk Detection Algorithm
11 pages
Enhancing Fall Detection and Classification Using Jarratt Butterfly Optimization Algorithm With Deep Learning
No ratings yet
Enhancing Fall Detection and Classification Using Jarratt Butterfly Optimization Algorithm With Deep Learning
10 pages
Event Detection in Soccer Matches Through Audio Classification Using Transfer Learning
No ratings yet
Event Detection in Soccer Matches Through Audio Classification Using Transfer Learning
9 pages
Hindi Spoken Digit Analysis For Native and Non-Native Speakers
No ratings yet
Hindi Spoken Digit Analysis For Native and Non-Native Speakers
7 pages
My Favorite Things-Presentation
No ratings yet
My Favorite Things-Presentation
2 pages
Developing A Website For English-Speaking Practice To English As A Foreign Language Learners at The University Level
No ratings yet
Developing A Website For English-Speaking Practice To English As A Foreign Language Learners at The University Level
12 pages
Hybrid Horned Lizard Optimization Algorithm-Aquila Optimizer For DC Motor
No ratings yet
Hybrid Horned Lizard Optimization Algorithm-Aquila Optimizer For DC Motor
10 pages
Enhancing Emotion Recognition Model For A Student Engagement Use Case Through Transfer Learning
No ratings yet
Enhancing Emotion Recognition Model For A Student Engagement Use Case Through Transfer Learning
11 pages
Graph-Based Methods For Transaction Databases: A Comparative Study
No ratings yet
Graph-Based Methods For Transaction Databases: A Comparative Study
10 pages
A Contest of Sentiment Analysis: K-Nearest Neighbor Versus Neural Network
No ratings yet
A Contest of Sentiment Analysis: K-Nearest Neighbor Versus Neural Network
9 pages
Abstractive Summarization Using Multilingual Text-To-Text Transfer Transformer For The Turkish Text
No ratings yet
Abstractive Summarization Using Multilingual Text-To-Text Transfer Transformer For The Turkish Text
10 pages
Artificial Intelligence Algorithms To Predict Customer Satisfaction: A Comparative Study
No ratings yet
Artificial Intelligence Algorithms To Predict Customer Satisfaction: A Comparative Study
9 pages
Evaluating ChatGPT's Mandarin "Yue" Pronunciation System in Language Learning
No ratings yet
Evaluating ChatGPT's Mandarin "Yue" Pronunciation System in Language Learning
8 pages
A Comparative Study of Natural Language Inference in Swahili Using Monolingual and Multilingual Models
No ratings yet
A Comparative Study of Natural Language Inference in Swahili Using Monolingual and Multilingual Models
8 pages
A Proposed Approach For Plagiarism Detection in Myanmar Unicode Text
No ratings yet
A Proposed Approach For Plagiarism Detection in Myanmar Unicode Text
9 pages
Biomolecules Lesson Plan 7es
No ratings yet
Biomolecules Lesson Plan 7es
6 pages
Comp9 Ulp
No ratings yet
Comp9 Ulp
5 pages
Curriculum Map Health&Pe6quarter2
No ratings yet
Curriculum Map Health&Pe6quarter2
2 pages
Building Production-ready Web Apps with Node.js: A Practitioner’s Approach to produce Scalable, High-performant, and Flexible Web Components
From Everand
Building Production-ready Web Apps with Node.js: A Practitioner’s Approach to produce Scalable, High-performant, and Flexible Web Components
Gireesh Punathil
No ratings yet
Advanced Web Development with React: SSR and PWA with Next.js using React with advanced concepts
From Everand
Advanced Web Development with React: SSR and PWA with Next.js using React with advanced concepts
Mehul Mohan
No ratings yet
Implementing Design Patterns in C# and .NET 5: Build Scalable, Fast, and Reliable .NET Applications Using the Most Common Design Patterns (English Edition)
From Everand
Implementing Design Patterns in C# and .NET 5: Build Scalable, Fast, and Reliable .NET Applications Using the Most Common Design Patterns (English Edition)
Alexandre F. Malavasi Cardoso
No ratings yet
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Introduction to DBMS: Designing and Implementing Databases from Scratch for Absolute Beginners
From Everand
Introduction to DBMS: Designing and Implementing Databases from Scratch for Absolute Beginners
Dr. Hariram Chavan
No ratings yet
Prompt Engineering with ChatGPT
From Everand
Prompt Engineering with ChatGPT
Nikiforos Kontopoulos
No ratings yet
Inter-Service Communication with Go: Mastering protocols, queues, and event-driven architectures in Go (English Edition)
From Everand
Inter-Service Communication with Go: Mastering protocols, queues, and event-driven architectures in Go (English Edition)
Dušan Stojanović
No ratings yet
.NET Mastery: The .NET Interview Questions and Answers
From Everand
.NET Mastery: The .NET Interview Questions and Answers
Chetan Singh
No ratings yet
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet

A Multilingual Semantic Search Chatbot Framework

Uploaded by

A Multilingual Semantic Search Chatbot Framework

Uploaded by

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 13, No. 2, June 2024, pp. 2333~2341

A multilingual semantic search chatbot framework

Vinay R., Thejas B. U., H. A. Vibhav Sharma, Poonam Ghuli, Shobha G.

Article Info ABSTRACT

Journal homepage: http://ijai.iaescore.com

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

2.1. System overview

2.2. Database search

Algorithm 1: Computing similarity scores

A multilingual semantic search chatbot framework (Vinay R.)

5) score = sim (q, vi)

Figure 1. System overview

2.3. Paragraph search

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

3. RESULTS AND DISCUSSION

3.1. Experimental setup

3.3. Evaluation metrics

3.4. Question pair similarity

Table 1. Quora question pair similarity results

A multilingual semantic search chatbot framework (Vinay R.)

Table 2. Performance of various models on quora question pair similarity dataset

3.5. Paragraph search and answer retrieval

Table 3. Exact match testing results on XQuAD dataset

Table 4. Testing results on SQuAD 2.0 dataset

Table 5. F1 scores obtained while testing on XQuAD dataset

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

Table 6. Performance of various models fine tuned on SQuAD 1.1

A multilingual semantic search chatbot framework (Vinay R.)

Int J Artif Intell, Vol. 13, No. 2, June 2024: 2333-2341

Thejas B. U. graduated from RVCE, Bangalore in 2022. His areas of interest

H. A. Vibhav Sharma graduated from RVCE, Bangalore in 2022. His areas of

Poonam Ghuli worked as Assistant Professor at RVCE Bangalore, India from

Shobha G. is professor at RVCE with close to 30 years of experience. Her areas

A multilingual semantic search chatbot framework (Vinay R.)

You might also like