Proposing a task-oriented chatbot system for EFL learners speaking practice
Proposing a task-oriented chatbot system for EFL learners speaking practice
To cite this article: Mei-Hua Hsu, Pei-Shih Chen & Chi-Shun Yu (2023) Proposing a task-
oriented chatbot system for EFL learners speaking practice, Interactive Learning Environments,
31:7, 4297-4308, DOI: 10.1080/10494820.2021.1960864
ABSTRACT KEYWORDS
Many learners of English as a foreign language often feel that learning Distance education and
spoken English is frustrating and quite difficult, especially when they online learning; Human-
have to talk to English-speaking foreigners. In general, because they are computer interface;
unfamiliar with the spoken mode of English and are worried about Improving classroom
teaching; Informal learning;
making grammatical errors, they often feel very scared to speak English. Mobile learning
Furthermore, spoken language and reading differ in many ways. For
instance, when seeing text containing new words, learners can stop and
look up words in a dictionary. Additionally, a passage can be read
multiple times by learners to understand it. Conversely, spoken
language must be understood immediately in order to communicate
effectively. The purpose of this study is to propose an interactive
chatbot system named TPBOT which stands for “TOEIC Practice
Chatbot” for EFL learners to eliminate their fear of speaking English and
enable them to chat with online chatbots to practice spoken English at
any time. This TPBOT would be very helpful to eliminate learners’
anxiety about speaking a foreign language with foreigners.
Participants in this study are Taiwanese students whose oral scores on
the TOEIC® test are below 100. They hope to improve their oral English
ability after participating in the four-month experiment. The results
have shown that students are satisfied with using this TPBOT and
believe that it has indeed helped them improve their English speaking
skills. Obviously, the TPBOT was very effective for the intended purpose.
For educators, the TPBOT is very useful to build learning content in the
TPBOT, provide learners with interactive exercises and improve learning
effects.
1. Introduction
Learning English is usually divided into four parts: listening, speaking, reading, and writing. For non-
native English speakers, when they communicate face-to-face with native English speakers, they
need oral English skills at that moment. In general, Taiwanese students commonly feel shy and
anxious when they are asked to speak English in class. This is the so-called foreign language
phobia, also known as xenophobia, which is the feeling of panic, worry, tension, and anxiety experi-
enced when using a second or foreign language. Young (1990) pointed out that speaking a foreign
language is not the only source of student anxiety, but speaking a foreign language in front of the
class. Therefore, Young (1991) tried to change the classroom environment to eliminate students’
anxiety. Additionally, Minghe and Yuan (2013) also pointed out that the factor that causes anxiety
is speaking in front of others. Fortunately, EFL learners can use computer-assisted methods to reduce
oral anxiety (Tallon, 2009).
Now that smartphones are easy to carry around and can easily be used to customize courses for lear-
ners, mobile learning has become a common phenomenon. Therefore, learners can learn anytime and
anywhere, and gaining knowledge is becoming increasingly convenient. In addition, Miangah and
Nezarat (2012) stated that the oral function of mobile learning is very important. Therefore, if mobile
learning can cover the function of interactive oral practice, this would be of great help to language lear-
ners. For these reasons, the AI chatbot designed in this study aims to be an effective learning tool for
students to alleviate their anxiety about speaking foreign languages. In particular, using the AI chatbot
as a speaking learning tool is one of the effective ways to improve oral English ability because it can
minimize the problems encountered in language classes and reduce oral English anxiety.
Computer-based training is effective for learners to improve their English language skills. Wang
and Munro (2004) proposed computer-based training for learning English vowel contrasts. Hsu
(2008a, 2008b) proposed a personalized English learning recommender system for ESL students.
Since 2011, Facebook Inc. released Facebook Messenger (commonly known as Messenger) from
pure messaging functionality to a multiple support platform to allow users to send messages and
exchange photos, audio, and files. At that time, more than 40 standalone applications, including
video editing and animation, were provided. The latest attempt is the Messenger robot program.
Users can react to other users’ messages and even interact with bots.
A chatbot, short for chat robot, is a software application designed to interact with users using text
or text-to-speech functionality. Designed to make users feel like they are talking to a live human
agent, most chatbots utilize artificial intelligence (AI) algorithms or natural language processors to
generate the necessary response. Early chatbots simply created intelligent illusions by using
simple techniques of pattern matching and string processing to interact with users based on rule-
based and generation-based models. However, with the advent of new technologies, more intelli-
gent systems have emerged using models based on complex knowledge.
There are many successful computer applications for training on speaking languages. The devel-
opment of chatbots has been undertaken to leverage learning and teaching in different disciplines.
For instance, Freudbot was developed to understand the psychology of student interaction in dis-
tance education (Heller et al., 2005). The results indicated that a basic analysis of chat records
shows a high proportion of task execution. The results also showed that chatbot technology is
expected to become a teaching and learning tool in distance learning and online education
(Heller et al., 2005). In science lectures, the use of chatbots was also compared with humanoid
robots, and it was reported that visualization using chatbots helps students to understand lectures
smoothly (Matsuura & Ishimura, 2017). However, research on the use and development of chatbots
to enhance language learning is quite difficult.
Project LISTEN’s virtual reading instructor (Mostow et al., 2013) uses speech recognition to monitor
children’s speech when reading aloud, thereby supporting elementary school students in improving
their reading skills in their first language, English. Systems such as Robo-Sensei (Nagata, 2009) have
implemented natural language processing algorithms to provide adult learners with semantic and syn-
tactic feedback and improve foreign language grammar skills. Physical robots have also been shown to
be effective in helping children to learn aryl compound vocabulary (Vogt et al., 2019). Hsu (2008a) pro-
posed that robots can help young children to understand and create stories.
Without a doubt, the development of educational technology has combined innovative teaching
methods and new technologies to enhance the learning experience of students. Virtual assistant or
chatbot technology simplifies and enhances the learning process of students by integrating teaching
methods and innovative technologies.
In addition, Hussain et al. (2019) noted the need to discuss the classification of chatbots, the
design techniques used in early and modern chatbots, and the ways in which the two main cat-
egories of chatbots handle conversation contexts. Winkler and Soellner (2018) proposed that chat-
bots are in the early stages of use. Several studies have shown the potential of chatbots to improve
INTERACTIVE LEARNING ENVIRONMENTS 4299
the learning process and learning outcomes. However, past research has shown that the effective-
ness of chatbots in education is complex and depends on many factors.
Kozma (2001) pointed out that specific properties of computers are needed to provide learners
with real models and simulations; therefore, the medium does affect learning. However, what stu-
dents learn is not the computer itself, but the design of real models and simulations, and students
interact with these models and simulations. In addition, the Auto Tutor is an intelligent tutoring
system that can help students learn Newtonian physics, computer literacy, and critical thinking
topics through natural language tutorial dialogues. In this study, the TPBOT system is a task-oriented
learning system that focuses on TOEIC speaking training and its development was inspired by Kozma
(2001) and Auto Tutor (Graesser et al., 2005) learning methods. The difference between TPBOT and
Auto Tutor is that the TPBOT system was written in Python and aims to enable learners to easily
engage in interactive conversations to encourage them to continuously improve. For instance,
when the learner selects a topic and speaks to the TPBOT, the TPBOT employs a pronunciation rec-
ognition device, to determine the appropriate answer so that the learner can improve speaking via
continuous imitation and practice.
The research questions of this study are 1. How to help non-native English speakers improve and
eliminate their anxiety when talking to English-speaking foreigners?
(1) At present, many Taiwanese enterprises require their employees to submit TOEIC test scores to
prove their English ability. Among them, many companies especially value the spoken English
ability of their employees. Because of this phenomenon, how to help college students
improve their spoken English ability and benefit their future employment?
(2) How to use the AI chatbot, called TPBOT, proposed in this study to help learners achieve these
goals by using it for self-learning, eliminating foreign language phobia, and improving TOEIC
speaking scores?
2. Method
2.1. Study design
This study used the experimental method to determine whether the use of TPBOT can improve stu-
dents’ TOEIC speaking scores. The experimental design was divided into an experimental group and
a control group. Both groups of students took part in two TOEIC speaking simulation pretests and
posttests. After taking the pretest, students in the experimental group were provided with the
TPBOT as a self-learning tool and had to use it for a four-month oral training experiment at least
one hour a week, part of the exercise content is shown in Figure 1, while the students in the
control group did not use any experimental operations to compare the differences between the
two groups and the experimental results.
2.2. Participants
The participants in this study were 100 students from a university in northern Taiwan. They partici-
pated in the TOEIC speaking simulation test. The results of the TOEIC speaking simulation test
showed that 48 students achieved a score of less than 100 out of 200 points on the TOEIC speaking
test.
In this study, a simple random sampling (SRS) method was used to divide the experimental group
and the control group. The way of using the SRS is to number each of the 48 students, and each
student has only one number. Then use the RANDBETWEEN (1,24) function of the MS-Excel to gen-
erate random numbers to select 24 students. If a certain number is repeatedly selected, then
abandon this number and re-execute to generate a new number until 24 different numbers are gen-
erated. After that, 24 students were assigned to the experimental group, and the remaining 24
4300 M.-H. HSU ET AL.
students were assigned to the control group. The students in the experimental group received the
TPBOT to help them with speaking practice. On the contrary, the control group did not. They just
used the TOEIC textbook with mp3 or audio CDs to practice English speaking and listening.
On the server side, the system requires Python version 3.7, Ngrok, and LINE Developers account.
The other employed packages include line-bot-sdk version 1.8.0, pydub version 0.23.1, and spee-
ch_recognition version 3.8.1; pydub can quickly convert files into the required format, and spee-
ch_recognition allows the TPBOT to recognize all languages through Google.
(1) When the user logs in to the dialogue window with the TPBOT, a practice menu will appear to
allow the user to choose the type of exercise she/he wants. In Figure 5, listening exercises, speak-
ing exercises, and reading exercises are shown in sequence.
(2) When receiving the exercise request from a user, the system will first query the database
whether the user is a new user or an old user. If it is a new user, the system will store the
new user’s data on the User’s Data Sheet of the database and record the exercise currently
selected by the user and then send it to the user’s first exercise dialogue, as shown in Figure 6.
(3) After the user returns the voice message of the dialogue prompted by the system, the system
will convert the voice message with the pydub package through FFmpeg software (.m4a to
.wav) and save it.
(4) Then use the Speech_recognition package to read the converted voice message and send it to
Google for voice recognition.
(5) Searching the current user’s exercises and the code of the dialogue from the database, and then
use the code to search for the dialogue text of the code in the Questions data table, and tem-
porarily store it in the system.
INTERACTIVE LEARNING ENVIRONMENTS 4303
(6) Comparing the returned Google voice-recognized text with the user’s current conversation text
temporarily stored in step (5).
(7) Determining whether the sentence spoken by the user completely matches the standard answer
in the database. If there is no complete match, use the fuzzywuzzy package to compare whether
the similarity between the interpreted text and the sentence that the user is guided to say is accep-
table. The Fuzzywuzzy package calculates the difference between sequences using the method of
Levenshtein (1966). That is, with Google text-to-speech service, the message to be responded to
the user is converted into a voice response to the user, and a sentence to be guided to the user
next time is added to complete this conversation. If the system does not accept the user’s speech,
it means that the user has a very low understanding of the sentence. The system will send a voice
message to inform the user of the correct pronunciation of the sentence, and a text message to
inform the user of the words that the user needs to pay attention to after the comparison.
(8) When the system fails to interpret the user’s answer, the reason may be that the user is silent, or
the environment is too noisy to be recognized, or the network connection is unstable. The
system will display a message, asking the user to say it again, and reply to the user “Sorry, I
didn’t quite catch that.” The user is prompted to send the voice message again to facilitate
system identification, as shown in Figure 7.
(9) If the server is disconnected during the exercise or the exercise is interrupted, the server will
record the dialogue position of the user’s last exercise (for example, the third sentence of exer-
cise 2). When entering the system again to practice, the user can choose to continue the pre-
vious exercise or start a new exercise.
Figure 7. Response when the system cannot recognize the user’s answer
The test results showed that the students in the experimental group made significant progress. In
contrast, the students in the control group did not.
3.3. Result
In this study, the experimental results showed that students who took part in oral training on the
TPBOT made significant improvements in their oral English ability. In contrast, students who did
not use the TPBOT had limited or no progress at all. Without a doubt, TPBOT that provide interactive
functions are very useful tools for helping learners improve their spoken English.
The experimental results of this study showed that in the experimental group, the students had
an average improvement of 65 points, but the standard deviation was 17.46, which means that the
degree of improvement varied substantially among students. An explanation can be found in the
results of the questionnaire survey. Most students found it very interesting to learn spoken
English by using an interactive online chatbot; in particular, online practice on a smart phone is
very convenient. However, as shown in Figure 8, there were still a small number of students
whose practice frequency was relatively low, and their learning progress was relatively low. There-
fore, the average score of the students in the control group dropped by 2 points, and there was
no difference in the standard deviation, indicating that their degree of progress did not change
much. The results of this study and the report of Ruan et al. (2019) both confirm that chatbots
play an important role in users’ learning.
In addition, a satisfaction survey was conducted at the end of the training. The scale ranges from 1
(not helpful) to 5 (very helpful). According to the results of the satisfaction questionnaire survey, the
average scores answered by the experimental group students all exceeded 4.0. Most students find
TPBOT helpful and their performance in learning TOEIC has improved (see Table 5).
3.4. Disscusion
The results of the satisfaction survey of experimental group students were considered advantageous,
as shown in Table 5. It is obvious from this result, the TPBOT is indeed a good self-learning tool,
which satisfies the research questions. More importantly, when EFL learners practice oral English
through TPBOT, it can help them eliminate their deep-rooted fear of speaking English. Because of
the convenience of mobile phones, they can practice oral English with TPBOT at any time. It can
be seen that TPBOT is a great assisted self-learning tool to help eliminate learners’ anxiety about
speaking foreign languages with foreigners. This result is the same as the web-based language learn-
ing and speaking anxiety proposed by Bashori et al. (2020). As pointed out by Minghe and Yuan
(2013), “effective learning and practicing out of class is also important, which not only contributes
to improving students’ oral proficiency, but also helps them keep strong motivation and reduce
the anxiety of learning oral English.” The results of the analysis indicate that there is a significant
difference between the means of the control group and the experimental group. Among them, the
experimental group had a significantly larger mean than the control group.
The test results showed that the students in the experimental group made significant progress. In
contrast, the students in the control group did not. After the students in the control group learned
that the students in the experimental group had greatly improved their oral English after using
TPBOT, they also increased their motivation and expected to use TPBOT for oral practice.
Notes on contributor
Mei-Hua Hsu received the Ed.D. degree in Instructional Technology from Nova Southeastern University, FL, USA. She is
now an associate professor at the Center of General Education of Chang Gung University of Science and Technology,
responsible for teaching English for Nursing students. From 2014 to 2019, she signed the industry-academia
cooperation contract with Chang Gung Memorial Hospital, one of Taiwan’s largest and most well-known hospitals,
and was responsible for training the hospital’s staff in English courses. Her recent research interest is applying
Chatbot in education and helping EFL students learn English.
4308 M.-H. HSU ET AL.
ORCID
Mei-Hua Hsu http://orcid.org/0000-0003-2689-9774
References
Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2020). Web-based language learning and speaking anxiety. Computer
Assisted Language Learning, 1–32. doi:10.1080/09588221.2020.1770293
Drake, J. D., & Worsley, J. C. (2002). Practical PostgreSQL. Boston, MA: O’Reilly Media, Inc.
Gao, J., Galley, M., & Li, L. (2019). Neural approaches to conversational AI: Question answering, task-oriented dialogues
and social chatbots. Now Foundations and Trends, 56–88. doi:10.1561/1500000074
Graesser, A. C., Chipman, P., Haynes, B. C., & Olney, A. (2005). AutoTutor: An intelligent tutoring system with mixed-
initiative dialogue. IEEE Transactions on Education, 48(4), 612–618. doi:10.1109/TE.2005.856149
Heller, B., Proctor, M., Mah, D., Jewell, L., & Cheung, B. (2005, June). Freudbot: An investigation of chatbot technology in
distance education. In P. Kommers & G. Richards (Eds.), EdMedia + innovate learning (pp. 3913–3918). Montreal,
Canada: Association for the Advancement of Computing in Education (AACE).
Hsu, M. H. (2008a). A personalized English learning recommender system for ESL students. Expert Systems with
Applications, 34(1), 683–688. doi:10.1016/j.eswa.2006.10.004
Hsu, M. H. (2008b). Proposing an ESL recommender teaching and learning system. Expert Systems with Applications, 34
(3), 2102–2110. doi:10.1016/j.eswa.2007.02.041
Hussain, S., Sianaki, O. A., & Ababneh, N. (2019, March). A survey on conversational agents/chatbots classification and
design techniques. In L. Barolli, M. Takizawa, F. Xhafa & T. Enokido (Eds.), Web, artificial intelligence and network appli-
cations. WAINA 2019. Advances in Intelligent Systems and Computing (pp. 946–956). Matsue, Japan: Springer, Cham.
doi:10.1007/978-3-030-15035-8_93
Kozma, R. B. (2001). Counterpoint theory of “learning with media.” In R. E. Clark (Ed.), Learning from media: Arguments,
analysis, and evidence (pp. 137–178). Greenwich, CT: Information Age Publishing Inc.
Levenshtein, V. I. (1966, February). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics
Doklady, 10(8), 707–710.
Matsuura, S., & Ishimura, R. (2017, July). Chatbot and dialogue demonstration with a humanoid robot in the lecture class.
In M. Antona, & C. Stephanidis (Eds.), International Conference on Universal Access in Human-Computer Interaction (pp.
233–246). Vancouver, Canada.
Miangah, T. M., & Nezarat, A. (2012). Mobile-assisted language learning. International Journal of Distributed and Parallel
Systems, 3(1), 309–319. doi:10.5121/ijdps.2012.3126
Minghe, G. U. O., & Yuan, W. A. N. G. (2013). Affective factors in oral English teaching and learning. Higher Education of
Social Science, 5(3), 57–61. doi:10.3968/j.hess.1927024020130503.2956
Mostow, J., Nelson-Taylor, J., & Beck, J. E. (2013). Computer-guided oral reading versus independent practice:
Comparison of sustained silent reading to an automated reading tutor that listens. Journal of Educational
Computing Research, 49(2), 249–276. doi:10.2190/EC.49.2.g
Nagata, N. (2009). Robo-Sensei’s NLP-based error detection and feedback generation. Calico Journal, 26(3), 562–579.
doi:10.1558/cj.v26i3.562-579
Ruan, S., Willis, A., Xu, Q., Davis, G. M., Jiang, L., Brunskill, E., & Landay, J. A. (2019, June). Bookbuddy: Turning digital
materials into interactive foreign language lessons through a voice chatbot. In Proceedings of the Sixth (2019)
ACM Conference on Learning@ Scale (pp. 1–4). Chicago, IL, USA. doi:10.1145/3330430.3333643
Tallon, M. (2009). The effects of computer-mediated communication on foreign language anxiety in heritage and non-
heritage students of Spanish: A preliminary investigation. TPFLE, 13(1), 39–66. doi:10.1111/j.1944-9720.2009.01011.x
Vogt, P., van den Berghe, R., de Haas, M., Hoffman, L., Kanero, J., Mamus, E., … Pandey, A. K. (2019, March). Second
language tutoring using social robots: a large-scale study. 14th ACM/IEEE International Conference on Human-
Robot Interaction (HRI) (pp. 497–505). Daegu, Korea: IEEE.
Wang, X., & Munro, M. J. (2004). Computer-based training for learning English vowel contrasts. System, 32(4), 539–552.
doi:10.1016/j.system.2004.09.011
Winkler, R., & Soellner, M. (2018). Unleashing the potential of chatbots in education: A state-of-the-art analysis.
Young, D. J. (1990). An investigation of students’ perspectives on anxiety and speaking. Foreign Language Annals, 23(6),
539–553. doi:10.1111/j.1944-9720.1990.tb00424.x
Young, D. J. (1991). Creating a low-anxiety classroom environment: What does language anxiety research suggest? The
Modern Language Journal, 75(4), 426–437. doi:10.1111/j.1540-4781.1991.tb05378.x