LLM Intelligent Agent Tutoring in Higher Education Courses using a RAG Approach
LLM Intelligent Agent Tutoring in Higher Education Courses using a RAG Approach
Horia Modran * , Ioana Corina Bogdan , Doru Ursuțiu , Cornel Samoila , Paul Livius Modran
doi: 10.20944/preprints202407.0519.v1
Copyright: This is an open access article distributed under the Creative Commons
Attribution License which permits unrestricted use, distribution, and reproduction in any
medium, provided the original work is properly cited.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 5 July 2024 doi:10.20944/preprints202407.0519.v1
Disclaimer/Publisher’s Note: The statements, opinions, and data contained in all publications are solely those of the individual author(s) and
contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting
from any ideas, methods, instructions, or products referred to in the content.
Article
Abstract. Conventional tutoring approaches are confronted with limitations such as restricted availability,
inconsistent pedagogical quality, and scalability constraints. Furthermore, the exclusive use of Large Language
Models (LLMs) like ChatGPT in education has its shortcomings, including the potential for incorrect responses
and the lack of customized guidance aligned with specific course content. This research proposes an innovative
intelligent chatbot tutoring system, integrating the Retrieval Augmented Generation (RAG) approach with a
custom LLM. The developed system aims to overcome the limitations of traditional tutoring and general-
purpose LLMs by providing accurate, contextually relevant, and personalized assistance, thereby enhancing
student understanding and engagement. The system, powered by an intelligent agent, retrieves relevant
information from curated academic sources, incorporates interactive features for user feedback, and utilizes
machine learning algorithms for ongoing performance enhancement, ensuring a robust and effective tutoring
experience. The anticipated outcome is an enriched educational experience for university students,
advancement in personalized learning, and improved student engagement, retention, and academic
performance. Through continuous research and refinement, it is expected that the intelligent chatbot tutor will
assume an important role in enhancing and supporting the educational journey of students.
1. Introduction
Traditional methods of student tutoring often face challenges such as limited availability,
inconsistency in teaching quality, and scalability issues. Moreover, relying solely on general-purpose
language models (LLMs) like ChatGPT for educational purposes poses several drawbacks, including
inaccurate responses and the inability to provide tailored guidance based on specific course content.
Taking into consideration these challenges, there is a pressing need to develop innovative solutions
that leverage the power of Artificial Intelligence (AI) to enhance the learning experience for students.
A comprehensive review of relevant research articles was conducted to explore the state of art
in the fields of Retrival Augumented Generation (RAG) in Large Language Models (LLMs),
Intelligent ChatBot Tutoring, and their application into Higer Education. This review emphasizes the
variety of strategies by which RAG can be utilized to overcome the inherent constraints of general-
purpose LLMs when performing tasks that require comprehensive knowledge [1].
The review conducted by M. Ashfaque et. al. [2] delves into the integration of AI and NLP in
intelligent tutoring systems, with a focus on Chatbots as virtual tutors. It discusses the diverse
applications of Chatbots in education and various sectors, highlighting the need for continuous
improvement in Chatbot design and development for enhanced functionality and user experience.
Research paper [3] investigates tensions arising from the integration of large language model-
based chatbots in higher education, emphasizing the need for clear guidelines and collaborative rule-
making. Students and teachers express concerns about the quality of learning, the value of diplomas,
and the evolving roles in education due to LLM-based chatbots. The study highlights the importance
of understanding the changing human-technology relationship and the implications for learning
objectives and division of labor within educational settings.
The paper [4] presents a hybrid model integrating Large Language Models (LLMs) and Chatbots
for efficient access to educational materials in higher education. It emphasizes the importance of
prompt engineering and content knowledge in maximizing the potential of LLM. Practical
implementations, such as question-answering chatbots, demonstrate the effectiveness of the
proposed programming environment. Another research [5] highlights the transformative capabilities
of LLMs in education and underscores the role of critical thinking and iterative design for optimal
performance.
The article [6] outlines the creation of TutorBot+, a chatGPT-based feedback tool for
programming students at Universidad Católica de la Santísima Concepción, aiming to enhance
learning and computational thinking skills. Preliminary results show successful integration with an
LMS and potential for improving the educational experience. N. Bakas et. al. [7] proposes using
ChatGPT’s API for interactive tutoring, highlighting the transformative impact of LLMs in education
by enabling dynamic, customized learning experiences.
The study on Student Interaction with NewtBot [8] demonstrates positive student perceptions
of using generative AI for schoolwork and the effectiveness of internal prompt engineering in
customizing LLM chatbot behaviors for improved academic engagement. S. Siriwardhan et. al. [9]
introduces RAG-end2end for domain adaptation in ODQA, enhancing RAG models with auxiliary
signals for improved factual consistency and reduced hallucinations. Results demonstrate the
effectiveness of joint retriever-generator training in specialized domains, suggesting potential
applications in educational chatbot tutoring systems.
In prior research, the same group of researchers proposed an instructional methodology for
training engineers in all necessary procedures for the creation, validation, and implementation of
machine learning-based systems [10], as well as strategies for incorporating Artificial Intelligence and
ChatGPT into Higher Engineering Education [11]. The current study seeks to transcend the
limitations of traditional tutoring and general-purpose LLMs by providing accurate, contextually
relevant, and personalized assistance, thereby enhancing student understanding and engagement.
This system, guided by an intelligent agent and utilizing a RAG approach, collects relevant
information from meticulously curated academic sources, incorporates interactive components for
user feedback, and applies machine learning methods for ongoing performance optimization,
ensuring a proficient tutoring experience.
for the retrieval component, ensuring that the information fed into the LLM is both accurate and
contextually relevant. Next, the chatbot is trained using prompt engineering techniques to fine-tune
its responses, ensuring they are aligned with educational objectives and learning outcomes.
Interactive components are crucial for effective tutoring. A RAG-based chatbot can incorporate
features such as quizzes, practice problems, and instant feedback mechanisms. These components
not only engage students actively but also help in reinforcing their understanding of the material.
Additionally, the chatbot can adapt to individual learning paces and styles, providing a personalized
learning experience. By analyzing student interactions and feedback, the system can continuously
improve its performance through iterative design and machine learning techniques.
Practical implementations of such systems demonstrate their potential to transform the
educational landscape. For instance, question-answering chatbots developed using RAG approaches
have shown significant improvements in delivering precise and comprehensive answers compared
to traditional LLMs. These chatbots can assist students with homework, clarify complex concepts,
and even offer guidance on research projects, thereby enhancing the overall learning experience.
Moreover, the integration of RAG-based chatbots in Learning Management Systems (LMS)
provides a seamless and accessible tutoring solution. Students can interact with the chatbot anytime,
ensuring consistent support outside the classroom. This continuous availability helps address the
issue of limited tutoring resources and offers scalable solutions for educational institutions.
The vector database is a critical component in the RAG framework, acting as a repository for the
high-dimensional vectors representing the indexed educational content. LllamaIndex leverages state-
of-the-art algorithms to create and manage this vector store, ensuring that the vectors are organized
in a manner that allows for efficient retrieval. This setup not only speeds up the search process but
also enhances the accuracy of the retrieved information by reducing the likelihood of retrieving
irrelevant or incorrect data.
By indexing a comprehensive knowledge base, the system ensures that the information retrieved
is always relevant and reliable, thereby enhancing the quality of tutoring. The use of the vector store
allows the system to scale effortlessly, accommodating a growing repository of educational materials
without sacrificing performance. This scalability is particularly beneficial for educational institutions
looking to implement AI-driven tutoring solutions across multiple courses and disciplines.
granularity and coherence of the indexed content. A chunk size of 512 ensures that each segment of
the document is of manageable length, allowing the model to capture sufficient context within each
vector. The chunk overlap of 20 provides a slight overlap between consecutive chunks, preserving
the continuity of information and improving the system’s ability to retrieve contextually relevant
data.
Following the creation of the Storage content, documents are loaded into the system using the
SimpleDirectoryReader().loadData() function. This function is designed to handle entire directories,
allowing for the bulk import of educational content. By passing the directory containing the
educational materials as a parameter, the function reads and processes the files, converting them into
a format suitable for indexing. The loaded documents are then used to create a VectorStoreIndex
using the VectorStoreIndex.fromDocuments() function. This function transforms the loaded documents
into a format suitable for storage in the vector store. These vectors encapsulate the semantic
information contained in the documents, enabling efficient and accurate retrieval based on the
content’s meaning rather than just keyword matching.
The TypeScript implementation responsible for the instantiation and local persistence of the
vector store is presented in Figure 2.
The complete process for handling the documents and creating the local vector store is
illustrated in Figure 3. The local vector store consists of three key files, each serving a distinct purpose:
• doc_store.json—this file contains the raw documents that have been loaded into the system. It
serves as a repository of the original educational content, preserving the text and metadata
associated with each document.
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 5 July 2024 doi:10.20944/preprints202407.0519.v1
• index_store.json—it maintains the indexing information for the documents stored in the system.
It includes the structures and mappings that allow for efficient searching and retrieval of
documents based on their content.
• vector_store.json stores the high-dimensional vectors generated from the indexed documents.
These vectors are created using an embedding technique, and they capture the semantic
meaning of the documents. They are used for tasks such as similarity search and clustering.
The application is currently in the testing phase, incorporating educational materials relevant to
the Virtual Instrumentation course at the Transilvania University of Brasov. To construct the
knowledge base for the application, two primary documents were employed:
• Virtual Instrumentation: Laboratory Guide [14]—this document provides comprehensive
laboratory exercises and practical guidance on virtual instrumentation, serving as a foundational
resource for hands-on learning and experimentation.
• Introduction to LabVIEW Graphic Programming with Applications in Electronics,
Telecommunications, and Information Technologies [15]—it covers fundamental concepts and
practical applications in electronics, telecommunications, and IT, making it a critical resource for
understanding the software tools used in virtual instrumentation.
To evaluate the effectiveness of the application, it is currently being tested to determine how
well it responds to queries related to the documents integrated into its knowledge base. This testing
phase involves asking various questions about the content covered in those two documents.
Preliminary results have been promising, indicating that the chatbot can accurately retrieve and
generate contextually appropriate responses based on these resources. Furthermore, the application
stores previous conversations locally, allowing it to maintain memory and context over the course of
an interaction. Figure 5 illustrates a sample question regarding the steps needed to build an Arduino
Application in LabVIEW, demonstrating the application’s capability to handle specific queries
effectively.
References
1. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.,
Rocktäschel, T., Riedel, S., Kiela, D.: Retrieval-augmented generation for knowledge-intensive NLP tasks.
In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS ‘20).
Curran Associates Inc., Red Hook, NY, USA, Article 793, 9459–9474, DOI:
https://doi.org/10.48550/arXiv.2005.11401 (2020).
2. Ashfaque, M. W., Tharewal, S., Iqhbal, S., Kayte, C. N.: A Review on Techniques, Characteristics and
approaches of an intelligent tutoring Chatbot system, 2020 International Conference on Smart Innovations
in Design, Environment, Management, Planning and Computing (ICSIDEMPC), Aurangabad, India, 2020,
pp. 258-262, DOI: https://doi.org/10.1109/ICSIDEMPC49020.2020.9299583.
3. Carbonel, H., Jullien, J.-M.: Emerging tensions around learning with LLM-based chatbots: A CHAT
approach, Networked Learning Conference, 14(1), https://journals.aau.dk/index.php/nlc/article/view/8084.
4. Bratić, D., Šapina, M., Jurečić, D., Žiljak Gršić, J.: Centralized Database Access: Transformer Framework and
LLM/Chatbot Integration-Based Hybrid Model. Appl. Syst. Innov. 2024, 7, 17, DOI:
https://doi.org/10.3390/asi7010017 (2024).
5. Makharia, R. et al.: AI Tutor Enhanced with Prompt Engineering and Deep Knowledge Tracing,” 2024 IEEE
International Conference on Interdisciplinary Approaches in Technology and Management for Social
Innovation (IATMSI), Gwalior, India, 2024, pp. 1-6, DOI:
https://doi.org/10.1109/IATMSI60426.2024.10503187.
6. Martinez-Araneda, C., Gutiérrez, M., Maldonado, D., Gómez, P., Segura, A., Vidal-Castro, C.: Designing a
Chatbot to support problem-solving in a programming course, INTED2024 Proceedings, pp. 966-975, DOI:
https://doi.org/10.21125/inted.2024.0317 (2024).
7. Bakas, N.P., Papadaki, M., Vagianou, E., Christou, I., Chatzichristofis, S.A.: Integrating LLMs in Higher
Education, Through Interactive Problem Solving and Tutoring: Algorithmic Approach and Use Cases, In:
Preprints.org (www.preprints.org) | NOT PEER-REVIEWED | Posted: 5 July 2024 doi:10.20944/preprints202407.0519.v1
Information Systems. EMCIS 2023. Lecture Notes in Business Information Processing, vol 501. Springer,
Cham, DOI: https://doi.org/10.1007/978-3-031-56478-9_21.
8. Lieb, A., Goel, T.: Student Interaction with NewtBot: An LLM-as-tutor Chatbot for Secondary Physics
Education. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI
EA ‘24), Association for Computing Machinery, New York, NY, USA, Article 614, pp. 1–8, DOI:
https://doi.org/10.1145/3613905.3647957.
9. Siriwardhana, S., Weerasekera, R., Wen, E., Kaluarachchi, T., Rana, R., Nanayakkara, S.: Improving the
Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question
Answering. Transactions of the Association for Computational Linguistics 2023; 11 1–17, DOI:
https://doi.org/10.1162/tacl_a_00530.
10. Modran, H.A., Ursutiu, D., Samoila, C., Chamunorwa, T.: Learning Methods Based on Artificial Intelligence
in Educating Engineers for the New Jobs of the 5th Industrial Revolution. In: Educating Engineers for
Future Industrial Revolutions. ICL 2020. Advances in Intelligent Systems and Computing, vol 1329.
Springer, Cham, DOI: https://doi.org/10.1007/978-3-030-68201-9_55 (2021)
11. Modran, H.A., Chamunorwa, T., Ursuțiu, D., Samoilă, C.: Integrating Artificial Intelligence and ChatGPT
into Higher Engineering Education. In: Towards a Hybrid, Flexible and Socially Engaged Higher
Education. ICL 2023. Lecture Notes in Networks and Systems, vol 899. Springer, Cham, DOI:
https://doi.org/10.1007/978-3-031-51979-6_52.
12. ChatGPT API Reference, https://platform.openai.com/docs/api-reference/introduction, last accessed
2024/05/25.
13. LLamaIndex Documentation, https://www.llamaindex.ai/, last accessed 2024/05/28.
14. Modran, H. A., Ursuțiu, D.: Instrumentație Virtuală: Îndrumar de laborator, Transilvania University of
Brasov Publishing House, ISBN 9786061915460 (2022).
15. Modran, H. A., Bogdan, I. C., Ursuțiu, D., Samoilă, C.: Introducere în Programarea Grafică LabVIEW cu
Aplicații în Electronică, Telecomunicații și Tehnologii Informaționale, Transilvania University of Brasov
Publishing House, ISBN 978-606-19-1709-9 (2024).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those
of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s)
disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or
products referred to in the content.