126930
126930
a
Ashish Garg and Ramkumar Rajendran
IDP in Educational Technology, Indian Institute of Technology Bombay, Mumbai, India
Abstract: This paper investigates the use of Generative AI chatbots, especially large language models like ChatGPT, in
enhancing data analysis skills through structured prompts in an educational setting. The study addresses the
challenge of deploying AI tools for learners new to programming and data analysis, focusing on the role of
structured prompt engineering as a facilitator. In this study Engineering students were trained to adeptly use
structured prompts in conjunction with Generative AI, to improve their data analysis skills. The t-test
comparing pre-test and post-test scores on programming and data analysis shows a significant difference,
indicating learning progress. Additionally, the task completion rate reveals that 45% of novice participants
completed tasks using Generative AI and structured prompts. This finding highlights the transformative
impact of Generative AI in education, indicating a shift in learning experiences and outcomes. The integration
of structured prompts with Generative AI not only aids skill development but also marks a new direction in
educational methodologies.
a
https://orcid.org/0000-0002-0411-1782
270
Garg, A. and Rajendran, R.
The Impact of Structured Prompt-Driven Generative AI on Learning Data Analysis in Engineering Students.
DOI: 10.5220/0012693000003693
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 16th International Conference on Computer Supported Education (CSEDU 2024) - Volume 2, pages 270-277
ISBN: 978-989-758-697-2; ISSN: 2184-5026
Proceedings Copyright © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
The Impact of Structured Prompt-Driven Generative AI on Learning Data Analysis in Engineering Students
the paired sample t-test on programming knowledge summaries, and computer code (Rajabi, P.,
and data analysis concepts. These findings suggest Taghipour, P., Cukierman, D., & Doleck, T., 2023).
the benefits of incorporating Generative AI and Transitioning from the applications of generative
structured prompts in education, especially in AI to its specific role in programming, recent studies
domains like programming and data analysis. have explored the impactful role of generative AI. An
experimental study showcased how the programming
tool Codex, powered by generative AI, outperformed
2 BACKGROUNDS AND learners in a CS1 class on a rainfall problem, ranking
in the top quartile (Denny, P. et al., 2023). Another
LITERATURE REVIEW investigation used the flake8 tool to assess code
generated by AI against the PEP8 coding style,
The literature review examines the application of revealing a minimal syntax error rate of 2.88% (Feng,
Generative AI in educational settings, with a focus on Y. et al., 2023). A notable study involved Github's
ChatGPT's contributions to various learning generative AI platform, which initially failed to solve
environments. This section highlights the critical role 87 Python problems; however, applying prompt
of structured prompt engineering and evaluates the engineering techniques enabled it to resolve
limited existing studies on Generative AI's use in approximately 60.9% of them successfully (Finnie-
programming and data analysis. This section Ansley, J. et al., 2022).
identifies areas that require further investigation to The Above findings collectively highlight the
fully understand the educational benefits and efficacy of generative AI in code generation. This
opportunities of Generative AI. transition into the realm of prompt engineering, a
critical aspect of maximizing the potential of
2.1 Generative AI in Education generative AI in educational contexts, leads us to the
next section of the literature review, focusing on the
Generative AI tools, such as ChatGPT, have become nuances of prompt engineering.
popular for their capacity to generate responses and
content that mimic human interaction, using 2.2 Prompt Engineering
advanced deep learning algorithms and vast amounts
of text data (Dai, Y. et al., 2023). To explore the Maximizing Generative AI's benefits in education
applications and potential advantages of these AI relies on the proficient use of prompt engineering, a
tools in education, a systematic literature review was key skill that significantly affects AI model
conducted, The applications of Generative AI are interactions (Kohnke, L., Moorhouse, B. L., & Zou,
summarized as follows D., 2023). Effective prompt engineering requires
Personalized Tutoring: AI provides customized understanding AI's operational principles, and
tutoring and feedback, tailoring the learning ensuring prompts are clear and precise to improve
experience to each student's needs and progress tokenization and response accuracy. Including
(Bahrini, A. et al., 2023). detailed context in prompts also enhances the AI's
Automated Essay Grading: AI systems are trained ability to form relevant connections, boosting
to grade essays by identifying characteristics of response quality.
effective writing and offering feedback (De Silva, D. The art of prompt engineering also involves
et al., 2023). specifying the desired format of the AI's responses,
Language Translation: AI translates educational ensuring they align with user expectations in terms of
materials into multiple languages, ensuring accurate structure and style. Controlling verbosity within
and understandable translations (Kohnke, L., prompts is another key aspect, allowing users to
Moorhouse, B. L., & Zou, D., 2023). manage the level of detail in the AI's responses, thus
Interactive Learning: AI creates dynamic learning tailoring the information density to suit specific
environments with a virtual tutor that responds to needs.
student inquiries (Bahrini, A. et al., 2023). However, understanding these principles is just
Adaptive Learning: AI adjusts teaching strategies the beginning. Practical application demands a
based on student performance, customizing the structured approach, embodied in the CLEAR
difficulty of problems and materials (Bahrini, A. et framework. This framework provides a systematic
al., 2023). strategy for crafting prompts that effectively harness
Content Creation: AI generates a variety of the capabilities of AI language models. It's a synthesis
content, including articles, stories, poems, essays, of clarity, context, formatting, and verbosity control,
271
CSEDU 2024 - 16th International Conference on Computer Supported Education
all working in concert to elevate the communication 2.3 Gaps in Existing Literature
process with AI, making it efficient and impactful in
educational settings (Lo, L. S., 2023). The framework The existing research focuses on generative AI's
elements are presented below as ability in code generation and problem-solving but
Concise: Prompts should be succinct, clear, and often misses its wider educational effects and
focused on the task's core elements to guide AI interactions with learners. Key areas needing further
towards relevant and precise responses. exploration include:
Logical: Prompts need a logical flow of ideas, Educational Impact in Data Analysis: The
helping AI understand context and relationships educational benefits of using tools like ChatGPT in
between concepts, resulting in coherent outputs. teaching data analysis are unclear. Their influence on
Explicit: Prompts must specify the expected output student motivation, comprehension of data analysis
format, content, or scope to avoid irrelevant principles, and lasting skill acquisition needs
responses. examination, especially for newcomers to data
Adaptive: Prompts should be flexible, allowing analysis.
experimentation with different structures and settings Prompt Engineering in Education: The role of
to balance creativity and specificity. prompt engineering in educational settings is largely
Reflective: Continuous evaluation and refinement of unexplored. Recognized for enhancing AI
prompts are crucial, using insights from previous performance, it has the potential to help learners
responses to improve future interactions (Lo, L. S., articulate data analysis problems, think critically, and
2023). engage creatively with tasks that need exploration.
Additionally, some strategies mentioned in open The literature review highlights the need for this
AI documentation for writing prompts are presented study, focusing on prompt engineering's unexplored
below potential in education, especially in data analysis. It
identifies a gap in understanding how Generative AI,
Strategy: Write Clear Instructions with structured prompts, affects analytical skills and
self-directed learning. This research aims to bridge
I. Include details in your query to get relevant this gap, providing insights into integrating
answers. Generative AI effectively in education and advancing
II. Ask the model to adopt a persona using the data analysis practices.
system message.
III. Use delimiters to indicate distinct parts of the
input.
3 STUDY DESIGN
Strategy: Provide Reference Text
In this part, the research method is explained,
I. Instruct the model to answer using a reference including the tasks created, the data used for these
text. tasks, and the tests conducted before and after to
II. Instruct the model to answer with citations assess the results. The discussion also covers the SUS
from a reference text. survey used to evaluate user satisfaction and the
training provided for effective prompt crafting.
Strategy: Split Complex Tasks into Simpler
Subtasks 3.1 Selection of Concept of Data
Analysis for the Task
I. Use intent classification to identify the most
relevant instructions. The study focuses on two essential data analysis
II. Summarize long documents piecewise and skills: data aggregation and merging. Aggregation
construct a full summary recursively (OpenAI, simplifies data, revealing trends and easing novices
2023). into complex tasks, much like learning the alphabet
The strategic application of prompt engineering before forming sentences (McKinney, W., 2022).
tactics significantly refines AI interactions, ensuring Merging integrates diverse datasets, essential in the
responses are both precise and contextually relevant. data landscape, offering a unified view (McKinney,
W., 2022).
272
The Impact of Structured Prompt-Driven Generative AI on Learning Data Analysis in Engineering Students
3.2 Dataset for the Task and Problem concepts through Python programming. These
Statement for the Task Questions are designed across three levels of Bloom's
taxonomy, the test comprises five questions each
3.2.1 Dataset ● Understanding (L1)
● Applying (L2)
The dataset includes data from September to January, ● Analyzing (L3)
capturing attributes such as student video usage, These questions emphasized practical application
student ID, school ID, view count, and last access also over mere rote learning and ensured relevance to
date and time. Each month's dataset contains over the task at hand. These questions originated from the
10,000 observations, providing a comprehensive official panda's documentation and underwent
view of student engagement with video content. This multiple validations by industry experts, ensuring
dataset is created for the task and the discussed their efficacy in gauging participant performance.
concept of data analysis. However, this dataset draws
inspiration from the school education program where 3.3.2 SUS Survey
students are provided tablets to enhance learning.
The System Usability Scale (SUS) was adapted to
3.2.2 Task gather feedback on the Generative AI tool in data
analysis, providing a reliable evaluation of its
Based on the given dataset, the task is designed that effectiveness and user experience (Brooke, J., 1995).
way so that it cannot be completed with no Combining SUS scores with task performance and
programming software like Excel and Tableau, etc., learning metrics allows for a detailed assessment of
The following are the problem statements of the task the tool's impact on user proficiency.
T1: Calculate the total daily video usage for each
student across all months. 3.4 Design of Structured Prompt
T2: Given the unique data capture cycle of student Training
video usage (the 26th of one month to the 25th of the
next), compute the monthly total video usage for each A one-hour training session has been designed to
student, for example, compute the student total video introduce participants to prompt engineering,
usage for October (1st October to 31st October). employing an example-based approach. Initially, the
T3: Calculate the monthly video usage for each CLEAR framework and strategies from OpenAI's
school over all the months. documentation are explained to lay the foundation.
For the completion of the tasks, Python Subsequently, two examples of structured prompts
programming is selected because it is preferred in are presented to illustrate the concepts in practice.
data analytics due to its simplicity, extensive libraries The first example is non-contextual, featuring a
like pandas and NumPy, and compatibility with big question from the Union Public Service Commission
data and machine learning. Its strong community and ethics exam, a well-known civil service recruitment
relevance in real-world applications, along with examination in India as shown in Figure 1. This
market demand as highlighted by Zheng, Y. (2019), example is chosen for its general nature, ensuring that
make it superior to alternatives like R and Weka, even students with limited programming or data
justifying its choice for this study. analysis experience can grasp the concept of
structured prompts. The structured prompt for this
3.3 Instrument Designed question is crafted to build trust and provide an easy
introduction to the topic. However, the prompt for
The study selects specific tools to ensure the findings this case is written this way “At the beginning of the
are valid and reliable. It aims to closely examine the prompt, the full question is clearly stated, followed by
experiences, needs, and performance of participants a detailed context explaining that this is a UPSC exam
to fully understand the research goals. question, the nature of the exam, the selection
process, and the roles of those selected. The
3.3.1 Pre and Post-Test requirement to limit the response to 250 words is
specifically highlighted. The prompt then instructs to
A set of 15 MCQs was designed, focusing adopt the persona of an evaluator, whose profile is
predominantly on the concepts of aggregation and clearly outlined, to ensure the response meets the
merging. These questions aimed to assess the evaluator's expectations. Additionally, a strategy is
participants' comprehensive understanding of the provided to organize the answer logically, enhancing
273
CSEDU 2024 - 16th International Conference on Computer Supported Education
the overall response utilizing concepts from the a relevant context but also prepares students for the
CLEAR framework and Open AI strategies. types of tasks they will encounter. This careful, step-
by-step approach ensures that all participants,
regardless of their background, can effectively engage
with and understand the principles of prompt
engineering, setting a solid foundation for their
subsequent tasks in data analysis
4 USER STUDY
This study examined the impact of Generative AI and
structured prompt engineering on novices learning
Figure 1: Example 1- Structured prompt for answering
UPSC exam question. data analysis, using a four-hour session with Python
and AI tools. Participants independently completed
The second example is contextual and directly related tasks, with researcher guidance and continuous AI
to the field of study. It involves showing students an access, to assess how structured prompts affect
Excel file with a dataset different from the one used learning in a condensed time frame.
in the tasks. For this dataset, a structured prompt is
written based on a specific problem statement as 4.1 Participants
shown in Figure 2,
This study involved 20 graduate-level participants, all
familiar with ChatGPT or similar AI tools but without
formal training in programming or data analytics.
This selection ensured a uniform baseline of
understanding across the 12 male and 8 female
participants, aged 24 to 29. Each participant had
access to ChatGPT 3.5 and shared English as their
formal education language, minimizing language
barriers. Their lack of prior prompt engineering
experience set a consistent starting point for all,
crucial for examining the impact of structured prompt
training on their data analysis skills using Generative
Figure 2: Example 2-Structured Prompt for writing Python AI. This Purposive sampling was important for
code for the comparison of scores for the given training maintaining a controlled study environment and
dataset. focusing on the specific research objectives.
In this case, the prompt is structured using the CLEAR 4.2 Study Procedure
framework and Open AI strategies. It starts by clearly
defining the problem statement. Next, it specifies the In this study, as shown in Figure 3 participants were
requirement for a Python code, it begins by stating the initially briefed on the impact of Generative AI in
file path of the dataset, followed by explicitly naming data analysis and consented to ethical data collection
the columns in this dataset. It then instructs to focus and privacy practices. A pre-test then assessed their
only on the columns relevant to the problem. Based on existing knowledge, establishing a baseline for
the problem statement, it first outlines the comparison subsequent phases. During the training phase, they
metrics and then provides instructions for advanced engaged in a one-hour session on structured prompt
analysis, including checking assumptions. It also writing, essential for effective interaction with
clearly states what actions to take if the assumptions Generative AI tools like OpenAI's ChatGPT, and
are not met. This is all organized in a logical sequence. practiced crafting prompts through contextual and
At the end of the prompt, there's a request to provide non-contextual examples. In the task phase, they
the code and explain each part of the syntax step by applied prompting skills over three hours, tackling
step. This helps the user understand the process and various data analysis tasks and refining their
learn in segments. This approach not only proficiency. Post-intervention, their skills were
demonstrates the application of structured prompts in reassessed to quantify the training's effectiveness.
274
The Impact of Structured Prompt-Driven Generative AI on Learning Data Analysis in Engineering Students
275
CSEDU 2024 - 16th International Conference on Computer Supported Education
participants' performance across all levels of Bloom's detailed learner interactions and perceptions. A key
taxonomy post-intervention, indicating an improved focus will be evaluating the quality of participants'
understanding and application of programming prompts to enhance critical thinking and refine
concepts crucial for data analysis. training methods. Expected to enrich learning
The substantial effect sizes reported in the t-test theories and Human-Computer Interaction
results underscore the profound impact of the frameworks, this research will help explore how
intervention on learners' ability to grasp and apply Generative AI can innovate pedagogy and create
programming principles within the context of data personalized, accessible educational experiences
analysis. Additionally, task completion rates after worldwide.
structured prompt training with Generative AI
suggest the significant influence of the intervention
on participants' ability to understand and execute data REFERENCES
analysis tasks. This progress goes beyond mere
memorization, indicating a shift towards the Bahrini, A. et al. (2023). ChatGPT: Applications,
comprehension of the underlying principles. opportunities, and threats. 2023 Systems and
Furthermore, the System Usability Scale (SUS) Information Engineering Design Symposium (SIEDS),
survey results, indicating a positive reception of the Charlottesville, VA, USA, 274-279.
Generative AI tool's usability, complement the https://doi.org/10.1109/SIEDS58326.2023.10137850
study's findings. A user-friendly and effective tool is Brooke, J. (1995, November 30). SUS: A quick and dirty
crucial in an educational setting, as it can significantly usability scale. Usability Evaluation in Industry, 189.
reduce the cognitive load on learners, allowing them Dai, Y., et al. (2023). Reconceptualizing ChatGPT and
generative AI as a student-driven innovation in higher
to concentrate on understanding and applying the
education. Procedia CIRP, 119, 84-90.
concepts rather than navigating the tool itself. https://doi.org/10.1016/j.procir.2023.05.002
This study sheds light on the potential of De Silva, D., Mills, N., El-Ayoubi, M., Manic, M., &
Generative AI and structured prompt engineering to Alahakoon, D. (2023). ChatGPT and generative AI
transform educational methods. The results of this guidelines for addressing academic integrity and
research suggest that Generative AI can play a crucial augmenting pre-existing chatbots. Proceedings of the
role in helping learners understand complex subjects IEEE International Conference on Industrial
like programming and data analysis. Moreover, the Technology, 2023-April.
usability of structured prompts has been instrumental https://doi.org/10.1109/ICIT58465.2023.10143123
Denny, P., Kumar, V., & Giacaman, N. (2023). Conversing
in providing students with clear, actionable guidance
with Copilot: Exploring prompt engineering for solving
through intricate learning tasks, enhancing their CS1 problems using natural language. In Proceedings
engagement and help them to master skills. of the 54th ACM Technical Symposium on Computer
However, the study acknowledges its limitations, Science Education (SIGCSE 2023) (pp. 1-7). ACM.
including the absence of log data analysis and https://doi.org/10.1145/3545945.356982
qualitative data like interviews which could provide Dhoni, P. (2023, August 29). Exploring the synergy
deeper insights into the behavioral patterns of high between generative AI, data, and analytics in the
and low performers. The relatively small sample size modern age. TechRxiv.
also restricts the generalizability of the findings. https://doi.org/10.36227/techrxiv.24045792.v1
Feng, Y., Vanam, S., Cherukupally, M., Zheng, W., Qiu,
M., & Chen, H. (2023). Investigating code generation
performance of ChatGPT with crowdsourcing social
7 FUTURE WORK data. In 2023 IEEE 47th Annual Computers, Software,
and Applications Conference (COMPSAC) (pp. TBD).
Future research on integrating Generative AI and IEEE.
Finnie-Ansley, J., Denny, P., Becker, B.A., Luxton-Reilly,
structured prompt engineering in education, A., & Prather, J. (2022). The robots are coming:
especially in programming and data analysis, is set to Exploring the implications of OpenAI Codex on
deepen our understanding of its effects on learning. introductory programming. In Proceedings of the 24th
Planned comparative studies can examine the Australasian Computing Education Conference (pp.
learning outcomes of groups with varying levels of TBD). Virtual Event, February 14–18, 2022.
access to ChatGPT and prompt training, aiming to Firaina, R., & Sulisworo, D. (2023). Exploring the usage of
understand the role of Generative AI in learner ChatGPT in higher education: Frequency and impact on
engagement and educational processes. These studies productivity. Buletin Edukasi Indonesia, 2(01), 39–46.
can expand participant diversity and employ methods https://doi.org/10.56741/bei.v2i01.310
like structured interviews and task analysis to capture
276
The Impact of Structured Prompt-Driven Generative AI on Learning Data Analysis in Engineering Students
277