Emotions and Gesture Recognition Using Affective Computing Assessment With Deep Learning
Emotions and Gesture Recognition Using Affective Computing Assessment With Deep Learning
Corresponding Author:
Fatchul Arifin
Department of Electronics and Informatics Engineering Education, Faculty of Engineering, Universitas
Negeri Yogyakarta
1st Colombo Street, Karangmalang Campus, Yogyakarta 55281, Indonesia
Email: [email protected]
1. INTRODUCTION
Emotion is an important aspect that can influence attitudes, decision making, and human
communication [1]. In the world of education, emotions can affect the student's learning process [2]. The
influence of emotions on the student learning process resulted in the level of acceptance of students in
participating in learning [3]. Based on research, motivation affects student responses when faced with
difficult learning materials [4]. More than that, emotions can increase students' environmental awareness [5].
In addition to having an impact on students, a teacher's job satisfaction is also influenced by emotions [6].
Therefore, a teacher must be able to provide an affective assessment that can measure the emotions and
behavior of students to determine learning achievement [7].
Affective assessment is important because it supports a student's cognitive processes. Research
conducted by Morshead [8], found that the affective domain affects cognitive. However, assessing the
affective domain is more difficult than assessing the cognitive domain because it converts naturally feelings
and attitudes into the cognitive domain [9]. In Indonesia, affective assessment is regulated under the Ministry
of Education which called educational assessment standards. According to the national standards of this
assessment, Indonesia translates it into an attitude assessment using the observation method. However, in
practice the assessment is more dominant in the cognitive domain [10]. This is reinforced by Rimland [11]
that the cause is that the assessment of the affective domain is quite difficult to apply. Measuring the
affective domain there are factors that make it difficult to measure directly [12]. In addition, research on
affective assessment in particular is still rare because it is difficult to evaluate [10]. Therefore, research and
development of affective assessment must be able to solve these problems.
Several studies have been conducted to measure students' affectiveness. The development of
affective assessment instruments textually can use the achievement emotions questionnaire (AEQ) [13].
Moreira et al. [14] through their research can show the relationship between emotional achievement and
personality using AEQ-Mathematics. However, according to Pekrun [13] the self-report method may be able
to produce bias due to student subjectivity. According to him, it needs to be equipped with other approaches
to measure student psychology. Along with the development of technology, a new approach emerged,
namely affective computing, which was initiated by Picard [15]. Affective computing allows the detection of
human emotions and affective through signals obtained from facial activity, posture, gesture, hand
movement, vocal, textual, and electrodermal or skin activity [16].
Research on detecting human emotions using technology has been widely carried out. D’Mello and
Graesser [17] detected boredom, confusion, and frustration of students through dialogue features and body
postures when learning to use a camera while learning in AutoTutor as well as Hussain et al. [18].
Furthermore, D’Mello and Graesser [19] used a cognitive disequilibrium model to explain the dynamic
condition of students' emotions. Arguel et al. [20] detected confusion through capturing facial expressions
and conversational cues in interactive digital learning environments (IDLEs). Thomas and Mathew [21] and
Lyons et al. [22] identified facial expressions to detect human emotions. Ko [23] utilizes several artificial
intelligence algorithms to detect visual information (face), while Kratzwald et al. [24] through text. Chen and
Lee [25] detected students' nervousness, excitement, and calmness by using a human pulse sensor. Behoora
and Tucker [26] and Sun et al. [27] detect and classify emotional states through human body gesture patterns.
In addition, the detection of emotions through sound was also carried out by research by Davis et al. [28].
Other studies apply more dominant emotion detection in the learning environment such as the use of
the intelligent tutoring system (ITS) [29]–[37]. In previous studies, there was nothing specific on affective
assessment but rather on ITS performance [1], [2]. Putra and Arifin [38] developed a system that is used to
monitor students' mood conditions during classroom learning. This monitoring of mood is done so that a
teacher can minimize student stress conditions. Detection of emotions during learning can be done. Some
basic emotions according to research by Nasuha et al. [39] can be classified through facial expressions. These
emotions include anger, sadness, joy, neutrality, fear, disgust, and surprise. In addition, through body
gestures, emotions such as interest, joy, frustration, and boredom can be detected [26].
In the development of deep learning, there is a model that can be used to predict body gestures,
namely PoseNet [40], [41]. This model detects body gestures based on 17 joint points in the human
body. Based on the theory that body gestures describe human emotions, PoseNet can be used to carry out
these activities. PoseNet is used to detect the positions of human joints that make up body gestures.
The pattern of joint position is then classified into recognizable emotions such as interest, neutral, bored and
frustrated. This model will classify according to the pattern it recognizes using the convolutional neural
network (CNN) architecture. Some CNN architectures that are lightweight and can be used for mobile are
MobileNet [42]–[47].
Emotions have a strong influence on students' cognitive abilities [15]. Therefore, the learning
process must be able to manage the affective domain well. Through the approach of affective computing
technology, it is proven to be able to show the authentic emotional state of students. Utilization of data
generated from PoseNet detection can be used as a basis for conducting affective assessments. The data is
processed in such a way that it can become information. The development of information systems is expected
to help teachers to manage their classes better by paying attention to the emotional conditions of their
students. The purpose of this study was to produce a product that can provide authentic student emotional
data so that teachers are assisted in the implementation of their affective assessment.
2. METHOD
This research uses research and development methods to produce a product. The development
procedure used is to use one of the software developments approaches, namely the V-model which is the
development of the waterfall model. This procedure is used because it prioritizes testing activities that assist
in software development [48]. The research and development procedure used is the V-model. The
development stage with the model used can be seen in Figure 1, including: 1) requirements analysis, 2)
system design, 3) architectural design, 4) module design, 5) coding, 6) unit testing, 7) integration testing, 8)
system testing, and 9) acceptance testing.
In the needs analysis stage, the basics of system development are generated. Because it is associated
with the acceptance testing stage, the acceptance test is designed using the usability aspect of the
international organization for standardization (ISO) 25010 standard. After the system concept is defined, then
the system design stage details all system components including product descriptions, use case diagrams,
scenarios, activity diagrams, and the selection of PoseNet as deep learning models. Likewise, at this stage a
system test is designed using all aspects of ISO 25010 except usability and functional suitability. This is
because usability is used for acceptance testing and functional suitability is used for integration testing. The
next stage of system design is architectural design that details system components by producing entity
relationship diagrams, class diagrams, sequence diagrams, user interface (UI) mockups and selecting the
MobileNet architecture on PoseNet. The next stage is to design the specific modules used in the system.
After the entire design is generated, it is then implemented into the program code so that it becomes an
information system. The resulting product is then tested successively, namely unit testing (module test) and
integration testing with functional suitability and deep learning model testing. The next step is to test the
system with aspects of ISO 25010 and test its acceptance and accuracy of emotion detection.
Emotions and gesture recognition using affective computing assessment with deep … (Herjuna Artanto)
1422 ISSN: 2252-8938
(a) (b)
Figure 2. Comparing emotion and gesture recognition in (a) happy and (b) bored emotion
The use of the above technique allows affective assessment to be carried out by assessing the
tendency of students' emotional detection data that occurs during learning. The order of the affective
assessment process using this system can be seen in Figure 3. The assessment activity begins with the teacher
opening the class and students making a presence in the class they have enrolled in. After students receive
information about being present in the class, emotion detection will run. During the learning process,
students' body gestures will be recognized according to the predetermined emotional classes. When learning
has been completed, the teacher needs to close the class so then emotion detection can be stopped and all
student’s data can be saved. The detection results are then stored into the system to obtain trend information.
Affective assessment can be done by following the steps above. The emotional tendencies that are
obtained from the detection of emotions become material for students' affective assessment. Results of the
detection of these emotions are displayed in a graph like the Figure 4. Student’s emotions detection and their
assessment can be seen on the left side in Figure 4(a). Radar type chart is used to indicate which emotion
class is more dominant. Every student’s emotion detection in classroom also can be used to evaluate teaching
style based on the data. Emotion class that dominant in classroom can be seen on the right side in Figure 4(b)
of the Figure 4.
(a) (b)
Figure 4. Comparing the presentation of emotion detection data in (a) a student and (b) a classroom view
Based on the results of the detection above, it is necessary to test the quality of detecting emotions.
The test is carried out in two ways, namely the confusion matrix and the accuracy test using the AEQ
instrument as the actual class. Testing with the confusion matrix can be seen in Figure 5. This image shows
the accuracy of emotion detection in each emotion class that has been determined. This test uses 250
detection data with 50 data for each emotion class. The system has successfully detected 42 classes of happy
emotions, 50 neutral emotions, 41 no emotions, 34 bored emotions, and 44 disappointed emotions.
Based on the confusion matrix above, it is possible to calculate the detection accuracy value. The
values needed to perform the calculation include the number of true detections and the number of false
detection data. False detection is divided into two, namely false positive (FP) and false negative (FN). Of the
250th test data, the happy emotion was detected correctly as many as 42 of the 50th test data. The system
successfully detects poses with a neutral emotion class of 50 out of 50, no 41 out of 50, bored 34 out of 50,
and disappointed 44 out of 50 test data. The calculation of accuracy is done by combining all the correct
detection results against the total numbers of data. Table 1 is for calculating the detection accuracy by the
system.
Emotions and gesture recognition using affective computing assessment with deep … (Herjuna Artanto)
1424 ISSN: 2252-8938
Based on the table, the number of correct detections is 211 th data and the total number of data is 250
so that the detection accuracy gets a value of 84.4%. After knowing the results of the accuracy of the system,
further testing needs to be done by comparing the results of the detection by the system with the actual
emotional conditions of students. To confirm the actual emotional state of the students, the AEQ
questionnaire was given. This accuracy test is done by dividing the categories of emotions into positive and
negative. In the Figure 6 you can see the difference in the accuracy of the positive and negative emotion test
value by the system and AEQ.
Figure 6. Results on Positive and Negative emotions by system and AEQ questionnaire
From the Figure 6, it shows the difference between the detection results by the system and AEQ.
This difference shows the level of accuracy in detecting students' emotions. Positive emotions include
interest and neutral which are classified from students' poses. Besides that, negative emotions include bored
and dissappointed. Based on the calculation, the tests conducted on 34 students showed better average
difference of 17.17 in the positive emotion class than in the negative emotion class which is 20.93. The value
calculated based on the difference between the actual condition (AEQ) and the detection results by the
system. Its ability to detect wether positive or negative emotions can be known from the difference as the
Figure 7 shown. It indicates that the ability to detect positive emotion needs to be more improved as the
actual condition higher than its detection. On the other hand, the fluctuate value of negative class emotions
denote the detection is too sensitive by the body pattern change.
Based on the value of each type of emotion’s difference, the average earned 19.05% which means
that the actual system accuracy is 80.95%. In addition to getting the accuracy, the system is also tested for
acceptance to the users. The test carried out using the USE instrument to measure the usability aspect of the
product being developed. Usability aspects include usefulness, ease of use, ease of learning, and satisfaction.
Usability testing by the teacher got an average final score of 77.56%. This value consists of usefulness
74.17%, ease of use 80%, ease of learning 85%, and satisfaction 77.33%. According to the teacher and based
on the results of the test, the system is easy to learn. However, the value of the usability of the developed
system got a score of 77.56 which means it is quite useful to be applied. The teacher assesses the system
based on these four aspects, while the students just evaluate the aspects of ease of use, ease of learning, and
satisfaction. The developed system according to the students got an average 79.58% in the testing. The level
of student satisfaction with the system is 83.38% which is quite high, the ease of use and the ease of learning
the system are 79.89% and 78.82%, respectively. According to the results of usability testing by students, the
system is not only quite easy to use and learn but also provides satisfaction. Figure 8 shows diagram of the
average assessment in usability testing.
4. CONCLUSION
This study aims to develop a product that can assist the implementation of affective assessment in
schools. Based on the research that has been done, it can be concluded that this affective computing
assessment system are: i) Able to detect emotions through student body gestures; ii) Able to provide data on
detecting students' emotions as a basis for affective assessment; iii) The detection of emotions has an
accuracy of 84.4%; iv) The actual emotion detection accuracy is 80.95%; v) Acceptance by teachers on the
usability aspect got 77.56%; vi) Acceptance by students on the usability aspect got 79.58%. Based on this
conclusion, the research that has been done can be used in the school scope. In addition, this research still has
the opportunity to be developed further. Future research is expected to be able to improve the quality of
emotion detection by improving the quality of the dataset and model used.
ACKNOWLEDGEMENTS
Authors thank to Research Group Biomedical, Electronics and Artificial Intelligence System
(BEAIS), Faculty of Engineering, Universitas Negeri Yogyakarta who have supported not only the financial
aid but also the knowledge.
REFERENCES
[1] E. Yadegaridehkordi, N. F. B. M. Noor, M. N. Bin Ayub, H. B. Affal, and N. B. Hussin, “Affective computing in education: A
systematic review and future research,” Computers and Education, vol. 142, 2019, doi: 10.1016/j.compedu.2019.103649.
[2] C. H. Wu, Y. M. Huang, and J. P. Hwang, “Review of affective computing in education/learning: Trends and challenges,” British
Journal of Educational Technology, vol. 47, no. 6, pp. 1304–1323, 2016, doi: 10.1111/bjet.12324.
[3] D. Handayani, H. Yaacob, A. W. Abdul Rahman, W. Sediono, and A. Shah, “Systematic review of computational modeling of
mood and emotion,” 2014 the 5th International Conference on Information and Communication Technology for the Muslim
World, ICT4M 2014, 2014, doi: 10.1109/ICT4M.2014.7020611.
[4] J. Goebel and S. Maistry, “Recounting the role of emotions in learning economics: Using the threshold concepts framework to
explore affective dimensions of students’ learning,” International Review of Economics Education, vol. 30, 2019,
doi: 10.1016/j.iree.2018.08.001.
Emotions and gesture recognition using affective computing assessment with deep … (Herjuna Artanto)
1426 ISSN: 2252-8938
[5] R. Robina-Ramírez, J. A. Medina Merodio, and S. McCallum, “What role do emotions play in transforming students’
environmental behaviour at school?,” Journal of Cleaner Production, vol. 258, 2020, doi: 10.1016/j.jclepro.2020.120638.
[6] Ç. Atmaca, F. Rızaoğlu, T. Türkdoğan, and D. Yaylı, “An emotion focused approach in predicting teacher burnout and job
satisfaction,” Teaching and Teacher Education, vol. 90, 2020, doi: 10.1016/j.tate.2020.103025.
[7] Rabiudin, E. Taruh, and Mursalin, “Development of authentic affective assessment instrument in high school physics learning,”
Journal of Physics: Conference Series, vol. 1028, no. 1, 2018, doi: 10.1088/1742-6596/1028/1/012201.
[8] R. W. Morshead, “Taxonomy of educational objectives handbook II: Affective domain,” Studies in Philosophy and Education,
vol. 4, no. 1, pp. 164–170, 1965, doi: 10.1007/BF00373956.
[9] J. A. Santee, J. M. Marszalek, and K. L. Hardinger, “A critique of validity analysis from instruments assessing the affective
domain,” Currents in Pharmacy Teaching and Learning, vol. 11, no. 2, pp. 218–229, 2019, doi: 10.1016/j.cptl.2018.11.010.
[10] Khaerudin, S. Munadi, and Supianto, “Affective assessment using social media,” Universal Journal of Educational Research,
vol. 8, no. 7, pp. 2921–2928, 2020, doi: 10.13189/ujer.2020.080720.
[11] E. Rimland, “Assessing affective learning using a student response system,” Portal, vol. 13, no. 4, pp. 385–401, 2013,
doi: 10.1353/pla.2013.0037.
[12] F. A. Bachtiar, E. W. Cooper, G. H. Sulistyo, and K. Kamei, “Student assessment based on affective factors in English learning
using fuzzy inference,” International Journal of Affective Engineering, vol. 15, no. 2, pp. 101–108, 2016, doi: 10.5057/ijae.ijae-d-
15-00037.
[13] R. Pekrun, “The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational
research and practice,” Educational Psychology Review, vol. 18, no. 4, pp. 315–341, 2006, doi: 10.1007/s10648-006-9029-9.
[14] P. Moreira, D. Cunha, and R. A. Inman, “Achievement emotions questionnaire-mathematics (AEQ-M) in adolescents: Factorial
structure, measurement invariance and convergent validity with personality,” European Journal of Developmental Psychology,
vol. 16, no. 6, pp. 750–762, 2019, doi: 10.1080/17405629.2018.1548349.
[15] R. W. Picard, “Affective computing,” Affective Computing, 2019, doi: 10.7551/mitpress/1140.001.0001.
[16] S. B. Daily et al., “Affective computing: Historical foundations, current applications, and future trends,” Emotions and Affect in
Human Factors and Human-Computer Interaction, pp. 213–231, 2017, doi: 10.1016/B978-0-12-801851-4.00009-4.
[17] S. D’Mello and A. Graesser, “Mind and body: dialogue and posture for affect detection in learning environments,” Frontiers in
Artificial Intelligence and Applications, vol. 158, pp. 161–168, 2007.
[18] M. S. Hussain, O. Alzoubi, R. A. Calvo, and S. K. D’Mello, “Affect detection from multichannel physiology during learning
sessions with autotutor,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and
Lecture Notes in Bioinformatics), vol. 6738 LNAI, pp. 131–138, 2011, doi: 10.1007/978-3-642-21869-9_19.
[19] S. D’Mello and A. Graesser, “Dynamics of affective states during complex learning,” Learning and Instruction, vol. 22, no. 2,
pp. 145–157, 2012, doi: 10.1016/j.learninstruc.2011.10.001.
[20] A. Arguel, L. Lockyer, O. V. Lipp, J. M. Lodge, and G. Kennedy, “Inside out: Detecting learners’ confusion to improve
interactive digital learning environments,” Journal of Educational Computing Research, vol. 55, no. 4, pp. 526–551, 2017,
doi: 10.1177/0735633116674732.
[21] N. Thomas and M. Mathew, “Facial expression recognition system using neural network and MATLAB,” 2012 International
Conference on Computing, Communication and Applications, ICCCA 2012, 2012, doi: 10.1109/ICCCA.2012.6179169.
[22] M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, “Coding facial expressions with Gabor wavelets,” Proceedings - 3rd IEEE
International Conference on Automatic Face and Gesture Recognition, FG 1998, pp. 200–205, 1998,
doi: 10.1109/AFGR.1998.670949.
[23] B. C. Ko, “A brief review of facial emotion recognition based on visual information,” Sensors (Switzerland), vol. 18, no. 2, 2018,
doi: 10.3390/s18020401.
[24] B. Kratzwald, S. Ilić, M. Kraus, S. Feuerriegel, and H. Prendinger, “Deep learning for affective computing: Text-based emotion
recognition in decision support,” Decision Support Systems, vol. 115, pp. 24–35, 2018, doi: 10.1016/j.dss.2018.09.002.
[25] C. M. Chen and T. H. Lee, “Emotion recognition and communication for reducing second-language speaking anxiety in a web-
based one-to-one synchronous learning environment,” British Journal of Educational Technology, vol. 42, no. 3, pp. 417–440,
2011, doi: 10.1111/j.1467-8535.2009.01035.x.
[26] I. Behoora and C. S. Tucker, “Machine learning classification of design team members’ body language patterns for real time
emotional state detection,” Design Studies, vol. 39, pp. 100–127, 2015, doi: 10.1016/j.destud.2015.04.003.
[27] B. Sun, S. Cao, J. He, and L. Yu, “Affect recognition from facial movements and body gestures by hierarchical deep spatio-
temporal features and fusion strategy,” Neural Networks, vol. 105, pp. 36–51, 2018, doi: 10.1016/j.neunet.2017.11.021.
[28] S. K. Davis, M. Morningstar, M. A. Dirks, and P. Qualter, “Ability emotional intelligence: What about recognition of emotion in
voices?,” Personality and Individual Differences, vol. 160, 2020, doi: 10.1016/j.paid.2020.109938.
[29] I. Arroyo, B. P. Woolf, W. Burelson, K. Muldner, D. Rai, and M. Tai, “A multimedia adaptive tutoring system for mathematics
that addresses cognition, metacognition and affect,” International Journal of Artificial Intelligence in Education, vol. 24, no. 4,
pp. 387–426, 2014, doi: 10.1007/s40593-014-0023-y.
[30] N. Banda and P. Robinson, “Multimodal affect recognition in intelligent tutoring systems,” Lecture Notes in Computer Science
(including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6975 LNCS, no. PART 2,
pp. 200–207, 2011, doi: 10.1007/978-3-642-24571-8_21.
[31] P. A. Jaques and R. M. Vicari, “A BDI approach to infer student’s emotions in an intelligent learning environment,” Computers
and Education, vol. 49, no. 2, pp. 360–384, 2007, doi: 10.1016/j.compedu.2005.09.002.
[32] R. C. D. Reis, S. Isotani, C. L. Rodriguez, K. T. Lyra, P. A. Jaques, and I. I. Bittencourt, “Affective states in computer-supported
collaborative learning: Studying the past to drive the future,” Computers and Education, vol. 120, pp. 29–50, 2018,
doi: 10.1016/j.compedu.2018.01.015.
[33] H. C. K. Lin, C. H. Wu, and Y. P. Hsueh, “The influence of using affective tutoring system in accounting remedial instruction on
learning performance and usability,” Computers in Human Behavior, vol. 41, pp. 514–522, 2014, doi: 10.1016/j.chb.2014.09.052.
[34] J. M. Harley, F. Bouchet, and R. Azevedo, “Aligning and comparing data on emotions experienced during learning with
metatutor,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in
Bioinformatics), vol. 7926 LNAI, pp. 61–70, 2013, doi: 10.1007/978-3-642-39112-5_7.
[35] H. C. K. Lin, C. H. Wang, C. J. Chao, and M. K. Chien, “Employing textual and facial emotion recognition to design an affective
tutoring system,” Turkish Online Journal of Educational Technology, vol. 11, no. 4, pp. 418–426, 2012.
[36] S. Caballe, “Towards a multi-modal emotion-awareness e-learning system,” Proceedings - 2015 International Conference on
Intelligent Networking and Collaborative Systems, IEEE INCoS 2015, pp. 280–287, 2015, doi: 10.1109/INCoS.2015.88.
[37] S. Afzal and P. Robinson, “Designing for automatic affect inference in learning environments,” Educational Technology and
BIOGRAPHIES OF AUTHORS
Emotions and gesture recognition using affective computing assessment with deep … (Herjuna Artanto)