Symmetry: Facial Expression Recognition: A Survey
Symmetry: Facial Expression Recognition: A Survey
Article
Facial Expression Recognition: A Survey
Yunxin Huang , Fei Chen, Shaohe Lv and Xiaodong Wang *
College of Computer, National University of Defense Technology, Changsha 410073, China;
[email protected] (Y.H.); [email protected] (F.C.); [email protected] (S.L.)
* Correspondence: [email protected]
Received: 2 September 2019; Accepted: 16 September 2019; Published: 20 September 2019
Abstract: Facial Expression Recognition (FER), as the primary processing method for non-verbal
intentions, is an important and promising field of computer vision and artificial intelligence, and one
of the subject areas of symmetry. This survey is a comprehensive and structured overview of recent
advances in FER. We first categorise the existing FER methods into two main groups, i.e., conventional
approaches and deep learning-based approaches. Methodologically, to highlight the differences
and similarities, we propose a general framework of a conventional FER approach and review the
possible technologies that can be employed in each component. As for deep learning-based methods,
four kinds of neural network-based state-of-the-art FER approaches are presented and analysed.
Besides, we introduce seventeen commonly used FER datasets and summarise four FER-related
elements of datasets that may influence the choosing and processing of FER approaches. Evaluation
methods and metrics are given in the later part to show how to assess FER algorithms, along with
subsequent performance comparisons of different FER approaches on the benchmark datasets. At the
end of the survey, we present some challenges and opportunities that need to be addressed in future.
1. Introduction
Facial expression is a major non-verbal means of expecting intentions in human communication.
The work of Mehrabian [1] in 1974 shows that 55% of messages pertaining to feelings and attitudes is
in facial expression, 7% of which is in the words that are spoken, the rest of which are paralinguistic
(the way that the words are said). Facial expression has proven to play a vital role in the entire
information exchange process in Mehrabian’s findings. With the rapid development of artificial
intelligence, automatic recognition of facial expressions has been intensively studied in recent years.
The study of Facial Expression Recognition (FER) has received extensive attention in the fields of
psychology, computer vision, and pattern recognition. FER has broad applications in multiple domains,
including human–computer interaction [2,3], virtual reality [4], augmented reality [5], advanced driver
assistance systems [6,7], education [8], and entertainment [9].
Various kinds of data can be used as the input of the emotion recognition system. Expression
recognition and emotion recognition is related but different. Human facial image is the mainstream
and promising input type, because it can provide abundant information for expression recognition
research. Besides the facial image taken by camera, physiological signals [10], e.g., electromyograph
(EMG), electrocardiogram (ECG), electroencephalograph (EEG), can also be employed as the auxiliary
data source in some real-world FER applications.
Considering that this survey only focuses on FER of visible facial expressions, the following
introduction and discussion are based on the AUs. Studies can be divided into two groups according
to whether the features are manually extracted or generated through the output of neural networks,
i.e., the conventional FER approaches and the deep learning-based FER approaches.
The conventional FER approach is composed of three major steps, i.e., image preprocessing,
feature extraction, and expression classification. Such methods based on manual feature extraction are
less dependent on data and hardware, which have advantages in small data sample analysis.
Deep learning-based FER approaches greatly reduce the dependence on feature extraction by
employing an “end-to-end” learning directly from input data to classification result. Note that, massive
datasets with annotations are the cornerstone of a deep learning algorithm, otherwise, overfitting can
easily occur.
According to the shooting environment, FER-related data can be approximately divided into
laboratory type and wild type, and several publicly available FER datasets are described in detail in
Section 4. Most of the existing studies are based on laboratory datasets, such as JAFFE [14] and CK+ [15],
which data come from volunteers who make corresponding expressions under particular instructions.
Experiments based on these kind of data accelerate the advancement of FER algorithms. However,
with the development of artificial intelligence technology and the wide demand for applications in the
era of big data, studies on FER will focus on the spontaneous expressions in the wild. New solutions
to FER in complex environment, e.g., occlusion, multi-view, and multi-objective, are necessary to
be proposed.
1.2. Terminologies
Before reviewing the approaches of FER, we first present some related terminologies to
supplement the theoretical basis of FER technology. Facial Landmarks (FLs), Facial Action Units
(AUs), and Facial Action Coding System (FACS) are about how to convert facial action into expression.
Symmetry 2019, 11, 1189 3 of 28
Basic Emotions (BEs), Compound Emotions (CEs), and Micro Expressions (MEs) are different definition
criteria for expression categories. Existing studies on FER are based on the setting of these concepts
and terms.
Figure 2. Example of facial landmarks (face images are taken from JAFFE dataset [14]).
Figure 3. Some examples of AUs (images are taken from CK+ dataset [15]).
Table 1. Prototypical AUs seen in basic and compound emotion category, proposed in [19].
• Noise reduction is the first preprocessing step. Average Filter (AF), Gaussian Filter (GF), Median
Filter (MF), Adaptive Median Filter (AMF), and Bilateral Filter (BF) are frequently used image
processing filters.
• Face detection has developed into an independent field [34,35]. It is an essential pre-step in FER
systems, with the purpose of localising and extracting the face region.
• Normalisation of the scale and grayscale is to normalise size and colour of input images,
the purpose of which is to reduce calculation complexity under the premise of ensuring the
key features of the face [36–38].
• Histogram equalisation is applied to enhance the image effect [39].
Feng et al. [44] establish local LBP histograms and connect them to FER. An improved LBP-based
algorithm, i.e., Complete Local Binary Pattern (CLBP), is proposed in [45], which has better
performance than original LBP algorithm, but it also causes dimensionality disaster. Jabid and Chae [46]
introduce LBP-based Local Directional Pattern (LDP) algorithm, which is robust to illumination and
has relatively low computational complexity. In addition, Local Phase Quantisation (LPQ) [47] is
mainly based on short-time Fourier Transform and is stable in feature extraction. In [48], the improved
es-LBP (expression-specific LBP) feature is proposed to extract the spatial information, the cr-LPP
(class-regularised Locality Preserving Projection) method is proposed to simultaneously maximise the
class independence and preserve the local feature similarity.
Compared with Gabor wavelet, the LBP operator requires less storage space and has higher
computational efficiency. However, the LBP operator is ineffective on the images with noise. It may
lose some useful feature information since it only considers the pixel features of the picture centre and
its neighbourhood ignoring the difference in amplitude.
2.2.3. ASM/AAM
The Active Shape Model (ASM) proposed in [49] is based on statistical models and is generally
used to extract feature points on expression contours. This model mainly uses the global shape model
to match the initial shape of the human face, and then establish a local texture model to obtain the
Symmetry 2019, 11, 1189 7 of 28
contour features of the target more accurately. The Active Appearance Model (AAM) [50] is developed
on the basis of ASM by incorporating local texture features.
Cristinacce [51] fuses PRFR (Pairwise Reinforcement of Feature Responses) with AAM to detect
feature points of local edges such as facial organs. Saatci [52] subtly cascades AAM with the SVM
classifier to improve the recognition rates.
Despite ASM is more efficient than AAM, AAM can obtain higher recognition rate than ASM as it
can better fit texture features.
the feature points evenly distributed without aggregation. Then, the KLT matching algorithm is
hierarchically iteratively designed, so that the match can be quickly tracked when the target has
obvious attitude and size change.
Feature extraction is a decisive phase in FER. Note that some classifiers may pose the problem of
curse of dimensionality (i.e., the phenomenon that data becomes sparser in high-dimensional space)
and over-fitting when the number of extracted features is overwhelming. Dimensionality reduction
methods are frequently embedded in this scenario, which can improves learning performance,
increase computational efficiency, and decrease memory storage. Traditional feature selection methods
including PCA and LDA have broad applications in many machine learning tasks. Feature value
selection designed for outlier detection or object recognition is proposed in [61], which can be employed
in the data with binary or nominal features.
The conventional FER approaches is obviously less dependent on data and hardware
compared to deep-learning-based approaches. However, feature extraction and classification have
to be designed manually and separately, which means these two phases cannot be optimised
simultaneously. The effectiveness of conventional FER methods is bound by the performance of
each individual component.
CNN is directly adapted for AU detection in most of the deep learning-based FER approaches.
The CNN-based FER method proposed in [84] employs Facial Action Coding System (FACS) feature
representation, which shows a generalisation capability of networks both cross-data and cross-task
related to FER. A well-performed recognition rate is obtained, when utilising the model to the task of
micro-expression detection.
Liu et al. [85] proposed a deformable facial action parts model for dynamic expression analysis.
A deformable facial parts learning module is incorporated into the 3D CNN that can detect a particular
facial action part under structured spatial constraint, and simultaneously obtain a representation based
on the discriminating part.
In order to take advantage of temporal features for recognising facial expression, a new integration
method named Deep Temporal Appearance-Geometry Network (DTAGN) is proposed in [86].
The Deep Temporal Appearance Network (DTAN) extracts appearance features from image sequences.
The Deep Temporal Geometry Network (DTGN) extracts geometry features from temporal FL points.
The the joint DTAGN boost the performance of the FER by making full use of temporal information.
The proposed network in [87] employs two convolutional layers and four inception layers to
extend both the depth and width of the network and keep the computational budget constant as well.
Compared with conventional CNN approaches, the proposed approach has shallower networks and
has clear advantage in classification accuracy on cross-database evaluation scenarios.
Symmetry 2019, 11, 1189 10 of 28
The work [88] is about occlusion aware facial expression recognition. Considering different
facial ROIs (Regions of Interest), Li et al. introduce two versions of ACNN (Convolution Neutral
Network with Attention mechanism), i.e., pACNN (Patch based ACNN) and gACNN (global–local
based ACNN). The pACNN focuses on local facial patches, while the gACNN combines local patches
with global images. Experimental results show that ACNNs are able to improve the FER accuracy on
occluded facial images by replacing the occluded patches with other related but non-occluded patches.
and applies this information to classify the sequences. Facial landmarks, visual highlights of facial
components, are used as inputs to our network.
For multi-view FER, Lai et al. propose a multi-task GAN-based learning approach [99], where
the generator frontalises a frontal face image based on the input non-frontal image while retaining
the identity and expression information, and the discriminator is trained to distinguish and recognise.
This face frontalisation system is demonstrated as valid for FER with visible head pose variations.
An end-to-end GAN-based model is presented in [100]. The encoder–decoder structure of the
generator first learn a identity representation for face images which is then demonstratively specified
through the expression and pose codes. In addition, the model is able to enlarge the FER training set
by automatically generate face images with arbitrary expressions and head-poses.
In [101], GANs are employed to train the generator to generate six basic expressions from a face
image while CNN is fine-tuned for each single identity sub-space expression classification. This model
can alleviate the effect of inter-subject variations, and is adaptable to integrate with other CNN
framework for FER.
To protect the user’s privacy, Chen et al. [102] present a Representation-Learning Variational
Generative Adversarial Network (PPRL-VGAN) to learn an image representation that is
demonstratively specified from the identity information. A DeRL (De-expression Residue Learning)
procedure is proposed in [103] to extract information of the expressive component. A cGAN
(conditional Generative Adversarial Networks) generates the relevant neutral face image for any
input to filter out the expressive information.
4. Datasets
Training and testing on existing datasets is a frequently used method of expression recognition.
In this section, some FER-related datasets are provided and discussed with data augmentation methods.
Symmetry 2019, 11, 1189 12 of 28
Figure 10. Examples of six representative datasets related to FER. (a) JAFFE [14]; (b) KDEF [104];
(c) CK+ [15]; (d) MMI [105]; (e) MPI [106]; (f) UNBC [107].
are manually scored with the intensity of AU’s (0–5 scale). A total of 66 facial landmarks of each
images in the dataset are also labelled.
in four recording sessions for a total of more than 750,000 images. The labels are AAM-style with
between 39 and 68 feature points.
4.1.14. GEMEP-FERA
The GEneva Multimodal Emotion Portrayals (GEMEP) [115] is a collection of audio and video
recordings featuring ten actors portraying 18 emotional states, with different verbal contents and
different modes of expression. This corpus consists of more than 7000 audio-video emotion portrayals,
representing 18 emotions (including rarely studied subtle emotions), portrayed by ten professional
actors who are coached by a professional director.
4.2. Discussion
According to the peculiarity of human facial expression, a dataset has four notable elements,
namely image dimension, shooting environment, labelling method, and elicitation method.
An overview of FER-related datasets is presented in Table 2.
• 2D-type: The traditional 2D laboratory dataset usually has good separability of different categories,
due to its exaggerated expression and limited variables. The JAFFE dataset [14] specially uses
Japanese females as the subjects. CE [21] consists of 22 categories of emotions of 230 subjects
with facial occlusion minimised. This type of dataset is useful for understanding the procedure of
expression recognition and comparing the performances of different experimental methods.
• 3D-type: Establishment of 3D-based facial expression dataset accelerates the examination of 3D
spatiotemporal features in subtle facial expression, revealing the connection between pose and
motion dynamics in AUs, and better acquainting of spontaneous facial action. The BU-3DFE [109]
and BP4D-Spontaneous [110] are published to accelerate the research on the facial behaviour and
3D structure of facial expressions.
• Unique condition: Positive face images under a single specific condition can provide accurate
feature information for expression recognition, but the trained model is very limited in scope,
e.g., JAFFE [14] and CK+ [15].
• Complex condition: In order to improve the application scope and processing ability of the FER
model, some datasets selectively collect expression images of various illumination conditions,
face directions and head postures, e.g., MPI [106] and Oulu-CASIA [113].
• Wild condition: Facial expressions in the wild is close to real world environment, e.g.,
AFEW [116] extracted from movies and RAF-DB [118,119] downloaded form the Internet. Such
datasets are more challenging.
Symmetry 2019, 11, 1189 16 of 28
Figure 11. An example of data augmentation(face images are taken from JAFFE dataset [14]).
Jinwoo et al. [122] apply data augmentation both on training phase and testing phase. They use
42 pixels × 42 pixels randomly cropped images and their mirrored images to attain eight times more
Symmetry 2019, 11, 1189 18 of 28
data for training input. An averaging method is then used on the testing phase to reduce outliers.
The probability is averaged as the final output of their cropped images and mirrored images.
5. Performance Metrics
In practical tasks, multiple learning algorithms can be chosen, and even for the same learning
algorithm, different parameters lead to a variety of results. Evaluation metrics are critical to identify
the merits of a method because it provides a standard for quantitative comparisons. In this section,
we present the evaluation methods and evaluation metrics that are publicly available in the FER
studies. The recognition rate of different methods is also compared with the FER typical classification
method introduced in the previous section.
TP + TN
Acc = . (1)
TP + TN + FP + FN
where TP, TN, FP, and FN represent true positive, true negative, false positive, and false
negative, respectively.
Some metrics for binary classification can be extended for multi-class classification
evaluations [123] (e.g., Precision P, Recall R, and F-measure). Apart from the metrics related to
effectiveness, other evaluation methods for measuring the efficiency and scalability of the classifier,
e.g., execution time, training time, and resource occupancy, also need to be considered in practice.
Symmetry 2019, 11, 1189 19 of 28
• A high accuracy of over 90% can be obtained by state-of-the-art FER approaches on JAFFE [14]
and CK+ [15] datasets. As the early collections of pose-invariant human facial expressions with
minimised facial occlusion, the majority of FER approaches are evaluated on JAFFE [14] and
CK+ [15] datasets. And these two datasets are widely used as the benchmark for FER comparison.
• The accuracy of FER approaches on MMI [105] dataset is generally less than 80%. There are
significant inter-personal variations, because of the subjects’ non-uniformly performances and
accessories (e.g., glasses, moustache). Experiments of eight FER methods illustrate that deep
learning-based methods perform better on the MMI dataset, with about 10% higher accuracy.
• Unlike the three datasets above, the accuracy on Multi-PIE [112] and Oulu-CASIA [113] are
about 80%. These two datasets are collected in complex conditions, which support the FER
research for illumination variation. FER approaches are better performed on Multi-PIE even if
the shooting conditions are more complicated (15 view points with 19 illumination condition in
Multi-PIE, frontal direction with three illumination conditions in Oulu-CASIA). That is because
the Multi-PIE is labelled with feature points as the a priori knowledge.
• The latest FER approaches, especially the GAN-based aproaches, can achieve about 80% accuracy
on the BU-3DFE [109] dataset, which has different properties than other datasets. As a 3D-based
facial expression dataset, BU-3DFE reveals the connection between pose and motion dynamics in
Symmetry 2019, 11, 1189 20 of 28
facial AUs, providing sufficient feature information for the study of multi-pose FER. Meanwhile,
detailed 3D information and facial landmark annotation are applicable to GAN-based algorithm
for image generation.
• The accuracy rates of FER on FERA [115], FER2013 [114], and SFEW [117] are approximately
70%, 60%, and 40%, respectively. These three datasets have more challenging conditions, i.e.,
subjects perform spontaneous expressions under the wild circumstance. Neither the conventional
approaches nor the deep learning-based approaches have high accuracy in dealing with the FER
in the wild condition, which is one of the challenges in this field.
protecting the visual privacy of their users. Hence, more trustworthy and accurate privacy protection
methods are needed so as to strike a balance between privacy and data utility for FER systems.
7. Conclusions
Facial Expression Recognition (FER) has attracted increasing attention in recent years. The past
decade has witnessed the development of many new FER algorithms. This paper provides a
comprehensive review about recent advances in FER technology. We first introduce some related
terminology and review the research background of FER. Then, we classify the existing FER
methods into conventional methods and deep learning-based methods. In particular, we divide
the conventional methods into three major steps, i.e., image preprocessing, feature extraction,
and expression classification. In each step, various possible methods are introduced and discussed.
In terms of deep learning-based methods, four kinds of popular deep learning networks are presented,
and some related FER algorithms are reviewed and analysed. Besides, seventeen FER datasets are
introduced. Four FER-related elements of datasets are subsequently summarised. In addition, some
methods and metrics are given on how to evaluate these FER algorithms. At the end of the survey,
we present some challenges and opportunities of the FER that require future research. This survey
aims to provide an organised and detailed study of the work done in the area of FER and further
promote the research in this field.
Author Contributions: Conceptualisation, Y.H.; writing–original draft preparation, Y.H.; writing–review and
editing, Y.H., F.C. and X.W.; supervision, S.L. and X.W.; project administration, S.L.
Funding: This research was funded by the National Natural Science Foundation of China (61572513).
Acknowledgments: We would like to gratitude anonymous reviewers for their constructive comments.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Mehrabian, A.; Russell, J.A. An Approach to Environmental Psychology; The MIT Press: Cambridge, MA,
USA, 1974.
2. Cowie, R.; Douglas-Cowie, E.; Tsapatsoulis, N.; Votsis, G.; Kollias, S.; Fellenz, W.; Taylor, J.G. Emotion
recognition in human-computer interaction. IEEE Signal Process. Mag. 2001, 18, 32–80. [CrossRef]
3. Bartlett, M.S.; Littlewort, G.; Fasel, I.; Movellan, J.R. Real Time Face Detection and Facial Expression
Recognition: Development and Applications to Human Computer Interaction. In Proceedings of the
Conference on Computer Vision and Pattern Recognition Workshop, Madison, WI, USA, 16–22 June 2003;
Volume 5, pp. 53–53.
4. Bekele, E.; Zheng, Z.; Swanson, A.; Crittendon, J.; Warren, Z.; Sarkar, N. Understanding how adolescents
with autism respond to facial expressions in virtual reality environments. IEEE Trans. Vis. Comput. Graph.
2013, 19, 711–720. [CrossRef] [PubMed]
5. Chen, C.H.; Lee, I.J.; Lin, L.Y. Augmented reality-based self-facial modeling to promote the emotional
expression and social skills of adolescents with autism spectrum disorders. Res. Dev. Disabil. 2015,
36, 396–403. [CrossRef] [PubMed]
6. Assari, M.A.; Rahmati, M. Driver drowsiness detection using face expression recognition. In Proceedings of
the IEEE International Conference on Signal and Image Processing Applications, Kuala Lumpur, Malaysia,
16–18 November 2011; pp. 337–341.
7. Jabon, M.; Bailenson, J.; Pontikakis, E.; Takayama, L.; Nass, C. Facial expression analysis for predicting
unsafe driving behavior. IEEE Perv. Comput. 2011, 10, 84–95. [CrossRef]
8. Kapoor, A.; Burleson, W.; Picard, R.W. Automatic prediction of frustration. Int. J. Hum.-Comput. Stud. 2007,
65, 724–736. [CrossRef]
9. Lankes, M.; Riegler, S.; Weiss, A.; Mirlacher, T.; Pirker, M.; Tscheligi, M. Facial expressions as game input
with different emotional feedback conditions. In Proceedings of the 2008 International Conference on
Advances in Computer Entertainment Technology, Yokohama, Japan, 3–5 December 2008; pp. 253–256.
Symmetry 2019, 11, 1189 23 of 28
10. Jerritta, S.; Murugappan, M.; Nagarajan, R.; Wan, K. Physiological signals based human emotion recognition:
A review. In Proceedings of the IEEE 7th International Colloquium on Signal Processing and its Applications,
Penang, Malaysia, 4–6 March 2011; pp. 410–415.
11. Tian, Y.I.; Kanade, T.; Cohn, J.F. Recognizing action units for facial expression analysis. IEEE Trans. Pattern
Anal. Mach. Intell. 2001, 23, 97–115. [CrossRef]
12. Russell, J.A. A circumplex model of affect. J. Pers. Soc. Psychol. 1980, 39, 1161. [CrossRef]
13. Chang, W.Y.; Hsu, S.H.; Chien, J.H. FATAUVA-Net: An integrated deep learning framework for facial
attribute recognition, action unit detection, and valence-arousal estimation. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Honolulu, HI, USA, 21–26
July 2017; pp. 17–25.
14. Lyons, M.; Akamatsu, S.; Kamachi, M.; Gyoba, J. Coding facial expressions with gabor wavelets.
In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition,
Nara, Japan, 14–16 April 1998; pp. 200–205.
15. Lucey, P.; Cohn, J.F.; Kanade, T.; Saragih, J.; Ambadar, Z.; Matthews, I. The extended cohn-kanade dataset
(ck+): A complete dataset for action unit and emotion-specified expression. In Proceedings of the IEEE
Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, San Francisco, CA,
USA, 13–18 June 2010; pp. 94–101.
16. Zhang, Z.; Luo, P.; Loy, C.C.; Tang, X. Facial landmark detection by deep multi-task learning. In Proceedings
of the European Conference on Computer Vision; Springer: Berlin, Germany, 2014; pp. 94–108.
17. Wu, Y.; Ji, Q. Facial landmark detection: A literature survey. Int. J. Comput. Vis. 2019, 127, 115–142.
[CrossRef]
18. Ekman, P.; Friesen, W.V. Facial Action Coding System: Investigator’s Guide; Consulting Psychologists Press:
Palo Alto, CA, USA, 1978.
19. Benitez-Quiroz, C.F.; Srinivasan, R.; Martinez, A.M. Emotionet: An accurate, real-time algorithm for the
automatic annotation of a million facial expressions in the wild. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 5562–5570.
20. Ekman, P. An argument for basic emotions. Cogn. Emot. 1992, 6, 169–200. [CrossRef]
21. Du, S.; Tao, Y.; Martinez, A.M. Compound facial expressions of emotion. Proc. Natl. Acad. Sci. USA 2014,
111, E1454–E1462. [CrossRef]
22. Ekman, P. Darwin, deception, and facial expression. Ann. N. Y. Acad. Sci. 2003, 1000, 205–221. [CrossRef]
[PubMed]
23. Samal, A.; Iyengar, P.A. Automatic recognition and analysis of human faces and facial expressions: A survey.
Pattern Recognit. 1992, 25, 65–77. [CrossRef]
24. Fasel, B.; Luettin, J. Automatic facial expression analysis: A survey. Pattern Recognit. 2003, 36, 259–275.
[CrossRef]
25. Sandbach, G.; Zafeiriou, S.; Pantic, M.; Yin, L. Static and dynamic 3D facial expression recognition:
A comprehensive survey. Image Vis. Comput. 2012, 30, 683–697. [CrossRef]
26. Danelakis, A.; Theoharis, T.; Pratikakis, I. A survey on facial expression recognition in 3D video sequences.
Multimed. Tools Appl. 2015, 74, 5577–5615. [CrossRef]
27. Takalkar, M.; Xu, M.; Wu, Q.; Chaczko, Z. A survey: Facial micro-expression recognition. Multimed. Tools Appl.
2018, 77, 19301–19325. [CrossRef]
28. Kumari, J.; Rajesh, R.; Pooja, K. Facial expression recognition: A survey. Procedia Comput. Sci. 2015,
58, 486–491. [CrossRef]
29. Huang, D.; Shan, C.; Ardabilian, M.; Wang, Y.; Chen, L. Local binary patterns and its application to facial
image analysis: A survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2011, 41, 765–781. [CrossRef]
30. Zhang, L.; Verma, B.; Tjondronegoro, D.; Chandran, V. Facial Expression Analysis under Partial Occlusion:
A Survey. ACM Comput. Surv. (CSUR) 2018, 51, 25. [CrossRef]
31. Deshmukh, S.; Patwardhan, M.; Mahajan, A. Survey on real-time facial expression recognition techniques.
IET Biometr. 2016, 5, 155–163. [CrossRef]
32. Goyal, S.J.; Upadhyay, A.K.; Jadon, R.; Goyal, R. Real-Life Facial Expression Recognition Systems: A Review.
In Smart Computing and Informatics; Springer: Berlin, Germany, 2018; pp. 311–331.
33. Khan, S.A.; Hussain, A.; Usman, M. Facial expression recognition on real world face images using intelligent
techniques: A survey. Opt.-Int. J. Light Electron Opt. 2016, 127, 6195–6203. [CrossRef]
Symmetry 2019, 11, 1189 24 of 28
34. Viola, P.; Jones, M.J. Robust real-time face detection. Int. J. Comput. Vis. 2004, 57, 137–154. [CrossRef]
35. Hsu, R.L.; Abdel-Mottaleb, M.; Jain, A.K. Face detection in color images. IEEE Trans. Pattern Anal.
Mach. Intell. 2002, 24, 696–706.
36. Shan, S.; Gao, W.; Cao, B.; Zhao, D. Illumination normalization for robust face recognition against varying
lighting conditions. In Proceedings of the 2003 IEEE International SOI Conference, Nice, France, 17 October
2003; pp. 157–164.
37. Chen, W.; Er, M.J.; Wu, S. Illumination compensation and normalization for robust face recognition using
discrete cosine transform in logarithm domain. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2006,
36, 458–466. [CrossRef]
38. Du, S.; Ward, R. Wavelet-based illumination normalization for face recognition. In Proceedings of the IEEE
International Conference on Image Processing 2005, Genova, Italy, 14 September 2005; Volume 2, pp. II–954.
39. Tan, T.; Sim, K.; Tso, C.P. Image enhancement using background brightness preserving histogram
equalisation. Electron. Lett. 2012, 48, 155–157. [CrossRef]
40. Zhang, S.; Li, L.; Zhao, Z. Facial expression recognition based on Gabor wavelets and sparse representation.
In Proceedings of the IEEE 11th International Conference on Signal Processing, Beijing, China, 21–25 October
2012; Volume 2, pp. 816–819.
41. Yu, J.; Bhanu, B. Evolutionary feature synthesis for facial expression recognition. Pattern Recognit. Lett. 2006,
27, 1289–1298. [CrossRef]
42. Mattela, G.; Gupta, S.K. Facial Expression Recognition Using Gabor-Mean-DWT Feature Extraction
Technique. In Proceedings of the 5th International Conference on Signal Processing and Integrated Networks
(SPIN), Noida, India, 22–23 February 2018; pp. 575–580.
43. Ahonen, T.; Hadid, A.; Pietikäinen, M. Face recognition with local binary patterns. In European Conference on
Computer Vision; Springer: Berlin, Germany, 2004; pp. 469–481.
44. Feng, X.; Pietikäinen, M.; Hadid, A. Facial expression recognition based on local binary patterns.
Pattern Recognit. Image Anal. 2007, 17, 592–598. [CrossRef]
45. Guo, Z.; Zhang, L.; Zhang, D. A completed modeling of local binary pattern operator for texture classification.
IEEE Trans. Image Process. 2010, 19, 1657–1663.
46. Jabid, T.; Kabir, M.H.; Chae, O. Robust facial expression recognition based on local directional pattern.
ETRI J. 2010, 32, 784–794. [CrossRef]
47. Wang, Z.; Ying, Z. Facial expression recognition based on local phase quantization and sparse representation.
In Proceedings of the 2012 8th International Conference on Natural Computation, Chongqing, China, 29–31
May 2012; pp. 222–225.
48. Chao, W.L.; Ding, J.J.; Liu, J.Z. Facial expression recognition based on improved local binary pattern and
class-regularized locality preserving projection. Signal Process. 2015, 117, 1–10. [CrossRef]
49. Cootes, T.F.; Taylor, C.J.; Cooper, D.H.; Graham, J. Active shape models-their training and application.
Comput. Vis. Image Underst. 1995, 61, 38–59. [CrossRef]
50. Cootes, T.F.; Edwards, G.J.; Taylor, C.J. Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell.
2001, 6, 681–685. [CrossRef]
51. Cristinacce, D.; Cootes, T.F.; Scott, I.M. A multi-stage approach to facial feature detection. In Proceedings of
the British Machine Vision Conference (BMVC), Kingston, UK, 7–9 September 2004; Volume 1, pp. 277–286.
52. Saatci, Y.; Town, C. Cascaded classification of gender and facial expression using active appearance models.
In Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition (FGR06),
Southampton, UK, 10–12 April 2006; pp. 393–398.
53. Horn, B.K.; Schunck, B.G. Determining optical flow. Artif. Intell. 1981, 17, 185–203. [CrossRef]
54. Yacoob, Y.; Davis, L.S. Recognizing human facial expressions from long image sequences using optical flow.
IEEE Trans. Pattern Anal. Mach. Intell. 1996, 18, 636–642. [CrossRef]
55. Cohn, J.F.; Zlochower, A.J.; Lien, J.J.; Kanade, T. Feature-point tracking by optical flow discriminates subtle
differences in facial expression. In Proceedings of the Third IEEE International Conference on Automatic
Face and Gesture Recognition, Nara, Japan, 14–16 April 1998; p. 396.
56. Sánchez, A.; Ruiz, J.V.; Moreno, A.B.; Montemayor, A.S.; Hernández, J.; Pantrigo, J.J. Differential optical flow
applied to automatic facial expression recognition. Neurocomputing 2011, 74, 1272–1282. [CrossRef]
57. Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the
CVPR, Kauai, HI, USA, 8–14 December 2001; Volume 1, pp. 511–518.
Symmetry 2019, 11, 1189 25 of 28
58. Yang, P.; Liu, Q.; Metaxas, D.N. Boosting encoded dynamic features for facial expression recognition.
Pattern Recognit. Lett. 2009, 30, 132–139. [CrossRef]
59. Tie, Y.; Guan, L. A deformable 3-D facial expression model for dynamic human emotional state recognition.
IEEE Trans. Circuits Syst. Video Technol. 2013, 23, 142–157. [CrossRef]
60. Liu, Y.; Wang, J.D.; Li, P. A Feature Point Tracking Method Based on The Combination of SIFT Algorithm
and KLT Matching Algorithm. J. Astronaut. 2011, 7, 028.
61. Xu, H.; Wang, Y.; Cheng, L.; Wang, Y.; Ma, X. Exploring a High-quality Outlying Feature Value Set for
Noise-Resilient Outlier Detection in Categorical Data. In Proceedings of the Conference on Information and
Knowledge Management (CIKM), Turin, Italy, 22–26 October 2018; pp. 17–26.
62. Sohail, A.S.M.; Bhattacharya, P. Classification of facial expressions using k-nearest neighbor classifier.
In Proceedings of the International Conference on Computer Vision/Computer Graphics Collaboration Techniques and
Applications; Springer: Berlin, Germany, 2007; pp. 555–566.
63. Wang, X.H.; Liu, A.; Zhang, S.Q. New facial expression recognition based on FSVM and KNN. Optik 2015,
126, 3132–3134. [CrossRef]
64. Valstar, M.; Patras, I.; Pantic, M. Facial action unit recognition using temporal templates. In Proceedings
of the 13th IEEE International Workshop on Robot and Human Interactive Communication, Kurashiki,
Okayama, Japan, 22 September 2004; pp. 253–258.
65. Michel, P.; El Kaliouby, R. Real time facial expression recognition in video using support vector machines.
In Proceedings of the 5th International Conference on Multimodal Interfaces, Vancouver, BC, Canada,
5–7 November 2003; pp. 258–264.
66. Tsai, H.H.; Chang, Y.C. Facial expression recognition using a combination of multiple facial features and
support vector machine. Soft Comput. 2018, 22, 4389–4405. [CrossRef]
67. Hsieh, C.C.; Hsih, M.H.; Jiang, M.K.; Cheng, Y.M.; Liang, E.H. Effective semantic features for facial
expressions recognition using SVM. Multimed. Tools Appl. 2016, 75, 6663–6682. [CrossRef]
68. Saeed, S.; Baber, J.; Bakhtyar, M.; Ullah, I.; Sheikh, N.; Dad, I.; Sanjrani, A.A. Empirical Evaluation of SVM
for Facial Expression Recognition. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 670–673. [CrossRef]
69. Shah, J.H.; Sharif, M.; Yasmin, M.; Fernandes, S.L. Facial expressions classification and false label reduction
using LDA and threefold SVM. Pattern Recognit. Lett. 2017. [CrossRef]
70. Wang, Y.; Ai, H.; Wu, B.; Huang, C. Real time facial expression recognition with adaboost. In Proceedings
of the 17th International Conference on Pattern Recognition, Cambridge, UK, 26 August 2004; Volume 3,
pp. 926–929.
71. Liew, C.F.; Yairi, T. Facial expression recognition and analysis: A comparison study of feature descriptors.
IPSJ Trans. Comput. Vis. Appl. 2015, 7, 104–120. [CrossRef]
72. Gudipati, V.K.; Barman, O.R.; Gaffoor, M.; Abuzneid, A. Efficient facial expression recognition using
adaboost and haar cascade classifiers. In Proceedings of the Annual Connecticut Conference on Industrial
Electronics, Technology & Automation (CT-IETA), Bridgeport, CT, USA, 14–15 October 2016; pp. 1–4.
73. Zhang, S.; Hu, B.; Li, T.; Zheng, X. A Study on Emotion Recognition Based on Hierarchical Adaboost
Multi-class Algorithm. In Proceedings of the International Conference on Algorithms and Architectures for Parallel
Processing; Springer: Berlin, Germany, 2018; pp. 105–113.
74. Moghaddam, B.; Jebara, T.; Pentland, A. Bayesian face recognition. Pattern Recognit. 2000, 33, 1771–1782.
[CrossRef]
75. Mao, Q.; Rao, Q.; Yu, Y.; Dong, M. Hierarchical Bayesian theme models for multipose facial expression
recognition. IEEE Trans. Multimed. 2017, 19, 861–873. [CrossRef]
76. Surace, L.; Patacchiola, M.; Battini Sönmez, E.; Spataro, W.; Cangelosi, A. Emotion recognition in the
wild using deep neural networks and Bayesian classifiers. In Proceedings of the 19th ACM International
Conference on Multimodal Interaction, Glasgow, UK, 13–17 November 2017; pp. 593–597.
77. Huang, M.W.; Wang, Z.W.; Ying, Z.L. A new method for facial expression recognition based on sparse
representation plus LBP. In Proceedings of the 3rd International Congress on Image and Signal Processing,
Yantai, China, 16–18 October 2010; Volume 4, pp. 1750–1754.
78. Zhang, S.; Zhao, X.; Lei, B. Facial expression recognition using sparse representation. WSEAS Trans. Syst.
2012, 11, 440–452.
79. El Emary, I.M.; Ramakrishnan, S. On the application of various probabilistic neural networks in solving
different pattern classification problems. World Appl. Sci. J. 2008, 4, 772–780.
Symmetry 2019, 11, 1189 26 of 28
80. Kusy, M.; Zajdel, R. Application of reinforcement learning algorithms for the adaptive computation of
the smoothing parameter for probabilistic neural network. IEEE Trans. Neural Netw. Learn. Syst. 2015,
26, 2163–2175. [CrossRef]
81. Neggaz, N.; Besnassi, M.; Benyettou, A. Application of improved AAM and probabilistic neural network to
facial expression recognition. J. Appl. Sci. (Faisalabad) 2010, 10, 1572–1579. [CrossRef]
82. Fazli, S.; Afrouzian, R.; Seyedarabi, H. High-performance facial expression recognition using Gabor filter and
probabilistic neural network. In Proceedings of the IEEE International Conference on Intelligent Computing
and Intelligent Systems, Shanghai, China, 20–22 November 2009; Volume 4, pp. 93–96.
83. Walecki, R.; Rudovic, O.; Pavlovic, V.; Schuller, B.; Pantic, M. Deep structured learning for facial expression
intensity estimation. Image Vis. Comput 2017, 259, 143–154.
84. Breuer, R.; Kimmel, R. A deep learning perspective on the origin of facial expressions. arXiv 2017,
arXiv:1705.01842.
85. Liu, M.; Li, S.; Shan, S.; Wang, R.; Chen, X. Deeply learning deformable facial action parts model for dynamic
expression analysis. In Asian Conference on Computer Vision; Springer: Berlin, Germany, 2014; pp. 143–157.
86. Jung, H.; Lee, S.; Yim, J.; Park, S.; Kim, J. Joint fine-tuning in deep neural networks for facial expression
recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago,
Chile, 7–13 December 2015; pp. 2983–2991.
87. Mollahosseini, A.; Chan, D.; Mahoor, M.H. Going deeper in facial expression recognition using deep neural
networks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV),
Lake Placid, NY, USA, 7–10 March 2016; pp. 1–10.
88. Li, Y.; Zeng, J.; Shan, S.; Chen, X. Occlusion Aware Facial Expression Recognition Using CNN with Attention
Mechanism. IEEE Trans. Image Process. 2019, 28, 2439–2450. [CrossRef]
89. Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006,
18, 1527–1554. [CrossRef]
90. Hinton, G.E.; Sejnowski, T.J. Learning and relearning in Boltzmann machines. In Parallel Distributed
Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations; MIT Press: Cambridge, MA,
USA, 1986.
91. Liu, P.; Han, S.; Meng, Z.; Tong, Y. Facial expression recognition via a boosted deep belief network.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus,
OH, USA, 24–27 June 2014; pp. 1805–1812.
92. Zhao, X.; Shi, X.; Zhang, S. Facial expression recognition via deep learning. IETE Tech. Rev. 2015, 32, 347–355.
[CrossRef]
93. He, J.; Cai, J.; Fang, L.; He, Z.; Amp, D.E. Facial expression recognition based on LBP/VAR and DBN model.
Appl. Res. Comput. 2016.
94. Uddin, M.Z.; Hassan, M.M.; Almogren, A.; Alamri, A.; Alrubaian, M.; Fortino, G. Facial expression
recognition utilizing local direction-based robust features and deep belief network. IEEE Access 2017,
5, 4525–4536. [CrossRef]
95. Wöllmer, M.; Kaiser, M.; Eyben, F.; Schuller, B.; Rigoll, G. LSTM-Modeling of continuous emotions in an
audiovisual affect recognition framework. Image Vis. Comput. 2013, 31, 153–163. [CrossRef]
96. Kim, D.H.; Baddar, W.; Jang, J.; Ro, Y.M. Multi-objective based spatio-temporal feature representation
learning robust to expression intensity variations for facial expression recognition. IEEE Trans. Affect. Comput.
2017. [CrossRef]
97. Hasani, B.; Mahoor, M.H. Facial expression recognition using enhanced deep 3D convolutional neural
networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 2278–2288.
98. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y.
Generative adversarial nets. In Proceedings of the Conference on Neural Information Processing Systems
(NIPS), Montreal, QC, Canad, 8–13 December 2014; pp. 2672–2680.
99. Lai, Y.H.; Lai, S.H. Emotion-preserving representation learning via generative adversarial network for
multi-view facial expression recognition. In Proceedings of the 13th IEEE International Conference on
Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 15–19 May 2018; pp. 263–270.
Symmetry 2019, 11, 1189 27 of 28
100. Zhang, F.; Zhang, T.; Mao, Q.; Xu, C. Joint pose and expression modeling for facial expression recognition.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City,
UT, USA, 19–21 June 2018; pp. 3359–3368.
101. Yang, H.; Zhang, Z.; Yin, L. Identity-adaptive facial expression recognition through expression regeneration
using conditional generative adversarial networks. In Proceedings of the 13th IEEE International Conference
on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 15–19 May 2018; pp. 294–301.
102. Chen, J.; Konrad, J.; Ishwar, P. Vgan-based image representation learning for privacy-preserving facial
expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR) Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1570–1579.
103. Yang, H.; Ciftci, U.; Yin, L. Facial expression recognition by de-expression residue learning. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA,
18 June 2018; pp. 2168–2177.
104. Lundqvist, D.; Flykt, A.; Öhman, A. The Karolinska directed emotional faces (KDEF). CD ROM Dep. Clin.
Neurosci. Psychol. Sect. Karolinska Inst. 1998, 91, 630.
105. Pantic, M.; Valstar, M.; Rademaker, R.; Maat, L. Web-based database for facial expression analysis.
In Proceedings of the IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands,
6 July 2005; p. 5.
106. Kaulard, K.; Cunningham, D.W.; Bülthoff, H.H.; Wallraven, C. The MPI facial expression database—A
validated database of emotional and conversational facial expressions. PLoS ONE 2012, 7, e32321. [CrossRef]
107. Lucey, P.; Cohn, J.F.; Prkachin, K.M.; Solomon, P.E.; Matthews, I. Painful data: The UNBC-McMaster shoulder
pain expression archive database. In Proceedings of the Face and Gesture 2011, Santa Barbara, CA, USA,
21–25 March 2011; pp. 57–64.
108. Mavadati, S.M.; Mahoor, M.H.; Bartlett, K.; Trinh, P.; Cohn, J.F. Disfa: A spontaneous facial action intensity
database. IEEE Trans. Affect. Comput. 2013, 4, 151–160. [CrossRef]
109. Yin, L.; Wei, X.; Sun, Y.; Wang, J.; Rosato, M.J. A 3D facial expression database for facial behavior research.
In Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition (FGR06),
Southampton, UK, 10–12 April 2006; pp. 211–216.
110. Zhang, X.; Yin, L.; Cohn, J.F.; Canavan, S.; Reale, M.; Horowitz, A.; Liu, P.; Girard, J.M. Bp4d-spontaneous:
A high-resolution spontaneous 3d dynamic facial expression database. Image Vis. Comput. 2014, 32, 692–706.
[CrossRef]
111. Wang, S.; Liu, Z.; Lv, S.; Lv, Y.; Wu, G.; Peng, P.; Chen, F.; Wang, X. A natural visible and infrared
facial expression database for expression recognition and emotion inference. IEEE Trans. Multimed. 2010,
12, 682–691. [CrossRef]
112. Gross, R.; Matthews, I.; Cohn, J.; Kanade, T.; Baker, S. Multi-PIE. Image Vis. Comput. 2010, 28, 807–813.
[CrossRef]
113. Zhao, G.; Huang, X.; Taini, M.; Li, S.Z.; Pietikäinen, M. Facial expression recognition from near-infrared
videos. Image Vis. Comput. 2011, 29, 607–619. [CrossRef]
114. Carrier, P.L.; Courville, A.; Goodfellow, I.J.; Mirza, M.; Bengio, Y. FER-2013 Face Database; Universit de
Montral: Montreal, QC, Canada, 2013.
115. Valstar, M.F.; Mehu, M.; Jiang, B.; Pantic, M.; Scherer, K. Meta-analysis of the first facial expression recognition
challenge. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2012, 42, 966–979. [CrossRef]
116. Dhall, A.; Goecke, R.; Lucey, S.; Gedeon, T. Collecting large, richly annotated facial-expression databases
from movies. IEEE Multimed. 2012, 19, 34–41. [CrossRef]
117. Dhall, A.; Goecke, R.; Lucey, S.; Gedeon, T. Static facial expression analysis in tough conditions: Data,
evaluation protocol and benchmark. In Proceedings of the IEEE International Conference on Computer
Vision Workshops (ICCV Workshops), Barcelona, Spain, 6–13 November 2011; pp. 2106–2112.
118. Li, S.; Deng, W.; Du, J. Reliable crowdsourcing and deep locality-preserving learning for expression
recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), Honolulu, HI, USA, 22–25 July 2017; pp. 2852–2861.
119. Li, S.; Deng, W. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial
expression recognition. IEEE Trans. Image Process. 2019, 28, 356–370. [CrossRef]
120. Li, S.; Deng, W. Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using
Crowdsourced Annotations and Deep Locality Feature Learning. Int. J. Comput. Vis. 2018, 1–23. [CrossRef]
Symmetry 2019, 11, 1189 28 of 28
121. Whitehill, J.; Littlewort, G.; Fasel, I.; Bartlett, M.; Movellan, J. Toward practical smile detection. IEEE Trans.
Pattern Anal. Mach. Intell. 2009, 31, 2106–2111. [CrossRef]
122. Jeon, J.; Park, J.C.; Jo, Y.; Nam, C.; Bae, K.H.; Hwang, Y.; Kim, D.S. A Real-time Facial Expression Recognizer
using Deep Neural Network. In Proceedings of the 10th International Conference on Ubiquitous Information
Management and Communication, Danang, Vietnam, 4–6 January 2016; p. 94.
123. Hossin, M.; Sulaiman, M. A review on evaluation metrics for data classification evaluations. Int. J. Data Min.
Knowl. Manag. Process 2015, 5, 1.
124. Li, S.; Deng, W. Deep Emotion Transfer Network for Cross-database Facial Expression Recognition.
In Proceedings of the 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24
August 2018; pp. 3092–3099.
125. Li, S.; Deng, W. A Deeper Look at Facial Expression Dataset Bias. arXiv 2019, arXiv:1904.11150.
126. Mollahosseini, A.; Hasani, B.; Mahoor, M.H. AffectNet: A Database for Facial Expression, Valence, and
Arousal Computing in the Wild. IEEE Trans. Affect. Comput. 2017. [CrossRef]
127. Ramakrishnan, S.; El Emary, I.M. Speech emotion recognition approaches in human computer interaction.
Telecommun. Syst. 2013, 52, 1467–1478. [CrossRef]
128. Chang, J.; Scherer, S. Learning representations of emotional speech with deep convolutional generative
adversarial networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 2746–2750.
129. Chao, L.; Tao, J.; Yang, M.; Li, Y.; Wen, Z. Multi-scale temporal modeling for dimensional emotion recognition
in video. In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, Orlando,
FL, USA, 7 November 2014; pp. 11–18.
130. Chen, S.; Jin, Q. Multi-modal dimensional emotion recognition using recurrent neural networks.
In Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge Brisbane, Australia,
26 October 2015; pp. 49–56.
131. He, L.; Jiang, D.; Yang, L.; Pei, E.; Wu, P.; Sahli, H. Multimodal affective dimension prediction using deep
bidirectional long short-term memory recurrent neural networks. In Proceedings of the 5th International
Workshop on Audio/Visual Emotion Challenge, Brisbane, Australia, 26 October 2015; pp. 73–80.
132. Newton, E.M.; Sweeney, L.; Malin, B. Preserving privacy by de-identifying face images. IEEE Trans. Knowl.
Data Eng. 2005, 17, 232–243. [CrossRef]
133. Rahulamathavan, Y.; Rajarajan, M. Efficient privacy-preserving facial expression classification. IEEE Trans.
Dependable Secur. Comput. 2017, 14, 326–338. [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).