Emotion Rec ML
Emotion Rec ML
Abstract—Starting a business is an easy task but making it demand or new features in the product, in order to increase
established and reliable is something challenging. Any business the sales or profit rate [3].
can grow if the customers are satisfied and this can be investigated
through their emotions or reviews expressed about the goods and Business-related emotion classification from online
services. It gives rise to the development of emotion recognition business reviews in Urdu Langue is an area which originates
system from business reviews using social computing paradigm. A several challenges for researchers in Business Intelligence
sufficient work has already been performed in this direction using (BI). In this area, a lot of work has been performed in
resource-rich languages like English. However, there is a need and English and other resource-rich languages but much more
a literature gap to develop such a system in Urdu, a resource-poor can be investigated in resource-poor languages such as Urdu
language, which is a national language of Pakistan and a widely [4].
spoken langue in other countries like India and other parts of the
world. This work aims at developing an Emotion detection System The business community has been attracted by social
from online business reviews (tweets) in Urdu Language using media sites to capture and analyze their customer’s emotions
supervised Machine Learning techniques. We applied different to improve the quality of their manufactured goods to grow
machine learning classifiers, such as Support Vector Classifier in the market [5]. This gives rise to the development of
(SVC), Random Forest (RF), Naïve Bayes (NB) and K-Nearest emotion-based B.I system to sustain in the current digital
Neighbors (KNN) to classify the tweets with respect to Urdu world of business competition.
emotions. Results show that with respect to other classifiers, SVC
achieved efficient results with an accuracy of 80.5% on smart Most of the recent works [5] on emotion recognition are
phone dataset and 81.09% for sports dataset. focused on English and other resource-rich languages.
However, limited work has been carried out [4] in the Urdu
Keywords—Emotion recognition, emotion detection, Language. Ren and Quan [5] in their work on emotion
Business-related, Urdu Language, Machine Learning, business recognition from business reviews have conducted
intelligence, Social computing experiments on emotion dataset in English, whereas the work
performed by [4] is focused on detecting Roman Urdu
emotions using ontology-based approach.
I. INTRODUCTION
The Social Computing is a multidisciplinary area of Major limitations of the aforementioned relevant works are
research which aims at addressing different processes and that none of them fulfill the needs for developing emotion
mental states including learning, thinking, perception, recognition from business-related online reviews/tweets
remembering, and emotions. Among the aforementioned posted in the Urdu language. Therefore, we use machine
areas of social computing, emotions play a pivotal role in learning classifiers on emotion-based Urdu datasets acquired
identifying the social behavior of human [1]. from Twitter and propose Business-Related Emotion
It is pertinent for you as an advisor or blogger or website Recognition System (BERS) in Urdu Language Using
holder or a businessman to evoke certain emotions in the Machine Learning (ML) classifiers.
audience and/or customers, in order to vehemently increase The proposed work is different from that of Nargis and
the demand for your product. It is only an emotionally Jamil’s work [4] with respect to using business reviews in
motivated customer who will take a bold decision to buy a pure Urdu text and then detecting emotions by applying ML
product. Emotional marketing is way more fruitful than any classifiers. Furthermore, we recommend the classifier with
other form of marketing. Many acclaimed brands and efficient results with respect to different emotion
marketing campaigns have got a huge success by invoking combinations.
emotions in their target audience. At the end of the day,
human beings are emotional beings; and, they try to go Our work is focused on answering the following research
where their heart is [2]. By knowing the nature and emotions questions:
of the customers by their reviews and tweets, owners and RQ1: What is the efficiency of different ML algorithms
businessmen then promptly introduce products of the for emotion detection from Urdu-based business tweets.
239
validation were performed for conducting the experiments. https://translate.google.com/
By applying the split validation, SVM-SMO attained the TABLE I. DATA SETS
highest accuracy of 86%, Naïve Bayes with 79.5% accuracy
ranked second, J48 stood third having an accuracy of 78% Dataset (Products) No. of Tweets
Smartphones 1000
while KNN failed the score board by getting the lowest
sports 1200
accuracy of 70.80%. By amalgamating SVM-SMO with
other optimization methods particularly particle swam B. Classifying Business-Related Emotions using M.L
optimization and genetic algorithm, one can escalate the Algorithms
accuracy of SVM-SMO.
We applied different Supervised Machine Learning
Xing et al. [15] applied Learning-based and Word-based Algorithms, namely: Support Vector Classifier (SVC),
process techniques for the detection emotions from textual Random Forest (RF), Naïve Bayes (NB), and K-Nearest
data. Satisfactory results were obtained with respect to Neighbors (KNN) by implementing it using Anaconda-based
comparing methods. Adding up some additional emotions Python platform supported by NLTK [19].
could be possible to take this work to some other level. The
a. Support Vector Machine (SVM)
performance could be enhanced and improved by applying
some other finest feature extraction techniques. The SVM/SVC performs binary classification by
classifying items to a single out of two classes by
In their work on emotion detection from Roman Urdu
finding maximum hyperplane, classifying the features
text, Nargis and Jamil [4] used a knowledge-based approach
into one of the two classes [13].
by developing emotion ontology and emotion detector
algorithm. A set of Roman Urdu text documents are used as b. Random Forest Classifier
a dataset for evaluating the effectiveness of the system. The
experimental results show the average recall of 85.40% and The working of this classifier is based on the
an average precision of 92.81%. construction of decision trees from a sample of training
set selected randomly. After that, votes from various
decision trees is accumulated to predict the class from
III. METHODOLOGY test data.
The proposed system (BERS) is comprised of following
c. Naïve Bayes (NB)
modules: (i) Gathering Urdu Emotion Dataset, (ii)
Classifying Business-Related Emotions, (iii) Performance The NB estimates the probability of feature of being
Evaluation/Assessment, and (iv) Recommending the most a member of a particular class from training data and
significant classification results shown in Fig. 1. then predicts the class label from unlabeled
Furthermore, we have adopted Ekman’s basic emotion model (unseen/test) data. In our case, an emotion label is
[9] and got equivalent translation in Urdu using Google predicted, if the computed probability is greater than the
Translate API1 given limit [8].
d. K-Nearest Neighbor (KNN)
The KNN operates by computing a distance of a data
point with respect to other points in a training data set.
In the next step, k-nearest data points are selected and
finally, a class is tagged with a data point having the
majority of the points [14].
As shown in Fig.[2] the input Urdu Text is processed
through multiple ML classifiers and resultantly it is classified
into different emotion categories, namely: (i) ̶ηϮΧ(Khushi,
Joy), (ii)ϑϮΧ (Khauf, Fear), (iii) ؟μϏ(Ghussa, Anger) , (iv)
̶Ϩϴ̴ϤϏ(Ghamgeeni, Sadness) , (v) Εήϔϧ(Nafrat, Disgust), and
(vi) ΖϣΪϧ (Nidamat, Shame). The training phase is supported
by providing the model with both labels and predictor
variables [20]. In testing phase, performance evaluation of
the model represents how emotions are detected and
Fig. 1. Proposed System classified into following categories: ̶ηϮΧ(Khushi, Joy),
ϑϮΧ(Khauf, Fear), ؟μϏ(Ghussa, Anger),
A. Gathering Urdu Emotion Dataset
̶Ϩϴ̴ϤϏ(Ghamgheeni, Sadness), Εήϔϧ(Nafrat, Disgust),
As Urdu is a resource-poor language, therefore, there ΖϣΪϧ(Nidamat, Shame) by applying different ML Classifiers.
is a lack of publically available lexical resources for In the Training phase, we first trained the data on the given
developing emotion-based Sentiment Analysis (SA) input and then in the testing phase we evaluate the
system in Business domain [16, 17]. For this purpose, we performance of our system by detecting the emotions and
acquired two datasets from Twitter using Tweepy API classifying them into the aforementioned categories.
[18].for different domains, namely: Smartphones and
Sports. Description of each dataset is presented in Table
I.
240
Experiment#1(RQ1: What is the efficiency of different ML
algorithms for emotion detection from Urdu-based business
reviews.)
We have conducted various experiments on 2 datasets
applying different classifiers to answer RQ1. The results are
given below in Table IV.
241
Experiment#2(RQ2: What is the best ML classifier for [4] G. Z. Nargis, and N. Jamil, (2016), “Generating an emotion ontology
emotion detection from Urdu-based business tweets. for Roman Urdu text.”
[5] F. Ren, and C. Quan, "Linguistic-based emotion analysis and
To answer RQ2, we evaluated the comparative results of recognition for measuring consumer satisfaction: an application of
different ML classifiers with respect to their emotion affective computing," Information Technology and Management 13.4
detection efficiency for business-related Urdu reviews. (2012): 321-332.
Results reported in Table IV show that SVC is effective than [6] S. S. Magdum, and J. V. Megha, "Mining online reviews and tweets
for predicting sales performance and success of movies," Intelligent
the other classifiers by attaining an accuracy of 80.50% and Computing and Control Systems (ICICCS), 2017 International
81.09% on two different datasets. Conference on. IEEE, 2017.
[7] M. Z. Asghar, A. Khan, K. Khan, H. Ahmad, and I. A. Khan, (2017),
V. CONCLUSION AND FUTURE WORK “COGEMO: Cognitive-based emotion detection from patient
generated health reviews,” Journal of Medical Imaging and Health
This work presents Business-Related Emotion Informatics, 7(6), 1436-1444.
Recognition System (BERS) in the Urdu Language for [8] J. Kaur, and J. R. Saini, "Emotion detection and sentiment analysis in
classifying and recognizing emotions in Urdu tweets posted text corpus: a differential study with informal and formal writing
on social media website in the business domain. We applied styles," International Journal of Computer Applications 101.9 (2014).
different machine learning classifiers on trained data to [9] M. Z. Asghar, A. Khan, A. Bibi, F. M. Kundi, and H. Ahmad, (2017),
“Sentence-level emotion detection framework using rule-based
check the validity of the proposed system. The system classification,” Cognitive Computation, 9(6), 868-894.
performs well with respect to classification and detection of
[10] S. Feng, D. Wang, G. Yu, W. Gao, and K. F. Wong, (2011),
emotions in the Urdu language. Resultantly, we found that “Extracting common emotions from blogs based on fine-grained
among the applied classifiers, Support Vector Classifier sentiment clustering,” Knowledge and information systems, 27(2),
(SVC) gave us the best accuracy results of 80.50% and 281-302.
81.09% on two different datasets of business-related tweets [11] O. Bruna, H. Avetisyan, and J. Holub, "Emotion models for textual
and outperformed the other classifiers used. emotion classification," Journal of Physics: Conference Series. Vol.
772. No. 1. IOP Publishing, 2016.
One may upraise the study by working on more [12] F. Calefato, F. Lanubile, and N. Novielli, "EmoTxt: a toolkit for
emotions labels other than ̶ηϮΧ (Khushi, Joy), (ii) ϑϮΧ emotion recognition from text," arXiv preprint
(Khauf, Fear), (iii) ؟μϏ(Ghussa, Anger) , (iv) arXiv:1708.03892 (2017).
̶Ϩϴ̴ϤϏ(Ghamgeeni, Sadness) , (v) Εήϔϧ(Nafrat, Disgust), and [13] H. Binali, C. Wu, and V. Potdar, "Computational approaches for
(vi) ΖϣΪϧ (Nidamat, Shame) on which the researchers of emotion detection in text," Digital Ecosystems and Technologies
(DEST), 2010 4th IEEE International Conference on. IEEE, 2010.
this study already worked. Adding up the emoticons feature
[14] N. A. S. Winarsih, and C. Supriyanto, "Evaluation of classification
could be possible to make it more visually appealing. methods for Indonesian text emotion detection," Technology of
Increase the size of datasets, their domains and Information and Communication (ISemantic), International Seminar on
experimenting with other different classifiers to increase the Application for. IEEE, 2016.
efficiency of the proposed system by achieving more [15] Y. Xing, C. Chen, and L. L. Liu, "Classification influence of features
significant results. on given emotions and its application in feature selection," Journal of
Physics: Conference Series. Vol. 1004. No. 1. IOP Publishing, 2018.
[16] https://www.kaggle.com/datasets
REFERENCES
[17] https://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_r
[1] N. H. Frijda, The emotions, Cambridge University Press, 1986. esearch#Twitter_and_tweets
[2] M. N. Khuong, and V. N. B. Tram, "The effects of emotional [18] Tweepy API, available at https://pypi.org/project/tweepy/
marketing on consumer product perception, brand awareness and [19] Anaconda, available at https://anaconda.org/
purchase decision (A study in Ho Chi Minh City, Vietnam)," Journal
of Economics, Business and Management 3.5 (2015): 524. [20] M. Z. Asghar, A. Khan, F. Khan, and F. M. Kundi, "RIFT: A rule
induction framework for Twitter sentiment analysis," Arabian Journal
[3] H. Ö÷üt, and B. K. O. Taú, "The influence of internet customer reviews for Science and Engineering 43, no. 2 (2018): 857-877.
on the online sales and prices in the hotel industry," The Service
Industries Journal 32.2 (2012): 197-214.
242