GBSVM Sentiment Classification From Unstructured Reviews Using Ensemble Classifier
GBSVM Sentiment Classification From Unstructured Reviews Using Ensemble Classifier
sciences
Article
GBSVM: Sentiment Classification from Unstructured
Reviews Using Ensemble Classifier
Madiha Khalid 1,† , Imran Ashraf 2,† , Arif Mehmood 3 , Saleem Ullah 1 , Maqsood Ahmad 1 and
Gyu Sang Choi 2,∗
1 Department of Computer Science, Khawaja Fareed University of Engineering and Information Technology,
Rahim Yar Khan 64200, Pakistan; [email protected] (M.K.); [email protected] (S.U.);
[email protected] (M.A.)
2 Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541,
Korea; [email protected]
3 Department of Computer Science & Information Technology, The Islamia University of Bahawalpur,
Bahawalpur 63100, Pakistan; [email protected]
* Correspondence: [email protected]
† These authors contributed equally to this work.
Received: 9 March 2020; Accepted: 14 April 2020; Published: 17 April 2020
Abstract: User reviews on social networking platforms like Twitter, Facebook, and Google+, etc. have
been gaining growing interest on account of their wide usage in sentiment analysis which serves
as the feedback to both public and private companies, as well as, governments. The analysis of
such reviews not only plays a noteworthy role to improve the quality of such services and products
but helps to devise marketing and financial strategies to increase the profit for companies and
customer satisfaction. Although many analysis models have been proposed, yet, there is still room
for improving the processing, classification, and analysis of user reviews which can assist managers
to interpret customers feedback and elevate the quality of products. This study first evaluates the
performance of a few machine learning models which are among the most widely used models
and then presents a voting classifier Gradient Boosted Support Vector Machine (GBSVM) which
is constituted of gradient boosting and support vector machines. The proposed model has been
evaluated on two different datasets with term frequency and three variants of term frequency-inverse
document frequency including uni-, bi-, and tri-gram as features. The performance is compared with
other state-of-the-art techniques which prove that GBSVM outperforms these models.
Keywords: educational data mining; ensemble classifier; sentiment analysis; term frequency;
bi-gram; tri-gram
1. Introduction
With the wide proliferation of smartphones, recent years have seen the inception and expansion
of social platforms like Twitter, Facebook, and Google+, etc. which collectively are referred to as
‘social media’. With its growth, social media has become one of the interactive technologies that
promote users to create ideas and share information and opinions in the form of expressions or reviews.
Such expressions are full of information that serves as the user’s feedback and can be utilized to revise
and devise policies to improve the quality of both products and services. However, the extraction
of users’ opinions from the reviews is not a trivial task. A specialized field called ‘sentiment
analysis’ [1] offers a variety of techniques and tools that are used for identifying and extracting
subjective information which represents users’ opinions. These techniques and tools are categorized
under the natural language processing [2]. The mining of users’ opinions from user reviews is called
opinion mining and is practiced within text mining. Opinion mining aims to contrive a system which
is used to extract and classify review and identify opinions within the text.
Traditionally sentiment analysis classifies the opinions polarity into positive, negative and neutral
classes [3]. The polarity is based on the emotions or specialized words that are present within users’
reviews. These reviews possess a significant meaning for the public, as well as, private companies
because of the information they contain. The reviews contain the likes and dislike about a particular
product or service so hold potential information that can be used by companies to improve the quality
of the products. It can further help to devise or revise policies about particular products. The textual
comments are much more significant than the numeric score because they represent what people
exactly comment about a particular product [4].
Even though this content is meant to hold meaningful information for governments,
businesses, and individuals, this user-generated bulk content has to be processed using text mining
techniques and sentiment analysis. However, this process is not a trivial task and sentiment analysis
faces several challenges that become obstacles in analyzing the accurate interpretation and meaning of
sentiments and detecting the suitable sentiment polarity [5]. The first challenge is to tackle the nature
of reviews text which can be either semi- or unstructured. Since most of the users writing reviews
are novice or non-expert and non-professional writers, so they do not follow any set rules to express
their views which results in semistructured and unstructured data [6]. Similarly, domain dependence,
review structure, and language semantics make the analysis more challenging. For a more detailed list
of sentiment analysis challenges users are referred to [5].
Supervised, as well as unsupervised, machine learning techniques have been applied for
sentiment analysis which extracts the meaningful information from structured and unstructured
text data to aid the decision-makers. Supervised techniques have proven to be more effective in
determining the polarity of the sentiments, however, they require large amounts of labeled data
which is not easy to get [7]. On the contrary unsupervised techniques, though not superior, are still
advantageous as they can work without the labeled data. Support Vector Machine (SVM), Naive Bayes
(NB), and tree-based approaches are reported to show good performance in sentiment analysis [8].
The selection of appropriate feature set from the data is an equally important step like the classification
model. Term Frequency (TF), Term Frequency-Inverse Document Frequency (TF-IDF), word2vec,
and parts of speech, etc. are among the most widely applied features in sentiment analysis. The use
of a specific feature with different classification models show different results, so, an appropriate
strategy would be to investigate the use of different features with a variety of classifiers and analyze
their performance. Research works [3,7] report superior performance of uni- and bi-gram features
in sentiment classification. Even so, a dedicated classifier is not suitable for tweets and often a
combination also called voting or ensemble of multiple classifiers proves to show excellent performance
for sentiment analysis [9].
This paper first investigates the performance of various machine learning supervised models and
then contrive a voting classifier to perform sentiment analysis on a Twitter dataset. Major contributions
of this study are summarized as follows:
• Performance analysis of SVM, Gradient Boosting Machine (GBM), Logistic Regression (LR),
and Random Forest (RF) is carried out for sentiment analysis. The polarity of Google apps dataset
is divided into positive, negative, and neutral classes for this purpose.
• A voting classifier, called Gradient Boosted Support Vector Machine (GBSVM), is contrived to
perform sentiment analysis that is based on Gradient Boosting (GB) and SVM. The performance is
compared with four state-of-the-art ensemble methods.
• The use of TF and TF-IDF is investigated, whereby, uni-gram, bi-gram, and tri-gram features
are used with the selected classifiers, as well as, GBSMV to analyze the impact on the sentiment
classification accuracy.
Appl. Sci. 2020, 10, 2788 3 of 20
The rest of the paper is organized in the following manner. Section 2 discusses important research
works related to the current study. Section 3 outlines the details of the proposed voting classifier and
describes the dataset used for the experiment. Section 4 is about the results of the experiment and
discussions. Performance analysis of the proposed GBSVM is also made with other state-of-the-art
models in the same section. In the end, conclusion and future work are given in Section 5.
2. Literature Review
Data over the internet is growing constantly and so does people’s choice to express their opinions.
With the expansion of social platforms, tools to obtain people opinion has been changed and fields like
opinion mining and sentiment analysis have received an increased demand. Owing to the potential
impact that users’ opinions can make on businesses, the importance of online reviews cannot be
underestimated. Consequently, a large number of research professionals are building systems that
can extract the information from such reviews to benefit marketing analysis, drive public opinion,
and increase customer satisfaction. As a result, sentiment analysis has been adopted and deployed in
a large number of research areas and businesses.
the correlation between the Twitter sentiment and a particular event. The model used is a Bayesian
logistic regression that uses lexicons, uni-, and bi-gram features to detect subjective or objective tweets.
The tweets are processed to represent the public sentiments towards unexpected events.
Another research [14] proposes a novel approach based on SentiWordNet to carry out opinion
mining using web data. The count of the score falls under seven categories: strong-positive,
positive, weak-positive, neutral, weak-negative, negative, and strong-negative to test the efficacy
of NB, SVM, and multi-layer perceptron. The proposed approach is evaluated on movie and product
web domains and results are compared against the performance of the selected classifiers. Results
demonstrate that the proposed approach outperforms the selected machine learning classifiers.
that is used for text classification. Results show that DFS gives competitive results to other feature
selection techniques.
Despite various methods proposed and used for sentiment analysis, there is still room for
improvement, as no method is suitable for all kinds of data. Specifically, the machine learning models
show variant performance when used with a specific pre-processing strategy and particular feature
selection method. Voting classifiers proved to show superior performance than single classification
models, as in [9], where a voting classifier based on LR and Stochastic Gradient Descent Classifier
(SGDC) is used for tweet classification. As a consequence, this research aims at devising a Voting
Classifier (VC) which can perform better than already proposed models to predict sentiments polarity
from unstructured text.
where α represents the learning rate which can be varied between 0 and 1 to tune the performance of
GBM, and ∑(yi − yi ) shows the sum of the residuals. The loss function is a measure that indicates
p
the efficiency of coefficients of a model that fit the underlying data. A logical understanding of loss
function would depend upon what we are trying to optimize [32]. The architecture of a GBM is given
in Figure 1.
Median
Figure 1. The architecture of gradient boosting machine. Rectangles in the figure show the split criteria
while the end node is called ‘leaf’.
GBM is not only useful in practical applications, but it also has significant usage in data mining
challenges [33,34]. The functionality of GBM is described in Algorithm 1.
Class 1 Class 2
Margins
Dividing hyperplane
Figure 2. The architecture of the support vector machines. The dashed lines on the figure shows
the best dividing margins while the thick black line represent the dividing hyperplane as drawn by
a support vector machine (SVM).
The class separability depends upon the distance between the samples of the classes. In other
words, the higher the distance between the support vectors (margins) is, the more distinguished the
classes are. The hyperplanes are originated using quadratic programming optimization problem [39].
The decision function of an SVM is related not only to the number of SVs and their weights but
also to the a priori chosen kernel that is called the support vector kernel [40,41]. For this purpose,
various kernels like radial, polynomial, and neural kernels, etc., are used with SVM [42]. The working
principle of SVM is given in Algorithm 2 [43,44].
Pre-processing Classifier
Remove GBSVM
missing values
Remove
numbers
Trained Model
Remove special
Training
Testing
characters
data
data
Play store
App user Remove
reviews punctuations Performance
metrics
Remove stop Accuracy
words
Precision
Convert to
lower case Features
Recall
Perform TF/IDF
stemming (uni-,bi-,tri-gram) F1-score
Figure 3. Diagram for the proposed methodology. Initially pre-processing is carried out to remove
noise and unnecessary data. Term Frequency-Inverse Document Frequency (TF-IDF) features are then
extracted to train the classifiers. For evaluation train-test split is done as 70–30.
As a first step, all the reviews with missing values are identified and removed as the missing
data can degrade the performance of the classifiers. Next, numerical values are removed from the
text as they do not contribute towards the learning of the classifiers. It decreases the complexity of
the training classifiers. Occasionally, reviews contain special symbols like a hear sign, thumb sign,
etc. that need to be removed to reduce feature dimension and improve performance. After that the
following punctuation []() /|, ; . ’ is removed from the reviews in view of the fact that it does not
contribute to the text analysis. It cripples the model’s ability to discriminate between punctuation and
other characters.
As a next step, words are converted to lowercase because the text analysis is case sensitive. If this
step is not carried out, the machine learning models will count for example ‘Excellent’ and ’excellent’ as
two different words which will ultimately affect the classifier’s performance [46]. In the end, stemming
is performed. It is a very important pre-processing step that removes the affixes from the words.
It transforms the extended words into their base forms. For example, ‘loves’, ‘loved’, and ‘loving’ are
the modified forms of ‘love’. Stemming changes these words into the original/root form and helps to
increase the performance of a classifier [47].
For example, if the predictions from c1 , c2 , and c3 are ‘positive’, ‘negative’, and ‘positive’,
respectively, then the final prediction will be ’positive’ by the majority vote.
The soft voting, on the other hand, considers the probability score from each classifier of a specific
class that the current sample belongs to. At that point, soft voting criteria determine the class with the
highest probability which it gets by averaging the individual values of the classifiers [49].
The proposed GBSVM takes the advantage of the advantages of both GBM and SBM and combines
their predicted probability of a particular class to make the final decision. MGBM and MSvM are trained
on the training data set and then used to predict the probability for positive, neutral and negative
classes separately. Using the predicted probability from the two classifiers, an average probability
for each class is computed for a given review. The decision function is then used to decide the
final prediction/label of the review which is based on the highest average probability for a class.
The working mechanism of the GBSVM is given in Algorithm 3.
1: for i = 1 to M do
2: if MGBM 6= 0 & MSV M 6= 0 & training_set 6= 0 then
3: ProbSV M − Pos = MSV M .probibility( Pos − class)
4: ProbSV M − Neu = MSV M .probibility( Neu − class)
5: ProbSV M − Neg = MSV M .probibility( Neg − class)
In this study, the soft voting technique is used. The VC in the current study is expressed as:
n n
pb = argmax { ∑GBMi , ∑SV Mi }. (4)
i i
n n
Here ∑GBMi and ∑SV Mi both will give prediction probabilities against each test example.
i i
After that, the probabilities for each test example by both GBM and SVM passes through the soft voting
criteria as shown in Figure 5.
When a given sample passes through the SVM and GBM, they give the probability score against
each class (positive, negative, neutral). Let GBM’s probability score be 0.96696002, 0.02426578,
0.0087742, and for ProbGBM − Pos, ProbGBM − Neu, and ProbGBM − Neg classes and SVM’s
predicted probabilities be 0.997757249, 0.00206882765, and 0.000173923303 for ProbSV M − Pos,
ProbSV M − Neu, and ProbSV M − Neg, respectively. Then the average probability for the three
classes can be calculated as
Since the final prediction is the MaxProb( Avg − Pos, Avg − Neu, Avg − Neg), which in this case
is for the positive class, so the final predicted class is ‘positive’ and the actual class of the sample review
is also positive in the dataset.
Avg-Pos = (ProbSVM-Pos+ProbGBM-Pos) / 2
Avg-Neu = (ProbSVM-Neu+ProbGBM-Neu) / 2
Avg-Neg = (ProbSVM-Neg+ProbGBM-Neg) / 2
Figure 5. Architecture of the proposed voting Classifier (GBSVM). ProbSVM-Pos represents SVM given
probability score for a specific class while ProbGBM-Pos is for the GBM score of a particular class.
3.3. Dataset
Dataset plays a very important role to perform the sentiment analysis. This study utilizes the
dataset that contains the mobile application reviews for Google apps. The dataset used in the current
study has been downloaded from and is freely available at [50]. The dataset contains user reviews
about Google apps in the English language. It contains 64,295 records consisting of attributes including
‘App’, ‘Translated_ Reviews’, and ‘Sentiments’. The description of the dataset attributes is given in
Table 1.
Attribute Description
App It represent the actual name of the app on google play store.
Translated_Reviews It consists of the reviews given by each individual users.
Sentiments It contains positive, Negative and Neutral sentiments.
The dataset contains names of different apps and ‘translated_ reviews’ attribute shows users
reviews against the individual app. There are three classes in sentiments attribute namely positive,
negative and neutral. Figure 6 shows the distribution of positive, negative and neutral reviews.
There are 23,998 positive reviews, 8271 negative and 5158 neutral reviews in the dataset.
Appl. Sci. 2020, 10, 2788 12 of 20
evaluation metrics are True Positive (TP), False Positive (FP), True Negative (TN), and False Negative
(FN) [58]. Accuracy determines the performance of a classifier in terms of the percentage of reviews
that are predicted correctly. Using the above-mentioned terms, accuracy can be calculated using:
Numbero f correctpredictions
Accuracy = . (5)
Totalnumbero f predictions
The recall is often referred to as the completeness of a classifier. What proportion of actual positive
is identified correctly is given by recall. It is also called the sensitivity and can be calculated using the
following formula:
TP
Recall = . (6)
TP + FN
Precision shows the exactness of a classifier. It shows what percentage of all samples are labeled
positive that are actually positive. It is calculated with the following equation:
TP
Precision = . (7)
TP + FP
The F score is a statistical analysis measure that takes both the precision and recall into account
and computes a score between 0 and 1. The closer the value is to 1, the higher the accuracy of the
classifier will be. F1 is calculated as:
Recall × Precision
F1 = 2 . (8)
Recall + Precision
RF is executed with two parameters control: max_ depth and random_ state. The former shows the
maximum depth of the tree that will be created. It can also be taken as the longest route from the node
to the leaf. An optimal decision tree (DT) is known to be NP-complete in many aspects. So practical
DT are heuristic where local optimal decisions are taken at each node. Hence a globally optimal
Appl. Sci. 2020, 10, 2788 14 of 20
decision tree is not guaranteed. So multiple trees are trained in an ensemble classifier and features
are sampled randomly. The latter parameter of RF controls the random choices for such training.
The C defines how much we want to avoid misclassification of each training example. For smaller
values, the misclassified examples are higher and vice versa. Maximum iterations define the maximum
number of iterations that we want to carry out to the optimization process. Parameter ‘tol’ refers
to tolerance for stopping criterion. Penalty shows the regularization technique used for the model.
L2 represents ‘Ridge Regression’ which adds ’squared magnitude’ of coefficient as a penalty to the loss
function of the model. The parameter ‘fit_ intercept’ is set ‘True’, that includes the intercept value to
the regression model. ‘Solver’ parameter defines the algorithm to be used in the optimization problem
which is set to ‘lbfgs’ which is necessary to handle the ‘L2’ penalty and handle the multinomial loss.
For GBM maximum depth is set to 10 while the learning rate is 0.4. The parameter ‘n_ estimators’
is set to 100; it defines the number of boosting stages. Setting a larger number of estimators
usually gives better performance. Cache_ size defines the size of the kernel cache (in MB).
The ‘decision_ function_ shape’ defines whether to return ‘ovr’ (one-vs-rest) or ‘ovo’ (one-vs-one);
we set it to ‘ovr’. For ‘degree’, we used the default value, i.e., 3; which uses 1/(n_ features *
X.var()) as the value of gamma. Kernel methods enable the mapping of non-linear observations
into a higher-dimensional space to make them separable. Various kernels are used in machine learning
models including linear, Gaussian, neural, etc. For SVM, we used a ‘linear’ kernel that is used when
the data is linearly separable. The number of iterations is set to −1 which means that there is ‘no limit’
to iterations and ‘shrinking’ heuristic is set to ‘True’.
Experiment results demonstrate the proposed GBSVM performs well when used with TF features.
The underlying reason is the combination of GBM and SVM where GBM works on weak learners
to make it strong for prediction. The learning rate applied is 0.1 which fairly gives accurate results
and SVM is used with linear kernel and so performs faster and more accurately on the categorical
data. As the dataset contains a higher number of positive class instances so, the precision rate for the
positive class is quite good.
Appl. Sci. 2020, 10, 2788 15 of 20
Similarly, GBSVM outperforms other machine learning classifiers when TF-IDF features are used
for sentiment classification and gives an accuracy of 92%. Like before, precision results for the positive
class are comparatively higher than that of negative and neutral classes primarily on account of the
higher training data samples for positive class. F1 score which considers both precision and recall is
also high for GBSVM than that of other classifiers. Results shown in Table 4 are for uni-gram TF-IDF,
however, bi-gram TF-IDF has also been used and results are shown in Table 5.
Results with bi-gram TF-IDF features reveal that the performance of all classifiers has been highly
degraded. Theoretically, a higher-order n-gram model contains more information on a word’s context
which can lead to a model overfit. This happens when the data is sparse where we have a relatively
large number of tokens but the frequency of the tokens is low. In such scenarios, a low order n-gram
model can perform better than a high order n-gram model which is the case with the current dataset.
This can be further corroborated from the results when TF-IDF tri-gram features are used for sentiment
classification. Results with tri-gram features are shown in Table 6.
As we can see the accuracy values obtained from different classifiers have been further decreased
with tri-gram features. The most probable reason is the nature of the data that has been used during
the training phase. In many cases, bi- and tri-gram performs worse than uni-grams, particularly
when adding extra features because it may lead to overfitting. Another reason is the small sample
of training data. It is most probable that classifiers are likely to have unseen tri-grams which can
Appl. Sci. 2020, 10, 2788 16 of 20
reduce the performance with the test data. Often the data contains only single words that lead to better
performance of uni-gram than that of bi- and tri-gram models. The selected data mostly consists of
single words like ‘great’, ‘nice’, and ‘good’, etc. so training on these results in higher accuracy for
classifiers when uni-gram is used and the result becomes poorer gradually if we move from bi-gram to
tri-gram. That is the reason the performance of selected classifiers has been degraded, however, even
so, GBSVM performs better than other classifiers.
Approach Accuracy
Reference [9] SGDC+LR 88%
Reference [60] Rocchio+Naïve Byes+KNN 73%
Proposed (GBSVM) 93%
Approach Accuracy
Reference [62] GCN 87.9%
Reference [62] SGCN 88.5%
Reference [63] NABoE 86.0%
Proposed (GBSVM) with TF-IDF 90.0%
Appl. Sci. 2020, 10, 2788 17 of 20
5. Conclusions
The rise and widespread use of social media has opened new ways of expressing opinions and
sentiments on social platforms like Twitter, Facebook, etc. It has fueled the interest in sentiment
analysis, as finding correct sentiments from text has become an important tool for individuals and
companies to devise and revise products and services for increased customer satisfaction. In this
paper, a sentiment analysis approach is contrived which performs voting from two base models
including GBM), and SV). The performance is tested against four machine learning models including
GBM, SVM, LR, and RF. Experiment results on the Google app dataset show that the proposed
GBSVM outperforms machine learning classifiers. Additionally, TF, and three variants of TF-IDF uni-,
bi-, and tri-gram are also investigated for their suitability as classification features which reveal that
uni-gram performs better than that of TF and bi- and tri-gram TF-IDF. However, these results are not
conclusive as a large dataset may affect the results and the bi-gram and tri-gram perform better with
a larger dataset, which is intended as future work. Refinement in the accuracy is further possible with
a more balanced data where the training samples for positive, negative, and neutral are approximately
similar. Performance comparison of GBSVM with four similar models show that it performs better
and achieves higher accuracy.
Author Contributions: Conceptualization, M.K. and I.A.; data curation, M.A.; formal analysis, M.K. and A.M.;
funding acquisition, G.S.C.; investigation, M.K.; methodology, A.M.; project administration, S.U.; software, M.A.;
validation, S.U.; writing—original draft, I.A.; writing—review and editing, G.S.C. All authors have read and
agreed to the published version of the manuscript.
Funding: This research was supported by Basic Science Research Program through the National Research
Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2019R1A2C1006159), MSIT(Ministry
of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program
(IITP-2020-2016-0-00313) supervised by the IITP(Institute for Information & communications Technology
Promotion), and the Brain Korea 21 Plus Program(No. 22A20130012814) funded by the National Research
Foundation of Korea (NRF).
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design of the
study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to
publish the results.
References
1. Chen, L.; Wang, W.; Nagarajan, M.; Wang, S.; Sheth, A.P. Extracting diverse sentiment expressions with
target-dependent polarity from twitter. In Proceedings of the Sixth International AAAI Conference on
Weblogs and Social Media, Dublin, Ireland, 4–7 June 2012.
2. Liu, B. Handbook Chapter: Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing.
In Handbook of Natural Language Processing; Marcel Dekker, Inc.: New York, NY, USA, 2009.
3. Dave, K.; Lawrence, S.; Pennock, D.M. Mining the peanut gallery: Opinion extraction and semantic
classification of product reviews. In Proceedings of the 12th International Conference on World Wide Web,
Budapest, Hungary, 20–24 May 2003; pp. 519–528.
4. Kasper, W.; Vela, M. Sentiment analysis for hotel reviews. In Proceedings of the Computational
Linguistics-Applications Conference, Jachranka, Poland, 17–19 October 2011; Volume 231527, pp. 45–52.
5. Hussein, D.M.E.D.M. A survey on sentiment analysis challenges. J. King Saud Univ.-Eng. Sci. 2018,
30, 330–338. [CrossRef]
6. Mukherjee, A.; Venkataraman, V.; Liu, B.; Glance, N. What yelp fake review filter might be doing?
In Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, Cambridge,
MA, USA, 8–11 July 2013.
7. Pang, B.; Lee, L.; Vaithyanathan, S. Thumbs up? Sentiment classification using machine learning
techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing,
Stroudsburg, Philadelphia, PA, USA, 6–7 July 2002; Association for Computational Linguistics: Shumen,
Bulgaria, 2002; Volume 10, pp. 79–86.
8. Pang, B.; Lee, L. Opinion mining and sentiment analysis. Found. Trends. Inf. Retr. 2008, 2, 1–135. [CrossRef]
Appl. Sci. 2020, 10, 2788 18 of 20
9. Rustam, F.; Ashraf, I.; Mehmood, A.; Ullah, S.; Choi, G.S. Tweets Classification on the Base of Sentiments for
US Airline Companies. Entropy 2019, 21, 1078. [CrossRef]
10. Neethu, M.; Rajasree, R. Sentiment analysis in twitter using machine learning techniques. In Proceedings of
the 2013 Fourth International Conference on Computing, Communications and Networking Technologies
(ICCCNT), Tiruchengode, India, 4–6 July 2013; pp. 1–5.
11. Ortigosa, A.; Martín, J.M.; Carro, R.M. Sentiment analysis in Facebook and its application to e-learning.
Comput. Hum. Behav. 2014, 31, 527–541. [CrossRef]
12. Bakshi, R.K.; Kaur, N.; Kaur, R.; Kaur, G. Opinion mining and sentiment analysis. In Proceedings of the 2016
3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi,
India, 16–18 March 2016; pp. 452–455.
13. Barnaghi, P.; Ghaffari, P.; Breslin, J.G. Opinion mining and sentiment polarity on twitter and correlation
between events and sentiment. In Proceedings of the 2016 IEEE Second International Conference on Big
Data Computing Service and Applications (BigDataService), Oxford, UK, 29 March–1 April 2016; pp. 52–57.
14. Ahmed, S.; Danti, A. A novel approach for Sentimental Analysis and Opinion Mining based on
SentiWordNet using web data. In Proceedings of the 2015 International Conference on Trends in Automation,
Communications and Computing Technology (I-TACT-15), Bangalore, India, 21–22 December 2015; pp. 1–5.
15. Duwairi, R.M.; Qarqaz, I. Arabic sentiment analysis using supervised classification. In Proceedings of the
2014 International Conference on Future Internet of Things and Cloud, Barcelona, Spain, 27–29 August 2014;
pp. 579–583.
16. Le, H.S.; Van Le, T.; Pham, T.V. Aspect analysis for opinion mining of Vietnamese text. In Proceedings of the
2015 International Conference on Advanced Computing and Applications (ACOMP), Ho Chi Minh City,
Vietnam, 23–25 Novomber 2015; pp. 118–123.
17. Chumwatana, T. Using sentiment analysis technique for analyzing Thai customer satisfaction from social
media. In Proceedings of the 5th International Conference on Computing and Informatics (ICOCI), Istanbul,
Turkey, 11–13 August 2015.
18. Boiy, E.; Moens, M.F. A machine learning approach to sentiment analysis in multilingual Web texts. Inf. Retr.
2009, 12, 526–558. [CrossRef]
19. Duwairi, R.; El-Orfali, M. A study of the effects of preprocessing strategies on sentiment analysis for Arabic
text. J. Inf. Sci. 2014, 40, 501–513. [CrossRef]
20. Uysal, A.K.; Gunal, S. The impact of preprocessing on text classification. Inf. Process. Manag. 2014,
50, 104–112. [CrossRef]
21. Kalra, V.; Aggarwal, R. Importance of Text Data Preprocessing & Implementation in RapidMiner.
In Proceedings of the First International Conference on Information Technology and Knowledge
Management, New Dehli, India, 22–23 December 2017; Volume 14, pp. 71–75.
22. Uysal, A.K.; Gunal, S. A novel probabilistic feature selection method for text classification. Knowl.-Based
Syst. 2012, 36, 226–235. [CrossRef]
23. Hackeling, G. Mastering Machine Learning with Scikit-Learn; Packt Publishing Ltd.: Birmingham, UK, 2017.
24. Wang, G.; Sun, J.; Ma, J.; Xu, K.; Gu, J. Sentiment classification: The contribution of ensemble learning. Decis.
Support Syst. 2014, 57, 77–93. [CrossRef]
25. Whitehead, M.; Yaeger, L. Sentiment mining using ensemble classification models. In Innovations and
Advances in Computer Sciences and Engineering; Springer: Dordrecht, The Netherlands, 2010; pp. 509–514.
26. Zhou, Z.H. Ensemble Learning. Encycl. Biom. 2009, 1, 270–273.
27. Deng, X.B.; Ye, Y.M.; Li, H.B.; Huang, J.Z. An improved random forest approach for detection of hidden web
search interfaces. In Proceedings of the 2008 International Conference on Machine Learning and Cybernetics,
Kunming, China, 12–15 July 2008; Volume 3, pp. 1586–1591.
28. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [CrossRef]
29. Korkmaz, M.; Güney, S.; Yiğiter, Ş. The importance of logistic regression implementations in the Turkish
livestock sector and logistic regression implementations/fields. Harran Tarım ve Gıda Bilimleri Dergisi 2012,
16, 25–36.
30. Johnson, R.; Zhang, T. Learning nonlinear functions using regularized greedy forest. IEEE Trans. Pattern
Anal. Mach. Intell. 2013, 36, 942–954. [CrossRef]
31. Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232.
[CrossRef]
Appl. Sci. 2020, 10, 2788 19 of 20
32. Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [CrossRef]
33. Bissacco, A.; Yang, M.H.; Soatto, S. Fast human pose estimation using appearance and motion via
multi-dimensional boosting regression. In Proceedings of the 2007 IEEE Conference on Computer Vision
and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8.
34. Hutchinson, R.A.; Liu, L.P.; Dietterich, T.G. Incorporating boosted regression trees into ecological latent
variable models. In Proceedings of the Twenty-Fifth Aaai Conference on Artificial Intelligence, San Francisco,
CA, USA, 7–11 August 2011.
35. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [CrossRef]
36. Aslam, S.; Ashraf, I. Data mining algorithms and their applications in education data mining. Int. J. Adv.
Res. Comp. Sci. Manag. Stud. 2014, 2, 50–56.
37. Byun, H.; Lee, S.W. A survey on pattern recognition applications of support vector machines. Int. J. Pattern
Recognit. Artif. Intell. 2003, 17, 459–486. [CrossRef]
38. Burges, C.J. Geometry and Invariance in Kernel Based Methods, Advances in Kernel Methods: Support Vector
Learning; MIT Press: Cambridge, MA, USA, 1999.
39. Shmilovici, A. Support vector machines. In Data Mining and Knowledge Discovery Handbook; Springer: Berlin,
Germany, 2009; pp. 231–247.
40. Smola, A.; Schölkopf, B. A Tutorial on Support Vector Regression; NeuroCOLT Tech. Rep.; Technical report,
NC-TR-98-030; Royal Holloway Coll. Univ.: London, UK, 1998. Available online: http://www.kernel-
machines (accessed on 7 April 2020).
41. Smola, A.J.; Schölkopf, B.; Müller, K.R. The connection between regularization operators and support vector
kernels. Neural Netw. 1998, 11, 637–649. [CrossRef]
42. Epanechnikov, V.A. Non-parametric estimation of a multivariate probability density. Theory Probab. Its Appl.
1969, 14, 153–158. [CrossRef]
43. Zhang, Y. Support vector machine classification algorithm and its application. In International Conference on
Information Computing and Applications; Springer: Berlin/Heidelberg, Germany, 2012; pp. 179–186.
44. Shevade, S.K.; Keerthi, S.S.; Bhattacharyya, C.; Murthy, K.R.K. Improvements to the SMO algorithm for
SVM regression. IEEE Trans. Neural Netw. 2000, 11, 1188–1193. [CrossRef]
45. Vijayarani, S.; Janani, R. Text mining: Open source tokenization tools-an analysis. Adv. Comput. Intell. Int. J.
2016, 3, 37–47.
46. Yang, S.; Zhang, H. Text mining of Twitter data using a latent Dirichlet allocation topic model and sentiment
analysis. Int. J. Comput. Inf. Eng. 2018, 12, 525–529.
47. Anandarajan, M.; Nolan, T. Practical Text Analytics. Maximizing the Value of Text Data. In Advances in
Analytics and Data Science; Springer Nature Switzerland AG: Cham, Switzerland, 2019; Volume 2.
48. Bennett, K.P.; Campbell, C. Support vector machines: Hype or hallelujah? Acm Sigkdd Explor. Newsl.
2000, 2, 1–13. [CrossRef]
49. Agnihotri, D.; Verma, K.; Tripathi, P.; Singh, B.K. Soft voting technique to improve the performance of global
filter based feature selection in text corpus. Appl. Intell. 2019, 49, 1597–1619. [CrossRef]
50. Kaggle. Google Play Store Apps. 2019. Available online: https://www.kaggle.com/lava18/google-play-
store-apps (accessed on 21 November 2019).
51. Tellex, S.; Katz, B.; Lin, J.; Fernandes, A.; Marton, G. Quantitative evaluation of passage retrieval algorithms
for question answering. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research
and Development in Informaion Retrieval, Toronto, ON, Canada, 28 July–1 August 2003; pp. 41–47.
52. Zhao, R.; Mao, K. Fuzzy bag-of-words model for document representation. IEEE Trans. Fuzzy Syst. 2017,
26, 794–804. [CrossRef]
53. Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space.
arXiv 2013, arXiv:1301.3781.
54. Huang, L. Measuring Similarity Between Texts in Python. 2017. Available online: https://sites.temple.edu/
tudsc/2017/03/30/measuring-similarity-between-texts-in-python/ (accessed on 21 November 2019).
55. Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of tricks for efficient text classification. arXiv 2016,
arXiv:1607.01759.
56. Sisodia, D.S.; Nikhil, S.; Kiran, G.S.; Shrawgi, H. Performance Evaluation of Learners for Analyzing the
Hotel Customer Sentiments Based on Text Reviews. In Performance Management of Integrated Systems and its
Applications in Software Engineering; Springer: Singapore, 2020; pp. 199–209.
Appl. Sci. 2020, 10, 2788 20 of 20
57. Oprea, C. Performance evaluation of the data mining classification methods. Inf. Soc. Sustain. Dev. 2014,
2344, 249–253.
58. Han, J.; Pei, J.; Kamber, M. Data Mining: Concepts and Techniques; Elsevier: Waltham, MA, USA, 2011.
59. Shalev-Shwartz, S.; Ben-David, S. Understanding Machine Learning: From Theory to Algorithms; Cambridge
University Press: Cambridge, MA, USA, 2014.
60. Danesh, A.; Moshiri, B.; Fatemi, O. Improve text classification accuracy based on classifier fusion methods.
In Proceedings of the 2007 10th International Conference on Information Fusion, Quebec, QC, Canada,
9–12 July 2007; pp. 1–6.
61. Kaggle. 20 Newsgroups. 2017. Available online: https://www.kaggle.com/crawford/20-newsgroups
(accessed on 14 January 2020).
62. Wu, F.; Zhang, T.; Souza, A.H.D., Jr.; Fifty, C.; Yu, T.; Weinberger, K.Q. Simplifying graph convolutional
networks. arXiv 2019, arXiv:1902.07153.
63. Yamada, I.; Shindo, H. Neural Attentive Bag-of-Entities Model for Text Classification. arXiv 2019,
arXiv:1909.01259.
c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).