0% found this document useful (0 votes)
31 views

Building Fake Review Detection Model Based On Sentiment Intensity and PU Learning

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Building Fake Review Detection Model Based On Sentiment Intensity and PU Learning

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

6926 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO.

10, OCTOBER 2023

Building Fake Review Detection Model Based on


Sentiment Intensity and PU Learning
Zhang Shunxiang , Zhu Aoqiang , Zhu Guangli, Wei Zhongliang , and Li KuanChing , Senior Member, IEEE

Abstract— Fake review detection has the characteristics of for customers to understand the product information before
huge stream data processing scale, unlimited data increment, making a purchase decision [2]. Driven by interests, marketing
dynamic change, and so on. However, the existing fake review managers forged many fake reviews to boost their products.
detection methods mainly target limited and static review data.
In addition, deceptive fake reviews have always been a difficult The number of fake reviews is increasing exponentially [3],
point in fake review detection due to their hidden and diverse spreading as fast as real reviews [4]. These widespread fake
characteristics. To solve the above problems, this article proposes reviews will seriously mislead users to make correct purchase
a fake review detection model based on sentiment intensity decisions and endanger the e-commerce economy’s healthy
and PU learning (SIPUL), which can continuously learn the development.
prediction model from the constantly arriving streaming data.
First, when the streaming data arrive, the sentiment intensity is The previous work of text sentiment computing provides
introduced to divide the reviews into different subsets (i.e., strong a theoretical basis for fake review detection in this article.
sentiment set and weak sentiment set). Then, the initial positive Regarding the research on text sentiment computing, previous
and negative samples are extracted from the subset using the work proposed a microblog word extraction algorithm to
marking mechanism of selection completely at random (SCAR) analyze the sentiment tendency of microblog reviews [5].
and Spy technology. Second, building a semi-supervised positive-
unlabeled (PU) learning detector based on the initial sample to An associated semantic representation model is proposed,
detect fake reviews in the data stream iteratively. According to which solves the problem that ultrashort reviews are
the detection results, the data of initial samples and the PU challenging to understand because of data sparseness and
learning detector are continuously updated. Finally, the old data content fragmentation [6]. Aiming at the problem of complex
are continually deleted according to the historical record points, sentence structure and incomprehensibility, a sentiment
so that the training sample data are within a manageable size
and prevent overfitting. Experimental results show that the model classification model for microblog reviews is proposed [7].
can effectively detect fake reviews, especially deceptive reviews. Different from the previous work, this research of user
reviews detects fake reviews in product reviews based on
Index Terms— Fake reviews, positive-unlabeled (PU) learning,
semi-supervised learning, sentiment analysis. sentiment intensity and PU learning (SIPUL) to build a
detection model for fake reviews.
I. I NTRODUCTION The traditional work on detecting fake reviews mostly tar-
gets limited and static review data. The main method focuses
W ITH the online purchase market developing rapidly,
everyone can become a participator, purchaser, and
reviewer of online product. However, while online shop-
on heuristic strategies [8], fully supervised machine learning
[9], [10], and deep learning [11], [12]. Most of these methods
ping brings us convenience, there are also inherent chal- need large-scale labeled datasets. However, the low accuracy
lenges that consumers cannot distinguish the quality and of manually identifying fake reviews, which is only 53.1%–
performance of products as they do in physical stores [1]. 61.9%, makes it difficult to obtain large-scale labeled datasets
Reading the existing reviews is one of the crucial ways in practical research [13]. Some researchers combine many
features of reviews to detect fake reviews based on semi-
Manuscript received 7 July 2021; revised 5 January 2022, 14 July 2022, and supervised methods, such as two-view [14], self-training [15],
14 October 2022; accepted 1 January 2023. Date of publication 12 January and positive-unlabeled (PU) learning [16], [17]. To a certain
2023; date of current version 6 October 2023. This work was supported in part
by the National Natural Science Foundation of China under Grant 62076006 extent, these methods solve the problem of no large-scale
and in part by the University Synergy Innovation Program of Anhui Province annotations in the fake review dataset. However, previous
under Grant GXXT-2021-008. (Corresponding author: Zhu Aoqiang.) studies did not consider the streaming data characteristics of
Zhang Shunxiang is with the School of Computer Science and Engineering,
Anhui University of Science and Technology, Huainan 231001, China, and false comment detection. In addition, deceptive fake reviews
also with the Artificial Intelligence Research Institute, Hefei Compre- are written deliberately to mislead readers, which are difficult
hensive National Science Center, Hefei 230000, P. R. China (e-mail: to be detected by the previous methods. Generally, deceptive
[email protected]).
Zhu Aoqiang, Zhu Guangli, and Wei Zhongliang are with the School fake reviews are written in imitation of real ones, so they have
of Computer Science and Engineering, Anhui University of Science the features of hidden and diversity [18].
and Technology, Huainan 231001, China (e-mail: [email protected]; By analyzing the characteristics of fake reviews and pre-
[email protected]; [email protected]).
Li KuanChing is with the Department of Computer Science and Information decessors’ work, several crucial questions about fake review
Engineering (CSIE), Providence University, Taichung 43301, Taiwan (e-mail: detection have been presented.
[email protected]). 1) The large-scale annotated fake reviews dataset is difficult
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TNNLS.2023.3234427. to be built. How to make full use of the unlabeled
Digital Object Identifier 10.1109/TNNLS.2023.3234427 datasets in fake reviews research?
2162-237X © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
SHUNXIANG et al.: BUILDING FAKE REVIEW DETECTION MODEL BASED ON SIPUL 6927

Fig. 1. Framework of fake review detection model SIPUL. Step a: Division of review according to different sentiment intensities. Step b: Construction of
fake review detection model SIPUL.

2) The fake review with the deceptive feature is hard to be the prediction model from the constantly arriving streaming
detected in the given product dataset. How to effectively data. In particular, this model has a better effect than previous
detect deceptive fake reviews? methods in detecting deceptive fake reviews with a hidden
3) How to treat review data as streaming data in the fake feature. The framework of the model includes the following
review detection model? two aspects.
We propose a semi-supervised fake review detection model 1) The division of review according to different sen-
(SIPUL) based on the above three problems. Unlike traditional
timent intensities: First, the received reviews data
work, the method of SIPUL is mainly dedicated to overcome
are preprocessed, such as data cleaning, stop words
the problem that deceptive fake reviews are difficult to detect
removal, stemming, lemmatization, and so on. Second,
in streaming data. The innovation of this article is to divide
SC-CMC-KS [7] algorithm is used to calculate the sen-
reviews according to different sentiment intensities before fake timent value of reviews. Finally, the product review
review detection to enhance the characteristics of deceptive
is divided into two subsets with different sentiment
fake reviews. This article improves the semi-supervised PU
intensities.
algorithm for processing static data. The continuous iteration 2) The construction of fake review detection model SIPUL:
of the model and the update of training sample data are
First, the selection completely at random (SCAR) is used
used to deal with review streaming data’s increasing and
to select the positive example from subsets and put them
changing characteristics. The specific method is that we divide
into set P. Then, the credible negative examples RN are
the review into two subsets: strong sentiment set and weak
extracted from subsets by the Spy Technique [52]. Next,
sentiment set according to the different sentiment intensities. P and RN were used to form the initial model training
Based on a sentiment feature of consensus, fake reviews tend
sample, and the fake review detector is trained iteratively
to have higher sentiment intensity than real reviews [10].
by constantly updating the set P and RN. At last, the old
In dividing the review, most of the fake reviews tend to flow data are continuously deleted according to the historical
into the strong sentiment set, while the real reviews tend to
record points, so that the training sample data are within
flow into the weak sentiment set. The advantage of sentiment
a manageable size.
intensity is that more interleaved reviews are separated. Before
the streaming data arrive, the classifier is trained based on the The rest of this article is structured as follows: Section II
labeled samples. The trained classifier is continuously used to briefly reviews the existing work. Section III introduces the
detect fake reviews in the unlabeled set. The training samples principle of the method proposed and the determination of
and classifiers are updated according to the detection results. related parameters. Section IV introduces the construction
The construction of the model SIPUL is shown in Fig. 1. process of the model proposed. We give the experimental
The advantage of this model is that it does not need a large results and analysis in Section V. Finally, conclusions and
amount of annotated corpus, which can continuously learn future work are made in Section VI.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
6928 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023

II. R ELATED W ORKS but the detection effect is poor. Therefore, many scholars have
This section briefly reviews three aspects of current theo- focused on semi-supervised learning [26]. Wu et al. [27] used
retical works, including text sentiment analysis, fake reviews the collaborative learning method to identify spammers and
detection, and PU learning algorithm. social spammers. Yuan et al. [28] combined users, reviews,
and products to get a review representation for fake reviews
detection. Li et al. [29] presented a deep social network
A. Text Sentiment Analysis detection model for spammers. Yelundur et al. [30] proposed
This section reviews two of our previous works, which are a binary multiobjective method to detect review abuse. Their
the theoretical reference basis for this article’s calculation of experimental results show that the model achieves higher
sentiment value. accuracy/recall than unsupervised technology. Wang et al. [31]
Affective computing involves multiple modalities of text constructed a multiple feature fusion and rolling collaborative
[19], pictures, and videos [20]. The main task of sentiment training model for fake reviews. Imam et al. [32] considered
analysis is to analyze user reviews’ implicit sentiment states, that the characteristics of fake reviews might change over
attitudes, and opinions. The foundation of text sentiment time, so they proposed a semi-supervision model for the drift
analysis based on semantics is constructing a sentiment dic- problem.
tionary. Zhang et al. [6] proposed a method of making a
sentiment dictionary based on microblog topics. The annual C. PU Learning Algorithm
network words are counted and annotated manually to con- PU learning algorithm is a semi-supervised binary classifi-
struct the network words dictionary. Six sentiment dictionar- cation model. Different from the traditional semi-supervised
ies are constructed to expand the existing essential emotion classification model, the training of PU learning can be done
dictionaries, including the negative dictionary, the network with only a small amount of real reviews [33].
term dictionary, the degree adverb dictionary, and other related Kiryo et al. [34] propose a nonnegative risk estimator
dictionaries, which enriches the essential sentiment dictionary. for PU learning that solves the overfitting problem caused
Subsequently, the sentiment value of the microblog review is by introducing deep learning models in the PU learning
calculated based on the constructed dictionaries. field. Niu et al. [35] proved that PU and NU learning given
The analysis based on syntactic rules is also one of the infinite U data will almost always improve on pn learning.
main ways of sentiment analysis. Zhang et al. [7] used Zayd et al. [36] propose a general simplified method to solve
the three attributes of sentiment, location, and keywords to the overfitting problem of PU risk estimation. Fang et al. [37],
identify critical sentences in reviews and proposed a sentiment [38] studied the learning world problem of open set learning
partition model of user microblog reviews based on imper- and unsupervised open set domain adaptation (UOSDA) in the
ative sentences (SC-CMC-KS). They integrate dependency PU algorithm.
relationships and multiple rules to calculate the sentiment PU learning learns through a small number of labeled
value of reviews text. Finally, the sentiment value of the whole positive samples, achieving better classification results. There-
review is obtained by weighting the sentiment value of crucial fore, it has wide applications in the field of fake reviews
sentences and emoticons. detection. Fusilier et al. [39] divided fake reviews into pos-
itive and negative to use PU learning arithmetic focus on
B. Detection of Fake Review deceptive opinions. Li et al. [40] compared multiple groups
of experiments, and the final results showed that the PU
Fake reviews refer to the reviews that some users make
learning outperformed the supervised learning in fake review
up fake consumption experience and advocate or slander
detection. Chen et al. [41] effectively detected spammers in
the quality of the product for commercial or other bad
Sina Weibo based on the semi-supervised algorithm. He et al.
motives [21].
[42] combined PU learning and behavior density to study the
The traditional work of fake review detection mainly
detection of fake reviews.
focused on the full-supervised algorithm. Li et al. [22] built
a cross-domain dataset and studied the detection of cross-
domain fake reviews. Melling et al. [23] used review repre- D. Streaming Data
sentation combining emotion and sentiments to detect fake Compared with static and limited data stored in a database
reviews. Liu et al. [24] mined multiple levels of implicit or file, streaming data have the following characteristics:
expression patterns in reviews and integrated four dimensions the scale is usually massive, unlimited increments, dynamic
of the user, comment text, product, and fine-grained aspects changes, and timely decision-making [43]. These features are
into review expression. Fang et al. [25] proposed a fake very consistent with the features and requirements of false
review detection method based on a dynamic knowledge comment detection.
graph. Petr et al. [11] utilized two neural network models, Liu et al. [44] proposed a recurrent neural network
which combine the bag-of-words and the word’s context with model based on LSTM and LSTM+, which can correct the
the sentiment of consumers. abnormal data through the anomalous data in the stream-
Fully supervised detection of fake reviews has achieved ing data. Martín et al. [45] propose Kafka-ML, a novel and
good results, but the lack of large-scale labeled datasets limits open-source framework that enables the management of
its research. Unsupervised learning does not need labeled data, ML/AI pipelines through streaming data. Sun et al. [46]

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
SHUNXIANG et al.: BUILDING FAKE REVIEW DETECTION MODEL BASED ON SIPUL 6929

proposed a data stream cleaning system using edge intel- in the original environment will relatively lose hidden
ligence. Ning et al. [47] proposed a new high-dimensional in the new environment.
online learning method for online high-dimensional regression
and classification problems through efficient polar coordinate Most of the traditional research work started from the
decomposition. Liang et al. [48] present an anomaly detection first point. However, it is challenging to extract significantly
aided budget online weighted learning method to identify useful features from the dataset due to the hidden feature of
positive and negative instances from imbalanced data streams. deceptive reviews. Therefore, this article studies the second
Singh et al. [49] propose a self-adaptive density summarizing point. Our motivation is to change the environment of the fake
incremental natural outlier detection in the data stream with review placed by dividing the dataset according to different
skipping scheme and without skipping technique, which solves sentiment intensities. Therefore, the original deceptive reviews
the problem of not detecting the memory. with the hidden feature will no longer be hidden in the new
Based on the above-related works, detecting fake reviews set environment (subset). The detailed principle is shown in
has made significant progress. However, fake reviews detection Fig. 2.
remains a challenge because of deceptive fake reviews’ hidden Model explanation: In Fig. 2, ri and f i denote the review
and diverse features. Based on previous research, a detection text and review feature. r1 , r3 , r5 , r7 and r9 are assumed as
model for fake reviews is proposed to improve the accuracy of real reviews, while r2 , r4 , r6 , r8 and r10 are fake reviews.
deceptive fake reviews by changing the dataset environment According to previous work, we know that different review
of reviews placed. The different sentiment values of reviews features have different influences on fake reviews. To explain
are used to divide the review into two subsets, and we use the principle of the model better, it is assumed that the impact
an improved PU algorithm, which can continuously learn the of the ten fake review features is the same, so that the review
prediction model from the constantly arriving streaming data. with more fake features is more likely to be fake. Based on
In addition, we continue to detect fake reviews from the the above assumptions, the comparison and analysis of the
unmarked set through the iteration of the PU algorithm and previous working principle and our working principle are in
update the training samples in time based on the detection the following.
results, so that the model can continue to learn and predict Previous working principle: In Fig. 2, reviews contain
from the fast-arriving streaming data. at most eight fake features and at least one fake feature.
Therefore, the reviews with fake features greater than 4 can
III. M ODEL P RINCIPLE AND PARAMETERS OF SIPUL be classified as negative examples (fake reviews) and less
This section introduces the principle of detecting deceptive than or equal to 4 are classified as positive examples (real
reviews in detail. It mainly includes the principle of SIPUL reviews). The results are shown in Fig. 2. The positive
in detecting deceptive fake reviews, the criterion of review examples include r1 , r3 , r6 , r7 and r10 , and the negative exam-
division, and the calculation of sentiment threshold. ples include r2 , r4 , r5 , r8 , and r9 . According to the previous
assumptions, we found that r5 , r6 , r9 , and r10 are misclassified.
The comparison with the results of this article is shown
A. Principle of Detecting Deceptive Reviews in Table I.
Researchers have studied the detection of fake reviews for Our working principle: Under the same assumption, our
many years. Dianping website (a Chinese commodity review working principle adds a feature partition layer to the previous
website) built a system for fake review detection, and they working principle. The layer added is designed to improve
are confident about the system’s accuracy, but they do not the detection of deceptive reviews with the hidden feature
know the recall [40]. The rest of the reviews may hide many by dividing the given dataset into two parts (i.e., strong
fake reviews that have not been detected. Usually, the fake feature matrix and weak feature matrix). In this case, the
reviews that are not detected are mostly deceptive. If deceptive number of fake features in the strong feature matrix is 5–8
reviews can be effectively detected, it will greatly improve the and in the weak feature matrix is 1–4. In the strong feature
efficiency of detecting fake reviews. matrix, the reviews with fake features greater than 6 are
Deceptive reviews are difficult to detect because they are divided into negative examples, and the reviews less than
similar to the real ones and the fake features are not obvious or equal to 6 are divided into real reviews. In the weak
in the given product dataset. As we all know, data and feature matrix, reviews with fake features greater than 2 are
features determine the ceiling of machine learning, while mod- classified as negative examples, and reviews less than or equal
els and algorithms only approach the ceiling. Consequently, to 2 are classified as real reviews. The detection result is
a binary machine learning classification problem can get better shown in Fig. 2. The positive examples include r1 , r2 , r5 , r7 ,
experimental results if significant and effective features can and r9 , and the negative examples include r3 , r4 , r6 , r8 , and r10 .
be extracted from the given data. To detect the deceptive According to the hypothesis, it can be known that r2 and r3 are
fake reviews butter, we can explore from the following two misclassified.
points. The result analysis of model principles: In Table I, the result
1) Feature: Extracting representative fake review features showed that r1 , r4 , r7 , and r8 are accurately detected in the two
from a given dataset to distinguish real reviews. model principles. The reason is that r1 and r7 contain fewer
2) Data: Changing the set environment (dataset) of the fake fake features, and r4 and r8 contain more fake features. r1
reviews placed. The fake reviews with hidden features and r7 contain fewer fake features, which is more obvious

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
6930 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023

Fig. 2. Comparison of model principles. The yellow dashed line is the dataset division operation (i.e., the calculation of sentiment value) that the work of
this article has increased compared with the previous work. ri f i to denote the review text and review feature. r1 , r3 , r5 , r7 , and r9 are assumed as real reviews,
while r2 , r4 , r6 , r8 , and r10 are fake reviews.
TABLE I
C OMPARISON R ESULTS OF M ODEL P RINCIPLES

than other reviews. On the contrary, r4 and r8 contain more reviews as much as possible according to different sentiment
fake features, which is also more obvious than other reviews. values before the model is trained. In dividing the review,
These prominent features make these reviews easy to detect most of the fake reviews tended to be classified into the
correctly. strong sentiment set, while real reviews tended to be clas-
However, the reviews containing three to six fake features sified into the weak sentiment set. This division indirectly
have a higher overall similarity between real reviews and changes the set environment of the fake reviews placed,
fake reviews. The reviews in this part are intertwined, which which makes deceptive reviews with hidden features no longer
is equivalent to hide the fake reviews in the real reviews. have hidden features in the subset. Therefore, it is easier
It is difficult to detect deceptive reviews with hidden features to detect deceptive fake reviews in the divided subset. The
because of the single feature extracted directly from the given effectiveness of this model is verified through experiments
dataset. Our method is to separate the intertwined parts of in Section V.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
SHUNXIANG et al.: BUILDING FAKE REVIEW DETECTION MODEL BASED ON SIPUL 6931

TABLE II TABLE III


D ISTRIBUTION OF THE D ATASET OR VALUE

B. How to Select Criterion of Review Division


According to the analysis in Section III-A, it is very crucial
to choose a representative criterion for review division. In this
article, sentiment intensity is selected as the criterion for the
classification of the review.
In general, many features may affect fake reviews. We ana-
lyzed ten common features based on [50]. The ten features
are text length (F1), text complexity (F2), relevance between existing research results (SC-CMC-KS) as the theoretical algo-
reviews and products (F3), consistency of reviews and ratings rithm of sentiment computing, and the theory of the method
(F4), sentiment intensity (F5), whether to include transitional comes from [7]. This article divides the reviews according
words (F6), copy text (F7), user reputation (F8), consistency of to the intensity of the sentiment and does not consider the
evaluation (F9), and attached advertising picture (F10). They positive or negative. Therefore, the absolute value operation is
use the stepwise regression to calculate which feature has a performed on the calculation result of the algorithm
severe influence on fake reviews, and the advantage of the
FS = |λ × TScore(total) + (1 − λ) × SE(emotion)| (4)
feature is calculated by the odds ratio (OR) [51]. The OR is a
ratio index to measure the effect of risk factors in epidemics, where FS is the sentiment value of the review, TScore(total)
and its calculation formula is as follows: is the total sentiment value of sentiment words in the review,
P1 /(1 − P1 ) SE(emotion) is the sentiment value of the emoticon, and λ is
OR j = (1) the ratio of text, and the detailed calculation process is referred
P0 /(1 − P0 )
to [7].
where P1 and P0 denote the probability of a false attack when
the independent variable X j denotes different values. OR j
denotes the multivariate-adjusted OR. Next, on the dataset IV. C ONSTRUCTION OF THE M ODEL SIPUL
collected by Otter et al. [20] (shown in Table II), we used the This section will introduce the construction of the model
length of the review as an example to expound the calculation SIPUL in detail. It mainly includes the introduction of the
of the OR PU learning for fake review detection and three steps of the
2536/3994 construction process of SIPUL.
Text length ≥ 50 advantage = ≈ 1.74 (2)
1458/3394
876/9252 A. PU Learning
Text length < 50 advantage = ≈ 0.10. (3)
8376/9250 The PU learning algorithm learns from a small number of
Therefore, O R = (1.74/0.10) = 17.4, and the effect of text positive examples [46]. First, it identifies reliable negative
length on fake reviews detection is very significant. Based on examples with higher credibility from the unlabeled dataset.
this principle, the ORs of ten features are calculated, and the Then, iterative training classification models until a specific
result is shown in Table III. stop condition are reached according to the extracted samples.
As shown in Table III, we can find that the three largest ORs Generally speaking, there are three types of PU learning
are 17.4 of the length of the text, 15.6 of whether it contains algorithms: two-step PU learning, biased PU learning, and
turning words, and 7.68 of the sentiment intensities. incorporation of the class prior. Based on the two PU algo-
rithms, this article controls the iteration of the model and the
update of training sample data to deal with the increasing and
C. Calculation of Sentiment Value changing characteristics of review streaming data.
Considering the influence of different fields and language This article uses two-step PU learning. The workflow of the
habits, the definition of strong or weak sentiment is not a algorithm is shown in Fig. 3.
definite value. The sentiment threshold needs to be determined Fig. 3 shows the process of using the PU learning algorithm
according to the field represented by the review and the to construct a classifier to detect fake reviews. There are only
collection of datasets. In addition, the sentiment value obtained positive and no negative examples given a training set P.
by using different sentiment calculation methods for the same The streaming data after sentiment division are put into the
reviews is not necessarily the same. This article adopts our unlabeled set U for detection. The U contains both positive

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
6932 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023

Fig. 3. Construction process of a two-stage PU learning algorithm for


streaming data. The blue arrow indicates the extraction of negative examples,
and the red arrow indicates model training and classification.
Fig. 4. Strategy for collecting positive examples. U, P, and I represent the
unlabeled text, positive example, and temporary collection, respectively.
and negative examples. PU learning algorithm mainly includes
the following two steps.
Step3. Updating training samples and historical record
1) Extracting reliable negative examples: According to the points.
positive sample P and unlabeled dataset U, reliable The following sections will describe each step in detail.
negative samples (RN) are extracted from U. The set 1) Extracting Reliable Negative Examples: First, the SIPUL
U is divided into RN and U-RN. algorithm needs to extract some negative examples from the
2) Training the classifier iteratively: The set P and RN form unlabeled dataset. In previous work, there are mainly four
the initial training set, and a traditional binary classifier algorithms to achieve this step: Spy Technique [52], Rocchio
(e.g., SVM) is trained iteratively by constantly updating [53], NB [54], and 1-DNF [55] technology. Experiments show
set P and RN. The iteration does not stop until a stop that the combined performance of Spy and SVM is the best.
condition is reached. Therefore, this article uses Spy to extract reliable negative
The main feature of the PU learning algorithm is that it does examples. The algorithm of the Spy Technique is shown in
not require negative training examples. Our goal is to build a Algorithm 1.
binary classifier using P and U for fake review detection that
continuously detects reviews in unlabeled reviews. Algorithm 1 Extracting Reliable Negative Examples
Input: Example set P and U , sampling rate s
Output: Example set R N
B. Construction Process of SIPUL 1: Set RN = ∅, s = P × S%
2: Set Ps = P−S, Us = U ∪ S, the label of Ps to 1, the label of Us to −1
The problem of fake reviews detection is defined in the 3: According to Ps and Us to train a binary classifier g
PU learning algorithm. According to the previous description, 4: Using g to classify example in set U
a real sample review dataset is required first. The positive 5: Determine the threshold ϕ
6: f or a ∈ U d o
and negative examples of initial training need to be obtained 7: if Pr (1|a) ≤ ϕ
before the model trainer and are not needed in subsequent 8: RN = RN ∪ a
model iterations. Although people cannot effectively detect 9: end f or
10: Output Example set R N
fake reviews in the current Internet environment, they can
collect a small number of real reviews through previous
knowledge and heuristic rules. This article shows the strategy Algorithm 1 describes the main process of extracting reli-
for collecting real reviews in Fig. 4. able negative examples.
P represents the positive sample set, and U means the unla- 1) Steps 1–3: Select some samples from the set P randomly
beled reviews. To get a sufficiently reliable positive example and put them into the set S. Then, put the samples from
set P, we require only samples with sufficient confidence to be set S into U as Spy samples. At this time, the sample
put into set P. For the uncertain samples, the traditional way set becomes P-S and U + S. Generally, the number of
is to put them back into the set U directly, so that they might subsets S divided from P is 15%.
be picked up again. We introduce a temporary set I, which is 2) Steps 4–6: P-S and U + S are taken as positive and
used to temporarily store the uncertain samples, to ensure that negative samples, respectively, and then combined with
every time the samples extracted from U are not extracted. the iterative EM algorithm for classification. All unla-
Based on the datasets P and U, this article presents a fake beled samples are trained as negative examples in the
review detection model based on SIPUL, and the PU learning initialization process to train the classifier.
mainly includes three steps. 3) Steps 7–10: Taking the minimum value of Spy sample
Step1. Extracting reliable negative examples. distribution as the threshold, all samples in U below this
Step2. Training the model iteratively. threshold are considered as RN.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
SHUNXIANG et al.: BUILDING FAKE REVIEW DETECTION MODEL BASED ON SIPUL 6933

Algorithm 2 The Iterative Training of the Model


Input: Example setPandU , the negative example set RN
Output: model
1: Set model-list=Ø, D = U − RN
2: Set the label of P to 1, and the label of RN to -1
3: i = 1
4: Loop
5: Train the SVM classifier Si according to P and RN
6: Save Si to model-list
7: Use Si to classify D
8: Set the negative review into M in D
9: if M = Ø
10: exit loop Fig. 5. Updating training samples and historical record points. Ti represents
11: else different times of receiving streaming data.
12: D = D − M
13: RN = RN ∪ M
14: i++ Algorithm 3 Updating Training Samples and Historical
15: Output model Record Points
Input: Product Reviews, Sentiment threshold T
Output: Classification result
1: Index= i, Current-model=model(i)
2) Training the Model: According to the discussion in 2: Product review preprocessing
Section IV-A and IV-B, positive samples and negative samples 3: Calculating the sentiment value FS of the review
4: if FS >= T
have been obtained. Then, we need to train a classifier based 5: Enter the strong sentiment set
on P and RN. The commonly used binary classification models 6: else
are SVM [56] and EM [57]. In this article, SVM is selected as 7: Enter the weak sentiment set
8: Use model(i) to detect fake reviews in the set
the training model. We aim to train an SVM binary classifier 9: The result is added to the training sample
through set P and set RN. The specific algorithm process is 10: Record the current data, and delete the old data.
shown in Algorithm 2. 11: Train the model(i+1) based on sample data
12: Output: Classification result
Algorithm 2 design description is given as follows.
1) Steps (1–6): The set P and set RN to form the initial
training set, which is used to build an initial SVM the sample are deleted according to the model capacity.
classifier Si . In our model, the last three batches of data are retained. The
2) Steps (7–8): The reviews in D are classified by using persistence task will continuously delete old data, keeping
the trained classifier Si , and the negative examples are the database at a manageable size. We currently support file-
stored in the set M. based implementation (file streaming), where we simulate data
3) Steps (9–10): This is a judgment statement. If the set M in a streaming manner. We plan to support network-based
is empty, it means that no negative examples are found streaming in the near future. The process of updating training
in D. In other words, all negative examples have been samples and historical record points is shown in Algorithm 3.
classified, and the algorithm ends. On the contrary, if the Algorithm 3 as a whole is an iterative loop operation:
M set is not empty, continue to Line 11.
1) Step 1 is the current model cycle record.
4) Steps (11–13): The identified negative sample is
2) Steps 2–7 are used to calculate the sentiment value of
removed from set D, and the negative sample set W is
the review and the sentiment division of the review.
merged into the reliable negative sample set RN. After
3) Sept 8 is used to complete the review detection opera-
reassigning D and RN, return to step 4.
tion.
5) Steps (4–13): This is a process of iteratively training
4) Steps 9 and 10 are used to update training samples and
the classifier. The classifier after each training cycle is
record points.
stored in the model.
5) Step10 is used for iterative training based on updated
3) Updating Training Samples and Historical Record samples that are used to judge the current model error
Points: The set P and set RN form the initial training set, and the existing minimum error and judge whether to
by constantly updating set P and set RN, and the SVM update the final model.
fake review detector is trained. However, compared to static
and limited data stored in a database or file, streaming data
V. E XPERIMENT
have the characteristics of a huge processing scale, unlimited
data increment, and dynamic changes. The design principle is A. Experimental Datasets
shown in Fig. 5. The dataset reviews include two types: real reviews and
As shown in Fig. 5, product reviews continue to enter the fake reviews. To evaluate the effectiveness of our model,
model. When the data arrive, the model trains and predicts we conducted experiments with four public datasets. The
based on the sample data in time and records the training information of the experimental datasets is shown in Table IV.
sample data and the trained model at this time. The predicted Ott et al. [13] contain deceptive reviews of hotels from
result is treated as sample data for the next model training. crowdsourcing platforms and the same hotel in Chicago by
As the sample data continue to increase, the past data in TripAdvisor. YelpChi [58] contains real business reviews of

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
6934 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023

TABLE IV TABLE V
E XPERIMENTAL D ATASETS C OMBINATION D ESIGN OF D IFFERENT F EATURES

Combination 1(C1) is the template, regardless of the divi-


sion of reviews. Combination 2–4 (C2–C4), respectively, indi-
cate that text length, sentiment intensity, and whether there
restaurants and hotels from Chicago on the Yelp website. are transitional words are used as the basis for the division of
YelpZIP [59] contains only restaurants reviews from different reviews.
regions of Yelp. At the end of experiment 3, in order to prove the effective-
In this article, we only use the public dataset stored in ness of the SIPUL in detecting fake reviews better, we com-
files to imitate streaming data’s increasing and changing pared it with several advanced methods.
characteristics. We plan to support network-based streaming 4) Experiments to Simulate the Characteristics of Streaming
soon. Data:
One of the characteristics of streaming data is a large
B. Experimental Method amount of data, so we do simulation experiments on the two
Product reviews often appear in the form of unlim- datasets: “YelpZIP” and “YelpChi.” For each dataset, 8% of
ited streaming data. This article uses the existing pub- the positive example documents are randomly selected as the
lic datasets to simulate the characteristics of streaming positive set. The rest and all negative example documents are
data. We currently support file-based implementation (file taken as unlabeled data. The unlabeled data are divided into
streaming), and we plan to support network-based streaming ten parts to simulate the change of streaming data.
soon. To evaluate the experimental results, accuracy (Acc), area
In the experiment, P, RN, and U represent the positive under the ROC (receiver operating characteristic) curve
examples, negative examples, and unlabeled samples, respec- (AUC), and F-score are applied for the performance evaluation
tively. To analyze the performance of the model, different measures.
experimental indicators are used to analyze the performance of
the model in this article on different datasets. We design three
C. Experimental Results and Analysis
basic experiments (Experiments 1–3) and an experiment that
simulates the characteristics of streaming data (Experiment 4). We have done the following experiments to verify the
1) Analyzing the Influence of Different Combinations of performance of the model in this article.
Two-Part PU Algorithms: 1) Experiment 1: Analyzing the Influence of Different Com-
Step 1: select Spy, Rocchio, NB, and 1-DNF technology. binations of Two-Part PU Algorithms:
Step 2: select SVM, EM, and NB. We designed eight groups According to Table VI, we can make the following analysis.
of different combinations of two technologies (as shown 1) With the combination of Spy and SVM, the average F
in Table VI) and conducted a comparative experiment on value performs best: The first step is to adopt 1-DNF,
“YelpZIP”; 30% of the data are randomly selected as test which is very poor performance. The reason is that
documents for each dataset. The rest are used to create training 1-DNF can go wrong without many positive documents.
sets. γ percentage of the data is first selected as the positive On the contrary, training Spy does not require a lot
set, and we range γ from 1% to 10%. of data, so the performance is better. The other six
2) Analyzing the Influence of Different Sentiment Values on combinations have relatively good performance, but Spy
the Experiment: and SVM have the best F-score performance. The reason
This experiment is mainly used to analyze the impact of is that SVM is a stronger classifier than NB, and EM
different sentiment values on the experiment’s performance uses NB.
and find the sentiment threshold for dividing comments. 2) The average F-score increases with R but gradually
3) Analyzing the Influence of Different Review Division becomes flat when it reaches a certain value: When the
Indexes on the Model: positive set is very small, the number of spies put into
We designed four kinds of feature combinations (as shown the unlabeled set is also very small, so the generated
in Table V) and conducted a comparative experiment on three RN set is unreliable. However, when the number of
datasets: “Ott,” “YelpZIP,” and “YelpChi.” The combination positive sets reaches a certain level (γ = 8%), it can
of PU and other representative fake reviews shows that it is be ensured that the generated RN set is reliable, and
reasonable to choose emotional intensity as the classification with the increase of γ , the performance of F-score will
of comments. not increase.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
SHUNXIANG et al.: BUILDING FAKE REVIEW DETECTION MODEL BASED ON SIPUL 6935

TABLE VI
AVERAGE F-S CORES ON Y ELP ZIP

Fig. 6. Sentiment value of different reviews. The red curve representing the fake review is above the blue curve representing the true review overall.

2) Experiment 2: Analyzing the Influence of Different Sen-


timent Values on the Experiment:
According to the sentiment calculation method provided
in Section III-C, the sentiment values of all reviews were
calculated, respectively. Because of the large scale of the
dataset, we randomly selected 400 real reviews and fake
reviews, and the results are shown in Fig. 6.
From the sentiment calculation results of reviews in Fig. 6,
it can be found that most of the sentiment value mainly
fluctuates between 15 and 25 and mainly concentrates 20.
Therefore, we select sentiment values of 15, 20, and 25 as
sentiment thresholds to divide the review.
We designed three groups of comparative experiments on
three datasets. For each dataset, we set γ = 8%, 30% of the
data are as test set, and the rest (70%) are used to create
training. The average results of the experiment are shown in
Table VII and Fig. 7.
According to the experimental results of the three thresh- Fig. 7. Influence of sentiment value on experimental performance. Ott ACC,
Ott AUC, and Ott F-score represent the evaluation index values of ACC, AUC,
olds, we summarize the following two important results. and F-score on the dataset Ott. YelpZIP and YelpChi are the same as above.
1) With the increase of the threshold, the experimental
index values are roughly in an inverted U-shaped distri-
bution: We can find from Fig. 7 that when the sentiment is that the threshold is too large or too small to separate
threshold is set to 20, the accuracy, precision, recall, and the real reviews and fake reviews.
F1 are higher than those of 15 or 25. The overall trend 2) The sentiment intensity threshold is selected where the
is that the selected index value first increases and then most concentrated sentiment value distribution: The sen-
decreases with the threshold value increase. The reason timent value of most reviews is mainly concentrated on

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
6936 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023

TABLE VII
I NFLUENCE OF S ENTIMENT VALUE ON E XPERIMENTAL P ERFORMANCE

Fig. 8. Influence of different review division indicators on the model. C1 is the template, regardless of the division of reviews. C2–C4, respectively, indicates
that text length, sentiment intensity, and whether there are transitional words are used as the basis for the division of reviews.

the upper and lower 20. When the sentiment is divided be directly represented by text length or whether there
by the sentiment threshold of 20, the intertwined reviews are transitional words.
are separated as much as possible. Compared with the
original set environment, the subset environment of fake To prove the effectiveness of the SIPUL in detecting fake
reviews is changed dramatically. The features of fake reviews better, we compared it with several advanced methods,
reviews are shown as much as possible in the subset. and the results are shown in Table VIII.
On the Ott dataset, our method does not perform very well.
3) Experiment 3: Analyzing the Influence of Different Review The reason is that the number of Ott dataset is small, and less
Division Indexes on the Model: data are used for training after sentiment division. However,
In this experiment, the PU algorithm uses a combination of it performs well on the YelpZIP and YelpChi datasets, but
Spy and SVM, choosing γ = 8%. According to Fig. 8, the the time cost is greater than other methods. The reason is
following two conclusions can be drawn. that we added the sentiment division step before the review
detection. Although the accuracy is improved, the training time
1) The classification of reviews can effectively improve the is increased.
performance of the model in detecting fake reviews: 4) Experiment 4: Experiments to Simulate the Characteris-
Compared with the other three groups (C2–C4), the tics of Streaming Data:
performance of C1 is poor. The main reason is that C1 From Fig. 9, it can be found that when the amount of
uses the most primitive dataset. The deceptive reviews data is increased in batches over time, the model performance
that originally have hidden features are very similar to improves in the first three times but then begins to decline
real reviews and are difficult to identify. After C2–C4 slowly. The reason is that the model learned more features
uses different features to divide the reviews, the reviews when adding data for the first three times. However, when the
are divided into strong and weak feature sets, and then, amount of data increases, the model does not learn more new
they are trained and detected by different models. features. Instead, the results of each detection (uncertain and
2) Sentiment intensity can achieve a better division of 100% correct) are added to the training sample. The unreliable
reviews: According to the knowledge, we know that the reviews in the training sample are increasing, resulting in the
length of the text and whether the text contains transi- decline of model performance.
tional words are essential features that may affect senti- To further prove the effectiveness of our experiment, after
ment intensity. However, the text with strong sentiment simulating the stream data, we only take the data saved in
intensity is not necessarily long or contains transitional the first three recording points for experimental simulation.
words. That is to say, text length and whether there As shown in Fig. 10, it can be found that only the latest three
are transitional words can be represented by sentiment recording points are used to train the model each time, and the
intensity, but the strength of sentiment intensity cannot performance of the model gradually stabilizes after the third

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
SHUNXIANG et al.: BUILDING FAKE REVIEW DETECTION MODEL BASED ON SIPUL 6937

TABLE VIII
E XPERIMENTAL R ESULTS OF D IFFERENT FAKE R EVIEW D ETECTION M ODELS IN THE D ATASETS : O TT, Y ELP ZIP, AND Y ELP C HI

experiments on different datasets, the models proposed in


this article all show good performance in the comparison of
advanced models, and it also shows good performance of fake
reviews detection in the simulated streaming data experiment.

VI. C ONCLUSION
To support consumers’ understanding of the true quality,
performance, and user’s evaluation of the product, a new
detection model for fake reviews based on SIPUL is proposed,
where natural language processing and PU learning algorithm
techniques are applied to fake reviews detection. The three
contributions are summarized as follows.
1) Selecting the sentiment intensity as the representative
review feature: It is verified that sentiment intensity
better represents many fake features. Therefore, in the
Fig. 9. Experiments simulating streaming data on datasets: YelpZIP and process of review division, more fake reviews are
YelpChi. B1–B10 represent the ten parts of the dataset and the amount of
data added to the model over time.
obtained in the strong sentiment set, and more real
reviews are obtained in the weak sentiment set.
2) A method to detect deceptive fake reviews is proposed:
This article creatively proposes that by changing the
dataset environment where the fake reviews are located,
the fake reviews with hidden features will lose their
hidden in the new set environment, so that they can be
better detected.
3) Fake review detection model SIPUL is successfully con-
structed: Experiments show that the model is effective
in detecting fake reviews. In addition, by controlling the
iteration of the model and the update of training data,
the model can continuously learn and predict from the
continuously arriving streaming data. The characteristics
of simulated streaming data preliminarily prove the
effectiveness of this method.
The model currently only supports file-based implemen-
tation (file streaming), and it needs to update the training
Fig. 10. Experiments simulating streaming data on datasets: YelpZIP and data according to the record constantly; otherwise, the per-
YelpChi. formance will be affected. We plan to support network-based
streaming soon.
simulation data. However, compared to the best case, there is
still a slow decline. It shows that the test results cannot be R EFERENCES
guaranteed to be completely correct, and these uncertain data [1] L. Li, B. Qin, and T. Liu, “Survey on fake review detection re-search,”
still have a certain impact on the model. Chin. J. Comput., vol. 41, no. 4, pp. 946–968, 2018.
[2] N. Jindal and B. Liu, “Opinion spam and analysis,” in Proc. Int. Conf.
In the comparison of the results of different experimen- Web Search Web Data Mining (WSDM), Stanford, CA, USA, 2008,
tal indicators, although there are slight fluctuations in the pp. 219–230.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
6938 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 34, NO. 10, OCTOBER 2023

[3] A. Bondielli and F. Marcelloni, “A survey on fake news and rumour [28] C. Yuan, W. Zhou, Q. Ma, S. Lv, J. Han, and S. Hu, “Learning
detection techniques,” Inf. Sci., vol. 497, pp. 38–55, Sep. 2019. review representations from user and product level information for spam
[4] S. Vosoughi, D. Roy, and S. Aral, “The spread of true and false news detection,” in Proc. IEEE Int. Conf. Data Mining (ICDM), Nov. 2019,
online,” Science, vol. 359, pp. 1146–1151, May 2018. pp. 1444–1449.
[5] S. Zhang, Z. Wei, Y. Wang, and T. Liao, “Sentiment analysis of Chinese [29] C. Li, S. Wang, L. He, P. S. Yu, Y. Liang, and Z. Li, “SSDMV: Semi-
micro-blog text based on extended sentiment dictionary,” Future Gener. supervised deep social spammer detection by multi-view data fusion,” in
Comput. Syst., vol. 81, pp. 395–403, Apr. 2018. Proc. IEEE Int. Conf. Data Mining (ICDM), Nov. 2018, pp. 247–256.
[6] S. Zhang, Y. Wang, S. Zhang, and G. Zhu, “Building associated semantic [30] A. R. Yelundur, V. Chaoji, and B. Mishra, “Detection of review abuse
representation model for the ultra-short microblog text jumping in big via semi-supervised binary multi-target tensor decomposition,” in Proc.
data,” Cluster Comput., vol. 19, no. 3, pp. 1399–1410, Sep. 2016. 25th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Jul. 2019,
[7] S. Zhang, Z. Hu, G. Zhu, M. Jin, and K.-C. Li, “Sentiment classifica- pp. 2134–2144.
tion model for Chinese micro-blog comments based on key sentences [31] J. Wang, H. Kan, F. Meng, Q. Mu, G. Shi, and X. Xiao, “Fake review
extraction,” Soft Comput., vol. 25, no. 1, pp. 463–476, Jan. 2021. detection based on multiple feature fusion and rolling collaborative
[8] K. S. Sanjay and A. Danti, “Online fake review identification based on training,” IEEE Access, vol. 8, pp. 182625–182639, 2020.
decision rules,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 8, no. 2, [32] N. Imam, B. Issac, and S. M. Jacob, “A semi-supervised learning
pp. 140–143, Apr. 2019. approach for tackling Twitter spam drift,” Int. J. Comput. Intell. Appl.,
[9] X. Wang, K. Liu, S. He, and J. Zhao, “Learning to represent review vol. 18, no. 2, Jun. 2019, Art. no. 1950010.
with tensor decomposition for spam detection,” in Proc. Conf. Empirical [33] C. Gong, T. Liu, J. Yang, and D. Tao, “Large-margin label-calibrated
Methods Natural Lang. Process., 2016, pp. 866–875. support vector machines for positive and unlabeled learning,” IEEE
[10] X. Wang, K. Liu, and J. Zhao, “Detecting deceptive review spam via Trans. Neural Netw. Learn. Syst., vol. 30, no. 11, pp. 3471–3483,
attention-based neural networks,” in Proc. Nat. CCF Conf. Natural Lang. Nov. 2019.
Process. Chin. Comput., 2017, pp. 866–876. [34] R. Kiryo, G. Niu, M. C. Plessis, and M. Sugiyama, “Positive-unlabeled
[11] P. Hajek, A. Barushka, and M. Munk, “Fake consumer review detection learning with non-negative risk estimator,” in Proc. NIPS, 2017,
using deep neural networks integrating word embeddings and emotion pp. 1675–1685.
mining,” Neural Comput. Appl., vol. 32, no. 23, pp. 17259–17274, [35] G. Niu, M. C. Plessis, T. Sakai, Y. Ma, and M. Sugiyama, “Theoretical
Dec. 2020. comparisons of positive-unlabeled learning against positive-negative
learning,” in Proc. NIPS, 2016, pp. 1199–1207.
[12] G. Jain, M. Sharma, and B. Agarwal, “Optimizing semantic LSTM
for spam detection,” Int. J. Inf. Technol., vol. 11, no. 2, pp. 239–250, [36] H. Zayd and L. Daniel, “Learning from positive and unlabeled data with
Jun. 2019. arbitrary positive shift,” in Proc. NIPS, 2020, pp. 13088–13099.
[37] Z. Fang, J. Lu, A. Liu, F. Liu, and G. Zhang, “Learning bounds for
[13] M. Ott, C. Cardie, and J. Hancock, “Estimating the prevalence of
open-set learning,” in Proc. ICML, 2021, pp. 3122–3132.
deception in online review communities,” in Proc. 21st Int. Conf. World
Wide Web, Lyon, France, Apr. 2012, pp. 201–210. [38] Z. Fang, J. Lu, F. Liu, J. Xuan, and G. Zhang, “Open set domain
adaptation: Theoretical bound and algorithm,” IEEE Trans. Neural Netw.
[14] S. D. Bhattacharjee, W. J. Tolone, and V. S. Paranjape, “Identify-
Learn. Syst., vol. 32, no. 10, pp. 4309–4322, Oct. 2021.
ing malicious social media contents using multi-view context-aware
[39] D. H. Fusilier, M. Montes-y-Gómez, P. Rosso, and R. G. Cabrera,
active learning,” Future Gener. Comput. Syst., vol. 100, pp. 365–379,
“Detecting positive and negative deceptive opinions using PU-learning,”
Nov. 2019.
Inf. Process. Manage., vol. 51, no. 4, pp. 433–443, Jul. 2015.
[15] M. Pavlinek and V. Podgorelec, “Text classification method based
[40] H. Li, B. Liu, A. Mukherjee, and J. Shao, “Spotting fake reviews using
on self-training and LDA topic models,” Exp. Syst. Appl., vol. 80,
positive-unlabeled learning,” Computación y Sistemas, vol. 18, no. 3,
pp. 83–93, Sep. 2017.
pp. 467–475, Sep. 2014.
[16] R. Narayan, J. K. Rout, and S. K. Jena, “Review spam detection [41] H. Chen, J. Liu, Y. Lv, M. H. Li, M. Liu, and Q. Zheng, “Semi-
using semi-supervised technique,” in Proc. Prog. Intell. Comput. Techn., supervised clue fusion for spammer detection in Sina Weibo,” Inf.
Theory, Pract., Appl. Singapore: Springer, 2018, pp. 281–286. Fusion, vol. 44, pp. 22–32, Nov. 2018.
[17] E. Sansone, F. G. B. De Natale, and Z.-H. Zhou, “Efficient training for [42] D. He et al., “Fake review detection based on PU learning and behavior
positive unlabeled learning,” IEEE Trans. Pattern Anal. Mach. Intell., density,” IEEE Netw., vol. 34, no. 4, pp. 298–303, Jul. 2020.
vol. 41, no. 11, pp. 2584–2598, Nov. 2019. [43] T. Zhai, Y. Gao, and J. W. Zhu, “Survey of online learning algorithms
[18] Y. Liu, X. Wang, T. Zhu, and A. Zhou, “Survey on quality evaluation and for streaming data classification,” J. Softw., vol. 31, no. 4, pp. 912–931,
control of online reviews,” J. Softw., vol. 25, no. 3, 2014, pp. 506–527. 2020.
[19] M. Muszynski et al., “Recognizing induced emotions of movie audiences [44] J. Liu, J. Bai, H. Li, and B. Sun, “Improved LSTM-based abnormal
from multimodal information,” IEEE Trans. Affect. Comput., vol. 12, stream data detection and correction system for Internet of Things,”
no. 1, pp. 36–52, Jan. 2021. IEEE Trans. Ind. Informat., vol. 18, no. 2, pp. 1282–1290, Feb. 2022.
[20] D. W. Otter, J. R. Medina, and J. K. Kalita, “A survey of the usages [45] C. Martín, P. Langendoerfer, P. S. Zarrin, M. Díaz, and B. Rubio,
of deep learning for natural language processing,” IEEE Trans. Neural “Kafka-ML: Connecting the data stream with ML/AI frameworks,”
Netw. Learn. Syst., vol. 32, no. 2, pp. 604–624, Feb. 2020. Future Gener. Comput. Syst., vol. 126, pp. 15–33, Jan. 2022.
[21] Y. Ren, D. Ji, H. Zhang, and L. Yin, “Deceptive reviews detection based [46] D. Sun, S. Xue, H. Wu, and J. Wu, “A data stream cleaning system
on positive and unlabeled learning,” J. Comput. Res. Develop., vol. 52, using edge intelligence for smart city industrial environments,” IEEE
no. 3, pp. 639–648, 2015. Trans. Ind. Informat., vol. 18, no. 2, pp. 1165–1174, Feb. 2022.
[22] J. Li, M. Ott, C. Cardie, and E. Hovy, “Towards a general rule for [47] H. Ning, J. Zhang, T. Feng, E. Chu, and T. Tian, “Control-based
identifying deceptive opinion spam,” in Proc. 52nd Annu. Meeting Assoc. algorithms for high dimensional online learning,” J. Franklin Inst.,
Comput. Linguistics, Baltimore, MD, USA, 2014, pp. 1566–1576. vol. 357, no. 3, pp. 1909–1942, 2020.
[23] A. Melleng, A. Jurek-Loughrey, and P. Deepak, “Sentiment and emotion [48] X. Liang, X. Song, K. Qi, J. Li, J. Liu, and L. Jian, “Anomaly detection
based on text representation for fake reviews detection,” in Proc. aided budget online classification for imbalanced data streams,” IEEE
Int. Conf. Recent Adv. Natural Lang. Process. (RANLP), Oct. 2019, Intell. Syst., vol. 36, no. 3, pp. 14–22, May 2021.
pp. 750–757. [49] M. Singh and R. Pamula, “ADINOF: Adaptive density summarizing
[24] M. Liu, Y. Shang, Q. Yue, and J. Zhou, “Detecting fake reviews using incremental natural outlier detection in data stream,” Neural Comput.
multidimensional representations with fine-grained aspects plan,” IEEE Appl., vol. 33, no. 15, pp. 9607–9623, Aug. 2021.
Access, vol. 9, pp. 3765–3773, 2021. [50] J. Zhao and H. Wang, “Detection of fake reviews based on emotional
[25] Y. Fang, H. Wang, L. Zhao, F. Yu, and C. Wang, “Dynamic knowl- orientation and logistic regression,” CAAI Trans. Intell. Syst., vol. 11,
edge graph-based fake-review detection,” Appl. Intell., vol. 50, no. 12, no. 3, pp. 336–342, 2016.
pp. 4281–4295, 2020. [51] K. Min, “Fast calculation method of OR value in case-control study,”
[26] A. Ligthart, C. Catal, and B. Tekinerdogan, “Analyzing the effectiveness South China Preventive Med., vol. 43, no. 5, pp. 492–494, 2017.
of semi-supervised learning approaches for opinion spam classification,” [52] B. Liu, Y. Dai, X. Li, W. S. Lee, and P. S. Yu, “Building text classifiers
Appl. Soft Comput., vol. 101, Mar. 2021, Art. no. 107023. using positive and unlabeled examples,” in Proc. 3rd IEEE Int. Conf.
[27] F. Wu, C. Wu, and J. Liu, “Semi-supervised collaborative learning for Data Mining, Nov. 2003, pp. 179–188.
social spammer and spam message detection in microblogging,” in Proc. [53] J. Rocchio, “Relevance feedback in information retrieval,” Comput. Sci.,
27th ACM Int. Conf. Inf. Knowl. Manage., Oct. 2018, pp. 1791–1794. vol. 27, no. 7, pp. 313–323, 2000.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.
SHUNXIANG et al.: BUILDING FAKE REVIEW DETECTION MODEL BASED ON SIPUL 6939

[54] A. Mccallum and K. Nigam, “A comparison of event models for Zhu Guangli received the M.S. degree from
Naive Bayes text classification,” in Proc. AAAI Workshop Learn. Text the School of Computing Engineering and Sci-
Categorization, 1998, pp. 41–48. ence, Anhui University of Science and Technology,
[55] H. Yu, J. Han, and C. K. Chen, “PEBL: Positive example based learning Huainan, China, in 2005.
for web page classification using SVM,” in Proc. 8th ACM SIGKDD Int. She is an Associate Professor at the Anhui Univer-
Conf. Knowl. Discovery Data Mining, 2002, pp. 239–248. sity of Science and Technology. Her current research
[56] V. Vapnik, The Nature of Statistical Learning Theory. New York, NY, interests include web mining, semantic search, and
USA: Springer-Verlag, 1995. calculation theory.
[57] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood
from incomplete data via the E M algorithm,” J. Roy. Stat. Soc., vol. 39,
no. 1, pp. 1–22, 1977.
[58] A. Mukherjee, V. Venkataraman, B. Liu, and N. S. Glance, “What yelp
fake review filter might be doing?” in Proc. ICWSM, 2013, pp. 409–418.
[59] S. Rayana and L. Akoglu, “Collective opinion spam detection: Bridging
review networks and metadata,” in Proc. 21st ACM SIGKDD Int. Conf.
Knowl. Discovery Data Mining, Aug. 2015, pp. 985–994.
Wei Zhongliang received the M.S. degree from
the Anhui University of Science and Technology,
Huainan, China, in 2011.
He was with the School of Computer Science and
Zhang Shunxiang received the Ph.D. degree from Engineering, Anhui University of Science and Tech-
the School of Computing Engineering and Science, nology in 2004, and became an Associate Professor
Shanghai University, Shanghai, China, in 2012. in 2021. His current research interests include data
He is a Professor at the Anhui University of mining, machine learning, and artificial intelligence
Science and Technology, Huainan, China. His cur- application.
rent research interests include web mining, semantic
search, and complex network.

Li KuanChing (Senior Member, IEEE) is currently


a Distinguished Professor at Providence University,
Zhu Aoqiang received the bachelor’s degree in Taichung, Taiwan, where he also serves as the
engineering from the School of Anhui University of Director of the High-Performance Computing and
Science and Technology, Huainan, China, in 2018, Networking Center. He published more than 250 sci-
where he is currently pursuing the M.S. degree in entific papers and articles and is the coauthor or
computer science and technology. coeditor of more than 25 books published by Taylor
His current research interests are in natural lan- & Francis, Springer, and McGraw-Hill. His research
guage processing, affective computing, and fake interests include parallel and distributed computing,
review detection. big data, and emerging technologies.
Prof. KuanChing is the Editor-in-Chief of Con-
nection Science and serves as an associate editor for several leading journals.
He is a fellow of the IET.

Authorized licensed use limited to: Zhejiang University. Downloaded on April 16,2024 at 10:21:49 UTC from IEEE Xplore. Restrictions apply.

You might also like