0% found this document useful (0 votes)
29 views

(2018) Cross-Domain Aspect Extraction For Sentiment Analysis

This document describes a new approach called CD-ALPHN for cross-domain aspect extraction using transductive learning. CD-ALPHN represents source and target domain aspects and linguistic features in a heterogeneous network and propagates labels across this network to extract aspects in the target domain where they are unlabeled.

Uploaded by

Hieu Bui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

(2018) Cross-Domain Aspect Extraction For Sentiment Analysis

This document describes a new approach called CD-ALPHN for cross-domain aspect extraction using transductive learning. CD-ALPHN represents source and target domain aspects and linguistic features in a heterogeneous network and propagates labels across this network to extract aspects in the target domain where they are unlabeled.

Uploaded by

Hieu Bui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Accepted Manuscript

Cross-domain aspect extraction for sentiment analysis: a


transductive learning approach

Ricardo Marcondes Marcacini, Rafael Geraldeli Rossi, Ivone


Penque Matsuno, Solange Oliveira Rezende

PII: S0167-9236(18)30138-6
DOI: doi:10.1016/j.dss.2018.08.009
Reference: DECSUP 12983
To appear in: Decision Support Systems
Received date: 10 April 2018
Revised date: 27 July 2018
Accepted date: 21 August 2018

Please cite this article as: Ricardo Marcondes Marcacini, Rafael Geraldeli Rossi, Ivone
Penque Matsuno, Solange Oliveira Rezende , Cross-domain aspect extraction for
sentiment analysis: a transductive learning approach. Decsup (2018), doi:10.1016/
j.dss.2018.08.009

This is a PDF file of an unedited manuscript that has been accepted for publication. As
a service to our customers we are providing this early version of the manuscript. The
manuscript will undergo copyediting, typesetting, and review of the resulting proof before
it is published in its final form. Please note that during the production process errors may
be discovered which could affect the content, and all legal disclaimers that apply to the
journal pertain.
ACCEPTED MANUSCRIPT

Cross-domain aspect extraction for sentiment analysis:


a transductive learning approach

Ricardo Marcondes Marcacinib,∗, Rafael Geraldeli Rossib , Ivone Penque


Matsunoa , Solange Oliveira Rezendea

T
a Institute
of Mathematics and Computer Sciences (ICMC)
University of São Paulo (USP)

IP
Av. Trabalhador São Carlense, 400, 13566-590, São Carlos, SP, Brazil
b Federal University of Mato Grosso do Sul (UFMS)

Av. Ranulpho Marques Leal, 3484, 79613-000, Três Lagoas, MS, Brazil

CR
US
Abstract

Aspect-Based Sentiment Analysis (ABSA) is a promising approach to analyze


AN
consumer reviews at a high level of detail, where the opinion about each fea-
ture of the product or service is considered. ABSA usually explores supervised
inductive learning algorithms, which requires intense human effort for the la-
M

beling process. In this paper, we investigate Cross-Domain Transfer Learning


approaches, in which aspects already labeled in some domains can be used to
ED

support the aspect extraction of another domain where there are no labeled
aspects. Existing cross-domain transfer learning approaches learn classifiers
PT

from labeled aspects in the source domain and then apply these classifiers in
the target domain, i.e., two separate stages that may cause inconsistency due
CE

to different feature spaces. To overcome this drawback, we present an inno-


vative approach called CD-ALPHN (Cross-Domain Aspect Label Propagation
through Heterogeneous Networks). First, we propose a heterogeneous network-
AC

based representation that combines different features (labeled aspects, unlabeled


aspects, and linguistic features) from source and target domain as nodes in a
single network. Second, we propose a label propagation algorithm for aspect

∗ Corresponding author
Email addresses: [email protected] (Ricardo Marcondes Marcacini),
[email protected] (Rafael Geraldeli Rossi), [email protected] (Ivone Penque
Matsuno), [email protected] (Solange Oliveira Rezende)

Preprint submitted to Decision Support Systems August 28, 2018


ACCEPTED MANUSCRIPT

extraction from heterogeneous networks, where the linguistic features are used
as a bridge for this propagation. Our algorithm is based on a transductive
learning process, where we explore both labeled and unlabeled aspects during
the label propagation. Experimental results show that the CD-ALPHN out-
performs the state-of-the-art methods in scenarios where there is a high-level

T
of inconsistency between the source and target domains — the most common

IP
scenario in real-world applications.

CR
Keywords: Cross Domain, Opinion Mining, Aspect Extraction

1. Introduction

US
Opinion Mining and Sentiment Analysis have become very popular for au-
tomating the knowledge extraction from user reviews about products and ser-
AN
vices (Liu, 2012; Cambria, 2016; Breck & Cardie, 2017). In general, the goal
is to determine the consumer’s opinion on some topic through the sentiment
M

polarity expressed in textual reviews (e.g. positive, negative or neutral) (Liu &
Zhang, 2012). While the first studies on sentiment analysis attempted to extract
ED

the polarity from the entire text review (document-level sentiment analysis) or
from the text sentences (sentence-level sentiment analysis), more recent studies
investigate aspect-based sentiment analysis (ABSA) (Feldman, 2013; Lau et al.,
PT

2014; Schouten & Frasincar, 2016; Akhtar et al., 2017). In this case, consumer
reviews are analyzed at a high level of detail, where the opinion about each
CE

feature of the product or service is considered, such as quality of a camera’s


photos, notebook weight, and food price. In fact, ABSA tasks are considered
AC

more complex (Mukherjee & Liu, 2012; Wang et al., 2014; Matsuno et al., 2017)
because we need to deal with a challenging question during the sentiment analy-
sis process: how to automatically extract the aspects of each product or service
from consumer reviews?
Existing works for aspect extraction are based on linguistic features patterns
(Rana & Cheah, 2016), where natural language processing techniques are used
to identify grammatical classes and syntactic relations of the text reviews. In

2
ACCEPTED MANUSCRIPT

this case, a set of previously labeled aspects (training set) is used in a super-
vised inductive learning task, which induces a classification model considering
only labeled examples. Next, aspects are extracted from the new text reviews
according to the linguistic features presented to the classifier. For example, a
supervised rule-based classifier that learned the following pattern “IF a word h

T
is a noun AND h is in a relationship with a verb t THEN h is extracted as an

IP
aspect.” is able to identify the aspect called “camera” from the review presented

CR
in Figure 1, where h=“camera” and t=“is”. The relationship between h and t is
defined by the adjective “nice”.

US
AN
Figure 1: Part-of-Speech (PoS) and Syntactic Relations extracted from the sentence “The
camera is nice”.
M

Aspect-based sentiment analysis usually explores inductive supervised learn-


ing algorithms, which requires intense human effort for the labeling process in
ED

order to obtain satisfactory classification performances (Rana & Cheah, 2016;


Schouten & Frasincar, 2016). However, frequent data labeling is not feasible in
PT

practical situations. To address this challenge, transductive learning is widely


used when there are few labeled data, while unlabeled data are easy to col-
lect (Kong et al., 2013; Chapelle et al., 2006; Belkin et al., 2006; Joachims,
CE

1999). Transductive learning directly classifies unlabeled data and make use of
this known unlabeled data to improve classification performance (Kong et al.,
AC

2013).
Another challenge related to the aspect extraction for sentiment analysis is
to deal with specific characteristics and limitations of the application domain
(Duric & Song, 2012; Deng et al., 2017). For example, reviews collected from so-
cial networks may contain short texts with informal language (e.g. neologisms,
slang, and jargon terms), while specialized forums may contain more technical
language and longer texts. Therefore, an important research question is how

3
ACCEPTED MANUSCRIPT

can we effectively exploit the already labeled aspects in some domain to aid
the aspect extraction in another domain? Studies addressing this question are
known as Cross-Domain Transfer Learning and have been reported as a promis-
ing solution to problems where frequent data labeling is impossible or expensive
(Pan & Yang, 2010; Schouten & Frasincar, 2016; Zhang et al., 2016; Rana &

T
Cheah, 2017).

IP
The problems mentioned above can be attenuated with the use of trans-

CR
ductive semi-supervised learning. While supervised inductive learning obtains a
classification model to classify unknown examples with possibly different feature
space and distributions, transductive semi-supervised learning already knows

US
the predictive space and can make use of it to improve the classification perfor-
mance. Besides, when cross-domain transfer learning is applied, usually the ex-
AN
amples to be classified are already collected. Thus, transductive semi-supervised
learning is more adequate for this type of situation. However, it is necessary to
deal with two important research questions for transductive learning in trans-
M

fer learning scenario: (1) How to properly structure information from different
domains into a unified representation? (2) How to effectively exploit this uni-
ED

fied representation to transfer knowledge from the source domain to the target
domain? These questions motivated us to investigate and propose a specific
PT

transductive learning solution for aspect extraction considering the cross-domain


transfer learning scenario.
CE

In this paper, we present a cross-domain aspect extraction approach based


on transductive learning called CD-ALPHN (Cross-Domain Aspect Label Prop-
agation through Heterogeneous Networks). While existing cross-domain ap-
AC

proaches learn classifiers from labeled aspects in the source domain and then
apply these classifiers in the target domain, i.e., two separate stages that may
cause inconsistency due to different feature spaces or different marginal prob-
ability distributions, we propose a unified transductive learning process from
both source and target feature spaces. The main contributions are two-fold:

• We present a heterogeneous network-based model representation where

4
ACCEPTED MANUSCRIPT

labeled aspects of the source domain, linguistic features, and unlabeled


aspects of the target domain are all mapped as nodes in a heterogeneous
network — which allows different feature types in a unified representation
model. Our proposed heterogeneous network model is an effective (pre-
cision and recall measures) and efficient (computation time) way to fuse

T
knowledge from different domains for cross-domain transfer learning.

IP
• We present a transductive learning algorithm for heterogeneous networks,

CR
where labeled and unlabeled nodes are exploited in a label propagation
process. In this case, the source domain label information (labeled as-

US
pects) are propagated to the linguistic features nodes. Next, linguistic
feature labels are propagated to the target domain (aspect candidates
nodes) — which then propagates label information back to the network.
AN
This propagation process continues until the convergence, i.e. when the
label information of the network nodes are not changed significantly. Thus,
M

the cross-domain transfer learning process is carried out in a more natural


way, with the advantage of using a mathematical formalization framework
ED

and theoretic guarantees of convergence.

We carried out an experimental evaluation with seven benchmark datasets


PT

to compare our proposed CD-ALPHN with other state-of-the-art methods, such


as Multilayer Perceptron Neural Networks (MLP) and Support Vector Machines
CE

(SVM). Experimental evaluation results show that our approach is very com-
petitive, outperforming MLP and SVM methods in scenarios where there is a
high-level of inconsistency between the source and target domains — the most
AC

common scenario in real-world applications. In addition, we also provide an


overview of how our approach can be used in real-world applications by a Data
Analytics System to support the decision-making process.
The remainder of this paper is organized as follows. Section 2 presents the
main concepts about cross-domain transfer learning and transductive learning,
as well as the related work about aspect extraction for sentiment analysis. Sec-
tion 3 describes in detail our CD-ALPHN approach, especially our proposed

5
ACCEPTED MANUSCRIPT

heterogeneous network model and the proposed label propagation algorithm


for transductive learning. The experimental evaluation results comparing our
CD-ALPHN and five other state-of-the-art methods for cross-domain transfer
learning are presented and discussed in Section 4. Finally, Section 5 discusses
the conclusions of this work and directions for future work.

T
IP
2. Related Work

CR
Cross-Domain Transfer Learning aims to utilize labeled data from other
domains to help current learning task (Pan & Yang, 2010). The domain with

US
labeled data is called the source domain, while the domain without labeled data
is called the target domain. In real-world applications, different domains may
have different feature spaces as well as different underlying data distributions —
AN
thereby requiring appropriate learning approaches to address these drawbacks
(Long et al., 2014a).
M

Most of the existing studies for cross-domain transfer learning are known as
inductive transfer learning (Pan & Yang, 2010; Lu et al., 2015). In this case,
ED

the goal is to identify features that are useful in both the target domain and the
source domain, thereby obtaining a shared feature space for the cross-domain
transfer learning problem. In some studies, the best features for both source and
PT

target domains are selected and reweighted to improve the final classification
(Chen et al., 2014; Lu et al., 2015). Inductive transfer learning is also presented
CE

as a two-stage cross-domain transfer learning (Wu & Tan, 2011; Rana & Cheah,
2017). In the first stage, a set of features that are common to both domains is
AC

extracted, and in the second stage, useful features specific to the target domain
are selected to learn a classifier (Wu & Tan, 2011; Rana & Cheah, 2017).
Different strategies to learn classifier models have been proposed for cross-
domain transfer learning (Lu et al., 2015). The most common strategy (with
promising results) is to train well-known classifiers such as Multilayer Percep-
tron, SVM, Naive-Bayes, and kNN, considering the feature space that is common
to both domains. Other strategies are based on the consensus of several classi-

6
ACCEPTED MANUSCRIPT

fiers with different data sampling from both the source and target domain (Luo
et al., 2008; Zhuang et al., 2010). Moreover, there are strategies that use active
learning to improve the transfer learning process, i.e., they require human feed-
back when a set of classifiers disagree about the class of some instance in the
target domain (Li et al., 2013; Wu et al., 2017).

T
The vast majority of existing studies focus primarily on the transfer of senti-

IP
ment polarity between different domains, while the use of cross-domain transfer

CR
learning for aspect extraction is underexplored (Al-Moslmi et al., 2017). Some
recent initiatives explore linguistic features extracted from the aspects to obtain
a common feature space between the two domains (Zhang et al., 2016; Rana &

US
Cheah, 2017). However, these studies also use the two-stage inductive learning
transfer approach, which works well only under the following assumption: the
AN
source and target domain data are drawn from the same feature space and the
same distribution (Pan & Yang, 2010). In other words, inductive learning ap-
proaches will fail when there is a high level of inconsistency between the source
M

and target domains — which is the most common case in real-world applications
(Pan & Yang, 2010; Long et al., 2015).
ED

In cross-domain transfer learning problems, recent studies claim that uni-


fying different domains using graphs as an intermediate representation yields
PT

more satisfactory results (Rohrbach et al., 2013; Long et al., 2014b; Chang et al.,
2017). Thus, graph nodes represent examples of both source and target domain
CE

and the edges represent the relations between examples. Some nodes are labeled
and the learning process involves identifying the label of unlabeled nodes ac-
cording to the topological properties of the graph. Both the graph construction
AC

and the graph-based learning algorithm are challenging tasks, especially in rela-
tion to the scalability problem of the graph-based transductive learning (Ryan
& Michailidis, 2017). For example, a common technique for graph building is
via nearest neighbor-based algorithm, which presents quadratic complexity of
time and space.
Transductive learning is a promising approach that explores both labeled and
unlabeled data (e.g. source and target domains) during the training process and

7
ACCEPTED MANUSCRIPT

is generally used in semi-supervised learning scenarios (Joachims, 2003). Un-


like the supervised inductive classification, which aims to create a classification
model to approximate a real class assignment function, the goal of transductive
learning is to find an admissible function in which the unlabeled data are used
to improve classification performance (Zhou et al., 2004). In practice, transduc-

T
tive learning assigns weights or relevance scores to examples for each one of the

IP
classes and the examples are classified considering these weights.

CR
It is worth mentioning that transductive learning has shown to be helpful
for cross-domain transfer learning problems (Wu & Tan, 2011; Al-Moslmi et al.,
2017; Chang et al., 2017; Ryan & Michailidis, 2017). This observation has

US
motivated us to employ related approaches for aspect extraction in sentiment
analysis, which in this particular setting is a challenge not addressed in the
AN
literature.

3. Cross-Domain Aspect Label Propagation through Heterogeneous


M

Networks
ED

Our CD-ALPHN (Cross-Domain Aspect Label Propagation through Het-


erogeneous Networks) approach is presented in two steps. In the first step, we
describe the heterogeneous network construction and its use as a unified repre-
PT

sentation between the source and target domains. In the second step, we discuss
our proposed algorithm for transductive learning, which uses label propagation
CE

in the heterogeneous network to transfer label information of the labeled as-


pects from the source domain to the “candidate” aspects of the target domain,
AC

by using linguistic feature patterns as a bridge.

3.1. Cross-Domain Data Representation using Heterogeneous Networks

Network-based algorithms came up to avoid the drawbacks of the algorithms


based on vector-space model and to improve transductive classification (Gupta
et al., 2015; Rossi et al., 2016). Networks are mostly used for label propagation,
in which some labeled objects propagate their labels to other objects through

8
ACCEPTED MANUSCRIPT

the network connections to perform transductive classification (Zhu & Goldberg,


2009; Rossi et al., 2014; Subramanya & Bilmes, 2008; Zhou et al., 2004). More-
over, the use of networks to model data allows extracting patterns which are
not extracted by algorithms based on vector-space model (Breve et al., 2012).
In this paper, we propose a cross-domain unified representation based on

T
a heterogeneous network, i.e., a network composed by different types of nodes

IP
and relations. Formally, our proposed network-based representation is defined

CR
by G = (V, E, W ), where V represents the set of nodes, E represents the edges
connecting the nodes, and W represents the weights of the edges. The set V
is composed of aspects from the source domain (AS ), aspects from the target

US
domain (AT ) and linguistic features L extracted by a Part-of-Speech (PoS)
process as shown in Figure 1. Thus, V = {AS ∪ AT ∪ L}.
AN
Figure 2 illustrates the general scheme of the heterogeneous network pro-
posed in this work. The set of edges E connect the linguistic features L to the
nodes in sets AS and AT . Thus, if two aspects participate in the same syntactic
M

relationship, then the two nodes that represent these aspects will be connected
to the node representing the linguistic feature of this syntactic relation. The
ED

words of an aspect usually appear in many text reviews, and thus the set W
indicates the weight of the edges — calculated by the frequency of occurrence
PT

between the (candidate) aspect nodes and the linguistic feature nodes. Lin-
guistic features are common to both source and target domains and can be
CE

interpreted as a bridge between the two domains.


We argue that this type of representation is an intuitive strategy for com-
bining information obtained from different domains. This unified representation
AC

also eliminates several inconsistencies of existing transfer learning approaches,


since the feature space is shared by both source and target domains. Unlike
most of the existing approaches, our proposed representation does not require
the computation of the k nearest neighbors of an instance for the construction
of the graph relations. In fact, the construction of our heterogeneous network
can be done in linear time, since this representation maps directly the attribute-
value table extracted from the text preprocessing.

9
ACCEPTED MANUSCRIPT

Source Domain
Labeled Aspects

Linguistic Aspect = YES


Features

T
Aspect = NO
Bridge
Unlabeled Aspects

IP
Linguistic Features

CR
Target Domain

Candidate Aspects

US
Figure 2: A general scheme of the heterogeneous network proposed for a unified representation
of the feature spaces between the source domain (labeled aspect nodes) and the target domain
AN
(candidate aspect nodes), in which linguistic features are used as bridge nodes.

In order to exemplify the network generation through the proposed approach,


M

we present a step-by-step example. In Table 1 we present respectively two sen-


tences, PoS and syntactic relations, and the generated network from computers
ED

domain. Rectangular forms represent the terms, and a circular form represents
the linguist features. Linguistic features are composed of the PoS tag, type of
PT

syntactic relation, and if it is an input relation (in) or an output relation (out).


For instance, considering the Sentence #1 in Table 1, the term battery
CE

(noun - NN) contains the input relation nsubj; and the term nice (adjective -
JJ) contains the output relation nsubj. Thus, two nodes with their respective
linguistic features are generated: NN:subj:in and JJ:subj:out. Also in the
AC

same example, we can notice that the weight of the edge between the term the
and the linguistic feature DT:det:in is 2 since the term the occurs two times
in Sentence #1 and in both time belongs to the relation DT:det:in.
In Table 2 we present four sentences about wine reviews that we consider
as target domain in order to build a complete network considering aspect and
non-aspects from the source domain, aspect candidates from the target domain,

10
ACCEPTED MANUSCRIPT

Table 1: Examples of the network generations for each review about computers - domain
source.
ID Sent. Sentence - Dependencies - Network

The battery of the computer is nice.

T
1

IP
the battery of computer is nice

2 1 1 1 1 1 1 1 1

CR
DT: NN: NN: IN: NN: DT: NN: VBZ: JJ:
det: det: nsubj: case: nmod: det: case: cop: nsubj:
in out in in in out out in out

The video resolution is low.

2
the video
US
resolution is low
AN
1 1 1 1 1 1 1

DT: NN: NN: NN: NN: VBZ: JJ:


comp-
det: ound: nsubj: det: nsubj: cop: nsubj:
in in in out in in out
M

and linguistic features from both domains. The complete network with labeled
ED

aspects from the source domain is illustrated in Figure 3. We disregard prepo-


sition, pronouns, and network components witch contains nodes from a single
PT

domain in order to better visualize the network.

Labeled aspects
from the source resolution battery video computer is is low nice
domain
CE

(computers)

Linguistic features
from both domain
and target source NN: NN: NN: NN: NN: NN: VBZ: JJ: JJ: JJ: JJ: JJ:
nsubj: det: compound: compound: nmod: case: cop: nsubj: cop: advmod: amod: neg:
in out out in in out in out out out in out
AC

Aspect candidates
from the target
Domain (wines)
acidity palate texture aroma tomatoey oak is is is tasteful expressive pleasant good

Aspect = YES Aspect = NO Unlabeled

Figure 3: Illustration of the proposed network for cross-domain aspect extraction considering
the sentences presented in Tables 1 and 2.

11
ACCEPTED MANUSCRIPT

Table 2: Examples of the network generations for each review about wines - target domain.
ID Sent. Sentence - Dependencies - Network

The palate is not overly expressive

T
the palate is overly expressive

1 1 1 1 1 1 1

IP
DT: NN: VBZ: RB: JJ: JJ: JJ:
det: nsubj: cop: advmod: nsubj: cop: advmod:
in in in in out out out

CR
The tomatoey acidity is tasteful.

2 the tomatoey acidity is tasteful

US
1 1 1 1 1 1 1 1

NN: NN: NN: NN: RB: JJ: JJ:


DT: compound: nsubj: det: compound: neg: cop: nsubj:
det_in in out
in out in out out

It has a very pleasant texture.


AN
3
it has a very pleasant texture
M

1 1 1 1 1 1 1

VBZ: RB: JJ:


PRP: DT: JJ: NN:
dobj advmod advmod
nsub_in det_in amod_in det_out
_out _in _out
ED

Oak aroma is not so good.

4
PT

oak aroma is not so good

1 1 1 1 1 1 1 1 1

NNP: NN: VBZ: RB: RB: JJ: JJ: JJ: JJ:


compound: compund: cop: neg: advmod: nsubj: cop: neg: advmod:
in out in in in out out out out
CE

In summary, our proposal for cross-domain data representation unifies fea-


tures (and their relations) of the source and target domains. The proposed
AC

heterogeneous network contains (1) nodes with aspects labeled as “YES” and
“NO” for source domain data, (2) nodes with aspect candidates that are terms
(e.g. noun, verb, adjective, and adverb) extracted from the target domain, and
(3) nodes with linguistic features extracted from the text reviews of both do-
mains — which we use as a bridge to propagate the labels of the source domain
to the nodes of the target domain. Moreover, all nodes in the network contain

12
ACCEPTED MANUSCRIPT

an information vector. For labeled nodes, this information vector represents


the labels of the source domain. For all other nodes, this vector is randomly
initialized (or initialized with zeros) and the network regularization function
(described in detail in the next section) represents the propagation of the la-
beled information to all the unlabeled nodes, considering the network topology.

T
Thus, even nodes with linguistic features will receive a vector of label informa-

IP
tion, although these nodes are only used as a bridge to transfer such information

CR
to the target domain.

3.2. Aspect Label Propagation through Linguistic Features

US
We present a transductive learning algorithm based on graph regularization
to propagate label information from the source domain to the target domain,
AN
where the linguistic features are used as a bridge for this propagation. Our
graph regularization algorithm was inspired by the successful use of a similar
algorithm introduced in Zhu et al. (2003) and has the basic premise that nodes
M

that share the same neighbor nodes tend to be classified with the same label.
The regularization function to be minimized representing the aspect label
ED

propagation is defined in Equation 1, where as is a node from the set AS rep-


resenting labeled aspects of the source domain, at is a node from the set AT
PT

representing candidate aspects of the target domain and l is a node from the
set L representing linguistic features. The edge weights between linguistic fea-
tures and aspects are represented by was ,l (for labeled aspects) and wat ,l (for
CE

candidate aspects). The matrix W will store the edge weights of each network
node.
AC

Each node of the network has a vector of labels f , which represents the per-
tinence level of a node to each one of a class cl ∈ C. In the aspect extraction
scenario, we have C = {Aspect = Y es, Aspect = N o}. The matrix F sum-
marizes the pertinence level of all the nodes and represents the solution of the
regularization function. In case of labeled nodes, there is also a real label vector
y. Such vector has the value 1 in the position corresponding to the class and
0 in the others. In our proposal, only nodes in the set AS will be previously

13
ACCEPTED MANUSCRIPT

labeled and therefore have y vector. The matrix Y will store all the y vector of
the labeled aspects from the source domain.

X 1 X X 2
Q(F) = was ,l fas − fl (1)
2

T
AS ,AT ,L⊂V as ∈AS l∈L
X X 2
+ 12 wat ,l fat − fl

IP
at ∈AT l∈L
X
+ lim µ (fas − yas )

CR
µ→∞
as ∈AS

While the first two terms of Equation 1 determine that nearby nodes share
similar label information, the last term determines that the f information of

US
the labeled aspects is not modified during the label propagation process. This
means that the node information of the labeled aspects must always be close to
AN
the true label information of the aspect nodes from source domain (y).
The minimization of the regularization function in Equation 1 can be ob-
tained via an iterative algorithm since the use of solvers for quadratic program-
M

ming optimization is computationally expensive and intractable for large num-


bers of nodes. In this way, we present an iterative algorithm for the cross-domain
ED

aspect label propagation through a heterogeneous network (CD-ALPHN). The


first term of the Function Q(F), presented in Equation 1, can be iteratively
PT

minimized making the label vector of an object oi (foi ) equals to the harmonic
average of the label vector of the neighboring objects. Thus, this step of the
CE

iterative solution can be seen as an iterative computation of F = PF, in which


P = (D−1 )W. Also, there is a reset step FAS = YAS in order to minimize the
second term of Equation 1 .
AC

Considering that the matrices F and P can be subdivided considering these


three distinct types of network objects, (AS , AT , and L) the label propagation
can be seen as:
    
FA S PAS AS PAS AT PAS L FA S
    
FAT  = PAT AS PAT L  FAT  . (2)
    
PAS AS
    
FL PLAS PLAT PLL FL

14
ACCEPTED MANUSCRIPT

In this case, the values of the submatrices PAS AS , PAS AT , PAT AS , PAS AS ,
and PLL are 0 since there are no relations among objects of the same type.
Hence, CD-ALPHN performs the transductive classification as presented in Al-
gorithm 1, in which the input is composed of the node sets AS , AT , and L of

T
the heterogeneous network, a matrix YAT with the real label information of

IP
the aspects from the source domain, a matrix W of edge weights, and a di-

CR
agonal matrix D with the degree (sum of the edge weights) of each node, i.e.,
P
dvi ,vi = vj ∈V wvi ,vj .
These steps are repeated until stopping criteria are reached. We adopted as

US
stopping criteria the maximum number of iterations, and the minimum mean
squared difference between the values of matrix F in consecutive iterations.
AN
Both stopping criteria are possible since the differences of the values of the
matrix F in consecutive iterations will decrease, i.e., the values of the matrix F
will converge as presented in the following paragraphs.
M

Through the iterative solution presented in Algorithm 1, FL and FAT at the


n-th iteration can be computed by the Equations 3 and 4 respectively.
ED

n−1
(n)
X
FL = (PAT L PLAT )i PLAS YAS +
PT

i=0
(0)
(PLAT PAT L )n−1 PLAT FAT (3)
CE

n−1
(n)
X
FAT = (PAT L PLAT )i PAT L PLAS YAS +
i=0
AC

(0)
(PAT L PLAT )n FAT (4)

Since each row of the matrix P is row-normalized, i.e., the sum of the val-
ues of the row is 1, the sum of the values of a row of the resulting matrix
(PAT L PLAT ) or (PAT L PLAT ) always will be lesser than 1. Thus, there exist a

15
ACCEPTED MANUSCRIPT

T
Algorithm 1: Cross-Domain Aspect Label Propagation through Hetero-

IP
geneous Network (CD-ALPHN)
Input: AS , AT , L, Y, W, D

CR
1 begin
2 P ← (D−1 ) · W
3 repeat

US
4 foreach AS , AT , L ⊂ V do
5 FAS ← YAS ; /* Setting the f vector of the labeled
AN
aspects from the source domain equal to the real
labels */
6 FL ← PL,AS · FAS + PL,AT · FAT ; /* Propagating labels
M

from aspects to linguistic features */


FAT ← PAT ,L · FL,AT ; /* Propagating labels from
ED

linguistic features to aspect candidate from the


target domain */
PT

8 FAS ← PAS ,L · FAS ; /* Propagating labels from


linguistic features to aspects from the source
CE

domain */
9 end
10 until stopping criteria;
AC

11 end
12 return F(AT )

16
ACCEPTED MANUSCRIPT

γ such that
|AT |
X
(PAT L PLAT [i, j]) ≤ γ < 1 ,
j=1

in which [i,j] means the value in i − th row and j − th column of a matrix.


At the n-th iteration we have:

T
|AT |

IP
X
(PAT L PLAT [i, j])(n) ≤ γ (n) . (5)
j=1

CR
For n → ∞,
lim (PAT L PLAT )n = 0 , (6)

US
n→∞

since for each real number γ ≤ 1, γ n = 0 when n tends to infinity. Thus,


(n) (n)
the proposed solutions for FT and FAT converge since they both contains
AN
(PAT L PLAT [i, j])(n) .
The final classification of the network objects uses the Class Mass Normal-
M

ization (CMN) concept. Hence, the class of an aspect candidate ai ∈ AT using


CMN is given by Zhu et al. (2003):
ED

fa ,c
class(ai ) = arg maxcl ∈C P r(cl ) · X i l (7)
faj ,cl
PT

aj ∈A

in which P r(cl ) is the prior probability of the class cl .


CE

In order to highlight the benefits of the proposed network representation


and the network label propagation algorithm, in Figure 4 we illustrate the final
labels of the target domain nodes after applying CD-ALPHN in the network
AC

presented in Figure 3. Through this illustration, we can notice that some fea-
tures that do not occur in the source domain will be useful to classify aspect
candidates in the target domain. For instance, neither aspect in the source do-
main is linked to a linguistic node NN:coupuound:out. However, some nodes
in the target domain, acidity and palate, propagate the labels to the node
NN:coupuound:out, which posteriorly propagates its label to the node aroma.

17
ACCEPTED MANUSCRIPT

We highlight that the node aroma would not be classified as aspect in this ex-
amples if only the linguistic nodes which appear in the source domain would
be considered. The same occurs for the domain target nodes pleasant and
good. Those nodes do not have any relation to the linguistic nodes of the source
domain, and they were classified through the label propagation of the target

T
node tasteful to some linguistic node connected to pleasant and good in the

IP
target domain.

CR
Labeled aspects
from the source resolution battery video computer is is low nice
domain
(computers)

Linguistic features

US
from both domain
and target source NN: NN: NN: NN: NN: NN: VBZ: JJ: JJ: JJ: JJ: JJ:
nsubj: det: compound: compound: nmod: case: cop: nsubj: cop: advmod: amod: neg:
in out out in in out in out out out in out
AN
Aspect candidates
from the target
Domain (wines)
acidity palate texture aroma tomatoey oak is is is tasteful expressive pleasant good
M

Aspect = YES Aspect = NO

Figure 4: Illustration of the label propagation through CD-ALPHN algorithm and considering
ED

the network presented in Figure 3.


PT

4. Experimental Evaluation

We demonstrate the performance of our proposed CD-ALPHN by using


CE

seven benchmark datasets (products and services reviews), as presented in


the next section. In addition, an example of Data Analytics System called
AC

“Websensors-SentimentAnalysis” is also presented to demonstrate the practical


relevance of our proposal in real-world applications.
Our objective is to evaluate the effectiveness of the CD-ALPHN for cross-
domain aspect extraction in sentiment analysis tasks. In the benchmark datasets,
all sentences with aspects in the source domain have a sentimental word re-
lated to the polarity of the aspect (negative, positive or neutral). Although our
CD-ALPHN approach does not directly use sentimental words to support the

18
ACCEPTED MANUSCRIPT

extraction of aspects, the grammatical and syntactic structure of this type of


sentence has been mapped in our heterogeneous network.

4.1. Datasets

We used seven text review benchmark datasets that are widely cited in the

T
aspect-based sentiment analysis literature. Table 3 presents an overview of the

IP
datasets, with the number of reviews (#Reviews), the total number of aspects

CR
(#Aspects), the total number of unique terms of the dataset after the text
preprocessing (#TotalTerms), and the average number of terms per document
(#AvgTerms). Datasets D1 to D5 were obtained from Hu & Liu (2004) and

US
datasets D6 and D7 were obtained from Pontiki et al. (2014).

Table 3: Description of the text review datasets used in the experimental evaluation.
AN
ID Dataset #Reviews #Aspects #TotalTerms #AvgTerms

D1 Digital Camera 1 45 68 1045 54.46


M

D2 Digital Camera 2 34 47 734 48.74


D3 Cellular Phone 41 78 924 51.70
ED

D4 MP3 Player 95 126 1614 63.95


D5 DVD Player 99 82 1074 33.02
D6 Laptop 3045 955 2559 5.26
PT

D7 Restaurant 3045 1219 3083 5.34


CE

All texts were preprocessed using the Stanford CoreNLP (Manning et al.,
2014) natural language processing tool to extract the linguistic features.
AC

4.2. Experiment Setup

We configured the experiments to compare our proposed CD-ALPHN ap-


proach with the state-of-the-art cross-domain approaches for aspect extraction.
Existing cross-domain aspect extraction approaches identify features that are
common to both source and target domains and learns an inductive classifier
from the labeled aspects in the source domain. The learned classifier is applied

19
ACCEPTED MANUSCRIPT

to classify aspects in the target domain. We used five popular classifiers and we
selected the best configurations considering the following parameters:

• J48 (Decision tree-based classifier): we analyzed the values {0.15, 0.20,


0.25} for the confidence parameter.

T
• kNN (Instance-based classifier): we analyzed the values {1, 3, 5, 7, 9, 11,

IP
13, 15} for the number of nearest neighbors (k parameter) with cosine

CR
similarity.

• MLP (Neural network-based classifier using a Multilayer Percep-


tron and the Backpropagation training algorithm): we analyzed

US
the values {8, 16, 32} for the number of neurons in the hidden layer and
the sigmoid function as the activation function.
AN
• NB (Naive-Bayes Classifier): this classifier is parameter free.

• SVM (Support Vector Machine): we analyzed the values {10−3 , 10−2 ,


M

10−1 , 100 , 101 , 102 , 103 } for the complexity C parameter with polynomial
kernel.
ED

• CD-ALPHN (Cross-Domain Aspect Label Propagation through Het-


erogeneous Networks): our proposed approach is parameter free.
PT

The simulation of a cross-domain process for aspect extraction was per-


CE

formed as follows: given a dataset to represent the target domain, all other
datasets are combined to compose the source domain. For example, if dataset
D7 represents the target domain, then the source domain is represented by the
AC

labeled aspects (and their linguistic features) extracted from datasets D1, D2,
D3, D4, D5, and D6. Thus, the proposed experimental evaluation is similar to
a real-world scenario where labeled aspects from different source domains can
be used to extract aspects of a new target domain.
Our experimental setup also considers the level of inconsistency between the
source and target domains. In this case, we consider the number of aspects of

20
ACCEPTED MANUSCRIPT

the target domain (AT ) that occur in the source domain (AS ) to compute the
level of inconsistency (β parameter), as defined in Equation 8.

|AT ∩ AS |
β = 100 × (8)
|AT |

T
IP
When the cross-domain transfer learning process is performed with low-level
of inconsistency, the target domain feature space is well represented in the source

CR
domain (there is a significant amount of shared aspects between both domains).
For example, we have low-level of inconsistency when the source domain contains
aspects about some Digital Camera models (dataset D1) and the target domain

US
contains aspects about another Digital Camera models (dataset D2). Although
they are different models of Digital Camera, the aspects of this type of product
AN
are very similar. On the other hand, when the target domain consists of a
completely new product or service with different aspects, then we have a high-
level of inconsistency.
M

Table 4 presents the experimental setup according to four levels of inconsis-


tency (β): Very Low-Level (β > 70%), Low-Level (50% < β ≤ 70%), Mid-Level
ED

(20% < β ≤ 50%), and High-Level (β < 20%). We underlined the source
domain datasets that have aspects shared with the target domain.
PT

4.3. Evaluation Criteria


We use the F1-Measure (Equation 9) to evaluate the aspect extraction per-
CE

formance, which is the harmonic mean between the Precision (Equation 10) and
Recall (Equation 11) measures, where:
AC

• T P (True Positive) indicates the number of terms inferred as aspects that


were correctly extracted;

• F P (False Positive) indicates the number of inferred terms as aspects


which are not real aspects; and

• F N (False Negative) indicates the number of terms that are aspects and
which were not inferred as aspects.

21
ACCEPTED MANUSCRIPT

Table 4: Level of inconsistency (β) considering the number of aspects of the target domain
that occur in the source domain.
Source Domains Target Domain Level of Inconsistency

Digital Camera 2 (D2)


Cellular Phone (D3)

T
MP3 Player (D4)
Digital Camera 1 (D1) Very Low-Level
DVD Player (D5)

IP
Laptop (D6)
Restaurant (D7)

CR
Digital Camera 1 (D1)
Cellular Phone (D3)
MP3 Player (D4)
Digital Camera 2 (D2) Very Low-Level
DVD Player (D5)

US
Laptop (D6)
Restaurant (D7)
Digital Camera 1 (D1)
Digital Camera 2 (D2)
AN
MP3 Player (D4)
Cellular Phone (D3) Mid-Level
DVD Player (D5)
Laptop (D6)
Restaurant (D7)
M

Digital Camera 1 (D1)


Digital Camera 2 (D2)
Cellular Phone (D3)
MP3 Player (D4) Low-Level
ED

DVD Player (D5)


Laptop (D6)
Restaurant (D7)
Digital Camera 1 (D1)
PT

Digital Camera 2 (D2)


Cellular Phone (D3)
DVD Player (D5) Low-Level
MP3 Player (D4)
CE

Laptop (D6)
Restaurant (D7)
Digital Camera 1 (D1)
Digital Camera 2 (D2)
AC

Cellular Phone (D3)


Laptop (D6) High-Level
MP3 Player (D4)
DVD Player (D5)
Restaurant (D7)
Digital Camera 1 (D1)
Digital Camera 2 (D2)
Cellular Phone (D3)
Restaurant (D7) High-Level
MP3 Player (D4)
DVD Player (D5)
Laptop (D6)

22
ACCEPTED MANUSCRIPT

P recision × Recall
F1 = 2 × (9)
P recision + Recall

TP
P recision = , (10)
TP + FP

T
IP
TP
Recall = , (11)
TP + FN

CR
4.4. Results and Discussion

Table 5 presents the experimental results (F1-Measure) of the aspects ex-

US
traction process. We highlighted the best algorithm for each dataset with a bold
text in cells. Below we discuss the experimental results according to the level
of inconsistency of the cross-domain process.
AN
Table 5: F1-Measure results of the aspect extraction process for each approach.
Target Domain CD-ALPHN J48 kNN MLP NB SVM
M

D1 0.560 0.554 0.550 0.570 0.559 0.566


D2 0.579 0.579 0.572 0.581 0.572 0.574
ED

D3 0.600 0.585 0.589 0.598 0.591 0.598


D4 0.589 0.583 0.581 0.587 0.585 0.595
D5 0.570 0.573 0.578 0.581 0.565 0.583
PT

D6 0.610 0.480 0.547 0.581 0.570 0.504


D7 0.683 0.543 0.583 0.668 0.649 0.569
CE

Very Low-Level: Datasets D1 (Digital Camera 1) and D2 (Digital Camera 2)


contain almost the same aspects. Thus, a cross-domain process based on
AC

two stages was sufficient for the transfer learning process, since there are
similar feature space and probability distribution. Neural Network (MLP)
classifier obtained the best results.

Low-Level: Datasets D4 (MP3 Player) and D5 (DVD Player) contain many


shared aspects, due to the similar functions of these two products. In the
same way as described above (very low-level), a two-stage cross-domain

23
ACCEPTED MANUSCRIPT

process was sufficient for the transfer learning process. Support Vector
Machine (SVM) classifier obtained the best results.

Mid-Level: Dataset D3 (Cellular Phone) shares some aspects with the D1


(Digital Camera 1), D2 (Digital Camera 2), and D4 (MP3 Player) datasets.

T
However, the target domain contains a significant amount of new aspects.

IP
In this scenario, the transductive learning process proposed in our CD-
ALPHN algorithm obtains competitive results and obtained the best as-

CR
pect extraction performance. The results are similar to traditional two-
stage approaches, especially when compared with MLP and SVM classi-

US
fiers.

High-Level: Datasets D6 (Laptop) and D7 (Restaurant) represent scenarios


AN
in which there is a large set of new aspects in the target domain. In this
case, traditional approaches based on two stages yield inferior performance
compared to our CD-ALPHN. These experimental results are a strong
M

evidence of the importance of a unified representation model based on


heterogeneous networks for cross-domain transfer learning.
ED

In summary, the use of transductive learning from a unified representa-


tion based on heterogeneous networks yields promising results for cross-domain
PT

transfer learning, outperforming MLP and SVM methods in scenarios where


there is a high-level of inconsistency between the source and target domains —
CE

the most common scenario in real-world applications. Moreover, we emphasize


that our proposed CD-ALPHN approach is competitive even when there is a
AC

low-level of inconsistency, being an interesting alternative to be used in different


scenarios.
The CD-ALPHN approach is also advantageous in terms of computational
time1 . Table 6 shows the average execution time of each approach considering
all datasets. Note that our CD-ALPHN approach is computationally fast for

1 All algorithms were implemented using the Java language and run in the same computing
environment (Debian Operating System, Intel i7 Processor and 64GB of RAM).

24
ACCEPTED MANUSCRIPT

aspect extraction tasks, even when compared to simpler approaches such as


Naive Bayes (NB). Transductive approaches have the advantage that it is not
necessary to perform a previous stage of model building (training step), since
it uses both the labeled and unlabeled data in an iterative label propagation
process; which already returns the set of classified aspects.

T
IP
4.5. Example of Application

CR
To complement the experimental analysis based on benchmark datasets, we
discuss the practical use of the CD-ALPHN approach integrated into a deci-
sion support system. Thus, we developed a data analytics system for aspect-

US
based sentiment analysis called “Websensors-SentimentAnalysis” (Websensors
for Aspect-based Sentiment Analysis), which uses the concept of “information
AN
as a sensor” (Marcacini et al., 2017). Websensors are proposed as an instance
of the Web Science research area, with the differential of exploring data science
techniques and analytical intelligence to explore the relationship between the
M

digital world and our physical world (Phethean et al., 2016). Websensors have
been used successfully in real-world applications such as time series forecasting
ED

(Marcacini et al., 2016) and urban violence events (Florence et al., 2017). In this
paper, we argue that sentiment analysis is a promising application to explore
PT

relationships between the virtual world (text reviews) and the real world (prod-
ucts and consumers) — concept also investigated by authors who study human
behavior in online ecommerce systems to support decision making (Zhang &
CE

Table 6: Average execution time in seconds of each cross-domain aspect extraction approach
AC

considering all datasets.


Method Model Building Time Classification Time Total

CD-ALPHN — 2.26 2.26


NB 3.38 0.70 4.07
J48 78.81 0.01 78.82
kNN — 373.47 373.56
MLP 25146.29 0.17 25146.46
SVM 25753.12 94.46 25847.58

25
ACCEPTED MANUSCRIPT

Benyoucef, 2016). In this case, in addition to the text reviews, geographic infor-
mation (e.g., consumer location) and time information (e.g., date of publication)
are also collected to improve the information sensor.
Figure 5 illustrates one of the main interface of the Websensors-Sentiment-
Analysis from 93 real reviews (written in Portuguese) about a Brazilian restau-

T
rant (source domain), where the aspect extraction are performed using our CD-

IP
ALPHN approach from computers and laptops reviews (target domain). Ini-

CR
tially, we summarize all aspects according to the positive and negative polarities
(Figure 5A), thereby providing an overview of the product or service. When
the user selects a polarity of interest (Positive or Negative), the Websensors-

US
SentimentAnalysis system presents: the text reviews with the highlighted as-
pects (Figure 5B), a temporal evolution of the polarity according to the fre-
AN
quency of positive or negative aspects over time (Figure 5C), and a geographical
mapping according to the locality of the consumers that submitted the reviews
(Figure 5D).
M

C
ED

A
PT

B D
CE
AC

Figure 5: Overview of the Websensors-SentimentAnalysis (Data Analytics) system interface


for aspect-based sentiment analysis.

Note that in Figure 5B we highlighted the aspects according to the matrix


F(AT ) obtained by using the CD-ALPHN approach. The user can select the

26
ACCEPTED MANUSCRIPT

T
IP
CR
US
Figure 6: Example of a heat map that represents the geographical occurrence of a selected
AN
aspect according to the location of the consumers.

extracted aspects and then get an overview on which text reviews this aspect
M

was classified as positive or negative. Thus, by selecting an aspect of interest


and its polarity, we can combine the selected aspect with geographic information
ED

to build a heat map as shown in Figure 6, thereby presenting the geographical


occurrence of an aspect according to the location of the consumers.
PT

Once the aspects are extracted, several other functionalities of a data ana-
lytics system can be implemented, such as notification of the occurrence of a
CE

new aspect, reports of the top-k aspects most commented in the reviews, and
alerts with the aspects that need attention (increase of negative polarity). We
believe that these functionalities can be used as guidelines for the development
AC

of new decision support systems for aspect-based sentiment analysis. Moreover,


it is a promising example of how ABSA-based decision support systems can be
constructed more quickly and easily if we use aspects already labeled in other
domains combined with transductive learning methods.

27
ACCEPTED MANUSCRIPT

5. Concluding Remarks

We present a new approach for aspect extraction of a given domain using


aspects labeled from other different domains. Our approach innovates from
existing solutions (1) by providing an unified representation of feature spaces

T
between different domains through heterogeneous networks, and (2) by using

IP
a cross-domain transfer learning process through label propagation with trans-
ductive learning.

CR
Experimental results validate the effectiveness (F1-Measure measure) and
efficiency (computational time) of our approach. Our proposed CD-ALPHN

US
(Cross-Domain Aspect Label Propagation through Heterogeneous Networks)
was compared to the state-of-the-art approaches for cross-domain tasks and ob-
tained competitive results considering all scenarios. Moreover, CD-ALPHN is
AN
potentially useful in scenarios where there is a high-level of inconsistency be-
tween the source and target domains, outperforming the state-of-the-art MLP
M

(Multilayer Perceptron Neural Network) and SVM (Support Vector Machine)


approaches. In fact, real-world applications often present high-level of inconsis-
ED

tency, since reviews are collected from different data sources and different writ-
ing styles. In addition to the experimental evaluation with benchmark datasets,
we discussed how the proposed approach can be employed in decision support
PT

systems. Specifically, we present a data analytics system for aspect-based sen-


timent analysis (Websensors-SentimentAnalysis).
CE

Directions for future work involve the enrichment of the model representation
based on heterogeneous networks. We plan to include the sentiment polarity
AC

(negative or positive) in the process of cross-domain transfer learning, thereby


allowing the aspect extraction as well as the classification of the polarity of the
aspects using labeled data from different domains. In fact, this is a challenging
research direction that we believe will be a trend in the sentiment analysis field
and will require advancements both in unified text representation models (e.g.
statistical and linguistic features) and effective transductive learning methods
for label propagation.

28
ACCEPTED MANUSCRIPT

All the datasets used in this work, as well as the source code of our CD-
ALPHN approach, are available at http://websensors.net.br/absa/.

Acknowledgment

T
The authors acknowledge the Brazilian Research Agencies FUNDECT-MS

IP
[grant number 147/2016 - SIAFEM 25907], FINEP, CNPq, CAPES, and FAPESP
[grant number 2014/08996-0 and 2017/08804-2] for their support to this work.

CR
The authors also thank the NVIDIA for donating computer equipment (GPU
Grant Academic Program).

US
References
AN
Akhtar, M. S., Gupta, D., Ekbal, A., & Bhattacharyya, P. (2017). Feature
selection and ensemble construction: A two-step method for aspect based
sentiment analysis. Knowledge-Based Systems, 125 , 116–135.
M

Al-Moslmi, T., Omar, N., Abdullah, S., & Albared, M. (2017). Approaches
ED

to cross-domain sentiment analysis: A systematic literature review. IEEE


Access, 5 , 16173–16192.
PT

Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: A geo-
metric framework for learning from labeled and unlabeled examples. Journal
of Machine Learning Research, 7 , 2399–2434.
CE

Breck, E., & Cardie, C. (2017). Opinion mining and sentiment analysis. In The
Oxford Handbook of Computational Linguistics. (2nd ed.).
AC

Breve, F. A., Zhao, L., Quiles, M. G., Pedrycz, W., & Liu, J. (2012). Particle
competition and cooperation in networks for semi-supervised learning. IEEE
Transasctions on Knowledge and Data Engineering, 24 , 1686–1698.

Cambria, E. (2016). Affective computing and sentiment analysis. IEEE Intelli-


gent Systems, 31 , 102–107.

29
ACCEPTED MANUSCRIPT

Chang, W.-C., Wu, Y., Liu, H., & Yang, Y. (2017). Cross-domain kernel induc-
tion for transfer learning. In Proceedings of the Thirty-First AAAI Conference
on Artificial Intelligence (AAAI-17) (pp. 1763–1769).

Chapelle, O., Schölkopf, B., & Zien, A. (Eds.) (2006). Semi-Supervised Learning.

T
MIT Press.

IP
Chen, Z., Mukherjee, A., & Liu, B. (2014). Aspect extraction with automated

CR
prior knowledge learning. In Proceedings of the 52nd annual meeting of the
Association for Computational Linguistics (pp. 347–358).

Deng, S., Sinha, A. P., & Zhao, H. (2017). Adapting sentiment lexicons to

US
domain-specific social media texts. Decision Support Systems, 94 , 65–76.

Duric, A., & Song, F. (2012). Feature selection for sentiment analysis based on
AN
content and syntax models. Decision support systems, 53 , 704–711.

Feldman, R. (2013). Techniques and applications for sentiment analysis. Com-


M

munications of the ACM , 56 , 82–89.


ED

Florence, R., Nogueira, B., & Marcacini, R. (2017). Constrained hierarchical


clustering for news events. In Proceedings of the 21st International Database
Engineering & Applications Symposium (pp. 49–56). ACM.
PT

Gupta, M., Kumar, P., & Bhasker, B. (2015). A new relevance measure for
CE

heterogeneous networks. In International Conference on Big Data Analytics


and Knowledge Discovery (pp. 165–177). Springer.
AC

Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Pro-
ceedings of the 10th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD-2004) (pp. 168–177).

Joachims, T. (1999). Transductive inference for text classification using sup-


port vector machines. In Proceedings of the 16th International Conference on
Machine Learning (ICML-99) (pp. 200–209).

30
ACCEPTED MANUSCRIPT

Joachims, T. (2003). Transductive learning via spectral graph partitioning.


In Proceedings of the 20th International Conference on Machine Learning
(ICML-03) (pp. 290–297).

Kong, X., Ng, M. K., & Zhou, Z.-H. (2013). Transductive multilabel learn-

T
ing via label set propagation. IEEE Transactions on Knowledge and Data

IP
Engineering, 25 , 704–719.

Lau, R. Y., Li, C., & Liao, S. S. (2014). Social analytics: learning fuzzy product

CR
ontologies for aspect-oriented sentiment analysis. Decision Support Systems,
65 , 80–94.

US
Li, S., Xue, Y., Wang, Z., & Zhou, G. (2013). Active learning for cross-domain
sentiment classification. In Proceedings of the 23rd International Joint Con-
AN
ference on Artificial Intelligence (IJCAI) (pp. 2127–2133).

Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on


M

human language technologies, 5 , 1–167.

Liu, B., & Zhang, L. (2012). A survey of opinion mining and sentiment analysis.
ED

In Mining text data (pp. 415–463). Springer.

Long, M., Wang, J., Ding, G., Pan, S. J., & Philip, S. Y. (2014a). Adaptation
PT

regularization: A general framework for transfer learning. IEEE Transactions


on Knowledge and Data Engineering, 26 , 1076–1089.
CE

Long, M., Wang, J., Ding, G., Shen, D., & Yang, Q. (2014b). Transfer learning
with graph co-regularization. IEEE Transactions on Knowledge and Data
Engineering, 26 , 1805–1818.
AC

Long, M., Wang, J., Sun, J., & Philip, S. Y. (2015). Domain invariant transfer
kernel learning. IEEE Transactions on Knowledge and Data Engineering, 27 ,
1519–1532.

Lu, J., Behbood, V., Hao, P., Zuo, H., Xue, S., & Zhang, G. (2015). Transfer
learning using computational intelligence: a survey. Knowledge-Based Sys-
tems, 80 , 14–23.

31
ACCEPTED MANUSCRIPT

Luo, P., Zhuang, F., Xiong, H., Xiong, Y., & He, Q. (2008). Transfer learning
from multiple source domains via consensus regularization. In Proceedings of
the 17th ACM conference on Information and knowledge management (pp.
103–112). ACM.

T
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., & McClosky,

IP
D. (2014). The Stanford CoreNLP natural language processing toolkit. In
Proceedings of 52nd annual meeting of the association for computational lin-

CR
guistics: system demonstrations (pp. 55–60).

Marcacini, R. M., Carnevali, J. C., & Domingos, J. (2016). On combining web-

US
sensors and dtw distance for knn time series forecasting. In Pattern Recogni-
tion (ICPR), 2016 23rd International Conference on (pp. 2521–2525). IEEE.
AN
Marcacini, R. M., Rossi, R. G., Nogueira, B. M., Martins, L. V., Cherman, E. A.,
& Rezende, S. O. (2017). Websensors analytics: Learning to sense the real
world using web news events. In Proceedings of 23th Brazilian Symposium on
M

Multimedia and the Web. Workshop on tools and applications. (pp. 169–173).
ED

Matsuno, I. P., Rossi, R. G., Marcacini, R. M., & Rezende, S. O. (2017). Aspect-
based sentiment analysis using semi-supervised learning in bipartite heteroge-
neous networks. Journal of Information and Data Management, 7 , 141–154.
PT

Mukherjee, A., & Liu, B. (2012). Aspect extraction through semi-supervised


modeling. In Proceedings of the 50th Annual Meeting of the Association for
CE

Computational Linguistics: Long Papers-Volume 1 (pp. 339–348). Associa-


tion for Computational Linguistics.
AC

Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions
on knowledge and data engineering, 22 , 1345–1359.

Phethean, C., Simperl, E., Tiropanis, T., Tinati, R., & Hall, W. (2016). The
role of data science in web science. IEEE Intelligent Systems, 31 , 102–107.

Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos,


I., & Manandhar, S. (2014). Semeval-2014 task 4: Aspect based sentiment

32
ACCEPTED MANUSCRIPT

analysis. In Proceedings of the 8th International Workshop on Semantic Eval-


uation (SemEval- 2014) (pp. 27–35).

Rana, T. A., & Cheah, Y.-N. (2016). Aspect extraction in sentiment analysis:
comparative analysis and survey. Artificial Intelligence Review , 46 , 459–483.

T
Rana, T. A., & Cheah, Y.-N. (2017). A two-fold rule-based model for aspect

IP
extraction. Expert Systems with Applications, 89 , 273–285.

CR
Rohrbach, M., Ebert, S., & Schiele, B. (2013). Transfer learning in a trans-
ductive setting. In Proceedings of Advances in Neural Information Processing
Systems (NIPS) (pp. 46–54).

US
Rossi, R. G., Lopes, A. A., & Rezende, S. O. (2014). A parameter-free label
propagation algorithm using bipartite heterogeneous networks for text classi-
AN
fication. In Proceedings of the Symposium on Applied Computing (pp. 79–84).
ACM.
M

Rossi, R. G., Lopes, A. d. A., & Rezende, S. O. (2016). Optimization and


label propagation in bipartite heterogeneous networks to improve transductive
ED

classification of texts. Information Processing & Management, 52 , 217–257.

Ryan, P. B. K. J., & Michailidis, M. V. C. G. (2017). Graph-based semi-


PT

supervised learning with big data. Handbook of Research on Applied Cyber-


netics and Systems Science, (p. 154).
CE

Schouten, K., & Frasincar, F. (2016). Survey on aspect-level sentiment analysis.


IEEE Transactions on Knowledge and Data Engineering, 28 , 813–830.
AC

Subramanya, A., & Bilmes, J. (2008). Soft-supervised learning for text classi-
fication. In Proceedings of the Conference on Empirical Methods in Natural
Language Processing (pp. 1090–1099). Association for Computational Lin-
guistics.

Wang, T., Cai, Y., Leung, H.-f., Lau, R. Y., Li, Q., & Min, H. (2014). Product
aspect extraction supervised with online domain knowledge. Knowledge-Based
Systems, 71 , 86–100.

33
ACCEPTED MANUSCRIPT

Wu, F., Huang, Y., & Yan, J. (2017). Active sentiment domain adaptation. In
Proceedings of the 55th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers) (pp. 1701–1711). volume 1.

Wu, Q., & Tan, S. (2011). A two-stage framework for cross-domain sentiment

T
classification. Expert Systems with Applications, 38 , 14269–14275.

IP
Zhang, K. Z., & Benyoucef, M. (2016). Consumer behavior in social commerce:

CR
A literature review. Decision Support Systems, 86 , 95 – 108.

Zhang, P., Wang, J., Wang, Y., & Wang, Y. (2016). A statistical approach to
opinion target extraction using domain relevance. In Proceedings of the 2nd

US
IEEE International Conference on Computer and Communications (ICCC)
(pp. 273–277). IEEE.
AN
Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Schölkopf, B. (2004). Learning
with local and global consistency. In Proceedings of the Advances in Neural
M

Information Processing Systems (pp. 321–328). volume 16.

Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning


ED

using gaussian fields and harmonic functions. In Proceedings of the 20th In-
ternational conference on Machine learning (ICML-03) (pp. 912–919).
PT

Zhu, X., & Goldberg, A. B. (2009). Introduction to Semi-Supervised Learning.


Morgan and Claypool Publishers.
CE

Zhuang, F., Luo, P., Xiong, H., Xiong, Y., He, Q., & Shi, Z. (2010). Cross-
domain learning from multiple sources: A consensus regularization perspec-
AC

tive. IEEE Transactions on Knowledge and Data Engineering, 22 , 1664–1678.

34
ACCEPTED MANUSCRIPT

Biography

Ricardo Marcondes Marcacini is an associate professor of Information Systems at the Federal


University of Mato Grosso do Sul, Brazil. He has a PhD in Computer Science from Institute of
Mathematics and Computer Science at the University of São Paulo, Brazil. His research
interests include machine learning, data clustering and data analytics systems. He has
published papers in a number of international journals and conferences, such as Pattern
Recognition Letters, Journal of Information and Data Management, International Conference
on World Wide Web, Web Intelligence Conference, and ACM Symposium on Document

PT
Engineering.

RI
Ivone Penque Matsuno is a doctoral candidate at the Institute of Mathematics and Computer

SC
Science at the University of São Paulo, Brazil. Her research interests include sentiment analysis
and opinion mining. Her research has appeared in the Journal of Information and Data
Management, and in conference proceedings, such as Brazilian Symposium on Multimedia and
NU
the Web and International Conference on Engineering Design.
MA

Rafael Geraldeli Rossi is an associate professor of Information Systems at the Federal


University of Mato Grosso do Sul, Brazil. He has a PhD in Computer Science from Institute of
Mathematics and Computer Science at the University of São Paulo, Brazil. His research
interests include machine learning, text classification and network models for data
D

representation. He has published papers in a number of international journals and


E

conferences, such as Knowledge-Based Systems, Pattern Recognition Letters, Intelligent Data


Analysis, Information Processing & Management, International Conference on Data Mining,
PT

and Annual ACM Symposium on Applied Computing.


CE

Solange Oliveira Rezende is a full professor of Computer Science at the University of São
Paulo, Brazil. She has a PhD in Mechanical Engineering from University of São Paulo and a
AC

postdoctoral degree in Computer Science at the University of Minnesota, USA. Her research
interests include data and text mining, machine learning, and recommendation systems. She
has published papers in a number of international journals, such as Pattern Recognition
Letters, Journal of Information and Data Management, Knowledge-Based Systems, Intelligent
Data Analysis, Information Retrieval Journal, and Information Processing & Management.
ACCEPTED MANUSCRIPT
Highlights

• We show that transductive learning is promising for cross-domain aspect extraction.

• A heterogeneous network model to fuse knowledge for cross-domain transfer learning.

• Aspect label propagation using linguistic features as a bridge between domains.

• Experimental results validate the effectiveness and efficiency of our approach.

T
IP
CR
US
AN
M
ED
PT
CE
AC
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

You might also like