0% found this document useful (0 votes)
50 views11 pages

Design of Recommendation System For Tourist Spot Using Sentiment

Uploaded by

meseret system
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views11 pages

Design of Recommendation System For Tourist Spot Using Sentiment

Uploaded by

meseret system
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Journal of Ambient Intelligence and Humanized Computing

https://doi.org/10.1007/s12652-019-01521-w

ORIGINAL RESEARCH

Design of recommendation system for tourist spot using sentiment


analysis based on CNN‑LSTM
Hyeon‑woo An1 · Nammee Moon1

Received: 28 December 2018 / Accepted: 26 September 2019


© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract
Sentiment analysis techniques used on texts play an important role in many fields including decision making systems. A
variety of research has been actively conducted on sentiment analysis techniques such as an approach using word frequency
or morphological analysis, and the method of using a complex neural network. In this paper, we apply sentiment analysis
technology using a deep neural network to sightseeing reviews, add ratings to reviews which had not included them, supple-
ment data to enable various classification by weather or season, and design a system that enables custom recommendations
based on data. Finally, we examine the contextual features of tourist attractions and design an efficient pre-processing pro-
cedure based on the results, and describe the overall process such as building a suitable learning environment, combining
review and weather information, and final recommendation method.

Keywords Sentiment analysis · Mobile edge computing · CNN · LSTM · Recommendation system

1 Introduction data necessary for constructing it must be very large. How-


ever, according to the survey, there are not even 10 review
Satisfaction with tourist attractions is determined by more scores for unpopular tourist sites.
complex factors than products alone. In general, the con- In this paper, we propose a method to supplement insuf-
venience of a tourist’s route, the weather, the concentration ficient data by using a deep neural network and social net-
of other people, and the degree of cultural property pres- working review data and a recommendation system applying
ervation are just a sample of the relevant factors. The most a combination of public tourism and meteorological data.
important factor is the weather condition (Scott and Lemieux
2010). As a result of this research, Fig. 1 shows how the
climate, forecast, and weather affect tourist planning and 2 Related works
tourism. Therefore, simply averaging the evaluations of tour-
ists who have had one specific experience is not an appropri- This work combines word embedding and TF-IDF theory
ate method of recommending tourist sites to others. Scenic for sentence processing, and deep learning technology for
spots that can look more beautiful on rainy days or beaches sentence analysis. It briefly introduces only the main func-
that are affected by high and low tides should be weighted tions due to the constraints of the space.
differently and gain an advantage on ‘such a day’ on which
the conditions are perfect. In order to be able to make these 2.1 Word embedding
recommendations, an indicator which rates and classifies the
weather conditions is necessary, which also means that the Word embedding is the process of digitizing words in a sen-
tence. Because almost all of the deep learning architectures
* Nammee Moon require numbers in the form of input, if a deep learning
[email protected] algorithm is used in sentence analysis, the quantification of
Hyeon‑woo An sentences is a necessary sequence. Generally, a sentence is
[email protected] classified as a morpheme in advance, and word embedding is
performed based on the classified morpheme. Morphologi-
1
Department of Computer Engineering, Hoseo University, cal classification refers to the task of classifying a sentence
Asan, Republic of Korea

13
Vol.:(0123456789)
H. An, N. Moon

Fig. 1  Weather–climate infor-


mation for tourist decision-
making

as a word and a word as a smaller unit of morpheme. At this convolution it. Recently, further research has been conducted
time, instead of simply classifying by using letters, the role to combine CBOW and CNN for emotional analysis, and
of words in the context is judged and classified. For exam- high accuracy has been derived through the function of cap-
ple, if there was a sense such as “좋은 장소입니다. 그런 turing the semantic features of text through CNN (Liu 2018).
데..” the statement classes such as (좋/VA 은/ETM 장소/
NNG 이/VCP ㅂ니다/EF 그런데/MAJ). Embedding works 2.4 Long‑short term memory (LSTM)
replace classified morphemes with R-dimensional vectors,
and it is desirable to adopt a method that exists in different When learning using RNN, the problem of not preserving
types of embedding methods, such as frequency and predic- the characteristics of the context well is called Long-Term
tion bases, and that is appropriate for the purpose of use. Dependencies. For example, given a short sentence such as
“There are many fish in the..” for a typical RNN, we can
2.2 Term frequency‑inverse document frequency use the previous context to predict words such as “river”
(TF‑IDF) and “sea” with high accuracy, In the long sentence such as
“I have been interested in soccer since I was young, so my
TF-IDF is an algorithm that induces the dependent words dream is to be a..”, the word “soccer” can not be preserved
of a particular document to play an important role over the until the end of the sentence, so “soccer player” can not
commonly used words (Ramos 2003). For example, When be deduced. One of the best ways to solve this problem is
categorize articles, certain words that are often used in the LSTM. It added a cell state and gates device to the RNN
“politics” category such as “governing” “ or “white-house” to retain the previous context information, which makes it
help to play a greater role in choosing the “politics” category possible to use important context information at the begin-
than the commonly used words such as “this” or “the”. For ning of the sentence. The characteristics of LSTM, which
this purpose, TF-IDF uses the frequency of term in a spe- is strong in learning time series data, have been applied to
cific document and frequency of the same word in the entire generate fuzzy cognitive map and construct clinical decision
document set. support system by using it as a model for learning about the
effect of prescription drugs on treatment for a certain period
2.3 Convolutional neural network (CNN) of time(Duneja et al. 2018).

CNN is a technology widely used in image analysis. It 2.5 Similar systems


adopts Kernel and Convolution to extract and preserve
regional features of images. In recent years, however, meth- The recommendation system considering user’s environ-
ods of using CNN for natural language processing have been mental factors similar to the system introduced in this paper
studied and it has been getting good results (Kim 2014). has been studied before. In general, in order to overcome
Various methodologies using CNN have been studied in sen- the shortage of evaluation data, users often use evaluation
tence analysis. The basic mechanism is to extract sentences sentences left in SNs. The main reason is that it is possible
consisting of word vectors as a matrix, and to extract and to analyze massive data and time flow (Kim et al. 2013;
preserve the contextual features by attaching the concept of Song et al. 2016). If this is defined as the result of using
Window, which is the role of Kernel, to the word matrix and the evaluation platform, there are many systems that apply

13
Design of recommendation system for tourist spot using sentiment analysis based on CNN‑LSTM

environmental factors that exist outside the evaluation plat- by extracting individual profile information and past review
form. One of them is based on the assumption that the topic information is must preceded.
of news is changed according to the user’s location. Using The Table 1 compares the proposed method with those of
Latent dirichlet allocation (LDA) topic modeling to investi- the most commonly used recommendation systems.
gate articles accessed at each location and to obtain mean-
ingful results (Noh et al. 2014).
Another similar study has a music recommendation sys- 3 System overview
tem that uses multiple profiles. This research has obtained
good results by constructing a music recommendation algo- The proposed system is a recommendation system that links
rithm using user profiles, item profiles and situation profiles various environmental factors that affect tourism to a prefer-
that define the listening environment (weather, season, time, ence through a series of processes and provides more suit-
etc.) (Park and Moon 2012). able recommendations. The proposed system consists of
The research is based on the assumption that various data collection, preprocessing, evaluation learning and sup-
profiles affect musical appreciation. Each profile is set as plementation, and spatial join process. Finally, the ‘evalu-
a multidimensional weight, and the influence is calculated ation index’ is generated, which is the preference index to
through multiple regression analysis. The difference from which the selected environmental factors have influence on
this study is that it only accounts for the relationship and each tourist destination. A summary of the entire process is
influence of the whole product because it treats various pro- shown in Fig. 2.
files as weights, and it is difficult to reflect the influence of The results of sentiment analysis of the sentence should
environmental factors on each item. be evaluated from one-to-five-points because it should serve
There is also research considering environmental factors as a tourist’s point of view. However, there is no data set of
based on deep running. An example is a recommendation one-to-five-points provided for sentiment analysis in Korea.
system using a Deep Auto Encoder. If this paper considers Therefore, we use crawled rating data as a data set for learn-
the environmental factors to which the user belongs, the sys- ing, and the target data is Google Map and Trip Advisor
tem is intended to consider user-specific properties. To sum- review data based on user’s experience. There are many
marize, it is a study that uses the Stacked Denoising AutoEn- ways to conduct sentiment analysis in sentences. Examples
coder (SDAE) to learn and apply the correlation between include approaches using characteristics and frequency of
scores based on ratings of reviews and opinion data left by words, clustering methods such as spectral clustering and
users (Je et al. 2017). As a result, this method has a result Chi Square-based feature clustering (Dong et al. 2018), and
that the cold start problem is mitigated by extracting and methods using deep neural networks. In this paper, the deep
utilizing the user’s review data and the user’s inherent char- neural network structure that will be applied for the senti-
acteristic information possessed in advance. However, it is ment analysis of long texts capitalizes on the combination
difficult to completely solve the cold start problem, and there of CNN (Convolutional Neural Network) and LSTM (Long
is a restriction that the process of establishing association Short Term Memory) (Wang et al. 2018).

Table 1  Comparison table between proposed method and commonly used recommendation methods
Attribute/methods Content based User based collaborative Item based collaborative Ours
filtering filtering

Cold start problem None Exist Exist None


First rater problem None Exist Exist Exist
Over specialized problem Exist None None None
Scalability for data size Good Bad Bad Good
Prior work Selection of item attributes Table construction for simi- Building item-matrix Environmental fac-
lar pattern matching tor extraction and
evaluation index
construction
Recommendation operation Analyze similarity with Match similar users or Selection of similar items Evaluation index
items and items, or items groups based on current based on item-matrix matching by
and user preferences user information extracting user’s
environment infor-
mation
Sources of environmental Attributes of items User profile Item profile External/user rating
factors

13
H. An, N. Moon

Fig. 2  Data used, data processing and output results

The evaluation of Korean sentences requires cutting the coordinate information. Finally, the recommendation service
sentences of the review data, removing unnecessary words, is provided by the data obtained by processing the evaluation
classifying them as morphemes and expressing them as index according to the recommendation algorithm.
a vectors, a learning process to understand the relation Finally, in order to reduce the server-weighted load and
between the preprocessed vectors and the score calculation maintain high bandwidth, we selected the mobile edge com-
process. The learning process is the process of learning the puting method, for preliminary tasks and reserved the rec-
relationship between sentences and scores based on a vector- ommendation process for the mobile platform. Therefore, it
ized array using word2vec. If a pre-processed review is input can be expected that it will provide users with a low latency
into the smart system, it is possible to derive an appropriate and quick, responsive service.
for the review within a given margin of error.
The process of pre-processing sentences for learning and
rating is as follows: The entire sentence is divided to the 4 Recommendation system
appropriate length for preprocessing and learning, and the
morpheme classification is performed. The TF-IDF mecha- The proposal system consists of data collection and preproc-
nism is then used to remove words that have dependencies essing, learning and assigning ratings, combining them into
on each tourist spots based on the morpheme. This process one, and finally the recommendation process.
prevents the words that describe the characteristics of tour- The data collection process is a collection of review data
ist attractions from being learned. Finally, derive a vector that will be used in the learning and rating process. It col-
matrix for learning with reconstructed sentence matrices. lects the rating data to be learned and a large amount of
The proposed learning method in this paper cites already unrated review data that will be given the rated data.
published research results. Characteristics that extract local Preprocessing is a procedure of reprocessing and vector-
features of CNN can be applied to extract contextual features izing the sentences so that the collected data can be learned
in sentence analysis. In this case, the LSTM concatenates and rated. And the learning and rating process is a process
the pooling layer extracted through the CNN to process the that uses preprocessed results to learn the relationship
sentence encoding. Finally, we derive the score from the between scores and sentences and to give a rating to the
encoded result through fully connected layer and softmax unrated review data with the learned system. In the process
layer. of creating the data for the recommendation, the weather
As a result of these tasks, we obtain a vast amount of rated information and the coordinate information are combined
data, and combine the weather data with the tourism data to with the rated data and the compression work is performed
create an evaluation index that includes ratings, weather, and to reduce the load during the recommendation calculation.

13
Design of recommendation system for tourist spot using sentiment analysis based on CNN‑LSTM

Finally, in the recommendation process, an operation is per- tourist spots. Finally, the evaluation index, which is the result
formed to search for a recommended tourist spot correspond- of combining previous data sets with the rated review data,
ing to the real environment of the service consumer, such as includes weather information and the name and coordinates
the location or the weather environment. In this paper, we of the tourist attraction according to each piece of informa-
propose mobile edge computing to provide fast, responsive tion and the publication date. Figure 3 shows the collection of
service within the mobile environment. Servers can reduce meteorological and tourism data.
the overhead and recommendation operation and maintain
high bandwidth. 4.2 Preprocess

4.1 Dataset The preprocess is to create an efficient sentence vector for


learning. The whole procedure consists of four operations
We selected Google Maps and Trip Advisor as platforms to including final vectorization, as shown in Fig. 4 below.
collect sample ratings and reviews from. There are several Review data has different lengths and various types of
reasons for the selection: first, both contain a large number grammar. Due to the nature of the CNN mechanism used in
of reviews with sentences of suitable length for learning, and this proposed system, the maximum document length must be
which are already divided into five points. As for the unrated specified. In case of abnormally long words, it is necessary to
review platform, Instagram was chosen as a review type due reduce the length limit for the efficiency of the preprocessing
to being relatively large in quantity. For example, a search activity. After the length and wordiness of the document are
for ‘pinnacleland’, a tourist spot in Korea, resulted in about reduced, the morpheme classification is performed based on
300 reviews of Tripadvisor and google map reviews, while the result.
there were about 12,000 instances on Instagram. The morpheme classification of Korean sentences is to clas-
There are several reasons why we did not use more sify sentences into smaller units suitable for learning. There
diverse platforms for learning. For one, the review evalu- are several reasons to perform vectorization based on mor-
ation criteria of each platform is clearly different. Some phemes. First, due to the nature of social media posts, the word
platforms have strict criteria for evaluating tourist spots and separation may not be clear. For instance, in the case of Insta-
others do not. This is not limited to tourist spots but also gram, it is common to put a few words in the hashtag without
the evaluation of other industries such as commodities or spaces. In this case, the word segmentation method using the
movies. The second reason is that there is a difference in context of the word makes it possible to distinguish words
narrative. In the case of blogs that are often used in tourist more precisely than the method using spaces. Secondly, the
reviews, they tend to use a diary format. This is not suitable results of the morpheme classification also include the attrib-
for learning because it includes not only the tourist sites but utes of the words. Using this method, it is possible to classify
also the various tourist attractions included in the same day the words of the same name, which are strictly classified, to
tour in a post. Another reason is the difference in the length improve learning efficiency.
of the sentence and the suitability of the post as a tourist Based on the morphemes thus classified, words are
review. Although preprocessing has a process of cutting the removed using the TF-IDF mechanism. This task is to ensure
length of a sentence in the case of a long sentence, many that words that depend on the tourist spot do not affect learn-
agree that it is efficient to reduce the variability of the sen- ing. For example, many of the words like ‘cathedral’, ‘cross’,
tence length. Also, even if collected by specifying the search ‘Maria’ and so on will frequently appear in the tourist reviews
keywords, possibility was enormous that users would write of beautiful cathedrals. As a fragmentary example, if it has
news articles related to the tourist attractions, not to mention good grades on all cathedrals, then dependent words will also
the tourist sites, or post personal opinions such as “I want to be affected. Therefore, obtaining the frequency of noun words
go to Dogo paradise”. in sightseeing reviews, and the frequency of the correspond-
We used meteorological data provided by the Korea Mete- ing words in all sightseeing reviews is compared to find and
orological Administration and tourism data provided by the remove dependent words. At this point, simply removing the
public data portal to create an evaluation index that combines word will have a negative impact on the context learning pro-
ratings, weather, and coordinate information for each tourist cess in the future. Therefore, it removes the subordinate words
spot. The meteorological data provides a historical record of by replacing them with words which are not shifted to either
more than 600 observatories in Korea and provides observa- positive or negative such as ‘Thing’.
tory information including coordinates and meteorological 0.5 × f (t, d)
records measured in minutes. Tourism data is data provided tf(t, d) = 0.5 + ,
max{f (w, d) ∶ w ∈ d}
by a public institution which included coordinate informa- (1)
|D|
tion for each tourist spot and introductory information about idf(t, D) = log .
|{d ∈ D ∶ w ∈ d}|

13
H. An, N. Moon

Fig. 3  Meteorological data and tourism data

Fig. 4  Sentence pre-processing process

The above equation is used to obtain the term Then, word2vec is used to vectorize preprocessed words
frequency(td) and inverse document frequency(idf) of the so that learning and rating can be done.
TF-IDF mechanism. In the original frequency-based analy-
sis, it is multiplied by tf(t,d) and idf(t,d) to reduce the degree 4.3 CNN and LSTM
of weighting commonly used words in all documents. In
this study, we decided to substitute a word by judging it The combined model of CNN and LSTM used in this paper
as a dependent word when the idf(t,d) value is less than a cites already published research results. In general, CNN is
certain value. a model that is often used for image classification. It extracts

13
Design of recommendation system for tourist spot using sentiment analysis based on CNN‑LSTM

and preserves local features of images composed of matrix distribution of the score is obtained. The softmax operation
data. The characteristic of extracting feature of CNN can is as follows:
be applied to extract the feature of context. In the field of
sentiment analysis, high performance has been verified, and exp(oi )
P̂ i = ∑C . (3)
studies using it have been actively conducted (Zhang et al. j=1
exp(oj )
2018). Convolution is possible by creating a window of the
appropriate length for the continuous words of the sentence. The following equation uses cross entropy as a loss
It will be able to increase the efficiency of work by adjust- function. In the proposed system, T is the total of the rated
ing the number and size of these windows, and adjusting the reviews, V is the number of the evaluation points, and P is
convolution variables involved in the number of feature maps the one-hot coded actual training data with 1 and 0 in the V
generated. LSTM plays a role of encoding the entire sentence dimension it means. The entire model is trained end-to-end
that concatenates the results of CNN and derives the score with stochastic gradient descent (Bottou 2010).
through the fully connected layer and the softmax layer.
Figure 5 shows the process as follows. The bottom-most V
∑∑
visible matrix is a matrix of sentences embedded in a vector loss = − P̂ ti (C)log(P̂ i (C)). (4)
s∈T i=1
by preprocessing. Each matrix will convolute into a window
of a predetermined size, and feature maps of the number of Once all learning is has been completed, CNN and LSTM
convolutional kernels are generated. Here we have 4 and will work to derive a rating. As in the case of learning, the
5 window sizes and m convolution kernels. Convolution input for deriving the score is also created by preprocessing,
processing leads to the extraction and preservation of local and the Rating table is created by attaching the resulting
characteristics of sentences. Each feature map thus generated score to the review data.
is subjected to max pooling, resulting in m * 2 feature maps.
finally, it concatenates and proceeds to send it to LSTM. The
equation below is for this process. 4.4 Spatial join
( )
( ) l−w+1
Z = ⊕ P4 , P5 , Z ∈ ℝ 2
×(m×2)
. (2) Spatial join is a process of combining four tables into a single
table, and proceeds on the basis of the following information.
The CNN output is like N-grams, which is produced at a
low cost. LSTM produces the same result as encoding the 1. Rated data: tourist spot’s name, date, rating.
whole sentence by inputting the resultant matrix. LSTM 2. Meteorological data: observation station number,
then forms a penultimate layer as a result, and through the weather information by date.
fully connected layer and the softmax layer, the probability

Fig. 5  Model architecture for an


example sentence

13
H. An, N. Moon

Fig. 6  Datasets and spatial join

Fig. 7  Pseudo code to get


coordinates

3. Observation data: observation number, observation


coordinates.
4. Tourism data: tourist spot’s name, tourist spot’s coordi-
nates.

Four datasets must be combined to form an evaluation


index that includes ratings, weather, and coordinate informa-
tion. As shown in Fig. 6, each data set is rated data, tourism
data, and meteorological data. The coupling process is as
follows. Using the name of the tourist spot in the rating data,
Fig. 8  Pseudo code to get closest observation station
it obtains the coordinate information of the tourist spot in the
tourism data. The Euclidean distance is calculated from the
coordinate information and the coordinates of observatory in at an observatory located on the sea for tourist sites close to
the meteorological data to find the nearest station. After that, the sea. In this case, since it is necessary to dispose of the
the past record of the observatory is obtained and combined sightseeing area that can see the beach, it is necessary to find
with the posting date of the Rating data, so that the evalua- a tourist attraction close to the beach and treat it separately.
tion index is created. There is a rating, name of tourist sites, This can be done by comparing the coordinates of each of
posting date, sentence, category and weather information in the “beach” and “non-beach” destinations using the category
the created evaluation index. attribute.
Exceptionally, in the case of a tourist site located on the The pseudo-code of the process for obtaining each piece
beach, the user’s appreciation may vary depending on the of data required for a spatial join is as follows (Figs. 7, 8, 9).
height of the waves, depending on the high tide and low tide. The created evaluation index is inefficient to use for rec-
Therefore, it is necessary to use meteorological information ommendation. For the weather information and the date to be

13
Design of recommendation system for tourist spot using sentiment analysis based on CNN‑LSTM


            

JRQJVHULFKXUFK GRJRSDUDGLVH SLQQDFOHODQG

Fig. 11  Graph of rating variation calculated in mobile environment


Fig. 9  Pseudo code for get weather
(x: time, y: rating score)

suitable for the recommendation, it is necessary to divide the


temperature, precipitation/snowfall, the height of the wave, The mobile device extracts a list of tourist attractions
the season, the holiday, and the day and night into appropriate within the radius desired to travel using Euclidean distance,
units to merge them and to judge the merged basis and the and requests the server for a merged-column for the tourist
reliability The number of merged columns is written to create sites. At this time, the server passes the data merged into the
a merged-column. As a result, the recommendation will be current season and whether it’s a holiday or not.
processed by merged-column, and the merged-column will be As the number of tourist sites that provide services
updated periodically as the evaluation index is updated. The increases, the distance calculation will take more and more
final generated merged-column is shown in Table 2. time. Therefore, we propose a scheme to solve this prob-
lem using hierarchical location database architecture (Han
4.5 Recommendation et al. 2017). When there is no initial database, the mobile
platform should be utilized. If the mobile platform sends a
Prior to the recommendation, the server provides a list of list of requested tourist sites according to the user’s location
tourist locations and coordinates on the mobile platform that and distance radius, the server stores the list in the index
have been used by the recommendation service consumer. corresponding to the location and radius. This way, you can

Table 2  The final generated


merged-columns

Fig. 10  Hierarchical location


database architecture

13
H. An, N. Moon

Fig. 12  Tourism data table with


categories

instantly send the list to the mobile platform that requests the inappropriate samples such as advertising posts could be
list in the same environment after the data has accumulated. collected. However, the collection of metadata such as exif
To do this, there must be a compartment to be merged and of photos in Instagram is prohibited, so a perfect applica-
judged, which can be treated as a boundary map by admin- tion utilizing social engineering or filtering of inaccurate
istrative districts in public portals (Fig. 10). data based on user past behavior has been deemed necessary
The merged-column provided by the server is data (Aghababaei and Makrehchi 2017).
merged only for seasons and holiday/weekdays, and filter- As previously noted, in the case of meteorological data,
ing of the weather conditions of the service consumer is there is a difference in information provided, such as not
required. This work is done on the mobile platform. The including the height of a wave depending on the character-
necessary weather environment can be received in the form istics of the observatory. Therefore, it is considered neces-
of an open api from the Korea Meteorological Administra- sary to combine the data of a sea-based observatory during
tion, and classified weather information received according the spatial joining process for the tourist sites on the beach.
to the specified unit and filtering on the merged-column. In this paper, we propose to reduce the burden on the
In the future, it will be possible to recommend the best server and maintain high bandwidth by imposing opera-
tourist route with consideration paid to the best sightseeing tions on the mobile platform in the recommendation process
hours by calculating the rating variance of the tourist sites including complex computation process using edge comput-
according to the weather change and day/night change as ing. However, there is still a large amount of merged-column
shown in Fig. 11. sending process required, and there are still operations that
The combined merged-column also includes the cat- are imposed on the server when returning the recommenda-
egories provided in the tourism data as Fig. 12(in order of tion result.
name, famous mountains, beaches, docks, rivers/valleys/ In order to solve this problem, it is necessary to update
lakes, camping/trekking/experience, landscapes, sanctuar- the entire list on the server side when adding tourist sites
ies, religious significance/Buddhist temple/sacred places, or to perform more complicated operations. However, it is
old houses/folk villages, exhibitions/sightseeing, recreation/ inevitable that it may result in temporary interruption of
hot springs, specialized sightseeing areas). This can be used service or decrease of operation speed.
to select an attraction of the desired theme, and it will be
possible to recommend better travel routes by applying the Acknowledgements This work has supported by the National Research
Foundation of Korea (NRF) grant funded by the Korean Government
necessary sightseeing time through categories. (MSIT) (No. NRF-2017R1A2B4008886).

5 Conclusion
References
If it is possible to evaluate tourism sites by applying various Aghababaei S, Makrehchi M (2017) Activity-based Twitter sampling
environmental factors such as timing, season, and weather, it for content-based and user-centric prediction models. Hum-centric
can be used as important information for the tourism indus- Comput Inf Sci 7(1):3
try and lead the way toward the improvement of sightsee- Bottou L (2010) Large-scale machine learning with stochastic gradi-
ent descent. In: Proceedings of COMPSTAT’2010. Springer, pp
ing and travel recommendations. However, since the current 177–186
platforms that have accumulated the review of sites were not Dong S, Zhang X, Li Y (2018) Microblog sentiment analysis method
considered to have enough data, it was proposed that using based on spectral clustering. J Inf Process Syst 14(3):727–739
a platform that does not include ratings but has much more Duneja A, Puyalnithi T et al (2018) Analysis of inter-concept depend-
encies in disease diagnostic cognitive maps using recurrent neural
content and reviews would be beneficial. network and genetic algorithms in time series clinical data for
Due to the characteristics of the social networking data, targeted treatment. J Ambient Intell Humaniz Comput. https​://
there is a chance that the time of user-posted messages doi.org/10.1007/s1265​2-018-1116-5
and the actual time of their visits are not congruent or that

13
Design of recommendation system for tourist spot using sentiment analysis based on CNN‑LSTM

Han YH, Lim HK, Gil JM (2017) Hierarchical location caching scheme Ramos J (2003) Using tf-idf to determine word relevance in document
for mobile object tracking in the internet of things. J Inf Process queries. In: Proceedings of the first instructional conference on
Syst 13(5):1410–1429 machine learning. Piscataway, NJ, pp 133–142
Je HW, Kim JW, Yi MY (2017) Deep AutoEncoder based personalized Scott D, Lemieux C (2010) Weather and climate information for tour-
recommendation system: considering user’s intrinsic characteris- ism. Procedia Environ Sci 1:146–183
tics. Korea Inf Sci Soc 2017(6):773–775 Song HJ, Kim JA, Lee SM, Moon NM (2016) A Study on User’s
Kim Y (2014) Convolutional neural networks for sentence classifica- Purchasing Pattern based on text mining and location awareness
tion. Proceedings of the 2014 Conference on Empirical Methods for T-Commerce. Proceedings of the KIBME 2016 Conference
in Natural Language Processing (EMNLP). pp 1746–1751 134–136
Kim KR, Jeong YS, Moon NM (2013) Social Network Community data Wang Y, Kim KT, Lee BJ, Youn HY (2018) Word clustering based on
based Modeling of User Types for Personalized Service. Proceed- POS feature for efficient twitter sentiment analysis. Hum-centric
ings of the KIBME 2013 Conference 165–166 Comput Inf Sci 8(1):17
Liu B (2018) Text sentiment analysis based on CBOW model and deep Zhang Y, Wang Q, Li Y, Wu X (2018) Sentiment classification based
learning in big data environment. J Ambient Intell Humaniz Com- on piecewise pooling convolutional neural network. Comput
put. https​://doi.org/10.1007/s1265​2-018-1095-6 Mater Contin 56:285–297
Noh Y, Oh YH, Park SB (2014) A location-based personalized news
recommendation. In: 2014 International Conference on Big Data Publisher’s Note Springer Nature remains neutral with regard to
and Smart Computing (BIGCOMP). IEEE, pp 99–104 jurisdictional claims in published maps and institutional affiliations.
Park KS, Moon NM (2012) Multidimensional optimization model of
music recommender systems. KIPS Trans Part B 19:155–164

13

You might also like