User's Next Location Prediction Using ML Algo
User's Next Location Prediction Using ML Algo
net/publication/343874590
CITATIONS READS
15 3,747
2 authors:
All content following this page was uploaded by Alireza Hamoudzadeh on 17 April 2022.
Abstract The reliability on smartphones has been Keywords Machine Learning algorithms Trajectory
increased in a tremendous way in the last decade, and this Geospatial Information System (GIS) Behavior
incensement brings us nothing but data to investigate. recognition Location prediction
Furthermore, the ability to predict user’s next location
plays a vital role in Location-Based Services and recom-
mender systems. This paper is seeking to predict the user’s 1 Introduction
next location based on their spatial background using
machine learning methods like Artificial Neural Networks It is almost obvious that people have certain routines in
and Classification methods like K-Nearest Neighbors their daily life and these routines repeat daily, weekly and
(KNN), Support Vector Machine and Decision Tree. The every month (e.g. a bank employee has 9–17 job so he is at
suitable method is then chosen through their comparison. the bank from Monday to Friday.) and sure, there are some
The data used in this research is from active users provided exceptions like sickness and holidays. Still, user activities
from the Geolife dataset from the city of Beijing. The best on Foursquare show that people demonstrate different
result is obtained from the Weighted K-Nearest Neighbors patterns at the weekend compared to the rest of the week
(KNN) method, with an overall accuracy of 91.98 percent. [1]. Based on researchers’ studies [2, 3], most people have
Additionally the dependency between observed data and low-entropy lives, which means their lives have long term
model prediction is determined. A concept called Rou- regularity. This long term regularity makes it available to
tineness is also introduced, which shows the predictability understand and extract meaningful patterns out of their life.
and anomalies of each users’ behavior, based on the dif- Typically, the positioning technology for smartphones
ference in prediction methods from their spatial and tem- can be divided into GPS, Cellular, and Wi-Fi positioning
poral background. A comparison is also done with similar [4]. There is also a lot of information, in where these
researches that used the same data to evaluate the suffi- certain people go; this information can be extracted and
ciency of the methods. The computation shows the pro- used as well in various ways like providing more appli-
posed method predicts behavior 2.72% more accurate than cable advertisements based on their personalities. Conse-
similar ones. quently, a plethora of publicly available data is generated
from platforms like Foursquare and Twitter, and can
potentially reflect the behavior of millions of citizens at a
remarkable level of detail [5].
& Saeed Behzadi It is vital to discover mobility patterns to predict the next
[email protected] location, and it is impossible to reach, without having the
Alireza Hamoudzadeh user’s spatial background since pattern discovery and
[email protected] prediction are closely related tasks and repetitive [6, 7].
1 When it comes to predicting, two types of approaches are
Department of Surveying Engineering, Faculty of Civil
Engineering, Shahid Rajaee Teacher Training University, usually got considered. The first one is the Markov model,
Tehran, Iran and the second one is the decision tree, both of which are
123
A. Hamoudzadeh, S. Behzadi
appropriate; however, there have been few pieces of Continuous Time Series Markov Model (CTS-MM) to
research that used ANNs for this purpose. predict the next location of users instantly. Herder et al.
Alharbi and de Doncker [8] used a Convolutional Neural [19] also used Markov Model with five different approa-
Network to analyze user behaviors on Twitter. Monreale ches (top visited places, last visited places, hours spent,
et al. [9] presented the WhereNext method, which pre- closest locations, and simple Markov Model) to predict the
dicted the next location based on a decision tree. Oh [10] next visited location. Additionally, Gambs et al. [20] cre-
also used a decision tree method for predicting user’s ated a mobility model called Mobility Markov Chain
locations based on their past movement patterns in an (MMC) to incorporate the n previous visited locations and
indoor environment with a data from only four users. Ying developed an algorithm for next location prediction based
et al. [11] prediction model was based on a cluster-based on their model, and their best result was with n = 2.
prediction strategy. Their model evaluates the next location Li et al. [21] presented a framework, referred to as
of a mobile user based on the frequent behaviors of similar hierarchical-graph-based similarity measurement (HGSM),
users in the same cluster which determined by analyzing for Geographic Information Systems (GIS) to consistently
users’ common behavior in semantic trajectories. A method model each individual’s location history, and effectively
called DestPD was proposed by Yang et al. [12] to over- measure the similarity among users. With the same pur-
come problems such as heavy computation, and data pose, Xiao et al. [22] used user’s GPS trajectories with a
sparsity. This method was consisted on two phases of Semantic Location History (SLH) to determine user’s
offline training and online prediction. They also proposed interests, and understand the similarity among users
two data constructs: Efficient Transition Probability (ETP) beyond geographic positions. Chen et al. [23] proposed a
and Transition Probabilities with Detours (TPD) to new model called Human Mobility Representation Model
improve the efficiency of matrix multiplication. (HMRM) to predict the users’ next location, and simulta-
Ashbrook and Sterner [13] presented a system to cluster neously produce the vector representations of the data
GPS data automatically into meaningful locations to get which was consisted by user ID, location ID, and times-
used into a Markov model to predict user next location. tamp and activity type.
Similarly, Do and Gattica-Perez [14] used various per- In this paper, five different Machine Learning Algorithm
sonalized Markov models, and developed a framework to and Five Artificial Neural Networks are created and tested
predict the user’s future location and the applications for to understand which one fits the best to prognosticate users’
the next 10 min. Likewise, Lu et al. [15] used a hidden next location. This paper also is trying to propose a concept
Markov model with one huge difference which was con- called Routineness that represents the predictability of
sidering traffic times with a few other factors. Moreover, users based on their spatiotemporal background.
Chen et al. [16] used three basic Markov models for three
tasks. A Global Markov Model that uses all available tra-
jectories to discover global behaviors, the Personal Markov 2 Preprocess and procedure
Model that focuses on mining the individual patterns of
each moving object, and the Regional Markov Model that 2.1 Data
clusters the trajectories to mine the similar movement
patterns. He et al. [17] used a transition graph to support The data used in this research is from the Geolife project
efficient sub-trajectories concatenation to tackle the spar- [24, 25]. The data is gathered from 182 users in a period of
sity issue. They also developed a novel similarity metric to over 5 years from Apr-2007 to Aug-2012. The data used
measure the similarity between two sets of trajectories to for this research is for 20 users in the city of Beijing.
validate whether the reconstructed trajectory set can well Beijing land-use and roads data used in the present
represent the original traces. Du et al. [18] proposed a research, are also obtained from Open Street Map.
Code 1 2 3 4 5 6 7 8 9
Name Water resources Food Military Natural Built-Up Governmental Sport Services Recreational
Code 10 11 12 13 14 15 16 17 18
123
Predicting user’s next location using machine learning algorithms
Table 2 -Road width in Beijing land-use classes in polygon format. Timestamp format is
Type Width (m)
also changed into the year, month, day of the week, and a
Class I 25.5 fraction between 0 and 1, which represents the time from 0
Class II 12 to 24 (e.g. 0.5 equals 12:00 PM). As a result, two types of
Class III 8.5 data are represented as h for time, and p for the location.
Figure 1 and Table 3 show one of the user’s move-
ments, and the nodes captured in each Land-use. For
2.2 Data preparation
example, besides the roads, the user’s most visited classes
are educational, residential, and commercial. Moreover,
Land-use data which users may visit is grouped into 18
this user never visited any agricultural, religious places,
classes listed in Table 1, (e.g. a pet shop and a café are in
industrial, nor water resources class.
two different classes, the first one is in commercial class
The likelihood of an event occurring at two close times
(No. 4) and the second one is in the food class (No. 11)).
is greater than it happens in a distant time. Equation 1
Next, time and coordinate are extracted and merged
shows that the probability of similarity between similar
from user GPS data, these coordinates are then matched
land-uses, in two close times is greater than its probability
with land-use data. Technical Standards of Highway
in two different times.
Engineering is used for the new construction and rehabil-
itation of roads. According to these standards, roads are Pðphn Þ P phnþm \Pðphn Þ P phnþ1 ð1Þ
divided into five categories based on their function and
Figures 2 and 3 shows user density heat plot and scatter
traffic volumes. Three types of which (Class I, Class II, and
plot over 160 days.
Class III) are inside the city and got used in this research
Data sparsity is another challenge in this research. The
based on the Development Division of the United Nations
user data is sometimes out of reach, so some rules are set to
Economic and Social Commission for Asia and the Pacific
fill out the blank space of the GPS disconnections. An
(Department of Highways, Ministry of Communication).
algorithm is created with basic rules to understand the GPS
The urban roads’ widths are available in Table 2. Then a
disconnection inside the building. For data sparsities under
buffer is created based on the China road standards, so the
an hour, when the last land-use, and the first one after the
time when the user is definitely on the road is entirely
disconnection are the same, the blank space gets filled with
calculated. For land-use classes in point format (collected
the same amenity. A similar rule is also created to under-
by users) the nearest point of land-use class is calculated
stand the disconnection during the sleep at night.
related to the trajectory point under the threshold of 20
meters. Also, a point in polygon analysis gets done for
123
A. Hamoudzadeh, S. Behzadi
Node counts 38,293 3840 15,941 5717 70 131,289 1073 138 910 640 107 169 104 2258
2.3 Method training process. The same data is used for all of the
machine learning methods -the ones trained in this paper-
An Artificial Neural Network (ANN) is created to under- to understand the differences with the same criteria. It is
stand user behaviors based on the context data (h, p). almost evident that the result slightly changes with differ-
ANNs are easy to build, and it properly performs under ent random selection. Figure 4 shows a schema of the
non-linear circumstances. A Scaled Conjugate Gradient Network’s structure and the analysis process.
(SCG) Backpropagation was trained with random data To understand and predict the next location of each user,
division for pattern recognition in one hidden layer with 5, analysis in two different aspects of the same problem is
10, 15, 20, and 25 neurons. By using a Levenberg–Mar- required: spatial aspect () and temporal aspect (h). Spatial
quardt approach, SCG avoids the line search per learning aspect () is used to prognosticate the users’ future location
iteration to scale the step size [26]. 25% of the data gets because it is necessary that the ANNs understand the
separated for testing without being participated in the location. So the locations were transformed into Land-use
123
Predicting user’s next location using machine learning algorithms
Network
∑ Result
: : Test Data
Classes (Table 1). Temporal aspect (h) shows that users’ represents a point at which a decision has to be taken. The
trajectory patterns occur with strong periodicity. The pat- branches emanating from nodes are the alternatives from
tern of users’ visited place was within time of the day, day which a choice can be selected. Each endpoint of the tree
of week, and month of year. The timestamp is changed into has an associated value which is the pay-off from reaching
these types, so the ANNs can understand this periodicity that endpoint [29]. A fine tree is used which has numerous
with higher accuracy. leaves to make many fine distinctions among classes
The time each user is in each land use gets extracted (maximum number of splits is 100).
using spatial analysis methods (nearest neighbor, buffer, K-Nearest Neighbor is one of the simplest and the most
point in polygon, etc.) and the result shows when and effective Machine Learning Algorithms, which depends on
where each user is. Consequently, ANN can learn from a the problem. The original W-KNN forecasting algorithm
combination of modified timestamps and land use classes was developed and introduced by Troncoso et al. in 2007
to predict where the user will be at any time. [30]. Two types of KNN are used in this article.
The export of these ANNs are networks that receive a Fine KNN: A nearest neighbor classifier that makes
time, and predict where the user will be spending his/her finely-detailed distinctions among classes with the number
time at that particular moment. Networks show diverse of neighbors set to 1.
responses to different users with the same timestamp (e.g. a Weighted KNN: Has a medium distinction among
user might spend the afternoon at the house while the other classes, using a distance weight. The number of neighbors
one goes to the gym). is set to 10.
Data are fed into every ANN in a vector form, and each After constructing the machine learning algorithms, the
network is exported to get tested with the test data, and the same data used for the ANN is used for these algorithms,
result is promising, which is available in Table 4. but in a non-vector mode. It is plain that the test data are
Four supervised machine learning algorithms are also also in a non-vector mode.
used to prognosticate the user’s next location, including a After constructing the networks, we enter the predic-
Support-Vector Machine (SVM), a decision tree, and two tors and the responses. Then the training process gets
KNNs with different conditions. started. The trained algorithm of these methods is similar
SVMs are particular linear classifiers that are based on to the ANNs’. After the training process, the result of
the margin maximization principle [27]. They perform each test data is a matrix with 18 rows. Each row shows
structural risk minimization which improves the complex- the probability of the user’s presence in that type of land-
ity of the classifier intending to achieve excellent gener- use. The row with the highest value is considered as the
alization performance. The SVM accomplishes the class in which user most likely visit. This value is then
classification task by constructing the hyper-plane in a compared with the actual response class value. If the
higher dimensional space that optimally separates the data response class and the actual value are the same, the
into two categories [28]. A fine Gaussian SVM is used in counter variable increases with 1, and nothing changes as
this research. This method makes a finely detailed dis- well with a wrong answer. The accuracy is determined
pffiffi
p with the division of the counter on the count of the total
tinctions among classes, with kernel scale set to 4 , where
p is the number of predictors. check data. The algorithm’s pseudo code is shown in
Decision trees are a graphical representation of the Algorithm 1.
choices in a decision-making process. A tree is a compo-
sition of nodes, branches, and endpoints; each node
123
A. Hamoudzadeh, S. Behzadi
123
Predicting user’s next location using machine learning algorithms
1
0.95
0.9
0.85
Accuracy
0.8
0.75
0.7
0.65
0.6
0.55
0.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
User Number
123
A. Hamoudzadeh, S. Behzadi
(a) (b)
1 1
0.95 0.95
0.9 0.9
0.85 0.85
Accuracy
Accuracy
0.8 0.8
0.75 0.75
0.7 0.7
0.65 0.65
0.6 0.6
0.55 0.55
Fig. 6 Prediction accuracy for two different users. a user 96’s prediction accuracy for each method, b user 167’s prediction accuracy for each
method
0.95
0.9
Accuracy
0.85
0.8
0.75
0.7
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
User Number
123
Predicting user’s next location using machine learning algorithms
References 15. Lu, K., Hsu, C., & Yang, D. (2010). A novel approach for effi-
cient and effective mining of mobile user behaviors. In 2010 4th
1. Noulas, A., et al. (2011). Exploiting semantic annotations for international conference on multimedia and ubiquitous
clustering geographic areas and users in location-based social engineering.
16. Chen, M., Yu, X., & Liu, Y. (2015). Mining moving patterns for
networks. The Social Mobile Web, 11(2), 1–4.
2. Eagle, N., & Pentland, A. (2006). Reality mining: Sensing predicting next location. Information Systems, 54, 156–168.
complex social systems. Personal Ubiquitous Computing, 10(4), 17. He, D., et al. (2019). Efficient and robust data augmentation for
255–268. trajectory analytics: A similarity-based approach. World Wide
Web, 23, 361–387.
3. Patrick, C. (2005). Life patterns: Structure from wearable
sensors. 18. Du, Y., et al. (2018). A geographical location prediction method
4. Li, F., et al. (2019). Exploiting location-related behaviors without based on continuous time series Markov model. PLoS ONE,
the GPS data on smartphones. Information Sciences, 527, 13(11), e0207063.
444–459. 19. Herder, E., Siehndel, P., & Kawase, R. (2014). Predicting user
5. Roick, O., & Heuser, S. (2013). Location based social net- locations and trajectories (vol. 8538, pp. 86–97).
works—definition, current state of the art and research agenda. 20. Gambs, S., Killijian, M.-O., & Cortez, M. N. D. P. (2012). Next
Transactions in GIS, 17(5), 763–784. place prediction using mobility Markov chains. In Proceedings of
6. Gonzalez, M. C., Hidalgo, C. A., & Barabasi, A.-L. (2008). the first workshop on measurement, privacy, and mobility. Bern:
Understanding individual human mobility patterns. Nature, 2012, Association for Computing Machinery (p. Article 3).
453(7196), 779. 21. Li, Q., et al. (2008). Mining user similarity based on location
history (p. 34).
7. Clauset, A., & Eagle, N. (2007). Persistence and periodicity in a
dynamic proximity network. In Proc. DIMACS, 2007. 22. Xiao, X., et al. (2010). Finding similar users using category-
8. Alharbi, A. S. M., & de Doncker, E. (2019). Twitter sentiment based location history (pp. 442–445).
analysis with a deep neural network: An enhanced approach using 23. Chen, M., et al. (2020). Modeling spatial trajectories with attri-
user behavioral information. Cognitive Systems Research, 54, bute representation learning. In IEEE transactions on knowledge
50–61. and data engineering (pp. 1–1).
9. Monreale, A., et al. (2009). WhereNext: A location predictor on 24. Zheng, J., & Ni, L. (2012). An unsupervised framework for
trajectory pattern mining. In Proceedings of the 15th ACM sensing individual and cluster behavior patterns from human
SIGKDD international conference on knowledge discovery and mobile data (pp. 153–162).
data mining. Paris: ACM (pp. 637–646). 25. Zheng, Y., et al. (2009). Mining interesting locations and travel
10. Oh, S.-C. (2012). Using an adaptive search tree to predict user sequences from GPS trajectories (pp. 791–800).
26. Møller, M. F. (1993). A scaled conjugate gradient algorithm for
location. Journal of Information Processing Systems, 8, 437–444.
11. Ying, J. J.-C., et al. (2011). Semantic trajectory mining for fast supervised learning. Neural Networks, 6(4), 525–533.
location prediction. In Proceedings of the 19th ACM SIGSPA- 27. Jafarian, H., & Behzadi, S. (2020). Evaluation of PM2.5 emis-
TIAL international conference on advances in geographic infor- sions in Tehran by means of remote sensing and regression
models. Pollution, 6(3), 521–529.
mation systems. Chicago: ACM (pp. 34–43).
12. Yang, Z., et al. (2018). An efficient destination prediction 28. Adankon, M. M., & Cheriet, M. (2009). Support vector machine.
approach based on future trajectory prediction and transition In S. Z. Li & A. Jain (Eds.), Encyclopedia of biometrics (pp.
matrix optimization. In IEEE transactions on knowledge and data 1303–1308). Boston: Springer.
engineering (pp. 1–1). 29. Black, J., Hashimzade, N., & Myles, G. D. (2009). A dictionary of
13. Ashbrook, D., & Starner, T. (2003). Using GPS to learn signifi- economics. Oxford: Oxford University Press.
cant locations and predict movement across multiple users. Per- 30. Troncoso, A., et al. (2007). Electricity market price forecasting
sonal Ubiquitous Comput., 7(5), 275–286. based on weighted nearest neighbors techniques. IEEE Trans-
14. Do, T. M. T., & Gatica-Perez, D. (2014). Where and what: Using actions on Power Systems, 22, 1294–1301.
smartphones to predict next locations and applications in daily
life. Pervasive and Mobile Computing, 12, 79–91. Publisher’s Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
123