0% found this document useful (0 votes)
8 views

Stacking Ensemble Learning For Non Line of Sight Detection of Global

Uploaded by

jahnavireddym1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Stacking Ensemble Learning For Non Line of Sight Detection of Global

Uploaded by

jahnavireddym1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL.

71, 2022 3512510

Stacking Ensemble Learning for Non-Line-of-Sight


Detection of Global Navigation Satellite System
Yuan Sun and Li Fu

Abstract— While the global navigation satellite system (GNSS) that the user does not receive, it will introduce a bias in
has been widely used to provide high-precision location services the GNSS pseudorange measurement and cause a significant
in many applications, it usually suffers from performance degra- performance degradation [8]. Thus, to improve the accuracy
dation due to non-line-of-sight (NLOS) reception. As the received
NLOS signals might have great measurement errors especially of GNSS receivers, NLOS signals should be countered for the
in urban canyons, they should be detected to mitigate the errors positioning system.
contaminating the positioning systems. However, the NLOS detec- To address this issue, a natural idea is to detect NLOS from
tion is quite challenging as the accuracy rate is usually highly all GNSS signals and then eliminate it prior to the position
related to the surrounding environment the receiver is located calculation. However, the NLOS detection of GNSS is always
in. To address this problem, we propose a stacking ensemble
learning (SEL) method for the NLOS detection of GNSS. First, a challenge problem, due to it being closely related to the
satellite measurement features are extracted from the GNSS raw environment surroundings of the user [9]. On the one hand, the
measurements via a designed data processing module. Then, environment might be complicated with different structures of
they are input to the SEL module consisting of two levels of buildings or trees, while NLOS signals might vary in different
machine learning models. In the first level, a support vector environments and can be difficult to model for detection.
machine (SVM) and an extreme gradient boosting (XGBoost)
are adopted in parallel, and the outputs of the fist-level models On the other hand, different from the multipath effect [10] that
are input to the second-level logistic regression (LR) to obtain the GNSS user receives both reflected and direct signals at the
NLOS predictions. The proposed SEL module combines the views same time, the user in NLOS reception phenomena does not
of different models to the measurement features to address the receive the direct LOS signal of a satellite but only receives
shortcomings of each single model and improve the model’s its reflected pattern. Thus, it is challenging to detect NLOS
generalization. Experimental results on real GNSS observations
in urban canyons show that the proposed method outperforms without a reference of the corresponding LOS signal.
the baseline machine learning methods with obvious detection A variety of studies on the NLOS detection of GNSS have
accuracy improvements. been conducted in the navigation domain. Typically, GNSS
Index Terms— Ensemble learning, global navigation satellite measurement features [11] that consist of raw measurements
system (GNSS), machine learning, non-line-of-sight (NLOS). and/or the quantities calculated from the measurements are
used for NLOS detection. Related work can be divided into
I. I NTRODUCTION three categories according to how to use measurement features.
1) Single measurement feature of GNSS signals is used to
R ECENTLY, the global navigation satellite system
(GNSS) is playing an increasingly important role in
a wide range of applications, such as intelligent transporta-
detect NLOS.
2) A combination of multiple measurement features of
tion system (ITS) [1], [2], location-based service (LBS) [3], GNSS signals are used to detect NLOS.
and artificial intelligence of things (AIoT) [4]. Nevertheless, 3) A combination of multiple measurement features of
the GNSS positioning could exhibit a serious error caused GNSS signals and other sensors are used to detect
by the notorious non-line-of-sight (NLOS) reception [5]–[7], NLOS.
especially in urban environments—the direct line-of-sight As for the first category, simple satellite measurements are
(LOS) signal is blocked while the signal is received only usually adopted to directly detect NLOS signals via compar-
via reflections. As the reflected GNSS signal is propagated ison with an empirical threshold. The measurements include
through an extra path than the corresponding direct LOS signal carrier-to-noise ratio C/N0 and satellite elevation angle [12].
However, the strategies based on the simple measurements
Manuscript received February 9, 2022; revised March 27, 2022; accepted would not work as the NLOS signals might not follow the
April 19, 2022. Date of publication April 28, 2022; date of current version
May 9, 2022. This work was supported in part by the National Natural expected behavior. For example, the strong reflection of GNSS
Science Foundation of China under Grant 61803037. The Associate Editor signals with high C/N0 will result in detection missing, and
coordinating the review process was Dr. Alessio De Angelis. (Corresponding the satellites with low elevation angles might not be blocked
author: Yuan Sun.)
Yuan Sun is with the School of Electronic Engineering, Beijing Uni- by surrounding buildings [13].
versity of Posts and Telecommunications, Beijing 100876, China (e-mail: To address the shortage of single measurement, the sec-
[email protected]). ond category of studies focuses on using multiple satellite
Li Fu is with JD AI Research, Beijing 100176, China (e-mail:
[email protected]). measurements to better distinguish NLOS and LOS signals.
Digital Object Identifier 10.1109/TIM.2022.3170985 Considering the excellent performance of machine learning
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
3512510 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 71, 2022

in detection and classification tasks, more work about how However, the performance of consistency checking will meet
to use machine learning to improve the detection perfor- challenges when a large proportion of the signals are NLOS
mance of NLOS has been proposed. Hsu [14] adopted a or multipath contaminated [22].
support vector machine (SVM) to an LOS/NLOS classifica- By comparing with the mitigation and detection methods
tion task based on multiple measurement features, including for GNSS NLOS mentioned earlier, the main advantage of
the difference between delta pseudorange and pseudorange detection methods is that they could eliminate the contami-
rate. Sun et al. [15] used the three measurement features of nated measurements prior to the position calculation. Ideally,
C/N0 , pseudorange residuals, and satellite elevation angle with a much larger choice of signals from multi-constellation
with a gradient boosting decision tree (GBDT)-based clas- GNSS, optimal position results might be obtained by selecting
sification algorithm and achieved significant improvement in only those signals least contaminated by NLOS and excluding
NLOS detection rate. Zhang et al. [16] compared different the rest [25].
machine learning methods, including SVM, K-nearest neigh- However, in the existing works of NLOS detection, more
bors (KNN), neural network (NN), and decision tree (DT) effort is paid on measurement feature selection in GNSS sig-
on NLOS detection. In their work, the experimental results nals or other sensors, and the machine learning model adopted
showed that the SVM outperformed other methods in different to detect NLOS is usually a simple version. In practice, these
urban scenarios. Xu et al. [11] also extended the measurement existing methods using a single model may fall into a local
features for SVM using signal-to-noise ratio (SNR), pseudor- optimal solution [30]. As the NLOS detection of GNSS is
ange, elevation angle, and so on. highly dependent on the surroundings, the existing machine
Regarding the methods of the third category, extra sensors learning models might not have a good generalization to the
are applied as aids to improve the performance of GNSS environment. To alleviate this problem, stacking ensemble
NLOS detection. For example, a fish-eye camera was applied learning (SEL) has shown great potential via blending different
to detect the borderline between the sky and the obstacles from and heterogeneous base models with particular parameters
the colored fish-eye image to exclude NLOS satellites [17]. to reduce the bias of each single model and decrease the
A fish-eye camera was also adopted to generate a visibility generalization error [31]. In this article, we proposed an
mask to improve the detection of NLOS [18]. Another method SEL for the NLOS detection of GNSS to further improve
is using the 3-D light detection and ranging (LiDAR) to the performance of GNSS positioning. The proposed SEL
provide surrounding environment obstacles to the user and consists of two levels of different machine learning models.
detect the NLOS signal [19], [20]. However, performance of It comprehensively considers the processing results of different
these methods relies on the image processing, which might be first-level machine learning models on all of these features and
unstable to illumination conditions or weather conditions. makes the final decision of NLOS detection via a second-level
Besides the detection scheme, NLOS mitigation is another machine learning model. The main advantage of SEL is that
widely studied solution, which aims to directly reduce the it combines different machine learning methods from different
GNSS positioning errors caused by NLOS reception. The views to address the shortcomings of each single model.
existing mitigation methods for GNSS signals can be divided Experimentally, as for NLOS detection tasks of GNSS, the
into hardware-based methods and methods based on data proposed SEL significantly improves the detection accuracy
processing [5]. First, as for the hardware-based methods, the in comparison with the baseline machine learning methods.
choke-ring antenna-based method is usually used to give low The main contributions of our work are shown as follows.
gains to low elevation satellites and mitigate the effect of 1) To the best of our knowledge, this is the first work using
reflected GNSS signals [21]. However, as referred in [22], SEL for the NLOS detection of GNSS.
the method exhibits little protection against reflected signals 2) We propose a new SEL method for the NLOS detec-
with higher elevation. Receiver-based methods also belong tion of GNSS to fuse different models’ advantages on
to the hardware-based methods, such as delay lock loop detection tasks.
[23], [24], which separates LOS and reflected signals via 3) We evaluate the effectiveness of our method with real
feedback loop. Nevertheless, the method might suffer from GNSS observation data, and our method significantly
performance degradation when the direct LOS is blocked in outperforms the baseline machine learning methods with
NLOS reception phenomena. Second, considering the high obvious detection accuracy improvements.
cost and inconvenience of hardware updating, NLOS mitiga- The remainder of this article is organized as follows.
tion methods based on data processing attract more attention Section II is the details of the proposed SEL for NLOS
in the navigation domain. For example, the measurements of detection of GNSS, including the descriptions of measure-
C/N0 and elevation angle were used for weighting adjust- ment features used for detection and the proposed SEL
ment positioning via downweighting the effect of NLOS method. Section III shows the experimental results and dis-
signals [25], [26]. However, as mentioned in the first category cussion. Finally, the conclusions and future work are given in
of the detection methods, the performance might be unstable, Section IV.
because the two measurements of NLOS vary greatly in
different environments. Alternatively, based on the assumption II. P ROPOSED M ETHOD
that NLOS measurements produce a less consistent pseudo- In this section, the proposed SEL for NLOS detection of
range residual or navigation solution, consistency checking GNSS is explained in detail. First, the system architecture
techniques were explored for NLOS mitigation [27]–[29]. of the proposed method is presented. Then, the data process
SUN AND FU: SEL FOR NLOS DETECTION OF GNSS 3512510

Fig. 1. System architecture of the proposed SEL for NLOS detection of GNSS, which mainly consists of the data processing and SEL.

of measurement feature extraction is designed for machine TABLE I


learning. Finally, the ensemble learning method is proposed E XTRACTED M EASUREMENT F EATURES OF A V ISIBLE S ATELLITE FOR
GNSS NLOS D ETECTION , W HICH C ONSIST OF THE GNSS R AW
for GNSS NLOS detection. M EASUREMENTS (R) AND M EASUREMENT F EATURES
VIA D ATA P ROCESSING (P)

A. System Architecture
The goal of this article is to develop an NLOS detection
method for GNSS measurements of a receiver. To achieve
this, a method based on SEL is proposed, which combines
different machine learners to improve the prediction results
than each individual model. The system architecture of the
proposed SEL for NLOS detection of GNSS is shown in Fig. 1,
which mainly consists of two modules, i.e., data processing
and SEL, as follows.
First, the received GNSS raw measurements {xi |i ∈ N} of second-level LR [35]. The choice of these machine learning
a receiver are input to the data processing module, with N models will be tested in Section III. Note that these models
the number of received GNSS satellites at a sampling time. of the proposed method are trained on a training dataset.
In this article, the extracted measurement features f ∈ R9 of a By combining the outputs of different models, the SEL module
observed satellite are a 9-D vector, which consists of pseudo- outputs the final NLOS predictions of each received satellite.
range p ∈ R, SNR sn ∈ R, elevation angle e ∈ R, azimuth
angle a ∈ R, pseudorange residual pr ∈ R, pseudorange rate
B. Data Processing
consistency prc ∈ R, and satellite positions s ∈ R3 . More of
other features will be further researched in our future work. Measurement feature is an important premise in determining
Then, the measurement features of each satellite are input the machine learning performance. In this section, our purpose
to the proposed SEL module, which consists of two levels is to design a data processing modular to obtain measurement
of different machine learning models. In practice, there are features input to the followed SEL modular and improve
many individual models that can be adopted into the first the NLOS detection accuracy. The selected measurement fea-
and second levels of the SEL framework.1 Empirically, in the tures f consist of the GNSS raw measurements and its simple
first level, we consider three state-of-the-art machine learning processing, as shown in Table I. The raw GNSS measure-
models, i.e., SVM, extreme gradient boosting (XGBoost), and ments include SNR sn , pseudorange p, and satellite positions
random forest (RF) [32]. All of these models are widely s = {s1 , s2 , s3 }, where s1 , s2 , and s3 are the three dimension
and successfully used in many real applications. As for the positions of a visible satellite in the Earth-centered Earth-
second level, we experiment with the widely used logistic fixed (ECEF) coordinate system, respectively. Nevertheless,
regression (LR) and XGBoost to ensemble the prediction empirically, the relation between such raw measurements and
outputs of the first level. To make a balance between the NLOS can be hard to model [36]. Thus, based on the raw
final accuracy and the computational cost, in our method, measurement, we add some measurement features that may
there are two individual first-level machine learning models, be more relevant to NLOS by performing some simple data
i.e., SVM [33] and XGBoost [34]. Then, the prediction results processing methods. These features include elevation angle e,
of these first-level models are concatenated and input to the azimuth angle a, pseudorange residual pr , and pseudorange
rate consistency prc . Overall, the 9-D measurement feature
1 The individual models in ensemble learning can also be ensemble learners. vector can be obtained for each satellite signal at a sampling
3512510 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 71, 2022

time. The calculation details for these quantities via data XGBoost [34]. Then, the predictions of these first-level models
processing are shown as follows. are concatenated and input to the second-level LR. A brief
1) Elevation Angle: The satellite elevation angle e can be introduction to the principles of the selected machine learning
estimated by e = sin−1 ( rU /
r), where 
r ∈ R3 is the models is provided as follows.
estimated satellite position in the east-north-up (ENU) First-Level Model ①—SVM: The SVM performs structural
coordinate system with respect to the receiver’s position, risk minimization instead of minimizing the absolute value of
with  rE , 
r N , and  rU the “East,” “North,” and “Up” an error, which addresses the overfitting issues by balancing
components, respectively. As the receiver’s positioning the model’s complexity against its success at fitting the training
error is negligible compared with the distance between data. It has also been proved to be better than other base
the satellite and the receiver, the satellite elevation angle learners in the GNSS NLOS detection tasks [11]. In par-
can be estimated with an acceptable accuracy using the ticular, similar to [11], we adopt the linear SVM classifier,
estimated measurements [15]. which is trained to find the optimal separating hyperplane
2) Azimuth Angle: Similar to the calculation of the elevation to classify inputs into two different categories. Given the
angle, the satellite’s azimuth angle a can be calculated training dataset D t , the SVM model can be trained to solve
by a = tan−1 ( r E /
r N ). the following optimization problem:
3) Pseudorange Residual: The pseudorange residual pr 1
is the satellite’s corresponding item of the esti- min w2 (1)
w,b 2
mated pseudorange residual vector  = ρ −
H(HT H)−1 HT  ρ, where ρ ∈ R N is the difference s.t. (2yt − 1)((ft /σ )T w + b) ≥ 1 (2)
between the pseudorange measurements and the geomet- where parameters σ ∈ R, w ∈ R , and b ∈ R are the kernel
9
ric distances from the estimated receiver position to the scale, the vector of fit linear coefficients, and the bias of the
satellites; H is the satellite geometry matrix [11]. linear SVM classification, respectively.
4) Pseudorange Rate Consistency: The pseudorange rate When the SVM model is trained to convergence, we can
consistency prc can be estimated by prc = p D − p p , obtain the model parameters w̃ ∈ R9 and b̃ ∈ R. Then, given
where p p is the difference between the pseudor- an input measurement feature vector fe for evaluation, the
ange measurements of two adjacent epoches, and score of the linear SVM classification is calculated by
p D = −λ f D  t is the pseudorange rate from Doppler
shift with λ the carrier wavelength, f D the Doppler shift SSVM (fe ) = (fe /σ )T w̃ + b̃. (3)
measurement of the satellite, and t the interval of two The value range of the score in (3) is {−∞, ∞}, while the
adjacent epoches. prediction results of other base classifiers in the proposed SEL
could be positive probabilities. To make the prediction results
C. SEL Method
of different base learners to be consistent, the probabilities that
As discussed in Section I, ensemble learning is adopted over NLOS of the SVM classification are handled by sigmoid
to combine several individual models to obtain better per- normalizing, that is,
formance of NLOS detection under different environments.
In general, the models of the proposed method are trained e SSVM(fe )
pSVM (fe ) = . (4)
and evaluated on the training dataset Dt and the evaluation 1 + e SSVM (fe )
dataset De , respectively. With (4), the score of the linear SVM classification is
Given the training dataset D t consists of feature-label pairs normalized to {0, 1}. In our experiments, we find that the
{ft , yt }, with yt ∈ {0, 1} the label of the measurement feature normalization of SVM score is required for training.
sample ft , the proposed SEL method is to train a model First-Level Model ②—XGBoost: The boosting method
that can classify the NLOS (yt = 1) and LOS (yt = 0) of GBDT [15] or its improved version XGBoost [34] has proved
GNSS signals with a high accuracy. Similarly, the feature-label to be a fast and accurate way in GNSS NLOS detection and
pairs of the evaluation dataset De can be denoted as {fe , ye }. achieved the state-of-the-art results on various classification
In particular, a two-level SEL method is proposed to combine tasks. It is an ensemble of tree-based methods that applies
heterogeneous weak models to produce a strong model that is the principle of boosting weak learners to improve the final
less biased than its component weak models. In the first level, prediction accuracy. The boosting method is an ensemble
several models are combined in parallel to output different model by itself but it can still benefit if it is ensemble with
weak model predictions from different views. In the second other models [32]. In the proposed SEL method, we adopt
level, there is a machine learning model, which is trained to XGBoost as a base learner to obtain a stronger model for
output the final prediction based on the predictions of the first GNSS NLOS detection.
level. The XGBoost method applies several base models,
Various combination strategies of basic machine learn- e.g., classification and regression trees (CARTs), as weak
ers can be selected to the SEL method, while the method learners and then creates ensemble trees to boost the perfor-
needs to make a balance between the final accuracy and mance via optimizing a regularized objective function [34].
the computational cost. In the proposed SEL method for It is trained in an additive way: the ensemble sequentially
GNSS NLOS detection, the first-level models are two state- adds weak learners that learn from the residual of the pre-
of-the-art machine learning methods, including SVM [11] and vious ensemble. Given the training dataset D t consists of
SUN AND FU: SEL FOR NLOS DETECTION OF GNSS 3512510

Fig. 3. Sky plot of the start point and endpoint in the moving trajectory with
the satellite visibility labeled from ground truth.

with the evaluation features fe , and ũ ∈ R2 and ṽ ∈ R are the


weight and the bias of the LR model that has already been
trained.
Finally, the predicted probabilities of NLOS pLR (oe ) need
Fig. 2. Eight static locations (in orange) and one moving trajectory (in to be rounded to the closest value of 1 or 0 to output the
yellow) for GNSS data collection in Hong Kong urban canyon (around latitude
22.299◦ and longitude 114.177◦ ). NLOS predictions.

feature-label pairs {ft , yt }, the i th regularized objective func- III. R ESULTS AND D ISCUSSION
tion of the additive training method can be denoted as In this section, three separate experiments on real GNSS

Li = l(yt , G i−1 (ft ) + gi (ft )) + (gi ) (5) data derived from urban canyons are designed to test the
{ft ,yt }∈ Dt
effectiveness of the proposed SEL method for GNSS NLOS
detection. The first experiment is to compare different choices
where gi (·)is the weak learners at the i th boosting round, of the individual models in the proposed SEL method. The sec-
G i−1 (·) = ik=0 gk (·) is the ensemble at the (i -1)th boosting ond one is to evaluate the performance of the proposed method
round, l(·) is the log-likelihood loss function between the compared with the baseline methods in an in-domain scenario
label yt and the models’ prediction output, and (·) is associated with evenly random sampling. The final experiment
the regularization function to penalize model complexity of is to test the generalization performance of the proposed
the weak learners gi of XGBoost. method compared with the existing methods in out-domain
After the additive training is terminated, output G I (·) as scenarios associated with different reception locations and
the final classifier, where I is the number of boosting rounds. states of motion, respectively.
Given an input measurement feature vector fe for evaluation,
the probabilities of NLOS of the XGBoost classification can
be obtained as A. Experimental Setup
pXGBoost (fe ) = G I (fe ). (6) 1) Data Preparation: In our experiments, two public real
datasets2 of GNSS receivers in urban canyons are used (see
Second-Level Model—LR: The LR method is a simple but Fig. 2): 1) static data—dataset collected at eight different static
effective classifier, which is usually adopted to the ensemble locations via a UBLOX NEO M8T receiver at different time in
learning framework. It is trained to improve the system’s final June 2018 [11] and 2) dynamic data—dataset collected over
decision based on the prediction of each individual model in a trajectory (about 500 meters) via the moving receiver in
the first level. The objective function for training is the log- May 2021 [37]. With 1-Hz sampling rate, the time duration
likelihood function of each static point is about 20 min and is about 90 s for the
  
L LR =
T
yt (otT u + v) − log(1 + eot u+v ) (7) dynamic data. In particular, the ground truth location and the
{ot ,yt }∈ Dt
surrounding 3-D building model of each sample are processed
to obtain the NLOS or LOS label for the datasets [11].
where ot = [ pSVM (ft ), pXGBoost (ft )] ∈ R2 is the concatenated The examples of the sky plot of the dynamic and static
predictions of the individual models in the first level associated data are shown in Figs. 3 and 4, respectively. As the GNSS
with the training features ft , and u ∈ R2 and k ∈ R are the observations vary over time, the sky plot of each static receiver
weight and the bias of the LR method to be trained. is plotted based on the last epoch, and the sky plot of the start
In the process of evaluation, the probabilities of NLOS of and end epoches for the moving receiver is plotted. Also, the
the LR method are obtained as proportions of NLOS and LOS signals in each dataset are
eoe ũ+ṽ
T
counted (see Figs. 5 and 6). The sky plots and NLOS/LOS
pLR (oe ) = (8) proportions show that the reception environments are quite
1 + eoe T ũ+ṽ
where oe = [ pSVM (fe ), pXGBoost (fe )] ∈ R2 is the concatenated 2 The datasets are downloaded from the website of Intelligent Positioning
predictions of the individual models in the first level associated and Navigation Laboratory (IPNL): https://www.polyu-ipn-lab.com/
3512510 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 71, 2022

Fig. 4. Sky plot of different static locations with the satellite visibility labeled from ground truth.

Fig. 5. SNR distributions and NLOS/LOS proportions in different static locations.

and evaluation datasets in two manners. First, we randomly


select a subset of the collected data samples as the training
set and use the remaining samples as the test set to obtain
optimal first/second-level models. Second, considering the
performance of NLOS detection is usually highly related to
the surrounding environment, we also test the generalization
of our proposed method in environments that are unseen to the
model during training. In particular, we split the dataset into
training datasets and evaluation datasets derived from different
locations and states of motion, respectively. The details about
the experimental scenarios are listed as follows.
1) In-Domain Scenario I: All of the data samples, collected
Fig. 6. SNR distributions and NLOS/LOS proportions in the moving at the eight static locations, are mixed together. Then,
trajectory.
90% of the samples are randomly selected for model
training, and the 10% remaining are used for evaluation.
2) Out-Domain Static Scenario: Among the data samples
different for the datasets. The data at each sampling time are collected at the eight static locations, we select the
pre-processed to obtain the measurement features in Table I. samples of the arbitrary seven locations out of the eight
To evaluate the performance of the proposed method in locations for model training and then use the remain-
different settings, the GNSS data are split into training datasets ing one location for model evaluation. Thus, we can
SUN AND FU: SEL FOR NLOS DETECTION OF GNSS 3512510

TABLE II
A CCURACY R ATES AND C OMPUTATIONAL C OSTS OF D IFFERENT S TRATEGIES FOR M ODEL S ELECTION ON THE I N -D OMAIN S CENARIO I , THE M EAN
VALUE OF THE S TATIC O UT-D OMAIN S CENARIO O ∗ , AND THE O UT-D OMAIN DYNAMIC S CENARIO D, W ITH “”
THE M ODEL S ELECTION AND “×” THE M ODEL I S N OT U SED

obtain eight experimental scenarios denoted as Oi , with TABLE III


i ∈ {1, . . . , 8}. A CCURACY R ATES OF D IFFERENT GNSS NLOS D ETECTION
M ETHODS ON THE I N -D OMAIN S CENARIO
3) Out-Domain Dynamic Scenario D: All samples col-
lected at the eight static locations are mixed for model
training, and the dynamic data are used as the testing
set.
2) Methods and Implementation: We compare the proposed
SEL method with state-of-the-art NLOS detection methods
based on machine learning, including SVM [11], GBDT [15],
and RF [38]. We also consider the conventional SNR clas-
sification method [12] as a comparison. Similar to [11], the
threshold of the SNR classification is set to 35 dB. The SNR 0.42%, but the computational cost of each sample for testing
distributions of LOS and NLOS in different static locations and is relatively increased by 33.3%.
the moving trajectory are shown in Figs. 5 and 6, respectively. Moreover, the importance of each model’s impact on the
Note that the measurements of NLOS signals are significantly results varies widely. In the first level, the importance of
smaller than that of LOS signals, while the SNR distributions these three candidate models is SVM, XGBoost, and RF
for NLOS and LOS signals are quite different for different from high to low. Numerically, compared with strategy 7,
datasets. In our experiments, each machine learning model in strategy 1 (strategy 3 or strategy 5) removes SVM (XGBoost
the proposed SEL method is conducted using default model or RF), while the average accuracy rate decreases 8.60%
hyperparameters in scikit-learn [39]. (2.82% or 0.42%). In the second level, the performance of
3) Software/Hardware: All of our experiments were con- LR consistently outperforms XGBoost in terms of accuracy
ducted using PyCharm software on a PC with a Core i7 CPU rate and computational cost. Compared with XGBoost, LR is
(2.93-GHz with 8-GB memory). more simple and efficient in the second level. Moreover, the
properties of LR and the first layer models are more hetero-
B. Models of SEL geneous, which might help the model learn from ensemble
features.
The performance of different model selection strategies on
In summary, to make a balance between the accuracy rate
GNSS NLOS detection is shown in Table II. Experimental
and the computational cost, we select strategy 5 as the models
results show that the accuracy rates of strategy 5 (first level:
of the proposed SEL method.
SVM + XGBoost, second level: LR) and strategy 7 (first level:
SVM + XGBoost + RF, second level: LR) are similar, which
largely outperform other strategies in the in-domain scenario, C. Results of the In-Domain Scenario
the out-domain static scenarios, and the out-domain dynamic To evaluate the performance of model training, we design
scenario. the in-domain scenario and use the pre-split training dataset for
In the results, the more individual models used in the model optimization. Then, we test the models’ performance on
first level, the higher the accuracy rate. The possible reason the test dataset. The accuracy rate and the confusion matrix
would be that the SEL method combines various views to the on the in-domain scenario are shown in Tables III and IV,
features from different heterogeneous models and improves respectively.
the NLOS detection performance. However, one can also In Table III, our experimental results of the in-domain
find that simply increasing the number of models in the first scenario show that the proposed SEL method achieves a
level may sometimes not improve significantly. For example, very significant performance improvement compared with the
compared with strategy 5, strategy 7 adds RF in the first level. baseline methods. The existing methods based on machine
The average accuracy of the scenarios is only improved by learning [11], [15], [38] outperform the traditional SNR
3512510 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 71, 2022

TABLE IV
C ONFUSION M ATRIX OF D IFFERENT GNSS NLOS D ETECTION M ETHODS ON THE I N -D OMAIN S CENARIO

TABLE V
A CCURACY R ATES OF D IFFERENT GNSS NLOS D ETECTION M ETHODS ON THE O UT-D OMAIN S CENARIOS

method [12], while the SVM method is the best one. 2) Out-Domain Dynamic Scenario: Merely for brevity, the
Xu et al. [11] and Zhang et al. [16] also found the same results of our method on the out-domain dynamic scenario
results. Although the GBDT method or the RF method is are presented in the last column of Tables V and VI. The
weaker than the SVM method when used alone in NLOS results indicate that the proposed SEL method achieves the
detection tasks, it can still benefit the SEL methods when best accuracy rate when compared with the existing meth-
combining these individuals together. In the experiments, the ods. Numerically, it outperforms the methods of SNR [12],
proposed SEL method consists of the SVM model and the SVM [11], GBDT [15], and RF [38] by 10.64%, 2.86%,
XGBoost model (a type of improved GBDT) in the first level 5.45%, and 9.74%, respectively. Although trained on static
and the LR model in the second level. Compared with the data completely, our SEL method does not suffer from large
baseline method that adopts SVM [11], GBDT [15], or RF [38] performance degradation when evaluated on the dynamic
alone, the proposed SEL method improves the accuracy rate dataset. A possible reason is that our method is conducted
of GNSS NLOS detection by 3.92%, 5.69%, and 8.59%, on the measurement features of each epoch, which is not very
respectively. Moreover, the values of the confusion matrix also sensitive to the dynamic features caused by receiver moving.
show that the proposed SEL method consistently outperforms In our future work, more dynamic dataset will be collected
the baseline machine learning methods in the GNSS NLOS for model training to improve the performance on realistic
detection tasks (see Table IV). applications.

D. Results of the Out-Domain Scenarios E. Qualitative Analysis


To evaluate the generalization of the proposed SEL method, To further evaluate the effectiveness of our method, we ana-
two kinds of out-domain scenarios are designed to test the lyze the results from the following two perspectives.
performance on the environment that the model does not see First, the time series of accuracy rates for our SEL method
during training. The accuracy rate and the confusion matrix is plotted to qualitatively assess the performance stability
on the out-domain static scenario and out-domain dynamic (see Fig. 7 for the out-domain static scenarios, and Fig. 8
scenario (see Tables V and VI, respectively) are discussed as for the out-domain dynamic scenario). The results of each
follows. experimental scenario show that the performance of SEL over
1) Out-Domain Static Scenario: As shown in Table V, the different epochs varies within a certain range. Numerically,
proposed SEL method shows a strong performance compared the standard deviation of the accuracy rate ranges from
with the baseline methods on the out-domain static scenarios. 5.53 to 13.58 points in the experimental scenarios. A possible
In particular, the proposed SEL achieves the highest accuracy reason for the larger variances is the surrounding environ-
rates in seven (Oi , with i ∈ {2, . . . , 8}) of the eight experi- ment the receiver is located in. For example, O4 exhibits
mental settings, while the performance on O1 is comparable the largest standard deviation (13.58 points) and the lowest
with the best result of SVM [11]. On average of the accuracy average accuracy (75.45%) than other experimental scenar-
rates in the out-domain static scenarios (i.e., O∗ ), the proposed ios. It might be caused by the highest NLOS proportion in
SEL method significantly outperforms the baseline methods, O4 (88%, see Fig. 5), which indicates a much challenging
including the SNR method [12], SVM [11], GBDT [15], and reception environment to the receiver. Conversely, O1 and
RF [38] by 8.99%, 2.86%, 5.04%, and 7.47%, respectively. O8 achieve the smallest two standard deviations (5.53 and
Also, the confusion matrix in Table VI shows that the pro- 8.19 points) and the highest two accuracy rates (94.23%
posed SEL method has a good generalization performance for and 90.75%). The reason might be that the two scenarios are
different environments, with high detection accuracy and low simple for NLOS detection tasks with low NLOS proportions
false detection. (43% and 19%) or easily distinguishable SNR measurements
SUN AND FU: SEL FOR NLOS DETECTION OF GNSS 3512510

TABLE VI
C ONFUSION M ATRIX OF THE P ROPOSED SEL M ETHOD ON THE O UT-D OMAIN S CENARIOS

Fig. 7. Number of visible LOS and NLOS, and accuracy rates in time series for the out-domain static scenarios.

the top five (O1 , O3 , O4 , O5 , and O6 ) of the eight static


datasets in terms of NLOS proportion, the accuracy rates for
the four datasets (O3 , O4 , O5 , and O6 ) are significantly below
the average performance (O∗ ). Numerically, the correlation
between the NLOS proportion for each location and accuracy
rate is −0.694, which shows a strong negative correlation.
We infer that as a larger proportion of NLOS signals usually
indicates a more complex reception environment, the method
may suffer from issues of performance degradation for a large
NLOS proportion.

Fig. 8. Number of visible LOS and NLOS, and accuracy rates in time series
IV. C ONCLUSION
for the out-domain dynamic scenario. In this article, a novel stacking-based ensemble learning
method has been proposed for the NLOS detection of GNSS.
It combines different machine learning methods from different
(see Fig. 5). Regarding the time series of performance on the views to address the shortcomings of each single model. The
dynamic data, our SEL method still achieves relatively stable proposed method effectively leveraged the existing individ-
accurate rates even for a moving receiver in urban canyons ual machine learning models to enhance NLOS detection
(see Fig. 8). for GNSS applications, and significantly improved the per-
Besides the time series analysis on each epoch with only formance on real GNSS data compared with the baseline
about ten observed satellites (might cause statistical error), methods.
we also analyze how the performance of our method changes In the future, features of other sensors, such as fish-eye
as the proportions of NLOS signals in different static locations cameras and LiDARs, will be included in the proposed SEL
increase. Combining the results from Fig. 5 and Table V, method. Also, ensemble deep learning will be researched
the proportion of NLOS signals is negatively correlated with for GNSS NLOS detection to obtain better generalization
the accuracy rate of our SEL method. For example, among performance.
3512510 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 71, 2022

ACKNOWLEDGMENT [22] P. D. Groves, Z. Jiang, M. Rudi, and P. Strode, A Portfolio Approach


to NLOS and Multipath Mitigation in Dense Urban Areas. Richmond,
The authors would like to thank Haosheng Xu and Li-Ta VA, USA: Institute of Navigation, 2013.
Hsu for sharing the real global navigation satellite system data [23] R. D. J. van Nee, J. Siereveld, P. C. Fenton, and B. R. Townsend, “The
multipath estimating delay lock loop: Approaching theoretical accuracy
and constructive suggestions for this work. limits,” in Proc. IEEE Position, Location Navigat. Symp. (PLANS),
Apr. 1994, pp. 246–251.
[24] X. Chen, F. Dovis, S. Peng, and Y. Morton, “Comparative studies of
R EFERENCES GPS multipath mitigation methods performance,” IEEE Trans. Aerosp.
Electron. Syst., vol. 49, no. 3, pp. 1555–1568, Jul. 2013.
[1] N. Dasanayaka and Y. Feng, “Analysis of vehicle location prediction [25] P. D. Groves and Z. Jiang, “Height aiding, C/N0 weighting and con-
errors for safety applications in cooperative-intelligent transportation sistency checking for GNSS NLOS and multipath mitigation in urban
systems,” IEEE Trans. Intell. Transp. Syst., early access, Jan. 20, 2022, areas,” J. Navigat., vol. 66, no. 5, pp. 653–669, Sep. 2013.
doi: 10.1109/TITS.2022.3141710. [26] S. Tay and J. Marais, “Weighting models for GPS pseudorange obser-
[2] S. Y. Cho and W. S. Choi, “Robust positioning technique in low-cost vations for land transportation in urban canyons,” in Proc. 6th Eur.
DR/GPS for land navigation,” IEEE Trans. Instrum. Meas., vol. 55, no. 4, Workshop GNSS Signals Signal Process., Dec. 2013, p. 4.
pp. 1132–1142, Aug. 2006. [27] L. T. Hsu, Y. Gu, and S. Kamijo, “NLOS correction/exclusion for GNSS
[3] A. Ul Haque, T. Mahmood, and M. Saeed, “Enhanced GNSS position- measurement using RAIM and city building models,” Sensors, vol. 15,
ing solution on Android for location based services using big data,” no. 7, pp. 17329–17349, 2015.
J. Internet Technol., vol. 20, no. 2, pp. 399–407, 2019. [28] Z. Jiang and P. D. Groves, “GNSS NLOS and multipath error mitigation
[4] Y. Sun, “Autonomous integrity monitoring for relative navigation of using advanced multi-constellation consistency checking with height
multiple unmanned aerial vehicles,” Remote Sens., vol. 13, no. 8, aiding,” in Proc. 25th Int. Tech. Meeting Satell. Division Inst. Navigat.
p. 1483, Apr. 2021. (ION GNSS), 2012, pp. 79–88.
[5] B. Xu, Q. Jia, and L.-T. Hsu, “Vector tracking loop-based GNSS NLOS [29] Z. Jiang, P. D. Groves, W. Y. Ochieng, S. Feng, C. D. Milner, and
detection and correction: Algorithm design and performance analysis,” P. G. Mattos, “Multi-constellation GNSS multipath mitigation using
IEEE Trans. Instrum. Meas., vol. 69, no. 7, pp. 4604–4619, Jul. 2020. consistency checking,” in Proc. 24th Int. Tech. Meeting Satell. Division
[6] J. Bressler, P. Reisdorf, M. Obst, and G. Wanielik, “GNSS positioning Inst. Navigat. (ION GNSS), 2011, pp. 3889–3902.
in non-line-of-sight context—A survey,” in Proc. IEEE 19th Int. Conf. [30] O. Sagi and L. Rokach, “Ensemble learning: A survey,” WIREs Data
Intell. Transp. Syst. (ITSC), Nov. 2016, pp. 1147–1154. Mining Knowl. Discovery, vol. 8, no. 4, Jul. 2018, Art. no. e1249.
[7] P. Francois, B. David, and M. Florian, “Non-line-of-sight GNSS signal [31] T. G. Dietterich, “Ensemble methods in machine learning,” in Proc. Int.
detection using an on-board 3D model of buildings,” in Proc. 11th Int. Workshop Multiple Classifier Syst., Cham, Switzerland: Springer, 2000,
Conf. ITS Telecommun., Aug. 2011, pp. 280–286. pp. 1–15.
[8] Z. Lyu and Y. Gao, “A new method for non-line-of-sight GNSS signal [32] H. Baker, M. R. Hallowell, and A. J.-P. Tixier, “Ai-based prediction
detection for positioning accuracy improvement in urban environments,” of independent construction safety outcomes from universal attributes,”
in Proc. 33rd Int. Tech. Meeting Satell. Division Inst. Navigat. Autom. Construct., vol. 118, Oct. 2020, Art. no. 103146.
(ION GNSS+), Oct. 2020, pp. 2972–2988. [33] E. Alickovic and A. Subasi, “Ensemble SVM method for automatic
[9] S. Haigh, J. Kulon, A. Partlow, P. Rogers, and C. Gibson, “A robust sleep stage classification,” IEEE Instrum. Meas., vol. 67, no. 6,
algorithm for classification and rejection of NLOS signals in narrowband pp. 1258–1265, Jun. 2018.
ultrasonic localization systems,” IEEE Trans. Instrum. Meas., vol. 68, [34] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,”
no. 3, pp. 646–655, Aug. 2018. in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
[10] P. Xie and M. G. Petovello, “Measuring GNSS multipath distributions Aug. 2016, pp. 785–794.
in urban canyon environments,” IEEE Trans. Instrum. Meas., vol. 64, [35] S. Menard, Applied Logistic Regression Analysis, vol. 106. Newbury
no. 2, pp. 366–377, Feb. 2015. Park, CA, USA: Sage, 2002.
[11] H. Xu, A. Angrisano, S. Gaglione, and L.-T. Hsu, “Machine learning [36] Y. Sun, “RAIM-NET: A deep neural network for receiver autonomous
based LOS/NLOS classifier and robust estimator for GNSS shadow integrity monitoring,” Remote Sens., vol. 12, no. 9, p. 1503, May 2020.
matching,” Satell. Navigat., vol. 1, no. 1, pp. 1–12, Dec. 2020. [37] G. Zhang, P. Xu, H. Xu, and L.-T. Hsu, “Prediction on the urban GNSS
[12] L. Wang, P. D. Groves, and M. K. Ziebart, “Smartphone shadow match- measurement uncertainty based on deep learning networks with long
ing for better cross-street GNSS positioning in urban environments,” short-term memory,” IEEE Sensors J., vol. 21, no. 18, pp. 20563–20577,
J. Navigat., vol. 68, no. 3, pp. 411–433, 2015. Sep. 2021.
[13] D. H. Won et al., “Weighted DOP with consideration on elevation- [38] M. Ramadan, V. Sark, J. Gutierrez, and E. Grass, “NLOS identification
dependent range errors of GNSS satellites,” IEEE Trans. Instrum. Meas., for indoor localization using random forest algorithm,” in Proc. WSA
vol. 61, no. 12, pp. 3241–3250, Dec. 2012. 22nd Int. ITG Workshop Smart Antennas, 2018, pp. 1–5.
[14] L.-T. Hsu, “GNSS multipath detection using a machine learning [39] F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Mach.
approach,” in Proc. IEEE 20th Int. Conf. Intell. Transp. Syst. (ITSC), Learn. Res., vol. 12, pp. 2825–2830, Jan. 2012.
Oct. 2017, pp. 1–6.
[15] R. Sun, G. Wang, W. Zhang, L.-T. Hsu, and W. Y. Ochieng, “A gra-
dient boosting decision tree based GPS signal reception classification
algorithm,” Appl. Soft Comput., vol. 86, Jan. 2020, Art. no. 105942.
[16] G. Zhang, B. Xu, and L.-T. Hsu, “GNSS shadow matching based on Yuan Sun received the B.E. degree from the School of Instrumentation
intelligent LOS/NLOS classifier,” in Proc. 16th IAIN World Congr., Science and OptoElectronics Engineering, Beihang University, Beijing, China,
2018, pp. 1–7. in 2011, and the Ph.D. degree from the School of Electronic and Information
[17] S. Kato, M. Kitamura, T. Suzuki, and Y. Amano, “Nlos satellite detection Engineering, Beihang University, in 2016.
using a fish-eye camera for improving GNSS positioning accuracy in She is currently a Lecturer with the School of Electronics Engineering,
urban area,” J. Robot. Mechatronics, vol. 28, no. 1, pp. 31–39, 2016. Beijing University of Posts and Telecommunications, Beijing. Her primary
[18] J. S. Sánchez, A. Gerhmann, P. Thevenon, P. Brocard, A. B. Afia, and research interests include global navigation satellite system positioning and
O. Julien, “Use of a FishEye camera for GNSS NLOS exclusion and integrity monitoring.
characterization in urban environments,” in Proc. Int. Tech. Meeting The
Inst. Navigat., Feb. 2016, pp. 283–292.
[19] W. W. Wen, G. Zhang, and L.-T. Hsu, “GNSS NLOS exclusion based
on dynamic object detection using LiDAR point cloud,” IEEE Trans. Li Fu received the B.E. degree from the School of Advanced Engineering,
Intell. Transp. Syst., vol. 22, no. 2, pp. 853–862, Feb. 2021. Beihang University, Beijing, China, in 2011, and the Ph.D. degree from
[20] W. Wen and L.-T. Hsu, “3D LiDAR aided GNSS NLOS mitigation in the School of Electronic and Information Engineering, Beihang University,
urban canyons,” 2021, arXiv:2112.06108. in 2016.
[21] V. Filippov, D. Tatarnicov, J. Ashjaee, A. Astakhov, and I. Sutiagin, He is currently an Algorithm Engineer with JD AI Research, Beijing,
“The first dual-depth dual-frequency choke ring,” in Proc. 11th Int. China. His primary research interests include global navigation satellite
Tech. Meeting Satell. Division The Inst. Navigat. (ION GPS), 1998, system positioning, computer vision, multisensor integration, and intelligent
pp. 1035–1040. unmanned systems.

You might also like