0% found this document useful (0 votes)
51 views11 pages

Energy Demand Prediction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views11 pages

Energy Demand Prediction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Received 21 December 2020; accepted 2 March 2021.

Date of publication 11 March 2021;


date of current version 9 April 2021. The review of this paper was arranged by Associate Editor Junmin Wang.
Digital Object Identifier 10.1109/OJVT.2021.3065529

Probabilistic Prediction of Energy Demand


and Driving Range for Electric Vehicles
With Federated Learning
ADAM THOR THORGEIRSSON 1 , STEFAN SCHEUBNER 2, SEBASTIAN FÜNFGELD3 ,
AND FRANK GAUTERIN 1
1
Karlsruhe Institute of Technology, Institute of Vehicle System Technology, 76131 Karlsruhe, Germany
2
Department of E-Mobility Charging Infrastructure, EnBW AG, 76131 Karlsruhe, Germany
3
Dr. Ing. h. c. F. Porsche AG, Department of Energy Management, R&D Center Weissach, 71287 Weissach, Germany
CORRESPONDING AUTHORS: ADAM THOR THORGEIRSSON; FRANK GAUTERIN (e-mail: [email protected]; [email protected])
This work was supported by the KIT-Publication Fund of the Karlsruhe Institute of Technology.

ABSTRACT Today’s drivers of battery electric vehicles must deal with limited driving range in a sparse
charging infrastructure. An accurate prediction of energy demand and driving range is therefore important
and enables reliable routing and charge planning applications. Predictions of energy demand entail uncer-
tainty, which can be considered directly with the use of probabilistic prediction algorithms. Machine learning
algorithms are frequently applied in this context, but data used to train these algorithms are often distributed
over a fleet of connected vehicles. Federated learning can be applied in this setting, but predictive uncertainty
is typically not considered. We apply an extension of the federated averaging algorithm to learn probabilistic
neural networks and linear regression models in a communication-efficient and privacy-preserving manner.
We demonstrate the performance advantage of probabilistic prediction models over deterministic prediction
models using proper scoring rules. Furthermore, we show that federated learning can improve the standard,
driver-individual learning. Using probabilistic predictions, variable safety margins based on destination
attainability can be applied, leading to increased effective driving range and reduced travel time.

INDEX TERMS Electric vehicles, energy demand prediction, probabilistic predictions, range estimation.

I. INTRODUCTION battery capacity as a safety margin [2], i.e., the utilization


The call for low or zero emissions vehicles, along with im- of the available battery energy is poor. The utilization of the
proved battery technology, makes the battery electric vehicle battery strongly depends on the calibration of the driving
(BEV) a serious candidate for the replacement of internal range prediction [3]. A central challenge in this context is
combustion engine powered vehicles (ICEVs). Despite the the prediction of future energy demand. The energy demand
advantages of such vehicles, they have not gained significant prediction (EDP) is not only used to display remaining driving
popularity among the general public. Due to limited charging- range [4], but also for other purposes such as the estimation of
infrastructure and the inevitably shorter driving range, BEV a destination’s attainability [5], time or energy optimal routing
drivers may experience range anxiety, which is the fear that with charge planning [6], [7], energy optimal control [8], [9],
the energy storage will run out before reaching the desti- BEV fleet management systems [10] and charging infrastruc-
nation [1]. In order to eliminate range anxiety and increase ture planning [11].
the usability of BEVs, there is a need for applications that The EDP and driving range estimation rests upon infor-
help drivers in arriving safely at their destinations without mation about the driver, vehicle, route, traffic and other en-
excessive time or cost. The primary goals of such applications vironmental factors. Frequently, machine learning (ML) al-
are to maximize the effective driving range and to accurately gorithms are used to compute the predictions [12]. Because
predict this range. Drivers tend to reserve up to 20% of the of the high number of influence factors, large amounts of

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 2, 2021 151
THORGEIRSSON ET AL.: PROBABILISTIC PREDICTION OF ENERGY DEMAND AND DRIVING RANGE

predictive data are required for an accurate prediction. Few In the context of driving range, energy demand and BEV
researchers have addressed the issue of uncertainty of these routing, few articles have addressed predictive uncertainty.
predictions [5]. Probabilistic predictions compute probability Oliva et al. describe remaining driving range as a random
densities for the target variable, so that uncertainty is directly variable, where the remaining battery energy is estimated with
taken into account. The required predictive data come from an unscented Kalman filter and the driving profile is pre-
different sources in a distributed system, which comprises dicted with a Markov chain. With that, a probability density
a network of connected vehicles and backend infrastructure function for the remaining driving range is computed [21].
in the cloud. A prediction algorithm utilizing data from this Ondruska and Posner trained linear models to describe the
network must guarantee the privacy of the users and be able to mean and the variance of the energy consumption based on
function without excessive computation and communication road segment features. Thereby, two deterministic models are
overhead. To this end, we can apply federated learning (FL), used to calculate the parameters of a normal distribution for
which is a ML scheme where each end device learns from the prediction of energy consumption [22]. Scheubner et al.
local data. A centralized server creates a global model by used a multi-linear regression (MLR) model to compute a
aggregating the model weights received from the devices at stochastic velocity prediction, which is then used to predict
regular intervals [13]. The global model is then sent back to a probability distribution for the energy consumption using a
the devices where the learning continues. federated learning physical model and a sequential Monte Carlo simulation [5].
(FL) algorithms, such as federated averaging (FedAvg), are Furthermore, the uncertainty of EDPs has been considered in
typically applied when a large dataset is desired, but shar- BEV routing applications [23]–[26].
ing data between devices is not possible or too expensive. Data-driven predictions such as with ML algorithms benefit
Recently, an extension of FedAvg with predictive uncertainty from a rich training dataset [27]. A few articles have proposed
was presented, called FedAvg-Gaussian (FedAG). There, un- sharing data between vehicles and the cloud, so that a user
certainty is introduced in the aggregation step of the algorithm can benefit from the experience of other users, ultimately
by treating the set of local weights as a posterior distribution leading to more accurate predictions. Grubwinkler et al. pro-
for the weights of the global model [14]. posed an energetic road map created through crowd-sourcing
This paper presents the application of FedAvg and FedAG by collecting information on energy consumption of BEVs
to the prediction of the energy demand of a BEV on a planned while driving a road segment [28]. Tseng and Chau applied
route. We show an efficient way to learn probabilistic ML the concept of participatory sensing to gather crowd-sourced
models, evaluate and accentuate the advantages of proba- data for the prediction of vehicle energy demand [29]. Straub
bilistic EDPs and demonstrate their effect on battery utiliza- et al. presented another approach for creating an energetic
tion and travel time. The paper is organized as follows: An road map, by collecting crowd-sourced driving profiles where
overview of related work is given in Section II. In Section III, the gaps in data coverage were eliminated using ML meth-
the system architecture and available predictive data are pre- ods [30].
sented. The EDP algorithms and federated learning schemes By applying FedAG to the EDP problem, the advantages
are described in Section IV and the validation of the prediction of crowd-sourcing can be extended to probabilistic models
is shown in Section V. The benefit in safety margin and travel in an efficient and privacy preserving manner. Recent pub-
time is discussed in Section VI before the paper is concluded lications showed the application of FL in vehicle-to-vehicle
in Section VII. (V2V) communications [31], in autonomous driving [32], and
in traffic flow prediction [33]. To the best of our knowledge,
II. BACKGROUND AND RELATED WORK FL has not yet been applied in EDP for BEVs.
Current practice in energy demand prediction (EDP) is to use
information from the vehicle, such as driving speed, accelera- III. SYSTEM DESIGN AND DATA
tion, and historic energy consumption together with predictive The digital ecosystem in which the EDP operates is a dis-
information about the planned route from a traffic and routing tributed system of connected vehicles and backend infrastruc-
database (TRDB). TRDB information comprises static map tures in the cloud. In this distributed system, large amounts of
data, e.g., road slope, legal speed limit, and dynamic data data can be used to learn ML models, which typically have
such as live traffic. The prediction itself is typically performed high computational requirements. The central challenge is to
using mechanistic models based on physical principles [6], make use of information in the distributed system to enable ac-
[15]–[17]. In recent years, ML algorithms have been trained to curate and robust probabilistic predictions, while considering
find the relation between the available predictive information aspects such as privacy protection and lean communications.
and the resulting energy consumption [18]–[20]. The main In our previous work [34], we demonstrated the impor-
advantage of ML algorithms is that an exact modeling of tance of system architecture and module placement for the
the mathematical relation between a feature and the target performance and user experience of driving range prediction
variable is not necessary, or rather, the ML algorithm au- and charge planning software. By placing the prediction al-
tomatically creates this model. Additionally, hybrid models, gorithm parts intelligently across the vehicle and cloud, the
combining a mechanistic model and ML, can be applied [5], performance can be increased. Following that, the prediction
[7]. algorithm presented in this work can be implemented in an

152 VOLUME 2, 2021


TABLE 1. Test Run Data

(PE), component losses are computed. These losses are de-


noted by red arrows in Fig. 2. For a complete description of
the model, we refer the reader to [5]. An overview of the test
run data is shown in Table 1.

FIG. 1. Schematic overview of the distributed system. B. MAP AND TRAFFIC DATA
To complement the driving data measured in the vehicles, map
and traffic data are acquired to match the driven routes. Using
the GPS traces from the measurement drives, the measured
data can be matched to a map. Using the IDs of the road seg-
ments that form the driven route, the TRDB can be queried to
obtain static map data as well as real-time traffic information.
The TRDB includes a list of properties such as road slope α,
street class , mean traffic speed u, road curvature κ, legal
speed limit vlim , segment length l etc. The TRDB does not
only report the mean traffic speed but also information on
FIG. 2. Powertrain model with input variables v, Fr and output variable Ec . its distribution, such as standard deviation σu and percentile
The red arrows indicate simulated component losses. values Pi (u) in steps of 5% [35]. A further aspect of traffic is
the traffic phase. The three-phase traffic theory divides traffic
efficient system architecture. The learning of the models is into free flow, synchronized flow, and wide moving jam [36]. A
method to classify the traffic phase directly in the vehicle was
performed in the vehicle, so that training data remains in
presented in [5]. Using this method, the estimated traffic phase
the vehicle. Thereby, the communication between the vehicle
and the cloud covers only the transfer of the model weights. is included in the dataset. Contrary to the measured driving
data, map and traffic have a much lower spatial resolution,
Furthermore, the predictions are computed in the cloud, so
that the transmission of predictive data from the cloud to the where a typical segment length is 200 m.
vehicle is reduced to the final predictions. In that way, the
C. VELOCITY PERCENTILE ESTIMATION
amount of data transferred between the vehicles and the cloud
is minimized. Fig. 1 shows an overview of the distributed An important factor in the energy consumption pattern is the
system. The ego vehicle and the vehicle fleet share their model driving speed. In this work, we rely on the velocity reported by
weights W in a central backend in the cloud, where a proba- the TRDB. As different drivers may exhibit different driving
bilistic neural network (NN) is built. When a destination D styles and cruise at different speeds in free flowing traffic,
is entered in the ego vehicle’s navigation system, the route we individualize the velocity predictions. To this end, we
and predictive information is queried in the TRDB and a observe to which percentile of the velocity distribution the
probabilistic EDP Ec is computed with a NN. driver belongs on a complete trip. By minimizing the squared
error between ego vehicle speed and percentile values of the
traffic speed distribution, the best matching percentile can be
A. MEASUREMENT DRIVES AND POWERTRAIN MODEL
found:
In this work, we use a dataset first presented in [5]. The dataset
includes 20 real world measurement drives performed by 10 ρd = argmin (v − Pi (u))2 , (1)
different drivers. All relevant data is logged in the vehicle i

with a sampling rate of 10 Hz. To generate unified driving where ρd is the percentile that best matches driver d, v is the
data from the pool of measurements with different vehicles, a speed of the ego vehicle, and Pi (u) is the i-th percentile of the
simulation model for the powertrain of an electric vehicle is traffic speed distribution u. As the traffic speed distribution
used. The simulation model calculates the power P and energy is very narrow in the case of a traffic jam, we only look at
Ec drawn from the battery based on velocity v and driving synchronized flow and free flow to determine the best fitting
resistance Fr . Fig. 2 shows a schematic overview of the power- percentile.
train model. Based on efficiency maps for components such as For each of the drives, (1) is used to find the best fitting per-
the gearbox (GB), electric motor (EM), and power electronics centile. Fig. 3 presents the results of the velocity prediction.

VOLUME 2, 2021 153


THORGEIRSSON ET AL.: PROBABILISTIC PREDICTION OF ENERGY DEMAND AND DRIVING RANGE

FIG. 4. Block diagram showing a schematic overview of the training


process of the two-scale regression model.

types of regression models, a linear regression (LR) and a neu-


ral network (NN). Both models can be used as probabilistic
FIG. 3. Velocity percentile observation and velocity prediction error. (a) models with random weights p(w).
Observed velocity percentiles of 10 drivers during trip 1 and trip 2. (b) The length of the road segments is not uniform. Further-
Velocity prediction error for all drivers.
more, the training data measured in the vehicle is measured
with a high sampling frequency (10–100 Hz). Therefore, the
data exhibit certain irregularities. To make the most out of the
Fig. 3(a) shows the observed velocity percentiles ρd for all 10
available data, we propose a learning scheme operating on two
drivers. The drivers tend to drive faster than the median traffic
scales. One part of the model is updated with the sampling fre-
speed. Most drivers tend to drive consistently, i.e., the velocity
quency of the vehicle measurement data while a second part is
percentiles of trips 1 (×) and 2 (◦) are close to each other.
updated in accordance to the lower, event based frequency of
However, drivers 2 and 10 have significant inconsistencies
road segment changes. In the following, the two-scale method
between trips 1 and 2. Fig. 3(b) shows a histogram of the
and the application of FedAG are presented.
velocity prediction error e = v − Pρ (u). The mean value of
the error distribution is 0 ms−1 and the prediction is therefore
A. TWO-SCALE REGRESSION
unbiased.
To optimally learn the regression model using unstructured
data, two regression models are applied. The first model (M1 )
IV. ENERGY DEMAND PREDICTION ALGORITHM
is learned continuously with a data stream (10 Hz) to describe
The task of the EDP algorithm is to predict the energy demand
the current energy consumption. The second model (M2 ) is
for a planned route from start to destination. The route consists
learned based on the road segments and tries to correct the
of multiple road segments and for each of the segments, the
prediction of the first model. Fig. 4 displays a block diagram
energy demand is predicted based on the features correspond-
of the ML process. Ec,i is the vehicle’s measured energy con-
ing to the segment. In the probabilistic approach, the EDP
sumption at time i and is the target variable for M1 . Feature
algorithm computes a probability density for each of the seg-
vector xi includes the variables measured by the vehicle at
ments. The total EDP is the sum of the EDPs for the individual
time i. Thereby, model weights W1 are learned. Simultane-
segments. The sum of random variables γ and δ is defined as
ously, the mean values of features xi on segment k are calcu-
the convolution of their probability density functions:
lated
 ∞
1
pγ +δ (x) = fγ (y) fδ (x − y)dy = ( fγ ∗ fδ )(x) . (2) xi∈k = xi li , (5)
−∞ lk
i∈k
For a route with segments S1 , S2 , . . ., SN and predictions where lk is the length of segment k and li is the distance driven
E c,2 , . . ., E
c,1 , E c,N the probability density for the total EDP
from time i − 1 to time i. Using the updated weights W1 and
is features xi∈k , M1 ’s estimation of the energy consumption on
segment k, Ec,k is computed. The difference of the true energy
pEc (x) = (pEc,1 ∗ pEc,2 ∗ . . . ∗ pEc,N )(x) . (3)
consumption Ec,k and the estimation E c,k delivers the target
According to the central limit theorem, the sum of indepen- variable for M2 . Based on the feature vector zk , weights W2
dent random variables tends toward a normal distribution and are learned. The first model’s features xi are:
the total EDP is
r v vehicle speed,
r α road slope,
pEc (x) = N(μEc , σE2 ) , (4) r κ road curvature,
c
r φ traffic phase,
where μEc is the mean value and σE2 is the variance of the r uh historic mean traffic speed,
c
normal distribution [5]. To describe the energy demand on a r uc current mean traffic speed.
road segment as a function of the available data, we apply two The second model’s features zk are xi∈k and additionally:

154 VOLUME 2, 2021


benefits from a rich data basis of a vehicle fleet, while mini-
mizing communication overhead and preserving the privacy
of the users. In case of an unstable internet connection, a
client cannot send and receive updates from the server until
a stable connection is restored, i.e., the federated learning
becomes asynchronous [37]. In this work, we assume a stable
connection between the vehicles and the server at all times.

C. FEDERATED LEARNING WITH CLUSTERING


Not all drivers and vehicles exhibit the same driving behavior
and energy consumption patterns. Therefore, a single, global
model might not be the best choice for the EDP. An alternative
is to generate several federated models, each of which acts as a
global model for a subset of drivers. A cluster analysis can be
executed to divide the set of drivers into subsets. Drivers can
then be assigned to these subsets by observing their driving
behavior and properties of their vehicles. In this work, we use
aggregated data from the drivers to create two driver clusters
with k-means clustering [38]. The features used in the cluster-
ing are:
r observed velocity percentile ρ,
r relative positive acceleration [39],
r relative velocity in free flowing traffic v/vlim ,
r distribution of observed traffic phases.
The following driver subsets are generated by the cluster
analysis:
S1 = {2, 4, 7, 8, 9} ,
r v k − v k−1 segment speed difference,
r σv(k) segment speed standard deviation, S2 = {1, 3, 5, 6, 10} .
r lk segment length. The drivers in S1 can cooperate in learning one model and the
The predictions step is limited to the road segments, as the drivers in S2 learn a separate model. FedAG with clustering
predictive data is only reported on that scale. The final EDP is is denoted by FedAG-Clustering (FedAGC) with FedAvg-
the sum of the predictions computed with M1 and M2 : Clustering (FedAvgC) as the deterministic counterpart. With
E (1) + E
c,k = E (2) . (6) the availability of a larger dataset with more variety, additional
c,k c,k
features, e.g., the type of vehicle, geographical region, or the
B. FEDERATED LEARNING WITH PREDICTIVE distribution of observed temperature, could be included.
UNCERTAINTY
To learn the proposed regression models including predictive V. PREDICTION VALIDATION
uncertainty, we apply FedAG, which is shown in Algorithm 1 To validate the algorithms presented in (IV), the data pre-
[14]. sented in (III) is used. We apply a leave-one-out cross vali-
The central part of the algorithm is the aggregation step, dation where the scheme depends on the learning algorithm.
where a Gaussian is fitted to the set of client weights w. In FL algorithms effectively have access to training data from
this work, the posterior distributions are found by calculating the entire vehicle fleet, whereas conventional ML algorithms,
the mean value μw and variance σw2 of weights w (k) . Subse- e.g., stochastic gradient descent (SGD), can typically only
quently, the posterior distributions for the weights p(w|D) are access data observed by the respective vehicle. In the follow-
returned to the clients. The clients use the expected value μw ing, we validate and compare the learning algorithms FedAG,
of the weight posterior distributions for further training, but FedAGC, FedAvg, and conventional driver-individual SGD.
the predictive distributions are computed with The algorithms are applied to a linear regression (LR) and a
 NN. FedAGC is not applied to the NN, as a NN is able to learn
p(y|x, D) = p(y|x, D, w)p(w|D)dw . (7) more sophistic dependencies than a linear model and benefits
from a larger data basis. The NN has two hidden layers, each
Since the integral is typically intractable in non-linear mod- containing 50 hidden units. E = 40 passes over the available
els, Markov chain Monte Carlo (MCMC) is used to compute training data are done. In FedAG, K = 10 devices denote the
an approximation. In summary, with FedAG, a probabilistic 10 drivers, each of which with C = 1 and batch size B = 1.
prediction model can be created in an efficient manner, which The training of a the NN is a non-convex optimization and

VOLUME 2, 2021 155


THORGEIRSSON ET AL.: PROBABILISTIC PREDICTION OF ENERGY DEMAND AND DRIVING RANGE

TABLE 2. Performance Evaluation All Algorithms on All Drives With Mean


CRPS and RMSE

t > 1 rounds are usually required to ensure convergence. We


report the results after t = 5 rounds, but further rounds do
not improve the results significantly. The training of the LR FIG. 5. Boxplots showing the distribution of the CRPS on all test drives for
is a convex optimization and no more than t = 1 rounds are the prediction algorithms.

needed for the training to converge. For FedAG, appropriate


precision parameters for the variance of the target variable are
estimated using the variance of the training data.

A. PROPER SCORING RULES


To evaluate the performance of the prediction algorithms,
proper scoring rules are required. Scoring rules assess the
quality of probabilistic predictions by comparing the predic-
tive distribution and the true observation. A scoring rule S is
proper if the expected score is optimized by issuing the true
distribution of observations as the prediction. In this work, we
regard scores as negatively oriented, i.e., a better prediction
leads to a lower score. The requirement for a scoring rule S to
be proper is thus
   
Ey∼P S(P, y) ≤ Ey∼P S(Q, y) , (8)
where y is the true observation of the target variable, P is the
true distribution of y and Q is a predictive distribution. The
equality in (8) only applies when Q = P [40]. The continuous
ranked probability score (CRPS) is a proper scoring rule for
density predictions of continuous variables
 ∞
CRPS(Q, y) = (Q(x) − H (x − y))2 dx , (9)
−∞
FIG. 6. Mean values and confidence intervals of the predictions computed
where H is the Heaviside step function. CRPS can be directly with a NN and a LR trained using FedAG and FedAGC, respectively,
compared with the mean absolute error (MAE) of determin- normalized by the true energy consumption.
istic predictions. Futhermore, CRPS is expressed in the unit
of the target variable, e.g., [kW h]. In the following, CRPS is
used as the main performance indicator in the evaluation of the probabilistic prediction algorithms achieve a much smaller
the prediction algorithms. CRPS than their deterministic counterparts.
A further visualization of the results of the two best per-
B. PREDICTION PERFORMANCE EVALUATION forming algorithms is shown in Fig. 6. The figure shows the
Table 2 shows the mean CRPS (MCRPS) and root mean mean values and 95% confidence intervals of the predictions
square error (RMSE) for all algorithms on all drives. Boxplots computed with a NN and a LR trained using FedAG and
for distribution of the CRPS on all drives for the algorithms FedAGC, respectively. The predictions are normalized with
are shown in Fig. 5. The performance of the algorithms in the true energy consumption of the respective drive. The ob-
terms of CRPS and RMSE increases with increasing algo- served energy consumption rarely matches the mean value
rithm complexity and the NN trained using FedAG achieves exactly, but falls within the confidence intervals in all drives.
the best performance. For the LR, FedAGC slightly improves The prediction for an exemplary drive (Nr. 18) using the
the results of FedAG. Generally, the application of FL in- best algorithm, NN-FedAG is shown in Fig. 7. The green
creases the performance significantly. Finally yet importantly, band represents a 95% confidence interval for the accumulated

156 VOLUME 2, 2021


FIG. 7. Predicted and observed velocity profile and predicted and observed FIG. 9. Destination attainability p(a) over the course of all drives based on
accumulated energy consumption of drive 18, computed with NN - FedAG. the EDP computed with a NN trained using FedAG.

FedAG when tested with a two-sample Kolmogorov-Smirnov


test.

D. DESTINATION ATTAINABILITY
With a probabilistic EDP and a known available battery en-
ergy, the probability of reaching a destination, i.e., destina-
FIG. 8. Boxplots showing the distribution of the DS of the probabilistic
tion attainability p(a), can be calculated [5]. However, this is
EDP algorithms.
not possible with a deterministic EDP. The available battery
energy is a variable that cannot be measured directly, but is
energy consumption at each point in the drive. The measured estimated with some uncertainty [41]. The attainability can
energy consumption is shown in purple. Additionally, the pre- thus be calculated with
dicted traffic speed percentile value is shown in yellow and the c ) = p(E
b ≥ E b − E
c ≥ 0) ,
p(a) = p(E (11)
observed driving speed is shown in blue. In Fig. 3(a), driver
9 displayed a moderate inconsistency in driving speed (55th where E b is the estimated available battery energy. Addition-
and 70th percentiles). In Fig. 7, the velocity prediction fails ally, the amount of energy needed to achieve p(a) = 0.99
to predict high driving speed of up to more than 50 ms−1 can be calculated using the inverse of the normal cumulative
at around 80 km. Nevertheless, the measured driving speed distribution function :
deviates a little from the predicted velocity and the observed
c,p = μ  + σ 
E −1
(p) . (12)
energy consumption always lies within the confidence interval Ec Ec
of the prediction. With (12), the amount of energy to be charged in order to
reach a destination can be computed. An important feature
C. SHARPNESS of the prediction and attainability estimation is that the des-
The sharpness of a prediction is a measure for the concentra- tination is ultimately reached. To analyze this, we compute
tion of the predictive distribution. One way to measure sharp- the energy needed for p(a) = 0.99 with (12) for each drive,
ness of normally distributed predictions is the determinant b to this value and observe the
set the initial battery energy E
sharpness (DS) defined as attainability p(a) during the trip. Fig. 9 shows the progression
of the destination attainability over the course of all drives.
DS = det ( )1/2 d , (10)
In some drives, the attainability exhibits fluctuation, e.g., in
where is the covariance matrix of the predictive distri- drives 12 and 19, p(a) is significantly lower than 0.99 at times.
bution of dimension d × d. The EDP is univariate (d = 1) The gradient of a sharp prediction’s cumulative distribution is
and the DS therefore reduces to the standard deviation of proportionally large, so that a single maneuver, e.g., strong
the predictive distribution. Fig. 8 shows boxplots displaying acceleration during overtaking, can have a significant impact
the distribution of the determinant sharpness of the predic- on the attainability. However, the attainability converges to
tions on all drives for the three probabilistic algorithms. The 1 when the destination is approached and the destination is
NN computes significantly sharper predictive distributions reached in all drives. The linear models trained using FedAG
than the LRs in all drives. The clustering in FedAGC brings and FedAGC are also able to accurately estimate the attain-
a marginally significant benefit in sharpness compared to ability.

VOLUME 2, 2021 157


THORGEIRSSON ET AL.: PROBABILISTIC PREDICTION OF ENERGY DEMAND AND DRIVING RANGE

TABLE 3. Calibration Error Measures for the Destination Attainability With


Probabilistic EDP Algorithms

E. CALIBRATION
The value p(a) can also be called the confidence of the
attainability estimation and the observed ratio of drives in
which the destination is reached can be denoted as accuracy.
If the confidence always matches the accuracy, the prediction
is well calibrated [42]. A measure for the calibration of the
attainability decision is the difference in expectation between
confidence and accuracy FIG. 10. Reliability diagram for the destination attainability estimation
  using the probabilistic EDP algorithms.
E P Y  = Y |P̂ = p − p , (13)
where the accuracy term P (Y  = Y |P̂ = p) is the probability

of the prediction Y being equal to observation Y given the for p < 0.5 but slightly over-confident for p > 0.5. Guo et al.
estimated confidence P̂ = p of the predictor. A perfect cali- discovered that modern NNs are often poorly calibrated [42].
bration, although impossible, is when the expected difference A poorly calibrated prediction can not only lead to a driver be-
is zero [43]. Using (12) and the observed energy consumption, ing stranded with an empty battery, but also to a significantly
the accuracy for different p-values can be computed. In our higher travel time if the prediction tends to be under-confident.
application, accuracy β is the empirical frequency of success- Nonetheless, all three probabilistic EDP algorithms exhibit a
ful trips given EDP Ec,p and confidence p sufficient calibration.

1  ( j)
β(p) = 1 Ec,p ≥ Ec( j) , (14) VI. SAFETY MARGIN AND TRAVEL TIME
ND
j A central task of the EDP is to enable certain decision making
where ND is the total number of drives. The expected cal- for attainability and charge planning. The requirement is to
ibration error (ECE) is defined as mean difference between predict the energy demand so that a destination can be reached
accuracy and confidence safely without an unnecessary large safety margin bE . A
safety margin is the proportion of battery energy reserved in
1  case of an inaccurate prediction. A robust EDP should thus
ECE = |β(pi ) − pi | , (15)
Np maximize the probability of attaining the destination while
i
minimizing the safety margin, which in turn maximizes the
where N p is the number of confidence levels p tested. The effective driving range of the vehicle. The user primarily ex-
maximum calibration error (MCE) is the maximum difference periences how far he can drive without charging and how fast
ECE = max |β(pi ) − pi | . (16) he can travel from A to B. Hence, the user experience is pos-
i itively influenced by an appropriate safety margin. The safety
Finally, the idealized root mean square calibration error (RM- margin is closely related to the sharpness of the prediction
SCE) is defined as and a sharp prediction leads to a smaller safety margin than a
less sharp prediction. In the following, we analyze the safety
1 
RMSCE = |β(pi ) − pi |2 . (17) margins resulting from the EDPs and their impact on travel
Np time.
i

Table 3 shows the ECE, MCE and RMSCE values for


the probabilistic prediction algorithms. The LR trained with A. SAFETY MARGIN
FedAGC has the lowest calibration errors, followed by the LR With probabilistic predictions, the safety margin can be di-
and NN trained with FedAG. The ranking of the algorithms is rectly derived from the predictive distribution. The difference
thus not the same as according to the prediction performance between the mean value and the p = 0.99 value of the predic-
in terms of CRPS and RMSE. Fig. 10 shows a reliability tive distribution can be seen as a safety margin. Using (12),
diagram visualizing the expected sample accuracy of the at- these values can be calculated and the safety margin b(p) E is
tainability estimation as a function of the confidence of the
prediction. The black, straight line with slope 1 is the ideal c,0.5
E
b(p)
E =1− , (18)
calibration. NN-FedAG tends to be slightly under-confident 
Ec,0.99

158 VOLUME 2, 2021


TABLE 4. Simulation Data Overview

TABLE 5. Safety Margin and Charging Time Results

FIG. 11. Empirical cumulative probability distributions of safety margins


bE of the EDP algorithms.

where the superscripted (p) denotes that the safety margin


is based on a probabilistic prediction. A deterministic pre- safety margin, the energy needed for the continuation of the
diction includes no information about the uncertainty of the trip may be smaller. Additionally, the driving time may be
prediction and a safety margin can not be derived directly. reduced as well, if a more convenient charging point (CP)
In a previous publication, we suggested calculating the safety is attainable with greater effective driving range. The safety
margin b(d ) margin has therefore a direct influence on charging and travel
E based on the maximum probable error
time. To quantify this, a stochastic framework was devel-
 −1
(d ) dEc /ds oped to analyse the influence of different vehicle parame-
bE ≥ +1 , (19) ters [3]. The framework includes the real road and charging
emax
infrastructure, in which virtual routes can be defined based
where dEc /ds is the mean consumption, emax is the maxi-
on mobility patterns and population data. Using traffic data,
mum probable error in terms of energy per distance and the
speed profiles for the routes are generated. BEV powertrain
superscripted (d ) denotes that the safety margin is based on a
and battery models are included to allow a calculation of the
deterministic prediction. We observe that when (19) is applied
energy demand for the routes. With route planning, EDP and
to the mean values of the probabilistic EDPs, a safety margin
charge planning, the fastest route is computed. In turn, driving
similar to the maximum value of the probabilistic safety mar-
time and charging time can be measured. For a more detailed
gins is found:
  description of the paper, we refer the reader to [3]. Using this
b(d )
b(p) same stochastic framework, we simulate 452 random long-
E ≈ max E . (20)
distance trips in Europe and North America and use the total
Fig. 11 shows empirical cumulative probability distributions time spent charging as a performance indicator. Table 4 shows
for the resulting safety margins of the EDP algorithms. The an overview of the simulation data.
ranking of the algorithms is the same as according to CRPS. Table 5 shows the simulated safety margins bE based on
Predictions with a NN lead to lower safety margins than with Fig. 11 and the resulting total charging time as a percentage
a LR and the probabilistic FedAG leads to lower safety mar- of a benchmark algorithm LR-SGD. With probabilistic pre-
gins than FedAvg and SGD. A disadvantage of a constant, dictions, the safety margins follow a normal distribution. The
deterministic safety margin is that is frequently too large. An results for charging time in Table 5 show that a decreased
unnecessarily large safety margin reduces the effective driving safety margin bE leads to a decrease in charging time. The
range and reduces the possibilities for feasible routing and advantage of a probabilistic EDP, such as with FedAG, over a
charge planning strategies [26]. deterministic EDP can be seen as well. A further analysis of
the charging time benefit of probabilistic EDPs can be seen
B. INFLUENCE ON TRAVEL TIME in Fig. 12, where the distribution of difference in charging
The safety margin determines the amount of reserved battery time over the complete route collective is shown. In the case
energy. The smaller the safety margin, the further a BEV can of LR, the mean reduction in charging time is approximately
drive before a charging stop needs to be planned. Thereby, 4.7% when predictive uncertainty is considered. For a NN,
a faster charging point might be attainable. Additionally, a including predictive uncertainty leads to a mean charging time
planned charging stop may be shorter, since with a smaller reduction of 2.3%. The predictions with NNs are generally

VOLUME 2, 2021 159


THORGEIRSSON ET AL.: PROBABILISTIC PREDICTION OF ENERGY DEMAND AND DRIVING RANGE

advantages of federated learning and probabilistic models for


BEV driving.

REFERENCES
[1] M. Eisel, I. Nastjuk, and L. Kolbe, “Understanding the influence of in-
vehicle information systems on range stress—Insights from an electric
vehicle field experiment,” Transp. Res. Part F: Traffic Psychol. Behav.,
vol. 43, pp. 199–211, 2016.
[2] T. Franke, I. Neumann, F. Bühler, P. Cocron, and J. Krems, “Experi-
encing range in an electric vehicle—Understanding psychological bar-
riers,” Appl. Psychol.: Int. Rev., vol. 61, no. 3, pp. 368–391, 2012.
[3] A. T. Thorgeirsson, S. Scheubner, S. Fünfgeld, and F. Gauterin, “An
investigation into key influence factors for the everyday usability of
electric vehicles,” IEEE Open J. Veh. Technol., vol. 1, pp. 348–361,
Oct. 2020.
[4] Y. Zhang, W. Wang, Y. Kobayashi, and K. Shirai, “Remaining driving
range estimation of electric vehicle,” in Proc. IEEE Int. Electric Veh.
FIG. 12. Proportional charging time reduction between probabilistic and
Conf., Greenville, SC, USA, 2012, pp. 1–7.
deterministic EDP algorithms.
[5] S. Scheubner, A. Thorgeirsson, M. Vaillant, and F. Gauterin, “A
stochastic range estimation algorithm for electric vehicles using traf-
fic phase classification,” IEEE Trans. Veh. Technol., vol. 68, no. 7,
pp. 6414–6428, Jul. 2019.
significantly sharper than those performed with LRs. Further- [6] F. Morlock, B. Rolle, M. Bauer, and O. Sawodny, “Forecasts of elec-
more, the variance of sharpness and bE is lower. This leads tric vehicle energy consumption based on characteristic speed profiles
to the somewhat smaller charging time reduction. Neverthe- and real-time traffic data,” IEEE Trans. Veh. Technol., vol. 69, no. 2,
pp. 1404–1418, Feb. 2020.
less, considering predictive uncertainty explicitly improves [7] L. Thibault, G. D. Nunzio, and A. Sciarretta, “A unified approach for
charging and travel time, especially in regions with sparse electric vehicles range maximization via eco-routing, eco-driving, and
charging infrastructure. energy consumption prediction,” IEEE Trans. Intell. Veh., vol. 3, no. 4,
pp. 463–475, Dec. 2018.
[8] Z. Yi and P. H. Bauer, “Optimal speed profiles for sustainable driving
VII. CONCLUSIONS AND FURTHER WORK of electric vehicles,” in Proc. IEEE Veh. Power Propulsion Conf., 2015,
pp. 1–6.
A network of connected BEVs and backend infrastructure in [9] Z. Yi and P. H. Bauer, “Energy aware driving: Optimal electric vehicle
the cloud constitute a distributed system with various sources speed profiles for sustainability in transportation,” IEEE Trans. Intell.
of information relevant for the energy demand prediction. Transp. Syst., vol. 20, no. 3, pp. 1137–1148, Mar. 2019.
[10] A. Fotouhi, N. Shateri, D. S. Laila, and D. J. Auger, “Electric vehicle
By applying federated learning and computing a probabilistic energy consumption estimation for a fleet management system,” Int. J.
prediction, the uncertainty of the distributed data is considered Sustain. Transp., vol. 15, no. 1, pp. 40–45, 2020.
in a communication efficient and privacy preserving manner. [11] Z. Yi and P. H. Bauer, “Optimization models for placement of an
energy-aware electric vehicle charging infrastructure,” Transp. Res. Part
With a multi-scale regression, the prediction models can be E: Logistics Transp. Rev., vol. 91, pp. 227–244, Jul. 2016.
trained using data measured in the vehicles while the pre- [12] S. Deepak, A. Amarnath, G. KrishnanU, and S. Kochuvila, “Survey
dictions are computed with data from TRDB directly in the on range prediction of electric vehicles,” in Proc. Innov. Power Adv.
Comput. Technol. (i-PACT), 2019, pp. 1–7.
cloud. The energy demand predictions are validated with real [13] B. McMahan, E. Moore, D. Ramage, and B. A. y Arcas,
driving data and the performance is measured with proper “Communication-efficient learning of deep networks from decentral-
scoring rules. The performance of the probabilistic predictions ized data,” in Proc. 20th Int. Conf. Artif. Intell. Statist., ser. Proceedings
of Machine Learning Research, A. Singh and J. Zhu, Eds., vol. 54. Fort
is superior to conventional deterministic predictions. Further- Lauderdale, FL, USA: PMLR, Apr. 20–22, 2017, pp. 1273–1282.
more, a non-linear model (NN) achieves higher performance [14] A. T. Thorgeirsson and F. Gauterin, “Probabilistic predictions with
in terms of CRPS and RMSE than a linear model (LR). A federated learning,” Entropy, vol. 23, no. 1, Dec. 2020, Art. no. 41.
[15] Z. Yi and P. H. Bauer, “Adaptive multiresolution energy consumption
probabilistic prediction allows the estimation of destination prediction for electric vehicles,” IEEE Trans. Veh. Technol., vol. 66,
attainability, i.e., the probability of reaching a destination us- no. 11, pp. 10515–10525, Nov. 2017.
ing the available battery energy. The calibration of this esti- [16] S. Grubwinkler, M. Kugler, and M. Lienkamp, “A system for cloud-
based deviation prediction of propulsion energy consumption for EVs,”
mation is sufficient and the error between accuracy and confi- in Proc. IEEE Int. Conf. Veh. Electron. Saf., 2013, pp. 99–104.
dence is low for all algorithms. A further advantage of an ac- [17] A. Jayakumar, F. Ingrosso, G. Rizzoni, J. Meyer, and J. Doering,
curate, probabilistic energy demand prediction is the variable “Crowd sourced energy estimation in connected vehicles,” in Proc.
IEEE Int. Electric Veh. Conf., Dec. 2014, pp. 1–8.
safety margin. This leads to a better utilization of the battery [18] S. Sun, J. Zhang, J. Bi, and Y. Wang, “A machine learning method for
energy and increases the effective driving range. Additionally, predicting driving range of battery electric vehicles,” J. Adv. Transp.,
this translates into a shorter travel and charging time on long vol. 2019, pp. 1–14, Jan. 2019.
[19] A. Fukushima, T. Yano, S. Imahara, H. Aisu, Y. Shimokawa, and Y.
distance trips. Our further work includes more research on Shibata, “Prediction of energy consumption for new electric vehicle
how global, federated models can be personalized for the par- models by machine learning,” IET Intell. Transport Syst., vol. 12, no. 9,
ticipating drivers. The analysis of system design, network us- pp. 1174–1180, 2018.
[20] C. D. Cauwer, W. Verbeke, T. Coosemans, S. Faid, and J. V. Mierlo,
age, and transmission time might prove an important area for “A data-driven method for energy consumption prediction and energy-
future research [44]. Moreover, an ever-expanding database efficient routing of electric vehicles in real-world conditions,” Energies,
of real driving data can be used to confirm the presented vol. 10, no. 5, May 2017, Art. no. 608.

160 VOLUME 2, 2021


[21] J. A. Oliva, C. Weihrauch, and T. Bertram, “Model-based remaining ADAM THOR THORGEIRSSON received the
driving range prediction in electric vehicles by using particle filtering B.Sc. degree in mechanical engineering from
and markov chains,” in Proc. IEEE World Electric Veh. Symp. Exhib. the University of Iceland, Reykjavik, Iceland,
(EVS27), Barcelona, Spain, 2013, pp. 1–10. in 2014, and the M.Sc. degree in mechan-
[22] P. Ondruska and I. Posner, “Probabilistic attainability maps: Efficiently ical engineering from the Karlsruhe Institute
predicting driver-specific electric vehicle range,” in Proc. IEEE Intell. of Technology (KIT), Karlsruhe, Germany, in
Veh. Symp. Proc., Dearborn, MI, USA, 2014, pp. 1169–1174. 2017. He is currently a Doctoral Candidate with
[23] M. W. Fontana, “Optimal routes for electric vehicles facing uncertainty, Dr. Ing. h.c. F. Porsche AG, Ludwigsburg, Ger-
congestion, and energy constraints,” Ph.D. dissertation, Massachusetts many, and supervised by Frank Gauterin (KIT).
Inst. of Technol., Cambridge, MA, USA, 2013. [Online]. Available:
http://hdl.handle.net/1721.1/84715
[24] Z. Yi and P. H. Bauer, “Optimal stochastic eco-routing solutions for
electric vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 12,
pp. 3807–3817, Dec. 2018.
[25] S. Pelletier, O. Jabali, and G. Laporte, “The electric vehicle routing
problem with energy consumption uncertainty,” Transp. Res. Part B:
Methodol., vol. 126, pp. 225–255, Aug. 2019. STEFAN SCHEUBNER received the B.Sc. degree
[26] G. Huber, K. Bogenberger, and H. van Lint, “Optimization of charging in mechanical engineering from the Karlsruhe In-
strategies for battery electric vehicles under uncertainty,” IEEE Trans. stitute of Technology, Karlsruhe, Germany, and the
Intell. Transp. Syst., to be published, doi: 10.1109/TITS.2020.3027625. M.Sc. degree in mechanical engineering from the
[27] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. Hobo- Korean Advanced Institute of Science and Tech-
ken, NJ, USA: Wiley, 2012. nology, Daejeon, South Korea, in 2011 and 2015,
[28] S. Grubwinkler, T. Brunner, and M. Lienkamp, “Range prediction for respectively. From 2015 to 2020, he worked on en-
evs via crowd-sourcing,” in Proc. IEEE Veh. Power Propulsion Conf., ergy management functions for BEVs with Porsche
2014, pp. 1–6. AG, Stuttgart, Germany. Since then, he has been
[29] C.-M. Tseng and C.-K. Chau, “Personalized prediction of vehicle en- concerned with DC fast-charging infrastructure for
ergy consumption based on participatory sensing,” IEEE Trans. Intell. BEVs with EnBW AG, Karlsruhe, Germany.
Transp. Syst., vol. 18, no. 11, pp. 3103–3113, Nov. 2017.
[30] T. Straub, M. Nagy, M. Sidorov, L. Tonetto, M. Frey, and F. Gauterin,
“energetic map data imputation: A machine learning approach,” Ener-
gies, vol. 13, no. 4, p. 982, Feb. 2020, Art. no. 982.
[31] S. Samarakoon, M. Bennis, W. Saad, and M. Debbah, “Federated learn-
ing for ultra-reliable low-latency V2V communications,” in Proc. IEEE
Glob. Commun. Conf. (GLOBECOM), 2018, pp. 1–7.
[32] S. R. Pokhrel and J. Choi, “A decentralized federated learning approach SEBASTIAN FÜNFGELD received the M.Sc. de-
for connected autonomous vehicles,” in Proc. IEEE Wireless Commun. gree in mechanical engineering from the Karlsruhe
Netw. Conf. Workshops (WCNCW), 2020, pp. 1–6. Institute of Technology, Karlsruhe, Germany, in
[33] Y. Liu, J. J. Q. Yu, J. Kang, D. Niyato, and S. Zhang, “Privacy- 2014. Since 2014, he has been with Dr. Ing. h.c. F.
preserving traffic flow prediction: A federated learning approach,” IEEE Porsche AG, Ludwigsburg, Germany. His research
Internet Things J., vol. 7, no. 8, pp. 7751–7763, Aug. 2020. interests include intelligent control and predictive
[34] A. T. Thorgeirsson, M. Vaillant, S. Scheubner, and F. Gauterin, “Eval- systems, including modeling of driver and traf-
uating system architectures for driving range estimation and charge fic behavior, stochastic forecasting, and stochastic
planning for electric vehicles,” Softw.: Pract. Experience, vol. 51, no. 1, optimization.
pp. 72–90, 2021.
[35] HERE GlobalB.V, “Traffic API developer’s guide,” Accessed:
Jul. 2020. [Online]. Available: https://developer.here.com/
documentation/traffic/dev_guide/topics/api-reference.html
[36] B. S. Kerner, “Three-phase traffic theory and highway capacity,” Phy.
A: Stat. Mechanics Appl., vol. 333, pp. 379–440, 2004.
[37] M. R. Sprague et al., “Asynchronous federated learning for geospatial FRANK GAUTERIN received the Diploma degree
applications,” in ECML PKDD 2018 Workshops, ECML PKDD 2018. in physics from the University of Münster, Mün-
Communications in Computer and Information Science, A. Monreale et ster, Germany, in 1989, and the Dr. rer. nat. degree
al., Eds., Dublin, Ireland: Springer, 2019, pp. 21–28. (Ph.D.) in physics from the University of Olden-
[38] S. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. The- burg, Oldenburg, Germany, in 1994. From 1989 to
ory, vol. 28, no. 2, pp. 129–137, Mar. 1982. 2006, he was in different R & D positions with
[39] E. Ericsson, “Independent driving pattern factors and their influence on Continental AG, Hanover, Germany, leaving as the
fuel-use and exhaust emission factors,” Transp. Res. Part D: Transport Director of NVH Engineering (noise, vibration,
Environ., vol. 6, no. 5, pp. 325–345, 2001. harshness). Since 2006, he has been a Full Pro-
[40] T. Gneiting, F. Balabdaoui, and A. Raftery, “Probabilistic forecasts, fessor with the Karlsruhe Institute of Technology
calibration and sharpness,” J. Roy. Stat. Soc.: Ser. B. (Stat. Methodol.), (KIT), Karlsruhe, Germany. He is currently the
vol. 69, no. 2, pp. 243–268, 2007. Head with the Institute of Vehicle System Technology, KIT, and a Scientific
[41] L. Lu, X. Han, J. Li, J. Hua, and M. Ouyang, “A review on the key is- Spokesperson with KIT Center Mobility Systems. His research interests in-
sues for lithium-ion battery management in electric vehicles,” J. Power clude vehicle control, vehicle dynamics, vehicle NVH, vehicle suspension,
Sources, vol. 226, no. 1, pp. 272–288, 2013. tire dynamics and tire-road-interaction, vehicle concepts, vehicle modeling,
[42] C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of and identification methods.
modern neural networks,” in Proc. 34th Int. Conf. Mach. Learn., 2017,
pp. 1321–1330.
[43] M. P. Naeini, G. F. Cooper, and M. Hauskrecht, “Obtaining well cali-
brated probabilities using Bayesian binning,” in Proc. 29th AAAI Conf.
Artif. Intell., Austin, TX, USA, 2015, pp. 2901–2907.
[44] K. Bonawitz et al., “Towards federated learning at scale: System de-
sign,” in Proc. Mach. Learn. Syst., A. Talwalkar, V. Smith, and M.
Zaharia, Eds., 2019, pp. 374–388.

VOLUME 2, 2021 161

You might also like