Statistical Model of Relationship Between Natural Gas Consumption and Temperature
Statistical Model of Relationship Between Natural Gas Consumption and Temperature
natural gas consumption and temperature in daily and hourly resolution 393
17
X
1. Introduction
In this chapter, we will describe a statistical model which was developed from first
principles and from empirical behavior of the real data to characterize the relationship
between the consumption of natural gas and temperature in several segments of a typical
gas utility company’s customer pool. Specifically, we will deal with household and
small+medium (HOU+SMC) size commercial customers. For several reasons, consumption
modeling is both challenging and important here. The essential fact is that these segments
are quite numerous in terms of customer numbers. It leads to three practically significant
consequences.
First, their aggregated consumption constitutes an important part of the total gas
consumption for a particular day.
Secondly, their consumption depends strongly on the ambient temperature. Hence,
the temperature lends itself as a nice and cheap-to-obtain, exogeneous predictor.
The temperature response is nonlinear and quite complex, however. Traditional,
simplistic approaches to its extraction are not adequate for many practical
purposes.
Further, the number of customers is high, so that their individual follow-up in fine
time resolution (say daily) is not feasible from financial and other points of view.
Routinely, their individual data are available only at a very coarse (time-
aggregated) level, typically in the form of approximately annual consumption
totals obtained from more or less regular meter readings. When daily consumption
is of interest, the available observations need to be disaggregated somehow,
however.
Disaggregation is necessary for various practical purposes – for instance for the routine
distribution network balancing, for billing computations related to the natural gas price
changes (leading to the need for pre- and post-change consumption part estimates), etc. As
required by the market regulator, the resulting estimates need to be as precise as possible,
www.intechopen.com
394 Natural Gas
and hence they need to use available information effectively and correctly. Therefore, they
should be based on a good, formalized model of the gas consumption. Since the main driver
of the natural consumption is temperature, any useful model should reflect the consumption
response to temperature as closely as possible. It ought to follow basic qualitative features of
the relationship (consumption is a decreasing function of temperature having both lower
and upper asymptotes), but it needs to incorporate also much finer details of the
relationship observed in empirical data.
Our model tries to achieve just this and a bit more, as we will describe in the following
paragraphs. It is based on our analyses of rather large amounts of real consumption data of
unique quality (namely of fine time resolution) that was obtained during several projects
our team was involved in during the last several years. These include the Gamma project,
Standardized load profiles (SLP) projects in both the Czech Republic and Slovakia, as well
as the Elvira project (Elvira, 2010). Consumption-to-temperature relationships were
analyzed there in order to be able to model/describe them in a practically usable way.
Our resulting model is built in a stratified way, where the strata had been defined
previously via formal clustering of the consumption dynamics profiles (Brabec at al., 2009).
The stratification concerns the values of model parameters only, however. The form of the
model is kept the same in all strata, both in order to retain simplicity advantageous for
practical implementation and for saving the possibility of a relatively easy (dynamic) model
calibration (Brabec et al., 2009a). Model parameters are estimated from data in a formalized
way (based on statistical theory). The data consist of a sample of consumption trajectories
obtained through individualized measurements (obtained in rare and costly measurement
campaigns for nationwide studies mentioned above).
Construction of the model keeps the same philosophy as our previous models that have
been in practical use in Czech and Slovak gas utility companies (Brabec et al., 2009),
(Vondráček et al., 2008). It is modular, stressing physical interpretation of its components.
This is useful both for practical purposes (e.g. the ability to estimate certain latent quantities
that are not accessible to direct measurement but might be of practical interest) and for
model criticism and improvement (good serviceability of the model).
The model we present here is substantially different from the standardized load profile
(SLP) model we published previously (Brabec et al., 2009) and from other gas consumption
models (Vondráček et al., 2008) in that it has no standard-consumption (or consumption
under standard conditions) part. It is advantageous that the model is more responsible to
the temperature changes, especially in years whose temperature dynamics is far from being
“standard” and in transition (spring and fall) periods even during close-to-normal years.
Absence of the smooth standard-consumption part also simplifies the interpretation of
various model parts. It calls for expansion of the temperature response function. Here, we
start from the approach (Brabec at al., 2008), but we expand it substantially in three
important ways:
Shape of the temperature response is estimated in a flexible, nonparametric way
(so that we let the empirical data to speak for themselves, without presupposing
any a priori parametric shape).
Dynamic character of the temperature response and mainly its lag structure is
captured in much more detail.
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 395
The model now allows for temperature*(type of the day) interaction. In plain
words, this means that is allows for different temperature responses for different
day of week.
Numerous papers have discussed various aspects of modeling, estimation and prediction of
natural gas consumption for various groups of customers such as residential, commercial,
and industrial. Similar tasks are solved in the context of electricity load. Load profiles are
typically constructed using a detailed measurements of a sample of customers from each
group. Other, methods include dynamic modeling (historical load data are related to an
external factor such as temperature) or proxy days (a day in history is selected which closely
matches the day being estimated). The optimal profiling method should be chosen based on
cost, accuracy and predictability (Bailey, 2000). Close association between gas demand and
outdoor temperature has been recognized long time ago, so the first approaches to modeling
were typically based on regression models with temperature as the most important
regressor. Among such models, nonlinear regression approaches to gas consumption
modeling prevail (Potocnik, 2007). The concept of heating degree days is sometimes used to
suppress the temperature dependency during the days when no heating is needed (Gil &
Deferrari, 2004).
In addition to the temperature, weather variables like sunshine length or wind speed are
studied as potential predictors. Among other important explanatory variables mentioned in
the literature one can find calendar effects, seasonal effects, dwelling characteristic, site
altitude, client type (residential or commercial customer), or character of natural gas end-
use. Economical, social and behavioral aspects influence the energy consumption, as well.
Data on many relevant potential predictors are not available. Regression and econometric
models may include ARMA terms to capture the effects of latent and time-varying variables.
Another large group of models is based on the classical time series approach, especially on
Box-Jenkins methodology (Lyness, 1984), or on complex time series modifications.
In the following, we will first describe the model construction in a formalized and general
way, having in mind its practical implementation, however. Then, we will illustrate its
performance on real data.
www.intechopen.com
396 Natural Gas
K 8
we used them as segments, similarly as in (Vondráček et al., 2008). This way, we have
segments (4 HOU + 4 SMC in the Czech Republic and 2 HOU + 6 SMC in Slovakia).
j 1
where I condition is an indicator function. It assumes value of 1 when the condition in its
argument is true and 0 otherwise. The model (1) has several unknown parameters (that will
jk is
have to be estimated from training data somehow).
We will now explain their meaning. the effect of the j -th type of the day
( j 1,,5 ). Note that different segments have different day type effects (because of the
subscripting by k ). The notation is similar to the so called textbook parametrization often
used in the ANOVA and general linear models’ context (Graybill, 1976; Searle, 1971). We
haste to add that, for numerical stability, the model is actually fitted in the so called sum-to-
jk , jk jk jk , j 1,,5
zero (or contr.sum) parametrization
5 5
(2)
j 1 j 1
(Rawlings, 1988). In other words, we reparametrize the model (1) to the sum-to-zero for
numerical computations and then we reparametrize the results back to the textbook
parametrization for convenience. Table 1 shows how different types of the day D1 ,, D5
are defined by specifying for which particular triplet ( t 1, t , t 1 ) a particular day type
holds. Non- working days are the weekends and (generic) bank holidays of any kind. On the
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 397
other hand, k and k are effects of special Christmas and Easter holidays. Note that these
effects act on the top of the generic holiday effect, so that the total holiday effect e.g. for 25th
of December is (on the log scale) the sum of generic holiday (given by the day type 4, from
Table 1) and Christmas effects. Christmas period is (in the Central European
implementations of the model) defined to consist of days of December, 23, 24, 25, 26, while
kt is
Easter period is defined to consist form the Wednesday, Thursday, Friday, Saturday of the
week before the Easter Monday. the temperature correction which is the most
important part of the model with quite rich internal structure that we will explain in detail
pik is a multiple of the so called expected annual consumption (scaled as
in the next section.
a daily consumption average) for the i -th customer. It is estimated from past consumption
record (typically 3 calendar years) of the particular customer. For instance, if we have
m roughly Yik , i1 ,, Yik , im in
, we compute
annual consumption readings the
intervals i1
t i 1 , t i 2 , , im t i , 2 m 1 , t i , 2 m
and then condition on that estimate (i.e., we take the p̂ik for the unknown pik ) in all the
development that follows. That way, we buy considerable computational simplicity,
compared to the correct estimation based on nonlinear mixed effects model style estimation
(Davidian & Giltinan, 1995; Pinheiro & Bates, 2000) at the expense of neglecting some
(relatively minor) part of the variability in the consumption estimates. It is important,
however that the integration period for the p̂ik estimation is long enough.
Note that (1) immediately implies a particular separation
ikt is
terms. Obviously, the separation is additive on the log scale.
an additive random error term (independent across i, k , t ) which describes
k2 .ikt
~ N ikt , k2 .ikt
mean ratio is allowed to differ across segments). This means that also the observable
consumption Yikt has a normal distribution, Yikt
ikt (i.e. the true consumption mean for a situation given by calendar effects and
www.intechopen.com
398 Natural Gas
k
ikt ), variance k2 .ikt , and coefficient of variation
ikt
temperature is given by . This is
a bit milder variance-to-mean relationship than that used in (Brabec et al., 2009). The
distribution is heteroscedastic (both over individuals and over time). Specifically, variability
increases for times when the mean consumption is higher and also for individuals with
higher average consumption (within the same segment). These changes are such that the
coefficient of variation decreases within a segment, but its proportionality factor is allowed
to change among segments to reflect different consumption volatility of e.g. households and
small industrial establishments.
Taken together, it is clear that the model (1) has multiplicative correction terms for different
calendar phenomena which modulate individual long term daily average consumption and
a correction for temperature.
kt
2.3 Temperature response function
Temperature response function is in the core of model (1). Here, we will describe how it
is structured to capture details of the consumption to temperature relationship:
. T . j 1. T ,
9
10
k t j
7
j 1
(5)
k t k k
j 1
where Tt is a daily temperature average for day t . We use a nation-wide average based on
official met office measurements, but other (more local) temperature versions can be used.
Even though a more detailed temperature info can be obtained in principle (e.g. reading at
several times for a particular day, daily minima, maxima, etc.), we go with the average as
with a cheap and easy to obtain summary.
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 399
k Tt , it is even a
It is easy to see that the right-most term in the parenthesis represents a nonlinear, but time
~
invariant filter in temperature. In the transformed temperature, Tkt
linear time invariant filter. In fact, it is quite similar to the so called Koyck model used in
econometrics (Johnston, 1984). It can be perceived as a slight generalization of that model
parameter k . Note that the weighing in the 10-day temperature average could be non-
terminology sense of the word here (Rawlings, 1988). The impact is controlled by the
uniform, at least in principle. Estimation of the weights is extremely difficult here so that we
stick to the uniform weighting.
The left-most parenthesis contains an interaction term. It mediates the interaction of
nonlinearly transformed temperature and type of the day. In other words, the temperature
effect is different on different types of the day. This is a point that was missing in the SLP
model formulation (Brabec et al., 2009) and it was considered one of its weaknesses –
because the empirical data suggest that the response to the same temperature can be quite
interaction is described by the parameters , j 1,5 . For numerical stability, they are
different if it occurs on a working day than in it occurs on Saturday, etc. The (saturated)
jk after
jk
www.intechopen.com
400 Natural Gas
Therefore, it is given just by evaluating the model (1), (5) with unknown parameters being
replaced by their estimates.
This finishes the description of our gas consumption model (GCM) in daily resolution,
which we will call GCMd, for shortness.
log kth kth I twork . I j h. wjk I tnonwork . I j h. njk kth
q 24 24
1 qkth
(7)
j 1 j 1
log.
before, now they help to select parameters ( ) of a particular hour for a working (w) and
where we use for the natural logarithm (base e ). Indicator functions are used as
nonworking (n) day. This is an (empirical) logit model (Agresti, 1990) for proportion of gas
consumed at hour h of the day t (averaged across data available from all customers of the
Y
given segment k ):
Y
ikth
ik
qkth (8)
ikth '
ik h'
with Yikth being consumption of a particular customer i within the segment k during hour
h of day t . The logit transformation assures here that the modeled proportions will stay
within the legal (0,1) range. They do not sum to one automatically, however. Although a
multinomial logit model (Agresti, 1990) can be posed to do this, we prefer here (much)
simpler formulation (7) and following renormalization. Model (7) is a working (or
kth with zero mean and finite second moment (and independent across k , t , h ). This is not
approximative) model in the sense that it assumes iid (identically distributed) additive error
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 401
Given the hkw and hkn , it is easy to compute estimated proportion consumed during hour
h and normalize it properly. It is given by
1 exp kth
1
q~kth
(9)
In the modeling just described, the daily and hourly steps are separated (leading to
substantial computational simplifications during the estimation of parameters).
Temperature modulation is used only at the daily level at present (due to practical difficulty
to obtain detailed temperature readings quickly enough for routine gas utility calculations).
hourly resolution, once its parameters (and the nonparametric functions k . ) are given.
Notice that real use of the model described in previous sections is simple both in daily and
For instance, its SW implementation is easy enough and relies upon evaluation of a few
fairly simple nonlinear functions (mostly of exponential character). Indeed, the
k .
implementation of a model similar to that described here in both the Czech Republic and
Slovakia is based on passing the estimated parameter values and tables defining the
o
functions (those need to be stored in a fine temperature resolution, e.g. by 0.1 C) to the gas
distribution company or market operator where the evaluation can be done easily and
quickly even for a large number of customers.
The separation property (4) is extremely useful in this context. This is because that the time-
varying and nonlinear consumption dynamics part f kt needs to be evaluated only once (per
segment). Individual long-term-consumption-related pik ’s enter the formula only linearly
and hence they can be stored, summed and otherwise operated on, separately from
the f kt part.
It is only the estimation of the parameters and of the temperature transformations that is
difficult. But that work can be done by a team of specialists (statisticians) once upon a longer
period. We re-estimate the parameters once a year in our running projects.
www.intechopen.com
402 Natural Gas
k , we assume that they are smooth and can be approximated with loess
quicker.
For the functions
k ’s, the model GCMd is a semiparametric model (Carroll & Wand, 2003). Apart from the
(Cleveland, 1979). Due to the presence of both fixed parameters and the nonparametric
temperature correction part, the structure of the model is additive and linear in parameters,
after log transformation, therefore it can be fitted as a GAM model (Hastie & Tibshirani,
1990), after a small adjustment. Naturally, we use normal, heteroscedastic GAM with
pikt here. The estimation proceeds in several stages, in the generalized estimating
variance being proportional to the mean, logarithmic link and offset into which we
put log
function k . To that end, we start with a simpler version of the model GCMd which
equation style (Small & Wang, 2003). We start the estimation with estimation of the
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 403
For practical computations, we use the R system (R Development Core Team, 2010), with
both standard packages (gam, in particular) and our own functions and procedures.
3.2 Practical applications of the model and typical tasks which it is used for
The model GCM (be it GCMd or GCMh) is typically used for two main tasks in practice,
namely redistribution and prediction. First, it is employed in a retrospective regime when
known (roughly annual) total consumption readings need to be decomposed into parts
corresponding to smaller time units in such a way that they add to the total. In other words,
t ' t1 i t ' t1 i
where Yˆikt has been defined in (6). Disaggregation into hours would be analogous, only the
t1i , t2i
GCMh model would be used instead of the GCMd. Such a disaggregation is very much of
interest in accounting when the price of the natural gas changed during the interval
and hence amounts of gas consumed for lower and higher rates need to be estimated. It is
also used when doing a routine network mass balancing, comparing closed network inputs
and amounts of gas measured by individual customers’ meters (for instance to assess
losses). The disaggregated estimates might need to be aggregated again (to a different
aggregation than original readings), in this context. The estimate of the desired consumption
aggregation both over time and customers is obtained simply by appropriate integration
(summation) of the disaggregated estimates (11):
Yˆ
t T2
Yˆ IR,T1 ,T2 R
ikt (12)
i ,kI t T1
where I is a given index set. It might e.g. require to sum consumptions of all customers of
two selected segments, etc.
Secondly, one might want to have prospective estimates of consumption over the interval
which lies, at least partially, in future. Redistribution of the known total is not possible here,
and the estimates have to be done without the (helpful) restriction on the total. They will
have to be based on Yˆikt alone. It is clear that such estimates will have to be less precise and
hence less reliable, in general. This is even more true in the situation when the average
annual consumption changes systematically, e.g. due to the external economic conditions
www.intechopen.com
404 Natural Gas
(like crisis) which the GCM model does not take into account. At any rate, the disagreggated
estimates can then be used to estimate a new aggregation in a way totally parallel to (12), i.e.
Yˆ
as follows:
t T2
Yˆ I ,T1 ,T2 ikt (13)
i ,kI t T1
It is important to bear on mind that the estimates (both YˆiktR and Yˆikt , as well as their new
aggregations) are estimates of means of the consumption distribution. Therefore, they are
not to be used directly e.g. for maximal load of a network or similar computations (mean is
not a good estimate of maximum). Estimates of the maxima and of general quantiles
(Koenker, 2005) of the consumption distribution are possible, but they are much more
complicated to get than the means.
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 405
interested in relative model error, one could be pressed to improve the model by
recalibration (because the small numerators stress the quality of the summer behavior
substantially).
Secondly, when the model is to be used e.g. for network balancing, it can easily happen that
the values which the model is compared against are obtained by a procedure that is not
entirely compatible with the measurement procedure used for individual customer readings
and/or for the fine time resolution reading in the sample. For instance, we might want to
compare the model results to amount of gas consumed in a closed network (or in the whole
gas distribution company). While the model value can be obtained by appropriate
integration over time and customers easily, for instance as in (13), obtaining the value which
this should be compared to is much more problematic than it seems at first. The problem lies
in the fact that, typically there is no direct observation (or measurement) of the total network
consumption. Even if we neglect network losses (including technical losses, leaks, illegal
consumption) or account for them in a normative way (for instance, in the Czech Republic,
there are gas industry standards that describe how to set a (constant) loss percentage) and
hence introduce the first approximation, there are many problems in practical settings. The
network entry is measured with a device that has only a finite precision (measurement
errors are by no means negligible). The precision can even depend on the amount of gas
measured in a complicated way. The errors might be even systematic occasionally, e.g. for
small gas flows which the meter might not follow correctly (so that summer can easily be
much more problematic than winter). Further, there might be large customers within the
network, whose consumption need to be subtracted from the network input in order to get
HOU+SMC total that is modeled by a model like GCM. These large customers might be
followed with their own meters with fine time precision (as it is the case e.g. in the Czech
Republic and Slovakia), but all these devices have their errors, both random and systematic.
From the previous discussion, it should be clear now that the “observed” SMC+HOU totals
have not the same properties as the direct measurements used for model training. It is just
an artificial, indirect construct (nothing else is really feasible in practice, however) which
might even have systematic errors. Then the calibration of the model can be very much in
place (because even a good model that gives correct and precise results for individual
consumptions might not do well for network totals).
Yˆ
In the context of the GCM model, we might think about a simple linear calibration of
Z..t against ikt (where it is understood that the summation is against the indexes
i,k
corresponding to the HOU+SMC customers from the network), i.e. about the calibration
model described by the equation (15) and about fitting it by the OLS, ordinary least squares
(15)
i,k
Conceptually, it is a starting point, but it is not good as the final solution to the calibration.
Indeed, the model (15) is simple enough, but it has several serious flaws. First, it does not
www.intechopen.com
406 Natural Gas
obtained from random data, it is a random quantity (containing estimation error of Yˆikt ’s). In
particular, it is not a fixed explanatory variable, as assumed in standard regression problems
that lead to the OLS as to the correct solution. The situation here is known as the
measurement error problem (Carroll et al., 1995) in Statistics and it is notorious for the
possibility of generating spurious regression coefficients (here calibration coefficients)
estimates. Secondly, the (globally) linear calibration form assumed by (15) can be a bit too
rigid to be useful in real situations. Locally, the calibration might be still linear, but its
coefficients can change smoothly over time (e.g. due to various random disturbances to the
network).
Therefore, we formulate a more appropriate and complete statistical model from which the
calibration will come out as one of its products. It is a model of state-space type (Durbin &
Koopman, 2001) that takes all the available information into account simultaneously, unlike
the approach based on (15):
k 1 i 1
t t 1 t
(16)
estimates from the GCMd model (1), (5) fitted previously (hence also ikt appearing
Here, we take the GCMd parameters as fixed. Their unknown values are replaced by the
K 1 -th equation are fixed quantities). Therefore, we have only the variances k2 , 2 ,
explicitly in the first K equations, as well as in the error specification and implicitly in the
2 as unknown parameters, plus we need to estimate the unknown t ’s. In the model (16),
the first K 1 equations are the measurements equations. In a sense they encompass
t in
simultaneously what models (1), (5) and (15) try to do separately. There is one state equation
which describes possible (slow) movements of the linear calibration coefficient exp
the random walk (RW) style (Kloeden & Platen, 1992). The RW dynamics is imposed on the
log scale in order to preserve the plausible range for the calibration coefficients (for even a
specified on the last line. We assume that , and are mutually independent and that
moderately good model, they certainly should be positive!). The random error terms are
each of them is independent across its indexes ( t and i, k ). For identifiability, we have to
have a restriction on k ’s (that is on the segment-specific changes of the calibration). In
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 407
www.intechopen.com
408 Natural Gas
different at different types of the day, etc., as described by the model (1)). This second,
within individual variability is exactly where the model (5) comes into play. All of this (and
more) needs to be taken into account while estimating the model.
After motivating the model, it is interesting to look at the model’s components and compare
them across customer segments. They can be plotted and compared easily once the model is
segment (subtracting minimum pik and dividing by maximum pik in that segment).
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 409
problem is that the new, independent data (unused in the fit) are simply not available in the
fine time resolution (since the measurement is costly and all the available information
should be used for model training). Nevertheless, aggregated data are available. For
instance, total (HOU+SMC) consumptions for closed distribution networks, for individual
gas companies and for the whole country are available from routine balancing. To be able to
compare the model fit with such data, we need to integrate (or re-aggregate) the model
estimates properly, e.g. along the lines of formula (13). When we do this for the balancing
data from the Czech Republic, we get the Figure 8. The fit is rather nice, especially when
considering that there are other than model errors involved in the comparison (as discussed
in the section 3.3) – note that the model output has not been calibrated here in any way.
1.0
0
-2
0.8
-4
0.6
log(consumption)
consumption
-6
0.4
-8
0.2
-10
-12
0.0
www.intechopen.com
410 Natural Gas
0
-2
log(consumption)
-4
-6
-8
-10 0 10 20
temperature
HOU1
2
HOU2
HOU3
1
HOU4
rho
0
-1
-2
temperature
2
SMC1
SMC2
SMC3
1
SMC4
rho
0
-1
k .
-30 -20 -10 0 10 20 30
temperature
Fig. 3. Temperature response function of (5), compared across different HOU and
SMC segments.
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 411
1.10
HOU1
HOU2
HOU3
HOU4
1.05
exp(alpha_jk)
1.00
0.95
day type, j
4e+04
2e+04
0e+00
scaled p_ik
www.intechopen.com
412 Natural Gas
0.10
working
nonworking
0.08
proportion of the daily consumption
0.06
0.04
0.02
5 10 15 20
hour
Fig. 6. Proportions of daily consumption totals consumed in a particular hour of the day, i.e.
q~kth ’s from (9), compared between working and nonworking day for HOU1 segment (i.e.
for „cookers“).
1.0
0.8
0.6
consumption
0.4
0.2
0.0
day
Fig. 7. Fit of the model (1) to the HOU4 data (normalized consumptions as dots and
normalized model output as a dotted line).
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 413
1.0
0.8
0.6
consumption
0.4
0.2
0.0
day
Fig. 8. Fit of the model (1) after disaggregation and re-aggregation of normalized model
output, according to (13) on the CR total HOU+SMC consumption data over a period of
more than a year.
relevant for the i, k -th customer, i.e. by Tikt . Obviously, it would not be practical to require
temperature measurements for each individual customer. Therefore, Tikt would on the
i, k index only through the relation of being included in some more local region for which
the temperature daily average would be available separately (e.g. county). Technically, this
is very simple indeed. Nevertheless, such an improvement requires appropriate (regionally)
stratified sample.
The calibration model (16) can be expanded to cover not only proportional but also additive
biases. Note that, compared to the full linear calibration of (15), model (16) assumes that the
additive bias is zero. The assumption is in line with what we experienced in practice, but for
www.intechopen.com
414 Natural Gas
other situations, the model (16) can be expanded by one more state equation to have time-
6. Conclusion
In this chapter, we have introduced a gas consumption model GCM for household and
small medium customers in daily and hourly resolution and showed how it can be used for
various practical tasks, including estimations of consumption aggregates integrated over
time and/or customers as well as network related balancing. A model similar to the
implementation described here has been running in nationwide system in the Czech
Republic and Slovakia for several years already.
The model has a moderately rich structure but it has been built with very strong accent on
easy and efficient practical implementation in a gas company or energy market operator
environment. It is built in a modular way, enhancing serviceability and making local
adjustments to somewhat different conditions rather easy. For more complicated
www.intechopen.com
Statistical model of segment-speciic relationship between
natural gas consumption and temperature in daily and hourly resolution 415
adjustments, we might be a help a new user with the statistical modeling part if it would
result in an interesting project.
The GCM model is built from the first principles, in close contact with empirical behavior of
the observed consumptions. It is specified in formal terms as a full blown statistical model
(not only mean behavior but also variability assumptions and distributional behavior are
given by the model). Our practical experience in natural gas modeling has been strongly
supporting the idea that rigorous statistical formulation always pays off here and that it is to
be preferred to a haphazard ad hoc or even black box type approaches. There is a lot of
structure and many systematic features that a good gas consumption model should follow
closely in order to be useful.
7. Acknowledgement
The work was partly supported by the grant 1ET400300513 of the Grant Agency of the
Academy of Sciences of the Czech Republic as well as by the Institutional Research Plan
AV0Z10300504 ‘Computer Science for the Information Society: Models, Algorithms,
Applications’. We would like to acknowledge important support from the M100300904
project of the Academy of Sciences of the Czech Republic. We also would like to thank to the
people from the RWE GasNet, formerly West Bohemian Gas Distribution Company (J.
Bečvář, J. Čermáková and others) and to V. Jilemnický from RWE Plynoprojekt for their help
and willingness to discuss gas distribution background problems and issues.
8. References
Agresti, A. (1990). Categorical data analysis. John Wiley. New York.
Bailey, J. (2000). Load profiling for retail choice: Examining a complex and crucial
component of settlement. Electricity Journal. 13, 69-74
Brabec, M.; Konár, O.; Malý, M.; Pelikán, E.; Vondráček, J. (2009). A statistical model for
natural gas standardized load profiles. JRSS C - Applied Statistics. 58, 1, 123-139
Brabec, M.; Malý, M.; Pelikán, E.; Konár, O. (2009a). Statistical calibration of the natural gas
consumption model. WSEAS transactions on systems. 8, 7, 902-912
Brabec, M.; Konár, O.; Pelikán, E.; Malý, M. (2008). A nonlinear mixed effects model for
prediction of natural gas consumption by individual customers. International
Journal of Forecasting. 24, 659-678
Brabec, M.; Konár, O.; Pelikán. E.; Malý, M. (2008a). Hierarchical model for estimation of
yearly sums from irregular longitudinal data. Book of abstracts, ISF symposium on
forecasting, Nice, France, page 139
Brabec, M.; Konár, O.; Malý,M.; Pelikán, E.; Vondráček, J. (2007). State space model for
aggregated longitudinal data. Abstract Book, 27th International Symposium on
Forecasting, New York 24.-27.6.2007, page 46, ISF.
Carroll, R. J. D.; Ruppert, L. A.; Stefanski. (1995). Measurement error in nonlinear models.
Chapman & Hall/CRC. London.
Carroll, R. J. & Wand, M. P. (2003). Semiparametric regression. Cambridge University Press.
Cambridge.
Cleveland, W. S. (1979). Robust Locally Weighted Regression and Smoothing Scatterplots.
Journal of the American Statistical Association. 74, 829-836
www.intechopen.com
416 Natural Gas
www.intechopen.com
Natural Gas
Edited by Primož PotoÄÂnik
ISBN 978-953-307-112-1
Hard cover, 606 pages
Publisher Sciyo
Published online 18, August, 2010
Published in print edition August, 2010
The contributions in this book present an overview of cutting edge research on natural gas which is a vital
component of world's supply of energy. Natural gas is a combustible mixture of hydrocarbon gases, primarily
methane but also heavier gaseous hydrocarbons such as ethane, propane and butane. Unlike other fossil
fuels, natural gas is clean burning and emits lower levels of potentially harmful by-products into the air.
Therefore, it is considered as one of the cleanest, safest, and most useful of all energy sources applied in
variety of residential, commercial and industrial fields. The book is organized in 25 chapters that cover various
aspects of natural gas research: technology, applications, forecasting, numerical simulations, transport and
risk assessment.
How to reference
In order to correctly reference this scholarly work, feel free to copy and paste the following:
Marek Brabec, Marek Maly, Emil Pelikan and Ondrej Konar (2010). Statistical Model of Segment-Specific
Relationship Between Natural Gas Consumption and Temperature in Daily and Hourly Resolution, Natural
Gas, Primož PotoÄÂnik (Ed.), ISBN: 978-953-307-112-1, InTech, Available from:
http://www.intechopen.com/books/natural-gas/statistical-model-of-segment-specific-relationship-between-
natural-gas-consumption-and-temperature-i