Data Handling and Parameter Estimation: Gürkan Sin Krist V. Gernaey Sebastiaan C.F. Meijer Juan A. Baeza
Data Handling and Parameter Estimation: Gürkan Sin Krist V. Gernaey Sebastiaan C.F. Meijer Juan A. Baeza
5.1 INTRODUCTION
Modelling is one of the key tools at the disposal of obtained from a range of experimental methods from the
modern wastewater treatment professionals, researchers laboratory, as well as pilot-scale studies to characterise
and engineers. It enables them to study and understand and study wastewater treatment plants. In this regard,
complex phenomena underlying the physical, chemical models help to properly explain various kinetic
and biological performance of wastewater treatment parameters for different microbial groups and their
plants at different temporal and spatial scales. activities in WWTPs by using parameter estimation
techniques. Indeed, estimating parameters is an integral
At full-scale wastewater treatment plants (WWTPs), part of model development and application (Seber and
mechanistic modelling using the ASM framework and Wild, 1989; Ljung, 1999; Dochain and Vanrolleghem,
concept (e.g. Henze et al., 2000) has become an 2001; Omlin and Reichert, 1999; Brun et al., 2002; Sin
important part of the engineering toolbox for process et al., 2010) and can be broadly defined as follows:
engineers. It supports plant design, operation,
optimization and control applications. Models have also Given a model and a set of data/measurements from
been increasingly used to help take decisions on complex the experimental setup in question, estimate all or some
problems including the process/technology selection for of the parameters of the model using an appropriate
retrofitting, as well as validation of control and statistical method.
optimization strategies (Gernaey et al., 2014; Mauricio-
Iglesias et al., 2014; Vangsgaard et al., 2014; Bozkurt et The focus of this chapter is to provide a set of tools
al., 2015). and the techniques necessary to estimate the kinetic and
stoichiometric parameters for wastewater treatment
Models have also been used as an integral part of the processes using data obtained from experimental batch
comprehensive analysis and interpretation of data activity tests. These methods and tools are mainly
© 2016 Gürkan Sin and Krist V. Gernaey. Experimental Methods In Wastewater Treatment. Edited by M.C.M. van Loosdrecht, P.H. Nielsen. C.M. Lopez-Vazquez and D. Brdjanovic. ISBN:
9781780404745 (Hardback), ISBN: 9781780404752 (eBook). Published by IWA Publishing, London, UK.
202 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
intended for practical applications, i.e. by consultants, substrate), YSX (yield of biomass per unit substrate), YSC
engineers, and professionals. However, it is also (yield of CO2 per unit of substrate), YSP1 (yield of
expected that they will be useful both for graduate intermediate product P1 per unit of substrate), and YSW
teaching as well as a stepping stone for academic (yield of water per unit of substrate).
researchers who wish to expand their theoretical interest
in the subject. For the models selected to interpret the The coefficients of this equation are written on the
experimental data, this chapter uses available models basis of 1 C-mol of carbon substrate. This includes
from literature that are mostly based on the Activated growth yield for biomass, YSX, substrate (ammonia)
Sludge Model (ASM) framework and their appropriate consumption yields, YSN, oxygen consumption yields,
extensions (Henze et al., 2000). YSO, yield for production of CO2, YSC, and yield for
water, YSW. The biomass, X, is also written on the basis
The chapter presents an overview of the most of 1 C-mol and is assumed to have a typical composition
commonly used methods in the estimation of parameters of CHaObNc. The biomass composition can be measured
from experimental batch data, namely: (i) data handling experimentally, CH1.8O0.5N0.2 being a typical value.
and validation, (ii) parameter estimation: maximum Some of the yields are also measured experimentally
likelihood estimation (MLE) and bootstrap methods, (iii) from the observed rates of consumption and production
uncertainty analysis: linear error propagation and the of components in the process as follows:
Monte Carlo method, and (iv) sensitivity and
identifiability analysis. ri q i
Yji = = and Yji = Yij−1 Eq. 5.2
rj q j
5.2 THEORY AND METHODS
Where, qi refers to the volumetric
5.2.1 Data handling and validation conversion/production rate of component i, i.e. the mass
of component i per unit volume of the reactor per unit
5.2.1.1 Systematic data analysis for biological time (Mass i Volume-1 Time-1), ri refers to the measured
processes rate of the mass of component i per unit time per unit
weight of the biomass (Mass i Time-1 Mass biomass-1)
Most activated sludge processes can be studied using
and Yji is the yield of component i per unit of component
simplified process stoichiometry models which rely on a
j. In the case of biomass, x, this would refer to the
‘black box’ description of the cellular metabolism using
specific growth rate μ:
measurement data of the concentrations of reactants
(pollutants) and products e.g. CO2, intermediate oxidised
nitrogen species, etc. Likewise, the Activated Sludge qx
μ = rx = Eq. 5.3
Model (ASM) framework (Henze et al., 2000) relies on x
a black box description of aerobic and anoxic
heterotrophic activities, nitrification, hydrolysis and
One of the advantages of using this process
decay processes.
stoichiometry is that it allows elemental balances for C,
H, N and O to be set up and to make sure that the process
A general model formulation of the process
stoichiometry is balanced. For the process stoichiometry
stoichiometry describing the conversion of substrates to
given in Eq 5.1, the following elemental balance for
biomass and metabolic products is formulated below (for
carbon will hold, assuming all the relevant yields are
carbon metabolism):
measured:
process stoichiometry are usually not closed in CH 2 O + YSO O 2 + YSN NH 3 → YSX X + YSC CO 2 Eq. 5.5
wastewater applications. However, the balance for the
degree of reduction is closed in wastewater treatment Assuming the biomass composition X is
process stoichiometry. This is the framework on which CH1.8O0.5N0.2. The degree of reduction for biomass is
ASM is based. The degree of reduction balance is calculated assuming the nitrogen source is ammonia
relevant since most biological reactions involve (hence the nitrogen oxidation state is -3, γX: 4 + 1.8 + 0.5
reduction-oxidation (redox)-type chemical conversion · (-2) + 0.2 · (-3) = 4.2 mol e- C-mol-1.
reactions in metabolism activities.
Now C, N and the degree of reduction balances can
5.2.1.2 Degree of reduction analysis be performed for the process stoichiometry as follows:
A biological process will convert a substrate i.e. the input Carbon balance: −1 + YSX + YSC = 0 Eq. 5.6
to a metabolic pathway, into a product that is in a reduced
or oxidized state relative to the substrate. In order to Nitrogen balance: −YSN + 0.2 ⋅ YSX = 0 Eq. 5.7
perform redox analysis on a biological process, a method
to calculate the redox potential of substrates and products Redox balance:
is required. In the ASM framework and other
biotechnological applications (Heijnen, 1999; Villadsen −1 ⋅ γ g − γ O2 ⋅ YSO − γ NH3 ⋅ YSN + γ X ⋅ YSX + γ CO2 ⋅ YSC = 0
et al., 2011), the following methodology is used:
−1 ⋅ γ g − γ O2 ⋅ YSO − 0 ⋅ YSN + γ X ⋅ YSX + 0 ⋅ YSC = 0
Eq. 5.8.
1) Define a standard for the redox state for the balanced
elements, typically C, O, N, S and P. In these balance equations, there are four unknowns
2) Select H2O, CO2, NH3, H2SO4, and H3PO4 as the (YSN, YSO, YSX, YSC). Since three equations are available,
reference redox-neutral compounds for calculating only one measurement of the yield is necessary to
the redox state for the elements O, C, N, S, and P calculate all the others. For example, in ASM
respectively. Moreover, a unit of redox is defined as applications, biomass growth yield is usually assumed
H = 1. With these definitions, the following redox measured or known, hence the other remaining yields
levels of the five listed elements are obtained: O = -2, can be calculated as follows:
C = 4, N = -3, S = 6 and P = 5.
3) Calculate the redox level of the substrate and CO2 yield: YSC = 1 - YSX Eq. 5.9
products using the standard redox levels of the
elements. Several examples are provided below: NH3 yield: YSN = 0.2YSX Eq. 5.10
a) Glucose (C6H12O6): 6 · 4 + 12 · 1 + 6 · (-2) = 24.
Per 1 C-mol, the redox level of glucose becomes,
γg = 24/6 = 4 mol e- C-mol-1. γ g − γ x ⋅ YSX 4 − 4.2YSX
O2 yield: YSO = = Eq. 5.11
b) Acetic acid (C2H4O2): 2 · 4 + 4 · 1 + 2 · (-2) = 8. γ O2 4
Per 1 C-mol, the redox level of Hac becomes,
γa = 8/2 = 4 mol e- C-mol-1
With these coefficients known, the process
c) Propionic acid (C3H6O2): 3 · 4 + 6 · 1 + 2 · (-2) =
stoichiometry model for 1 C-mol of glucose
14. Per 1 C-mol, the redox level of HPr becomes,
consumption becomes as follows:
γp = 14/3 = 4.67 mol e- C-mol-1.
d) Ethanol (C2H6O): 2 · 4 + 6 · 1 + 1 · (-2) = 12. Per
1 C-mol, the redox level of HAc becomes, γe = 4 − 4.2YSX
CH 2 O + ⋅ O2 + 0.2YSX ⋅ NH 3 →
12/2 = 6 mol e- C-mol-1. 4 Eq. 5.12
4) Perform a degree of reduction balance over a given YSX ⋅ X + (1 − YSX ) ⋅ CO2
process stoichiometry (see Example 5.1).
Example 5.1 Elemental balance and degree of reduction analysis In the ASM framework, the process stoichiometry is
for aerobic glucose oxidation calculated using a unit production of biomass as a
reference. Hence, the coefficients of Eq. 5.12 can be re-
General process stoichiometry for the aerobic oxidation arranged as follows:
of glucose to biomass:
204 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
mol e g COD g COD Eq. 5.15 The equation above is formulated for a biological
⋅ = γg ⋅ 8
C − mole mol e C − mol process with N substrates and M metabolic products. In
the equation, e is the elemental composition (C, H, O and
N) for a component, and q the volumetric production (or
5.2.1.3 Consistency check of the experimental data consumption) rate for substrates (qsj), biomass (qx) and
metabolic products (qpj). Hence, the elemental balance
The value of performing elemental balances around data
can be formulated as follows:
collected from experiments with biological processes is
obvious: to confirm the data consistency with the first
law of thermodynamics, which asserts that energy (in the E⋅q = 0 Eq. 5.17
form of matter, heat, etc.) is conserved. A primary and
obvious requirement for performing elemental balances In this equation, E is the conservation matrix and its
is that the model is checked and consistent. Experimental columns refer to each conserved element and property,
data needs to be checked for gross (measurement) errors e.g. C, H, O, N, γ, etc. Each row of matrix E contains
that may be caused by incorrect calibration or values of a conserved property related to substrates,
malfunction of the instruments, equipment and/or products and biomass; q is a column vector including the
sensors. measured volumetric rates for each compound. This is
substrate as well as products and biomass.
Inconsistency in the data can be checked from the
sum of the elements that make up the substrates The total number of columns in E is the number of
consumed in the reaction (e.g. glucose, ammonia, compounds, which is the sum of substrates (N), products
oxygen, etc.). This should equal to the sum of the (M) and biomass, hence N + M + 1. The total number of
elements (products) produced in the reaction (therefore constraints is 5 (C, H, O, N and γ). This means that N+M-
also see Eq. 5.4 for the carbon balance). Deviation from 4 is the number of degrees of freedom that needs to be
this elemental balance indicates an incorrectly defined measured or specified in order to calculate all the rates.
system description, a model inconsistency and/or
measurement flaws. Typically, not all the rates will be measured in batch
experiments. Therefore, let us assume qm is the measured
set of volumetric rates and qu the unmeasured set of rates
DATA HANDLING AND PARAMETER ESTIMATION 205
which need to be calculated. In this case Eq. 5.17 can be chooses one parameter from the parameter set and then
reformulated as follows: changes it incrementally (increases or decreases around
its nominal value) until a reasonable model fit is obtained
E m q m + E u q u =0 Eq. 5.18 to the measured data. The same process may be iterated
for another parameter. The fitting process is terminated
q u = − ( E u ) E m q m =0
-1
when the user deems that the model fit to data is good.
This is often determined by practical and/or time
Provided that the inverse of Eu exists (det(Eu) ≠ 0), constraints because this procedure will never lead to an
Eq. 5.18 provides a calculation/estimation of the optimal fit of the model to the measured data. In addition,
unmeasured rates in a biological process. These multiple different sets of parameter values can be
estimated rates are valuable on their own, but can also be obtained which may not necessarily have a physical
used for validation purposes if redundant measurements meaning. The success of this procedure often relies on
are available. This systematic method of data consistency the experience of the modeller in selecting the
check is highlighted in Example 5.2. appropriate parameters to fit certain aspects of the
measured data. Although this approach is largely
All these calculations help to verify and validate the subjective and suboptimal, the approach is still widely
experimental data and measurement of the process yield. used in industry as well as in the academic/research
The data can now be used for further kinetic analysis and environment. Practical data quality issues do not often
parameter estimation. allow the precise determination of parameters. Also not
all (commercial) modelling software platforms provide
5.2.2 Parameter estimation the appropriate statistical routines for parameter
estimation. There are automated procedures for model
Here we recall a state-space model formalism to describe calibration using algorithms such as statistical sampling
a system of interest. Let y be a vector of outputs resulting techniques, optimization algorithm, etc. (Sin et al.,
from a dynamic model, f, employing a parameter vector, 2008). However, such procedures focus on obtaining a
θ; input vector, u; and state variables, x: good fit to experimental data and not necessarily on the
identifiability and/or estimation of a parameter from a
data set. This is because the latter requires proper use of
dx
= f ( x,θ, u, t ) ; x ( 0 ) = x 0 statistical theory.
dt Eq. 5.19
y = g ( x,θ, u, t )
5.2.2.2 Formal statistical methods
Frequentist method - maximum likelihood theory equivalent to minimizing the following cost (or
In the parameter estimation problem we usually define objective) function, S(y,θ) (Seber and Wild, 1989):
parameter estimators, θ , to distinguish them from the
true model parameters, θ. In the context of statistical
estimation, model parameters are defined as unknown ( y − f (θ))
2
and statistical methods are used to infer their true value. S ( y,θ ) = Eq. 5.23
σ2
This difference is subtle but important to understand and
to interpret the results of parameter estimation,
irrespective of the methods used. Where, y stands for the measurement set, (θ) stands
for the corresponding model predictions, and Σ stands for
Maximum likelihood is a general method for finding the standard deviation of the measurement errors. The
estimators, θ, from a given set of measurements, y. In solution to the objective function (Eq. 5.24) is found by
this approach, the model parameters θ are treated as true, minimization algorithms (e.g. Newton’s method,
fixed values, but their corresponding estimators θ are gradient descent, interior-point, Nelder-Mead simplex,
treated as random variables. The reason is that the genetic, etc.).
estimators depend on the measurements, which are
assumed to be a stochastic process:
θ̂: min θ S ( y,θ )
∂ Eq. 5.24
S ( y,θ ) =0
y = f (θ) + ε where ε ∝ N ( 0, σ ) Eq. 5.20 ∂θ θ̂
s2 =
( )
Smin y,θˆ
Eq. 5.26
that the underlying distribution of errors is assumed to
follow a normal (Gaussian) distribution.
n−p
In many practical applications, however, this
Here, n is the total number of measurements, p is the condition is rarely satisfied. Hence, theoretically the
number of estimated parameters, n-p is the degrees of MLE method for parameter estimation cannot be applied
freedom, Smin (y,θ) is the minimum objective function without compromising its assumptions, which may lead
value and F. is the Jacobian matrix, which corresponds to to over or underestimation of the parameter estimation
the first order derivative of the model function, f, with errors and their covariance structure.
respect to the parameter vector θ evaluated at θ = θ.
An alternative to this approach is the bootstrap
The covariance matrix is a square matrix with (p×p) method developed by Efron (1979), which removes the
dimensions. The diagonal elements of the matrix are the assumption that the residuals follow a normal
variance of the parameter estimators, while the non- distribution. Instead, the bootstrap method works with
diagonal elements are the covariance between any pair of the actual distribution of the measurement errors, which
parameter estimators. are then propagated to the parameter estimation errors by
using an appropriate Monte Carlo scheme (Figure 5.1).
The 95 % confidence interval of the parameter
estimators can now be approximated. Assuming a large The bootstrap method uses the original data set D(0)
n, the confidence intervals (the difference between the with its N data points, to generate any number of
estimators and true parameter values), follow a student t- synthetic data sets DS(1);DS(2);…., also with N data
distribution, the confidence interval at 100 (1-α) % points. The procedure is simply to draw N data points
significance: with replacements from the set D(0). Because of the
replacement, sets are obtained in which a random
fraction of the original measured points, typically 1/e =
θ1-α = θ ± t α/2
N-p diag cov θ () Eq. 5.27 37 %, are replaced by duplicated original points. This is
illustrated in Figure 5.1.
Where, tN-pα/2 is the upper α/2 percentile of the t- The application of the bootstrap method for
distribution with N-p degrees of freedom, and parameter estimation in the field of wastewater treatment
diag cov(θ) represents the diagonal elements of the requires adjustment due to the nature of the data that is
covariance matrix of the parameters. in the time series. Hence, the sampling is not performed
from the original data points (which are the time series
The pairwise linear correlation between the and indicate a particular trend). Instead, the sampling is
parameter estimators, Rij, can be obtained by calculating performed from the residual errors and then added to the
a correlation matrix from unit standardization of the simulated model outputs (obtained by using reference
covariance matrix as follows: parameter estimation) (Figure 5.1). This is reasonable
because the measurement errors are what is assumed to
cov ( θ i ,θ j ) be stochastic and not the main trend of the measured data
R ij = Eq. 5.28 points, which are caused by biological
σ θi × σ θ j
processes/mechanisms. Bearing this in mind, the
theoretical background of the bootstrap method is
This linear correlation will range from [-1 1] and outlined below.
indicate whether or not he parameter estimator is
208 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
^ ^ ^s
Synthetic data y1 Monte Carlo Parameter θ1
Random sampling from
residuals, e
^
^ ^ ^s
Synthetic data y2 Monte Carlo Parameter θ2
Experimental Reference Parameter
^
Data Set, y Estimation, θ
^ ^ ^s
Synthetic data y3 Monte Carlo Parameter θ3
^ ^ ^s
Synthetic data y4 Monte Carlo Parameter θ4
Figure 5.1 Illustration of the workflow for the bootstrap method: synthetic data sets are generated by Monte Carlo samples (random sampling with
replacement) from the reference MLE. For each data set, the same estimation procedure is performed, giving M different sets of parameter estimates: θS(1),
θS(2), … θS(M).
Let us define a simple nonlinear model where yi is the The realisation of measurement error in each
ith measurement, fi is the ith model prediction, θ is a bootstrap method, ε∗ , is simulated by random sampling
parameter vector (of length p), and εi is the measurement with replacement from the original residuals, which
error of yi: assigns each point with a uniform (probability) weight.
By performing N random sampling with a replacement
yi = fi ( θ ) + ε i where εi ∞ F Eq. 5.29 and then adding them to the model prediction (Eq. 5.31),
a new synthetic data set is generated, Ds(1) = y*.
The distribution of errors, F, is not known. This is
By repeating the above sampling procedure M times,
unlike in MLE, where the distribution is assumed a
M data sets are generated: Ds(1), Ds(2), Ds(3), … Ds(M).
priori. Given y, use least squares minimization, to
estimate θ:
Each synthetic data set, Ds(j), makes it possible to
obtain a new parameter estimator θ(j) by the same least
2 squares minimisation method which is repeated M times:
θ̂ : min θ y − f ( θ ) Eq. 5.30
2
θ̂ j : minθ Ds ( j) − f ( θ ) where j = 1,2…M Eq. 5.33
The bootstrap method defines F as the sample
probability distribution of ε as follows:
The outcome from this iteration is a matrix of
1 parameter estimators, θ(M × p) (M is the number of
F̂ = (density) at εi = ( yi − fi ( θ ) ) i = 1,2,…n
n Monte Carlo samples of synthetic data and p is the
Eq. 5.31 number of parameters estimated). Hence, each parameter
th estimator now has a column vector with values. This
The density is the probability of the i observation.
vector of values can be plotted as a histogram and
In a uniform distribution each observation (in this case
interpreted using common frequentist parameters such as
the measurement error, εi) has an equal probability of
the mean, standard deviation and the 95 % percentile.
occurrence, where density is estimated from 1/n. The
The covariance and correlation matrix can be computed
bootstrap sample, y*, given (, F), is then generated as
using θ (M × p) itself. This effectively provides all the
follows:
needed information on the quality of the parameter
estimators.
()
y*i = f i θˆ + ε*i where ε*i ∝ Fˆ Eq. 5.32
DATA HANDLING AND PARAMETER ESTIMATION 209
1 N
(f ( x N ) − E)
2
5.2.3.2 The Monte Carlo method σ2 ( f ) ≈ s2 ( f ) = Eq. 5.38
N −1 1
The Monte Carlo (MC) method was originally used to
calculate multi-dimensional integrals and its systematic
For notational simplicity, we consider the following
use started in the 1940s with the ‘Los Alamos School’ of
simple model: y = f(x), where the function f represents
mathematicians and physicists, namely Von Neumann,
the model under study, x:[x1;… xd] is the vector of the
Ulam, Metropolis, Kahn, Fermi and their collaborators.
model inputs, and y:[y1;… yn] is the vector of the model
The term was coined by Ulam in 1946 in honour of a
predictions.
relative who was keen on gambling (Metropolis and
Ulam, 1949).
The goal of an uncertainty analysis is to determine
the uncertainty in the elements of y that results from
Within the context of uncertainty analysis, which is
uncertainty in the elements of x. Given uncertainty in the
concerned with estimating the error propagation from a
vector x characterised by the distribution functions
set of inputs to a set of model outputs, the integral of
D=[D1,… Dd], where D1 is the distribution function
interest is the calculation of the mean and variance of the
associated with x1, the uncertainty in y is given by:
model outputs which are themselves indeed
multidimensional integrals (the dimensionality number
var ( y ) = ( ( y ) − f ( x ) ) dx
2
is determined by the length of the vector of input
parameters):
E ( y ) = f ( x ) dx
Eq. 5.39
210 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
Where, var(y) and E(y) are the variance and expected symbolic manipulation toolbox software. Alternatively,
value respectively of a vector of random variables, y, the derivatives can be obtained numerically by model
which are computed by the Monte Carlo sampling simulations with a small positive or negative
technique. In addition to the variance and mean values, perturbation, Δx, of the model inputs around their
one can also easily compute a percentile for y including nominal values, x0. Depending on the direction of the
the 95% upper and lower bounds. perturbation, the sensitivity analysis can be
approximated using the forward, backward or central
difference methods:
5.2.4 Local sensitivity analysis and
identifiability analysis Forward perturbation:
= Eq. 5.42
Most of the sensitivity analysis results reported in the ∂x Δx
literature are of a local nature, and these are also called
one factor at a time (OAT) methods. In OAT methods, Backward perturbation:
each input variable is varied (also called perturbation)
one at a time around its nominal value, and the resulting
∂y f ( x ) − f ( x − Δx )
0 0
expand the analysis from one point in the parameter = Eq. 5.44
space to cover a broader range in the entire parameter ∂x 2Δx
space but this is beyond the scope of this chapter
(interested readers can consult literature elsewhere such When an appropriately small perturbation step, Δx, is
as Saltelli et al., 2000; Sin et al., 2009). selected (usually a perturbation factor, ε = 10-3 is used.
Hence Δx = ε · x), all three methods provide exactly the
The local sensitivity measure is commonly defined same results.
using the first order derivative of an output, y = f(x), with
respect to an input parameter, x: Once the sensitivity functions have been calculated,
they can be used to assess the parameter significance
∂y
Absolute sensitivity: sa = Eq.5.40 when determining the model outputs. Typically, large
∂x
absolute values indicate high parameter importance,
while a value close to zero implies no effect of the
(effect on y by perturbing x around its nominal value x0). parameter on the model output (hence the parameter is
not influential). This information is useful to assess
parameter identifiability issues for the design of
∂y x°
Relative sensitivity: sr = Eq. 5. 41 experiments.
∂x y°
The relative sensitivity functions are non- The first step in parameter estimation is determining
dimensional with respect to units and are used to which sets of parameters can be selected for estimation.
compare the effects of model inputs among each other. This problem is the subject of identifiability analysis,
which is concerned with identifying which subsets of
These first-order derivatives can be computed parameters can be identified uniquely from a given set of
analytically, for example using Maple or Matlab measurements. Thereby, it is assumed a model can have
a number of parameters. Here the term uniquely is
DATA HANDLING AND PARAMETER ESTIMATION 211
important and needs to be understood as follows: a In Step 1, parameters that have negligible or near-
parameter estimate is unique when its value can be zero influence on the measured model outputs are
estimated independently of other parameter values and screened out from consideration for parameter
with sufficiently high accuracy (i.e. a small uncertainty). estimation. In the second step, for each parameter subset
This means that the correlation coefficient between any (all the combinations of the parameter subsets which
pair of parameters should be low (e.g. lower than 0.5) include 2, 3, 4,…m parameters) the collinearity index is
and the standard error of parameter estimates should be calculated. The collinearity index is the measure of the
low (e.g. the relative error of the parameter estimate, σθ similarity between any two vectors of the sensitivity
/θ, lower than e.g. 25%). As it turns out, many parameter functions. Subsets that have highly similar sensitivity
estimation problems are ill-conditioned problems. A functions will tend to have a very large number (γK ~ inf),
problem is defined ill-conditioned when the condition while independent vectors will have a smaller value γK ~
number of a function/matrix is very high, which is caused 1 which is desirable. In identifiability analysis, a
by multicollinearity issues. In regression problems, the threshold value of 5-20 is usually used in literature (Brun
condition number is used as a diagnostic tool to identify et al., 2001; Sin and Vanrolleghem, 2007; Sin et al.,
parameter identifiability issues. Such regression 2010). It is noted that this γK value is to be used as
diagnostics are helpful in generating potential candidates guidance for selecting parameter subsets as candidates
of the parameter subsets for estimation which the user for parameter estimation. The best practice is to iterate
can select from. and try a number of higher ranking subsets.
1
γ = First establish which variables of interest are measured
minλ
Eq. 5.46
and then define the matrices as follows: Em includes the
elemental composition and the degree of reductions for
λK = eigen(snormTK snormK ) Eq. 5.47 these measured variables, while Eu includes those of
sr unmeasured variables. To calculate the degree of
snorm = Eq. 5.48 reduction, use the procedure given in Section 5.2.1.2.
‖sr‖
Step 4. Calculate the yield coefficients. In this step, the parameter estimation problem is defined
as a minimization problem and solved using optimization
In this step, since all the species rates of algorithms (e.g. fminsearch in Matlab)
consumption/productions are now known, the yield
coefficients can be calculated using Eq. 5.2 and the Step 4. Estimate the uncertainty of the parameter
process stoichiometry can be written using the yield estimators and model outputs.
coefficient values.
In this step, calculate the covariance matrix of the
Step 5. Verify the elemental balance. parameter estimators and compute the parameter
confidence intervals as well as the parameter correlation
In this step, a simple check is performed to verify if the matrix. Given the covariance matrix of the parameter
elemental balance and degree of reduction balance are estimators, estimate the covariance matrix of the model
closed. If not, the procedure needs to be iterated by outputs by linear error propagation.
assuming a different hypothesis concerning the
formation of by-products. Step 5. Review and analyse the results.
5.3.2 Parameter estimation workflow for In this step, review the values of the parameter values,
which should be within the range of parameter values
the non-linear least squares method obtained from the literature. In addition, inspect the
confidence intervals of the parameter estimators. Very
This workflow assumes that an appropriate and large confidence intervals imply that the parameter in
consistent mathematical model is used to describe the question may not be estimated reliably and should be
data. Such a model confirms the elemental balance and excluded from the subset.
degree of reduction analysis (see the workflow in Section
5.3.1). Usually these models are available from Further, plot and review the results from the best-fit
literature. Most of them are modified from ASM models solution. Typically, the data and model predictions
with appropriate simplifications and/or additions should fit well.
reflecting the conditions of the batch experiment.
If the results (both parameter values) and the best fit
Step 1. Initialisation. solution to the data are not satisfactory, iterate as
appropriate by going back to Step 1 or Step 2.
In this step, the initial conditions for the model variables
are specified as well as a nominal set of parameters for
the model. The initial conditions for the model are
5.3.3 Parameter estimation workflow for
specified according to the experimental conditions (e.g. the bootstrap method
10 mg NH4-N added at time 0, kLa is a certain value,
oxygen saturation at a given temperature is specified, The workflow of the bootstrap method follows on from
etc.). An initial guess of the model parameters is taken Step 1, Step 2 and Step 3 of the non-linear least squares
from literature. method.
Step 2. Select the experimental data and a parameter Step 1. Perform a reference parameter estimation using
subset for the parameter estimation. the non-linear least squares method.
In this step, the experimental data is reviewed for the This step is basically an execution of steps 1, 2 and 3 of
parameter estimation and which parameters need to be the workflow in the non-linear least squares technique.
estimated is defined. This can be done using expert The output is a residual vector that is passed on to the
judgement or, more systematically, a sensitivity and next step. The residual vector is then plotted and
identifiability analysis (see Section 5.3.4). reviewed. If the residuals follow a systematic pattern (it
should be random) or contain outliers, this is a cause for
Step 3. Define and solve the parameter estimation concern as it may imply the bootstrap method is not
problem. suited for this application.
DATA HANDLING AND PARAMETER ESTIMATION 213
Step 2. Generate synthetic data by bootstrap sampling and forward, backward or central difference. Plot, review and
repeat the parameter estimation. analyse the results.
Synthetic data is generated using Eq. 5.29-5.32 by Step 3. Rank the parameter significance.
performing bootstrap sampling (random sampling with
replacement) from the residual vector and adding it to the Calculate the delta mean-square measure, δmsqr, and rank
model prediction obtained in Step 1. For each synthetic the parameters according to this measure. Exclude any
data, the parameter estimation in Step 1 is repeated and parameters that have zero or negligible impact on the
the output (that is, the values of the parameter estimators) outputs.
is recorded in a matrix.
Step 4. Compute the collinearity index.
Step 3. Review and analyse the results.
For all the parameter combinations (e.g. subset size 2, 3,
In this step, the mean, standard deviation and the 4….m), the collinearity index, γK, is calculated. Each
correlation matrix of the parameter estimators are parameter subset is ranked according to the collinearity
computed from the recorded matrix data in Step 2. index value.
Moreover, the distribution function of the parameter
estimators can be estimated and plotted using the vector Step 5. Review and analyse the results.
of the parameter values that was obtained in Step 2.
Based on the results from Step 3 and Step 4, identify a
As in Step 5 of the workflow in the non-linear least short list of candidates (parameter subsets) that are
squares method, the results are interpreted and evaluated identifiable. Exclude these parameters from any
using knowledge from literature and process parameter subset that has near-zero or negligible
engineering. sensitivity on the outputs.
5.3.4 Local sensitivity and identifiability 5.3.5 Uncertainty analysis using the Monte
analysis workflow Carlo method and linear error propagation
The workflow of this procedure starts with the The workflow for the Monte Carlo method includes the
assumption that a mathematical model is available and following steps:
ready to be used to describe a set of experimental data.
Step 1. Input the uncertainty definition.
Step 1. Initialisation.
Identify which inputs (parameters) have uncertainty.
A framework is defined for the sensitivity analysis by Define a range/distribution for each uncertainty input,
defining the experimental conditions (the initial e.g. normal distribution, uniform distribution, etc. The
conditions for the batch experiments) as well as a set of output from the parameter estimators (e.g. bootstrap) can
nominal values for the model analysis. The model is be used as input here.
solved with these initial conditions and the model outputs
are plotted and reviewed before performing the Step 2. Sampling from the input space.
sensitivity analysis.
Define the sampling number, N, (e.g. 50, 100, etc.) and
Step 2. Compute the sensitivity functions. sample from the input space using an appropriate
sampling technique. The most common sampling
Define which outputs are measured and hence should be techniques are random sampling, Latin Hypercube
included in the sensitivity analysis. Define the sampling, etc. The output from this step is a sampling
experimental data points (every 1 min versus every 5 matrix, XNxm, where N is the number of samples and m is
min). the number of inputs.
Step 3. Perform the Monte Carlo simulations. plus a degree of reduction balance), measurement of
three rates is sufficient to estimate/infer the remaining
Perform N simulations with the model using the rates.
sampling matrix from Step 2. Record the outputs in an
appropriate matrix form to be processed in the next step. To illustrate the concept, the measured rates are
selected as the volumetric consumption rate of substrate
Step 4. Review and analyse the results. (-qs), the biomass production rate (qx), and the glycerol
production rate (qg) hence the remaining rates for
Plot the outputs and review the results. Calculate the ammonia consumption as well as the production of
mean, standard deviation/variance, and percentiles (e.g. ethanol and CO2 need to be estimated using Eq. 5.18. In
95 %) for the outputs. Analyse the results within the the measured rate vectors, a negative sign indicates the
context of parameter estimation quality and model consumption of a species, while a positive sign indicates
prediction uncertainty. Iterate the analysis, if necessary, the production of a species.
by going back to Step 1 or Step 2.
Step 3 Compute the unmeasured rates of the species (qu).
The workflow for linear error propagation:
Recall Eq. 5.18, which is solved as follows:
The workflow is relatively straightforward as it is
complementary to the covariance matrix of the parameter E m ⋅ q m + Eu ⋅ qu = 0
estimators and should be performed as part of the
parameter estimation in the non-linear least squares
S X Gly NH 3 Eth CO 2
method. It requires the covariance matrix of parameter
estimators as well as the Jacobian matrix which are both C 1 1 1 -q s 0 1.0 1.0 -q n
N 0 0.15 0 ⋅ q + 1 0
0 ⋅ q e = 0
obtained in Step 4 of the non-linear least squares x
methodology. γ 4 4.12 4.67 q g 0 6 0 q c
Ammonia is assumed to be the nitrogen source for Solving the system of linear equations above yields
growth. The biomass composition is assumed to be the following solution where the three unmeasured rates
CH1.6O0.5N0.15. All the substrates are given on the basis are calculated as a function of the measured rates qs, qg
of 1 C-mol, whereas nitrogen is on the basis of 1 N-mol. and qx:
In this biological process, the substrates are CH2O
(glucose) and NH3. The products are CH1.61O0.52N0.15 -q n − 0.15q x
(biomass), CH3O0.5 (ethanol), CH8/3O (glycerol) and CO2.
q e = 2q s / 3 − 467q g / 600 − 103q x / 150
Water is excluded from the analysis, as its rate of q q / 3 − 133q / 600 − 47q / 150
c s
production is not considered relevant to the process. This g x
Step 2. Compose the elemental composition matrices (Em Once the rates of all the products and substrates are
and Eu). estimated, one can then calculate the yield coefficients
for the process by recalling Eq. 5.2 as follows:
As the process has six species (substrates + products) and
three constraints (two elemental balances for C and N
DATA HANDLING AND PARAMETER ESTIMATION 215
respiration related to heterotrophic biomass is constant that for the sake of completeness all of the above
(hence not modelled), (ii) the inert fraction of the phenomena should be described which makes the
biomass released during decay is negligible (hence not analysis more accurate. However, here the model is kept
modelled), and (iii) the ammonium consumed for simple to focus the attention of the reader on the
autotrophic growth of biomass is negligible. It is noted workflow of parameter estimation.
Table 5.1 The two-step nitrification model structure using matrix representation (adopted from Sin et al., 2008)
1.14 ‒1 1
NOO growth 1‒ 1 µmaxNOO · MNO2 · MO,NOO · XNOO
YNOO YNOO YNOO
SNH SO SO SNO2
MNH : ;M : ;M : ;M :
SNH +Ks,AOO O,AOO SO +Ko,AOO O,NOO SO +Ko,NOO NO2 SNO2 +Ks,NOO
The model has in total six ordinary differential The model has six state variables, all of which need
equations (ODE), which corresponds to one mass to be specified to solve the system of the ODE equations.
balance for each variable of interest. Using a matrix The initial condition corresponding to batch test 1 is
notation, each ODE can be formulated as follows: shown in Table 5.3.
The model has in total 12 parameters. The nominal The parameter estimation is programmed as a
values as well as their range are taken from literature (Sin minimization problem using the sum of the squared
et al., 2008) and shown in Table 5.2 errors as the cost function and solved using an
unconstrained non-linear optimisation solver
DATA HANDLING AND PARAMETER ESTIMATION 217
(fminsearch algorithm in Matlab) using the initial maximum growth rate of NOO is assumed to be zero in
parameter guess given in Table 5.2 and initial conditions the model simulations. The best estimates of the
in Table 5.3. To simulate the inhibitor addition, the parameter estimators are given in Table 5.3.
Table 5.2 Nominal values of the model parameters used as an initial guess for parameter estimation together with their upper and lower bounds.
Table 5.3 Initial condition of the state variables for the model in batch test 1.
25 9
NH4+-N 8
NH4+-N, NO3--N, NO2--N (mg N L-1)
20 NO2--N 7
DO (mg O2 L-1)
NO3--N 6
15
5
4
10
3
5 2
0 0
0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h)
Figure 5.2 Data collected in batch test 1. NH4, NO2 and DO are used as the measured data set.
218 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
%%step 3 define and solve parameter estimation %% get the Jacobian matrix. use built-in
problem (as a minimization problem) "lsqnonlin.m" but with no iteration.
options =optimset('display', options =optimset('display',
'iter','tolfun',1.0e-06, 'tolx',1.0e-5, 'iter','tolfun',1.0e-06, 'tolx',1.0e-5,
'maxfunevals', 1000); 'maxfunevals', 0);
[pmin,sse]=fminsearch(@costf,pinit,options,td,yd [~,~,residual,~,~,~,jacobian]=lsqnonlin(@costl,p
,idx,iy); min,[],[],options,td,yd,idx,iy);
j(:,:)=jacobian; e=residual;
Step 4. Estimate the uncertainty of the parameter s=e'*e/dof; %variance of errors
estimators and the model prediction uncertainty. %% calculate the covariance of parameter
estimators
In this step, the covariance matrix of the parameter pcov = s*inv(j'*j) ; %covariance of parameters
estimators is computed. From the covariance matrix, the psigma=sqrt(diag(pcov))'; % standard deviation
standard deviation, 95 % confidence interval as well as parameters
the correlation matrix are obtained. The results are pcor = pcov ./ [psigma'*psigma]; % correlation
shown in Table 5.4. matrix
alfa=0.025; % significance level
Table 5.4 Optimal values of the parameter estimators after the solution of tcr=tinv((1-alfa),dof); % critical t-dist value
the parameter estimation problem. at alfa
p95 =[pmin-psigma*tcr; pmin+psigma*tcr]; %+-95%
Parameter Initial guess, θ° Optimal values, θ confidence intervals
YAOO 0.1 0.15
μmaxAOO 0.8 1.45
Ks,AOO 0.4 0.50
Ko,AOO 0.5 0.69
Table 5.5 Parameter estimation quality for the ammonium oxidation process: standard deviation, 95% confidence intervals and correlation matrix.
Parameter Optimal value, θ Standard deviation, 95 % confidence interval (CI) Correlation matrix
YAOO μmaxAOO Ks,AOO Ko,AOO
YAOO 0.15 0.0076 0.130 0.160 1 0.96 0.0520 0.17
μmaxAOO 1.45 0.0810 1.290 1.610 1 0.0083 0.42
Ks,AOO 0.50 0.0180 0.470 0.540 1 -0.26
Ko,AOO 0.69 0.0590 0.570 0.800 1
Using the covariance matrix of the parameter %% calculate confidence intervals on the model
estimators, the uncertainty in the model prediction is also output
calculated and the results are shown in Figure 5.3. ycov = j * pcov * j';
ysigma=sqrt(diag(ycov)); % std of model outputs
ys=reshape(ysigma,n,m);
y95 = [y(:,iy) - ys*tcr y(:,iy)+ys*tcr]; % 95%
confidence intervals
DATA HANDLING AND PARAMETER ESTIMATION 219
20 20
8
18 18
16 16 7
14 14
NO2--N (mg N L )
6
NH4+-N (mg N L-1)
-1
DO (mg O2 L-1)
12 12
10 10 5
8 8
6 6 4
4 4
3
2 2
0 0 2
0 1.2 2.4 3.6 0 1.2 2.4 3.6 1.2 2.4 3.6
Time (h) Time (h) Time (h)
Figure 5.3 Model outputs including 95 % confidence intervals calculated using linear error propagation (red lines). The results are compared with the
experimental data set.
Step 5. Review and analyse the results. prediction and the 95 % upper and lower bounds are quite
close to each other. This means that the model prediction
The estimated parameter values (Table 5.5) are found to uncertainty due to parameter estimation uncertainty is
be within the range reported in literature. This is an negligible. It is noted that a comprehensive uncertainty
indication that the parameter values are credible. The analysis of the model predictions will require analysis of
uncertainty of these parameter estimators is found to be all the other sources of uncertainty including other model
quite low. For example, the relative error (e.g. standard parameters as well as the initial conditions. However,
deviation/mean value of parameter values) is less than 10 this is outside the scope of this example and can be seen
%, which is also reflected in the small confidence elsewhere (Sin et al., 2010). Measurement error
interval. This indicates that the parameter estimation uncertainty is considered in Example 5.6.
quality is good. It is usually noted that relative error
higher than 50 % is indicative of bad estimation quality, This concludes the analysis of parameter estimation
while relative error below 10 % is good. using the non-linear least squares method for the AOO
parameters.
Regarding the correlation matrix, typically from
estimating parameters from batch data for Monod-like Part 2. Estimate the parameters for the NOO step.
models, the growth yield is significantly correlated with
the maximum growth rate (the linear correlation Steps 1 and 2. Initial conditions and selection of data and
coefficient is 0.96). Also notable is the correlation parameter subsets for the parameter estimation.
between the maximum growth rate and the oxygen
affinity constant. This means that a unique estimation of The same initial condition for batch test 1 is used in batch
the yield and maximum growth rate is not possible. test 2 but without any inhibitor addition, meaning that in
Further investigation of the correlation requires a this example the nitration is active. The data collected
sensitivity analysis, which is demonstrated in Example from batch test 2 is shown in Figure 5.3, which includes
5.5. ammonium, nitrite, nitrate and DO measurements.
Since the parameter estimation uncertainty is low, the • Y2 = [NH4 NO2 NO3 DO]; selected measurement set,
uncertainty in the model predictions is also observed to Y.
be small. In Figure 5.3, the mean (or average) model
220 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
The parameter values of AOO were set to the • θ2 = [YNOO µmaxNOO Ks,NOO Ko,NOO]; parameter subset
estimated values (Table 5.4) in the first part and are for the estimation.
hence known, while the yield and kinetic parameters of
NOO can be identified from the data:
20 8
NH4+-N 7
NH4+-N, NO3--N, NO2--N (mg N L-1)
15 NO2--N O2
NO3--N 6
DO (mg O2 L-1)
10 5
5
3
0 2
0 1.2 2.4 3.6 4.8 0 1.2 2.4 3.6 4.8
Time (h) Time (h)
Steps 3 and 4. Solve the parameter estimation problem and problem as well as the parameter uncertainties for NOO
calculate the parameter estimation uncertainties. are shown in Table 5.6.
Table 5.6 Optimal values of the parameter estimators after solution of the parameter estimation problem.
Parameter Optimal values, θ Standard deviation, 95 % confidence interval (CI) Correlation matrix
Lower bound Upper bound YNOO μmaxNOO Ks,NOO Ko,NOO
YNOO 0.04 0.01 0.01 0.07 1.00 1.00 0.54 -0.86
μmaxNOO 0.41 0.13 0.15 0.66 1.00 0.55 -0.86
Ks,NOO 1.48 0.03 1.42 1.55 1.00 -0.37
Ko,NOO 1.50 0.05 1.39 1.60 1.00
20 20
Ammonium (mg N L )
-1
NH4+-N NO2--N
0 0
0.0 1.2 2.4 3.6 4.8 0.0 1.2 2.4 3.6 4.8
Time (h) Time (h)
20 8
NO3--N DO
Nitrate (mg N L )
DO (mg O2 L )
-1
-1
6
10
4
0 2
0.0 1.2 2.4 3.6 4.8 0.0 1.2 2.4 3.6 4.8
Time (h) Time (h)
Figure 5.5 Model outputs including 95 % confidence intervals compared with the experimental data set.
Step 5. Review and analyse the results predictions, e.g. to describe batch test data. While
performing simulations with the model, however, one
The estimated parameter values are within the range needs to report the 95% confidence intervals of the
reported for the NOO parameters in literature, which simulated values as well. The latter reflects how the
makes them credible. However, this time the parameter covariance of the parameter estimates (implying the
estimation error is noticeably higher, e.g. the relative parameter estimation quality) affects the model
error (the ratio of standard deviation to the optimal prediction quality. For example, if the 95% confidence
parameter value) is more than 30%, especially for the interval of the model predictions is low, then the effect
yield and maximum growth rate. This is not surprising of the parameter estimation error is negligible.
since the estimation of both the yield and maximum
growth rate is fully correlated (the pairwise linear Part 1 and Part 2 conclude the parameter estimation
correlation coefficient is 1). These statistics mean that a for the two-step nitrification step. The results show that
unique parameter estimation for the yield, maximum the quality of the parameter estimation for AOO is
growth rate and oxygen half-saturation coefficient of relatively higher than that of NOO using batch data for
NOO (the pairwise linear correlation coefficient is 0.86) these experiments. This poor identifiability will be
is not possible with this batch experiment. Hence, this investigated later on, using sensitivity analysis to
parameter subset should be considered as a subset that improve the identifiability of individual parameters of
provides a good fit to the experimental data, while the model.
individually each parameter value may not have
sensible/physical meaning. Regarding the model prediction errors, the 95%
confidence interval of the model outputs is quite low.
The propagation of the parameter covariance matrix This means that the effects of the parameter estimation
to the model prediction uncertainty indicates low errors on the model outputs are low.
uncertainty on the model outputs. This means that
although parameters themselves are not uniquely
identifiable, they can still be used to perform model
222 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
Example 5.4 Estimate the parameters of ammonium oxidation Step 1. Perform a reference parameter estimation using
using data from the batch test – the bootstrap method non-linear least squares.
In this example, we investigate the parameter estimation The workflow in this step is exactly the same as the steps
problem in part 1 of Example 5.3. We used the data from 1, 2 and 3 in Example 5.3. The output from this step is
batch test 1 to estimate the parameters of AOO. the best fit to the data and the distribution of residuals
(Figure 5.6).
0.2
Ammonium
Residuals
0.0
-0.2
0 5 10 15 20 25 30 35
0.2
Nitrite
Residuals
0.0
-0.2
0 5 10 15 20 25 30 35
0.2
Oxygen
Residuals
0.0
-0.2
0 5 10 15 20 25 30 35
Data index
Step 2. Generate synthetic data by bootstrap sampling and ybt = y(:,iy) + rsam ; % synthetic data: error
repeat the parameter estimation. + model (ref PE)
options
In this step, bootstrap sampling from residuals is =optimset('display','iter','tolfun',1.0e-
performed. 06,'tolx',1.0e-5,'maxfunevals',1000);
[pmin(i,:),sse(i,:)]=lsqnonlin(@costl,pmin1,plo,
nboot=50; % bootstrap samples phi,options,td,ybt,idx,iy);
for i=1:nboot bootsam(:,:,i)=ybt; % record samples
disp(['the iteration number is : end
',num2str(i)])
onesam =ceil(n*rand(n,m)); % random sampling Fifty bootstrap samples from residuals (random
with replacement sampling with replacement) are performed and added to
rsam =res(onesam); % measurement errors for the model, thereby yielding the 50 synthetic
each variable measurement data sets shown in Figure 5.7.
DATA HANDLING AND PARAMETER ESTIMATION 223
10
0
0.00 0.02 0.04 0.06 0.08 0.10 1.20
20
Nitrate (mg N L-1)
10
0
0.00 0.02 0.04 0.06 0.08 0.10 1.20
10
Oxygen (mg O2 L-1)
0
0.00 0.02 0.04 0.06 0.08 0.10 1.20
Time (h)
Figure 5.7 Generation of synthetic data using bootstrap sampling from the residuals (50 samples in total).
For each of this synthetic data (a bootstrap sample), generated, this means that 50 different estimates of
a parameter estimation is performed and the results are parameters are obtained. The results are shown as a
recorded for analysis. Because 50 synthetic data sets are histogram for each parameter estimate in Figure 5.8.
20 30
20
Counts
Counts
10
10
0 0
0.12 0.13 0.14 0.15 0.16 0.17 0.18 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
Distribution of YAOO Distribution of µAOO
max
20 15
10
Counts
Counts
10
5
0 2
0.45 0.47 0.5 0.52 0.55 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90
Distribution of KS,AOO Distribution of KO,AOO
Figure 5.8 Distribution of the parameter estimates obtained using the bootstrap method (each distribution contains 50 estimated values for each
parameter).
224 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
Step 3. Review and analyse the results. .%step 3 Evaluate/interpret distribution of theta
disp('The mean of distribution of theta are')
Step 2 provided a matrix of the parameter estimates, disp(mean(pmin))
θ50x4. In this step, the mean, standard deviation and disp('The std.dev. of distribution of theta
correlation matrix properties of this matrix are evaluated. are')
The results are shown in Table 5.7. disp(std(pmin))
disp('')
disp('The correlation of parameters')
disp(corr(pmin))
Table 5.7 Optimal values of the parameter estimators after solving the parameter estimation problem.
Step 1. Initialisation. We use the initial conditions of the yield and ammonium (substrate) consumption. A
batch test 1 as described in Table 5.3 as well as the higher yield means less ammonium is consumed per unit
nominal values of AOO model parameters as given in growth of biomass, and hence it would also mean more
Table 5.2. ammonium present in the batch test. Since less
ammonium is consumed, less nitrite would be produced
The model outputs of interest are: (hence the negative correlation).
• y = [NH4 NO2 NO3 DO AOO NOO] On the other hand, it is also noted that the sensitivity
of the yield parameter increases gradually during the
The parameter set of interest is: linear growth phase and starts to decrease as we are
nearer to the depletion of ammonium. Once the
• θ = [YAOO µmaxAOO Ks,AOO Ko,AOO bAOO] ammonium is depleted, the sensitivity becomes nil as
expected. As predicted, the yield has a positive impact
Step 2. Compute and analyse the sensitivity functions. on AOO growth since a higher yield means higher
biomass production. Regarding oxygen, the yield first
In this step, the absolute sensitivity functions are has a positive impact that becomes negative towards the
computed using numerical differentiation and the results completion of ammonium. This means there is a rather
are recorded for analysis. non-linear relationship between the oxygen profile and
the yield parameter. As expected, the yield of AOO has
for i=1:m; %for each parameter no impact on the nitrate and NOO outputs in batch test 1,
dp(i) = pert(i) * abs(ps(i)); % parameter because of the addition of the inhibitor that effectively
perturbation suppressed the second step of nitrification.
p(i) = ps(i) + dp(i); % forward
perturbation In the sensitivity analysis, what is informative is to
[t1,y1] = ode45(@nitmod,td,x0,options,p); compare the sensitivity functions among each other. This
p(i) = ps(i) - dp(i); %backward perturbation is done in Figure 5.10 using non-dimensional sensitivity
[t2,y2] = ode45(@nitmod,td,x0,options,p); functions, which are obtained by scaling the absolute
dydpc(:,:,i) = (y1-y2) ./ (2 * dp(i)); sensitivity function with their respective nominal values
%central difference of parameters and outputs (Eq. 5.41). Figure 5.10 plots
dydpf(:,:,i) = (y1-y) ./ dp(i); %forward the sensitivity of all the model parameters with respect to
difference the six model outputs. Each subplot in the figure presents
dydpb(:,:,i) = (y-y2) ./ dp(i); %backward the sensitivity functions of all the parameters with
difference respect to one model output shown in the legend. The y-
p(i)=ps(i); % reset parameter to its reference axis indicates the non-dimensional sensitivity measure,
value while the x-axis indicates the time during the batch
end activity. For example, we observe that the sensitivity of
parameters to nitrate and NOO is zero. This is logical
The output sensitivity functions (absolute) are since NOO activity is assumed to be zero in this
plotted in Figure 5.9 for one parameter, namely the yield simulation.
of AOO growth for the purpose of detailed examination.
The interpretation of a sensitivity function is as follows: For the model outputs for ammonium, nitrite and
(i) higher magnitude (positive or negative alike) means oxygen, the sensitivity functions of the yield and
higher influence, while lower or near zero magnitude maximum growth rate for AOO follow an inversely
means negligible/zero influence of the parameter on the proportional trend/pattern. This inversely proportional
output, (ii) negative sensitivity means that an increase in relation is the reason why the parameter estimation
a parameter value would decrease the model output, and problem is an ill-conditioned problem. This means that if
(iii) positive sensitivity means that an increase in a the search algorithm increases the yield and yet at the
parameter value would increase the model output. same time decreases the maximum growth rate with a
With this in mind, it is noted that the yield of AOO has a certain fraction, the effect on the model output could be
positive effect on ammonium and an equally negative cancelled out. The result is that many combinations of
impact on nitrite. This is expected from the model parameter values for the yield and maximum growth rate
structure where there is an inverse relationship between can have a similar effect on the model output. This is the
226 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
reason why a high correlation coefficient is obtained their sensitivity functions should be unique and not
after the parameter estimation has been performed. This correlated with the sensitivity function of the other
means that for a parameter to be uniquely identifiable, parameters.
100 20 1.0
Ammonium Nitrite Nitrate
80 0
0.5
-20
dNO2/dYAOB
dNO3/dYAOB
dNH4/dYAOB
60
-40 0.0
40
-60
-0.5
20
-80
0 -100 -1.0
0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h) Time (h)
50 20 1.0
Oxygen AOO NOO
0 15 0.5
dNOB/dYAOB
dAOB/dYAOB
dO2/dYAOB
-50 0 0.0
-100 15 -0.5
-150 0 -1.0
0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h) Time (h)
Figure 5.9 Absolute sensitivity of the AOO yield on all the model outputs.
Another point of interest regarding these plots is that The results show that the decay rate of AOO has
the relative effect (that is, the magnitude of values on the almost zero effect on all three of the measured variables
y-axis) of the parameters on ammonium, oxygen and (ammonium, nitrite and oxygen) and therefore cannot be
nitrite is quite similar. This means that all three of these estimated. This is known from process engineering and
variables are equally relevant and important for for this reason, short-term batch tests are not used to
estimating these parameters. determine decay constants. This result therefore is a
confirmation of the correctness of the sensitivity
Step 3. Parameter-significance ranking. analysis. With regards to the maximum growth rate and
yield, these parameters are equally important followed
In this step the significance of parameters is ranked by by the affinity constant for oxygen and ammonium. This
summarizing the non-dimensional sensitivity functions indicates that at least four parameters can potentially be
of the parameters to model outputs using the δmsqr estimated from the data set.
measure. The results are shown in Figure 5.11.
DATA HANDLING AND PARAMETER ESTIMATION 227
dy/dθ
dy/dθ
0 0.0
0
-5
-5
-0.5
-10
-10
20 3 1.0
Oxygen AOO NOO
10 2 0.5
dy/dθ
dy/dθ
dy/dθ
0 1 0.0
-10 0 -0.5
-20 -1 -1.0
0.0 1.2 2.4 3.6 0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h) Time (h)
Figure 5.10 Relative sensitivity functions of the AOO parameters on the model outputs.
(Brun et al., 2002; Sin et al., 2010). As shown here, the have indicated that this subset was not suitable for the
analysis would have diagnosed the issue before estimation.
performing the parameter estimation (PE) and this would
AOO
5 µmax YAOO
Ammonium
δmsqr
KoAOO
KsAOO
bAOO
0
5 µAOO
max YAOO
Nitrite
δmsqr
KoAOO
KsAOO
bAOO
0
AOO
4 µmax YAOO
Oxygen
δmsqr
KoAOO
KsAOO bAOO
0
4
AOO
YAOO
δmsqr
AOO
µmax bAOO
KoAOO KsAOO
0
1 2 3 4 5
Importance rank
Figure 5.2 Significance ranking of the AOO parameters with respect to the model outputs.
However, given that the sensitivity of bAOO was not Example 5.6 Estimate the model prediction uncertainty of the
influential on the outputs (see Step 3), any subset nitrification model – the Monte Carlo method
containing this parameter would not be recommended for
parameter estimation. Nevertheless there remain many In this example, we wish to propagate the parameter
subsets that meet a threshold of 5-15 for γK that can be uncertainties resulting from parameter estimation (e.g.
considered for the parameter estimation problem. The Example 5.3 and Example 5.4) to model output
parameter subsets shaded in Table 5.8 meet these uncertainty using the Monte Carlo method.
identifiability criteria, and therefore can be used for
parameter estimation. The best practice is to start with For the uncertainty analysis, the problem is defined
the parameter subset with the largest size (of parameters) as follows: (i) only the uncertainty in the estimated AOO
and lowest γK. Taking these considerations of the parameters is considered, (ii) the experimental
sensitivity and collinearity index of the parameter conditions of batch test 1 are taken in account (Table
subsets into account helps to avoid the ill-conditioned 5.3), and (iii) the model in Table 5.1 is used to describe
parameter estimation problem and to improve the quality the system and nominal parameter values in Table 5.2.
of the parameter estimates.
DATA HANDLING AND PARAMETER ESTIMATION 229
Table 5.8 The collinearity index calculation for all the parameter combinations.
Step 1. Input uncertainty definition. can be verified by calculating the empirical density
function for each parameter using the parameter
As defined in the above problem definition, only the estimates matrix (θ50x4) and shown in Figure 5.12.
uncertainties in the estimated AOO parameters are taken
into account: figure
labels=['\theta_1';'\theta_2';'\theta_3';'\theta
• θinput = [YAOO µmax AOO
Ks,AOO Ko,AOO]. _4']; %or better the name of parameter
for i=1:4
subplot(2,2,i)
Mean and standard deviation estimates are taken as
[f xi]=ksdensity(pmin(:,i));
obtained from the bootstrap method together with their
plot(xi,f)
correlation matrix (Table 5.7). Further it is assumed that
xlabel(labels(i,:),'FontSize',fs,'FontWeight','b
these parameters follow a normal distribution or
old')
multivariate normal distribution since they have a
end
covariance matrix and are correlated. This assumption
230 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
40 4
30 3
Counts
Counts
20 2
10 1
0 0
0.10 0.12 0.14 0.16 0.18 1.0 1.2 1.4 1.6 1.8
Distribution of θ1 Distribution of θ2
20 6
15
4
Counts
Counts
10
2
5
0 0
0.40 0.45 0.50 0.55 0.60 0.4 0.6 0.8 1.0
Distribution of θ3 Distribution of θ4
Figure 5.12 Empirical probability density estimates for the AOO parameters as obtained by the bootstrap method.
Step 2. Sampling from the input space case the most important observations are that (i) the
parameter input space is sampled randomly and (ii) the
Since the input parameters have a known covariance parameter correlation structure is preserved in the
matrix, any sampling technique must take this into sampled values.
account. In this example, since the parameters are
defined to follow a normal distribution, the input Step 3. Perform the Monte Carlo simulations.
uncertainty space is represented by a multivariate normal
distribution. A random sampling technique is used to In this step, N model simulations are performed using the
sample from this space: sampling matrix from Step 2 (XNxm) and the model
outputs are recorded in a matrix form to be processed in
%% do random sampling the next step.
N= 100; %% sampling number
mu=mean(pmin); %% mean values of parameters %%step 2 perform monte carlo simulations for
sigma=cov(pmin); %% covariance matrix (includes each parameter value
stand dev and correlation information) % Solution of the model
X = mvnrnd(mu,sigma,N); % sample parameter space initcond;options=odeset('RelTol',1e-
using multivariate random sampling 7,'AbsTol',1e-8);
for i=1:nboot
The output from this step is a sampling matrix, XNxm, disp(['the iteration number is :
where N is the sampling number and m is the number of ',num2str(i)])
inputs. The sampled values can be viewed using a matrix par(idx) = X(i,:) ; %read a sample from
plot as in Figure 5.13. In this figure, which is a matrix sampling matrix
plot, the diagonal subplots are the histogram of the [t,y1] = ode45(@nitmod,td,x0,options,par); ;
parameter values while the non-diagonal subplots show %solve the model
the sampled values of the two pairs of parameters. In this y(:,:,i)=y1; %record the outputs
end
DATA HANDLING AND PARAMETER ESTIMATION 231
0.20
YAOO
0.15
0.10
2.0
AOO
µmax
1.5
1.0
0.6
KSAOO
0.5
0.4
1.0
KoAOO
0.8
0.6
0.1 0.15 0.2 1.0 1.5 2.0 0.4 0.5 0.6 0.6 0.8 1.0
AOO
YAOO µmax KsAOO KoAOO
Figure 5.13 Plotting of the sampling matrix of the input space, XNxm – the multivariate random sampling technique with a known covariance matrix.
20 20
Ammonium (mg N L-1)
15 15
Nitrite (mg N L-1)
10 10
5 5
0 0
0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h)
8 79
78
Oxygen (mg O2 L-1)
77
4
76
2 75
0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h)
Step 4. Review and analyse the results. the model outputs can be considered negligible. These
results are in agreement with the linear error propagation
In this step, the outputs are plotted and the results are results shown in Figure 5.5.
reviewed. In Figure 5.14, Monte Carlo simulation results
are plotted for four model outputs. This means that while there is uncertainty in the
parameter estimates themselves, when the estimated
As shown in Figure 5.15, the mean, standard parameter subset is used together with its covariance
deviation and percentiles (e.g. 95 %) can be calculated matrix, the uncertainty in the model prediction is low.
from the output matrix. The results indicate that for the For any application of these model parameters they
sources of uncertainties being studied, the uncertainty in should be used together as a set, rather than individually.
20 20
Ammonium (mg N L-1)
15 15
5 5
0 0
0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h)
8 78
Oxygen (mg O2 L-1)
6 77
4 76
2 75
0.0 1.2 2.4 3.6 0.0 1.2 2.4 3.6
Time (h) Time (h)
Figure 5.15 Mean and 95 % percentile calculation of the model output uncertainty.
Another point to make is that the output uncertainty 5.5 ADDITIONAL CONSIDERATIONS
evaluated depends on the input uncertainty defined as
well as the framing, e.g. initial conditions of the Best practice in parameter estimation
experimental setup. For example, in the above example In practice, while asymptotic theory assumption gives
what was not considered is the measurement uncertainty reasonable results, there are often deviations from the
or uncertainty due to other fixed parameters (decay) and assumptions. In particular:
initial conditions (the initial concentration of autotrophic
bacteria). Therefore these results need to be interpreted
within the context where they are generated.
DATA HANDLING AND PARAMETER ESTIMATION 233
• The measurement errors are often auto-correlated, for the parameter values, is taken for granted. Hence a
meaning that too many observations are redundant proper analysis and definition of the parameter
and not independent (non-independently and estimation problem will always require a good
identically distributed (iid) random variables). This engineering judgment. For robust parameter estimation
tends to cause an underestimation of asymptotic in practice, due to the empirical/experiental nature of
confidence intervals due to smaller sample variance, parameter definition, the statistical methods (including
σ2. A practical solution to this problem is to check the MLE estimates) should be treated within the
autocorrelation function of the residuals and filter context/definition of the problem of interest.
them or perform subsampling such that
autocorrelation is decreased in the data set. The With regards to bootstrap sampling, the most
parameter estimation can then be redone using the important issue is whether or not the residuals are
subsample data set. representative of typical measurement error. For a more
• Parameter estimation algorithms may stop at local detailed discussion of this issue, refer to Efron (1979).
minima, resulting in an incorrect linearization result
(the point at which the non-linear least squares are Best practice in uncertainty analysis
linearized). To alleviate this issue, parameter When performing uncertainty analysis, the most
estimation needs to be performed several times with important issue is the framing and the corresponding
either different initial guesses, different search definition of the input uncertainty sources. Hence, the
algorithms and/or an identifiability analysis. outcome from an uncertainty analysis should not be
treated as absolute but dependent on the framing of the
Afterwards it is important to verify that the minimum analysis. A detailed discussion of these issues can be
solution is consistent with different minimization found elsewhere (e.g. Sin et al., 2009; Sin et al., 2010).
algorithms.
Another important issue is the covariance matrix of
Identifiability or ill-conditioning problem: Not all the the parameters (or correlation matrix), which should be
parameters can be estimated accurately. This can be obtained from a parameter estimation technique.
caused by a too large confidence interval compared to the Assuming the correlation matrix is negligible may lead
mean or optimized value of the parameter estimators. to over or under estimation of the model output
The solution is to perform an identifiability analysis or uncertainty. Hence, in a sampling step the appropriate
re-parameterisation of the model, so that a lower number correlation matrix should be defined for inputs (e.g.
of parameters needs to be estimated. parameters) considered for the analysis.
While we have robust and extensive statistical Regarding the sampling number, one needs to iterate
theories and methods relevant for estimation of model several times to see if the results differ from one iteration
parameters as demonstrated above, the definition of the to another. Since the models used for the parameter
parameter estimation problem itself, which is concerned estimation are relatively simple to solve numerically, it
with stating what is the data available, what is the is recommended to use a sufficiently high number of
candidate model structure, and what is the starting point iterations e.g. 250 or 500.
References
Dochain, D., Vanrolleghem, P.A. (2001). Dynamical Modelling
Brun, R., Kühni, M., Siegrist, H., Gujer, W., Reichert, P. (2002). and Estimation in Wastewater Treatment Processes. London
Practical identifiability of ASM2d parameters - systematic UK: IWA Publishing.
selection and tuning of parameter subsets. Water Res. 36(16): Efron, B. (1979). Bootstrap methods: another look at the jackknife.
4113-4127. The Annals of Statistics, 7(1):1-26.
Brun, R., Reichert, P., and Künsch, H. R. (2001). Practical Gernaey, K.V., Jeppsson, U., Vanrolleghem, P.A., Copp, J.B.
identifiability analysis of large environmental simulation (Eds.). (2014). Benchmarking of control strategies for
models. Water Resources Research, 37(4):1015-1030. wastewater treatment plants. IWA Publishing.
Bozkurt, H., Quaglia, A., Gernaey, K.V., Sin, G. (2015). A Guisasola, A., Jubany, I., Baeza, J.A., Carrera, J., Lafuente, J.
mathematical programming framework for early stage design (2005). Respirometric estimation of the oxygen affinity
of wastewater treatment plants. Environmental Modelling & constants for biological ammonium and nitrite oxidation.
Software, 64: 164-176. Journal of Chemical Technology and Biotechnology, 80(4):
388-396.
234 EXPERIMENTAL METHODS IN WASTEWATER TREATMENT
Heijnen, J.J. (1999). Bioenergetics of microbial applications: a critical discussion using an example from
growth. Encyclopaedia of Bioprocess Technology. design. Water Res. 43(11): 2894-2906.
Henze, M., Gujer, W., Mino, T., van Loosdrecht, M.C.M., (2000). Sin, G., Gernaey, K.V., Neumann, M.B., van Loosdrecht, M.C.M.,
ASM2, ASM2d and ASM3. IWA Scientific and Technical Gujer, W. (2011). Global sensitivity analysis in wastewater
Report, 9. London UK. treatment plant model applications: prioritizing sources of
Ljung L. (1999). System identification - Theory for the user. 2nd uncertainty. Water Research, 45(2): 639-651.
edition. Prentice-Hall. Sin, G., de Pauw, D.J.W., Weijers, S., Vanrolleghem, P.A. (2008).
Mauricio-Iglesias, M., Vangsgaard, A.K., Gernaey, K.V., Smets, An efficient approach to automate the manual trial and error
B.F., Sin, G. (2015). A novel control strategy for single-stage calibration of activated sludge models. Biotechnology and
autotrophic nitrogen removal in SBR. Chemical Engineering Bioengineering. 100(3): 516-528.
Journal, 260: 64-73. Sin, G., Meyer, A.S., Gernaey, K.V. (2010). Assessing reliability of
Meijer, S.C.F., Van Der Spoel, H., Susanti, S., Heijnen, J.J., van cellulose hydrolysis models to support biofuel process design-
Loosdrecht, M.C.M. (2002). Error diagnostics and data identifiability and uncertainty analysis. Computers &
reconciliation for activated sludge modelling using mass Chemical Engineering, 34(9): 1385-1392.
balances. Water Sci Tech. 45(6): 145-156. Sin, G., Vanrolleghem, P.A. (2007). Extensions to modeling
Metropolis, N., Ulam, S. (1949). The Monte Carlo method. Journal aerobic carbon degradation using combined respirometric–
of the American Statistical Association, 44(247): 335-341. titrimetric measurements in view of activated sludge model
Omlin, M. and Reichert, P. (1999). A comparison of techniques for calibration. Water Res. 41(15): 3345-3358.
the estimation of model prediction uncertainty. Ecol. Model., Vangsgaard, A.K., Mauricio-Iglesias, M., Gernaey, K.V., Sin, G.
115: 45-59. (2014). Development of novel control strategies for single-
Roels, J.A. (1980). Application of macroscopic principles to stage autotrophic nitrogen removal: A process oriented
microbial metabolism. Biotechnology and approach. Computers & Chemical Engineering, 66: 71-81.
Bioengineering, 22(12): 2457-2514. van der Heijden, R.T.J.M., Romein, B., Heijnen, J.J., Hellinga, C.,
Saltelli, A., Tarantola, S., and Campolongo, F. (2000). Sensitivity Luyben, K. (1994). Linear constraint relations in biochemical
analysis as an ingredient of modeling. Statistical Science, reaction systems: II. Diagnosis and estimation of gross
15(4):377-395. errors. Biotechnology and bioengineering, 43(1): 11-20.
Seber G. and Wild C. (1989) Non-linear regression. Wiley, New Villadsen, J., Nielsen, J., Lidén, G. (2011). Elemental and Redox
York. Balances. In Bioreaction Engineering Principles (pp. 63-118).
Sin, G., Gernaey, K.V., Neumann, M.B., van Loosdrecht, M.C.M., Springer, US.
Gujer, W. (2009). Uncertainty analysis in WWTP model