0% found this document useful (0 votes)
30 views

Error Analysis and Simulations of Complex Phenomen

This document discusses error analysis and simulations of complex phenomena. It notes that large-scale computer simulations are increasingly used to predict complex systems like weather, climate change, and oil reservoir performance. However, all simulations have errors from theories, data, and numerical modeling. The document lays out methodologies for analyzing and combining different types of errors in simulations. It provides three examples of how error models are constructed and used to improve predictions and assess reliability for applications like oil recovery and nuclear weapons certification where high-confidence predictions are needed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Error Analysis and Simulations of Complex Phenomen

This document discusses error analysis and simulations of complex phenomena. It notes that large-scale computer simulations are increasingly used to predict complex systems like weather, climate change, and oil reservoir performance. However, all simulations have errors from theories, data, and numerical modeling. The document lays out methodologies for analyzing and combining different types of errors in simulations. It provides three examples of how error models are constructed and used to improve predictions and assess reliability for applications like oil recovery and nuclear weapons certification where high-confidence predictions are needed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/237392909

Error Analysis and Simulations of Complex Phenomena

Article · January 2005

CITATIONS READS
71 673

6 authors, including:

Michael A. Christie James Glimm


Heriot-Watt University Stony Brook University
187 PUBLICATIONS   3,872 CITATIONS    513 PUBLICATIONS   14,479 CITATIONS   

SEE PROFILE SEE PROFILE

John Grove
Los Alamos National Laboratory
72 PUBLICATIONS   1,992 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Geological realism in reservoir model update View project

CyberCardia View project

All content following this page was uploaded by Michael A. Christie on 19 November 2014.

The user has requested enhancement of the downloaded file.


Error Analysis

Error Analysis and Simulations


of Complex Phenomena
Michael A. Christie, James Glimm, John W. Grove, David M. Higdon,
David H. Sharp, and Merri M. Wood-Schultz

Large-scale computer-based simulations are being used increasingly to predict the behavior of
complex systems. Prime examples include the weather, global climate change, the performance
of nuclear weapons, the flow through an oil reservoir, and the performance of advanced aircraft.
Simulations invariably involve theory, experimental data, and numerical modeling, all with their
attendant errors. It is thus natural to ask, “Are the simulations believable?” “How does one assess the
accuracy and reliability of the results?” This article lays out methodologies for analyzing and com-
bining the various types of errors that can occur and then gives three concrete examples of how error
models are constructed and used.

At the top of these two pages is a simulation of low-viscosity gas (purple) displacing
higher-viscosity oil (red) in an oil recovery process. Error models can be used to improve
predictions of oil production from this process. Above, at left, is a component of such an
error model, and at right is a prediction of future oil production for a particular oil reser-
voir obtained from a simple empirical model in combination with the full error model.

6 Los Alamos Science Number 29 2005


Error Analysis

Reliable Predictions of needed data may be too hazardous or difficult to make with confidence
Complex Phenomena expensive, it may be forbidden as a because, although the fluid properties
matter of policy, as in the case of can be determined with reasonable
There is an increasing demand for nuclear testing, or it just may not be accuracy, the fluid flow is controlled
reliable predictions of complex phe- feasible. Confidence must then be by the poorly known rock permeability
nomena encompassing, where possi- sought through understanding of the and porosity. The rock properties can
ble, accurate predictions of full-sys- scientific foundations on which the be measured by taking samples at
tem behavior. This requirement is predictions rest, including limitations wells, but these samples represent only
driven by the needs of science itself, on the experimental and calculational a tiny fraction of the total reservoir
as in modeling of supernovae or pro- data and numerical methods used to volume, leading to significant uncer-
tein interactions, and by the need for make the prediction. This understand- tainties in fluid flow predictions. As an
scientifically informed assessments in ing must be sufficient to allow quanti- analogy of the difficulties faced in pre-
support of high-consequence decisions tative estimates of the level of accura- dicting fluid flow in reservoirs, imag-
affecting the environment, national cy and limits of applicability of the ine drawing a street map of London
security, and health and safety. For simulation, including evidence that and then predicting traffic flows based
example, decisions must be made any factors that have been ignored in on what you see from twelve street
about the amount by which green- making the predictions actually have a corners in a thick fog!
house gases released into the atmos- small effect on the answer. If, as In nuclear weapons certification, a
phere should be reduced, whether and sometimes happens, high-confidence different problem arises. The physical
for what conditions a nuclear weapon predictions cannot be made, this fact processes in an operating nuclear
can be certified (Sharp and Wood- must also be known, and a thorough weapon are not all accessible to labo-
Schulz 2003), or whether develop- and accurate uncertainty analysis is ratory experiments (O’Nions et al.
ment of an oil field is economically essential to identify measures that 2002). Since underground testing is
sound. Large-scale computer-based could reduce uncertainties to a tolera- excluded by the Comprehensive Test
simulations provide the only feasible ble level, or mitigate their impact. Ban Treaty (CTBT), full system pre-
method of producing quantitative, Our goal in this paper is to provide dictions can only be compared with
predictive information about such an overview of how the accuracy and limited archived test data.
matters, both now and for the foresee- reliability of large-scale simulations of The need for reliable predictions is
able future. However, the cost of a complex phenomena are assessed, and not confined to the two areas above.
mistake can be very high. It is there- to highlight the role of what is known Weather forecasting, global climate
fore vitally important that simulation as an error model in this process. modeling, and complex engineering
results come with a high level of projects, such as aircraft design, all
confidence when used to guide high- generate requirements for reliable,
consequence decisions. Why Is It Hard to Make quantitative predictions—see, for
Confidence in expectations about Accurate Predictions of example, Palmer (2000) for a study of
the behavior of real-world phenomena Complex Phenomena? predictability in weather and climate
is typically based on repeated experi- simulations. These often depend on
ence covering a range of conditions. We begin with a couple of examples features that are hard to model at the
But for the phenomena we consider that illustrate some of the uncertainties required level of detail—especially if
here, sufficient data for high confi- that can make accurate predictions dif- many simulations are required in a
dence is often not available for a vari- ficult. In the oil industry, predictions of design-test-redesign cycle.
ety of reasons. Thus, obtaining the fluid flow through oil reservoirs are More generally, because we are

Number 29 2005 Los Alamos Science 7


Error Analysis

dealing with complex phenomena,


knowledge about the state of a system
and the governing physical processes
is often incomplete, inaccurate, or
both. Furthermore, the strongly non-
linear character of many physical
processes of interest can result in the
dramatic amplification of even small
uncertainties in the input so that they
produce large uncertainties in the sys-
tem behavior. The effects of this sen-
sitivity will be exacerbated if experi-
mental data are not available for
model selection and validation.
Another factor that makes prediction
of complex phenomena very difficult
is the need to integrate large amounts
of experimental, theoretical, and com- Figure 1. Oil-in-Place Uncertainty Estimate Variation with Time
putational information about a com- This figure shows estimates of p90, p50, and p10 probabilities that the amount of oil
plex problem into a coherent whole. in a reservoir is greater than the number shown. The estimated probabilities are
Finally, if the important physical plotted as a function of time. The variations shown indicate the difficulties involved
processes couple multiple scales of in accurate probability estimations. [Photo courtesy of Terrington (York) Ltd.]
length and time, very fast and very
high memory capacity computers and
sophisticated numerical methods are
required to produce a high-fidelity
simulation. The examples discussed in
this article exhibit many of these diffi-
culties, as well as the uncertainties in
prediction to which they lead.
To account for such uncertainties,
models of complex systems and their
predictions are often formulated prob-
abilistically. But the accuracy of pre-
dictions of complex phenomena,
whether deterministic or probabilistic,
varies widely in practice. For example,
estimates of the amount of oil in a
reservoir that is at an early stage of
development are very uncertain. Large
capital investments are made on the
basis of probabilistic estimates of oil
in place, so that the oil industry is fun-
damentally a risk-based business. The
estimates are usually given at three Figure 2. Calibration Curve for Weather Forecasts
This plot shows estimates of the probability of precipitation from simulation fore-
confidence levels: p90, p50, and p10,
casts vs the observed frequency of precipitation for a large number of observa-
meaning that there is a 90 percent,
tions. Next to each data point is the number of observations for that forecast.
50 percent, and 10 percent chance,
respectively, that the amount of oil in
place will be greater than the specified Sea example) of estimated reserves as the reservoir was acquired during the
reserve level. Figure 1 shows a a function of time. The plot clearly course of field development, estimates
schematic plot (based on a real North shows that, as more information about of the range of reserves changed out-

8 Los Alamos Science Number 29 2005


Error Analysis

side the initial prediction. In other and numerical errors are unlikely to the error in a prediction is often more
words, the initial estimates of reserves, scale in the same way, a calibrated difficult than making the prediction in
although probabilistic, did not capture simulation is reliable only for the the first place, and when confidence
the full range of uncertainty and were regime for which it has been shown to in the answer is an issue, it is just as
thus unreliable. This situation was match experimental data. important.
obviously a cause for concern for a In one variant of calibration, multi- A systematic approach for deter-
company with billions of dollars in ple simultaneous simulations are per- mining and managing error in simula-
investments on the line. formed with different models. The tions is to try to represent the effects
Probabilistic predictions are also “best” prediction is defined as a of inaccurate models, neglected phe-
used in weather forecasting. If the weighted average over the results nomena, and limited solution accuracy
probabilistic forecast “20 percent obtained with the different models. As using an error model.
chance of rain” were correct, then on additional observations become avail- Unlike the calibration and data
average it would have rained on 1 in 5 able, the more successful models are assimilation methods discussed above,
days that received that forecast. Data revealed, and their predictions are an error model is not primarily a
on whether or not it rained are easily weighted more heavily. If the models method of increasing the accuracy of
obtained. This rapid and repeated used reflect the range of modeling a simulation. Error modeling aims to
feedback on weather predictions has uncertainty, then the range of results provide an independent estimate of
resulted in significantly improved reli- will indicate the variance of the pre- the known inadequacies in the simula-
ability of forecasts compared with pre- diction due to those uncertainties. tion. An error model does not purport
dictions of uncertainty in oil reserves. Data assimilation, while basically a to provide a complete and precise
The comparison between the observed form of calibration, has important dis- explanation of observed discrepancies
frequency of precipitation and a proba- tinctive features. One of the most between simulation and experiment
bilistic forecast for a locality in the important is that it enables real-time or, more generally, of the differences
United States shown in Figure 2 con- utilization of data to improve predic- between the simulation model and the
firms the accuracy of the forecasts. tions. The need for this capability real world. In practice, an error model
This accuracy did not come easily, comes from the fact that, in opera- helps one achieve a scientific under-
and so we next briefly describe two of tional weather forecasting, for exam- standing of the knowable sources of
the principal methods currently used ple, there is insufficient time to restart error in the simulation and put quanti-
to improve the accuracy of predictions a run from the beginning with new tative bounds on as much of the error
of complex phenomena: calibration data, so that this information must be as possible.
and data assimilation. incorporated on the fly. In data assim-
Calibration is a procedure whereby ilation, one makes repeated correc- Simulation Errors. Computer
a simulation is matched to a particular tions to model parameters during a codes used for calculating complex
set of experimental data by perform- single run, to bring the code output phenomena combine models for
ing a number of runs in which uncer- into agreement with the latest data. diverse physical processes with algo-
tain model parameters are varied to The corrections are typically deter- rithms for solving the governing equa-
obtain agreement with the selected mined using a time series analysis of tions. Large databases containing
data set. This procedure is sometimes the discrepancies between the simula- material properties such as cross sec-
called “tuning,” and in the oil industry tion and the current observations. tions or equations of state that tie the
it is known as history matching. Data assimilation is widely used in simulation to a real-world system
Calibration is useful when codes are weather forecasting. See Kao et al. must be integrated into the simulation
to be used for interpolation, but it is (2004) for a recent application to at the lowest level of aggregation.
of limited help for extrapolation out- shock-wave dynamics. These components and, significantly,
side the data set that was used for tun- input from the user of the code must
ing. One reason for this lack of pre- be linked by a sophisticated computer
dictability is that calibration only Sources of Error and How science infrastructure, with the result
ensures that unknown errors from dif- to Analyze Them that a simulation code for complex
ferent sources, say inaccurate physics phenomena is an exceedingly elabo-
and numerics, have been adjusted to Introducing Error Models. The rate piece of software. Such codes,
compensate one another, so that the role of a thorough error analysis in while elaborate, still provide only an
net error in some observable is small. establishing confidence in predictions approximate representation of reality.
Because different physical processes has been mentioned. But evaluating Simulation errors come from three

Number 29 2005 Los Alamos Science 9


Error Analysis

tions, and others. Solution error is the


difference between the exact mathe-
matical solution of the governing
equations for the model and the
approximate solution of the equations
obtained with the numerical algo-
rithms used in the simulation. Physics
error includes the effects of phenome-
na that are inadequately represented in
the simulation, for example, the
unknown details of subscale physics,
such as the microscopic details of
material that is treated macroscopical-
ly in the simulation. Evaluations of
the effects of these details are typical-
ly based on statistical descriptions.
The physics component of an error
model is thus based on knowledge of
aspects of the nominal model that
need or might need correction.

Experimental Errors and


Solution Errors. Much of our under-
standing of how to analyze errors
comes from studies of experimental
error. We will also see below that
experimental and solution errors play
a similar role in an uncertainty analy-
sis. We therefore start by discussing
Figure 3. Uncertainties in Reported Measurements of the Speed of experimental errors.
Light (1870–1960) Experimental errors play an impor-
This figure shows measured values of the speed of light along with estimates of the tant role in building error models for
uncertainties in the measured values up until 1960. The error bars correspond to the simulations. First, they can bias con-
estimated 90% confidence intervals. The currently accepted value lies outside the clusions that are drawn when simula-
error bars of more measurements than would be expected, indicating the difficulty
tion results are compared with meas-
of truly assessing the uncertainty in an experimental measurement. Refer to the arti-
ured data. Second, experimental errors
cle by Henrion and Fischoff on pp. 666–677 in Heuristics and Biases (2002) for more
details on this and other examples of uncertainties in physical constants.
affect the accuracy of simulations
(Photo courtesy of Department of Physics, Carnegie Mellon University.) indirectly through their effects on
databases and input data used in a
simulation. Experimental errors are
main sources: inaccurate input data, each source is problem dependent, but classified as random or systematic.
inaccurate physics models, and limit- each source of error must be evaluat- Typically, both types of error are pres-
ed accuracy of the solutions of the ed. Our discussion of error models ent in any particular application. A
governing equations. Clearly, each of will reflect the above comments by familiar example of a random error is
these generic sources of error is categorizing simulation inadequacies the statistical sampling error quoted
potentially important. A perfect as due to input, solution, and physics along with the results of opinion polls.
physics model with perfect input data errors. Another type of random error is the
will give wrong answers if the equa- Input errors refer to errors in data result of variations in random physical
tions are solved poorly. Likewise, a used to specify the problem, and they processes, such as the number of
perfect solution of the wrong equa- include errors in material properties, radioactive decays in a sample per
tions will also give incorrect answers. the description of geometrical config- unit time. The signals from measuring
The relative importance of errors from urations, boundary and initial condi- instruments usually contain a compo-

10 Los Alamos Science Number 29 2005


Error Analysis

nent that either is or appears to be Additional, independent measure- way to judge the adequacy of an
random whether the process that is the ments made with independent measur- analysis of uncertainty in a complex
subject of the measurement is random ing equipment would suggest that experiment is to repeat the experiment
or not. This component is the ubiqui- something was wrong if they were with an independent method and an
tous “noise” that arises from a wide inconsistent with these results. independent team.
variety of unwanted or uncharacter- However, the cause of the systematic Solution errors enter an analysis of
ized processes occurring in the meas- error could only be identified through simulation error in several ways. In
urement apparatus. The way in which a physical understanding of how the addition to being a direct source of
noise affects a measurement must be instruments work, including an analy- error in predictions made with a given
taken into consideration to attain valid sis of the experimental procedures and model, solution errors can bias the con-
conclusions based on that data. Noise the experimental environment. In this clusions one draws from comparing a
is typically treated probabilistically, example, the additional measurements model to data in exactly the same way
either separately or included with a should show that the electrical charac- that experimental errors do. Solution
statistical treatment of other random teristics of the cable were not as errors also can affect a simulation
error. However, systematic error is expected. To reiterate, the point of almost covertly: It is common for the
often both more important and more both examples is that an understand- data or the code output to need further
difficult to deal with than random ing of the systematic error in a meas- processing before the two can be
error. It is also frequently overlooked, ured quantity requires an analysis that directly compared. When this process-
or even ignored. is independent of the instrument used ing requires modeling or simulation
To see how a systematic error can for the original measurement. with a different code, then the solution
occur, imagine that an opinion poll on An example of how difficult it can error from that calculation can affect
the importance of education was con- be to determine uncertainties correctly the comparison. As with experimental
ducted by questioning people on street is shown in Figure 3, a plot of esti- errors, solution errors must be deter-
corners “at random”—not knowing mates of the speed of light vs the date mined independently of the simulations
that many of them were coming and of the measurement. The dotted line that are being used for prediction.
going from a major library that hap- shows the accepted value, and the
pened to be located nearby. It is virtu- published experimental uncertainties
ally certain that those questioned are shown as error bars. The length of Using Data to
would on average place a higher the error bars—1.48 times the stan- Constrain Models
importance on education than the pop- dard deviation—is the “90 percent
ulation in general. Even if a very large confidence interval” for a normally The scientific method uses a cycle
number of those individuals were distributed uncertainty for the experi- of comparison of model results with
questioned, an activity that would mental error; that is, the experimental data, alternating with model modifica-
result in a small statistical sampling error bars will include the correct tion. A thorough and accurate error
error, conclusions about the impor- value 90 percent of the time if the analysis is necessary to validate
tance of higher education drawn from uncertainty were assessed correctly. It improvements. The availability of
this data could be incorrect for the is evident from the figure, however, data is a significant issue for complex
population at large. This is why care- that many of the analyses were inade- systems, and data limitations permeate
fully conducted polls seek to avoid quate: The true value lies outside the efforts to improve simulation-based
systematic errors, or biases, by ensur- error bars far more often than 10 per- predictions. It is therefore important
ing that the population sampled is rep- cent of the time. This situation is not to use all relevant data in spite of dif-
resentative. uncommon, and it provides an exam- ferences in experiment design and
As a second example, suppose that ple of the degree of caution appropri- measurement technique. This means
10 measurements of the distance from ate when using experimental results. that it is important to have a proce-
the Earth to the Sun gave a mean The analysis of experimental error dure to combine data from diverse
value of 95,000,000 miles due, say, to is often quite arduous, and the rigor sources and to understand the signifi-
flaws in an electric cable used in mak- with which it is done varies in prac- cance of the various errors that are
ing these measurements. How would tice, depending on the importance of responsible for limitations on pre-
someone know that 95,000,000 miles the result, the accuracy required, dictability.
is the wrong answer? This error could whether the measurement technique is The way in which the various cate-
not be revealed by a statistical analy- standard or novel, and whether the gories of error can affect comparison
sis of only those 10 measurements. result is controversial. Often, the best with experimental data and the steps

Number 29 2005 Los Alamos Science 11


Error Analysis

to be taken if the errors are too large


are discussed in the next section. The
comparison of model predictions with
experimental data is often called the
forward step in this cycle and is a key
component in uncertainty assessments
of a predicted result. The backward
step of the cycle for model improve-
ment, which is discussed next, is the
statistical inference of an improved
model from the experimental data.
The Bayesian framework provides a
systematic procedure for inference of
an improved model from observa-
tions; lastly, we describe the use of
hierarchical Bayesian models to inte-
grate data from many sources.
Some discussion of the use of the
terms “uncertainty” and “error” is in
order. In general, any physical quan-
tity, whether random or not, has a spe-
cific value—such as the number of
radioactive decays in a sample of triti-
ated paint in a given 5-minute period.
The difference between that actual
number and an estimate determined
from knowledge of the number of tri-
tium nuclei present and the tritium
lifetime is the error in that estimate. If Figure 4. Comparing Experimental Measurements with Simulations
the experiment were repeated many The green line shows the true, unknown value of an observable over the range of
times, a distribution of errors would uncertainty in the experimental conditions, and the purple cross indicates the
arise, and the probability density func- uncertainty in the observation. The discrepancy measures the difference between
observation and simulation.
tion for those errors is the uncertainty
in the estimate.

Decomposition of Errors. Our choices—are small enough to ensure with a slightly different condition than
ability to predict any physical phe- that predictions of the phenomena of the one for which the experiment was
nomenon is determined by the accura- interest can be made with sufficient designed, as shown in Figure 4.
cy of our input data and our modeling precision for the task at hand. This The three steps below could serve
approach. When the modeling input means that simpler techniques are as an initial, deterministic assessment
data are obtained by analysis of often appropriate at the start of a of the discrepancy between simulation
experiments, the experimental error study to ensure that we are operating and experiment.
and modeling error (solution error with the required level of precision. Step 1. Compare Simulated and
plus physics approximations) terms The discrepancy between simula- Experimental Results. The size of the
control the accuracy of our estimation tion results and experimental data is measurement error will obviously
of those data, and hence our ability to illustrated in Figure 4, which shows affect the conclusions drawn from the
predict. Because a full uncertainty- the way in which this discrepancy can comparison. Those conclusions can
quantification study is in itself a com- be related to measurement errors and also be affected by the degree of
plex process, it is important to ensure solution errors. Note that the experi- knowledge of the actual as opposed to
that those errors whose size can be mental conditions are also subject to the designed experimental conditions.
controlled—either by experimental uncertainties. This means that the For example, the as-built composition
technique or by modeling/simulation observed value may be associated of the physical parts of the system

12 Los Alamos Science Number 29 2005


Error Analysis

under investigation may differ slightly tered in full system operation. that capture most of the variability.
from the original design. The effects Nevertheless, because the need to pre- The principle that underlies many
of both of these errors are typically dict integral quantities motivates the of these techniques is that, for a com-
reported together, but they are explic- development and use of simulation, a plex engineering system to be reli-
itly separated here because error in crucial test of the “correctness” of a able, it should not depend sensitively
the experimental conditions affects simulation is that it consistently and on the values of, for example, 104 or
the simulated result, as well as the accurately matches all available data. more parameters. This is as true for a
measured result, as can be seen in weapon system that is required to
Figure 4. operate reliably as it is for an oil field
Step 2. Evaluate Solution Errors. If Statistical Prediction that is developed with billions of dol-
the error is a simple matter of numeri- lars of investment funds.
cal accuracy—for example, spatial or A major challenge of statistical
temporal resolution—then the error is prediction is assessing the uncertainty Statistical Inference—The
a fixed, determinable number in prin- in a predicted result. Given a simula- Bayesian Framework. The Bayesian
ciple. In other cases—for example, tion model, this problem reduces to framework for statistical inference
subgrid stochastic processes—the the propagation of errors from the provides a systematic procedure for
error may be knowable in only a sta- simulation input to the simulated updating current knowledge of a sys-
tistical sense. result. One major problem in examin- tem on the basis of new information.
Step 3. Determine Impact on ing the impact of uncertainties in In engineering and natural science
Predictability. If the discrepancy is input data on simulation results is the applications, we represent the system
large compared with the solution error “curse of dimensionality.” If the prob- by a simulation model m, which is
and experimental uncertainty, then the lem is described by a large number of intended to be a complete specifica-
model must be improved. If not, the input parameters and the response sur- tion of all information needed to solve
model may be correct, but in either face is anything other than a smooth a given problem. Thus m includes the
case, the data can be used to define a quasilinear function of the input vari- governing evolution equations (typi-
range of modeling parameters that is ables, computing the shape of the cally, partial differential equations) for
consistent with the observations. If that response surface can be intractable the physical model, initial and bound-
range leads to an uncertainty in predic- even with large parallel machines. For ary conditions, and various model
tion that is too large for the decision example, if we have identified 8 criti- parameters, but it would not generally
being taken, the experimental errors or cal parameters in a specific problem include the parameters used to specify
solution errors must be reduced. and can afford to run 1 million simu- the numerical solution procedure
A significant discrepancy in step 1 lations, we can resolve the response itself. Any or all of the information in
indicates the presence of errors in the surface to an accuracy of fewer than m may be uncertain to some degree.
simulation and/or experiment, and 7 equally spaced points per axis. To represent the uncertainty that may
steps 2 and 3 are necessary, but not Various methods exist to assess the be present in the initial specification

ble of models M, with m ∈ M, and


sufficient, to pinpoint the source(s) of most important input parameters. of the system, we introduce an ensem-
error. However, these simple steps do Sensitivities to partial derivatives can

M. This is called the prior distribu-


not capture the true complexity of be computed either numerically or define a probability distribution on
analyzing input or modeling errors. In through adjoint methods. Adjoint
practice, the system must be subdivid- methods allow computation of sensi- tion and is denoted by p(m).
ed into pieces for which the errors can tivities in a reasonable time and are If additional information about the
be isolated (see below) and independ- widely used. system is supplied by an observation
ently determined. The different errors Experimental design techniques O, one can determine an updated esti-
must then be carefully recombined to can be used to improve efficiency. mate of the probability for m, called
determine the uncertainties in integral Here, the response surface is assumed the posterior distribution and denoted
quantities, such as the yield of a to be a simple low-order polynomial by p(m|O), by using Bayes’ formula
nuclear weapon or the production of in the input variables, and then statis-
an oil well, that are measured in full tical techniques are used to extract the
system tests. A potential drawback of maximum amount of information for
this paradigm is that experiments on a given number of runs. Principal (1)
subsystems may not be able to probe component analysis can also be used
the entire parameter space encoun- to find combinations of parameters

Number 29 2005 Los Alamos Science 13


Error Analysis

Figure 5. Bayesian Framework for It is important to realize that the measurements. The likelihood is
Predicting System Performance Bayesian procedure does not deter- defined by assigning probabilities to
with Relevant Uncertainties mine the choice of p(m). Thus, in solution and/or measurement errors
Multiple simulations are performed using Bayesian analysis, one must of different sizes. The required prob-
using the full physical range of parame- supply the prior from an independent ability models for both types of
ters. The discrepancies between the data source or a more fundamental errors must be supplied by an inde-
observation and the simulated values
theory, or otherwise, one must use a pendent analysis.
are used in a statistical inference proce-
dure to update estimates of modeling
noninformative “flat” prior. This discussion shows that the role
and input uncertainties. The update The factor p(O|m) in Equation (1) of the likelihood in simulation-based
involves computing the likelihood of the is called the likelihood. The likeli- prediction is to assign a weight to a
model parameters by using Bayes’ theo- hood is the (unnormalized) condi- model m based on a probabilistic
tional probability for the observation measure of the quality of the fit of the
O, given the model m. In the cases of
rem. The likelihood is computed from a
probability model for the discrepancy, model predictions to data. Probability
taking into account the measurement interest here, model predictions are models for solution and measurement
errors (shown schematically by the
determined by solutions s(m) of the errors play a similar role in determin-
green dotted lines) and the solution
governing equations. The simulated ing the likelihood.
observables are functionals O(s(m))
errors (blue dotted lines). The updated
parameter values are then used to pre-
This point is so fundamental and
of s(m). If both the experimentally sufficiently removed from common
measured observables O and the solu-
dict system performance, and a deci-
approaches to error analysis that we
tion s(m), hence O(s(m)), are exact,
sion is taken on whether the accuracy
of the predictions is adequate. repeat it for emphasis: Numerical
the likelihood p(O|m) is a delta func- and observation errors are the lead-

in M defined by the equation


tion concentrated on the hypersurface ing terms in the determination of the
Bayesian likelihood. They supply
critical information needed for
(2) uncertainty quantification.
Alternative approaches to infer-
Real-world observations and simula- ence include the use of interval
tions contain errors, of course, so that analysis, possibility theory, fuzzy
a discrepancy will invariably be sets, theories of evidence, and others.
observed between O and O(s(m)). We do not survey these alternatives
Because the likelihood is evaluated here, but simply mention that they

model m ∈ M is correct, any such


subject to the hypothesis that the are based on different assumptions
about what is known and what can be
discrepancy can be attributed to concluded. For example, interval
errors either in the solution or in the analysis assumes that unknown

14 Los Alamos Science Number 29 2005


Error Analysis

parameters vary within an interval sources of information—even the different components of the out-
(known exactly), but that the distri- those that provide only indirect put. This means that each of the like-
bution of possible values of the information. lihood terms will have its own solu-
parameter within the interval is not In principle, an analysis can uti- tion error, as well as its own observa-
known even in a probabilistic sense. lize any experimental data that can tion error. The relative sizes of these
This method yields error bars but not be compared with some part of the errors greatly affect how these vari-
confidence intervals. output of a simulation. To understand ous data sources constrain θ. For
An illustration of the Bayesian this point, let us make the simple and example, if it is known that m2(θ )
does not reliably simulate O2, then
ily of possible models M can be
framework we follow to compute the often useful assumption that the fam-
impact of solution error and experi- the likelihood should reflect this fact.
mental uncertainty is shown in indexed by a set of parameters . In Note that a danger here is that a mis-
Figure 5. Multiple simulations are this case, the somewhat abstract specification of a likelihood term
performed with the full physical specification of the prior as a proba- may give some data sources undue
range of parameters. The discrepan- bility distribution p(m) on models influence in constraining possible
cies (between simulation and obser- can be thought of simply as a proba- values of one of the parameters θ.
vation) are used in a statistical infer- bility distribution p(θ) on the param- In some cases, one (or more) com-
ence procedure to update estimates of eters θ. Depending on the applica- ponent (components) of the observed
modeling and input uncertainties. tion, θ may include parameters that data is (are) not from the actual
These updated values are then used describe the physical properties of a physical system of interest, but from
to predict system performance, and a system, such as its equation of state, a related system. In such cases,
decision is taken on whether the or that specify the initial and bound- Bayesian hierarchical models can be
accuracy of the predictions is ary conditions for the system, to used to borrow strength from that
adequate. mention just a few examples. In any data by specifying a prior model that
of these cases, uncertainty in θ incorporates information from the
affects prediction uncertainty. different systems. See Johnson et al.
Combining Information from Typically, different data sources will (2003) for an example.
Diverse Sources give information about different Finally, expert judgment usually
parameters. plays a significant role in the formu-
Bayesian inference can be extend- Multiple sources of experimental lation and use of models of complex
ed to include multiple sources of data can be included in a Bayesian phenomena—whether or not the
information about the details of a analysis by generalizing the likeli- models are probabilistic. Sometimes,
physical process that is being simu- hood term. If, for example, the expert judgment is exercised in an
lated (Gaver 1992). This information experimental observations O decom- indirect way, through selection of a
may come from “off-line” experi- pose into three components (O1, O2, likelihood model or through the
ments on separate components of the O3), the likelihood can be written as choice of the data sources to be
simulation model m, expert judg- included in an analysis. Expert judg-
ment, measurements of the actual ment is also used to help with the
physical process being simulated, choice of the probability distribution
and measurements of a physical for p(θ), or to constrain the range of
process that is related, but not identi- possible outcomes in an experiment,
cal, to the process being simulated. and such information is often
Such information can be incorporat- invoked in applications for which
ed into the inference process by experimental or observational data
using Bayesian hierarchical models, if we assume that each component of are scarce or nonexistent. However,
which can account for the nature and the data gives information about an the use of expert judgment is fraught
strength of these various sources of independent parameter θ. The sub- with its own set of difficulties. For
information. This capability is very scripts on the models are there to example, the choice of a prior can
important since data directly bearing remind us that, although the same leave a strong “imprint” on results
on the process being modeled is simulation model is used for each of inferred from subsequent experi-
often in short supply and expensive the likelihood components, different ments. See Heuristics and Biases
to acquire. Therefore, it is essential subroutines within the simulation (2002) for enlightening discussions
to make full use of all possible code are likely to be used to simulate of this topic.

Number 29 2005 Los Alamos Science 15


Error Analysis

Building Error Models—


Examples
Figure 6. Dropping an Object
from a Tower Dropping Objects from a Tower.
(a) The time it takes an object to drop Some of the basic ideas used in build-
from each of 6 floors of a tower is ing error models are illustrated in
recorded. There is an uncertainty in the Figure 6. In this example, experimental
measured drop times of about ±0.2 s. observations are combined with a sim-
Predictions for times are desired for
ple physics model to predict how long
drops from floors 7 through 10, but they
do not yet exist.
it takes an object to fall to the ground
when it is dropped from a tower. The
experimental data are drop times
recorded when the object is dropped
from each of six floors of the tower.
(b) A mathematical model is developed The actual drop time is measured with
to predict the drop times as a function an observation error, which we assume
of drop height. The simulated drop
for illustrative purposes to be Gaussian
times (red line) are systematically too
(normal), with mean 0 and a standard
low when compared with the experi-
mental data (triangles). The error bars
deviation of 0.2 second. The physics
around the observed drop times show model is based solely on the accelera-
the observation uncertainty. tion due to gravity. We observe that the
predicted drop times are too short and
that this discrepancy apparently grows
with the height from which the object is
dropped.
(c) This systematic deviation between
Even though this model shows a
the mathematical model and the experi-
substantial error, which is apparent
mental data is accounted for in the like-
lihood model. A fitted correction term
from the discrepancy between the
adjusts the model-based predictions to experimental data and the model pre-
better match the data. The resulting 90% dictions (Figure 6(b)), it can still be
prediction intervals for floors 7 through made useful for predicting drop times
10 are shown in this figure. Note that from heights that are greater than the
the prediction intervals become wider height of the tower. As a first step, we
as the drop level moves away from the account for the discrepancy by includ-
floors with experimental data. The cyan
ing an additional unknown correction
triangles corresponding to floors 7
in the initial specification of the model,
through 10 show experimental observa-
tions taken later only for validation of namely, in the prior. This term repre-
the predictions. sents the discrepancy as an unknown,
smooth function of drop height that is
estimated (with uncertainty) in the
analysis. The results are applied to give
predictions of drop times for heights
(d) An improved simulation model was that would correspond to the seventh
constructed that accounts for air resist-
through tenth floors of the tower. These
ance. A parameter controlling the
strength of the resistance must be esti-
predictions have a fair amount of
mated from the data, resulting in some uncertainty because the discrepancy
prediction uncertainty (90% prediction term has to be extrapolated to drop
intervals are shown for floors 7 through heights that are beyond the range of the
10). The improved model captures more experimental data. Note also that the
of the physics, giving reduced predic- prediction uncertainty increases with
tion uncertainty. drop height (refer to Figure 6(c)).
This strictly phenomenological

16 Los Alamos Science Number 29 2005


Error Analysis

modeling of the error leads to results


that can be extrapolated over a very
limited range only, because predictions
of drop times from just a few floors
above the sixth have unacceptably large
uncertainties. But an improved physics
model can greatly extend the range
over which useful predictions can be
made. Thus, we next carry out an
analysis using a model that incorporates
a physically motivated term for air
resistance. This model requires estima- Figure 7. Viscous Fingering in a Realization of Porous Media Flow
tion of a single additional parameter Low-viscosity gas (purple) is injected into a reservoir to displace higher-viscosity
appearing as a coefficient in the air oil (red). The displacement is unstable and the gas fingers into the oil, reducing
resistance term. But when this parame- recovery efficiency.
ter is constrained by experimental data,
much better agreement with the meas- evolution associated with the lack of comparing simulation and observation
ured drop times is obtained (see knowledge of the initial conditions (in practice, oil and gas viscosities
Figure 6(d)). In fact, in this case, the and with unknown small-scale fluctu- would be measured, although there
discrepancy is estimated to be nearly ations in rock properties. would still be uncertainties associated
zero. The remaining uncertainty in this The oil industry has a simple with amounts of gas dissolved in the
improved prediction results from uncer- empirical model that accounts for the oil). To construct a solution error
tainties in the measured data and in the effects of fingering. This model, model for the average gas concentra-
value of the air resistance parameter. called the Todd and Longstaff model, tion in the reservoir, we run a number
fits an expansion wave (rarefaction of fine-grid simulations at discrete
Using an Error Model to fan) to the average behavior. Although values of the viscosity ratio, which we
Improve Predictions of Oil the model is good, it is not perfect, refer to as calibration points. Then, for
Production. In most oil reservoirs, and in particular, when applied to each value of the viscosity ratio, we
the oil is recovered by injecting a cases with a correlated permeability compute the difference between the
fluid to displace the oil toward the field, it tends to underestimate the Todd and Longstaff model and the
production wells. The efficiency of speed with which the leading edge of fine-grid simulations as a function of
the oil recovery depends, in part, on the gas moves through the medium. scaled distance along the flow (x) and
the physical properties of the displac- If we compare results from the Todd dimensionless time (t) (time divided
ing fluid. The example in this section and Longstaff model with observed by the time for gas to break through
concerns estimation of the viscosity data in order to estimate physical in the absence of fingering). The
(typically poorly known) of an inject- parameters such as viscosity, we will mean error computed in this way for
ed gas displacing oil in a porous introduce errors into the parameter the viscosity ratio 10 is shown in
medium. We will show how an error estimates because of the errors in the Figure 8 as a function of the similarity
model for such estimates allows solution method. To compensate for variable x/t. We also compute the
improved estimates of the uncertainty these errors, we create a statistical standard deviation of the error at each
in future oil production using this model for the solution errors.1 time, as well as the correlation
method of recovery. For this example, we assume that between errors at different times. This
Because the injected gas has lower the primary unknown in the Todd and information is represented as a
viscosity than the oil, the displace- Longstaff model is the ratio of gas “covariance matrix.”
ment process is unstable and viscous viscosity to oil viscosity, which deter- We will show that the solution
fingers develop (see Figure 7). The mines the rate at which instabilities error model (the mean error and the
phenomenon is similar to the grow. This ratio will be determined by covariance matrix), when used in con-
Rayleigh-Taylor instability of a dense junction with predictions of the Todd
fluid on top of a less dense fluid. The 1 All the results cited in this section are and Longstaff model at different vis-
fingers have a reasonably predictable from Alannah O’Sullivan’s Ph.D. thesis cosity ratios, can yield good estimates
on error modeling (O’Sullivan 2004). We
average behavior, but there is some are grateful to her for permission to use of the viscosity ratio for a given pro-
randomness in their formation and these unpublished results in this article. duction data set. Figure 9 shows the

Number 29 2005 Los Alamos Science 17


Error Analysis

observed production data (black


curve) for which we wish to deter-
mine the unknown viscosity ratio. We
first run the Todd and Longstaff
model at different viscosity ratios
from a prior range of 5 to 25 and then
correct each prediction by adding to it
the mean error at that specific viscosi-
ty ratio. The mean error at each vis-
cosity ratio is calculated by interpolat-
ing between the mean error at the
known calibration points for each
value of the similarity variable x/t.
The blue curve in Figure 9 gives an
example of a Todd and Longstaff pre-
diction, and the red curve gives the
corrected curve obtained by adding
the mean error to the blue curve. To
apply the error model, we have con-
verted from the similarity variable x/t
Figure 8. Mean Error and Data to Compute Mean Error
The black curve is the mean error in the gas concentration for viscosity ratio 10. The
to time using the known length of the
data to compute the mean error (gray curves) come from the differences between a system.
single coarse-grid or approximate solution (in this case, the Todd and Longstaff After calculating the corrected pre-
model) and multiple fine-grid realizations, all computed at viscosity ratio 10. The dictions for each viscosity ratio, the
variability in the fine-grid realizations reflects random fluctuations in the permeabili- next step is to compare the corrected
ty field, which create different finger locations and growth paths. In this case, the prediction (an example is shown in
gas concentration averaged across the flow from the fine-grid solution is subtracted red) for each viscosity ratio with the
from the coarse Todd and Longstaff prediction as a function of x (distance along the
observed data (shown in black) and
flow) divided by t (time). In the example discussed in the text, we compute the mean
compute the misfit M between the
error and covariance matrix at viscosity ratios 5, 10, and 15, and interpolation is
used to predict the behavior in between these values.
simulation and the data. The misfit is
given by

(3)

where o is the observed value, s is the


simulated value, e– is the mean error,
and the covariance matrix is given by
C = σd2 I + Csem. That is, for the
covariance matrix, we assume that the
data errors are Gaussian, independent,
and identically distributed and that
therefore they have a standard devia-
tion of σd2 , and we estimate the solu-
tion error model covariance matrix
Csem from the fine-scale simulations
performed at the calibration points.
Figure 9. Observed Production Compared with Predictions
The mean error from the error model is added to the coarse-grid result (blue curve) The red curve in Figure 10 shows the
at each time to generate an improved estimate of the gas concentration produced misfits as a function of viscosity ratio
(red curve). The black curve is observed data (actually synthetic data calculated computed using the full error model as
using the fine-grid model with oil-gas viscosity ratio equal to 13). in Equation (3). The other misfit statis-

18 Los Alamos Science Number 29 2005


Error Analysis

tics in Figure 10 were computed using

for the least-squares model and

for least-squares plus mean-error


model.
The likelihood function L for the
viscosity ratio is then given by
L = exp(–M). Notice that the exponen-
tial is a signal that the probabilities
are sensitive to the method used for
computing the misfit. The likelihoods
Figure 10. Misfit Statistic vs Viscosity Ratio Calculated in Three Ways are converted to probability distribu-
This figure shows a plot of misfit as a function of the viscosity ratio. The misfit is tion functions by being normalized so
computed using a standard least-squares approach (black curve), least squares with that they integrate to 1.
mean error added (blue curve), and the full error model. The misfit measures the To illustrate the improvement in
quality of the fit to the observed data with low misfits indicating a good fit. parameter estimation that results from
using an error model, we computed
estimates of the probability distribu-
tion function for the unknown viscosi-
ty ratio using the three different misfit
curves in Figure 10, which were cal-
culated with the three different meth-
ods: standard least squares, least
squares modified by the addition of a
mean error term, and least squares
with the inclusion of the mean error
plus the full covariance matrix. The
range of possible values for the vis-
cosity ratio and their posterior proba-
bilities are shown in Figure 11.
The true value of the viscosity
ratio used to generate the “observed”
(synthetic) production data in
Figure 9 was 13, and one can see that
this value has been accurately identi-
Figure 11. Posterior Probability Distribution Functions for the Viscosity fied by the full error model. The stan-
Ratio Calculated in Three Ways dard least-squares method has not
This figure shows the estimated posterior probability (assuming a uniform prior identified this value because of the
probability in the range 5–25) of the viscosity ratio obtained from three different underlying bias in the Todd and
methods for matching the Todd and Longstaff predictions to observed data. The Longstaff model.
black curve is obtained from the Todd and Longstaff predictions and a standard
We sample from the estimated
least-squares approach. The probability density rises to a maximum at the upper
probability distribution for the viscos-
end of the viscosity range specified in the prior model. The blue curve shows the
effect of adding the mean error to the predictions. The bias in the coarse model has ity ratio to generate a forecast of
been removed, but the uncertainty is still large. The red curve shows the estimated uncertainty in future production.
viscosity ratio from a full error model treatment—refer to Equation (3)—indicating Figure 12 is a plot of the maximum
that it is possible to use a statistical model of solution error to get a good estimate likelihood prediction from the Todd
of a physical parameter. The true value of the viscosity ratio in this example was 13. and Longstaff model, along with the

Number 29 2005 Los Alamos Science 19


Error Analysis

the fact that shock waves are persist-


ent, highly localized wave distur-
bances. In this case, “persistent”
means that shock waves propagate as
locally steady-state wave fronts that
can be modified only by interactions
with other waves or unsteady flows.
Generally, interactions consist of col-
lisions with other shock waves,
boundaries, or material interfaces. The
phrase “highly localized” refers to
shock fronts being sharp and their
interactions occurring in limited
regions of space and time and possi-
bly being characterized by the refrac-
tion of shock fronts into multiple
wave fronts of different families.
These properties are illustrated in
Figure 13, which shows a sequence of
wave interactions being initiated when
a shock incident from the left collides
with a contact located a short distance
Figure 12. Prediction of Future Oil Production Using Error Model from a reflecting wall at the right
The solid red line shows the mean (maximum likelihood) prediction from the Todd boundary in the figure. Each collision
and Longstaff model and the full error model. The dashed red lines show the 95% event produces three outgoing waves:
confidence interval, and the fine gray curves show the results from 20 fine-grid sim- a transmitted shock, a contact discon-
ulations using the exact viscosity ratio of 13.
tinuity, and a reflected shock or rar-
efaction wave. The buildup of a com-
95 percent confidence limits obtained continuities. Shock waves play a plex space-time pattern due to the
by sampling for different values of prominent role in explosions, super- multiple wave interactions is evident.
viscosity. In addition, 20 predictions sonic aerodynamics, inertial confine- Generally, solution errors are deter-
from fine-grid simulation are shown. ment fusion, and numerous other mined by comparison to a fiducial
They use the exact viscosity ratio 13. problems. Most problems of practical solution, that is, a solution that is
The uncertainty in the evolution of importance involve two- or three- accepted, not necessarily as perfect,
the fingers gives rise to the uncertain- dimensional (2-D or 3-D) flows, com- but as “correct enough” for the prob-
ty in prediction shown by the multi- plex wave interactions, and other lem being studied. But producing a
ple light-gray curves. It is clear from complications, so that a quantitative fiducial solution may not be easy. In
the figure that use of an error model description of the flow can be principle, one might obtain one using a
has allowed us to produce well-cali- obtained only by solving the fluid- very highly resolved computation.
brated predictions. flow equations numerically. The abili- However, in real-world problems, this
ty to numerically simulate complex is generally not feasible. If it were, one
Fluid Dynamics—Error Models flows is a triumph of modern science, would just do it and forget about solu-
for Reverberating Shock Waves. but such simulations, like all numeri- tion errors. So, what do we do when
Compressible flow exhibits remark- cal solutions, are only approximate. we cannot compute a fiducial solution?
able phenomena, one of the most The errors in the numerical solution The development of models for
striking being shock waves, which are can be significant, especially when the error generation and propagation
propagating disturbances character- computations use moderate to coarse offers an approach for dealing with
ized by sudden and often large jumps computational grids as is often neces- flows that are too complex for direct
in the flow variables across the wave sary for real-world problems. In this computation of a fiducial solution. For
front (Courant and Friedrichs 1967). section, we sketch an approach to esti- compressible flows, the key point is
In fact, for inviscid flows, these jumps mating these errors. that the equations are hyperbolic,
are represented as mathematical dis- Our approach makes heavy use of which implies that errors are largely

20 Los Alamos Science Number 29 2005


Error Analysis

advected through smooth-flow


regions and significant errors are
only created when wave fronts col-
lide. The flow shown in Figure 13
consists of a sequence of binary
wave interactions, each of which is
simple enough to be computed on an
ultrafine grid. The basic idea is to
determine the solution errors for an
elementary wave interaction and to
construct “composition laws” that
give the error at any given point in
terms of the error generated at each
of the elementary wave interactions
in its domain of influence.
A number of points need to be
made here. First, there are a limited
number of types of elementary wave
interactions. One-dimensional (1-D)
interactions occur as refractions of
pairs of parallel wave fronts, 2-D
interactions are refractions of two
oblique wave fronts, and 3-D inter-
actions correspond to triple points
produced by three interacting waves.
It is important to note that, in each Figure 13. The Space-Time Interaction History of a Shock-Tube
spatial dimension, the elementary Refraction
wave interactions occur at isolated This figure shows the interaction history as reconstructed from the simulated solu-
points. Most of the types of wave tion data from a shock-tube refraction problem. A planar shock is incident from the
interactions that can occur in 1-D left on a contact discontinuity located near the middle of the test section of the
flow appear in Figure 13. The coher- shock tube. A reflecting wall is located on the right side of the tube. Event 1 corre-
sponds to the initial refraction of the shock wave into reflected and transmitted
ent traveling wave interactions that
waves, event 2 occurs when the transmitted shock produced by interaction 1
occur in 2-D flows have been charac- reflects at the right wall, and the events numbered 3–10 correspond to subsequent
terized (Glimm et al. 1985). However, wave interactions between the various waves produced by earlier refractions or
substantial limitations are left on the reflections. Our error model is applied at each interaction location to estimate the
refinement and thoroughness with additional solution error produced by the interaction.
which 3-D elementary wave interac- (This figure was supplied courtesy of Dr. Yan Yu, Stony Brook University.)

tions can be studied.


Event 1 in Figure 13 is a typical laws with scale-invariant initial data. of Riemann problems for a flow
example of a 1-D wave interaction. Riemann problems and their solutions analysis quickly and efficiently. This
Here, the “incoming waves” consist of are basic theoretical tools in the study observation is particularly important
an incident shock and a contact dis- of shock dynamics, the development because our error model requires the
continuity, and the “outgoing state” is of shock-capturing schemes to numer- solution of multiple Riemann prob-
described by a reflected shock, a ically compute flows, and they also lems whose data are drawn from sta-
(moving) contact, and a transmitted play a key role in our study of solu- tistical ensembles of initial data to
shock. The interaction can be tion errors. A key point in the use of represent uncertainties in the incom-
described as the solution to a Riemann problem solutions in our ing waves.
Riemann problem with data given by error model is that the solution of a A final point here is that a realistic
the states behind the incoming wave 1-D Riemann problem for hydrody- solution error model must include the
fronts. A Riemann problem is defined namics reduces to solving a single, study of the size distribution of errors
as the initial value problem for a relatively simple algebraic equation. It over an ensemble of problems, in
hyperbolic system of conservation is thus possible to solve large numbers which the variability of problem char-

Number 29 2005 Los Alamos Science 21


Error Analysis

Ms (ν) = 32.6986
νs (ν) = 1.33625 (the shock waves) itself; they are con-
S = 0.25 C U[.9,1.1], C = 1 centrated along the wave fronts,
Open boundary

Reflecting wall
where steep gradients in the solution
ρ=1 ρ=1
P = 10–3 P = 10–3
occur. Second, errors are generated at
ν=0 ν=0 the location of wave interactions. The
error generated by the interaction
0 0.5 1.0 1.5 2.0 2.5 increments the error in the outgoing
γ = 5/3 waves, which is inherited from errors
ρ(ν) = 3.97398 0 ≤ t ≤ 3.5 in the incoming waves.
P(ν) = 1.33725 Comparable studies have been car-
ν U[.9,1.1], ν = 1 ried out for each of the types of wave
interaction shown in Figure 13, as well
Figure 14. Initial Data for a 1-D Shock-Tube Refraction Problem
This schematic diagram is for the initial data used to conduct an ensemble of simula- as corresponding wave interactions
tions of a 1-D shock tube refraction. Each simulation consisted of a shock wave inci- that occur in spherical implosions or
dent from the left on a contact discontinuity between gases at the indicated pres- explosions (Dutta et al. 2004). An
sures and densities. Each realization from the ensemble is obtained by selecting a analysis of statistical ensembles of
shock strength consistent with a velocity v behind the incident shock taken from a such interactions has led us to suggest

10% uniform distribution about the mean value v = 1, and an initial contact location the following scheme for estimating

C chosen from a 10% uniform distribution about the mean position C = 1. In the dia- the solution errors. The key steps are
gram, S is the shock position, Ms is the shock strength, and vs is the velocity of the
(a) identification of the main wave
shock. The initial state behind the shock is set by using the Rankine-Hugoniot condi-
tions for the realization shock strength and the specified state ahead of the shock.
fronts in a flow, (b) determination of
the times and locations of wave inter-
actions, and (c) approximate evalua-
acteristics is described probabilistical- the error model on a study of an tion of the errors generated during the
ly. Of course, one will often want to ensemble of problems that reflects the interactions. Wave fronts are most
make as refined an error analysis as degree of variability one expects to simply identified as regions of large
possible within a given realization encounter in practice. Of course, the flow gradients, and the distribution of
from the ensemble (that is, a deter- choice of such an ensemble reflects the wave positions and velocities are
ministic error analysis), but there are scientific judgment and is an ongoing found by solving Riemann problems
powerful reasons for a probabilistic part of our effort. whose data are taken from ensembles
analysis to be needed as well. First, Now, let us return to the analysis of state information near the detected
you need probability to describe fea- of solution errors in elementary wave wave fronts. The error generated dur-
tures of a problem that are too com- interactions. Our work was motivated ing an interaction is fit by a linear
plex for feasible deterministic analy- by a study of a shock-contact interac- expression in the uncertainties of the
sis. Thus, fine details of error genera- tion—refer to event 1 in Figure 13. incoming wave’s strength. The coeffi-
tion in complex flows are modeled as The basic setup is shown in Figure 14, cients are computed using a least-
random, just as are some details of the which illustrates a classic shock-tube squares fit to the distribution of outgo-
operation of measuring instruments. experiment. An ensemble of problems ing wave strengths. This fitting proce-
Second, a sensitivity analysis is need- was generated by sampling from uni- dure can be thought of as defining an
ed to determine the robustness of the form probability distributions input/output relation between errors in
conclusions of a deterministic error (±10 percent about nominal values) incoming and outgoing waves.
analysis to parameter variation. To get for the initial shock strength and the A linear relation of this kind,
an accurate picture, one needs to do contact position. The solution errors which amounts to treating the errors
sensitivity analysis probabilistically, were analyzed by computing the dif- perturbatively, holds even for strong,
to answer the question of how likely ference between coarse to moderate and hence nonlinear, wave interac-
the parameter variations are that lead grid solutions and a very fine grid tions. But there are limitations.
to computed changes in the errors. solution (1000 cells). Error statistics Linearity works if the errors in the
Third, to be a useful tool, the error are shown in Figure 15 for a 100-cell incoming waves are not too large, but
model must be applicable to a reason- grid (moderate grid) solution. Two it may break down for larger errors.
able range of conditions and prob- facts about these solution errors are In the latter case, higher order (for
lems. The only way we are aware of apparent. First, the solution errors fol- example, bilinear or rational) terms
for achieving these goals is to base low the same pattern as the solution in the expansion may be needed. See

22 Los Alamos Science Number 29 2005


Error Analysis

Figure 15. Space-Time Error Statistics for Shock-Tube Refraction Problems


Panel (a) shows the space-time 100-mesh-point density field for a using 1000 mesh zones. Panels (b) and (c) show the mean and
single realization from the flow ensemble. The space-time error variance, respectively, over the ensemble as a function of
field for each realization is computed from the difference between space and time. Note that most errors are generated at the
a 100-mesh-zone calculation and a fiducial solution computed wave interactions and then move with the wave fronts.

Glimm et al. (2003) for details. interactions and of waves connecting considered three grid levels, the finest
We can now explain how the com- these interactions. Moreover, each (5000 cells) defining the fiducial solu-
position law for solution errors actual- term can be computed on the basis of tion, and the other two representing
ly works. The basic idea is that errors elementary wave interactions and “resolved” (500 cells) and “under-
are introduced into the problem by does not require the full solution of resolved” (100 cells) solutions for this
two mechanisms: input errors that are the numerical problem. The final step problem. We introduced a 10 percent
present in waves that initiate the in the process is to compute the errors initial input uncertainty to define the
sequence of wave interactions—see in the output waves at event 3, by ensemble of problems to be examined.
the incoming waves for event 1 in using the input/output relations devel- The results can be summarized briefly
Figure 13—and errors generated at oped for this type of wave interaction. as follows. For the resolved case, the
each interaction site. However they This procedure represents a sub- composition law gave accurate results
are introduced, errors advect with the stantial reduction in the difficulty of for the errors (as determined by direct
flow and are transferred at each inter- the error analysis problem, and we fine-to-coarse grid comparisons) in all
action site by computable relations. must ask whether it actually works. cases: wave strength, wave width, and
Generally, waves arrive at a given Full validation requires use in practice, wave position errors. This was not the
space-time point by more than one of course. As a first validation step, we case for the under-resolved simula-
path. Referring again to Figure 13, compute the error in two ways. First, tion. Although the composition law
suppose you want to find the errors in we compute the error directly by com- gave good results for wave strength
the output waves for event 3, where paring very fine and coarse-grid simu- and wave width errors, it gave poor
the shock reflected off the wall lations for an entire wave pattern. results for wave position errors. The
reshocks the contact. On path A, the Results are shown in Figure 15. nature of these results can be under-
error propagates directly from the out- Second, we compute the error using stood in terms of a breakdown in
put of interaction 1 along the path of the composition law procedure shown some of the modeling assumptions
the contact, where it forms part of the in Figure 13. Comparing the errors used in the analysis.
input error for event 3. On path B, the computed in these two ways provides An interesting point of contrast
output error in the transmitted shock the basis for validation. emerged between the planar and
from event 1 follows the transmitted In Glimm et al. (2003) and Dutta et spherical cases. For the planar case,
shock to the wall, where it is reflected al. (2004), we carried out such valida- the dominant source of error was from
and then re-crosses the contact. In this tion studies for planar and spherical initial uncertainty, while for the spher-
way, the error coming into event 3 is shock-wave reverberation problems. ical symmetry case, the dominant
given as a sum of terms, with each As an example, for events 1 to 3 in source of error arose in the simulation
term labeled by a sequence of wave the planar problem in Figure 13, we itself, and especially from shock

Number 29 2005 Los Alamos Science 23


Error Analysis

reflections off the center of symmetry. confidence that can be placed in a


We come now to the “so what?” simulation-based prediction on the
question for error models. What are basis of a careful analysis of the
they good for? Our analysis shows source and size of errors affecting this
that, with an error model, one can prediction. Thus, the metric of success
determine the relative importance of of an error analysis is the confidence
input and solution errors (thereby it gives that the errors are of a specific
allocating resources effectively to size—not necessarily that they are
their reduction), as well as the precise small (they might not be).
source of the solution error (for the We have reviewed some of the
same purpose), and, finally, one can ideas and methods that are used in the
assess the error in a far more efficient study of simulation errors and have
manner than by direct comparison presented three examples illustrating
with a highly refined computation of how these methods can be used. The
the full problem. examples show how an improved
A significant limitation in our physics model can dramatically
results to date is that they pertain reduce the size of errors, how an
mostly to 1-D flows, namely, to flows improved error model can reduce
having planar, cylindrical, or spheri- uncertainty in prediction of future oil
cal symmetry. Two-dimensional prob- production, and how an error model
lems are currently under study, while for a complex shock-wave problem
full 3-D problems are to be solved in can be built up from an error analysis
the future. Furthermore, errors in of its components.
some important fluid flows lie outside Similar to models of natural phe-
the framework we have developed, nomena, error models will never be
and their analysis will require new perfect. Estimates of errors and uncer-
ideas. One such problem—fluid mix- tainties are always provisional
ing—was discussed in the previous because the data supporting these esti-
subsection. mates are derived from a limited
range of experience. Certainty is not
in the picture. Nevertheless, confi-
Conclusions dence in predictions can be derived
from the scope and power of the theo-
This paper started from the prem- ry and solution methods that are being
ise that predictive simulations of com- used. Scope refers to the number and
plex phenomena will increasingly be variety of cases in which a theory has
called upon to support high-conse- been tested. Scope is important in
quence decisions, for which confi- building confidence that one has iden-
dence in the answer is essential. Many tified the factors limiting the applica-
factors limit the accuracy of simula- bility of the theory. Power is judged
tions of complex phenomena, one of by comparing what is put into the
the most important being the sparsity simulation with what comes out.
of relevant, high-quality data. Other Error models contribute to confi-
factors include incomplete or insuffi- dence by clarifying what we do and
ciently accurate models, inaccurate do not understand. They also guide
solutions of the governing equations efforts to improve our understanding
in the model, and the need to integrate by focusing on factors that are the
the diverse and numerous components leading sources of error. Thus, in pre-
of a complex simulation into a coher- dictions of complex phenomena, an
ent whole. Error analysis by itself error analysis will form an indispensa-
does not circumvent these limitations. ble part of the answer. n
It is a way to estimate the level of

24 Los Alamos Science Number 29 2005


Error Analysis

Further Reading O'Sullivan, A. E. Modelling Simulation Error


for Improved Reservoir Prediction Ph.D.
thesis, Heriot-Watt University, Edinburgh.
Courant, R., and K. O. Friedrichs. 1967.
Supersonic Flow and Shock Waves. New Palmer, T. N. 2000. Predicting Uncertainty in
York: Springer-Verlag. Forecasts of Weather and Climate. Rep.
Prog. Phys. 63: 71.
Dutta, S., E. George, J. Glimm, J. W. Grove, H.
Jin, T. Lee, et al. 2004. Shock Wave Sharp, D. H., and M. Wood-Schulz. 2003.
Interactions in Spherical and Perturbed QMU and Nuclear Weapons Certification.
Spherical Geometries, Los Alamos National Los Alamos Science 28: 47.
Laboratory document LA-UR-04-2989.
(Submitted to Nonlinear Anal.).

Gaver, D. P. 1992. Combining For further information, contact


Information: Statistical Issues and David H. Sharp (505) 667-5266
Opportunities for Research Report by Panel ([email protected]).
on Statistical Issues and Opportunities for
Research in the Combination of
Information, Committee on Applied and
Theoretical Statistics, National Research
Council. Washington, DC: National
Academic Press.

Gilovich, T., D. Griffin, and D. Kahneman, eds.


2002. Heuristics and Biases: The
Psychology of Intuitive Judgment.
Cambridge, UK: Cambridge University
Press.

Glimm, J., J. W. Grove, Y. Kang, T. W. Lee, X.


Li, D. H. Sharp, et al. 2003. Statistical
Riemann Problems and a Composition Law
for Errors in Numerical Solutions of Shock
Physics Problems, Los Alamos National
Laboratory document LA-UR-03-2921.
SIAM J. Sci. Comput. (in press).

Glimm, J., C. Klingenberg, O. McBryan, B.


Plohr, S. Yaniv, and D. H. Sharp. 1985.
Front Tracking and Two-Dimensional
Riemann Problems. Adv. Appl. Math. 6:
259.

Johnson, V. E., T. L. Graves, M. S. Hamada,


and C. S. Reese. 2003. A Hierarchical
Model for Estimating the Reliability of
Complex Systems. In Bayesian Statistics 7:
Proceedings of the Seventh Valencia
International Meeting. p. 199. UK: Oxford
University Press.

Kao, J., D. Flicker, R. Henninger, S. Frey, M.


Ghil, and K. Ide. 2004. Data Assimilation
with an extended Kalman Filter for Impact-
Produced Shock-Wave Dynamics. J. Comp.
Phys. 196 (2): 705.

O’Nions, K., R. Pitman, and C. Marsh. 2002.


The Science of Nuclear Warheads. Nature
415: 853.

Number 29 2005 Los Alamos Science 25

View publication stats

You might also like