Nowak Verly The Practice of SGS
Nowak Verly The Practice of SGS
Abstract. The theory of simulation is relatively well documented but not its practice,
which is a problem since simulation is not as robust as linear estimation. As a result,
many costly mistakes probably go undetected. In this paper, a process for simulation is
introduced with the objective of reducing the likelihood of such mistakes. The context is
sequential Gaussian simulation within the mining industry. However, a significant part
of the process can be applied in other simulation framework.
Four of the most important aspects of the process are discussed in detail. A gradual trend
adjustment is suggested as a post-simulation step. A modified bootstrap approach is
presented to deal with the grade uncertainty that accounts for spatial dependence
between the samples. A number of pre- and post-simulation checks are also discussed.
Some post-simulation adjustments of the simulated values are suggested to improve on
the quality of the simulation.
All of the approaches, solutions and checks presented in this paper are simple, flexible,
and can be easily implemented by a practitioner.
1 Introduction
387
process for sequential Gaussian simulation. Note that a significant portion of the process
can be applied to other simulation algorithms (see Figure 1).
Figure 1 shows that the process is more complex than just normal score transformation,
variogram modeling, simulation and back-transformation. A number of steps have been
added to improve on the span of uncertainties, trend reproduction, reproduction of data
distribution, reproduction of a variogram model, reproduction of correlated variables,
and choice of optimistic and pessimistic scenarios. Although some of these steps have
yet to be implemented, generally the process is closely followed by Placer and is
described in Nowak and Verly (2004).
This process is based on specific difficulties encountered during real case studies, in
particular:
x Simulated values may not adequately follow general trends, especially away from
data locations.
x Bootstrapped distributions may be almost identical when created from large data
sets.
x Average and/or variability of simulated data may be substantially different from
average and/or variability of the conditioning data.
x Variograms of simulated values may be different from the variogram models.
x Trend analysis.
x Bootstrap grades.
x Check/Adjust simulated normal scores (histograms and variograms).
x Check/Adjust distribution of simulated grades.
2 Trend analysis
Trends are not always well reproduced in sequential Gaussian simulation (Steps I.1.2
and I.1.8 in Figure 1a). This is because of the stationarity assumption necessary for the
normal score transform and the assumption of a constant zero mean in the SK algorithm.
One simple way to deal with this problem is to filter the trend and simulate the residuals
of the original values (Deutsch, 2002). Unfortunately, this solution may produce
simulated grade values that are negative. An obvious way out is to reset the negative
values to zero, but this may result in a significant bias and poor reproduction of the
trend.
A second solution consists in defining the local prior means to be used by SK with a
correction factor for all kriging variances (Goovaerts, 1997; Deutsch, 1998). This
solution was not tried by the authors but it is suspected that it may lead to difficulties in
the reproduction of the original values distribution and it does not address the fact that
the normal score transform is global within a geological domain.
THE PRACTICE OF SEQUENTIAL GAUSSIAN SIMULATION 389
Figure 1. Process for simulating (a) original data and (b) bootstrapped data. The topics
discussed in this paper are highlighted. The other process steps are discussed in Nowak
and Verly (2004).
A third solution is given by Leuangthong and Deutsch (2004) who suggest a step-wise
normal score transform. The method consists in defining the trend and residuals
followed by a normal score transform of the residuals conditional to the trend. In
practice, the residuals are classified according to a series of trend value intervals and
there is one standard normal score transform of the residual per interval. This method is
very promising because the normal score transform is conditional to the trend. The
method ensures that there is no trend in the normal score space and that a proper normal
score variogram is used. Finally, the method greatly reduces the number of negative
grade values after the step-wise back-transform.
390 M. NOWAK AND G .VERLY
This method can be modified to a transformation of the original values conditional to the
trend instead of residuals conditional to the trend, which would ensure that there are no
negative grades after back-transformation.
Although the step-wise normal score transform is very promising, other solutions for
trend reproduction have been tried by the authors. These solutions rely on a definition of
a trend at all grid locations and on the average simulated model. It is assumed that the
trend represents a relatively smooth surface and can be assessed by OK with a high
nugget effect. An example of the trend values compared with the original data is
presented in Figure 2a. The first attempt consisted of filtering the trend and simulating
residuals. This approach, however, was abandoned because of a significant amount of
simulated negative grades. Other attempts were made to correct for the trend of the
simulated normal score values or the back-transformed simulated values. The best
results have been obtained by adjusting back-transformed values according to:
where Sim(x) is the simulated value at location x before the trend adjustment, Simtr(x) is
the simulated value after trend adjustment, and w(x) is a correction factor calculated as
follows:
w( x) (c( x) 1) v( x) 1
c( x) Tr ( x) / Avsim( x)
v( x) V kr ( x) / V kr max
where Tr(x) is the trend value at location x, Avsim(x) is the average simulated value,
ıkr(x) is the kriging standard deviation and ık rmax is the maximum kriging standard
deviation at any given node.
The kriging standard deviation ıkr(x) affects the amount of the adjustment. If a simulated
node is very close to conditioning data then v( x) # 0 and no adjustment is made. On the
other hand, a maximum adjustment is made far from data locations. Note that a similar
progressive correction, i.e., a correction dependent on the distance from the data, has
been discussed by Xu (1997). The advantages of the approach are:
x The average of simulated values is similar to the trend, in particular away from data
locations.
x The coefficients of variation of the simulated values before and after the correction
have been observed to be quite similar in practice.
x The correction is simple and can be done on already simulated values.
x The correction is flexible in the sense that ık rmax can be replaced by an arbitrary
value.
The disadvantage of the approach is the difficulty to infer the trend everywhere, in
particular far from data locations.
THE PRACTICE OF SEQUENTIAL GAUSSIAN SIMULATION 391
Figure 2b shows a comparison of the trend with the average simulation before the trend
adjustment, and Figure 2c presents the comparison after the adjustment for the trend.
Clearly, there is a substantial improvement in the reproduction of the trend when the
adjustment is made.
Figure 2. Comparison of the trend (solid line) along elevation with (a) conditioning
data, (b) average simulation before trend adjustment, (c) average simulation after trend
adjustment
3.1 HISTOGRAMS
The simulated normal score histogram check may reveal that the simulated distribution
is not standard normal. This section discusses (1) the case of the average of the
simulated values different from 0.0, (2) the case of the variance of simulated values
different from 1.0, and (3) a gradual adjustment of the simulated value to a standard
normal distribution.
The difference may result from an improperly defined validation zone, i.e., the zone
within which the simulation results are validated. Usually, this zone should be similar to
the zone within which the simulation parameters (histogram, normal score transform,
variogram) are calibrated. The difference may also result from an improper declustering
of the original distribution. Two possible solutions are:
392 M. NOWAK AND G .VERLY
x Modification of the validation zone. If for example the non-zero average is due to a
significant amount of simulated values at some distance from the conditioning data, and
at the same time conditioned to low assays on the edges of the drilled out area, a
modification of the validation zone that excludes areas far from conditioning data may
reduce significantly the difference observed. Figure 3 illustrates the impact of such a
modification of the validation zone. Here, in the original validation zone the average of
simulated values is -0.11 but in the modified validation zone the average is -0.02 which
is close to the 0.0 data average. The modified validation zone is limited to the area close
to the conditioning data, extending not further than a search radius used for polygonal
declustering.
x Adjustment to declustering weights. If a polygonal declustering is used, the search
radius may be inappropriate. In other words, the original data distribution has not been
properly defined.
If the source of the difference is unknown and there is reason to believe that the original
distribution mean (= 0.0) is correct, the simulated values may have to be adjusted as per
sub-section 3.1.3.
As for the average, the difference in variance may result from an improper validation
zone or improper declustering and the solutions proposed earlier for correcting the
average may be applied.
In practice, the variogram is often fitted first with a sill of one (Figure 1a, Step I.1.4).
The value of J ( Z , Z ) should then be computed. If the J ( Z , Z ) value is within 5% of
one, a simple rescaling of the variogram values is reasonable, otherwise a variogram
model adjustment (sill and range) is suggested (Figure 4).
If the source of the difference in mean (z 0.0) and/or variance (z 1.0) is unknown and
there is reason to believe that the original N(0,1) distribution is correct, the simulated
THE PRACTICE OF SEQUENTIAL GAUSSIAN SIMULATION 393
values may have to be adjusted. The following approach is a progressive correction that
depends on the distance of the simulated node from the conditioning data.
First, a maximum possible adjustment at a given node Simtr max(x) is defined by a simple
standardization (mean = 0 and variance = 1):
where Sim(x) is the original simulated value at location x, AvGsim is the global average of
all simulated values and ıGsim is the global standard deviation of all simulated values.
where ısim(x) is the standard deviation of the simulated values at the selected node, and
ımaxsim is the maximum standard deviation of the simulated values from all nodes.
Note that for a node located on a conditioning data, ratio(x)=0 and Simtr(x)=Sim(x), i.e.,
there is no correction. As the node gets further from the conditioning data, the value of
ratio(x) gradually increases from zero up to one, and the value of Simtr(x) gradually
varies from Sim(x) to Simtr max(x).
As shown in Figure 5, this adjustment results in a modification of both the average and
the variance of the simulated values. Note that the adjustment does not result in an
average and variance equal to 0 and 1 respectively, but there is a substantial
improvement. Note also that the adjustment described in this section is a gradual affine
correction that will not correct the shape of the distribution. If it is necessary to also
correct the shape of the distribution (i.e., adjusting to a N(0,1) distribution), then a more
sophisticated approach can be used (Xu, 1994).
3.2 VARIOGRAMS
The variograms of the simulated values can deviate from the modeled variograms. A
deviation from the original model may adversely impact the simulation results,
especially when the focus of the study is on variability of the mined blocks. The
difference between the simulated and modeled continuities (variograms) may be caused
by (1) poorly fitted variograms, (2) a modeler’s decision to fit according to geological
interpretation, and (3) unknown reason.
The first two sources of differences are counter-acted by data conditioning. Regardless
of the original variogram model, continuities of the experimental data are to some extent
imprinted on the simulated continuities, especially when there are lots of data as in
mining. The impact of the data can be checked by comparing conditional and
unconditional simulations. If deemed necessary, both variogram model range and sill
may be adjusted to achieve the desired results. As shown in Figure 6 the adjustment to
394 M. NOWAK AND G .VERLY
the variogram model results in improved, albeit not perfect, continuities of the simulated
values.
Figure 3. Comparison of simulated values with the data: (a) validation domain identical
to simulation zone, (b) validation domain extending not further from the data than a
search radius used for polygonal declustering
Figure 4. Example of variogram models before and after normal score variability check.
a) Variogram model with total sill of 1.0 results in a dispersion variance within the
validation zone of 0.96. b) Modified model with total sill of 1.10 results in a dispersion
variance of 0.99.
Figure 5. Comparison of simulated values with the original normal score data: (a)
before the correction (b) after the correction of both average and variance
THE PRACTICE OF SEQUENTIAL GAUSSIAN SIMULATION 395
All checks and sometimes the adjustments made in normal score space are necessary but
not sufficient to ignore the checks on the simulated values after back-transformation
(Step I.1.8 in Figure 1a). This is especially true because of a potential compounding
effect of the corrections made. Although the writers are not aware of significant
problems related to the series of corrections, their effect on the final simulated grades
should be studied. Comparisons should be made with the original data within the
validation envelope. Histograms, probability plots, scatterplots and visual checks of
maps of simulated values are useful tools. Care should be given to ensure that the
simulated mean grade in a geological domain is similar to the average estimated grade in
that domain. If they are different, the simulated grades may have to be adjusted either by
modifying some pre-simulation parameters, such as a trimming value, and re-simulating,
or by a simple adjustment of the simulated values to the required average.
5 Bootstrap grades
Two main levels of uncertainty can be identified: geological (rock types) and grade
uncertainty. Only the grade uncertainty is discussed in this section, but the same
discussion applies to the geological uncertainty.
Current simulation practice often relies on the assumption that the distribution of in-situ
grade values is known from the declustered grade histogram. The additional risk
associated with an imperfect knowledge of the actual grade distribution should be
addressed, resulting in better reproduction of the space of uncertainty.
Using a bootstrapping methodology, statistical fluctuations can be investigated by
sampling from the original distribution (Steps I.2.1, I.2.2, and I.2.3 in Figure 1b). A
typical procedure consists in creating a series of possible datasets by drawing randomly
with replacement as many values, with the attached declustering weights, as there are in
the original distribution. The fluctuations between the various datasets are then
investigated.
396 M. NOWAK AND G .VERLY
When there are many sample values, such as in mining, the classical bootstrap approach
results in datasets that are very similar to each other. This similarity would be perfectly
correct if the sample values were uncorrelated, but this is not the case in a typical
mining situation.
Spatial correlation can be addressed by drawing fewer values from the original
distribution (Srivastava, pers. comm.). Indeed, the variance of the mean grade is:
1
Var1(Mean) ¦ ¦ Cij
N2
where Cij is the covariance for the distance between sample i and j, and can be deduced
from the variogram.
If P values are drawn randomly from the original dataset, the variance of the mean is:
1
Var 2( Mean) Var ( Data )
P
The required fluctuation for the mean is achieved if P is chosen such that
Var2(Mean)=Var1(Mean), then:
Var ( Data )
P
Var1( Mean)
Note that this formula could be refined to account for declustering weights.
Figure 7 illustrates the impact of bootstrap on the possible means of the original
distribution. If no bootstrap is applied, the standard deviation of the mean is zero, i.e.,
the mean is fixed (Figure 7a). If the classical bootstrap is applied, the standard deviation
of the means is 0.09 (Figure 7b). If the spatial bootstrap is applied, the standard
deviation of the means increases to 0.036 (Figure 7c).
Figure 7. (a) Data mean - no bootstrap. (b) Typical bootstrap - mean distribution.
(c) Spatial dependence bootstrap - mean distribution.
THE PRACTICE OF SEQUENTIAL GAUSSIAN SIMULATION 397
The bootstrapping may be done on data from all geological domains or on data from one
domain at a time. If the former is used, the choice of optimistic (high average) and
pessimistic (low average) distributions is more difficult, because the distributions from
one or two domains may influence the results. The authors feel that bootstrapping per
domain is a better solution. Under those circumstances, a pessimistic/optimistic
distribution is truly pessimistic/optimistic in all domains. Of course, care should be
given when choosing the bootstrapped distributions for simulating the grades. The
distributions should not be overly pessimistic or optimistic. The choice of
pessimistic/optimistic distributions can be limited to a specific area, or can be based on
low/high metal content or NPV.
Prior to the final choice of the optimistic and pessimistic scenarios, it may be useful to
have some insight on the potential impact of that choice on simulated values. Applying a
cut-off grade on the bootstrapped distribution corrected for change of support may
provide such insight.
Figure 8. (a) Standard normal score transform based on a bootstrapped distribution. (b)
Original grade distribution converted to normal scores using the standard normal score
transform of the bootstrapped distribution.
398 M. NOWAK AND G .VERLY
6 Conclusions
A process for sequential Gaussian simulation is presented, which contains more steps
than the usual normal score transformation, variogram modeling, simulation and back-
transformation. A significant portion of the process may be used for other simulation
methods, such as sequential indicator simulation. The authors believe that using similar
processes in the mineral industry would avoid many costly mistakes.
Four of the most important aspects of the process are discussed in detail: trends,
bootstrapping, checks, and adjustment of the simulated values.
Sequential Gaussian simulation often fails to correctly reproduce trends because of its
strong stationarity requirement. A simple, albeit approximate, solution consists of
adjusting for the trend after the simulation, via a gradual correction that depends on the
distance to the conditioning data.
References
Deutsch, C.V. and A.G. Journel, 1998, GSLIB: Geostatistical Software Library and User’s Guide, Oxford
University Press, New York, 380 pp.
Deutsch C.V., 2002, Geostatistical Reservoir Modeling, Oxford University Press, New York, 376 pp.
Goovaerts, P., 1997, Geostatistics for Natural Resources Evaluation, Oxford University Press, New York, 467
pp.
Isaaks, E.H., 1991. Application of Monte Carlo methods to the analysis of spatially correlated data,
Unpublished PhD thesis, Stanford University.
Leuangthong O. and Deutsch, C.V., 2004. Transformation of Residuals to avoid Artifacts in Geostatistical
Modelling with a Trend, Mathematical Geology, Vol 36, No 3, p. 287-305.
Nowak M., and Verly G., 2004. A Practical Process for Simulation, with Emphasis on Gaussian Simulation,
Submitted to Orebody Modelling and Strategic Planning 2004 Symposium, Perth, Australia.
Xu, W., and Journel, A.G., 1994. Posterior identification of histograms conditional to local data. In Stanford
Center for Reservoir Forecasting Report (SCRF) 7. Stanford Center for Reservoir Forecasting, School of
Earth Sciences, Stanford, CA, USA