Tpe 517 Geostatistics II
Tpe 517 Geostatistics II
RESERVOIR MODELLING
PART II
1
Introduction to Statistics
2
Introduction to Statistics
3
Introduction to Statistics
Sampling
• Gathering information about an entire population often costs too
much or is virtually impossible. Instead, we use a sample of the
population. A sample should have the same characteristics as
the population it is representing.
• Most statisticians use various methods of random sampling in an
attempt to achieve this goal. This section will describe a few of
the most common methods. There are several different methods
of random sampling. In each form of random sampling, each
member of a population initially has an equal chance of being
selected for the sample
5
Introduction to Statistics
7
Introduction to Statistics
• Descriptive Statistics:
• Deals with the organization, presentation and summarization of data.
It allows for the understanding the characteristics of the information
and enables us to use the information more productively.
• Inferential Statistics:
• Deals with the procedures for deriving conclusions on the basis of the
data.
8
Introduction to Statistics
9
Introduction to Statistics
• That in reservoirs with similar geology, 30% will produce oil. This is
related to the random nature of outcomes.
• That there is 30% probability that a well to be drilled will produce oil.
This is related to the incomplete knowledge of the reservoir.
Deterministic events can be treated as random events if we lack
sufficient knowledge about the events.
10
Introduction to Statistics
11
Introduction to Statistics
Probability Function
• The probability function describes the probability that a random
variable will take a certain value.
• It is closely related to the relative distribution function.
Expected Value
• This is defined as some weighted average outcome of a random
experiment if the experiment is conducted a large number of times.
• E(x) = Xi. P(Xi)
12
Introduction to Statistics
• Uniform Distribution
• Triangular Distribution
• Beta Distribution
• Normal or Gaussian distribution
• Log-normal distribution
13
Introduction to Statistics
14
Introduction to Statistics
15
Exercise
16
Solution to Exercise
P(A) = 0.2
P(B) = 0.3
Therefore,
P(A n B) = P(A) * P(B)
= 0.2 x 0.3 = 0.06
Therefore the probability of success in both wells is 6%
17
Exercise
18
Solution to Exercise
Let A = Event that oil is found in the exploratory well.
Let B = Event that the source rock is present.
20
Measures of the Spread of the Data
Where, s2 is the sample variance, ẍ the mean sample and n the total number
of samples.
Variance can also be calculated as:
21
Measures of the Spread of the Data
Cv = S/ ẍ
Where, s2 is the sample variance, ẍ the mean sample and n the total number
of samples.
22
Example Exercise
23
Solution
• i= k(n +1)/100
Calculate 100(X+0.5Y)/N
18, 21, 22, 25, 26, 27, 29, 30, 31, 33, 36, 37, 41, 42, 47, 52,
55, 57, 58, 62, 64, 67, 69, 71, 72, 73, 74 , 76, 77
28
Finding Percentile
a. When k= 70:
i=70(29+1)/100 = 21.
21st position in the data set is 64.
i.e. 70th percentile is 64 years.
b. 85th percentile:
• i=85(29+1)/100 = 25.5
• Age in 25th position = 72 Age in 26th = 73
• Therefore 85th percentile = (72+73)/2 = 72.5 years.
29
Spatial Data Set Analysis (Univariate)
There are many cases that involve the spatial analysis of data
sets of two variables. e.g porosity and permeability.
The commonest analysis tools are
Covariance, C(x,y)
Coefficient of correlation, r(x,y)
Linear regression
31
Bivariate Statistics
32
Variance and Correlation Coefficient: Case 1
33
Solution: Case 1
Sample Porosity(X) K (Y) X2 Y2 XY
1 29.49 1156 869.6601 1336336 34090.44
2 26.79 531 717.7041 281961 14225.49
3 28.74 1059 825.9876 1121481 30435.66
4 27.65 822 764.5225 675684 22728.3
5 27.69 1014 766.7361 1028196 28077.66
6 22.69 109 514.8361 11881 2473.21
7 23.30 138 542.8900 19044 3215.4
8 23.81 166 566.9161 27556 3952.46
9 25.54 362 652.2916 131044 9245.48
Sum 235.70 5357 6221.544 4633183 148444.1
35
Solution: Case 2
Sample Porosity(X) log K (Y) X2 Y2 XY
1 29.49 3.063 869.6601 9.381969 90.32787
2 26.79 2.725 717.7041 7.425625 73.00275
3 28.74 3.025 825.9876 9.150625 86.93850
4 27.65 2.915 764.5225 8.497225 80.59975
5 27.69 3.006 766.7361 9.036036 83.23614
6 22.69 2.037 514.8361 4.149369 46.21953
7 23.30 2.140 542.890 4.579600 49.86200
8 23.81 2.220 566.9161 4.928400 52.85820
9 25.54 2.559 652.2916 6.548481 65.35686
Sum 235.70 23.690 6221.544 63.69733 628.4016
m=C(x,y)/Sx2
b=ȳ - mẋ
37
Linear Regression
• For our example,
C (log K, φ) = 0.8875
Sφ = 2.329
Sφ2 = 5.424
m= 0.8875/5.424 = 0.1636
b = log Ќ – m φ
= 2.6322-0.1636(26.19) = -1.652
38
Spatial Relationship
▪ Indicating intensity of pattern and the scale at which that pattern is exposed
▪ Interpolating to predict values at unmeasured points across the domain (e.g. kriging)
▪ Assessing independence of variables before applying parametric tests of significance
39
Quantifying Spatial Relationship
40
Areal Heterogeneity: Conventional Methods
41
Areal Heterogeneity: Conventional Methods
42
Areal Heterogeneity: Conventional Methods
43
Principles that govern determination of weights
45
Kriging Method
• In other words, kriging substitutes the arbitrarily chosen weight
from IDM with a probabilistically-based weighting function that
models the spatial dependence of the data.
• The structure of the spatial dependence is quantified in the semi-
variogram
• Semivariograms measure the strength of statistical correlation as
a function of distance; they quantify spatial autocorrelation
• Kriging associates some probability with each prediction, hence it
provides not just a surface, but some measure of the accuracy of
that surface
• Kriging equations are estimated through least squares
46
Kriging Method
• Kriging has both a deterministic, stochastic and random error
component
Z(s) = μ(s) + ε’(s)+ ε’’(s),
where
μ(s) = deterministic component
ε’(s)= stochastic but spatially dependent component
ε’’(s)= spatially independent residual error
• Assumes spatial variation in variable is too irregular to be modeled by
simple smooth function, better with stochastic surface
• Interpolation parameters (e.g. weights) are chosen to optimize fn
47
Kriging Method
1. Assumption of Stationarity
2. Spatial modelling of sample data using a variogram
3. Estimation of a variable value at unsampled locations
called kriging.
49
Stationarity in Geostatistics
In addition to assuming that all locations are described by
random variables, we also have to consider the restrictions
associated with available sample data to predict values at
unsampled locations.
50
Stationarity in Geostatistics
For geostatistical purposes, the first and second orders of
stationarity must be specified:
Or,
51
Bivariate Relationships in Spatial Data
52
Bivariate Relationships in Spatial Data
• For two variables x and y, the covariance is given as:
• For value x, If we denote x(u) as a value at location u and x(u+L) as value at location u+L,
we can rewrite the equation above as:
r(x,y) = c(x,y)/(SxSy)
r(L) = c(L)/(Sx(u)Sx(u+L)
54
Example
55
Example
56
Lag Distance of 1 ft (n=6)
57
Lag Distance of 1 ft (n=6)
• Therefore, Sx = 1.638
• Therefore,
= 0.7165
58
Lag Distance of 1 ft (n=6)
59
Lag Distance of 2 ft (n=5)
60
Lag Distance of 3 ft (n=4)
61
Effect of Lag Distance on Covariance
62
Effect of Lag Distance on Covariance
63
Mathematical Expectation
Solution:
For a die experiment, probability of a random variable is 1/6
65
Properties of Expected Value
• E[K] = K
• E[Ku(X)] = KE[u(X)]
• E[u1(X)+u2(X)] = E[u1(X)] + E[u2(X)]
66
Properties of Expected Value
Example
Calculate the variance for the rolling-a-die experiment.
Solution
σ2 =E[X2] - µ2
µ = E[X]= 3.5
67
The Variogram
The variogram is the most commonly used geostatistical
technique for describing the spatial relationship. It is half of
the variance between the two values located L distance apart.
Mathematically, it is defined as:
68
The Variogram
Oftentimes, both the variogram and covariance can capture
spatial relationship adequately.
Why use the variogram instead of covariance?
• Tradition
• Estimation of the variogram requires restrictive assumptions
than the covariance:
variogram requires only an assumption that variance of the
difference between two values be finite, while calculation of
covariance requires that the variation of the data be finite.
69
Semivariance
•Semivariance (distance h) = 0.5 * average [ (value at location i–
value at location j)2] OR
n
i
{ z ( x ) − z ( xi + h )}2
( h) = i =1
2n
•Based on the scatter of points, the computer (Geostatistical analyst)
fits a curve through those points
•The inverse is the covariance matrix which
shows correlation over space
by Austin Troy
Variogram
• Plots semi-variance against
distance between points
• Is binned to simplify
• Can be binned based on just
distance (top) or distance
Binning based on distance only
and direction (bottom)
• Where autocorrelation
exists, the semivariance
should have slope
• Look at variogram to find
where slope levels
©2008 All lecture materials by Austin TroyBinning based
except where notedon distance and direction
Variogram
• SV value where it flattens
out is called a “sill.” sill
• The distance range for which
there is a slope is called the
“neighborhood”; this is
where there is positive
spatial structure
• The intercept is called the
“nugget” and represents
random noise that is nugget
spatially independent range
©2008 All lecture materials by Austin Troy except where noted
Functional Forms
74
The Kriging Technique
The term kriging comes from Danny Krige, a South African geoscientist,
who first applied the technique to gold mines.
The kriging technique uses a linear estimation procedure to estimate a
value at unsampled locations.
75
The Kriging Techniques
76
Kriging Method
•The fitted variogram results in a series of matrices and vectors that
are used in weighting and locally solving the kriging equation.
•Basically, at this point, it is similar to other interpolation methods
in that we are taking a weighting moving average, but the weights
(λ) are based on statistically derived autocorrelation measures.
• λs are chosen so that the estimate z ( x0 ) is unbiased and the
estimated variance is less than for any other possible linear combo
of the variables.
Kriging Method
•Produces four types of prediction maps:
•Prediction Map: Predicted values
•Probability Map: Probability that value over x
•Prediction Standard Error Map: fit of model
•Quantile maps: Probability that value over certain quantile
Experimental Variogram
• Measures the variability of data with respect to spatial distribution
• Specifically, looks at variance between pairs of data points over a
range of separation scales
4/22/2021
Stationarity
• Stationarity implies that an entire dataset is
described by the same probabilistic process…
that is we can analyze the dataset with one
statistical model
4/22/2021
Stationarity and the Variogram
• Under the condition of stationarity, the
variogram will tell us over what scale the data
are correlated.
Correlated at any distance
Uncorrelated
(h)
h
4/22/2021 UB Geology GLY560: GIS
Variogram for Stationary Dataset
•Range: maximum
distance at which data
are correlated
Range •Nugget: distance over
Semi-Variogram
absolutely correlated or
Sill unsampled
•Sill: maximum
Nugget
variance ((h)) of data
Separation Distance pairs