Prob Stat Petroleum Resources Assessmenbt
Prob Stat Petroleum Resources Assessmenbt
By
Robert A. Crovelli1
This report is preliminary and has not been reviewed for conformity with U.S.
Geological Survey editorial standards. Any use of trade, product or firm names is for
descriptive purposes only and does not imply endorsement by the U.S. Government.
!U.S. Geological Survey, Box 25046, MS 971, DFC, Denver, Colorado 80225
1993
TABLE OF CONTENTS
Page
Venn diagram for describing the fields of probability and statistics ................ 1
The relationship between probability and inferential statistics......................... 2
I. Probability.................................................................................................................. 3
A. Basic Concepts.................................................................................................... 4
1. Petroleum accumulation classification hierarchy................................... 4
2. Experiment, sample space, and event....................................................... 5
3. Venn diagram............................................................................................... 6
4. Tree diagram................................................................................................. 6
5. Event relations.............................................................................................. 7
6. Combinatorial analysis (counting techniques)........................................ 8
7. Definitions of probability............................................................................ 9
8. Probability of event relations..................................................................... 11
9. Conditional probability............................................................................... 12
10. Probability rules........................................................................................... 13
11. Applications of probability rules............................................................... 14
12. Bayes1 rule..................................................................................................... 18
C. Descriptive Parameters..................................................................................... 29
1. Measures of central location....................................................................... 29
a. Mean........................................................................................................ 29
b. Median..................................................................................................... 30
c. Mode........................................................................................................ 31
2. Mean, median, and mode related to skewness ....................................... 32
3. Measures of variation.................................................................................. 33
a. Variance................................................................................................... 33
b. Standard deviation................................................................................ 34
4. Fractiles.......................................................................................................... 35
5. Examples....................................................................................................... 36
6. LOGRAF........................................................................................................ 39
H. Statistics .................................................................................................................. 59
A. Sampling Concepts............................................................................................ 60
1. Populations................................................................................................... 60
2. Parameters.................................................................................................... 62
3. Samples.......................................................................................................... 64
4. Sampling techniques ................................................................................... 69
5. Statistics......................................................................................................... 71
B. Descriptive Statistics.......................................................................................... 73
1. Tabular methods.......................................................................................... 73
a. Frequency distribution.......................................................................... 73
b. Relative frequency distribution........................................................... 74
c. Cumulative frequency distribution (less than).................................. 75
d. Relative cumulative frequency distribution (less than)................... 76
e. Cumulative frequency distribution (more than)............................... 77
f. Relative cumulative frequency distribution (more than)................. 78
2. Pictorial methods......................................................................................... 79
a. Frequency histogram............................................................................. 79
b. Relative frequency histogram.............................................................. 80
c. Cumulative frequency polygon (less than)........................................ 81
d. Relative cumulative frequency polygon (less than)......................... 82
e. Cumulative frequency polygon (more than)..................................... 83
f. Relative cumulative frequency polygon (more than)....................... 84
3. Measures of central location....................................................................... 85
a. Sample mean .......................................................................................... 85
b. Sample median....................................................................................... 86
c. Sample mode.......................................................................................... 87
4. Measures of variation.................................................................................. 88
a. Sample variance..................................................................................... 88
b. Sample standard deviation................................................................... 89
c. Sample range.......................................................................................... 90
C. Sampling Distributions..................................................................................... 91
1. Sampling distribution of the mean............................................................ 91
2. Central Limit Theorem................................................................................ 93
3. Normal probability paper........................................................................... 96
ii
D. Inferential Statistics ........................................................................................... 98
1. Statistical estimation.................................................................................... 100
a. Point estimation..................................................................................... 100
b. Interval estimation................................................................................. 103
2. Tests of hypotheses...................................................................................... 104
a. Z test for p................................................................................................ 104
b. Chi-squared goodness-of-fit test.......................................................... 106
c. Lognormal probability paper............................................................... 108
3. Regression and correlation......................................................................... Ill
a. formulas.................................................................................................. Ill
b. Transformations..................................................................................... 113
c. Finding-rate curves................................................................................ 119
d. Power laws.............................................................................................. 123
e. Fractals..................................................................................................... 125
111
Venn Diagram for Describing the Fields of Probability and Statistics
Probability Statistics
AREAS AREAS
Operations Research Statistical Inference
Probability Models Estimation Theory
Stochastic Processes Tests of Hypotheses
Markov Chains Regression & Correlation
Queuing Theory (Simple & Multiple)
Simulation Analysis of Variance
Game Theory Probability Basic Statistical Design of Experiments
Decision Theory Theory Probability Theory Sampling Techniques
Risk Analysis Sample Surveys
Dynamic Programming Nonparametric Statistics
Reliability Theory Multivariate Statistics
Combinatorial Analysis Factor Analysis
Time Series Analysis Discriminate Analysis
Actuarial Analysis Quality Control
Random Walks Descriptive Statistics
Bayesian Statistics
Geostatistics
The Relationship Between Probability and Inferential Statistics
Probability
^-"-"" ^^
(deductive
reasoning)
Population Sample
(inductive
reasoning)
*"--- -*
Statistics
I. Probability
A. Basic Concepts
1. Petroleum accumulation classification hierarchy
Pool: An individual accumulation or reservoir of oil or gas.
Field: A set of one or more pools of oil or gas that are related to a single
structural or stratigraphic feature.
Prospect: A potential oil or gas field.
Play: A set of one or more prospects that are geologically related in their
hydrocarbon sources, reservoirs, traps, and geologic histories.
Province (or basin): A set of one or more plays that are hydrodynamically
related.
Region: A set of one or more provinces that are geographically related.
O Prospect
2. Experiment, sample space, and event
Experiment: any process or action that generates observations.
Experiment: Three-prospect assessment
Suppose we are assessing three prospects in a new play. Each prospect
results in one of two possible outcomes. Let "success" (S) denote having an
oil or gas field and "failure" (F) denote being dry.
Sample space: a set of all possible outcomes (sample points) of an
experiment.
Sample Space
SSS
SSF
SFS
SFF
FSS
FSF
FFS
FFF
Event: a subset of a sample space.
Let event A: Exactly one field
A = {SFF, FSF, FFS}
and event B: At least one field
B = {SSS, SSF, SFS, SFF, FSS, FSF, FFS}
3. Venn diagram
Sample Space
FFF
4. Tree diagram
SSS
FFF
5. Event relations
a. Union of events
The union of two events A and B, denoted by AuB and read "A or B," is
the event containing all outcomes in A or B or both.
Let A = {SFF, FSF, FFS) and B = {SSS, SSF, SFS, SFF, FSS, FSF, FFS),
then AuB = {SSS, SSF, SFS, SFF, FSS, FSF, FFS}
b. Intersection of events
The intersection of two events A and B, denoted by AnB and read "A
and B," is the event containing all outcomes in both A and B.
Let A = {SFF, FSF, FFS} and B = {SSS, SSF, SFS, SFF, FSS, FSF, FFS},
then AnB = {SFF, FSF, FFS}
c. Complement of event
The complement of an event A, denoted by A' and read "not A," is the
event containing all outcomes of the sample space that are not in A.
Let A = {SFF, FSF, FFS},
then A' = {SSS, SSF, SFS, FSS, FFF}
d. Mutually exclusive events
Two events A and B are mutually exclusive or disjoint events
if A and B have no outcomes in common, i.ev AnB = 0
Let A = {SFF, FSF, FFS} and B = {SSF, SFS, FSS},
then AnB = 0
Therefore, A and B are mutually exclusive events.
6. Combinatorial analysis (counting techniques)
a. Fundamental principle of counting
If an operation can be performed in n, ways, and if for each of these a
second operation can be performed in n2 ways, and for each of the first
two a third operation can be performed in n3ways, and so forth, then
the sequence of k operations can be performed in n!n2...nk ways.
Experiment: Three-prospect assessment
Number of sample points in sample space = 2*2*2 = 8
b. Combinations
A combination is any unordered subset of r objects taken from a set of n
distinct objects.
The number of combinations of r objects taken from n distinct objects is
r!(n-r)!
where "n factorial" is n! = n(n-l)(n-2)» (3)(2)(1) and 0! = 1.
Number of sample points in event A = I , = = =3
K v (I) 1121 1»2»1
(3\ (3} (3}_
Number of sample points in event B = L + ~ + - -3 + 3 + 1-7
V 1 / Vz/ V-V
c. Permutations
A permutation is any ordered subset of r objects taken from a set of n
distinct objects.
The number of permutations of r objects taken from n distinct objects is
p _ "I
r'"~(^
8
7. Definitions of probability
Idea of probability:
The probability of an event is a numerical measure of the likelihood that
the event will occur.
a. Classical definition of probability
If an experiment can result in any one of N different equally likely
outcomes, and if exactly n of these outcomes correspond to event A,
then the probability of event A is
P(A) = li
n->~ n
10
8. Probability of event relations
Suppose we assume equally likely sample points for the experiment: three-
prospect assessment.
Sample Space
SSS
SSF
SFS
SFF
FSS
FSF
FFS
FFF
a. Given event A: Exactly one field
A = {SFF, FSF, FFS},
then P(A) = 3/8
b. Given event B: At least one field
B = {SSS, SSF, SFS, SFF, FSS, FSF, FFS},
then P(B) = 7/8
c Given AuB = {SSS, SSF, SFS, SFF, FSS, FSF, FFS},
then P(AuB) = 7/8
d. Given An B = {SFF, FSF, FFS},
then P(AnB) = 3/8
e. Given A1 = {SSS, SSF, SFS, FSS, FFF},
then P( A1 ) = 5/8
f. Given B1 = {FFF},
then P(B') = 1/8
11
9. Conditional probability
Notation P(A I B) denotes the conditional probability of event A given that
the event B has occurred.
Given that B has occurred, event B becomes the new reduced sample space.
Conditional probability:
For any two events A and B with P(B) > 0, the conditional probability of A
given that B has occurred is defined by
P(B) 7/8
Independence:
Two events A and B are independent if P(A I B) = P(A) and are dependent
otherwise.
Consider the experiment: three-prospect assessment with equally likely
sample points. Since
P(A I B) = 3/7 and P(A) = 3/8 => P(A I B) * P(A),
events A and B are dependent.
Remark: If two events are mutually exclusive, then they are dependent.
12
10. Probability rules
a. Addition rule
If A and B are any two events, then
P(AuB) = P(A) + P(B) - P(AnB)
b. Special addition rule
If A and B are mutually exdusive events, then
P(AuB) = P(A) + P(B)
c Complement rule
If A and A' are complementary events, then
P(A) + P(A') = 1
or
P(A) = 1-P(A')
d. Multiplication rule
If A and B are any two events, then
P(AnB) = P(AIB)P(B)
and
P(AnB) = P(A)P(BIA)
e. Special multiplication rule
If A and B are independent events, then
P(AnB) = P(A)P(B)
f. Another special multiplication rule
If A and B are mutually exdusive events, then
P(AnB) = 0
13
11. Applications of probability rules
a. Equally likely
SSS 0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
1.000
14
b. Bernoulli process
First Second Third Sample Event
Prospect Prospect Prospect Point A
SSS
SSF
SFS
SFF
FSS
FSF
FFS
FFF
15
c. Independence
First Second Third Sample Event
Prospect Prospect Prospect Point A
SSS
SSF
SFS
SFF
FSS
FSF
FFS
FFF
16
d. Dependence
First Second Third Sample Event
Prospect Prospect Prospect Point A
SSS
0.8
FFF
17
12. Bayes'rule
Rule of total probability:
If the events BI, 62, ..., Bk constitute a partition of the sample space such
that P(Bi) * 0 for i = 1, 2, ..., k, then for any event A
k k
P(A) =
for r = 1, 2, ..., k.
Example: Posterior probabilities are
PCBi I A) - P(Bi)P(AIBi)
P(B2 )P( Al B2 )
(3/4)(2/3)
(3/4)(2/3) + (l/4)(l/3) 7/12
18
B. Random Variables and Probability Distributions
A random variable X is a function that associates a real number with each
element in the sample space.
1. Discrete random variables
a. Binomial random variable
Bernoulli process
First Second Third Sample
Prospect Prospect Prospect Point Probability x
SSS 0.008 3
SSF 0.032 2
SFS 0.032 2
SFF 0.128 1
FSS 0.032 2
FSF 0.128 1
FFS 0.128 1
FFF 0.512 0
1.000
Let random variable X: Number of fields (successes)
Possible distinct values x = 0,1,2,3
Note that P(X = 1) = P(A) = 0.384
A discrete random variable X can take on a countable number of values.
b. Examples of discrete random variables
X: Number of discoveries
X: Number of dry holes
X: Number of prospects
X: Number of petroleum accumulations
X: Number of oil fields
X: Number of gas fields
X: Number of exploratory wells
19
2. Discrete probability distributions
Probability distributions can be expressed in the form of tables, graphs, and
formulas.
Binomial distribution
a. Probability mass function (pmf)
f(x) = P(X = x)
x f(x) 1-
0 0.512 f(x)
0.5-
1 0.384 1
2 0.096 1 I .
1 1 1
3 0.008
0123
1 for x < 0
0.488 for 0 < x < 1
R(x) = 0.104 for 1 < x < 2
0.008 for 2 < x < 3
0 forx > 3
20
3. Graphs of discrete probability distributions
Binomial distribution
a. Probability histogram
U L I 1"T"I
0123
b. Cumulative (less than) distribution function (cdf)
F(x)
0 2 3
c. Complementary cumulative (more than) distribution function (ccdf)
R(x)
r-
-f r-
o 2 3
21
4. Continuous random variables
a. Concept of a continuous random variable
A continuous random variable X can take on a continuum of values.
X = x where x is a real number in an interval, e.g., 0 < x «» or any
positive number.
1 2
22
5. Continuous probability distributions
Uniform or rectangular distribution
a. Probability density function (pdf)
1
b-a fora < x < b
f(x) =
otherwise
23
6. Graphs of continuous probability distributions
Uniform or rectangular distribution
a. Probability density function (pdf)
f(x)
b-a
0 a b
b. Cumulative (less than) distribution function (cdf)
F(x)
1
0 a b
24
7. General graphs of continuous probability distributions
a. Probability density function (pdf)
f(x)
25
8. Monte Carlo simulation
a. Binomial distribution
X: Number of fields (successes) in 3 prospects (x = 0,1,2,3).
P(field) = 0.2 and P(dry) = 0.8
X = Xi + X2 + X3 where
Xi: Number of fields in prospect 1 (xi = 0,1),
X2: Number of fields in prospect 2 (x2 = 0,1),
Xy Number of fields in prospect 3 (xs = 0,1).
0 1 0 1 X2 X3
0 1
Prospect 1 Prospect 2 Prospect 3
26
A relative frequency distribution of X from an actual simulation,
compared to the exact probability distribution of X from the analytic
method:
U2 __J U
X2 X3 ^
Closure Thickness Oil Yield Factor
27
c. Comparison of the analytic probability method and the Monte Carlo
simulation method
Difficulty of Problem
28
C. Descriptive Parameters
1. Measures of central location
Also called measures of central tendency
a. Mean
i. Probability density function (pdf)
mean
center of gravity
a +b
29
b. Median
i. Probability density function (pdf)
f(x)
median
0.5 - - -
median
R(x)
0
median
30
c. Mode
i. Probability density function (pdf)
f(x)
max
mode
Inflection point
mode
iii. Complementary cumulative (more than) distribution function
(ccdf)
Inflection point
mode
31
2. Mean, median, and mode related to skewness
a. Symmetric probability density function (pdf)
f(x)
mean
median
mode
f(x)
mode
median
mean
mean
median
mode
32
3. Measures of variation
a. Variance
i. Two probability density functions with different variations
f(x)
+(3 - 0.6)2(0.008)
= 0.48
Theorem: a2 = E(X2 )-^i2
E(X2) = (02)(0.512) + (l2)(0.384) + (22)(0.096) + (32)(0.008)
= 0.84
a2 = 0.84 -(0.6)2 = 0.48
33
b. Standard deviation
i. The standard deviation of X is the positive square root of the
variance of X, i.e.,
a =
ii. Experiment: Three-prospect assessment
Binomial random variable X: Number of fields
The standard deviation of X is
a = VO.48 = 0.69
iii. Chebyshev's Theorem:
Given any random variable X with mean |i and standard
deviation a, then
foranyk>0, P(u - ko< X < a + kcr) > 1 - -=-.
\c
3
For k = 2, P(fj. - 20 < X < u + 2cr) >
4
o
For k = 3, P(jU - 3cr < X < p, + 3cr) ^ -
34
4. Fractiles
a. Fractiles are values of a random variable that correspond to "more than"
or excedence probabilities.
The pi00th fractile (0 < p < 1), denoted by Fpioo/ is the value of a random
variable X such that P(X > Fpioo) = P-
For example
The 95th fractile, Fgs, is the value of X such that P(X > Fgs) = 0.95.
The 5th fractile, FS, is the value of X such that P(X > FS) = 0.05.
The 50th fractile, FSQ, is the median,
b. Probability density function (pdf)
R(x)
35
1.00
0.95
0.05
0.00
36
ESTIMRTES
MERN - 1.08
MEDIRN - 0.73
95X 0.18
757. 0.40
507. 0.73
257. 1.32
57. 3.H
MODE - 0.33
5.0. 1.19
ESTIMRTES
MERN - 1.08
MEDIRN - 0.73
95X 0.18
757. 0.40
50X 0.73
25X 1.32
57. 3.14
MODE - 0.33
5.D. 1.19
Figure 9. Conditional probability distribution of the undiscovered recoverable oil for the
North Atlantic Shelf province expressed as A , conditional more-than cumulative
distribution function, and B, conditional probability density function. Estimates
are mean, median, mode, standard deviation (S.D.), and fractiles that correspond to
the percentages listed.
37
o
CM
o .
z. ESTI MflTES
cn <£ . \ MERN 0.45
^ R.
i- MEDIflN - 0.00
o . 0.00
957.
LJ
or ^ ' \ 757.
50X
0.00
0.00
257. 0.59
O 57. 2.07
\ MODE
S.D.
- 0.33
0.94
5 ," \
cn oo "
OQ 0_ \
£ §: V-
-_
0.0 1.2 2.4 3.6 4.8 6.0 7.2 8.4 9.6 10.8 12.0
BILLION BARRELS RECOVERABLE OIL
E5TWRTES
MERN 0.15
MEDIRN - 0.00
957. 0.00
757. O.GO
507. 0.00
58% 257. 0.59
5X 2.07
MODE 0.33
5.0. 0.94
38
EXAMPLES
Examples of two probability curves on the same graph are (1) conditional and
unconditional resource potential, and (2) recoverable and economically recoverable
resource potential.
Example 1
LOGRAF is used to duplicate the probability graphs that were originally
generated by EXACTDIS for the national assessment of undiscovered conventional
oil and gas resources by the U.S. Geological Survey (Mast and others, 1989).
Figure la consists of cumulative probability distributions for undiscovered
recoverable and undiscovered economically recoverable conventional crude oil
resources of the United States. Figure la' is a summary of the input and output of
the assessment, including the lognormal parameters and the conditional and
unconditional estimates for each probability curve. The input into LOGRAF are
estimates of the following parameters for each distribution:
Recoverable resources-p = 1, 6 = 0, F^5 = 33.2, F^ = 69.9
Economically recoverable-? = 1, 6 = 0, F^5 = 20.7, F£ = 53.8
39
Undiscovered Conventional Crude Oil Resources - Total U.S.
Recoverable and Economically Recoverable Resources
ESTIMATES
Cond Cond
0 20
BILLIONS OF BARRELS
Figure 1a.--LOGRAF output of cumulative probability distributions for undiscovered recoverable (solid curve) and undiscovered
economically recoverable (dashed curve) conventional crude oil resources of the United States. Both curves are conditional
(cond) probability distributions. (Curves duplicated from Figure 10 in Mast and others, 1989.)
LOGRAF 92.7 15-Dec-1992 17:09:09 C:\LOGRAF\GEOTECH\USOIL.DAT
OUTPUT: OUTPUT:
Lognormal parameters Lognormal parameters
Mu 3.8748 Mu 3.5077
Sigma 0.2263 Sigma 0.2903
Conditional estimates Conditional estimates
Mean 49.423 Mean 34.808
Median 48.173 Median 33.372
Mode 45.769 Mode 30.674
F95 33.2 F95 20.7
F90 36.045 F90 23.003
F75 41.354 F75 27.437
F50 48.173 F50 33.372
F25 56.117 F25 40.59
F10 64.383 F10 48.415
F05 69.9 F05 53.8
S.D. 11.329 S.D. 10.322
Unconditional estimates * Unconditional estimates *
Mean 49.423 Mean 34.808
Median 48.173 Median 33.372
Mode 45.769 Mode 30.674
F95 33.2 F95 20.7
F90 36.045 F90 23.003
F75 41.354 F75 27.437
F50 48.173 F50 33.372
F25 56.117 F25 40.59
F10 64.383 F10 48.415
F05 69.9 F05 53.8
S.D. 11.329 S.D. 10.322
Because the marginal probability is equal to 1, the unconditional and conditional estimates are equal.
Figure 1a'.--LOGRAF summary of input and output estimates for undiscovered recoverable
(curve #1) and undiscovered economically recoverable (curve #2) conventional crude oil
resources of the United States. For additional output see figure 1 a.
41
Undiscovered Total Natural Gas Resources - Total U.S.
Recoverable and Economically Recoverable Resources
ESTIMATES
Cond Cond
OUTPUT: OUTPUT:
Lognormal parameters Lognormal parameters
Mu 5.9776 Mu 5.5619
Sigma 0.1528 Sigma 0.1358
Conditional estimates Conditional estimates
Mean 399.1 Mean 262.74
Median 394.47 Median 260.32
Mode 385.37 Mode 255.57
F95 306.8 F95 208.2
F90 324.31 F90 218.73
F75 355.84 F75 237.54
F50 394.47 F50 260.32
F25 437.3 F25 285.3
F10 479.81 F10 309.83
F05 507.2 F05 325.5
S.D. 61.341 S.D. 35.851
Unconditional estimates * Unconditional estimates *
Mean 399.1 Mean 262.74
Median 394.47 Median 260.32
Mode 385.37 Mode 255.57
F95 306.8 F95 208.2
F90 324.31 F90 218.73
F75 355.84 F75 237.54
F50 394.47 F50 260.32
F25 437.3 F25 285.3
F10 479.81 F10 309.83
F05 507.2 F05 325.5
S.D. 61.341 S.D. 35.851
* Because the marginal probability is equal to 1, the unconditional and conditional estimates are equal.
Figure 1b'.--LOGRAF summary of input and output estimates for undiscovered recoverable
(curve #1) and undiscovered economically recoverable (curve #2) conventional natural gas
resources of the United States. For additional output see figure 1 b.
43
D. Some Continuous Probability Distributions
1. Normal distribution
a. Probability density function (pdf)
1 2
f(x) = / e - oo < x < <»
V27T a
Parameters are the mean \i (-°° < \i < °°) and standard deviation o > 0.
r x
68.26% within u± a
Typical normal curve.
95.44% within u±2a
99.74% within u±3o
^^
/I 2<T /I <T /I /I ~\~ <7 /I ~T" 2<T
P(15 < X < 20) = P(X < 20) - P(X < 15) = P(Z < 0.67) - P(Z < - 1)
= 0.7486 - 0.1587 from Table A.1
= 0.5899
P(X > 20) = 1 - P(X < 20) = 1 - 0.7486 = 0.2514
44
2. PROBDIST model selection menu
45
PROBDIST 89,11 FIG15.PDL 14:38:33 3-Hay-1990
Project naae : Open File Report
Estiaation naae : Test data
Units : none
Model : 7-fractile Probability Histogra§
INPUT: PARAMETERS
OUTPUT: ESTIMATES
VARIABLE NAME MEAN S. D. F100 F95 F75 F50 F25 F5 FO
Saaple data 8.52500 6.52745 0.00000 1.00000 3.00000 6,00000 14.0000 19.0000 28.0000
1: Sanple data
7-fractile probability histogran
46
PROBDIST 89.11 PIG16.PDL 14:39:16 3-May-1990
Project naae Open Pile Report
Estiaation naae Test data
Units none
Model 3-fractile Probability Bistogran
INPUT: PARAMETERS
OUTPUT: ESTIMATES
Sanple data 7.37916 1.75230 2,00000 4.00000 6.50000 8.00000 8.66666 9.33333 10.
1: Sanple data
3-fractile probability histogran
2.00000 10.0000
4.00000 8.00000 9.33333
Any key to continue ...
47
PROBDIST 89,11 PIG17.PDL 14:39:43 3-Maj-1990
Project naae : Open File Report
Estiiation naie : Test data
Units : none
Model : Hio/iax Nornal Distribution
INPUT: PARAHETERS
Kin Max
VARIABLE NAHE FIDO PO
OUTPUT: ESTIMATES
Saaple data 6.00000 1.46074 2.00000 3.80666 5.10000 6.00000 6.90000 8.19333 10.
1: Sanple data
M in/fiax nornal distribution
8.19333
S
10.0000 .i
Any key to continue ...
Figure 17. Output of PROBDIST for the minimum/maximum normal distribution model.
48
PROBDIST 89.11 PIG18.PDL 14:40:08 3-Hay-1990
Project nane : Open File Report
Estiiation Dane : Test data
Units : none
Hodel : Hean/SD Nornal Distribution
INPUT: PARAMETERS
Hean S.D.
VARIABLE NAHE (Hu) (Sign)
OUTPUT: ESTIMATES
Saiple data 5.00000 1.09555 2.00000 3.35500 4.32500 5.00000 5.67500 6.64500 8.00000
1: Sanple data
Mean/SD nornal, Signa = 1
Figure 18. Output of PROBDIST for the mean/standard deviation normal distribution
model.
49
PEOBDIST 89.11 PIG19.PDL 14:40:33 3-May-1990
Project naae Open Pile Report
Bstiaation naae Test data
Units none
Model Truncated Homal Distribution
INPUT: PARAMETERS
OUTPUT: ESTIMATES
VARIABLE NAME HEAH 8, D, P100 P95 P75 P50 P25 P5 PO
Saaple data 5,05290 1.23119 2.00000 3.36504 4.32778 5,00163 5.67623 6.64769 10.0000
1: Sanple data
Truncated nornal, Signa =
3.36504 5.00163 6
Any key to continue ...
Figure 19. Output of PROBDIST for the truncated normal distribution model.
50
PBOBDIST 89.11 PIC20.PDL 14:40:5? 3-May-1990
Project naae Open File Report
Estiaation Dane Test data
Units none
Model Lognoraal Distribution
INPUT: PARAMETERS
OUTPUT: ESTIMATES
Saiple data 4.29568 1.28129 2.00000 2.93519 3.46408 4.00000 4.73208 6.27719 10,
1: Sanple data
Lognorna1 d1st r ibution
51
PROBDIST 89.11 PIG21.PDL 14:41:21 3-May-1990
Project case : Open File Report
Estiaation naae : Test data
Units : none
Model : Truncated Lognoraal Distribution
IKPUT: PARAMETERS
OUTPUT: ESTIMATES
VARIABLE NAME MEAN S. D. F100 P95 P75 P50 P25 P5 PO
Saaple data 158.630 125.516 2.00000 18,7198 56.6487 117.261 223.345 411.409 500.000
1: Sanple data
Truncated lognornal, Signa = 1.2
Figure 21. Output of PROBDIST for the truncated lognormal distribution model.
52
PROBDIST 89.11 PIG22.PDL 14:41:42 3-Hay-1990
Project nane Open Pile Report
Estimation naae Test data
Units none
Kodel Exponential Distribution
INPUT: PARAMETERS
Kin Max
VARIABLE NAME FIDO PO
OUTPUT: ESTIMATES
VARIABLE NAME MEAN S. D. P100 P95 P75 P50 P25 P5 PO
Saiple data 3.27798 1.38289 2.00000 2.05940 2.33316 2.80274 3.60549 5.46941 10.
1: Satiple data
Exponent ial distribut ion
2.00000 10.0000
2.05940 2.80274 5.46941
Any key to continue ...
53
PBOBDIST 89.11 PIG23.PDL 14:42:06 3-Hay-1990
Kin Max
VARIABLE NAME P100 PO Beta
OUTPUT: ESTIMATES
VARIABLE MAKE MEAN S. D. P100 P95 P75 P50 P25 P5 PO
Saiple data 4,48180 2.02684 2.00000 2.14292 2.79435 3.87791 5.59086 8.46225 10,
1: Sanple data
Truncated exponential, Beta = 3
Figure 23. Output of PROBDIST for the truncated exponential distribution model.
54
PEOBDIST 89.11 PIG24.PDL 14:42:35 3-Hay-1990
Project naae : Open Pile Report
Estiaation naoe : Test data
Units : none
Hodel : Pareto Distribution
INPUT: PARAHETBRS
Kin Har
VARIABLE NAHE FIDO PO
Saaple data
OUTPUT: ESTIMATES
VARIABLE NAME MEAN S. D, FIDO F95 P75 P50 P25 F5 PO
Saaple data 2,74582 1.16589 2.00000 2.02404 2.13864 2.35053 2.76251 4.01936 10.
1: Sanple data
Pareto distribution
2.01 10.0000
2.02404 2.35053 4.01936
Any key to continue ...
55
PBOBDIST 89.11 PIG25.PDL 14:42:56 3-Hay-1990
Project naae : Open Pile Report
Bstiaation naae : Test data
Units : none
Model : Truncated Pareto Distribution
INPUT: PARAMETERS
Kin Max
VARIABLE NAHB FIDO PO d
OUTPUT: ESTIMATES
VARIABLE NAHB MEAN S. D. P100 P95 P75 P50 P25 P5 PO
Saaple data 4.54702 2.17003 2.00000 2.11531 2.69285 3.81966 5.83592 8.86975 10.
1: Sanple data
Truncated Pareto, d = 0.5
Figure 25. Output of PROBDIST for the truncated Pareto distribution model.
56
PEOBDIST 89.11 PIG26.PDL 14:43:21 3-May-1990
Project nane Open Pile Report
Bstisation naae Test data
Units none
Hodel Unifora Distribution
INPUT: PARAMETERS
Hin Har
VARIABLE HAMB PI 00 PO
OUTPUT: BSTIHATBS
VARIABLE NAME MEAN S. D. P100 P95 P75 P50 P25 P5 PO
Saaple data 6.00000 2.30940 2. 1.40000 4.00000 6.00000 8.00000 9.60000 10.0000
1: Sample data
Uniforn distribution
v "X.
^
.-"
57
PBOBDIST 89,11 PIG27.PDL 14:43:47 3-May-1990
INPUT: PARAMETERS
Hin Max
VARIABLE NAHE FIDO Node PO
OUTPUT: ESTIMATES
1: Sanple data
Triangle distribution
2.00000 10.0000
4.00000
Press any key to continue ...
58
59
E. Statistics
A. Sampling Concepts
1. Populations
Population of units: a set of units having some common characteristic.
Example: The population of oil fields in a play.
Population of observations: the set of all possible observations or values of a
random variable X.
A population of observations is conceptualized from a population of units if
each unit (e.g., oil field) were to be measured according to some random
variable X (e.g., oil field size).
Population distribution: the probability distribution of a random variable X
Example: Normal population means a population whose observations are
values of a random variable having a normal distribution.
Population size: the number of observations in the population.
Parent population: the population of observations that we are interested in
studying.
Example: The population of oil field sizes in a play. The parent population
distribution could be modeled as a Pareto distribution.
Sampled population: the population of observations from which a sample is
taken.
Sometimes, for various reasons (e.g., financial), the sampled population is
more restricted than the parent population.
Example: The population of oil field sizes in a play with the constraint of a
point of economic truncation. The sampled population distribution could be
modeled as a lognormal distribution.
60
u
z
LU
ID
o
LU
PARENT
POPULATION
INCRUMENTAL
DISTRIBUTION
SEGIMENTS
X
"° SIZE
Figure 9.1. The progressive exhaustion of an oil and gas field size distribution by wildcat drill-
ing. S0 is point of economic truncation. W{ to W* are sequential segments of the field size
distribution added with W{ to W* wildcat wells.
61
2. Parameters
Population parameter: a parameter of the population distribution; i.e., any
numerical quantity that characterizes or describes a population
distribution.
Important: A parameter is a constant or fixed value.
Characterizing parameter: a population parameter that characterizes the
population distribution.
Descriptive parameter: a population parameter that is a function of the
characterizing parameters and describes the population distribution, e.g.,
the population mean.
Population mean: the mean (i of the population distribution.
Population variance: the variance a2 of the population distribution.
Population standard deviation: the standard deviation a of the population
distribution.
For simplicity,
"parameters" will often refer specifically to the characterizing parameters;
"moments" will refer specifically to the population mean and variance.
a. Binomial population
Population distribution: binomial distribution
Parameters:
Number of prospects: n = 3
Probability of field (success): p = 0.2
Moments:
Population mean: (i= np = (3)(0.2) = 0.6
Population variance: o2 = np(l - p) = (3)(0.2)(0.8) = 0.48
62
b. Uniform population
Population distribution: uniform distribution
Parameters:
Minimum value: a = 0
Maximum value: b = 1
Moments:
a + b
Population mean: p. = =0.5
rh aY*
Population variance: a2 = = 0.083
63
3. Samples
Sample: a subset of a population.
Sample size: the number of members in a sample.
Sample of units: a subset of n units from a population of units.
Example: A sample of n oil fields in a play.
Sample of observations: a subset of n observations from a sampled
population of observations, i.e., a set of n random variables Xi, X2,..., Xn.
Example: A sample of n oil field sizes in a play.
Physical sample: in geology, a single unit.
Examples: A core sample or a water sample.
Important: Unless otherwise stated, a "sample" means a sample of
observations a set of data.
Two basic types of data
Discrete data: data resulting from a discrete random variable (count data).
Example: Number of new discoveries in each of the plays making up a
basin.
Continuous data: data resulting from a continuous random variable
(measured data).
Example: Discovered oil field sizes in a play.
64
Example 1: Oil and mixed oil and gas field sizes
Oil and mixed oil and gas field sizes (in million barrels known recovery)
for 175 fields of 1 million BOE or more known recovery in the northern
Michigan Silurian reef play are given below.
Source of data is the Significant Oil and Gas Fields of the United States
Data Base, a product of NRG Associates, Inc. (1988). The version used
included discoveries up to and including 1990. Known recovery refers to
the sum of cumulative production plus reserves. Included in the NRG files
are those fields with at least 1 million BOE of known recovery and also
those smaller, but expected to eventually be revised to at least 1 million
BOE.
Let X: Oil or mixed oil and gas field size (million barrels)
65
Oil and mixed oil and gas field sizes - 1990
66
Example 2: Gas field sizes
Gas field sizes (in billion cubic feet known recovery) for 61 fields of 1
million BOB or more known recovery in the northern Michigan Silurian
reef play are given below.
Source of data is the Significant Oil and Gas Fields of the United States Data
Base, a product of NRG Associates, Inc. (1988). The version used included
discoveries up to and including 1990. Known recovery refers to the sum of
cumulative production plus reserves. Included in the NRG files are those
fields with at least 1 million BOB of known recovery and also those smaller,
but expected to eventually be revised to at least 1 million BOB.
X: Gas field size (billion cubic feet)
67
Example 3: Net pay thickness data
Suppose a geologist is studying a new drilling prospect in an area in which
20 wells have been drilled. One of the unknown variables to be considered
in the new prospect is net pay thickness. To get an idea of the possible
likelihoods and ranges of possible values he has tabulated the net pay
thickness values from each of the completed wells, as shown in the table
below. Source of data is Newendorp (1975).
X: Net pay thickness (feet)
68
4. Sampling techniques
The definitions on sampling techniques are stated in terms of observations,
but could be stated in terms of units.
Random sampling: a method of selecting a sample of size n from the
sampled population such that every possible sample of size n has an equal
chance of being selected.
Random sample: a sample that results from random sampling.
Alternatively, a set of n independent and identically distributed random
variables Xi, X2,..., Xn each having the same population distribution.
Sampling with replacement: sampling in which each observation of a
sampled population can be selected more than once.
Sampling without replacement sampling in which each observation of a
sampled population cannot be selected more than once.
Stratified random sampling: the sampled population is divided into
subpopulations and a random sample is taken from each subpopulation.
Example: A series of random samples taken independently from various
strata or depths.
Sampling proportional to size: biased sampling in which the chance of
being selected is proportional to the size of the unit.
The discovery process modeling approach suggested by Barouch and
Kaufman (1976) and Arps and Roberts (1958) relies on the following
postulates:
1. The discovery of pools within an area of exploratory interest can be
modeled statistically as sampling without replacement from an underlying
population of pools.
69
2. The discovery of a particular pool within the available population of
undiscovered pools is random with the probability of discovery being
proportional to the areal extent of the pool.
This introduces a sampling bias toward the largest pools and produces a
result where the largest pools are found more quickly.
Parent Population
Sampled Population
Sample
70
5. Statistics
Statistic: any function of the observations comprising a sample; i.e., any
numerical quantity that is calculated from a sample Xi, X2,..., Xn. In
general, a statistic Y can be expressed in functional notation as
Y = g(X1,X2,-,Xn)
Important: A statistic is a random variable, because a function of random
variables is a random variable.
The reason the term statistic is defined so broadly, and not simply as a
numerical quantity "describing" a sample, is that in statistical inference we
calculate numerical values from the sample for the purpose of making
inferences concerning the population, and not necessarily to describe the
sample. It is important to realize that a parameter possesses a fixed value,
whereas a statistic can assume one of many possible values since it
depends on the sample; that is, the value of a statistic varies from sample to
sample.
Example: Oil and mixed oil and gas field sizes.
Oil and mixed oil and gas field sizes (in million barrels known recovery)
for 175 fields of 1 million BOE or more known recovery in the northern
Michigan Silurian reef play. Three statistics are
Minimum value: Xmin = 0-ll
Maximum value: Xmax = 14.25
175
Total: £xi= 317.56
71
In summary, we are interested in the population parameters and
sample statistics (defined later) shown in the chart.
Parameters Statistics
72
B. Descriptive Statistics
1. Tabular methods
Notation: [xi, xa) means xi < x < X2
a. Frequency distribution
X:Oil or mixed oil and gas field size (million barrels)
Frequency distribution of oil and mixed oil and gas field sizes
Class Frequency
Interval f
1. [0,0.5) 10
2. [0.5, 1.0) 53
3. [1.0, 1.5) 41
4. [1.5,2.0) 21
5. [2.0,2.5) 12
6. [2.5, 3.0) 9
7. [3.0, 3.5) 10
8. [3.5, 4.0) 7
9. [4.0,4.5) 1
10. [4.5, 5.0) 4
11. [5.0, 5.5) 4
12. [5.5, 15) _3
n = 175
73
b. Relative frequency distribution
Relative frequency distribution of oil and mixed oil and gas field sizes
Relative Relative
Class Frequency Frequency Frequency
Interval f f/n %
74
a Cumulative frequency distribution (less than)
Cumulative frequency distribution (less than) of oil and mixed oil and
gas field sizes
Cumulative
Class Frequency Frequency
Interval f (less than)
1. [0,0.5) 10 10
2. [0.5,1.0) 53 63
3. [1.0,1.5) 41 104
4. [1.5,2.0) 21 125
5. [2.0,2.5) 12 137
6. [2.5,3.0) 9 146
7. [3.0,3.5) 10 156
8. [3.5,4.0) 7 163
9. [4.0,4.5) 1 164
10. [4.5,5.0) 4 168
11. [5.0,5.5) 4 172
12. [5.5,15) _3 175
n = 175
75
d. Relative cumulative frequency distribution (less than)
Also called cumulative proportion and cumulative percentage.
Relative cumulative frequency distribution (less than) of oil and mixed
oil and gas field sizes
Relative Relative
Cumulative Cumulative Cumulative
Class Frequency Frequency Frequency Frequency
Interval f (less than) Proportion Percentage
1. [0,0.5) 10 10 0.06 6%
2. [0.5,1.0) 53 63 0.36 36%
3. [1.0,1.5) 41 104 0.59 59%
4. [1.5,2.0) 21 125 0.71 71%
5. [2.0,2.5) 12 137 0.78 78%
6. [2.5,3.0) 9 146 0.83 83%
7. [3.0,3.5) 10 156 0.89 89%
8. [3.5,4.0) 7 163 0.93 93%
9. [4.0,4.5) 1 164 0.94 94%
10. [4.5,5.0) 4 168 0.96 96%
11. [5.0,5.5) 4 172 0.98 98%
12. [5.5,15) _3 175 1.00 100%
n = 175
76
e. Cumulative frequency distribution (more than)
Cumulative frequency distribution (more than) of oil and mixed oil and
gas field sizes
Cumulative
Class Frequency Frequency
Interval f (more than)
1. [0,0.5) 10 175
2. [0.5,1.0) 53 165
3. [1.0,1.5) 41 112
4. [1.5,2.0) 21 71
5. [2.0,2.5) 12 50
6. [2.5,3.0) 9 38
7. [3.0,3.5) 10 29
8. [3.5,4.0) 7 19
9. [4.0,4.5) 1 12
10. [4.5,5.0) 4 11
11. [5.0,5.5) 4 7
12. [5.5,15) _3 3
n = 175
77
f. Relative cumulative frequency distribution (more than)
Also called cumulative proportion and cumulative percentage.
Relative cumulative frequency distribution (more than) of oil and mixed
oil and gas field sizes
Relative Relative
Cumulative Cumulative Cumulative
Class Frequency Frequency Frequency Frequency
Interval f (more than) Proportion Percentage
78
2. Pictorial methods
a. Frequency histogram
60 -i
50 -
40 -
5=£
ST
;=* so
22
P-H
20 -
10
0 ITIIIIITIIIIII
0 8 10 11 12 13 14 15
0.4 -i
0.3 -
00
o
0.2 -
55
"03
0.1 -
0.02
0 I I I T I I I I I I I I I I I I I I I
0 6 7 8 9 10 11 12 13 14 15
l.OO -i
a
CO
CO
0.75 -
oo
NJ
Er
I O.5O -
i
O.25 -
.£5
0 i l l l l l l I l l i I l l l l I l l l l l l l l l l l l r
0 8 10 11 12 13 14 15
_X= IX
i=1
n
= 109.75
20 20
175 175
85
b. Sample median
Definition: If Xi, X2,..., Xn represent a sample of size n, arranged in
increasing order of magnitude, then the sample median is defined by
the statistic
^ if n is odd
<n/2 + X(n/2)+1 ifniseven
86
c. Sample mode
Definition: If Xi, X2,..., Xn represent a sample of size n, then the sample
mode M is that value of the sample that occurs most often or with the
maximum frequency. The mode may not exist, and when it does, it is
not necessarily unique.
87
4. Measures of variation
a. Sample variance
Definition: If Xi, X2,..., Xn represent a sample of size n, then the sample
variance is defined by the statistic
i=l
n-1
The reason for using n-1 as the divisor will be explained later.
Theorem:
n(n-l)
88
b. Sample standard deviation
Definition: The sample standard deviation, denoted by S, is the positive
square root of the sample variance, i.e.,
s=Vsr
Note: The sample standard deviation S has the same units as the
random variable X.
89
c. Sample range
Definition: If Xi, X2, ..., Xn represent a sample of size n, and X(n) and
X(D are, respectively, the largest and smallest observations in the
sample, then the sample range is defined by the statistic
R = X(n) -
90
C. Sampling Distributions
Recall: A statistic is a random variable.
A sampling distribution is the probability distribution of a statistic.
1. Sampling distribution of the mean
a. Mean and standard deviation of X
Theorem: Given a random sample of size n from any population
with mean ji and standard deviation o, then
m = n and GJ = a / Vn
Example: Given a random sample of size n = 9 from any population of
porosity (%) with ji = 18 and o = 3, then
m = 18 and o^ = 3 / V9 = 1
b. Sampling distribution of X
Theorem: Given a random sample of size n from a normal population
with mean ji and standard deviation o, then
X has a normal distribution with p.s = \JL and o^ = o / Vn.
Therefore, P(X<x) = P(Z< x ~/^ = z)
G/Vn
Example: Given a random sample of size n = 9 from a normal
population of porosity (%) with \JL = 18 and o = 3, then
X has a normal distribution with \iy = 18 and os = 1
_ on _ 1 o
P(X < 20) = P(Z < = 2) = 0.9772 from Table A.1
91
Sampling
distribution
ofX
Population
distribution
ofX
Porosity (%)
92
2. Central Limit Theorem
a. One of the most amazing theorems in all of mathematics
Theorem: Given a random sample of size n from any population
with mean |i and standard deviation a, then
X has approximately a normal distribution with u5 = |i and a^ = a / Vn
if n is large (n > 30).
Therefore,
P(X< x) =
, a / Vn
b. Example: Given a random sample of size n = 36 from a uniform
population of porosity (%) with parameters a = 12.8 and b = 23.2.
The population mean and standard deviation are
a + b 12.8 + 23.2 = ,is0
2 2
(J _
b-a
_=- =
23.2-12.8
-a _ J
a
Vl2 Vl2
X has approximately a normal distribution with
(i_=(i =18 and a* = a/Vn =3/^36=0.5.
_ 1Q_1Q
P(X < 19) = P(Z < = 2) = 0.9772 from Table A.1
u. o
c. Even small sample size n gives bell-shaped distribution
Example: Given a random sample of size n = 9 from a uniform
population of porosity (%) with parameters a = 12.8 and b = 23.2.
X has a uniform distribution (|i = 18, a = 3)
X has approximately a bell-shaped distribution with
|i_ = |i =18 and a = a/Vn =3/V9 =1.
93
50 samples of size n = 9 from Uniform (a - 12.8, b = 23.2)
Sample X1 x2 X3 x4 x5 x6 x7 x8 x9 Mean
1 14.38 13.01 17.18 21.24 21.04 13.92 15.61 15.61 15.88 16.43
2 16.64 18.49 22.01 17.93 17.69 13.58 20.78 21.27 16.29 18.30
3 16.68 20.83 20.12 16.29 14.96 22.82 18.81 17.03 14.18 17.97
4 16.90 18.49 16.49 14.43 16.80 20.54 18.08 23.10 21.95 18.53
5 23.03 13.58 14.00 17.48 13.90 22.46 21.61 14.28 15.09 17.27
6 17.81 15.96 20.44 14.88 16.65 22.35 16.75 16.97 18.10 17.77
7 12.99 21.35 14.75 21.04 21.37 19.55 13.57 19.20 21.76 18.40
8 12.87 19.51 22.34 22.86 18.20 22.66 21.59 21.61 19.33 20.11
9 15.94 19.47 16.14 15.05 22.21 16.93 14.72 16.89 16.46 17.09
10 15.10 19.53 22.90 19.56 17.97 21.69 20.25 13.92 21.66 19.18
11 22.67 14.05 13.27 16.26 15.07 15.76 19.33 21.09 22.33 17.76
12 22.00 13.65 21.37 19.35 21.07 21.04 19.39 19.26 14.28 19.05
13 15.63 21.77 18.65 21.33 16.12 14.79 18.90 16.98 15.00 17.69
14 17.08 14.27 22.16 14.64 17.07 13.16 21.55 17.57 19.34 17.43
15 22.81 20.46 15.45 14.49 15.04 22.00 13.91 17.58 16.03 17.53
16 15.53 20.08 21.56 17.38 18.27 18.72 14.39 17.43 20.32 18.19
17 13.01 19.61 21.98 20.90 15.92 13.63 21.30 15.14 19.41 17.88
18 19.18 15.01 20.65 13.23 22.54 16.30 22.23 16.04 22.84 18.67
19 17.12 18.45 13.95 18.61 22.94 14.87 21.97 17.20 15.08 17.80
20 19.84 17.08 16.46 15.35 18.65 15.82 13.55 13.90 12.96 15.95
21 15.68 22.33 16.65 14.51 14.41 21.47 22.93 19.70 13.62 17.92
22 13.46 20.58 16.90 20.43 20.30 22.88 17.77 16.36 17.55 18.47
23 14.31 19.84 17.05 21.27 14.44 18.78 20.21 21.50 19.15 18.51
24 15.28 16.86 18.55 13.48 20.34 22.05 12.83 14.57 14.85 16.53
25 20.25 20.55 22.50 14.04 21.47 17.48 20.02 14.85 21.74 19.21
26 19.91 19.52 21.39 21.28 22.46 21.21 17.97 22.88 19.24 20.65
27 13.02 19.99 14.25 16.18 19.69 21.50 18.46 18.86 22.65 18.29
28 15.16 20.41 13.76 16.82 21.75 19.77 19.70 15.85 15.92 17.68
29 18.88 20.42 21.49 14.08 19.72 17.15 16.31 17.91 18.44 18.27
30 13.30 13.09 13.28 20.33 19.25 18.45 21.45 21.80 17.46 17.60
31 22.34 20.51 17.68 19.51 17.13 16.45 16.63 19.21 14.63 18.23
32 14.93 13.60 16.16 13.48 13.11 20.06 19.46 20.60 15.17 16.29
33 20.70 19.93 13.12 22.24 13.69 17.43 16.12 20.38 17.04 17.85
34 19.02 13.01 20.98 19.44 15.26 22.28 13.49 16.16 20.56 17.80
35 21.44 14.21 21.78 19.03 17.22 23.17 22.71 19.22 18.90 19.74
36 15.76 21.59 21.02 18.81 19.60 20.23 18.48 16.44 15.73 18.63
37 13.88 17.17 14.65 13.97 21.41 ^19.22 15.08 13.87 20.92 16.68
38 22.91 21.47 19.82 22.28 14.39 22.01 17.16 16.87 16.49 19.27
39 14.58 16.96 21.37 19.98 22.41 15.33 22.86 14.50 21.72 18.86
40 15.94 20.42 20.01 21.24 20.14 21.38 19.32 13.53 16.38 18.71
41 18.45 16.40 21.48 23.08 15.33 14.36 16.88 19.94 14.68 17.84
42 14.27 16.67 13.30 14.71 23.01 14.76 15.65 19.59 21.00 17.00
43 23.06 21.63 22.18 21.26 13.54 13.83 16.96 20.83 19.59 19.21
44 21.10 14.95 18.53 18.03 14.64 17.18 20.72 20.84 20.87 18.54
45 20.11 20.29 2.1.60 14.21 14.33 17.63 17.09 21.80 17.28 18.26
46 22.22 14.06 18.14 13.72 16.44 14.96 19.87 19.09 21.24 17.75
47 15.37 12.89 17.83 18.78 14.46 18.71 20.58 19.50 18.35 17.39
48 17.31 15.17 14.09 15.15 14.32 13.83 16.77 19.78 20.84 16.36
49 17.42 19.83 15.89 19.83 16.42 18.86 14.16 20.00 20.16 18.06
50 19.50 22.30 20.58 20.21 22.32 14.21 15.80 16.39 22.05 19.26
94
38%
32%
Sampling
distribution
ofX
14% Population
distribution
10% ofX
12 15 18 21
Porosity (%)
X has a uniform distribution (ji = 18, a = 3)
95
3. Normal probability paper
96
PROBABILITY X 90 DIVISIONS
KEUFFEL 8c ESSER CO. MADE IN u.s A. 6 8000
Normal
99.99 99.9 99.8 99 98 95 90 80 70 60 50 40 30 20 10 5 21 0.5 0.2 0.1 0.05 0.01
98
Nonparametric (or distribution-free) statistics: statistical methods
based on no assumption of the population distribution (distribution-
free assumption).
There are two basic types of inference:
1. Statistical estimation
2. Tests of hypotheses
Statistical estimation: The type of inference is an estimate of a
population aspect.
Example: Estimate the population mean jx.
Tests of hypotheses: The type of inference is a test of hypothesis of a
population aspect.
Example: Test the hypothesis of a lognormal population.
Assumptions of a statistical method: the conditions under which the
statistical method is valid.
Robust statistical method: a statistical method that is still valid even
with moderate violation of the assumptions.
Throughout the rest of this section, a random sample is assumed.
99
1. Statistical estimation
a. Point estimation
Point estimate of a population parameter 6: a single number that
can be regarded as the most plausible value of 6.
Example: A point estimate of the porosity population mean |i is
18.5% (say).
Point estimator of a population parameter 0: a statistic Y that is
used to obtain a point estimate of 0.
Examples:
1. A point estimator of the population mean |i is the sample mean X.
2. A point estimator of the population mean |i is the sample median X.
Because a point estimator is a statistic, a point estimator has a
sampling distribution.
Example: For a normal population X has a normal sampling
distribution.
Standard error of a point estimator Y: the standard deviation of Y.
Example: The standard error of X is c/Vn.
Unbiased estimator of a population parameter 0 r a point
estimator Y such that E(Y) = 0 for every possible value of 0.
Examples:
1. An unbiased estimator of jo, is X because E( X ) = |i.
2. An unbiased estimator of a2 is S2 because E(S2) = a2.
This explains why n-1 is used as the divisor.
Principle of minimum variance unbiased estimation: Among all
unbiased estimators of 0, choose the estimator that has minimum
variance.
100
Minimum variance unbiased estimator (MVUE) of 0:
the unbiased estimator that has minimum variance.
Example: For a normal population,
1. Both X and X are unbiased estimators of the population mean
V-
2. The variance of X is smaller than the variance of X.
3. X is the MVUE of U.
Important The best estimator of jj, depends on the population
distribution.
If the population is not normal, the best estimator may not be X.
101
is an unbiased estimator of 9, while Y2 is not.
e
Variance of unbiased estimator YT < variance of unbiased estimator Y,
102
b. Interval esimation
Interval estimate of a population parameter 0 : an interval of
numbers of the form yL < 6 < Vy
Example: An interval estimate of the porosity population mean u
is 17.5 < (i< 19.2 (say).
Confidence interval of a population parameter 6 : an interval such
that P(YL < 6 < YU) = 1-oc, for 0 < a < 1.
The end points YL and YU are called the lower and upper
confidence limits.
Example: A 95% confidence interval for u
Suppose that the parameter of interest is a population mean u and
the following assumptions are made
1. The population distribution is normal, and
2. The value of the population standard deviation a is known.
Recall: The sample mean X is normally distributed such that
a/Vn
has a standard normal distribution. Then
P(-1.96 < X ~|i < 1.96) = 0.95
a/Vn
Manipulating the inequalities to get the form YL < u <
P(X-1.96-S=<M-<X + 1.96-S=) = 0.95
Vn Vn
Example: A 95% confidence interval for the porosity population
mean jj, with the above assumptions and given quantities
a = 3.0, n = 16, and x = 18.5.
x±1.96-5= = 18.5±(1.96)^ = 18.5±1.5 = (17.0, 20.0)
Vn V16
A 95% confidence interval for jj, is 17.0 < |J,< 20.0.
103
2. Tests of hypotheses
a. Z test for p,
Parameter of interest: population mean p,
Assumptions:
1. The population distribution is normal, and
2. The value of the population standard deviation a is known.
Null hypothesis HQ: p. = |io
Alternative hypothesis HI: u < |io, |i > |io/ or |i * Uo
Test statistic:
a/Vn
Significance level: a = P(Ho is rejected when HO is true) =
Critical region:
Reject H0 if z < -za/ z > za , or z < -za/2 and z > za/2
Take random sample of size n: x
Compute test statistic: z
Decision: Decide whether or not H0 should be rejected
a =0.05
Critical region: z < -1.645
104
Random sample of size n = 25 : x = 16.2
Compute: z= , = -1.33
3/V25
Decision: Do not reject HO and conclude that the mean porosity is
not significantly less than 17%.
105
b. Chi-squared goodness-of-fit test
Definition: The lognormal distribution
A nonnegative r.v. X has a lognormal distribution if
the r.v.Y = ln(X) has a normal distribution.
Parameters: mean p. and standard deviation a of the normal distribution
Example: Oil and mixed oil and gas field sizes
HQ: the population distribution is lognormal
HI: the population distribution is not lognormal
Test statistic:
where
X 2 has approximately a chi-squared distribution.
k is the number of dasses.
Fi is the observed frequency of class i.
EI = npi is the estimated expected frequency (Ei > 5 for every i).
Pi is the estimated probability of an observation falling in class i.
n is the sample size.
Significance level: a = P(Ho is rejected when HO is true) = 0.05
Critical region:
if%2 ^%a,k-l reject H0
106
Chi-squared goodness-of-fit test
Observed Lnof *f =
Class Boundary «, = » A Combined (/. - tf
Frequency Boundary In(^) - 0.31 P(Z<z,) A
Interval *i */ et
/i In(£,)
0.74
1. [0,0.5) 10 0.5 -0.69 -1.35 0.0885 0.0885 15.49 1.95
2. [0.5, 1.0) 53 1.0 0 -0.42 0.3372 0.2487 43.52 2.07
3. [1.0, 1.5) 41 1.5 0.41 0.14 0.5557 0.2185 38.24 0.20
4. [1.5,2.0) 21 2.0 0.69 0.51 0.6950 0.1393 24.38 0.47
5. [2.0,2.5) 12 2.5 0.92 0.82 0.7939 0.0989 17.31 1.63
6. [2.5,3.0) 9 3.0 1.10 1.07 0.8577 0.0638 11.17 0.42
7. [3.0,3.5) 10 3.5 1.25 1.27 0.8980 0.0403 7.05 0.54
8. [3.5,4.0) 7 4.0 1.39 1.46 0.9278 0.0298 5.22 0.61
9. [4.0,4.5) 1 4.5 1.50 1.61 0.9463 0.0185 3.24
10. [4.5, 5.0) 4 5.0 1.61 1.76 0.9608 0.0144 2.52 7.35 0.37
11. [5.0,5.5) 4 5.5 1.70 1.88 0.9699 0.0091 1.59
12. [5.5, 15) 3 oo oo oo 1 0.0301 5.27 0.98
n = 175 0.9999 175.00 f = 9.24
Decision: Because %2 = 9.24 < 14.067, H0 is not rejected, so the lognormal model provides a good fit for the distribution of oil and mixed oil
and gas field sizes.
c. Lognormal probability paper
Example: Oil and mixed oil and gas field sizes (million barrels) for 175 fields of 1
million BOE or more known recovery in the northern Michigan Silurian
reef play.
108
601
SB P
KO g.
H
>oooooooooooooooooooooo
At*iji&ti>&>i^i^oooaB^0tCHUi«hti>(i>MOOoaBaB^uiuiui^b>i^o0a00iAinininwwK>ooa)aBuioioMooaB^b>«hOOM
IAOtOtOtOtOtOtAAOtOtOtOtOtUIMUIUIUIWMMlnUIUIMMUIUIUIUIWI«k«k«k«b«b«b«b«b«b«b«bg«b«b«k«b«bb>WWb>WWUIb>UW
/^ M In 'o '* (9 K> '^ M I* 'o W (9 ro '^ M l» <3 !h'« K> ^ M In o * (9 K> ^ M in *o '* '« K> *^ M u
S3
Lognormal Cumulative percentage
1 *7b :> iU ID ZU JU 4U SU DU /U OU BO _U 3D »C>7b .
9 9
8 . .8
7 7
6 6
5 5
4 4
3 3
2 2
X-N
c 10 1
0 9 9
8
7
£ b 6
(
00 5 5
W> 3 ................................«.-___- 3
........................ ...| ..___ _____
......... ......... .--^f !-... ...____--_
8 2 2
.
i
I
i
O
1
R '.< 9
,7 7
..aj-iii::::::::::::::::::::::::::::::::
.6 6
5
.4 4
.3 3
.2 2
0.1 1
1 1 1 1 I 1 1 1 1 1 1 II 1 1 II III 1 1 1 1 1 1 1 1
3.0 3.5 4.0 4.5 _5.0 5.5 6.0 6.5 7.0
110
Regression and Correlation
Regression
1. Simple linear regression
Straight line
y = a + bx
where
n E5
xy -Zxly
£ and, a « y- - bx-
n Ex - (Ex)
y = bx
where
b --2.
Ex
y = a + bx
where
b = s /s = r'-^ *zj^- and a =
y x \ _ 2 /r N 2
V n E x - (E x)
Ill
Correlation
n £ xy - £ x £ y_________
measures the strength of the linear relationship between two variables, y and x.
2. Coefficient of Determination
2 _ SSR
r ~ SST
measures what proportion of the total variation in the response y is accounted for by
the fitted regression model and is reported as a percentage AlQQ%) and interpreted
SSR is called the regression sum of squares and reflects the amount of variation in the
y values explained by the model, e.g., a postulated straight line.
SST is called the total corrected sum of squares and reflects the total amount of
variation in the y values.
112
Transformations
When log (...) appears, either a base 10 or base e logarithm can be used.
113
Transformations
0<o
0<0< 1
114
O-i
Z/10,279
Ro = 0.37 6
o 10H
o
o
I
CL
LJJ
Q 20H
1 0.58 1.4
2 0.80 0.68
3 0.58 1.49
4 0.86 0.66
5 0.53 1.2
6 0.28 4.95
7 0.30 3.28
8 1.23 0.20
9 0.56 0.96
10 0.52 1.17
11 0.55 1.00
12 0.51 1.57
13 0.63 0.69
14 0.56 1.04
15 0.47 1.02
16 0.52 0.83
17 0.56 1.18
18 0.70 0.90
19 0.79 0.56
20 1.4 0.32
21 1.34 0.26
22 1.35 0.23
23 0.69 0.45
24 0.32 2.86
25 0.35 2.27
26 0.37 2.64
27 0.36 3.39
28 0.44 2.46
29 0.66 1.26
30 0.98 0.55
31 0.79 0.94
32 3.33 0.39
33 3.86 0.24
34 1.20 0.78
35 1.58 0.41
36 0.77 0.44
37 0.91 0.40
38 0.61 0.60
116
Zlt
Face Cleat Spacing (inches)
VO
U)
o
o
i
An exponential relationship between face cleat spacing y and the reciprocal of vitrinite
reflectance x.
y = 0.193 e°-924/x
X y
0.25 7.78
0.28 5.23
0.3 4.2
0.4 1.94
0.5 1.22
0.75 0.66
1 0.49
1.5 0.36
2 0.31
2.5 0.28
3 0.26
3.5 0.25
4 0.24
118
15 30 45 60 75 90 105 135 150
EXPLORATORY FOOTAGE (MILLION FEET)
I I
YEAR
119
BOE IN MILLIONS
0 32 B4 96 128 ISO
i_____i
O
^0
^H
m
n
tIEt
CD
2D
GD
ID
ID
Z
~D
ID
O
r-
n
^o
n
n
"n
LO
O
O
NORTHERN MICHIGflN NIflGflRflN PINNflCLE REEFS C200)
CUMULRTIVE BOE
HISTORIC MIDPOINT TOTflL
0. -2000. 2200. -3000. 2200. HOOO.
EXPO 1033.0 19.1 21.8
HYPER 1033.0 66.1 106.8
400 800 1200 1600 2000 2400 2800 3200 3600 4000
CUMULflTIVE BOE
HISTORIC MIDPOINT TOTflL
0.-2100. 2400.-3300. 2400.-4200.
EXPO 1038.0 16.7 18.2
HYPER 1038.0 58.1 82.8
ft
600 1200 1600 2400 3000 3600 -1200 4800 5-100 6000
Power function:
Taking logarithms,
/ = a' + p x'
x'
123
30 -i
10-
CO
O
D:
O
CL
1.0-
0.5
0.4 1.0 2.0 3.0
VITRINITE REFLECTANCE (%)
Source: Hester
e. Fractals
Even if the linear pattern of points is restricted to the right side of the cumulative
frequency distribution due to an economic truncation, a Pareto (or fractal)
distribution is suggested as a model for the parent population.
The Pareto distribution theory for this approach is established in Crovelli and Barton
(submitted).
Fit a straight line to the linear pattern on the right side of the distribution.
125
Cumulative frequency distribution (more than) - original data
of oil and mixed oil and gas field sizes
126
Cumulative frequency distribution
1000
S 100
ho E
^
O
§
I
IO 10
D
|
O
0.1 1 10 100
Oil or mixed oil and gas field size (million barrels) - original data
Computation of sums for
transformation of power function to linear form
128
Computation of sums for
transformation of power function to linear form
n = 31
18.132 36.475
b= -2.194
= 288.246
-0.981
V / A2 / V A II V* / A2 (V \
W > IJC I I > JC I IIM 7 IV I I > V I
"ZrfV^v v^ '/ "ZrfV/iJ vZrf^;
=| 0.963
129
Selected References
Arps, J.J. and Roberts, T.G., 1958, Economics of drilling for Cretaceous oil on east flank
of Denver-Julesburg Basin: American Association of Petroleum Geologists
Bulletin, v. 42, p. 2549-2566.
Barouch, E. and Kaufman, G.M., 1976, Oil and gas discovery modelled as sampling
without replacement and proportional to random size: Sloan School Working
Paper No. 888-76.
Charpentier, R.R., 1989, A statistical analysis of the larger Silurian reefs in the northern
part of the Lower Peninsula of Michigan: U.S. Geological Survey Open-File
Report 89-216,34 p.
Crovelli, R.A. and Balay, R.H., 1990, PROBDIST: Probability distributions for modeling
and simulation in the absence of data: U.S. Geological Survey Open-File Report
90-446-A, Documentation (paper copy) 51 p.; Open-File Report 90-446-B,
Executable program (5.25" diskette).
Crovelli, R.A. and Balay, R.H., 1992, LOGRAF: Lognormal graph for resource
assessment forecast: U.S. Geological Survey Open-File Report 92-679-A,
Documentation (paper copy) 30 p.; Open-File Report 92-679-B, Executable
program (5.25" diskette).
Crovelli, R.A. and Barton, C.C. [submitted], Fractals and the Pareto distribution applied
to petroleum field-size distributions, in Barton, C.C. and LaPointe, P.R., eds.,
Fractal geometry and its uses in the petroleum industry: American Association
of Petroleum Geologists Book Series Volume, 26 ms. p., 10 figs. (Director's
approval 28-Jan-1991, and BTR log-in 26-Dec-1990.)
Crovelli, R.A., 1984, Procedures for petroleum resource assessment used by the U.S.
Geological Survey statistical and probabilistic methodology, in Masters, C.D.,
ed., Petroleum resource assessment: International Union of Geological Sciences,
pub. no. 17, p. 24-38.
Crovelli, R.A., 1984, U.S. Geological Survey probabilistic methodology for oil and gas
resource appraisal of the United States: Journal of the International Association
for Mathematical Geology, v. 16, no. 8, p. 797-808.
Dolton, G.L., Carlson K.H., Charpentier, R.R., Coury, A.B., Crovelli, R.A., Frezon, S.E.,
Khan, A.S., Lister, J.H., McMullin, R.H., Pike, R.S., Powers, R.B., Scott, E.W., and
Varnes, K.L., 1981, Estimates of undiscovered recoverable conventional resources
of oil and gas in the United States: U.S. Geological Survey Circular 860,87 p.
Drew, L.J., 1990, Oil and gas forecasting Reflections of a petroleum geologist: New
York, Oxford University Press, International Association for Mathematical
Geology, Studies in Mathematical Geology No. 2,252 p.
130
Harbaugh, J.W., Doveton, J.H., and Davis, J.C., 1977, Probability methods in oil
exploration: New York, John Wiley, 269 p.
Hester, T.C., [submitted] Porosity trends of Pennsylvanian sandstones with respect to
thermal maturity and thermal regimes in the Anadarko basin, Oklahoma: U.S.
Geological Survey Bulletin. (Branch Chief approval and CTR log-in)
Hunter, R.L. and Mann, C.J., eds., 1992, Techniques for determining probabilities of
geologic events and processes: New York, Oxford University Press, International
Association for Mathematical Geology, Studies in Mathematical Geology, No. 4,
364 p.
Mast, R.F., Dolton, G.L., Crovelli, R.A., Root, D.H., Attanasi, E.D., Martin, P.E., Cooke,
L.W., Carpenter, G.B., Pecora, W.C., and Rose, M.B., 1989, Estimates of
undiscovered conventional oil and gas resources in the United States A part of
the Nation's energy endowment: U.S. Geological Survey and Minerals
Management Service, 44 p.
Newendorp, P.O., 1975, Decision analysis for petroleum exploration: Tulsa, Oklahoma,
Petroleum Publishing Company, 668 p.
NRG Associates, Inc., 1992, The significant oil and gas fields of the United States
[through December 31,1990]: Available from Nehring Associates, Inc., P.O. Box
1655, Colorado Springs, Colorado 80901.
Walpole, R.E. and Myers, R.H., 1989, Probability and statistics for engineers and
scientists: New York, Macmillan Publishing Company, 4th ed., 765 p.
131
Appendix: Statistical Tables
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
-3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002
-3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003
-3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005
-3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007
-3.0 .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 '.0010
-2.9 .0019 .0018 .0017 .0017 .0016 .0016 .0015 .0015 .0014 .0014
-2.8 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0021 .0020 .0019
-2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026
-2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036
-2.5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048
-2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
-2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
-2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
-2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143
-2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183
- .9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
- .8 .0359 .0352 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294
- .7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367
- .6 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455
- .5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
_ 4 .0808 .0793 .0778 .0764 .0749 .0735 .0722 .0708 .0694 .0681
_ 3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823
- .2 .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985
-1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170
-1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379
-0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611
-0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867
-0.7 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148
-0.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451
-0.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776
-0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121
-0.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483
-0.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859
-O.I .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247
-0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641
132
Appendix: Statistical Tables
133
Appendix: Statistical Tables
V .995 .99 .98 .975 .95 .90 .80 .75 .70 .50
1 .0*393 .0 3 157 .0 3628 .03982 .00393 .0158 .0642 .102 .148 .455
2 .0100 .0201 .0404 .0506 .103 .211 .446 .575 .713 1.386
3 .0717 .115 .185 .216 .352 .584 1.005 1.213 1.424 2.366
4 .207 .297 .429 .484 .711 1.064 1.649 1.923 2.195 3.357
5 .412 .554 .752 .831 1.145 1.610 2.343 2.675 3.000 4.351
6 .676 .872 1.134 1.237 1.635 2.204 3.070 3.455 3.828 5.348
7 .989 1.239 1.564 1.690 2.167 2.833 3.822 4.255 4.671 6.346
8 1.344 1.646 2.032 2.180 2.733 3.490 4.594 5.071 5.527 7.344
9 1.735 2.088 2.532 2.700 3.325 4.168 5.380 5.899 6.393 8.343
10 2.156 2.558 3.059 3.247 3.940 4.865 6.179 6.737 7.267 9.342
11 2.603 3.053 3.609 3.816 4.575 5.578 6.989 7.584 8.148 10.341
12 3.074 3.571 4.178 4.404 5.226 6.304 7.807 8.438 9.034 11.340
13 3.565 4.107 4.765 5.009 5.892 7.042 8.634 9.299 9.926 12.340
14 4.075 4.660 5.368 5.629 6.571 7.790 9.467 10.165 10.821 13.339
15 4.601 5.229 5.985 6.262 7.261 8.547 10.307 11.036 11.721 14.339
16 5.142 5.812 6.614 6.908 7.962 9.312 11.152 11.912 12.624 15.338
17 5.697 6.408 7.255 7.564 8.672 10.085 12.002 12.792 13.531 16.338
18 6.265 7.015 7.906 8.231 9.390 10.865 12.857- 13.675 14.440 17.338
19 6.844 7.633 8.567 8.907 10.117 11.651 (3.716 14.562 15.352 18.338
20 7.434 8.260 9.237 9.591 10.851 12.443 14.578 15.452 16.266 19.337
21 8.034 8.897 9.915 10.283 11.591 13.240 15.445 16.344 17.182 20.337
22 8.643 9.542 10.600 10.982 12.338 14.041 16.314 17.240 18.101 21.337
23 9.260 10.196 11.293 11.688 13.091 14.848 17.187 18.137 19.021 22.337
24 9.886 10.856 11.992 12.401 13.848 15.659 18.062 19.037 19.943 23.337
25 10.520 11.524 12.697 13.120 14.611 16.473 18.940 19.939 20.867 24.337
26 11.160 12.198 13.409 13.844 15.379 17.292 19.820 20.843 21.792 25.336
27 11.808 12.879 14.125 14.573 16.151 18.114 20.703 21.749 22.719 26.336
28 12.461 13.565 14.847 15.308 16.928 18.939 21.588 22.657 23.647 27.336
29 13.121 14.256 15.574 16.047 17.708 19.768 22.475 23.567 24.577 28.336
30 13.787 14.953 16.306 16.791 18.493 20.599 23.364 24.478 25.508 29.336
134
Appendix: Statistical Tables
.30"
V .25 .20 .10 .05 .025 .02 .01 .005 .001
1 1.074 1.323 1.642 2.706 3.841 5.024 5.412 6.635 7.879 10.827
2 2.408 2.773 3.219 4.605 5.991 7.378 7.824 9.210 10.597 13.815
3 3.665 4.108 4.642 6.251 7.815 9.348 9.837 11.345 12.838 16.268
4 4.878 5.385 5.989 7.779 9.488 11.143 11.668 13.277 14.860 18.465
5 6.064 6.626 7.289 9.236 11.070 12.832 13.388 15.086 16.750 20.517
6 7.231 7.841 8.558 10.645 12.592 14.449 15.033 16.812 18.548 22.457
7 8.383 9.037 9.803 12.017 14.067 16.013 16.622 18.475 20.278 24.322
8 9.524 10.219 11.030 13.362 15.507 17.535 18.168 20.090 21.955 26.125
9 10.656 11.389 12.242 14.684 16.919 19.023 19.679 21.666 23.589 27.877
10 11.781 12.549 13.442 15.987 18.307 20.483 21.161 23.209 25.188 29.588
11 12.899 13.701 14.631 17.275 19.675 21.920 22.618 24.725 26.757 31.264
12 14.011 14.845 15.812 18.549 21.026 23.337 24.054 26.217 28.300 32.909
13 15.119 15.984 16.985 19.812 22.362 24.736 25.472 27.688 .29.819 34.528
14 16.222 17.117 18.151 21.064 23.685 26.119 26.873 29.141 31.319 36.123
15 17.322 18.245 19.311 22.307 24.996 27.488 28.259 30.578 32.801 37.697
16 18.418 19.369 20.465 23.542 26.296 28.845 29.633 32.000 34.267 39.252
17 19.511 20.489 21.615 24.769 27.587 30.191 30.995 33.409 35.718 40.790
18 20.601 21.605 22.760 25.989 28.869 31.526 32.346 34.805 37.156 42.312
19 21.689 22.718 23.900 27.204 30.144 32.852 33.687 36.191 38.582 43.820
20 22.775 23.828 25.038 28.412 31.410 34.170 35.020 37.566 39.997 45.315
21 23.858 24.935 26.171 29.615 32.671 35.479 36.343 38.932 41.401 46.797
22 24.939 26.039 27.301 30.813 33.924 36.781 37.659 40.289 42.796 48.268
23 26.018 27.141 28.429 32.007 35.172 38.076 38.968 41.638 44.181 49.728
24 27.096 28.241 29.553 33.196 36.415 39.364 40.270 42.980 45.558 51.179
25 28.172 29.339 30.675 34.382 37.652 40.646 41.566 44.314 46.928 52.620
26 29.246 30.434 31.795 35.563 38.885 41.923 42.856 45.642 48.290 54.052
27 30.319 31.528 32.912 36.741 40.113 43.194 44.140 46.963 49.645 55.476
28 31.391 32.620 34.027 37.916 41.337 44.461 45.419 48.278 50.993 56.893
29 32.461 33.711 35.139 39.087 42.557 45.722 46.693 49.588 52.336 58.302
30 33.530 34.800 36.250 40.256 43.773 46.979 47.962 50.892 53.672 59.703
135
Index
A
Assumptions of a statistical method, 99
8
Basin, 4
Bayes' rule, 18
Bernoulli process, 15
Binomial distribution, 20,26,29,33,34,62
Binomial random variable, 19
C
Central limit theorem, 93-95
Characterizing parameter, 62
Chebyshev's theorem, 34
Chi-squared goodness-of-fit test, 106,107
Combinations, 8
Combinatorial analysis
Combinations, 8
Fundamental principle of counting, 8
Permutations, 8
Complement of event, 7
Complementary cumulative distribution function, 20,23
Conditional probability, 12
Confidence interval, 103
Continuous data, 64
Continuous probability distributions
Cumulative distribution function, 23
Exponential, 53
Field size, 27
General graphs, 25
Graphs, 24
Lognormal, 51
Mean/standard deviation normal, 49
Minimum/maximum normal, 48
Normal, 44
Pareto, 55
Probability density function, 23
Seven-fractile histogram, 46
Three-fractile histogram, 47
Triangular, 58
136
Truncated exponential, 54
Truncated lognormal, 52
Truncated normal, 50
Truncated Pareto, 56
Uniform, 57
Uniform (rectangular) distribution, 23,24
Continuous random variables
Concept, 22
Examples, 22
Correlation, 112
Counting techniques, 8
Cumulative distribution function, 20,23
Cumulative frequency distribution (less than), 75
Cumulative frequency distribution (more than), 77
Cumulative frequency polygon (less than), 81
Cumulative frequency polygon (more than), 83
D
Dependence, 12,17
Descriptive parameter, 62
Descriptive parameters
Examples, 36-38
Measures of central location, 29
Measures of variation, 33
Descriptive statistics, 98
Measures of central location, 85
Measures of variation, 88
Pictorial methods, 79
Tabular methods, 73
Discrete data, 64
Discrete probability distributions
Binomial distribution, 20
Complementary cumulative distribution, 20
Cumulative distribution function, 20
Graphs, 21
Probability histogram, 21
Probability mass function, 20
Discrete random variables, 19
Disjoint events, 7
E
Event, 5
Event relations
137
Disjoint events, 7
Intersection of events, 7
Mutually exdusive events, 7
Union of events, 7
Experiment, 5
Exponential distribution, 53
F
Field, 4
Field size distribution, 61
Finding-rate curves, 119-122
Fractals, 125-128,130
Fractiles, 35
Frequency distribution, 73
Frequency histogram, 79
Fundamental principle of counting, 8
G
Graphs
Continuous probability distributions, 24
Discrete probability distributions, 21
I
Independence, 12,16
Inferential statistics
Assumptions, 99
Descriptive statistics, 98
Nonparametric (distribution-free) statistics, 99
Parametric statistics, 98
Regression and correlation, 111, 112
Robust methods, 99
Statistical estimation, 99,100
Tests of hypotheses, 99,104,105
Intersection of events, 7
Interval estimate, 103
Interval estimation
Confidence interval, 103
Interval estimate, 103
L
Lognormal distribution, 51,106-110
Lognormal probability paper, 108-110
LOGRAF, 39-43
138
M
Mean, 29
Skewness, 32
Measures of central location
Mean, 29
Median, 30
Mode, 31
Sample mean, 85
Sample median, 86
Sample mode, 87
Measures of variation
Sample range, 90
Sample standard deviation, 89
Sample variance, 88
Standard deviation, 34
Variance, 33
Median, 30
Skewness, 32
Minimum variance unbiased estimation, 100
Mode, 31
Skewness, 32
Moments, 62
Monte Carlo simulation, 26-28
Comparison with analytic method, 28
Mutually exclusive events, 7
N
Net pay thickness data, 68
Nonparametric (distribution-free) statistics, 99
Normal distribution
Areas under normal curve, 44
Mean/standard deviation, 49
Minimum/maximum, 48
Probability density function, 44
Normal probability paper, 96,97
P
Parameters
Characterizing parameter, 62
Descriptive parameter, 62
Moments, 62
Population mean, 62
Population parameter, 62
139
Population standard deviation, 62
Population variance, 62
Parametric statistics, 98
Parent population, 60
Pareto distribution, 55
Permutations, 8
Petroleum hierarchy
Basin, 4
Field, 4
Play, 4
Pool, 4
Prospect, 4
Region, 4
Physical sample, 64
Pictorial methods
Cumulative frequency polygon (less than), 81
Cumulative frequency polygon (more than), 83
Frequency histogram, 79
Relative cumulative frequency polygon (less than), 82
Relative cumulative frequency polygon (more than), 84
Relative frequency histogram, 80
Play, 4
Point estimate, 100
Point estimation
Minimum variance unbiased estimation, 100
Minimum variance unbiased estimator, 101,102
Point estimate, 100
Point estimator, 100
Standard error, 100
Unbiased estimator, 100
Point estimator, 100
Pool, 4
Population distribution, 60
Population mean, 62
Population of observations, 60
Population of units, 60
Population parameter, 62
Population size, 60
Population standard deviation, 62
Population variance, 62
Populations
Parent population, 60
Population distribution, 60
Population of observations, 60
140
Population of units, 60
Population size, 60
Sampled population, 60
Power laws, 123,124
Probability, 3
Application of rules, 14
Axiomatic definition, 10
Classical definition, 9
Event relations, 11
Relative frequency definition, 9
Rules, 13
Subjective definition, 10
Probability density function, 23
Probability mass function, 20
PROBDIST model menu, 45
Prospect, 4
Province, 4
R
Random sample, 69
Random sampling, 69
Random variable, 19
Region, 4
Regression, 111
Regression and correlation
Correlation, 112
Finding-rate curves, 119-122
Fractals, 125-129
Power laws, 123,124
Regression, 111
Transformations, 113-118
Relative cumulative frequency distribution (less than), 76
Relative cumulative frequency distribution (more than), 78
Relative cumulative frequency polygon (less than), 82
Relative cumulative frequency polygon (more than), 84
Relative frequency distribution, 74
Relative frequency histogram, 80
Robust statistical methods, 99
Rule of total probability, 18
S
Sample mean, 85
Sample median, 86
141
Sample mode, 87
Sample of observations, 64
Sample of units, 64
Sample range, 90
Sample size, 64
Sample space, 5
Sample standard deviation, 89
Sampled population, 60
Samples
Continuous data, 64
Discrete data, 64
Gas field sizes, 67
Oil field sizes, 65,66
Physical sample, 64
Sample of observations, 64
Sample of units, 64
Sample size, 64
Sampling concepts
Parameters, 62,72
Populations, 60
Samples, 64
Sampling techniques, 69
Statistics, 71,72
Sampling distribution of the mean, 91,92
Sampling distributions, 72
Central limit theorem, 93-95
Normal probability paper, 96,97
Sampling distribution of the mean, 91,92
Sampling proportional to size, 69
Sampling techniques
Biased sampling, 70
Random sample, 69
Random sampling, 69
Sampling proportional to size, 69
Sampling with replacement, 69
Sampling without replacement, 69
Stratified random sampling, 69
Sampling with replacement, 69
Sampling without replacement, 69
Seven-fractile histogram, 46
Skewness, 32
Standard deviation, 34
Standard error, 100
Statistic, 71
142
Statistical estimation, 99
Interval estimation, 103
Point estimation, 100-102
Statistics, 59
Examples, 71
Statistic, 71
Stratified random sampling, 69
T
Tabular methods
Cumulative frequency distribution (less than), 75
Cumulative frequency distribution (more than), 77
Frequency distribution, 73
Relative cumulative frequency distribution (less than), 76
Relative cumulative frequency distribution (more than), 78
Relative frequency distribution, 74
Tests of hypotheses, 99
Chi-squared goodness-of-fit test, 106,107
Lognormal probability paper, 108-110
Z test for u, 104,105
Three-fractile histogram, 47
Transformations, 113-118
Tree diagram, 6
Triangular distribution, 58
Truncated exponential distribution, 54
Truncated lognormal distribution, 52
Truncated normal distribution, 50
Truncated Pareto distribution, 56
U
Unbiased estimator, 100
Uniform distribution, 23,24,29,57,63
Union of events, 7
V
Venn diagram, 1,6
143