0% found this document useful (0 votes)
11 views

01 Spatial - Correlation

Uploaded by

irawati.nurdin33
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

01 Spatial - Correlation

Uploaded by

irawati.nurdin33
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

An introduction to applied geostatistics

Part 1 – Introduction; Spatial Correlation

Overheads
D G Rossiter
Department of Earth Systems Analysis
International Institute for Geo-information Science & Earth Observation (ITC)
<http://www.itc.nl/personal/rossiter>

July 11, 2004


AN INTRODUCTION TO APPLIED GEOSTATISTICS 1

Topic: Resources
There are many resources, at various mathematical levels, some aimed at
particular applications. These lists are not comprehensive but should be good
starting points:

• Texts

• Web pages

• Computer programmes

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 2

Texts: Applied

• Isaaks, E.H. and Srivastava, R.M., 1990. An introduction to applied


geostatistics. Oxford University Press, New York.

• Webster, R. and Oliver, M.A., 2001. Geostatistics for environmental scientists.


Wiley & Sons, Chichester.

• Goovaerts, P., 1997. Geostatistics for natural resources evaluation. Applied


Geostatistics. Oxford University Press, New York.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 3

Texts: Mathematical

• Chilès, J.-P. and Delfiner, P., 1999. Geostatistics: modeling spatial uncertainty.
Wiley series in probability and statistics. John Wiley & Sons, New York.

• Christakos, G., 2000. Modern spatiotemporal geostatistics. Oxford University


Press, New York.

• Cressie, N., 1993. Statistics for spatial data. John Wiley & Sons, New York.

• Ripley, B.D., 1981. Spatial statistics. John Wiley and Sons, New York.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 4

Texts: In the context of a particular application field

• Stein, A., Meer, F.v.d. and Gorte, B.G.F. (Editors), 1999. Spatial statistics for
remote sensing. Kluwer Academic, Dordrecht.

• Davis, J.C., 2002. Statistics and data analysis in geology. John Wiley & Sons,
New York.

• Fotheringham, A.S., Brunsdon, C. and Charlton, M., 2000. Quantitative


geography : perspectives on spatial data analysis. Sage Publications, London
; Thousand Oaks, Calif.

• Kitanidis, P.K., 1997. Introduction to geostatistics : applications to


hydrogeology. Cambridge University Press, Cambridge, England.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 5

Web pages

• R: http://www.r-project.org/

• R spatial projects: http://sal.agecon.uiuc.edu/csiss/Rgeo/

• gstat: http://www.gstat.org/

• gslib: http://www.gslib.com/

• GEOEAS: http://www.epa.gov/ada/csmos/models/geoeas.html

• ILWIS: http://www.itc.nl/ilwis/

• ArcGIS Geostatistical Analyst: http://www.esri.com/software/


arcgis/arcgisxtensions/geostatistical/

• Geostatistical analysis tutor [Colorado (USA) School of Mines]:


http://uncert.mines.edu/tutor/
D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 6

Computer programmes

• ILWIS 3.2 (ITC) [EUR 100]

• R [free] open-source environment for statistical computing and visualisation;


includes several relevant libraries, including

* gstat, by Pebesma
* spatial, by Ripley
* geoR, by Ribeiro & Diggle
* spdep, by Rowlingson & Diggle
* spatstat, by Baddeley & Turner (point pattern analysis)

• ArcGIS Geostatistical Analyst (ESRI) [USD 2,500 + ArcGIS base]

• PCRaster + gstat (Utrecht) [free]

• GeoEAS, GSLIB, Variowin, VESPER . . .

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 7

Topic: Introduction to Spatial Analysis

1. Concepts of space: geographic and feature spaces

2. What is special about spatial data?

3. Key concepts in spatial analysis

4. Measuring spatial correlation

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 8

What is “space”?

• A set of n continuous dimensions; dimension i has range [ximin · · · ximax ]

• Points are mathematical n-dimensional vectors: ~x = (x1, x2, · · · , xn)

• Geographic vs. feature spaces

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 9

Geographic space

• Axes are 1-d lines

• One-dimensional: coordinates are on a line with respect to some origin (0):


(x1) = x

• Two-dimensional: coordinates are on a grid with respect to some origin (0, 0):
(x1, x2) = (x, y) = (E, N )

• Three-dimensional: coordinates are grid and elevation from a reference


elevation: (x1, x2, x3) = (x, y, z) = (E, N, H)

• Must transform latitude-longitude to grid coordinates in some 2-d projection;


distortions occur over large areas

• Can work directly with geographic coordinates, but not as a grid

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 10

Feature space

• Axes are the range of each variable

• Coordinates are values of variables, possibly transformed or combined

• Not included in the common use of the term “spatial” data or analysis

• But the observation may be related in this ‘space’ . . .

• . . . and we often plot variables in this space, e.g. 2-D scatterplots

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 11

What is special about spatial data (1)?

1. The location of a sample is an intrinsic part of its definition.

2. Values at sample points can not be assumed to be independent

3. That is, there may be a spatial structure to the data


• Classical statistics assumes independence, at least within sampling strata
• Major implications for sampling design and statistical inference

4. All data sets from a given area are implicitly related by their coordinates →
models of spatial structure

5. Data values may be related to their coordinates → spatial trend

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 12

What is special about spatial data (2)?


For points:

• Points within a dataset have a defined distance from each other

• Euclidean distance:

x1 = (x11 , x12 , · · · , x1n )


x2 = (x21 , x22 , · · · , x2n )
" n #1/2
X
d(x1, x2) = (x1i − x2i )2
i=1

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 13

What is special about spatial data (3)?


For 2-d and 3-d objects:

1. Objects have a topology as well as distances


(a) Ex. adjacency, containment

For fields:

1. Fields have a simple implicit topology: adjacency of cells

2. Fields have an implict distance metric, from the row & column positions (the
natural coordinate system of a field)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 14

Key Concepts

• Spatial dependence: the value of a variable at a point in space is related to its


value at nearby points; knowing the value of these points allows us to predict
(with some degree of certainty) the value at the chosen point

• Spatial structure: the nature of the spatial relation: how far, and in what
directions, is the spatial dependence? How does the dependence vary with
distance and direction between points?

• Support of a sample: the physical dimensions it represents (n.b. may try to


predict to coarser or finer resolutions)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 15

Topic: Exploratory spatial data analysis


Since spatial data were collected at known points in geographic space, we
should visualise them in that space.

• Distribution of sample points

• Postplots (values vs. locations): where are which values?

• Geographic postplots: with images, landuse maps etc. as background: do


there appear to be any explanation for the distribution of values?

• Spatial structure: range, direction, strength . . .

• Is there anisotropy? In what direction(s)?

• Do there seem to be several populations with distinct geographic distribution?

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 16

Point distribution
This shows how sample points are distributed in space.

• What weas the sampling plan?

• Random or clustered?

• Are some areas over– or under–sampled?

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 17

Walker Lake: Distribution of points – All points

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 18

Walker Lake: Distribution of points – Variable U only

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 19

Values distribution (postplot)


The so-called postplot shows how the data values are distributed in space.

• Are values of closeby points similar to each other, or do the values appear to
be random?

• Does there appear to be a trend?

• Are there distinct clusters of high or low values?

• Is there any directional difference in clustering? (anisotropy)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 20

Distribution of V values (postplot) – All points

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 21

Distribution of V values (postplot) – where Variable U was sampled

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 22

Meuse – Distribution of Log(Cadmium) in soils

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 23

Geographic postplot
This shows the postplot against a background that may explain the distribution of
samples or values. Examples:

• land cover or land use

• geologic or soil units

• structural geology

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 24

Meuse – Log(Cadmium) on a false-colour composite

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 25

Topic: Spatial correlation

1. Evidence of spatial correlation: Moran’s I

2. Computing spatial covariance

3. Summarizing spatial covariance with the experimental variogram

4. Visualising spatial structure with the experimental variogram

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 26

Spatial Correlation

• Question: are nearby points in geographic space also ‘nearby’ in feature


space?

• That is, does knowing the value of some variable at some location give us
information on the value at ‘nearby’ locations?

• The concept of correlation between variables can be applied to correlation


within a variable, using distance to model the relation

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 27

Covariance and Correlation


Recall: for two non-spatial variables X and Y :

• Sample covariance:

n
1 X
sXY = (xi − x) · (yi − y)
n − 2 i=1

• Sample correlation coefficient: the covariance normalized by sample


standard deviations; range [−1 . . . 1]:
P P
sXY (xi − x) · (yi − y)
rXY = = pP
sX · sY
P
(xi − x) · (yi − y)2
2

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 28

Autocorrelation
We want to apply the idea of correlation to one variable (auto-correlation).

Here, the correlation is controlled by some other dimension:

• time – if the variable is collected as a time–series

• space – if the variable is collected at points in space

So we will get a measure of how much the variable is correlated to itself,


considering the other factor (time or space) .

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 29

Spatial autocorrelation
Two methods; the variable to be autocorrelated can be:

1. classified according to a stratification of space

2. considered as spatially continuous with the distance between point-pairs as


the

The first is simpler to conceive, as it uses non-spatial correlation analysis on a


spatially-classified variable.

The second requires a new mathematical formulation and stronger assumptions.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 30

Evidence of Spatial Correlation

• Moran’s I statistic: a simple approach which works for regions of arbitrary size,
as long as “adjacency” is well-defined (but here we will use distance classes)

• Compute spatial autocorrelation within various distance classes and


summarize as:
P P
n i j wij (zi − z̄)(zj − z̄)
I= P P P 2
i j wij i (zi − z̄)

P P
• wij = 1 iff the point pair is in the distance class, so n/ i j wij is the inverse
proportion of all points in this class.

• Standardized by the overall variance

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 31

Interpreting Moran’s I

• Expected value −1/(n − 1)

• In ILWIS, standardized to 0 for all classes (multiply by 1 − n and subtract 1)

• Higher values (ILWIS > 0): positive spatial autocorrelation (values separated
by this distance tend to be similar)

• Lower values (ILWIS < 0): negative spatial autocorrelation (values separated
by this distance tend to be dissimilar

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 32

Covariance

• The sample covariance between two different variables x and y measured


on the same sample points [1, . . . , i, . . . , n]:

n
X
(xi − x)(yi − y)
i=1

• Units are the product of the two variable’s units; not standardized (that is the
correlation)

• The spatial covariance is computed within the same variable, i.e.


auto-covariance, but using pairs of observations.

• We form all possible point pairs, total (n · (n − 1))/2.

• This is a large number! For example, with 200 points this is 19,900 point pairs.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 33

Semi-variances

• Each pair of observation points has a semivariance γ, defined as:

1
γ(~xi, ~xj ) = [z(~xi) − z(~xj )]2
2

• This is defined by all point pairs

• Each point pair is separated by a known distance, so . . .

• We can plot the semivariances against distance as a variogram “cloud”, with


(n · (n − 1))/2 points in the graph

• Can also summarize in a variogram

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 34

The variogram cloud

• Shows all point pairs, so maximum information

• Shows which point pairs do not fit the general pattern


> v<-variogram(log(cadmium)~1, loc=~x+y, data=meuse, cloud=TRUE)
> plot(v, pch=20, cex=.5, col=’’red’’)
> #find unusual points in variogram cloud; identify with mouse
> pp<-plot(v, id=T)
> # show where these pairs are located
> plot(pp, data=meuse)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 35

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 36

The experimental variogram

• To summarize the variogram cloud, compute average semi-variance at


various separations (‘lags’); this is the experimental variogram

m(~
h)
1 X
γ(~h) = [z(~xi) − z(~xj )]2
2m(~h) i=1

• m(~h) is the number of point pairs separated by vector ~h

• In practice, we have to define the set of vectors in each “bin” (to have enough
points)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 37

Example of an experimental variogram

> v<-variogram(log(cadmium)~1, ~x+y, meuse); v


np dist gamma
1 57 79.29244 0.6650872
2 299 163.97367 0.8584648
3 419 267.36483 1.0064382
4 457 372.73542 1.1567136
5 547 478.47670 1.3064732
6 533 585.34058 1.5135658
7 574 693.14526 1.6040086
8 564 796.18365 1.7096998
9 589 903.14650 1.7706890
10 543 1011.29177 1.9875659
11 500 1117.86235 1.8259154
12 477 1221.32810 1.8852099
13 452 1329.16407 1.9145967
14 457 1437.25620 1.8505336
15 415 1543.20248 1.8523791

np are the number of point pairs in the bin; dist is the average separation of
these pairs; gamma is the average semivariance

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 38

Plotting the experimental variogram


This can be plotted as semivariance gamma against average separation dist,
along with the number of points that contributed to each estimate np:
plot(v, plot.numbers=T)

(Note: gstat defaults to 15 equally-spaced bins and a maximum distance of 1/3


of the maximum separation. These can be over-ridden with the width and
cutoff parameters, respectively.)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 39

Default variogram of Log(Cd)

2.0 ● 543
● 452
● 477
● 500
● 457 ● 415
● 589
● 564
● 574

1.5 ● 533

● 547
semivariance

● 457

1.0 ● 419

● 299

● 57

0.5

0.0
0 500 1000 1500
distance

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 40

Features of the experimental variogram


Later we will look at fitting a model to the variogram; but even without a model we
can notice some features, which we define here only qualitatively:

• Sill: maximum semi-variance; represents variability in the absence of spatial


dependence

• Range: separation between point-pairs at which the sill is reached; distance at


which there is no evidence of spatial dependence

• Nugget: semi-variance as the separation approaches zero; represents


variability at a point that can’t be explained by spatial structure.

In the previous slide, we can estimate the sill ≈ 1.9, the range ≈ 1200 m, and the
nugget ≈ 0.5 i.e. ≈ 25% of the sill.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 41

From variogram cloud to variogram

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 42

Defining the bins (1)

• Distance interval, specifying the centres. E.g. (0, 100, 200, . . .) means
intervals of [0 . . . 50], [50 . . . 150], . . .

• All point pairs whose separation is in the interval are used to estimate γ(~h) for
~h as the interval centre

• Narrow intervals: more resolution but fewer point pairs for each sample
> v<-variogram(log(cadmium)~1, ~x+y, meuse, boundaries=seq(50,2050,by=100))
> plot(v, pl=T)
> par(mfrow = c(2,3)) #show all six plots together
> for (bw in seq(20, 220, by = 40)) {
v<-variogram(log(cadmium)~1, ~x+y, meuse, width=bw)
plot(v$dist, v$gamma, xlab=paste("bin width", bw))
}

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 43

● ● ●

2.0
●● ●
● ● ● ●
●● ● ● ● ● ●
● ●

2.0

1.8

●● ●● ● ●●● ● ● ●●
● ● ●
●● ● ● ● ● ● ● ●
● ●● ●
● ●
● ●
● ● ●●● ●
●●

1.6
● ● ●

1.5

●● ● ● ● ● ●

1.5
●● ●● ● ●
● ●● ●

v$gamma

v$gamma

v$gamma

● ● ●●

1.4
●● ● ●

● ● ●●
● ●

1.0
● ●

1.0
● ● ● ●

1.2
● ●
●●


●● ● ●

1.0
● ●

0.5
0.5

0.8

● ● ●

0 500 1000 1500 0 500 1000 1500 500 1000 1500

bin width 20 bin width 60 bin width 100

● ● ● ●

● ● ●

1.8
● ● ●
1.8

1.8


● ● ● ●

1.6
1.6

1.6


v$gamma

v$gamma

v$gamma
1.4


1.4

1.4


1.2
1.2

1.2
1.0

● ●

1.0

1.0
0.8


0.8

● ● ●

0.8
500 1000 1500 200 600 1000 1400 200 600 1000 1400

bin width 140 bin width 180 bin width 220

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 44

Defining the bins (2)

• Each bin should have > 100 point pairs; > 300 is much more reliable
> v<-variogram(log(cadmium)~1, ~x+y, meuse, width=20)
> plot(v, plot.numbers=T)
> v$np
[1] 6 19 27 27 51 65 58 62 62 82 76 75 86 81 76
[16] 91 92 90 88 92 112 103 80 116 108 106 79 94 117 99
[31] 100 101 108 117 110 117 114 107 96 110 109 106 114 117 104
[46] 98 94 117 92 110 105 91 89 98 89 91 103 102 93 92
[61] 73 85 88 91 88 84 75 81 90 73 93 95 76 85 67
[76] 77 88 60
> v<-variogram(log(cadmium)~1, ~x+y, meuse, width=120)
> v$np
[1] 79 380 485 577 583 642 654 648 609 572 522 491 493 148
> plot(v, plot.numbers=T)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 45

Topic: Anisotropy

• Greek “Iso” + “tropic” = English “same” + “trend”; Greek “an-” = English “not-”

• Variation may depend on direction, not just distance

• This is why we refer to the separation vector; up till now this has just meant
distance, but now it includes direction

• Case 1: same sill, different ranges in different directions (geometric, also


called affine, anisotropy)

• Case 2: same range, sill varies with direction (zonal anisotropy)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 46

How can anisotropy arise?

• Directional process

* Example: sand content in a narrow flood plain: much greater spatial


dependence along the axis parallel to the river
* Example: secondary mineralization near an intrusive dyke
* Example: population density in a hilly terrain with long, linear valleys

• Note that nugget must logically be isotropic (it is variation at a point)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 47

Affine (Geometric) anisotropy

• If we can find orthogonal axes of maximum and minimum range and if the
same semi-variogram model can be fitted, the coordinates can be transformed
from an oblique ellipse to a circle

• This is just the affine transformation: rotation and scale change


independently in two orthogonal axes (This is the same transformation as
when registering a paper map on a digitizer)

• In the affine-transformed coordinate system one variogram model applies

• ILWIS, gstat, and many others call this geometric anisotropy

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 48

Zonal anisotropy

• If the sills of the directional variograms are different, no affine transformation


can make the two variograms co-incide.

• Variance is inherently different in the two zones; this is called zonal anisotropy.

• It may be approximated by adding two affine anisotropic models, each with a


very high anistropy ratio (so that the other direction’s variance is almost zero)
see gstat manual for details

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 49

Detecting anisotropy with a Variogram Surface

• ILWIS: Compute the variogram surface, which is a 2-D variogram.

• This is not a map! but rather a plot of semivariances vs. distance and direction
(the separation vector )

• Each cell shows the semivariance at a given distance and direction (lag)

• Symmetric by definition

• Find axis of maximum spatial dependence (lowest semivariances at a given


distance)

• Second axis is the perpendicular

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 50

Variogram surface of Log(Cd)

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 51

Detecting anisotropy with a Directional Variogram

• A directional variogram only considers point pairs separated by a certain


direction

• these are put in distance classes (bins) as with an anisotropic variogram

• Parameters to specify:
1. Direction of the major or minor axis in 1st quadrant; implicitly specifies
perpendicular as other axis; in ILWIS as Azimuth (degrees) clockwise from
Y (North), as with a compass; corresponding minor or major axis is then
+π/2 = +90◦ clockwise
2. Tolerance: Degrees on either side which are considered to have the ‘same’
angle
3. Band width: Limit the bin to a certain width; this keeps the band from taking
in too many far-away points

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 52

Directional Variograms in gstat


In gstat a series of directional variograms can be computed at once and then
plotted together; this makes it easy to find the major axis (longest range).
> # directional variograms every 45 deg.
> lcad<-log(cadmium)
> v3<-variogram(lcad ~1, ~x+y, meuse, alpha=c(0, 45, 90, 135))
> plot(v3, plot.numbers=T) # show four different graphs side-by-side
> plot(v3, multipanel=F) # show all directions on one graph
> # same, but restrict the tolerance angle
> v4<-variogram(lcad ~1, ~x+y, meuse, alpha=c(0, 45, 90, 135), tol.hor=10)
> plot(v4)

• Much shorter range and lower sill in the NE-SW direction

• Fewer point pairs for each estimate, so less reliable models

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 53

Anisotropic variograms of Log(Cd)


0 500 1000 1500
90 135 3.0 ●

● 72 3.0

● 51 ●
● 17 ●
● 68● ● 2116

2.5 ●
51 ● 62 ●
● ●
● 88 ●
2.5
104 80

● 44 ●
● 91
● 4 2.0 ●

115

● 98 ●100
● 30● ●
16 ● 99
● 89 1.5 ●
● 95 ●
●118 ● 18
● 10 ● ● ● ●
● 70● 97● 98 1.0 2.0 ●

● 62 ●
● 16 ●

semivariance
0.5 ●
semivariance


● ●

● ● ●
● ● ●

0.0 1.5 ● ● ●
0 45 ●
● ●

3.0
109 ● 96

● ●
●120 ●
2.5 ●

● ●

● ● ●
●137
1.0 ● ●

156 135

2.0 ●
●156 ●
●158
●275 ●
●158 ●159 ●299
●297
1.5 ●154 ●226 ●264 ●282
●209 ●283 ●274


● 76 136 ●177

0.5
1.0 134
● ●118 ●172 ●
●109

0.5
● 91
● 12 ●

● 11
0.0 0.0
0 500 1000 1500 0 500 1000 1500
distance distance

One plot for each direction All directions on one plot

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 54

Point pattern analysis


Sometimes we are interested only in distribution of sample points in space,
without reference to the data values.

This can give insight into the spatial process by which the points were
placed(repulsion, attraction, . . . )

References:

• ILWIS 3 help topic “Pattern analysis”;

• Boots, B. N. & Getis, A. (1988). Point pattern analysis. Newbury Park: Sage;

• Ripley, B. D. (1981). Spatial statistics. New York: John Wiley and Sons.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 55

Examples

• Distribution of rare species in a forest

• Location of settlements or farms

• Placement of “representative” soil samples during a soil survey

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 56

Complete Spatial Randomness (CSR)

• CSR results from a homogeneous Poisson process

* Assumption 1: Each location has an equal chance of having an observation


point
* Assumption 2: The presence of one or more points does not affect the
placement of any other point

• If points “attract” and form compact groups, with large spaces in between:
clustered

• If points “repel” and arrange themselves in an evenly-spaced pattern: regular


pattern

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 57

Measures of the point pattern

1. Points-in-area

2. Distance between points (ILWIS)


• NN: Distance from each point to its nearest neighbour of some order, from
1 to n − 1 (all)
• RNN: Reflexive nearest neighbours: two points are first order RNN if they
are each other’s nearest neighbours; similar definition for higher orders

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 58

Points-in-area: The Poisson process

• Probability of observing exactly x points in area a is p(x) = (e−λλx)/x!


where λ is the expected number of points in the area a

• For a = 1, λ is called the intensity of the process

• If n points are observed over a study area A, λ = (n/A).

• In a sample area a, we then expect λa = aλ points

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 59

Test of CSR : χ2

• Place a large number n of test circles according to CSR

* independent uniform distributions in E and N , over the range of the study


area

• Count the number of sample points falling in each circle, and summarize how
many times 0, 1, . . . are found ⇒ observed

• Calculate (by the Poisson distribution) the probability of a test circle containing
each number under the assumption of CSR

• multiply by the number of test points ⇒ expected

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 60

Test of CSR : χ2 (2)


2
Pn (O−E)2
• Calculate χ = i=1 E

• Compare to tabulated probability with n − 2 d.f. (subtract 1 column, 1 row)

• Note: sample size must be large enough so most expected counts > 5; if not
must increase the size of test circles.

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 61

Distance methods

• Problem with area techniques: arbitrary areas; choice of size can affect
interpretation; only tests first-order effects

• Alternative: measure distances between points, compare to what is expected


under CSR

• Measure 1: Reflexive nearest neighbours (RNN): two points are first order
RNN if they are each other’s nearest neighbours; measure the distance
between them

• Similar definition for higher orders

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 62

ILWIS Point Pattern Analysis: Nearest Neighbour (NN)

• Calculates the distances between point pairs of a given order and uses this as
the ordinate

• For each distance, calculates the “probability” (actually, the observed


frequency) that the nth-order NN is within this distance

• Example: 2 points (out of 459) have at least one neighbour within by 1.8m;
frequency is 2/459 = 0.00436

• Example: 56 points have at least one neighbour within 3.3m; f = 56/459 =


0.12200

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 63

Expected NN distances under CSR

• ILWIS: In the table produced by point pattern analysis, View | Additional


Information; 2nd table
1. Left column: Average radius of circle within which a point has at least j
neighours (j th-order nearest neighbour)
2. Left column: Compare with expected distance assuming CSR
3. Low-order closer than expected or High-order farther than expected:
clumping

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 64

Expected RNN pairs under CSR

• ILWIS: In the table produced by point pattern analysis, View | Additional


Information; Top table
1. Left column: number of points which are j th-order RNN
2. Right column: Compare the number of point pairs expected at the j th-order
in a CSR (Boots & Getis Table 4.1)
3. Lower-order expected values > CSR suggest clumping, < CSR suggest
dispersion
4. Higher-order expected values > CSR suggest regularity

D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 65

References

[1] Brian D. Ripley. Spatial statistics in R. R News, 1(2):14–15, June 2001.

[2] Paulo J. Ribeiro, Jr. and Peter J. Diggle. geoR: A package for geostatistical
analysis. R News, 1(2):14–18, June 2001.

[3] Martin Schlather. Simulation and analysis of random fields. R News,


1(2):18–20, June 2001.

[4] Roger Bivand. More on spatial data. R News, 1(3):13–17, September 2001.

[5] Ole F. Christensen and Paulo J. Ribeiro. geoRglm: A package for


generalised linear spatial models. R News, 2(2):26–28, June 2002.

[6] Edzer J Pebesma. gstat User’s Manual. Dept. of Physical Geography,


Utrecht University, Utrecht, version 2.3.3 edition, 2001.

[7] Edzer J. Pebesma and Cees G. Wesseling. Gstat: a program for


geostatistical modelling, prediction and simulation. Computers &
Geosciences, 24(1):17–31, 1998.
D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 66

[8] Brian D Ripley. Spatial statistics. John Wiley and Sons, New York, 1981.

[9] W N Venables and B D Ripley. Modern applied statistics with S.


Springer-Verlag, New York, fourth edition, 2002.

[10] A Stein, Freek van der Meer, and B G F Gorte, editors. Spatial statistics for
remote sensing. Kluwer Academic, Dordrecht, 1999.

[11] John C. Davis. Statistics and data analysis in geology. John Wiley & Sons,
New York, 3rd edition, 2002.

[12] P K Kitanidis. Introduction to geostatistics : applications to hydrogeology.


Cambridge University Press, Cambridge, England, 1997.

[13] A. Stewart Fotheringham, Chris Brunsdon, and Martin Charlton.


Quantitative geography : perspectives on spatial data analysis. Sage
Publications, London ; Thousand Oaks, Calif., 2000.

[14] Noel Cressie. Statistics for spatial data. John Wiley & Sons, New York,
revised edition, 1993.
D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 67

[15] G. Christakos. Modern spatiotemporal geostatistics. Oxford University


Press, New York, 2000.

[16] E.H. Isaaks and R.M. Srivastava. An introduction to applied geostatistics.


Oxford University Press, New York, 1990.

[17] R. Webster and M.A. Oliver. Geostatistics for environmental scientists.


Wiley & Sons, Chichester, 2001.

[18] Pierre Goovaerts. Geostatistics for natural resources evaluation. Applied


Geostatistics. Oxford University Press, New York, 1997.

[19] J-P. Chilès and P. Delfiner. Geostatistics: modeling spatial uncertainty. Wiley
series in probability and statistics. John Wiley & Sons, New York, 1999.

[20] A. B. McBratney, R. Webster, and T. M. Burgess. The design of optimal


sampling schemes for local estimation and mapping of regionalized
variables - I. Computers and Geosciences, 7(4):331–334, 1981.

[21] A. B. McBratney and R. Webster. The design of optimal sampling schemes


D G R OSSITER
AN INTRODUCTION TO APPLIED GEOSTATISTICS 68

for local estimation and mapping of regionalized variables - II. Computers


and Geosciences, 7(4):335–365, 1981.

[22] Jan-Willem van Groenigen. Sampling strategies for effective variogram


estimation. In Jan-Willem van Groenigen, editor, Constrained optimisation of
spatial sampling, ITC Publication 65, pages 105–124. ITC, Enschede, NL,
1999.

[23] B.N. Boots and A. Getis. Point pattern analysis. Scientific Geography series
8. Sage, Newbury Park, 1988.

D G R OSSITER

You might also like