REGIONALIZED
VARIABLES
- Variance (geostatistics)
- Covariance (spatial
correlation)
- Cluster analysis
(regionalization)
Ronny Berndtsson
Objectives course
Ability to do a geostatistical
analysis employing variance
of a data set.
Ability to do a spatial correlation analysis employing
covariance of a data set.
Ability to do a regionalization employing cluster
analysis.
Regionalized variables
Literature
Handouts
Application spatial correlation
and cluster analysis, Uvo and
Berndtsson (1996) (available
on Air through ftp).
Application geostatistics,
Berndtsson et al. (1993).
Regionalized variables
Software
Geoeas (geostatistical
software freely available
from [Link]
ada/csmos/models/[Link]
Matlab (correlation and
cluster analyses)
Regionalized variables
Today's topic
Analysis of a single data field
z(x, y) (note; for correlation
time series are needed)!
z(x, y)
x
z
y
Regionalized variable z = z(x, y)
Regionalized variables
Examples of spatially dependent
variables (regionalized variables)
Rainfall
Soils hydraulic conductivity
Chemical concentration
Plant properties
Population characteristics
What variable is not?
Regionalized variables
Why use regional
variables theory?
-
General analysis tool for spatially
varying/dependent data.
- A general tool for spatial interpolation.
- A tool for regionalization studies.
- A basis for developing spatial models
that consider regional differences.
- Just because it is fun and interesting!
Regionalized variables
Definition of variance
and covariance
Variance V(x) = E[(x - m)2] = 2
Covariance C(x, y) =
E[(x - mx)2(y - my)2]
Correlation coefficient R(x, y) =
C(x, y)/[V(x) V(y)]1/2
Regionalized variables
Spatial field points
Assumptions:
1st order stationarity
(E(z) = constant)
2nd order stationarity
(V(z) = constant)
x
z(x, y)
.
z1
. z2
y
h
Regionalized variables
Spurios correlation (or
variance)!
If data contain many zeros
If data contain outliers
If data contain trend
Check normality (if nonnormal apply relevant data
transformation)
De-trend if necessary
Regionalized variables
Definition
semivariance
V(z2 z1) = E(z2 z1)2 = 2(h)
(h) = E(z2 z1)2/2
*(h) = ((z+h) - z)2/2n(h)
n = number of observation
pairs at h distance
Regionalized variables
Spatial correlation
(h) = C(z1, z2)/[V(z1) V(z2)]1/2
where z1 and z2 are time series at
corresponding points and h is the
distance between z1 and z2
Regionalized variables
Both correlation and semivariance expressed as a
function of distance h
(h)
1.0
(h) = 1 - (h)
(if stationary!)
Distance h
(h)
Vtot
Distance h
Regionalized variables
Errors + small-scale
variability
(h)
1.0
Sum of errors and
small-scale variation
Distance h
(h)
Vtot
Sum of errors
and small-scale
variation
Distance h
Regionalized variables
The variogram
(h)
Sill
Vtot
Nugget
Range
Distance h
Regionalized variables
The correlogram
(h)
1.0
Decorrelation =
1/e = 0.37
Decorrelation
distance
Distance h
Regionalized variables
Spatial analyses
Correlogram
Variogram
Normal
Random
Highly
correlated
in space
Significant
trend
Data not
stationary
Distance
Distance
Regionalized variables
Experimental variogram
(h)
Regionalized variables
Correlogram for different
time steps
(h)
Distance
Regionalized variables
Correlogram
seasonal difference
(h)
Distance
Regionalized variables
Regional differences; data not
homogeneous and stationarity
assumption not fulfilled!
z(x, y)
x
Area of low
correlation
y
Area of high
correlation
Regionalized variables
Cluster analysis
Technique to discriminate
between different data
groups with mutually high
similarity. Dendrogram:
From: [Link]
Regionalized variables
Wards method
From: [Link]
Regionalized variables
Indata for cluster analysis
Raw data
Semivariance
Correlation
etc
Regionalized variables
Level of detail in
dendrogram
Level 3
Level 2
Level 1
Regionalized variables
Regionalization based on
three levels of detail
Regionalized variables
Directional dependence
spatial correlation
Regionalized variables
Regional differences for
spatial correlation
Regionalized variables
Exercises
Calculate and plot variograms for your data
(Geoeas)
Calculate and plot correlograms for your data (Matlab)
Use cluster analysis to
delineate homogeneous
regions (Matlab)
Regionalized variables
Geoeas
Calculate experimental
variograms
Plot variograms
Use the variograms for
kriging
Regionalized variables
Data file Geoeas
Data for Geoeas analyses
3
X-coor m
Y-coor m
Al
ug/g DM
0.707 39.293 55000
0.303 20.234 44000
0.450 15.232 34000
0.420 10.210 64000
etc
Regionalized variables
Spatial correlation
(h)
Calculate correlation
coefficient for time series of
pairwise points
Calculate distance between
these pairwise points
Plot correlation vs. distance
for all unique station
combinations
x
x
Distance
Regionalized variables
Cluster analysis
Possible in Matlab
Perform a regionalization
Compare e.g., variance with
correlation as dependent
measure.
Regionalized variables
Matlab help Cluster
CLUSTER Construct clusters from LINKAGE output.
T = CLUSTER(Z,'CUTOFF',C) constructs clusters from cluster
tree Z. Z is a matrix of size M-1 by 3, generated by LINKAGE.
C is a threshold for cutting the hierarchical tree generated
by LINKAGE into clusters. Clusters are formed when
inconsistent values are less than CUTOFF (see INCONSISTENT).
The output T is a vector of size M that contains the cluster
number for each observation in the original data.
T = CLUSTER(Z,'MAXCLUST',N) specifies N as the maximum
number of clusters to form from the hierarchical tree in Z.
T = CLUSTER(...,'CRITERION','CRIT') uses the specified
criterion for forming clusters, where 'CRIT' is either
'inconsistent' or 'distance'.
T = CLUSTER(...,'DEPTH',D) evaluates inconsistent values to
a depth of D in the tree. The default is D=2.
See also PDIST, LINKAGE, COPHENET, INCONSISTENT,
CLUSTERDATA.
Regionalized variables
References
Berndtsson, R., A. Bahri, and
K. Jinno, (1993), Spatial
dependence
of
geochemical elements in a
semi-arid agricultural field:
2. Geostatistical properties,
Soil Sci. Soc. Am. J., 57,
1323-1329.
Uvo,
C.
B.,
and
R.
Berndtsson,
(1996),
Regionalization and spatial
properties of Cear State
rainfall in Northeast Brazil,
J. Geophys.
Res
.,
101,
Regionalized variables
4221-4233