0% found this document useful (0 votes)
7 views

Class Notes6 F24

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Class Notes6 F24

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Digital Processing of Remote

Sensing Data

Image Classification

CVE572 Satellite Remote Sensing


Image Classification
• The objective of image classification is to automatically categorize all
pixels in an image into land cover/land use classes
Basic Strategy: How do you do it?
• Use radiometric properties of remote sensor
• Different objects have different spectral
signatures
40
35
30
25
Vegetation
20
Soil
15
10
5
0
Band Band Band Band Band Band
1 2 3 4 5 7
Informational vs. Spectral Classes
• Informational classes:
– categories of interest to users: land use, water turbidity
classes, forest habitat types, geological units, chlorophyll
types, soil organic matter content, depth of snow pack, sea
surface temperature, etc…
– Not directly recorded on a remotely sensed image
• Spectral classes:
– Groups of pixels that are uniform with respect to brightness
values and patterns in their multiple spectral channels
The challenge is to derive quantitative connections
between informational and spectral classes
Basic Strategy: How do you do it?
• In an easy world, all “Vegetation” pixels
would have exactly the same spectral
signature
• Then we could just say that any pixel in an
image with that signature was vegetation
• We’d do the same for soil, etc. and end up
with a map of different classes
Basic Strategy: How do you do it?
• But in reality, that isn’t the case. Looking at
several pixels with vegetation, you’d see variety in
spectral signatures.
40
35 Veg 1
30 Veg 2
Veg 2
25
Veg 3
20
Veg 4
15
Veg 5
10 Veg 6
5 Veg 7
0
Band 1 Band 2 Band 3 Band 4 Band 5 Band 7

The same would happen for other types of pixels, as well.


The Classification Trick: Deal with
variability
• Different ways of dealing with the variability lead to
different ways of classifying images
• To talk about this, we need to look at spectral
signatures a little differently
40
35
Think of a pixel’s reflectance in 2-
30
25
dimensional space. The pixel occupies
Vegetation
20
15
Soil a point in that space.
10
5
0
Band Band Band Band Band Band
The vegetation pixel and the soil pixels
1 2 3 4 5 7
occupy different points in 2-d space

40
35
30
Band 4

25
Vegetation
20
Soil
15
10
5
0
0 5 10 15 20
Band 3
Basic Strategy: Dealing with
variability
45 With variability, the vegetation
40

35
pixels now occupy a region, not a
30 point, of n-dimensional space
25

20

15

10

0
Band 1 Band 2 Band 3 Band 4 Band 5 Band 7

45

40

35

30
Band 4

25

Soil pixels occupy a 20


15

different region of n- 10
5
dimensional space 0
0 2 4 6 8 10 12 14 16 18 20
Band 3
Basic strategy: Dealing with variability

• Classification:
• Delineate boundaries of 45

classes in n-dimensional space 40

35
• Assign class names to pixels 30

using those boundaries

Band 4
25

20
15

10
5

0
0 2 4 6 8 10 12 14 16 18 20
Band 3
Image Classification Techniques
1) Supervised classification
2) Unsupervised classification
3) Contextual classification
4) Fuzzy classification
Supervised Classification
• The process of using samples of known
informational classes (training sets) to
classify pixels of unknown identity.
• Identification and delineation of training
areas is key to successful implementation
Training Sets
• # of pixels - want to statistically characterize the
spectral properties of an informational class (i.e. forest,
crop, water), should have >= 100 pixels total for an
informational class
• location - geographically dispersed, boundaries away
from edge/mixed pixels
• number of areas - depends on number of information
categories, 10 at a minimum, enough for accuracy
assessment and incorporation of spectral subclasses
• uniformity - unimodal distributions, use training areas
to characterize mean, variance, covariances -
sometimes not easy due to spectral variation present
Supervised Classification Steps

• Identification of the informational categories/classes to


be used
• Selection and definition of training data - expensive
and time consuming. Training set can come from field
data (GPS locations), aerial photos, knowledge of the
area, image interpretation, or combinations of those
• Evaluate statistics of training fields (exploratory data
analysis) and discard/re-iterate as needed
• Choose statistical classifier and conduct classification
• Develop displays of results
• Evaluate classification performance
Evaluating Statistics of Training Sets
• In two-dimensional space, training set pixels
representing different informational classes should
be in separate clusters
• Training classes should be spectrally separable -
histograms should not overlap too much or in too
many bands
• Training classes should be homogeneous – unimodal
histograms
• Transformed divergence: another measure of
statistical separation between spectral classes – the
larger this number, the greater the separation of
classes and higher the probability of correct
classification
Evaluating Statistics of Training Sets
Supervised Classification
Supervised classification requires the
analyst to select training areas where
he/she knows what is on the ground Mean Spectral
and then digitize a polygon within that The computer then creates...
Signatures
area…
Conifer

Known Conifer
Area

Water
Known Water
Area

Deciduous

Known Deciduous
Area

Digital Image
Supervised Classification
Mean Spectral Information
Signatures Multispectral Image (Classified Image)

Conifer

Deciduous

Water Unknown
Spectral Signature
of Next Pixel to be
Classified
The Result is Information--in this case a Land Cover map...

Land Cover Map

Legend:
Water
Conifer
Deciduous
Supervised classifiers
• Minimum distance to means classifier: uses the central
values (means) of the spectral data clusters (defined by
training data) to assign pixels to information categories
• Parallelepiped classifier : uses ranges of pixel values within
the training data to define classification regions in
multivariate space; one of the earliest classification
algorithms developed
• Maximum likelihood classifier : estimates the means and
variances of information classes defined by training data to
estimate probabilities for each pixel in an image, most
commonly used method
– takes into account the shape of the cluster distribution, as well
as overlapping regions, requires normal distributions
Minimum distance to means
– Find mean value of
pixels of training sets 45

40

in n-dimensional 35

30

Band 4
25

space 20
15

– All pixels in image


10
5

classified according 0 2 4 6 8 10
Band 3
12 14 16 18 20

to the class mean to


which they are
closest
Minimum Distance to means
– Pros:
• All regions of n-dimensional space are classified
• Allows for diagonal boundaries (and hence no
overlap of classes)
– Con:
• Assumes that spectral variability is same in all
directions, which is not the case
Parallelepiped Classification
• A parallelepiped is a three dimensional geometrical shape
whose opposite sides are straight and parallel.
• This classifier uses the class limits and stored in each class
signature to determine if a given pixel falls within the class or
not.
– The class limits specify the dimensions (in standard deviation units) of
each side of a parallelepiped surrounding the mean of the class in
feature space.
– If the pixel falls inside the parallelepiped, it is assigned to the class.
However, if the pixel falls within more than one class, it is put in the
overlap class (code 255). If the pixel does not fall inside any class, it is
assigned to the null class (code 0).
• The parallelepiped classifier is typically used when speed is
required. The draw back is (in many cases) poor accuracy and
a large number of pixels classified as ties (or overlap)
Maximum Likelihood Classifier
• Another statistical approach
• Assume multivariate normal distributions of pixels
within classes
• For each class, build a discriminant function
– For each pixel in the image, this function calculates the
probability that the pixel is a member of that class
– Takes into account mean and covariance of training set
• Each pixel is assigned to the class for which it has
the highest probability of membership
Maximum Likelihood Classifier
Mean Signature 1

Candidate Pixel

Mean Signature 2

It appears that the candidate pixel is


closest to Signature 1. However, when
we consider the variance around the
signatures…

Blue Green Red Near-IR Mid-IR


Maximum Likelihood Classifier
Mean Signature 1

Candidate Pixel

Mean Signature 2

The candidate pixel clearly belongs to


the signature 2 group.

Blue Green Red Near-IR Mid-IR


Maximum likelihood

– Pro:
• Most sophisticated; achieves good separation of
classes
– Con:
• Requires strong training set to accurately describe
mean and covariance structure of classes
Some advanced techniques

– Neural networks
• Use flexible, not-necessarily-linear functions to
partition spectral space
– Contextual classifiers
• Incorporate spatial or temporal conditions
– Linear regression
• Instead of discrete classes, apply proportional values of
classes to each pixel; ie. 30% forest + 70% grass
Unsupervised Classification
• The identification of natural groups, or
structures/patterns, within multispectral data
• Spectral classes are defined by the computer
through statistical clustering method;
informational classes are assigned to output
spectral clusters
Unsupervised Classification
The analyst requests the computer to examine
the image and extract a number of spectrally
distinct clusters… Spectrally Distinct Clusters

Cluster 3 Cluster 6

Cluster 5 Cluster 2

Cluster 1 Cluster 4

Digital Image
Unsupervised Classification
Saved Clusters
Output Classified Image
Cluster 3 Cluster 6

Next Pixel to
be Classified
Cluster 5 Cluster 2

Cluster 1 Cluster 4

Unknown
Unsupervised Classification
The result of the The analyst determines the
unsupervised classification ground cover for each of
is not yet information until… the clusters…

??? Water

??? Water

??? Conifer

??? Conifer

??? Hardwood

??? Hardwood
Unsupervised Classification
It is a simple process to The result is essentially the
regroup (recode) the clusters same as that of the supervised
into meaningful information classification:
classes (the legend).
Land Cover Map Legend
Labels
Water
Water

Water
Conif.
Conifer

Hardw.
Conifer

Hardwood

Hardwood
Unsupervised Classification
• Pros
– Takes maximum advantage of spectral
variability in an image
• Cons
– The maximally-separable clusters in spectral
space may not match our perception of the
important classes on the landscape
ISODATA -A Special Case of
Minimum Distance Clustering
• “Iterative Self-Organizing Data Analysis Technique”
• Parameters you must enter include:
– N - the maximum number of clusters that you want
– T - a convergence threshold and
– M - the maximum number of iterations to be
performed.
ISODATA Procedure
• Program begins with randomly selected cluster
centroids
• Distance of each pixel to cluster centroids is
computed
• New cluster centroids are developed based on
distances
• Distance of each pixel to new cluster centroids is
computed
• Iterates until cluster centroids no longer change
(or convergence definitions are met
ISODATA Procedure
• After each iteration, the algorithm calculates
the percentage of pixels that remained in the
same cluster between iterations
• When this percentage exceeds T (convergence
threshold), the program stops or…
• If the convergence threshold is never met, the
program will continue for M iterations and
then stop.
ISODATA Pros and Cons
• Not biased to the top pixels in the image (as
sequential clustering can be)
• Non-parametric--data does not need to be
normally distributed
• Very successful at finding the “true” clusters
within the data if enough iterations are allowed
• Cluster signatures saved from ISODATA are easily
incorporated and manipulated along with
(supervised) spectral signatures
• Slowest (by far) of the clustering procedures.
Unsupervised Classification

• After iterations finish, you’re left with a


map of distributions of pixels in the clusters
• How do you assign class names to clusters?
– Requires some knowledge of the landscape
– Ancillary data useful, if not critical (aerial
photos, personal knowledge, etc.)
Alternatives to ISODATA approach
– K-means algorithm
• assumes that the number of clusters is known a priori,
while ISODATA allows for different number of clusters
– Non-iterative
• Identify areas with “smooth” texture
• Define cluster centers according to first occurrence in
image of smooth areas
– Agglomerative hierarchical
• Group two pixels closest together in spectral space
• Recalculate position as mean of those two; group
• Group next two closest pixels/groups
• Repeat until each pixel grouped
Classification: Summary
• Use spectral (radiometric) differences to distinguish
objects
• Land cover not necessarily equivalent to land use
• Supervised classification
– Training areas characterize spectral properties of classes
– Assign other pixels to classes by matching with spectral
properties of training sets
• Unsupervised classification
– Maximize separability of clusters
– Assign class names to clusters after classification
Spectral Clusters and Spectral
Signatures
• Recall that clusters are spectrally distinct and
signatures are informationally distinct
• When using the supervised procedure, the
analyst must ensure that the informationally
distinct signatures are spectrally distinct
• When using the unsupervised procedure, the
analyst must supply the spectrally distinct
clusters with information (label the clusters).

You might also like