Assignment # 1: Introduction To Supervised Classification
Assignment # 1: Introduction To Supervised Classification
Assignment # 1
Introduction to Supervised Classification
Matt Reaume
GISC9216
Digital Copy:
X:\Students\MREAUME2\GISC9216\Assignment#1\ReaumeMGISC9216D1
containing;
a) ReaumeMGISC9216D1.docx
b) subset_layers.img
c) supervised_maxlike.img
d) unsupervisedclass.img
e) recoded_unsupervised.img
Table of Contents
1.0 Introduction ............................................................................................................................. 1
2.0 Subset Images .......................................................................................................................... 1
3.0 Unsupervised Classification ................................................................................................... 2
4.0 Supervised Classification........................................................................................................ 7
5.0 Classification Comparison ................................................................................................... 10
6.0 Conclusion ............................................................................................................................. 13
7.0 References .............................................................................................................................. 14
Table of Figures
Figure 1. Subset Layer in False Colour.......................................................................................... 1
Figure 2. Subset Layer in True Colour .......................................................................................... 1
Figure 3. Unsupervised Classification w/ Pseudo Colours ............................................................ 2
Figure 4. Max Iterations of 1 (Left) and Max Iterations of 10 (Right) Showing Vegetation Features
......................................................................................................................................................... 3
Figure 5. Max Iterations of 1 (Left) and Max Iterations of 10 (Right) Showing Residential and
Industrial Features ........................................................................................................................... 3
Figure 6. Twenty Classes of the Unsupervised Classification Image ............................................ 4
Figure 7. Ten Classes of the Unsupervised Classification Image .................................................. 4
Figure 8. Convergence Threshold of 50% (Left Image) and 95% (Right Image) of the
Unsupervised Classification............................................................................................................ 5
Figure 9. Unsupervised Classification and False Colour Subset (Differences in Water
Sedimentation) ................................................................................................................................ 6
Figure 10. Unsupervised Classification and False Colour Subset (Differences in Agricultural and
Roads Features) ............................................................................................................................... 6
Figure 11. Supervised Classification Training Sites ...................................................................... 7
Figure 12. Minimum Distance Supervised Classification ............................................................. 8
Figure 13. Mahalanobis Supervised Classification ........................................................................ 8
Figure 14. Maximum Likelihood Supervised Classification ......................................................... 9
Figure 15. Maximum Likelihood Supervised Classification ......................................................... 9
Figure 16. Comparison between Maximum Likelihood Supervised (left) and Unsupervised (right)
Classifications ............................................................................................................................... 10
Figure 17. Comparison between Attribute Tables (Supervised, top) (Unsupervised, bottom) .... 10
Figure 18. Comparison of Histograms (Supervised, left) (Unsupervised, right) ......................... 11
Figure 19. Water Comparison between Unsupervised Classification (left) and Supervised
Classification (right) ..................................................................................................................... 11
Figure 20. Agricultural/Forest Comparison between Unsupervised Classification (left) and
Supervised Classification (right)................................................................................................... 12
Figure 21. Roads/Residential Comparison between Unsupervised Classification (right) and
Supervised Classification (left) ..................................................................................................... 12
Figure 22. Industrial/Commercial Comparison between Unsupervised Classification (right) and
Supervised Classification (left) ..................................................................................................... 13
1.0 Introduction
Digital Image Processing is the use of computer algorithms to perform image processing on digital
images. This is an important skill used in Geographic Information Systems (GIS). Aerial
photographs are used in the process of classification, which allows one to categorize all of the
pixels in a digital image into one of several land cover classes. This categorized data can be used
to produce thematic maps of the land cover present in an image. By using the unsupervised
classification and the supervised classification methods, different images can be produced to see
what classification method works the best for identifying land surface features on aerial imagery.
The unsupervised classification method allows the user to choose the amount of classes to classify
and the number of bands ERDAS should use. The supervised methods spectral signatures are
developed from specified locations in the image. These specified locations are given the generic
name 'training sites' and are defined by the user. After defining and providing examples of images
used for each classification method, a comparison will be conducted to ultimately decide which
classification method is superior over the other in this assignment.
Max iterations is a parameter used in the unsupervised classification process that allows ERDAS
to run through the image in order to get the values needed to produce the unsupervised
classification image. If the number is high then the image could become too detailed or become
redundant if the image does not change over a certain number of iterations. However, if the max
iterations value is too low then the image may not be able to classify everything needed to produce
an effective image. Below in Figure 4, the max iterations is set to 1 on the left side and 10 on the
right side. The image shows how the pink pigmentation is more pronounced and solid when the
max iteration is set to 10, which allows for certain areas of agriculture to appear brighter. In Figure
5, the max iterations is also set to 1 (left) and 10 (right) to illustrate the differences of residential
and industrial areas. The image that has a max iteration of 1 does not display the blue pigmentation
as bright as the max iterations of 10, which illustrates how a low max iteration does not allow for
all the values to be classified entirely as opposed to a higher max iteration.
Figure 4. Max Iterations of 1 (Left) and Max Iterations of 10 (Right) Showing Vegetation Features
Figure 5. Max Iterations of 1 (Left) and Max Iterations of 10 (Right) Showing Residential and Industrial Features
Another parameter that is used in the unsupervised classification method is the number of classes
that is allowed to be set by the user. Below in Figure 6 and Figure 7, an unsupervised classification
image with pseudo colours is altered into ten and twenty classes to illustrate the difference. The
image with twenty classes shows how land surface features can be clustered together due to the
higher amount of colours used. This could cause complications because the user may not be able
to identify all the colour appropriately into the correct land use category, ultimately leading to
misidentification. In the image that has ten classes, the image is a bit more simplified and could
help the user identify features easier, but risk the possibility on making the image too generalized.
Convergence threshold is a parameter used in the unsupervised classification method that is the
maximum percentage of pixels who cluster assignment can go unchanged. In Figure 8, the
convergence threshold is set to 0.50 on the left image and 0.95 on the right image. This means as
soon as the percentage reaches that number or more, the pixels stay in the same cluster between
each iteration, which stops the process. When the convergence threshold is set to 50% not all of
the pixels are able to be processed when being compared to 95%, such as the amount of detail in
the blue and pink pigmentation that resembles different land surface features.
Figure 8. Convergence Threshold of 50% (Left Image) and 95% (Right Image) of the Unsupervised Classification
Lastly, by using the unsupervised classification method with pseudo colours many features may
not be able to be displayed, such as the sediment visible in the False Colour image in Figure 9. In
this figure the unsupervised classification (left) does not recognize the sediment in the water and
throughout various agricultural and forested areas. In Figure 10, unsupervised classification
clumps land surface features together by recognizes two different features as the same, therefore
displaying the same pixel colour. For example, agricultural and forested areas blend together when
they should not be, thus reducing the amount of detail. Also, roads are not easily visible in the
unsupervised classification image because the agriculture are the roads features are recognized as
the same pixel colour, which is misleading to overall effectiveness of the image.
Figure 9. Unsupervised Classification and False Colour Subset (Differences in Water Sedimentation)
Figure 10. Unsupervised Classification and False Colour Subset (Differences in Agricultural and Roads Features)
After using the different parameters of the unsupervised classification, a better understanding of
this method has been achieved by using the max iterations and number of classes to alter the image.
In this assignment the best parameters that provides the best result is 10 max iterations, 10 classes,
and a convergence threshold of 0.95. After the parameters are chosen, each class is given its own
colour and a description that best illustrates the feature. These colours allowed for a detailed
interpretation of the unsupervised classification method illustrating which parameter settings are
the most effective.
When using supervised classification, three types of classifications are used, which are maximum
likelihood, minimum distance, and mahalanobis. These computer generated algorithms produce
different outputs, which are illustrated below in Figures 12, 13, and 14. First, the minimum
distance, shown in Figure 12, assigns each unknown pixel o the closest category mean and does
not take variability and dispersion into account. The issue with this classification is that there is
little sediment in the body of water causing concern due to putting multiple types of features
together, similar to the unsupervised classification. Many of the roads/residential features (black
colours) and industrial/commercial features (brown colours) were clumped together. This made
the output image inaccurate as it based upon clustering.
Second, the mahalanobis classification, shown in Figure 13, is similar to the minimum distance,
but takes variability into account, thus making it more useful. However, this classification tends to
over classify signatures with large values in covariance and is slower to computer. The issue with
this classification is that certain areas of the agricultural (light green) is clumped together with the
industrial/commercial features (brown). This classification also adds areas of roads/residential to
forest (dark green) and agricultural areas (light green). There is very minimal amount of error with
this classification, but the maximum likelihood classification portrays the overall quality of the
image the best.
Lastly, maximum likelihood appears to be the most accurate because it take the most variables into
consideration, takes the longest to computer, and relies heavily on a normal distribution. However,
this classification tends to over classify signatures with large values in a covariance matrix. Shown
in Figure 14 is the maximum likelihood classification that displayed the least amount or error. The
roads appear to be the most accurate and do not override any other features, the water is better
represented on the minimum distance, but the overall appearance of the water is better shown in
the maximum likelihood. The agricultural and forested areas are a bit clumped together, and the
separation of the fields can be difficult to view, but does not clump together multiple features like
the other classifications. In Figure 15 shows the mean plot for this classification illustrating the
amount of values for each band. This image shows how each of the features peak at certain bands,
but fall during other bands
10
Figure 16. Comparison between Maximum Likelihood Supervised (left) and Unsupervised (right) Classifications
Figure 17. Comparison between Attribute Tables (Supervised, top) (Unsupervised, bottom)
11
The histograms shown in Figure 18, illustrates how the amount of classes along the x-axis change
throughout the two types of classifications. Also, it shows how many pixels along the y-axis each
band has. This comparison displays the differences in appearance for each histogram that depends
upon each of the two types of classifications.
The body of water in the unsupervised classification the blue is a solid colour with no variation,
but the supervised classification clearly illustrates sediment on the edges of the water body and
some in the center of the water body, which is represented by different pixel colours. The
supervised classification also has a greater area mass, as shown below in Figure 19. When
comparing the two images for the best quality, it seems that the unsupervised classification excels
by dismissing random pixels throughout the feature.
Figure 19. Water Comparison between Unsupervised Classification (left) and Supervised Classification (right)
12
Another feature that can be seen with noticeable difference between the unsupervised classification
and supervised classification is the agricultural and forested areas. As shown below in Figure 20,
the agricultural features in the unsupervised classification (light green) is clumped together with
other features, such as the white and grey areas that display residential and industrial areas. In the
supervised classification, there are more agricultural and forested areas, which is due to the training
sites from carefully selected pixels. The road network is also more visual on the supervised
classification due to the main roads dividing agricultural areas, as shown in Figure 21. Overall,
the maximum likelihood supervised classification is the best when comparing these features due
to a realistic representation of the main roads within the agricultural areas.
Figure 20. Agricultural/Forest Comparison between Unsupervised Classification (left) and Supervised Classification (right)
Figure 21. Roads/Residential Comparison between Unsupervised Classification (right) and Supervised Classification (left)
13
Lastly, the industrial and commercial areas, shown below in Figure 22, on the supervised image
does not combine agricultural and forested areas within the features. Whereas on the unsupervised
images it is clearly seen that the white areas (residential/industrial) overlap some green shades seen
on the supervised classification image. This ultimately explains the overall effectiveness, quality,
and preciseness of the supervised classification
Figure 22. Industrial/Commercial Comparison between Unsupervised Classification (right) and Supervised Classification (left)
Overall, both the unsupervised and the supervised classifications have their advantages and
disadvantages. The unsupervised classification excelled in solid colors making each feature easily
recognizable, such as the water body, but combined multiple features making it difficult to classify.
The supervised classification excelled in agricultural and forested areas due to picking up pixels
more accurately and the roads network is more precise and accurate. However, the water body
picked up pixels in the middle from sedimentation at the base of the water. Therefore, with all of
the benefits and risks between the two types of classifications, it appears that the maximum
likelihood supervised classification is a better method due to overall detail and accuracy. Thus, for
the purpose of this assignment the supervised classification is the most effective.
6.0 Conclusion
The knowledge of unsupervised classifications and supervised classification of the original subset
image is an important assets to acquire within Digital Image Processing and Geographic
Information Systems. This ability allows one to properly classify individual land-use types and
output the best quality image. This assignment is successful in providing an introduction to each
type of classification by using various parameters for each classification. By comparing the two
classifications, a great amount of knowledge and understanding has been achieved, and will lead
to a greater understanding of Digital Image Processing in future assignments and upcoming
employment opportunities
14
7.0 References
Lillesand, T. M., Kiefer, R. W., & Chipman, J. W. (2008). Remote Sensing and Image
Interpretation (Sixth Edition). Hoboken, New Jersey, United States of America: John
Wiley & Sons, Inc.
Niagara College. (n.d.). From: X:\GIS Resources\GIS - Second Semester\GISC9216
DIP\Week2 - Unsupervised_Classification. Niagara-On-The-Lake, Ontario, Canada:
Niagara College Canada.
Niagara College. (n.d.). From: X:\GIS Resources\GIS - Second Semester\GISC9216
DIP\Week3 - Supervised_Classification. Niagara-On-The-Lake, Ontario, Canada: Niagara
College Canada.