Fast and Accurate Detection and Classification of
Fast and Accurate Detection and Classification of
31
International Journal of Computer Applications (0975 – 8887)
Volume 17– No.1, March 2011
some texture feature set. The proposed method is an Algorithm 1: Basic steps describing the proposed algorithm.
improvement of the approach proposed in [1].
As a testbed we use a set of leaves which are taken from Al- 1 RGB image acquisition
Ghor area in Jordan. Our program has been tested on five 2 Create the color transformation structure
diseases which affect on the plants; they are: Early scorch,
3 Convert the color values in RGB to the space specified in
Cottony mold, Ashen mold, late scorch, and tiny whiteness.
the color transformation structure
The proposed framework could successfully detect and classify
the tested diseases with precision of more than 94% in average 4 Apply K-means clustering
with more than 20% speedup over the presented approach in 5 Masking green-pixels
[1]. The minimum precision value was 84% compared to 80%
precision in the previous approach. 6 Remove the masked cells inside the boundaries of the
infected clusters
In conclusion, the aim of this work is threefold: 1) identifying
the infected object(s) based upon K-means clustering 7 Convert the infected (cluster / clusters) form RGB to HSI
procedure; 2) extracting the features set of the infected objects Translation
using color co-occurrence methodology for texture analysis; 3) 8 SGDM Matrix Generation for H and S
detecting and classifying the type of disease using ANNs, 9 Calling the GLCM function to calculate the features
moreover, the presented scheme classifies the plant leaves and
stems at hand into infected and not-infected classes. 10 Texture Statistics Computation
11 Configuring Neural Networks for Recognition
2. THE PROPOSED APPROACH – STEP-
BY-STEP DETAILS
The overall concept that is the framework for any vision
related algorithm of image classification is almost the same.
First, the digital images are acquired from the environment
using a digital camera. Then image-processing techniques are
applied to the acquired images to extract useful features that
are necessary for further analysis. After that, several analytical
discriminating techniques are used to classify the images
according to the specific problem at hand. Figure 1 depicts the
Early scorch Cottony mold
basic procedure of the proposed vision-based detection
algorithm in this research.
Image acquisition
Image preprocessing
Image segmentation
Ashen mold Late scorch
Feature extraction
Statistical analysis Figure 2: Sample images from our dataset.
Classification based on a
classifier
32
International Journal of Computer Applications (0975 – 8887)
Volume 17– No.1, March 2011
33
International Journal of Computer Applications (0975 – 8887)
Volume 17– No.1, March 2011
CCM method in short. It is a method, in which both the color Color spaces can be transformed from one space to another
and texture of an image are taken into account, to arrive at easily. In our experiments, the Equations 1, 2 and 3 were used
unique features, which represent that image. to transform the image‟s components from RGB to HSI [6]:
(2)
(3)
2.3.1 Co-occurrence Methodology for Texture Analysis 2.3.2 Normalizing the CCM matrices
The image analysis technique selected for this study was the The CCM matrices are then normalized using Equation 4.
CCM method. The use of color image features in the visible
light spectrum provides additional image characteristic features
over the traditional gray-scale representation [2]. (4)
The CCM methodology established in this work consists of is the image attribute matrix, represents the
three major mathematical processes. First, the RGB images of intensity co-occurrence matrix, and represents the total
leaves are converted into Hue Saturation Intensity (HSI) color number of intensity levels.
space representation. Once this process is completed, each
pixel map is used to generate a color co-occurrence matrix, The marginal probability matrix ( ) can be defined as shown
resulting in three CCM matrices, one for each of the H, S and I in Equation 5.
pixel maps. (HSI) space is also a popular color space because it (5)
is based on human color perception [17]. Electromagnetic
radiation in the range of wavelengths of about 400 to 700 Sum and difference matrices ( ) are defined as shown
nanometers is called visible light because the human visual in Equations 6 and 7, respectively.
system is sensitive to this range. Hue is generally related to the
wavelength of a light and intensity shows the amplitude of a (6)
light. Lastly, saturation is a component that measures the
“colorfulness” in HSI space [17].
Where ,
34
International Journal of Computer Applications (0975 – 8887)
Volume 17– No.1, March 2011
(7) Before the data can be fed to the ANN model, the proper
network design must be set up, including type of the network
Where . and method of training. This was followed by the optimal
parameter selection phase. However, this phase was carried out
2.3 Texture Features Identification simultaneously with the network training phase, in which the
The following features set were computed for the components network was trained using the feed-forward back propagation
H and S: network. In the training phase, connection weights were always
updated until they reached the defined iteration number or
acceptable error. Hence, the capability of ANN model to
The angular moment ( ) is used to measure the respond accurately was assured using the Mean Square Error
homogeneity of the image, and is defined as shown in (MSE) criterion to emphasis the model validity between the
Equation 8. target and the network output.
(8)
3. EXPERIMENTAL RESULTS AND
OBSERVATIONS
The produc moment (cov) is analogous to the
covariance of the intensity co-occurance matrix and 3.1 Input Data Preparation and
is defined as shown in Equation 9. Experimental Settings
(9) In our experiments, two main files were generated, namely: (i)
Training texture feature data, and (ii) Testing texture feature
data. The two files had 192 rows each, representing 32 samples
The sum and difference entropies ( and ) which from each of the six classes of leaves. Each row had 10
are computed using Equations 10 and 11 columns representing the 10 texture features extracted for a
respectively. particular sample image. Each row had a unique number (1, 2,
3, 4, 5 or 6) which represented the class (i.e., the disease) of the
(10) particular row of data. „1‟ represented early scorch disease
infected leaf. „2‟ represented Cottony mold disease infected
(11) leaf. „3‟ represented ashen mold disease infected leaf. „4‟
represented late scorch disease infected leaf. „5‟ represented
tiny whiteness disease infected leaf, and „6‟ represented normal
The entropy feature ( ) is a measure of the amount of leaf. Then, a software program was written in MATLAB that
order in an image, and is computed as as defined in would take in .mat files representing the training and testing
Equation 12. data, train the classifier using the „train files‟, and then use the
(12) „test file‟ to perform the classification task on the test data.
Consequently, a Matlab routine would load all the data files
The information measures of correlation ( ). (training and testing data files) and make modifications to the
is defined as shown in Equation 13. data according to the proposed model chosen. In the
experimental results, the threshold value for each of the above
(13) categories is constant for all samples infected with the same
Where: disease. This threshold is a global image threshold that is
computed using Otsu's method [12; 13].
The architecture of the network used in this study was as
Contrast ( ) of an image can be measured by the follows. A set of 10 hidden layers in the neural network was
inverse difference moment as shown in Equation 14. used with the number of inputs to the neural network (i.e. the
number of neurons) is equal to the number of texture features
(14) listed above. The number of output is 6 which is the number of
classes representing the 5 diseases studied along with the case
of normal (uninfected) leaf. Those diseases are early scorch,
Correlation ( is a measure of intensity linear cottony mold, ashen mold, late scorch, tiny whiteness. The
dependence in the image and is defined as shown in neural network used is the feed forward back propagation with
Equation 15. the performance function being the Mean Square Error (MSE)
and the number of iterations was 10000 and the maximum
(15) allowed error was 10-5.
35
International Journal of Computer Applications (0975 – 8887)
Volume 17– No.1, March 2011
shows a graph that representing the percentage classification of Table 4: Classification results per class for neural network with
various diseases of all the data models shown in Table 2. back propagation.
Table 2: Percentage classification of various diseases
Late scorch
whiteness
Accuracy
Cottony
Normal
species
scorch
Ashen
From
Early
mold
mold
Tiny
Early scorch
Ashen mold
Late scorch
whiteness
Features
Cottony
average
Normal
Overall
Model
Color
mold
Tiny
used
Early 25 0 0 0 0 1 100
M1 HS 98 96 89 91 92 100 94.33 scorch
Cottony 0 24 0 1 0 0 96
M2 H 90 92 86 89 93 98 91.33 mold
M3 S 90 89 85 89 81 98 88.67 Ashen 0 0 25 0 0 1 100
M4 I 92 89 84 88 86 99 89.67 mold
Late 0 0 0 22 1 0 88
M5 HSI 81 84 78 79 81 99 83.67 scorch
Tiny 0 1 0 0 23 0 92
whiteness
120 Normal 0 0 0 2 1 23 92
100 Early scorch Average 94.67
Percentage classification
80 Cottony mold
Ashen mold
60 102
Late scorch 100
40 98
Tiny whiteness
96
20
Normal 94
0 92
Overall average
90
HS H S I HSI 88
Color Features
86
Figure 7: Percentage classification of various diseases 84
82
Early Cottony Ashen Late Tiny Normal
The recognition rate for NN classification strategy of all scorch mold mold scorch whiteness
models was also computed for early scorch, cottony mold and
normal leaf images based upon Algorithm 1, the obtained
results for Model 1 and 5 are reported in Table 3. Figure 8: Classification results per class for neural network
with back propagation.
The convergence curve for the learning step of the neural
Table 3: Recognition rates of individual plant diseases network in the proposed study in this work is better than that of
Model Color Early Cottony Normal Overall [1] as shown in Figure 9. In Figure 9 Approach 1 represents the
Features scorch mold average study presented in [1], while Approach 2 represents our
M1 HS 99 100 100 99.66 proposed study.
M5 HSI 96 98 100 98.00
It can be implied from Table 3 that Model M1 which has used
only the H and S components in computing the texture
features, has emerged as the best model among the various
models. Furthermore, it can be observed from Table 3 that
Model M5 has less recognition rate than Model 1; this was in
part because of the elimination of the intensity (I component)
from computing the texture features in Model M1. As a matter
of fact, elimination of intensity is serviceable in this study
because it nullifies the effect of intensity variations. The
numbers of leaf samples that were classified into each of the
five tested categories using model M1 with specific threshold
value are shown in Table 4 and Figure 8. It is observed from
Table 4 that only few samples from late scorch and tiny
whiteness leaves were misclassified, also, three test images
were misclassified for the case of late scorch infected leaves.
Similarly, in the case of tiny whiteness images, only two test
images from the class were misclassified. In average, accuracy Figure 9: The convergence curve of [1], that is “Approach 1”,
of classification using our approach was 94.67 compared to and our approach “Approach 2”.
92.7 in case of using the approach presented in [1].
36
International Journal of Computer Applications (0975 – 8887)
Volume 17– No.1, March 2011
It can be concluded from the above tables and figures that the REFERENCES
obtained results achieved an acceptable level of optimal results [1]. Al-Bashish, D., M. Braik and S. Bani-Ahmad, 2011.
which can also add more weighting rate to the proposed study. Detection and classification of leaf diseases using K-
The presented approaches in this paper and in [1] have been means-based segmentation and neural-networks-based
implemented in MATLAB under Windows XP environment. classification. Inform. Technol. J., 10: 267-275. DOI:
All the experiments are carried out on a desktop computer with 10.3923/itj.2011.267.275
Intel (R) Pentium (R) CPU 2.20GHZ 645 MHZ and 256MB of
RAM. The average computation time for the two proposed [2]. Aldrich, B.; Desai, M. (1994) "Application of spatial grey
approaches was computed in seconds for the models M1, M2, level dependence methods to digitized mammograms,"
M3, M4 and M5 as shown in Table 5. The data in Table 5 was Image Analysis and Interpretation, 1994., Proceedings of
obtained for the two approaches using the same neural network the IEEE Southwest Symposium on, vol., no., pp.100-105,
structure and under the same machine. Figure 10 shows that 21-24 Apr 1994. DOI: 10.1109/IAI.1994.336675
our proposed approach has 19% speedup over the approach of
[1]. [3]. Ali, S. A., Sulaiman, N., Mustapha, A. and Mustapha, N.,
(2009). K-means clustering to improve the accuracy of
Table 5: Average computation time. decision tree response classification. Inform. Technol. J.,
Model Approach 1 Approach 2 8: 1256-1262. DOI: 10.3923/itj.2009.1256.1262
M1 456.19 392.98 [4]. Bauer, S. D.; Korc, F., Förstner, W. (2009): Investigation
M2 399.00 359.91 into the classification of diseases of sugar beet leaves
M3 477.91 388.27 using multispectral images. In: E.J. van Henten, D.
Goense and C. Lokhorst: Precision agriculture ‟09.
M4 408.22 347.44 Wageningen Academic Publishers, p. 229-238. URL:
M5 449.12 414.60 http://www.precision-crop-protection.uni-
Average 438.09 380.64 bonn.de/gk_research/project.php?project=3_09
[5]. Camargo, A. and Smith, J. S., (2009). An image-
processing based algorithm to automatically identify plant
20 disease visual symptoms, Biosystems Engineering,
18 Volume 102, Issue 1, January 2009, Pages 9-21, ISSN
Percentage speedup
4. CONCLUSION AND FUTURE WORK [9]. Jun, W. and S. Wang, (2008). Image thresholding using
weighted parzen-window estimation. J. Applied Sci., 8:
In this paper, respectively, the applications of K-means 772-779. DOI: 10.3923/jas.2008.772.779. URL:
clustering and Neural Networks (NNs) have been formulated http://scialert.net/abstract/?doi=jas.2008.772.779
for clustering and classification of diseases that affect on plant
leaves. Recognizing the disease is mainly the purpose of the [10]. MacQueen, J.. (1967). Some methods for
proposed approach. Thus, the proposed Algorithm was tested classification and analysis of multivariate observations. In
on five diseases which influence on the plants; they are: Early L. M. LeCam and J. Neyman, editors, Proceedings of the
scorch, Cottony mold, ashen mold, late scorch, tiny whiteness. Fifth Berkeley Symposium on Mathematical Statistics and
Probability, volume 1, pages 281--297, Berkeley, CA,
The experimental results indicate that the proposed approach is
1967. University of California Press. URL:
a valuable approach, which can significantly support an
http://projecteuclid.org/DPubS?service=UI&version=1.0
accurate detection of leaf diseases in a little computational
&verb=Display&handle=euclid.bsmsp/1200512992.
effort.
An extension of this work will focus on developing hybrid [11]. Martinez, A., (2007). 2007 Georgia Plant Disease
algorithms such as genetic algorithms and NNs in order to Loss Estimates. http://www.caes.uga.edu/Publications/
increase the recognition rate of the final classification process displayHTML.cfm?pk_id=7762. Viewed on Saturday, 15,
underscoring the advantages of hybrid algorithms; also, we will January, 2011.
dedicate our future works on automatically estimating the [12]. Otsu, N. (1979). "A threshold selection method from
severity of the detected disease. gray-level histograms". IEEE Trans. Sys., Man., Cyber. 9:
62–66. DOI:10.1109/TSMC.1979.4310076.
37
International Journal of Computer Applications (0975 – 8887)
Volume 17– No.1, March 2011
[13]. Otsu, N., "A Threshold Selection Method from Gray- [18]. Rumpf, T., A.-K. Mahlein, U. Steiner, E.-C. Oerke,
Level Histograms," IEEE Transactions on Systems, Man, H.-W. Dehne, L. Plumer, Early detection and
and Cybernetics, Vol. 9, No. 1, 1979, pp. 62-66. classification of plant diseases with Support Vector
Machines based on hyperspectral reflectance, Computers
[14]. Prasad Babu, M. S. and Srinivasa Rao , B. (2010) and Electronics in Agriculture, Volume 74, Issue 1,
Leaves recognition using back-propagation neural October 2010, Pages 91-99, ISSN 0168-1699, DOI:
network - advice for pest and disease control on crops. 10.1016/j.compag.2010.06.009.
Technical report, Department of Computer Science &
Systems Engineering, Andhra University, India. [19]. Wang, X., M. Zhang, J. Zhu and S. Geng, Spectral
Downloaded from www.indiakisan.net on May 2010. prediction of Phytophthora infestans infection on
tomatoes using artificial neural network (ANN),
[15]. Sezgin, M. and Sankur, B. (2003). "Survey over International Journal of Remote Sensing 29 (6) (2008),
image thresholding techniques and quantitative pp. 1693–1706.
performance evaluation". Journal of Electronic Imaging
13 (1): 146–165. DOI:10.1117/1.1631315. [20]. Weizheng, S., Yachun, W., Zhanliang, C., and
Hongda, W. (2008). Grading Method of Leaf Spot
[16]. Soltanizadeh, H. and B.S. Shahriar, 2008. Feature Disease Based on Image Processing. In Proceedings of the
extraction and classification of objects in the rosette 2008 international Conference on Computer Science and
pattern using component analysis and neural network. J. Software Engineering - Volume 06 (December 12 - 14,
Applied Sci., 8: 4088-4096. DOI: 2008). CSSE. IEEE Computer Society, Washington, DC,
10.3923/jas.2008.4088.4096. URL: 491-494. DOI= http://dx.doi.org/10.1109/CSSE.
http://scialert.net/abstract/?doi=jas.2008.4088.4096. 2008.1649.
[17]. Stone, M. C. (August 2001). “A Survey of Color for
Computer Graphics”. Course at SIGGRAPH 2001.
38