Image Analysis System For Detection of Red Cell Disorders Using Artificial Neural Networks
Image Analysis System For Detection of Red Cell Disorders Using Artificial Neural Networks
Image Analysis System for Detection of Red Cell Disorders Using Artificial
Neural Networks
Ms. Y M Hirimutugoda BSc, MSc (IT)
Department of Computer Science and Engineering, Faculty of Engineering, University of Moratuwa, Moratuwa,
Sri Lanka
E-mail address: [email protected]
Dr. Gamini Wijayarathna B.Sc, MEng, DrEng
Senior Lecturer, Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
E-mail address: [email protected]
Abstract
This paper investigates the possibility of rapid and accurate automated diagnosis of red blood cell disorders and
describes a method to detect malarial parasites and thalassaemia in blood sample images acquired from light
microscopes. As malaria and thalassaemia are life threatening diseases and an enormous global health problem,
rapid and precise differentiation is necessary in clinical settings. The analysis of blood is a powerful diagnostic
tool for the detection of these diseases. Visual inspection of microscopic images is the most widely used
technique for determination of malaria and possible thalassaemia and it is a labour-intensive repetitive and time
consuming task. Two back propagation Artificial Neural Network models (3 layers and 4 layers) was employed
together with image analysis techniques to evaluate the accuracy of the classification in the recognition of
medical image patterns associated with morphological features of erythrocytes in the blood. The three layers
Artificial Neural Network (ANN) architecture had the best performance with an error of 2.74545e-005 and
86.54% correct recognition rate. The trained three layer ANN acts as a final detection classifier to determine
diseases. A medical consultation system has been jointly used with this system to provide clinical decision
making ability. A questioning and answering dialog on the basis of patient history, physical examination and
routine diagnostic test has been conducted in the medical consultation system with image analyzing result made
by the trained ANN.
Introduction
Thalassaemia is actually a group of inherited diseases of the blood that affect a person's
ability to produce haemoglobin, resulting in anaemia. Haemoglobin is a protein in red blood
cells that carries oxygen and nutrients to cells in the body. The World Health Organization
(WHO) report of thalassaemia describes that each year about 300,000 infants worldwide are
born with thalassaemia syndromes (30%) or sickle-cell anaemia (70%`). Globally, the
percentage of carriers of Thalassaemia is greater than that of carriers of sickle-cell anaemia,
but because of the higher frequency of the sickle-cell gene in certain regions, the number of
affected births is higher than that of thalassaemia(3). Thalassaemia occurs most frequently in
people of Italian, Greek, Middle Eastern, Southern Asian and African Ancestry.
35
Y. M. Hirimutugoda, et. al. / Sri Lanka Journal of Bio-Medical Informatics 2010;1(1):35-42
Visual inspection of microscopic images is the most widely used technique for current
determination of these blood disorders. Microscopy of Giemsa stained thick and thin blood
films are used for the current standard determination of malaria. In peripheral blood sample,
visual detection and recognition of Plasmodium spp is possible and effect via a chemical
process called (Giemsa) staining. The staining process slightly colourises the red blood cells
(RBCs) but highlights Plasmodium spp parasites, white blood cells (WBC), and platelets or
artefacts. Detection of Plasmodium spp requires detection of stained objects. However, to
prevent false diagnosis, the stained objects have to be analyzed further to determine if they
are parasite or not (4).
Although the microscopy has good sensitivity and allows species identification, there are
some drawbacks. Visual inspection of microscopic images is time consuming and exhaustive.
If the detection and counting process is interrupted, the operator has to start over again from
the scratch. Different cells in microscope images can be differentiated by human visual
analysis by using only the spatial and intensity information. After the blood cell slides have
been analyzed, they are kept away. There is no quick and easy way of retrieving analyzing lot
of images for future reference as with a computerized system. Some decision makers in
emergency situations may not have accessed to test results before having to decide on
treatment and they may have no experience of a particular rare condition and therefore not
recognize it or not know how to deal with it. Emotional problems and fatigue degrade the
expert’s performance. Standardized automated image analysis software would circumvent
limitations associated with manual determination (5).
Artificial Neural Network (ANN) has been employed together with image processing
techniques to automate the assessment of these blood disorders using the morphological
features of erythrocytes in the blood. Prior to training, the first necessary step was to pre-
process the Giemsa stained blood sample images acquired from using a high resolution
digital camera mounted on a microscope. But the images were in various magnification
factors, colour depths and images sizes. These acquired digital images were prone to a
various types of noise which reduce the image quality. To prepare this data set for training
and testing, common image processing tasks were performed on the digital images. Some of
them are image enhancement, edge erosion, colour and size normalizing, specifying a region
of interest, extraction of ROI, grey level conversion, reading pixel values.
Image Processing
A large number of various microscope images have to be collected for training and testing.
According to the different magnification factors, the size of the red cell can change from an
image to an image. To fix the average red cell size of the images into a standard size, the size
normalization processed is done on the images according to magnification factors. The
average diameter of the red cell was determined as 32 pixels according the evaluation
performance of several different no of training cycles with the same dataset but the different
average diameter red cell in the image. The accuracy is high when more information or a
large number of pixels can be acquired from the blood cell.
It is essential to apply colour normalization to the images in order to decrease the effect of
different light sources or sensor characteristics (e.g. intensity, white balance). Among many
computational colour consistency algorithms based on the different models of illumination
change, we have chosen to use an adapted grey world normalization method. According to
36
Y. M. Hirimutugoda, et. al. / Sri Lanka Journal of Bio-Medical Informatics 2010;1(1):35-42
this method it is assumed that the colour in each sensor channel averages to grey over the
entire image. If it is not so, we wish to rotate the cluster to the main diagonal. Gray world
normalization method based on the diagonal model of illumination change which utilizes
certain characteristics of microscopic peripheral blood images. Gray level normalization
assumes that there is a constant gray value of the image which does not change among
different conditions (6). In the diagonal model, an image of unknown illumination Iu can be
simply transformed to the known illuminant space I k by multiplying pixel values with a
diagonal matrix ( I k r g b(x) = MI u r g b (x) ). Where µI r g b are the means for channels r,g,b.
For ordinary images, normalization with a transformation using the average values yields
poor results. However, the images subject to this study contain two basic components (plasma
and the rest) which can be separated by a foreground and background segmentation. Hence,
the grey value assumption can be successfully incorporated in to normalization process. In
this method, the input image is first separated into foreground and background regions.
According to the method described by WHO (3) which use area morphology to estimate size
of cells and then extracts foreground objects and estimates histograms. This procedure is
quite efficient for the normalization of the image with respect to global illumination and
staining effects.
It was decided to determine regions of interest of the larger captured images (some of which
contain more than one suspicious area). The size of the ROI was originally chosen as it is
large enough to accommodate the largest abnormal leukocyte with some headroom by
inspecting images (300 images) randomly. A Region of Interest (ROI) was defined around the
centre (cx, cy) of the suspicious area on the image. This region of interest was defined to be a
square 160 by 160 pixels in size as minimum the training issues in the implementation stage.
A common pre-processing operation in image processing is the extraction of the region of the
interest (ROI) which we wish to investigate more closely without the added complexity of
extraneous data from other, unwanted, parts of an image. To do this we need to identify the
region of interest and then crop, or cut away, this area from the rest of the image. The other
reason for segmentation is to make sure that the ANN’s is kept to the smallest possible size
for in order to achieve easier training.
Due to edge effects in the captured images (the camera produced noticeable dark bands at the
very edges of the image) there were many nuclear pixels around the perimeter of the image.
An edge erosion filter, which simply set all the pixels within 3 pixels of the edge of the image
to white, was developed and used.
The numbers of the hidden neurons were chosen on the basis of trials with the training set to
be as low as possible consistent with learning. Each pattern in the training and testing set
consisted of a pattern name, 160 * 160 pixel image data and a target output code. These have
been shown in the table 1. This illustrates the actual input and target output of cell patterns.
37
Y. M. Hirimutugoda, et. al. / Sri Lanka Journal of Bio-Medical Informatics 2010;1(1):35-42
Other
Malaria Thalassemia Normal
Abnormal
Blood image
160*160 pixels
The outputs from the neurons in the output layer of the neural network are presented in
percentages, from 0% to 100%.With respect to the target outputs, “0” equals 0% and “1”
equals 100%. The target outputs are defined as [1000] for Malaria, [0100] for Thalassaemia,
[0001] for other abnormal and [0010] for normal. All the patterns were manually classified to
define the target output. This ANN was trained with a standard back-propagation learning
algorithm and weight was updated in batch mode. The network was trained for 1000 cycles.
The number of training cycles was chosen after evaluating the performance of several
different numbers of training cycles with the development set.
38
Y. M. Hirimutugoda, et. al. / Sri Lanka Journal of Bio-Medical Informatics 2010;1(1):35-42
No of correct Recognition of
Number of Images
Sample blood Testing
Image type Three Layer Four
Training Testing
ANN Layer ANN
61/70 56/60
120 70 Malaria
= 87.14% =80.00%
72/80 67/70
120 80 Thalassaemia
= 90.00% =83.75%
52/60 50/60
120 60 Normal
=85.00% =83.33%
42/50 40/50
120 50 Other abnormal
=84.00% =80.00%
The total average correct recognition rate 86.54% 84.25%
Saved error 2.74545e-005 2.82124e-005
Image Type
No of Incorrect Recognitions of Testing 3 Layer ANN
39
Y. M. Hirimutugoda, et. al. / Sri Lanka Journal of Bio-Medical Informatics 2010;1(1):35-42
Rule p1
Rule p3
IF fever
AND sweating
AND chills
………………….………………..
THEN poor general health
Patient Facts
f5: fever
f6: sweating
…………..
fn: chills
fm: temperature >= 38o C
The classification accuracy of 86.54% with 3 layers ANN was achieved in this study. The
goal was largely met and false negative rates (7.6 %) were extremely low. The false negatives
that were found were mostly in blood cell images with very brightness ranges or less contrast
than other cell images. This initial study is promising and shows that direct classification with
image pixels intensity data through ANN is possible, even some back ground variation in the
image.
Neural networks are generally perceived as being a 'black box'. It is extremely difficult to
document how specific classification decisions are reached. In a medical decision support
system, asking questions, giving explanations and reasons and dealing with uncertainties are
very important things. As a solution for these important issues, an Expert System was created.
But interfaces problems were encountered when combining the Expert System with this
trained Neural Network to give the reasoning and explanation power to this system.
We are also aiming at a possible change to the algorithms to find a better approach for the
highest accuracy of this solution. Currently I am studying to apply following research based
approaches to overcome the above problems and to achieve 100% or a comparable higher
40
Y. M. Hirimutugoda, et. al. / Sri Lanka Journal of Bio-Medical Informatics 2010;1(1):35-42
The study of why and how the misclassification occurred is very necessary and important.
Previous studies also have cases that were misclassified; however, it is uncommon to see the
explanations for the above misclassification. The analysis of misclassification result has been
identified as a future research area that may achieve a clinical application potential.
According to this research area, in the current study we used the Support Vector Machine
(SVM) classifier to find the better accuracy classifier for this medical problem using the same
medical dataset. If the merits of the Support Vector Machines and Artificial Neural Networks
can be combined, we can obtain a new Artificial Neural Network Support Vector Machine
with a better performance. Motivated by this idea, we hope to apply a recurrent Neural
Network to SVM training for this medical image pattern recognition. Within this research
area, we are also studying how to use the Multi Agents System for training this medical
image dataset to get the result and find the accuracy of using a Multi Agents System.
The use of hypermerge hybrid systems can be used to address problems in computer-assisted
decision making. Hybrid systems use combined methodologies from two or more techniques
in the same system. Often, these include the combination of knowledge-based methods with
data-based methods such as neural networks. In addition, various techniques are used to
handle uncertain information, including fuzzy set theory, approximate reasoning. In my
current study, the structural components of them are examined to know how to these methods
are integrated.
Acknowledgements
References
41
Y. M. Hirimutugoda, et. al. / Sri Lanka Journal of Bio-Medical Informatics 2010;1(1):35-42
42