0% found this document useful (0 votes)
17 views7 pages

Gambar Cin

This document describes a study that used convolutional neural networks (CNNs) to classify cervical cancer. Key points: - The study divided cervical histology images into 10 segments, and then each segment into top, middle, and bottom parts. - CNNs were trained to classify patches extracted from each part, without manually extracting features. - The CNN classifications of each part were then fused to classify each segment and whole image. - The proposed method achieved an accuracy of 77.25% on a dataset of 65 images, compared to 75.75% from a previous method.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views7 pages

Gambar Cin

This document describes a study that used convolutional neural networks (CNNs) to classify cervical cancer. Key points: - The study divided cervical histology images into 10 segments, and then each segment into top, middle, and bottom parts. - CNNs were trained to classify patches extracted from each part, without manually extracting features. - The CNN classifications of each part were then fused to classify each segment and whole image. - The proposed method achieved an accuracy of 77.25% on a dataset of 65 images, compared to 75.75% from a previous method.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Available online at www.sciencedirect.

com

ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 00 (2017) 000–000
www.elsevier.com/locate/procedia
ScienceDirect
Procedia Computer Science 114 (2017) 281–287

Complex Adaptive Systems Conference with Theme: Engineering Cyber Physical Systems, CAS
October 30 – November 1, 2017, Chicago, Illinois, USA

Convolutional Neural Network Based Localized Classification of


Uterine Cervical Cancer Digital Histology Images.
Haidar A. Almubaraka, R. Joe Stanleya*, Rodney Longb, Sameer Antanib, George
Thomab, Rosemary Zunac, Shelliane R. Frazierd
a
Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO USA 65401
b
Lister Hill Nationa Center for Biomedical Communications for National Library of Medicine, National Institutes of Health, DHHS, Bethesda,
MD, USA
c
Department of Pathology for the University of Oklahoma Health Sciences Center, Oklahoma City, OK 73117, USA
d
Surgical Pathology Department for the University of Missouri Hospitals and Clinics, Columbia, MO 65202, USA

Abstract

In previous research, we introduced an automated localized, fusion-based algorithm to classify squamous epithelium into Normal,
CIN1, CIN2, and CIN3 grades of cervical intraepithelial neoplasia (CIN). The approach partitioned the epithelium into 10 segments.
Image processing and machine vision algorithms were used to extract features from each segment. The features were then used to
classify the segment and the result was fused to classify the whole epithelium. This research extends the previous research by
dividing each of the 10 segments into 3 parts and uses a convolutional neural network to classify the 3 parts. The result is then
fused to classify the segments and the whole epithelium. The experimental data consists of 65 images. The proposed method
accuracy is 77.25% compared to 75.75% using the previous method for the same dataset.

© 2017 The Authors. Published by Elsevier B.V.


Peer-review under responsibility of the scientific committee of the Complex Adaptive Systems Conference with Theme:
Engineering Cyber Physical Systems.

Keywords: Cervical Cancer; Convolutoin Neural Networks; Data Fusion; Image classifcation; Clinical Decision Support Systems.

* Corresponding author. Tel.: +1-573-341-6896.


E-mail address: [email protected]

1877-0509 © 2017 The Authors. Published by Elsevier B.V.


Peer-review under responsibility of the scientific committee of the Complex Adaptive Systems Conference with Theme:
Engineering Cyber Physical Systems.

1877-0509 © 2017 The Authors. Published by Elsevier B.V.


Peer-review under responsibility of the scientific committee of the Complex Adaptive Systems Conference with Theme:
­Engineering Cyber Physical Systems.
10.1016/j.procs.2017.09.044
282 Haidar A. Almubarak et al. / Procedia Computer Science 114 (2017) 281–287
Haidar A. Almubarak et al. / Procedia Computer Science 00 (2017) 000–000

1. Introduction

Cervical cancer is the second leading cause of cancer death in women aged 20 to 39 years, in 2017 an estimate of
12,820 new cases and 4,210 is reported [1]. Screening for cervical cancer and its precursor lesions is carried out using
a Papanicolaou (Pap) test. Biopsied cervical tissue histology slides are used to give a definitive evaluation;
interpretation of these slides is done by an expert pathologist [2]. Pathologists seek to detect cervical intraepithelial
neoplasia (CIN), which is a pre-malignant condition for cervical cancer. A cervical biopsy is classified as normal (no
CIN lesion) or one of three CIN grades: CIN1 (mild dysplasia), CIN2 (moderate dysplasia), or CIN3 (severe dysplasia)
by identifying the atypical cells in the epithelium by the visual inspection of histology slides [3]. Fig. 1 shows an
example of different CIN grades. Delayed maturation with an increase in immature atypical cells from bottom to top
of the epithelium has been observed as CIN increases in severity [4]. Computer-assisted CIN diagnosis has been
studied previously in [5]–[10]; in these studies, manually handcrafted features need to be extracted using various image
processing and machine learning algorithms which are time-consuming and may not be the best features to be used.

Fig. 1 CIN Grades

Convolutional neural networks (ConvNets) proved to be great in many images related domains such as image
classification for very large scale datasets like ImageNet [11], face recognition [12], and breast cancer mitosis detection
[13]. ConvNets does not need features to be extracted manually, instead, they use filters and convolve them with the
images to extract the features. The filters are updated and tuned during the training process. In previous research, our
research group used a localized fusion-based approach for CIN grade classification [6]; this localized approach divides
an epithelium image into 10 segments then extract features from each segment; the futures are used to train a classifier
which will classify each segment into one of the CIN grades. After classifying the segments, the whole image class is
determined by voting among the 10 segments.
This research extends the localized fusion-based approach by further subdividing each segment into 3 parts: top,
middle, and bottom. This division is trying to exploit the fact that the abnormality in the cells progresses from the
bottom to the top of the epithelium, and analyzing the 3 parts separately then fuse the results shall improve the
classification results. In this research, ConvNets will be used for feature extraction and initial classification, there will
be no manually crafted features from the image or image parts. Other classification algorithms such as support vector
machine, logistic regression, and random forests will be used to fuse the 3 parts and segments result to get the whole
image class.

2. Methodology

The steps for processing an epithelium image for CIN classification is given as follows:
 Divide the whole image into 10 segments.
 Divide each segment into 3 parts: top, middle, bottom.
 Extract 32x32 patches (chunks) from each part.
Haidar A. Almubarak et al. / Procedia Computer Science 114 (2017) 281–287 283
Haidar A. Almubarak et al. / Procedia Computer Science 00 (2017) 000–000

 Train 3 ConvNets on the 32x32 patches, one for each part.


 Classify the training and testing patches to get classification probability.
 Estimate the 3 parts class probability based on the patches extracted from each part.
 Train a classifier to classify each segment/image based on the probability vector from the 3 parts.
 Use the trained classifier on the test images.
The remainder of this section presents each step in detail.

2.1. Segmenting Images and extracting patches.

The first step is to determine the medial axis which is used to partition the whole epithelium image into 10 segments
based on the methods from [6], [7]. Partitioning the epithelium image into ten vertical segments has facilitated
improved CIN assessment through fusion of local sub-region classifications of the epithelium [6], [7], [14]. An
example of the medial axis and vertical segment partitioning is shown in Fig. 2. The segments are then divided into
3 parts: top, middle, and bottom.

Fig. 2 Segmenting epithelium into 10 segments

The ConvNets requires input images to be the same size and since the epithelium has an irregular shape and non-
uniform size, the 3 parts need to be processed in a way that produced a fixed size patches. For this research, a 32 by
32 pixels patches are extracted from each part, the patches extracting uses a non-overlapping sliding window over the
3 parts of the segment. Fig 3 shows an example of a segment being divided into 3 parts and 32x32 patches extracted
from the middle part.

Fig. 3 Example of segment being divided into 3 parts and 32x32 patches extracted from it
284 Haidar A. Almubarak et al. / Procedia Computer Science 114 (2017) 281–287
Haidar A. Almubarak et al. / Procedia Computer Science 00 (2017) 000–000

2.2. Training Convolutional Neural Network

Each part of the segment is having its own ConvNet, we have the topNet midNet and botNet. Each network is
trained using the 32x32 patches that correspond to its part of the segment. All networks have the same configuration
as listed in Table 1.

Table 1 Convolution Neural Network Configuraion

Layer type Layer properties


Input Size (3,32,32)
Convolution 32 filters, size (3,3)
Convolution 32 filters, size (3,3)
Max pooling Size (2,2)
Dropout P(0.25)
Convolution 64 filters, size (3,3)
Convolution 64 filters, size (3,3)
Max polling Size(2,2)
Dropout P(0.25)
Dense 256 Nodes
Dropout P(0.5)
Dense (Output) 4 Nodes

The input to the network is the raw RGB values of the 32x32 patch that is followed by a series of convolution and
max pooling layers with dropout layers in between, these layers are extracting features from the image. The last 2
dense layers are the fully connected layers used to classify the image (patch) into one of 4 classes using features
extracted from previous layers. Each network is trained for 300 epochs. The trained network is used to classify the
patches in the test set. This architecture was chosen after experimenting with several architectures and different hyper-
parameters; the selected architecture performed the best in our dataset.

2.3. Segment classification

The final output of the ConvNet is the small patch classification into Normal, CIN1, CIN2, or CIN3. Since the
segments sizes are not uniform, the number of patches differ between segments; hence when classifying the top,
middle, or bottom parts of the segment, the percentage of patches falling in each class is used based on this formula:

# patches classified as x
class X _ p (Y ) = (1)
total number of patches in part Y

Each part Y (top, middle, and bottom) is assigned a probability of being normal or one the 3 CIN grades based on
the ratio of patches belonging to that class to the total number of patches. Using this formula, a features vector of
size 12 is created for each segment Table 2 shows an example of this vector.

Table 2 Segment Classification probability distribution

Top Third Middle Third Bottom Third


Normal_p CIN1_p CIN2_p CIN3_p Normal_p CIN1_p CIN2_p CIN3_p Normal_p CIN1_p CIN2_p CIN3_p
Haidar A. Almubarak et al. / Procedia Computer Science 00 (2017) 000–000
Haidar A. Almubarak et al. / Procedia Computer Science 114 (2017) 281–287 285

In other words, the feature vector is the percentage of patches falling in each class (normal, CIN 1-3). The CNN will
classify the patches in each part (top, middle, and bottom) of the segment independently using different trained
network for each part. The percentage of patches in each class serves as a confidence value of the part coming from
a segment with this class. The confidence values of the 3 parts are arranged into a vector of size 12. This vector is
used as a training input to other algorithms namely support vector machine (SVM), linear discriminant analysis
(LDA), multilayer perceptron (MLP), Logistic regression, and random forest (RF).

2.4. Whole image classification

Classification of individual segments gives both the segment class and the probability of each class. To classify the
whole image two approaches have been used, the first approach uses voting mechanism among the class of the
segments by assigning the class appearing the most among the segments. The second approach constructs a new
features vector from the probability of all segments, the feature vector length is 40 (4 classes from 10 segments). This
feature vector is used to train a new classifier that is used to classify the images in the test set.

3. Experimental Results

To test the algorithm a dataset consisting of 65 images is used. The data set contains 32 images classified as normal,
7 images classified as CIN1, 17 images classified as CIN2, and 10 images classified as CIN3. The images have been
annotated by an expert pathologist.
The ConvNet was trained using 5 folds cross-validation by dividing the data set into 80% training and 20% testing
sets, where the test sets are disjoint. The images have been processed according to the method described in section 2
resulting in more than 75,000 sample of size 32x32 per fold for ConvNet training. The 32x32 patches test classification
results can be found in Table 3. We can see that the bottom part of the segment having lower accuracy in general, that
is due to the similarity of this part among the different classes; it is hard for the neural network to distinguish between
them.

Table 3 ConvNet 32x32 patches test set average classification result

Fold Top Middle Bottom


1 0.6503 0.6102 0.4499
2 0.4724 0.4625 0.3008
3 0.5245 0.4932 0.3496
4 0.6565 0.6510 0.4880
5 0.4720 0.4645 0.3523
Average 0.5551 0.5362 0.3847

Extracting 32x32 patches generated a large dataset that could be trained and tested using the 5-foldes cross
validation method, but training and testing the whole image classification with 5-folds method was not viable due to
the limited number of images in the dataset, hence a leave-one-out approach is used. The segments of one image are
held for testing while the rest are used to train the classifier (SVM, LDA, MLP … etc.). This approach is used to
classify both the segments and the whole image. You can see the classification results in Table 4. For SVM, LDA,
and MLP using the voting method outperformed the 40 features method. The logistic regression and random forest
outperformed the other algorithms when using the 40 features method. You can see the confusion matrix for both in
Tables 5, 6
286 Haidar A. Almubarak et al. / Procedia Computer Science 114 (2017) 281–287
Haidar A. Almubarak et al. / Procedia Computer Science 00 (2017) 000–000

Table 4 Segments and whole image classification results

Data SVM LDA MLP Logistic Random Forest


Segment accuracy 0.6950 0.6535 0.6813 0.6965 0.7041
Whole Image (Voting) 0.7576 0.7576 0.7273 0.7425 0.7273
Whole Image (40 Feat.) 0.6515 0.6667 0.6970 0.7727 0.7727

Table 5 Confusion matrix (Logistic Regression)

class Normal CIN1 CIN2 CIN3


Normal 32 0 0 0
CIN1 7 1 1 0
CIN2 2 0 7 3
CIN3 0 0 2 11

Table 6 Confusion matrix (Random Forest)

class Normal CIN1 CIN2 CIN3


Normal 31 1 0 0
CIN1 8 1 0 0
CIN2 0 0 10 2
CIN3 0 0 4 9

CIN1 images were confused as normal for both logistic regression and random forest. Fig. 4 shows an example of
a CIN1 epithelium classified as Normal and an example of normal epithelium. As you can see they look very similar
and differentiating the two is not easy.

When compared to the algorithm used in [6] the proposed algorithm gave better results 77.27% exact class accuracy
compared to 75.75% on the same dataset.

Fig. 4 Example of a normal epithelium and a CIN1 misclassified as normal


Haidar A. Almubarak et al. / Procedia Computer Science 114 (2017) 281–287 287
Haidar A. Almubarak et al. / Procedia Computer Science 00 (2017) 000–000

Acknowledgements

This research was supported [in part] by the Intramural Research Program of the National Institutes of Health
(NIH), National Library of Medicine (NLM), and Lister Hill National Center for Biomedical Communications
(LHNCBC). In addition, we gratefully acknowledge the medical expertise and collaboration of Dr. Mark Schiffman
and Dr. Nicolas Wentzensen, both of the National Cancer Institute’s Division of Cancer Epidemiology and Genetics
(DCEG).

References

[1] R. L. Siegel, K. D. Miller, and A. Jemal, “Cancer statistics, 2017,” CA. Cancer J. Clin., vol. 67, no. 1, pp. 7–30, 2017.
[2] J. Jeronimo, M. Schiffman, R. L. Long, L. Neve, and S. Antani, “A tool for collection of region based data from uterine cervix images
for correlation of visual and clinical variables related to cervical neoplasia,” in Proceedings. 17th IEEE Symposium on Computer-Based
Medical Systems, 2004, pp. 558–562.
[3] H. L., L. R. Long, A. S, and T. G.R, “Computer Assisted Diagnosis in Histopathology,” in Sequence and Genome Analysis: Methods
and Applications,. iConcept Press, 2011.
[4] J. R. Egner, “AJCC Cancer Staging Manual,” JAMA: The Journal of the American Medical Association, vol. 304. p. 1726, 2010.
[5] Y. Wang, D. Crookes, O. S. Eldin, S. Wang, P. Hamilton, and J. Diamond, “Assisted diagnosis of cervical intraepithelial neoplasia
(CIN),” IEEE J. Sel. Top. Signal Process., vol. 3, no. 1, pp. 112–121, Feb. 2009.
[6] P. Guo et al., “Nuclei-Based Features for Uterine Cervical Cancer Histology Image Analysis with Fusion-based Classification.,” IEEE
J. Biomed. Heal. informatics, Oct. 2015.
[7] S. De et al., “A fusion-based approach for uterine cervical cancer histology image classification.,” Comput. Med. Imaging Graph., vol.
37, no. 7–8, pp. 475–87, 2013.
[8] J. van der Marel et al., “Molecular mapping of high-grade cervical intraepithelial neoplasia shows etiological dominance of HPV16.,”
Int. J. Cancer, vol. 131, no. 6, pp. E946-53, 2012.
[9] M. Guillaud et al., “Subvisual chromatin changes in cervical epithelium measured by texture image analysis and correlated with HPV.,”
Gynecol. Oncol., vol. 99, no. 3 Suppl 1, pp. S16-23, Dec. 2005.
[10] S. J. Keenan et al., “An automated machine vision system for the histological grading of cervical intraepithelial neoplasia (CIN).,” J.
Pathol., vol. 192, no. 3, pp. 351–62, 2000.
[11] A. Krizhevsky, Ii. Sulskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Adv. Neural Inf.
Process. Syst., pp. 1097–1105, 2012.
[12] O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep Face Recognition,” in Procedings of the British Machine Vision Conference 2015,
2015, no. Section 3, p. 41.1-41.12.
[13] D. C. Ciresan, A. Giusti, L. M. Gambardella, and J. Schmidhuber, “Mitosis Detection in Breast Cancer Histology Images using Deep
Neural Networks,” Med. Image Comput. Comput. Interv. (MICCAI 2013), pp. 411–418, 2013.
[14] P. Guo et al., “Enhancements in localized classification for uterine cervical cancer digital histology image assessment,” J. Pathol.
Inform., vol. 7, no. 1, p. 51, 2016.

You might also like