06 Deep Convolutional Neural Network With Segmentation Techniques For Chest X-Ray Analysis
06 Deep Convolutional Neural Network With Segmentation Techniques For Chest X-Ray Analysis
Abstract—The deep ConvNets is suitable for learning the map- medical imaging. Current research direction such as nodule
ping between CXR gradients. This paper proposes an example detection [8], automated sketching of organs and medical
segmentation algorithm based on deep learning applied to x- image segmentation [1] etc are very emerging directions. The
ray medical image automatic segmentation annotation. The basic
convolutional neural network is used to extract the feature map of types of medical images applied by AI included x-ray, CT,
the image, and the corresponding branch structure: classification, MRI and so on are common medical imaging technologies.
regression, and mask that can complete the automatic analysis of Computer-aided diagnosis (CAD) can effectively reduce
the image’s infrastructure. Our method is evaluated on a dataset the work of medical experts and improve work efficiency.
that consisted of 180 cases of real two-exposure dual-energy Traditional CAD can help doctors better to manage medical
subtraction chest radiographs. Meanwhile, we design histogram
averaging and data augmentation to enhance the low contrast image information, but it still lacks certain utility in assistant
image. Finally, we visualize the image and the good results have diagnosis. With the development of artificial intelligence,
been achieved in the segmentation and labeling of clavicle and people’s expectations for computer-aided diagnosis become
rib. We hope that our research will provide a good prospect for higher. In addition to solving some repetitive and tedious tasks,
the application of deep learning in automatic segmentation and people are in search to have an algorithm to solve some of the
labeling of medical images.
Index Terms—deep ConvNets, chest radiographs, segmenta- more difficult and technical problems. At present, the hot topic
tion, labeling , mask. is the detection of pulmonary nodules [8], mainly due to the
Luna 16 challenge in 2016 and Tianchi challenge. Meanwhile,
I. I NTRODUCTION the publication of data sets such as chext14 and JSRT, which
In recent years, the deep learning has developed rapidly. means that computer-aided diagnosis needs more functions
At the same time, the fields involved in AI are also more requirements to completes more complex tasks.
extensive. From the initial handwritten digital classification to X-ray is one of the most widely used medical imaging
complex target detection and video motion recognition, the techniques. The CT is more effective and accurate in diagnosis
deep learning is slowly changing our lives. Certainly, the deep but it is expensive and complex. Comparatively speaking, the
learning has also achieved excellent progress in the field of application market of X-ray is more huge. At the same time,
X-ray is cheap and easy to operate. It is suitable for hospitals
Corresponding author: [email protected]. of different sizes. According to statistics, the amount of X-
Support by Hefei Borrowing and Subsidizing Project, 2018, Development
and Application Demonstration of Intelligent Diagnosis System for Lung rays per year is very huge, and it is very meaningful for the
Cancer Based on Deep Learning auxiliary diagnosis and treatment. X-ray chest radiography is
978-1-5386-9490-9/19/$31.00 2019
c IEEE 1212
one of the most commonly used imaging techniques, when complex multiple-stage cascade that predicts segment propos-
doctors diagnose, they need to know the location of organs als from bounding-box proposals, followed by classification.
and bones in time, so as to accurately locate the focus. Instead, our method is based on parallel prediction of m asks
From the perspective of medical anatomy, accurately locating and class labels, which is simpler and more flexible.
organs and tissues is a complex task that requires medical X-ray images are widely used to analyze various diseases.
experience. However, accurately locating each tissue is not In recent years, the application of deep convolution neural
only helpful for clinicians’ diagnosis but also can provide network technology has made the development of X-ray
good prior knowledge on the basis of single lesion location more convenient. In [2], Scholars presented an effective deep
detection under the background of the application of artificial learning method for bone suppression in single conventional
intelligence. Because the detection of lesions at this stage, CXR using deep convolutional neural networks (ConvNets)
such as pulmonary nodules, is still in a very high false- as basic prediction units. In the paper of radiologist-Level
positive state, if the prior knowledge of segmentation can pneumonia detection on chest X-Rays with deep learning [8]
be added, the accuracy of computer-aided diagnosis will be described how to diagnose lung cancer by deep learning. The
greatly improved. authors developed an algorithm that can detect pneumonia
Instance segmentation is very challenging in field of image from chest X-rays at a level exceeding practicing radiologists.
because it requires not only accurate detection of each target An in-depth study of the literature on segmentation methods
but also the accurate segmentation of each instance. Our applied in dental imaging [1]. Automatic segmenting teeth in
task is to integrate individual classification, positioning, and X-ray images was proposed and the authors described the
semantics segmentation where the goal is to classify each pixel trends, benchmarking and future perspectives. A completely
into a fixed set of categories without differentiating object integrated CAD system was proposed to screen digital X-ray
instances. In this paper, we need to classify each bone of mammograms involving detection, segmentation, and classifi-
X-ray and bounding box’s regression task. Meanwhile, we cation of breast masses via deep learning methodologies [4].
need Semantic Segmentation. At present, we mainly focus on
24 ribs and 2 clavicles of X-ray. More experiments will do B. Data set
continuously in the follow-up study. We collected 180 chest radiographs acquired with digital
In this paper, for the first time, an instance segmentation radiography (DR) machine at Hospital of Traditional Chinese
algorithm is applied to apply the instance segmentation al- Medicine, Anhui Provincial, China. The images were restored
gorithm in deep learning to X-ray bone segmentation. The in DICOM format with a 14-bit depth and the sizes of the
advanced Mask R-CNN segmentation algorithm is used to CXRs ranged from 2600 × 2600 to 3000 × 3000 pixels. a
automatically segment and label the ribs and clavicles of X- total 160 radiographs were randomly selected as training set,
ray. Because of the poor clarity of ribs in X-ray and the and the 20 cases were used as test set. The whole data are
occlusion of front and rear ribs. Because of the poor clarity randomly extracted from the database, so age and sex ratio
of ribs in X-ray and the occlusion of the front and rear ribs, are consistent with the uniformity of the sample.
we automatically adjust the window width of the picture and
optimize the parameters in the process of network training to III. M ETHOD
get a better result. In summary, it provides a good prospect We first propose to use basic network framework of Mask
for automatic segmentation and labeling of computer-aided R-CNN [3] for automatic segmentation and annotation of
diagnosis and treatment in the future. We hope our proposed X-ray. Firstly, feature information of the medical image is
approach will serve as a solid baseline and help ease future extracted by the basic network, then the candidate regions are
research in instance-level recognition of the medical image. screened by RPN. Finally, the segmentation, classification and
The remainder of this paper is organized as follows. First, mask tasks of image targets are completed by three branch
we briefly introduce the related works and data sets in Section structures. The method of image segmentation techniques and
2. The network Structure of Bone Segmentation and Marking the solution for some related issues are described in the
are described in section 3. The experimental results are provid- following sections.
ed in section 4. Finally, a summary of the results is presented.
A. Mask R-CNN
II. R ELATED WORK AND DATA SET Mask R-CNN, an improved structure based on Faster R-
CNN [16], is also a two-stage network structure. Faster R-
A. Related work CNN introduces two branches after the feature extraction
Driven by the effectiveness of RCNN, many approaches module. One is the classification of target objects, the other
to instance segmentation are based on segment proposals. is the coordinate regression of the bounding box, and the
Earlier methods [10, 5, 6] resorted to bottom-up segments Mask R-CNN add a third branch that outputs the object
[9,12]. DeepMask [7] and the following works learn to propose mask. Firstly, Let’s review the concept of Faster R-CNN. The
segment candidates, which are then classified by Fast R-CNN. network structure divided into two stages. The first stage is
In these methods, segmentation precedes recognition, which is RPN: after the basic convolution feature extraction network,
slow and less accurate. Likewise, Dai et al. [13] proposed a candidate regions are obtained by screening. The second part
2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA) 1213
Fig. 1. Segmentation for Chest X-Ray Analysis.
is the detection function module: RoIPool is used to extract which the loss Lcls is the classification loss and the Lbox
features from each candidate box, and classify and bounding- is the bounding-box loss. The Lmask is the average binary
box regression is performed. It is important to note that both cross-entropy loss. The classification and bounding-box loss
RPN and fast RCNN [14] need a primitive feature extraction are identical as those defined in [14]. Lcls (p, u) = −logpu is
network, so Faster RCNN shares feature maps. The network the log loss for true class u. For bounding-box regression, the
uses feature extraction network to obtain initial parameters loss is shown as follow:
using Imagene classification library and fine-tune them. Mask
R-CNN basically follows the above structure. Firstly, it main- Lloc (tu , v) = smoothL1 (tui − vi ) (2)
tains the RPN structure of stage 1. But in the aspect of i∈x,y,w,h
feature extraction network, it uses more advanced Resnet50
in which
and Resnet101 [15]. In stage 2, it connects a mask branch
with classification and bounding box in parallel. Therefore, in 0.5x2 , if |x| < 1
addition to completing the normal classification and regression smoothL1 (x) = (3)
|x| − 0.5, otherwise
tasks of candidate regions, the network can segment and
annotate the target to classify each pixel into a fixed set of is a robust L1 loss that is less sensitive to outliers than the
categories without differentiating. L2 loss used in R-CNN [4]. The Lloc is defined over a
tuple of true bounding-box regression target for class u, v =
B. Automatic segmentation and annotation system (vx , vy , vw , vh ), and the predicted tuple tu = tux , tux , tuy , thw . A
mask encodes an input object’s spatial layout, and the Lmask
In this paper, we design an automatic segmentation and is only defined on the k-th mask. The output mask represents
annotation system based on Mask R-CNN, as shown in Figure the prediction result of the network, and the output loss reflects
1. The medical pictures with X-ray input in network structure, the difference between the prediction value and the real value.
then read image matrix denote as I, after convolution pooling
( the CNN part of Figure 1). The main structure used in this IV. E XPERIMENTS
paper is resnet101. The residual network has a deeper network A total of 180 samples are labeled, including 160 training
and fewer parameters, which greatly improves the learning sets and 20 test sets. The data were provided by Anhui
efficiency and the accuracy of network prediction. Through Traditional Chinese Medicine Hospital and labeled under the
this process, the network obtains the feature graph as shown guidance of professional doctors. The training samples in this
in figure 1. Then, in parallel to predicting the class and box, paper are labeled for clavicle and rib segmentation. In future
Mask R-CNN also outputs a binary mask for each RoI. Finally, research, we will provide more segmented labeling of organs,
the output of the network is the image that has been segmented bones and tissues. As shown in Figure 2, there are 26 labeled
and labeled. objects, including 2 clavicles and 24 ribs (1 to 12 ribs from top
We define the multi-task loss as : to bottom). The ribs are basically located in the overlapping
part of the lungs and the front and back ribs are occluded
L = Lcls + Lbox + Lmask (1) from each other, it is difficult to label them, so some details
1214 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA)
may not be very accurate. To solve these problems, we later By comparison, the MASK R-CNN achieves better results.
consider increasing the contrast between different tissues by The network output in the figure is divided into three parts,
adjusting the window width of medical images. But in the which also corresponds to the structure of the network, 1: the
practical application, the original window width and window classification of the target object. 2: coordinate regression of
position of the DCM image produced by different instruments the target object, ie bounding box 3: instance segmentation
are quite different. Finally, we choose histogram averaging to of the target object. The segmentation is the most difficult
preprocess the image. because of the occlusion of the front and rear ribs and the
overlap of the multiple ribs, which are represented by different
colors in the figure. The details of the output are shown in
Figure 4. In figure 4, the two clavicle and the 24 ribs are
detected and segmented. The bounding box regression effect
of each object is good, and the classification accuracy is very
high. There are some misjudgments in the segmentation, but
the actual effect is satisfactory.
V. C ONCLUSION
In this study, we apply deep ConvNets to predict the bone
images from CXRs. This paper first proposes an instance
segmentation algorithm to solve the problem of automatic
segmentation annotation in medical images. Compared with
traditional manual labeling, automatic labeling has great sig-
nificance for the auxiliary diagnosis and treatment of comput-
ers. The automatic labeling algorithm proposed in this paper
will provide a good prospect for the future application of
Fig. 2. Label Sample for Chest X-Ray Analysis. artificial intelligence in the field of medical imaging, com-
bined with the current progress in segmentation and medical
We set up 300 epoch training head and 600 epoch training anatomy in deep learning. Through cooperation with the
all nets. Meanwhile, the learning rate decreased from 0.001 to hospital, A total of 180 samples are marked in the experiment.
0.0001. The maximum value suppression parameter is 0.5, the Experimental verification of more complex ribs and clavicle
basic network selects resnet101, and the confidence parameter show that the example segmentation algorithm in deep learning
is 0.9. It should be noted that the training network needs to achieves better experimental results in automatic segmentation
be pretrained on the COCO data set, and then fine-tuned on of medical images. Next, our work will continue in the field
our data set. The experimental results show that the training of automatic annotation based on deep learning. we will
results have a good migration effect. improve the research work with more sample sizes and more
complex human tissue structures. At the same time, we will
also actively cooperate with the hospital on our research results
and hope to provide more help in computer-assisted diagnosis
and treatment.
R EFERENCES
[1] Gil Jader, Luciano Oliveira, Matheus Pithon, “Automatic segmenting
teeth in X-ray images: Trends, a novel data set, benchmarking and future
perspectives,” Expert Systems with Applications, Vol.107, pp. 15–31,
2018.
[2] Wei Yang, Yingyin Chen, Yunbi Liu, Liming Zhong, Genggeng Qin,
Zhentai Lu, Qianjin Feng, Wufan Chen, “Cascade of multi-scale con-
volutional neural networks for bone suppression of chest radiographs in
gradient domain,” Expert Systems with Applications, Vol.35, pp. 421–
433, Jan 2017.
[3] Kaiming He Georgia Gkioxari Piotr Dollar Ross Girshick, “Mask R-
CNN,” IEEE International Conference on Computer Vision (ICCV)
2017.
[4] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature
hierarchies for accurate object detection and semantic segmentation,”
Fig. 3. Output of Network . In CVPR, 2014.
[5] B. Hariharan, P. Arbelaez, R. Girshick, and J. Malik, “Simultaneous
Mask R-CNN outputs are visualized in Figures 3 and 4. detection and segmentation,” In ECCV, 2014.
Mask R-CNN achieves good results even under challenging [6] B. Hariharan, P. Arbelaez, R. Girshick, and J. Malik, “Hypercolumns
for object segmentation and fine-grained localization,” In CVPR, 2015.
conditions. The left part is the original image of x-ray, and [7] P. O. Pinheiro, R. Collobert, and P. Dollar, “Learning to segment object
the right side is the output result of the network model. candidates,” In NIPS, 2015.
2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA) 1215
Fig. 4. Segmentation for Chest X-Ray Analysis.
1216 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA)