0% found this document useful (0 votes)
45 views38 pages

A Survey of Deep Learning For Lung Disease Detection On Medical Images: State-of-the-Art, Taxonomy, Issues and Future Directions

This document summarizes a survey of 98 recent articles on using deep learning for lung disease detection from medical images. It presents a taxonomy of seven common attributes used in the surveyed works: image types, features, data augmentation, deep learning algorithms, transfer learning, ensemble classifiers, and lung diseases. The survey finds that lung disease detection using deep learning is an active area, with 90 of the surveyed articles published from 2018 onward. It aims to analyze trends, identify remaining issues, and suggest potential future directions in this important application of deep learning.

Uploaded by

mudgal_ashish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views38 pages

A Survey of Deep Learning For Lung Disease Detection On Medical Images: State-of-the-Art, Taxonomy, Issues and Future Directions

This document summarizes a survey of 98 recent articles on using deep learning for lung disease detection from medical images. It presents a taxonomy of seven common attributes used in the surveyed works: image types, features, data augmentation, deep learning algorithms, transfer learning, ensemble classifiers, and lung diseases. The survey finds that lung disease detection using deep learning is an active area, with 90 of the surveyed articles published from 2018 onward. It aims to analyze trends, identify remaining issues, and suggest potential future directions in this important application of deep learning.

Uploaded by

mudgal_ashish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Journal of

Imaging
Review
A Survey of Deep Learning for Lung Disease
Detection on Medical Images: State-of-the-Art,
Taxonomy, Issues and Future Directions
Stefanus Tao Hwa Kieu 1 , Abdullah Bade 1 , Mohd Hanafi Ahmad Hijazi 2, * and
Hoshang Kolivand 3
1 Faculty of Science and Natural Resources, Universiti Malaysia Sabah, Kota Kinabalu 88400, Sabah, Malaysia;
[email protected] (S.T.H.K.); [email protected] (A.B.)
2 Faculty of Computing and Informatics, Universiti Malaysia Sabah, Kota Kinabalu 88400, Sabah, Malaysia
3 School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool L3 3AF, UK;
[email protected]
* Correspondence: [email protected]

Received: 24 October 2020; Accepted: 25 November 2020; Published: 1 December 2020 

Abstract: The recent developments of deep learning support the identification and classification of
lung diseases in medical images. Hence, numerous work on the detection of lung disease using
deep learning can be found in the literature. This paper presents a survey of deep learning for lung
disease detection in medical images. There has only been one survey paper published in the last
five years regarding deep learning directed at lung diseases detection. However, their survey is
lacking in the presentation of taxonomy and analysis of the trend of recent work. The objectives
of this paper are to present a taxonomy of the state-of-the-art deep learning based lung disease
detection systems, visualise the trends of recent work on the domain and identify the remaining
issues and potential future directions in this domain. Ninety-eight articles published from 2016 to
2020 were considered in this survey. The taxonomy consists of seven attributes that are common in
the surveyed articles: image types, features, data augmentation, types of deep learning algorithms,
transfer learning, the ensemble of classifiers and types of lung diseases. The presented taxonomy
could be used by other researchers to plan their research contributions and activities. The potential
future direction suggested could further improve the efficiency and increase the number of deep
learning aided lung disease detection applications.

Keywords: deep learning; lung disease detection; taxonomy; medical images

1. Introduction
Lung diseases, also known as respiratory diseases, are diseases of the airways and the other
structures of the lungs [1]. Examples of lung disease are pneumonia, tuberculosis and Coronavirus
Disease 2019 (COVID-19). According to Forum of International Respiratory Societies [2], about 334
million people suffer from asthma, and, each year, tuberculosis kills 1.4 million people, 1.6 million
people die from lung cancer, while pneumonia also kills millions of people. The COVID-19 pandemic
impacted the whole world [3], infecting millions of people and burdening healthcare systems [4].
It is clear that lung diseases are one of the leading causes of death and disability in this world. Early
detection plays a key role in increasing the chances of recovery and improve long-term survival
rates [5,6]. Traditionally, lung disease can be detected via skin test, blood test, sputum sample
test [7], chest X-ray examination and computed tomography (CT) scan examination [8]. Recently, deep
learning has shown great potential when applied on medical images for disease detection, including
lung disease.

J. Imaging 2020, 6, 131; doi:10.3390/jimaging6120131 www.mdpi.com/journal/jimaging


J. Imaging 2020, 6, 131 2 of 38

Deep learning is a subfield of machine learning relating to algorithms inspired by the function
and structure of the brain. Recent developments in machine learning, particularly deep learning,
support the identification, quantification and classification of patterns in medical images [9]. These
developments were made possible due to the ability of deep learning to learned features merely
from data, instead of hand-designed features based on domain-specific knowledge. Deep learning is
quickly becoming state of the art, leading to improved performance in numerous medical applications.
Consequently, these advancements assist clinicians in detecting and classifying certain medical
conditions efficiently [10].
Numerous works on the detection of lung disease using deep learning can be found in
the literature. To the best of our knowledge, however, only one survey paper has been published
in the last five years to analyse the state-of-the-art work on this topic [11]. In that paper, the history
of deep learning and its applications in pulmonary imaging are presented. Major applications of
deep learning techniques on several lung diseases, namely pulmonary nodule diseases, pulmonary
embolism, pneumonia, and interstitial lung disease, are also described. In addition, the analysis of
several common deep learning network structures used in medical image processing is presented.
However, their survey is lacking in the presentation of taxonomy and analysis of the trend of recent
work. A taxonomy shows relationships between previous work and categorises them based on
the identified attributes that could improve reader understanding of the topic. Analysis of trend, on
the other hand, provides an overview of the research direction of the topic of interest identified from
the previous work. In this paper, a taxonomy of deep learning applications on lung diseases and
a trend analysis on the topic are presented. The remaining issues and possible future direction are
also described.
The aims of this paper are as follows: (1) produce a taxonomy of the state-of-the-art deep learning
based lung disease detection systems; (2) visualise the trends of recent work on the domain; and (3)
identify the remaining issues and describes potential future directions in this domain. This paper is
organised as follows. Section 2 presents the methodology of conducting this survey. Section 3 describes
the general processes of using deep learning to detect lung disease in medical images. Section 4
presents the taxonomy, with detailed explanations of each subtopic within the taxonomy. The analysis
of trend, research gap and future directions of lung disease detection using deep learning are presented
in Section 5. Section 6 describes the limitation of the survey. Section 7 concludes this paper.

2. Methodology
In this section, the methodology used to conduct the survey of recent lung disease detection using
deep learning is described. Figure 1 shows the flowchart of the methodology used.
First, a suitable database, as a main source of reference, of articles was identified. The Scopus
database was selected as it is one of the largest databases of scientific peer-reviewed articles. However,
several significant articles, indexed by Google Scholar but not Scopus, are also included based on
the number of citations that they have received. Some preprint articles on COVID-19 are also included
as the disease has just recently emerged. To ensure that this survey only covers the state-of-the-art
works, only articles published recently (2016–2020) are considered. However, several older but
significant articles are included too. To search for all possible deep learning aided lung disease
detection articles, relevant keywords were used to search for the articles. The keywords used were
“deep learning”, “detection”, “classification”, “CNN”, “lung disease”, “Tuberculosis”, “pneumonia”,
“lung cancer”, “COVID-19” and “Coronavirus”. Studies were limited to articles written in English
only. At the end of this phase, we identified 366 articles.
Second, to select only the relevant works, screening was performed. During the screening,
only the title and abstract were assessed. The main selection criteria were this survey is only
interested in work, whereby deep learning algorithms were applied to detect the relevant diseases.
Articles considered not relevant were excluded. Based on the screening performed, only 98 articles
were shortlisted.
J. Imaging 2020, 6, 131 3 of 38

Last, for all the articles screened, the eligibility inspection was conducted. Similar criteria, as in
the screening phase, were used, whereby the full-text inspection of the articles was performed instead.
All 98 screened articles passed this phase and were included in this survey. Out of the eligible articles,
90 were published in 2018 and onwards. This signifies that lung disease detection using deep learning
is still a very active field. Figure 1 shows the numbers of studies identified, screened, assessed for
eligibility and included in this survey.

Figure 1. Flow diagram of the methodology used to conduct this survey.

3. The Basic Process to Apply Deep Learning for Lung Disease Detection
In this section, the process of how deep learning is applied to identify lung diseases from medical
images is described. There are mainly three steps: image preprocessing, training and classification.
Lung disease detection generally deals with classifying an image into healthy lungs or disease-infected
lungs. The lung disease classifier, sometimes known as a model, is obtained via training. Training is
the process in which a neural network learns to recognise a class of images. Using deep learning, it is
possible to train a model that can classify images into their respective class labels. Therefore, to apply
deep learning for lung disease detection, the first step is to gather images of lungs with the disease to
be classified. The second step is to train the neural network until it is able to recognise the diseases.
The final step is to classify new images. Here, new images unseen by the model before are shown to
the model, and the model predicts the class of those images. The overview of the process is illustrated
in Figure 2.
J. Imaging 2020, 6, 131 4 of 38

Figure 2. Overview of using deep learning for lung disease detection.

3.1. Image Acquisition Phase


The first step is to acquire images. To produce a classification model, the computer needs to learn
by example. The computer needs to view many images to recognise an object. Other types of data,
such as time series data and voice data, can also be used to train deep learning models. In the context
of the work surveyed in this paper, the relevant data required to detect lung disease will be images.
Images that could be used include chest X-ray, CT scan, sputum smear microscopy and histopathology
image. The output of this step is images that will later be used to train the model.

3.2. Preprocessing Phase


The second step is preprocessing. Here, the image could be enhanced or modified to improve
image quality. Contrast Limited Adaptive Histogram Equalisation (CLAHE) could be performed to
increase the contrast of the images [12]. Image modification such as lung segmentation [13] and bone
elimination [14] could be used to identify the region of interest (ROI), whereby the detection of the lung
disease can then be performed on the ROI. Edge detection could also be used to provide an alternate
data representation [15]. Data augmentation could be applied to the images to increase the amount
of available data. Feature extraction could also be conducted so that the deep learning model could
identify important features to identify a certain object or class. The output of this step is a set of images
whereby the quality of the images is enhanced, or unwanted objects have been removed. The output
of this step is images that were enhanced or modified that will later be used in training.
J. Imaging 2020, 6, 131 5 of 38

3.3. Training Phase


In the third step, namely training, three aspects could be considered. These aspects are the selection
of deep learning algorithm, usage of transfer learning and usage of an ensemble. There are numerous
deep learning algorithm, for example deep belief network (DBN), multilayer perceptron neural network
(MPNN), recurrent neural network (RNN) and the aforementioned CNN. Different algorithms have
different learning styles. Different types of data work better with certain algorithms. CNN works
particularly well with images. Deep learning algorithm should be chosen based on the nature of
the data at hand. Transfer learning refers to the transfer of knowledge from one model to another.
Ensemble refers to the usage of more than one model during classification. Transfer learning and
ensemble are techniques used to reduce training time, improve classification accuracy and reduce
overfitting [16]. Further details concerning these two aspects could be found in Sections 4.5 and 4.6,
respectively. The output of this step is models generated from the data learned.

3.4. Classification Phase


In the fourth and final step, which is classification, the trained model will predict which class an
image belongs to. For example, if a model was trained to differentiate X-ray images of healthy lungs
and tuberculosis-infected lungs, it should be able to correctly classify new images (images that are
never seen by the model before) into healthy lungs or tuberculosis-infected lungs. The model will give
a probability score for the image. The probability score represents how likely an image belongs to a
certain class. At the end of this step, the image will be classified based on the probability score given
to it by the model.

4. The Taxonomy of State-Of-The-Art Work on Lung Disease Detection Using Deep Learning
In this section, a taxonomy of the recent work on lung disease detection using deep learning is
presented, which is the first contribution of this paper. The taxonomy is built to summarise and provide
a clearer picture of the key concepts and focus of the existing work. Seven attributes were identified for
inclusion in the taxonomy. These attributes were chosen as they were imminent and can be found in all
the articles being surveyed. The seven attributes included in the taxonomy are image types, features,
data augmentation, types of deep learning algorithms, transfer learning, the ensemble of classifiers
and types of lung diseases. Sections 4.1–4.7 describe each attribute in detail, whereby the review of
relevant works is provided. Section 4.8 describes the datasets used by the works surveyed. Figure 3
shows the taxonomy of state-of-the-art lung disease detection using deep learning.

Figure 3. Taxonomy of lung disease detection using deep learning.


J. Imaging 2020, 6, 131 6 of 38

4.1. Image Type


In the papers surveyed, four types of images were used to train the model: chest X-ray, CT scans,
sputum smear microscopy images and histopathology images. These images are described in detail in
Sections 4.1.1–4.1.4. It should be noted that there are other imaging techniques exist such as positron
emission tomography (PET) and magnetic resonance imaging (MRI) scans. Both PET and MRI scans
could also be used to diagnose health conditions and evaluate the effectiveness of ongoing treatment.
However, none of the papers surveyed used PET or MRI scans.

4.1.1. Chest X-rays


An X-ray is a diagnostic test that helps clinicians identify and treat medical problems [17].
The most widely performed medical X-ray procedure is a chest X-ray, and a chest X-ray produces
images of the blood vessels, lungs, airways, heart and spine and chest bones. Traditionally, medical
X-ray images were exposed to photographic films, which require processing before they can be viewed.
To overcome this problem, digital X-rays are used [18]. Figure 4 shows several examples of chest X-ray
with different lung conditions taken from various datasets.

Figure 4. Examples of chest X-ray images.

Among the papers surveyed, the majority of them used chest X-rays. For example, X-rays
were used for tuberculosis detection [19], pneumonia detection [20], lung cancer detection [14] and
COVID-19 detection [21].
J. Imaging 2020, 6, 131 7 of 38

4.1.2. CT Scans
A CT scan is a form of radiography that uses computer processing to create sectional images
at various planes of depth from images taken around the patient’s body from different angles [22].
The image slices can be shown individually, or they can be stacked to produce a 3D image of the patient,
showing the tissues, organs, skeleton and any abnormalities present [23]. CT scan images deliver more
detailed information than X-rays. Figure 5 shows examples of CT scan images taken from numerous
datasets. CT scans have been used to detect lung disease in numerous work found in the literature, for
example for tuberculosis detection [24], lung cancer detection [25] and COVID-19 detection [26].

Figure 5. Examples of CT scan images.

4.1.3. Sputum Smear Microscopy Images


Sputum is a dense fluid formed in the lungs and airways leading to the lungs. To perform
sputum smear examination, a very thin layer of the sputum sample is positioned on a glass slide [27].
Among the papers surveyed, only five used sputum smear microscopy image [28–32]. Figure 6 shows
examples of sputum smear microscopy images.

Figure 6. Examples of sputum smear microscopy images.

4.1.4. Histopathology Images


Histopathology is the study of the symptoms of a disease through microscopic examination of
a biopsy or surgical specimen using glass slides. The sections are dyed with one or more stains to
visualise the different components of the tissue [33]. Figure 7 shows a few examples of histopathology
images. Among all the papers surveyed, only Coudray et al. [34] used histopathology images.
J. Imaging 2020, 6, 131 8 of 38

Figure 7. Examples of histopathology images.

4.2. Features
In computer vision, features are significant information extracted from images in terms of
numerical values that could be used to solve specific problem [35]. Features might be in the form
of specific structures in the image such as points, edges, colour, sizes, shapes or objects. Logically,
the types of images affect the quality of the features.
Feature transformation is a process that creates new features using the existing features. These
new features may not have the same representation as to the original features, but they may
have more discriminatory power in a different space than the original space. The purpose of
feature transformation is to provide a more useful feature for the machine learning algorithm for
object identification. The features used in the surveyed papers include: Gabor, GIST, Local binary
patterns (LBP), Tamura texture descriptor, colour and edge direction descriptor (CEDD) [36], Hu
moments, colour layout descriptor (CLD) edge histogram descriptor (EHD) [37], primitive length,
edge frequency, autocorrelation, shape features, size, orientation, bounding box, eccentricity, extent,
centroid, scale-invariant feature transform (SIFT), regional properties area and speeded up robust
features (SURF) [38]. Other feature representations in terms of histograms include pyramid histogram
of oriented gradients (PHOG), histogram of oriented gradients (HOG) [39], intensity histograms
(IH), shape descriptor histograms (SD), gradient magnitude histograms (GM), curvature descriptor
histograms (CD) and fuzzy colour and texture histogram (FCTH). Some studies even performed lung
segmentations before training their models (e.g., [13,14,36]).
From the literature, a majority of the works surveyed used features that are automatically extracted
from CNN. CNN can automatically learn and extract features, discarding the need for manual feature
generation [40].

4.3. Data Augmentation


In deep learning, it is very important to have a large training dataset, as the community agrees
that having more images can help improve training accuracy. Even a weak algorithm with a large
amount of data can be more accurate than a strong algorithm with a modest amount of data [41].
Another obstacle is imbalanced classes. When doing binary classification training, if the number of
samples of one class is a lot higher than the other class, the resulting model would be biased [6]. Deep
learning algorithms perform optimally when the amount of samples in each class is equal or balanced.
One way to increase the training dataset without obtaining new images is to use image
augmentation. Image augmentation creates variations of the original images. This is achieved by
performing different methods of processing, such as rotations, flips, translations, zooms and adding
noise [42]. Figure 8 shows various examples of images after image augmentation.
Data augmentation can also help increase the amount of relevant data in the dataset. For example,
consider a car dataset with two labels, X and Y. One subset of the dataset contains images of cars of label
X, but all the cars are facing left. The other subset contains images of cars of label Y, but all the cars are
facing right. After training, a test image of a label Y car facing left is fed into the model, and the model
labels that the car as X. The prediction is wrong as the neural network search for the most obvious
features that distinguish one class from another. To prevent this, a simple solution is to flip the images
J. Imaging 2020, 6, 131 9 of 38

in the existing dataset horizontally such that they face the other side. Through augmentation, we may
introduce relevant features and patterns, essentially boosting overall performance.

Figure 8. Examples of image augmentation: (a) original; (b) 45◦ rotation; (c) 90◦ rotation; (d) horizontal
flip; (e) vertical flip; (f) positive x and y translation; (g) negative x and y translation; (h) salt and pepper
noise; and (i) speckle noise.

Data augmentation also helps prevent overfitting. Overfitting refers to a case where a network
learns a very high variance function, such as the perfect modelling of training results. Data
augmentation addresses the issue of overfitting by introducing the model with more diverse data [43].
This diversity in data reduces variance and improves the generalisation of the model.
However, data augmentation cannot overcome all biases present in a small dataset [43]. Other
disadvantages of data augmentation include additional training time, transformation computing costs
and additional memory costs.

4.4. Types of Deep Learning Algorithm


The most common deep learning algorithm, CNN, is especially useful to find patterns in images.
Similar to the neural networks of the human brain, CNNs consist of neurons with trainable biases
and weights. Each neuron receives several inputs. Then, a weighted sum over the inputs is computed.
The weighted sum is then passed to an activation function, and an output is produced. The difference
between CNN and other neural networks is that CNN has convolution layers. Figure 9 shows
an example of a CNN architecture [44]. A CNN consists of multiple layers, and the four main
types of layers are convolutional layer, pooling layer and fully-connected layer. The convolutional
layer performs an operation called a “convolution”. Convolution is a linear operation involving
the multiplication of a set of weights with the input. The set of weights is called a kernel or a filter.
The input data are larger than the filter. The multiplication between a filter-sized section of the input
and the filter is a dot product. The dot product is then summed, resulting in a single value. The pooling
layer gradually reduces the spatial size of the representation to lessen the number of parameters and
computations in the network, thus controlling overfitting. A rectified linear unit (ReLu) is added to
the CNN to apply an elementwise activation function such as sigmoid to the output of the activation
produced by the previous layer. More details of CNN can be found in [44,45].
CNN generally has two components when learning, which are feature extraction and classification.
In the feature extraction stage, convolution is implemented on the input data using a filter or kernel.
Then, a feature map is subsequently generated. In the classification stage, the CNN computes a
probability of the image belongs to a particular class or label. CNN is especially useful for image
classification and recognition as it automatically learns features without needing manual feature
extraction [40]. CNN also can be retrained and applied to a different domain using transfer learning [46].
Transfer learning has been shown to produce better classification results [19].
J. Imaging 2020, 6, 131 10 of 38

Figure 9. Example of a CNN structure.

Another deep learning algorithm is DBN. DBN can be defined as a stack of restricted Boltzmann
machines (RBM) [47]. The layer of the DBN has two functions, except for the first and final layers.
The layer serves as the hidden layer for the nodes that come before it, and as the input layer for
the nodes that come after it. The first RBM is designed to reproduce as accurately as possible the input
to train a DBN. Then, the hidden layer of the first RBM is treated as the visible layer for the second
one, and the second RBM is trained using the outputs from the first RBM. This process keeps repeating
until every layer of the network is trained. After this initial training, the DBN has created a model that
can detect patterns in the data. DBN can be used to recognise objects in images, video sequences and
motion-capture data. More details of DBN can be found in [31,48].
One more example of a deep learning algorithm used in the papers surveyed is a bag of words
(BOW) model. BOW is a method to extract features from the text for use in modelling. In BOW,
the number of the appearance of each word in a document is counted, then the frequency of each word
was examined to identify the keywords of the document, and a frequency histogram is made. This
concept is similar to the bag of visual words (BOVW), sometimes referred to as bag-of-features. In
BOVW, image features are considered as the “words”. Image features are unique patterns that were
found in an image. The general idea of BOVW is to represent an image as a set of features, where each
feature contains keypoints and descriptors. Keypoints are the most noticeable points in an image, such
that, even if the image is rotated, shrunk or enlarged, its keypoints are always the same. A descriptor
is the description of the keypoint. Keypoints and descriptors are used to construct vocabularies and
represent each image as a frequency histogram of features. From the frequency histogram, one can
find other similar images or predict the class of the image. Lopes and Valiati proposed Bag of CNN
features to classify tuberculosis [19].

4.5. Transfer Learning


Transfer learning emerged as a popular method in computer vision because it allows accurate
models to be built [49]. With transfer learning, a model learned from a domain can be re-used on a
different domain. Transfer learning can be performed with or without a pre-trained model.
A pre-trained model is a model developed to solve a similar task. Instead of creating a model from
scratch to solve a similar task, the model trained on other problem is used as a starting point. Even
though a pre-trained model is trained on a task which is different from the current task, the features
learned, in most cases, found to be useful for the new task. The objective of training a deep learning
model is to find the correct weights for the network by numerous forward and backward iterations.
By using pre-trained models that have been previously trained on large datasets, the weights and
architecture obtained can be used and applied to the current problem. One of the advantages of a
pre-trained model is the reduced cost of training for the new model [50]. This is because pre-trained
weights were used, and the model only has to learn the weights of the last few layers.
J. Imaging 2020, 6, 131 11 of 38

Many CNN architectures are pre-trained on ImageNet [51]. The images were gathered from
the internet and labelled by human labellers using Amazon’s Mechanical Turk crowd-sourcing
tool. ILSVRC uses a subset of ImageNet with approximately 1000 images in each of 1000 classes.
Altogether, there are approximately 1.2 million training images, 50,000 validation images and 150,000
testing images.
Transfer learning can be used in two ways: (i) fine-tuning; or (ii) using CNN as a feature extractor.
In fine-tuning, the weights of the pre-trained CNN model are preserved on some of the layers and
tuned in the others [52]. Usually, the weights of the initial layers of the model are frozen while only
the higher layers are retrained. This is because the features obtained from the first layers are generic
(e.g., edge detectors or colour blob detectors) and applicable to other tasks. The top-level layers of
the pre-trained models are retrained so that the model learned high-level features specific to the new
dataset. This method is typically recommended if the training dataset is huge and very identical to
the original dataset that the pre-trained model was trained on. On the other hand, CNN is used as a
feature extractor. This is conducted by removing the last fully-connected layer (the one which outputs
the probabilities for being in each of the 1000 classes from ImageNet) and then using the network as a
fixed feature extractor for the new dataset [53]. For tasks where only a small dataset is available, it is
usually recommended to take advantage of features learned by a model trained on a larger dataset in
the same domain. Then, a classifier is trained from the features extracted.
There are several issues that need to be considered when using transfer learning: (i) ensuring that
the pre-trained model selected has been trained on a similar dataset as the new target dataset; and (ii)
using a lower learning rate for CNN weights that are being fine-tuned, because the CNN weights are
expected to be relatively good, and we do not wish to distort them too quickly and too much [53].

4.6. Ensemble of Classifiers


When more than one classifier is combined to make a prediction, this is known as ensemble
classification [16]. Ensemble decreases the variance of predictions, therefore making predictions
that are more accurate than any individual model. From work found in the literature, the ensemble
techniques used include majority voting, probability score averaging and stacking.
In majority voting, every model makes a prediction for each test instance, or, in other words, votes
for a class label, and the final prediction is the label that received the most votes [54]. An alternate
version of majority voting is weighted majority voting, in which the votes of certain models are deemed
more important than others. For example, majority voting was used by Chouhan et al. [55].
In probability score averaging, the prediction scores of each model are added up and divided
by the number of models involved [56]. An alternate version of this is weighted averaging, where
the prediction score of each model is multiplied by the weight, and then their average is calculated.
Examples of works which used probability score averaging are found in [15,57].
In stacking ensemble, an algorithm receives the outputs of weaker models as input and tries
to learn how to best combine the input predictions to provide a better output prediction [58]. For
example, stacking ensemble was used by Rajaraman et al. [12].

4.7. Type of Disease


In this section, the deep learning techniques applied for detecting tuberculosis, pneumonia, lung
cancer and COVID-19 are discussed in greater detail in Sections 4.7.1–4.7.4, respectively. The first three
diseases were considered as they are the most common causes of critical illness and death worldwide
related to lung [2], while COVID-19 is an ongoing pandemic [3]. We also found that most of the existing
work was directed at detecting these specific lung-related diseases.

4.7.1. Tuberculosis
Tuberculosis is a disease caused by Mycobacterium tuberculosis bacteria. According to the World
Health Organisation, tuberculosis is among the ten most common causes of death in the world [59].
J. Imaging 2020, 6, 131 12 of 38

Tuberculosis infected 10 million people and killed 1.6 million in 2017. Early detection of tuberculosis is
essential to increase the chances of recovery [5].
Two studies used Computer-Aided Detection for Tuberculosis (CAD4TB) for tuberculosis
detection [60,61]. CAD4TB is a tool developed by Delft Imaging Systems in cooperation with
the Radboud University Nijmegen and the Lung Institute in Cape Town. CAD4TB works by obtaining
the patient’s chest X-ray, analysing the image via CAD4TB cloud server or CAD4TB box computer,
generating a heat map of the patient’s lung and displaying an abnormality score from 0 to 100.
Murphy et al. [60] showed that CAD4TB v6 is an accurate system, reaching the level of expert human
readers. A technique for automated tuberculosis screening by combining X-ray-based computer-aided
detection (CAD) and clinical information was introduced by Melendez et al. [61]. They combined
automatic chest X-ray scoring by CAD with clinical information. This combination improved accuracies
and specificities compared to the use of either type of information alone.
In the literature, several works use CNN to classify tuberculosis. A method that incorporated
demographic information, such as age, gender and weight, to improve CNN’s performance was
presented by Heo et al. [62]. Results indicate that CNN, including the demographic variables, has
a higher area under the receiver operating characteristic curve (AUC) score and greater sensitivity
then CNN based on chest X-rays images only. A simple convolutional neural network developed
for tuberculosis detection was proposed by Pasa et al. [63]. The proposed approach is found to be
more efficient than previous models but retains their accuracy. This method significantly reduced
the memory and computational requirement, without sacrificing the classification performance.
Another CNN-based model has been presented to classify different categories of tuberculosis [64].
A CNN model is trained on the region-based global and local features to generate new features.
A support vector machine (SVM) classifier was then applied for tuberculosis manifestations recognition.
CNN has also been used to classify tuberculosis [65–67]. Ul Abideen et al. [68] used a Bayesian-based
CNN that exploits the model uncertainty and Bayesian confidence to improve the accuracy of
tuberculosis identification. In other work, a deep CNN algorithm named deep learning-based
automatic detection (DLAD), was developed for tuberculosis classification that contains 27 layers with
12 residual connections [69]. DLAD shows outstanding performance in tuberculosis detection when
applied on chest X-rays, obtaining results better than physicians and thoracic radiologists.
Lopes and Valiati proposed Bag of CNN features to classify tuberculosis [19] where feature
extraction is performed by ResNet, VggNet and GoogLenet. Then, each chest X-ray is separated into
subregions whose size is equal to the input layer of the networks. Each subregion is regarded as a
“feature”, while each X-ray is a “bag”.
Several works that utilised transfer learning are described in this paragraph. Hwang et al.
obtained an accuracy of 90.3% and AUC of 0.964 using transfer learning from ImageNet and training
on a dataset of 10848 chest X-rays [70]. Pre-trained GoogLeNet and AlexNet were used to perform
pulmonary tuberculosis classification by Lakhani and Sundaram [57], who concluded that higher
accuracy was achieved when using the pre-trained model. Their pre-trained AlexNet achieved
an AUC of 0.98 and their pre-trained GoogLeNet achieved an AUC of 0.97. Lopes and Valiati used
pre-trained GoogLenet, ResNet and VggNet architectures as features extractors and the SVM classifier
to classify tuberculosis [19]. They achieved AUC of 0.900–0.912. Fine-tuned ResNet-50, ResNet-101,
ResNet-512, VGG16, VGG19 and AlexNet were used by Islam et al. to classify tuberculosis. These
models achieved an AUC of 0.85–0.91 [71]. Instead of using networks pre-trained from ImageNet,
pre-training can be performed on other datasets, such as the NIH-14 dataset [72]. This dataset contains
an assortment of diseases (which does not include tuberculosis) and is from the same modality as that
of the data under consideration for tuberculosis. Experiments show that the features learned from
the NIH dataset are useful for identifying tuberculosis. A study performed data augmentation and
then compared the performances of three different pre-trained models to classify tuberculosis [73].
The results show that suitable data augmentation methods were able to rise the accuracies of CNNs.
Transfer learning was also used by Abbas and Abdelsamea [74], Karnkawinpong and Limpiyakorn [75]
J. Imaging 2020, 6, 131 13 of 38

and Liu et al. [76]. A coarse-to-fine transfer learning was applied by Yadav et al. [77]. First, the datasets
are split according to the resolution and quality of the images. Then, transfer learning is applied to
the low-resolution dataset first, followed by the high-resolution dataset. In this case, the model was
first trained on the low-resolution NIH dataset, and then trained on the high-resolution Shenzen
and Montgomery datasets. Sahlol et al. [78] used CNN as fixed feature extractor and Artificial
Ecosystem-Based Optimisation to select the optimal subset of relevant features. KNN was used
as the classifier.
Several works that utilised ensemble are described in this paragraph. An ensemble method
using the weighted averages of the probability scores for the AlexNet and GoogLeNet algorithms was
used by Lakhani and Sundaram [57]. In [79], ensemble by weighted averages of probability scores
is used. An ensemble of six CNNs was developed by Islam et al. [71]. The ensemble models were
generated by calculating the simple averaging of the probability predictions given by every single
model. Another ensemble classifier was created by combining the classifier from the Simple CNN
Feature Extraction and a classifier from Bag of CNN features proposals [19]. Three classifiers were
trained, using the features from ResNet, GoogLenet and VggNet, respectively. The Simple Features
Ensemble combines all three classifiers, and the output is obtained through a simple soft-voting scheme.
A stacking ensemble for tuberculosis detection was proposed by Rajaraman et al. [12]. An ensemble
generated via a feature-level fusion of neural network models was also used to classify tuberculosis [80].
Three models were employed: the DenseNet, ResNet and Inception-ResNet. As such, the ensemble
was called RID network. Features were extracted using the RID network, and SVM was used as
a classifier. Tuberculosis classification was also executed using another ensemble of three regular
architectures: ResNet, AlexNet and GoogleNet [79]. Each architecture was trained from scratch,
and different optimal hyper-parameter values were used. The sensitivity, specificity and accuracy
of the ensemble were higher than when each of the regular architecture was used independently.
The authors of [15,81] performed a probability score averaging ensemble of CNNs trained on features
extracted from a different type of images; the enhanced chest X-ray images and the edge detected
images of the chest X-ray. Rajaraman and Antani [82] studied and compared various ensemble methods
that include majority voting and stacking. Results show that stacking ensemble achieved the highest
classification accuracy.
Other techniques used to classify tuberculosis images include k-Nearest Neighbour (kNN),
sequential minimal optimisation and simple linear regression [38]. A Multiple-Instance Learning-based
approach was also attempted [83]. The advantage of this method is the lower labelling detail required
during optimisation. In addition, the minimal supervision required allows easy retraining of a
previously optimised system. One tuberculosis detection system uses ViDi Systems for image analysis
of chest X-rays [84]. ViDi is an industrial-grade deep learning image analysis software developed
by COGNEX. ViDi has shown feasible performance in the detection of tuberculosis. The authors
of [36] introduced a fully automatic frontal chest screening system that is capable of detecting
tuberculosis-infected lungs. This method begins with the segmentation of the lung. Then, features are
extracted from the segmented images. Examples of features include shape and curvature histograms.
Finally, a classifier was used to detect the disease.
For CT scans related tuberculosis detection works, a method called AECNN was proposed [85].
An AE-CNN block was formed by combining the feature extraction of CNN and the unsupervised
features of AutoEncoder. The model then analyses the region of interest within the image to perform
the classification of tuberculosis. A research study explores the use of CT pulmonary images to diagnose
and classify tuberculosis at five levels of severity to track treatment effectiveness [24]. The tuberculosis
abnormalities only occupy limited regions in the CT image, and the dataset is quite small. Therefore,
depth-ResNet was proposed. Depth-ResNet is a 3D block-based ResNet combined with the injection
of depth information at each layer. As an attempt to automate tuberculosis related lung deformities
without sacrificing accuracy, advanced AI algorithms were studied to draw clinically actionable
hypotheses [86]. This approach involves thorough image processing, subsequently performing feature
J. Imaging 2020, 6, 131 14 of 38

extraction using TensorFlow and 3D CNN to further augment the metadata with the features extracted
from the image data, and finally perform six class binary classification using the random forest.
Another attempt for this problem was proposed by Zunair et al. [87]. They proposed a 16-layer
3D convolutional neural network with a slice selection. The goal is to estimate the tuberculosis
severity based on the CT image. An integrated method based on optical flow and a characterisation
method called Activity Description Vector (ADV) was presented to take care of the classification of
chest CT scan images affected by different types of tuberculosis [88]. The important point of this
technique is the interpretation of the set of cross-sectional chest images produced by CT scan, not as a
volume but as a series of video images. This technique can extract movement descriptors capable of
classifying tuberculosis affections by analysing deformations or movements generated in these video
series. The idea of optical flow refers to the approximation of displacements of intensity patterns. In
short, the ADV vector describes the activity in image series by counting for each region of the image
the movements made in four directions of the 2D space.
For sputum microscopy images-related tuberculosis detection works, CNN was used for
the detection and localisation of drug-sensitive tuberculosis bacilli in sputum microscopy images [29].
This method automatically localises bacilli in each view-field (a patch of the whole slide). A study
found that, when training a CNN on three different image versions, namely RGB, R-G and grayscale,
the best performance was achieved when using R-G images [28]. Image binarisation can also be used for
preprocessing before the data were fed into a CNN [30]. Image binarisation is a segmentation method
to classify the foreground and background of the microscopic sputum smear images. The segmented
foreground consists of single bacilli, touching bacillus and other artefacts. A trained CNN is then
given the foreground objects, and the CNN will classify the objects into bacilli and non-bacilli. Another
tuberculosis detection system automatically attains all view-fields using a motorised microscopic
stage [32]. After that, the data are delivered to the recognition system. A customised Inception V3
DeepNet model is used to learn from the pre-trained weights of Inception V3. Afterwards, the data
were classified using SVM. DBN was also used to detect tuberculosis bacillus present in the stained
microscopic images of sputum [31]. For segmentation, the Channel Area Thresholding algorithm is
used. Location-oriented histogram and speed up robust feature (SURF) algorithm were used to extract
the intensity-based local bacilli features. DBN is then used to classify the bacilli objects. Table 1 shows
the summary of papers for tuberculosis detection using deep learning.

Table 1. Summary of papers for tuberculosis detection using deep learning.

Authors Deep Learning Technique Features Dataset


CNN with transfer learning and
[74] Features extracted from CNN Montgomery
data augmentation
K-nearest neighbour, Simple Linear Area, major axis, minor axis,
[38] Regression and Sequential Minimal eccentricity, mean, kurtosis, Shenzhen
Optimisation (SMO) Classification skewness and entropy
[84] ViDi Features extracted from CNN Unspecified
Gabor, LBP, SIFT, PHOG and
[64] CNN Private dataset
Features extracted from CNN
[24] CNN Features extracted from CNN ImageCLEF 2018 dataset
CNN with transfer learning, with Features extracted from CNN +
[62] Private dataset
demographic information demographic information
CNN with data augmentation, and
Montgomery, Shenzhen,
[79] ensemble by weighted averages of Features extracted from CNN
Belarus, JSRT
probability scores
CNN with transfer learning and Private dataset, Montgomery,
[70] Features extracted from CNN
data augmentation Shenzhen
J. Imaging 2020, 6, 131 15 of 38

Table 1. Cont.

Authors Deep Learning Technique Features Dataset


Private datasets,
[69] CNN Features extracted from CNN
Montgomery, Shenzhen
CNN with transfer learning and
Features extracted from CNN +
[71] ensemble by simple linear Indiana, JSRT, Shenzhen
rule-based features
probabilities averaging
ZiehlNeelsen Sputum smear
[29] CNN HoG features
Microscopy image DataBase
[75] CNN and shuffle sampling Features extracted from CNN Private datasets
CNN with transfer learning and CNN extracted features from edge
[81] Montgomery, Shenzhen
ensemble by averaging images
CNN with transfer learning, data
Private dataset, Montgomery,
[57] augmentation and ensemble by Features extracted from CNN
Shenzhen, Belarus
weighted probability scores average
[85] AutoEncoder-CNN Features extracted from CNN Private dataset
CNN with transfer learning and
[76] Features extracted from CNN Private dataset
shuffle sampling
[65] End-to-end CNN Features extracted from CNN Montgomery, Shenzhen
Activity Description Vector on
[88] Optical flow model ImageCLEF 2019 dataset
optical flow of video sequences
[28] CNN Colours TBimages dataset
Modified maximum pattern margin
First four moments of the intensity
[83] support vector machine (modified Private datasets
distributions
miSVM)
Features extracted from CNN +
[61] CAD4TB with clinical information Private dataset
clinical features
ZiehlNeelsen Sputum smear
[31] DBN LoH + SURF features
Microscopy image DataBase
[60] CAD4TB Features extracted from CNN Private dataset
CNN with transfer learning and Montgomery, Shenzhen,
[72] Features extracted from CNN
data augmentation NIH-14 dataset
[30] CNN Features extracted from CNN TBimages dataset
CNN from scratch and data Montgomery, Shenzhen,
[63] Features extracted from CNN
augmentation Belarus
Features extracted from CNN +
[86] 3D CNN lung volume + patient attribute ImageCLEF 2019 dataset
metadata
CNN with transfer learning and local and global feature descriptors Private dataset, Montgomery,
[12]
ensemble by stacking + features extracted from CNN Shenzhen, India
CNN with transfer learning and
[80] Features extracted from CNN Shenzhen
feature level ensemble
CNN with transfer learning and CNN extracted features from edge
[15] Montgomery, Shenzhen
ensemble by averaging images
ZiehlNeelsen Sputum smear
[32] CNN with transfer learning Features extracted from CNN
Microscopy image DataBase
[66] CNN with data augmentation Features extracted from CNN Shenzhen
CNN with transfer learning and NIH-14, Montgomery,
[73] Features extracted from CNN
data augmentation Shenzhen
CNN with transfer learning, Bag of
Features extracted from Private dataset, Montgomery,
[19] CNN Features and ensemble by a
CNN + BOW Shenzhen
simple soft-voting scheme
Shape, curvature descriptor
[36] Neural network histograms, eigenvalues of Hessian Montgomery, Shenzhen
matrix
J. Imaging 2020, 6, 131 16 of 38

Table 1. Cont.

Authors Deep Learning Technique Features Dataset


CNN with transfer learning and Montgomery, Shenzhen,
[77] Features extracted from CNN
data augmentation NIH-14
[87] 3D CNN Features extracted from CNN ImageCLEF 2019 dataset
CNN and Artificial
[78] Ecosystem-based Optimisation Features extracted from CNN Shenzhen
algorithm
[67] CNN Features extracted from CNN Shenzhen
[68] Bayesian based CNN Features extracted from CNN Montgomery, Shenzhen
CNN with transfer learning, and Montgomery, Shenzhen,
ensemble by majority voting, simple LDOCTCXR, 2018 RSNA
[82] Features extracted from CNN
averaging, weighted averaging, and pneumonia challenge
stacking dataset, Indiana dataset

4.7.2. Pneumonia
Pneumonia is a lung infection that causes pus and fluid to fill the alveoli in one or both lungs,
thus making breathing difficult [89]. Symptoms include severe shortness of breath, chest pain, chills,
cough, fever or fatigue. Community-acquired pneumonia is still a recurrent cause of morbidity and
mortality [90]. Most of the studies used transfer learning and data augmentation. Tobias et al. [91]
straightforwardly used CNN. Stephen et al. [92] trained their CNN from scratch while using rescale,
rotation, width shift, height shift, shear, zoom and horizontal flip as their augmentation techniques.
A pre-trained CNN was utilised by the authors of [20,55,93–97] for pneumonia detection, while
the latter four also applied data augmentation on their training datasets. For data augmentation,
random horizontal flipping was used by Rajpurkar et al. [96]; shifting, zooming, flipping and
40-degree angles rotation were used by Ayan and Ünver [20]; Chouhan et al. [55] used noise addition,
random horizontal flip random resized crop and images intensity adjustment; and Rahman et al. [97]
used rotation, scaling and translation. Hashmi et al. [98] used CNN with transfer learning, data
augmentation and ensemble by weighted averaging.
In a unique study, Acharya and Satapathy [99] used Deep Siamese CNN architecture. Deep
Siamese network uses the symmetric structure of the two input image for classification. Thus, the X-ray
images were separated into two parts, namely the left half and the right half. Each half was then fed
into the network to compare the symmetric structure together with the amount of the infection that is
spread across these two regions. Training the model for both left and right parts of the X-ray images
makes the classification process more robust. Elshennawy and Ibrahim [100] used CNN and Long
Short-Term Memory (LSTM)-CNN for pneumonia detection. The key advantage of the LSTM is that it
can model both long and short-term memory and can deal with the vanishing gradient problem by
training on long strings and storing them in memory. Emhamed et al. [101] studied and compared
seven different deep learning algorithms: Decision Tree, Random Forest, KNN, AdaBoost, Gradient
Boost, XGBboost and CNN. Their results show CNN obtained the highest accuracy for pneumonia
classification, followed by Random forest and XGBboost. Hashmi et al. [98] used CNN with transfer
learning, data augmentation and ensemble by weighted averaging.
In addition, Kumar et al. [102] attempted not only pneumonia classification, but also ROI
identification. Pneumonia was detected by looking at lung opacity, and Mask-RCNN based model
was used to identify lung opacity that is likely to depict pneumonia. They also performed ensemble by
combining confidence scores and bounding boxes. In addition to pneumonia detection, Hurt et al. [103]
proposed an approach that provides a probabilistic map on the chest X-ray images to assist in
the diagnosis of pneumonia. Table 2 shows the summary of papers for pneumonia detection using
deep learning.
J. Imaging 2020, 6, 131 17 of 38

Table 2. Summary of papers for pneumonia detection using deep learning

Reference Deep Learning Technique Features Dataset


CNN extracted features from
[99] Deep Siamese based neural network the left half and right half of Unspecified Kaggle dataset
the lungs
CNN with transfer learning and Features extracted from
[20] LDOCTCXR
data augmentation CNN
CNN with transfer learning, data
Features extracted from
[55] augmentation and ensemble by LDOCTCXR
CNN
majority voting.
Features extracted from
[93] CNN with transfer learning LDOCTCXR
CNN
CNN with transfer learning, data
Radiological Society of North
augmentation and ensemble by Features extracted from
[102] America (RSNA) pneumonia
combining confidence scores and CNN
dataset
bounding boxes.
CNN with transfer learning and Features extracted from
[96] NIH Chest X-ray Dataset
data augmentation CNN
CNN from scratch and data Features extracted from
[92] LDOCTCXR
augmentation CNN
Features extracted from
[95] CNN with transfer learning LDOCTCXR
CNN
Features extracted from
[91] CNN Mooney’s Kaggle dataset
CNN
CNN and LSTM-CNN, with transfer Features extracted from
[100] Mooney’s Kaggle dataset
learning and data augmentation CNN
CNN with probabilistic map of Features extracted from 2018 RSNA pneumonia challenge
[103]
pneumonia CNN dataset
Decision Tree, Random Forest,
[101] K-nearest neighbour, AdaBoost, Multiple features Mooney’s Kaggle dataset
Gradient Boost, XGBboost, CNN
CNN with transfer learning, data
Features extracted from
[98] augmentation and ensemble by LDOCTCXR
CNN
weighted averaging
CNN with transfer learning and Features extracted from
[97] Mooney’s Kaggle dataset
data augmentation CNN
Features extracted from
[94] CNN with transfer learning Private dataset
CNN

4.7.3. Lung Cancer


One key characteristic of lung cancer is the presence of pulmonary nodules, solid clumps of tissue
that appear in and around the lungs [104]. These nodules can be seen in CT scan images and can be
malignant (cancerous) in nature or benign (not cancerous) [23].
As early as 2015, Hua et al. [105] used models of DBN and CNN to perform nodule classification
in CT scans. They showed that, using deep learning, it is possible to seamlessly extract features
for lung nodules classification into malignant or benign without computing the morphology and
texture features. Rao et al. [25] and Kurniawan et al. [106] used CNN in a straightforward way to
detect lung cancer in CT scans. Song et al. [23] compared the classification performance of CNN, deep
neural network and stacked autoencoder (a multilayer sparse autoencoder of a neural network) and
concluded that CNN has the highest accuracy among them. Ciompi et al. [107] used multi-stream
multi-scale CNNs to classify lung nodules into six different classes: solid, non-solid, part-solid, calcified,
perifissural and spiculated nodules. Specifically, they presented a multi-stream multi-scale architecture,
in which CNN concurrently handles multiple triplets of 2D views of a nodule at multiple scales
and then calculates the probability for the nodule in each of the six classes. Yu et al. [14] performed
J. Imaging 2020, 6, 131 18 of 38

bone elimination and lung segmentation before training with CNN. Shakeel et al. [108] performed
image denoising and enhanced the quality of the images, and then segmented the lungs by using
the improved profuse clustering technique. Afterwards, a neural network is trained to detect lung
cancer. The approach of Ardila et al. [13] consists of four components: lung segmentation, cancer
region of interest detection model, full-volume model and cancer risk prediction model. After lung
segmentation, the region of interest detection model proposes the most nodule-like regions, while
the full-volume model was trained to predict cancer probability. The outputs of these two models were
considered to generates the final prediction. Chen et al. [109] performed nodule enhancement and
nodule segmentation before performing nodule detection.
For the works that employed transfer learning, Hosny et al. [110] and Xu et al. [111] both
used CNN with data augmentation. For augmentations, both studies used flipping, translation
and rotation. The authors of [112] leveraged the LUNA16 dataset to train a nodule detector and then
refined that detector with the KDSB17 dataset to provide global features. Combining that and local
features from a separate nodule classifier, they were able to detect lung cancer with high accuracy.
The authors of [113] used transfer learning by training the model multiple times. It commenced
using the more general images from the ImageNet dataset, followed by detecting nodules from chest
X-rays in the ChestX-ray14 dataset, and finally detecting lung cancer nodules from the JSRT dataset.
The authors of [34] is the only study surveyed to do lung cancer detection on histopathology images.
Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most frequent subtypes of lung
cancer, and visual examination by an experienced pathologist is needed to differentiate them. In this
work, CNN was trained on histopathology slides images to automatically and accurately classify them
into LUAD, LUSC or normal lung tissue. Xu et al. [114] used a CNN-long short-term memory network
(LSTM) to detect lesions on chest X-ray images. Long short-term memory is an extension of RNN. This
CNN-LSTM network offers probable clinical relationships between lesions to assist the model to attain
better predictions. Table 3 shows the summary of papers for lung cancer detection using deep learning.

Table 3. Summary of papers for lung cancer detection using deep learning.

Reference Deep Learning Technique Features Dataset


[13] CNN Features extracted from CNN LUNA, LIDC, NLST
[113] CNN with transfer learning Features extracted from CNN JSRT Dataset, NIH-14 dataset
Multi-stream multi-scale
[107] Features extracted from CNN MILD dataset DLCST dataset
convolutional networks
[34] CNN with transfer learning Features extracted from CNN NCI Genomic Data Commons
NSCLC-Radiomics,
CNN with transfer learning and NSCLC-Radiomics-Genomics,
[110] Features extracted from CNN
data augmentation RIDER Collections and several
private datasets
Features extracted from CNN and
[105] CNN and DBN LIDC-IDRI
DBN
Kaggle Data Science Bowl 2017
[112] CNN with transfer learning Features extracted from CNN dataset, Lung Nodule Analysis 2016
(LUNA16) dataset
[25] CNN Features extracted from CNN LIDC-IDRI
[108] CNN Features extracted from CNN LIDC-IDRI
[23] CNN with data augmentation Features extracted from CNN LIDC-IDRI database
CNN with transfer learning and
[111] Features extracted from CNN Private dataset
data augmentation
Bone elimination and lung Features extracted using CNN from
[14] segmentation before training with bone eliminated lung images and JSRT dataset
CNN segmented lung images
CNN-long short-term memory
[114] Features extracted from CNN NIH-14 dataset
network
J. Imaging 2020, 6, 131 19 of 38

Table 3. Cont.

Reference Deep Learning Technique Features Dataset


CNN with transfer learning and
[109] Features extracted from CNN JSRT database
data augmentation
[106] CNN with data augmentation Features extracted from CNN Cancer Imaging Archive

4.7.4. COVID-19
COVID-19 is an infectious disease caused by a recently discovered coronavirus [115]. Senior
citizens are those at high risk to develop severe sickness, along with those that have historical medical
conditions such as cardiovascular disease, chronic respiratory disease, cancer and diabetes [116].
A straightforward approach to detect COVID-19 using CNN with transfer learning and data
augmentation was used by Salman et al. [21]. For transfer learning, they used InceptionV3 as a
fixed feature extractor. Other works that implemented the similar approach of transfer learning for
COVID-19 detection can be found in [117–122].
The authors of [123,124] performed 3-class classification using CNN with transfer learning,
classifying X-ray images into normal, COVID-19 and viral pneumonia cases. Chowdhury et al. [125]
utilised CNN with transfer learning and data augmentation to classify classifying X-ray images into
normal, COVID-19 and viral pneumonia cases. The augmentation techniques used were rotation,
scaling and translation. Wang et al. [126] trained a CNN from scratch and data augmentation to perform
three-class classification. The augmentation technique used were translation, rotation, horizontal flip
and intensity shift. Other work performing three-class classification can be found in [4,127–130]. Studies
that employ data augmentation to increase the amount of data available can be found in [131,132].
In addition to COVID-19 detection on X-ray images, Alazab et al. [131] managed to perform prediction
on the number of COVID-19 confirmations, recoveries and deaths in Jordan and Australia.
For works utilising ensemble, Ouyang et al. [133] implemented weighted averaging ensemble.
Mahmud et al. [134] implemented stacking ensemble, whereby the images were classified into four
categoriesL normal, COVID-19, viral pneumonia and bacterial pneumonia.
Shi et al. [135] utilised VB-Net for image segmentation and feature extraction and used a modified
random decision forests method for classification. Several handcrafted features were also calculated
and used to train the random forest model. More information about random forest can be found
in [136].
A system that receives thoracic CT images and points out suspected COVID-19 cases was
proposed by Gozes et al. [26]. The system analyses CT images at two distinct subsystems. Subsystem
A performed the 3D analysis of the case volume for nodules and focal opacities, while Subsystem B
performed the 2D analysis of each slice of the case to detect and localise larger-sized diffuse opacities.
In Subsystem A, nodules and small opacities detection were conducted using a commercial software.
Besides the detection of abnormalities, the software also provided measurements and localisation.
For Subsystem B, lung segmentation was first performed, and then COVID-19 related abnormalities
detection was conducted using CNN with transfer learning and data augmentation. If an image is
classified as positive, a localisation map was generated using the Grad-cam technique. To provide a
complete review of the case, Subsystems A and B were combined. The final outputs include per slice
localisation of opacities (2D), 3D volumetric presentations of the opacities throughout the lungs and a
Corona score, which is a volumetric measurement of the opacities burden.
The authors of [137] focused on location-attention classification mechanism. First, the CT images
were preprocessed. Second, a 3D CNN model was employed to segment several candidate image
patches. Third, an image classification model was trained and employed to categorise all image
patches into one of three classes: COVID-19, Influenza-A-viral-pneumonia and irrelevant-to-infection.
A location-attention mechanism was embedded in the image classification model to differentiate
the structure and appearance of different infections. Finally, the overall analysis report for a single
CT sample was generated using the Noisy-or Bayesian function. The results show that the proposed
J. Imaging 2020, 6, 131 20 of 38

approach could more accurately detect COVID-19 cases than without the location-attention model.
Several other studies modified the CNN for COVID-19 detection. In [138], a multi-objective differential
evolution-based CNN was utilised. Sedik et al. [139] implemented CNN and LSTM with data
augmentation, while Ahsan et al. [140] employed MLP-CNN based model. The authors of [141]
employed capsule network-based framework with transfer learning. Table 4 shows the summary of
papers for COVID-19 detection using deep learning.

Table 4. Summary of papers for COVID-19 detection using deep learning.

Authors Deep Learning Technique Features Dataset


CNN with transfer learning
[137] and location-attention Features extracted from CNN Private dataset
classification mechanism
SIRM database, Cohen’s Github
CNN with transfer learning
[125] Features extracted from CNN dataset, Chowdhury’s Kaggle
and data augmentation
dataset
Chainz Dataset, A dataset from
RADLogics Inc., CNN with a hospital in Wenzhou, China,
Features extracted from RADLogics
[26] transfer learning and data Dataset from El-Camino
Inc and CNN
augmentation Hospital (CA) and Lung image
database consortium (LIDC)
Cohen’s Github dataset and
[123] CNN with transfer learning Features extracted from CNN
LDOCTCXR
CNN with transfer learning Cohen’s Github dataset and
[21] Features extracted from CNN
and data augmentation unspecified Kaggle dataset
Dataset obtained from Tongji
Hospital of Huazhong
University of Science and
VB-Net and modified
Technology, Shanghai Public
[135] random decision forests 96 handcrafted image features
Health Clinical Center of Fudan
method
University, and China-Japan
Union Hospital of Jilin
University.
CNN from scratch and data
[126] Features extracted from CNN COVIDx Dataset
augmentation
Cohen’s Github dataset,
[127] CNN with transfer learning Features extracted from CNN Andrew’s Kaggle dataset,
LDOCTCXR
Cohen’s Github dataset, RSNA
[117] CNN with transfer learning Features extracted from CNN
pneumonia dataset, COVIDx
CNN with transfer learning
[131] Features extracted from CNN Sajid’s Kaggle dataset
and data augmentation
CNN with transfer learning Cohen’s Github dataset,
[4] Features extracted from CNN
and data augmentation Mooney’s Kaggle dataset
[118] CNN with transfer learning Features extracted from CNN COVID-CT-Dataset
GitHub, Radiopaedia,
CNN as feature extractor
The Cancer Imaging Archive,
[128] and long short-term memory Features extracted from CNN
SIRM, Kaggle repository, NIH
(LSTM) network as classifier
dataset, Mendeley dataset
CNN with transfer learning
Cohen’s Github, Chowdhury’s
and synthetic data
[132] Features extracted from CNN Kaggle dataset, COVID-19 Chest
generation and
X-ray Dataset, Initiative
augmentation
CNN with transfer learning,
[129] data augmentation and Features extracted from CNN Cohen’s Github, LDOCTCXR
ensemble by majority voting
J. Imaging 2020, 6, 131 21 of 38

Table 4. Cont.

Authors Deep Learning Technique Features Dataset


CNN with transfer learning
[134] Features extracted from CNN Private dataset, LDOCTCXR
and stacking ensemble
[130] CNN Features extracted from CNN Private dataset
Multi-objective differential
[138] Features extracted from CNN Unspecified
evolution-based CNN
[119] CNN with transfer learning Features extracted from CNN Cohen’s Github
CNN and ConvLSTM with Cohen’s Github,
[139] Features extracted from CNN
data augmentation COVID-CT-Dataset
[120] CNN with transfer learning Features extracted from CNN Cohen’s Github
CNN with ensemble by
[133] Features extracted from CNN Private hospital datasets
weighted averaging
Cohen’s Github, Mooney’s
[121] CNN with transfer learning Features extracted from CNN Kaggle dataset, Shenzhen and
Montgomery datasets
[140] MLP-CNN based model Features extracted from CNN Cohen’s Github
Cohen’s Github, unspecified
[122] CNN with transfer learning Features extracted from CNN
Kaggle dataset
Capsule Network-based
Cohen’s Github, Mooney’s
[141] framework with transfer Features extracted from CNN
Kaggle dataset
learning

4.8. Dataset
The datasets used by the surveyed works are reported in this section. Tables 5–8 show
the summary of datasets used for tuberculosis, pneumonia, lung cancer and COVID-19 detection,
respectively. This is done to provide readers with relevant information on the datasets. Note that only
public datasets are included in the tables because they are available to the public, whereas private
datasets are inaccessible without permission.
According to Table 5, among the twelve datasets used for tuberculosis detection works, five of
them do not contain tuberculosis medical images: JSRT dataset, Indiana dataset, NIH-14 dataset,
LDOCTCXR and RSNA pneumonia dataset. JSRT dataset contains lung cancer images, while
the Indiana and NIH-14 datasets contain multiple different diseases. LDOCTCXR and RSNA
pneumonia datasets both contain pneumonia and normal lung images. These five datasets were
used for transfer learning in several studies. Models were first trained to identify abnormalities in
chest X-ray, and then they were trained to identify tuberculosis. The India, Montgomery and Shenzhen
datasets contain X-ray images of tuberculosis; ImageCLEF 2018 and ImageCLEF 2019 datasets contain
CT images of tuberculosis; and the Belarus dataset contains both X-ray and CT images of tuberculosis.
Two of the datasets contain sputum smear microscopy images of tuberculosis: the TBimages dataset
and ZiehlNeelsen Sputum smear Microscopy image DataBase.
For detection works related to pneumonia, only four public datasets are available, as shown in
Table 6. All four datasets contain X-ray images only. Even though the number of datasets is low,
the number of images within these datasets is high. Future studies utilising these datasets should have
sufficient data.
J. Imaging 2020, 6, 131 22 of 38

Table 5. Summary of datasets used for tuberculosis detection.

Name Disease Image Type Reference Number of Images Link


Belarus dataset Tuberculosis X-ray and CT [142] 1299 http://tuberculosis.by
ImageCLEF 2018 https://www.imageclef.
Tuberculosis CT 2287
dataset org/2018/tuberculosis
https://www.imageclef.
ImageCLEF 2019
Tuberculosis CT [143] 335 org/2019/medical/
dataset
tuberculosis
78 tuberculosis and https://sourceforge.net/
India Tuberculosis X-ray [39]
78 normal projects/tbxpredict/
Multiple diseases https:
Indiana Dataset X-ray [144] 7284
with annotations //openi.nlm.nih.gov
Lung nodules 154 nodule and 93 http:
JSRT dataset X-ray and CT [145]
and normal non-nodule //db.jsrt.or.jp/eng.php
https:
Montgomery and Tuberculosis and 394 tuberculosis and
X-ray [146] //lhncbc.nlm.nih.gov/
Shenzhen datasets normal 384 normal
publication/pub9931
https:
Pneumonia and
NIH-14 dataset X-ray [147] 112120 //www.kaggle.com/
13 other diseases
nih-chest-xrays/data
Sputum smear http://www.tbimages.
TBimages dataset Tuberculosis [148] 1320
microscopy image ufam.edu.br/
ZiehlNeelsen Sputum
Sputum smear 620 tuberculosis and http:
smear Microscopy Tuberculosis [27]
microscopy image 622 normal //14.139.240.55/znsm/
image DataBase
Large Dataset of
Labeled Optical
https:
Coherence Pneumonia and 3883 pneumonia and
X-ray [93] //data.mendeley.com/
Tomography (OCT) normal 1349 normal
datasets/rscbjbr9sj/3
and Chest X-Ray
Images (LDOCTCXR)
https://www.kaggle.
Radiological Society of
Pneumonia and com/c/rsna-
North America (RSNA) X-ray 5528
normal pneumonia-detection-
pneumonia dataset
challenge/data

Table 6. Summary of datasets used for pneumonia detection.

Name Disease Image Type Reference Number of Images Link


https://data.
3883 pneumonia
LDOCTCXR X-ray [93] mendeley.com/
and 1349 normal
datasets/rscbjbr9sj/3
https:
NIH Chest X-ray Pneumonia and
X-ray [147] 112,120 //www.kaggle.com/
Dataset 13 other diseases
nih-chest-xrays/data
Radiological Society https://www.kaggle.
of North America Pneumonia and com/c/rsna-
X-ray 5528
(RSNA) pneumonia normal pneumonia-detection-
dataset challenge/data
https:
Mooney’s Kaggle Pneumonia and //www.kaggle.com/
X-ray 5863
dataset normal paultimothymooney/
chest-xray-pneumonia

According to Table 7, among the ten datasets used for lung cancer detection works, only one
contains histopathology images, which is the NCI Genomic Data Commons dataset. The NIH-14
dataset contains X-ray images, while the JSRT dataset contains a mix of X-ray and CT images. The rest
of the datasets all contain CT images.
J. Imaging 2020, 6, 131 23 of 38

Table 7. Summary of datasets used for lung cancer detection.

Name Disease Image Type Reference Number of Images Link


Lung nodules 154 nodule and 93 http:
JSRT dataset X-ray and CT [145]
and normal lungs non-nodule //db.jsrt.or.jp/eng.php
https://www.kaggle.
Kaggle Data Science
Lung Cancer CT scans 601 com/c/data-science-
Bowl 2017 dataset
bowl-2017/overview
https://wiki.
cancerimagingarchive.
LIDC-IDRI Lung Cancer CT [149] 1018
net/display/Public/
LIDC-IDRI
Lung Nodule https://luna16.grand-
Location and size
Analysis 2016 CT scans [8] 888 challenge.org/
of lung nodules
(LUNA16) dataset download/
NCI Genomic Data histopa- thology https://portal.gdc.
Lung Cancer [150] More than 575,000
Commons images cancer.gov/
https:
NIH-14 dataset 14 lung diseases X-ray [147] 112,120 //www.kaggle.com/
nih-chest-xrays/data
https://biometry.nci.nih.
Approximately
NLST Lung Cancer CT gov/cdas/learn/nlst/
200,000
images/
https://wiki.
cancerimagingarchive.
NSCLC-Radiomics Lung Cancer CT 422
net/display/Public/
NSCLC-Radiomics
https://wiki.
cancerimagingarchive.
NSCLC- Radiomics
Lung Cancer CT 89 net/display/Public/
-Genomics
NSCLC-Radiomics-
Genomics
https://wiki.
Approximately cancerimagingarchive.
RIDER Collections Lung Cancer CT
280,000 net/display/Public/
RIDER+Collections

Table 8 shows that there are thirteen public datasets related to COVID-19. With the rise of
the COVID-19 pandemic, multiple datasets have been made available to the public. Many of these
datasets still have a rising number of images. Therefore, the number of images within the datasets
might be different from the number reported in this paper. Take note that some of the images might be
contained in multiple datasets. Therefore, future studies should check for duplicate images.
Table 9 summarises the works surveyed based on the taxonomy. This allows readers to quickly
refer to the articles according to their interested attributes. The analysis of the distribution of works
based on the identified attributes of the taxonomy is given in the following section.

Table 8. Summary of datasets used for COVID-19 detection.

Name Disease Image Type Reference Number of Images Link


https://www.kaggle.
Andrew’s Kaggle dataset COVID-19 X-ray and CT 79 com/andrewmvd/
convid19-x-rays
COVID-19 and 50 COVID-19, 51
Chainz Dataset CT www.ChainZ.cn
normal normal
https://www.kaggle.
COVID-19, 219 COVID-19, 1341
Chowdhury’s Kaggle com/tawsifurrahman/
normal and X-ray [125] normal and 1345
dataset covid19-radiography-
pneumonia pneumonia
database
https:
Cohen’s Github dataset COVID-19 X-ray and CT [151] 123 //github.com/ieee8023/
covid-chestxray-dataset
J. Imaging 2020, 6, 131 24 of 38

Table 8. Cont.

Name Disease Image Type Reference Number of Images Link


https://github.com/
COVID-19, 573 COVID-19, 8066
lindawangg/COVID-
COVIDx Dataset normal and X-ray [126] normal and 5559
Net/blob/master/docs/
pneumonia pneumonia
COVIDx.md
Italian Society Of
Medical And https://www.sirm.org/
Interventional Radiology COVID-19 X-ray and CT 68 category/senza-
(SIRM) COVID-19 categoria/covid-19/
Database
https:
Pneumonia and 3883 pneumonia and
LDOCTCXR X-ray [93] //data.mendeley.com/
normal 1349 normal
datasets/rscbjbr9sj/3
https://wiki.
Lung image database cancerimagingarchive.
Lung Cancer CT [149] 1018
consortium (LIDC) net/display/Public/
LIDC-IDRI
https:
COVID-19 and 28 normal, 70 //www.kaggle.com/
Sajid’s Kaggle dataset X-ray
normal COVID-19 nabeelsajid917/covid-
19-x-ray-10000-images
https:
Pneumonia and //www.kaggle.com/
Mooney’s Kaggle dataset X-ray 5863
normal paultimothymooney/
chest-xray-pneumonia
https:
COVID-19 and 349 COVID-19 and
COVID-CT Dataset CT //github.com/UCSD-
normal 463 non-COVID-19
AI4H/COVID-CT
Mendeley Augmented https:
COVID-19 and
COVID-19 X-ray Images X-ray 912 //data.mendeley.com/
normal
Dataset datasets/2fxz4px6d8/4
https:
COVID-19 Chest X-Ray //github.com/agchung/
COVID-19 X-ray 55
Dataset Initiative Figure1-COVID-
chestxray-dataset

Table 9. Summary of the works surveyed based on the taxonomy.

Attributes Subattributes References


[4,12,14,15,19–21,24,36,38,55,57,60–85,91–103,109,113,114,117,119–129,
Image types X-Ray
131,132,134,139–141]
CT Scans [13,23,25,26,86–88,105–108,110–112,118,130,133,135,137–139]
Sputum Smear Microscopy
[28–32]
Images
Histopathology images [34]
[4,12–15,19–21,23–26,30,32,34,55,57,60–82,84–87,91–103,105–114,117–
Features Extracted from CNN
134,137–141]
Others [12,15,26,28,29,31,36,38,61,62,64,71,81,83,86,88,105,135]
[4,20,21,23,26,55,57,63,66,70,73,74,77,79,92,96–98,100,102,106,109–111,
Data augmentation Yes
114,122,125,126,128,129,131,132,139]
Types of deep [4,12–15,19–21,23–26,28–30,32,34,55,57,60–69,72,74,76–82,84–86,91–
CNN
learning algorithm 103,105–114,117–134,137–141]
Non-CNN [19,26,31,36,38,83,88,105,135]
Transfer learning Fixed feature extractor [12,15,19,21,62,70,76,78,80,81,93,94,96,100,102,117,127,128,137]
[4,20,26,32,34,55,57,71–74,76,77,79,82,95,97,98,102,109–113,118–125,
Fine-tuning CNN
129,131,132,134,141]
Ensemble Majority voting [19,55,82,129]
Probability score averaging [15,57,71,79,81,82,98,102,133]
Stacking [12,82,134]
Other [80]
Disease types Tuberculosis [12,15,19,24,28–32,36,38,57,60–88]
Pneumonia [20,55,91–103]
Lung cancer [13,14,23,25,34,105–114]
COVID-19 [4,21,26,117–135,137–141]
J. Imaging 2020, 6, 131 25 of 38

5. Analysis of Trend, Issues and Future Directions of Lung Disease Detection Using
Deep Learning
In this section, the broad analysis of the existing work is presented, which is the last contribution
outlined in this paper. The analysis of the trend of each attribute identified in the foregoing section
is described, whereby the aim is to show the progress of the works and the direction the researchers
are heading over the last five years. The shown trend could be useful to suggest the future direction
of the work in this domain. Section 5.1 presents the analysis of the trend of the articles considered.
The issues and potential future work to address the identified issues are described in Section 5.2.

5.1. An Analysis of the Trend of Lung Disease Detection in Recent Years


This subsection presents the analysis of lung disease detection works in recent years for each
attribute of the taxonomy described in the foregoing section.

5.1.1. Trend Analysis of the Image Type Used


Figure 10a shows that the usage of X-ray images increases linearly over the years. The usage of
CT images also increases over the years, with a slight dip in 2018. The sputum smear microscopy and
histopathology images are combined into one as ‘Others’ due to the low number of previous work
using them to detect lung diseases. The usage of other image types slowly increases until 2018, and
then drops. This indicates that deep learning aided lung disease detection works are heading towards
the direction of using X-ray images and CT images.
Figure 10b shows that the majority of the studies used X-ray images at 71%, while CT images
followed second with 23%. Such observation could be due to the availability, accessibility and mobility
of X-ray machines over the CT scanner. Due to the COVID-19 pandemic that has spread to all types
of geographical locations, it is anticipated that the X-ray images will still be the dominant choice of
medical images used to detect lung-related diseases over CT images. CT images may remain the second
choice because they provide more detailed information than X-rays.

(a) (b)
Figure 10. (a) The trend of the usage of image types in lung disease detection works in recent years; and
(b) the distribution of the image type used in deep learning aided lung disease detection in recent years.

5.1.2. Trend Analysis of the Features Used


From the perspective of features used for lung disease detection in recent years, as shown in
Figure 11a, the usage of CNN extracted features is steadily increasing, while the usage of other features
and the combination of CNN extracted features plus other features remain low. This is because CNN
allows automated feature extraction, discarding the need for manual feature generation [40]. The usage
of other features was less preferred due to the fact that most recent works showed the superiority of
CNN extracted features in detecting lung diseases. Figure 11b shows the distribution of work by type
of features used. CNN extracted features were used in 79% of the works. The combination of CNN
extracted features plus some other features were used in 13% of the recent works, while the remaining
works utilised other types of features.
J. Imaging 2020, 6, 131 26 of 38

(a) (b)
Figure 11. (a) The trend of the usage of features in lung disease detection works in recent years; and
(b) the distribution of usage of data augmentation in deep learning aided lung disease detection in
recent years.

5.1.3. Trend Analysis of the Usage of Data Augmentation


Figure 12a shows the trend of the usage of data augmentation. Although implementing data
augmentation increased the complexity of the data pre-processing, the number of works employing
data augmentation increases steadily over the years. Such trend signifies that more researchers have
realised how beneficial data augmentation is to train the lung disease detection models.
Figure 12b shows the distribution of data augmentation usage in deep learning aided lung
disease detection. Only about one-third of the studies used data augmentation. While it is reported
that data augmentation improved the classification accuracy, the majority of works did not use data
augmentation. One reason for this might be that data augmentation is not that simple to implement.
As mentioned in Section 4.3, the disadvantages of data augmentation include additional memory costs,
transformation computing costs and training time.

12

10

0
2014-2015 2016 2017 2018 2019 2020

--COVID-19 --Lung Cancer -- Pneumonia --Tuberculosis • With Data Augmentation • Without Data Augmentation

(a) (b)
Figure 12. (a) The trend of the usage of data augmentation in lung disease detection works in recent
years; and (b) the distribution of usage of data augmentation in deep learning aided lung disease
detection in recent years.

5.1.4. Trend Analysis of the Types of Deep Learning Algorithm Used


Figure 13a shows the trend of the usage of deep learning algorithms in lung disease detection
works in recent years. As shown in Figure 13, CNN was the most preferred deep learning algorithm
for the last five years. Future works will likely follow this trend, whereby more work may prefer CNN
for lung disease detection over other deep learning algorithms.
Figure 13b visualises the analysis of the usage of CNN in deep learning aided lung disease
detection in recent years. The majority of the papers surveyed used CNN. This is because CNN is
robust and can achieve high classification accuracy. Many of the works surveyed indicate that CNN
J. Imaging 2020, 6, 131 27 of 38

has superior performance [74]. Other benefits of using CNN include automatic feature extraction and
utilising the advantages of transfer learning, which is further analysed in the following subsection.

50
45
Tuberculosis
40
35 Pneumonia
30
25
20 Lung Cancer
15
10
COVID-19
5
0
2014-2015 2016 2017 2018 2019 2020 0.00 0.20 0.40 0.60 0.80 1.00 1.20

--CNN --Non-CNN ■ CNN ■ Non-CNN

(a) (b)
Figure 13. (a) The trend of the usage of deep learning algorithms in lung disease detection works in
recent years; and (b) the distribution of the usage of CNN in deep learning aided lung disease detection
in recent years.

5.1.5. Trend Analysis of the Usage Of Transfer Learning


Figure 14a shows the trend of the usage of transfer learning. As time goes on, more works
employed transfer learning. With transfer learning, there is no need to define a new model. Transfer
learning also allows the usage features learned while training from an old task for the new task, often
increasing the classification accuracy. This could be due to the model used being more generalised as it
has been trained with a greater number of images.
Figure 14b shows the usage of transfer learning among the works which used CNN. According to
the figure, 57% of the recent works utilised transfer learning. Even though the number of works
utilising transfer learning increased over the years, as shown in Figure 14a, the percentage of
works using transfer learning is just 57%. For example, in 2020, out of 44 studies that used CNN,
28 implemented transfer learning. This suggests that works in this domain are moving towards
the direction of using transfer learning, but not at a high pace. Transfer learning remains a strong
approach to lung disease detection, with respect to the detection performance. Hence, the distribution
of work may be skewed towards transfer learning in the near future.

20
18
16
14
12
10
8
6
4
2
0
2014-2015 2016 2017 2018 2019 2020

--COVID-19 --Lung Cancer --Pneumonia --Tuberculosis • With Transfer Learning • Without transfer Learning

(a) (b)
Figure 14. (a) The trend of the usage of transfer learning in lung disease detection works in recent
years; and (b) the usage of transfer learning in lung disease detection works using CNN.

5.1.6. Trend Analysis of the Usage Of Ensemble


Based on Figure 15a, it seems that the ensemble was only applied on COVID-19, pneumonia and
tuberculosis detection. It is observed that the usage of the ensemble is slowly growing in popularity
J. Imaging 2020, 6, 131 28 of 38

for pneumonia and COVID-19 detection. Although less popular, the works that deployed an ensemble
classifier reported better detection performance than when not using ensemble.
Figure 15b shows the distribution of the usage of the ensemble in deep learning aided lung disease
detection. Only 15% of the studies used ensemble. This suggests that ensemble classifier is still less
explored for lung disease detection. Only three types of ensemble techniques were found in the papers
surveyed, which were majority voting, probability score averaging and stacking. The challenge to
implement ensemble may be the caused of such low application. Using ensemble, the performance
could only improve if the errors of the base classifiers have a low correlation. When using similar
data, which may occur when the size of the datasets and the number of datasets itself are limited,
the correlation of errors of the base classifiers tends to be high.

3.5

2.5

1.5

0.5

0
2014-2015 2016 2017 2018 2019 2020

--COVID-19 --Pneumonia --Tuberculosis ■ Ensemble ■ No Ensemble

(a) (b)
Figure 15. (a) The trend of the usage of ensemble classifier in lung disease detection works in recent
years; and (b) the distribution of the usage of the ensemble in deep learning aided lung disease
detection in recent years.

5.1.7. Trend Analysis of the Type Of Lung Disease Detected using Deep Learning
Based on the trend shown in Figure 16a, the total number of lung disease detection works using
deep learning increased steadily over the years, with most work related to tuberculosis detection.
As more lung disease medical image datasets become public, researchers have access to more data.
Thus, more extensive studies were conducted. Towards 2020, the works on COVID-19 detection
emerged while work conducted to detect other diseases decreased tremendously. This signifies that
using deep learning to detect lung disease is still an active field of study. This also shows that much
effort was directed towards easing the burden of detecting COVID-19 using the existing manual
screening test, which is already anticipated.
Figure 16b shows the distribution of the diseases detected using deep learning in recent
years. The majority of works were directed at tuberculosis detection, followed by COVID-19,
lung cancer and pneumonia. The reason that works of tuberculosis are high is because
the majority of tuberculosis-infected inhabitants were from resource-poor regions with poor healthcare
infrastructure [61]. Therefore, tuberculosis detection using deep learning provides the opportunity
to accelerate tuberculosis diagnosis among these communities. The reason that works of COVID-19
detection are second highest is because researchers all over the world are trying to reduce the burden of
detecting COVID-19, and thus many works have been published, even though COVID-19 is a relatively
new disease.
J. Imaging 2020, 6, 131 29 of 38

30

25

20

15

10

0
<=2015 2016 2017 2018 2019 2020

--COVID-19 --Lung Cancer --Pneumonia --Tuberculosis • COVID-19 • Lung Cancer • Pneumonia • Tuberculosis

(a) (b)
Figure 16. (a) The trend of the deep learning aided lung disease detection works in recent years; and
(b) the distribution of the diseases detected using deep learning in recent years.

5.2. Issues and Future Direction of Lung Disease Detection Using Deep Learning
This subsection presents the remaining issues and corresponding future direction of lung disease
detection using deep learning, which are the final contributions of this paper. The state-of-the-art lung
disease detection field is suffering from several issues that can be found in the papers considered.
Some of the proposed future works are designed to deal with the issues found. Details of the issues
and potential future works are presented in Sections 5.2.1 and 5.2.2, respectively.

5.2.1. Issues
This section presents the issues of lung disease detection using deep learning found in
the literature. Four main issues were identified: (i) data imbalance; (ii) handling of huge image
size; (iii) limited available datasets; and (iv) high correlation of errors when using ensemble techniques.

(i) Data imbalance: When doing classification training, if the number of samples of one class is a
lot higher than the other class, the resulting model would be biased. It is better to have the same
number of images in each class. However, oftentimes that is not the case. For example, when
performing a multiclass classification of COVID-19, pneumonia and normal lungs, the number
of images for pneumonia far exceeds the number of images for COVID-19 [126].
(ii) Handling of huge image size: Most researchers reduced the original image size during training
to reduce computational cost. It is extremely computationally expensive to train with the original
image size, and it is also time-consuming to train a deeply complex model even with the aid of
the most powerful GPU hardware.
(iii) Limited available datasets: Ideally, thousands of images of each class should be obtained for
training. This is to produce a more accurate classifier. However, due to the limited number of
datasets, the number of available training data is often less than ideal. This causes researchers
to search for other alternatives to produce a good classifier.
(iv) High correlation of errors when using ensemble techniques: It requires a variety of errors for an
ensemble of classifiers to perform the best. The base classifiers used should have a very low
correlation. This, in turn, will ensure the errors of those classifiers also will be varied. In other
words, it is expected that the base classifiers will complement each other to produce better
classification results. Most of the studies surveyed only combine classifiers that were trained on
similar features. This causes the correlation error of the base classifiers to be high.

5.2.2. Potential Future Works


This section presents the possible future works that should be considered to improvise
the performance of lung disease detection using deep learning.
J. Imaging 2020, 6, 131 30 of 38

(i) Make datasets available to the public: Some researchers used private hospital datasets. To
obtain larger datasets, efforts such as de-identification of confidential patients’ information
can be conducted to make the data public. With more data available, the produced classifiers
would be more accurate. This is because, with more data comes more diversity. This decreases
the generalisation error because the model becomes more general as it was trained on more
examples. Medical data are hard to come by. Therefore, if the datasets were made public, more
data would be available for researchers.
(ii) Usage of cloud computing: Performing training using cloud computing might overcome
the problem of handling of huge image size. On a local mid-range computer, training with large
images will be slow. A high-end computer might speed up the process a little, but it might still
be infeasible. However, by training the deep learning model using cloud computing, we can
use multiple GPUs at a reasonable cost. This allows higher computational cost training to be
conducted faster and cheaper.
(iii) Usage of more variety of features: Most researchers use features automatically extracted by
CNN. Some other features such as SIFT, GIST, Gabor, LBP and HOG were studied. However,
many other features are still yet to be explored, for example quadtree and image histogram.
Efforts can be directed to studying different types of features. This can address the issue of
the high correlation of errors when using ensemble techniques. With more features comes
more variation. When combining many variations, the results are often better [41]. Feature
engineering allows the extraction of more information from present data. New information
is extracted in terms of new features. These features might have a better ability to describe
the variance in the training data, thus improving model accuracy.
(iv) Usage of the ensemble learning: Ensemble techniques show great potentials. Ensemble methods
often improve detection accuracy. An ensemble of several features might provide better detection
results. An ensemble of different deep learning techniques could also be considered because
ensembles perform better if the errors of the base classifiers have a low correlation.

6. Limitation of the Survey


The survey presented has a limitation whereby the primary source of work considered were those
indexed in the Scopus database, due to the reason described in Section 2. Exceptions were given on
COVID-19 related works, as most of the articles were still at the preprint level when this survey was
conducted. Concerning the publication years considered, the latest publication included were those
published prior to October 2020. Therefore, the findings put forward in this survey paper did not
consider contributions of works that are non-Scopus indexed and those that are published commencing
October 2020 and onwards.

7. Conclusions
As time goes on, more works on lung disease detection using deep learning have been published.
However, there was a lack of systematic survey available on the current state of research and application.
This paper is thus produced to offer an extensive survey of lung disease detection using deep learning,
specifically on tuberculosis, pneumonia, lung cancer and COVID-19, published from 2016 to September
2020. In total, 98 articles on this topic were considered in producing this survey.
To summarise and provide an organisation of the key concepts and focus of the existing work
on lung disease detection using deep learning, a taxonomy of state-of-the-art deep learning aided
lung disease detection was constructed based on the survey on the works considered. Analyses
of the trend on recent works on this topic, based on the identified attributes from the taxonomy,
are also presented. From the analyses of the distribution of works, the usage of both CNN and
transfer learning is high. Concerning the trend of the surveyed work, all the identified attributes in
the taxonomy observed, on average, a linear increase over the years, with an exception to the ensemble
attribute. The remaining issues and future direction of lung disease detection using deep learning were
J. Imaging 2020, 6, 131 31 of 38

subsequently established and described. Four issues of lung disease detection using deep learning were
identified: data imbalance, handling of huge image size, limited available datasets and high correlation
of errors when using ensemble techniques. Four potential works for lung disease detection using deep
learning are suggested to resolve the identified issues: making datasets available to the public, usage
of cloud computing, usage of more features and usage of the ensemble.
To conclude, investigating how deep learning was employed in lung disease detection is
highly significant to ensure future research will concentrate on the right track, thereby improving
the performance of disease detection systems. The presented taxonomy could be used by other
researchers to plan their research contributions and activities. The potential future direction suggested
could further improve the efficiency and increase the number of deep learning aided lung disease
detection applications.

Author Contributions: All authors contributed to the study conceptualisation and design. Material preparation
and analysis were performed by S.T.H.K. and M.H.A.H. The first draft of the manuscript was written by S.T.H.K.,
supervised by M.H.A.H., A.B. and H.K. All authors provided critical feedback and helped shape the manuscript.
All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by Universiti Malaysia Sabah (UMS) grant number SDK0191-2020.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design of the study;
in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish
the results.

References
1. Bousquet, J. Global Surveillance, Prevention and Control of Chronic Respiratory Diseases; World Health
Organization: Geneva, Switzerland, 2007; pp. 12–36.
2. Forum of International Respiratory Societies. The Global Impact of Respiratory Disease, 2nd ed.; European
Respiratory Society, Sheffield, UK, 2017; pp. 5–42.
3. World Health Organization. Coronavirus Disease 2019 (COVID-19) Situation Report; Technical Report March;
World Health Organization: Geneva, Switzerland, 2020.
4. Rahaman, M.M.; Li, C.; Yao, Y.; Kulwa, F.; Rahman, M.A.; Wang, Q.; Qi, S.; Kong, F.; Zhu, X.; Zhao, X.
Identification of COVID-19 samples from chest X-Ray images using deep learning: A comparison of transfer
learning approaches. J. X-Ray Sci. Technol. 2020, 28, 821–839. [CrossRef]
5. Yahiaoui, A.; Er, O.; Yumusak, N. A new method of automatic recognition for tuberculosis disease diagnosis
using support vector machines. Biomed. Res. 2017, 28, 4208–4212.
6. Hu, Z.; Tang, J.; Wang, Z.; Zhang, K.; Zhang, L.; Sun, Q. Deep learning for image-based cancer detection and
diagnosis-A survey. Pattern Recognit. 2018, 83, 134–149. [CrossRef]
7. American Thoracic Society. Diagnostic Standards and Classification of Tuberculosis in Adults and Children.
Am. J. Respir. Crit. Care Med. 2000, 161, 1376–1395. [CrossRef]
8. Setio, A.A.A.; Traverso, A.; de Bel, T.; Berens, M.S.; van den Bogaard, C.; Cerello, P.; Chen, H.; Dou, Q.;
Fantacci, M.E.; Geurts, B.; et al. Validation, comparison, and combination of algorithms for automatic
detection of pulmonary nodules in computed tomography images: The LUNA16 challenge. Med. Image
Anal. 2017, 42, 1–13. [CrossRef]
9. Shen, D.; Wu, G.; Suk, H.I. Deep Learning in Medical Image Analysis. Annu. Rev. Biomed. Eng. 2017, 19,
221–248. [CrossRef]
10. Wu, C.; Luo, C.; Xiong, N.; Zhang, W.; Kim, T.H. A Greedy Deep Learning Method for Medical Disease
Analysis. IEEE Access 2018, 6, 20021–20030. [CrossRef]
11. Ma, J.; Song, Y.; Tian, X.; Hua, Y.; Zhang, R.; Wu, J. Survey on deep learning for pulmonary medical imaging.
Front. Med. 2019, 14, 450–469. [CrossRef]
12. Rajaraman, S.; Candemir, S.; Xue, Z.; Alderson, P.O.; Kohli, M.; Abuya, J.; Thoma, G.R.; Antani, S.; Member, S.
A novel stacked generalization of models for improved TB detection in chest radiographs. In Proceedings
of the 2018 40th Annual International Conference the IEEE Engineering in Medicine and Biology Society
(EMBC), Honolulu, HI, USA, 17–21 July 2018; pp. 718–721. [CrossRef]
J. Imaging 2020, 6, 131 32 of 38

13. Ardila, D.; Kiraly, A.P.; Bharadwaj, S.; Choi, B.; Reicher, J.J.; Peng, L.; Tse, D.; Etemadi, M.; Ye, W.;
Corrado, G.; et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose
chest computed tomography. Nat. Med. 2019, 25, 954–961. [CrossRef]
14. Gordienko, Y.; Gang, P.; Hui, J.; Zeng, W.; Kochura, Y.; Alienin, O.; Rokovyi, O.; Stirenko, S. Deep Learning
with Lung Segmentation and Bone Shadow Exclusion Techniques for Chest X-Ray Analysis of Lung Cancer.
Adv. Intell. Syst. Comput. 2019, 638–647. [CrossRef]
15. Kieu, S.T.H.; Hijazi, M.H.A.; Bade, A.; Yaakob, R.; Jeffree, S. Ensemble deep learning for tuberculosis
detection using chest X-Ray and canny edge detected images. IAES Int. J. Artif. Intell. 2019, 8, 429–435.
[CrossRef]
16. Dietterich, T.G. Ensemble Methods in Machine Learning. Int. Workshop Mult. Classif. Syst. 2000, 1–15._1.
[CrossRef]
17. Webb, A. Introduction To Biomedical Imaging; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [CrossRef]
18. Kwan-Hoong, N.; Madan M, R. X ray imaging goes digital. Br. Med J. 2006, 333, 765–766. [CrossRef]
19. Lopes, U.K.; Valiati, J.F. Pre-trained convolutional neural networks as feature extractors for tuberculosis
detection. Comput. Biol. Med. 2017, 89, 135–143. [CrossRef] [PubMed]
20. Ayan, E.; Ünver, H.M. Diagnosis of Pneumonia from Chest X-Ray Images using Deep Learning. Sci. Meet.
Electr.-Electron. Biomed. Eng. Comput. Sci. 2019, 1–5. [CrossRef]
21. Salman, F.M.; Abu-naser, S.S.; Alajrami, E.; Abu-nasser, B.S.; Ashqar, B.A.M. COVID-19 Detection using
Artificial Intelligence. Int. J. Acad. Eng. Res. 2020, 4, 18–25.
22. Herman, G.T. Fundamentals of Computerized Tomography; Springer: London, UK, 2009; Volume 224. [CrossRef]
23. Song, Q.Z.; Zhao, L.; Luo, X.K.; Dou, X.C. Using Deep Learning for Classification of Lung Nodules on
Computed Tomography Images. J. Healthc. Eng. 2017, 2017. [CrossRef]
24. Gao, X.W.; James-reynolds, C.; Currie, E. Analysis of tuberculosis severity levels from CT pulmonary images
based on enhanced residual deep learning architecture. Neurocomputing 2019, 392, 233–244. [CrossRef]
25. Rao, P.; Pereira, N.A.; Srinivasan, R. Convolutional neural networks for lung cancer screening in computed
tomography (CT) scans. In Proceedings of the 2016 2nd International Conference on Contemporary
Computing and Informatics, IC3I 2016, Noida, India, 14–17 December 2016 ; pp. 489–493. [CrossRef]
26. Gozes, O.; Frid, M.; Greenspan, H.; Patrick, D. Rapid AI Development Cycle for the Coronavirus (COVID-19)
Pandemic: Initial Results for Automated Detection & Patient Monitoring using Deep Learning CT Image
Analysis Article. arXiv 2020, arXiv:2003.05037.
27. Shah, M.I.; Mishra, S.; Yadav, V.K.; Chauhan, A.; Sarkar, M.; Sharma, S.K.; Rout, C. Ziehl–Neelsen sputum
smear microscopy image database: A resource to facilitate automated bacilli detection for tuberculosis
diagnosis. J. Med. Imaging 2017, 4, 027503. [CrossRef]
28. López, Y.P.; Filho, C.F.F.C.; Aguilera, L.M.R.; Costa, M.G.F. Automatic classification of light field smear
microscopy patches using Convolutional Neural Networks for identifying Mycobacterium Tuberculosis.
In Proceedings of the 2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and
Communication Technologies (CHILECON), Pucon, Chile, 18–20 October 2017 .
29. Kant, S.; Srivastava, M.M. Towards Automated Tuberculosis detection using Deep Learning. In Proceedings
of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bengaluru, India, 18–21 November
2018; pp. 1250–1253. [CrossRef]
30. Oomman, R.; Kalmady, K.S.; Rajan, J.; Sabu, M.K. Automatic detection of tuberculosis bacilli from
microscopic sputum smear images using deep learning methods. Integr. Med. Res. 2018, 38, 691–699.
[CrossRef]
31. Mithra, K.S.; Emmanuel, W.R.S. Automated identification of mycobacterium bacillus from sputum images
for tuberculosis diagnosis. Signal Image Video Process. 2019. [CrossRef]
32. Samuel, R.D.J.; Kanna, B.R. Tuberculosis ( TB ) detection system using deep neural networks. Neural Comput.
Appl. 2019, 31, 1533–1545. [CrossRef]
33. Gurcan, M.N.; Boucheron, L.E.; Can, A.; Madabhushi, A.; Rajpoot, N.M.; Yener, B. Histopathological Image
Analysis: A Review. IEEE Rev. Biomed. Eng. 2009, 2, 147–171. [CrossRef]
34. Coudray, N.; Ocampo, P.S.; Sakellaropoulos, T.; Narula, N.; Snuderl, M.; Fenyö, D.; Moreira, A.L.;
Razavian, N.; Tsirigos, A. Classification and mutation prediction from non–small cell lung cancer
histopathology images using deep learning. Nat. Med. 2018, 24, 1559–1567. [CrossRef]
J. Imaging 2020, 6, 131 33 of 38

35. O’Mahony, N.; Campbell, S.; Carvalho, A.; Harapanahalli, S.; Hernandez, G.V.; Krpalkova, L.; Riordan, D.;
Walsh, J. Deep Learning vs . Traditional Computer Vision. Adv. Intell. Syst. Comput. 2020, 128–144.
[CrossRef]
36. Vajda, S.; Karargyris, A.; Jaeger, S.; Santosh, K.C.; Candemir, S.; Xue, Z.; Antani, S.; Thoma, G. Feature
Selection for Automatic Tuberculosis Screening in Frontal Chest Radiographs. J. Med Syst. 2018, 42.
[CrossRef]
37. Jaeger, S.; Karargyris, A.; Candemir, S.; Folio, L.; Siegelman, J.; Callaghan, F.; Xue, Z.; Palaniappan, K.;
Singh, R.K.; Antani, S.; et al. Automatic tuberculosis screening using chest radiographs. IEEE Trans. Med.
Imaging 2014, 33, 233–245. [CrossRef]
38. Antony, B.; Nizar Banu, P.K. Lung tuberculosis detection using x-ray images. Int. J. Appl. Eng. Res. 2017, 12,
15196–15201.
39. Chauhan, A.; Chauhan, D.; Rout, C. Role of gist and PHOG features in computer-aided diagnosis of
tuberculosis without segmentation. PLoS ONE 2014, 9, e112980. [CrossRef]
40. Al-Ajlan, A.; Allali, A.E. CNN—MGP: Convolutional Neural Networks for Metagenomics Gene Prediction.
Interdiscip. Sci. Comput. Life Sci. 2019, 11, 628–635. [CrossRef] [PubMed]
41. Domingos, P. A Few Useful Things to Know About Machine Learning. Commun. ACM 2012, 55, 78–87.
[CrossRef]
42. Mikołajczyk, A.; Grochowski, M. Data augmentation for improving deep learning in image classification
problem. In Proceedings of the 2018 International Interdisciplinary PhD Workshop, Swinoujscie, Poland,
9–12 May 2018; pp. 117–122. [CrossRef]
43. Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019,
6. [CrossRef]
44. O’Shea, K.; Nash, R. An Introduction to Convolutional Neural Networks. arXiv 2015, arXiv:1511.08458v2.
45. Ker, J.; Wang, L. Deep Learning Applications in Medical Image Analysis. IEEE Access 2018, 6, 9375–9389.
[CrossRef]
46. Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359.
[CrossRef]
47. Lanbouri, Z.; Achchab, S. A hybrid Deep belief network approach for Financial distress prediction. In
Proceedings of the 2015 10th International Conference on Intelligent Systems: Theories and Applications
(SITA), Rabat, Morocco, 20–21 October 2015; pp. 1–6. [CrossRef]
48. Hinton, G.E.; Osindero, S. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554.
[CrossRef]
49. Cao, X.; Wipf, D.; Wen, F.; Duan, G.; Sun, J. A practical transfer learning algorithm for face verification.
In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December
2013; pp. 3208–3215. [CrossRef]
50. Wang, C.; Chen, D.; Hao, L.; Liu, X.; Zeng, Y.; Chen, J.; Zhang, G. Pulmonary Image Classification Based on
Inception-v3 Transfer Learning Model. IEEE Access 2019, 7, 146533–146541. [CrossRef]
51. Krizhevsky, A.; Sutskeve, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural
Networks. Adv. Neural Inf. Process. Syst. 2012. [CrossRef]
52. Tajbakhsh, N.; Shin, J.Y.; Gurudu, S.R.; Hurst, R.T.; Kendall, C.B.; Gotway, M.B.; Liang, J. Convolutional
Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? IEEE Trans. Med. Imaging 2016,
35, 1299–1312. [CrossRef]
53. Nogueira, K.; Penatti, O.A.; dos Santos, J.A. Towards better exploiting convolutional neural networks for
remote sensing scene classification. Pattern Recognit. 2017, 61, 539–556. [CrossRef]
54. Kabari, L.G.; Onwuka, U. Comparison of Bagging and Voting Ensemble Machine Learning Algorithm as a
Classifier. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2019, 9, 1–6.
55. Chouhan, V.; Singh, S.K.; Khamparia, A.; Gupta, D.; Albuquerque, V.H.C.D. A Novel Transfer Learning
Based Approach for Pneumonia Detection in Chest X-ray Images. Appl. Sci. 2020, 10, 559. [CrossRef]
56. Lincoln, W.P.; Skrzypekt, J. Synergy of Clustering Multiple Back Propagation Networks. Adv. Neural Inf.
Process. Syst. 1990, 2, 650–659.
57. Lakhani, P.; Sundaram, B. Deep Learning at Chest Radiography: Automated Classification of Pulmonary
Tuberculosis by Using Convolutional Neural Networks. Radiology 2017, 284, 574–582. [CrossRef]
J. Imaging 2020, 6, 131 34 of 38

58. Divina, F.; Gilson, A.; Goméz-Vela, F.; Torres, M.G.; Torres, J.F. Stacking Ensemble Learning for Short-Term
Electricity Consumption Forecasting. Energies 2018, 11, 949. [CrossRef]
59. World Health Organisation. Global Health TB Report; World Health Organisation: Geneva, Switzerland, 2018;
p. 277.
60. Murphy, K.; Habib, S.S.; Zaidi, S.M.A.; Khowaja, S.; Khan, A.; Melendez, J.; Scholten, E.T.; Amad, F.;
Schalekamp, S.; Verhagen, M.; et al. Computer aided detection of tuberculosis on chest radiographs: An
evaluation of the CAD4TB v6 system. Sci. Rep. 2019, 10, 1–11. [CrossRef]
61. Melendez, J.; Sánchez, C.I.; Philipsen, R.H.; Maduskar, P.; Dawson, R.; Theron, G.; Dheda, K.; Van
Ginneken, B. An automated tuberculosis screening strategy combining X-ray-based computer-aided
detection and clinical information. Sci. Rep. 2016, 6, 1–8. [CrossRef]
62. Heo, S.J.; Kim, Y.; Yun, S.; Lim, S.S.; Kim, J.; Nam, C.M.; Park, E.C.; Jung, I.; Yoon, J.H. Deep Learning
Algorithms with Demographic Information Help to Detect Tuberculosis in Chest Radiographs in Annual
Workers’ Health Examination Data. Int. J. Environ. Res. Public Health 2019, 16, 250. [CrossRef]
63. Pasa, F.; Golkov, V.; Pfeiffer, F.; Cremers, D.; Pfeiffer, D. Efficient Deep Network Architectures for Fast Chest
X-Ray Tuberculosis Screening and Visualization. Sci. Rep. 2019, 9, 2–10. [CrossRef]
64. Cao, Y.; Liu, C.; Liu, B.; Brunette, M.J.; Zhang, N.; Sun, T.; Zhang, P.; Peinado, J.; Garavito, E.S.;
Garcia, L.L.; et al. Improving Tuberculosis Diagnostics Using Deep Learning and Mobile Health
Technologies among Resource-Poor and Marginalized Communities. In Proceedings of the 2016 IEEE
1st International Conference on Connected Health: Applications, Systems and Engineering Technologies,
CHASE, Washington, DC, USA, 27–29 June 2016 ; pp. 274–281. [CrossRef]
65. Liu, J.; Liu, Y.; Wang, C.; Li, A.; Meng, B. An Original Neural Network for Pulmonary Tuberculosis
Diagnosis in Radiographs. In Lecture Notes in Computer Science, Proceedings of the International Conference on
Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp.
158–166._16. [CrossRef]
66. Stirenko, S.; Kochura, Y.; Alienin, O. Chest X-Ray Analysis of Tuberculosis by Deep Learning with
Segmentation and Augmentation. In Proceedings of the 2018 IEEE 38th International Conference on
Electronics andNanotechnology (ELNANO), Kiev, Ukraine, 24–26 April 2018; pp. 422–428.
67. Andika, L.A.; Pratiwi, H.; Sulistijowati Handajani, S. Convolutional neural network modeling for
classification of pulmonary tuberculosis disease. J. Phys. Conf. Ser. 2020, 1490. [CrossRef]
68. Ul Abideen, Z.; Ghafoor, M.; Munir, K.; Saqib, M.; Ullah, A.; Zia, T.; Tariq, S.A.; Ahmed, G.; Zahra, A.
Uncertainty assisted robust tuberculosis identification with bayesian convolutional neural networks. IEEE
Access 2020, 8, 22812–22825. [CrossRef] [PubMed]
69. Hwang, E.J.; Park, S.; Jin, K.N.; Kim, J.I.; Choi, S.Y.; Lee, J.H.; Goo, J.M.; Aum, J.; Yim, J.J.; Park, C.M.
Development and Validation of a Deep Learning—based Automatic Detection Algorithm for Active
Pulmonary Tuberculosis on Chest Radiographs. Clin. Infect. Dis. 2019, 69, 739–747. [CrossRef]
70. Hwang, S.; Kim, H.E.; Jeong, J.; Kim, H.J. A Novel Approach for Tuberculosis Screening Based on Deep
Convolutional Neural Networks. Med. Imaging 2016, 9785, 1–8. [CrossRef]
71. Islam, M.T.; Aowal, M.A.; Minhaz, A.T.; Ashraf, K. Abnormality Detection and Localization in Chest X-Rays
using Deep Convolutional Neural Networks. arXiv 2017, arXiv:1705.09850v3.
72. Nguyen, Q.H.; Nguyen, B.P.; Dao, S.D.; Unnikrishnan, B.; Dhingra, R.; Ravichandran, S.R.; Satpathy, S.;
Raja, P.N.; Chua, M.C.H. Deep Learning Models for Tuberculosis Detection from Chest X-ray Images. In
Proceedings of the 2019 26th International Conference on Telecommunications (ICT), Hanoi, Vietnam, 8–10
April 2019; pp. 381–385. [CrossRef]
73. Kieu, T.; Ho, K.; Gwak, J.; Prakash, O. Utilizing Pretrained Deep Learning Models for Automated Pulmonary
Tuberculosis Detection Using Chest Radiography. Intell. Inf. Database Syst. 2019, 4, 395–403. [CrossRef]
74. Abbas, A.; Abdelsamea, M.M. Learning Transformations for Automated Classification of Manifestation of
Tuberculosis using Convolutional Neural Network. In Proceedings of the 2018 13th International Conference
on Computer Engineering andSystems (ICCES), Cairo, Egypt, 18–19 December 2018; IEEE: New York, NY,
USA, 2018; pp. 122–126.
75. Karnkawinpong, T.; Limpiyakorn, Y. Classification of pulmonary tuberculosis lesion with convolutional
neural networks. J. Phys. Conf. Ser. 2018, 1195. [CrossRef]
J. Imaging 2020, 6, 131 35 of 38

76. Liu, C.; Cao, Y.; Alcantara, M.; Liu, B.; Brunette, M.; Peinado, J.; Curioso, W. TX-CNN: Detecting Tuberculosis
in Chest X-Ray Images Using Convolutional Neural Network. In Proceedings of the 2017 IEEE International
Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017.
77. Yadav, O.; Passi, K.; Jain, C.K. Using Deep Learning to Classify X-ray Images of Potential Tuberculosis
Patients. In Proceedings of the 2018 IEEE International Conference on Bioinformatics and Biomedicine(BIBM),
Madrid, Spain, 3–6 December 2018; IEEE: New York, NY, USA, 2018; pp. 2368–2375.
78. Sahlol, A.T.; Elaziz, M.A.; Jamal, A.T.; Damaševičius, R.; Hassan, O.F. A novel method for detection of
tuberculosis in chest radiographs using artificial ecosystem-based optimisation of deep neural network
features. Symmetry 2020, 12, 1146. [CrossRef]
79. Hooda, R.; Mittal, A.; Sofat, S. Automated TB classification using ensemble of deep architectures. Multimed.
Tools Appl. 2019, 78, 31515–31532. [CrossRef]
80. Rashid, R.; Khawaja, S.G.; Akram, M.U.; Khan, A.M. Hybrid RID Network for Efficient Diagnosis of
Tuberculosis from Chest X-rays. In Proceedings of the 2018 9th Cairo International Biomedical Engineering
Conference(CIBEC), Cairo, Egypt, 20–22 December 2018; IEEE: New York, NY, USA, 2018; pp. 167–170.
81. Kieu, S.T.H.; Hijazi, M.H.A.; Bade, A.; Saffree Jeffree, M. Tuberculosis detection using deep learning and
contrast-enhanced canny edge detected x-ray images. IAES Int. J. Artif. Intell. 2020, 9. [CrossRef]
82. Rajaraman, S.; Antani, S.K. Modality-Specific Deep Learning Model Ensembles Toward Improving TB
Detection in Chest Radiographs. IEEE Access 2020, 8, 27318–27326. [CrossRef] [PubMed]
83. Melendez, J.; Ginneken, B.V.; Maduskar, P.; Philipsen, R.H.H.M.; Reither, K.; Breuninger, M.; Adetifa, I.M.O.;
Maane, R.; Ayles, H.; Sánchez, C.I. A Novel Multiple-Instance Learning-Based Approach to Computer-Aided
Detection of Tuberculosis on Chest X-Rays. IEEE Trans. Med. Imaging 2014, 34, 179–192. [CrossRef] [PubMed]
84. Becker, A.S.; Bluthgen, C.; van Phi, V.D.; Sekaggya-Wiltshire, C.; Castelnuovo, B.; Kambugu, A.; Fehr, J.;
Frauenfelder, T. Detection of tuberculosis patterns in digital photographs of chest X-ray images using Deep
Learning: Feasibility study. Int. J. Tuberc. Lung Dis. 2018, 22, 328–335. [CrossRef] [PubMed]
85. Li, L.; Huang, H.; Jin, X. AE-CNN Classification of Pulmonary Tuberculosis Based on CT images. In
Proceedings of the 2018 9th International Conference on Information Technology inMedicine and Education
(ITME), Hangzhou, China, 19–21 October 2018; IEEE: New York, NY, USA, 2018; pp. 39–42. [CrossRef]
86. Pattnaik, A.; Kanodia, S.; Chowdhury, R.; Mohanty, S. Predicting Tuberculosis Related Lung Deformities from CT
Scan Images Using 3D CNN; CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12.
87. Zunair, H.; Rahman, A.; Mohammed, N. Estimating Severity from CT Scans of Tuberculosis Patients using 3D
Convolutional Nets and Slice Selection; CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12.
88. Llopis, F.; Fuster-Guillo, A.; Azorin-Lopez, J.; Llopis, I. Using improved optical flow model to detect
Tuberculosis; CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12.
89. Wardlaw, T.; Johansson, E.W.; Hodge, M. Pneumonia: The Forgotten Killer of Children; United Nations
Children’s Fund (UNICEF): New York, NY, USA, 2006; p. 44.
90. Wunderink, R.G.; Waterer, G. Advances in the causes and management of community acquired pneumonia
in adults. BMJ 2017, 1–13. [CrossRef]
91. Tobias, R.R.; De Jesus, L.C.M.; Mital, M.E.G.; Lauguico, S.C.; Guillermo, M.A.; Sybingco, E.; Bandala, A.A.;
Dadios, E.P. CNN-based Deep Learning Model for Chest X-ray Health Classification Using TensorFlow. In
Proceedings of the 2020 RIVF International Conference on Computing and Communication Technologies,
RIVF 2020, Ho Chi Minh, Vietnam, 14–15 October 2020 .
92. Stephen, O.; Sain, M.; Maduh, U.J.; Jeong, D.U. An Efficient Deep Learning Approach to Pneumonia
Classification in Healthcare. J. Healthc. Eng. 2019, 2019. [CrossRef]
93. Kermany, D.S.; Goldbaum, M.; Cai, W.; Lewis, M.A. Identifying Medical Diagnoses and Treatable Diseases
by Image-Based Deep Learning. Cell 2018, 172, 1122–1131.e9. [CrossRef] [PubMed]
94. Young, J.C.; Suryadibrata, A. Applicability of Various Pre-Trained Deep Convolutional Neural Networks for
Pneumonia Classification based on X-Ray Images. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 2649–2654.
[CrossRef]
95. Moujahid, H.; Cherradi, B.; Gannour, O.E.; Bahatti, L.; Terrada, O.; Hamida, S. Convolutional Neural
Network Based Classification of Patients with Pneumonia using X-ray Lung Images. Adv. Sci. Technol. Eng.
Syst. 2020, 5, 167–175. [CrossRef]
J. Imaging 2020, 6, 131 36 of 38

96. Rajpurkar, P.; Irvin, J.; Zhu, K.; Yang, B.; Mehta, H.; Duan, T.; Ding, D.; Bagul, A.; Ball, R.L.; Langlotz, C.; et al.
CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. arXiv 2017,
arXiv:1711.05225v3.
97. Rahman, T.; Chowdhury, M.E.H.; Khandakar, A.; Islam, K.R.; Islam, K.F.; Mahbub, Z.B.; Kadir, M.A.;
Kashem, S. Transfer Learning with Deep Convolutional Neural Network (CNN) for Pneumonia Detection
Using Chest X-ray. Appl. Sci. 2020, 10, 3233. [CrossRef]
98. Hashmi, M.; Katiyar, S.; Keskar, A.; Bokde, N.; Geem, Z. Efficient Pneumonia Detection in Chest Xray Images
Using Deep Transfer Learning. Diagnostics 2020, 1–23. [CrossRef] [PubMed]
99. Acharya, A.K.; Satapathy, R. A Deep Learning Based Approach towards the Automatic Diagnosis of
Pneumonia from Chest Radio-Graphs. Biomed. Pharmacol. J. 2020, 13, 449–455. [CrossRef]
100. Elshennawy, N.M.; Ibrahim, D.M. Deep-Pneumonia Framework Using Deep Learning Models Based on
Chest X-Ray Images. Diagnostics 2020, 10, 649. [CrossRef] [PubMed]
101. Emhamed, R.; Mamlook, A.; Chen, S. Investigation of the performance of Machine Learning Classifiers for
Pneumonia Detection in Chest X-ray Images. In Proceedings of the 2020 IEEE International Conference on
Electro Information Technology (EIT), Chicago, IL, USA, 31 July–1 August 2020; pp. 98–104.
102. Kumar, A.; Tiwari, P.; Kumar, S.; Gupta, D.; Khanna, A. Identifying pneumonia in chest X-rays: A deep
learning approach. Measurement 2019, 145, 511–518. [CrossRef]
103. Hurt, B.; Yen, A.; Kligerman, S.; Hsiao, A. Augmenting Interpretation of Chest Radiographs with Deep
Learning Probability Maps. J. Thorac. Imaging 2020, 35, 285–293. [CrossRef]
104. Borczuk, A.C. Benign tumors and tumorlike conditions of the lung. Arch. Pathol. Lab. Med. 2008, 132,
1133–1148.[1133:BTATCO]2.0.CO;2. [CrossRef]
105. Hua, K.L.; Hsu, C.H.; Hidayati, S.C.; Cheng, W.H.; Chen, Y.J. Computer-aided classification of lung nodules
on computed tomography images via deep learning technique. OncoTargets Ther. 2015, 8, 2015–2022.
[CrossRef]
106. Kurniawan, E.; Prajitno, P.; Soejoko, D.S. Computer-Aided Detection of Mediastinal Lymph Nodes using
Simple Architectural Convolutional Neural Network. J. Phys. Conf. Ser. 2020, 1505. [CrossRef]
107. Ciompi, F.; Chung, K.; Van Riel, S.J.; Setio, A.A.A.; Gerke, P.K.; Jacobs, C.; Th Scholten, E.; Schaefer-Prokop, C.;
Wille, M.M.; Marchianò, A.; et al. Towards automatic pulmonary nodule management in lung cancer
screening with deep learning. Sci. Rep. 2017, 7, 1–11. [CrossRef]
108. Shakeel, P.M.; Burhanuddin, M.A.; Desa, M.I. Lung cancer detection from CT image using improved profuse
clustering and deep learning instantaneously trained neural networks. Meas. J. Int. Meas. Confed. 2019, 145,
702–712. [CrossRef]
109. Chen, S.; Han, Y.; Lin, J.; Zhao, X.; Kong, P. Pulmonary nodule detection on chest radiographs using balanced
convolutional neural network and classic candidate detection. Artif. Intell. Med. 2020, 107, 101881. [CrossRef]
[PubMed]
110. Hosny, A.; Parmar, C.; Coroller, T.P.; Grossmann, P.; Zeleznik, R.; Kumar, A.; Bussink, J.; Gillies, R.J.; Mak,
R.H.; Aerts, H.J. Deep learning for lung cancer prognostication: A retrospective multi-cohort radiomics
study. PLoS Med. 2018, 15, 1–25. [CrossRef] [PubMed]
111. Xu, Y.; Hosny, A.; Zeleznik, R.; Parmar, C.; Coroller, T.; Franco, I.; Mak, R.H.; Aerts, H.J. Deep learning
predicts lung cancer treatment response from serial medical imaging. Clin. Cancer Res. 2019, 25, 3266–3275.
[CrossRef] [PubMed]
112. Kuan, K.; Ravaut, M.; Manek, G.; Chen, H.; Lin, J.; Nazir, B.; Chen, C.; Howe, T.C.; Zeng, Z.; Chandrasekhar, V.
Deep Learning for Lung Cancer Detection: Tackling the Kaggle Data Science Bowl 2017 Challenge. arXiv
2017, arXiv:1705.09435
113. Ausawalaithong, W.; Thirach, A.; Marukatat, S.; Wilaiprasitporn, T. Automatic Lung Cancer Prediction
from Chest X-ray Images Using the Deep Learning Approach. In Proceedings of the 2018 11th Biomedical
Engineering International Conference (BMEiCON), Chiang Mai, Thailand, 21–24 November 2018 .
114. Xu, S.; Guo, J.; Zhang, G.; Bie, R. Automated detection of multiple lesions on chest X-ray images:
Classification using a neural network technique with association-specific contexts. Appl. Sci. 2020, 10, 1742.
[CrossRef]
115. Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X. Clinical features of
patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [CrossRef]
116. Velavan, T.P.; Meyer, C.G. The COVID-19 epidemic. Trop. Med. Int. Health 2020, 25, 278–280. [CrossRef]
J. Imaging 2020, 6, 131 37 of 38

117. Shibly, K.H.; Dey, S.K.; Islam, M.T.U.; Rahman, M.M. COVID faster R–CNN: A novel framework to Diagnose
Novel Coronavirus Disease (COVID-19) in X-Ray images. Inform. Med. Unlocked 2020, 20, 100405. [CrossRef]
118. Alsharman, N.; Jawarneh, I. GoogleNet CNN neural network towards chest CT-coronavirus medical image
classification. J. Comput. Sci. 2020, 16, 620–625. [CrossRef]
119. Zhu, J.; Shen, B.; Abbasi, A.; Hoshmand-Kochi, M.; Li, H.; Duong, T.Q. Deep transfer learning artificial
intelligence accurately stages COVID-19 lung disease severity on portable chest radiographs. PLoS ONE
2020, 15, e0236621. [CrossRef]
120. Sethi, R.; Mehrotra, M.; Sethi, D. Deep Learning based Diagnosis Recommendation for COVID-19 using
Chest X-Rays Images In Proceedings of the 2020 Second International Conference on Inventive Research in
Computing Applications (ICIRCA), Coimbatore, India, 15–17 July 2020 .
121. Das, D.; Santosh, K.C.; Pal, U. Truncated inception net: COVID-19 outbreak screening using chest X-rays.
Phys. Eng. Sci. Med. 2020, 43, 915–925. [CrossRef] [PubMed]
122. Panwar, H.; Gupta, P.K.; Siddiqui, M.K.; Morales-Menendez, R.; Singh, V. Application of deep learning for
fast detection of COVID-19 in X-Rays using nCOVnet. Chaos Solitons Fractals 2020, 138, 109944. [CrossRef]
[PubMed]
123. Narin, A.; Kaya, C.; Pamuk, Z. Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images
and Deep Convolutional Neural Networks. arXiv 2020, arXiv:2003.10849..
124. Apostolopoulos, I.D.; Mpesiana, T.A. Covid—19: Automatic detection from X-ray images utilizing transfer
learning with convolutional neural networks. Phys. Eng. Sci. Med. 2020, 1–6. [CrossRef] [PubMed]
125. Chowdhury, M.E.H.; Rahman, T.; Khandakar, A.; Mazhar, R.; Kadir, M.A.; Reaz, M.B.I.; Mahbub, Z.B.;
Islam, K.R.; Salman, M.; Iqbal, A.; et al. Can AI help in screening Viral and COVID-19 pneumonia? arXiv
2020, arXiv:2003.13145.
126. Wang, L.; Lin, Z.Q.; Wong, A. COVID-Net: A Tailored Deep Convolutional Neural Network Design for
Detection of COVID-19 Cases from Chest X-Ray Images. Sci. Rep. 2020, 10, 1–12. [CrossRef]
127. Sethy, P.K.; Behera, S.K.; Ratha, P.K.; Biswas, P. Detection of coronavirus disease (COVID-19) based on deep
features and support vector machine. Int. J. Math. Eng. Manag. Sci. 2020, 5, 643–651. [CrossRef]
128. Islam, M.Z.; Islam, M.M.; Asraf, A. A combined deep CNN-LSTM network for the detection of novel
coronavirus (COVID-19) using X-ray images. Inform. Med. Unlocked 2020, 20, 100412. [CrossRef]
129. Shorfuzzaman, M.; Masud, M. On the detection of covid-19 from chest x-ray images using cnn-based transfer
learning. Comput. Mater. Contin. 2020, 64, 1359–1381. [CrossRef]
130. Li, L.; Qin, L.; Xu, Z.; Yin, Y.; Wang, X.; Kong, B.; Bai, J.; Lu, Y.; Fang, Z.; Song, Q.; et al. Using Artificial
Intelligence to Detect COVID-19 and Community-acquired Pneumonia Based on Pulmonary CT: Evaluation
of the Diagnostic Accuracy. Radiology 2020, 296, 65–71. [CrossRef]
131. Alazab, M.; Awajan, A.; Mesleh, A.; Abraham, A.; Jatana, V.; Alhyari, S. COVID-19 prediction and detection
using deep learning. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 2020, 12, 168–181.
132. Waheed, A.; Goyal, M.; Gupta, D.; Khanna, A.; Al-Turjman, F.; Pinheiro, P.R. CovidGAN: Data Augmentation
Using Auxiliary Classifier GAN for Improved Covid-19 Detection. IEEE Access 2020, 8, 91916–91923.
[CrossRef]
133. Ouyang, X.; Huo, J.; Xia, L.; Shan, F.; Liu, J.; Mo, Z.; Yan, F.; Ding, Z.; Yang, Q.; Song, B.; et al. Dual-Sampling
Attention Network for Diagnosis of COVID-19 from Community Acquired Pneumonia. IEEE Trans. Med.
Imaging 2020, 39, 2595–2605. [CrossRef]
134. Mahmud, T.; Rahman, M.A.; Fattah, S.A. CovXNet: A multi-dilation convolutional neural network
for automatic COVID-19 and other pneumonia detection from chest X-ray images with transferable
multi-receptive feature optimization. Comput. Biol. Med. 2020, 122, 103869. [CrossRef] [PubMed]
135. Shi, F.; Xia, L.; Shan, F.; Wu, D.; Wei, Y.; Yuan, H.; Jiang, H. Large-Scale Screening of COVID-19 from
Community Acquired Pneumonia using Infection Size-Aware Classification. arXiv 2020, arXiv:2003.09860..
136. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [CrossRef]
137. Xu, X.; Jiang, X.; Ma, C.; Du, P.; Li, X.; Ly, S.; Yu, L.; Chen, Y; Su, J.; Lang, G.; et al. A Deep Learning System
to Screen Novel Coronavirus Disease 2019 Pneumonia. Engineering 2020. [CrossRef]
138. Singh, D.; Kumar, V.; Kaur, M. Classification of COVID-19 patients from chest CT images using
multi-objective differential evolution–based convolutional neural networks. Eur. J. Clin. Microbiol. Infect.
Dis. 2020, 39, 1379–1389. [CrossRef]
J. Imaging 2020, 6, 131 38 of 38

139. Sedik, A.; Iliyasu, A.M.; El-Rahiem, B.A.; Abdel Samea, M.E.; Abdel-Raheem, A.; Hammad, M.; Peng, J.;
Abd El-Samie, F.E.; Abd El-Latif, A.A. Deploying machine and deep learning models for efficient
data-augmented detection of COVID-19 infections. Viruses 2020, 12, 769. [CrossRef]
140. Ahsan, M.M.; Alam, T.E.; Trafalis, T.; Huebner, P. Deep MLP-CNN model using mixed-data to distinguish
between COVID-19 and Non-COVID-19 patients. Symmetry 2020, 12. [CrossRef]
141. Afshar, P.; Heidarian, S.; Naderkhani, F.; Oikonomou, A.; Plataniotis, K.N.; Mohammadi, A. COVID-CAPS:
A capsule network-based framework for identification of COVID-19 cases from X-ray images. Pattern
Recognit. Lett. 2020, 138, 638–643. [CrossRef] [PubMed]
142. Rosenthal, A.; Gabrielian, A.; Engle, E.; Hurt, D.E.; Alexandru, S.; Crudu, V.; Sergueev, E.; Kirichenko, V.;
Lapitskii, V.; Snezhko, E.; et al. The TB Portals: An Open-Access, Web- Based Platform for Global
Drug-Resistant- Tuberculosis Data Sharing and Analysis. J. Clin. Microbiol. 2017, 55, 3267–3282. [CrossRef]
[PubMed]
143. Cid, Y.D.; Liauchuk, V.; Klimuk, D.; Tarasau, A. Overview of ImageCLEFtuberculosis 2019—Automatic CT—Based
Report Generation and Tuberculosis Severity Assessment: CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12.
144. Demner-Fushman, D.; Kohli, M.D.; Rosenman, M.B.; Shooshan, S.E.; Rodriguez, L.; Antani, S.; Thoma, G.R.;
McDonald, C.J. Preparing a collection of radiology examinations for distribution and retrieval. J. Am. Med.
Inform. Assoc. 2016, 23, 304–310. [CrossRef]
145. Shiraishi, J.; Katsuragawa, S.; Ikezoe, J.; Matsumoto, T.; Kobayashi, T.; Komatsu, K.I.; Matsui, M.; Fujita, H.;
Kodera, Y.; Doi, K. Development of a digital image database for chest radiographs with and without a
lung nodule: Receiver operating characteristic analysis of radiologists’ detection of pulmonary nodules.
Am. J. Roentgenol. 2000, 174, 71–74. [CrossRef] [PubMed]
146. Jaeger, S.; Candemir, S.; Antani, S.; Wáng, Y.x.J.; Lu, P.x.; Thoma, G. Two public chest X-ray datasets for
computer-aided screening of pulmonary diseases. Quant. Imaging Med. Surg. 2014, 4, 475–477. [CrossRef]
147. Xiaosong, W.; Yifan, P.; Le, L.; Lu, Z.; Mohammadhadi, B.; Summers, R.M. ChestX-ray8: Hospital-scale chest
X-ray database and benchmarks on weakly-supervised classification and localization of common thorax
diseases. In Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI,
USA, 21–26 July 2017; pp. 3462–3471.
148. Costa, M.G.; Filho, C.F.; Kimura, A.; Levy, P.C.; Xavier, C.M.; Fujimoto, L.B. A sputum smear microscopy
image database for automatic bacilli detection in conventional microscopy. In Proceedings of the 2014 36th
Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC, Chicago,
IL, USA, 26–30 August 2014 ; pp. 2841–2844. [CrossRef]
149. Armato, S.G.; McLennan, G.; Bidaut, L.; McNitt-Gray, M.F.; Meyer, C.R.; Reeves, A.P.; Zhao, B.; Aberle, D.R.;
Henschke, C.I.; Hoffman, E.A.; et al. The Lung Image Database Consortium (LIDC) and Image Database
Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Med. Phys. 2011,
38, 915–931. [CrossRef]
150. Grossman, R.L.; Allison, P.; Ferrentti, V.; Varmus, H.E.; Lowy, D.R.; Kibbe, W.A.; Staudt, L.M. Toward a
Shared Vision for Cancer Genomic Data. N. Engl. J. Med. 2016, 375, 1109–1112. [CrossRef]
151. Cohen, J.P.; Morrison, P.; Dao, L.; Roth, K.; Duong, T.Q.; Ghassemi, M. COVID-19 Image Data Collection:
Prospective Predictions Are the Future. arXiv 2020, arXiv:2006.11988..

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional
affiliations.

c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like