Project Report
Project Report
A
Project Report
on
‘Perception and Prognosis of COVID-19 Cases Using Radiology Modalities’
Submitted in partial fulfillment of the requirement for the award of the degree of
Bachelor of Engineering
in
Electronics and Communications
Submitted By
AISHWARYA PANDIT 2JH18EC006
KAVERI C. AKKI 2JH18EC020
NAYANA B. HUBBALLI 2JH18EC025
NIVEDITA P. MUSTIPALLI 2JH18EC027
CERTIFICATE
Certified that the Project Entitled “Perception and prognosis of Covid-19 using radiology
modalities” carried out by AISHWARYA PANDIT, bearing USN: 2JH18EC006, KAVERI
C. AKKI, bearing USN:2JH18EC020, NAYANA B. HUBBALLI, bearing USN:
2JH18EC025, NIVEDITA P. MUSTIPALLI, bearing USN: 2JH18EC027, bonafide students
of Jain College of Engineering and Technology, is in partial fulfillment for the award of the
BACHELOR OF ENGINEERING in Electronics and Communication Engineering from
Visvesvaraya Technological University, Belagavi during the year 2021-2022. It is certified that
all the corrections/suggestions indicated for Internal Assessment have been incorporated in the
report submitted in the department library. The Project report has been approved as it satisfies
the academic requirements in respect of the Project prescribed for the said Degree.
DECLARATION
We, AISHWARYA PANDIT bearing USN: 2JH18EC006, KAVERI C. AKKI bearing USN:
2JH18EC020, NAYANA B. HUBBALLI bearing USN: 2JH18EC025, NIVEDITA P.
MUSTIPALLI bearing USN: 2JH18EC027, students of Eighth Semester B.E, Department of
Electronics and Communication Engineering, Jain College of Engineering and Technology, Sai
Nagar, Hubballi, declare that the Project Work entitled “Perception and prognosis of covid-19
cases using Radiology Modalities” has been carried out by us and submitted in partial
fulfillment of the course requirements for the award of degree in Bachelor of Engineering in
Electronics and Communication Engineering from Visvesvaraya Technological
University, Belagavi during the academic year 2021-2022. The matter embodied in this report
has not been submitted to any other university or institution for the award of any other degree.
Place: Hubballi
Date: 25/07/2022
ABSTRACT
Artificial intelligence (AI techniques in general and convolutional neural networks (CNNs in
particular have attained successful results in medical image analysis and classification. A deep
CNN architecture has been proposed in this paper for the diagnosis of COVID-19 based on the
chest X-ray image classification.
Due to the non-availability of sufficient-size and good-quality chest X-ray image dataset, an
effective and accurate CNN classification was a challenge. To deal with these complexities
such as the availability of a very-small-sized and imbalanced dataset with image-quality issues,
the dataset has been preprocessed in different phases using different techniques to achieve an
effective training dataset for the proposed CNN model to attain its best performance.
The preprocessing stages of the datasets performed in this study include dataset balancing,
medical experts’ image analysis, and data augmentation. The experimental results have shown
the overall accuracy as high as 99.5% which demonstrates the good capability of the proposed
CNN model in the current application domain.
The CNN model has been tested in two scenarios. In the first scenario, the model has been
tested using the 100 X-ray images of the original processed dataset which achieved an accuracy
of 100%. In the second scenario, the model has been tested using an independent dataset of
COVID-19 X-ray images.
The performance in this test scenario was as high as 99.5%. To further prove that the proposed
model outperforms other models, a comparative analysis has been done with some of the
machine learning algorithms. The proposed model has outperformed all the models generally
and specifically when the model testing was done using an independent testing set.
ACKNOWLEDGEMENT
The satisfaction and the euphoria that accompany the successful completion of any task would
be incomplete without the mention of the people who made it possible. The constant guidance
of these persons and encouragement provide, crowned our efforts with success and glory.
Although it is not possible to thank all the members who helped for the completion of the
project individually, we take this opportunity to express our gratitude to one and all.
We are grateful to the management and our institute JAIN COLLEGE OF ENGINEERING
AND TECHNOLOGY with its very ideals and inspiration for having provided me with the
facilities, which made this, work a success.
We express our sincere gratitude to Dr. Prashanth Banakar, Principal, Jain College of
Engineering and Technology for the support and encouragement.
We wish to place on record, my grateful thanks to Prof. Prasanna Pattanshetty, HOD,
Department of ECE, Jain College of Engineering and Technology, for the constant
encouragement provided to me.
We indebted with a deep sense of gratitude for the constant inspiration, encouragement, timely
guidance and valid suggestion given to me by my guide Prof. Poornima Patil, Professor,
Department of ECE, Jain College of Engineering and Technology
We are thankful to all the staff members of the department for providing relevant information
and helping in different capacities in carrying out this project.
Last, but not least, we owe our debt to our parents, friends and also those who directly or
indirectly have helped me to make the project work a success.
ABSTRACT
ACKNOWLEDGEMENT
CONTENTS
LIST OF TABLES
LIST OF FIGURES
CHAPTER 1. INTRODUCTION 1
3.2 OBJECTIVES 5
4.3.1.1 ANACONDA 9
4.3.1.2 PYTHON 11
4.3.1.3 FLASK 12
5.3 DATASET 18
5.4 PRE-PROCESSING 19
5.4.3 RESNET 22
CONCLUSION 28
REFERENCES 30
LIST OF TABLES
CHAPTER 1
INTRODUCTION
The virus called the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2Th) had
been discovered in late 2019. The virus which originated in China became a cause of a
disease known as Corona Virus Disease 2019 or COVID19. The World Health Organization
(WHO declared the disease a pandemic in March 2020 [1, 2]. According to the reports issued
and updated by global healthcare authorities and state governments, the pandemic affected
millions of people globally.
The most serious illness caused by COVID19 is related to the lungs such as pneumonia. The
symptoms of the disease can vary and include dyspnea, high fever, runny nose, and cough.
These cases can most commonly be diagnosed using chest X-ray imaging analysis for the
abnormalities [3]. X-radiation or X-ray is an electromagnetic form of penetrating radiation.
These radiations are passed through the desired human body parts to create images of internal
details of the body part.
The X-ray image is a representation of the internal body parts in black and white shades. X-
ray is one of the oldest and most commonly used med diagnostic tests. Chest X-ray is used to
diagnose chest-related diseases like pneumonia and other lung diseases [4], as it provides the
image of the thoracic cavity, consisting of the chest and spine bones along with the soft
organs including the lungs, blood vessels, and airways.
The deep ANNs have outperformed other conventional models on many essential
benchmarks. Thus, ANNs have generally proved to be the state-of-the-art technology across a
wide range of application areas, including NLP, speech recognition, image processing,
biological sciences, and another commercial as well as academic areas. The advancement of
As seen in recent times, various parts of the world face the healthcare crisis both in terms of
the needed number of healthcare professionals and testing equipment. Considering the
present pandemic situation, there is an appurtenant relationship between the detection of
COVID19 cases and chest X-ray image analysis and classification.
CHAPTER 2
LITERATURE SURVEY
A literature survey or a review in a project report is that section that shows the various
analyses and research made in the field of our interest and the results already
published.Literature Survey consists of searching techniques and optimized algorithms from
referred base papers to come up with modification which leads to the best solution.
Deep learning has shown a dramatic increase in medical applications in general and
specifically in medical image-based diagnosis. Deep learning models performed prominently
in computer vision problems related to medical image analysis. The ANNs outperformed
other conventional models and methods of image analysis [7, 8]. Due to the very promising
results provided by CNNs in medical image analysis and classification, they are considered as
de facto standard in this domain [9, 10].
CNN has been used for a variety of classification tasks related to medical diagnosis such as
lung disease [10], detection of malarial parasite in images of thin blood smear [11], breast
cancer detection [12], wireless endoscopy images [13], interstitial lung disease [14], CAD-
based diagnosis in chest radiography [15], diagnosis of skin cancer by classification [16], and
automatic diagnosis of various chest diseases using chest X-ray image classification [17].
Since the emergence of COVID-19 in December 2019, 2 Complexity numerous researchers
are engaged with the experimentation and research activities related to diagnosis, treatment,
and management of COVID-19.
Researchers in [18] have reported the significance of the applicability of AI methods in image
analysis for the detection and management of COVID-19 cases. COVID-19 detection can be
done accurately using deep learning models’ analysis of pulmonary CT [18]. Researchers in
[19] have designed an open-source COVID-19 diagnosis system based on a deep CNN. In
this study, tailored deep CNN design has been reported for the detection of COVID-19
patients using X-ray images. Another significant study has reported on the X-ray dataset
comprising X-ray images belonging to common pneumonia patients, COVID-19 patients, and
people with no disease[20].
The study uses state-of-the-art CNN architectures for the automatic detection of patients with
COVID-19. Transfer learning has achieved a promising accuracy of 97.82% in COVID-19
The authors have reported the results of the study with an accuracy of 95.12%, sensitivity of
97.91%, and specificity of 91.87%. Having reviewed the relevant and recent research work
on the design, development, and possible applicability of CNNs in COVID-19 detection
using medical images, particularly X-ray images, due to the availability of very less amount
of X-ray images of COVID-19 patients and the poor quality of some images in the dataset,
the accuracy of the models was affected.
This study is particularly focused on dataset pre-processing to fine-tune it, data augmentation,
and design of a CNN with extra layers to increase further the performance of the COVID-19
diagnosis using CNNs as described in subsequent sections.
CHAPTER 3
PROPOSED METHODOLOGY
3.2 OBJECTIVES
To collect X-ray images of COVID-19, non COVID-19 and other respiratory diseased
To create dataset using these X-ray images to train the Machine-Learning model.
To ensure early diagnosis of COVID-19 cases and to identify the infection amongst
patients with high risk of developing severe disease due to underlining conditions.
In this work, an automatic diagnostic system has been developed using CNN which uses
chest X-ray analysis results to diagnose whether a person is COVID-19-affected or normal.
Preliminary analysis of this study has shown promising results in terms of its accuracy and
other performance parameters to diagnose the disease in a cost-effective and time-efficient
manner. This study used CNN with extra layers to improve the COVID-19 X-ray image
classification accuracy. In neural networks, the CNN structure is specially designed to
process the two-dimensional image tasks although it can also be used in one- and three-
dimensional data. CNN is a type of DNN, inspired by the visual system of the human brain,
and is most commonly used in the analysis of visual imagery.
To train the CNN model, first, the dataset has been obtained from Kaggle [5]. Since the
dataset obtained for training the model was very small in size and imbalanced, to solve the
problem of having a very-limited sized X-ray image dataset, it has been extended using data
augmentation techniques to increase its size and also to make the model training feature-rich.
Image flipping and rotation at different angles have been used to generate more data. For
dataset balancing in terms of the proportion of images with different class labels, the dataset
has been further extended with some more image instances of the minority class. After data
augmentation and dataset balancing, the CNN model has been trained using a total of 800
images (400 COVID-19 and 400 normal and then the model has been tested by using a test
set.
The CNN model performance evaluation has then been done using different performance
metrics. These metrics include accuracy, precision, sensitivity, specificity, ROC AUC, and
F1 score.Later, the proposed CNN model has also been tested using an independent dataset
obtained from the IEEE data port [6] for independent validation of the proposed CNN model.
(i) The CNN with extra convolutional layers (e.g., six layers have been used in the
CNN proposed in this study performs best in COVID-19 diagnosis.
(ii) The CNN models require a sufficient number of images for efficient and more
accurate image classification.
(iii) The Data augmentation techniques are very effective to improve the CNN model
performance remarkably by generating more data from an existing limited size
dataset.
(iv) The Data augmentation is also effective in image classification as it gives the
ability of invariance to CNNs.
(v) The proposed CNN model performance has been proved statistically significant in
the performance of other ML models.
(vi) The CNN-based diagnosis using X-ray imaging can be very effective for medical
sector to handle the mass testing situations in pandemics like COVID-19.
CHAPTER 4
REQUIREMENT ANALYSIS
Generality: The detection model must be able to be used in several applications which
helps users in different areas.
Usefulness: The model must allow the detection of various chest related diseases.
Accuracy: The information accuracy must allow the system to support the early
detection of Covid-19 cases.
Performance: Quick response to users which helps radiologist for quick detection.
Reliability: The system is used by all age groups it should generate valid predictions
of data as user is totally dependent on system.
Availability: The system should be available all the time along with the user.
Maintainability: The model should meet new requirements.
Editors: Anaconda
Programming Language: Python
Web Framework: Flask
4.3.1.1 ANACONDA
Anaconda is an open-source distribution for python and R.It is used for data science, machine
learning, deep learning, etc. With the availability of more than 300 libraries for data science,
it becomes fairly optimal for any programmer to work on anaconda for data
science.Anaconda helps in simplified package management and deployment. Anaconda
comes with a wide variety of tools to easily collect data from various sources using various
machine learning and AI algorithms. It helps in getting an easily manageable environment
setup that can deploy any project with the click of a single button.
Package versions in Anaconda are managed by the package management system conda. This
package manager was spun out as a separate open-source package as it ended up being useful
on its own and for things other than Python. There is also a small, bootstrap version of
Anaconda called Miniconda, which includes only conda, Python, the packages they depend
on, and a small number of other packages.
The big difference between conda and the pip package manager is in how to package
dependencies are managed, which is a significant challenge for Python data science and the
reason conda exists.
Before version 20.3, when pip installed a package, it automatically installed any dependent
Python packages without checking if these conflicted with previously installed packages. It
would install a package and any of its dependencies regardless of the state of the existing
installation.
Because of this, a user with a working installation of, for example, TensorFlow, could find
that it stopped working having used pip to install a different package that requires a different
version of the dependent numpy library than the one used by TensorFlow. In some cases, the
package would appear to work but produce different results in detail. While pip has since
implemented consistent dependency resolution, this difference accounts for a historical
differentiation of the conda package manager.
Open-source packages can be individually installed from the Anaconda repository, Anaconda
Cloud (the user's own private repository or mirror, using the conda installcommand.
Anaconda, Inc. compiles and builds the packages available in the Anaconda repository itself,
and provides binaries for Windows 32/64 bit, Linux 64 bit and MacOS 64-bit. Anything
available on PyPI may be installed into a conda environment using pip, and conda will keep
track of what it has installed itself and what pip has installed.
Custom packages can be made using the command condabuild and can be shared with others
by uploading them to Anaconda Cloud, PyPI or other repositories. The default installation of
Anaconda2 includes Python 2.7 and Anaconda3 includes Python 3.7. However, it is possible
to create new environments that include any version of Python packaged with conda.
Often, programmers fall in love with Python because of the increased productivity it
provides. Since there is no compilation step, the edit-test-debug cycle is incredibly fast.
Debugging Python programs is easy: a bug or bad input will never cause a segmentation
fault. Instead, when the interpreter discovers an error, it raises an exception.
When the program doesn't catch the exception, the interpreter prints a stack trace. A source
level debugger allows inspection of local and global variables, evaluation of arbitrary
expressions, setting breakpoints, stepping through the code a line at a time, and so on. The
debugger is written in Python itself, testifying to Python's introspective power. On the other
hand, often the quickest way to debug a program is to add a few print statements to the
source: the fast edit-test-debug cycle makes this simple approach very effective.
Python combines remarkable power with very clear syntax. It has interfaces to many system
calls and libraries, as well as to various window systems, and is extensible in C or C++. It is
also usable as an extension language for applications that need a programmable interface.
Finally, Python is portable: it runs on many Unix variants including Linux and macOS, and
on Windows.
WSGI
The Web Server Gateway Interface (Web Server Gateway Interface, WSGITh has been used
as a standard for Python web application development. WSGI is the specification of a
common interface between web servers and web applications.
Werkzeug
Werkzeug is a WSGI toolkit that implements requests, response objects, and utility functions.
This enables a web frame to be built on it. The Flask framework uses Werkzeg as one of its
bases.
Jinja2
Jinja2 is a popular template engine for Python.A web template system combines a template
with a specific data source to render a dynamic web page.
This allows you to pass Python variables into HTML templates like this:
<html>
<head>
<title>{{title}}</title>
</head>
<body>
<h1>Hello{{username}}</h1>
</body>
</html>
Instead of an abstraction layer for database support, Flask supports extensions to add such
capabilities to the application.
Unlike the Django framework, Flask is very Pythonic. It’s easy to get started with Flask,
because it doesn’t have a huge learning curve.
On top of that it’s very explicit, which increases readability. To create the “Hello World” app,
you only need a few lines of code.
If__name__==’__main__’:app.run(Th
If you want to develop on your local computer, you can do so easily. Save this program
as server.py and run it with python server.py.
$ python server.py
* Serving Flask app “hello”
* Running on http://127.0.0.1:5000/ (press CTRL+C to quiteTh
CHAPTER 5
SYSTEM ARCHITECTURE
5.1 EXISTING SYSTEM
Molecular RT-PCR test and Antigen test
The PCR testing, which is the NAAT testing or that tests the nucleic acid of the virus
itself, are the most sensitive ones. And those, actually, to complete the testing
process, they need to be done in quite a sophisticated laboratory setting. And, that's why
the turnaround time for these tests can take several days. And, if there's an outbreak and
Dept of ECE, JCET, Hubballi, 2021-22 Page 14
Perception and Prognosis of COVID-19 cases using Radiology Modalities
there's lots of samples, it will take several days longer than what we would want to or
hope for. The antigen testing that exists now in the market are what we call the antigen
rapid diagnostic tests. Those look for the antigen on the outer surface of the virus itself.
And those have been developed in a way that they can be done at the bedside or in the
field so they do not need the sophisticated laboratory setting to conduct them. They are
not as accurate as the PCR testing, but they have a very important value as one of the
tools to address the COVID pandemic.
Diagnostic tests for SARS-CoV-2, none of which perfectly reflect viral carriage, fall into
two broad categories: antigen tests and real-time reverse transcription polymerase chain
reaction (RT-PCR) tests. While both tests are used as diagnostics, antigen tests detect the
presence of a specific viral antigen and are capable of returning results within 15
minutes, while RT-PCR amplify genomic sequences and therefore require longer turn-
around times. Substantial attention has been paid to the lower sensitivity of antigen
testing compared with that of RT-PCR testing.
1. One disadvantage of antigen tests is the tests have lower accuracy for people who do
not have symptoms. If someone gets a positive result typically an RT-PCR test is done
afterward to confirm the result.
2. Another disadvantage is it takes time for the polymerase to make many copies of DNA
typically, samples are collected at a clinic, then sent to a lab.
3. More accurate in the first 5 days of symptoms than testing with no symptoms.
CNN
TRAINING
DATA
DOMAIN OF
MACHINE
THE
LEARNING
PATIENT
MODEL
DATASET
TEST DATA
5.3 DATASET
We build a public available SARS-CoV-2 CT scan dataset, containing 1252 CT scans that
are positive for SARS-CoV-2 infection (COVID-19Th and 1230 CT scans for patients non-
infected by SARS-CoV-2, 2482 CT scans in total. These data have been collected from real
patients in hospitals. The aim of this dataset is to encourage the research and development of
artificial intelligent methods which are able to identify if a person is infected by SARS-CoV-
2 through the analysis of his/her CT scans. As baseline result for this dataset we used an
explainable Deep Learning approach (DNN).By this which we could achieve an F1 score of
97.31% which is very promising.
5.4 PRE-PROCESSING
There are two main parts in Convolution Neural Network (CNN) architecture:
A convolution tool that separates and identifies the various features of the image for analysis
in a process called as Feature Extraction
A fully connected layer that utilizes the output from the convolution process and predicts the
class of the image based on the features extracted in previous stages.
There are three types of layers that make up the CNN which are the convolution layer,
pooling layer, and Fully-connected (FC) layer. When these layers are stacked, CNN
architecture will be formed.
1. Convolution Layer
This layer is the first layer that is used to extract the various features from the input
images. In this layer, the mathematical operation of convolution is performed between the
input image and a filter of a particular size MxM. By sliding the filter over the input image,
the dot product is taken between the filter and the parts of the input image with respect to the
size of the filter (MxM).
The output is termed as the Feature map which gives us information about the image such as
the corners and edges. Later, this feature map is fed to other layers to learn several other
features of the input image.
2. Pooling Layer
In most cases, a Convolution Layer is followed by a Pooling Layer. The primary aim of this
layer is to decrease the size of the convolved feature map to reduce the computational costs.
This is performed by decreasing the connections between layers and independently operates
on each feature map. Depending upon method used, there are several types of pooling
operations.
In Max Pooling, the largest element is taken from feature map. Average Pooling calculates
the average of the elements in a predefined sized Image section. The total sum of the
elements in the predefined section is computed in Sum Pooling. The Pooling Layer usually
serves as a bridge between the Convolution Layer and the FC Layer
Image Acquisition
Image Pre-processing
Image segmentation
Feature Extraction
Detection of Covid-19
5.4.3 Resnet
In 1998, the ResNet-5 architecture was introduced in a research paper titled “Gradient-Based
Learning Applied to Document Recognition” by Yann LeCun, Leon Bottou, Yoshua Bengio,
and Patrick Haffner. It is one of the earliest and most basic CNN architecture.
It consists of 7 layers:
The first layer consists of an input image with dimensions of 32×32. It is convolved with 6
filters of size 5×5 resulting in dimension of 28x28x6.
The second layer is a Pooling operation which filters size 2×2 and stride of 2. Hence
the resulting image dimension will be 14x14x6.
Similarly, the third layer also involves in a convolution operation with 16 filters of
size 5×5 followed by a fourth pooling layer with similar filter size of 2×2 and stride
of 2. Thus, the resulting image dimension will be reduced to 5x5x16.Once the image
dimension is reduced,
The fifth layer is a fully connected convolution layer with 120 filters each of size
5×5. In this layer, each of the 120 units in this layer will be connected to the 400
(5x5x16Th units from the previous layers.
The sixth layer is also a fully connected layer with 84 units.
The final seventh layer will be a soft max output layer with ‘n’ possible classes
depending upon the number of classes in the dataset.
CHAPTER 6
SYSTEM DESIGN
Collection of dataset(D1,D2,D3)
Data Pre-Processing
Training Dataset
Convolution
Neural Network
Machine
Test data Learning Model
The workflow of this study begins with collection of primary dataset containing two image
classes: one class belonged to chest X-rays of COVID-19- confirmed cases and the other
class of images belonged to the normal people without the disease. In the next phase of the
study, the concerned medical professionals analysed the dataset and removed some of the X-
ray images which were not clear in terms of quality and diagnostic parameters. Hence, the
resulted dataset was very clean, as each X-ray image was of good quality as well as clear in
terms of significant diagnostic parameters according to their expertise. In the third phase, the
dataset was augmented using standard augmentation techniques to increase its size. The
resulted dataset was used to train the model in the next phase. After training, the model was
tested for its performance in the disease detection.
Dataset Pre-processing:
The output given by the sigmoid activation function lies between 0 and 1. It finds the error
between the predicted class and the actual class. The “Adam” optimizer has been used which
changes the attribute weight and learning rate to reduce the loss of the learning model. The
model parameter values are given in Table 3, and the model architecture is given in above.
During the initial experiments, the CNN has been used with different configurations in terms
of the usage of number of convolution layers in the model. The decision of how many
convolution layers used in the model was made by using an incremental approach. First, the
CNN was tested using only one convolution layer and the results were analysed. Then, the
CNN was built with two layers and results were analysed and so on. The approach had been
continued till the results provided by the model were accurate and effective. The final model
which was very feasible according to its results consisted of six convolution layers. The
results of each increment of the model have been reported in the Results section.
CHAPTER 7
INTERPRETATION OF RESULTS
• Providing rapid result of COVID-19 infection by processing the x-ray of the person.
• Differentiating the patients infected with COVID-19, normal person and person with
pneumonia with the help of Machine Learning.
CONCLUSION
This study has been conducted to demonstrate the effective and accurate diagnosis of
COVID-19 using CNN which was trained on chest X-ray image datasets. The model training
was performed incrementally with different datasets to attain the maximum accuracy and
performance. The primary dataset was very limited in size and also imbalanced in terms of
class distribution.
These two issues with the primary dataset affected the performance of the models very badly.
To overcome these issues, the dataset was pre-processed using different techniques, including
dataset balancing technique, manual analysis of X-ray images by concerned medical experts,
and data augmentation techniques. To balance the dataset for model training and also to test
its performance parameters, an ample number of chest X-rays were collected from different
available sources. After training and testing the CNN model on the fully processed dataset,
the performance results have been reported.
In addition, to test further the model performance, particularly the accuracy, the proposed
CNN model has been tested using an independent dataset as an independent validation and
real-world test obtained from IEEE Data Port . As reported in the results in both the testing
scenarios, the proposed CNN model has shown highly promising results.
Since this study uses an incremental approach in training the model using different sizes and
types of datasets, the approach confirmed the fact that CNN models require an ample amount
of image data for the efficient and more-accurate classification. The data augmentation
techniques are very effective to significantly improve the CNN model performance by
generating more data from an existing limited-size dataset and also by giving the ability of
invariance to the CNN.
The proposed CNN model’s number of convolution layers was also decided in an incremental
approach; that is, in the first increment, only one convolution layer was used and, then, on the
basis of model performance metrics, one layer in each increment was increased till it reaches
a stable and efficient stage in terms of its performance.
The final version of the CNN consisted of six convolution layers. A comparative analysis has
also been done to further test the scope of the proposed CNN model by performance
The results prove that the proposed CNN has outperformed all the models particularly when
each model was tested on the independent validation dataset. Considering the significant
effect of data augmentation techniques on model performances, the authors are currently
working on the application of other state-of-the art data augmentation algorithms and
techniques. In the future, the results obtained from the study concerned with the applicability
of these modern data augmentation techniques in different application domains will be
published.
REFERENCES
[1] D. Cucinotta and M. Vanelli, “WHO declares COVID-19 a pandemic,” Acta Biomedica:
Atenei Parmensis, vol. 91, pp. 157–160, 2020.
[2] F. Rustam, A. A. Reshi, A. Mehmood et al., “COVID-19 future forecasting using
supervised machine learning models,” IEEE Access, 2020.
[3] D. J. Cennimo, “Coronavirus disease 2019 (COVID-19Th clinical presentation,” vol. 8,
pp. 101489–101499, 2020, https:// emedicine.medscape.com/article/2500114-clinical#b2,
2020. Online.
[4]X-ray Radiography, https://www.radiologyinfo. org/en/info.cfm?pg�chestrad#overview.
[5] J. P. Cohen, “Github Covid19 X-ray dataset,” 2020, https:// github.com/ieee8023/covid-
chestxray-dataset, 2020. Online.
[6] Z. H. Chen, “Mask-RCNN detection of COVID-19 pneumonia symptoms by employing
stacked autoencoders in deep unsupervised learning on low-dose high resolution CT,” IEEE
Dataport, 2020.
[7] A. S. Lundervold and A. Lundervold, “An overview of deep learning in medical imaging
focusing on MRI,” Zeitschrift f¨ur Medizinische Physik, vol. 29, no. 2, pp. 102–127, 2019.
[8] M. Ahmad, “Ground truth labeling and samples selection for hyperspectral image
classification,” Optik, vol. 230, Article ID 166267, 2021.
[9] B. Kayalibay, G. Jensen, and P. van der Smagt, “CNN-based segmentation of medical
imaging data,” 2017, http://arxiv. org/abs/1701.03056.
[10] Q. Li, W. Cai, X. Wang, Y. Zhou, D. D. Feng, and M. Chen, “Medical image
classification with convolutional neural network,” in Proceedings of the 2014 13th
International Conference on Control Automation Robotics & Vision (ICARCVTh, pp. 844–
848, Singapore, December 2014.
[11] M. Umer, S. Sadiq, M. Ahmad, S. Ullah, G. S. Choi, and A. Mehmood, “A novel
stacked CNN for malaria parasite detection in Thin blood smear images,” IEEE Access, vol.
8, pp. 93782–93792, 2020.
[12] R. Rouhi, M. Jafari, S. Kasaei, and P. Keshavarzian, “Benign and malignant breast
tumors classification based on region growing and CNN segmentation,” Expert Systems with
Applications, vol. 42, no. 3, pp. 990–1002, 2015.
[13] M. Sharif, M. Attique Khan, M. Rashid, M. Yasmin, F. Afza, and U. J. Tanik, “Deep
CNN and geometric features-based gastrointestinal tract diseases detection and classification
from wireless capsule endoscopy images,” Journal of Experimental & .eoretical Artificial
Intelligence, pp. 1–23, 2019.