0% found this document useful (0 votes)
6 views

Deep Learning Models Performance Evaluations For Remote Sensed Image Classification

project on remote sensing

Uploaded by

ananya kunduru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Deep Learning Models Performance Evaluations For Remote Sensed Image Classification

project on remote sensing

Uploaded by

ananya kunduru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Received 17 August 2022, accepted 9 October 2022, date of publication 17 October 2022, date of current version 27 October 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3215264

Deep Learning Models Performance Evaluations


for Remote Sensed Image Classification
ABEBAW ALEM 1,2 AND SHAILENDER KUMAR2 , (Member, IEEE)
1 ITDepartment, Debre Tabor University, Debre Tabor 251272, Ethiopia
2 Department of Computer Science and Engineering, Delhi Technological University, Delhi 110042, India

Corresponding author: Abebaw Alem ([email protected])


This work of Abebaw Alem was supported by the Scholarship from the Ethiopian Ministry of Education (MoE).

ABSTRACT Deep learning-based land cover and land use (LCLU) classification systems are a significant
aspiration for remote sensing communities. In nature, remote sensing images have various properties that
need to be analyzed. Analyzing and interpreting image properties is difficult due to the nature of the image,
the sensor technology’s capability, and other determinant variables such as seasons and weather conditions.
The problem is essential for environmental monitoring, agricultural decision-making, and urban planning if
it can be supported by deep learning systems. Therefore, deep learning approaches are proposed to quickly
analyze and interpret the remote sensing image to classify the LCLU. The deep learning methods could be
designed starting from scratch or using pre-trained networks. However, there are few comparisons of deep
learning methods developed from scratch and trained on pre-trained networks. Thus, we proposed evaluating
and comparing the deep learning models convolutional neural network feature extractor (CNN-FE) by
developing it from scratch, transfer learning, and fine-tuning it for the LCLU classification system using
remote sensed images. Using CNN-FE, TL, and fine-tuning deep learning models as examples, this paper
compares and analyzes deep learning algorithms for remote sensed image classification. After developing
and training each deep learning model on the UCM dataset, we evaluated and compared their performances
using the performance measurement metrics accuracy, precision, recall, f1-score, and confusion matrix. The
proposed deep learning algorithms can adapt and learn the features of the remote sensing images, and the TL
and fine-tuning classification performances are significantly improved. As a result of the efficient time used
for training the models, this paper discovered that the fine-tuned deep learning model achieved profound
accuracy performance results in the UCM dataset.

INDEX TERMS Convolutional neural network, deep learning, fine-tuning, performance comparisons,
remote sensed image classification, transfer learning.

I. INTRODUCTION is increasing dramatically, and the demand for land use is


Environment monitoring, agricultural decision-making, and increasing. A learning system could be applied to the domain
urban planning in today’s fast-paced world all rely primarily to utilize this land properly.
on the LCLU classification learning system. The LCLU clas- Thus, the LCLU classification is the recent hot, chal-
sification using remote sensed (RS) images is a critical issue lenging task in RS [1], [2], [3], [4]. With advanced sensor
in managing natural resources and human-made activities on technologies, RS images are satellite data collected from the
the natural phenomena in the earth’s environment. RS image earth’s environment. The deep learning (DL) method could
classification is the most recently focused area for the RS be applied to solve the challenge.
societies in the computer vision trends and image processing The DL is a recent specialized machine learning (ML)
research areas. From time to time, the world’s population approach that could automatically extract features of the
image for large datasets with admirable performance
The associate editor coordinating the review of this manuscript and improvements. Thus, DL is a recently focused research area
approving it for publication was Giacomo Fiumara . applied in various domains, such as classification [1], [5], [6],

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
111784 VOLUME 10, 2022
A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

[7], [8], recognition [9], and object detection [10]. It is also and comparison with pre-trained development models have
potentially challenging in many other domains [11]. not been widely researched. For designing the TL and fine-
DL techniques proposed in this paper are convolutional tuning model, we preferred the recently pre-trained network,
neural networks (CNNs), transfer learning (TL), and fine- the EffificientNetB7, which was trained on the ‘‘ImageNet’’
tuning, which make the classification task more attractive. large dataset. Recently, [16] achieved 84.4% top-1 accu-
CNN is one of computer vision’s most common DL methods racy of state-of-the-art EffificientNetB7 on the ‘‘ImageNet’’.
[18] for feature extractions and LCLU modelling using RS According to [16], eight scaling-up series of EffificientNet
images. CNN is a feedforward and backward neural network pre-trained models from EffificientNetB0-EffificientNetB7
consisting of convolutional calculations and deep structures. were designed on the larger dataset called ‘‘ImageNet’’. The
Therefore, CNN models have powerful feature extraction performance of each successive version has been improved.
capabilities for classification performance improvement in According to [18], who have applied EffificientNetB3,
RS images [12]. the large versions of EffificientNet models achieve better
To train classification models from scratch, DL approaches accuracy than the smaller ones. Thus, we proposed the Effi-
such as CNN require a large amount of data and a tremen- ficientNetB7 pre-trained network to design TL and fine-
dous amount of processing power [13]. However, the classi- tuning models in the domain of LCLU classification using
fication problem can be solved with less training time and RS images to evaluate their performances and compare them
smaller dataset training samples using TL [14] and fine- with the CNN-FE developed from scratch. We selected the
tuning [4], [15]. Thus, the main problem of the DL models UCM dataset to assess and compare the DL models.
is training the DL models using deep CNN from scratch Therefore, this paper aims to design the DL models and
requires a large dataset and takes longer to train the model. evaluate their performance with various performance mea-
To solve these kinds of DL problems, we came up with surement metrics. We contributed the following main contri-
the TL and fine-tuning approaches and compared how well butions accordingly.
they worked with the convolutional neural network feature • We developed the CNN model from scratch and com-
extractor (CNN-FE) model, which was built from scratch. pared its performance with the deep TL and fine-tuning
TL is another recent DL technique used to train the DL models for LCLU classification using the RS UCM
model by reusing pre-trained networks. TL and fine-tuning dataset.
are used for smaller datasets and can be designed from the • We applied the recent advanced EfficientNetB7 pre-
pre-trained networks on their top fully connected layer to trained network to design TL and fine-tune DL models
reuse the features. The training time in TL and fine-tuning to LCLU classification in RS images.
could be much less than that of deep CNN models trained • We looked at the models, compared how well they
from scratch. Therefore, TL could solve the problems of worked by using different ways to measure perfor-
building DL models from scratch by training the models in mance, and came to the conclusion that the fine-tuning
less time with smaller datasets by freezing the pre-trained model worked better with less training time.
network that has already been trained.
TL adopts the features from the pre-trained network to train II. MATERIALS AND PROPOSED METHODS
the new models. Moreover, fine-tuning is a DL technique A. DATASETS AND TOOLS
used to train the model by unfreezing the pre-trained net- We used the publicly available University of California
works. This technique is vital to increasing the performance Merced (UCM) dataset for modelling the CNN, CNN-based
of the model. The TL adopts the properties of the pre-trained TL, and fine-tuning. The UCM dataset is an LCLU data set
layers, excluding the last fully connected layer, i.e., the dense collected from the earth, labeled manually, and introduced
layer is replaced by our classifier with 21 neurons, and the by [19] at the University of California Merced. It contains
activation function is softmax. twenty-one classes. Each class includes 100 images with a
This paper designed and evaluated the DL models, such resolution of 256 × 256 pixels and a spatial resolution of
as CNN, TL, and fine-tuning. The CNN-FE has been devel- about 30 centimeters per pixel. However, the UCM dataset is
oped from scratch and has four CNN blocks. Using Keras inconsistent, as about 44 images have different pixel shapes.
applications, the deep CNN-based TL and fine-tuning models This dataset is available at:
have been developed on the pre-trained model Effificient- http://weegee.vision.ucmerced.edu/datasets/landuse.html.
NetB7 [16]. Python is a high-level computer language tool used for
In recent studies, very few studies have been conducted model development. Python is a versatile and user-friendly
to compare the capabilities of DL models developed from programming language that can be used to create many inter-
scratch with those that reused the pre-trained network to active libraries for the DL and ML models. TensorFlow and
evaluate their capable performances. For instance, [17] has Keras are also other DL tools used with Python.
applied the TL and fine-tuning methods on the ResNet50
pre-trained network and compared their performances with B. PROPOSED DL METHODS
other pre-trained-based networks in scene image classifica- Previous works have studied the DL method in the classifi-
tion. However, the scratch development models’ evaluation cation problems in RS images of various datasets. However,

VOLUME 10, 2022 111785


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

TABLE 1. The DL hyperparameters settings for training the datasets. In DL model training, relu and softmax non-linear acti-
vation functions are the most relevant functions to update
the weights in the convolution process. We used the relu
at the entire convolutional layer to activate the weights in
each convolution process and the softmax at the output layer
since it is reliable for multiclass classification problems. The
softmax function is a feature classifier based on a probability
score for each class.
In the convolution process, the number of parameters
(params) could be calculated using (1) and (2). The total
parameter numbers of the model are the summations of
evaluating and comparing the DL developed from scratch and the computed results from Conv2D and dense layers. The
on the pre-trained network has not been investigated widely. CNN-FE model was built with four Conv2D layers that cal-
Further investigations are still needed to design and assess the culate the number of parameters in the same norm by (1)
current DL techniques for LCLU classification using the RS and two dens layers (2). However, the calculation formula
datasets. for dense parameters differs from Conv2D as equated in (2).
Thus, to evaluate and compare the performances of dif- The number 1 means the bias associated with each filter for
ferent DL models applied for LCLU classification problems learning.
in the UCM RS dataset, we designed the CNN model from
scratch and the TL and fine-tuning models on EfficientNet C2DP# = #OC × (#OC × FH × FW + 1) . (1)
pre-trained network. The EfficientNet was trained on the DP# = # OC × (#IC + 1) . (2)
large-size dataset of ‘‘ImageNet’’ images. ImageNet is the
most significant benchmark dataset introduced by [20] for where, C2DP# = number of convolutional parameters, #OC
designing DL models. = number of output channels, DP# = number of dense param-
The DL hyperparameters influenced on CNNs perfor- eters, #IC = number of input channels, FH = filter size of
mances [21]. For instance, according to [22], using differ- height, and FW = filter size of width
ent dropout values produced different performance results. We could get a total calculated parameter number accord-
We also showed that the dropout value (0.25) generated an ing to (1) and (2). However, the number of parameters for
accuracy of 84.76 %, which is different from our previous all MaxPooling2D and Flatten layers is zero because these
work, with the dropout value (0.50) causing an accuracy of layers do not learn anything from weights (or filters) or the
89.76. Therefore, by considering their effects, we set the same built model. As a result, 1.68 million parameters were found
hyperparameters for all three DL models on the given dataset and learned from the CNN-FE model, while 18.88 million
to evaluate the models’ performances, as shown in Table 1. parameters were found and learned from the TL and fine-
tuning models.
1) THE CONVOLUTIONAL NEURAL NETWORK (CNN)
ALGORITHM 2) THE TRANSFER LEARNING (TL) METHOD
The CNN algorithm is the most critical DL technique that TL is a method of training the DL model by replacing the
could extract and automatically learn features from the data. input layer with an image embedding as the EfficientNet
From the input images, features are newly extracted and transfers the knowledge learned from the much larger dataset
learned weights of pixels in the image in the new value called ‘‘ImageNet’’ to our classification problem. The TL has
(usually reduced). The CNN method consists of several series been trained by making the layers in EfficientNet on Ima-
of connected layers. These layers shared weights throughout geNet images non-trainable (pre_trained_model.trainable =
the process, i.e., from the start to the end layer (classifier), False). We trained only the last flatten (1D vector form)
as depicted in Figure 1. This process creates the feature and two dense layers, including relu and softmax activation
map for the model’s entire set of layers as well as the class functions, and dropout on the 21 LCLU RS UCM dataset
prediction for the output layer. The feature map of the model classes.
can be built up with pixel-wise multiplication of the input Therefore, the classification head with dense layers can
image pixels and the provided weight or kernel pixels with be appended to manipulate our new classification problem.
learnable parameters [23], [24]. TL is an efficient, reliable DL technique used to propose
The CNNs can be capable of spatial feature representations various domains, especially the image classification problem
for RS image classifications using the convolution technique in this paper. Recently, TL has been applied for LCLU in
in the form of pixels [25]. This convolution process updates RS image classifications [27], [28]. TL is used to train DL
weights with each layer’s provided non-linear activation func- models in a short amount of time with improved results [29],
tion. The input data types and weight calculations in the [30]. However, deep TL is used for limited dataset training
convolution method make the CNNs different from other samples, while deep CNN from scratch is used for large
conventional ML approaches [26]. dataset training samples. In this paper, we used the TL model

111786 VOLUME 10, 2022


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

FIGURE 1. Layers of the CNN-FE model (Conv2D, Batch Normalization, MaxPooling2D, Flatten, Dense, and Dropout) with the input sample images and
the output classes. The convolution operation has been convolved with a filter size of 3∗ 3.

to compare how well it works with other DL techniques for were used for this experiment in Python. In addition, we used
classifying LCLUs in RS images. the scikit-learn statistical package for analyzing the perfor-
mance measurement metrics.
3) THE FINE-TUNING ON EFFICIENTNET
Fine-tuning is a DL technique used to train a model by A. EXPERIMENTAL SETTING AND RESULT
allowing and adapting the EfficientNet pre-trained layers on We used the UCM dataset for experiments to design and
ImageNet large dataset to be trainable (pre_trained_model. evaluate the DL models for LCLU classification problems.
trainable = True). EfficientNet is the recent advanced CNN- We split each dataset into train, validation, and test samples
based network that could be applied to classification tasks at 60%, 20%, and 20%, respectively. We also set the DL
on ImageNet. To get the improved performance, EfficientNet hyperparameters, as indicated in Table 1. Then, we trained
has been fine-tuned, and the final fully connected layer is the models, validated them with the validation dataset during
treated as the output classifier layer as we did in TL, except training, and evaluated the performances with the test dataset.
the layers are allowed to be trained. As stated by [4] and [15], After the experimental parameters were set, we trained
fine-tuning a pre-trained network is the optimal solution for and evaluated the model during and after the experiments
a limited number of training samples. The EfficientNet pre- with validation and test datasets, respectively. We evaluated
trained network was introduced by [16] for rethinking model the models using accuracy, precision, recall, f1-score, and
scaling for CNN. We chose the EfficientNet pre-trained net- confusion matrix (CM) metrics. CM assesses the class per-
work because it is the newest and most advanced network formance, whether classified correctly or incorrectly in rows-
that has not been applied to classify CLCU classification column intersections. We used the categorical-cross-entropy
problems yet. loss function to figure out the errors, as well as the accuracy.
The fine-tuning was trained on the last three fully con- The training and validation losses or mistakes are expected to
nected layers of the ImageNet using the EfficientNet pre- decrease in epoch increments, as shown in Figures 2, 3, and 4
trained network. The final fully connected layers include (on the right).
a flatten layer that transforms the input image into vector Therefore, we evaluated the models with 420 test or sup-
form, two dense layers, dropout, relu, and activation func- port images, as shown in Table 2, Table 3, and Table 4, using
tions. Accordingly, the pre-trained weights are used ran- the UCM dataset. The UCM is an imbalanced RS dataset.
domly as initial weights for our fine-tuning neural network. The accuracy performance metric, the percentage of correctly
Fine-tuning is used to compare the results of the fully con- classified images, could not be suitable for the imbalanced
nected layer and convolutional layer. Thus, we proposed a dataset. Thus, each class performance is evaluated using pre-
fine-tuning technique to compare its performance with the cision, recall, f1-score, and CM metrics in addition to the
convolutional layer-based CNN-FE model and the fully con- accuracy metric. The f1-score is the harmonic mean of pre-
nected layer-based TL model. cision and recall metrics, and it generalizes the performance
of each class and the average performance of the DL models
III. EXPERIMENTAL RESULTS AND DISCUSSIONS built. If both precision and recall have the best performance
Experiments were done on a laptop computer with an Intel result in a category, then the f1-score has the best performance
Core i3-4000M CPU that ran at 2.40 GHz and 4GB of RAM result in that class. Whereas, if either precision or recall has
that was connected to the Colaboratory in its Tesla K80 GPU. 0 performance result, then the f1-score has 0 performance,
Keras and TensorFlow, open-source DL software packages, which is nothing the model is predicting.

VOLUME 10, 2022 111787


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

FIGURE 2. The training and validation accuracies and losses in the CNN-FE model.

FIGURE 3. The training and validation accuracies and losses in the TL model.

FIGURE 4. The training and validation accuracies and losses in the fine-tuning DL model.

Accordingly, the categories that scored best (100%) in The accuracy of the DL models is also measured in terms
the f1-score metric are agricultural and chaparral in CNN- of accuracy and loss measurement metrics in graphical rep-
FE (Table 2); chaparral, parkinglot, and storagetanks in TL resentation, as shown in Figures 2, 3, and 4 for CNN-FE, TL,
model (Table 3); and agricultural, airplane, chaparral, free- and fine-tuning models, respectively. The training accuracies
way, and runway in a fine-tuning model (Table 4), respec- (with a blue color curve) are smoothly increasing, while the
tively. validation accuracies (with a red color curve) are somewhat

111788 VOLUME 10, 2022


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

TABLE 2. CNN-FE classification performances in precision, recall, and TABLE 4. Fine-tuning classification performance in precision, recall, and
F1-score on 420 support images. F1-score on 420 support images.

performance, whether it is classified correctly or incorrectly.


TABLE 3. TL classification performance in precision, recall, and F1-score
on 420 support images. CM considers each class label in rows (true labeled class)
and columns (predicted labeled class), as depicted in Figures
5 through 7. The probability score in the diagonal intersection
showed the correct classified classes. In contrast, the results in
other rows-columns wise are predicted to be in misclassified
classes.
For evaluating the model with test set sample images,
we used the argmax function to predict a class with the max-
imum argument probability score. For instance, our classifi-
cation problem has twenty-one possible classes in the UCM
dataset. If the output probabilities are [0.0, 0.0, 0.0, 0.0, 0.05,
0.0, 0.55, 0.0, 0.0, 0.0, 0.05, 0.0, 0.30, 0.05, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0], the argmax (maximum argumentative-class)
probability is 0.55 and it is associated with the denseres-
idential class that is predicted by the CNN-FE model as
shown in Figure 5. In similar ways, the argmax proba-
bility can correspond to each class prediction in the CM
metric. The sum of the output probabilities of each class
is 1.00.
While evaluating the class performance in CM metric,
the lowest performance results in the two first classes are
denseresidential (55%) and golfcourse (60%) in CNN-FE,
fluctuating in increasing the accuracies in all models, espe- denseresidential (50%), mobilehomepark(50%) and golf-
cially in fine-tuning, as depicted in Figures 2, 3, and 4 (on the course (55%) in TL and mobilehomepark(55%) and golf-
left). We used the cross-entropy loss function to reduce errors course (60%), as shown in Figures 5, 6, and 7, respectively.
in the model performance. As shown in Figures 2, 3, and 4 (on The lower result showed that the class property is mainly
the right), the training losses (shown with a blue color curve) associated with other classes. For instance, the property of
are decreasing in a smooth manner, while the validation losses denseresidential has a more common feature with the class
(shown with a red color curve) are decreasing in an uneven mediumresidential. The class performance in the CM metric
manner to reduce the errors of the model. is generally better in the fine-tuning model.
In addition to deploying precision, recalls, and f1-score,
we used CM to evaluate class performances in each DL B. DISCUSSIONS
model. Like the f1-score, better class performance is observed In this paper, we utilized DL models for LCLU classifica-
in most classes in the CM metric. The CM measures the class tion in RS images. The performance of these models has

VOLUME 10, 2022 111789


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

FIGURE 5. CM performance results for the CNN-FE model in the UCM dataset.

FIGURE 6. CM performance results for the TL model in the UCM dataset.

been assessed using a variety of measurement metrics. The movement estimation) learning rate is in charge, the results
result showed good performance results for the classification of the experiments showed that the proposed DL algorithms
problem, as shown in Table 5. Since the Adam (adaptive could adapt and learn features of remote sensing images. The

111790 VOLUME 10, 2022


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

FIGURE 7. CM performance results for the fine-tuning model in the UCM dataset.

TABLE 5. The DL model performance evaluations using performance measurement metrics in the UCM dataset and the time (in seconds) consumed for
training each DL model.

TL and fine-tuning performances are significantly improved The maximum capability of a number of parameters in
over the scratch-developed CNN-FE. EfficientNet is 64 million. There are 18.88 million parameters
To achieve our stated goal in this paper, Tables 2 through found and learned in the same number in TL and fine-tuning
4 and Figures 2 through 7 compare the DL model perfor- models. This parameter number is about 18 times greater
mances on the UCM dataset. From the results, good class than the parameters found in CNN-FE mode, i.e., 1.68 mil-
performance has been achieved in precision, recall, f1-score, lion. This is why the convolution technique used in CNN-FE
and CM, though some class performances scored lower in reduces the number of parameters.
values. The training accuracy increases smoothly in CNN- The fine-tuning technique is used to compare the perfor-
FE, TL, and fine-tuning DL models, as shown in Figures 2, mance of the DL models designed using the convolutional
3, and 4, respectively. method and fully connected layers. As a result, improved
The overall accuracies for each model are summarized in performance has been achieved by fine-tuning the model with
Table 5. According to Table 5, the fine-tuning model has out- efficient time compared to the other DL techniques used in
performed performance in accuracy (88%), precision (89%), this paper.
recall (88%), and f1-score (89%) with efficient time. Whereas Designing the CNN model from scratch is essential to iden-
the CNN-FE model performed lower in each metric compared tify the correct properties of the categories for large datasets
to the other two models, this could be why the dataset used that are usually recommended when they exceed about
was smaller. Moreover, the CNN-FE spent much more time 5000 images per class. However, it could consume much
training than the two models did. training time and be subject to over-fitting. The less training

VOLUME 10, 2022 111791


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

time-consuming DL techniques, such as TL and fine-tuning, [2] B. Liu, X. Yu, P. Zhang, A. Yu, Q. Fu, and X. Wei, ‘‘Supervised deep feature
could overcome this limitation. The TL and fine-tuning DL extraction for hyperspectral image classification,’’ IEEE Trans. Geosci.
Remote Sens., vol. 56, no. 4, pp. 1909–1921, Apr. 2018.
techniques are efficient in training time-consuming and pro- [3] G. Cheng, Z. Li, J. Han, X. Yao, and L. Guo, ‘‘Exploring hierarchical con-
duce improved performance results. Nevertheless, we recom- volutional features for hyperspectral image classification,’’ IEEE Trans.
mend that TL and fine-tuning DL be applicable for small data Geosci. Remote Sens., vol. 56, no. 11, pp. 6712–6722, Nov. 2018.
[4] M. Mahdianpari, B. Salehi, M. Rezaee, F. Mohammadimanesh, and
sizes that may be less than 5,000 images per class. Therefore, Y. Zhang, ‘‘Very deep convolutional neural networks for complex land
as observed in Table 5, we can conclude that the TL and cover mapping using multispectral remote sensing imagery,’’ Remote
fine-turning DL techniques are economically relevant in time Sens., vol. 10, no. 7, p. 1119, 2018.
[5] M. A. Shafaey, M. A. Salem, H. M. Ebied, M. N. Al-Berry, and M. F. Tolba,
saving and essential for performance improvement. ‘‘Deep learning for satellite image classification,’’ in Proc. Int. Conf. Adv.
Intell. Syst. Inform., vol. 845, 2019, pp. 383–391.
IV. CONCLUSION [6] G. J. Scott, M. R. England, W. A. Starms, R. A. Marcum, and C. H. Davis,
‘‘Training deep convolutional neural networks for land–cover classifica-
This paper designed the three DL models: CNN-FE, TL, tion of high-resolution imagery,’’ IEEE Geosci. Remote Sens. Lett., vol. 14,
and fine-tuning for LCLU classification problems using RS no. 4, pp. 549–553, Apr. 2017.
images. The TL and fine-tuning models have been trained [7] A. B. Hamida, A. Benoit, P. Lambert, and C. B. Amar, ‘‘3-D deep learning
approach for remote sensing image classification,’’ IEEE Trans. Geosci.
on the recent EfficientNetB7 pre-trained baseline network, Remote Sens., vol. 56, no. 8, pp. 4420–4434, Aug. 2018.
whereas the CNN-FE has been trained from scratch using [8] C. Deng, Y. Xue, X. Liu, C. Li, and D. Tao, ‘‘Active transfer learning
the UCM dataset. The models’ performances were evaluated network: A unified deep joint spectral–spatial feature learning model for
hyperspectral image classification,’’ IEEE Trans. Geosci. Remote Sens.,
using accuracy, precision, recall, f1-score, and CM metrics. vol. 57, no. 3, pp. 1741–1754, Mar. 2019.
The fine-tuned model in the UCM dataset has profound accu- [9] F. Özyurt, ‘‘Efficient deep feature selection for remote sensing image
racy. We could observe that the nature of the denseresidential recognition with fused deep learning architectures,’’ J. Supercomput.,
class is mainly similar to the properties of the mediumresi- vol. 76, no. 11, pp. 8413–8431, Dec. 2019.
[10] J. Han, D. Zhang, G. Cheng, L. Guo, and J. Ren, ‘‘Object detection in
dential category. Thus, its performance results in precision, optical remote sensing images based on weakly supervised learning and
recall, f1-score, and CM is the worst result compared to other high-level feature learning,’’ IEEE Trans. Geosci. Remote Sens., vol. 53,
class categories. In addition to those metrics, the training time no. 6, pp. 3325–3337, Jun. 2015.
[11] O. Fink, Q. Wang, M. Svensén, P. Dersin, W.-J. Lee, and M. Ducoffe,
is another critical evaluation metric that is used to compare ‘‘Potential, challenges and future directions for deep learning in prognos-
the economic advantages of TL and fine-tuning models over tics and health management applications,’’ Eng. Appl. Artif. Intell., vol. 92,
the DL models developed from scratch. We found that the TL Jun. 2020, Art. no. 103678.
[12] X. Liu, Y. Zhou, J. Zhao, R. Yao, B. Liu, and Y. Zheng, ‘‘Siamese con-
and fine-turning DL models are efficient in saving time and volutional neural networks for remote sensing scene classification,’’ IEEE
essential for improving performance. Geosci. Remote Sens. Lett., vol. 16, no. 8, pp. 1200–1204, Aug. 2019.
Since TL and fine-tuning modes are reliable for a small [13] U. Zahid, I. Ashraf, M. A. Khan, M. Alhaisoni, K. M. Yahya, H. S. Hussein,
and H. Alshazly, ‘‘BrainNet: Optimal deep learning feature fusion for
dataset, all the experiments have been performed using a brain tumor classification,’’ Comput. Intell. Neurosci., vol. 2022, pp. 1–13,
small LCLU dataset in Google CoLab. Thus, we would Aug. 2022.
like to recommend developing a DL model from scratch [14] B. Yang, S. Hu, Q. Guo, and D. Hong, ‘‘Multisource domain transfer
and comparing its performance with other pre-trained DL learning based on spectral projections for hyperspectral image classifi-
cation,’’ IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 15,
models using the powerful GPU to improve the performance pp. 3730–3739, May 2022.
of the DL models using other larger RS datasets. Like [15] E. Maggiori, Y. Tarabalka, G. Charpiat, and P. Alliez, ‘‘Convolutional neu-
datasets, varying the DL hyperparameters could also affect ral networks for large-scale remote-sensing image classification,’’ IEEE
Trans. Geosci. Remote Sens., vol. 55, no. 2, pp. 645–657, Feb. 2017.
the model performance. Therefore, our future studies will [16] M. Tan and Q. V. Le, ‘‘EfficientNet: Rethinking model scaling for convo-
focus on DL optimization methods for LCLU classification in lutional neural networks,’’ 2019, arXiv:1905.11946.
RS images. [17] A. Shabbir, N. Ali, J. Ahmed, B. Zafar, A. Rasheed, M. Sajid, A. Ahmed,
and S. H. Dar, ‘‘Satellite and scene image classification based on transfer
learning and fine tuning of ResNet50,’’ Math. Problems Eng., vol. 2021,
ACKNOWLEDGMENT pp. 1–18, Jul. 2021.
Reviewing a article for publication is a challenging task. The [18] H. S. Alhichri, A. S. Alswayed, Y. Bazi, N. Ammour, and N. A. Ajlan,
‘‘Classification of remote sensing images using efficientNet-B3 CNN
editors (such as Prof. Giacomo Fiumara), the administrator model with attention,’’ IEEE Access, vol. 9, pp. 14078–14094, 2021.
(Ritika Gupta), and the reviewers have devoted more time [19] Y. Yang and S. Newsam, ‘‘Bag-of-visual-words and spatial extensions
to share their deep and experienced knowledge with the for land-use classification,’’ in Proc. 18th SIGSPATIAL Int. Conf. Adv.
Geographic Inf. Syst. (GIS), 2010, pp. 270–279.
authors and readers worldwide. Thus, the authors would like [20] O. Russakovsky, ‘‘ImageNet large scale visual recognition challenge,’’ Int.
to thank the anonymous editors, administrators, and reviewers J. Comput. Vis., vol. 115, no. 3, pp. 211–252, Dec. 2015.
of the IEEE Access international journal for their constructive [21] R. P. de Lima and K. Marfurt, ‘‘Convolutional neural network for remote-
sensing scene classification: Transfer learning analysis,’’ Remote Sens.,
reviews and comments by devoting their precious time to vol. 12, no. 1, p. 86, Dec. 2019.
improving the quality of this article. [22] R. Stivaktakis, G. Tsagkatakis, and P. Tsakalides, ‘‘Deep learning for
multilabel land cover scene categorization using data augmentation,’’ IEEE
Geosci. Remote Sens. Lett., vol. 16, no. 7, pp. 1031–1035, Jul. 2019.
REFERENCES [23] C. Peng, Y. Li, L. Jiao, Y. Chen, and R. Shang, ‘‘Densely based multi-scale
[1] S. Li, W. Song, L. Fang, Y. Chen, P. Ghamisi, and J. Benediktsson, ‘‘Deep and multi-modal fully convolutional networks for high-resolution remote-
learning for hyperspectral image classification: An overview,’’ IEEE Trans. sensing image semantic segmentation,’’ IEEE J. Sel. Topics Appl. Earth
Geosci. Remote Sens., vol. 57, no. 9, pp. 6690–6709, Sep. 2019. Observ. Remote Sens., vol. 12, no. 8, pp. 2612–2626, Aug. 2019.

111792 VOLUME 10, 2022


A. Alem, S. Kumar: Deep Learning Models Performance Evaluations for Remote Sensed Image Classification

[24] E. Maggiori, Y. Tarabalka, G. Charpiat, and P. Alliez, ‘‘Fully convolutional SHAILENDER KUMAR (Member, IEEE) received
neural networks for remote sensing image classification,’’ in Proc. IEEE the B.E. degree in computer science and engi-
Int. Geosci. Remote Sens. Symp. (IGARSS), Jul. 2016, pp. 5071–5074. neering, in 2001, the M.Tech. degree in computer
[25] Z. Zeng, X. Chen, and Z. Song, ‘‘MGFN: A multi-granularity fusion science, in 2005, and the Ph.D. degree in computer
convolutional neural network for remote sensing scene classification,’’ science and engineering from Maharshi Dayanand
IEEE Access, vol. 9, pp. 76038–76046, 2021. University, Haryana, India, in 2017.
[26] M. Kim, ‘‘Convolutional neural network-based land cover classification He has more than 20 years of teaching expe-
using 2-D spectral reflectance curve graphs with multitemporal satellite
rience in the Computer Science and Engineering
imagery,’’ IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11,
Departments at various esteemed engineering col-
no. 12, pp. 4604–4617, Dec. 2018.
[27] R. Naushad, T. Kaur, and E. Ghaderpour, ‘‘Deep transfer learning for leges, like Delhi College of Engineering, Netaji
land use land cover classification: A comparative study,’’ Nov. 2021, Subhas Institute of Technology, and Ambedkar Institute of Advanced Com-
arXiv:2110.02580. munication Technologies and Research, Delhi, India. Moreover, he has posi-
[28] D. Zhang, Z. Liu, and X. Shi, ‘‘Transfer learning on EfficientNet for tional experiences, such as being an Officer-in-Charge with the Data Mining
remote sensing image classification,’’ in Proc. 5th Int. Conf. Mech., Control Laboratory and the Ph.D. Student Coordinator with the Computer Science
Comput. Eng. (ICMCCE), Dec. 2020, pp. 2255–2258. and Engineering Department, Delhi Technological University, Delhi, India,
[29] A. Alem and S. Kumar, ‘‘Transfer learning models for land cover and land where he is currently a Professor of computer science and engineering.
use classification in remote sensing image,’’ Appl. Artif. Intell., vol. 36, He has published over 50 research papers in various reputed international
no. 1, Dec. 2022, Art. no. 2014192. journals and conferences. His research interests include database systems,
[30] M. Rashid, M. A. Khan, M. Alhaisoni, S.-H. Wang, S. R. Naqvi, data mining, big data analytics, machine learning, information security,
A. Rehman, and T. Saba, ‘‘A sustainable deep learning framework for compiler construction, and computer networks.
object recognition using multi-layers deep features fusion and selection,’’
Sustainability, vol. 12, no. 12, p. 5037, Jun. 2020.

ABEBAW ALEM received the B.Sc. degree in


information science from Haramaya University,
Haramaya, Oromia, Ethiopia, in 2010, and the
M.Sc. degree in information science from Addis
Ababa University, Addis Ababa, Ethiopia, in 2014.
He is currently pursuing the Ph.D. degree in com-
puter science and engineering with Delhi Techno-
logical University, India.
Since September 2010, he has been a Fac-
ulty Member with the Computer Science and IT
Department, Debre Tabor University, Debre Tabor, Ethiopia. He has middle-
level management experience, having served as the Vice Dean for the Tech-
nology Faculty, from November 2014 to October 2016, and as a Technology
Faculty Educational Development Center Coordinator, from October 2016 to
July 2018. His research interests include deep learning for remote sensed
image classification, machine learning for image analysis, case-based rea-
soning, and computer vision.
Mr. Alem is a member of the Black Artificial Intelligence Professional
Group, Global Initiative of Academic Networks (GIAN), and the Young
African Leaders Initiative (YALI).

VOLUME 10, 2022 111793

You might also like