Segmentation_of_MR_images_for_brain_tumor_detectio
Segmentation_of_MR_images_for_brain_tumor_detectio
Research
Abstract
Medical images often require segmenting into different regions in the first analysis stage. Relevant features are selected
to differentiate various regions from each other, and the images are segmented into meaningful (anatomically signifi-
cant) regions based on these features. The purpose of this study is to present a model for segmenting and identifying
the local tumor formation in MR images of the human brain. The proposed system operates in an unsupervised manner
to minimize the intervention of expert users and to achieve an acceptable speed in the tumor classification process. The
proposed method includes several steps of preprocessing for different brain image classify that Perform the normaliza-
tion task. These preprocessing steps lead to more accurate results in high-resolution images and ultimately improve the
accuracy and sensitivity of tumor separation from brain tissue. The output of this stage is applied to a self-encoding neural
network for image zoning. By nature of self-encoding networks, leads to reduce the dimensionality of tumor pixels from
the surrounding healthy environment, which significantly helps remove regions incorrectly extracted as tumors. Finally,
by extracting features from the previous stage’s output through Otsu thresholding, the surrounding area and type of
tumor are also extracted. The proposed method was trained and tested using the BRATS2020 database and evaluated
by various performance metrics. The results based on the Dice Similarity Coefficient (DSC) show an accuracy of 97% for
the entire MR image and improved detection accuracy compared to other methods, as well as a reduction in the cost
of the diagnostic process.
Keywords Autoencoder neural network · MR images · Segmentation · MRI · Brain tumor detection · Autoencoder
1 Introduction
The brain is a soft and spongy organ composed of tissues that are protected by the skull, three thin layers of tissues called
Meninges or pia mater, watery liquid (the fluid surrounding the brain) that flows in the spaces between the Meninges
and within the ventricle of the brain [1]. A brain tumor is a solid and often spherical Neoplasm that develops inside the
brain or the central spinal canal. In simple terms, a brain tumor is an abnormal mass in the brain that can be either can-
cerous (malignant) or non-cancerous (benign) [2]. The threat level posed by a tumor depends on factors such as its type,
location, size, age, and mode of growth and development [3]. Physicians classify brain tumors based on their grades [4].
Type one tumor is visible under a microscope, and it consists of benign and non-cancerous tissue. Cells in this type of
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
tumor are approximately like normal brain cells and usually are included with long-term survival [5]. Type two tumors
in comparison to type one less like normal brain cells, and under a microscope somewhat appear abnormal. Type three
tumors also grow slowly [6]. Type three tumors are cancerous and have different cells from normal cells. In this type of
tumor, abnormal cells (anaplastic cells) actively grow [7]. Type four tumors have highly malignant tissues and cells that
closely resemble abnormal cells. These tumors readily invade adjacent normal brain tissues and tend to rapid growth [8].
Based on the mentioned explanations about brain structure and tumors, it can be said that the best way to observe
and examine these tissues is through non-invasive imaging from inside of tissues like MR images, which utilize electro-
magnetic waves, which are used to produce images [9]. MRI is a precise and powerful imaging technique for diagnosing
problems and diseases in body tissues. It is based on Nuclear Magnetic Resonance (NMR) enhancement [10]. The signal
of NMR can be used for nuclei that possess magnetic properties, meaning their atomic particles have a spin or postural
movement [11]. Currently, MR imaging is one of the most important tools for recognizing and evaluating non-palpable
brain tumors due to its high resolution and quality. The analysis of tumors present in MR images is performed by spe-
cialists based on regions extracted by segmentation algorithms. In MR images, because of the presence of noise, partial
volume effects, contextual orientation, as well as highly diverse patterns in convolutions and sulci, often significantly
complicate the segmentation process [12].
The process of separating an image into multiple components is called image segmentation. Image segmentation
results in the creation of different sets of pixels within an image that facilitate the analysis and meaningful extraction
of information from the image [13]. In general, segmentation involves labeling each pixel in an image such that pixels
with the same label possess similar characteristics. Image segmentation divides the image into distinct regions, each
having uniform levels of brightness. This process is used in medical imaging to observe tissues and accurately detect
tissue boundaries [14].
Autoencoders are a specific type of artificial neural network that is used for efficient coding learning. Autoencoders
Instead of training the network to predict the target value of Y from the input value of X, are trained to reconstruct the
input X, resulting in output vectors of the same dimensions as the input vectors [15]. During this process, the autoencoder
is optimized by minimizing the reconstruction error. The purpose of Autoencoders is to learn a compressed representa-
tion for a dataset. Autoencoders consist of three main layers [16]: an input layer, such as brain MR images; some smaller
hidden layers that perform the encoding process, and an output layer for the decoding process, where each neuron
has a similar meaning to the input layer [17]. In summary, an autoencoder as a function primarily includes of two parts.
The encoder, which is a feature extraction function that computes the feature vector from the inputs, and the decoder,
is trained using a predefined probabilistic model to maximize the similarity between the data (often approximately)
[18]. Auto encoders are commonly used for dimension reduction or feature extraction [19]. They are often trained using
variants of backpropagation algorithms such as conjugate gradient methods [20]. Although this model is generally
efficient and effective, it can be severely insufficient if errors occur in the initial layers. This issue may cause the network
to reconstruct an average of the training data. A suitable way to address this problem is using the network with initial
weights that approximate the final solution or using appropriate preprocessing techniques [21].
The purpose of this study is to assist specialists and physicians in the segmentation of brain MR images using an
autoencoder neural network and extracting optimal features to improve accuracy in the separation of tumor-surrounding
regions and tumor detection. The proposed method is performed by introducing a layered architecture for the autoen-
coder neural network and training the network using brain MR images from the BRATS20201 database. The performance
of the proposed method based on different parameters is evaluated and its accuracy is compared to other recent study.
The present study is divided into seven sections. In the second section, a brief introduction review of various tech-
niques and methods is provided for the segmentation of MR images in recent years. In the fourth section, the proposed
method for segmenting MR images using an autoencoder neural network is discussed. In the fifth section, the results of
the proposed method are presented. The sixth Section is about the Evaluation and Discussion of the proposed method.
Finally, the conclusion and future recommendations are presented in the seventh section. In the subsequent sections,
each stage will be further explained in detail.
1
https://www.kaggle.com/datasets/awsaf49/brats2020-training-data
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
2 Related works
2.1 This section discusses recent studies conducted in the area of brain MR image segmentation
In [22] (2023), an advanced feature-enhanced stacked autoencoder model (FEMALE) was introduced for the diagnosis
of brain diseases. Due to the suboptimal results obtained with a standard autoencoder, the researchers utilized an alter-
native stacked encoder. The proposed model in this paper includes four main stages: (1) preprocessing techniques are
applied to remove noise and convert the images into grayscale, (2) extraction of important features through Discrete
Wavelet Transform (DWT) and channelization, (3) classification of brain MR images into four main classes: normal, tumor,
stroke, and Alzheimer’s, (4) Evaluation. An accuracy of 96.55% is achieved for 40 to 60 test data samples.
In [23] (2021), a Deep Wavelet Autoencoder model (DWAE) was proposed for the segmentation of input data into
natural or abnormal tumor regions. In the preprocessing stages of the proposed method, high-pass filters were used
to represent the inhomogeneities in brain MR images and integrate them with the input image. The merging of the
parts segmented in this paper was performed using a high median filter. The quality of the output part was improved
by enhancing the highlight edges and smoothening the input images. Finally, to achieve the segmentation classes
through the proposed autoencoder, thresholding using the seed-growing method was employed. Experimental
results showed that the proposed method achieved an accuracy of segmentation of 96.55% on brain MR images.
In [11] (2020), researchers used a combination of Convolutional Neural Networks (CNNs) for brain tumor seg-
mentation. In this study, a combination of two popular CNN models, U-Net and SegNet, was used. In the proposed
architecture, U-Net used skip connections to gather precise information. However, the feature extraction process by
U-Net required significant computational time. To address this, a SegNet architecture with five convolutional blocks
was employed to improve computational efficiency. Experimental results demonstrated that the proposed method
achieved a segmentation accuracy of 93.12% on brain MR images.
In [24] (2018), a deep convolutional neural network was used for brain tumor segmentation that extracted high-
level concepts from low-level features. Because of the low accuracy and computation expensive for network training,
the Adapt Ahead [25] optimization algorithm was used to improve the output accuracy and efficiency. The proposed
layered model in this study can effectively segment a large volume of data within a suitable time range. Experimental
results indicate that the proposed method achieved a segmentation accuracy of 90% on brain MR images.
In [26] (2017), a combination of a Deep Wavelet Autoencoder and Deep Neural Network (DWA + DNN) was used
for brain tumor segmentation. In this study wavelet techniques for images are employed for compression. The pro-
posed method combines the reduced features obtained by the wavelet-based autoencoder with the original features,
offering a combined approach. Experimental results indicate that the proposed method achieved a segmentation
accuracy of 93% on brain MR data.
Recent research results indicate that deep neural network algorithms can be trained to achieve highly accurate
results in detecting various tumor types in MR images that can leads to a reduction in errors of diagnostic and an
increased use of medical tools.
In addition to the recent studies conducted in the field of regionalization of brain MR images, other studies based on
deep learning and machine vision models have been conducted, and some of these papers are discussed in this section.
In [27] (2024), a revolution in Brain Tumor Segmentation with a Dynamic Fusion of Handcrafted Features and
Global Pathway-based Deep Learning was introduced. In this study, a revolutionary cascade strategy is presented
that intelligently transfers past information from handcrafted feature-based ML algorithms to CNN. Handcrafted
features and deep learning are used to segment brain tumors in a global convolutional neural network (GCNN). The
proposed GCNN architecture with two parallel CNNs, CSPathways CNN (CSPCNN) and MRI Pathways CNN (MRIPCNN),
segmented BRATS brain tumors with high accuracy. This research can improve brain tumor segmentation and help
doctors diagnose and treat patients.
In [28] (2024), expeditious detection and segmentation of bone mass variation were used in DEXA images using
the hybrid GLCM-AlexNet approach. Because most researchers do not attempt to identify and segment low bone
mass from DEXA images, in this study medical image segmentation is performed in the analysis and visualization of
low bone mass. The proposed hybrid approach, which integrates GLCM for feature extraction and AlexNet for low
bone mass variability classification, provides segmented images that help classify bone health as normal, osteopenic,
or osteoporotic. The performance criteria of the developed algorithm, including Dice Co-efficient, Sensitivity, and
Specificity were 92.35, 90.26 and 92.42% respectively, which shows the proper performance of the proposed method.
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
In [29] (2024), the design and evaluation of a deep genetic algorithm for recognizing yoga postures were presented.
This paper uses chunks of data in the Yoga-82 dataset and models HARNet considering the movements of yoga asanas.
Experimental evaluations for the HARNet model show an accuracy of 98.52% for the Yoga-82 dataset. The cost of the
toy framework is also compared with advanced research on humanoid toys, and the economic scope of the proposed
framework is evident.
In [30] (2024), methodological intervention for predicting bone health was used to make clinical recommendations.
Texture analysis is the most important and distinctive image feature. In this study, an advanced discrimination power
texture feature extraction system for volumetric images is developed by combining two types of complementary informa-
tion: local binary patterns (LBP) and normalized gray level co-occurrence matrix (nGLCM) based techniques. The Kaggle
dataset including X-ray images obtained from patients with osteoporosis has been used to extract the features and train
the U-Net model in the classification of the developed algorithm. In summary, in this paper, the modified U-Net semantic
segmentation classifier (ModU-Net) is used to segment low bone mass segments in the processed image. This method
helps doctors in early diagnosis and also protects patients from bone fragility and possible fractures.
In [31] (2024), squeezenet-guided gaussian-kernel svm was used for covid-19 diagnosis. the proposed approach in
this study consists of combining SqueezeNet deep neural network with Gaussian kernel in Support Vector Machines
(SVM). This model is trained on a dataset of CT images. SqueezeNet has been used for feature extraction and Gaussian
kernel for non-linear classification. According to the evaluations, the SVM Gaussian-Kernel SN (SGS) model achieved
high accuracy and sensitivity in the diagnosis of COVID-19. This model outperformed other models with an accuracy of
96.15% and can be used as a successful SGS model and a promising approach for accurate diagnosis of COVID-19. The
integration of SqueezeNet and the Gaussian kernel enhances its ability to capture complex relationships and effectively
classify COVID-19 cases.
]n [32] (2023), an automated multi-class alzheimer’s disease classification was used in brain mri using a hybrid dense
optimal capsule network. in this research paper, a proposal for an efficient and automatic deep learning-based AD
diagnosis using MRI image data was made. The initial step in AD classification was data collection, which was used to
collect MRI brain images from the Alzheimer’s Disease Core Dataset (ADNI) and the OASIS dataset. The collected data
are pre-processed through an improved median filter (IMF) and normalized using the [0, 1] rescaling method. Also, skull
separation is done using morphological threshold (MoT). The preprocessed images are fed to the Multiview Fuzzy Clus-
tering (MvFC) algorithm to segment brain tissues as gray matter (GM), cerebrospinal fluid (CSF), and white matter (WM).
The process of deep feature extraction, multi-class classification, and loss optimization was performed using a Dense
Hybrid Optimal Capsule Network (Hybrid D-OCapNet). The overall accuracy in AD classification was 99.32%, sensitivity
98.42%, specificity 98.90%, accuracy 98.93%, and F1 score 98.44 for the ADNI dataset and accuracy 98.97%, sensitivity
98.31%, and F1 score 98.39% for the OASIS dataset.
In [33] (2023), a neural network and a mobile app were presented for COVID-19 recognition. In this study, a four-way
variable distance GLCM (FDVD-GLCM) is presented based on the gray-level co-occurrence matrix (GLCM). An extreme
learning machine (ELM) has been used as a classifier for the diagnosis of COVID-19. The proposed model uses a multi-way
data augmentation method to enhance the training sets. Ten cross-validity shows that this FECNET model is sensitive to
92.23 ± 2.14, 93.18 5.87, 93.12 0.83 and initial accuracy of 0.1 1,0 of 92.19 ± 1.89, 92.88 5.88, 92.83 ± 1.22, and accuracy,
and accuracy. 92.53 ± 1.37 was obtained for the second data set.
In [34] (2023), an in-depth exploration of transfer learning techniques was used in the context of brain tumor clas-
sification. This paper identifies research on transfer learning for brain tumor classification, including multimodal data
fusion, interpretable transfer learning models, and domain adaptation techniques. Ethical considerations and limitations
of transferable learning in health care are also discussed in this study. Finally, to advance the development of accurate
and efficient diagnostic tools in clinical settings, this paper has provided insight into the challenges, opportunities, and
prospects of transfer learning in brain tumor classification.
In [35] (2023), an evolutionary model for brain cancer grading and Classification was presented. In this study, an
evolutionary model first prepares the data, extracts the important features, and merges the data using a special type of
classification called group classification. The proposed model was evaluated using the BRATS2020 dataset. The dataset
consists of 285 MRI scans from patients with glioma. The simulation results showed that the proposed model achieved
93.0% accuracy, 0.94 precision, 0.93 recall, 0.94 F1 score, and an area under the Receiver Operating Characteristic Curve
(AUC-ROC) value of 0.984.
In [36] (2023), brain tumor segmentation accuracy is enhanced through scalable federated learning with advanced
data privacy and security measures. This study has used U-Net-based model architecture. The experimental results show
the remarkable effectiveness of learning in this study. This study improves the specificity to 0.96 and the Dice coefficient
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
to 0.89 with the increase of customers from 50 to 100. The findings of this research contribute to the advancement of
the field of medical image segmentation while maintaining privacy and data security.
In [37] (2023), brain tumor segmentation from MRI images is enhanced using a handcrafted convolutional neural net-
work. In this study, firstly, handcrafted features that included intensity-based, texture-based, and shape-based features
were extracted from MRI scans, and then a unique CNN architecture was developed and trained to automatically identify
features from the data. The proposed hybrid method was transformed into a new CNN with handcrafted features and the
features detected by CNN in different directions. In this study, the BRATS dataset was used to measure performance using
a variety of evaluation criteria. The obtained results showed that our proposed approach performs better than traditional
feature-based methods and individual CNN-based methods used for brain tumor segmentation. Moreover, the incorpora-
tion of handcrafted features increased the performance of CNN and obtained a more robust and generalizable solution.
Brain MR images typically indicate complex structures and accurate segmentation of these structures is essential for clini-
cal diagnosis. There are many tools available for automated and semi-automated segmentation of brain MR images, but
most of them fail for various reasons such as unknown noise, low gray level of contrast in images, inhomogeneity, and
weak edges commonly found in brain MR images. MR images also change due to a divert field of biasing, which leads
to intensity variations in inhomogeneous tissues throughout the image. This method does not sufficiently guarantee
that the intensity distribution of a tissue type is consistent across different objects with the same MRI sequence; rather,
it is considered an implicit assumption in most segmentation methods. This method may even vary when an image of a
patient is acquired using a different scanner or at different time intervals. Because of These factors, different segmentation
methods excel only in detecting specific tumor types. Because of these factors, each method works well in detecting a
certain type of tumor. Considering the dimensionality reduction and noise reduction features in autoencoder networks
to address the heterogeneity and weak edges in such images, this study employs an autoencoder network for more
optimal and accurate segmentation of various tumor types in different sequences of brain MR images.
The proposed method includes four main stages to segment and eventually recognize an accurate region of tumor: (1)
acquiring the input image, (2) the initial preprocessing of the images, (3) segmentation using the proposed autoencoder
neural network, and (4) extraction of tumor regions.
In the proposed method, the MR images are initially subjected to data normalization for data homogenization and
resized to 240 × 240 pixels. Then normalized in terms of intensity brightness and followed by the elimination of noise
resulting from these transformations. Finally, image sharpening operations are performed. After completing the process
of the input data preparation, the preprocessed images are fed into the autoencoder neural network for the segmentation
process. In the final step, the output of the segmentation process is utilized for tumor extraction and size determination.
2.3 Input images
The images used in this study belong to the BRATS2020 database. The database contains four MRI sequences available
for each patient in the BRATS challenges: T1-weighted (T1), T1 with gadolinium contrast enhancement (T1c), T2-weighted
(T2), and FLAIR. The BRATS2020 database provides Training, Validation, and Testing datasets, each containing MR scans
taken from different patients. In the training process of the proposed method, two sets of images related to benign
tumors (type 1 and 2) and malignant tumors (type 3 and 4) were used. To improve the results of the proposed method
Data Augmentation technique was applied to input data. Data augmentation is used to create reliable data for pre-
training. This technique involves rotating image patches at different angles. In this study, for type 1 tumors, patches
were rotated at angles of 90, 90-, and 180 degrees, while for grade 2 tumors, the patches were rotated at angles of 45
and 45- degrees.
2.4 Preprocessing
Initial preprocessing includes the normalization of the image by size, intensity brightness, eliminating noise, and image
sharpening. In this stage first Resizing the input MR images to a size of 240 × 240 pixels to match the dimensions of the
proposed autoencoder neural network to mapping to the Intended neurons of autoencoders should be done with a unit
standard. In the intensity brightness normalization stage, the input image is first converted to a grayscale image, and the
intensity values are normalized between zero and one. The noise reduction is performed using a median filter applied
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
to all images to eliminate noise. To restore the lost details during the noise Reduction and enhance the sharpness of the
brain image, image details are extracted using a sharpening filter and added back to the original image.
2.5 Segmentation
Autoencoders are a type of deep neural network primarily used for dimension reduction. These networks to reconstruct
the input data are trained and achieve dimension reduction by using fewer neurons in the hidden layer than the input
layer. The initial target is to develop unique parameters for segmenting the brain image into regions of white matter,
gray matter, and cerebrospinal fluid, as shown in Fig. 1. However, these three regions have distinct characteristics. For
example, white matter consists of large white patches, cerebrospinal fluid is thinner and more flexible, and gray matter
includes both of these characteristics. Therefore, we decided to adjust the corresponding layers separately based on the
characteristics of each region.
Another purpose is that design an algorithm by using fewer parameters than most neural network architectures are
used. In autoencoders, the number of hidden layers is usually less than the input layers, and therefore, autoencoders
essentially learn a representation of the input space in a lower-dimensional space that this hidden space can be used for
reconstructing the original image. Although this reconstruction may not be perfect, it can be conceptualized as compress-
ing the input image from a higher-dimensional space to a lower-dimensional space, similar to JPEG compression. In this
study, the proposed autoencoder network is used with 8, 16, 32, 64, and 128 neuron layers. The compression phases of
the 8, 16, 32, and 64 neuron layers are performed sequentially, where the output of each layer goes to the next encoder.
The inverse compression is immediately performed in the 128-neuron layer, while the Compression reverser is embed-
ded in the other layers of the decoder symmetrically. The 8 and 16-neuron layers, along with their corresponding reverser,
are suitable for detecting cerebrospinal fluid, and the arrangement and number of neurons in these two detection layers
provide a suitable detection for abnormalities in cerebrospinal fluid, which is thinner and more flexible. The 32-neuron
layer, along with its corresponding reverser layer, is suitable for detecting white matter, which consists of large white
patches. The 64-neuron layer, along with its 64-neuron reverser layer and the combination of its output with the 128-neu-
ron layer, can handle the task of detecting gray matter, which possesses all the characteristics of large white patches,
thinness, and flexibility.
The overall architecture of the proposed autoencoder neural network can be seen in Fig. 2. This network consists
of five layers from type increasing filter size, followed by a reverser network that reduces the filter size. The input data
is corrupted by adding noise or randomly removing a portion of the image, and then this corrupted image is trained
based on the original image, aiming to reconstruct the missing part of the image or find the correct image from a noisy
input. The network receives a noisy or corrupted image and then returns a clean and denoised image. In other words,
the images with noise (noisy inputs) are first encoded into less space with lower dimensions. This process is similar to
compression, which leads to some information loss (in this study, the goal is to eliminate noise-related information).
Finally, the output of the hidden space (inverse compression) is decoded, providing a reconstructed and de-noise image
of all three parts of the brain. This output can be used to detect abnormal brain image conditions and consequently
identify tumor-affected regions.
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
When it comes to Autoencoder and hidden layers, the main concern is how many hidden layers and how many
neurons are needed; There are several ways to determine the correct number of neurons to use in hidden layers,
including: The number of hidden neurons must be between the size of the input layer and the size of the output
layer. The number of hidden neurons should be 2/3 the size of the input layer plus the size of the output layer. The
number of hidden neurons must be less than twice the size of the input layer. In addition, the number of neurons
and the number of layers required for the hidden layer also depend on the training items, the number of outliers, the
complexity, the data to be learned, and the type of activation functions used. In this study, we used a deep autoen-
coder where the first layers are used for the first-order features in the raw input and the middle layers are used for the
second-order features corresponding to the first-order feature patterns. The deeper layers of the deep autoencoder
learn higher-order features. In fact, the deep autoencoder network consists of two symmetric deep belief networks.
As shown in Fig. 2, the left half indicates the encoding half of the network and the right half indicates the decoding
half. If a smaller number of neurons is selected, it leads to underfitting. While if we choose a large number of neurons,
it may lead to overfitting, high variance and increase the time required for network training.
The proposed autoencoder neural network has two execution phases: the encoding phase (from the input layer
to the hidden layer) and the decoding phase (from the hidden layer to the output layer). The encoding phase can
be formulated with Eq. (1), where W represents the weight matrix and b1 is the bias vector for the encoder phase.
Equation (2) formulates the decoding phase, where WT is the weight matrix and b2 is the bias vector for the decoder
phase. h represents the hidden representation of the input layer x, and x ̃ is the function p(x│h), which should
approximately resemble the shape of x. f in both equations refers to one of the commonly used activation functions.
In the proposed autoencoder neural network, the sigmoid activation function is used for the encoder, and a linear
function is used for the decoder. The sigmoid is a real function, bounded, and differentiable function. This function is
defined for all real values and has a positive derivative. Activation functions are an important part of neural networks
that their input can be any number, and their output is a number between zero and one. Indeed, these functions by
transforming a number to the range can decide whether a neuron should be activated in a neural network or not.
( )
h = f Wx + b1 (1)
x̃ = f (W T h + b2 ) (2)
In summary, the layers of the proposed autoencoder neural network consist of four encoder layers (compressors)
and four corresponding decoder layers (inverse compressors) that are linearly established, along with a 128-neuron
layer. The encoding and decoding in the 128-neuron layer are performed symmetrically, while in the other four lay-
ers, this process is performed in Correspondence. These layers are trained for the identification of features of types
3 and 4 tumors for 130 patients with malignant tumors (types 3 tumors: 70; types 4 tumors: 60). Also, the proposed
autoencoder neural network was tested using 10 images with types 1 and 2 tumors and 10 images with types 3 and
4 tumors. Validation was also performed using fragments obtained from 11 images with types 1 and 2 tumors and
10 images with types 3 and 4 tumors.
The proposed autoencoder neural network was trained using images from the BRATS2020 database. The layers
of the proposed autoencoder neural network were trained to detect types 3 and 4 (malignant) tumors on 110,492
training patches (types 1 and 2 tumors: 533,856; types 3 and 4 tumors: 576,636) and evaluated on 438,275 testing
patches (types 1 and 2 tumors: 244,320; types 3 and 4 tumors: 193,955). Train data accompanied with 20 percent
noise. The proposed autoencoder neural network was trained for 7000 epochs with an initial rate of 0.001. Addition-
ally, the layer-by-layer network was also trained to detect types 1 and 2 (benign) tumors using 941,716 patches. In
this process, which was performed on 50 images with a 25% noise mask, the weight and bias values were initialized
with their previous defaults. For fine-tuning, the weight and bias of the network layer and decision layer were initial-
ized with zero values. We also use the spatial dropout after the initial encoder convolution. The dropout rate is %25
for all layers. We have experimented with other placements of the dropout but did not find any additional accuracy
improvements. The network was retrained with 3,304,035 patches (types 1 and 2 tumors: 1,416,015; types 3 and 4
tumors: 1,888,020), and validation was performed on 411,495 patches (types 1 and 2 tumors: 176,355; types 3 and 4
tumors: 235,140). Due to the limited number of types 1 and 2 tumors in the image database, the network was also
retrained based on types 1 and 2 tumors data, and performance optimization was conducted with a 35% random
dropout rate using types 1 and 2 tumor patches (training: 1,365,450; validation: 181,170).
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
The weight of the network was trained by minimizing the possible cost of negative input with a stochastic gradient
descent function with a descent rate of 0.9. The learning rate was 0.500 and the learning rate was reduced to 0.001. The
initialization and the current learning rate were calculated by Eq. (3).
Current learning rate = Initial learning rate ÷ (Epoch × Reduced learning rate) (3)
2.6 Tumor detection
In the first step for extraction tumor regions use a medium filter to reduce noise in the output images. The value of the
filter was 5 and the number of pixels also was 5 determined based on brightness intensity.
The code snippet for noise reduction is given below:
I = imread (’aks.png’);
h = ones (5,5) / 25;
I2 = imfilter (I,h);
Next stage, the original image and the filtered image are merged in order to enhance the quality of image. Then, the
image is filtered using a median filter, which is a non-linear operation commonly used in image processing to reduce
“salt and pepper” noise. When the purpose is to simultaneously reduce noise and preserve edges, the median filter is
more effective than other filters. For simulating this part, the "medfilt2" command from the image processing toolbox
in MATLAB is used.
The code snippet for the median filter is given below:
I = imread (I2);
J = imnoise (I,’salt & pepper’,0.02);
K = medfilt2 (J);
After this stage, the filtered image, based on Otsu Thresholding converted to a binary image. In this method, the overall
threshold for conversion of an image to a binary image with intensities normalized between 0 and 1. The Otsu method
that was used, calculates the threshold to minimize the within-class variance of black and white pixels. After segment-
ing the image using the Otsu method, the morphological operations are applied to the previously generated binary
segmented image to find out the area of interest and neglect the other undesired regions. To optimize the attributes, the
"imerode" and "imdilate" commands from morphological operations are used for erosion and dilation on the resulting
image. The same morphological operations have been applied to all the state-of-the-art algorithms to avoid loss and
fair comparison of the segmented brain tumor images. Finally, the obtained result is merged with the original image.
The code snippet for two operations of “imerode” and “imdilate” is given below:
se = strel (’K’,0.5,0.5);
M = imerode (I,se);
se = strel (’M’, 0.5,0.5);
V = imdilate (I,se);
At this stage, the resulting image represents the tumor region clearly on the original image. Finally, the identified
tumor edges are highlighted, and the tumor image is displayed individually without noise. At the end of this step, a post-
processing stage consisting of histogram matching and normalization is performed. In the histogram matching process
depicted in Fig. 3, all available images are matched to the reference training image’s histogram. This procedure ensures
that the contrast and dynamic range remain consistent throughout the processing.
After histogram matching, the calculation of the outlier is performed based on the z-score method. In this method, all
sequences related to tumor volumes are independently normalized with a mean of zero and a standard deviation of one.
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
After this stage, the extraction process of patches is performed. In each sequence of MR images, patches of size 21 × 21
are utilized for pre-training and fine-tuning the network. For pre-training, the patches are sampled along the image plane
with a step of 10, using a 21 × 21 window. Subsequently, the patches are extracted from the regions surrounding the
tumor for fine-tuning. This sampling approach reduces the imbalance between damaged tissues and healthy patches,
thereby improving the overall performance of the network.
3 Experimental results
In this section, we present the results obtained from the proposed method, including the introduction of the database,
simulation environment, evaluation metrics, and the display of graphs and images related to the output of the proposed
model.
3.1 Dataset
The images that are used in this study were extracted from the BRATS2020 dataset, which consists of four MRI sequences
(T1, T1c, T2, and FLAIR) for each patient. The used dataset includes three sets: Training, Leaderboard, and Challenge,
belonging to MR scans of different patients. The Training set includes 20 high-grade tumors and 10 low-grade tumors,
which are based on manual segmentation available. The Leaderboard set consists of 21 high-grade tumors and 4 low-
grade tumors, and the Challenge set includes 10 high-grade tumors. Automated segmentation of brain tumors can save
physicians time and provide an accurate reproducible solution for further tumor analysis and monitoring. Nowadays,
deep learning-based segmentation techniques surpassed traditional computer vision methods for dense semantic seg-
mentation. Autoencoders can learn from examples and demonstrate state-of-the-art segmentation accuracy both in
2D natural images and in 3D medical image modalities. Multimodal Brain Tumor Segmentation Challenge (BRATS) aims
to evaluate state-of-the-art methods for the segmentation of brain tumors by providing an MRI dataset with ground
truth tumor segmentation labels annotated. Manual segmentation identifies four tumor categories: necrosis, edema,
enhancing tumor, and non-enhancing tumor. However, the evaluation for the enhancing tumor considers a combina-
tion of all categories (necrosis + non-enhancing tumor + enhancing tumor), along with the complete tumor. The images
in this dataset are in Mha format, which is transformed into PNG format for image processing purposes. When we train
a machine learning model, we set its parameters so that it can map a specific input (such as an image) to an output (a
label). The purpose of optimization in this study is to pursue the optimal point where the loss of our model is low, and
this occurs when the model parameters are set correctly. hence the number of parameters we need should be chosen
according to the complexity of the task that our model performs. To improve the results for type 1 and 2 tumors, data
augmentation techniques were applied to the input data due to the limited available data. This technique enhances
and increases the available data. In this study, for type 1 tumors, the patches were rotated by 90, 90-, and 180 degrees,
while for type 2 tumors, the patches were rotated by 45 and 45- degrees. It is also possible to use random angles in this
technique, but determining the appropriate Zero Filling or Zero Padding becomes challenging.
3.2 Simulation environment
The computer system used in this research consists of an Intel Core i7 processor accompanied by an Nvidia GeForce
graphics processor, which handles the main processing and graphics tasks. The graphics processor has 4 gigabytes of
memory for working with the data. The software employed for the simulation in this study is MATLAB R2016a, which
was prepared using the MatCaffe package based on Caffe 2.0, and MR images of brain tumors obtained from the 2020
BRATS dataset for the processing tasks.
3.3 Evaluation criteria
Various metrics are used to evaluate the results and performance of tumor detection methods, with the main purpose of
measuring the accuracy of tumor region pixel detection. The more pixels a method or algorithm can identify and detect
as tumor pixels, the more accurate the method is considered to be. However, this evaluation approach is not effective for
region-based segmentation of brain MR images for para-clinical purposes, where the purpose is not only to diagnose and
surgically treat the tumor region. Therefore, other metrics have been proposed to measure the accuracy and segmentation
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
of tumors, which, in addition to the importance of correctly identifying the number of tumor pixels, also consider the number
of healthy pixels that may be mistakenly classified as tumor pixels based on the filters and algorithmic changes employed.
These evaluation metrics are usually expressed in two ways. One method is based on the sets of tumor and healthy brain
regions, and the second method is based on the number of pixels belonging to tumor and healthy brain regions, as shown in
Fig. 4. This figure is called a Confusion Matrix, in which both sets mentioned and the number of pixels in each set are visible.
In Fig. 4, Tp occur when the model accurately predicts pixels class. Tn occurs when the model accurately predicts a pixel
is not belong to a class. Fp occur when the model predicts pixels class incorrectly. Fn Occurs when the model incorrectly
predicts that the pixel does not belong to its own class. In general, it can be concluded that Tp + Tn + Fp + Fn = N, that N is the
total number of pixels in the image. The following are the most important evaluation metrics used in this study to assess
the proposed method:
Jaccard Similarity Index (JSI): It is one of the most important metrics for evaluating the success or failure of a method in
detecting tumor regions. JSI is defined by the Eq. (4):
|A ∩ B|
JSI = (4)
|A ∪ B|
Dice Similarity Coefficient (DSC): Alongside the Jaccard similarity index, the Dice coefficient is employed to assess the
quality of segmentation. The DSC is defined by Eq. (5):
2 × |A ∩ B|
DSC = (5)
|A| + |B|
The closer the value of these two indices is to 1, the better the quality of segmentation. The set of segmented pixels of
the target tissue in the image segmented by the neurologist expert is shown as A, and B shows the set of segmented pixels
of the target tissue by the algorithm.
Furthermore, for a more detailed assessment of segmentation accuracy, two characteristics, sensitivity (TPF) and specificity
(TNF), are used. Sensitivity is defined by Eq. (6), and specificity is defined by Eq. (7):
tp
Sensitivity = (6)
tp + fn
tn
Specifity = (7)
tn + fp
Additionally, for better evaluation of segmentation, the percentage of under-segmentation, over-segmentation, and mis-
segmentation can be used. According to Eqs. (8), (9), and (10), UnS represents the percentage of pixels mistakenly considered
as cluster A, OvS represents the percentage of pixels from cluster A mistakenly not considered, and InS represents the overall
percentage of pixels that are mis-segmented for a cluster.
fp
UnS = × 100 (8)
n
fn
OvS = × 100 (9)
p
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
fp + fn
InS = × 100 (10)
N
In the above Equations, n represents all the pixels belonging to cluster A (n = tp + fn), p represents all the pixels outside
cluster A (p = tn + fp), and N represents the total number of pixels in an image (N = n + p).
3.4 Output results
To display the segmentation results for brain tumors, a multi-class classification with 5 categories (healthy tissue, necrosis,
edema, enhancing tumor, and non-enhancing tumor) was needed to align with the BRATS dataset. However, these classes
were imbalanced in brain tumors; therefore, all samples from the unrepresented classes and random oversampling from
the other classes were also used in this study. The number of necrosis and enhancing tumor samples in the training set
for type 1 and 2 (benign) tumors was low. To solve this problem, the intensity values of type 3 and 4 (malignant) tumors
were calculated using boundary cues and normalized with type 1 and 2 (benign) tumors so that the sample classes
obtained from type 3 and 4 (malignant) tumors could also be used as training samples for type 1 and 2 (benign) tumors.
In the training process of the proposed autoencoder neural network, approximately 450,000 and 335,000 patches were
extracted for type 3 and 4 (malignant) and type 1 and 2 (benign) tumors, respectively. The data augmentation made
about four times as many effective training samples available. These patches approximately represent 40% healthy tis-
sue in type 3 and 4 (malignant) tumors and 50% in type 1 and 2 (benign) tumors. The learning rate linearly decreased
after each epoch during the training process. The results of applying each layer of the autoencoder neural network on
a sample image are shown in Table 1.
In the network training, the number of epochs was seven, since the network accuracy does not change with an
increased number. By finalizing the meta-parameters on the input images and recording the network performance
accuracy in the training procedure, the cost function is illustrated in Fig. 5. As can be observed, the network cost at the
end of the 6th epoch after 7000 iterations, decreased. No change occurred after that and the cost remained constant.
The steps described in the simulation section were implemented on each of the 10 selected images. The results
obtained from the implementation of the proposed method on the 10 selected images are summarized in Table 2. Table 3
also displays the size of the images and the size of the tumor in pixels or, in other words, the area of the tumor in pixels,
for each of the 10 selected images.
In this section, we evaluate the impact of each component of the proposed method by examining the increase in
performance. This performance improvement is evaluated as the average yield in the TPF and TNF coefficients. The evalu-
ation is carried out as follows: first, we calculate all the coefficients using the proposed method for the dataset, then we
remove or replace the component under investigation. Finally, by subtracting each metric unit from the two systems, we
calculate the average of the differences and obtain the average yield. Furthermore, the intensity normalization method
has only been applied to the images within the training dataset during the training phase. All the experiments conducted
in this section use the extracted patches from the vertical planes including axis X of the MR image I, and only the selec-
tion of the best axis is evaluated at the final stage. The choice of this approach was primarily because it had been more
widely used in the brain tumor segmentation method based on autoencoder neural networks.
The alternative preprocessing initially begins by minimizing the percentage in the intensity values inside the brain,
then applying bias field correction to each of the used MRI sequences, and linearly converting the intensity values
to [1 and 0]. Finally, each sequence is normalized to have a mean of zero and a unit variance. Additionally, during the
training process of the autoencoder neural network, through this preprocessing, we observed that for type 1 and 2
(benign) tumors, the initial and final learning rates should be reduced to 10–7 × 3 and 10–5 × 3, respectively, otherwise
the optimization will diverge. Based on the results shown in Table 4 for the Training image set and Test image set, t it can
be concluded that the preprocessing using the intensity normalization method has increased most of the coefficients.
The proposed preprocessing has been able to improve the complete detection as well as the challenging task of tumor
core segmentation.
The results of this experiment are intriguing in both types, as we know that instead of calculating point-wise
features as intensity values, the features learned by the autoencoder neural network are computed in local regions
through low-pass filters at different scales. Additionally, we investigated the effect of increasing the number of train-
ing epochs up to 90 but did not observe any progress with performing simpler preprocessing. Referring to Table 4
and examining the results for test image set segmentation, it can be concluded that the proposed preprocessing
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
A B
C D
E F
G H
enables better training of the autoencoder neural network, resulting in segmentation that displays an improved
pattern of non-enhancing regions and necrosis in both datasets.
In this study, two types of data augmentation were examined. In the first type, the effect of data augmentation
through increasing the number of samples using rotation was investigated. In this type, rotations of multiples of 90
degrees (90, 180, and 270 degrees) were used, whereas in the second type, three equidistant rotational angles were
used, which were sampled in a row using a uniform distribution. The angle values were determined as 90⁰ × α with
{18, 116, 132} ∈ α, where the value of 1.16 = α was examined. Table 5 shows the results for each type of rotation with
different angles. As observed, the rotations increased DSC performance in all regions, but a decrease in sensitivity
was also observed in both dataset variables. Considering the results shown in Table 4, we conclude that the additional
information generated by the first type of rotations in training the autoencoder neural network results in a better
depiction of the entire tumor and its internal structures in the segmentations. In both grades, when training without
data augmentation, we observe the dominance of the non-enhancing class, and for type 3 and 4 (malignant) tumors,
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
this class is even found within the region formed by the enhancing structures and necrotic tissue, although this does
not occur in manual segmentation.
In Table 6, examples of the output of the proposed method are displayed. In the figure below, from left to right, T1,
T1c, T2, FLAIR, and the segmentation output are shown. Each color represents class of a tumor: green—edema, blue—
necrosis, yellow—non-enhancing tumor, and red—enhancing tumor. Figure A shows the segmentation with type 3
and 4 (malignant) tumors in the first row for the patient with ID 210, Figure B shows the segmentation with type 1 and
2 (benign) tumors in the second row for the patient with ID 105, and Figure C shows an example of complete tumor
segmentation in the Challenge dataset in the third row for the patient with ID 310.
The achieved accuracy for brain MR image segmentation of some of the studies introduced in the background
section (in terms of the Dice similarity metric) is shown in Table 7 in the order of their publication year. The average
Dice similarity metric accuracy can be obtained for the three cases of complete, core, and enhancing regions on the
BRATS image dataset; in this study, only the results related to the complete case are reported for comparison with
other articles. While these accuracies have improved significantly compared to the conventional previous methods,
further improvement in performance accuracy and reduction in training and testing time are still highly important.
4 Discussion
An autoencoder can learn nonlinear transformations with a nonlinear activation function and multiple layers. In this
architecture, there is no need to learn dense layers and convolutional layers can be used to determine which one is
better for video, image, and time series data. Learning multiple layers with an autoencoder is much more efficient
than a massive transformation with PCA. An autoencoder provides a representation of each layer as output and it
can use pre-trained layers of another model and transfer learning for the encoder/decoder. Therefore, the use of this
network in medical applications can be very efficient, because autoencoder networks learn to encode the input into
a set of simple signals and try to reconstruct the input from them and change the image geometry or reflectivity.
In this study, the autoencoder neural network architecture was used for the segmentation of brain tumors in MR
images. The overall flowchart of the proposed approach is shown in Fig. 2. The implementation and evaluation of
the proposed method were carried out on the BRATS2020 dataset. For training the network, three sets of Training,
Leaderboard, and Challenge were used, which included MR scans of different patients, all of which had four sequences
of T1, T1c, T2, and FLAIR. The necessary preprocessing steps, including image normalization and data augmentation,
were performed before inputting into the autoencoder neural network. After the preprocessing step, segmentation
was performed by the autoencoder neural network, and finally, the tumor region and tumor type were extracted
using the proposed method. Table 1 shows an example of the results obtained by applying the different layers of
the proposed model on a sample image (a patient), and Table 2 shows various examples of the results obtained for
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
Table 2 Output of the proposed autoencoder neural network for 10 test images
Tumor shape The location of the tumor Segmentation of Image Original image
Image B3
Image B4
Image B332
Image B368
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
Table 2 (continued)
Tumor shape The location of the tumor Segmentation of Image Original image
Image B412
Image B436
Image B460
Image B469
Image B471
Image B491
the tumor region for different patients. The distinctive feature of this network compared to previous methods, which
were mostly designed symmetrically, was the use of 8 and 16-neuron layers for the segmentation of white matter,
a 32-neuron layer for the segmentation of gray matter, and finally, 64 and 128-neuron layers for segmentation of
cerebrospinal fluid, which had a more complex tissue structure compared to the previous two tissues. The proposed
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
Table 3 Image size and #Num Image’s name Image’s size Tumor’s size
extracted tumor for ten
output images 1 B3 240 × 240 3769
2 B4 240 × 240 2136
3 B332 240 × 240 2472
4 B368 240 × 240 2780
5 B412 240 × 240 1879
6 B436 240 × 240 2553
7 B460 240 × 240 1518
8 B469 240 × 240 3078
9 B471 240 × 240 2887
10 B491 240 × 240 2921
Table 4 Evaluation results of Evaluation coefficient Training image set Test image set
the proposed method on the
Training image set and Test JSI 0.8327 0.8337
image set
DSC 0.9887 0.9790
TPF 0.9088 0.8811
TNF 0.9895 0.9940
UnS 1.18 0.658
OvS 9.11 11.88
InS 2.09 1.81
autoencoder neural network used a sigmoid encoder and a linear decoder. Based on the results shown in Table 4,
obtained from the evaluation of the proposed method using different metrics, it can be concluded that the present
method performed better and more accurately, and has suitable efficiency for improving the segmentation of brain
MR images. The 97% accuracy obtained for the Dice similarity coefficient (DSC) indicates the appropriate performance
of the proposed method compared to other works reviewed in Table 7 and related to recent years.
5 Conclusion
Today, machine learning is used as a powerful tool in solving artificial intelligence and data mining problems; but
this success depends on having a good feature representation of the raw data, and learning algorithms do not
provide satisfactory performance in the face of poorly represented data. Feature representation extraction is often
done manually and requires knowledge of a specific domain. Neural networks are designed inspired by the results
obtained from the study on the visual cortex of the brain of living organisms and
6 Conclusions
Today, machine learning is used as a powerful tool in solving artificial intelligence and data mining problems; but this
success depends on having a good feature representation of the raw data, and learning algorithms do not provide sat-
isfactory performance in the face of poorly represented data. Feature representation extraction is often done manually
and requires knowledge of a specific domain. Neural networks are designed inspired by the results obtained from the
study on the visual cortex of the brain of living organisms and provide the representation of raw input data at different
levels of abstraction. Medical image processing is one of the most challenging and necessary areas in today’s research.
Currently, MR imaging is one of the most important tools for detecting and evaluating non-palpable brain tumors due
to its clarity and high quality. The analysis of tumors in MR images is performed by medical experts based on the regions
extracted by segmentation algorithms. MR images are affected by the presence of noise in the image, partial volume
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
Table 6 Examples of the output of image segmentation with the proposed autoencoder neural network
Patient ID T1 T1c T2 FLAIR The result of segmentation
A (210)
B (105)
C (310)
effect, orientation fields, and as well as very different patterns in the indentations and grooves of the brain, increasing the
difficulty of segmentation in the form of sheets. In brain MR images of patients, the image of tumors and the surrounding
fluid is often diffuse, low-contrast, and has a tentacle-like structure. Furthermore, these tumors can be present anywhere
in the brain, with any shape and size. These factors make the detection and segmentation of these tumors by physicians
challenging, rendering the analysis a complex and time-consuming task. Thus, the need for an intelligent system to aid
physicians in this process is highly necessary. In this study, we introduced an auto encoder neural network for the seg-
mentation of brain MR images. The aim was to enhance the accuracy and improve the segmentation of brain MR images
using this neural network, thereby assisting physicians in detecting and segmenting both benign and malignant brain
tumors. The proposed method was trained and tested on the BRATS2020 dataset. The proposed method proceeded in
several steps: preprocessing the input data, segmentation using the auto encoder neural network, and extraction of the
regions around the tumor and the tumor type. Based on the results of the evaluation of the proposed method, an accu-
racy of 97% correct diagnosis with a Dice Similarity Coefficient (DSC) was obtained. The achieved accuracy demonstrates
the appropriate performance of the proposed method compared to other recent works.
7 Future works
As future research work, one can examine the application of other useful parameters and make changes using different
criteria in the layers of the autoencoder network to increase the accuracy of the proposed model; other classification
methods such as the region growing method and modified clustering methods can be evaluated considering spatial
information; additionally, to increase the speed of operations and processing, especially in cases where the images have
high accuracy or when segmentation is performed for reconstructing the noisy parts of the brain, GPU technology and
parallel processing can be utilized.
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
Author contribution Statements and Declarations Data Availability Statement: Data sharing is not applicable to this article as no datasets were
generated or analyzed during the current study. Funding: Not applicable; This research received no specific grant from any funding agency
in the public, commercial, or not-for-profit sectors. Competing Interests: All authors wrote the main manuscript text and prepared all figures.
All authors have accepted responsibility for the entire content of this manuscript and consented to its submission to the journal reviewed all
the results, and approved the final version. The first Author: conceptualization, data curation, formal analysis, funding acquisition, investiga-
tion, methodology, project administration, resources, software, supervision, validation, visualization, writing – original draft, writing – review
& editing. The second Author: conceptualization, data curation, formal analysis, methodology, software, writing – review & editing. The third
Author: conceptualization, formal analysis, methodology, project administration, resources, visualization, writing – review & editing Consent
to participate: Not applicable. Consent to publish: Not applicable.
Funding Not applicable; This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Data availability No datasets were generated or analysed during the current study.
Declarations
Ethics approval and consent to partiipate Not applicable.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which
permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to
the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You
do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party
material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds
the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativeco
mmons.org/licenses/by-nc-nd/4.0/.
References
1. Kavitha P, Dhinakaran D, Prabaharan G, Manigandan MD. Brain tumor detection for efficient adaptation and superior diagnostic precision
by utilizing mbconv-finetuned-b0 and advanced deep learning. Int J Intell Eng Syst. 2024. https://doi.org/10.2226/ijies2024.0430.51.
2. Kienzler JC, Becher B. Immunity in malignant brain tumors: tumor entities, role of immunotherapy, and specific contribution of myeloid
cells to the brain tumor microenvironment. Eur J Immunol. 2024;54(2):2250257. https://doi.org/10.1002/eji.202250257.
3. Feng Y, Cao Y, An D, Liu P, Liao X, Yu B. DAUnet: A U-shaped network combining deep supervision and attention for brain tumor segmenta-
tion. Knowl-Based Syst. 2024;285: 111348. https://doi.org/10.1016/j.knosys.2023.111348.
4. Zhu Z, Wang Z, Qi G, Mazur N, Yang P, Liu Y. Brain tumor segmentation in MRI with multi-modality spatial information enhancement and
boundary shape correction. Pattern Recognit. 2024. https://doi.org/10.1016/j.patcog.2024.110553.
5. Cekic E, Pinar E, Pinar M, Dagcinar A. Deep learning-assisted segmentation and classification of brain tumor types on magnetic resonance
and surgical microscope images. World Neurosurg. 2024;182:e196–204. https://doi.org/10.1016/j.wneu.2023.11.073.
Vol:.(1234567890)
Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x Research
6. Zulfiqar F, Bajwa UI, Mehmood Y. Multi-class classification of brain tumor types from MR images using EfficientNets. Biomed Signal Process
Control. 2023;84: 104777. https://doi.org/10.1016/j.bspc.2023.104777.
7. Sahoo AK, Parida P, Muralibabu K, Dash S. Efficient simultaneous segmentation and classification of brain tumors from MRI scans using
deep learning. Biocybern Biomed Eng. 2023;43(3):616–33. https://doi.org/10.1016/j.bbe.2023.08.003.
8. Aamir M, Rahman Z, Abro WA, Bhatti UA, Dayo ZA, Ishfaq M. Brain tumor classification utilizing deep features derived from high-quality
regions in MRI images. Biomed Signal Process Control. 2023;85: 104988. https://doi.org/10.1016/j.bspc.2023.104988.
9. Singh C, Ranade SK, Kaur D, Bala A. A novel approach for brain MRI segmentation and image restoration under intensity inhomogeneity
and noisy conditions. Biomed Signal Process Control. 2024;87: 105348. https://doi.org/10.1016/j.bspc.2023.105348.
10. Feng L, Chen S, Wu B, Liu Y, Tang W, Liu F, Zhang C. Detection of oilseed rape clubroot based on low-field nuclear magnetic resonance
imaging. Comput Electron Agri. 2024. https://doi.org/10.1016/j.compag.2024.108687.
11. Daimary D, Bora MB, Amitab K, Kandar D. Brain tumor segmentation from MRI images using hybrid convolutional neural networks. Pro-
cedia Comput Sci. 2020;167:2419–28. https://doi.org/10.1016/j.procs.2020.03.295.
12. Rao CS, Karunakara K. A comprehensive review on brain tumor segmentation and classification of MRI images. Multimed Tools Appl.
2021;80(12):17611–43. https://doi.org/10.1007/s11042-020-10443-1.
13. Xiao H, Li L, Liu Q, Zhu X, Zhang Q. Transformers in medical image segmentation: a review. Biomed Signal Process Control. 2023;84: 104791.
https://doi.org/10.1016/j.bspc.2023.104791.
14. Yu Y, Wang C, Fu Q, Kou R, Huang F, Yang B, Gao M. Techniques and challenges of image segmentation: a review. Electronics. 2023. https://
doi.org/10.3390/electronics12051199.
15. Li P, Pei Y, Li J. A comprehensive survey on design and application of autoencoder in deep learning. Appl Soft Comput. 2023;138: 110176.
https://doi.org/10.1016/j.asoc.2023.110176.
16. Chen S, Guo W. Auto-encoders in deep learning—a review with new perspectives. Mathematics. 2023;11(8):1777. https://doi.org/10.
3390/math11081777.
17. Berahmand K, Daneshfar F, Salehi ES, Li Y, Xu Y. Autoencoders and their applications in machine learning: a survey. Artif Intell Rev.
2024;57(2):28. https://doi.org/10.1007/s10462-023-10662-6.
18. Prasshanth CV, Venkatesh SN, Sugumaran V, Aghaei M. Enhancing photovoltaic module fault diagnosis: leveraging unmanned aerial
vehicles and autoencoders in machine learning. Sustain Energy Technol Assess. 2024;64: 103674. https://doi.o rg/10.1016/j.seta.2024.
103674.
19. Khan SU, Hussain T, Ullah A, Baik SW. Deep-ReID: deep features and autoencoder assisted image patching strategy for person re-identi-
fication in smart cities surveillance. Multimedia Tools Appl. 2024;83(5):15079–100. https://doi.org/10.1007/s11042-020-10145-8.
20. Cui H, Li Y, Wang Y, Xu D, Wu LM, Xia Y. Towards accurate cardiac MRI segmentation with variational autoencoder-based unsupervised
domain adaptation. IEEE Trans Med Imaging. 2024. https://doi.org/10.1109/TMI.2024.3382624.
21. Das A, Mohapatra SK, Pattanaik RK, Tripathy B, Patra GR, Mohanty MN. Target driven autoencoder: a supervised learning approach for
tumor segmentation in 2024 international conference on emerging systems and intelligent computing (ESIC). IEEE. 2024. https://doi.
org/10.1109/ESIC60604.2024.10481532.
22. Butt UM, Arif R, Letchmunan S, Malik BH, Butt MA. Feature enhanced stacked auto encoder for diseases detection in brain MRI. Comput
Mater Continua. 2023. https://doi.org/10.3260/cmc.2023.039164.
23. Abd El Kader I, Xu G, Shuai Z, Saminu S, Javaid I, Ahmad IS, Kamhi S. Brain tumor detection and classification on MR images by a deep
wavelet auto-encoder model. Diagnostics. 2021;11(9):1589. https://doi.org/10.3390/diagnostics11091589.
24. Hoseini F, Shahbahrami A, Bayat P. An efficient implementation of deep convolutional neural networks for MRI segmentation. J Digit
Imaging. 2018;31(5):738–47. https://doi.org/10.1007/s10278-018-0062-2.
25. Hoseini F, Shahbahrami A, Bayat P. AdaptAhead optimization algorithm for learning deep CNN applied to MRI segmentation. J Digit
Imaging. 2019;32:105–15. https://doi.org/10.1007/s10278-018-0107-6.
26. Mallick PK, Ryu SH, Satapathy SK, Mishra S, Nguyen GN, Tiwari P. Brain MRI image classification for cancer detection using deep wavelet
autoencoder-based deep neural network. IEEE Access. 2019;7:46278–87. https://doi.org/10.1109/ACCESS.2019.2902252.
27. Ullah F, Nadeem M, Abrar M. Revolutionizing brain tumor segmentation in MRI with dynamic fusion of handcrafted features and global
pathway-based deep learning. KSII Trans Int Inf Syst. 2024. https://doi.org/10.3837/tiis.2024.01.007.
28. Amiya G, Murugan PR, Ramaraj K, Govindaraj V, Vasudevan M, Thirumurugan M, Thiyagarajan A. Expeditious detection and segmenta-
tion of bone mass variation in DEXA images using the hybrid GLCM-AlexNet approach. Soft Comput. 2024. https://doi.org/10.1007/
s00500-024-09900-y.
29. Subramanian RR, Govindaraj V. HARNet: design and evaluation of a deep genetic algorithm for recognizing yoga postures. Signal Image
Video Process. 2024. https://doi.org/10.1007/s11760-024-03173-6.
30. Amiya G, Murugan PR, Ramaraj K, Govindaraj V, Vasudevan M, Thirumurugan M, Thiyagarajan A. LMGU-NET: methodological intervention
for prediction of bone health for clinical recommendations. J Supercomput. 2024. https://doi.org/10.1007/s11227-024-06048-2.
31. Shi F, Wang J, Govindaraj V. SGS: SqueezeNet-guided Gaussian-kernel SVM for COVID-19 Diagnosis. Mobile Netw Appl. 2024. https://doi.
org/10.1007/s11036-023-02288-3.
32. Nisha AV, Rajasekaran MP, Kottaimalai R, Vishnuvarthanan G, Arunprasath T, Muneeswaran V. Hybrid d-ocapnet: automated multi-
class Alzheimer’s disease classification in brain mri using hybrid dense optimal capsule network. Int J Pattern Recognit Artif Intell.
2023;37(15):2356025. https://doi.org/10.1142/S0218001423560256.
33. Zhang YD, Govindaraj V, Zhu Z. Fecnet: a neural network and a mobile app for covid-19 recognition. Mobile Netw Appl. 2023. https://doi.
org/10.1007/s11036-023-02140-8.
34. Anwar, R. W., Abrar, M., & Ullah, F. (2023, October). Transfer learning in brain tumor classification: challenges, opportunities, and future
prospects. In 2023 14th International conference on information and communication technology convergence (ICTC) . IEEE. https://doi.
org/10.1109/ICTC58733.2023.1039283
Vol.:(0123456789)
Research Discover Artificial Intelligence (2024) 4:71 | https://doi.org/10.1007/s44163-024-00180-x
35. Ullah F, Nadeem M, Abrar M, Amin F, Salam A, Alabrah A, AlSalman H. Evolutionary model for brain cancer-grading and classification. IEEE
Access. 2023. https://doi.org/10.1109/ACCESS.2023.3330919.
36. Ullah F, Nadeem M, Abrar M, Amin F, Salam A, Khan S. Enhancing brain tumor segmentation accuracy through scalable federated learning
with advanced data privacy and security measures. Mathematics. 2023;11(19):4189. https://doi.org/10.3390/math11194189.
37. Ullah F, Nadeem M, Abrar M, Al-Razgan M, Alfakih T, Amin F, Salam A. Brain tumor segmentation from MRI images using handcrafted
convolutional neural network. Diagnostics. 2023;13(16):2650. https://doi.org/10.3390/diagnostics13162650.
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Vol:.(1234567890)