F.SH: A 3D Recurrent Residual Attention U-Net For Automated Multiple Sclerosis Lesion Segmentation
F.SH: A 3D Recurrent Residual Attention U-Net For Automated Multiple Sclerosis Lesion Segmentation
Abstract:- Multiple sclerosis (MS) is an autoimmune medical image segmentation tasks [5]. In this study, we
disease affecting the central nervous system, propose F.sh (3DR2AUNet), a novel deep learning
characterized by lesions in the brain and spinal cord. architecture specifically designed for MS lesion
Accurate detection and localization of these lesions on segmentation. F.sh combines 3D recurrent residual blocks,
MRI scans is crucial for diagnosis and monitoring disease attention gates, and the U-Net structure to effectively capture
progression. Manual segmentation is time-consuming and lesion features and achieve accurate segmentation results.
prone to inter-rater variability. This study proposes F.sh
(3DR2AUNet), a novel deep learning architecture for II. MATERIALS AND METHODS
automated MS lesion segmentation. F.sh combines 3D
recurrent residual blocks, attention gates, and the U-Net A. Dataset
structure to effectively capture lesion features. The model The dataset used in this study consists of MRI scans
was trained and evaluated using a comprehensive from 70 MS patients. The scans were acquired using a
approach, including patch-based preprocessing, data Siemens Avanto 1.5 Tesla MRI scanner with a twelve-
augmentation, and a composite loss function combining channel head coil. FLAIR sequences were obtained with
Binary Cross-Entropy and 3D Dice Loss. Experimental dimensions of 181x217x181 and stored in NIFTI format.
results demonstrate the superior performance of F.sh Corresponding ground truth lesion masks were provided for
compared to baseline methods, achieving a Dice score of each scan.
0.92. The proposed approach has the potential to assist
radiologists in the accurate and efficient assessment of MS B. Pre-Processing
lesion burden.
The MRI Scans were Pre-Processed using the Following
I. INTRODUCTION Steps:
Multiple sclerosis (MS) is a chronic autoimmune Normalization: Intensity values were scaled to the range
disorder that attacks the central nervous system, leading to the [0, 1].
formation of focal lesions in the brain and spinal cord [1]. Patch extraction: 3D patches of size 64x64x64 with a
These lesions, also known as plaques, are visible on magnetic stride of 32 were extracted from the normalized scans.
resonance imaging (MRI) scans. In T2-weighted and FLAIR Data augmentation: Random rotation, flipping, and elastic
sequences, MS lesions appear as hyperintense regions, while deformation were applied to the patches to increase
in T1-weighted images with gadolinium contrast, they training data diversity.
present as incomplete bright rings [2]. Lesions can occur in
periventricular, infratentorial, white matter, and juxtacortical This Patch-Based Approach Offers Several Advantages:
regions of the brain.
Memory efficiency: Enables processing of high-
Accurate detection and localization of MS lesions on resolution 3D volumes on GPUs with limited memory.
MRI scans is essential for diagnosis, monitoring disease Data augmentation: Effectively increases the number of
progression, and evaluating treatment efficacy. The training samples.
McDonald criteria, which rely on the number and location of
Local context: Focuses the model on learning local
lesions, play a crucial role in the definitive diagnosis of MS
features.
[3]. However, manual segmentation of lesions is a time-
Class imbalance reduction: Mitigates the severe class
consuming and subjective task, prone to inter-rater
imbalance problem in MS lesion segmentation by
variability.
selecting patches containing lesions.
Automated MS lesion segmentation using image
C. F.sh (3DR2AUNet) Architecture
processing and artificial intelligence techniques has the
F.sh is a 3D CNN that combines recurrent residual
potential to improve the accuracy and efficiency of lesion
blocks (R2CL), attention gates, and the U-Net structure for
assessment. Deep learning, particularly convolutional neural
MS lesion segmentation. The key components of F.sh are:
networks (CNNs) [4], has shown remarkable success in
3D Recurrent Residual Convolutional Layer (3DR2CL): a decoder path. The encoder path comprises four 3DR2CL
This layer incorporates two 3D convolutional layers with blocks with 3D max pooling, progressively reducing
batch normalization and ReLU activation. It also includes spatial dimensions while increasing feature depth. The
a 3D residual connection and a 3D recurrent connection. bridge connects the encoder and decoder, maintaining
The 3DR2CL enhances 3D feature extraction and high-level feature representations. The decoder path
facilitates gradient flow, allowing for more effective includes four 3D upsampling blocks with 3D attention
learning of complex spatial relationships in the MRI data. gates and 3DR2CL blocks, gradually recovering spatial
3D Attention Gates: Integrated into the decoder path, information. 3D skip connections between corresponding
these gates focus on relevant 3D features and suppress encoder and decoder levels facilitate the integration of
irrelevant ones. This mechanism improves the model's low-level and high-level features.
ability to capture small lesions in 3D space by
emphasizing important spatial information while reducing Figure 1 illustrates the complete F.sh architecture,
the impact of background noise. showcasing the intricate connections between the various
3D U-Net Structure: The overall architecture follows a 3D components [6].
U-Net design, consisting of an encoder path, a bridge, and
Fig 1: F.sh (3DR2AUNet) architecture. The diagram shows the Encoder Path (Left), Bridge (Center), and Decoder Path (Right),
Highlighting the 3DR2CL Blocks, Attention Gates, and Skip Connections.
D. Loss Function and Evaluation Where p_i are the predicted lesion voxels, g_i are the
true lesion voxels, and ε is a small constant to prevent division
The Model Uses a Weighted Combination of Two Loss by zero.
Functions:
Evaluation metrics include accuracy, sensitivity (recall),
Binary Cross-Entropy (BCE): specificity, precision, F1 score, and 3D Dice score.
1 N
E. Training and Optimization
BCE
N
[ y
i 1
i log( pi ) (1 yi ) log(1 pi )]
F.sh was implemented using the PyTorch deep learning
framework [7] and trained on an NVIDIA GTX 1650 GPU
Where y_i is the true label and p_i is the predicted with 8GB memory. The model was optimized using the
probability for voxel i. Adam optimizer with an initial learning rate of 0.0001. A
learning rate scheduler (ReduceLROnPlateau) monitored the
3D Dice Loss: validation loss and reduced the learning rate by a factor of 0.5
if no improvement was observed. The composite loss
2 ( pi g i ) ò function combined BCE and Dice loss with equal weights of
Dice Loss 1 0.5. The model was trained for 70 epochs.
pi g i ò
Fig 2: Qualitative Segmentation Results. (A) Axial View, (B) Coronal View, (C) Sagittal View. For Each View: Left - Original
FLAIR MRI, Middle - Ground Truth Lesion Mask, Right - F.sh Prediction. The Model Accurately Identifies Lesions Across
Different Brain Regions and Orientations.
C. Training Progress
Figures 3 and 4 illustrate the training progress over 70 epochs.
Fig 3: Training and Validation Loss Curves. The Graph Shows the Binary Cross-Entropy (BCE) Loss, Dice Loss, and Total Loss
for Both Training and Validation Sets Over 70 Epochs. The Smooth Convergence of these Curves Indicates Stable and Effective
Learning. The Training Loss (Solid Lines) Consistently Decreases, while the Validation Loss (Dashed Lines) Shows a Similar
Trend with Slight Fluctuations, Suggesting Good Generalization Without Overfitting.
Fig 4: Performance Metrics during Training. This Graph Displays the Evolution of Accuracy, Sensitivity (Recall), Specificity,
Precision, and F1 Score Over 70 Epochs. All Metrics Show Consistent Improvement throughout the Training Process. The Rapid
Initial Increase in the First 10-15 Epochs Demonstrates the Model's Quick Learning of Basic Features. The Subsequent Gradual
Improvement Indicates Refinement of the Model's Ability to distinguish subtle lesion characteristics. By the Final Epoch, the
Model Achieves High Values Across All Metrics, with Specificity Reaching Near-Perfect Levels, Highlighting the Model's
Ability to Avoid False Positives.
The training and validation loss curves (Figure 3) show A. Validity and Reliability
smooth convergence, indicating stable and effective learning. The validity and reliability of the F.sh model are critical
The initial rapid decrease in loss is followed by a more aspects to consider when evaluating its potential for clinical
gradual improvement, suggesting that the model quickly application. In terms of validity, the high performance
learns basic features and then refines its ability to capture metrics achieved by F.sh, particularly the Dice score of 0.92,
more subtle lesion characteristics. indicate strong concurrent validity when compared to expert
manual segmentations. The model's ability to accurately
The performance metrics (Figure 4) demonstrate identify lesions across various brain regions (periventricular,
consistent improvement across all evaluation criteria juxtacortical, and infratentorial) further supports its construct
throughout the training process. The accuracy and specificity validity in capturing the diverse manifestations of MS lesions.
curves show a steep initial increase, indicating that the model
quickly learns to correctly classify the majority of non-lesion To assess reliability, future work should include test-
voxels. The sensitivity, precision, and F1 score curves show retest experiments, where the same MRI scans are processed
a more gradual improvement, reflecting the challenge of multiple times by F.sh to evaluate the consistency of its
accurately identifying and delineating lesions, which often segmentations. Additionally, inter-rater reliability studies
represent a small fraction of the total brain volume. comparing F.sh's performance to multiple human raters
would provide valuable insights into the model's consistency
By the final epoch, the model achieves high accuracy, relative to expert variability.
sensitivity, and specificity, with specificity reaching near-
perfect levels. This indicates that F.sh is highly capable of The generalizability of F.sh to different scanner types,
distinguishing between lesion and non-lesion voxels, with a field strengths, and patient populations is an important aspect
very low false-positive rate. of its external validity. While the current study demonstrates
promising results on a dataset of 70 patients, further
IV. DISCUSSION validation on larger and more diverse cohorts is necessary to
establish the model's broader applicability and reliability
The experimental results highlight the effectiveness of across various clinical settings.
F.sh for automated MS lesion segmentation. The proposed
architecture successfully addresses the challenges associated To Enhance the Model's Validity and Reliability, Future
with lesion heterogeneity, small size, and low contrast in 3D Work Could Explore the Following:
MRI scans. The combination of 3D recurrent residual blocks,
3D attention gates, and the 3D U-Net structure enables F.sh Multi-center validation studies to assess performance
to capture fine-grained lesion features and achieve accurate across different institutions and scanner types.
segmentation in three-dimensional space. Longitudinal studies to evaluate the model's consistency
in tracking lesion changes over time.
The high 3D Dice score (0.92) and sensitivity (0.90) Comparison with other automated segmentation methods
obtained by F.sh indicate its potential to assist radiologists in to benchmark F.sh's performance against state-of-the-art
the accurate detection and localization of MS lesions. By techniques.
automating the 3D lesion segmentation process, F.sh can Integration of uncertainty quantification methods to
reduce the time and effort required for manual delineation and provide confidence measures for the model's predictions,
improve the reproducibility of lesion assessment. The model's enhancing its interpretability and reliability in clinical
ability to handle 3D FLAIR MRI scans further enhances its decision-making.
clinical applicability.
By addressing these aspects of validity and reliability,
The learning curves and performance metrics over the F.sh can be further developed into a robust and trustworthy
training epochs demonstrate the model's stable learning tool for automated MS lesion segmentation in clinical
process and consistent improvement. The high specificity practice.
(0.9998) indicates that F.sh is highly capable of avoiding false
positives, which is crucial in clinical settings to prevent V. CONCLUSION
overdiagnosis.
In this study, we proposed F.sh (3DR2AUNet), a novel
However, there are limitations to consider. The dataset 3D deep learning architecture for automated MS lesion
used in this study is relatively small (70 patients), and further segmentation. F.sh combines 3D recurrent residual blocks,
validation on larger and more diverse cohorts is necessary to 3D attention gates, and the 3D U-Net structure to effectively
assess the generalizability of F.sh. Additionally, the model's capture lesion features in three-dimensional space and
performance may be affected by variations in MRI achieve accurate segmentation results. Experimental
acquisition protocols and scanners, requiring further evaluation on a dataset of 70 MS patient FLAIR MRI scans
investigation and potential adaptations. demonstrates the superior performance of F.sh compared to
baseline methods.
The high 3D Dice score, sensitivity, and specificity [9]. Valanarasu, J. M. J., Sindagi, V. A., Hacihaliloglu, I.,
obtained by F.sh highlight its potential to assist radiologists & Patel, V. M. (2021). KiU-Net: Towards accurate
in the accurate and efficient assessment of MS lesion burden. segmentation of biomedical images using over-
By automating the 3D lesion segmentation process, F.sh can complete representations. In International Conference
improve the reproducibility and objectivity of lesion on Medical Image Computing and Computer-Assisted
assessment, ultimately contributing to enhanced diagnosis Intervention (pp. 363-373). Springer, Cham.
and monitoring of MS. [10]. Taghanaki, S. A., Abhishek, K., Cohen, J. P., Cohen-
Adad, J., & Hamarneh, G. (2021). Deep semantic
Future work includes further validation of F.sh on larger segmentation of natural and medical images: a review.
and more diverse datasets, investigating its robustness to Artificial Intelligence Review, 54(1), 137-178.
variations in MRI acquisition protocols, and exploring its [11]. Gros, C., Lemay, A., Cohen-Adad, J., & Guttmann, C.
integration into clinical workflows. The incorporation of R. (2021). Automatic segmentation of multiple
additional MRI sequences, such as T1-weighted and T2- sclerosis lesions using 3D residual fully convolutional
weighted scans, may further improve the model's neural networks. NeuroImage: Clinical, 29, 102541.
segmentation performance. [12]. Zhang, J., Wang, Y., Wang, Z., & Zhang, J. (2021).
MS-Net: Multi-site network for improving deep
In conclusion, F.sh represents a promising approach for learning with limited training data on lesion
automated 3D MS lesion segmentation, combining advanced segmentation in multiple sclerosis. NeuroImage, 237,
deep learning techniques to achieve accurate and reliable 118155.
results in three-dimensional space. With further validation [13]. Kamnitsas, K., Ledig, C., Newcombe, V. F., Simpson,
and refinement, F.sh has the potential to become a valuable J. P., Kane, A. D., Menon, D. K., ... & Glocker, B.
tool in the clinical management of MS, assisting in diagnosis, (2017). Efficient multi-scale 3D CNN with fully
monitoring disease progression, and evaluating treatment connected CRF for accurate brain lesion
efficacy. segmentation. Medical image analysis, 36, 61-78.
[14]. La Rosa, F., Fartaria, M. J., Abdulkadir, A.,
REFERENCES Rahmanzadeh, R., Lu, P. J., Galbusera, R., ... & Bach
Cuadra, M. (2021). Multiple sclerosis cortical and
[1]. Reich, D. S., Lucchinetti, C. F., & Calabresi, P. A. WM lesion segmentation at 3T MRI: a deep learning
(2018). Multiple sclerosis. New England Journal of method based on FLAIR and MP2RAGE.
Medicine, 378(2), 169-180. NeuroImage: Clinical, 31, 102736.
[2]. Filippi, M., Preziosa, P., Banwell, B. L., Barkhof, F., [15]. Nair, T., Precup, D., Arnold, D. L., & Arbel, T. (2020).
Ciccarelli, O., De Stefano, N., ... & Rocca, M. A. Exploring uncertainty measures in deep networks for
(2019). Assessment of lesions on magnetic resonance multiple sclerosis lesion detection and segmentation.
imaging in multiple sclerosis: practical guidelines. Medical image analysis, 59, 101557.
Brain, 142(7), 1858-1875.
[3]. Thompson, A. J., Banwell, B. L., Barkhof, F., Carroll,
W. M., Coetzee, T., Comi, G., ... & Cohen, J. A.
(2018). Diagnosis of multiple sclerosis: 2017
revisions of the McDonald criteria. The Lancet
Neurology, 17(2), 162-173.
[4]. Akkus, Z., Galimzianova, A., Hoogi, A., Rubin, D. L.,
& Erickson, B. J. (2017). Deep learning for brain MRI
segmentation: state of the art and future directions.
Journal of digital imaging, 30(4), 449-459.
[5]. Ronneberger, O., Fischer, P., & Brox, T. (2015,
October). U-net: Convolutional networks for
biomedical image segmentation. In International
Conference on Medical image computing and
computer-assisted intervention (pp. 234-241).
Springer, Cham.
[6]. Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M.,
Heinrich, M., Misawa, K., ... & Rueckert, D. (2018).
Attention u-net: Learning where to look for the
pancreas. arXiv preprint arXiv:1804.03999.
[7]. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep
residual learning for image recognition. In
Proceedings of the IEEE conference on computer
vision and pattern recognition (pp. 770-778).
[8]. Li, H., Xiong, P., An, J., & Wang, L. (2018). Pyramid
attention network for semantic segmentation. arXiv
preprint arXiv:1805.10180.