0% found this document useful (0 votes)

10 views

Presentation New

The project focuses on enhancing cataract diagnosis using transfer learning techniques, aiming to develop an automated and accurate detection system through AI and deep learning. It addresses current limitations in datasets and diagnostic methods while introducing innovations such as synthetic data generation and explainable AI techniques like Grad-CAM. The project outlines objectives, methodologies, and preliminary results indicating the potential for improved cataract detection and patient outcomes.

Uploaded by

mitaligopinath.paul2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Presentation New

Uploaded by

mitaligopinath.paul2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

School of Computer Science and Engineering

Project-I Review - Presentation

BCSE497J (Project-I)
Title of Project: Enhancing Cataract Diagnosis with Transfer Learning
Techniques

Student Details:
Anusha Garg – 21BCE0412
Mitali Gopinath Paul- 21BCE2447
Suhani Bhatnagar- 21BCE3129

Faculty Guide:
Dr. Shashank Mouli Satapathy - Professor Grade-I

Date of Presentation: 13-11-2024

1
Introduction

Cataracts, a leading cause of visual impairment globally, occur when the eye's lens
becomes cloudy, resulting in blurred vision and potential blindness if left
untreated. Traditional diagnostic methods rely on manual evaluation by healthcare
professionals, which can be slow and inconsistent, particularly in regions with
limited access to specialized care. The rise of artificial intelligence (AI) and deep
learning (DL) offers a transformative solution, enabling automated and highly
accurate cataract detection. Leveraging advanced techniques such as convolutional
neural networks (CNNs), transfer learning, and ensemble models, AI-driven
systems can analyze medical images efficiently, reducing the need for manual
intervention and improving diagnostic precision. This project aims to harness these
technologies to develop a robust, early detection system for cataracts, enhancing
patient outcomes through faster and more reliable diagnoses.
Literature Survey

• Transfer learning with pre-trained models like VGGNet[15], ResNet[8], MobileNet[19], and Inception is widely applied for cataract
detection, often combined with SVM classifiers[18] to enhance performance, as seen in studies achieving up to 98.17% accuracy. Hybrid
models leveraging these architectures demonstrate high accuracy, such as 95.65% and 92.91%, with reduced data requirements.[1][5]

• Ensemble learning, combining multiple CNN models[3] like AlexNet[9], InceptionV3[22], and ResNet[7][18], enhances cataract
classification accuracy, achieving up to 99.20%. Hybrid methods, such as CNN-RNN combinations, further improve performance, with F1
scores reaching 95.90% and accuracies of 97.39% .[4]

• Several studies introduce advanced techniques for enhancing feature extraction in cataract classification. These include object detection
and multi-task learning[6] to focus on key areas , portable diagnostic systems using Inception-v3, real-time grading with YOLOv3[24][25]
and DenseNet[17], and the efficient CataractNet[11] model with reduced computational cost.[2]

• XAI techniques like Grad-CAM[20] enhance model transparency by generating visual heatmaps that highlight decision-relevant areas in
cataract detection. DeepSurgery uses AI to supervise cataract surgeries in real-time, improving surgical accuracy and outcomes.[14]
3
Research Gaps

 Dataset Limitations:
• Most studies rely on small, homogeneous datasets, While effective, there is a need for larger, more
varied datasets which can improve model accuracy and robustness.
 Focus on Binary Classification:
• Limited generalizability to diverse populations and real-world conditions. Current models often
overlook the subtleties of different cataract types and severity levels.
 Real-Time & Portable Diagnostic Challenges:
• Difficulty integrating real-time, portable tools, especially in resource-constrained settings.
Novelties

Synthetic Data Generation using GANs

 Implementing GANs (Generative Adversarial Networks) to generate synthetic fundus eye images, which can simulate various
conditions (cataract, non-cataract) and augment existing datasets.
• Impact: This method could help overcome the challenge of limited datasets, providing the model with a variety of cases to
learn from, improving accuracy and generalization.
Using GradCAM for Cataract Severity Prediction
 Implementing GradCAM (Gradient-weighted Class Activation Mapping) to generate heatmaps, which visualize the regions in
fundus images that contribute the most to the model's predictions.
• Impact: It could potentially enhance interpretability.
Novelties

Adding Custom Layers to InceptionResnet V2 Architecture

 Using transfer learning, the base model of InceptionResnet V2 is customized using the following layers-Global Average
Pooling, Batch Normalization, Dropout, Dense Layers, L2 Regularization
• Impact: These layers improve generalization, reduce overfitting, and optimize efficiency.
Using New Pre-Processing techniques
 CLAHE: Enhances contrast in low-contrast areas, preventing over-amplification of noise.
 Wavelet Denoising: Reduces noise, particularly illumination and small artifacts, while preserving image details.
• Impact: These techniques improve image quality, leading to more accurate and robust models.
Dataset

Dataset 1: Cataract Image Dataset

This dataset is available through GitHub and has been contributed by jr2ngb. It can be accessed
at:
https://github.com/yiweichen04/retina_dataset
Class-wise Distribution:
Class Number of Images
Normal 300
Cataract 100
Total for Study 400

Dataset 2: Cataract DR Normal Glaucoma Fundus Images Dataset

This dataset is contributed by DR S K Prabhakar and is hosted in a public repository.
Class-wise Distribution:

Class Number of Images

Normal 1039
Cataract 1075
Total for Study 2114
Dataset

Dataset 3: Immature Cataract Fundus Images

This dataset is provided by Telkom University’s Dataverse and can be accessed
via DOI: 10.34820/FK2/CDWESA. Contributors include Sofia Sa'idah, Rita
Magdalena, and Yunendah Nur Fuadah.
Class-wise Distribution:

Class Number of Images

Immature Cataract 802
Total for Study 802

Dataset 4: Eye Disease Image Dataset

The dataset was developed with the help of contributors from Daffodil International University and
Jahangirnagar University. It is available under the CC BY 4.0 license and can be accessed via DOI:
10.17632/s9bfhswzjb.1.
Class-wise Distribution:

Class Number of Images

Healthy Eye 2676
Total for Study 2676
Objectives

The objectives of this project are to:

• Explore and select a pre-trained deep learning model (e.g., InceptionResnetV2, VGG16, ResNet50) for transfer learning,
identifying the most suitable model for cataract detection.
• Fine-tune the selected pre-trained model on the cataract dataset to optimize performance.
• Utilize synthetic data generation through Generative Adversarial Networks (GANs) to augment the dataset and enhance
model robustness.
• Generate Explainable AI (XAI) heatmaps using Grad-CAM to highlight critical regions in the eye images, ensuring model
interpretability.
• Evaluate the model’s performance using metrics such as accuracy, precision, recall, and F1 score to ensure accuracy
and reliability.
Project Plan
Hardware/Software Requirements

Software Specifications Hardware Specifications

 Windows 10 operating system  Processor
 Python programming language  Memory (RAM)
 Google colab (T4 GPU)  Storage
 Key libraries and frameworks  Graphics Processing Unit (GPU)
• TensorFlow
• PyTorch
• Open CV
• NumPy
• Matplotlib for visualization
 Database: Publicly available datasets
System Architecture
•Data Input: Fundus images of the eye, possibly with metadata/labels (CSV
format).
•Data Preprocessing: Images are normalized for scale and pixel values to
ensure consistency.
•Cloud Storage: Normalized data is stored in the cloud for scalable access
and processing.
•Model Training: The machine learning model is trained on the fundus
images to identify patterns associated with cataract formation.
•Model Parameters Storage: Trained model parameters and related
information are stored in a data repository for future use.
•Testing & Predictions: Unseen test data is input to the system, and the
trained model generates predictions to detect cataracts.
•Cloud-Based Workflow: The architecture leverages cloud storage for
efficient training, testing, and diagnosis.
Design - GAN
• Neural Networks: Two key networks are involved: the Generator and the
Discriminator.
• Generator's Role: The generator takes a random vector or matrix as input and
generates a "fake" sample that mimics the real data.
• Discriminator's Role: The discriminator evaluates whether a given sample is "real"
(from the actual training dataset) or "fake" (produced by the generator).
• Training Process: Both networks are trained together in an adversarial setup.
• First, the discriminator is trained for a few epochs to improve its ability to
distinguish between real and fake samples.
• Next, the generator is trained for a few epochs to get better at generating
realistic samples.
• Adversarial Improvement: As training progresses, the generator gets better at
producing convincing fakes, while the discriminator becomes more accurate at
identifying real vs. fake samples, leading to mutual improvement over time.
Implementation Progress- GAN

Generated cataract images at 100 epochs Generated normal images at 100 epochs
Implementation Progress- GAN

Generated cataract images at 200 epochs-

overfitting
Design - Model

Inception Resnet V2
Architecture
Design - Model

Original InceptionResNetV2 Architecture:

 Convolutional Layers: Extract features from the input image.
 Inception Modules: Combine different-sized convolutions to capture features at various scales.
 Residual Connections: Help the network learn deeper representations without suffering from the vanishing gradient
problem.
 Pooling Layers: Downsample the feature maps to reduce computational cost and extract global information.
 Fully Connected Layers: Classify the extracted features into specific categories.
New Layers Added:

• GlobalAveragePooling2D: Reduces each feature map to a single value to lower parameters.

• Dense(128, activation='relu'): Learns complex features with 128 units using ReLU.
• Dense(64, activation='relu'): Refines features with 64 units, adding ReLU non-linearity.
• Dense(1, activation='sigmoid'): Outputs a probability for cataract classification.
Design - Model

Existing Pre-processing Techniques Used Based on Research Papers:

1.Resizing: Images resized to 224x224 to match the input size of the model.
2.Green (G) Channel Extraction: The G-channel of the image is extracted for enhanced clarity, as it provides
the best contrast for fundus images.
3.Image Augmentation: Random rotations, zooms, width/height shifts, and horizontal flips were applied to
reduce overfitting by artificially increasing the dataset.

New Pre-processing Techniques added on:

1.CLAHE (Contrast Limited Adaptive Histogram Equalization): Applied to the G-channel to improve
contrast by limiting over-amplification of noise, especially in low-contrast areas.
2.Wavelet Denoising: Applied on top of CLAHE-enhanced G-channel images using wavelet transforms (Haar
wavelet). This method effectively reduces noise, particularly illumination and small artifacts, while preserving
important image details
Design -XAI

• Grad-CAM (Gradient-weighted Class Activation Mapping) is designed to provide visual explanations for predictions made by convolutional
neural networks (CNNs).
• It works by computing gradients of the target class score with respect to the feature maps of the final convolutional layer.
• These gradients highlight important regions of the image that contributed most to the prediction.
• A global average pooling is performed on the weighted sum of these feature maps which produces a heatmap that is overlaid on the input
image, showing the areas the model focused on.
• This method helps in visualizing which parts of the image are crucial for classification, aiding in model interpretability.
Implementation-XAI

Fundus image of eye containing cataract Heatmap superimposed on the original image highlighting the
regions which contributed the most to the prediction
Result

•Image Generation Quality (GANs): The GAN model trained with 64x64 resolution images for 100 epochs produced coherent
cataract images with structural clarity, though lacking detail for effective feature extraction. Extending to 200 epochs reduced
image quality due to instability and potential mode collapse.
•Training Metrics (GANs): Discriminator and generator losses achieved balance at 100 epochs, indicating stable training;
however, at 200 epochs, the generator loss increased, reflecting instability from overfitting or mode collapse.
•Model Training Outcomes: Without preprocessing, InceptionResNetV2 achieved a stable 97.7% training and 88.2% validation
accuracy, suggesting good generalization. After CLAHE and wavelet denoising, the model's validation accuracy dropped to
73.4%, with instability suggesting overfitting to enhanced features rather than generalized ones.
•Grad-CAM Visualization: Applying Grad-CAM to InceptionResNetV2 highlighted critical regions in cataract detection,
specifically the lens and surrounding eye tissues, aiding in interpretability and validating the model’s focus on relevant features.
References

Weblinks:
1. https://glassboxmedicine.com/2020/05/29/grad-cam-visual-explanations-from-deep-networks/
2. https://jovian.ai/aakashns/06b-anime-dcgan
Dataset Links:
3. https://www.kaggle.com/datasets/jr2ngb/cataractdataset
4. https://www.kaggle.com/datasets/drskprabhakar/cataract-dr-normal-glaucoma-fundus-images-dataset
5. https://dataverse.telkomuniversity.ac.id/dataset.xhtml?persistentId=doi:10.34820/FK2/CDWESA
6. https://data.mendeley.com/datasets/s9bfhswzjb/1
Journals:
7. Pratap, Turimerla, and Priyanka Kokil. "Computer-aided diagnosis of cataract using deep transfer learning." Biomedical
Signal Processing and Control 53 (2019): 101533.
8. Hasan, Md Kamrul, et al. "[Retracted] Cataract Disease Detection by Using Transfer Learning‐Based Intelligent
Methods." Computational and Mathematical Methods in Medicine 2021.1 (2021): 7666365.
9. Jidan, Omar Jilani, et al. "A comprehensive study of DCNN algorithms-based transfer learning for human eye cataract
detection." International Journal of Advanced Computer Science and Applications 14.6 (2023).
References

4. Imran, Azhar, et al. "Fundus image-based cataract classification using a hybrid convolutional and recurrent neural
network." The visual computer 37 (2021): 2407-2417.
5. Khan, Md Sajjad Mahmud, et al. "Cataract detection using convolutional neural network with VGG-19 model." 2021
IEEE World AI IoT Congress (AIIoT). IEEE, 2021.
6. Junayed, Masum Shah, et al. "CataractNet: An automated cataract detection system using deep learning for fundus
images." IEEE access 9 (2021): 128799-128808.
7. Imran, Azhar, et al. "Automated identification of cataract severity using retinal fundus images." Computer Methods in
Biomechanics and Biomedical Engineering: Imaging & Visualization 8.6 (2020): 691-698.
8. Kalyani, B. J. D., et al. "Smart cataract detection system with bidirectional LSTM." Soft Computing 27.11 (2023): 7525-
7533.
9. Kumari, Pammi, and Priyank Saxena. "Cataract detection and visualization based on multi-scale deep features by RINet
tuned with cyclic learning rate hyperparameter." Biomedical Signal Processing and Control 87 (2024): 105452.
10. Yadav, Sunita, and Jay Kant Pratap Singh Yadav. "Automatic Cataract Severity Detection and Grading Using Deep
Learning." Journal of Sensors 2023.1 (2023): 2973836.
11. Ganokratanaa, Thittaporn, Mahasak Ketcham, and Patiyuth Pramkeaw. "Advancements in Cataract Detection: The
Systematic Development of LeNet-Convolutional Neural Network Models." Journal of Imaging 9.10 (2023): 197.
References

12. Patil, Yogeshwar, et al. "Multiple ocular disease detection using novel ensemble models." Multimedia Tools and
Applications 83.4 (2024): 11957-11975.
13. Yadav, Jay Kant Pratap Singh, and Sunita Yadav. "Computer‐aided diagnosis of cataract severity using retinal fundus
images and deep learning." Computational Intelligence 38.4 (2022): 1450-1473.
14. Faizal, Sahil, et al. "Automated cataract disease detection on anterior segment eye images using adaptive thresholding
and fine tuned inception-v3 model." Biomedical Signal Processing and Control 82 (2023): 104550.
15. Goh, Jocelyn Hui Lin, et al. "Artificial intelligence for cataract detection and management." Asia-Pacific journal of
ophthalmology 9.2 (2020): 88-95.
16. Wang, Yong, et al. "Cataract detection based on ocular B-ultrasound images by collaborative monitoring deep
learning." Knowledge-based systems 231 (2021): 107442.
17. Hu, Shenming, et al. "ACCV: automatic classification algorithm of cataract video based on deep learning." BioMedical
Engineering OnLine 20 (2021): 1-17.
18. Keenan, Tiarnan DL, et al. "DeepLensNet: deep learning automated diagnosis and quantitative classification of cataract
type and severity." Ophthalmology 129.5 (2022): 571-584.
19. Wang, Ting, et al. "Intelligent cataract surgery supervision and evaluation via deep learning." International Journal of
Surgery 104 (2022): 106740.