Adrija NTCC File
Adrija NTCC File
Submitted To
Amity Institute of Information Technology (AIIT)
Place: Noida
ACKNOWLEDGEMENT
Adrija Monal
A1049522010
Industry Guide
Project Information
1) Project Duration: (63 Days)
a) Date of Summer Internship commencement (17/12/2024)
a) Date of Summer Internship Completion (17/02/2025)
2) Topic
3) Project Objective
The methodology for this project involves a systematic approach to satellite image
classification using machine learning and deep learning techniques. Initially, satellite
imagery data is collected from reliable sources such as Landsat, Sentinel-2, or
Google Earth Engine.
The raw data undergoes preprocessing steps, including noise reduction, resizing,
and normalization, to enhance model performance. Key features, such as spectral
indices like NDVI, are extracted to aid in distinguishing different land cover types.
The dataset is then divided into training, validation, and testing subsets. Machine
learning algorithms like Random Forest and Support Vector.
This project focuses on classifying satellite images into different land cover types
using machine learning and deep learning techniques. Satellite images often carry
valuable geospatial information, but manual interpretation is time-consuming and
prone to errors. By leveraging automated classification methods, this project seeks
to enhance the speed and precision of land cover analysis.
1. Abstract
2. Introduction
o 1a. Motivation
o 1b. Objectives
3. Requirement specification
4. Design
o 4A. Model Diagram
5. Implementation
o 5A. System Requirements
o 5C. Output
6. Testing
Performance Analysis
7. Conclusion and Future Work
8. References
Abstract
Land Use and Land Cover (LULC) classification plays a crucial role in
monitoring environmental changes, resource management, and urban planning.
The integration of remote sensing technologies with deep learning has
significantly enhanced the accuracy and efficiency of automated classification
methods. This research investigates the performance of the ResNet-50 deep
convolutional neural network in categorizing high-resolution satellite imagery
into various LULC categories.
Results indicate that ResNet-50 efficiently identifies complex land cover patterns
and generalizes well across various landscapes, making it a viable option for
large-scale remote sensing applications. The application of deep learning in
satellite image processing allows for rapid and precise LULC mapping, aiding in
data-driven decision-making for environmental sustainability and urban
expansion.
This research highlights the potential of deep learning, particularly the ResNet-
50 model, in transforming LULC classification through automation and
scalability. Future studies may explore the inclusion of multi-spectral and
hyperspectral data to further enhance classification accuracy and robustness.
Chapter 1
Introduction
Classifying Land Use and Land Cover (LULC) is super important in remote
sensing because it helps us figure out how land is being used and changed in
different areas. This process is key for things like keeping an eye on the
environment, tracking deforestation, studying urban growth, and planning out
agriculture. In the past, LULC classification was done by hand or using rule-
based algorithms and basic machine learning, but these methods had their
drawbacks when it came to being scalable, accurate, and widely applicable.
Deep learning and special computer programs called CNNs have gotten good at
helping computers learn and recognize things quickly and accurately. One of the
best programs is called ResNet-50, which uses a smart way to learn that helps it
avoid some common problems. This study is about using ResNet-50 to identify
different types of land, like forests or cities, by looking at detailed pictures taken
from satellites, especially using a set of pictures called the EuroSAT dataset
from a satellite named Sentinel-2.
Through the application of ResNet-50 for land use and land cover (LULC)
classification, this research seeks to illustrate the transformative potential of deep
learning in enhancing classification accuracy, efficiency, and automation
1a. Motivation
1b. Objectives
With Preprocessing → The images are cleaned up and adjusted—resized to fit the
model, normalized to balance pixel values, and sometimes enhanced using
techniques like rotation or contrast adjustment. This helps the model learn better.
Without Preprocessing → The raw images are fed directly into the model as they
are, without any modifications. This might affect accuracy since unprocessed
images may contain noise or inconsistencies.
Model → Once the images are prepared (or left as they are), they are passed into a
deep learning model. The diagram shows three options: GoogleNet, ResNet-50, and
ResNet-101. These models analyze the images, extract important patterns, and try to
classify them correctly.
Sequence Layer → After the model processes the images, the extracted features are
passed through additional layers (like fully connected layers) to refine the
classification. Think of this as a final step where the model makes its best guess
based on all the patterns it has learned.
Training and Evaluation → Finally, the model is trained and tested. This is where
we check how well it performs using different metrics like accuracy, precision, recall,
and F1-score. These help us understand whether the model is making correct
predictions and where improvements might be needed.
This structured flow ensures that the model is well-prepared to classify satellite
images accurately.
Chapter 3
Design
A. Model Diagram
The model follows a structured pipeline for LULC classification, as shown in the
provided diagram. It consists of the following components:
1. Input Image:
o To comply with ResNet-50'sinput specifications, the satellite picture
(from the Euro SAT dataset) is pre-processed and scaled.
Processing Workflow:
CPU:
• A multi-core processor (Intel i5 or AMD Ryzen 5 or better).
• Performance will be enhanced by more cores and faster clock speeds,
particularly for data preprocessing.
Storage:
• SSD (Solid State Drive) for faster data access and model training.
• At least 20GB of free space for datasets, models, and libraries.
Software Requirements: .
1 Operating System:
1. Annual crop: Fields of crops that are sown and harvested in a single year are
known as annual crops.
2. Forest: Places with vegetation and trees.
3. Grasslands and other non-woody vegetation are examples of herbaceous
vegetation.
4. Highway: Important thoroughfares.
5. Industrial: Places with infrastructure and buildings used for industry.
6. Pasture: Land where animals graze.
7. Permanent Crop: Crops (like orchards) that are planted once and yield for
several years.
8. Residential: Cities that have residential structures.
9. River: Streams and rivers are examples of water bodies.
10. Sea Lake: Lakes and other large bodies of water.
3. Image Specifications
• Image Format: JPEG is the most used image format.
• Image Size: For model training, the original, varied-sized images in the
EuroSAT dataset are frequently scaled to a common size. You are scaling them
to 64x64 pixels in your implementation.
• Colour Channels: The pictures have three colour channels (Red, Green, and
Blue) because they are RGB.
4. Number of Images
There are 27,000 photos in all, split among the 10 classes, in the EuroSAT
collection. There are roughly 2,700 photos in each class, though the precise
amount may differ slightly.
5. Pixel Information
• Pixel Values: Each color channel (RGB) in the photos has pixel values
ranging from 0 to 255. These values are usually normalized to the range [0,
1] during preprocessing by dividing by 255.
• Image Resolution: To lessen the computational load and speed up training,
the original images—which might have had a higher resolution—are
shrunk to a smaller resolution (for example, 64x64 pixels).. .
Data Preprocessing
1. Import Required Libraries
Verify that the required libraries are installed. TensorFlow and additional
libraries are installable. This includes Matplotlib for visualisation, NumPy for
numerical computations, and TensorFlow for model construction and training.
• os: This library is used to navigate directories and engage with the
operating system.
• numpy: An essential Python library for numerical computation that
manages matrices and arrays.
• Tensorflow: An open-source machine learning and deep learning library
that offers model construction and training capabilities.
matplotlib.pyplot: A charting library for visualising data, including pictures.
2: Set Up Parameters
To Establish which parameters will be applied throughout the preprocessing and
model training phases.
Explanation:
• dataset_url: The local machine's route to the dataset. Adapt this route
according to the location of the EuroSAT dataset extraction.
• img_height and img_width: the measurements that will be used to resize
every image. This is crucial to guarantee that every input image fed into
the model has the same shape..
• batch_size: how many photos must be processed in a single cycle.
Although it uses more memory, a bigger batch size helps speed up
training.
• rescale: The factor used to normalize pixel values. Dividing by 255
converts pixel values from the range [0, 255] to [0, 1].
• validation_split:The portion of the dataset that should be put aside for
verification. This aids in assessing how well the model performs when
applied to unknown data.. .
4: Create ImageDataGenerator
Images are loaded and pre-processed using TensorFlow's ImageDataGenerator
class. Additionally, it can use real-time data augmentation.
Justification:
• ImageDataGenerator: With real-time data augmentation, this class enables
you to produce batches of tensor image data. Pixel values can also be rescaled
with it.
• validation_split: Using the given fraction, this option automatically divides
the dataset into training and validation sets.
• Rescale: This speeds up convergence by normalizing the pixel values, which
is essential for neural network training.
5. Splitting Dataset
Photographs should be divided into:
Training set (70-80%) Set of validations (10–15%) applying the
flow_from_directory method to the directory structure. Using the titles of the
subdirectories, this technique automatically labels the pictures.
C. Output
The ten LULC categories—forest, river, residential, etc.—are used to label the
categorised satellite images that are produced by the trained ResNet-50 model.
The final classification map makes it easier to understand by graphically
representing the various categories of land cover and land
Chapter 5
Testing
2. Model Training
The ResNet50 architecture, which consists of 50 layers of convolutional and
pooling procedures, is the model utilized for this investigation. Using accuracy
as the evaluation metric and categorical cross-entropy as the loss function, the
model was trained across 100 epochs with a batch size of 32.
5. Performance Metrics •
Overall Accuracy: 87.4%
• Class-wise Accuracy:
• AnnualCrop: 90%
• Forest: 85%
• HerbaceousVegetation: 80%
• Highway: 95%
• Industrial: 88%
• Pasture: 82%
• PermanentCrop: 89%
• Residential: 86%
• River: 91%
• SeaLake: 93%
REFERENCES
Achievements:
• Established clear project objectives and goals.
• Drafted a comprehensive project plan.
• Completed a preliminary requirements document.
Achievements:
Critical spectral indices were successfully calculated to enhance feature
extraction and classification. The NDVI was utilized to quantify
vegetation health and density, while the NDWI was applied to highlight
water bodies and assess their boundaries. Additionally, the Built-up Index
was effectively employed to detect urban areas. Texture features were
extracted using the Gray Level Co-occurrence Matrix, which addressed
ambiguities between spectrally similar land cover types, such as urban and
barren areas. By combining spectral indices and texture features, initial
regions were mapped, laying a solid foundation for further classification
tasks.
• Future Work Plans:
• The plan includes integrating additional data, such as topographical
features, to provide further context for distinguishing land cover types.
Multi-temporal imagery will be utilized to capture seasonal variations.
Extracted features will undergo dimensionality reduction techniques
before being fed into the classification pipeline.
AMITY INSTITUTE OF INFORMATION TECHNOLOGY
MAJOR PROJECT REPORT
Achievements:
Refinements were made to the extracted feature datasets to improve
classification accuracy. Multi-source data integration was initiated by
incorporating topographical attributes such as elevation and slope,
which provided additional context for distinguishing similar land cover
types. Adjustments were made to spectral indices thresholds to
minimize overlap between urban and barren regions. Statistical analysis
of texture features was conducted, resulting in the selection of the most
parameters, such as contrast and homogeneity. These refinements
enabled a more precise differentiation of land cover categories, further
improving the quality of the input dataset for classification.
Future Work Plans:
Incorporate temporal datasets to analyze seasonal variations and their
impact on land cover features. Begin initial training of the
classification model using the refined dataset to evaluate its baseline
performance.
AMITY INSTITUTE OF INFORMATION TECHNOLOGY
MAJOR PROJECT REPORT
Achievements:
The initial training of the classification model was successfully
conducted using the refined dataset, incorporating spectral indices,
texture features, and topographical attributes. Model evaluation was
performed using a split dataset, with 70% for training and 30% for
testing, ensuring a balanced distribution of land cover classes. Various
machine learning algorithms, including Convolutional Neural
Networks (CNN) and Random Forest, were tested to compare their
effectiveness. Preliminary results indicated promising accuracy for
vegetation and water classes; however, urban and barren land
categories still exhibited minor misclassification due to spectral
similarities.
Future Work Plans:
Fine-tune the model further by incorporating advanced feature selection
techniques and additional data augmentation strategies. Implement post-
classification refinement methods to enhance spatial accuracy and reduce
classification noise..
AMITY INSTITUTE OF INFORMATION TECHNOLOGY
Achievements:
Significant improvements were made to the classification model by
optimizing hyper-parameters such as learning rate, batch size, and the
number of training epochs. Data augmentation techniques, including
rotation, flipping, and contrast adjustments, were applied to increase
dataset diversity and reduce overfitting. The inclusion of additional
training samples for urban and barren land classes helped improve
differentiation between spectrally similar regions. Model validation was
conducted using cross-validation, leading to a noticeable increase in
classification accuracy, particularly for previously misclassified areas.
Future Work Plans:
Incorporate temporal datasets to analyze seasonal variations and their impact
on land cover features. Begin initial training of the classification model using
the refined dataset to evaluate its baseline performance.
AMITY INSTITUTE OF INFORMATION TECHNOLOGY
Achievements:
During the week, higher-resolution satellite images were successfully incorporated,
leading to a significant improvement in spatial details and classification accuracy. GIS-
based post-classification correction techniques were implemented, which helped in
refining misclassified regions and enhancing the overall model performance. An extensive
error analysis was conducted to identify persistent classification issues, and necessary
adjustments were made to the model parameters accordingly. Furthermore, a comparative
evaluation of performance metrics before and after these enhancements demonstrated the
effectiveness of the implemented improvements
Future Work Plans:
The focus will be on exploring deep learning-based segmentation techniques to further
enhance classification accuracy. Additional hyper-parameter tuning will be carried out
to optimise model performance, ensuring the model achieves its best possible results.
Moreover, the results will be validated against benchmark datasets to confirm their
robustness and generalisability, ensuring the reliability of the classification mode
AMITY INSTITUTE OF INFORMATION TECHNOLOGY
Achievements:
Expanded the training dataset with more diverse satellite images to improve
model adaptability.
Fine-tuned the network architecture by experimenting with deeper layers and
improved activation function
Applied ensemble learning techniques such as stacking and bagging to combine
multiple classification mo
Analyzed the impact of ensemble learning on classification
performance
Future Work Plans:
- Final optimisation of the deep learning model before project documentation.
- Perform extensive validation using real-world satellite data.
- Prepare a comparative performance analysis between different classification
models.
- Continue drafting the final project report, focusing on results and conclusions.
AMITY INSTITUTE OF INFORMATION TECHNOLOGY
Achievements:
Refined model parameters based on validation results to enhance performance.
Conducted extensive benchmarking, analyzing accuracy, precision, recall, and
F1-score.
Created detailed visualizations (heatmaps, confusion matrices) to interpret classification
outputs.
Drafted sections of the final report, summarizing key findings and insights.
Future Work Plans
Thursday: Identified commonly used datasets like FER2013, CK+, and Affect Net
Sunday: Noted key findings and potential improvements from the literature review.
Monday: Summarized insights and started dataset selection process dataset selection
and the creation of a detailed timeline for project execution.
AMITY INSTITUTE OF INFORMATION AND TECHNOLOGY
Sunday: Pre-processed 50% of the dataset and prepared for model training.
Monday: Reviewed preprocessing pipeline and documented key steps..
AMITY INSTITUTE OF INFORMATION AND TECHNOLOGY
Thursday: Identified errors and edge cases where the model failed.
Saturday:
Student Name:
Adrija Monal
Enrolment No:
A1049522010
Program:
BCA+MCA
(Dual)
Batch: 2022-2027
Semester: 6
Course: NTCC
Name of the Guide: Dr. ParthaSarathi Chakraborty
Project Title : Satellite image classification