0% found this document useful (0 votes)

19 views16 pages

Research Paper UGR - Team-07

Uploaded by

akshithagopu1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views16 pages

Research Paper UGR - Team-07

Uploaded by

akshithagopu1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Progress Report of UGR

i) G. Akshitha
Name of the Students ii) P. Sindhu
iii) M. Harshini
iv) N. Vyshnavi

i) 2205A42007
Roll No. ii) 2205A42035
iii) 2205A42013
iv) 2205A42018

Department (School) ECE

i) [email protected]– 8019102643
Email ID & Mobile No. ii) [email protected]– 9014124965
iii) [email protected]– 7799212666
iv) [email protected]– 9398839095

Supervisor(s)Name Dr. K. Sreedhar

ResearchTitle Image detection using deep learning

Team no. 07
Abstract
Deep learning has revolutionized image detection, powering applications in object detection, facial
recognition, autonomous vehicles, and medical imaging. This review highlights key models like YOLO,
ResNet, and Faster R-CNN, analyzing their architectures, methodologies, and performance on datasets
like COCO, PASCAL VOC, and ImageNet.
Trade-offs between efficiency and accuracy are discussed, with attention to lightweight real-time models
and deeper networks for feature extraction. Innovations like multi-scale feature integration, attention
mechanisms, and advanced loss functions address challenges such as small object detection, occlusion,
and class imbalance.
Limitations include high computational costs, reliance on large datasets, and vulnerability to adversarial
attacks. Future directions involve transfer learning, unsupervised methods, and transformer-based
architectures to enhance scalability and practical deployment, bridging research and real-world
applications.
Keywords: Image Detection, CNNs, YOLO, ResNet, Faster R-CNN, Object Detection.

I. Introduction
Image detection plays a vital role in computer vision, underpinning numerous applications such as
object tracking, scene understanding, facial recognition, autonomous driving, and medical imaging. It
involves two primary tasks: identifying objects present in an image and determining their precise
locations within it. Over the years, deep learning, especially Convolutional Neural Networks (CNNs),
has brought about a transformative shift in this domain. These models have replaced traditional methods
that relied heavily on handcrafted features and domain-specific knowledge, enabling end-to-end learning
from raw image data and achieving unprecedented accuracy and scalability.
The success of CNNs stems from their ability to learn hierarchical representations of data, from low-
level features like edges and textures to high-level semantic information. Breakthroughs in architecture
design, such as AlexNet, ResNet, and DenseNet, have significantly improved the capacity of deep
networks to model complex patterns in visual data. Moreover, specialized architectures like YOLO (You
Only Look Once) and Faster R-CNN have optimized image detection tasks by combining speed and
precision.
However, despite these advancements, significant challenges persist. Detecting small or overlapping
objects remains a critical issue, especially in complex scenes with cluttered backgrounds. Similarly,
ambiguity in identifying visually similar objects often leads to misclassification. Balancing
computational efficiency with accuracy poses another challenge, particularly for real-time applications
requiring lightweight yet precise models. Furthermore, reliance on large-scale annotated datasets and the
susceptibility of deep learning models to adversarial attacks limit their generalization capabilities and
robustness.
This paper provides an in-depth review of the evolution of deep learning-based image detection
techniques, emphasizing key innovations and their applications across various domains. It explores how
state-of-the-art methods address critical challenges and evaluates their performance on benchmark
datasets. Additionally, the paper highlights emerging trends and technologies, such as attention
mechanisms, transformer-based models, and semi-supervised learning, that promise to further enhance
the field. By synthesizing insights from existing research, this study aims to identify opportunities for
optimizing image detection models for real-world deployment, ensuring a balance between efficiency,
accuracy, and scalability.

Fig. 1. Object Detection Examples

II. Literature Review
A. Literature Survey for Image Detection
Numerous studies have demonstrated the effectiveness of deep learning techniques in image detection,
highlighting key architectures that have driven advancements in the field. Below is an elaboration of
prominent models and their contributions:
1. YOLO (You Only Look Once)
The YOLO architecture revolutionized object detection by introducing a single-stage detection
framework. Unlike traditional two-stage models like Faster R-CNN, YOLO treats object
detection as a regression problem, predicting bounding boxes and class probabilities directly
from an image in a single evaluation. This approach allows YOLO to achieve real-time detection
speeds while maintaining reasonable accuracy. Variants such as YOLOv3, YOLOv4, and
YOLOv5 have improved on the original by incorporating features like multi-scale predictions,
better backbone networks (e.g., CSPDarknet), and advanced training strategies.
2. Faster R-CNN (Region-based Convolutional Neural Network)
Faster R-CNN set a benchmark in detection accuracy by integrating Region Proposal Networks
(RPNs) directly into the detection pipeline. This innovation eliminated the need for external
region proposal methods, making the process more efficient. Faster R-CNN is particularly
known for its high accuracy in tasks requiring precise localization, such as facial recognition and
autonomous driving. Extensions like Mask R-CNN have further built on this architecture to
include instance segmentation, enabling pixel-level object classification.
3. ResNet (Residual Networks)
ResNet introduced the concept of skip connections, which effectively mitigated the vanishing
gradient problem in deep networks. This innovation allowed for the development of extremely
deep architectures (e.g., ResNet-50, ResNet-101) without a significant degradation in
performance. In object detection, ResNet often serves as a backbone network in models like
Faster R-CNN and YOLOv4, extracting high-quality feature representations critical for detection
tasks.
4. SSD (Single Shot MultiBox Detector)
SSD combines the speed of single-stage detectors with competitive accuracy. It employs a multi-
scale feature map approach, enabling the detection of objects of varying sizes. Unlike YOLO,
which outputs predictions from a single grid, SSD uses convolutional feature maps to make
predictions at different scales, improving its performance on smaller objects.
5. RetinaNet
RetinaNet introduced the Focal Loss, which addresses the issue of class imbalance in object
detection by down-weighting easy negatives and focusing more on hard-to-classify examples.
This innovation led to significant performance improvements, particularly in detecting smaller or
less distinct objects.
6. Transformer-Based Models
Recent advancements, such as DETR (DEtection TRansformer), leverage transformer
architectures to simplify the object detection pipeline. DETR replaces traditional components
like RPNs with an end-to-end transformer model, which directly predicts object locations and
labels. This approach has demonstrated competitive accuracy and simplicity, paving the way for
further exploration of transformers in computer vision.
7. Applications and Comparisons
Each of these models has been applied across diverse domains:
o YOLO is widely used in real-time applications, such as video surveillance and robotics.
o Faster R-CNN excels in scenarios requiring high precision, such as medical imaging.
o SSD and RetinaNet strike a balance between speed and accuracy, making them suitable
for applications like autonomous vehicles.
o Transformer-based models like DETR are promising for tasks that benefit from a more
unified and end-to-end approach.
Fig. 2. CNN YOLO Models

B. Dataset Survey for Image Detection

The availability of comprehensive datasets has been instrumental in advancing the field of image
detection. These datasets not only serve as benchmarks for evaluating model performance but also play a
critical role in training models capable of generalizing to diverse real-world scenarios. Below is a
detailed overview of some widely used datasets and their contributions to the field:
1. PASCAL VOC (Visual Object Classes Challenge)
 Overview: Initiated in 2005, PASCAL VOC is one of the pioneering datasets in computer
vision. It consists of annotated images for a fixed set of 20 object categories, such as people,
animals, and vehicles.
 Key Features:
o Provides both image classification and object detection annotations.
o Offers segmentation masks, making it useful for multiple vision tasks.
o Yearly challenges encouraged the development of novel algorithms.
 Impact: PASCAL VOC set the foundation for standard evaluation metrics like mean Average
Precision (mAP), widely used for object detection tasks.
2. MS COCO (Microsoft Common Objects in Context)
 Overview: MS COCO, introduced in 2014, significantly expanded the scope of image detection
datasets by emphasizing object detection within complex, cluttered scenes.
 Key Features:
o Contains over 330,000 images, with 80 object categories.
o Annotations include bounding boxes, segmentation masks, and keypoints for human pose
estimation.
o Focuses on objects in natural and contextual environments, increasing the difficulty of
detection tasks.
 Impact: MS COCO’s diverse and challenging dataset has become a gold standard for testing
object detection and segmentation models, such as Faster R-CNN, YOLO, and Mask R-CNN.
3. ImageNet
 Overview: ImageNet revolutionized computer vision by providing a large-scale dataset with
over 14 million labeled images spanning 20,000 categories.
 Key Features:
o Although primarily known for its use in image classification tasks, its subset (ImageNet-
LOC) includes bounding box annotations for object detection.
o Hosts the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC), which
fostered the rise of deep learning-based architectures like AlexNet, VGG, and ResNet.
 Impact: ImageNet established the baseline for deep learning research and motivated the
development of feature extraction techniques now adapted for object detection tasks.
Role of Datasets in Advancing Image Detection
1. Improved Model Generalization:
o Datasets like PASCAL VOC and MS COCO provide a variety of object categories and
environmental conditions, enabling models to learn robust features applicable to real-
world scenarios.
o The diverse annotations (e.g., segmentation masks, keypoints) support multi-task
learning, improving model adaptability.
2. Benchmarking and Evaluation:
o These datasets establish standardized metrics, such as mAP and IoU (Intersection over
Union), ensuring fair comparisons between models.
o Annual challenges associated with datasets like MS COCO and ImageNet encourage
innovation and highlight cutting-edge techniques.
3. Encouraging Dataset Diversity:
o The evolution of datasets highlights the growing need for annotations reflecting real-
world complexities, such as occlusions, small objects, and varied lighting.
o Future datasets, inspired by these benchmarks, aim to address gaps in representation for
underrepresented environments and categories[8].
III.SCOPE
Researchers publish numerous papers each year in the field of deep learning and its applications,
making it challenging to compile a comprehensive review of all state-of-the-art methods within
the constraints of a single paper. This study focuses specifically on the significant advancements
in deep learning-based object detection since 2015. The primary goal is to provide a detailed
comparison of recent techniques and models, evaluating them based on key metrics such as
FLOPs (Floating Point Operations) and mAP (mean Average Precision).
As each year brings forth new techniques or improvements to existing ones, often through model
refinements, this paper serves as a valuable resource for researchers. It aids in identifying
optimal methods, backbone DCNNs (Deep Convolutional Neural Networks), or models to use in
developing more effective object detection systems. By facilitating informed decision-making,
this study aims to support researchers in achieving superior network performance and
discovering novel applications, ultimately contributing significantly to the field.

IV. Deep Learning Models for Image Detection

The advent of deep learning has ushered in state-of-the-art models that have redefined image detection
by addressing diverse requirements such as speed, accuracy, and computational efficiency. Notable
among these models are YOLO, ResNet, and MobileNet, each tailored to specific use cases and
challenges. This section delves into their architectural innovations and application-specific strengths.

YOLO (You Only Look Once) has become a benchmark for real-time image detection tasks due to its
innovative single-shot detection mechanism. Unlike traditional multi-stage detectors, YOLO treats
object detection as a regression problem, simultaneously predicting bounding boxes and class
probabilities in a single forward pass of the network. This streamlined approach enables exceptional
speed, making YOLO particularly suitable for applications like autonomous driving, video surveillance,
and robotics. Over successive iterations, such as YOLOv3, YOLOv4, and YOLOv5, the model has
improved in accuracy and its ability to detect smaller objects in complex scenes. However, YOLO can
struggle with overlapping objects in crowded environments, and its precision sometimes lags behind
more complex, slower models.

ResNet (Residual Networks) addresses the challenges associated with training very deep neural
networks, such as vanishing gradients, through the use of residual learning. Its innovative skip
connections allow data to bypass certain layers, ensuring efficient gradient flow and enabling the
training of extremely deep architectures like ResNet-50, ResNet-101, and ResNet-152. This scalability
has made ResNet a popular backbone in object detection frameworks like Faster R-CNN, where its
ability to extract rich feature representations enhances model accuracy. ResNet’s exceptional
performance in image detection and classification tasks underscores its role as a foundational
architecture in deep learning research.

MobileNet, designed for efficiency, is a lightweight model tailored for mobile and embedded devices. It
achieves high computational efficiency using depthwise separable convolutions, which reduce the
number of parameters and operations required. MobileNet’s modular architecture allows developers to
balance accuracy and speed by adjusting parameters like the width multiplier and input resolution. This
makes it an ideal choice for applications requiring low-latency detection on resource-constrained
devices, such as smartphones, drones, and IoT systems. Although it is less accurate than more complex
models like ResNet, its efficiency and portability have ensured widespread adoption.
Together, YOLO, ResNet, and MobileNet exemplify the diversity of approaches in deep learning
models for image detection. They cater to different application needs, from real-time performance to
high accuracy and computational efficiency, highlighting the adaptability of modern deep learning to
meet the challenges of various domains.

Fig. 3. Image Detection Examples

V.Object Detection Models

A.YOLO (You Only Look Once)

YOLO is a fast and efficient object detection model that processes images in a single stage, dividing the
input into a grid and predicting bounding boxes and class probabilities in one pass. This approach
enables real-time performance, making YOLO one of the most popular models for object detection.

Since its introduction with YOLOv1 in 2016, the model has evolved through multiple versions.
YOLOv2 introduced anchor boxes, batch normalization, and multi-scale training, improving accuracy
for objects of varying sizes. YOLOv3 added feature pyramid networks and Darknet-53 as a backbone
for better multi-scale detection. Later versions like YOLOv4 and YOLOv5 optimized training
techniques, computational efficiency, and deployment ease. The latest iterations, YOLOv6 and
YOLOv7, further enhanced speed and accuracy, making them suitable for real-time applications like
video analytics and robotics.

While YOLO excels in speed, it can struggle with small, overlapping, or densely packed objects.
However, its simplicity, versatility, and ability to balance performance and computation have made it
indispensable in fields like autonomous driving, surveillance, retail, healthcare, and robotics. Its
continuous evolution ensures its relevance in both research and practical applications[3-4].

Fig. 4. YOLOv2 block diagram

B.RCNN (Region-based CNN)

RCNN combined sliding windows and semantic segmentation to improve object detection. It uses
the selective search algorithm to generate around 2,000 region proposals per image, which are then
passed through a CNN for classification. While it improved detection accuracy, it was computationally
expensive and slow due to the time-consuming selective search process.

Fast RCNN

Fast RCNN addressed the inefficiency of RCNN by passing the entire image through a CNN first, and
then using RoI pooling to extract fixed-size feature maps for each region proposal. This significantly
sped up the process, but it still relied on selective search, making it slower than desired for real-time
applications.

Faster RCNN

Faster RCNN introduced Region Proposal Networks (RPNs) to replace selective search, generating
region proposals directly from the CNN’s feature maps. This made Faster RCNN much faster and more
efficient, enabling end-to-end training and real-time object detection by eliminating the computational
bottleneck of selective search.

Limitations

 RCNN: Requires significant computational power due to the use of selective search for region
proposals and independent CNN processing for each region.
 Fast RCNN: Still relies on selective search for region proposals, making it faster than RCNN but
not fast enough for real-time applications.
 Faster RCNN: While it improves efficiency, the model can still be slow for real-time
applications compared to more recent advancements like YOLO and SSD, especially on devices
with limited computational resources[3-7].

Fig. 5. Faster R-CNN block diagram

Fig. 6.Fast R-CNN block diagram

C. MobileNetv2

MobileNetV2 builds on the lightweight design of MobileNetV1 with significant architectural

enhancements to improve efficiency and accuracy. The key innovation in MobileNetV2 is the
introduction of bottleneck residual blocks, which consist of three stages: expansion, depthwise
convolution, and projection. The expansion layer uses 1×11 \times 1 convolution to increase the input
channels, followed by a depthwise convolution that applies spatial filtering independently to each
channel. Finally, the projection layer reduces the expanded channels back to a lower-dimensional space
using another 1×11 \times 1 convolution with a linear activation function, preserving essential
information in low-rank feature spaces. These blocks also include residual connections, which skip over
layers and directly add the input to the output if their dimensions match, enabling better gradient flow
and feature reuse.

The MobileNetV2 architecture implements these bottleneck blocks 17 times, compared to

MobileNetV1's 13 depthwise separable convolutions. By incorporating residual connections and linear
bottlenecks, MobileNetV2 reduces the loss of information during compression, improving performance
while keeping the model lightweight. This design makes MobileNetV2 highly efficient for real-time
applications on resource-constrained devices such as mobile phones and IoT systems.

Overall, MobileNetV2 enhances computational efficiency while maintaining high accuracy for tasks like
image classification and object detection. Its innovative use of expansion, depthwise convolution, and
projection steps, coupled with residual connections, makes it a powerful backbone for mobile and
embedded deep learning systems.

Comparison with MobileNetV1

Feature MobileNetV1 MobileNetV2

Bottleneck Residual Blocks with
Key Innovation Depthwise Separable Convolutions
Expansion
Non-linearity ReLU ReLU6 in expansion, Linear in projection
Number of 13 Depthwise Separable
17 Bottleneck Blocks
Convolutions Convolutions
Residual Connections No Yes
Efficiency High Higher

Fig.7.MobileNet-V2 block diagram

D. DETR (DEtection TRansformer)

DETR (DEtection TRansformer) revolutionizes object detection by utilizing a transformer architecture,
removing the need for traditional region proposal networks and anchor-based strategies. It combines a
CNN backbone, which extracts visual features from the input image, with a transformer encoder-
decoder architecture that processes these features. The transformer encoder captures global context
through self-attention, while the decoder uses object queries, which are learnable embeddings
representing potential objects, to predict bounding boxes and class labels in parallel. DETR employs a
set-based loss function that uses the Hungarian algorithm to uniquely match each predicted object to
ground truth, eliminating the need for post-processing techniques like non-maximum suppression
(NMS). Bounding boxes are directly predicted as continuous values, enabling a simpler and more
elegant design. This end-to-end framework achieves competitive accuracy and streamlines the object
detection pipeline, making it a promising approach for tasks requiring both simplicity and performance.
Transformers, originally designed for sequence-to-sequence tasks like natural language processing, are
used in DETR for visual feature representation and global context understanding. DETR combines a
convolutional neural network (CNN) backbone with a transformer encoder-decoder to process image
features.

Fig.8. DETR (DEtection TRansformer) block diagram

E. SSD (Single Shot MultiBox Detector)

The Single Shot MultiBox Detector (SSD) is an efficient and fast object detection model that predicts
object classes and bounding boxes in a single pass through the network, making it ideal for real-time
applications. One of the core innovations of SSD is its use of multi-scale feature maps. By extracting
features from different layers of the network, SSD can detect objects at various sizes, with higher-
resolution layers handling smaller objects and lower-resolution layers detecting larger objects. This
multi-scale approach ensures the model is robust across different object sizes and improves its overall
accuracy. Additionally, SSD employs anchor boxes, which are predefined bounding boxes of different
aspect ratios and sizes, attached to each feature map cell. These anchor boxes are refined during training
to match ground truth objects, allowing SSD to handle multiple objects in the same region. This model
performs both classification and bounding box regression in a single forward pass, optimizing both
speed and accuracy. As a result, SSD is particularly well-suited for real-world applications where a
balance of fast processing and detection accuracy is required, such as in mobile devices, surveillance
systems, and autonomous vehicles.
Fig.9. SSD (Single Shot MultiBox Detector) block diagram

Comparison of Key Models

Model Type Speed Accuracy Use Cases
YOLO Single-Stage Very Fast Moderate Real-time detection in videos, drones
SSD Single-Stage Fast Good Mobile and embedded systems
Faster R-CNN Two-Stage Slow High High-precision tasks like medical imaging
Mask R-CNN Two-Stage Slower Very High Object detection with segmentation
DETR Transformer-Based Moderate High Research and general detection tasks

VI.CONCLUSION
Deep learning has transformed image detection, improving accuracy in tasks like object and facial
recognition. However, challenges remain, including high computational costs, dataset biases, and the
lack of interpretability in models. Training deep models requires significant resources, which limits their
deployment on smaller devices. Biases in training data can lead to unfair results, while the "black-box"
nature of models makes them difficult to trust, especially in sensitive areas like healthcare. Additionally,
real-world conditions often reduce model performance. Future research should focus on creating more
efficient, interpretable models that can handle diverse environments and ensure fairness.

VII.REFRENCE

1. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
2016, 770–778.
2. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image
Recognition. In Proceedings of the International Conference on Machine Learning (ICML),
2014, 1–14.
3. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time
Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2016, 779–788.
4. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection
with Region Proposal Networks. In Advances in Neural Information Processing Systems
(NeurIPS), 2015, 91–99.
5. Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
2015, 3431–3440.
6. Chen, L.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic
Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected
CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4), 834–848.
7. Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer
Vision (ICCV), 2015, 1440–1448.
8. Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick,
C.L. Microsoft COCO: Common Objects in Context. In European Conference on Computer
Vision (ECCV), 2014, 740–755.
9. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional
Neural Networks. In Advances in Neural Information Processing Systems (NeurIPS), 2012,
1097–1105.
10. Zhou, B.; Wang, D.; Xo, X. Object Detection with Deep Learning: A Review. IEEE
Transactions on Neural Networks and Learning Systems, 2018, 29(8), 1234–1248.
PERSONAL REFLECTION:
Over the past month, we have gained valuable insights into the application of deep learning for image
detection. We have developed new skills in training and fine-tuning convolutional neural networks
(CNNs) and other deep learning architectures, enhancing our understanding of how these models can be
leveraged for tasks like object detection, medical imaging, and facial recognition. We have also
encountered challenges related to model performance, such as dealing with dataset biases and the high
computational cost of training deep networks. Despite these obstacles, this experience has deepened my
understanding of the practical implications of deep learning in image detection and its potential to
revolutionize fields such as healthcare, security, and autonomous systems.

DATE:

SIGNATURE OF THE SUPERVISOR:

YOLO Advances To Its Genesis: A Decadal and Comprehensive Review of The You Only Look Once (YOLO) Series
No ratings yet
YOLO Advances To Its Genesis: A Decadal and Comprehensive Review of The You Only Look Once (YOLO) Series
83 pages
Helmet Detection Using Machine Learning and Automatic License Final
75% (4)
Helmet Detection Using Machine Learning and Automatic License Final
47 pages
Object Detection and Identification
67% (3)
Object Detection and Identification
20 pages
Object Detection Using Yolo Algorithm-1
No ratings yet
Object Detection Using Yolo Algorithm-1
9 pages
Object Detection Classification and Tracking of Everyday Common Objects
No ratings yet
Object Detection Classification and Tracking of Everyday Common Objects
5 pages
Liquid Neural Networks
No ratings yet
Liquid Neural Networks
282 pages
DL - Assignment 6 Solution
100% (3)
DL - Assignment 6 Solution
6 pages
Pedestrian Detection System Based On Deep Learning
No ratings yet
Pedestrian Detection System Based On Deep Learning
5 pages
Incremental Training For Image Classification of Unseen Objects
No ratings yet
Incremental Training For Image Classification of Unseen Objects
19 pages
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
No ratings yet
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
11 pages
Real Time Object Detection Using Deep Learning
No ratings yet
Real Time Object Detection Using Deep Learning
6 pages
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
No ratings yet
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
13 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
Object Tracking in Crowd Environment Using Deep Learning
No ratings yet
Object Tracking in Crowd Environment Using Deep Learning
8 pages
Sample Paper of AI Grade IX
100% (1)
Sample Paper of AI Grade IX
5 pages
Comparative Analysis of Deep Learning Image Detection Algorithms
No ratings yet
Comparative Analysis of Deep Learning Image Detection Algorithms
27 pages
AReviewon YOLOv 8 Andits Advancementsv 2
No ratings yet
AReviewon YOLOv 8 Andits Advancementsv 2
20 pages
Object Detection Models Part2
No ratings yet
Object Detection Models Part2
12 pages
Mini Project Synopsis
No ratings yet
Mini Project Synopsis
6 pages
Paper Id 334 (New) With Animation - PPTX - 20240311 - 215722 - 0000
No ratings yet
Paper Id 334 (New) With Animation - PPTX - 20240311 - 215722 - 0000
11 pages
YOLO Based Object Detection Models: A Review and Its Applications
No ratings yet
YOLO Based Object Detection Models: A Review and Its Applications
40 pages
AReviewon YOLOv 8 Andits Advancementsv 2
No ratings yet
AReviewon YOLOv 8 Andits Advancementsv 2
20 pages
Final Report - Removed
No ratings yet
Final Report - Removed
43 pages
Research Article: An Evaluation of Deep Learning Methods For Small Object Detection
No ratings yet
Research Article: An Evaluation of Deep Learning Methods For Small Object Detection
18 pages
Report 34
No ratings yet
Report 34
22 pages
Object Detection in Images and Videos Using OpenCV A Comparative Study of Deep Learning and Traditional Computer Vision Techniques
No ratings yet
Object Detection in Images and Videos Using OpenCV A Comparative Study of Deep Learning and Traditional Computer Vision Techniques
6 pages
2802 8020 1 PB
No ratings yet
2802 8020 1 PB
3 pages
Make 05 00083 v2
No ratings yet
Make 05 00083 v2
37 pages
Object Detection Models
No ratings yet
Object Detection Models
36 pages
1 s2.0 S1877050924033301 Main
No ratings yet
1 s2.0 S1877050924033301 Main
7 pages
Object Detection Using Tensorflow....
No ratings yet
Object Detection Using Tensorflow....
9 pages
YOLOv1 v8综述
No ratings yet
YOLOv1 v8综述
36 pages
Sensors 22 04833
No ratings yet
Sensors 22 04833
17 pages
CCTV
No ratings yet
CCTV
23 pages
Computer Vision 3
No ratings yet
Computer Vision 3
38 pages
Ankit Synopsis
No ratings yet
Ankit Synopsis
13 pages
Object Detection
No ratings yet
Object Detection
13 pages
Ijlbps 6620dd20c5747
No ratings yet
Ijlbps 6620dd20c5747
8 pages
Final Synopsis1
No ratings yet
Final Synopsis1
10 pages
Presentation1 FINAL 1
No ratings yet
Presentation1 FINAL 1
11 pages
Project Report (Group 9)
No ratings yet
Project Report (Group 9)
20 pages
Overview of Object Detection Based On Deep Learnin
No ratings yet
Overview of Object Detection Based On Deep Learnin
7 pages
E3sconf Iconnect2023 04032
No ratings yet
E3sconf Iconnect2023 04032
11 pages
Deep Learning For Object Detection - 131124
No ratings yet
Deep Learning For Object Detection - 131124
35 pages
Transfer Learning For Object Detection Using State-of-the-Art Deep Neural Networks
No ratings yet
Transfer Learning For Object Detection Using State-of-the-Art Deep Neural Networks
7 pages
Object Detection Using YOLO: Challenges, Architectural Successors, Datasets and Applications
No ratings yet
Object Detection Using YOLO: Challenges, Architectural Successors, Datasets and Applications
33 pages
Object Detection Presentation
No ratings yet
Object Detection Presentation
12 pages
Efficient Detection of Small and Complex Objects For Autonomous Driving Using Deep Learning
No ratings yet
Efficient Detection of Small and Complex Objects For Autonomous Driving Using Deep Learning
5 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
A Brief Review and Challenges of Object 2020
No ratings yet
A Brief Review and Challenges of Object 2020
17 pages
A Survey of Modern Deep Learning Based Object Detection Models
No ratings yet
A Survey of Modern Deep Learning Based Object Detection Models
19 pages
FF Ffi
No ratings yet
FF Ffi
15 pages
Sapkota Et Al., 2025
No ratings yet
Sapkota Et Al., 2025
28 pages
Literature Survey For Robotics
No ratings yet
Literature Survey For Robotics
6 pages
YOLO Based Object Detection Models: A Review and Its Applications
No ratings yet
YOLO Based Object Detection Models: A Review and Its Applications
40 pages
An Evaluation of Deep Learning Methods For Small Object
No ratings yet
An Evaluation of Deep Learning Methods For Small Object
18 pages
Final Project Paper Akash
No ratings yet
Final Project Paper Akash
5 pages
An Investigation of Deep Neural Network Based Techniques For Object Detection An
No ratings yet
An Investigation of Deep Neural Network Based Techniques For Object Detection An
6 pages
Investigations of Object Detection in Im
No ratings yet
Investigations of Object Detection in Im
46 pages
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
No ratings yet
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
29 pages
Yolov10 To Its Genesis A Decadal and Comprehensive
No ratings yet
Yolov10 To Its Genesis A Decadal and Comprehensive
49 pages
Lecture7 PDF
No ratings yet
Lecture7 PDF
228 pages
Multilayer Feed Forward Neural Network
No ratings yet
Multilayer Feed Forward Neural Network
8 pages
Lecture 1 - Neural Network Definitions and Concepts 1
No ratings yet
Lecture 1 - Neural Network Definitions and Concepts 1
4 pages
Nath Et Al 2024
No ratings yet
Nath Et Al 2024
18 pages
Unit - 1, Notes
No ratings yet
Unit - 1, Notes
38 pages
AIML MCQ All
No ratings yet
AIML MCQ All
20 pages
7-Knowledge Distillation
No ratings yet
7-Knowledge Distillation
29 pages
Resnet 50 1D CNN
No ratings yet
Resnet 50 1D CNN
24 pages
Difference Between Perceptron and Neuron
No ratings yet
Difference Between Perceptron and Neuron
3 pages
Deep L PPT
No ratings yet
Deep L PPT
8 pages
Survey of Neuromorphic Computing A Data Science Perspective
No ratings yet
Survey of Neuromorphic Computing A Data Science Perspective
4 pages
Nasscom Digital 101
No ratings yet
Nasscom Digital 101
6 pages
Unit 1: Introduction To Soft Computing and Neural Networks
No ratings yet
Unit 1: Introduction To Soft Computing and Neural Networks
6 pages
AD3511-DEEP LEARNING LAB MANUAL Revised
No ratings yet
AD3511-DEEP LEARNING LAB MANUAL Revised
72 pages
Lecture 08 On Neural Networks 1
No ratings yet
Lecture 08 On Neural Networks 1
15 pages
LLM Research Paper
No ratings yet
LLM Research Paper
2 pages
A Survey of Neural Networks Usage For Intrusion Detection Systems
No ratings yet
A Survey of Neural Networks Usage For Intrusion Detection Systems
18 pages
The GR4J Hydrological Model
No ratings yet
The GR4J Hydrological Model
13 pages
Lecture04 Graph SVM
No ratings yet
Lecture04 Graph SVM
54 pages
A Review of Handwritten Text Recognition Using Machine Learning and Deep Learning Techniques
No ratings yet
A Review of Handwritten Text Recognition Using Machine Learning and Deep Learning Techniques
6 pages
2022 - Exam 1 Solution Source - 2022
No ratings yet
2022 - Exam 1 Solution Source - 2022
6 pages
Brochure AIMLEA 2024 SBI Collect REVISED Dates
No ratings yet
Brochure AIMLEA 2024 SBI Collect REVISED Dates
6 pages
期末專題1
No ratings yet
期末專題1
14 pages
Machine Learning Syllabus
No ratings yet
Machine Learning Syllabus
5 pages
Manish Bhatt 2451137 ProjectIV
No ratings yet
Manish Bhatt 2451137 ProjectIV
20 pages
Python Package Imports
No ratings yet
Python Package Imports
3 pages
Emotion Classification On Youtube Comments
No ratings yet
Emotion Classification On Youtube Comments
5 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
YOLO Object Detection Explained: Definitive Reference for Developers and Engineers
From Everand
YOLO Object Detection Explained: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Research Paper UGR - Team-07

Uploaded by

Research Paper UGR - Team-07

Uploaded by

Progress Report of UGR

Department (School) ECE

Supervisor(s)Name Dr. K. Sreedhar

ResearchTitle Image detection using deep learning

Fig. 1. Object Detection Examples

B. Dataset Survey for Image Detection

IV. Deep Learning Models for Image Detection

Fig. 3. Image Detection Examples

V.Object Detection Models

Fig. 4. YOLOv2 block diagram

B.RCNN (Region-based CNN)

Fig. 5. Faster R-CNN block diagram

MobileNetV2 builds on the lightweight design of MobileNetV1 with significant architectural

The MobileNetV2 architecture implements these bottleneck blocks 17 times, compared to

Comparison with MobileNetV1

Feature MobileNetV1 MobileNetV2

Fig.7.MobileNet-V2 block diagram

D. DETR (DEtection TRansformer)

Fig.8. DETR (DEtection TRansformer) block diagram

E. SSD (Single Shot MultiBox Detector)

Comparison of Key Models

SIGNATURE OF THE SUPERVISOR:

You might also like