0% found this document useful (0 votes)

106 views

YOLOv1 To v8 Unveiling Each VariantA Comprehensive Review of YOLO

An Ieee paper of 2024.

Uploaded by

d4technozgamer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

106 views

YOLOv1 To v8 Unveiling Each VariantA Comprehensive Review of YOLO

An Ieee paper of 2024.

Uploaded by

d4technozgamer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Received 6 February 2024, accepted 13 March 2024, date of publication 19 March 2024, date of current version 26 March 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3378568

YOLOv1 to v8: Unveiling Each Variant—

A Comprehensive Review of YOLO
MUHAMMAD HUSSAIN
Department of Computer Science, University of Huddersfield, HD1 3DH Huddersfield, U.K.
e-mail: [email protected]

ABSTRACT This paper implements a systematic methodological approach to review the evolution of
YOLO variants. Each variant is dissected by examining its internal architectural composition, providing
a thorough understanding of its structural components. Subsequently, the review highlights key architectural
innovations introduced in each variant, shedding light on the incremental refinements. The review includes
benchmarked performance metrics, offering a quantitative measure of each variant’s capabilities. The paper
further presents the performance of YOLO variants across a diverse range of domains, manifesting their
real-world impact. This structured approach ensures a comprehensive examination of YOLOs journey,
methodically communicating its internal advancements and benchmarked performance before delving into
domain applications. It is envisioned, the incorporation of concepts such as federated learning can introduce
a collaborative training paradigm, where YOLO models benefit from training across multiple edge devices,
enhancing privacy, adaptability, and generalisation.

INDEX TERMS Computer vision, YOLO, edge-computing, manufacturing, object detection, realtime.

I. INTRODUCTION This article embarks aims to untangle the intricacies under-

In the realm of computer vision [1], [2], [3], the imperative pinning YOLOs variants, commencing with a meticulous
task of object detection has undergone a paradigmatic analysis of the architectural advancements from YOLOv1
evolution [4], [5], [6], catalysed by the revolutionary advent to YOLOv8. Exploring the rationale behind these structural
of the You Only Look Once (YOLO) architecture in 2016 [7]. adaptations, their mutual relationships, and their consequen-
YOLOs innovative approach diverged from the traditional tial impact on the efficacy of object detection forms the
two-stage object detection architectures, proposing a unified bedrock of this paper. Transcending architectural intricacies,
architecture with the ability to predict bounding boxes and the paper examines the core of YOLOs effectiveness, which
class probabilities concurrently, all within the exigencies of lies in its training methodologies. Through the amalgamation
real-time processing [8]. The prominence of YOLO rests of data augmentation, transfer learning principles, and strate-
not merely in its single-stage approach but in its iterative gic implementation of internal architectural enhancements,
progression, evolving from its nascent version, YOLOv1, YOLO has fostered a robustness that extends beyond the
to the present variant, YOLOv8. Each iteration manifested limitations of domain-specific peculiarities.
a confluence of innovative architectural advancements, Diving deeper into the deployment realm, the paper exam-
affording YOLO an unprecedented adeptness and versatility ines the real-world impact of YOLO. Through specific exam-
across a plethora of industrial applications. YOLOs ethos is ples where YOLO has showcased its efficacy, we present
underscored by an intrinsic equilibrium between accuracy insights into YOLOs evolution from a baseline architecture
and speed, making it a feasible proposition to the research to a transformative asset in real-world applications. However,
community and industry practitioners alike. as YOLO ascends penetrates into new domains, it encounters
its own set of challenges. From navigating through occlusions
to efficiently managing scale variations and fine-grained
The associate editor coordinating the review of this manuscript and object detection scenarios [9], YOLO faces a myriad of
approving it for publication was Bo Pu . obstacles. By acknowledging these challenges, we initiate
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
42816 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

FIGURE 1. Visual structure of this review.

a dialogue on the future trajectory of research, outlining delves into the innovations fueling the performance of each
potential pathways to strengthen YOLOs robustness in the variant, offering a comparative study that spans more than
realm of object detection. 20 domains.
In summary, this article dives into the evolving YOLO Furthermore, beyond examining the strengths of YOLO
architectures, comprehensively evaluating their effectiveness, architectures, this comprehensive review sheds light on the
and pondering over future prospects. The article unravels how persistent challenges faced by the YOLO series. By outlining
YOLO has transformed over the epochs of time, how effective current limitations and areas for improvement, the review
it is now, and what the future of YOLO variants may be in the aims to present a nuanced understanding of the ongoing
domain of computer vision. hurdles.
Additionally, it anticipates future developments and
A. SURVEY OBJECTIVE enhancements, providing insights into potential directions
This article seeks to examine the factors fuelling the profound for overcoming existing challenges. This forward-looking
adoption of the YOLO variants, with a focus on its evolution approach positions the review as a valuable resource not only
from YOLOv1 to YOLOv8. Figure 1 presents the article for understanding the historical evolution of YOLO but also
structure with the key components of the article highlighted for anticipating its trajectory in addressing emerging issues
in green. These components form the key objectives of this and meeting the demands of diverse domains.
paper:
1) Architectural Evolution Analysis: Examine the
architectural innovations across YOLO variants, eluci- C. ORGANIZATION OF PAPER
dating motivations and impact on real-time industrial This article is structured to succinctly examine the evolution
applications. and inspiration fuelling the popularity of YOLO variants
2) Training Strategies Scrutiny: Analyse YOLO’s train- in industrial applications. Beginning with an introduction
ing methodologies, including data augmentation and that lays the foundations, subsequent sections are intricately
transfer learning, to understand its adaptability across structured. Section II presents an overview of object detec-
diverse domains. tion. Section III, delves into the motivations and implications
3) Real-world Impact Assessment: Explore specific of architectural reforms across the variants, YOLOv1 to
domains where YOLO has manifested impressive YOLOv8.
efficacy, showcasing its practical versatility. Section IV, scrutinizes the versatility of YOLO variants
4) Challenges and Future Directions Exploration: through an examination of training methodologies, including
Identify realtime challenges, such as occlusions and data augmentation, transfer learning, and training datasets.
scale variations, and propose future research directions In Section V, a rigorous empirical assessment of YOLOv1-v8
to fortify YOLO’s standing in object detection. is conducted, benchmarking against contemporaneous mod-
els to quantify performance with respect to Mean-Average
B. IMPORTANCE OF SURVEY Precision (MAP), Frames Per Second (FPS) and internal
Although several papers have reviewed YOLO architectures, intricacies such as nature of loss functions deployed.
they often exhibit limitations such as focusing on specific Section VI, explores wide-ranging industrial applications
YOLO variants [10], [11], or concentrating on particular where YOLO has demonstrated efficacy, showcasing its prac-
application domains [12]. However, this review distinguishes tical versatility. Section VII, identifies barriers like handling
itself as the first to provide an in-depth analysis of mainstream occlusions, addressing biases, real-time edge deployment and
YOLO variants from YOLOv1 to YOLOv8. The analysis proposes future research directions.

VOLUME 12, 2024 42817

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

Finally, Section VIII, summarises key findings, highlight- Look Once), RefineDet++, DSSD (Deconvolution Single
ing factors contributing towards YOLO’s popularity and its Shot Detector), and RetinaNet.
significant implications for the field of object detection. This SSD [23] deploys manifold convolutional feature maps at
organized structure ensures a coherent and insightful journey various scales to predict bounding boxes and class probability
through the multifaceted analysis of YOLO’s evolution and scores, effectively detecting objects of various sizes and
impact in object detection and the wider field of computer shapes in a single forward pass.
vision. RefineDet++ [24] optimises the original RefineDet archi-
tecture through iterative refinement of target proposals across
multiple stages, improving accuracy via enhanced feature
II. OBJECT DETECTION fusion mechanisms and refined target boundaries.
Addressing the intricacies of object detection presents numer- DSSD (Deconvolution Single Shot Detector) integrates
ous challenges. A key issue involves effectively managing deconvolution layers to preserve spatial information lost
fluctuations in image resolutions and aspect ratios [13], during feature pooling, enabling the model to capture
a task aggravated when the target objects manifest substantial fine-grained details by maintaining spatial resolution.
differences in spatial dimensions [14]. The presence of RetinaNet [25] addresses class imbalance via Focal
class imbalance, particularly in scenarios where ascertain a Loss, attributing higher weights to misclassified samples,
sufficient number of images for specific classes is challeng- enhancing the architecture’s ability to handle class imbalance
ing [15], can detrimentally impact architectural performance, and improve detection performance.
leading to biased predictions [16]. Vision Transformer (ViT) was introduced in 2020 [26].
Furthermore, a noteworthy hurdle is the computational Based the encoder and decoder mechanism, ViT continues
complexity associated with object detection architectures, the concept of tokens to visual data streams. As an
demanding considerable computational resources in terms alternate to CNNs, ViT can be utilized for the backbone
of power, memory, and time [17], [18]. Figure 2, illustrates feature extraction work. Selecting ResNet as a baseline,
object detection for both single and multiple objects in Wu et al. [27], integrated ViTs by replacing the ultimate
an image, detectors with deep internal networks require convolutional layer. This enabled the preceding convolutional
significant computational capabilities to process intricate layers to extract low-level features, which then segued into
datasets and extract essential features. the ViT, demonstrating the adaptability of the transformer
Object detection can be bifurcated into two categories: architecture in the territory of computer vision.
single- and two-stage detectors. The latter contains proposing
candidate regions within an image, followed by classification
III. EVOLUTION OF YOLO ARCHITECTURE:
and localization of proposed regions. Examples of two-stage
A. YOLOv1
detectors include RCNN (Region-based Convolutional Neu-
Announced in 2016, YOLOv1 marked a profound leap in
ral Network) [19], Fast R-CNN [20], Faster R-CNN [21], and
single-shot object detection. Enthused by the GoogLeNet
FPN (Feature Pyramid Network) [22].
architecture [28], YOLOv1 deployed a unique approach by
RCNN [19], proposed in 2014, deployed selective search
substituting GoogLeNet’s inception modules with (1 × 1)
for candidate region proposals, utilising a convolutional
convolution followed by (3 × 3) convolutional filters.
network for feature extraction. Fast R-CNN [20] facilitates
The architecture, benchmarked on the VOC Pascal Dataset
these concerns by proposing ROI pooling, which significantly
2007 and 2012 [29], exploited the Darknet framework for
reduces computations by extracting fixed-size feature maps
training. Featuring 24 convolution layers, with only four
for each region from the original feature maps.
of which were followed by max-pooling layers, YOLOv1
Faster R-CNN [21] enriched upon Fast R-CNN by
embraced (1 × 1) convolutions and global average pooling
implementing the Region Proposal Network (RPN). This
as standout features.
innovation eradicated the need for a separate proposal stage
Initially trained on the ImageNet dataset [30], the model
by directly generating region proposals from feature maps,
was exposed to fine-tuning by adding four additional convo-
optimising both speed and accuracy.
lutional layers and two fully connected layers with randomly
FPN (Feature Pyramid Network) [22] tackled the challenge
initialized weights. The activation function employed Leaky
of detecting targets at multiple scales by generating a feature
Rectified Linear Unit (LReLU), except for the last layer with
pyramid. This pyramid fused feature maps of varying reso-
a linear activation function. Despite its pioneering status,
lutions from different network stages, empowering effective
YOLOv1 exhibited drawbacks, including large localization
detection of targets across different scales. Notwithstanding
errors and lower recall compared to two-stage object
their impressive accuracy, two-stage detectors, are limited by
detectors.
their high computational demands.
In contrast, single-stage detectors aim to detect objects in
a single pass, side-stepping the need for a separate region B. YOLOv2
proposal step. Notable single-stage detectors include SSD YOLOv2 [31] was inspired by the once popular VGG
(Single Shot Multibox Detector), YOLO variants (You Only architecture, featuring the darknet-19 framework with 19 con-

42818 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

FIGURE 2. Single and multiple objects in an image: Classification, Localization, Segmentation.

TABLE 1. YOLOv2 darknet-19 framework. a: BATCH NORMALIZATION

Addressing the matter of Internal Covariate Shift during
the training of neural networks, YOLOv2 employed Batch
Normalization to normalize the outputs of all hidden layers,
guaranteeing consistent distribution of weight matrices across
different layers. This not only reduced superfluous shifts
in deeper hidden layers but also acted as a regularisation
technique. The use of Batch Normalization contributed to
an approximately 2% increase in mean Average Precision
(mAP).

b: HIGH RESOLUTION CLASSIFIER

In YOLOv2, the High Resolution Classifier was introduced to
train the model on 448 × 448 sized images for classification
before fine-tuning for object detection. This framework
improved the model’s ability to learn classification and adapt
to high-resolution inputs, resulting in an approximately 4%
increase in mAP.

c: CONVOLUTION WITH ANCHOR BOXES

Replacing the fully connected layer from the base variant,
YOLOv2 employed convolution with anchor boxes leading
to an approximately 7% increase in recall, albeit with a 0.3%
reduction in mAP.

volutional layers and 5 max pooling layers as presented in d: DIMENSION CLUSTERS

Table 1. Prominent features included the integration of 1 ×
Moving away from pre-defined anchor boxes, YOLOv2
1 convolutions for down sampling across the depth of the
extracted anchor boxes using KMEANS [32] clustering,
input volume. YOLOv2 incorporated several data augmen-
providing enhanced anchor boxes for optimised model
tation techniques, such as random crops and rotations, for
training and performance.
enhanced training. However, YOLOv2 faced challenges in
detecting smaller-sized objects. The architecture introduced
new optimisation procedures, including Batch Normaliza- e: DIRECT LOCATION PREDICTION
tion, High Resolution Classifier, Convolution with Anchor To address issues with pre-defined priors and model insta-
Boxes, Dimension Clusters, Direct Location Prediction, Fine- bility during bounding box prediction, YOLOv2 predicted
Grained Features, and Multi-Scale Training. Key Innovations location coordinates relative to grid-cell locations. This
in YOLOv2 included: feature contributed to a 5% increase in mAP.

VOLUME 12, 2024 42819

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

f: FINE-GRAINED FEATURES a: DARKNET-53 BACKBONE

Recognizing the challenge of effectively detecting smaller The architecture of YOLOv3 was based on the Darknet-53
objects, YOLOv2 introduced fine-grained features by con- framework, employing 3 × 3 and 1 × 1 convolutional filters
catenating higher resolution features with lower resolution alongside shortcut connections. This architecture, consisted
features via skip connections, resulting in a 1% increase in of 53 convolutional layers, serving as a robust base for
mAP. efficient object detection.

g: MULTI-SCALE TRAINING b: FEATURE PYRAMID NETWORK (FPN) INSPIRED DESIGN

With the omission of fully connected layers, YOLOv2 Drawing inspiration from FPN, YOLOv3 incorporated
operated on dimensions ranging from 320 × 320 to 608 × heuristics such as residual blocks, skip connections, and
608. The architecture randomly selected new dimensions, up-sampling into its internal architectural footprint. This
multiples of 32, providing flexibility and contributing to the approach enhanced the network’s capacity to detect objects
model’s ability to predict on various dimensions. efficiently across varying scales.

C. YOLOv3
c: THREE-SCALE DETECTION MECHANISM
YOLOv3 [33], addressed the shortcomings observed in
its predecessors by concentrating on rectifying localisation YOLOv3 generated feature maps at three distinct scales,
errors and optimising detection efficiency, particularly for down-sampling the input at factors of 32, 16, and 8. Detection
smaller objects. Benchmarked on the COCO dataset [34], was carried out on a 13 × 13 feature map after a series of
YOLOv3 presented improved performance in detecting convolutions, followed by a 26 × 26 feature map obtained
smaller objects, while encountering difficulties in achieving via up-sampling and concatenation. Additionally, a 52 ×
precise results for medium and large-sized objects. 52 feature map was involved in the detection process. This
Constructed on the Darknet-53 framework, YOLOv3 three-scale mechanism enabled YOLOv3 to detect large,
employs a robust network comprising of 53 convolutional medium, and small-sized objects using distinct feature maps.
layers, incorporating 3 × 3 and 1 × 1 convolutional filters
along with skip connections, as presented in Table 2. Conspic- D. YOLOv4
uously, the Darknet-53 framework, with its 53 convolutional Authors of YOLOv4 [36], introduced a plethora of advanced
layers, achieved double the speed of ResNet-152 [35]. techniques and sophisticated methodologies, distinguishing
YOLOv4 as a faster and more accurate object detector
TABLE 2. YOLOv3 internal architecture. tailored for production systems compared to its predecessors.
YOLOv4 architecture was defined through a sequence
of pivotal components: initial image processing, feature
extraction utilising potent networks like VGG16 [37], Dark-
net53, and ResNet50, feature scaling with neck structures
like Feature Pyramid Network (FPN) and Path Aggregation
Network (PAN) [38], and the integration of single-stage and
two-stage detectors for prediction.
In their experimentation with architectures, authors com-
pared CSPResNeXt50, CSPDarknet53, and EfficientNetB3,
ultimately selecting CSPDarknet53 as the backbone. CSP-
Darknet53, featured 29 convolution layers with 3 × 3 fil-
ters and around 27.6 million parameters, incorporating
Cross-stage partial connections (CSP) to enhance gradient
combination efficiency with minimal computational cost.
Key architectural components included:

a: SPATIAL PYRAMID POOLING (SPP)

To accommodate various input dimensions without resizing
or reshaping, YOLOv4 integrated Spatial Pyramid Pooling
YOLOv3 introduced a foundational generic architecture (SPP) [35]. Sandwiched between a convolutional blocks and
inspired by the Feature Pyramid Network (FPN) [22], inte- fully connected layers, SPP mapped any input size to a fixed-
grating elements such as residual blocks, skip connections, size output, facilitating object detection for images of varying
and up-sampling. Key attributes of YOLOv3 included: sizes.

42820 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 3. YOLOv5 internal variant comparison.

YOLOv5n and YOLOv5s are lightweight models suitable for

low-resource devices, while YOLOv5x is optimized for high
FIGURE 3. YOLOv4 path aggregation (a) Addition (b) Concatenation.
performance, albeit at the expense of speed.

b: PATH AGGREGATION NETWORK (PAN) d: PERFORMANCE AND VERSIONS

In the traditional PANet, neighbouring layers were usually Benchmarked on the MS COCO dataset test-dev 2017,
added together for mask predictions, employing adaptive fea- YOLOv5x achieved an AP of 66.9% compromising of
ture pooling, as shown in Figure 3 (a). However, in YOLOv4, 86.7 Million parameters. Whilst, YOLOv5s, at the other end
PANet directs a concatenation operation, combining the of the spectrum, achieved an AP of 55.8% compromising of
feature maps from different layers along a specific axis, 7.5 Million parameters, as presented in Table 3.
creating a new feature map that contains information from
both layers, as shown in Figure 3 (b). This modification aims e: OPEN SOURCE AND ACCESSIBILITY
to capture intricate details and context across multiple scales, YOLOv5 is open source and actively maintained by Ultralyt-
enhancing the accuracy of predictions. ics, boasting over 250 contributors and frequent updates. The
model is known for its ease of use, training, and deployment.
E. YOLOv5 Ultralytics provides a mobile version for iOS and Android,
YOLOv5 [39] epitomizes a paradigm shift as it transitions along with various integrations for labelling, training, and
from the Darknet framework to PyTorch. The authors retained deployment.
many improvements introduced in YOLOv4, in addition
to notable changes. The architecture initiated a strided F. YOLOv6
convolution layer with a large window size, aimed at reducing Meituan Vision AI Department [41] introduced YOLOv6 in
memory and computational costs. Subsequent convolutional September 2022 [42], boasting several innovative features
layers extract pertinent features from the input image. The claiming to enhance both efficiency and accuracy. The
SPPF (spatial pyramid pooling fast) layer and subsequent architectural footprint, incorporated an efficient backbone
convolution layers processed features at various scales, featuring RepVGG/CSPStackRep blocks, a PAN (Path
while up sample layers enhanced the resolution of feature Aggregation Network) topology neck, and an efficient
maps. The SPPF layer accelerated computation by pooling decoupled head with a hybrid-channel strategy [43].
features of different scales into a fixed-size feature map. Conspicuously, the paper introduced sophisticated quan-
Each convolutional layer was paired with batch normalization tisation strategies, implementing post-training quantisation
(BN) [40] and SiLU activation. and channel-wise distillation, resulting in detectors that are
not only faster but also more accurate. YOLOv6 builds
a: NECK AND HEAD — SPPF AND MODIFIED CSP-PAN upon the successes of its predecessors, notably YOLOv5,
The neck of YOLOv5 implemented SPPF and a modified surpassing previous state-of-the-art models in terms of
CSP-PAN, while the head structure resembled that of accuracy and speed. Key Features include:
YOLOv3.
a: EFFICIENT BACKBONE — EfficientRep AND ENHANCED
b: AUGMENTATIONS AND IMPROVED GRID SENSITIVITY NECK
YOLOv5 introduced several augmentations, including YOLOv6 introduced a new backbone called EfficientRep,
Mosaic, copy-paste, random affine, MixUp, HSV augmen- based on RepVGG, which leveraged higher parallelism
tation, and random horizontal flip, along with augmentations compared to previous YOLO backbones. The neck of
from the albumentations package. The model also enhanced the network utilised PAN enhanced with RepBlocks or
grid sensitivity for stability against runaway gradients. CSPStackRep Blocks for larger models. The redesigned
backbone and neck contributed to improved efficiency and
c: MODEL VARIANTS adaptability.
YOLOv5 is manifested in five variants to accommodate for
various applications and hardware requirements: YOLOv5n b: TASK ALIGNMENT LEARNING APPROACH FOR LABEL
(nano), YOLOv5s (small), YOLOv5m (medium), YOLOv5l ASSIGNMENT
(large), and YOLOv5x (extra-large). Each variant varies in The architecture adopts the Task Alignment Learn-
the width and depth of the convolution modules. For instance, ing approach, enhancing label assignment. Additionally,

VOLUME 12, 2024 42821

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

FIGURE 4. PANet configuration [2].

YOLOv6 incorporated new classification and regression requirements. Benchmarked on the MS COCO dataset test-
losses, employing a classification VariFocal loss and an dev 2017, the largest variant achieved an impressive AP of
SIoU/GIoU regression loss. 57.2% while maintaining a speed of around 29 FPS on an
NVIDIA Tesla T4.
c: SELF-DISTILLATION STRATEGY
YOLOv6 implemented a self-distillation strategy for both G. YOLOv7
regression and classification tasks. This strategy assisted YOLOv7, released in 2022, represents an innovative advance-
the model distill knowledge from its own predictions, ment in the realm of object detection [43]. At the time of
contributing to improved performance and generalisation. its release, it outperformed many present object detectors,
ranging from 5 FPS to an impressive 160 FPS.
d: QUANTIZATION SCHEME WITH RepOptimiser AND Notably, YOLOv7 was trained on the MS COCO dataset
CHANNEL-WISE DISTILLATION without leveraging pre-trained backbones, showcasing its
The authors introduced a quantisation scheme for detection ability to achieve remarkable results through its unique
using RepOptimiser and channel-wise distillation. This training approach. Architectural advents include:
scheme not only assisted in achieving a faster detector but
also ensured that quantisation did not compromise accuracy. a: EXTENDED EFFICIENT LAYER AGGREGATION NETWORK
(E-ELAN)
e: BIDIRECTIONAL CONCATENATION (BiC) MODULE YOLOv7 proposed an extended version of the efficient layer
YOLOv6 introduced a BiC module in the neck of the detector, aggregation network (ELAN) [44], termed E-ELAN. ELAN
enhancing localisation signals and delivering performance is a strategic mechanism facilitating efficient learning and
gains with negligible speed degradation. convergence in deep models by controlling the shortest
longest gradient path. E-ELAN optimises this concept
for models with unlimited stacked computational blocks.
f: ANCHOR-AIDED TRAINING (AAT) STRATEGY
It achieves this by shuffling and merging cardinality features,
AAT caters for both anchor-based and anchor-free paradigms
thus augmenting the network’s learning capabilities without
without compromising inference efficiency.
compromising the original gradient path.
g: ENHANCED BACKBONE AND NECK DESIGN b: MODEL SCALING FOR CONCATENATION-BASED MODELS
By deepening YOLOv6 to include another stage in the YOLOv7 adopted a concatenation-based architecture, and
backbone and neck, the architecture achieved state-of-the-art to generate models of varying sizes, it introduced a novel
performance on the COCO dataset at high-resolution input. mechanism for model scaling. Unlike standard scaling
techniques, such as depth scaling, YOLOv7 ensured the
h: SELF-DISTILLATION STRATEGY depth and width of the block are scaled proportionally. This
A new self-distillation strategy is implemented to boost the maintained the optimal structure of the model, preventing
performance of smaller models of YOLOv6, enhancing the unwanted distortions in the hardware usage of the model.
auxiliary regression branch during training and removing it
at inference to avoid a marked speed decline. c: PLANNED RE-PARAMETERIZED CONVOLUTION
(RepConvN)
i: MODEL VARIANTS AND PERFORMANCE Inspired by re-parameterized convolutions (RepConv) from
The authors provide eight scaled variants, ranging from YOLOv6, YOLOv7 introduced RepConvN. In contrast to
YOLOv6-N to YOLOv6-L6, catering to different application RepConv, RepConvN eradicates the identity connection,

42822 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 4. Key features and architectural evolution. TABLE 5. Training and optimization.

addressing issues related to residual destruction in ResNet

and concatenation in DenseNet.

d: COARSE AND FINE LABEL ASSIGNMENT

YOLOv7 employed a strategy of coarse label assignment for
the auxiliary head and fine label assignment for the lead head.
While the lead head is responsible for the final output, the
auxiliary head assists with training, as presented in Figure 4.

e: BATCH NORMALIZATION IN CONV-BN-ACTIVATION c: PERFORMANCE METRICS AND SPEED

The integration of batch normalisation into conv-bn- Benchmarked on the MS COCO dataset test-dev 2017,
activation involved incorporating the mean and variance YOLOv8x achieved an impressive AP of 53.9% with an
of batch normalisation into the bias and weight of the image size of 640 pixels, outperforming YOLOv5. YOLOv8
convolutional layer during the inference stage. achieves a remarkable speed of 280 FPS on an NVIDIA
A100 with TensorRT, emphasizing its efficiency in real-time
H. YOLOv8 applications.
Ultralytics introduced YOLOv8 in January 2023 [45], With its advancements in architecture, loss functions, and
signifying a significant evolution in the YOLO series by segmentation capabilities, YOLOv8 stands as a powerful tool
providing users with a comprehensive range of enhancements for a wide range of applications. Table 4 presents the key
and versatile capabilities. architectural innovations for the Yolo variants discussed in
YOLOv8 introduces five scaled versions, catering to differ- the preceding section.
ent application needs: YOLOv8n (nano), YOLOv8s (small), Surveying the interconnections among the YOLO variants
YOLOv8m (medium), YOLOv8l (large), and YOLOv8x exposes a clear pattern, as shown in Table 4. The majority
(extra large). Key Features include: of YOLO variants reveal a connection to the Darknet
framework, with later iterations progressing towards more
a: BACKBONE SIMILARITY TO YOLOv5 WITH C2F MODULE sophisticated versions like CSPDarknet, as the underlying
YOLOv8 preserves a backbone architecture similar to framework. Furthermore, Table 5 emphasizes a consistent
YOLOv5, with profound adjustments in the CSPLayer, trend, indicating that the COCO dataset serves as a key
now referred to as the C2f module. This module, standing benchmark dataset for the majority of YOLO variants, com-
for ‘‘cross-stage partial bottleneck with two convolutions,’’ prising 80 object categories. These categories compromise of
effectively combines high-level features with contextual common objects like cars, bicycles, and animals, as well as
information, contributing to enhanced detection accuracy. more specific items such as umbrellas, handbags, and sports
equipment, enabling a challenging benchmarking front. This
b: SEMANTIC SEGMENTATION MODEL — YOLOv8-SEG shared framework and well recognised dataset benchmarking
YOLOv8 extends its capabilities with a semantic segmen- underscore the robust evolution and continuity in YOLOs
tation model known as YOLOv8-Seg. It compromises of a development, providing a common foundation for testing and
CSPDarknet53 feature extractor followed by a C2F module, benchmarking across different architectural innovations.
diverging from the conventional YOLO neck architecture.
YOLOv8-Seg includes two segmentation heads responsible IV. TRAINING STRATEGIES AND DATASET DIVERSITY
for predicting semantic segmentation masks for input images. A. DATA AUGMENTATION TECHNIQUES
YOLOv8 achieves state-of-the-art results in object detection Within the fabric of YOLO’s training strategy, data augmen-
and semantic segmentation benchmarks while maintaining tation emerges as a dynamic and profound mechanism. The
efficiency. intricate orchestration of diverse transformations, including

VOLUME 12, 2024 42823

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 6. Comparative Study of YOLO variants.

but not limited to random scaling, rotation, translation, YOLOv1: Pioneer in Object Detection (2015) The
illumination, and the popular Mosaic (YOLOv5) serves as a inaugural version of YOLO, YOLOv1, introduced the
cornerstone for enhancing the variant robustness. groundbreaking concept of real-time object detection using
By exposing variants to a myriad of augmented instances a single-stage architecture with anchor boxes. Deploying
during training, YOLO becomes adept at handling the the Darknet24 framework, it achieved a remarkable Mean
inherent variations and complexities present in real-world Average Precision (mAP) of 63.4% while maintaining a
scenarios. This augmentation strategy embebedded within the processing speed of 45 frames per second (FPS).
algorithmic pipeline not only mitigates the risk of overfitting YOLOv2: Refinements and Anchor Boxes (2016) Build-
but also fosters a model that generalizes effectively across ing upon the success of YOLOv1, YOLOv2 continued
diverse object appearances, orientations, and environmental the utilization of anchor boxes for improved localization
conditions. accuracy. Implemented within the Darknet24 framework,
it achieved a notable increase in mAP, reaching 69.0%, and
maintained real-time processing capabilities with 52 FPS.
B. DYNAMIC TRAINING MECHANISMS
YOLOv3: Multi-scale Features and Loss Functions
The training methodologies deployed across different vari-
(2018) YOLOv3 marked a balanced approach by adopting
ants of YOLO as presented in Table 5 underscore a continual
a multi-scale feature extraction architecture and introducing
evolution in optimizing object detection models. YOLOv1
novel loss functions such as CIoU, GIoU, and BCE. Utilizing
initiated the journey with a grid-based approach leveraging
the Darknet53 framework, it achieved a mAP of 57.9% and
the Darknet framework for training on the Pascal dataset.
demonstrated the ability to handle object detection across
Subsequent variants, like YOLOv2 and YOLOv3,
various scales at 34 FPS.
expanded their horizons by incorporating hierarchical
YOLOv4: Advanced Loss Functions (2020) With the
classification and adopting the Darknet-53 backbone, along
adoption of the CSPDarknet53 framework, YOLOv4 empha-
with introducing innovative techniques such as the FPN.
sized advanced loss functions, including CIoU, DFL, and
YOLOv4 further enhanced the training process through
BCE, aiming to enhance bounding box accuracy while
techniques like enhanced quantization, PAN, and RepVGG.
sustaining real-time processing. Despite a decrease in mAP
YOLOv5 marked a transition to PyTorch, embracing
to 44.3%, it exhibited a high FPS of 65.
AutoAnchor, Mosaic and MixUp for improved performance.
YOLOv5: Leap in Accuracy and Efficiency (2020)
YOLOv6 introduced advancements like RepVGG, PAN,
A significant leap in accuracy and efficiency, YOLOv5,
and EfficientRep, while YOLOv7 continued to innovate with
implemented the Modified CSP v7 architecture in PyTorch.
ELAN and model scaling. YOLOv8, developed in PyTorch,
With a single-stage detection mechanism and novel loss
stands out with its C2f module, EfficientRep, CIoU, and DFL
functions (CIoU, DFL, BCE), it achieved a mAP of 50.7%
for robust and efficient training.
and a substantial increase in FPS to 200, showcasing its
This iterative refinement in training techniques across
efficiency in real-time applications.
YOLO versions showcases a commitment to optimizing
YOLOv6 to YOLOv8: Iterative Improvements (2022-
object detection models through a diverse range of method-
2023) The subsequent iterations, YOLOv6, YOLOv7, and
ologies, each tailored to address the specific challenges and
YOLOv8, demonstrate a commitment to iterative improve-
opportunities presented by evolving datasets.
ments. YOLOv6, utilizing the EfficientRep architecture,
improved accuracy to 52.5%, while YOLOv7, based on
V. YOLO VERSIONS: A COMPARATIVE ANALYSIS the RepConvN, achieved a mAP of 56.8%. YOLOv8,
This section provides a comparative analysis of the reviewed introducing an anchor-free model, maintained a high
YOLO variants from YOLOv1 to YOLOv8, across a wide accuracy of 53.9% with an impressive processing speed
range of metrics, as presented in Table 6. of 280 FPS.

42824 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 7. Diverse applications of YOLO.

VOLUME 12, 2024 42825

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 7. (Continued.) Diverse applications of YOLO.

42826 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 7. (Continued.) Diverse applications of YOLO.

VOLUME 12, 2024 42827

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 7. (Continued.) Diverse applications of YOLO.

42828 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

TABLE 7. (Continued.) Diverse applications of YOLO.

VI. REAL-WORLD APPLICATIONS AND IMPACT C. INDUSTRIAL AUTOMATION AND QUALITY CONTROL
A. SURVEILLANCE SYSTEMS AND PUBLIC SAFETY YOLO finds applications in industrial settings for automation
YOLO’s real-time processing capabilities make it invaluable and quality control [50]. In manufacturing, it can detect and
in surveillance systems, enhancing public safety through the inspect defects in products, ensuring adherence to quality
efficient monitoring of public spaces [46]. standards [51]. The real-time nature of YOLO facilitates swift
decision-making [52] in automated processes, contributing to
increased efficiency [53] and reduced errors [54], in areas
B. AUTONOMOUS VEHICLES AND TRAFFIC such as defect detection.
MANAGEMENT
In the realm of autonomous vehicles, YOLO plays a
crucial role in object detection for obstacle avoidance and D. HEALTHCARE IMAGING AND DIAGNOSIS
navigation [47]. Its rapid identification and classification In medical imaging, YOLO demonstrates efficacy in detect-
of objects contribute to the safe and efficient operation ing and localizing abnormalities, aiding medical profes-
of autonomous vehicles [48]. YOLO also supports traffic sionals in timely and accurate diagnoses in areas such as
management systems by providing real-time information on cancer and exudate detection for early diagnosis of deiabetic
road conditions [49]. retinopathy [55]. YOLO’s real-time processing is particularly

VOLUME 12, 2024 42829

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

valuable in scenarios where quick decisions are critical for may struggle to accurately detect and delineate individual
patient care. instances. Future research could explore novel approaches,
such as improved feature representations or context-aware
E. ENVIRONMENTAL MONITORING AND WILDLIFE models, to enhance YOLO’s ability to cope with occlusions
CONSERVATION and cluttered scenes [88].
YOLO’s adaptability extends to environmental monitoring,
supporting wildlife conservation, biodiversity studies and B. SCALE VARIATIONS AND FINE-GRAINED OBJECT
renewable energy. It can detect and track animals in their DETECTION
natural habitats, aiding researchers in population monitoring The robust detection of objects at varying scales and the
and protection efforts. YOLO’s real-time capabilities enhance identification of fine-grained details remain areas where
the efficiency of conservation initiatives. YOLO can be refined [89]. Adapting the architecture to
better handle small or distant objects, potentially through
F. RETAIL AND CUSTOMER EXPERIENCE multi-scale feature fusion strategies, could elevate YOLO’s
In the retail domain, YOLO variants have been implemented performance in scenarios demanding fine-grained object
to enhancing customer experiences and optimise several detection. The integration of federated learning can con-
aspects of the supply chain. By leveraging its efficient tribute to the enhancement of YOLO’s adaptability across
object detection and tracking effeciency, YOLO variants can diverse scales by leveraging collaborative learning from edge
significantly contribute to automated inventory management, devices.
offering retailers real-time analysis of their stock levels and
product availability [56] and [57]. C. DOMAIN ADAPTATION AND GENERALIZATION
To further illustrate YOLO’s impact, Table 7 provides While YOLO has showcased versatility across domains,
an overview of diverse applications and research studies there is room for improvement in domain adaptation [90].
leveraging YOLO. Each entry in the table highlights the Ensuring robust performance when transitioning from one
reference, detection type, YOLO model used, key character- environment to another [4], especially in scenarios with
istics of the application, and performance metrics achieved. significant domain shifts, is a challenge. The integration
Notably, multiple applications have optimised the selected of federated learning introduces a collaborative approach
YOLO architecture for diverse purposes. While the majority to domain adaptation, allowing YOLO models to adapt to
of works presented, prioritised attaining high accuracy across diverse edge environments through decentralized learning.
metrics such as MAP, precision, and recall, certain works,
driven by limitations in hardware resources or domain
D. EXPLAINABILITY AND INTERPRETABILITY
restrictions, directed efforts toward optimising Frames Per
As with any machine learning system, addressing biases
Second (FPS) for expedited inferencing, highlighting YOLOs
in training data and ensuring ethical considerations are
versatility in adapting to the specific needs of different
paramount [91]. YOLO, like other object detection models,
applications.
may exhibit biases that mirror the biases present in the data it
Another notable observation showcased in Table 7 is that
was trained on. The integration of federated [92] learning can
most variants implemented are v3 onwards. This preference
contribute to addressing biases by ensuring a more diverse
can be attributed to the crucial role played by YOLO-
and representative dataset across edge devices, enhancing the
v3 as the initial variant addressing the challenge of small
fairness and interpretability [93]of YOLO models.
object detection. YOLO-v3 introduced multi-scale detection
mechanisms with subsequent variants i.e., PANet (YOLOv4),
building on this concept, thereby unlocking applicability in E. ADDRESSING BIASES AND ETHICAL CONSIDERATIONS
scenarios where the detection of small targets was essential. As YOLO evolves, considerations for privacy preservation
become increasingly important. The integration of YOLO
VII. CHALLENGES with federated learning aligns with privacy-preserving objec-
Despite its remarkable success, YOLO faces certain chal- tives by allowing models to be trained collaboratively
lenges and areas for improvement. This section critically across edge devices without centralizing sensitive data [94].
examines the limitations of the YOLO framework, proposes This integration addresses ethical considerations related to
potential avenues for future research to address these data privacy in various applications, from surveillance to
challenges, and explores the integration of YOLO with edge healthcare.
deployment and federated learning for enhanced privacy and
adaptability: F. REAL-TIME PROCESSING OPTIMIZATION
While YOLO is renowned for its real-time processing
A. HANDLING OCCLUSIONS AND CLUTTER capabilities, continuous optimization in this aspect is essen-
One persistent challenge for YOLO is effectively handling tial [95]. Future research may explore innovative techniques
occluded objects and scenes with high clutter. In scenarios for further improving inference speed without compromising
where objects overlap or are partially obscured [87], YOLO accuracy. The integration of edge deployment and federated

42830 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

learning introduces a decentralized approach to real-time methodologies based on their specific domain requirements.
processing, where models are trained collaboratively on edge This diversity is evident in the extensive range of domains
devices, contributing to enhanced efficiency. presented in the third objective, highlighting YOLO variant
deployments across various industries. The incremental
G. EDGE DEPLOYMENT AND FEDERATED LEARNING advances in training strategies contribute significantly to the
The deployment of YOLO at the edge and the integration with adaptability and performance optimization of YOLO variants
federated learning present exciting opportunities [96]. Edge in real-world applications.
devices benefit from YOLO’s efficiency, enabling on-device In considering future challenges, it is envisioned that
object detection without relying heavily on centralized YOLO variants will continue to address and improve perfor-
servers [97]. Federated learning introduces a collaborative mance on small object targets, especially as they penetrate
training paradigm where YOLO models are trained across into more specialized areas such as precision manufacturing.
multiple edge devices [98], enhancing privacy, adaptability, This trajectory suggests a necessity for advancements in
and generalization [99]. This integration aligns with the lightweight architectures that balance high accuracy with
evolving landscape of decentralized and privacy-preserving stringent FPS requirements. As YOLO progresses, meeting
machine learning. the demands of niche applications will likely drive further
innovation in architectural design and optimisation, ensuring
VIII. CONCLUSION its continued relevance in domains with stringent require-
As we conclude this comprehensive exploration of YOLOs ments for precision and efficiency.
evolution, challenges, and integrations, it becomes evident
that YOLO has not only shaped the landscape of object REFERENCES
detection but continues to evolve dynamically, staying at [1] M. Hussain and R. Hill, ‘‘Custom lightweight convolutional neural
the forefront of advancements in computer vision. From network architecture for automated detection of damaged pallet rack-
the pioneering YOLOv1 to the sophisticated YOLOv8, ing in warehousing & distribution centers,’’ IEEE Access, vol. 11,
pp. 58879–58889, 2023.
the architectural innovations and training strategies have [2] M. Hussain, ‘‘YOLO-v5 variant selection algorithm coupled with
propelled YOLO into the limelight, making it a go-to choice representative augmentations for modelling production-based variance
for real-time object detection. in automated lightweight pallet racking inspection,’’ Big Data Cognit.
Comput., vol. 7, no. 2, p. 120, Jun. 2023.
Reviewing the first objective of this review, it is evident [3] M. F. Talu, K. Hanbay, and M. H. Varjovi, ‘‘CNN-based fabric defect
that YOLO variants have endured significant architectural detection system on loom fabric inspection,’’ Tekstil Konfeksiyon, vol. 32,
innovations during evolution. This progression includes no. 3, pp. 208–219, Sep. 2022.
highlights such as the introduction of Feature Pyramid [4] B. A. Aydin, M. Hussain, R. Hill, and H. Al-Aqrabi, ‘‘Domain modelling
for a lightweight convolutional network focused on automated exudate
Networks (FPN) in YOLOv3 and the incorporation of detection in retinal fundus images,’’ in Proc. 9th Int. Conf. Inf. Technol.
ELAN mechanisms in YOLOv7. Notably, the later variants Trends (ITT), May 2023, pp. 145–150.
have acknowledged the requirement for versatility to meet [5] M. A. Ansari, A. Crampton, and S. Parkinson, ‘‘A layer-wise surface
deformation defect detection by convolutional neural networks in laser
the diverse demands of industrial deployments. To address powder-bed fusion images,’’ Materials, vol. 15, no. 20, p. 7166, Oct. 2022.
this, researchers have proposed several sub-variants of [6] P. Lala Mehta and A. Kumar, ‘‘Livai: A novel resource-efficient real-time
each architecture, such as v5s/m/l/x, each with varying facial emotion recognition system based on a custom deep CNN model,’’
SSRN Electron. J., Feb. 2022.
internal architectural configurations. This approach enables
[7] M. Hussain, ‘‘YOLO-v1 to YOLO-v8, the rise of YOLO and its
developers to select a base architecture based on their specific complementary nature toward digital manufacturing and industrial defect
requirements for accuracy and detection rate requirements. detection,’’ Machines, vol. 11, no. 7, p. 677, Jun. 2023.
The resulting versatility has permitted YOLO variants to [8] A. Koubaa, A. Ammar, A. Kanhouch, and Y. AlHabashi, ‘‘Cloud versus
edge deployment strategies of real-time face recognition inference,’’ IEEE
successfully penetrate various applications in the industry as Trans. Netw. Sci. Eng., vol. 9, no. 1, pp. 143–160, Jan. 2022.
evident from Table 7. [9] Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye, ‘‘Object detection in 20 years:
The second objective of the paper, which scrutinizes the A survey,’’ Proc. IEEE, vol. 111, no. 3, pp. 257–276, Mar. 2023.
training strategies for performance optimisation, reveals a [10] P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, ‘‘A review of YOLO algorithm
developments,’’ Proc. Comput. Sci., vol. 199, pp. 1066–1073, Jan. 2022.
comprehensive analysis of training methodologies across [11] P. P. Khaire, R. D. Shelke, D. Hiran, and M. Patil, ‘‘Comparative study
YOLO variants. As presented in Table 5, each variant not only of a computer vision technique for locating instances of objects in images
endured testing on key benchmark datasets but also engaged using YOLO versions: A review,’’ in Proc. Int. Conf. Inf. Commun. Technol.
Intell. Syst., Springer, 2023, pp. 349–359.
in in-depth tuning of internal architectures. YOLOv4, for [12] C. Chen, Z. Zheng, T. Xu, S. Guo, S. Feng, W. Yao, and Y. Lan, ‘‘YOLO-
instance, transitioned from Darknet53 to CSPDarknet53, based UAV technology: A review of the research and its applications,’’
demonstrating a shift in architectural choices for enhanced Drones, vol. 7, no. 3, p. 190, Mar. 2023.
performance. In the case of YOLOv6, the focus moved [13] X. Qian, B. Wu, G. Cheng, X. Yao, W. Wang, and J. Han, ‘‘Building a
bridge of bounding box regression between oriented and horizontal object
towards training optimisation through EfficientRep, followed detection in remote sensing images,’’ IEEE Trans. Geosci. Remote Sens.,
by RepConvN (YOLOv7), indicating a deliberate effort to vol. 61, 2023.
incorporate incremental training boosters. [14] X. Qian, Y. Huo, G. Cheng, C. Gao, X. Yao, and W. Wang, ‘‘Mining high-
quality pseudoinstance soft labels for weakly supervised object detection
These refined training strategies have bestowed devel- in remote sensing images,’’ IEEE Trans. Geosci. Remote Sens., vol. 61,
opers with a rich selection pool, enabling them to select 2023.

VOLUME 12, 2024 42831

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

[15] L. Li, X. Yao, X. Wang, D. Hong, G. Cheng, and J. Han, ‘‘Robust few- [41] C.-Y. Wang, A. Bochkovskiy, and H.-Y. Liao. (2022). YOLOv6. GitHub.
shot aerial image object detection via unbiased proposals filtration,’’ IEEE [Online]. Available: https://github.com/meituan/YOLOv6
Trans. Geosci. Remote Sens., vol. 61, 2023. [42] C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng,
[16] S. Agarwal, J. O. D. Terrail, and F. Jurie, ‘‘Recent advances in object W. Nie, Y. Li, B. Zhang, Y. Liang, L. Zhou, X. Xu, X. Chu, X. Wei,
detection in the age of deep convolutional neural networks,’’ 2019, and X. Wei, ‘‘YOLOv6: A single-stage object detection framework for
arXiv:1809.03193. industrial applications,’’ 2022, arXiv:2209.02976.
[17] L. Liu, W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu, and [43] C.-Y. Wang, A. Bochkovskiy, and H.-Y. Liao, ‘‘YOLOv7: Trainable bag-
M. Pietikäinen, ‘‘Deep learning for generic object detection: A survey,’’ of-freebies sets new state-of-the-art for real-time object detectors,’’ 2022,
2018, arXiv:1809.02165. arXiv:2207.02696.
[18] C.-Y. Wang, H.-Y. M. Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and [44] X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, ‘‘RepVGG: Making
I.-H. Yeh, ‘‘CSPNet: A new backbone that can enhance learning capability VGG-style ConvNets great again,’’ 2021, arXiv:2101.03697.
of CNN,’’ 2020, arXiv:1911.11929. [45] J. Solawetz, ‘‘What is YOLOv8? The ultimate guide,’’ Tech. Rep.,
[19] X. Xie, G. Cheng, J. Wang, X. Yao, and J. Han, ‘‘Oriented R-CNN for Jan. 2023.
object detection,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), [46] D. Beymer, ‘‘Person counting using stereo,’’ in Proc. Workshop Human
Oct. 2021, pp. 3500–3509. Motion, Dec. 2000, pp. 127–133.
[20] R. Girshick, ‘‘Fast R-CNN,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), [47] M. Nagy and G. Lăzăroiu, ‘‘Computer vision algorithms, remote sensing
Dec. 2015, pp. 1440–1448. data fusion techniques, and mapping and navigation tools in the Industry
[21] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real- 4.0-based Slovak automotive sector,’’ Mathematics, vol. 10, no. 19,
time object detection with region proposal networks,’’ IEEE Trans. Pattern p. 3543, Sep. 2022.
Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017. [48] S. Battiato, S. Conoci, R. Leotta, A. Ortis, F. Rundo, and F. Trenta,
[22] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, ‘‘Benchmarking of computer vision algorithms for driver monitoring on
‘‘Feature pyramid networks for object detection,’’ in Proc. IEEE Conf. automotive-grade devices,’’ in Proc. AEIT Int. Conf. Electr. Electron.
Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 936–944. Technol. Automot. (AEIT AUTOMOTIVE), Nov. 2020, pp. 1–6.
[23] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and [49] J. Barthélemy, N. Verstaevel, H. Forehead, and P. Perez, ‘‘Edge-computing
A. C. Berg, ‘‘SSD: Single shot MultiBox detector,’’ in Proc. Eur. Conf. video analytics for real-time traffic monitoring in a smart city,’’ Sensors,
Comput. Vis., 2016, pp. 21–37. vol. 19, no. 9, p. 2048, May 2019.
[24] C. Sun, Y. Ai, S. Wang, and W. Zhang, ‘‘Dense-RefineDet for traffic sign [50] L. Scime and J. Beuth, ‘‘Anomaly detection and classification in a laser
detection and classification,’’ Sensors, vol. 20, no. 22, p. 6570, Nov. 2020. powder bed additive manufacturing process using a trained computer
[25] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, ‘‘Focal loss for dense vision algorithm,’’ Additive Manuf., vol. 19, pp. 114–126, Jan. 2018.
object detection,’’ 2017, arXiv:1708.02002. [51] N. Lyons, ‘‘Deep learning-based computer vision algorithms, immersive
[26] D. Wu, S. Lv, M. Jiang, and H. Song, ‘‘Using channel pruning-based analytics and simulation software, and virtual reality modeling tools
YOLO v4 deep learning algorithm for the real-time and accurate detection in digital twin-driven smart manufacturing,’’ Econ., Manage., Financial
of apple flowers in natural environments,’’ Comput. Electron. Agricult., Markets, vol. 17, no. 2, pp. 67–81, 2022.
vol. 178, Nov. 2020, Art. no. 105742. [52] K. Li, E. D. Miller, M. Chen, T. Kanade, L. E. Weiss, and P. G. Campbell,
[27] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, ‘‘Computer vision tracking of stemness,’’ in Proc. 5th IEEE Int. Symp.
T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, Biomed. Imag., Nano Macro, May 2008, pp. 847–850.
J. Uszkoreit, and N. Houlsby, ‘‘An image is worth 16 × 16 words: [53] Q.-J. Zhao, P. Cao, and D.-W. Tu, ‘‘Toward intelligent manufacturing:
Transformers for image recognition at scale,’’ 2020, arXiv:2010.11929. Label characters marking and recognition method for steel products with
[28] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, machine vision,’’ Adv. Manuf., vol. 2, no. 1, pp. 3–12, Mar. 2014.
V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’ [54] S. Paneru and I. Jeelani, ‘‘Computer vision applications in construction:
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, Current state, opportunities & challenges,’’ Autom. Construct., vol. 132,
pp. 1–9. Dec. 2021, Art. no. 103940.
[29] M. Everingham, L. van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, [55] M. Hussain, H. Al-Aqrabi, M. Munawar, R. Hill, and S. Parkinson,
‘‘The Pascal visual object classes (VOC) challenge,’’ Int. J. Comput. Vis., ‘‘Exudate regeneration for automated exudate detection in retinal fundus
vol. 88, no. 2, pp. 303–338, Jun. 2010. images,’’ IEEE Access, vol. 11, pp. 83934–83945, 2022.
[30] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, ‘‘ImageNet: [56] P. Cortez, L. M. Matos, P. J. Pereira, N. Santos, and D. Duque, ‘‘Forecasting
A large-scale hierarchical image database,’’ in Proc. IEEE Conf. Comput. store foot traffic using facial recognition, time series and support vector
Vis. Pattern Recognit., Jun. 2009, pp. 248–255. machines,’’ in Proc. Int. Joint Conf. Cham, Switzerland: Springer, 2017,
[31] J. Redmon and A. Ali, ‘‘YOLO9000: Better, faster, stronger,’’ in pp. 267–276.
Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2017, [57] N. James, ‘‘Automated checkout for stores: A computer vision approach,’’
pp. 7263–7271. Revista Gestão Inovação Tecnologias, vol. 11, no. 3, pp. 1830–1841,
[32] T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, Jun. 2021.
P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, ‘‘Microsoft COCO: [58] W. Lan, J. Dang, Y. Wang, and S. Wang, ‘‘Pedestrian detection based on
Common objects in context,’’ 2014, arXiv:1405.0312. YOLO network model,’’ in Proc. IEEE Int. Conf. Mechatronics Autom.
[33] J. Redmon and A. Ali, ‘‘YOLOv3: An incremental improvement,’’ 2018, (ICMA), Aug. 2018, pp. 1547–1551.
arXiv:1804.02767. [59] W.-Y. Hsu and W.-Y. Lin, ‘‘Adaptive fusion of multi-scale YOLO for
[34] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, pedestrian detection,’’ IEEE Access, vol. 9, pp. 110063–110073, 2021.
and C. L. Zitnick, ‘‘Microsoft COCO: Common objects in context,’’ in [60] S. Shinde, A. Kothari, and V. Gupta, ‘‘YOLO based human action
Proc. Eur. Conf. Comput. Vis., 2014, pp. 740–755. recognition and localization,’’ Proc. Comput. Sci., vol. 133, pp. 831–838,
[35] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image Jan. 2018.
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), [61] P. Maski and A. Thondiyath, ‘‘Plant disease detection using advanced deep
Jun. 2016, pp. 770–778. learning algorithms: A case study of papaya ring spot disease,’’ in Proc. 6th
[36] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, ‘‘YOLOv4: Optimal Int. Conf. Image, Vis. Comput. (ICIVC), Jul. 2021, pp. 49–54.
speed and accuracy of object detection,’’ 2020, arXiv:2004.10934. [62] M. Lippi, N. Bonucci, R. F. Carpio, M. Contarini, S. Speranza,
[37] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for and A. Gasparri, ‘‘A YOLO-based pest detection system for precision
large-scale image recognition,’’ 2014, arXiv:1409.1556. agriculture,’’ in Proc. 29th Medit. Conf. Control Autom. (MED), Jun. 2021,
[38] Z. Ma, M. Li, and Y. Wang, ‘‘PAN: Path integral based convolution for pp. 342–347.
deep graph neural networks,’’ 2019, arXiv:1904.10996. [63] W. Yang and Z. Jiachun, ‘‘Real-time face detection based on YOLO,’’
[39] G. Jocher et al., (2020), ‘‘ultralytics/yolov5:v3.0,’’ Zenodo, doi: in Proc. 1st IEEE Int. Conf. Knowl. Innov. Invention (ICKII), Jul. 2018,
10.5281/zenodo.3983579. pp. 221–224.
[40] Z. Yao, Y. Cao, S. Zheng, G. Huang, and S. Lin, ‘‘Cross-iteration batch [64] W. Chen, H. Huang, S. Peng, C. Zhou, and C. Zhang, ‘‘YOLO-Face: A real-
normalization,’’ 2021, arXiv:2002.05712. time face detector,’’ Vis. Comput., vol. 37, no. 4, pp. 805–813, Mar. 2020.

42832 VOLUME 12, 2024

M. Hussain: YOLOv1 to v8: Unveiling Each Variant—A Comprehensive Review of YOLO

[65] M. A. Al-masni, M. A. Al-antari, J.-M. Park, G. Gi, T.-Y. Kim, P. Rivera, [87] M. Ghafoor and A. Mahmood, ‘‘Quantification of occlusion handling
E. Valarezo, M.-T. Choi, S.-M. Han, and T.-S. Kim, ‘‘Simultaneous capability of 3D human pose estimation framework,’’ IEEE Trans.
detection and classification of breast masses in digital mammograms via Multimedia, 2022.
a deep learning YOLO-based CAD system,’’ Comput. Methods Programs [88] M. F. Aslan, A. Durdu, K. Sabanci, and M. A. Mutluer, ‘‘CNN and
Biomed., vol. 157, pp. 85–94, Apr. 2018. HOG based comparison study for complete occlusion handling in human
[66] Y. Nie, P. Sommella, M. O’Nils, C. Liguori, and J. Lundgren, ‘‘Automatic tracking,’’ Measurement, vol. 158, Jul. 2020, Art. no. 107704.
detection of melanoma with YOLO deep convolutional neural networks,’’ [89] H. T. Mustafa, J. Yang, and M. Zareapoor, ‘‘Multi-scale convolutional
in Proc. E-Health Bioeng. Conf. (EHB), Nov. 2019, pp. 1–4. neural network for multi-focus image fusion,’’ Image Vis. Comput., vol. 85,
[67] H. M. Ünver and E. Ayan, ‘‘Skin lesion segmentation in dermoscopic pp. 26–35, May 2019.
images with combination of YOLO and GrabCut algorithm,’’ Diagnostics, [90] A. Zahid, M. Hussain, R. Hill, and H. Al-Aqrabi, ‘‘Lightweight
vol. 9, no. 3, p. 72, Jul. 2019. convolutional network for automated photovoltaic defect detection,’’ in
[68] L. Tan, T. Huangfu, L. Wu, and W. Chen, ‘‘Comparison of RetinaNet, SSD, Proc. 9th Int. Conf. Inf. Technol. Trends (ITT), May 2023, pp. 133–138.
and YOLO v3 for real-time pill identification,’’ BMC Med. Informat. Decis. [91] D. S. Char, N. H. Shah, and D. Magnus, ‘‘Implementing machine learning
Making, vol. 21, no. 1, Nov. 2021. in health care—Addressing ethical challenges,’’ New England J. Med.,
[69] N. Bordoloi, A. K. Talukdar, and K. K. Sarma, ‘‘Suspicious activity vol. 378, no. 11, pp. 981–983, Mar. 2018.
detection from videos using YOLOv3,’’ in Proc. IEEE 17th India Council [92] A. Lakhan, M. A. Mohammed, K. H. Abdulkareem, H. Hamouda, and
Int. Conf. (INDICON), Dec. 2020, pp. 1–5. S. Alyahya, ‘‘Autism spectrum disorder detection framework for children
[70] K. Bhambani, T. Jain, and K. A. Sultanpure, ‘‘Real-time face mask and based on federated learning integrated CNN-LSTM,’’ Comput. Biol. Med.,
social distancing violation detection system using YOLO,’’ in Proc. IEEE vol. 166, Nov. 2023, Art. no. 107539.
Bengaluru Humanitarian Technol. Conf. (B-HTC), Oct. 2020, pp. 1–6. [93] H. Younes, H. L. Blevec, M. Léonardon, and V. Gripon, ‘‘Inter-
[71] Hendry and R.-C. Chen, ‘‘Automatic license plate recognition via sliding- operability of compression techniques for efficient deployment of CNNs
window darknet-YOLO deep learning,’’ Image Vis. Comput., vol. 87, on microcontrollers,’’ in Proc. Int. Conf. Syst.-Integr. Intell., Springer,
pp. 47–56, Jul. 2019. 2022, pp. 543–552.
[72] C. Dewi, R.-C. Chen, X. Jiang, and H. Yu, ‘‘Deep convolutional neural [94] N. Rane, S. Choudhary, and J. Rane, ‘‘YOLO and faster R-CNN
network for enhancing traffic sign recognition developed on YOLO v4,’’ object detection in architecture, engineering and construction (AEC):
Multimedia Tools Appl., vol. 81, no. 26, pp. 37821–37845, Apr. 2022. Applications, challenges, and future prospects,’’ Eng. Construction, Appl.,
[73] A. M. Roy, J. Bhaduri, T. Kumar, and K. Raj, ‘‘WilDect-YOLO: Challenges, Future Prospects, Oct. 2023.
An efficient and robust computer vision-based accurate object localization [95] B.-G. Han, J.-G. Lee, K.-T. Lim, and D.-H. Choi, ‘‘Design of a scalable and
model for automated endangered wildlife detection,’’ Ecolog. Informat., fast YOLO for edge-computing devices,’’ Sensors, vol. 20, no. 23, p. 6779,
vol. 75, Jul. 2023, Art. no. 101919. Nov. 2020.
[74] D. H. Dos Reis, D. Welfer, M. A. D. S. L. Cuadros, and D. F. T. Gamarra, [96] G. Plastiras, M. Terzi, C. Kyrkou, and T. Theocharidcs, ‘‘Edge intel-
‘‘Mobile robot navigation using an object recognition software with RGBD ligence: Challenges and opportunities of near-sensor machine learning
images and the YOLO algorithm,’’ Appl. Artif. Intell., vol. 33, no. 14, applications,’’ in Proc. IEEE 29th Int. Conf. Application-specific Syst.,
pp. 1290–1305, Nov. 2019. Architectures Processors (ASAP), Jul. 2018, pp. 1–7.
[75] A. Ye, B. Pang, Y. Jin, and J. Cui, ‘‘A YOLO-based neural network with [97] M. P. Véstias, ‘‘A survey of convolutional neural networks on edge with
VAE for intelligent garbage detection and classification,’’ in Proc. 3rd Int. reconfigurable computing,’’ Algorithms, vol. 12, no. 8, p. 154, Jul. 2019.
Conf. Algorithms, Comput. Artif. Intell., Dec. 2020. [98] Q. Wang, Q. Li, K. Wang, H. Wang, and P. Zeng, ‘‘Efficient federated learn-
[76] J. Li, J. Gu, Z. Huang, and J. Wen, ‘‘Application research of improved ing for fault diagnosis in industrial cloud-edge computing,’’ Computing,
YOLO v3 algorithm in PCB electronic component detection,’’ Appl. Sci., vol. 103, no. 10, pp. 2319–2337, Oct. 2021.
vol. 9, no. 18, p. 3750, Sep. 2019. [99] C. He, M. Annavaram, and S. Avestimehr, ‘‘Group knowledge transfer:
[77] J. Jiang, X. Fu, R. Qin, X. Wang, and Z. Ma, ‘‘High-speed lightweight Federated learning of large CNNs at the edge,’’ in Proc. Adv. Neural Inf.
ship detection algorithm based on YOLO-V4 for three-channels RGB SAR Process. Syst., vol. 33, 2020, pp. 14068–14080.
image,’’ Remote Sens., vol. 13, no. 10, p. 1909, May 2021.
[78] B. Chen and X. Miao, ‘‘Distribution line pole detection and counting based
on YOLO using UAV inspection line video,’’ J. Electr. Eng. Technol.,
vol. 15, no. 1, pp. 441–448, Jul. 2019.
[79] S. R. Vrajesh, A. N. Amudhan, A. Lijiya, and A. P. Sudheer, ‘‘Shuttlecock
detection and fall point prediction using neural networks,’’ in Proc. Int.
Conf. Emerg. Technol. (INCET), Jun. 2020, pp. 1–6.
[80] H. Wu, Y. Hu, W. Wang, X. Mei, and J. Xian, ‘‘Ship fire detection based
on an improved YOLO algorithm with a lightweight convolutional neural
network model,’’ Sensors, vol. 22, no. 19, p. 7420, Sep. 2022.
[81] K. Chen, H. Li, C. Li, X. Zhao, S. Wu, Y. Duan, and J. Wang, ‘‘An automatic MUHAMMAD HUSSAIN received the B.Eng.
defect detection system for petrochemical pipeline based on cycle-GAN degree in electrical and electronic engineering
and YOLO v5,’’ Sensors, vol. 22, no. 20, p. 7907, Oct. 2022. and the M.S. degree in Internet of Things from
[82] R. Zhang and C. Wen, ‘‘SOD-YOLO: A small target defect detection the University of Huddersfield, in 2019, and the
algorithm for wind turbine blades based on improved YOLOv5,’’ Adv. Ph.D. degree in artificial intelligence for defect
Theory Simulations, vol. 5, no. 7, Jul. 2022, Art. no. 2100631. identification. He is an accomplished Researcher
[83] I. Khokhlov, E. Davydenko, I. Osokin, I. Ryakin, A. Babaev, V. Litvinenko, hailing in Dewsbury, U.K. His work contributes to
and R. Gorbachev, ‘‘Tiny-YOLO object detection supplemented with optimizing PV systems’ efficiency and reliability.
geometrical data,’’ in Proc. IEEE 91st Veh. Technol. Conf. (VTC-Spring), He is equally passionate about machine vision,
May 2020, pp. 1–5. focusing on lightweight architectures for edge
[84] Y. A. Khan, S. Imaduddin, A. Ahmad, and Y. Rafat, ‘‘Image-based foreign
device deployment in real-world production settings. Beyond fault detection,
object detection using YOLO v7 algorithm for electric vehicle wireless
he explores AI interpretability, concentrating on developing explainable
charging applications,’’ in Proc. 5th Int. Conf. Power, Control Embedded
Syst. (ICPCES), Jan. 2023, pp. 1–6.
AI for medical and healthcare applications. His interdisciplinary approach
[85] E. S. T. K. Reddy and V. Rajaram, ‘‘Pothole detection using CNN and underscores his commitment to ethical and impactful AI solutions. With
YOLO v7 algorithm,’’ in Proc. 6th Int. Conf. Electron., Commun. Aerosp. his diverse expertise spanning AI, fault detection, machine vision, and
Technol., Dec. 2022, pp. 1255–1260. interpretability, he aims to leave his mark on shaping the future of technology
[86] A. Munin, A. Folarin, A. Munin-Doce, L. Alonso-Garcia, V. Diaz-Casas, and its positive influence on society. His research interests include fault
S. Ferreno-Gonzalez, and J. M. Ciriano-Palacios, ‘‘Real time vessel detection, particularly microcracks on photovoltaic (PV) cells due to
detection model using deep learning algorithms for controlling a barrier mechanical and thermal stress.
system,’’ J. SSRN, Apr. 2023.

VOLUME 12, 2024 42833

Hotel Transition Checklist PDF PDF
No ratings yet
Hotel Transition Checklist PDF PDF
5 pages
AI -102
No ratings yet
AI -102
116 pages
Mastering All YOLO Models From YOLOv1 To YOLO
100% (1)
Mastering All YOLO Models From YOLOv1 To YOLO
58 pages
Make 05 00083 v2
No ratings yet
Make 05 00083 v2
37 pages
YOLOv1 v8综述
No ratings yet
YOLOv1 v8综述
36 pages
2501.13400v1
No ratings yet
2501.13400v1
22 pages
2504.11995v1
No ratings yet
2504.11995v1
18 pages
Paper 5
No ratings yet
Paper 5
13 pages
Paper
No ratings yet
Paper
3 pages
YOLOv12_A Breakdown of the Key Architectural Features
No ratings yet
YOLOv12_A Breakdown of the Key Architectural Features
9 pages
yolov8
No ratings yet
yolov8
12 pages
Object_Detection_Document
No ratings yet
Object_Detection_Document
4 pages
Features of Yolo11
No ratings yet
Features of Yolo11
9 pages
s11042-024-18872-y
No ratings yet
s11042-024-18872-y
40 pages
Abir
No ratings yet
Abir
10 pages
WHAT IS YOLOV8
No ratings yet
WHAT IS YOLOV8
10 pages
YOLOv8_A_Novel_Object_Detection_Algorithm_with_Enhanced_Performance_and_Robustness
No ratings yet
YOLOv8_A_Novel_Object_Detection_Algorithm_with_Enhanced_Performance_and_Robustness
6 pages
Evaluating the Evolution of YOLO You Only Look Onc
No ratings yet
Evaluating the Evolution of YOLO You Only Look Onc
20 pages
BIOMETRICS
No ratings yet
BIOMETRICS
18 pages
Comparative Analysis of YOLO Versions_ YOLOv1 to YOLOv10
No ratings yet
Comparative Analysis of YOLO Versions_ YOLOv1 to YOLOv10
12 pages
yolo1-11
No ratings yet
yolo1-11
38 pages
Deep Learning YOLOv2
No ratings yet
Deep Learning YOLOv2
3 pages
yolo
No ratings yet
yolo
34 pages
AReviewonYOLOv8anditsAdvancementsv2
No ratings yet
AReviewonYOLOv8anditsAdvancementsv2
20 pages
1-s2.0-S1877050924033301-main
No ratings yet
1-s2.0-S1877050924033301-main
7 pages
Evolution of Yolo Algorithm and Yolov5: The State-Of-The-Art Object Detection Algorithm
100% (1)
Evolution of Yolo Algorithm and Yolov5: The State-Of-The-Art Object Detection Algorithm
61 pages
Enhancing Surveillance Systems With YOLO Algorithm for Real-Time Object Detection and Tracking
No ratings yet
Enhancing Surveillance Systems With YOLO Algorithm for Real-Time Object Detection and Tracking
4 pages
A Comprehensive Review of YOLO From YOLOv1 To YOLO
No ratings yet
A Comprehensive Review of YOLO From YOLOv1 To YOLO
27 pages
AReviewonYOLOv8anditsAdvancementsv2
No ratings yet
AReviewonYOLOv8anditsAdvancementsv2
20 pages
YOLO Based Object Detection Models: A Review and Its Applications
No ratings yet
YOLO Based Object Detection Models: A Review and Its Applications
40 pages
125826
No ratings yet
125826
5 pages
2407.20892v1
No ratings yet
2407.20892v1
8 pages
1 s2.0 S1877050922001363 Main
No ratings yet
1 s2.0 S1877050922001363 Main
8 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
You Only Look Once - Object Detection Models A Review
No ratings yet
You Only Look Once - Object Detection Models A Review
8 pages
vehicles-06-00065 (1)
No ratings yet
vehicles-06-00065 (1)
19 pages
2023 - Comparison of Transfer Learning Techniques For Object Detection
No ratings yet
2023 - Comparison of Transfer Learning Techniques For Object Detection
10 pages
What_is_YOLOv9_An_In-Depth_Exploration_of_the_Inte
No ratings yet
What_is_YOLOv9_An_In-Depth_Exploration_of_the_Inte
10 pages
Presentation 4 (2)
No ratings yet
Presentation 4 (2)
23 pages
YOLOv10 - Revolutionizing Real-Time Object Detection
No ratings yet
YOLOv10 - Revolutionizing Real-Time Object Detection
9 pages
Real Time Object Detection
No ratings yet
Real Time Object Detection
8 pages
Yolo Comprehensive (v1 To v8)
No ratings yet
Yolo Comprehensive (v1 To v8)
34 pages
4377-Article Text-19556-1-10-20240731
No ratings yet
4377-Article Text-19556-1-10-20240731
6 pages
Yolov 8
No ratings yet
Yolov 8
31 pages
Yolo11 Car
No ratings yet
Yolo11 Car
16 pages
YOLOv8n-FAWL Object Detection for Autonomous Driving Using YOLOv8 Network on Edge Devices
No ratings yet
YOLOv8n-FAWL Object Detection for Autonomous Driving Using YOLOv8 Network on Edge Devices
12 pages
Electronics 12 04970
No ratings yet
Electronics 12 04970
21 pages
Conference
No ratings yet
Conference
16 pages
Computer Vision Report
No ratings yet
Computer Vision Report
21 pages
Enhancing Real-Time Object Detection With YOLO Alg
No ratings yet
Enhancing Real-Time Object Detection With YOLO Alg
9 pages
Yolov10: Real-Time End-To-End Object Detection: Ao Wang Hui Chen Lihao Liu Kai Chen Zijia Lin Jungong Han Guiguang Ding
No ratings yet
Yolov10: Real-Time End-To-End Object Detection: Ao Wang Hui Chen Lihao Liu Kai Chen Zijia Lin Jungong Han Guiguang Ding
21 pages
YED-YOLO: An Object Detection Algorithm For Automatic Driving
No ratings yet
YED-YOLO: An Object Detection Algorithm For Automatic Driving
9 pages
20C116 - 20C118 - 20C135 - Comparative Evaluation of YOLO Versions in Trash Detection
No ratings yet
20C116 - 20C118 - 20C135 - Comparative Evaluation of YOLO Versions in Trash Detection
29 pages
Final Synopsis1
No ratings yet
Final Synopsis1
10 pages
Conference Proj
No ratings yet
Conference Proj
22 pages
723 Poster
No ratings yet
723 Poster
1 page
YOLO Based Detection and Classification of Objects in Video Records
No ratings yet
YOLO Based Detection and Classification of Objects in Video Records
5 pages
Overview of YOLO ObjectDetectionAlgorithm
No ratings yet
Overview of YOLO ObjectDetectionAlgorithm
7 pages
14489-Article Text-52397-76611-10-20240112 (2)
No ratings yet
14489-Article Text-52397-76611-10-20240112 (2)
10 pages
Image Detection and Segmentation Using YOLO v5 For
No ratings yet
Image Detection and Segmentation Using YOLO v5 For
6 pages
Project
100% (1)
Project
30 pages
Governance Frameworks for Public Project Development and Estimation
From Everand
Governance Frameworks for Public Project Development and Estimation
Ole Jonny Klakegg
No ratings yet
Investigating The Language Learning Strategies and Language Competence
No ratings yet
Investigating The Language Learning Strategies and Language Competence
47 pages
Traditional Settlements, Cultural Heritage and Sustainable Development
No ratings yet
Traditional Settlements, Cultural Heritage and Sustainable Development
146 pages
Grade 9 Revision q4
No ratings yet
Grade 9 Revision q4
6 pages
New Golden Format
No ratings yet
New Golden Format
10 pages
Tau 2k Army List
No ratings yet
Tau 2k Army List
4 pages
CSE - CCM Template
No ratings yet
CSE - CCM Template
4 pages
Emilio Aguinaldo College: School of Optometry
No ratings yet
Emilio Aguinaldo College: School of Optometry
4 pages
Soft Computing 0105IT171089
No ratings yet
Soft Computing 0105IT171089
17 pages
Lana Christy Lenora
No ratings yet
Lana Christy Lenora
13 pages
Ultrasonic Level Transmitter and Sensor: SW 838023 GB Shuttle Manual 100316
No ratings yet
Ultrasonic Level Transmitter and Sensor: SW 838023 GB Shuttle Manual 100316
70 pages
UNIT-1 Input_Output and Data_Types
No ratings yet
UNIT-1 Input_Output and Data_Types
73 pages
Plant Standards of The CPMB 100-07
No ratings yet
Plant Standards of The CPMB 100-07
34 pages
FHDV 120
No ratings yet
FHDV 120
3 pages
Bermad Bermad Bermad: Waterworks Waterworks Waterworks
No ratings yet
Bermad Bermad Bermad: Waterworks Waterworks Waterworks
6 pages
Performance Assessment and Review Admin and Accounts Managers
No ratings yet
Performance Assessment and Review Admin and Accounts Managers
3 pages
CHAPTER 5 - Writing The Research Proposal A. TITLE
100% (1)
CHAPTER 5 - Writing The Research Proposal A. TITLE
3 pages
Neoverter III: Single-Split Inverter Series
No ratings yet
Neoverter III: Single-Split Inverter Series
10 pages
Peak Analysis: Injection Details
No ratings yet
Peak Analysis: Injection Details
2 pages
Ass1 - Opt2 - EmpMgnt
100% (1)
Ass1 - Opt2 - EmpMgnt
3 pages
Analyzing Literature Assess Rubric
No ratings yet
Analyzing Literature Assess Rubric
2 pages
English For Welding 2
No ratings yet
English For Welding 2
3 pages
sr28c63385 A sp68c63381 0515 Top Left Schematics
No ratings yet
sr28c63385 A sp68c63381 0515 Top Left Schematics
6 pages
Event Management-Company Profile
No ratings yet
Event Management-Company Profile
2 pages
8th Grade Math Syllabus 2022-2023
No ratings yet
8th Grade Math Syllabus 2022-2023
2 pages
mathReport.2e9d6ba4af51829cc7bc
No ratings yet
mathReport.2e9d6ba4af51829cc7bc
1 page
Cherp 2012 Defining Energy Security Takes More Than Asking Around
No ratings yet
Cherp 2012 Defining Energy Security Takes More Than Asking Around
2 pages
Modules in Bu 1 Plumbing - Part 1
No ratings yet
Modules in Bu 1 Plumbing - Part 1
151 pages
Periodic Surprises in Electromagnetism: Photonic Crystals
No ratings yet
Periodic Surprises in Electromagnetism: Photonic Crystals
29 pages