Object_Detection_Harmful_Weapons_Detection_using_YOLOv4
Object_Detection_Harmful_Weapons_Detection_using_YOLOv4
Wan Emilya Izzety Binti Wan Noor Afandi Naimah Mat Isa
School of Electrical Engineering School of Electrical Engineering
College of Engineering College of Engineering
2021 IEEE Symposium on Wireless Technology & Applications (ISWTA) | 978-1-6654-4043-1/21/$31.00 ©2021 IEEE | DOI: 10.1109/ISWTA52208.2021.9587423
Abstract— Closed-circuit television (CCTV) is essential in spiders and their cobwebs cause some of the most popular
the security industry by providing surveillance, monitoring false alarms [5] that an operator has to deal with apart from
activities, recording incidents, and storing evidence. Research pets. A method to detect a suspicious person today is by
and developments have been performed to ameliorate its detecting a person idling around at a certain location [6].
application to meet the ever-changing security landscape. This This method is however inaccurate as some might just be
paper presents a revolutionary method to enhance the waiting for friends or family.
application of CCTVs in Malaysia. The purpose of this study is
to develop an Artificial Intelligence (AI) based weapons To curb this issue, the application of Artificial
detection that helps people in identifying violent crimes that Intelligence (AI) can be implemented. This study is proposed
are currently happening. This study focuses on detecting to help reduce violent crimes from happening. This system
harmful weapons such as handguns and knives using the provides a solution for regular CCTV by detecting two types
custom trained object detection model that has been trained of weapons i.e., Handguns and Knives. It will take the
using the YOLOv4 Darknet framework. Two sets of training security industry to a new level and hopefully see a declining
have been done to test the effectiveness of this system. The first statistics in armed crimes. This system is meant to be
training was done on a single class custom object detection implemented on CCTV or alarm security systems for
model while the second was done on a multiple class custom security purposes but for this study, it is only limited to the
object detection model. Based on the results obtained, the detection of the object. Further steps has to be undertaken
single class object detection only managed to achieve 66.67% to
before the system can be completely applied in a CCTV
77.78% accuracy on average whilst the multiple class object
detection managed to achieve up to 100% accuracy on most of
security system.
its input images. Apart from that, a study on the mean average precision
(mAP) of two custom trained object detection models has
Keywords— CCTV; Artificial Intelligence (AI); Object been made to compare the differences of both results
detection; Weapon; YOLOv4 obtained. Lastly, to test the effectiveness of this system,
accuracy is calculated and compared for each of the input
INTRODUCTION images.
The most primitive documented usage of closed-circuit
television (CCTV) technology was first used in Germany [1], LITERATURE REVIEW
the system was set up to monitor the V-2 rockets [2]. The Previous researchers have done studies on how to protect
world’s first CCTV can only be used for live monitoring and society from violent crimes. One of the most focused areas is
not to record footage [2]. Not long after, the system was the versatility of CCTV. A few studies have reported on the
being promoted by a vendor called Vericon [3] and has been effectiveness of CCTV implementations. Some did not have
made available to the public commercially. This technology any effect after the installation of CCTV [7] while others
has significantly improved over the years and the recording show a significant reduction in violent crimes [8]. The
systems even became more versatile and dependable as seen following research listed below will be the primary source of
today. ideas and motivation in the establishment of this project.
As is known, conventional CCTV requires constant
monitoring by security personnel. The drawbacks of AI-Based Automatic Robbery/Theft Detection using Smart
conventional CCTV are that security personnel might miss Surveillance in Banks
incidents happening within the background as they are Their focus is on implementing a Smart Cam that
focusing on something more prominent. Apart from that, monitors the bank's activity. Their system can detect any
excessive usage of screen time can also lead to an eye kind of suspicious behaviour. This Smart Cam can also
problem called computer vision syndrome [4] showing detect the types of weapons and count the number of
symptoms such as redness, dryness, blurred vision, and weapons it has detected. Once a weapon or suspicious
double vision. behaviour has been detected, the thieves would be tracked,
On top of that, CCTV that relies on motions only has a and an alert notification will be sent automatically including
high possibility of giving out false alarms that will be a the details as shown in Fig. 1 to the security department [9].
disadvantage to CCTV systems. A study has shown
that
978-1-6654-4043-1/21/$31.00 ©2021
Authorized licensed use limited to: IEEE 63
VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.
(1) Dataset Preparation
Instead of using a CCTV video, this research is done
offline and by using an images dataset. The images came
from the Open Images V6 dataset. Open Images V6 dataset
consists of 9.6 million images with annotations for
segmentation, object detection and classification process
[13]. The images are downloaded with the label for training
purposes. More than 3000 images are downloaded and used
in this project. Then, the images are sorted and labelled
according to the YOLOv4 format.
(2) Training YOLOv4
For this project, the training session was done by using
Fig. 1. Object detected after training. [9]
Google Colab. The YOLOv4 is trained to detect harmful
weapons. The training dataset is saved in google drive where
the dataset can be retrieved by google Colab. The training
session is divided into two sessions as the first session is to
A Neural Network Based Intelligent Intruders and Tracking train single class object detection and the second session is
System using CCTV Images multiple class object detection. Before the training session is
started, some parameters need to be defined i.e. batch size,
Early development on intruder detection and tracking subdivisions, maximum batches, number of classes, width
system based on the neural network approach has been made. and height.
According to this paper, the potential intruder is identified by
examining the technique and algorithm in a neural network. (3) Object Detection Model
When the system identifies the presence of an intruder, it will
start monitoring and tracking its movement. This way, the
information gathered can be used for further identification.
Apart from that, they made a comparison between the
traditional approach of Intelligent Scene Monitoring (ISM)
and the artificial neural network (ANN). It is said that the
ANN approach can differentiate between suspicious
behaviour and non-suspicious behaviour [10].
METHODOLOGY
This study proposed a weapons detector framework to
detect the presence of harmful weapons, by analyzing the
image or video frame by frame. The purpose of detecting
weapons is because incidents involving the use of firearms
remain a major threat to national security. Although armed Fig. 2. Single Class Object Detection Model for AI Based Monitoring
robbery in Malaysia has decreased significantly [12] but System flowchart.
appropriate action should be taken to make sure armed
The system flowchart of a single class object detection
robbery in Malaysia is under control. Hence, this will be a
model is shown in Fig. 2. First, the process starts by reading
breakthrough in security management as potential crimes
the input image or video footage frame by frame but for this
involving firearms can be effectively curtailed.
project an image is used. Then the model starts to detect an
The proposed idea is by using Darknet YOLOv4 and object on the input images. The detected object is bounded
TensorFlow platform. Darknet is an open-source library to with the bounding box where the bounding box has a
build a neural network framework while TensorFlow is a threshold value to be achieved. In this system, the threshold
platform to run the YOLOv4 detector. Further explanation value is set to a minimum of 0.6 therefore when the threshold
will be explained in this section. is above 0.6, only then it will display a red bounding box. Or
64
Authorized licensed use limited to: VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.
else, it will just ignore the predicted bounding box and the label. The image size varies and before training
continue to read frames. the default value for image size is 416 x 461 pixels.
65
Authorized licensed use limited to: VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.
weights file has been converted to a TensorFlow file to run
this object detection. To compare the effectiveness of both
systems, the accuracy is then calculated. Each value of true
positive (TP), false positive (FP), true negative (TN) and
false negative (FN) in each image or video is counted. The
following formulae shown in (1) are used to calculate the
accuracy for each system.
No Image TP FP TN FN Accuracy
1 Self-taken 1 0 5 3 66.67%
66
Authorized licensed use limited to: VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.
random positions and random items are added to test the
effectiveness of the detection. For this single class object
detection, only one weapon is detected with a confidence
value of 98%.
Fig. 9. Image of knives detected together in one bounding box but fails to
detect all.
Fig. 10. The image of several weapons failed to be detected by the object
detection model.
67
Authorized licensed use limited to: VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.
Fig. 12 The self-taken image that has been successfully
Fig. 11. Image of a fidget spinner being detected as a weapon. detected by the object detection model.
N Image TP FP TN FN Accurac
o y
1 Self-taken 4 0 5 0 100%
2 Handgun 1 0 2 1 75%
and knife
3 Handguns 7 0 0 0 100% Fig. 13 Handgun has been detected but fails to detect the knife.
4 Knives 9 0 0 0 100% In Fig. 13, only the handgun was detected with a
confidence value of 94%, exhibit an improvement of 28%
5 Random 5 0 4 0 100%
position 1 compared to the previous detection in single class, for this
similar image. The knife however could not be recognized by
6 Random 2 0 6 1 88.89% the object detection model due to its position placement.
position 2
68
Authorized licensed use limited to: VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.
Fig. 14 Object detection model successfully detected each handgun.
CONCLUSION
To conclude, the objective has successfully achieved as
this system can detect two types of weapons specifically,
handguns and knives. This approach is beneficial to
everyone especially those in the security industry. By
comparison, the multiple class object detection model has
better mAP scores and result in better detection compared to
the single class object detection model. The multiple class
Fig. 15 The object detection model has successfully detected each object detection model even attained a much higher accuracy
knife. compared to the single class object detection model. Overall,
the multiple class object detection model has a better
In Fig. 15, the multiple class object detection model performance in detecting objects. However, it can sometimes
managed to successfully detect every one of the knives with overlook if the object is moving too fast or the object is
a confidence level ranging from 65% to 100%. In placed at a different angle or position.
comparison to Fig. 9, Fig. 15 predicts better and is more To improve this system, the multiple class object
accurate as it was able to detect each knife individually. detection model can be upgraded by making it able to
classify more types of weapons. It can also help the model to
be more accurate as it will be trained to classify various types
of weapons and will make the system more dynamic. Apart
from that, this system can be implemented on a
microcontroller such as Raspberry Pi for more robust use. A
notification system can also be implemented to send
notifications once a harmful weapon has been detected. By
combining all of these, the AI based surveillance monitoring
system can be implemented in places such as banks,
shopping malls and even at home.
69
Authorized licensed use limited to: VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.
REFERENCES Experimental Test in Taiwan’s Taipei City,” Int. J. Offender Ther.
Comp. Criminol., vol. 63, no. 1, pp. 101–134, 2019, doi:
10.1177/0306624X18780101.
[1] [1] K. Yeganegi, D. Moradi, and A. J. Obaid, “Create a wealth of [8] [8] E. L. Piza, “The crime prevention effect of CCTV in public
security CCTV cameras Create a wealth of security CCTV cameras,” places: a propensity score analysis,” J. Crime Justice, vol. 41, no. 1,
2020, doi: 10.1088/1742-6596/1530/1/012110. pp. 14–30, 2018, doi: 10.1080/0735648X.2016.1226931.
[2] [2] H. Rama Moorthy, V. Upadhya, V. V. Holla, S. S. Shetty, and [9] [9] R. Kakadiya, R. Lemos, S. Mangalan, M. Pillai, and S. Nikam,
V. V. Tantry, “Challenges encountered in building a fast and efficient “AI Based Automatic Robbery/Theft Detection using Smart
surveillance system: An overview,” Proc. 4th Int. Conf. IoT Soc. Surveillance in Banks,” Proc. 3rd Int. Conf. Electron. Commun.
Mobile, Anal. Cloud, ISMAC 2020, pp. 731–737, 2020, doi: Aerosp. Technol. ICECA 2019, pp. 201–204, 2019, doi:
10.1109/I-SMAC49090.2020.9243563. 10.1109/ICECA.2019.8822186.
[3] [3] I. Journal and S. Sciences, “Akpauche: International Journal of [10] [10] C. C. Fung and N. Jerrat, “Neural network based intelligent
Arts and Social Sciences, Vo 1, No 2,” no. 2, pp. 96–105. intruders detection and tracking system using CCTV images,” IEEE
[4] [4] K. Y. Loh and S. C. Reddy, “Understanding and preventing Reg. 10 Annu. Int. Conf. Proceedings/TENCON, vol. 2, 2000, doi:
computer vision syndrome,” Malaysian Fam. Physician, vol. 3, no. 3, 10.1109/tencon.2000.888772.
2008. [11] [11] A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4:
[5] [5] R. Hebbalaguppe, “A computer vision based approach for Optimal Speed and Accuracy of Object Detection,” arXiv, 2020.
reducing false alarms caused by spiders and cobwebs in surveillance [12] [12] “Msia’s crime index down significantly, due in part to Sosma,
camera networks,” 2014. Poca.” [Online]. Available: https://www.nst.com.my/news/crime-
[6] [6] W. Aitfares, A. Kobbane, and A. Kriouile, “Suspicious behavior courts/2020/09/628610/msias-crime-index-down-significantly-due-
detection of people by monitoring camera,” Int. Conf. Multimed. part-sosma-poca. [Accessed: 08-Jan-2021].
Comput. Syst. -Proceedings, vol. 0, pp. 113–117, 2017, doi: [13] [13] “Open Images Dataset v6 (Bounding Boxes) | Appen Datasets.”
10.1109/ICMCS.2016.7905601. [Online]. Available: https://appen.com/datasets/open-images-
[7] [7] Y. L. Lai, C. J. Sheu, and Y. F. Lu, “Does the Police-Monitored annotated-with-bounding-boxes/. [Accessed: 10-Jan-2021].
CCTV Scheme Really Matter on Crime Reduction? A Quasi-
70
Authorized licensed use limited to: VIT University. Downloaded on December 01,2024 at 06:29:21 UTC from IEEE Xplore. Restrictions apply.