People Monitoring and Mask Detection Using Real-Time Video Analyzing
People Monitoring and Mask Detection Using Real-Time Video Analyzing
Abstract: People Counting and mask detection based on video is an important field in a Computer Vision.
There is growing interest in video-based solutions for people monitoring and counting in business and security
applications using Computer Vision technology. It has been effectively used in many Artificial Intelligence
fields. Compareing to normal sensor based solutions the one with video based allows more flexible
performance, improved functionalities with lower costs. The system with people counter program requires more
processing because that deals with real-time video, so this particular proposed technique converts a color
image into binary in order to minimize data of image. Reducing processing time is an important term in
Software Engineering to build a good working system. People counting methods based on head detection and
tracking to evaluate the total number of people who move under an overhead camera and check whether that
people are wearing a mask or not. There basically four main features in this proposed system: People counting,
Mask detection, Alarm alert and Scan ID. Based on tracking of head, this method uses the crossing-line
judgment to determine whether the particular head object will get counted or not to be counted. The two main
challenges overcome in this system are: tough estimation of the background scene and the number of persons in
merge split scenarios. A technique for masked face detection using three different steps of estimating eye line
detection, facial part detection and eye detection is used in this system. On exceeding the count of people or in
case mask is not worn then alarm gets alerted
Keywords - Convolution Neural Network, MobileNet SSD, Dataset
I. INTRODUCTION
Public safety has become a very major problem in areas like malls, railway stations and streets during
festive seasons, concerts etc. during any pandemic situation. The massive disasters that happen worldwide
include numerous instances of fatality where people gather in crowds. An efficient automated system to manage
the crowd count is essential. People head tracking provides a way to detect the position, to obtain the motion
trail and to maintain identities of persons in the scene. Managing a crowd of varying densities involves detection
of the individual humans in the crowd. In a high density crowd, because of inter-object closure, detection and
tracking of humans in the crowd will be a challenge in the computer vision field. This system focuses on
training a model for human head detection by some positive samples and negative samples. The trained model is
then used to process the video frames in which the human heads are detected and the count of humans in the
scenario is provided. It also detects whether people are wearing a mask or not. If people are not wearing then the
alarm gets alerted, the same alerting happens when the number of people gathering exceeds. This system can be
used in malls or any other places where crowd should be minimum.
1
www.viva-technology.org/New/IJRI
VIVA-Tech International Journal for Research and Innovation Volume 1, Issue 4 (2021)
ISSN(Online): 2581-7280 Article No. X
PP XX-XX
VIVA Institute of Technology
9th National Conference on Role of Engineers in Nation Building – 2021 (NCRENB-2021)
III. METHODOLOGY
The Crowd Monitoring and Mask Detection is a simple system used for people counting and detection
of mask in crowded places. This system uses Convolution Neural Network (CNN), which is an image
classification algorithm as well as MobileNet SSD which is used for the same. CNN is made up of neurons, each
having an independent weight assigned to it. CNN is a class of deep neural networks specially used for image
recognition and image processing. MobileNet is a simple but efficient and not very intensive convolutional
neural network for mobile vision applications. MobileNet is widely used in many real-world applications which
include fine-grained classifications, object detection, face attributes, and localization. CNN takes the input as an
image, identifies and assigns priority to various features of the image and it differentiates the features from one
another. Mobilenet is a neural network that is used for classification and recognition whereas the SSD is a
framework that is used to realize the multi detector. Only the combination of both can do object detection. SSD
can be interchanged with RCNN. The preprocessing required for CNN is less and has the ability to learn image
characteristics. CNN consists of several sets of convolution layers, pooling layers, flatten and dense. The sets of
convolution and pooling layers are used for feature extraction and the number of such sets may vary.
Convolution layer is the basic building block of the CNN and is used for extracting features from an input
image. The proposed system uses Convolution model which consists of multiple layers for the purpose of
feature extraction from the image. Training data is provided to the model for better prediction of people wearing
2
www.viva-technology.org/New/IJRI
VIVA-Tech International Journal for Research and Innovation Volume 1, Issue 4 (2021)
ISSN(Online): 2581-7280 Article No. X
PP XX-XX
VIVA Institute of Technology
9th National Conference on Role of Engineers in Nation Building – 2021 (NCRENB-2021)
a mask or not. The classification of people wearing mask, the input video is converted into frames and then into
RGB format and then is flattened in matrix to extract the information by convolution layer. Multiple
convolutional layers used to provide better predictions with higher accuracy. Figure 1 represents the mask
detection system flow using CNN with MobileNet algorithm is used in this system as it consumes less data
processing time. The testing of the module is done using real time images of people with masks and no mask to
reflect the accuracy of the model. Hence, the model classifies the real time people counting and masks detection
in an efficient way.
IV. CONCLUSION
The mask detection using CNN with MobileNet algorithm is used in this system as it consumes less
data processing time. This System presents a people counting system as a way to manage crowds by keeping the
count of people. Keeping in mind the Pandemic situation Mask-Detection feature is added if the count exceeds
the prohibited count or if the model recognizes whether people are not wearing masks then the alarm gets
alerted. This system will reduce the time taken for humans for counting or checking purposes and ensure them,
this work is done by the system itself in no time. By this model human errors will be reduced to great extents as
the system itself gets trained through large datasets. This process requires comparatively less time and provides
great accuracy. As the system trains itself by doing the same tasks of mask detection so that there is less loss and
provides a better accuracy. As this system is still under progress so we can’t predict accurate accuracy but it
offers better accuracy.
REFERENCES
[1] Mingjie Jiang, Xinqi Fan ―RetinaMask: A Face Mask detector,2020, 7th International conference on
Artificial Intelligience, IEEE, 2020, pg.9.
[2] M. Ahmad, I. Ahmed, K. Ullah, I. Khan, A. Khattak and A. Adnan, ―Energy efficient camera solution
for video surveillance‖, International Journal of Advanced Computer Science and Applications, vol. 10,
no. 3, IEEE, 2019, pg.2.
3
www.viva-technology.org/New/IJRI
VIVA-Tech International Journal for Research and Innovation Volume 1, Issue 4 (2021)
ISSN(Online): 2581-7280 Article No. X
PP XX-XX
VIVA Institute of Technology
9th National Conference on Role of Engineers in Nation Building – 2021 (NCRENB-2021)
[3] Mario Martínez Zarzuela, Francisco Javier Díaz-Pernas, Miriam Antón-Rodríguez, ―AdaBoost Face
Detection on the GPU Using Haar-Like Features‖, Proceedings of the 4th international conference on
Interplay between natural and artificial computation: new challenges on bioinspired applications-Volume
III, IEEE, 2018, pg.9
[4] Akshay Mangawati; Mohana; Mohammed Leesan; H.V. Ravish Aradhya ―Object Tracking Algorithms
for Video Surveillance Applications‖, 2018 International Conference on Object detection, and motion
sensor, IEEE, 2018, pg.6.
[5] J. Grönman; P. Sillberg; P. Rantanen; M. Saari;” People Counting in a Public Event—Use Case: Free-to-
Ride Bus”,IEEE,2019.
[6] Prof. P Y Kumbhar1 , Mohammad Attaullah2 , Shubham Dhere3 , Shivkumar Hipparagi:” REAL TIME
FACE DETECTION AND TRACKING USING OPENCV”,2019.
[7] RafaelMuñoz-SalinasaEugenioAguirrebMiguelGarcía-Silventeb “People detection and tracking using
stereo vision and color”,2007.
[8] Zebin Cai; Zhu Liang Yu; Hao Liu; Ke Zhang “Counting People in Crowded Scenes by Video
Analyzing”,IEEE,2014.
[9] Heemoon Yoon, Sang-Hee Lee, Mira Park,” TensorFlow with user friendly Graphical Framework for
object detection API”,2020.
[10] Gretchel Karen L. Alcantara; Ivan Darren J. Evangelista; Jerome Vincent B. Malinao; Ofelia B. Ong;
Reginald Steven DM. Rivera “Head Detection and Tracking Using OpenCV”,IEEE,2018.
[11] S. Syed Ameer Abbas; P. Oliver Jayaprakash; M. Anitha; X. Vinitha Jaini “Crowd Detection and
Management using Cascade classifier on ARMv8 and OpenCV-Python”,IEEE,2017.
[12] Fabio Dittrich, Luiz E. S. de Oliveira, Alceu S. Britto Jr. and Alessandro L. Koerich “People Counting in
Crowded and Outdoor Scenes using a Hybrid Multi-Camera Approach”,IEEE,2019.
4
www.viva-technology.org/New/IJRI