0% found this document useful (0 votes)
15 views

Face Mask Detection Final Report Draft

Uploaded by

Anhtk Anhtk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Face Mask Detection Final Report Draft

Uploaded by

Anhtk Anhtk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

St.

Xavier’S College
Affiliated to Tribhuvan University
Maitighar, Kathmandu

Final Year Project Report


On
“Face Mask Detection and Alert System”
A Final Year Project Report submitted in the partial fulfillment of the requirements for the
degree of Bachelor of Science in Computer Science and Information Technology awarded by
Tribhuvan University.

Under the supervision of


Mr. Bal Krishna Subedi
Lecturer

Submitted by
Dilip Karki (T.U. Exam Roll No. 15173/074)
Kishan KC (T.U. Exam Roll No. 15181/074)

Submitted to
St. Xavier’S College
Department of Computer Science
Maitighar, Kathmandu, Nepal
April, 2021
Face Mask Detection And Alert System – [CSC-404]

A final year project report submitted in partial fulfillment of the


requirements for the degree of Bachelor of Science in Computer Science
and Information Technology awarded by Tribhuvan University.

Submitted By:
Dilip Karki (T.U. Exam Roll No. 15173/074)
Kishan KC (T.U.Exam Roll No. 15181/074)

Submitted To:
St. Xavier’S College
Department of Computer Science
Maitighar, Kathmandu, Nepal
April, 2021
CERTIFICATE OF APPROVAL

The undersigned certify that they have read and recommended to the Department of
Computer Science for acceptance, a project proposal entitled “Face Mask Detection
and Alert System” submitted by Dilip Karki (T.U. Exam Roll No. 15173/074) and
Kishan KC (T.U. Exam Roll No. 15181/074) for the partial fulfillment of the
requirement for the degree of Bachelor of Science in Computer Science and Information
Technology awarded by Tribhuvan University.

…………………………..
Mr. Bal Krishna Subedi
Project Supervisor /Lecturer
St. Xavier’s College

…………………………..
External Examiner
Tribhuvan University

…………………………..
Mr. Ganesh Yogi
Head of the Department
Department of Computer Science
St. Xavier’s College
ACKNOWLEDGEMENT

We are momentously privileged to be the students of Computer Science here in St. Xavier’s
College with a department, utterly packed by expertise of the respective field, greatly
supportive to the learners. We would like to express our sincere gratitude to Mr. Sarjan
Shrestha – our supervisor for creating a virtuous academic and sociable environment to
foster this project. Therefore, we would like to express our innermost thanks to him for
providing us with all the crucial advices, guidelines and resources for the accomplishment
of this project.

We are also grateful to the entire Computer Science Department of St. Xavier’s College for
housing us a seemly environment where we could work with this project. We were pleased
to be under the commands of the department to help us from all possible ways. We would
also take this opportunity to express our gratitude to Mr. Ganesh Yogi for his continuous
encouragement and support throughout the completion of this project.

We would also like to express our heartfelt gratitude to Er. Rajan Karmacharya, Mr. Bal
Krishna Subedi, Er. Anil Shah, Er. Saugat Sigdel, Er. Nitin Malla, Er. Sansar Dewan,
Er. Sanjay Kumar Yadav, Mr. Ganesh Dhami and Mr. Ramesh Shahi for their constant
support and guidance.

At the end we would like to express our sincere thanks to all our friends and others who
helped us directly or indirectly during this project work.

Dilip Karki (T.U. Exam Roll No. 15173/074)


Kishan KC (T.U.Exam Roll No. 15181/074)

i
ABSTRACT

Object recognition is one of the newest and most widely studied areas of laptop imagination
and predictive structure. The purpose of object recognition is to discover positive teaching
gadgets on the perimeter side of a particular photo and assign the corresponding great
designations. With the help of deep acquaintances, the use and performance of object
recognition structures has expanded significantly. Our task contains modern-day strategies
for item detection that also can be used for real-time item detection.
One of the main inconveniences of many element detection mechanisms is the reliance on
various witty predictive tactics on laptops before resorting to the use of deep learning,
resulting in overall performance within the system. Is to drop. This task uses in-depth
knowledge to solve stop-to-stop item recognition issues. The community is trained on
datasets developed in-house. The following modules are very quick and accurate and can
also be used for real-time item recognition.

Keywords: Object recognition, algorithm, deep learning

ii
TABLE OF CONTENTS

ACKNOWLEDGEMENT ........................................................................................... i

ABSTRACT ................................................................................................................. ii

LIST OF FIGURES ................................................................................................... vi

CHAPTER 1: INTRODUCTION ...............................................................................1

Background .....................................................................................................1

Problem Statement ..........................................................................................2

Project Objectives ...........................................................................................2

Project Scope ...................................................................................................3

1.4.1 Airports: .......................................................................................................3

1.4.2 Hospitals: .....................................................................................................3

1.4.3 Offices: ........................................................................................................3

Development Methodology .............................................................................3

1.5.1 Agile ............................................................................................................3

Report Organization ........................................................................................4

CHAPTER 2: LITERATURE REVIEW ...................................................................4

General theory and concept: ............................................................................5

2.1.1 Neural networks ...........................................................................................5

2.1.2 Convolutional Neural Network (CNN) .......................................................6

2.1.3 Linear Regression ........................................................................................9

2.1.4 Logistic Regression .....................................................................................9

2.1.5 Multi-label classification ...........................................................................10

2.1.6 Feature pyramid networks .........................................................................11

Related Terms ...............................................................................................11

2.2.1 IOU ............................................................................................................11

2.2.2 Anchor Box / Bounding Box .....................................................................12

iii
2.2.3 Threshold ...................................................................................................13

2.2.4 Activation Function ...................................................................................14

2.2.5 Loss Function ............................................................................................16

Related work .................................................................................................16

CHAPTER 3: SYSTEM ANALYSIS .......................................................................23

System Analysis ............................................................................................23

3.1.1 Requirement Analysis................................................................................23

3.1.2 Feasibility Analysis ...................................................................................25

3.1.3 System Analysis ........................................................................................27

CHAPTER 4: SYSTEM DESIGN ............................................................................30

Design............................................................................................................30

Algorithm details ...........................................................................................30

4.2.1 System Algorithm ......................................................................................30

4.2.2 Project Algorithm ......................................................................................32

CHAPTER 5: IMPLEMENTATION.......................................................................33

AND TESTING ..........................................................................................................33

Implementation..............................................................................................33

5.1.1 Tools Used .................................................................................................33

5.1.2 Implementation Details OF modules .........................................................37

Testing ...........................................................................................................42

5.2.1 Testing cases for unit testing .....................................................................42

5.2.2 Test cases for system testing .....................................................................46

Result Analysis ..............................................................................................52

CHAPTER 6: CONCLUSION AND FUTURE ENHANCEMENTS ...................56

Conclusion.....................................................................................................56

Future Enhancements ....................................................................................56

iv
REFERENCES ...........................................................................................................57

v
LIST OF FIGURES

Figure 1 Agile Model ..................................................................................................... 4

Figure 2 Deep Neural Network ...................................................................................... 6

Figure 3 Convolutional Neural Network ....................................................................... 7

Figure 4 Convolutional process ..................................................................................... 7

Figure 5 Pooling process ................................................................................................ 9

Figure 6 Logistic regression ......................................................................................... 10

Figure 7 Intersect Over Union ..................................................................................... 12

Figure 8 Boundry Box.................................................................................................. 12

Figure 9Precision & Recall .......................................................................................... 13

Figure 10 Sigmoid Activation Function....................................................................... 14

Figure 11ReLU Activation Function ........................................................................... 15

Figure 12 Leaky ReLU Activation Function ............................................................... 16

Figure 13Use Case Diagram ........................................................................................ 24

Figure 14Gantt chart demonstrating project timeline .................................................. 26

Figure 15 ER Diagram ................................................................................................. 27

Figure 16 DFD level 0 ................................................................................................. 28

Figure 17 DFD level 1 ................................................................................................. 28

Figure 18 DFD level 2 ................................................................................................. 29

Figure 19 Relational Model ......................................................................................... 30

Figure 20 convolutional layer ...................................................................................... 32

Figure 21 start of application ....................................................................................... 43

Figure 22 camera opened` ............................................................................................ 44

Figure 23 mask detected .............................................................................................. 44

Figure 24 no mask detected ......................................................................................... 45

Figure 25 counted NO mask ........................................................................................ 45

vi
Figure 26 counted mask ............................................................................................... 45

Figure 27 email alert .................................................................................................... 46

Figure 28 Single person Face mask detection .............................................................. 47

Figure 29 detection of mask wearing on improper place ............................................. 48

Figure 30Detection of multiple person not wearing mask ........................................... 49

Figure 31 Detection of face with different cases(wearing mask and not wearing mask)
...................................................................................................................................... 49

Figure 32 Detection of faces with no masks ................................................................ 50

Figure 33 Detection of faces wearing mask ................................................................. 51

Figure 34 E-mail alert system ...................................................................................... 52

Figure 35: Detection of absence of face mask ............................................................. 53

Figure 36: Detection of Presence of face mask ............................................................ 54

Figure 37: Email alert after detection of absence of face mask ................................... 55

vii
CHAPTER 1: INTRODUCTION

Background
Face mask is crucial in the prevention of airborne diseases. Droplets of microorganisms
ejected into the air by coughing, sneezing, or talking produce an airborne illness. The
pathogens in question could be viruses, bacteria, or fungi. Tuberculosis, influenza, and
small pox are just a few of the frequent illnesses that can spread through the air.

People with some diseases can transfer disease via the air when they cough, sneeze, or talk,
releasing nasal and throat secretions. Some viruses or bacteria take to the air and float
around, landing on humans or surfaces. When you inhale harmful germs from the air, they
take up residence inside you. You can also pick up germs by touching a germ-infested
surface and then touching your own eyes, nose, or mouth. These infections are difficult to
control because they spread through the air.

COVID-19 has recently triggered a global pandemic. COVID-19 spreads when an infected
person exhales virus-containing droplets and very minute particles. Other people may
inhale these droplets and particles, or they may settle on their eyes, noses, or mouths. They
may contaminate surfaces they come into contact with in some cases. People who are closer
than 6 feet from the infected person are most likely to get infected.

Wearing a mask can prevent the spattering of the droplets from the body of infected person.
If a person is infected by any airborne diseases then using face mask he/she can prevent
other people from being infected and vice versa.

Previous studies have found that facemask-wearing is valuable in preventing the spread of
respiratory viruses. For instance, the efficiencies of N95 and surgical masks in blocking the
transmission of SARS are 91% and 68%, respectively. Facemask-wearing can interrupt
airborne viruses and particles effectively, such that these pathogens cannot enter the
respiratory system of another person. As a non-pharmaceutical intervention, facemask-
wearing is a non-invasive and cheap method to reduce mortality and morbidity from
respiratory infections [1].

Hence, it is very crucial to detect whether an individual is wearing a face mask. Our project
will help in detecting the presence of face mask on the face of a person. If the person isn’t

1
wearing mask of any kind then the system will detect such individual and alert the
concerned authority. Mostly the mask detection is being done in manual fashion. With alert
system functioning side by side with the face mask detection system, the process of mask
detection can be automated. This will help in the prevention in the spread of many airborne
disease because the recent global pandemic has demonstrated the importance of preventing
such diseases.

Problem Statement

Face mask is crucial for the spread of many airborne diseases. Although the importance of
face mask is evident, many people can be seen roaming in public places such as banks
without face mask. Some people don’t think wearing a face mask is a moral duty. When
such people go unnoticed or unpunished, then such people tend to not wear mask in future
as well. Many people wear the mask in improper way i.e not covering the nose and mouth
properly.

Project Objectives
This project is to help identify face masks as an object in video surveillance cameras across
different places like hospitals, emergency departments, out-patient facilities, residential
care facilities, emergency medical services, and home health care delivery to provide safety
to doctors, patients and reduce the outbreak of disease. Where the detection of Face Mask
would be required to happen in Real-time as the necessary actions in case of any
disobedience will be taken on the spot.

The objectives we desire to meet are as follows:

 To develop a system which can detect whether an individual is wearing a mask or


not.
 To build an alert system which alerts the concerned authority when an individual
isn’t wearing a mask.
 Ensure a safe working environment.

2
Project Scope

1.4.1 Airports:

The Face Mask Detection System can be used at airports to detect travelers without
masks. Face data of travelers can be captured in the system at the entrance. If a traveler
is found to be without a face mask, their picture is sent to the airport authorities so that
they could take quick action.

1.4.2 Hospitals:

Using Face Mask Detection System, Hospitals can monitor if their staff is wearing
masks during their shift or not. If any health worker is found without a mask, alert them
Also, if quarantine people who are required to wear a mask, the system can keep an eye
and detect if the mask is present or not and send notification automatically or report to
the authorities.

1.4.3 Offices:

The Face Mask Detection System can be used at office premises to detect if employees
are maintaining safety standards at work. It monitors employees without masks.

Development Methodology

1.5.1 Agile

Our project is based on the agile model. Each development process has been done
iteratively. The meaning of Agile is swift or versatile. “Agile process model" refers to a
software development approach based on iterative development. Agile methods break tasks
into smaller iterations, or parts do not directly involve long term planning. The project
scope and requirements are laid down at the beginning of the development process. Plans
regarding the number of iterations, the duration and the scope of each iteration are clearly
defined in advance Each iteration is considered as a short time "frame" in the agile process
model, which typically lasts from one to four weeks. The division of the entire project into
smaller parts helps to minimize the project risk and to reduce the overall project delivery
time requirements.

3
Each iteration involves a team working through a full software development life cycle
including planning, requirements analysis, design, coding, and testing before a working
product is demonstrated to the client[2].

Figure 1 Agile Model

Report Organization
The first chapter of this report consists of a project introduction, along with problem
definitions, objectives, scopes, and limitations. The next chapter included a literature
search on the background of the project and existing systems. Chapter 3 consists of system
analysis including requirements analysis and feasibility analysis.

CHAPTER 2: LITERATURE REVIEW

4
General theory and concept:

Research is a spiral process as it revolves around analysis, planning, execution and


evaluation. Any research project needs to build on the basis of research. The present study
also needed the support of several other researches and other literature[3].

2.1.1 Neural networks

Neural networks, also known as artificial neural networks (ANNs) or simulated neural
networks (SNNs), are a subset of machine learning and are at the heart of deep learning
algorithms. Their name and structure are inspired by the human brain, mimicking the
way that biological neurons signal to one another.

Neural networks rely on training data to learn and improve their accuracy over time.
However, once these learning algorithms are fine-tuned for accuracy, they are powerful
tools in computer science and artificial intelligence, allowing us to classify and cluster
data at a high velocity. Tasks in speech recognition or image recognition can take
minutes versus hours when compared to the manual identification by human experts.
One of the most well-known neural networks is Google’s search algorithm.

Artificial neural networks (ANNs) are comprised of a node layers, containing an input
layer, one or more hidden layers, and an output layer. Each node, or artificial neuron,
connects to another and has an associated weight and threshold. If the output of any
individual node is above the specified threshold value, that node is activated, sending
data to the next layer of the network. Otherwise, no data is passed along to the next layer
of the network.

Neural networks rely on training data to learn and improve their accuracy over time.
However, once these learning algorithms are fine-tuned for accuracy, they are powerful
tools in computer science and artificial intelligence, allowing us to classify and cluster
data at a high velocity. Tasks in speech recognition or image recognition can take

5
minutes versus hours when compared to the manual identification by human experts.
One of the most well-known neural networks is Google’s search algorithm[4].

Figure 2 Deep Neural Network

2.1.2 Convolutional Neural Network (CNN)

A convolutional neural network, or CNN, is a deep learning neural network designed


for processing structured arrays of data such as images. Convolutional neural networks
are widely used in computer vision and have become the state of the art for many visual
applications such as image classification, and have also found success in natural
language processing for text classification.

Convolutional neural networks are very good at picking up on patterns in the input
image, such as lines, gradients, circles, or even eyes and faces. It is this property that
makes convolutional neural networks so powerful for computer vision. Unlike earlier
computer vision algorithms, convolutional neural networks can operate directly on a
raw image and do not need any preprocessing.

A convolutional neural network is a feed-forward neural network, often with up to 20


or 30 layers. The power of a convolutional neural network comes from a special kind of
layer called the convolutional layer.

Convolutional neural networks contain many convolutional layers stacked on top of


each other, each one capable of recognizing more sophisticated shapes. With three or

6
four convolutional layers it is possible to recognize handwritten digits and with 25 layers
it is possible to distinguish human faces.

The usage of convolutional layers in a convolutional neural network mirrors the


structure of the human visual cortex, where a series of layers process an incoming image
and identify progressively more complex features.

Figure 3 Convolutional Neural Network

The above mentioned is important when our goal is to design an architecture that is not
only good at learning features but is also scalable to massive datasets. shows the
Convolutional Neural Network[5].

The Kernel

The element which is involved in the process of carrying out the convolution operation
in the first part of the convolutional layer is called the Kernel/Filter.

Figure 4 Convolutional process

In the Fig.3 the left section is similar to 5 × 5 × 1 matrix which is input image.

7
In the Fig. 3 the right section is similar to 3 × 3 × 1 matrix which is Kernel. It is
represented here as K.

 Image Dimensions = 5 (Height) × 5 (Breadth) × 1 (Number of channels, e.g. RGB).

 Kernel/Filter, K =

Here, the kernel will shift 9 times because Stride Length = 1, every time performing a
matrix multiplication operation between K and the portion P of the image over which
the kernel is hovering. The filter will keep on moving to the right with some stride value
until it parses the complete width. Then it will move down to the left most beginning of
the image where it will again continue its journey to the end until the complete image is
traversed[6].

Pooling Layer

The function of the pooling layer is to reduce the spatial size of the convolved feature.
Because of this the computational power required to process the data will decrease
gradually through dimensionality reduction. Also, it is useful for finding out the
dominant features which are independent of rotation and position thereby maintaining
the process of effectively training the model.

Pooling are of two types:

2.1.2.2.1 Max Pooling:

Max pooling works as a noise reducer. It removes the noisy activations and performs
de-noising along with dimensionality reduction.

2.1.2.2.2 Average Pooling:

Average pooling simply performs dimensionality reduction for the reduction of noise.
Hence, we can conclude that Max pooling performs better than average pooling[6].

8
Figure 5 Pooling process

2.1.3 Linear Regression

Linear regression analysis is used to predict the value of a variable based on the value
of another variable. The variable you want to predict is called the dependent variable.
The variable you are using to predict the other variable's value is called the independent
variable.

This form of analysis estimates the coefficients of the linear equation, involving one or
more independent variables that best predict the value of the dependent variable. Linear
regression fits a straight line or surface that minimizes the discrepancies between
predicted and actual output values. There are simple linear regression calculators that
use a “least squares” method to discover the best-fit line for a set of paired data. You
then estimate the value of X (dependent variable) from Y (independent variable)[7].

2.1.4 Logistic Regression

o Logistic regression is one of the most popular Machine Learning algorithms, which
comes under the Supervised Learning technique. It is used for predicting the
categorical dependent variable using a given set of independent variables.

o Logistic regression predicts the output of a categorical dependent variable.


Therefore, the outcome must be a categorical or discrete value. It can be either
Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as 0
and 1, it gives the probabilistic values which lie between 0 and 1.

9
o Logistic Regression is much similar to the Linear Regression except that how they
are used. Linear Regression is used for solving Regression problems,
whereas Logistic regression is used for solving the classification problems.

o In Logistic regression, instead of fitting a regression line, we fit an "S" shaped


logistic function, which predicts two maximum values (0 or 1).

o The curve from the logistic function indicates the likelihood of something such as
whether the cells are cancerous or not, a mouse is obese or not based on its weight,
etc.

o Logistic Regression is a significant machine learning algorithm because it has the


ability to provide probabilities and classify new data using continuous and discrete
datasets.

Logistic Regression can be used to classify the observations using different types of data
and can easily determine the most effective variables used for the classification[8].

Figure 6 Logistic regression

2.1.5 Multi-label classification

Multi-label classification is an AI text analysis technique that automatically labels (or


tags) text to classify it by topic. This differs from multi-class classification because
multi-label can apply more than one classification tag to a single text. Using machine
learning and natural language processing to automatically analyze text (news articles,
emails, social media, etc.), multi-label classification can help categorize text data under
10
predetermined tags, usually topics, like customer service, pricing, etc. It can be a
massive time-saver when analyzing huge amounts of text for your business[9].

2.1.6 Feature pyramid networks

A Feature Pyramid Network, or FPN, is a feature extractor that takes a single-scale


image of an arbitrary size as input, and outputs proportionally sized feature maps at
multiple levels, in a fully convolutional fashion. This process is independent of the
backbone convolutional architectures. It therefore acts as a generic solution for building
feature pyramids inside deep convolutional networks to be used in tasks like object
detection.

The construction of the pyramid involves a bottom-up pathway and a top-down pathway.

The bottom-up pathway is the feedforward computation of the backbone ConvNet,


which computes a feature hierarchy consisting of feature maps at several scales with a
scaling step of 2. For the feature pyramid, one pyramid level is defined for each stage.
The output of the last layer of each stage is used as a reference set of feature maps. For
ResNets we use the feature activations output by each stage’s last residual block[10].

The top-down pathway hallucinates higher resolution features by upsampling spatially


coarser, but semantically stronger, feature maps from higher pyramid levels. These
features are then enhanced with features from the bottom-up pathway via lateral
connections. Each lateral connection merges feature maps of the same spatial size from
the bottom-up pathway and the top-down pathway. The bottom-up feature map is of
lower-level semantics, but its activations are more accurately localized as it was
subsampled fewer times.

Related Terms

2.2.1 IOU

 IOU can be computed as Area of Intersection divided over Area of Union of


two boxes

 IOU must be ≥0 and ≤1

 ground truth box to be IOU ≈ 1

11
 The left image IOU is very low

Figure 7 Intersect Over Union

2.2.2 Anchor Box / Bounding Box

The bounding box is a rectangle that is drawn in such a way that it covers the entire
object and fits it perfectly. There exists a bounding box for every instance of the object
in the image. And for the box, 4 numbers are predicted which are as follows:

 center_X center_Y width height

Figure 8 Boundry Box

Recall

 Recall is the ratio of true positive (true predictions) and the total of ground truth positives
(total number of cars)[11].
 How many relevant items are selected?
 The recall is the measure of how accurately we detect all the objects in the data.
 Recall =

12
Precision

 Precision is the ratio of true positive (true predictions) (TP) and the total number of
predicted positives (total predictions)[11].
 How many selected items are relevant?

 Precision=

Figure 9Precision & Recall

MAP

Average precision is calculated by taking the area under the precision-recall curve.

Average Precision combines both precision and recall together

Mean Average Precision is the mean of the AP calculated for all the classes.

2.2.3 Threshold

Conf. Threshold

Confidence Threshold is a base probability value above which the detection made by
the algorithm will be considered as an object. Most of the time it is predicted by a
classifier[12].

NMS Threshold

While performing non-max suppression, which bounding boxes should be merged to a


single bounding box is decided by the nms_threshold during the computation of IOU
between those bounding boxes.

13
2.2.4 Activation Function

Sigmoid Function

The Sigmoid Activation Function is sometimes known as the logistic function or


squashing function.

The research that has been carried out in the Sigmoid functions which resulted in three
variants of sigmoid Activation Function, which are used in the Deep Learning
applications. Sigmoid Function is mostly used in feedforward neural networks.

It is a bounded differentiable real function, defined for real input values, with positive
derivatives everywhere and some degree of smoothness.

The sigmoid function is given by the Formula .

The sigmoid function appears in the output layers of the DL architectures, and they are
useful for predicting probability-based output.

Figure 10 Sigmoid Activation Function

Rectified Linear Unit (ReLU) Function

ReLU is the most widely used activation function for deep learning applications with the most
accurate results. It is faster compared to many other Activation Functions. ReLU represents a
nearly Linear function and hence it preserves the properties of the linear function that made it
easy to optimize with gradient descent methods. The ReLU activation function performs a
threshold operation to each input element where values less than zero are set to zero[13].

the ReLU is given by Formula

14
Figure 11ReLU Activation Function

Leaky ReLU (LReLU) Function

The leaky ReLU, was introduced to sustain and keep the weights updates alive during the entire
propagation process. A parameter named alpha was introduced as a solution to ReLU’s dead
neuron problem so that the gradients will not be zero at any time during training.

LReLU computes the gradient with a very small constant value for the negative gradient with a
very small constant value for the negative gradient alpha in the range of 0.01.

LReLU is computed as:

The LReLU has a similar result as compared to standard ReLU with an exception that it
will have non-zero gradients over the entire duration and hence suggesting that there is
no significant result improvement except in sparsity and dispersion when compared to
standard ReLU and other activation functions[13].

15
Figure 12 Leaky ReLU Activation Function

2.2.5 Loss Function

Loss Function in YOLO v3:

There are 3 detection layers in the YOLO algorithm. Each of these 3 layers is responsible
for the calculation of loss at three different scales. Then the losses that are calculated at
the 3 scales are then summed up for Backpropagation. Every layer of YOLO uses 7
dimensions to calculate the Loss. The first 4 dimensions correspond to center_X,
center_Y, width, height of the bounding box. The next dimension corresponds to the
objectness score of the bounding box and the last 2 dimensions correspond to the one-
hot encoded class prediction of the bounding box.

the following 4 losses will be calculated:

 MSE of center_X, center_Y, width and height of bounding box


 BCE of obbjectness score of a bounding box
 BCE of no objectness score of a bounding box
 BCE of multi-class predictions of a bounding box[14].

Related work

After the breakout of the worldwide pandemic COVID-19, there arises a severe need of
protection mechanisms. The year 2020 has shown mankind some mind-boggling series of

16
events amongst which the COVID-19 pandemic is the most life-changing event which has
startled the world since the year began. Affecting the health and lives of masses, COVID-
19 has called for strict measures to be followed in order to prevent the spread of disease.
From the very basic hygiene standards to the treatments in the hospitals, people are doing
all they can for their own and the society’s safety; face masks are one of the personal
protective equipment. People wear face masks once they step out of their homes and
authorities strictly ensure that people are wearing face masks while they are in groups and
public places. To select a base model, we evaluated the metrics like accuracy, precision
and recall and selected MobileNetV2 architecture with the best performance having 100%
precision and 99% recall. It is also computationally efficient using MobileNetV2 which
makes it easier to install the model to embedded systems. This face mask detector can be
deployed in many areas like shopping malls, airports and other heavy traffic places to
monitor the public and to avoid the spread of the disease by checking who is following
basic rules and who is not[15].

To identify facemask wearing condition, the input images were processed with image pre-
processing, facial detection and cropping, SR, and facemask-wearing condition
identification. Finally, SRCNet achieved a 98.70% accuracy and outperformed traditional
end-to-end image classification metho[1]ds by over 1.5% in kappa. Our findings indicate
that the proposed SRCNet can achieve high accuracy in facemask-wearing condition
identification, which is meaningful for the prevention of epidemic diseases including
COVID-19 in public[1].

In 2017, A Cascade Framework for masked face detection proposed by Weibu Jiangejinn
Xiao and Chuanhong Zhou used a simple system for mask detection. The architecture
consists of cascaded 3 convolutional mask detectors are Mask–12, Mask–24-1 and Mask –
24-2. Here ResNet 5 model–7 layer convolutional layer followed by a pooling layer is used.
Mask 1 is the first stage and Mask 3 is the last stage of masked face detector. A masked
face Dataset is used and it is contained 160 images for testing and 40 images for testing
purpose. Training process includes Pre-train model and Fine tune models. Finally use
PASCAL VOC for evacuation process. Testing on Masked Face achieved 86.6%
accuracy[16].

While already many people are persuaded of the interest for facial protective mask, as
suggested by the World Health Organization (WHO, 2020) and Studies in Science

17
conducted by (N. Leung et al, 2020), (S. Zhou et al, 2018) and (M. Sande et al, 2008), One
may note that many people don't wear masks for protection from the virus (see various
sample data in Figure 1). Such findings led infirmiers and other people to Initiate Public
Health Education prevention campaigns in mask wearing. Such campaigns consist
specifically of Sensitizing people of the importance of wearing a mask by sharing
prevention Posters and sketches[17].

In this paper, we proposed a new object detection method based on YOLOv3, name
Squeeze and Excitation YOLOv3 (SE-YOLOv3). The proposed method can locate the face
in real time and assess how the mask is being worn to aid the control of the pandemic in
public areas. Our main contributions are depicted as follows: We built a large dataset of
masked faces, the Properly-Wearing Masked Face Detection Dataset (PWMFD). Three
predefined classes were included concerning the targeted cases. Combined with the channel
dimension of the attention mechanism, the backbone of YOLOv3 was improved. We added
the Squeeze and Excitation block (SE block) between the convolutional layer of Darknet53,
which helped the model to learn he relationship between channels. The final accuracy
reached 99.64% on the Real-Time Face Dataset (RMFD)[18].

Jiang and Fan in 2020 proposed a one-stage face-detection model capable of classifying
detected faces with respect to whether they are wearing masks or not. The proposed
approach was again inspired by the RetinaNet model and represents a one-stage object
detector that consists of a Feature Pyramid Network (FPN) and a novel context attention
module. The model comprises a backbone, a neck, and a head. The main (high accuracy)
model uses a ResNet backbone, but a simpler model with a MobileNet backbone is also
explored. For the neck of the model (the intermediate connection between the backbone
and the heads of the model), the authors use an FPN. For the heads, the proposed approach
relies on a structure similar to that used in single-stage detectors (SSD). The model is tested
on selected subsets from the MAFA and Wider Face datasets that consist of a total of 7959
images with masked and unmasked faces. Despite the impressive detection performance
the proposed models do not distinguish between faces that wear masks properly (in
accordance with recommendations) and faces that do not [19].

ace and iris localization is one of the most active research areas in image understanding for
new applications in security and theft prevention, as well as in the development of human–
machine interfaces. In the past, several methods for real-time face localization have been

18
developed using face anthropometric templates which include face features such as eyes,
eyebrows, nose and mouth. It has been shown that accuracy in face and iris localization is
crucial to face recognition algorithms. An error of a few pixels in face or iris localization
will produce significant reduction in face recognition rates. In this paper, we present a new
method based on particle swarm optimization (PSO) to generate templates for frontal face
localization in real time. The PSO templates were tested for face localization on the Yale
B Face Database and compared to other methods based on anthropometric templates and
Adaboost.

Additionally, the PSO templates were compared in iris localization to a method using
combined binary edge and intensity information in two subsets of the AR face database,
and to a method based on SVM classifiers in a subset of the FERET database. Results show
that the PSO templates exhibit better spatial selectivity for frontal faces resulting in a better
performance in face localization and face size estimation. Correct face localization reached
a rate of 97.4% on Yale B which was higher than 96.2% obtained with the anthropometric
templates and much better than 60.5% obtained with the Adaboost face detection method.
On the AR face subsets, different disparity errors were considered and for the smallest
error, a 100% correct detection was reached in the AR-63 subset and 99.7% was obtained
in the AR-564 subset. On the FERET subset a detection rate of 96.6% was achieved using
the same criteria. In contrast to the Adaboost method, PSO templates were able to localize
faces on high-contrast or poorly illuminated environments. Additionally, in comparison
with the anthropometric templates, the PSO templates have fewer pixels, resulting in a 40%
reduction in processing time thus making them more appropriate for real-time applications
[20].

In 2021, Madhura Inamdar and Ninad Mehendale perform projects on detection of face
mask .Deep learning can be used in unsupervised learning algorithms to process the
unlabeled data. A CNN model for speedy face detection has been introduced by Li et al.
that evaluates low resolution an input image and discards non-face sections and accurately
processes the regions that are at a greater resolution for precise detection. Calibration nets
are used to stimulate detection. The advantage of this model is that it is fast and achieves
14 FPS in case of standard VGA images on the CPU and can be quickened to 100 FPS on
GPU [21].

19
The objective of this work is to provide a simple and yet efficient tool to detect human faces
in video sequences. This information can be very useful for many applications such as
video indexing and video browsing. In particular the paper will focus on the significant
improvements made to our face detection algorithm presented in [l]. Specifically, a novel
approach to retrieve skin-like homogeneous regions will be presented, which will be later
used to retrieve face images. Good results have been obtained for a large variety of video
sequences [22].

The closest to our work is the recent paper by Qin and Li. Here, the authors describe an
approach (SRCNet) for classifying face-mask wearing. The approach incorporates an
image super resolution model that makes it possible to process low-resolution faces and a
classification network that predicts whether faces are masked, without masks or if the
masks are worn incorrectly. The model is trained and evaluated on a dataset that contained
a total of 3835 images, which unfortunately is no longer available. Out of the 3835 images,
671 contain faces without masks, 134 images contain faces with incorrectly worn masks
and 3030 images contain faces with correctly worn face-masks. An accuracy of 98.70% is
reported for the proposed model. Although this work shares the basic problem statement,
we do not focus solely on low-resolution faces, but explore the general task of detecting
whether face-masks are worn correctly or not regardless of the data characteristics[1],[23].

Currently, there is a global outbreak of novel coronavirus pneumonia, which infected many
people.

One of the most efficient ways to prevent infection is to wear a mask. Thus, mask detection,
which essentially belongs to object detection is meaningful for the authorities to prevent
and control the epidemic. After comparing different methods utilized in object detection
and conducting relevant analysis, YOLO v3-tiny is proved to be suitable for real-time
detection[24].

There are many solutions to prevent the spread of the COVID-19 virus and one of the most
effective solutions is wearing a face mask. Almost everyone is wearing face masks at all
times in public places during the coronavirus pandemic. This encourages us to explore face
mask detection technology to monitor people wearing masks in public places. Most recent
and advanced face mask detection approaches are designed using deep learning. In this
article, two state-of-the-art object detection models, namely, YOLOv3 and faster R-CNN
are used to achieve this task. The authors have trained both the models on a dataset that
20
consists of images of people of two categories that are with and without face masks. This
work proposes a technique that will draw bounding boxes (red or green) around the faces
of people, based on whether a person is wearing a mask or not, and keeps the record of the
ratio of people wearing face masks on the daily basis. The authors have also compared the
performance of both the models i.e., their precision rate and inference time[25].

The human face is a complicated multidimensional visual model and hence it is very
difficult to develop a computational model for recognizing it. The paper presents a
methodology for recognizing the human face based on the features derived from the image.
The proposed methodology is implemented in two stages. The first stage detects the human
face in an image using viola Jones algorithm. In the next stage the detected face in the
image is recognized using a fusion of Principle Component Analysis and Feed Forward
Neural Network. The performance of the proposed method is compared with existing
methods. Better accuracy in recognition is realized with the proposed method. The
proposed methodology uses Bio ID-Face-Database as standard image database[25].

The COVID-19 is an unparalleled crisis leading to a huge number of casualties and security
problems. To reduce the spread of coronavirus, people often wear masks to protect
themselves. This makes face recognition a very difficult task since certain parts of the face
are hidden. A primary focus of the researchers during the ongoing coronavirus pandemic
is to come up with suggestions to handle this problem through rapid and efficient solutions.
This paper aims to present a review of various methods and algorithms used for human
recognition with a face mask. Different approaches i.e. Haar cascade, Adaboost, VGG-16
CNN Model, etc. are described in this paper. A comparative analysis is made on these
methods to conclude which approach is feasible. With the advancement of technology and
time more reliable methods for human recognition with a face mask can be implemented
in the future. Finally, it includes some of the applications of face detection. This system
has various applications at public places, schools, etc. where people need to be detected
with the presence of a face mask and recognize them and help society[26].

The outbreak of Coronavirus disease has thus far killed over 2.85M people and infected
over 131M all over the world, causing global health crisis. Due to this the government was
forced to impose lockdown all over the world. As made mandatory by World Health
Organization (WHO), the only effective protection method is to wear face mask every time
we are out in public and maintain social distancing. Wearing face masks will automatically

21
reduce the risk of spreading of the deadly virus. An efficient approach used for building
Deep learning model for face detection will be presented. Here, we will have dataset that
consists images that are with mask and without mask and later use OpenCV real-time face
mask detection from our webcam. We will use the dataset to build a COVID-19 face mask
detector with computer vision using Python, OpenCV, and Tensor Flow and Keras. Our
aim to identify if the person in the image/video is masked or unmasked. The model achieves
98.7% accuracy on distinguishing people with or without face mask. We hope that our
study would be useful to reduce the rapid spread of virus[27].

22
CHAPTER 3: SYSTEM ANALYSIS

System Analysis

It is a process of collecting and interpreting facts, identifying the problems, and


decomposition of a system into its components. System analysis is conducted for the
purpose of studying a system or its parts in order to identify its objectives. It is a problem
solving technique that improves the system and ensures that all the components of the
system work efficiently to accomplish their purpose[28].

3.1.1 Requirement Analysis

The team size for the development of the system was a 2 and the total project duration
was 20 weeks. Each of the members worked 35 hours per week to develop the system.

i) Functional Requirements

A Functional Requirement (FR) describes the service that the software must provide. It
refers to a software system or its component. A function is nothing more than the
software system's inputs, behavior, and outputs. It could be a calculation, data
manipulation, business process, user interaction, or any other specific functionality that
describes the function that a system is likely to perform[29].

23
Figure 13Use Case Diagram

The Use Case describes the interaction between the actor and the system - what the actor
does and how the system reacts

In face mask detection all the user/visitors should pass through system first. The system
take visual input of the user and

ii) Non Functional Requirements

Nonfunctional Requirements (NFRs) define system attributes such as security, reliability,


performance, maintainability, scalability, and usability. They serve as constraints or restrictions
on the design of the system across the different backlogs. They ensure the overall system's
usability and effectiveness. Failure to meet any of them can result in systems that do not meet
internal business, user, or market needs, or that do not meet mandatory regulatory or standards
agency requirements.

 Performance and scalability: How fast do the system return results? How much will
this performance change with higher workloads?

 Portability and compatibility: Face mask detection can run on a system with RAM
of 4 GB or higher and 1 GHz or faster processor. These are readily available on most
of the systems nowadays. During development, it runs on the Windows platform but

24
can be further developed to run for Mac and Linux. Python is available for all
platforms. The package used for library is called open CV is used. All of the tools are
available for cross platform portability and will have no issues with compatibility.

 Reliability, availability, maintainability: Face mask detection is being deployed


locally in the development environment.

 Localization: Facemask detection match the local specifics currently. It surely can
also be used globally.

 Usability: Facemask has a simple design. The system is very simple to use.

Hardware Requirements

• GB RAM or higher.
• 1 GHz or faster processor.
• Input device: Keyboard, Mouse
• Output device: Monitor
• Camera

Software Requirements

• Code Editor: VS Code/Pycharm


• Web Browser: Google Colab
• HTML, CSS, jQuery, JavaScript, PHP

3.1.2 Feasibility Analysis

Feasibility analysis, in simple words is an analysis and evaluation of a proposed project


to ensure if it is technically, economically and operationally feasible. As the name
suggests, a feasibility analysis is a study of the viability of an idea. It focuses on
answering the essential question of “should this proposed project idea be proceeded[30]

i) Technical Feasibility

25
Development of the proposed detection based facemask detction is technically feasible.
It complies with current technology. Pycharm is an open source platform which can be
programmed using python language. Also, only access to a computer system.

ii) Operation Feasibility

This project aims to provide a suitable working environment in the office. By providing
the alertness to the indivisual/without any harm. Hence, this product is operationally
feasible.

iii) Economic Feasibility

The web application and mobile application used in the system will be developed using
open-source platforms and technologies such as python pycharm tenserflow which will
require no seed investment. Also, any computer device or smartphone will be capable
of making use of the user DAPP. The controller can also access using any computer
device connected to the to the system. The tool used for the simulation was open source.
So, the proposed project is economically feasible.

iv) Schedule Feasibility

Schedule Feasibility is defined as the probability of a project to be completed within its


scheduled time limits, by a planned due date. If a project has a high probability to be
completed on-time, then its schedule feasibility is appraised as high[31]. Our project
will be completed in schedule to the submitted project proposal. Hence, the development
of our system is schedule feasible.

Gantt Chart
13-Jun 3-Jul 23-Jul 12-Aug 1-Sep 21-Sep 11-Oct 31-Oct 20-Nov

Preliminary Investigation 18
Planning 30
Coding 25
Development 50
Testing and Debugging 20
Finalizing 10
Documentation 120

Figure 14Gantt chart demonstrating project timeline

26
3.1.3 System Analysis

Data Modelling

The practice of developing a simple illustration of a complicated software system using


text and symbols to illustrate how data will flow is known as data modeling. The graphic
can be used as a blueprint for building new software or reengineering an existing
program to ensure efficient data consumption.

Figure 15 ER Diagram

3.1.3.1.1 Process Modeling


Process models allow you to visualize your business processes so that your enterprise
can better understand your internal business processes and manage and design them
more efficiently. This is usually an agile continuous improvement exercise.

27
Figure 16 DFD level 0

DFD Level 0 is also called a Context Diagram. It’s a basic overview of the whole system
or process being analyzed or modeled. It’s designed to be an at-a-glance view, showing
the system as a single high-level process, with its relationship to external entities.

Figure 17 DFD level 1

DFD Level 1 provides a more detailed breakout of pieces of the Context Level Diagram.
The main functions carried out by the system are highlighted, as the high-level process

28
of the Context Diagram is broken down into its sub-processes. The DFD level 1 is in the
above figure.

Figure 18 DFD level 2

Level 2 DFD goes one step deeper into parts of 1-level DFD. It can be used to plan or
record the specific/necessary detail about the system’s functioning

29
CHAPTER 4: SYSTEM DESIGN

Design
The relational diagram as follows shows the relationship between the 4 major entities of
our application.

Figure 19 Relational Model

Algorithm details

4.2.1 System Algorithm

Introduction to YOLO v3

YOLOv3 (You Only Look Once, Version 3) is a real-time object detection algorithm
that identifies specific objects in videos, live feeds, or images. YOLO uses features
learned by a deep convolutional neural network to detect an object. Versions 1-3 of
YOLO were created by Joseph Redmon and Ali Farhadi.

30
The first version of YOLO was created in 2016, and version 3, which is discussed
extensively in this article, was made two years later in 2018. YOLOv3 is an improved
version of YOLO and YOLOv2. YOLO is implemented using the Keras or OpenCV
deep learning libraries.

As typical for object detectors, the features learned by the convolutional layers are
passed onto a classifier which makes the detection prediction. In YOLO, the prediction
is based on a convolutional layer that uses 1×1 convolutions.

YOLO is named “you only look once” because its prediction uses 1×1 convolutions; the
size of the prediction map is exactly the size of the feature map before it[32].

Architecture

The YOLOv3 algorithm first separates an image into a grid. Each grid cell predicts some
number of boundary boxes (sometimes referred to as anchor boxes) around objects that
score highly with the aforementioned predefined classes.

Each boundary box has a respective confidence score of how accurate it assumes that
prediction should be and detects only one object per bounding box. The boundary boxes
are generated by clustering the dimensions of the ground truth boxes from the original
dataset to find the most common shapes and sizes.

Other comparable algorithms that can carry out the same objective are R-CNN (Region-
based Convolutional Neural Networks made in 2015) and Fast R-CNN (R-CNN
improvement developed in 2017), and Mask R-CNN.

However, unlike systems like R-CNN and Fast R-CNN, YOLO is trained to do
classification and bounding box regression at the same time.

Working

YOLO is a Convolutional Neural Network (CNN) for performing object detection in


real-time. CNNs are classifier-based systems that can process input images as structured
arrays of data and identify patterns between them (view image below). YOLO has the
advantage of being much faster than other networks and still maintains accuracy.

It allows the model to look at the whole image at test time, so its predictions are informed
by the global context in the image. YOLO and other convolutional neural network
algorithms “score” regions based on their similarities to predefined classes.

31
High-scoring regions are noted as positive detections of whatever class they most closely
identify with. For example, in a live feed of traffic, YOLO can be used to detect different
kinds of vehicles depending on which regions of the video score highly in comparison
to predefined classes of vehicles[33].

Figure 20 convolutional layer

4.2.2 Project Algorithm

Step 1: Start
Step 2: User launches the software.
Step 3: User presses the start button.
Step 4: When the user presses the start button, the software accesses the webcam of
the device and initializes the webcam.
Step 5: After the webcam is initialized, the software checks whether a person is
wearing face mask or not for each frame.
Step 6: If everyone is wearing face mask, the system displays the status as safe. If less
than 3 persons are not wearing face mask, the system displays warning status. If
more than 3 persons are not wearing face mask, the status is displayed as danger.
Step 7: For each danger status, an alert email is sent to the concerned authority
consisting of count value of masked and non-masked individual along with date and
time.
Step 8: End.

32
CHAPTER 5: IMPLEMENTATION
AND TESTING

Implementation

The face mask detection and alert system developed from the previously mentioned
requirements and designed is implemented in this phase. The tools used and the
implementation details are mentioned as follows:

5.1.1 Tools Used

The following hardware and software tools are used to develop the face mask detection
and alert system:

Hardware tools

 4 GB RAM or higher.
 1 GHz or faster processor.
 Input device: Keyboard, Mouse
 Output device: Monitor
 Camera

Software Tools

The following software tools are used to develop the face mask detection and alert system:

 VS code: Visual Studio Code is a distribution of the Code repository with Microsoft-
specific customizations released under a traditional Microsoft product license. Visual
Studio Code combines the simplicity of a code editor with what developers need for their
core edit-build-debug cycle. It provides comprehensive code editing, navigation, and
understanding support along with lightweight debugging, a rich extensibility model, and
lightweight integration with existing tools.
 PyCharm: PyCharm is an integrated development environment (IDE) used in
computer programming, specifically for the Python programming language. It is
developed by the Czech company JetBrains (formerly known as IntelliJ).[5] It
provides code analysis, a graphical debugger, an integrated unit tester, integration
with version control systems (VCSes), and supports web development with Django
as well as data science with Anaconda.
33
 Python: Python is an interpreted high-level general-purpose programming language.
Its design philosophy emphasizes code readability with its use of significant
indentation. Its language constructs as well as its object-oriented approach aim to
help programmers write clear, logical code for small and large-scale projects. Python
is dynamically-typed and garbage-collected. It supports multiple programming
paradigms, including structured (particularly, procedural), object-oriented and
functional programming. It is often described as a "batteries included" language due
to its comprehensive standard library.
The Python language was designed with the following features:
 Easy to code: Python is a high-level programming language. Python is very easy to
learn the language as compared to other languages like C, C#, JavaScript, Java, etc.
It is very easy to code in python language and anybody can learn python basics in
a few hours or days. It is also a developer-friendly language.
 Free and Open Source: Python language is freely available at the official website.
Since it is open-source, this means that source code is also available to the public
.So you can download it as, use it as well as share it.
 Object-Oriented Language: One of the key features of python is Object-Oriented
programming. Python supports object-oriented language and concepts of classes,
objects encapsulation, etc.
 GUI Programming Support: Graphical User interfaces can be made using a module
such as PyQt5, PyQt4, wxPython, or Tk in python. PyQt5 is the most popular option
for creating graphical apps with Python.
 High-Level Language: Python is a high-level language. When we write programs
in python, we do not need to remember the system architecture, nor do we need to
manage the memory
 Anaconda Python: Anaconda is a distribution of the Python and R programming
languages for scientific computing (data science, machine learning applications,
large-scale data processing, predictive analytics, etc.), that aims to
simplify package management and deployment. The distribution includes data-
science packages suitable for Windows, Linux, and macOS. It is developed and
maintained by Anaconda, Inc., which was founded by Peter Wang and Travis
Oliphant in 2012. As an Anaconda, Inc. product, it is also known as Anaconda
Distribution or Anaconda Individual Edition, while other products from the
34
company are Anaconda Team Edition and Anaconda Enterprise Edition, both of
which are not free.
 LabelImg: LabelImg is a free, open source tool for graphically labeling images. It’s
written in Python and uses QT for its graphical interface. It’s an easy, free way to
label a few hundred images to try out your next object detection project.
 Google Drive: Google Drive is a file storage and synchronization
service developed by Google. Launched on April 24, 2012, Google Drive allows
users to store files in the cloud (on Google's servers), synchronize files across
devices, and share files. In addition to a web interface, Google Drive offers apps
with offline capabilities for Windows and macOS computers
and Android and iOS smartphones and tablets. Google Drive encompasses Google
Docs, Google Sheets, and Google Slides, which are a part of the Google Docs
Editors office suite that permits collaborative editing of documents, spreadsheets,
presentations, drawings, forms, and more.
 Microsoft Excel: Microsoft Excel is a helpful and powerful program for data
analysis and documentation. It is a spreadsheet program, which contains a number
of columns and rows, where each intersection of a column and a row is a “cell. It is
used to create grids of text, numbers and formulas specifying calculations. That is
extremely valuable for many businesses, which use it to record expenditures and
income, plan budgets, chart data and succinctly present fiscal results[34].
The main reason why we used Excel for our project was that we can create a gantt
chart.
 Draw.io software: Designed by Seibert Media, draw.io is proprietary software for
making diagrams and charts. The software allows you to choose from an automatic
layout function, or create a custom layout. They have a large selection of shapes
and hundreds of visual elements to make your diagram or chart one-of-a-kind. It
also produces web-based diagramming technology and integrates with Google
Drive and Dropbox[35].
We used this online diagram software for the purpose of making flowchart, context
diagram,data flow diagram, use case and class diagrams.
 Microsoft Word: Microsoft Word is a word processor developed by Microsoft. It
was first released on October25, 1983[36] .Using word you can create the document
and edit them later, as and when required, by adding more text, modifying the
35
existing text, deleting/moving some part of it. Changing the size of the margins can
reformat complete document or part of text. Font size and type of fonts can also be
changed. Page numbers and Header and Footer can be included.
We used Microsoft Word for the purpose of documenting the whole workflow from
start to the end of the project.
 GitHub: GitHub is a code hosting platform for version control and collaboration.
It lets you and others work together on projects from anywhere. is a highly used
software that is typically used for version control. To understand GitHub, we must
first have an understanding of Git. Git is a version control system which allows
developers to easily collaborate, as they can download a new version of the
software, make changes, and upload the newest revision.
The main reason why we preferred Git was that it has multiple advantages over the
other systems available. It stores file changes more efficiently and ensures file
integrity better[37].
 Opencv: OpenCV is the huge open-source library for the computer vision, machine
learning, and image processing and now it plays a major role in real-time operation
which is very important in today’s systems. By using it, one can process images and
videos to identify objects, faces, or even handwriting of a human. When it integrated
with various libraries, such as NumPy, python is capable of processing the OpenCV
array structure for analysis. To Identify image pattern and its various features we
use vector space and perform mathematical operations on these features.
 Numpy: Numpy is a general-purpose array-processing package. It provides a high-
performance multidimensional array object, and tools for working with these arrays.
It is the fundamental package for scientific computing with Python.
 Besides its obvious scientific uses, Numpy can also be used as an efficient multi-
dimensional container of generic data.
 pytz,: Pytz brings the Olson tz database into Python and thus supports almost all
time zones. This module serves the date-time conversion functionalities and helps
user serving international client’s base. It enables time-zone calculations in our
Python applications and also allows us to create time zone aware date time
instances.

36
 Imutils: A series of convenience functions to make basic image processing
functions such as translation, rotation, resizing, skeletonization, and displaying
Matplotlib images easier with OpenCV and both Python 2.7 and Python 3.
 Python GUI –
Tkinter: Python offers multiple options for developing GUI (Graphical User
Interface). Out of all the GUI methods, tkinter is the most commonly used method.
It is a standard Python interface to the Tk GUI toolkit shipped with Python. Python
with tkinter is the fastest and easiest way to create the GUI applications. Creating a
GUI using tkinter is an easy task.
Pillow: PIL is the Python Imaging Library by Fredrik Lundh and Contributors.
Pillow for enterprise is available via the Tidelift Subscription.
Python Tcl: Tcl is a dynamic interpreted programming language, just like Python.
Though it can be used on its own as a general-purpose programming language, it is
most commonly embedded into C applications as a scripting engine or an interface
to the Tk toolkit[38].

5.1.2 Implementation Details OF modules

The functions which define the project are as follows:

For mean subtraction, normalization and channel swapping

while True:

_, frame = cap.read()

fps._numFrames = frame.sum()

height, width, _ = frame.shape

blob = cv2.dnn.blobFromImage(frame, 1/255, (416, 416),


(0,0,0), swapRB=True, crop=False)

net.setInput(blob)

output_layers_names = net.getUnconnectedOutLayersNames()

37
layerOutputs = net.forward(output_layers_names)

For detection of individual wearing face mask

boxes = []

confidences = []

class_ids = []

for output in layerOutputs:

for detection in output:

scores = detection[5:]

class_id = np.argmax(scores)

confidence = scores[class_id]

if confidence > 0.2:

center_x = int(detection[0]*width)

center_y = int(detection[1]*height)

w = int(detection[2]*width)

h = int(detection[3]*height)

x = int(center_x - w/4)

y = int(center_y - h/4)

boxes.append([x, y, w, h])

confidences.append((float(confidence)))

class_ids.append(class_id)

indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.2)

Adding output Frame and calculating count values

border_size = 100

border_text_color = [255, 255, 255]


38
frame = cv2.copyMakeBorder(frame, border_size, 0, 0, 0,
cv2.BORDER_CONSTANT)

# calculate count values

filtered_classids = np.take(class_ids, indexes)

mask_count = (filtered_classids == 0).sum()

nomask_count = (filtered_classids == 1).sum()

Displaying count and status

text = "NoMaskCount: {} MaskCount: {}".format(nomask_count,


mask_count)

cv2.putText(frame, text, (0, int(border_size - 50)),


cv2.FONT_HERSHEY_SIMPLEX, 0.65, border_text_color, 2)

# display status

text = "Status:"

cv2.putText(frame, text, (width - 200, int(border_size -


50)), cv2.FONT_HERSHEY_SIMPLEX, 0.65, border_text_color, 2)

ratio = nomask_count / (mask_count + nomask_count + 0.000001)

if ratio >= 0.1 and nomask_count >= 3:

text = "Danger !"

cv2.putText(frame, text, (width - 100, int(border_size -


50)), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [26, 13, 247], 2)

if fps._numFrames >= next_frame_towait: # to send danger


email again,only after skipping few seconds

msg = "**Face Mask System Alert** \n\n"

msg += "Status: \n" + str(text)+"\n"

msg += "No_Mask Count: " + str(nomask_count) + " \n"

msg += "Mask Count: " + str(mask_count) + " \n"

datetime_ist = datetime.now(IST)

msg += "Date-Time of alert: \n" +


datetime_ist.strftime('%Y-%m-%d %H:%M:%S %Z')

39
sendEmail(msg)

next_frame_towait = fps._numFrames + (30*60*60*5*5)

print(fps._numFrames)

print(next_frame_towait)

elif ratio != 0 and np.isnan(ratio) != True:

text = "Warning !"

cv2.putText(frame, text, (width - 100, int(border_size -


50)), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 255], 2)

else:

text = "Safe "

cv2.putText(frame, text, (width - 100, int(border_size -


50)), cv2.FONT_HERSHEY_SIMPLEX, 0.65, [0, 255, 0], 2)

if len(indexes)>0:

for i in indexes.flatten():

x, y, w, h = boxes[i]

label = str(classes[class_ids[i]])

confidence = str(round(confidences[i],2))

color = colors[i]

cv2.rectangle(frame, (x,y), (x+w, y+h), color, 2)

cv2.putText(frame, label + " " + confidence, (x,


y+20), font, 2, (255,255,255), 2)

For displaying output

cv2.imshow('FaceMask Detection', frame)

key = cv2.waitKey(1)
40
if key==27:

break;

For sending mail

port = 465 # For SSL

smtp_server = "smtp.gmail.com"

sender_email = "[email protected] "

password = "qahylckwpfzgipnv"

def sendEmail(msg, receiver_email="[email protected]"):

message = 'Subject: {}\n\n{}'.format("Facemask Notifier",


msg)

context = ssl.create_default_context()

with smtplib.SMTP_SSL(smtp_server, port, context=context) as


server:

server.login(sender_email, password)

server.sendmail(sender_email, receiver_email, message)

print("mail sent to", receiver_email)

For GUI

root = tkinter.Tk()

root.title("Face mask detection and alert system")

root.configure(bg='#87cefa')

canvas=tkinter.Canvas(root,width=600,height=400,)

canvas.grid(columnspan=3,rowspan=3)

logo = Image.open('pp.png')

logo = ImageTk.PhotoImage(logo,width=300,height=300)

logo_label = tkinter.Label(image=logo)

41
logo_label.image = logo

logo_label.grid(column=1, row=0)

instruction = tkinter.Label(root, text=" 'we ensure a safe


environment for you' ",

font="Raleway", anchor = "center",


justify = "center",

padx=10,pady=10,bg="#87cefa")

instruction.grid(column=1, row=10,padx=10,pady=10)

browse_btn = tkinter.Button(root,text = "Start", font="Calibri",


bg="#64e764",

fg="white", height=1, width=15, anchor = "center",

bd=2, justify = "center",command= Click,padx=10,pady=10)

browse_btn.grid(column=1, row=20,padx=10,pady=10)

root.mainloop()

Testing

5.2.1 Testing cases for unit testing

Unit Testing is a software testing technique by means of which individual units of software i.e.
group of computer program modules, usage procedures and operating procedures are tested to
determine whether they are suitable for use or not. It is a testing method using which every
independent modules are tested to determine if there are any issue by the developer himself.

Test Case 1

Test Objectives: Test for starting GUI

Expected Output: successfully open an application.

42
Figure 21 start of application

43
Test Case 2

Test Objectives: Test for opening the camera

Expected Output: successfully opened the camera.

Figure 22 camera opened`

Test Case 3

Test Objectives: Test for detection of face mask

Expected Output: successfully detect weather person wearing mask or not.

Figure 23 mask detected

44
Figure 24 no mask detected

Test Case 4

Test Objectives: Test for showing the number of mask count /no mask count

Expected Output: successfully count the number of mask

Figure 25 counted NO mask

Figure 26 counted mask

Test Case 5

Test Objectives: Test for email alert system

Expected Output: successfully send the email to the controller.

45
Figure 27 email alert

5.2.2 Test cases for system testing

System testing is testing conducted on a complete integrated system to evaluate the


system's compliance with its specified requirements. System testing takes, as its input,
all of the integrated components that have passed integration testing.

Test Case 1

Test Objectives: Test for detection of face mask for single person with status

Expected Output: successfully detected

46
Figure 28 Single person Face mask detection

47
Test Case 2

Test Objectives: Test for detection of face mask for single person unusual cases

Expected Output: successfully detected with status warning

Figure 29 detection of mask wearing on improper place

Test Case 3

Test Objectives: Test for detection of face mask for multiple person with status Warning

Expected Output: successfully detected with status warning(person<=2)

48
Figure 30Detection of multiple person not wearing mask

Figure 31 Detection of face with different cases(wearing mask and not wearing mask)

Test Case 4

49
Test Objectives: Test for detection of face mask for multiple person with status danger

Expected Output: successfully detected with status Danger (person =>3)

Figure 32 Detection of faces with no masks

Test Case 4

Test Objectives: Test for detection of facemask for multiple person with status Safe

Expected Output: successfully detected with status safe (person =>3)

50
Figure 33 Detection of faces wearing mask

Test Case 5

Test Objectives: Test for email alert system

Expected Output: successfully send the email to the controller (every 30 sec)

51
Figure 34 E-mail alert system

Result Analysis
All the functionalities of the projects are achieved in the project as per the objective of the

project, those functionalities are explained with relevant screenshots below:

Finally developed software comes with following features:

 A system which can detect whether an individual is wearing a mask or not.

52
Figure 35: Detection of absence of face mask

53
Figure 36: Detection of Presence of face mask

54
 An alert system which alerts the concerned authority when an individual isn’t
wearing a mask.

Figure 37: Email alert after detection of absence of face mask

55
CHAPTER 6: CONCLUSION AND FUTURE
ENHANCEMENTS

Conclusion
Face mask detection system is a software which can detect whether a person wearing a face
mask or not and alert the concerned authority by displaying the stats on the screen and
sending alert email as well. A simple and easy GUI pops up when the software is launched.
With the help of single button, the detection system starts. User can set a threshold count
value for receiving email alert on their email address. The email will state the number of
people with and without face mask on a particular time. The system waits for a certain
amount of time before sending another alert email. The algorithms used in the development
process of face detection and recognition are fast and work perfectly fine on good lighting
conditions.

This system can replace the process of manual face mask detection which is unsafe, hectic
and impractical at many places. It can be easily deployed as an executable file. The GUI
interfacing is very simple to use and the software works properly in all available OS.

Future Enhancements
To overcome the limitations in the future following enhancements can be implemented:

1. The first step towards enhancement would be to improve accuracy in detecting not
commonly used masks or fancy face mask
2. It should be capable of detecting any faces under any light conditions.
3. Improve the recognition rate of algorithms when there are unintentional changes in
a person like using scarf, glasses, hats.
4. Better performance even on low end devices.
5. The system can be made to store images of individuals roaming without face mask
so that the culprit can be identified and punished.

56
REFERENCES

[1] B. Qin and D. Li, “Identifying facemask-wearing condition using image super-resolution
with classification network to prevent COVID-19,” Sensors (Switzerland), vol. 20, no. 18,
pp. 1–23, Sep. 2020, doi: 10.3390/s20185236.

[2] “agile model,” [Online]. Available: https://www.javatpoint.com/software-engineering-


agile-model.

[3] “lumenlearning,” [Online]. Available: https://courses.lumenlearning.com/boundless-


sociology/chapter/the-research-process/.

[4] “No Title.” https://www.ibm.com/cloud/learn/neural-networks.

[5] T. Wood, “No Title,” [Online]. Available: https://deepai.org/machine-learning-glossary-


and-terms/convolutional-neural-network.

[6] B. O. F. Technology, Face-Mask Detection Using Yolo V3 Architecture, no. May. 2020.

[7] “linear regression,” [Online]. Available: https://www.ibm.com/topics/linear-regression.

[8] “Logistic regression in machine learning,” [Online]. Available:


https://www.javatpoint.com/logistic-regression-in-machine-learning.

[9] “multilabel classification,” [Online]. Available: https://www.geeksforgeeks.org/an-


introduction-to-multilabel-classification/.

[10] K. Sharma and E. Gurinder Singh, “IJARCCE Face Recognition using Principal Component
Analysis and ANN,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 5, 2016, doi:
10.17148/IJARCCE.2016.53144.

[11] “No Title.”

[12] “No Title,” [Online]. Available: ttp://www.thresh.net/.

[13] “activation function.”

[14] “loss function.”

[15] A. Ahmed, S. Adeel, H. Shahriar, and S. Mojumder, “Face Mask Detector Face Mask
Recognition View project,” 2020, doi: 10.13140/RG.2.2.32147.50725.

[16] J. Babu, “A Review on Face Mask Detection using Convolutional Neural Network,” Int.
Res. J. Eng. Technol., 2020, [Online]. Available: www.irjet.net.

[17] L. Dinalankara, “‘Face Detection & Face Recognition Using Open Computer Vision

57
Classifies FACE DETECTION & FACE RECOGNITION USING OPEN COMPUTER
VISION CLASSIFIRES,’ , [Online].,” [Online]. Available: available:
https://www.researchgate.net/publication/318900718.

[18] X. Jiang, T. Gao, Z. Zhu, and Y. Zhao, “Real-time face mask detection method based on
yolov3,” Electron., vol. 10, no. 7, Apr. 2021, doi: 10.3390/electronics10070837.

[19] M. Jiang, X. Fan, and H. Yan, “RetinaMask: A Face Mask detector,” May 2020, [Online].
Available: http://arxiv.org/abs/2005.03950.

[20] and C. M. H. C. A. Perez, C. M. Aravena, J. I. Vallejos, P. A. Estevez, “‘Face and iris


localization using templates designed by particle swarm optimization,’ Pattern Recognit.,”
vol. 31, pp. 857–868, 10AD.

[21] M. Inamdar and N. Mehendale, “Real-time face mask identification using Facemasknet deep
learning network.” [Online]. Available: https://ssrn.com/abstract=3663305.

[22] and M. K. A. Kumar, A. Kaur, “‘Face detection techniques: a review,’ Artif. Intell. Rev.,”
vol. 52, pp. 927–948, 2019.

[23] B. Batagelj, P. Peer, V. Štruc, and S. Dobrišek, “How to Correctly Detect Face-Masks for
COVID-19 from Visual Information?,” Appl. Sci., vol. 11, no. 5, p. 2070, Feb. 2021, doi:
10.3390/app11052070.

[24] G. Cheng, S. Li, Y. Zhang, and R. Zhou, “A Mask Detection System Based on Yolov3-
Tiny,” Front. Soc. …, vol. 2, no. 11, pp. 33–41, 2020, doi: 10.25236/FSST.2020.021106.

[25] S. Singh, U. Ahuja, M. Kumar, K. Kumar, and M. Sachdeva, “Face mask detection using
YOLOv3 and faster R-CNN models: COVID-19 environment,” Multimed. Tools Appl., vol.
80, no. 13, pp. 19753–19768, 2021, doi: 10.1007/s11042-021-10711-8.

[26] V. S. Bhat, “Review on Literature Survey of Human Recognition with Face Mask,” vol. 10,
no. 01, pp. 697–702, 2021.

[27] S. Singh, R. Swami, and M. V Bonde, “REAL TIME FACE MASK DETECTION USING,”
vol. 8, no. 5, pp. 1–5, 2021.

[28] “No Title,” [Online]. Available:


https://www.tutorialspoint.com/system_analysis_and_design/system_analysis_and_design
_overview.htm.

[29] “What is a Functional Requirement in Software Engineering? Specification, Types,


Examples.,” [Online]. Available: https://www.guru99.com/functional-requirement-
specification- example.html.

58
[30] “Feasibility Study Definition: How Does It Work?,” [Online]. Available:
https://www.investopedia.com/terms/f/feasibility-study.asp.

[31] “What is schedule feasibility?”

[32] J. Redmon and A. Farhadi, “YOLO v.3,” Tech Rep., pp. 1–6, 2018, [Online]. Available:
https://pjreddie.com/media/files/papers/YOLOv3.pdf.

[33] V. Meel, “YOLOv3 real-time object detection Algorithm.” https://viso.ai/deep-


learning/yolov3-overview/#:~:text=YOLOv3 %28You Only Look Once%2C Version
3%29 is,were created by Joseph Redmon and Ali Farhadi.

[34] I. Technology, “What is Excel?,” [Online]. Available:


https://itconnect.uw.edu/learn/technology-training/.

[35] C. Hope, “Draw.io,” 2020.

[36] T.Warren, “Microsoft word,” Mirosoft launches Off. 2019.

[37] K. BROWN, “What Is GitHub, and What Is It Used For?,” How-To-Geek, no. 13 november
2019.

[38] “python GUI,” python.org, [Online]. Available:


https://docs.python.org/3/library/tkinter.html#:~:text=Tcldynamicinterpreted,interfaceTkto
olkit.

59

You might also like