0% found this document useful (0 votes)
13 views54 pages

Batch 20 KV

Uploaded by

gokulmp.eee2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views54 pages

Batch 20 KV

Uploaded by

gokulmp.eee2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

SURVEILLANCE THROUGH OBJECT DETECTION USING

MASK R-CNN

A PROJECT REPORT

Submitted by

KANNAN. E (210416105018)

GOKUL. G (210416105012)

in partial fulfillment for the award


of the degree of

BACHELOR OF ENGINEERING
IN
ELECTRICAL AND ELECTRONICS ENGINEERING

CHENNAI INSTITUTE OF TECHNOLOGY, CHENNAI

ANNA UNIVERSITY: CHENNAI 600 025


SEPTEMBER 2020
BONAFIDE CERTIFICATE

Certified that this project report "SURVEILLANCE THROUGH OBJECT


DETECTION USING MASK RCNN” is the bonafide work of “KANNAN. E
(210416105018) and GOKUL. G (210416105012)” who carried out the
project work under my supervision.

SIGNATURE SIGNATURE

DR.M. ETTAPAN, M.E, PhD., DR. R. JANARDHANAN,M.E,Ph.D

HEAD OF THE DEPARTMENT SUPERVISOR

PROFESSOR PROFESSOR
Electrical and Electronics Engineering, Information Technology ,
Chennai Institute of Technology, Chennai Institute of Technology,
Pudhupedu, Kundrathur, Pudhupedu, Kundrathur,
Chennai – 600069 Chennai - 600069

Submitted for the ANNA UNIVERSITY examination held on ___22-09-2020___

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

We thank our beloved Chairman Shri. P. SRIRAM and all the trust members of
Chennai Institute of Technology at this high time for providing us with plethora of facilities to
complete my project successfully.
We owe our sincere gratitude to our vice chairman MR.P. JANAKIRAMAN and ours
seceratary Mrs. S. SRIDEVI, for helping us in all the way to complete the project succesfully.

We take privilege to express my thanks to our Principal Dr. A.Ramesh M.E,Phd., who has been a
bastion of moral strength and a source of incessant encourage to us.

We express our sincere thanks to Dr.M. ETTAPAN, Ph.D., Head of the Department,
Chennai Institute of Technology, for her valuable guidance and suggestions.

We take immense pleasure to express our heartfelt thanks to our beloved project guide,
Mr. R. JANARTHANAN, M.E., PH.D., and Co-project Guide Mr. KEERTHI
VIJAYADHASAN, M.E for their valuable suggestions, excellent guidance and constant
support provided all through the course of our project.

We also thank the teaching and non-teaching staff members of Electrical and Electronics
Engineering Department and all our fellow students who stood with us to complete our project
successfully.

Last but not least we extend our deep gratitude to our beloved family members for their
moral coordination, encouragement and financial support to carry out this project.
ABSTRACT

India is the second most populated countries in the world after China. The
management of crowd is a very challenging task due to the large population. For
the management of crowd and analysis of crowd. we have proposed a system
using Computer Vision and Deep Learning. In our system we use the Mask
RCNN. For the detection of the objects, specifically people in a real time
surveillance camera .We use our own created data to train the Convolution
Neural Network to detect people, Since only people has to be detected by our
Neural Network .With the help of the Deep Learning algorithm (Mask RCNN)
we detect the people who cross the camera and make a count of the people them
in parallel. When the count of the people exceed a certain threshold a warning
message displayed to manage crowd in the particular area .With our system the
management of the crowd becomes a easy task. The surveillance cameras are
present every where for the monitoring of the people, we use those surveillance
camera for our system to perform crowd management .With the help of this data
we can perform analysis of the crowd and the rate of people crossing through
the certain path. This helps in the easy and simple solution using Computer
Vision and Deep Learning.
TABLE OF CONTENTS
CHAPTER TITLE PAGE NO
ABSTRACT (iv)
LIST OF FIGURES (ix)
LIST OF TERMINOLOGIES (ix)
1 INTRODUCTION 1
1.1 Aim 1
1.2 Object Detection 2
1.3 Mask RCNN 3
1.4 Characteristics of Mask RCNN 4
1.5 Applications of Object Detections 5
1.6 Object Detection Algorithms 6
2 LITERATURE SURVEY 8
3 SYSTEM ANALYSIS 11
3.1 Existing System 11
3.1.1 Disadvantages in Existing System 1 12

3.2 Proposed System 12


3.3.1 Advantages of Proposed System 12
4 SYSTEM REQUIREMENT 14
4.1 Introduction 14
4.2 Hardware and Software Specification 14
4.2.1 Hardware Requirement 14
4.2.2 Software Requirement 15
4.3 Technologies Used 15
4.3.1 Python 15
4.3.2 Tensorflow 16
4.3.3 Keras 16
4.3.4 Open-CV 17
4.3.5 Mask RCNN 18
4.3.6 Version tracking: GIT 18
4.3.7 User Interface 19
4.3.8 Cython 19
4.3.9 Numpy 19

5 SYSTEM DESIGN 20
5.1 Introduction 20
5.2 UML Diagram 20
5.2.1 System Architecture 21
5.2.2 Sequence Diagram 22
5.2.3 Class Diagram 23
6 SYSTEM IMPLEMENTATION 24
6.1 Module 24
6.1.1 Image Annotation Module 24
6.1.2 Object Detection Module 25
6.1.3 People counter module 25
6.2 Algorithm 25
6.2.1 Mask-RCNN 25

7 SYSTEM TESTING 27
7.1 Coding Standards 27
7.1.1 Naming Conventions 27
7.1.2 Value Conventions 28
7.1.3 Script Writing Standard 28
7.1.4 Message Box Format 29
7.2 Testing Objective 29
7.3 Types of Testing 30
7.3.1 Unit Testing 30
7.3.2 Integration Testing 30
7.3.3 Validation Testing 31
7.4 Testing Strategies 31
7.4.1 White Box Testing 31
7.4.2 Black Box Testing 32
7.4.3 User Interface Testing 32
7.4.4 Module Testing 32
7.4.5 Integration Testing 33
7.4.6 User Acceptance Testing 33

8 FEASIBILITY STUDY 34
8.1 Feasibility Study 34
8.1.1 Technical Feasibility 34
8.1.2 Economic Feasibility 35
8.1.3 Operational Feasibility 35

9 RESULT ANALYSIS 37

A) SCREENSHOTS 38

10 CONCLUSION AND FUTURE WORKS 43

11) REFERENCES 44
LIST OF FIGURES

FIGURE NO NAME OF THE FIGURE PAGE NO

5.2 Use Case Diagram 20


5.2.1 System Architecture Diagram 21
5.2.2 Sequence Diagram 22
5.2.3 Class Diagram 23
6.2.1 Mask R-CNN Architecture Diagram 26

LIST OF ABBREVIATIONS

CNN Convolutional Neural Network

FPN Feature Pyramid Network

YOLO You Only Look Once

II
CHAPTER 1

INTRODUCTION

1.1 AIM

India is the second most populated countries in the world


after china. The management of crowd is a very challenging
task due to the large population. For the management of crowd
and analysis of crowd. we have proposed a system using
Computer Vision and Deep Learning. In our system we use the
Mask RCNN for the detection of the objects, specifically
people people using the surveillance camera. We use our own
created data to train the Convolution Neural Network to detect
people, Since only people has to be detected by our Neural
Network. With the help of the Deep Learning algorithm (Mask
RCNN) we detect the people who cross the camera and make a
count of the people them in parallel. When the count of the
people exceed a certain threshold a warning message displayed
to manage crowd in the particular area. With our system the
management of the crowd becomes a easy task. The
surveillance cameras are present every where for the

1
monitoring of the people, we use those surveillance camera for
our system to perform crowd management. With the help of
this data we can perform analysis of the crowd and the rate of
people crossing through the certain path. This helps in the easy
and simple solution using Computer Vision and Deep Learning.

1.2 OBJECT DETECTION

Object Detection is a related to computer vision and image


processing which deals with detecting instances of an object of a certain
class such as people, cat, dog, vehicle etc. from an image or from an
video. The most used Object Detection are the Face Detection and the
Pedestrian Detection since it is a well researched segment of Object
detection.
The uses of Object detection is wide such as image annotation, activity
recognition, face detection, face recognition, object tracking.
The basic concept of Object detection is the every object has its own
special feature that helps in the classifying the class for example all
circles are round. Object class detection uses these special features to
classify the objects. For example if we need to detect a circle, the objects
that are at a particular distance from a point are sought. The methods of
object detection comes in the category of deep learning based

2
approaches. For deep learning approaches any one of the following
methods or approaches is to be used for object detection:

 Region Proposals (R-CNN, Fast R-


CNN, Faster R-CNN)

 Single Shot Multi-Box Detector (SSD)

 You Only Look Once (YOLO)

 Single-Shot Refinement Neural Network for


Object Detection (Refine-Det)

 Retina-Net

 Deformable convolutional networks

1.3 MASK RCNN

 Mask RCNN is designed for accuracy rather than memory efficiency.


It's not a light-weight model.
 If you have a small GPU then you might notice that inferencing runs
correctly but training fails with an Out of Memory error.
 That's because training requires a lot more memory than running in
inference mode.
 Ideally, you'd want to use a GPU with 12GB or more, but you can
train on smaller GPUs by choosing good settings and making the right
trade-offs.

3
1.4 CHARACTERISTICS OF MASK RCNN

 Mask RCNN has been the new state of art in terms of instance
segmentation.
 Mask RCNN is a deep neural network aimed to solve instance
segmentation problem in machine learning or computer vision.
 Backbone is a Feature Pyramid network style deep neural
network.
 A light weight neural network called RPN scans all FPN top-
bottom pathwayand proposes regions which may contain
objects.
 Mask RCNN is that we could actually force different
layers in neural network to learn features with different
scales.

4
1.5 APPLICATIONS OF OBJECT DETECTION

1. OPTICAL CHARACTER RECOGNITION

Optical character recognition or optical character reader, often


abbreviated as OCR, is the mechanical or electronic conversion of images
of typed, handwritten or printed text into machine-encoded text, whether
from a scanned document, a photo of a document, a scene-photo (for
example the text on signs and billboards in a landscape photo) or from
subtitle text superimposed on an image, we are extracting characters from
the image or video.
2. SELF DRIVING CARS
One of the best examples of why you need object detection is
for autonomous driving is In order for a car to decide what to do in
next step whether accelerate, apply brakes or turn, it needs to know
where all the objects are around the car and what those objects are
That requires object detection and we would essentially train the car to
detect known set of objects such as cars, pedestrians, traffic lights,
road signs, bicycles, motorcycles, etc.
3. FACE DETECTION AND FACE RECOGNITION
Face detection and Face Recognition is widely used in
computer vision task. We noticed how facebook detects our face
when you upload a photo This is a simple application of object
detection that we see in our daily life.Face detection can be regarded
as a specific case of object-class detection. In object-class detection,
the task is to find the locations and sizes of all objects in an image that

5
belong to a given class. Examples include upper torsos, pedestrians,
and cars.
4. PEDESTRIAN DETECTION
Pedestrian detection is an essential and significant task in any
intelligent video surveillance system, as it provides the fundamental
information for semantic understanding of the video footages. It has an
obvious extension to automotive applications due to the potential for
improving safety systems.
1.6 OBJECT DETECTION ALGORITHMS
1. Single Shot Detector (SSD):

Single Shot Detector achieves a good balance between speed and


accuracy . SSD runs a convolutional network on input image only
once and calculates a feature map. Now, we run a small 3x3 sized
convolutional kernel on this feature map to predict the bounding
boxes and classification probability.
2. YOLO :
YOLO divides each image into a grid of S x S and each grid
predicts N bounding boxes and confidence. The confidence reflects the
accuracy of the bounding box and whether the bounding box actually
contains an object (regardless of class)

3. Fast R-CNN :

Fast RCNN uses the ideas from SPP-net and RCNN and fixes the
key problem in SPP-net i.e. they made it possible to train end-to-end.
To propagate the gradients through spatial pooling, It uses a simple
back-propagation calculation which is very similar to max-pooling

6
gradient calculation with the exception that pooling regions overlap
and therefore a cell can have gradients pumping in from

multiple regions

4. SPATIAL PYRAMID POOLING :

SPP-net, we calculate the CNN representation for entire image


only once and can use that to calculate the CNN representation foreach
patch generated by Selective Search.

7
CHAPTER 2

LITERATURE SURVEY

1. MaskR-CNN -- Facebook AI Research (FAIR)


Authors:
Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick

Description:
The vision community has rapidly improved object detection and
semantic segmentation results over a short period of time. In large
part, these advances have been driven by
powerful baseline systems, such as the Fast/Faster R-CNN and Fully
Convolutional Network (FCN) frameworks for object detection and
semantic segmentation, respectively. These methods are conceptually
intuitive and offer flexibility and robustness, together with fast
training and inference time. Our goal in this work is to develop a
comparably enabling framework for instance segmentation.

2. Crossing-line Crowd Counting with Two-phase Deep Neural


Networks
Authors:
Zhuoyi Zhao , Hongsheng Li;, Rui Zhao, Xiaogang Wang;

Description:
We propose a deep Convolutional Neural Network(CNN) for
counting the number of people across a line-of-interest(LOI) in
surveillance videos. It is a challenging problem and has many
potential applications. Observing the limitations of temporal slices used
8
by state-of-the-art LOI crowd counting methods, our proposed CNN
directly estimates the crowd counts with pairs of video frames as inputs
and is trained with pixel-level supervision maps. Such rich supervision
information helps our CNN learn more discriminative feature
representations.
A two-phase training scheme is adopted, which decomposes the original
counting problem into two easier sub-problems, estimating crowd
density map and estimating crowd velocity map. Learning to solve the
sub-problems provides a good initial point for our CNN model, which is
then _ne-tuned to solve the original counting problem. A new dataset
with pedestrian trajectory annotations is introduced for evaluating LOI
crowd counting methods and has more annotations than any existing one.
Our extensive experiments show that our proposed method is robust to
variations of crowd density, crowd velocity, and directions of the LOI,
and outperforms state-of-the-art LOI counting methods.

3. A Novel YOLO-based Real-time People Counting Approach

Authors:

Peiming Ren , Wei Fang and Soufiene Djahel

Description:

Real-time people counting from video records is a main building bloc


for many applications in smart cities. In practice, this task usually
encounters many problems, like the lack of real-time processing of the
recorded videos or the occurrence of errors due to irrelevant people

9
being counted. To overcome the above issues, we propose a novel
real-time people counting approach dubbed YOLO-PC (YOLO based
People Counting).

10
CHAPTER 3
SYSTEM ANALYSIS

3.1 EXISTING SYSTEM


The existing systems are mostly hardware bases since it
produces more accurate results, but bringing along with it the question
of capital investment, installation and maintenance costs and issues in
the long run.
The systems based on software are very easy to use and are much
more useful. The Software based system used deep learning
algorithms and Open-CV to implement the people counter and
analysis. These algorithms are more effective and produced good
results but the only drawback for those systems were its accuracy.
This system used the YOLO Algorithm to identify the people in the
camera and detect the count from the video. The YOLO algorithm
runs on Tensorflow backend which is a deep learning framework by
google for training and detection of objects using computer vision,

3.1.1 DISADVANTAGES IN EXISTING SYSTEM

 Hardware systems have capital investment, installation and


maintenance costs which are all to be concerned.
 The System have less accuracy due to the previous deep learning
algorithms
 Are more resource consuming due to heavy training and detection
process

11
3.2 PROPOSED SYSTEM

The proposed system is based on the convolutional neural network


based algorithm named as Mask RCNN. The Mask RCNN provides high
accuracy and efficiency in the detection of objects in an image as well as in
the real time video detection.
Through Object Detection we detect the people, crossing the camera to
analyse the crowd which arrive in a particular area.
We can also identify the total people arriving in a particular place on an
occasion. This will help in the further preparation for the management of
crowd. Often, huge footfall is associated with various popular festivals,
religious ceremonies, public events, concerts etc. The enormous count of
people coupled with their density(how close they are to each other) easily
poses the risk of a normal crowd turning into a dangerous stampede that
could potentially cost many lives. The counts of visitors at different areas of
a big venue, could further be used to detect crowded areas that are at risk of
stampedes. This information could be used to take appropriate crowd control
measures like crowd routing and redirection to avoid the possibility of
stampedes.

3.2.1 ADVANTAGES OF PROPOSED SYSTEM


 Increased Accuracy due to the usage of mask RCNN
Algorithm .
 This system can be easily attached to any CCTV
camera footage to get it working.
 Apart from this, this kind of information coming from
multiple cameras recording visuals of multiple areas of
12
a venue, could be used to improve the crowd counting
model to make accurate predictions about future crowd
counts.

13
CHAPTER 4

SYSTEM REQUIREMENT

4.1 INTRODUCTION
The system requirement is a technical specification of
requirements for the software products. It is the first step in the
requirements analysis process it lists the requirements of a particular
software system including functional, performance and security
requirements. The requirements also provide usage scenarios from a
user, an operational and an administrative perspective. The purpose
of software requirements specification is to provide a detailed
overview of the software project, its parameters and goals. This
describes the project target audience and its user-interface, hardware
and software requirements. It defines how the client, team and
audience see the project and its functionality.

4.2HARDWARE AND SOFTWARE SPECIFICATION

4.2.1 HARDWARE REQUIREMENTS


Processor : Intel Core i5 or higher 3.3GHz.
Mother Board : Intel Board.
Hard Disk : 50 GB.
RAM : Min 8 GB

14
4.2.2 SOFTWARE REQUIREMENTS
Operating System : Windows 10
Technologies Used : Python, Tensor flow, keras, ImageAI
Tools Used : PyCharm

4.3 TECHNOLOGIES USED

4.3.1 PYTHON

Python is a great object-oriented, interpreted, and interactive


programming language. It is often compared to Lisp, Tcl, Perl,
Ruby, C#, Visual Basic, Visual Fox Pro, Scheme or Java.
Python combines remarkable power with very clear syntax. It has
modules, classes, exceptions, very high level dynamic data types, and
dynamic typing. There are interfaces to many system calls and
libraries, as well as to various windowing systems. New built-in
modules are easily written in C or C++ (or other languages,
depending on the chosen implementation). Python is also usable as
an extension language for applications written in other languages that
need easy-to-use scripting or automation interfaces.

4.3.2 TENSORFLOW

TensorFlow is a free and open-source software. It is a symbolic


math library,
15
and is also used for machine learning applications such as neural
networks. It is a library for dataflow and differentiable programming
across a range of tasks. TensorFlow is Google Brain's second-
generation system. Version 1.0.0 was released on February 11, 2017.
While the reference implementation runs on single devices,
TensorFlow can run on multiple CPUs and GPUs(with
optional CUDA and SYCL extensions for general-purpose
computing on graphics processing units). TensorFlow is available on
64-bit Linux, macOS, Windows, and mobile computing platforms
including Android and iOS.

4.3.3 KERAS

Keras is an open source Python library for easily building neural


networks. The library is capable of running on top of TensorFlow,
Microsoft Cognitive Toolkit, Theano and MXNet, Tensorflow and
Theano are the most used numerical platforms in Python
when building deep learning algorithms, but they can be quite
complex and difficult to use. By comparison, Keras provides an easy
and convenient way to build deep learning models.

Keras creator François Chollet developed the library to help people


build neural networks as quickly and easily as possible, putting a
focus on extensibility, modularity, minimalism and Python support.
Keras can be used with GPUs and CPUs and it supports both Python 2
and 3.

16
4.3.4 Open-CV
OpenCV (Open Source Computer Vision Library) is an open
source computer vision and machine learning software library.
OpenCV was built to provide a common infrastructure for computer
vision applications and to accelerate the use of machine perception in
the commercial products. Being a BSD-licensed product, OpenCV
makes it easy for businesses to utilize and modify the code. The
library has more than 2500 optimized algorithms, which includes a
comprehensive set of both classic and state-of-the-art computer vision
and machine learning algorithms. These algorithms can be used to
detect and recognize faces, identify objects, classify human actions in
videos

4.3.5 Mask RCNN

Mask RCNN is designed for accuracy rather than memory


efficiency. It's not a light-weight model. If you have a small GPU then
you might notice that inferencing runs correctly but training fails with
an Out of Memory error. That's because training requires a lot more
memory than running in inference mode. Mask RCNN has been the
new state of art in terms of instance segmentation. Mask RCNN is a
deep neural network aimed to solve instance segmentation problem in
machine learning or computer vision. Backbone is a Feature Pyramid
network style deep neural network. A light weight neural network
called RPN scans all FPN top-bottom pathway and proposes regions

which may contain objects.

17
4.3.6 VERSION TRACKING:GIT
Version control systems are a category of software tools that help a
software team manage changes to source code overtime. Version
control software keeps track of every modification to the code in a
special kind of database. If a mistake is made, developers can turn
back the clock and compare earlier versions of the code to help fix the
mistake while minimizing disruption to all team members.

4.3.7 USER INTERFACE(UI)


HTML and CSS contribute to the User Interface. UX Design refers
to the term User Experience Design , while UI Design stands for User
Interface Design. Both elements are crucial to a product and work
closely together. But despite their professional relationship, the roles
themselves are quite different, referring to very different parts of the
process and the design discipline. Where UX Design is a more
analytical and technical field, UI Design is closer to what we refer to
as graphic design, though the responsibilities are somewhat more
complex.

4.3.8 CYTHON
CPython is the reference implementation of the Python
programming language. Written in C and Python, CPython is the
default and most widely used implementation of the language.
CPython can be defined as both an interpreter and a compiler as it
compiles Python code into byte code before interpreting it. It has

18
a foreign function interface with several languages including C, in
which one must explicitly write bindings in a language other than
Python.

4.3.9 NUMPY

NumPy is a library for the Python programming language, adding


support for large, multi-dimensional arrays and matrices, along with a
large collection of high-level mathematical functions to operate on
these arrays. The ancestor of NumPy, Numeric, was originally created
by Jim Hugunin with contributions from several other developers. In
2005, Travis Oliphant created NumPy by incorporating features of the
competing Num array into Numeric, with extensive modifications.
NumPy is open-source software and has many contributors.
NumPy targets the CPython reference implementation of Python,
which is a non-optimizing bytecode interpreter. Mathematical
algorithms written for this version of Python often run much slower
than compiled equivalents. NumPy addresses the slowness problem
partly by providing multidimensional arrays and functions and
operators that operate efficiently on arrays, requiring rewriting some
code, mostly inner loops using NumPy.

19
CHAPTER 5

SYSTEM DESIGN

5.1 INTRODUCTION

In order to design a website, the relational data base must be designed


first. Conceptual design can be divided into two parts. The data model
and the process model. The data model focuses on what data should be
stored in the database while the processes model deals with how the data
is processes. To put this in the context of the relational database, the data
model is used to design the relational tables. The process the model is
used to design the queries that will access and perform operations on
those tables.

5.2 UML DIAGRAM

USE CASE DIAGRAM

20
5.2.1 SYSTEM ARCHITECTURE DIAGRAM

An architecture diagram is a graphical representation of a set of


concepts that are part of an architecture, including their principles,
elements and components.

FIG 5.2.1 SYSTEM ARCHITECTURE DIAGRAM

21
5.2.2 SEQUENCE DIAGRAM

Sequence Diagram model the model flow of logic within your


system in avisual manner, enabling you both document and validate
your login, and commonly used for analysis and design purpose,
Sequence diagram are the most popular.

FIG 5.2.2 SEQUENCE DIAGRAM

22
5.2.3 CLASS DIAGRAM

A class diagram in the Unified Modeling Language (UML) is


a type of static diagram that describes the structure of a system by
showing the system’s classes, their attributes, operations (or
methods), and the relationships among objects.

23
CHAPTER 6

SYSTEM IMPLEMENTATION

6.1 MODULES

Software is divided into separately named and addressable components


called modules that are integrated to satisfy problem requirements.
Modularity is the single attribute of software that allows a program to be
intellectually manageable.

There are 4 modules:


 Image annotation module
 Object detection module
 People counter Module

6.1.1 IMAGE ANNOTATION MODULE:

In this module we create the annotated images to train our Mask-


RCNN using the image annotation tool. With the help of the
annotation tool we annotate only the people in a set of images , these
annotated images are used for training the object detection module.
The tool produces a model of training images in the (.pb) format. This
file is the model which is feeded to the object detection module with
the help of which the algorithm identifies the people in the image and
videos.

24
6.1.2 OBJECT DETECTION MODULE:

The object detection module is where the input image or video is


given from which the people are detected. This module uses Open-
CV as the base to produce the window to show the object detection
output. This module uses the Mask RCNN algorithm for the detection
of the people in the image or real time video from a CCTV. The
detected output is then showed using the Open CV window which is
created using the tools and a copy of the detected video is saved to the
output folder for the further analysis of the saved video.

6.1.2 PEOPLE COUNTER MODULE

The People counter module is used to set a counter

6.2 ALGORITHM
The system uses MASK RCNN algorithm for the detection of the objects
for the people in the image or the video. The explanation of the algorithm is
as follows

6.2.1 MASK RCNN:


Mask RCNN is designed for accuracy rather than memory
efficiency. It's not a light-weight model. If you have a small GPU then
you might notice that inferencing runs correctly but training fails with
an Out of Memory error.
25
That's because training requires a lot more memory than
running in inference mode.

FIG 6.2.1 MASK RCNN ARCHITECTURE


DIAGRAM

26
CHAPTER-7

SYSTEM TESTING
7.1 CODING STANDARDS
Coding standards are guidelines to programming that focuses on
the physical structure and appearance of the program. They make the code
easier to read, understand and maintain. This phase of the system actually
implements the blueprint developed during the design phase. The coding
specification should be in such a way that any programmer must be able
to understand the code and can bring about change whenever felt
necessary. Some of the standard needed to achieve the above-mentioned
objectives are as follows:
 Program should be simple, clear and easy to understand.

 Naming conventions

 Value conventions

 Script and comment procedure

 Message box format

 Exception and error handling

7.1.1 NAMING CONVENTIONS

Naming conventions of classes, data member, member functions,


procedures etc., should be self-descriptive. One should even get the
meaning and scope of the variable by its name. The conventions are
adopted for easy understanding of the intended message by the

27
user. So it is customary to follow the conventions. These
conventions are as follows:

Class names
Class names are problem domain equivalence and begin
with capital letter and have mixed cases. Member Function and Data
Member name

Member function and data member name begins with a lowercase


letter with each subsequent letters of the new words in uppercase
and the rest of letters in lowercase.

7.1.2 VALUE CONVENTIONS


Value conventions ensure values for variable at any point of time. This
involves the following:
 Proper default values for the variables.

 Proper validation of values in the field.

 Proper documentation of flag values.

7.1.3 SCRIPT WRITING STANDARD


Script writing is an art in which indentation is utmost
important. Conditional and looping statements are to be properly
aligned to facilitate easy understanding. Comments are included to
minimize the number of surprises that could occur when going
through the code.

28
7.1.4 MESSAGE BOX FORMAT

When something has to be prompted to the user, he


must be able to understand it properly. To achieve this, a specific
format has been adopted in displaying messages to the user. They
are as follows:
 X – User has performed illegal operation.

 ! – Information to the user.

7.2TESTING OBJECTIVES:
Testing is a set of activities that can be planned in advance and
conducted systematically. For this reason a template for software
testing, a set of steps into which can place specific test case design
techniques and testing methods should be defined for software
process. Testing often accounts for more effort than any other
software engineering activity. If it is conducted haphazardly, time is
wasted, unnecessary effort is expanded, and even worse, errors sneak
through undetected. It would therefore seem reasonable to establish a
systematic strategy for testing software.

29
7.3TYPES OF TESTING:

7.3.1 UNIT TESTING:


The primary goal of unit testing is to take the smallest piece of
testable software in the application, isolate it from the remainder of
the code, and determine whether it behaves exactly as you expect.
Each unit is tested separately before integrating them into
modules to test the interfaces between modules. Unit testing has
proven its value in that a large percentage of defects are identified
during its use. In the company as well as seeker registration form,the
zero length username and password are given and checked. Also the
duplicate username is given and checked.
In the job and question entry, the button will send data to the
server only if the client side validations are made. The dates are
entered in wrong manner and checked. Wrong email-id and website
URL (Universal Resource Locator) is given and checked.

7.3.2 INTEGRATION TESTING:


Testing is done for each module. After testing all the modules,
the modules are integrated and testing of the final system is done with
the test data, specially designed to show that the system will operate
successfully in all its aspects conditions.
Thus the system testing is a confirmation that all is correct and
an opportunity to show the user that the system works.

30
7.3.3 VALIDATION TESTING:

The final step involves Validation testing, which determines


whether the software function as the user expected. The end-user
rather than the system developer conduct this test most software
developers as a process called “Alpha and Beta were testing” to
uncover that only the end user seems able to find.
The compilation of the entire project is based on the full
satisfaction of the end users. In the project ,validation testing is made
in various forms. In question entry form, the correct answer only will
be accepted in the answer box. The answers other than the four given
choices will not be accepted.

7.4 TESTING STRATEGIES:


A number of software testing strategies have been proposed in the
literature. All provide the software developer with a template for
testing and all have the following generic characteristics:
 Testing begins at the component level and works “outward”
toward the integration of the entire computer-based system.
 Different testing techniques are appropriate at different points in
time.

 The developer of the s/w conducts testing and for large


projects, independent test group

7.4.1 WHITE BOX TESTING


It is just the vice versa of the Black Box testing. They do not
watch the internal variables during testing. This gives clear idea about

31
what is going on during execution of the system. The point at which
the bug occurs were all clear and were removed.

7.4.2 BLACK BOX TESTING


In this testing we give input to the system and test the output.
Here I do not go for watching the internal file in the system and what
are the changes made on them for the required output.

7.4.3 USER INTERFACE TESTING


The Interface Testing is performed to verify the interfaces
between sub modules while performing integration of sub modules
aiding master module recursively.
The objective of this GUI testing is to validate the GUI as per
the business requirement. The expected GUI of the application is
mentioned in Detailed Design Document and GUI mockup screens.
The GUI testing includes size of the buttons and input field present
on the screen, alignment of all text, tables and content in the tables.
It also validates the menu of the application, after selecting
different menu and menu items ,it validates that the page doesn’t
fluctuate and the alignment remains same.

7.4.4 MODULE TESTING


Module Testing is a process of testing the system, module by
module. It includes the various inputs given, outputs produced and

32
their correctness. By testing in this method we would be very clear of
all the bugs that have occurred.

7.4.5 INTEGRATION TESTING


Integration testing is a level of software testing where
individual units are combined and tested as a group. The purpose of
this level of testing is to expose faults in the interaction between
integrated units. Test drivers and test stubs are used to assist in
Integration Testing.
In a comprehensive software development environment,
bottom-up testing is usually done first, followed by top-down testing.

7.4.6 USER ACCEPTANCE TESTING


User acceptance of the system is key factor for the success
of any system. The system under consideration is tested for user
acceptance by constantly keeping in touch with prospective system
and user at the time of developing and making changes whenever
required. This is done in regarding to the following points.
 Input screen design.
 Output screen design

33
CHAPTER 8
FEASIBILITY STUDY

8.1 FEASIBILITY STUDY

A feasibility study is carried out to select the best system that meets
performance requirements. The main aim of the feasibility study
activity is to determine whether it would be financially and
technically feasible to develop the product. The feasibility study
activity involves the analysis of the problem and collection of all
relevant information relating to the product such as the different data
items which would be input to the system, the processing required to
be carried out on these data, the output data required to be produced
by the system as well as various constraints on the behavior of the
system.

8.1.1 TECHNICAL FEASIBILITY


This is concerned with specifying equipment and software that
will successfully satisfy the user requirement. The technical needs of
the system may vary considerably, but might include:
 The facility to produce outputs in a given time.

 Response time under certain conditions.


 Ability to process a certain volume of transaction at a
particular speed.
 Facility to communicate data to distant locations.

34
In examining technical feasibility, configuration of the system is
given more importance than the actual make of hardware. The
configuration should give the complete picture about the system’s
requirements: How many workstations are required, how these units
are interconnected so that they could operate and communicate
smoothly. And what speeds of input and output should be achieved at
particular quality of printing.

8.1.2 ECONOMIC FEASIBILITY


Economic analysis is the most frequently used technique for
evaluating the effectiveness of a proposed system. More commonly
known as Cost / Benefit analysis, the procedure is to determine the
benefits and savings that are expected from a proposed system and
compare them with costs. If benefits outweigh costs, a decision is
taken to design and implement the system. Otherwise, further
justification or alternative in the proposed system will have to be
made if it is to have a chance of being approved. This is an outgoing
effort that improves in accuracy at each phase of the system lifecycle.

8.1.3 OPERATIONAL FEASIBILITY


This is mainly related to human organizational and political
aspects. The points to be considered are:
 What changes will be brought with the system?
 What organizational structure are disturbed?

35
 What new skills will be required? Do the existing staff
members have these skills? If not, can they be trained in due
course of time?
This feasibility study is carried out by a small group of people
who are familiar with information system technique and are skilled
in system analysis and design process. Proposed projects are
beneficial only if they can be turned into information system that
will meet the operating requirements of the organization. This test of
feasibility asks if the system will work when it is developed and
installed.

36
CHAPTER-9

RESULT ANALYSIS

Based on the test results obtained from different Test Strategies we can we

define the performance of the code and it’s efficiency. From the Black box
testing and white box testing there are no internal errors . These tests can be
functional or non-functional, though usually functional. They give us the
results that our code is working fine.

The model which is trained by the set of data which are web scraped online.
Those images and video are fitted to train the model.

The model was able to recognize the test data which is totally different from
the training data which his used to train the model. The model accuracy for
the test images and video

The general threshold for the IOU can be 0.5, our model has IOU of 0.57
which is considered to decent score among the object detection models.

The accuracy of the model from the lab test under certain condition has the
average score above 80.44 .

37
A)SCREENSHOTS

FIG 10.1 frame detection for object detection

38
FIG 10.2 Real Time Object Detection Module

39
FIG 10.3 Video Object Detection module.

40
FIG 10.4 People Counter Module

41
FIG 10.5 Mask RCNN Object detection module

42
CHAPTER- 10

CONCLUSION AND FUTURE WORKS

The crowd management is done efficiently with the help of the Mask R-
CNN algorithm with the precision, which helps to monitor each and every
individual in the hallway or any place which has high possibility of people
gathering. Thereby we can maintain social distance in these places without
fear the of social transmission.

In future enhancements , we can implement this project for a large set of


data like in football and cricket stadium or in temple during a certain season
which usually has high risk stampede death. We can use our project to
monitor the crowd and determine the possibility of a stampede based in the
crowd movement in a certain area.

In a large bus stands and railway stations , during festivities we can manage
the crowd in these areas by diverting those crowd to prevent hassle.

43
CHAPTER 11
REFERENCES
1. He K, Gkioxari G, Dollar P, Girshick R (2018) Mask R-CNN.
Facebook AI research (FAIR). arXiv:1703.06870v3 24 Jan
2018

2. Privacy Preserving Crowd Monitoring: Counting People


without People Models or Tracking(Appears in IEEE Conf.
on Computer Vision and Pattern Recognition, Anchorage)

3. Feature Pyramid Networks for Object Detection


Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath
Hariharan, Serge Belongie; Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2117-2125

4. C. Zhan, X. Duan, S. Xu, Z. Song and M. Luo, "An Improved


Moving Object Detection Algorithm Based on Frame
Difference and Edge Detection," Fourth International
Conference on Image and Graphics (ICIG 2007), Sichuan,
2007, pp. 519-523, doi: 10.1109/ICIG.2007.153.

5. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger.


In: 2017 IEEE conference on computer vision and pattern
recognition (CVPR). IEEE, pp 6517–6525 (1, 2, 3)

6. High-performance Object Detection and Tracking Using


Deep Learning http://hdl.handle.net/2440/124521

7. Comparative evaluation of human detection and tracking


approaches for online tracking applications

8. Le, Than (2020): Mask R-CNN with data augmentation for


food detection and recognition. TechRxiv. Preprint.
https://doi.org/10.36227/techrxiv.11974362.v1

9. Object Detection Implementation Rene Reyes UTRGV

44
Edinburg, Texas Ryan Luna UTRGV Edinburg, Texas

10.Afif, M., Ayachi, R., Pissaloux, E. et al. Indoor objects \


detection and recognition for an ICT mobility assistance of
visually impaired people. Multimed Tools Appl (2020).
https://doi.org/10.1007/s11042-020-09662-3

45

You might also like