Smart Monitoring Final Project
Smart Monitoring Final Project
AND TECHNOLOGY
PROJECT TITLE:
SECURITY TRACKING SYSTEM
Submitted by:
MAINA ARNOLD NDIRITU – EN272-3964/2015
PROJECT SUPERVISOR
MR. OLOO
A Final Year Project submitted to the Department of Electrical and
Electronic Engineering in partial fulfillment of the requirements for the award of a
Bachelor of Science Degree in Electronics and computer Engineering.
NOVEMBER 2020
DECLARATION
This project report is my original work, except where due acknowledgement is made in the
text, and to the best of my knowledge has not been previously submitted to Jomo Kenyatta
University of Agriculture and Technology or any other institution for the Award of a degree
or diploma.
TITLE OF PROJECT:
SUPERVISOR CONFIRMATION:
This project report has been submitted to the Department of Electrical and Electronic
Engineering, Jomo Kenyatta University of Agriculture and Technology, with my approval as
the University supervisor:
ii
ABSTRACT
A lot of investigations have gone cold due to the sudden disappearances of suspected
criminals. Also, known criminals who various government agencies have their pictures are
still at large. Their photos may be circulated but to no success. Known terrorists who pose
serious security concern can go about their activities as not many people will be able to
recognize them.
This project seeks to help both the relevant authorities in capturing suspected criminals. This
will be able to keep the public safe as they will be removed from the streets. Known terrorists
can be captured before they pose a major threat
Currently, authorities would post pictures of people who are of interest and hope the public
would be able to identify them and report them to the nearest police station. They would also
use the several CCTV’s mounted across the city and try to retrace the path used by a criminal
and be able to identify their current location.
Real time criminal alert system, will be able to assist the various authorities in tracking
known criminals and also suspects in an ongoing criminal investigation. This will be done by
utilizing computer vision to recognize faces. The method to be used for face detection is haar
cascades, HOG + Linear SVM. This system will have a web interface to check the live feed
of the various mounted cameras and send an alert when it recognizes the face of a person of
interest. There will also be a text sent to the authorities when a face is recognized.
The project would be able to detect criminal’s faces in real time and be able to alert the
authority of their whereabouts according to the location of the cctv, this will go a long way in
helping authorities in reducing crime as it will help put criminals behind bars
iii
TABLE OF CONTENTS
DECLARATION .......................................................................................................................ii
ABSTRACT............................................................................................................................. iii
2.1.4 Flask......................................................................................................................... 15
iv
3.2 Facial Detection and recognition.................................................................................... 19
5.1 Conclusion...................................................................................................................... 27
REFERENCES ........................................................................................................................ 28
APPENDIX .............................................................................................................................. 30
v
LIST OF FIGURES
FIGURE 2.1: RASPBERRY PI .................................................................................................... 10
FIGURE 2.2: RASPBERRY PI AND CAMERA ........................................................................ 12
FIGURE 3.1: SYSTEM BLOCK DIAGRAM ............................................................................. 18
FIGURE 3.2: SYSTEM FLOWCHART ...................................................................................... 22
FIGURE 4.1 IMAGE RECOGNIZED OF PERSON 1 ................................................................ 23
FIGURE 4.2 IMAGE RECOGNIZED OF PERSON 2 ................................................................ 24
FIGURE 4.3 IMAGE RECOGNIZED OF PERSON 3 ................................................................ 24
FIGURE 4.4 TEXT NOTIFICATION .......................................................................................... 25
FIGURE 4.4 DYNAMIC WEBSITE ............................................................................................ 26
FIGURE 4.5 STATIC WEBSITE ................................................................................................. 26
vi
LIST OF ACRONYMS AND ABBREVIATIONS
API - Application Programming Interface
IP - Internet Protocol
RPI - Raspberry Pi
vii
SMS – Short Message Service
viii
CHAPTER ONE: INTRODUCTION
1.1. Background Information
The demands on video surveillance systems are rapidly increasing in the present day. One of
the first things people will want to know about their surveillance system is whether or not
they have the ability to connect to it over the internet for remote viewing. In the past, security
systems had to be monitored by a guard who was locked away in a room all day watching the
monitors to make sure that nothing would happen. The other option was to come back and
review the footage but damage could have happened. Therefore, researchers and scientists
had to come up with ways of overcoming that and thus improving security at large [1].
Commercial spaces, universities, hospitals, casinos, warehouses and major roads require
video capturing systems that have the ability to alert and record beside live video streaming
of the intruder. The advancements in video surveillance technology have made it possible to
view your remote security camera from any internet-enabled PC or smartphone from
anywhere in the word. This encompasses the use of CCTV (DVRs) systems and IP cameras.
The Raspberry Pi crosses both criteria in that it is a cheap, effective computer which can be
interfaced with other modules to realize systems with immense functionality. A lot can be
done on it ranging from motor speed control, automatic lighting, VPN server, security system
etc. [2]. The latter is of great interest in this project.
The Raspberry Pi microcomputer is capable of implementing a cost-effective security system
for various applications. This new arising technology related to security provides a
comfortable and safe environment. The system can be tweaked to detect an intruder and also
convey an alert message to the facility owner. In doing so it thus allows for remote
monitoring of homes from anywhere in the world.
The system to be designed cannot wholly replace the role of CCTV and IP surveillance
cameras. This project can also be implemented for low income home owners to monitor their
homes at a very affordable price. In addition to the fact that the Raspberry Pi board is cheap,
the camera to be used in this case is relatively cheap compared to the others. The whole
security system circuitry is simple and easy to implement.
Image processing is a term which indicates the processing on image or video frame which is
taken as an input and the result set of processing is may be a set of related parameters of an
image. The purpose of image processing is visualization which is to observe the objects that
are not visible. Analysis of human motion is one of the most recent and popular research
topics in digital image processing. In which the movement of human is the important part of
1
human detection and motion analysis, the aim is to detect the motions of human from the
background image in a video sequence. It also includes detection and tracking [3].
2
1.4. OBJECTIVES
The project includes the use of a Raspberry Pi and Raspberry Pi camera. Both are used for
facial detection and recognition. When someone commits a crime, their facial pictures are
taken and stored in a database. The database will contain watch lists of suspects they can
search for on live camera footage. If they have committed a crime before and arrested there
will be a picture of them in the database. If it’s a first-time offender the picture of them while
doing the crime will be used or gotten from other sources e.g. social media or in other records
where they might have their pictures. That picture will be stored in the database and to help in
facial detection and recognition. Through facial recognition they are detected and an alert will
be sent to the police i.e. text message. The website will be updated asynchronously in the
event of a successful facial recognition. This will be useful in the event when there are
problems with the communication network and sending text message will be difficult. The
website will have details of the detection i.e. name, time and location.
3
CHAPTER TWO: LITERATURE REVIEW
2.1. EXISTING TECHNOLOGY
2.1.1 Face Recognition
Most people are comfortable with facial recognition for its use in social media applications
such as Instagram, snapchat filters and Face ID in phone security. The face is like a
fingerprint, and the technology behind facial recognition is complex.
The most basic task on Face Recognition is of course, "Face Detecting". Before anything, you
must "capture" a face in order to recognize it, when compared with a new face captured on
future.
The most common way to detect a face (or any objects), is using the "Haar Cascade
classifier"
Object Detection using Haar feature-based cascade classifiers is an effective object detection
method proposed by Paul Viola and Michael Jones in their paper, "Rapid Object Detection
using a Boosted Cascade of Simple Features" in 2001. It is a machine learning based
approach where a cascade function is trained from a lot of positive and negative images. It is
then used to detect objects in other images.
Initially, the algorithm needs a lot of positive images (images of faces) and negative images
(images without faces) to train the classifier which we need to extract features from it.
OpenCV comes with a trainer as well as a detector..
OpenCV already contains many pre-trained classifiers for face, eyes, smile, etc.
2.1.2 Surveillance
In the present day, researchers and developers have come up with a wide range of
surveillance systems that are used for remote monitoring, alerting as well as controlling tasks
through affordable and easy to implement hardware systems. Some have so far been realized
while others still remain a proposition.
An embedded home surveillance system which assesses the implementation of a cost
effective alerting system based on small motion detection was presented by Padmashree A.
Shake and Sumedha S. Borde. They worked on implementing cheap in price, low power
consumption; well utilize resources and efficient surveillance system using a set of various
4
sensors. Their system helps to monitor the household activities in real time from anywhere
and based on microcontroller which is considered nowadays as a limited resource and an
open source solution compared to SBC [5].
D. Jeevanand worked on designing of a networked video capture system using Raspberry Pi.
The proposed system works on capturing video and distributing with networked systems
besides alerting the administration person via SMS alarm as required by the client. Their
system was designed to work in a real-time situation and based on Raspberry Pi SBC.
Contrasting to other embedded systems their real-time application offers client video monitor
with the help of alerting module and SBC platform [6].
Sneha Singhd and his team described IP Camera Video Surveillance system using Raspberry
Pi technology. The Researchers aimed at developing a system which captures real time
images and displays them in the browser using TCP/IP. The algorithm for face detection is
being implemented on Raspberry Pi, which enables live video streaming along with detection
of human faces. The research did not include any of surveillance reactions [7].
Mahima F. Chauhan and Gharge Anuradha offered to design and develop a real time video
surveillance system based on embedded web server Raspberry PI B+ Board. Their system has
low cost, good openness and portability and is easy to maintain and upgrade. Thus, this
application system provides better security solutions. This system can be used to effect
security in banking halls, industry, environment and in military arts [8].
Jadhav G. J evaluates in 2014 the use of various sensors, wireless module, microcontroller
unit and finger print module to formulate and implement a cost-effective surveillance system.
He and his team adopted an ARM core as a basis processor of the system. PIR sensor is used
to detect motion in the vision area, while vibrating sensor is used to sense any vibration
events such as sound of breaking. The intruder detection technique is proposed by using the
PIR sensor that detect motion and trigger a system of alerting and sending short message
service through GSM module for a specified phone number. Their work can be featured by
adopting numerous diverse kinds of demanding database and thus it will be more secure and
difficult to hack [9]. In 2014, Sanjana Prasad and his colleagues worked on developing a
mobile smart surveillance system based on SBC of Raspberry Pi and motion detector sensor
PIR. Their development boosts the practice of portable technology to offer vital safety to our
daily life and home security and even control uses. The objective of their research is to
develop a mobile smart phone home security system based on information capturing module
combined with transmitting module based on 3G technology fused with web applications.
The SBC will control the PIR sensor events and operates the video cameras for video
5
streaming and recording tasks. Their system has the capability to count number of objects in
the scene [10].
Uday Kumar worked on implementation of a low cost wireless remote surveillance system
using Raspberry Pi. Conventional wireless CCTV cameras are widely used in surveillance
systems at a low cost. He and his team implemented a low cost and secure surveillance
system using a camera with Raspberry Pi and the images acquired have to be transferred to
the drop box using a 3G internet dongle. This was successfully implemented using Raspberry
Pi and 3G dongle [11].
Major cities in the world are introducing facial recognition cameras to its streets. As of
January, this year Russia has rolled out live facial recognition system with an application to
alert the police. London is also integrating live facial recognition into daily police activities,
with the Metropolitan Police deploying cameras in busy tourist and shopping areas to spot
individuals “wanted for serious and violent offences.”
Police have used facial recognition in the past years to identify individuals in archive footage,
but the deployment of live facial recognition is a new development.
6
Twilio
The main benefits of Twilio Platform are its cost efficiency, immediate development process,
the reliability of the connections provided, and regional distribution capability.
The Internet of Things has led to the creation of humongous data, both structured and
unstructured, and distilling them to form actionable insights is the end goal. To achieve this
data must first be gathered, stored, and retrieved with ease.
Amazon Simple Storage Service (Amazon S3), is the most fundamental and global
Infrastructure as a Service (IaaS) solution provided by Amazon Web Services (AWS).
Amazon S3 facilitates highly-scalable, secured and low-latency data storage from the cloud.
With its simple web service interface, it is easy to store and retrieve data on Amazon S3 from
anywhere on the web. All you need to do is choose a region (which is a separate geographic
area, choose the closest one to you), create a bucket and start storing data.
Also, with Amazon S3 there is no need to predict future data usage since as much data can be
stored and accessed at any time(though, individual objects can only be up to 5 terabytes in
size).
Amazon S3 automatically creates multiple replicas of the data so that data is never lost. It is
really easy to have different versions of data on Amazon S3, which makes data recovery an
easy task. With no minimum fee and setup cost, life cycle policies such as moving your less
7
used data to Amazon Glacier to reducing cost and security policies to stop unauthorized
access of the data, Amazon S3 helps you make the most of data without much headache.
Amazon S3 is a pioneer in cloud data storage and has uncountable benefits including:
security, availability, low cost, ease of management and simplicity of management.
Programming language used for Web Development are HTML, javascript and python.
HTML is used to structure a web page and its content. It is a mark-up language that defines
the structure of your content. HTML consists of a series of elements, which you use to
enclose, or wrap, different parts of the content to make it appear a certain way, or act a
certain way.
JavaScript is a text-based programming language used both on the client-side and server-side
that allows you to make web pages interactive. Where HTML and CSS are languages that
give structure and style to web pages, JavaScript gives web pages interactive elements that
engage a user.
Python can be used to build server-side web applications. While a web framework is not
required to build web apps, it's rare that developers would not use existing open source
libraries to speed up their progress in getting their application working.
Python is not used in a web browser. The language executed in browsers such as Chrome,
Firefox and Internet Explorer is JavaScript. Projects such as pyjs can compile from Python to
8
JavaScript. However, in this case the website has been written using a combination of Python
and JavaScript. Python is executed on the server side while JavaScript is downloaded to the
client and run by the web browser.
9
Figure 2.1: Raspberry Pi
To enable communication with the outside world, the Raspberry Pi has to be programmed
with a suitable programming language. These languages include Java, FOTRAN, Pascal,
Python, C, C++ etc.[12]. Each language has its own syntax and semantics. RPI can be
programmed using any of these languages but for purposes of this project, Python will be of
great importance to study. It is provided by default through and thus optimum operation of
the Pi can be achieved.
An operating system makes Raspberry Pi run. Since Raspberry Pi is a credit sized computer
that is based on Linux, optimum performance of RPI can be achieved if it is therefore
operated in this environment. Raspbian provides more than a pure OS: it comes with over
35,000 packages, pre-compiled software bundled in a nice format for easy installation on
RPI[11]. Important to note is that the Raspberry Pi does not operate in a Windows
environment.
PUTTY
PuTTY is a free and open-source terminal emulator, serial console and network file transfer
application. It supports several network protocols, including SCP, SSH, Telnet, rlogin, and
raw socket connection. It can also connect to a serial port.
10
PuTTY supports many variations on the secure remote terminal, and provides user control
over the SSH encryption key and protocol version, alternate ciphers such
as AES, 3DES, RC4, Blowfish, DES, and Public-key authentication. PuTTY uses own format
of key files – PPK (protected by Message Authentication Code). PuTTY
supports SSO through GSSAPI, including user provided GSSAPI DLLs. It also can emulate
control sequences from xterm, VT220, VT102 or ECMA-48 terminal emulation, and allows
local, remote, or dynamic port forwarding with SSH (including X11 forwarding). The
network communication layer supports IPv6, and the SSH protocol supports the
[email protected] delayed compression scheme. It can also be used with local serial port
connections.
PuTTY comes bundled with command-line SCP and SFTP clients, called "pscp" and "psftp"
respectively, and plink, a command-line connection tool, used for non-interactive sessions.
XMING
Xming provides the X Window System display server, a set of traditional sample X
applications and tools, as well as a set of fonts. It features support of several languages and
has Mesa 3D, OpenGL, and GLX 3D graphics extensions capabilities.
Xming may be used with implementations of Secure Shell (SSH) to securely forward X11
sessions from other computers. It supports PuTTY and ssh.exe, and comes with a version of
PuTTY's plink.exe. The Xming project also offers a portable version of PuTTY. When SSH
forwarding is not used, the local file Xn.hosts must be updated with host name or IP address
of the remote machine where the GUI application is started.
• Low cost
• Many interfaces (HDMI, multiple USB, Ethernet, onboard Wi-Fi and Bluetooth,many
GPIOs, USB powered, etc.)
11
• Readily available examples with community support
• Developing such an embedded board is going to cost a lot of money and effort
2.1.3 Open CV
Computer vision is a process by which we can understand the images and videos how they are
stored and how we can manipulate and retrieve data from them. Computer Vision is the base or
mostly used for Artificial Intelligence. Computer-Vision is playing a major role in self-driving
cars, robotics as well as in photo correction apps.
OpenCV is the huge open-source library for the computer vision, machine learning, and image
processing and now it plays a major role in real-time operation which is very important in
today’s systems. By using it, one can process images and videos to identify objects, faces, or
even handwriting of a human. When it integrated with various libraries, such as Numpy, python
12
is capable of processing the OpenCV array structure for analysis. To identify image pattern and
its various features we use vector space and perform mathematical operations on these features.
The first OpenCV version was 1.0. OpenCV is released under a BSD license and hence it’s free
for both academic and commercial use. It has C++, C, Python and Java interfaces and supports
Windows, Linux, Mac OS, iOS and Android. When OpenCV was designed the main focus was
real-time applications for computational efficiency. All things are written in optimized C/C++
to take advantage of multi-core processing.
Open CV will be used to detect, train and recognize the image
HAAR CASCADES
It is a machine learning based approach where a cascade function is trained from a lot of
positive and negative images. It is then used to detect objects in other images.
The algorithm has four stages:
1. Haar Feature Selection
2. Creating Integral Images
3. Adaboost Training
4. Cascading Classifiers
It is well known for being able to detect faces and body parts in an image, but can be trained
to identify almost any object e.g.face detection. Initially, the algorithm needs a lot of positive
images of faces and negative images without faces to train the classifier. Then features need
to be extracted from it.
First step is to collect the Haar Features. A Haar feature considers adjacent rectangular
regions at a specific location in a detection window, sums up the pixel intensities in each
region and calculates the difference between these sums.
During the detection phase, a window of the target size is moved over the input image, and
for each subsection of the image and Haar features are calculated.
This difference is then compared to a learned threshold that separates non-objects from
objects.
Cascade Classifier
The cascade classifier consists of a collection of stages, where each stage is an ensemble of
weak learners. The weak learners are simple classifiers called decision stumps. Each stage is
13
trained using a technique called boosting. Boosting provides the ability to train a highly
accurate classifier by taking a weighted average of the decisions made by the weak learners.
Each stage of the classifier labels the region defined by the current location of the sliding
window as either positive or negative. Positive indicates that an object was found
and negative indicates no objects were found. If the label is negative, the classification of this
region is complete, and the detector slides the window to the next location. If the label is
positive, the classifier passes the region to the next stage. The detector reports an object
found at the current window location when the final stage classifies the region as positive.
The stages are designed to reject negative samples as fast as possible. The assumption is that
the vast majority of windows do not contain the object of interest. Conversely, true positives
are rare and worth taking the time to verify.
Cascade classifier training requires a set of positive samples and a set of negative images.
You must provide a set of positive images with regions of interest specified to be used as
positive samples. You can use the Image Labeler to label objects of interest with bounding
boxes. The Image Labeler outputs a table to use for positive samples. You also must provide
a set of negative images from which the function generates negative samples automatically.
To achieve acceptable detector accuracy, set the number of stages, feature type, and other
function parameters.
LBPH
In the LBP approach for texture classification, the occurrences of the LBP codes in an image
are collected into a histogram. The classification is then performed by computing simple
14
histogram similarities. However, considering a similar approach for facial image
representation results in a loss of spatial information and therefore one should codify the
texture information while retaining also their locations. One way to achieve this goal is to use
the LBP texture descriptors to build several local descriptions of the face and combine them
into a global description. Such local descriptions have been gaining interest lately which is
understandable given the limitations of the holistic representations. These local feature based
methods are more robust against variations in pose or illumination than holistic methods.
The basic methodology for LBP based face description proposed by Ahonen et al. (2006) is
as follows: The facial image is divided into local regions and LBP texture descriptors are
extracted from each region independently. The descriptors are then concatenated to form a
global description of the face.
This histogram effectively has a description of the face on three different levels of locality:
the LBP labels for the histogram contain information about the patterns on a pixel-level, the
labels are summed over a small region to produce information on a regional level and the
regional histograms are concatenated to build a global description of the face.
2.1.4 Flask
Flask package contains the framework used from development of web applications using
Python. The framework depends on two external libraries Jinja2 (Ronacher, Jinja2, 2008) and
Werkzug (Ronacher, Werkzug, The Python WSGI utility Library, 2014). The Jinja2 library is
used to render templates while the Werkzug library is a toolkit that is used for WSGI. WSGI
is an interface between web server and web applications using Python. To install flask via the
command line using Linux or OSX use pip install Flask. The pip command can also be used
on Windows but easy install and pip need to be installed first.
15
Jinja2
One of the main features that flask uses in Jinja2. Jinja2 is a templating language designed for
Python and is an essential part of any flask application. The main advantage of this is to be
ability to use data on the server side and display it on the client side of the application
Werkzug
One of the essential parts of the Flask framework is the Werkzug package which is a WSGI
library. It is the main component responsible for routing in the application and is needed for
request and response objects. Web Server Gateway Interface provides and essential interface
which enables communication between a web server and a Python application. When
deploying an application to python anywhere it is critical that the applications WSGI Python
file be corrected before the application will work correctly. WSGI is basically a protocol
defined so that Python application can communicate with a web-server and thus be used as
web-application outside of CGI.
Websocket
WebSocket is a new communication protocol introduced with HTML5, mainly to be
implemented by web clients and servers, though it can also be implemented outside of the
web. Unlike HTTP connections, a WebSocket connection is a permanent, bi-directional
communication channel between a client and the server, where either one can initiate an
exchange. Once established, the connection remains available until one of the parties
disconnects from it. WebSocket connections are useful for games or web sites that need to
display live information with very low latency
SocketIO
SocketIO is a cross-browser Javascript library that abstracts the client application from the
actual transport protocol. For modern browsers the WebSocket protocol is used, but for older
browsers that don't have WebSocket SocketIO emulates the connection using one of the older
solutions, the best one available for each given client.
The important fact is that in all cases the application uses the same interface, the different
transport mechanisms are abstracted behind a common API, so using SocketIO you can be
16
pretty much sure that any browser out there will be able to connect to your application, and
that for every browser the most efficient method available will be used.
17
CHAPTER THREE: METHODOLOGY
The system adopts the form illustrated by the block diagram below, as depicted on the
objectives
Detect face(Haar
cascade, HOG+ linear SEND SMS
SVM)
Recognize face
18
3.1 System Modules
The entire system modules consist of seven parts components namely:
a) Raspberry Pi Model B+ controller,
b) RJ45 Ethernet connector,
c) Pi camera module
d) Micro SD card
e) USB powered cable.
There are two methods to input data which include taking the photos with the person being
physically present and copying the images into the dataset in the event there are pictures of a
person but not in the dataset e.g. from social media or from other records .
The raspberry pi camera is able to capture the image. The image is then be processed by the
raspberry pi. The sub processes that will take place include: Face detection, computation of
face embedding via deep metric network, comparison to known database and finally
recognize the face.
Face detection is going to be done using OpenCV’s Haar cascades. Haar Cascades is chosen
because HOG is far too slow on the RPI for real-time face detection and CNN face detector is
accurate but the Raspberry Pi won’t have enough memory to run the CNN.
Deep metric is different, instead of trying to output a single label (or even the
coordinates/bounding box of objects in an image), instead a real-valued feature vector is
outputted. The network quantifies the faces, constructing the 128-d embedding
(quantification) for each. For the dlib facial recognition network, the output feature vector is
128-d (i.e., a list of 128 real-valued numbers) that is used to quantify the face. Training the
network is done using triplets:
19
The general idea is that the weights of the neural network are tweaked so that the 128-d
measurements of the two images of the same person are closer to each other and farther from
the measurement of the third image.
Then, each face in the input image is matched to a known encoding in the dataset. During
classification, a simple k-NN model + votes is used to make the final face classification.
• First it sets up the following four parameters: Radius which is the radius is used to
build the circular local binary patter; Neighbors the number of sample points to build
the circular local binary patter; grid x and grid y
• The algorithm is then trained using the dataset with facial images obtained.
• The LBP operation is applied that is to create an intermediate image that describes the
original image in a better way, by highlighting the facial characteristics using the
neighbors and radius parameters
• Grid X and Grid Y parameters are used to divide the image into multiple grids, to
extract the histograms
• The algorithm is already trained. Each histogram created is used to represent each
image from the training dataset. So to find the image that matches the input image we
just need to compare two histograms and return the image with the closest histogram.
Once there is a successful facial recognition, there will be an alert via text message. This will
be through a combination of twilio API and amazon s3 cloud storage.
Twilio api
The twilio API will be used to register all receiver numbers and also the sender number will
be given by the API. The twilio API will be used to actually send the text message. The text
will contain the name and time of the successful facial recognition. The location will also be
contained in the text as the physical location of the cameras will be recorded during their
installation removing the need of GPRS module.
20
Amazon web service s3
An amazon s3 cloud storage is used to store the image files. The amazon cloud storage is
used as the Twilio API cannot directly serve attachments e.g. images and videos. To send a
multimedia SMS amazon’s cloud storage is used to store the images before it is sent using
twilio. A package called boto3 will help us to transfer the files from our Internet of Things
Raspberry Pi to AWS S3.
13.4 Website
There will also be a website which will store the records of the successful facial recognition
I.e. name and time of the successful facial recognition. The website will be hosted in the
raspberry pi. To access the web page from another computer Raspberry PI’s IP address is
used. Data will be sent from the server to the client using socketio and websockets. There will
also be a static website to display the cameras and their location. The server is written in
python and the client side using JavaScript.
The web page is automatically updated as a result of events that happened in the background
on my server system. The event that will trigger an update on the website is a successful
facial recognition. The webpage will be updated with name, time and location. The data
displayed in the website cannot be modified as text messages can be deleted.
21
3.2 System Flowchart
22
CHAPTER FOUR: RESULTS
The project implements haarcascades and LBPH to detect and recognize images respectively.
LBPH is fairly accurate rarely giving false results i.e. mistaken identity.
23
Figure 4.2 Image Recognized of person 2
24
either a previous conviction and picture was taken or a new suspect and their picture can be
obtained and included in the dataset.
25
Figure 4.4 Dynamic website
There is also a static webpage which will be displaying the location of the cameras.
26
CHAPTER FIVE: CONCLUSION
5.1 Conclusion
The various modules were successfully integrated to achieve a security tracking system.
Whenever there is a successful facial recognition, it will trigger a SMS to be sent informing
name, time and location of where the criminal has been tracked.
With the aid of the Website, there are records of every successful criminal tracked which can
be used for confirmation and authentication as the data in the website cannot be modified.
The data will be updated asynchronously and updates will be triggered by a successful facial
recognition.
5.2 Challenges
5.3 Recommendation
i. That anyone wishing to further this project consider use of a camera with better quality.
ii. That anyone wishing to further this project consider incorporating more powerful
microprocessor, as the raspberry pi is naturally limited in terms of computation power and
memory (especially without a GPU) making the project slow.
iii. That the department considers setting up a computer lab with free, fast internet especially
for the purposes of research by the finalist students.
iv. That the school incorporates more units into the curriculum especially computer based
units to enable students come up with more robust projects based on more recent
technologies.
27
REFERENCES
[1] Real-Time Face Recognition: An End-to-End Project, [online]
availablehttps://www.hackster.io/mjrobot/real-time-face-recognition-an-end-to-end-
project-a10826#toc-introduction-0
[2] Z. Sundas, “Motion Detecting Camera Security System with Email Notifications and
Live Streaming Using Raspberry Pi.”
[3] M. Peter and H. David, “Learn Raspberry Pi with Linux,” Apress, 2012.
[4] P. S. Dhake and B. Sumedha S., “Embedded Surveillance System Using PIR Sensor.,”
vol. No. 02, no. 3, 2014.
[6] J. D., “Real Time Embedded Network Video Capture And SMS Alerting system,” Jun.
2014.
[7] S. Sneha, “IP Camera Video Surveillance using Raspberry Pi.,” Feb. 2015.
[8] F. C. Mahima and A. Prof. Gharge, “Design and Develop Real Time Video Surveillance
System
Based on Embedded Web Server Raspberry PI B+ Board. International Journal of Advance
Engineering and Research Development (Ijaerd), NCRRET.,” pp. 1–4, 2015.
[10] P. Sanjana, J. S. Clement, and S. R., “Smart Surveillance Monitoring System Using
Raspberry PI and PIR Sensor.,” 2014.
[11] U. Kumar, R. Manda, S. Sai, and A. Pammi, “Implementation Of Low Cost Wireless
Image Acquisition And Transfer To Web Client Using Raspberry Pi For Remote
28
Monitoring. International Journal of Computer Networking, Wireless and Mobile
Communications (IJCNWMC).,” vol. No. 4, no. 3, pp. 17–20, 2014.
29
APPENDIX
A1: Code for image detection
import cv2
import numpy as numpy
import time
import os
###########################################################################
###
def detect_faces(img,train=0):
face_clf = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_clf.detectMultiScale(gray, 1.3, 5)
array_face = []
rect = []
for (x, y, w, h) in faces:
fc = gray[y:y+h, x:x+w]
rect.append((x,y,w,h))
array_face.append(fc)
if train:
if len(array_face) != 0: return array_face[0],rect[0]
return [],[]
return array_face,rect
30
from flask_moment import Moment
from flask_bootstrap import Bootstrap
from flask_moment import Moment
from flask import request
import datetime
from pytz import timezone
import cv2
import numpy as np
import time
import os
from face_detection import *
app = Flask(__name__)
app.config['SECRET_KEY'] = 'secret!'
app.config['DEBUG'] = True
# app = Flask(__name__)
31
#socketio = SocketIO(app)
bootstrap=Bootstrap(app)
moment=Moment(app)
lastsent = None
global thres
thres = 0
def camera():
cameranumber=1
if cameranumber==1:
location="EMB"
if cameranumber==2:
location="ANNEX ROAD"
if cameranumber==3:
location="Hospital Road"
if cameranumber==4:
location="Rufui road"
32
if cameranumber==5:
location="SPA"
camera()
def label_reading(filename):
s = filename.split('.jpg')[0]
return ''.join([i for i in s if not i.isdigit()])
class Face_recog(Thread):
def __init__(self):
self.face_recognizer = cv2.face.LBPHFaceRecognizer_create()
self.path = 'images/'
self.delay = 1
super (Face_recog, self).__init__()
def train(self):
images = os.listdir(self.path)
faces = []
labels = []
labels.append(names.index(label_reading(image)))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces.append(img)
#print(img.shape)
faces = np.array(faces)
print("faces recognised")
self.face_recognizer.train(faces, np.array(labels))
33
def recognition(self,img):
label_text = names[label]
cameranumber=1
if cameranumber==1:
location="EMB"
if cameranumber==2:
location="ANNEX ROAD"
34
if cameranumber==3:
location="Hospital Road"
if cameranumber==4:
location="Rufui road"
if cameranumber==5:
location="SPA"
return location
camera()
def upload(self):
tempImage = TempFile(ext=".jpg")
img = cv2.imread(tempImage.path)
cv2.imwrite(tempImage.path, img)
s3=boto3.resource('s3')
filename=tempImage.path[tempImage.path.rfind("/") + 1:]
filePath = os.path.join(tempImage.path,filename)
data = open(filename, 'rb')
s3.Bucket(bucketName).put_object(Key=filename, Body=data,
ContentType = 'image/jpg')
url = 'https://s3.amazonaws.com/'+bucketName+'/'+filename
return url
upload(self)
account_sid = 'AC3db9c6955a071445282e4db0e27b2ca8'
auth_token = '50eb4948bf8f1735a66e0feda8b95b29'
client = Client(account_sid, auth_token)
message = client.messages.create(body=names[label]+' has been spotted at'+
camera(),
from_='+1-502-549-1709',
# media_url=upload(self),
to='+254736414268' )
##
35
x=' '
print(names[label])
details = str(nname ) + str(9*x) +'--'+ ts + str(9*x) + '--'+ str(camera())
socketio.emit('newnname',
{'details':details}, namespace = '/test')
socketio.sleep(5)
#print(message.sid)
print(message.sid)
lastsent=names[label]
# name=names[label]
(x, y, w, h) = rect[i]
cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0))
cv2.putText(img, label_text, (x, y), cv2.FONT_HERSHEY_PLAIN, 1.5, (0, 255,
0), 2)
return img
@app.route('/')
def main():
model = Face_recog()
model.train()
video_capture = cv2.VideoCapture(0)
while True :
ret, frame = video_capture.read()
cv2.destroyAllWindows()
return render_template('indexx.html')
@app.route('/')
def index():
#only by sending this page first will the client be connected to the socketio instance
return render_template('indexx.html')
@app.route('/location')
def location():
return render_template('inddex.html')
@socketio.on('newnname', namespace='/test')
def my_event(msg):
print (msg['data'])
@socketio.on('connect', namespace='/test')
def test_connect():
# need visibility of the global thread object
global thread
print('Client connected')
if not thread.isAlive():
print("Starting Thread")
thread = socketio.start_background_task(main)
# thread = Face_recog()
# thread.start()
@socketio.on('disconnect', namespace='/test')
37
def test_disconnect():
print('Client disconnected')
if __name__ == '__main__':
socketio.run(app, port='5000', debug=True)
# app.run(host='0.0.0.0', port='8000', debug=True)
socketio.run(main)
{% block page_content %}
<!DOCTYPE html>
<html>
<head>
<script src="//code.jquery.com/jquery-3.3.1.min.js"></script>
<script type="text/javascript"
src="//cdnjs.cloudflare.com/ajax/libs/socket.io/1.3.6/socket.io.min.js"></script>
<script src="static/js/application2.js"></script>
<script src="/static/js/moment.min.js"></script>
<link rel="stylesheet"
href="//maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
</head>
<body>
<div class="container">
38
<div class="jumbotron">
<h1>CRIMINAL TRACKER WEBSITE</h1>
</div>
</div>
</div>
<h3>Criminals Tracked:</h3>
<div id="log">
</div> <!-- /#log -->
</div>
</div>
</body>
</html>
{% endblock %}
39
A5: JAVASCRIPT CODE
$(document).ready(function(){
//connect to the socket server.
namespace = '/test'; // change to an empty string to use the global namespace
var socket = io.connect('http://' + document.domain + ':' + location.port + '/test');
var details_received = [];
socket.on('newnname', function(msg){
console.log("Received "+ msg.details);
40