0% found this document useful (0 votes)
136 views

Artificial Intelligence Based Real-Time Attendance System Using Face Recognition

This document describes a project report on an artificial intelligence based real-time attendance system using face recognition. The system uses a deep learning algorithm and facial recognition technology to take group photos of students in a classroom and identify each student for attendance marking. The project aims to develop an autonomous attendance system with minimal human interaction. It discusses the objectives, methodology, software requirements and platforms used including YOLOv5 neural network, Python programming and OpenCV library for image processing and feature extraction from photos. Experimental results demonstrate the effectiveness of the proposed face recognition based multiple attendance system.

Uploaded by

gururaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views

Artificial Intelligence Based Real-Time Attendance System Using Face Recognition

This document describes a project report on an artificial intelligence based real-time attendance system using face recognition. The system uses a deep learning algorithm and facial recognition technology to take group photos of students in a classroom and identify each student for attendance marking. The project aims to develop an autonomous attendance system with minimal human interaction. It discusses the objectives, methodology, software requirements and platforms used including YOLOv5 neural network, Python programming and OpenCV library for image processing and feature extraction from photos. Experimental results demonstrate the effectiveness of the proposed face recognition based multiple attendance system.

Uploaded by

gururaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

ARTIFICIAL INTELLIGENCE BASED REAL-TIME

ATTENDANCE SYSTEM USING FACE


RECOGNITION

A PROJECT REPORT

Submitted by

GURURAJ R(191EC152)
HARIRAM S(191EC156)
GOWTHAM G K(191EC146)

In partial fulfilment for the award of the degree


of
BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMMUNICATION ENGINEERING

BANNARI AMMAN INSTITUTE OF TECHNOLOGY


(An Autonomous Institution Affiliated to Anna University, Chennai
SATHYAMANGALAM-638401

ANNA UNIVERSITY: CHENNAI 600 025

MARCH 2023
BONAFIDE CERTIFICATE

Certified that this project report “ARTIFICIAL INTELLIGENCE BASED


REALTIME ATTENDANCE SYSTEM USING FACE RECOGNITION” is the
bonafide work of HARIRAM S(191EC156), GOWTHAM G K(191EC146),
GURURAJ R(191EC152) who carried out the project work under my supervision.

SIGNATURE SIGNATURE
Dr C. POONGODI Dr C.RAJU
HEAD OF THE DEPARTMENT, SUPERVISOR,
Professor & Head, Assistant Professor,
Department of ECE, Department of ECE,
Bannari Amman Institute Of Bannari Amman Institute Of
Technology-638401. Technology-638401.

Submitted for project Viva Voce examination held on ____________.

Internal Examiner External Examiner


DECLARATION

We affirm that the project work titled “ARTIFICIAL INTELLIGENCE


BASED REALTIME ATTENDANCE SYSTEM USING FACE
RECOGNITION ” being submitted in partial fulfillment for the award of the
degree of Bachelor of Engineering is the record of original work done by us under
the guidance of Dr C.RAJU Supervisor, Assistant Professor, Department of ECE. It
has not formed a part of any other project work(s) submitted for the award of any
degree or diploma, either in this or any other University.

HARIRAM S GOWTHAM G K GURURAJ R

(191EC156) (191EC146) (191EC152)

I certify that the declaration made above by the candidates is true.

SIGNATURE
Dr C.RAJU
SUPERVISOR,
Assistant Professor,
Department of ECE,
Bannari Amman Institute Of
Technology-638401.
ACKNOWLEDGEMENT

We would like to enunciate heartfelt thanks to our esteemed Chairman


Sri.S.V.Balasubramaniam, and the respected Director Dr.M.P.Vijaykumar,
for providing excellent facilities and support during the course of study in this
institute.

We are grateful to Dr.C. POONGODI, Professor and Head of the


Department, Electrical and Communication Engineering for his valuable
suggestions to carry out the project work successfully.

We wish to express our sincere thanks to the Faculty guide Dr.C.RAJU,


Assistant Professor, Department of Electronics And Communication Engineering
for his constructive ideas, inspirations, encouragement, excellent guidance and
much needed technical support extended to complete our project work.

We would like to thank our friends, faculty and non-teaching staff who
have directly and indirectly contributed to the success of this project.

GURURAJ R (191EC152)

GOWTHAM G K (191EC146)

HARIRAM S (191EC156)
ABSTRACT

To maintain track of students' everyday presence in academic institutions across all


grade levels, attendance marking is a widespread practise. The manual methods used
in the past to record attendance. These methods are accurate and have no potential
of recording phoney attendance, but they take a lot of time and need some work from
many pupils.

Using biometric technologies based on radio frequency identification, fingerprint,


face, and iris scanning, automated solutions are created to address the shortcomings
of manual methods Each method has benefits and drawbacks. However, each of
these systems has the drawback of requiring human interaction to mark each person's
attendance one at a time. In this work, we present a robust and effective attendance
marking system from a single group photograph using face identification and
recognition algorithms to overcome the shortcomings of existing manual and
automated attendance systems.

In this technique, a group photo of every student seated in a classroom is taken using
a high-resolution camera set at a fixed point. Following the extraction of the face
pictures from the group image using an algorithm, a convolutional neural network
trained on a student face database is used for recognition. We tested our system using
several group picture formats and database types. Our experimental findings
demonstrate that, in terms of effectiveness, usability, and implementation, the
suggested framework performs better than other attendance marking systems. The
suggested system is an autonomous attendance system that can be readily integrated
into a smart classroom because it necessitates minimal contact between humans and
machines.

This project describes a face recognition algorithm-based multiple attendance


system. Using a deep learning algorithm, a reliable and effective facial detection-
based attendance system is put into place. Facial recognition technology is
commonly used to identify faces.

Keywords: . Facial recognition, deep learning, autonomous, multiple attendance,


radio frequency identification, fingerprint, face, and iris scanning.
TABLE OF CONTENTS

CHAPTER
TITLE PAGE NO
NO

ABSTRACT v
LIST OF FIGURES xi

1 INTRODUCTION 1
1.1 ADVANTAGES 2
1.2 APPLICATIONS 3
2 LITERATURE SURVEY 4
3 OBJECTIVES AND METHODOLOGY 8
3.1 OBJECTIVES 8
3.2 METHODOLOGY 8
3.2.1 Building Dataset 10
3.2.2 Training Dataset 10
3.2.3 Object Detection Model Testing 11

4 PLATFORMS 12
4.1 SOFTWARE REQUIREMENT 12
4.1.1 H/W System Configuration 12
4.1.2 S/W System Configuration 12
4.2 SOFTWARE ENVIRONMENT 12
4.2.1 Python Technology 12
4.2.2 Python Programing 12

5 YOLOV5 AND PYTHON ARCHITECTURE 14


5.1 YOLO ALGORITHM 14
5.1.1 ADVANTAGES 15
5.2 YOLOV3 NETWORK 16

5.3 YOLOV5 NETWORK 16


5.4 PYTHON 17
5.4.1 THE PYTHON PLATFORM 19
5.4.2 WHAT DOES PYTHON 20
TECHNOLOGY DO?
5.4.3 PRODUCTIVITY AND SPEED 20
5.4.4 PYTHON IS POPULAR FOR WEB 20
APPS

5.5 OPEN SOURCE AND FRIENDLY 21


COMMUNITY
5.5.1 PYTHON IS QUICK TO LEARN 21
5.5.2 BROAD APPLICATION 21

6 EXERIMENTAL PROCEDURE 22
6.1 IMAGE PROCESSOR 22
6.2 IMAGE PREPROCESSING 23
6.3 IMAGE SEGMENTATION 23
6.4 FEATURE EXTRACTION 24
6.5 IMAGE ENHANCEMENT 24
6.6 CLASSIFICATION 24
6.7 API DOCUMENTATION 25
GENERATORS
6.7.1 USES 25
6.8 PANDAS 27
6.8.1 LIBRARY FEATURES 27
28
6.9 CSV READER
7 30
PROCESSOR
30
7.1 INTRODUCTION TO
PROCESSOR
30
7.2 GENERAL PURPOSE
PROCESSOR
7.3 MICROPROCESSOR 30
7.3.1 BASIC COMPONENTS OF 31
PROCESSOR
7.3.2 PRIMARY CPU 31
PROCESSOR OPERATIONS
7.4 TYPES OF PROCESSOR 31
7.4.1 SINGLE CORE 32
PROCESSOR
7.4.2 DUEL CORE PROCESSOR 32
7.4.3 MULTI CORE PROCESSOR 32
7.4.4 QUAD CORE PROCESSOR 33
7.4.5 OCTA CORE PROCESSOR 33
7.5 WEB CAM 33
7.5.1 VIDEO CALLING AND 34
VIDEO CONFRENCING
34
7.5.2 VIDEO SECURITY
7.5.3 VIDEO CLIPS AND STILLS 35
7.5.4 INPUT CONTROL 35
DEVICES
35
7.5.5 ASTRO PHOTOGRAPHY
7.6 OPEN CV 36
7.7 YOLO 37
7.7.1 HOW YOLO IMPROVES
OVER PREVIOUS OBJECT 38
DETECTION METHOD
7.8 CNN ARCHITECTURE 39

7.8.1 DEEP NEURAL NETWORK 40


8 41
RESULTS
8.1 DATASET CREATION 41

8.2 DATA COLLECTION 42

8.3 FACE EXTRACTION 43

8.4 OUTPUT 43

8.5 EXPECTED RESULT 44

44
9 CONCLUSION 44
9.1 CONCLUSION 45
9.2 REFERENCE

10 ANNEXURE 1 46
11 ANNEXURE 2 47
12 ANNEXURE 3 52
LIST OF FIGURES
FIGURE NO. FIGURE NAME PAGE NO.

1 Methodology chart 9

2 Labelling and Roboflow 10

3 Flowchart of Dataset to Weight Model 10

4 Flowchart of Input to Output 11

5 Overall Flowchart 11
6 A Repository Architecture of IDE 17

7 Workflow of a Source Code 19

8 Block Diagram of Fundamental 22


Sequence Involved In An Image
Processing System

9 Software Libraries 27
10 Python Panda Features 28

11 Opening a CSV Reader 29

12 Webcam 33

13 CNN Architecture 39
CHAPTER 1

INTRODUCTION

Marking attendance is a common practice in both workplaces and


organizations. In educational institutions, attendance is seen as a crucial aspect for
both students and professors. It takes a lot of work to keep track of student attendance
in the classroom. The two main categories of attendance systems are manual and
automatic attendance systems. The roll call method, in which a teacher records
attendance by calling out each student's name one at a time, is the most used manual
attendance approach. It can take more than 10 minutes per day to mark proxy
attendance using this method, which is incredibly out-of-date and has the highest
number of opportunities.

The second way involves marking one's presence on an attendance register or


sheet. It is the most time-consuming method and, if unchecked, is easily manipulated
and falsified. Therefore, it is crucial to create an automated attendance system that
can efficiently record attendance without human involvement. The best option for
building attendance systems is face recognition because it is the least invasive way of
identification, can be used to take images from a distance, is a cost-effective solution,
has no chance of recording a proxy as present, and is a simple yet dependable process.
In this study, we created an automated attendance system that uses films taken by a
webcam to track students' attendance by facial recognition.

In this study, we suggested a face detection and recognition-based attendance


system that can accept numerous attendances with a single input, increasing the
system's efficiency while eliminating the need for proxy attendance. The method
begins by taking a group photo of the class using a live CCTV feed, after which the
faces are recognised. The suggested system recognizes faces from group images using
the DCNN algorithm. Face data from the user is initially captured using OpenCV
software.

More than 1000 photos from the user have been gathered. The four crucial
procedures of grayscale conversion, resizing, normalization, and augmentation are
used to preprocess the collected data. The processed data was used to create and train
the CNN architecture. The automated facial detection block stores and uses the
trained model. The created model aids in accurately identifying the faces that have
been trained The attendance will be noted if the face is recognized. Recognizing a
student count as marking him or her as present. To improve system effectiveness, the
procedure is performed several times, and the final findings are recorded in the excel
file. Due to its background operation and minimal to no involvement from either
professors or students, this automatic attendance system helps pupils preserve their
valuable study time.

1.1 ADVANTAGES

• The main advantage of this system is where attendance is marked on the server
which is highly secure where no one can mark the attendance of other.

• Time saving
• Ease in maintaining attendance.
• Reduced paper work.
• Automatically operated and accurate.
• Reliable and user friendly
1.2 APPLICATION

● To verify identities in Government organizations.


● Enterprises.
● Attendance in library.
● To detect fake entries at international borders.
● Industries.
CHAPTER 2

LITERATURE SURVEY

[1] Vyavahare M.D, Kataria S. S entitled “Library Management Using Real-Time


Face Recognition System” proposed the concept of an automatic system for human
face identification in a vast home-made dataset of a person's faces in a real-time
backdrop setting. The task is extremely challenging since real-time background
removal from an image is still a problem. In addition, there is a significant range in
the size, position, and expression of the human face. Most of this variation is
collapsed by the suggested approach. Ada-boost with cascade is performed to
recognize human faces in real time, and a quick and easy PCA and LDA are used to
identify the faces found. In our example, the matching face is then used to record
attendance in the lab.Real-time attendance is tracked by this library management
system, which uses human face recognition and gains a high accuracy rate. There are
two databases—one is a system for library databases, and the other is a database for
students.

[2] K. Susheel Kumar, Shitala Prasad, Vijay Bhaskar Semwal, R C Tripathi entitled
“Real Time Face recognition using Ada-boost improved fast PCAS Algorithm”. This
work provides an automated method for human face identification in a big home-
made datasets of a person's face in a real-time backdrop setting. The task is extremely
challenging since real-time background removal from an image is still a problem. In
addition, there is a significant range in the size, position, and expression of the human
face. Most of this variation is collapsed by the suggested approach. Ada-boost with
cascade is used to detect human faces in real time, and a quick and easy PCA and
LDA are used to identify the faces found. In our example, the matching face is then
utilized to record attendance in the lab. This biometric system is a real-time
attendance system.
[3] Chengji Liu, Yufan Tao, Jiawei Liang, Kai Li1, Yihang Chen entitled “Object
Detection Based on YOLO Network” proposed that by implementing advanced
degrading techniques on training sets, such as noise, blurring, rotation, and cropping
of pictures, a generic object identification network was created. The model's
generalization and resilience were improved by using degraded training sets during
training. The experiment demonstrated that the model's resilience and generalization
capabilities for degraded pictures are both subpar for sets trained on standard data.
The model was then trained on damaged photos, which increased its average
precision. It was established that the generic degenerative model outperformed the
standard model in terms of average accuracy for deteriorated photos.

[4] Rumin Zhang, Yifeng Yang entitled “An Algorithm for Obstacle Detection based
on YOLO and Light Field Camera'' displays the concept of The YOLO object
detection algorithm and the light field camera are combined to create a suggested
obstacle detection system for indoor environments. This algorithm will categorize
objects into groups and mark them in the picture. To train YOLO, the photographs of
the typical obstacles were tagged. The unimportant obstruction is eliminated using
the object filter. The usefulness of this obstacle identification algorithm is illustrated
using several scene types, such as pedestrians, chairs, books, and so on.

[5] S. Aravindh, R. Athira, M. J. Jeevitha entitled “Automated Attendance


Management Reporting System using Face Recognition” highlights the development
of a system that is successful and will automatically mark attendance by recognizing
their faces. The phases in this face recognition system's procedure are broken down
into many stages, but the crucial ones are face detection and face recognition. First, a
picture of each student's face will be needed to record their attendance.

[6] Swarnendu Ghosh, Mohammed Shafi KP, Neeraj Mogal, Prabhu Kalyan Nayak,
Biswajeet Champaty entitled “Smart Attendance System”It is suggested that the
Android application be restricted to authorized staff in order to track student
attendance and communicate information for library records. The gadget is extremely
secure since only authorized personnel's fingerprints may be used to activate it.
[7] Vassilios Tsakanikas and Tasos Dagiuklas proposed a paper on title "Video
surveillance systems-current status and future trends" .An effort is made to document
the current state of video surveillance systems in this survey. The fundamental
elements of a surveillance system are provided and carefully examined. The
presentation of algorithms for object detection, object tracking, object recognition,
and item re-identification. The most popular surveillance system modalities are
examined, with a focus on video in terms of available resolutions and cutting-edge
imaging techniques like High Dynamic Range video. Together with the most popular
methods for improving the image and video quality, the most significant features and
statistics are offered.
The most significant deep learning algorithms and the intelligent analytics they
employ are described. Just before examining the difficulties and potential future
directions of surveillance, augmented reality and the function it can play in a
surveillance system are reported.
[8] S. Aravindh, R. Athira, M. J. Jeevitha delivered the idea of automated attendance
through paper titled as "Automated Attendance Management and Reporting System
using Face Recognition ".Manually managing the system for attendance is
challenging. The many biometrics-based smart and automated attendance systems are
frequently used to manage attendance. One of them is face recognition. This method
frequently resolves the issue of proxies and fake attendance. There were certain
drawbacks to the old facial recognition-based attendance system, such as sun intensity
and head posture issues. Thus, a number of methods, including the illumination
invariant, the Viola and Jones algorithm, and principle component analysis are
utilized to overcome these problems. The two basic processes in this system are face
detection and face recognition. After this, the discovered faces are often compared
by cross-referencing with the student face database. This clever technique will make
it easier to keep track of students' attendance and records. Taking the attendance
manually in a classroom full of many pupils is a laborious and time-consuming
operation. As a result, we may put in place a system that effectively marks pupils'
attendance by identifying their faces.
[9] Swarnendu Ghosh, Mohammed Shafi KP, Neeraj Mogal, Prabhu Kalyan Nayak,
Biswajeet Champaty proposed the title of “Automated Attendance System”. For the
best possible use of teaching and learning time, the current study outlines the design
and development of a smart attendance system for students in schools or colleges.
The suggested gadget is a biometric attendance recorder that works with an Arduino
UNO and fingerprint sensor. Through the enrollment procedure, the gadget recorded
the fingerprint prints of all faculty members and students at an institute. Students'
registration fingerprints were compared to the enrolled database throughout the
attendance process. If there was a match, the student's name was stored in that device
and wirelessly communicated to an Android application created in-lab using
Bluetooth protocol service. Only approved staff members have access to the Android
app, which is used to share and track student attendance.The device is very secure
since only the authorised persons concerned may activate it using their fingerprints
(faculties). The gadget is affordable, reliable, transportable, and user-friendly. The
gadget has an advantage over the items already on the market due to its portability
and affordability. The technology shortens class periods, increasing instructors' and
students' important teaching and learning time and offering them more opportunities
to teach and learn, respectively.
CHAPTER 3

OBJECTIVES AND METHODOLOGY

3.1 OBJECTIVES

● To detect multiple faces in real-time scenarios for monitoring attendance in


working place and producing the daily report to an authorized person.
● To increase identity security using a face recognition system.
● To prevent the misuse of the identity of an individual.

3.2 METHODOLOGY

The library attendance system is being built in three


phases, each based on one of its three subsystems: API Service, Face Recognition
Using YOLOv5 and Visitor Identification System. The stages of development are as
follows:

1. Creation of a library attendance API service

2. Developing facial recognition software using YOLOv5

3. Creating a system for visitor identification

4. Thorough system testing.


Figure 1 Methodology chart

The general software development life-cycle is used in phases 1 and 3 to


construct the API Service and Visitor Identification System. Stage 2 of facial
recognition development using YOLOv5 calls for more careful attention. YOLO
(You Only Look Once) is a very accurate real-time object identification system. For
object identification, the YOLO method makes use of a convolution neural network.
To find items in a picture, YOLO employs an artificial neural network (ANN)
method. The picture is divided into various parts by this network, which also predicts
each boundary city. YOLO excels in both picture prediction and classification.
YOLO can recognize many images at the same time. YOLO v5 includes four models
for training data: YOLO v5-s, YOLO v5-m, YOLO v5-l, and YOLO v5-x. The
network design, number of layers, and number of parameters differ throughout the
four models.
3.2.1 BUILDING A DATASET

The first step is to create a dataset before beginning training.

Figure 2 Labelling and Roboflow

A JPG/img image collection of images is required to create the dataset, and each
photo is subsequently tagged or annotated using labeling software. The annotation's
output comes in the form of an XML file. For picture pre-processing, The XML files
in the datasets are then concatenated.

3.2.2 TRAINING DATASET

In the second approach, which employs a proprietary YOLO v5-s model, the
outputs of the training phase, or Yolov5 weight model, are used for detection.

Figure 3 Flowchart of Dataset to Weight Model Yolov5

The dataset is read, and a class is created to serve as the basis for Yolov5's unique
detection model. The dataset is then trained using this file. When the training is
finished, pictures and videos may be used to test the training's data.
3.2.3 OBJECT DETECTION MODEL TESTING

The final procedure is object detection using a trained model.

Figure 4 Flowchart of Input to Output

Using images and videos, the testing step is represented in the above picture. For
testing, images or videos can be used as input. The built-in model is loaded to begin
the detection phase, which is followed by classification and prediction using
bounding boxes and confidence ratings. Prediction boxes, confidence values, and
object classes are shown as the results. Thorough System Testing is the fourth stage.
Comprehensive system testing is carried out by incorporating the three-part sub-
system to represent the comprehensive system testing phase in order to evaluate the
integrated sub-system.

Figure 5 Overall Flowchart

The input begins with a face that the camera has photographed. The built-in model
will then be loaded and used by the detection process to do classification and
prediction utilizing bounding boxes and confidence scores. Prediction boxes,
confidence values, and object classes are shown as the results. The report database
will save the detection results as an object class (cls), and the report page will then
display the data from the report database. The report page is designed to show
information in the form of a number, user name, NPM, attendance date, and time.
CHAPTER 4
PLATFORMS

4.1 SOFTWARE REQUIREMENT

4.1.1 H/W SYSTEM CONFIGURATION:-

• processor - INTEL

• RAM - 4 GB (min)

• Hard Disk - 20 GB

4.1.2 S/W SYSTEM CONFIGURATION:-

• Operating System : Windows 7 or 8


• Software : Python Idle

4.2 SOFTWARE ENVIRONMENT

4.2.1 PYTHON TECHNOLOGY:


Python is a general-purpose, interpreted programming language. Programming
paradigms including procedural, object-oriented, and functional programming are all
supported. Due to its extensive standard library, Python is frequently referred to as a
"batteries included" language.

4.2.2 PYTHON PROGRAMING LANGUAGE:


A multi-paradigm programming language is Python. A large number of its
features allow functional programming and aspect-oriented programming, including
metaprogramming and met objects (magic methods), and object-oriented
programming and structured programming are both fully supported. Extensions are
available for many additional paradigms, such as design by contract and logic
programming.

Python packages with a wide range of functionality, including:

● Easy to Learn and Use


● Expressive Language
● Interpreted Language
● Cross-platform Language
● Free and Open Source
● Object-Oriented Language
● Extensible
● Large Standard Library
● GUI Programming Support
● Integrated

Python's memory management system combines reference counting and a cycle-


detecting garbage collector with dynamic typing. Moreover, it has a dynamic name
resolution (late binding) capability that binds variable and method names as the
programme is being run.
Python was made to be very extendable rather than having all of its features
included in its core. It is especially well-liked for adding programmable interfaces to
already-existing applications because of its compact modularity. Van Rossum's
dissatisfaction with ABC, which advocated the opposite strategy, led to his concept
of a tiny core language with a huge standard library and easily expandable interpreter.
CHAPTER 5
YOLOV5 AND PYTHON
ARCHITECTURE

5.1 YOLO ALGORITHM

YOLO is an algorithm that uses neural networks to provide real time object
detection. The concept of object detection in computer vision includes identifying
different things in digital photos or movies. YOLO is an algorithm that can find and
identify different items in images. On the other hand, the YOLO framework (You
Only Look Once) approaches object identification in a different way. It predicts the
bounding box coordinates and class probabilities for these boxes using the complete
picture in a single instance. The main benefit of adopting YOLO is its outstanding
speed; it can process 45 frames per second. Moreover, YOLO is aware of generic
object representation.

YOLO is popular because it achieves great accuracy while running in real-


time. The method "just looks once" at the picture in the sense that making predictions
takes only one forward propagation pass through the neural network. Following non-
max suppression (which ensures that the object detection algorithm identifies each
object only once), it returns identified objects together with bounding boxes. A single
CNN predicts multiple bounding boxes and class probabilities for those boxes using
YOLO. YOLO trains on entire photos and enhances detection performance
immediately. This approach offers some advantages over conventional object
detection methods: YOLO moves incredibly quickly. YOLO sees the complete image
during training and testing, thus it stores contextual information implicitly.
When you feed an image into a YOLO algorithm, it divides the picture into a
SxS grid and utilises it to determine if a given bounding box contains the object (or
portions of it). It then uses this knowledge to determine what class the object belongs
to. We must comprehend how the algorithm creates and specifies each bounding box
before we can go into depth and describe how the method works. The YOLO
algorithm predicts an outcome by using four components and extra value.

1.The center of a bounding box (bx by)

2.Width (bw)

3.Height (bh)

4.The Class of the object (c)

The final predicted value is confidence (pc). It displays the likelihood that an
object will be found inside the bounding box. The centre of the enclosing box is
represented by the (x,y) coordinates. As most bounding boxes won't typically include
an item, we must employ computer prediction. Non-max suppression is a technique
we may use to get rid of extra boxes that are unlikely to contain items and those that
share large regions with other boxes.

5.1.1 ADVANTAGES

In the past, object identification tasks were completed in a pipeline of multi-


step series using techniques like Region-Convolution Neural Networks (R-CNN),
including rapid R-CNN. R-CNN trains each component individually while
concentrating on a particular area of the picture.

This procedure takes a long time since the R-CNN must categorise 2000 areas
every image (47 seconds per individual test image). As a result, real-time
implementation is not possible. Moreover, R-CNN employs a fixed selection method,
meaning no learning process takes place at this point and the network may produce a
subpar area recommendation.

As a result, object detection networks like R-CNN are slower than YOLO and
are more difficult to improve. YOLO is based on an algorithm that employs just one
neural network to perform all of the task's components, making it quicker (45 frames
per second) and simpler to tune than earlier techniques.

We must first investigate YOLO's design and algorithm in order to comprehend


what it is.

5.2 YOLOV3 NETWORK

YOLO V3 is an incremental improvement over YOLO V2, which employs a


different type of Dark net. This YOLO V3 architecture has 106 layers, with 53 trained
on ImageNet and another 53 tasked with object identification. While this has
significantly increased network accuracy, it has also lowered network speed from 45
to 30 frames per second.

5.3 YOLOV5 NETWORK

Similar to a standard CNN, a YOLO network has convolution and max-pooling


layers, followed by two fully linked CNN layers. Inverse Loss Function As the YOLO
method predicts numerous bounding boxes for each grid cell, we only want one of
the predicted bounding boxes to be accountable for the item within the image. To do
this, we compute the loss for each true positive using the loss function. The bounding
box with the highest Intersection over Union (IoU) with the ground truth must be
chosen in order to increase the efficiency of the loss function. By creating specific
bounding boxes, this technique enhances predictions for particular aspect ratios and
sizes.

5.4 PYTHON
Python is designed to be a language that is simple to read. Its formatting
is visually clean and frequently substitutes English keywords for punctuation in other
languages. It differs from many other languages in that blocks are not delimited by
curly brackets, and the use of semicolons to end statements is optional. Compared to
C or Pascal, it features fewer syntactic exceptions and special circumstances.
Figure 6: A repository architecture for an IDE

Python gives developers a choice in their development style while aiming for a
simpler, less cluttered syntax and grammar. Python adheres to a "there should be one
and preferably only one obvious way to do it" design ethos as opposed to Perl's "there
is more than one way to do it" maxim. "To label anything as 'smart' is not considered
a praise in the Python culture," argues Alex Martelli, a Fellow of the Python Software
Foundation and author of several Python books.
The Python developers try to avoid over-optimizing code, and they reject changes to
non-critical areas of the Python reference implementation that might result in slight
speed improvements at the expense of readability. When speed is crucial, a Python
programmer can use PyPy, a just-in-time compiler, or relocate time-critical functions
to extension modules written in languages like C. There is also Python, which
converts a Python script into C and allows users to use the Python interpreter directly
from C-level APIs.
The developers of Python prioritise keeping the language enjoyable to use. This
is reflected in the name of the language, which pays homage to the British comedy
group Monty Python, as well as in the language's occasionally lighthearted approach
to tutorials and reference materials, as in the use of examples like spam and eggs
(from a well-known Monty Python sketch) rather than the more traditional foo and
bar.
Duck typing is used in Python, which has typed objects but untyped variable
names. Type constraints aren't checked at build time; instead, actions on an object
could fail because the object's type isn't right. Python is highly typed, despite its
dynamic typing; it forbids operations that are not clearly stated rather than trying
invisibly to make sense of them.
Figure 7 Work flow of a source code

5.4.1 The Python Platform:


Python's platform module can be used to get data on the underlying platform,
including details on the hardware, operating system, and interpreter version. Tools to
view information about the hardware, operating system, and interpreter version of the
platform where the programme is running are included in the platform module.
The current Python interpreter can be found out using four functions. Python's
major, minor, and patch level components are returned in various ways via the
functions python version() and python version tuple(). The Python compiler is
reported by the function python compiler(). Moreover, python build() returns a
version string for the interpreter's build.
Platform() returns a string with a universal platform identifier in it. There are two
optional Boolean arguments the function will take. The names in the return value are
changed from their formal names to their more informal forms if aliased is set to true.
returns a minimum value with some parts skipped when terse is true.
5.4.2 WHAT DOES PYTHON TECHNOLOGY DO?
Python is very well-liked by programmers, but actual usage demonstrates that
business owners also believe in Python development, and for good reason. Its
reputation as one of the simplest programming languages to learn and its simple
syntax make it a favourite among software engineers. The fact that there is a
framework for almost everything, from web apps to machine learning, is appreciated
by business owners or CTOs.
However, it is more of a technology platform that was created by a massive
collaboration amongst hundreds of independent professional developers who came
together to establish a sizable and unusual community of enthusiasts.
What specific advantages does the language offer individuals who chose to adopt it
as their primary technology, then? Here are only a few of them justifications.

5.4.3 PRODUCTIVITY AND SPEED


There is a well-known myth in the programming community that writing Python code
can be up to 10 times faster than writing Java or C/C++ code for the same application.
The clear object-oriented design, improved process control capabilities, excellent
integration, and text processing capabilities all contribute to the outstanding
advantage in terms of time savings. Moreover, its own unit testing framework makes
a significant contribution to its efficiency and performance.

5.4.4 PYTHON IS POPULAR FOR WEB APPS

The market still favors technology for quick and efficient web development because
web development isn't showing any signs of slowing down. Together with JavaScript
and Ruby, Python also offers excellent support for creating web apps and is very well-
liked in the web development world thanks to its most well-known web framework,
Django.
5.5 OPEN-SOURCE AND FRIENDLY COMMUNITY
It is created under an OSI-approved open source licence, as mentioned on the official
website, allowing it freely distributable and useful. Furthermore, the community
actively participates and organizes conferences, meet-ups, hackathons, etc. to
promote camaraderie and knowledge-sharing.

5.5.1 PYTHON IS QUICK TO LEARN


It is said that the language is relatively simple so you can get pretty quick results
without actually wasting too much time on constant improvements and digging into
the complex engineering insights of the technology. Even though Python
programmers are really in high demand these days, its friendliness and attractiveness
only help to increase number of those eager to master this programming language.

5.5.2 BROAD APPLICATION

It is used for the broadest spectrum of activities and applications for nearly all
possible industries. It ranges from simple automation tasks to gaming, web
development, and even complex enterprise systems. These are the areas where this
technology is still the king with no or little competence:
● Machine learning as it has a plethora of libraries implementing machine learning
algorithms.
● Web development as it provides back end for a website or an app.
● Cloud computing as Python is also known to be among one of the most popular cloud-
enabled languages even used by Google in numerous enterprise-level software apps.
● Scripting.
● Desktop GUI applications.
CHAPTER 6
EXPERIMENTAL PROCEDURE

6.1 IMAGE PROCESSOR


The tasks of picture collecting, storage, preprocessing, segmentation,
representation, recognition, and interpretation are completed by an image processor,
which then displays or records the finished image. The basic steps of an image
processing system are shown in the block diagram below.

PROBLEM IMAGE SEGMENTATION REPRESENTATI


DOMAIN ACQUISITION ON &
DESCRIPTION

KNOWLEDGE RESULT
PREPROCESSING
RECOGNITION &
BASE INTERPRETATION

Fig 8 block diagram of fundamental sequence involved in an image


processing system

The process begins with picture acquisition, which is done using an imaging
sensor and a digitizer to digitize the image, as shown in the diagram. The following
phase is preprocessing, when the image is enhanced and fed into the other processes
as an input. Preprocessing frequently involves improving, eliminating noise, isolating
regions, etc. Segmentation divides an image into its individual objects or components.
The result of segmentation is often raw pixel data, which either includes the region's
perimeter or the region's individual pixels. The process of representation involves
converting the raw pixel data into a format that can be used by the computer for
further processing. The task of description is to identify key characteristics that
distinguish one class of items from another. Based on the details provided by an
object's descriptors, recognition gives it a label. An ensemble of identified things must
be given meaning in order to be considered as interpreted. The knowledge base
includes the information on a certain problem domain. Each processing module is
guided in its functioning by the knowledge base, which also regulates how the
modules communicate with one another. Not all modules are required to perform a
given function. The application determines how the image processing system is built.
The image processor typically operates at a frame rate of 25 or less per second.

6.2 IMAGE PREPROCESSING:

The input image may be of a different size, contain noise, and have a different colour
scheme after preprocessing. These settings must be changed in accordance with the
process' requirements. Picture regions with low signal levels, such as shadow areas
or underexposed photographs, are where image noise is most noticeable. There are
numerous sorts of noise, including film grains, salt and pepper noise, and others. All
of this noise is eliminated using filtering algorithms. Weiner filter is one of the many
filters used. The acquired image will be processed for accurate output in the
preprocessing module. An algorithm was used for pre-processing. Pre-processing
must be done for all photographs in order to improve the final product.

6.3 IMAGE SEGMENTATION:

The technology and process of segmenting the facial image into various
distinct and specific regions and locating objects of interest within those regions is
known as facial image segmentation. Technology for face image segmentation has
become widely employed in processing facial data in recent years.

6.4 FEATURE EXTRACTION:

The study of data gathering, organization, analysis, and interpretation is known


as statistics. It covers every facet of this, including how surveys and experiments are
designed and how data collecting is planned. This is what statistics are used for. The
following statistical characteristics of the image: Mean, Variance, Skewness, and
Standard Deviation
Utilizing the Gray-Level Co-Occurrence Matrix for Texture Analysis
(GLCM). The gray-level co-occurrence matrix (GLCM), often referred to as the gray-
level spatial dependence matrix, is a statistical technique for analysing texture that
takes into account the spatial relationship of pixels.

6.5 IMAGE ENHANCEMENT:

Several picture enhancement methods may be grouped using spatial domain


approaches and frequency domain techniques. Several image enhancing techniques
are used to different photographs. This comprises improving the smoothness of the
image and removing noise, blur, etc. It has been shown that Gabor filters are effective
in removing noise and blur. In the phase that follows, picture filtering will be
advantageous.

6.6 CLASSIFICATION:

The relationship between the data and the classifications they are classified into must
be clearly understood in order to classify a piece of data into several classes or
categories. In order for a computer to accomplish this, it must be trained. Training is
essential for categorization success. Techniques for classification were initially
created. Features are characteristics of the data items that serve as the basis for
classifying them into different groups.
1). The picture classifier acts as a discriminant, favoring some classes over others. 2).
highest for one class, lower for other classes in the discriminant value (multiclass) 3).
Positive discriminant value for one class and negative for another (two class).
6.7 API DOCUMENTATION GENERATORS

Generators of Python API documentation include:


● Sphinx
● Epydoc
● HeaderDoc
● Pydoc

6.7.1 USES

Python has been successfully embedded in many software products as a scripting


language, including in finite element method software such as Abaqus, 3D parametric
modeller like FreeCAD, 3D animation packages such as 3ds Max, Blender, Cinema
4D, Lightwave, Houdini, Maya, modo, MotionBuilder, Softimage, the visual effects
compositor Nuke, 2D imaging programmes like GIMP, Inkscape, Scribus and Paint
Shop Pro, and musical notation programmes like scorewriter and capella. Python is
used by GNU Debugger as an aesthetically pleasing printer to display complex
structures like C++ containers. Python is recommended by Esri as the best language
for creating scripts for ArcGIS. It has also been incorporated into a number of video
games, and Google App Engine has chosen it as the first of the three programming
languages it offers, the other two being Java and Go.

With the aid of packages like TensorFlow, Keras, and Scikit-learn, Python is
frequently used in artificial intelligence projects. Python is frequently used for natural
language processing because it is a scripting language with a modular architecture,
easy syntax, and rich text processing facilities.

Python is a widely used operating system that comes as standard equipment. It may
be used from the command line and is included with the majority of Linux
distributions, AmigaOS 4, FreeBSD (as a package), NetBSD, OpenBSD (as a
package), and macOS (terminal). Python-based installers are used by several Linux
distributions; Red Hat Linux and Fedora utilise the Anaconda installer, while Ubuntu
uses the Ubiquity installer. Python is used by Gentoo Linux's Portage package
manager.

Python is widely used in the information security sector, notably for the creation of
exploits.

Python is the primary programming language used in Sugar Labs' development of the
One Laptop per Child XO software. Python has been chosen as the primary user-
programming language for the Raspberry Pi single-board computer project.

Python is a component of LibreOffice, which seeks to displace Java with it. With
Version 4.0 on the 7th of February 2013, its Python Scripting Provider has become a
key component.
6.8 PANDAS
Pandas is a software library for the Python programming language designed
for data manipulation and analysis in computer programming. It includes specific data
structures and procedures for working with time series and mathematical tables. It is
free software distributed under the BSD license's three clauses. The word is derived
from "panel data," a phrase used in econometrics to refer to data sets that contain
observations for the same persons throughout a range of time periods.

Fig 9 Software
libraries

6.8.1 LIBRARY FEATURES

● A data manipulation object called a data frame with built-in indexing.


● Instruments for transferring data across several file formats and in-memory data
structures.
● Data synchronization and seamless handling of missing data.
● Reorganizing and rotating data sets.
● Subsetting of big data sets, clever indexing, and label-based slicing.
● Inserting and removing columns from data structures.
● The ability to split-apply-combine data sets using the group by engine.
● Joining and combining data sets.
● The use of a lower-dimensional data structure's hierarchical axis indexing to
handle high-dimensional data.
● Time series functionality includes date shifting, frequency conversion, lagging,
moving window statistics, and moving window linear regressions.
● Offers data filtering.

Fig 10 Python panda features

6.9 CSV READER

The CSV (Comma Separated Values) file format is straightforward and used to store
tabular data in spreadsheets and databases. Tabular data (numbers and text) is stored
as plain text in a CSV file. The file's lines each contain a data record. One or more
fields, separated by commas, make up each record. The name of this file format is
derived from the fact that fields are separated by commas.

Python includes a module called csv that may be used to open and read CSV files.
Fig 11 Opening a CSV reader
CHAPTER-7
PROCESSOR
7.1 INTRODUCTION TO PROCESSOR

The basic instructions needed to operate a specific computer are responded to and
processed by the processor, which is a chip or logical circuit. The fetching, decoding,
execution, and write-back of an instruction are the processor's primary tasks. Any
system that includes computers, laptops, smartphones, embedded systems, etc. has a
processor, which is also referred to as the system's brain. The two components of the
processors are the CU (Control Unit) and ALU (Arithmetic Logic Unit). The control
unit functions like a traffic cop, managing the command or the operation of the
instructions, while the arithmetic logic unit conducts all mathematical operations such
as additions, multiplications, subtractions, divisions, etc. The input/output devices,
memory, and storage devices that make up the other components are likewise in
communication with the processor.

7.2 GENERAL PURPOSE PROCESSOR

There are five different types of general-purpose processors: media processor,


embedded processor, DSP, microcontroller, and microprocessor.

7.3 MICROPROCESSOR

In embedded systems, the microprocessor is a representation of the general-


purpose processors. There are numerous types of microprocessors from various
manufacturers on the market. The microprocessor is a general-purpose processor that
includes a control unit, ALU, and a number of registers, including control registers,
status registers, and registers for scratchpads. There could be an on-chip memory as
well as ports, interrupt lines, and other lines for the memory and interfaces for
interacting with the outside world. Ports are frequently referred to as programmable
ports since we may programme them to operate as either inputs or outputs. In the
table below, general-purpose processors are listed.
7.3.1 BASIC COMPONENTS OF PROCESSOR

⮚ ALU stands for arithmetic logic unit, which help out to execute all arithmetic and
logic operations.
⮚ FPU (Floating Point Unit) is also called the “Math coprocessor” that helps to
manipulate mathematically calculations.
⮚ Registers store all instructions and data, and it fires operands to ALU and save
the output of all operations.
⮚ Cache memory helps to save more time in travelling data from main memory.

7.3.2 PRIMARY CPU PROCESSOR OPERATIONS

⮚ Fetch – In which, to obtain all instructions from main memory unit (RAM).
⮚ Decode – In this operation, to convert all instructions into understandable ways then
other components of CPU are ready to continue further operations, and this entire
operations ar performed by decoder.
⮚ Execute – Here, to perform all operations and every components of CPU which are
needed to activate to carry out instructions.
⮚ Write-Back – After executing all operations, then its result is moved to write back.

7.4 TYPES OF PROCESSOR

Here, we will discuss about different types of CPU (Processors), which are
used in computers. If you know how many types of CPU (Processors) are there, then
short answer is 5 types of processor.

7.4.1 SINGLE CORE PROCESSOR


Single Core CPUs were used in the traditional type of computers. Those CPUs
were able to perform one operation at once, so they were not comfortable to
multitasking system. These CPUs got degrade the entire performance of computer
system while running multiple programs at same time duration.

In Single Core CPU, FIFO (First Come First Serve) model is used, it means that
couple of operations goes to CPU for processing according to priority base, and left
operations get wait until first operation completed.

7.4.2 DUAL CORE PROCESSOR

Two processors make up a dual core processor, and they are connected to one another
like a single integrated circuit (Integrated circuit). Each processor has its own local
cache and controller, enabling it to complete various challenging tasks faster than a
single core CPU.

Intel Core Duo, AMD X2, and the dual-core PowerPC G5 are a few examples of dual
core CPUs in use.

7.4.3 MULTI CORE PROCESSOR

Multi core processor is designed with using of various processing units’ means
“Cores” on one chip, and every core of processor is able to perform their all tasks.
For example, if you are doing multiple activities at a same time like as using
WhatsApp and playing games then one core handles WhatsApp activities and other
core manage to another works such as game.

7.4.4 QUAD CORE PROCESSOR


Quad core processor is high power CPU, in which four different processors
cores are combined into one processor. Every processor is capable to execute and
process all instructions own level without taking support to other left processor cores.
Quad core processors are able to execute massive instructions at a time without
getting waiting pools. Quad core CPU help to enhance the processing power of
computer system, but it performance depend on their using computing components.

7.4.5 OCTA CORE PROCESSOR

Octa core processor is designed with using of multiprocessor architecture, and


its design produces the higher processing speed. Octa core processor has best ability
to perform multitasking and to boost up efficiency of your CPU. These types of
processors are mostly used in your smart phones.

7.5 WEB CAM

A webcam is a video camera that streams or sends its image live to or over a network
of computers. The computer can "capture" a video stream, which can then be saved,
viewed, or shared to other networks via the internet and email as an attachment. The
video feed can be saved, viewed, or further transferred to a remote destination.
A webcam is typically connected through a USB connection or other similar cable,
or it may be integrated into computer hardware, such as laptops, unlike an IP camera,
which connects using Ethernet or Wi-Fi.
Fig 12 Web cam

7.5.1 VIDEO CALLING AND VIDEOCONFERENCING

Videoconferencing, videophone, and videotelephony One-to-one live video


communication over the Internet has now reached millions of common PC users
worldwide. Cameras can be integrated to instant messaging, text chat services like
AOL Instant Messenger, and VoIP systems like Skype. Webcams are beginning to
replace traditional video conferencing solutions thanks to improved visual quality.
Webcams are becoming more and more popular thanks to new capabilities like
automatic lighting adjustments, real-time upgrades (retouching, wrinkle smoothing,
and vertical stretch), automatic face tracking, and autofocus. Program, computer
operating system, and CPU capabilities can all affect the webcam's functionality and
performance. Several well-known instant messaging apps now include capabilities
for video calling.

7.5.2 VIDEO SECURITY

Security cameras can be made from webcams. There is software that enables PC-
connected cameras to listen for sound and detect movement, and record both when
they are found. These recordings can then be downloaded from the Internet, saved to
a PC, or sent via email. In one well-known instance, the owner of the computer was
able to provide authorities with a clear photograph of the burglar's face even after the
computer had been stolen because the burglar e-mailed pictures of himself while the
computer was being stolen. Webcam access without authorization might cause
serious privacy problems (see "Privacy" section below).

7.5.3 VIDEO CLIPS AND STILLS

Both static images and video can be captured using webcams. For this, a variety of
widely used software programmes can be used, such as Pic Master (for Windows
operating systems), Photo Booth (Mac), or Cheese (with Unix systems). See out
Comparison of webcam software for a more comprehensive list.

7.5.4 INPUT CONTROL DEVICES

The visual stream from a camera can be used by specialised software to


facilitate or improve a user's control over programmes and games. To create a similar
type of control, video features such as faces, forms, models, and colours can be seen
and tracked. For instance, a head-mounted light would enable hands-free computing
and significantly increase computer accessibility. The position of a single light source
might also be recorded and utilised to simulate a mouse cursor.
This can be used in games to provide players more control, better involvement, and a
more immersive experience. Free Track is a free webcam motion-tracking
programme for Windows that can track an exclusive head-mounted model in up to
six degrees of freedom and output information to mouse, keyboard, joystick, and Free
Track-compatible games. The webcam's IR filter can be removed so that IR LEDs
can be used instead. As IR LEDs are invisible to the naked eye, the user is not
distracted when using them. A commercial application of this technology is called
Track IR.

7.5.5 ASTRO PHOTOGRAPHY


A select few particular webcam models with very-low-light capability are used
frequently by astronomers and astro photographers to capture images of the night sky.
These cameras often have manual focus and a somewhat older CCD array rather than
a CMOS array. The cameras' lenses are taken off, and the cameras are then mounted
on telescopes to capture still, moving, or both types of media. In more recent methods,
movies of extremely faint objects are captured for a few seconds, and then all of the
video's frames are "stacked" together to create a still image with decent contrast.

7.6 OPENCV

A group of enthusiastic programmers created the Open Source Computer


Vision Library in 1999 to integrate image processing into a wide range of coding
languages. It runs on Windows, Linux, Android, and Mac and features C++, C, and
Python interfaces.
The Open CV project, which was formally introduced in 1999, began as an
Intel Research endeavor to develop CPU-intensive applications. It was one of several
initiatives that also included real-time ray tracing and 3D display walls.
The Intel Performance Library Team and several optimization specialists from
Intel Russia were among the project's major contributors. To make code more legible
and transferrable, a standard infrastructure was created that developers could use to
spread vision information. By making portable, performance-optimized code freely
available under a license that does not demand that the code itself be open or free, we
can advance commercial vision-based applications. At the IEEE Symposium on
Computer Vision and Pattern Recognition in 2000, the first alpha version of Open
CV was made available to the public, and five beta versions followed between 2001
and 2005. In 2006, the initial 1.0 version was released. A version 1.1 "pre-release"
was made accessible in October 2008. The Open CV's second significant update came
out in October 2009. The C++ interface has undergone significant changes in Open
CV 2, with the goal of facilitating new functions, easier and more type-safe patterns,
and improved performance for existing ones (especially on multi-core systems). The
frequency of official releases has been increased to every six months, and independent
Russian developers are now backed by private businesses. In August 2012, a non-
profit organization called OpenCV.org, which runs a development and user website,
took over maintenance for Open CV. An agreement to acquire was inked by Intel in
May 2016. It is a well-known Open CV developer.

7.7 YOLO

YOLO is a method that provides real-time object detection using neural


networks. The phenomenon of object detection in computer vision includes
identifying different things in digital photos or movies. YOLO is an algorithm that
can find and identify different items in images. On the other hand, the YOLO
framework (You Only Look Once) approaches object identification in a different
way. It predicts the bounding box coordinates and class probabilities for these boxes
using the complete image in a single instance. The main benefit of adopting YOLO
is its outstanding speed; it can process 45 frames per second.
Moreover, YOLO is aware of generalized object representation. This is one of
the best object detection algorithms and has demonstrated performance that is
comparable to R-CNN algorithms. We shall learn about various methods employed
by the YOLO algorithm in the parts that follow. One of the fundamental issues in
computer vision is object detection, where the goal is to identify what and where,
particularly what things are present in a given image and where they are located
within the image. Object detection is a more challenging task than classification,
which can also identify items but does not tell the viewer where they are in the image.
Moreover, classification fails for photos with several objects. YOLO takes a very
different tack. A smart convolutional neural network (CNN) called YOLO is used to
recognise objects in real time. A single neural network is applied to the entire image
by the algorithm, which then divides it into regions and forecasts bounding boxes and
probabilities for each region. The projected probabilities are used to weight these
bounding boxes.
Because it can run in real-time and attain great accuracy, YOLO is well-liked.
In the sense that it only needs to perform one forward propagation run through the
neural network to produce predictions, the algorithm "only looks once" at the image.
It outputs recognised items along with the bounding boxes after non-max suppression.
A single CNN predicts several bounding boxes and class probabilities for those boxes
simultaneously with YOLO. YOLO moves quite quickly. YOLO implicitly encodes
contextual information about classes in addition to their outward appearance because
it views the full image during training and testing. YOLO develops generalizable
representations of objects, outperforming previous top detection techniques when
trained on natural photos and tested on creative works. A network called You Only
Look Once (YOLO) employs Deep Learning (DL) techniques for object detection.
YOLO accomplishes object detection by categorising certain things in the image and
locating them on it. A YOLO network, for instance, will produce a vector of bounding
boxes for each individual sheep and identify it as such if you input a picture of a herd
of sheep.

7.7.1 HOW YOLO IMPROVES OVER PREVIOUS OBJECT DETECTION


METHODS-

In the past, object detection tasks were completed in a pipeline of multi-step series
using techniques like Region-Convolution Neural Networks (R-CNN), including fast
R-CNN. R-CNN trains each component separately while concentrating on a
particular area of the image.

This procedure takes a long time because the R-CNN must categorise 2000 regions
every image (47 seconds per individual test image). As a result, real-time
implementation is not possible. Furthermore, R-CNN employs a fixed selection
method, meaning no learning process takes place at this point and the network may
produce a subpar area recommendation.

As a result, object detection networks like r-cnn are slower than yolo and are more
difficult to improve. Yolo is based on an algorithm that uses just one neural network
to run all of the task's components, making it faster (45 frames per second) and
simpler to optimise than earlier techniques.

We must first investigate yolo's architecture and algorithm in order to comprehend


what it is.

Yolo architecture -structure design and algorithm operation

In a yolo network, there are three essential components. The algorithm, sometimes
referred to as the predictions vector, comes first. The network, next. The loss works,
thirdly.

7.8 CNN ARCHITECTURE

This research work describes the image classification using deep neural
network combined with HOG feature extraction with K-means segmentation
algorithm and classifies through SVM classifier for more accuracy. The following
advantage of proposed system
1) The proposed CNN method reduce the number of preprocessing steps
2) Extra shape feature extracted from HOG algorithm for provide the better accuracy
3) SVM classifier reduced the complexity of work and improved the robustness of
system

7.8.1 DEEP NEURAL NETWORK

A Complete 2D dimensional neural network consist of number of image input


layer, convolution layer, ReLu layer, Maxpooling2d layer ,Fully connected layer
,SoftMax Layer and classification layer ,the detail description of each layer of
classifiers compete .
(1) Image input layer: The Image input Layer learn the feature from the input
image. The first step define pixel of input image, the image size define.
(2) Convolution layer: The convolution layer extract the features of image from
the image input layer. CNN layer consists of one or more kernel with different
weight that are used for extract the features of input image. Depending on weights
associated with each Filter we can extract the feature of image.
(3) Pooling layer: The pooling layer applies a down sampling of convolved
features of image .when detect non linearity of input image. The pooling layer is to
provide the dimension of features map of image.
(4)Fully connected layer: The fully connected layer connect the 26 class of image
data, the above layer of five blocks interconnected which is classified by the fully
connected layer of system, based on the class score we can classify the predicted
score.

Fig 13 CNN Architecture


CHAPTER 8
RESULTS

8.1 DATASET CREATION


In this module, the dataset for the student is created by using Opencv-python.
One thousand image where collected from each and every students to create a dataset.
Fig 14 Dataset creation

8.2 DATA COLLECTION

CHAPTER-7

Fig 15 Dataset collection

8.3 FACE EXTRACTION


Fig 16 Face extraction

8.4 OUTPUT
Fig 17 Result

8.5 EXPECTED RESULT


The intended outcome of this suggested system is to automatically record each
student's attendance in class using a single group video. Based on the student's
presence or absence, which is determined in the model utilizing face recognition, the
marking of attendance is ultimately accomplished in a website providing date time
and name of the persons attendance in a proper manner.

CHAPTER 9

9.1 CONCLUSION
The previous (manual) method's shortcomings are intended to be lessened by the
automated attendance system. The application of image processing techniques in the
classroom is demonstrated through this attendance system. A wonderful example of
recording student attendance in a classroom is the suggested automated attendance
system using face recognition. Also, this system aids in reducing the likelihood of
proxies and phoney attendance. There are many methods that use biometrics that are
available in the modern world. Yet, due to its great accuracy and minimal need for
human participation, facial recognition emerges as a potential solution. The system's
goal is to offer a high level of security. This technique can enhance an institution's
reputation in addition to simply assisting with the attendance system.

9.2 REFERENCES

[1] P. Cocca, F. Marciano, and M. Alberti, ``Video surveillance systems


to enhance occupational safety: A case study,'' Saf. Sci., 2016
[2] M. L. Garcia, Vulnerability Assessment of Physical Protection
Systems. Oxford, U.K.: Heinemann, 2006.
[3] M. P. J. Ashby, ``The value of CCTV surveillance cameras as an
investigative tool: An empirical analysis,'' Eur. J. Criminal Policy
Res,2017
[4] B. C. Welsh, D. P. Farrington, and S. A. Taheri, ``Effectiveness and
social costs of public area surveillance for crime prevention,'' 2015
[5] The Effectiveness of Public Space CCTV: A Review of Recent
Published Evidence Regarding the Impact of CCTV on Crime, Police
Community Saf. Directorate Scottish Government, Edinburgh, U.K.,
2009.
[6] W. Hu, T. Tan, L. Wang, and S. Maybank, ``A survey on visual
surveillance of object motion and behaviors,'' EEE Trans. Syst., Man,
Cybern. C, Appl. Rev 2004
[7] P. L. Venetianer and H. Deng, ``Performance evaluation of an
intelligent video surveillance systemA case study,'' Comput. Vis. Image
Understand., Nov. 2010
[8] V. Tsakanikas and T. Dagiuklas, ``Video surveillance systems-
current status and future trends,'' Comput. Electr. Eng., Aug. 2018
[9] L. Patino, T. Nawaz, T. Cane, and J. Ferryman, ``PETS 2017: Dataset
and challenge,'' in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
Workshops (CVPRW), Honolulu, HI, USA, Jul. 2017
[10] G. Awad, A. Butt, J. Fiscus, D. Joy, A. Delgado, M. Michel, A. F.
Smeaton, Y. Graham, W. Kraaij, G. Quénot, M. Eskevich, R. Ordelman,
G. J. F. Jones, and B. Huet. (2018). TRECVID 2017: Evaluating Ad-Hoc
and Instance Video Search, Events Detection, Video Captioning, and
Hyperlinking.
ANNEXURE – I
WORK CONTRIBUTION

PROJECT TITLE: ARTIFICIAL INTELLIGENCE BASED REAL-TIME ATTENDANCE


SYSTEM USING FACE
RECOGNITION

INDIVIDUAL CONTRIBUTION OF STUDENT 1

Student Name: GURURAJ R

Register Number: 191EC152

Role in the project: Designing UI using Angular JS for creation of login page,
signup page and for all the dashboard visualizations.

INDIVIDUAL CONTRIBUTION OF STUDENT 2

Student Name: HARIRAM S

Register Number: 191EC156

Role in the project: data collection , data organization , feature extraction and output
design execution.

INDIVIDUAL CONTRIBUTION OF STUDENT 3

Student Name: GOWTHAM G K


Register Number: 191EC146

Role in the project: Data Visualization in Python coding end of the project.
ANNEXURE 2 :
ANNEXURE 3 :

You might also like