Final Year Project Report
Final Year Project Report
BSc DEGREE
IN
Computer Science with Network Communications
PROJECT REPORT
ID Number: K1557901
Date: 23/04/2018
0
Final Year Report Jireh Robert Jam K1557901
Plagiarism Declaration
The following declaration should be signed and dated and inserted directly
after the title page of your report:
Declaration
I have read and understood the University regulations on plagiarism and I
understand the meaning of the word plagiarism. I declare that this report is
entirely my own work. Any other sources are duly acknowledged and
referenced according to the requirements of the School of Computer Science
and Mathematics. All verbatim citations are indicated by double quotation
marks (“…”). Neither in part nor in its entirety have I made use of another
student’s work and pretended that it is my own. I have not asked anybody to
contribute to this project in the form of code, text or drawings. I did not allow
and will not allow anyone to copy my work with the intention of presenting it
as their own work.
Date 23/04/2017
Signature JirrehJam
1
Final Year Report Jireh Robert Jam K1557901
Acknowledgements:
A big thank you to Dr. Jean-Christophe Nebel for his support and guidance throughout this project.
I also express gratitude to all my friends for their help and support especially those of the Computer
Science Department.
I also thank Jirdith Esther Qwasinwi Roberts my daughter for her encouraging words; my parents
Rev. Dr. Ilaja and Susan Jam for their love and prayers.
Most of all a big thanks to God Almighty for his guidance throughout course.
Thank you all for the encouragement and support!
Abstract:
This paper will show how we can implement algorithms for face detection and recognition in image
processing to build a system that will detect and recognise frontal faces of students in a classroom. “A
face is the front part of a person’s head from the forehead to the chin, or the corresponding part of an
animal” (Oxford Dictionary). In human interactions, the face is the most important factor as it contains
important information about a person or individual. All humans have the ability to recognise individuals
from their faces. The proposed solution is to develop a working prototype of a system that will facilitate
class control for Kingston University lecturers in a classroom by detecting the frontal faces of students
from a picture taken in a classroom. The second part of the system will also be able to perform a facial
recognition against a small database.
In recent years, research has been carried out and face recognition and detection systems have been
developed. Some of which are used on social media platforms, banking apps, government offices e.g.
the Metropolitan Police, Facebook etc.
2
Final Year Report Jireh Robert Jam K1557901
Table of Contents
Plagiarism Declaration ........................................................................................................................... 1
ACKNOWLEDGEMENTS: ..........................................................................................................................2
ABSTRACT: ..................................................................................................................................................2
CHAPTER ONE.............................................................................................................................................5
INTRODUCTION: .........................................................................................................................................5
MOTIVATION AND PROBLEM DEFINITION: ...................................................................................................... 5
PROJECT AIMS AND OBJECTIVES: .................................................................................................................... 5
DISCUSSION REGARDING ETHICS: .................................................................................................................... 6
REPORT STRUCTURE: ....................................................................................................................................... 6
CHAPTER TWO............................................................................................................................................7
LITERATURE REVIEW: .............................................................................................................................7
OVERVIEW OF FACE DETECTION AND RECOGNITION: ................................................................................... 7
FACE RECOGNITION: ...................................................................................................................................... 16
CHAPTER THREE......................................................................................................................................24
METHODOLOGY: ......................................................................................................................................24
OTHER METHODOLOGIES FOR PROJECT DEVELOPMENT: ........................................................................... 25
SCRUM: ..................................................................................................................................................... 25
Extreme Programming(XP):...................................................................................................................... 26
REQUIREMENTS AND ANALYSIS:.........................................................................................................26
FUNCTIONAL REQUIREMENTS:....................................................................................................................... 26
NON-FUNCTIONAL REQUIREMENTS:.............................................................................................................. 27
MOSCOW ANALYSIS: ..................................................................................................................................... 27
SWOT ANALYSIS:........................................................................................................................................... 27
USE CASE DIAGRAM: ...................................................................................................................................... 29
THE USER STORY:........................................................................................................................................... 29
CONCLUSION: .................................................................................................................................................. 30
PROJECT PLAN ................................................................................................................................................ 30
PROJECT CONTINGENCY ................................................................................................................................ 31
SUMMARY: ....................................................................................................................................................... 32
CHAPTER FOUR ........................................................................................................................................32
DESIGN: ......................................................................................................................................................32
ARCHITECTURE DESIGN: ................................................................................................................................ 33
INTERFACE DESIGN: ....................................................................................................................................... 34
THE ACTIVITY DIAGRAM: .............................................................................................................................. 34
CHOICE OF METHODS/ALGORITHMS ............................................................................................................. 35
CHAPTER FIVE ..........................................................................................................................................38
IMPLEMENTATION ..................................................................................................................................38
TECHNOLOGY USED: ...................................................................................................................................... 39
IMPLEMENTATION OF FACE DETECTION: ...................................................................................................... 39
IMPORTANT CODE DETAILS OF FACE DETECTOR:........................................................................................ 39
IMPORTANT CODE DETAILS OF FACE RECOGNITION: ................................................................................................... 42
TRAINING THE DATASET: ............................................................................................................................... 45
STRUCTURE OF THE GUI AND IMPLEMENTATION: ........................................................................................ 48
SUMMARY: ...................................................................................................................................................... 49
CHAPTER SIX ............................................................................................................................................49
3
Final Year Report Jireh Robert Jam K1557901
4
Final Year Report Jireh Robert Jam K1557901
CHAPTER ONE
Introduction:
In Face Detection and Recognition systems, the flow process starts by being able to detect and recognise
frontal faces from an input device i.e. mobile phone. In today’s world, it has been proven that students
engage better during lectures only when there is effective classroom control. The need for high level
student engagement is very important. An analogy can be made with that of pilots as described by
Mundschenk et al (2011 p101)” Pilots need to keep in touch with an air traffic controller, but it would
be annoying and unhelpful if they called in every 5 minutes”. In the same way students need to be
continuously engaged during lectures and one of the ways is to recognise and address them by their
names. Therefore, a system like this will improve classroom control. In my own view based on
experience, during my time as a teacher, I realised calling a student by his/her name gives me more
control of the classroom and this draws the attention of the other students in the classroom to engage
during lectures.
Face detection and recognition is not new in our society we live in. The capacity of the human mind to
recognize particular individuals is remarkable. It is amazing how the human mind can still persist in
identification of certain individuals even through the passage of time, despite slight changes in
appearance.
Anthony (2014 p1) reports that, due to the remarkable ability of the human mind to generate near
positive identification of images and facial recognition of individuals, this has drawn considerable
attention for researchers to invest time in finding algorithms that will replicate effective face recognition
on electronic systems for use by humans.
Wang et al (2015 p318) states that” the process of searching a face is called face detection. Face
detection is to search for faces with different expressions, sizes and angles in images in possession of
complicated light and background and feeds back parameters of face”.
Face recognition processes images and identifies one or more faces in an image by analysing patterns
and comparing them. This process uses algorithms which extracts features and compare them to a
database to find a match. Furthermore, in one of most recent research, Nebel (2017, p. 1), suggest that
DNA techniques could transform facial recognition technology, by the use of video analysis software
which can be improved thanks to a completely advance in research in DNA analysis. By so doing,
camera-based surveillance systems software to analyze DNA sequences, by treating a video as a scene
that evolves the same way DNA does, to detect and recognize human face.
Motivation and Problem Definition:
This project is being carried out due to the concerns that have been highlighted on the methods which
lectures use to take attendance during lectures. The use of clickers, ID cards swiping and manually
writing down names on a sheet of paper as a method to track student attendants has prompted this
project to be carried out. This is not in any way to criticize the various methods used for student
attendance, but to build a system that will detect the number of faces present in a classroom as well as
recognizing them. Also, a teacher will be able to tell if a student was honest as these methods mentioned
can be used by anyone for attendance records, but with the face detection and recognition system in
place, it will be easy to tell if a student is actually present in the classroom or not.
This system will not only improve classroom control during lectures, it will also possibly detect faces
for student attendance purposes. I will use MATLAB to build and implement this system.
Project Aims and Objectives:
The aim and objectives of this project has been acquired after meeting with the client.
AIM OBJECTIVES
To develop a prototype that will facilitate The system should be able to detect
classroom control and attendance by face students’ frontal faces in a classroom
detection and recognition of students faces in a within 30% accuracy.
digital image taken by a mobile phone camera. The system should be able to
5
Final Year Report Jireh Robert Jam K1557901
6
Final Year Report Jireh Robert Jam K1557901
CHAPTER TWO
Literature Review:
In this chapter, a brief overview of studies made on face detection and recognition will be introduced
alongside some popular face detection and recognition algorithms. This will give a general idea of the
history of systems and approaches that have been used so far.
Fig2.2 Face recognition system framework as suggested by Shang-Hung Lin. (2000, p.2).
The figure below shows a simplified diagram from the framework for face recognition from the study
suggested by Shang-Hung Lin. (2000).
7
Final Year Report Jireh Robert Jam K1557901
From the figure, above, Face Detection or face detector will detect any given face in the given image
or input video. Face localization, will detect where the faces are located in the given image/video, by
use of bounding boxes. Face Alignment is when the system will find a face and align landmarks such
as nose, eyes, chin, mouth for feature extraction. Feature extraction, extracts key features such as the
eyes, nose, mouth to undergo tracking. Feature matching and classification. matches a face based on a
trained data set of pictures from a database of about 200 pictures. Face recognition, gives a positive or
negative output of a recognized face based on feature matching and classification from a referenced
facial image.
Face detection is the process of locating a face in a digital image by any special computer software
build for this purpose. Feraud et al (2000 p.77) discuss face detection as “To detect a face in an image
means to find its position in the image plane and its size or scale “.
As figure 2.3 shows, the detection of a face in a digital image is a prerequisite to any further process in
face recognition or any face processing software.
In early years, face detection algorithms focused mainly on the frontal part of the human face
(Srinivasan, Golomb and Martinez, 2016, p.4434).
However, in recent years, Cynganek, (2013, p.346) suggest that newer algorithms take into
consideration different perspectives for face detection. Researchers have used such systems but the most
challenge that has been faced is to make a system detect faces irrespective of different illumination
conditions. This is based on a study by Castrillón et al. (2011, p.483) on the Yale database which
contains higher resolution images of 165 frontal faces. Face detection is often classified into different
methods. In order to face the first major problem of the project (Detecting students faces), a wide range
of techniques have been researched. These several face detection techniques/ methodologies have been
proposed by many different researchers and often classified in major categories of different approaches.
In this paper, we will look at some reviews and major categories of classification by different groups
of researchers and relate it to the system.
Yang et al (2002) classifies face detection methodologies into four major categories: Knowledge-based,
Feature invariant, Template matching and appearance-based approaches. Knowledge Based Method
that uses human
knowledge or human coding to model facial feature based on nature of the human face such as two
eyes, mouth and the nose. This is very easy to apply the rules but very difficult to detect in various
8
Final Year Report Jireh Robert Jam K1557901
background depending on the pose and illumination. Low detection accuracy with small burden of
calculation and short detection time.
Yang et al (2002 pp.36-37), in order to investigate this method created multiple resolution hierarchy of
images by averaging and subsampling as on the figure 2.4 below.
Figure 2.4 Taken from Detecting Faces in Images: A Survey (Yand et al (2002 p.36).
They subdivided these resolution hierarchies into three levels with level 1 being the lowest resolution
which only searches for face candidate and further processed at finer resolutions. At level 2 they used
the face candidate in level 1 to alongside local histogram equalization followed by edge detection. At
level three, the surviving face candidate region uses a set rule responding to facial features such as
mouth and eyes. They conducted their experiment on 60 images. Their system located faces on 50 of
these images and 28 images gave false alarm, thus giving a success rate of 83.33% and a false alarm
rate at 46.66%. Feature-Based-Methods that uses algorithms to look for structural features regardless
of pose, viewpoint or lighting conditions to find faces. Template Matching Methods; uses standard
facial patterns stored for use to correlate an input image with the stored pattern to compute for detection.
Appearance Base Methods; uses a set of training sets of images to learn the templates and capture the
representative of facial appearance. Furthermore, Yang et al. also carried out their experiments on a
standard database set which is shown on the Table2.1 and Table 2.2 Yang et al. (2002, pp53-54) below
with the detection rate results and false detection rates.
Table 2.1 showing standard database test set for Face Detection. Yang et al. (2002, p.53).
Table 2.2 Results of Two Image Test Sets experimented. Yang et al. (2002, p.54).
9
Final Year Report Jireh Robert Jam K1557901
As Table 2.2 summarizes, the experimental results show images of different training set with different
parameters of tuning which has a direct impact on the training performance. For example, the
dimensionality reduction is carried out to improve computation efficiency and detection efficacy, with
image patterns projected to a lower dimensional space to form a discriminant function for classification.
Also, the training and execution time and the number of scanning windows in these experimented
influenced the performance in some way. Hjelmås and Low, (2001) classifies face detection
methodologies into two major categories. Image-based approaches, which is further sub-categorized
into Linear subspace methods, Neural networks and statistical approaches.
Image Based Approaches; Most of the recent feature-based attempts in the same study by Hjelmås
and Low, (2001 p.252) have improved the ability to cope with variations, but are still limited to head,
shoulder and part of frontal faces. There is therefore need for techniques to cope in hostile scenarios
such as detecting multiple faces in a cluttered scene, e.g. clutter-intensive background. Furthermore,
this method ignores the basic knowledge of the face in general and uses face patterns from a given set
of images. This is mostly known as the training stage in the detection method.
From this training stage, the system may be able to detect similar face patterns from an input image. A
decision of face existence by the system is now established based on a comparison of the distance
between the pattern from the input image and training image with a 2D intensity array extracted from
the input image. Most image-based approaches use window-scanning techniques for face detection.
Window scanning algorithm searches for possible face locations at all scales. This method depends on
window scanning algorithms. In other research carried out on this method which depends on window
scanning algorithms, Ryu et al. (2006), in their study experimented the scanning window techniques
discussed by Hjelmas and Low, (2001, p.252) in their system. They go further to experiment their
system, based on a combination of various classifiers for a more reliable result compared to a single
classifier. They designed multiple face classifiers which can take different representations of face
patterns. They used three classifiers, Gradient feature classifier which contains the integral information
of pixel distribution that returns certain invariability among facial features. The second classifier is
Texture Feature which extracts texture features by correlation (uses joint probability occurrence of
specified pixel), variance (measures the amount of local variations in an image) and entropy (measures
image disorder). The third classifier used here is Pixel Intensity Feature, which extracts pixel intensity
features of the eye, nose and mouth region for determining the face pattern. They further used Coarse-
To-Fine Classification approach with their classifications for computational efficiency. Based on 1056
images which were obtained from the AT&T, BioID, Stirling, and Yale dataset, they achieved the
results presented in Table 2.4 and Table 2.5 Ryu et al. (2006, p.489) The first face classification of their
experiment with respect to shift in both x and y direction achieved a detection rate of 80% when images
are shifted within 10 pixels in the x direction and 4 pixels in the y direction. The second and third face
of their classification showed a detection rate of over 80% when 2 pixels were shift in both x and y
directions respectively.
Table 2.4: Results showing Exhaustive full scanning method and Proposed scanning method. Ryu et
al (2006, p.489)
10
Final Year Report Jireh Robert Jam K1557901
Table 2.5: Performance Comparison by different Researchers and Proposed System by Ryu et al
(2006, p.490)
As seen in Table 2.5, their system achieved a detection rate between 93.0% and 95.7%. Rowley et al.
1998 in their study on Neural Network-Based face detection, experimented on their system which
applies a set of neural network-based filters to an image and then uses an arbitrator to combine the
outputs. They tested their system against two databases of images. The CMU databased which was
made of 130 images and the FERET database achieve a detection rate of 86.2% with 23 false detections.
Feraud et al. (2001) also experimented on neural network-based face detection technique. They used a
combination of different components in their system (motion filter, colour filter, prenetwork filter and
large neural network). The prenetwork filter is a single multilayer perceptron, with 300 inputs
corresponding to the extracted sizes of the subwindows, hidden with 20 neurons and outputs a
face/nonface for a total of number of weights [reference]. These components, with a combination of
neural network achieved an 86.0% detection rate with 8 false detections, based on a face database of
8000 images from Sussex Face Database and CMU Database which is further subdivided into different
subsets of equal sizes corresponding to different views. (page 48). Table 2.6 and Table 2.7 Feraud et al.
(2001, p.49) below shows the experimental results carried out by these researchers.
Table 2.6: Showing Results of Sussex Face Database. Feraud et al. (2001, p.49)
Table 2.7: Showing Results of CMU Test Set A. Feraud et al. (2001, p.49)
11
Final Year Report Jireh Robert Jam K1557901
Wang et al. (2016) in their study to support neural network face detector used a multi-task convolutional
neural network-based face detector, which relies directly on learning features from images instead of
hand crafted features. Hence their ability to differentiate faces from uncontrolled backgrounds or
environments. The system the experimented on used the Region Proposed Network which generates
the candidate proposal and the CNN-Based detector for the final detection output. They experimented
this based on 183200 images from their database and used the AFLW dataset for validation. Their face
detector system was evaluated on AFW, FDDB and Pascal faces datasets respectively and achieved a
98.1% face detection rate. The authors did not reveal all the facts leading to the development of the
system and I have limited time to implement this on OpenCV. 2.8 (Wang et al. 2016 p.479), shows the
different comparisons of their system against other state of the arts. (Wang et al. 2016 p.480), discuss
their system (FaceHunter) perform better than all other structured models. However, this cannot be
independently verified as this system was commercialised. One cannot conclude if this was for
marketing purpose or a complete solution to the problem as I have limited time to implement it.
The other major category is Feature-based approaches; depends on extracted features which are not
affected by variations in lighting conditions and pose. This according to these researchers, Hjelmås and
Low, (2001 p.241),further clarifies that “visual features are organised into a more global concept of
face and facial features using information of face geometry “. This technique in my own opinion will
be slightly difficult to use for images containing facial features from uncontrolled background. This
technique relies on feature analysis and feature derivation to gain the required knowledge about the face
to be detected. The features extracted are the skin colour, face shape, eyes, nose and mouth. On the
other hand, in another study by Mohamed et al (2007 p.2) suggest that the “human skin colour is an
effective feature used to detect faces, although different people have different skin colour, several
studies have shown that the basic difference based on the intensity rather than their chrominance”. The
texture of the human skin can therefore be separated from different objects. Feature methods for face
detection uses features for face detection. Some users depend on the edges and then grouping the edges
12
Final Year Report Jireh Robert Jam K1557901
for face detection. Furthermore, Sufyanu et al (2016) suggest a good extraction process will involve
feature points chosen in terms of their reliability of automatic extraction and importance for face
representation. Most geometric feature-based approaches use the active appearance model(AAM) as
suggested by (S., N.S. and M., M. 2010). This allows localization of facial landmarks in different ways
to extract the shape of facial features and movement of these features as expression evolves
Hjelmås and Low, (2001 p.241),further placed the feature-based approach into sub categories of; Low
level analysis (Edges, Gray-levels, Colour, motion and generalized measure).
Feature analysis (Feature searching and constellation analysis).
Active shape models (Snakes, Deformable templates and Point distribution models(PDMS).
Figure 2.5 shows the different approaches for Face detection as reported in a study by Hjelmås and
Low, (2001), which can be compared with Figure 2.6 showing the exact same classification by Modi
and Macwan (2014, p.11108).
Figure2.5 Face Detection, classified into different methodologies. Hjelmås and Low, (2001, p.238)
13
Final Year Report Jireh Robert Jam K1557901
Figure 2.6 Various Face Detection Methodologies. Modi and Macwan (2014, p.11108).
Hjelmås and Low, (2001, p.240) in their study, show experiment based on edge-detection based
approach for face detection, on a set of 60 images of 9 faces, with complex backgrounds and correctly
detected 76% of faces with an average of two false alarms per image. Nehru and Padmavathi, (2017,
p.1), in their study, experimented face detection based on the Viola-Jones algorithm in a dataset of dark
and colored men to support their statement which states “It is possible to detect various parts of the
human body based on the facial features present”, like the eyes, nose and mouth. In this case, systems
as such will have to be trained properly to be able to distinguish features like the eyes, nose, mouth etc.,
when a live dataset is used. The Viola-Jones algorithm to detect faces as seen in the images in Figure
2.7 which shows dark and colored skin faces detected accurately.
Figure2.7 Face Detection in Dark and Colored Men by. Nehru and Padmavathi, (2017, p.1).
Also, in support of the claim made by Nehru and Padmavathi, (2017), the research carried out by Viola-
Jones to come up with the Viola-Jones algorithm in face detection, has had the most impact in the past
decade. As suggested by Mayank Chauhan et al (2014 p.1615), the Viola-Jones in face detection is
widely used in genuine applications such as digital cameras and digital photo managing software. This
claim is made based on a study by Viola and Jones (2001). Table 2.9 gives a summary of the results
obtained by these experts, showing various numbers of false and positive detections based on the MIT
and CMU database set with 130 images and 507 faces.
14
Final Year Report Jireh Robert Jam K1557901
Table 2.9: Various Detection rates by different algorithms showing positive and false detection rates.
Viola and Jones (2001, pI-517).
Wang et al, (2015, p.318) states that” the process of searching a face is called face detection. Face
detection is to search for faces with different expressions, sizes and angles in images in possession of
complicated light and background and feeds back parameters of face”. In their study, they tested face
detection based on two modules which shows one module uses a combination of two algorithms (PCA
with SVM) and the other module based on a real-time field-programmable gate array (FPGA). With
these they concluded with their results of face detection accuracy of 89%. Table 2.10 is a screen short
taken from this paper to show experimental results of two units combined in order to investigate the
accuracy of the system.
Table 2.11. Comparing Different Algorithms on classification rates. Thai et al. (2011, p.392).
The overall objective of the Face detection part of this project will be to find out if any faces exist in
the input image and if present will return the location in bounding boxes and extent of each face,
counting the number of faces detected. It is a challenge to this project that due to the variations in
location, scale, pose orientation, facial expression, illumination or lighting condition and various
appearance features such as facial hair, makeup etc. It will be difficult to achieve an excellent result.
However, the performance of the system will be evaluated, taking into consideration the learning time,
execution time and number of samples required for training and the ratio between the detection rate and
false detections. Table 2.12 below shows experiments from different researchers. They have used
different sizes of image dataset. Some have used a combination of different algorithms and applied
15
Final Year Report Jireh Robert Jam K1557901
other methods like colour filtering etc. and different training sets to obtain their results. However, we
can conclude the Viola-Jones algorithm which is on its own classifies images based on local features
only and can still detect at very high accuracy and rapidly than pixel-based systems. Viola-Jones (2001,
p.139).
Face Recognition:
Face recognition can be defined as the method of identifying an individual based on biometrics by way
of comparing a digital captured image or video with the stored record of the person in question.
In the early 90s numerous algorithms were developed for face recognition and increase in the need for
face detection. Systems were designed to deal with video streaming. The past few years has proven to
have developed more research and systems to deal with such challenges. Dodd, (2017 p.1) reported that
in the recent Notting Hill carnival, some arrest resulted due to trial of facial recognition systems. Hence
the reason why there is still on-going research on this system. In contrast, the 2011 London riots had
just one arrest contributed by facial recognition software out of the 4962 that took place. With the most
recent technology of facial recognition and detection techniques, commercial products have emerged
on the markets. Despite the commercial success a few issues are still to be explored.
Jafri and Arabnia (2009) in their study discuss Face Recognition in two primary tasks. Verification; a
one-to-one matching of an unknown face alongside a claim of identity, to ascertain the face of the
individual claiming to be the one on the image. Identification which is also a one-to-one matching,
given an input image of a face for an individual (unknown), to determine their identity by comparing
the image against a database of images with known individuals. However, Face Recognition can also
be used in numerous applications such as Security, Surveillance, General Identity Verification (electoral
registration, national ID cards, passports, driving licenses, student IDs), Criminal Justice systems,
Image Database Investigations, Smart Card, Multi-media Environments, Video Indexing and Witness
face reconstruction. Face Recognition in most common form is its frontal view which is not unique or
rigid as numerous factors cause its appearance to vary. Variations in facial appearance has been
categorized in two groups of intrinsic factors (physical nature of the face which is independently of the
observer) and extrinsic factors (illumination, pose, scale and imaging parameters such as resolution,
noise, focus, imaging) as discussed by Gong et al. (200) and supported by Jafri and Arabnia (2009 p.42).
Lenc and Král (2014 pp.759-769) classify face recognition into various approaches; Correlation
Method, compares two images by computing the correlation between them, with the images handled
16
Final Year Report Jireh Robert Jam K1557901
as one-dimensional vectors of intensity values. The images are normalized to have zero mean and unit
variance with the nearest neighbour classifier used in the image directly. With these considerations
stated, the light source intensity and characteristics of the camera are suppressed. The limitations of this
method are; Large amount of memory storage needed, the corresponding points in the image space may
not be tightly clustered and it is computationally expensive.
Eigenfaces; This method considers the whole image as a vector. With this method, performance
depends on alignment of the images with approximately the same pose. The change in lighting
conditions, scale, pose and other dissimilarities decreases the recognition rate rapidly.
View-Based Eigenfaces; on like the previous method, evaluates images on a large database and
addresses the problem of viewing orientation.
Independent Component Analysis; this separates signal into sub-components with the main aim
looking for a linear combination of non-Gaussian data signals that reconstructs the original signal. With
this method, images are treated as random variables with pixels as observations and vice-versa.
Fisherfaces; which are derived from Fisher’s Linear Discriminant (FLD) and projected into another
less dimensional space. The dimensionality, given by the image resolution is reduced to a number of
distinct classes. Yi et al. (2000) investigated this method and compared it with SKKUfaces
(Sungkyunkwan University faces) an algorithm developed by them which adopts Principal Component
Analysis and FLD in series similar to Fisherfaces. They discuss Fisherfaces method as being able to
perform dimensionality reduction using a projection matrix and still preserve class separability. With
this, FLD is applied to reduce PCA subspace thus achieving more reliability for classification purposes.
Their performance comparison of the Fisherfaces method and SKKUfaces for SKKU facial images
showed a recognition rate of approximately 88% for Fisherfaces method and 92% for SKKUfaces with
changes in illumination and other factors considered. They also compared the Fisherfaces methods and
SKKUfaces with Yale facial images and achieved an approximate recognition rate of 92% for
Fisherfaces and 96% recognition rate for SKKUfaces. On the other hand, Jeong and Choi (2013) show
the performance of Fisherfaces on recognition for various number of features with a recognition rate of
approximately 91% for the Yale database as shown in Figure 2.8. Furthermore, they expanded their
research and compared the Fisherfaces method with Feature Feedback and obtained a recognition rate
of approximately 95% with Feature Feedback performing at a recognition rate of approximately 96%
on the Yale database.
Figure 2.8 Showing Number of Fishers for Recognition. (a) Descending order of EigenValues. (b)
Recognition rate as a function of the number of features. Jeong and Choi (2013, p.542).
17
Final Year Report Jireh Robert Jam K1557901
Kernel Methods; are KPCA and KFLD addressing the issue that original methods are based on second
order statistics as discussed by Lenc and Král (2014, p.761). This method makes considerations of
multiple pixels’ dependencies, allowing more information to be captured for the face representation.
Another method based on the kernel method uses Kernel Fisher discriminant analysis, with features
extracted by Discrete Cosine Transform (DCT) and Radon Transform. The coefficients for this are used
for feature vector. Furthermore, the Kernel Fisher Discriminant (KFD) is applied to increase
discrimination capabilities. This method was experimented by Jadhav and Holambe (2010 p.1007)
alongside other algorithms and showed an average recognition rate of approximately 90.09% for KPCA
on two image sets and 92.01% based on three image sets for KFD on FERET database. Upon evaluation
using the ORL database KPCA showed an average recognition rate of 90.65% and KFD 91.54% for
three images per training set and on the Yale database showed an average recognition of 90.22% for
KPCA and 94.76% for KFD showing an overall average recognition rate of 90.32% for KPCA and
92.77% for KFD.
Adaptive Local Hyperplane; one of the methods suggested by Lenc and Kral (2014) says it’s an
extension of the K-local Hyperplane Distance Nearest Neghbour HKNN). This method approximates
the possibility of missing instances in the manifolds of particular classes by a local hyperplane. With
this method, classification of an unknown vector starts with identifying the K-nearest neighbor and
based this, the local hyperplane is constructed.
Genetic Algorithms; this approach shows how a facial image is processed in lower dimensional PCA
sub-space. It looks for optimal rotation of a basis vector based on a fitness function, as the rotations are
random.
Trace Transformation; is invariant to image transformation and it is first transformed into a trace
space, thus creating a face representation. The face representation is created.
Linear Regression; assumes that faces from one class are placed on a linear subspace and multiple
training in images for each class (individual).
Active Appearance Models; this method uses a statistical model for grey level appearance and object
shape. As stated by Lenc and Kral (2014 p.762), “a set of training examples is used to learn the valid
shapes. The shapes must be labelled”. This is a clear indication that the landmarks are manually marked,
so that the algorithm can try to match the model to an image. by so doing, the distance between the
synthesized model and the image is minimized and performed iteratively.
Neural Networks; performs based on neural networks with the images sampled into a set of vectors.
The vectors created from the labelled images are used as a training set for Self-Organized Map. In other
study carried out by Dhanaseely et al. (2012), discuss the neural Network Classifiers as an Artificial
Neural Network (ANN) that comprises of artificial neurons that uses a computational model to process
information. They further conducted an experiment based on their proposed system, to measure the
performance recognition rate of two of the neural networks, the Feed Forward Neural Network and
Cascade Neural Network. A diagram of their proposed system is seen below in Figure 2.9.
18
Final Year Report Jireh Robert Jam K1557901
Table 2.13 Parameters for Cascade Neural Network and Feed Forward Neural Network Dahanaseely et
al. (2012).
The overall recognition rate based on the parameters shown on the table 2.13 show a 97.5% recognition
rate for CASNN and 92.5% for FFNN. This experiment is based on the ORL database.
Hidden Markov Models; associated with the states of the HMM are the subdivided regions of the face
(eyes, nose, mouth etc.). the images in this method are sampled with a rectangular window of the same
width as the image and shifted downward with a specific block overlap. This is done thanks to the
representation of boundaries between regions which are represented by probabilistic transition between
the states of the HMM.
Miar-Naimi and Davari (2008) in their study carried out their investigation of the algorithm based on
7-state HMM. Their experiment on this algorithm, was carried out alongside Singular Value
Decomposition (SVD) coefficients as extracted features. With the quantized SVD coefficient of each
block extracted from the image, each face is a numerical sequence that is modelled by HMM. With an
order static filter used for processing of the images, their experiment was carried out on the ORL
database of 400 images and 99% recognition rate was achieved. On the YALE database, which has 165
images of 15 subjects, they obtained a 97.78% recognition rate. However, the high recognition rate is
achieved based on only three features with the best classification rate per block, alongside image
resizing. With different quantization levels and symbols of 24,182 and 630, the recognition rate dropped
to 80%, 94.5% and 97% respectively. However, the high recognition obtained by the experiment on
this study, is only achieved based on the number of trained images greater than five and forward facing
facial images. The recognition rate based on number of trained images less than five shows a recognition
rate of 78% on the YALE database.
Lenc and Kral (2014 p.763), in their study investigated this algorithm on a dataset containing 5 images
of 24 individuals. The recognition rate using this approach was 84%. In order to confirm this algorithm,
they compared the Eigenfaces with the same dataset and obtained a recognition rate of 74%. In another
study by Cheng et al. (2014), describe HMM as “a doubly embedded stochastic process with a Markov
chain describing the transition of the states while another stochastic process used to describe the
statistical corresponding relation between states and observations”. With these states hidden, it can only
be observable through observations produced by each state. HMM works better by specifying
parameters. Cheng et al. (2014) further tested this on the ORL database and obtained an 86.50%
recognition rate. This rate is dependent on the parameters (sampling height, overlap, DCT coefficients,
training set, training times, and recognition time). With these parameters, the recognition rate can be
19
Final Year Report Jireh Robert Jam K1557901
partially increased by optimizing the parameters of the sampling height, overlap and DCT coefficients.
Wang and Abdulghafour. (2015, p.294), in their study also carried out an investigation on two databases
by combining DCT-HMM for face recognition. With a parameter of the Overlap value of 9, they
obtained a recognition rate of 83.5% for the ORL database and 82.23% for the Yale dataset.
Support Vector Machine; this method is relying on two approaches. Component-based, and global
Methods to create vectors that represent the face. In the first approach, the pixel values are considered
as the input vector in a Support Vector Machine classifier. In this approach, the separate representations
are used for important parts of the face and feed into a classifier for individual classification, and it’s
less sensitive to image variation. Also, it is used for feature extraction, derived from Linear Discriminant
Analysis (LDA). Da Costa et al. (2015) investigated this method by using feature extraction methods
and further extract the coefficients generated by different transforms (Shearlet transform, wavelet,
contourlet, and curvelet) and variations of PCA and LDA obtained an approximate recognition accuracy
of 85.05%.
Cost-Sensitive Face Recognition; most researchers always consider the recognition rate but never take
into consideration different types of misclassifications which may have an impact on the performance
of the system. The loss value depends on the classification error. Exmples of cost- sensitive
classifications are mckNN and mcKLR.
Elastic Bunch Graph Matching and Related Approaches; uses features obtained from Gabor
wavelet. The first phase of the process is to manually label the landmarks presented to the algorithm.
The landmarks are then used to compare the landmark position in an imaginary image. The landmark
positions are computed by Gabor wavelets convolutions(Jets) and used for face representation. With a
“bunch of graph” created to relate this, each node in the graph contains a set of Jets for each landmark
on all the images. Face similarity is obtained by getting the positions of the landmark and jet values.
Kepenekci Method; in this algorithm, the landmarks are labelled by Gabor filter and obtained
dynamically as compared to the previous algorithm which requires manual labelling of the facial
landmarks. It also uses the sliding window to scan the images and identify the maxima of Gabor filter
responses within the window. These points are known as fiducial points. The fiducial points are not
constant and used to calculate the feature vectors. The cosine similarity is used to calculate the similarity
of these vectors. The higher the window size the less fiducial points detected. However, a search for
larger window leads to more computational time. The number of fiducial points determines the time
needed in the comparison stage.
Local Binary Patterns; first used in texture as texture descriptor, the operator uses the value of the
central pixel to threshold a local image region. The pixels are labelled either as 0 or 1 depending on
whether the value is lower or greater than the threshold. Linna et al. (2015) in their study, proposed a
system (Online Face Recognition System) that is based on LBP and Facial Landmarks, which uses
nearest neighbor classifier in LBP histogram matching. They experimented the system on the videos of
Honda/UCSD video database. They used both Offline and Online testing for different distance
thresholds and achieved recognition rates of 62.2%,64.0% and 98.6% respectively for the Offline test.
The recognition rate was calculated based on a confusion matrix that is shown in Figure 2.10 below,
obtained as a screenshot from this paper. The online test performed at a recognition rate of 95.9%. The
high achieved recognition rates as per their experiment is based on longer search strategy. The detected
face tracked, is used to find the nearest neighbor match and the number of frames from the start of the
face tracking are used in the database. This shows that the number of frames decreases as the database
gets larger and hence increase in search time. This is because more time is needed to find the nearest
match for a single frame. However, as more time is needed to find the nearest match, although
recognition rate may be high, it is still not robust enough to compete with other methods.
20
Final Year Report Jireh Robert Jam K1557901
Figure 2.10: Confusion Matrix showing Offline test results by Linna et al. (2015, p.10).
Local Derivative Patterns; this constructs patterns from local derivative variations. It has an advantage
over the LBP in that it has a higher order and can represent more information than the LBP. It can be
applied on an original gray scale level image and processed Gabor filter images. Dagher et al. (2013)
in their research investigated LDP and other algorithms with their propsed algorithm (Voting
Technique) on four databases (ORL, UMIST, Yale, and BIOID). On the different databases, they
randomly partitioned the databases into 60% training set and 40% test set the results obtained was
approximately 73.59% recognition rate for the LDP.
Scale Invariant Feature Transform (SIFT); originally developed for object recognition by Lowe
(1999), creates local features that can lead to high recognition rates. These features are invariant to
image scaling, translation, rotation and illumination. With this algorithm, features of the reference
image are compared using Euclidean distance of their feature vectors. The algorithm works in four
stages, namely extrema detection, removal key-points with low contrast, assignment orientation and
descriptor calculation. Lenc and Kral (2014, p.765) report a recognition rate of 96.3% and 91.7% on
the ORL and Yale databases respectively. In other research by Sunghoon et al (2016), the recognition
21
Final Year Report Jireh Robert Jam K1557901
rate based on ORL database is 90.0% on an image size of 50X57. Their conclusion was that the usage
of “SIFT for face recognition has many problems because face has the landmark of doubleness, non-
rigid and smooth character compared to general object”, as stated by Sunghoon et al (2016, p.10).
Speeded-Up Robust Features(SURF); another useful method of descriptor creation and Key-point
detection. In this method, the process of key-point detection is based on Hessian matrix where box
filters approximate the second order Gaussian derivatives. It is invariant to face rotation as one
orientation is applied to each key-point. Computation is based on circular neighborhood of the key
points. Carro et al. (2015) in investigated this method compared with the SIFT method implemented on
OpenCV. Their proposed approached followed a step by step flow process as shown in Figure 2.11
below obtained as a screenshot on their study. They measured the performance of SURF against SIFT,
based on Correct Recognition Rate (CRR) and Equal Error Recognition Rate (ERR). The SIFT showed
87.34% CRR and 31.7% ERR with the SURF showing a 98.97% CRR and 29.62% ERR. Figure 2.12
shows the comparison between SIFT and SURF with Key-point matching obtained from the results of
their investigation. They further concluded that both algorithms can be used for face recognition with
SURF as the most suitable.
Figure 2.11 Showing Overview of Proposed System by Carro et al. (2015, p.324)
Figure 2.11. Comparison of SIFT (left) and SURF (right) showing Key-point matching Carro et
22
Final Year Report Jireh Robert Jam K1557901
Figure 2.12. Results comparison of ROC curve for two Methods LBP and the Proposed Method by Li
et al. (2017, p.17068).
However, the review of the different approaches seen in this section has been evaluated by the experts
stated on different databases. There is bound to be disparities in the experimental setup which leaves
one to conclude that, there is no worst or best performing approach. But this can lead us into making a
choice to target our application which will be discussed in the implementation phase.
Therefore, the main purpose for face recognition related to this paper is to match a given face image of
a student captured in a lecture, to a database of known faces, in order to identify the student in the query
image. Unlike any other system for face detection and face recognition, the investigation will not be
limited to challenges but will test some of the methods in the review to achieve the aim of the project.
23
Final Year Report Jireh Robert Jam K1557901
CHAPTER THREE
Methodology:
The development of software projects in the past have been carried out based on different
methodologies applied by software developers/engineers. As discussed by Veerapaneni and Rao (2011,
p.41), each methodology used is informed by the type of project, the organisation and the developers
involved in seeing the execution of the project to completion.
The agile project delivery framework is the approach that will be used for the development of the system
in this paper. DSDM is an agile methodology approach primarily used as a software development
method. The development of this project is requiring the supervisor (user) involvement in order to have
timely visible results. Information gathered from the literature review shows researchers using different
algorithms in face detection and recognition. However, as it is an ongoing research area, this project
requires incremental implementation in smaller functionalities which will be put together at the end for
a complete system. With the help of my supervisor, it is important to consider DSDM as the approach
to achieve the project objectives. The project objectives specified in the proposal can only be achieved
with the expertise of the supervisor, as functionalities will be prioritised in order of importance
alongside continuous user involvement. Unlike other approaches (Waterfall) where the stages of
implantations are clearly defined, it was preferable to use the approach which will adapt easily to
changes made during the implementation. Baseer, K. (2015)
24
Final Year Report Jireh Robert Jam K1557901
DSDM takes an iterative development approach which is facilitated by the following characteristics.
The supervisor and the developer (myself) will be actively involved in the development of the
system.
The client will be satisfied with the rapid development of the system as working prototypes will
be released.
The results of the development will be timely and directed by the supervisor.
More functionality will be delivered at regular intervals with basic functionalities delivered
frequently.
Bureaucracy will be eliminated to break down communication between parties to facilitate
incremental development.
There are early indications of achieving the project objectives rather than surprise at the end of
the project as different algorithms will be tried.
The system is likely to be delivered on time by trying different algorithms stated in the literature
review.
The direction of the Project will be influenced by the client and supervisor.
Below is figure 3.1 showing the DSDM development process (Pierre, H. 2016). The feasibility studies
for this project, have been informed by the literature review. Each interactive and iterative stage will be
implemented according to changes to functionalities at regular intervals, by putting together different
algorithms to achieve project objectives. The risk of building upon the wrong solution will be eliminated
as this project will be closely monitored. In this way, the deployment of the project in this case will be
a smooth process. DSDM wipes out late delivery as the project will be focused on time frame.
Moreover, unused features will be eliminated to be able to meet the project dateline.
25
Final Year Report Jireh Robert Jam K1557901
software development.
Extreme Programming(XP):
Other methods that could have been considered is the agile methodology framework Extreme
Programming (XP). With this methodology, there is constant release of small prototypes with testing
alongside Wood et al (2013). The incremental development of the system in this project will require a
little bit of work upfront to understand in wider perspective the system design, before going into the
details some aspects to deliver specific features/functionalities. By doing this, design decisions will
change based on the most current information and refactoring will be incorporated to remove
duplication. As this project is an ongoing research in its field of study and will require small iterations
complemented by evaluations and client feedback. Also, the development of this project using this
method gives the client a clear idea on user interaction with the system. It is a suitable approach for
small teams of developers who are working with projects that have unclear changes in requirements.
Furthermore, a negative communication amidst foundations (testing, client acceptance test, test-first
design, pair programming and refactoring) is easy to identify and moderated professionally. This will
be done through simple design and a pace sustainable to the project incorporated with performance.
Wood et al (2013). Also, pair programming which is an aspect of this methodology is going to slow
down the development of this project as it is not possible to work with the supervisor daily.
Comparing all the above methodologies, the most suitable framework to the development of this project
is DSDM. This project, will be divided into smaller functionalities, as the information gathered from
the literature review shows no guarantee an algorithm will work as expected. In this case, there will be
amendments during the implementation of each iteration and new solutions can be applied. On the other
hand, the SCRUM framework is similar but with the short period for each sprint and pair programming.
It is not a good idea to follow as there is one developer.
To sum up, choosing this Agile approach was due to the information gathered from the literature review
of the ongoing research on face recognition. Moreover, I had little understanding of Face Detection and
Face Recognition and the resources and libraries that could be used to best implement it. However,
from more research gathered and having a supervisor who is an expert in the field, I decided to choose
this approach for an iterative development throughout the life cycle of the project.
To conclude, with DSDM, there is that flexibility to go back stages and find applicable solution to
upgrade functionalities.
26
Final Year Report Jireh Robert Jam K1557901
Display the input image alongside output image side by side on the same plot.
Display the name of the output image above the image in the plot area.
Non-Functional Requirements:
Non-functional requirements are set of requirements with specific criteria to judge the systems
operation. These requirements have been collected based on the following after meetings with the client.
They cover ease of use to the client, security, support availability, operational speed, and
implementation considerations. More specifically:
The user will find it very convenient to take photos.
The user will inform the students when taking a photo with clear instructions on how to position
their faces.
The system is very secure.
The system will have a response time of 10 seconds.
The system can be easily installed.
The system is 100% efficient.
The system must be fast and reliable.
MoSCoW Analysis:
In order to support the analysis stage, we use MoSCoW a DSDM-Agile Methodology tool to weigh the
importance of requirements captured from analyzing the use cases. This will help prioritize the delivery
of each requirement under Must Have, Should Have, Could Have, Will Not Have. The “Must Have”
must be implemented for the final solution of this system to be acceptable. The “Should Have” are
priority features that will be implemented if possible with this project time frame. The “Could Have” is
a desirable feature to have if time permits but this system will still function without it. The “Not Have”
is a feature of this system that will be implemented in future. Due to the unique login of the university,
it is easy to trace who is using this system. However, if this system is to go commercial, this will be a
requirement to be implemented.
Must Have: With regards to this project, the “Must Have” are the requirements that have been identified
by the client that must be implemented for the final solution. Without these requirements, the final
solution will not achieve its aim and objectives.
The application must detect images by use of bounding boxes.
Crop the total number of faces detected.
The application must resize faces to match size of faces stored on the database.
Compute the total attendance based on the number of faces detected.
Train Images for recognition.
Display the input image alongside output image side by side on the same plot.
Display the name of the output image above the image in the plot area.
Should Have: These are priority features that the system “Should Have” as identified by the client
during the meeting. These features will be implemented if possible with this project time frame.
Although these features are priority, the system will still meet its aim and objective.
Display the name of the input search image and the output image in the command window.
Determine the percentage Recognition of an image to that found on the database.
Compute recognition rate of the system.
Could Have: The “Could Have” is a desirable feature for the system of this project but will only be
implemented if time permits. This system will still function without it.
Graphical User Interface (GUI).
Professional HD Camera.
Will Not Have: The “Not Have” is a feature that was identified during the meeting that will be
implemented in future as it is not much of an issue at the moment. Due to the unique login of the
university, it is easy to trace who is using this system. However, if this system is to go commercial, this
will be a requirement to be implemented.
A login authentication.
SWOT Analysis:
An analysis of the strategic implementation of this project can be done using SWOT. SWOT evaluates
the project in terms of Strengths, weaknesses, opportunities and threats. It is simply to identify these
points using this type of analysis methodology. By using this method, this section will consider both
internal (strengths and weaknesses) and external (opportunities and threats) factors which are either
27
Final Year Report Jireh Robert Jam K1557901
harmful or useful to accomplish the project objectives. This will help estimate the likelihood of the
success in accomplishing project objectives. The SWOT acronym represents for Strengths, weakness,
opportunities and threats.
Strengths – are factors that influence the success of the project objectives and possessed within.
Weaknesses – are harmful to accomplish project objectives and are internal factors.
Opportunities – are factors that accomplish project objectives from an external point of view.
Threats – are factors harmful to the project aims and objectives and are external.
S.W.O.T.
Matrix
28
Final Year Report Jireh Robert Jam K1557901
Strengths:
1.A strength the background knowledge i have in Java Weaknesses:
programming. 1.No previous experience in project management but
2. A strength,by me, is the ability to learn new skills have some knowledge and first time implementing on a
within a short time possible. real project.
2.The client has shown willingness to meet up, which is 2. The technology (MATLAB) used to build this system
a strong indication that the management of the project is completely new and I have to learn from scratch.
will facilitate the elimination of non essential 3. No previous knowledge of building a face recognition
requirements and possible mistakes. A strong point to system before using other languages.
achieve objectives.
4.The system may not have an inbuild security or
3.A strong listening and communication skill, and the authentication.
ability to follow closely what the supervisor request.
5. The possibility not to be granted the permission to
4.The experience of my supervisor in research and use Kingston University image database due to Data
student focus guidance in the right direction.. protection act and proceedure due to time constraint.
5. The expertise of my Supervisor in the Image
processing.
Opportunities:
Threats:
1.Improve system; an improved system in the future
1.The system can be hacked and all images of students
upon research of other methodologies and complete
fall in the wrong hands.
new developement for Face Recognition.
2.An error in the system could cause a mismatch of
2. Free Images of students in a classroom setting can
student and could be unacceptable to use by the
be obtained from the internet for testing of the system.
University.
3. Application could be tested using Face Databases
3. Face Recognition which is still an ongoing research
availble online.
world wide may eliminate the chance of the product
3. The expertise of my supervisor is an opportunity to being used by the University.
take on the project as valuable feedback will be very
4.The government could change the privacy law data
useful through out the development of the system.
handling laws which could affect the use of the system
4. Matlab is available both on University platform and by the university.
my personal computer.
5. The period to implement this project is very short.
5. Use of finger print class attendance system by Face recognition has been in research for many years
university which already exist in the market today. and it is no guarnatee that this poject will be the best.
29
Final Year Report Jireh Robert Jam K1557901
the user can click to facilitate interaction between certain task as requested. Because the system has two
phases, the second phase of the system will involve the training of images on a dataset that are to be
used for recognition.
The proposed system behaviour has been captured by the use case diagram in Figure 3.2 below.
Conclusion:
The requirements analysis of any project has laid the foundation to take the project forward through to
the design and implementation phases. Meeting with the client has been very useful in gathering
functional and non-functional requirements. Also, information gathered from the literature review have
been very useful. MoSCoW, SWOT analysis and Use Cases have been a strong tool to identify how the
client wants the system to work. However, because aspects in this project are part of ongoing research,
there will be changes during the implementation to achieve more which could lead to more contribution
in the future.
Project Plan
This project has been planned and followed using the Gantt chat below. The detail layout of the plan is
attached to the Appendix A2 of this report. The detailed layout shows the phases of implementation
using the DSDM methodology chosen for this project. The reader can follow these phases on the
30
Final Year Report Jireh Robert Jam K1557901
detailed layout to understand how iteratively this project has been developed.
Gantt Chat 9-Oct 28-Nov 17-Jan 8-Mar 27-Apr
Project Contingency
The project will be constantly being worked upon as I will be using the DSDM-Agile methodology
approach throughout the build. The Gantt chart shows a planned duration based on exaggerated time
frame estimation which gives me and my supervisor ample time to meet the dateline. Some of the
factors outlined below may hinder the project phases.
RISK FACTORS EVALUATION CONTINGENCY
Delay in project implementation Likelihood- Online tutorials, attend lectures on
Medium Image processing and Vision, Online
Impact – medium research on similar projects.
Breakdown in Communication Likelihood- Planned meeting with Supervisor.
with Supervisor medium Communicate via emails and work with
Impact-high calendar.
Work in progress with short Likelihood- Low More research and online study on areas
deliverables and functionalities Impact-High that may arise with these functionalities.
may increase time frame as Use more image processing systems
unforeseen functionalities may already out there to compare
come into play. functionalities.
Lack of technical expertise Likelihood- Commit extra hours of work on use of
Medium MATLAB and online research.
Impact - High
Technology used may not fulfil Likelihood-m Carryout research to use best platform or
project objectives and High technology to be used.
requirements as stated. Impact-Very High
Ill Health Likelihood- Rearrange workload to cope with ill
Medium health.
Impact-Low
Bugs on coding leading to Likelihood-Very Continuous integration, testing and
unsatisfactory results. Low evaluation to ensure the system is
Impact- Very high satisfactory.
31
Final Year Report Jireh Robert Jam K1557901
Summary:
Summarily, to understand the goals and objectives of this project, boundaries have been established by
using various tools to capture and analyse specific requirements. The user story has enabled us to
capture specific requirements, and use non-functional requirement to judge the system, MoSCoW to
prioritise functionalities that the system must have. Moreover, SWOT has been used to analyse the
internal and external factors which can either be helpful or harmful to the project objectives.
Contingency plans to identified potential risk were classified under impact, likelihood and impact for
the smooth management of the project.
CHAPTER FOUR
Design:
This chapter represent design concepts that has led to the current implementation of the prototype of
this project. The design of this system is going to be carried out using the requirements analysed in the
previous chapter in order to produce a description of the systems internal structure that will serve as the
basis to implement the system (Bass et al 2013). This will result in the systems architecture, showing
32
Final Year Report Jireh Robert Jam K1557901
how the system will be decomposed and organised into components alongside the interface of those
components. The model design of this system, will form a blue print for the implementation that will
be put together to achieve the project objectives and best performance for the final product. This system
design consists of activities that fit between software requirements analysis and software construction.
The algorithms that compute the functionalities of this system have been discussed further in this
chapter.
The various outputs for this design are functional specification, detailed design, user interface
specification, data model, prototype implementation plan.
Data Flow Model: This is a representation of the system of this project, by way of diagrams to show
the exchange of information within the system. It is a diagram that gives an overview of the system
described here without going into much detail, which can be later elaborated. (Wikipedia 2018).
Structural Model: A structural model will display of the system of this project in terms of components
and their relationships. For example, the system of this project is modelled by architectural design to
illustrate the systems responds to events based on the system environment and the interaction between
other components both externally and internally. (Sommerville 2011, p.119). The system of this project
will also be represented using a behavioural diagram based on the Unified Modelling language (UML).
After a clear understanding of the requirements and assigned components for the system in this project,
the method chosen to inform the implementation phase is described below.
During this phase, the client provided feedback to match specific objectives. The prototype has been
built iteratively by using the requirements gathered in the requirements and analysis section in the
previous chapter.
Architecture Design:
This shows the interaction between software (Internal) components and hardware (External)
components with an interface to establish a framework to achieve system objectives. Both external and
internal components have been considered. The internal component incorporates all the functionalities
with a Graphical User Interface to allow the user to interact with the system.
33
Final Year Report Jireh Robert Jam K1557901
the system will very much rely on the resolution and quality of the image and how it is trained for
recognition.
Interface Design:
The Graphical user interface has been designed to allow the user to interact with the system. This has
been implemented using MATLAB’s GUI design accessed via GUIDE. A simple menu could still do
the job, but using a GUI brings the system together from Face Detection to Recognition.
34
Final Year Report Jireh Robert Jam K1557901
Choice of Methods/Algorithms:
The above figures represent the system’s structure, interface and activities that will occur in the system.
The figure 4.1 represents the structure show the hardware component (camera) as an input device for
images. The software component that performs this same task is a local database (folder) of images.
The image is loaded using an internal functionality (Load image) for pre-processing. When the image
is processed, the faces in the image are detected and aligned into suitable sizes that is required for
feature extraction. The features are then classified and matched to the faces corresponding to that
requested by the system or the user. This is then output to the GUI which is part of the internal software
components. The figure 4.2 shows the structure of the GUI and the buttons to the various functionalities
implemented in the system. The axis will output the image and the second axis to the bottom of the first
display axis will output the command window output for the user to view when a task is completed.
The figure 4.3 shows how the activities going on in the system. These activities have been described
above. These functionalities will be use the algorithms described below for the design and implemented
implementation of the system.
The Viola-Jones Algorithm:
The Viola-Jones algorithm for face detection was proposed by Viola and Jones (2001) as mentioned in
the literature review. This algorithm since then has been widely used by researchers for face detection.
35
Final Year Report Jireh Robert Jam K1557901
It has shown the most impact compared to other methods with fast detections due to its broad use in
genuine applications Mayank Chauhan et al (2014).
The Viola-Jones algorithm works with a full view of frontal faces (Viola-Jones 2001). The difficulty
comes when faces are tilted or on either side but can be adjusted as has been implemented with reference
to MATLAB.
The Viola-Jones Algorithm scans a sub-window in order to detect faces across an input image. The
standard approach for image processing with this algorithm is to rescales the detector and run it many
times through the image rather than rescaling the input image which results to more computational time.
Each time the rescaled detector is run against an input image, the size of the image changes each time.
The initial step of the Viola-Jones algorithm converts the input image into an integral image. By this,
each pixel is equivalent to the entire sum pixels on top and to the left of the pixel in question.
1 1 1 1 2 3
1 1 1 2 4 6
1 1 1 3 6 9
A B
C D
The pixels are within rectangular features of random sizes which can be computed in constant time. The
given sub-window is analyzed by the Viola-Jones using features of two or more rectangles a number of
reference arrays.
The features are as shown in Figure 4.2. where the resulting value are calculated by subtracting the sum
of the rectangles from the sum of the gray rectangles. (Viola-Jones 2001).
The recognition part of the system has been implemented using Hidden Markov Model with Singular
Value Decomposition (SVD). The use of this has been based on the information gathered during the
literature review. The reason why HMM with SVD coefficients was considered is because of the great
results obtained by implementing these algorithms together as informed by research. Their results were
the motivation to consider as these algorithms for this part of the system. As this area of study is still
an ongoing research, this algorithm has been implemented differently and evaluated accordingly.
Hidden Markov Models (HMM):
This is a simple tool that determines a sequence of events (observations), without the state of the model
in which the sequence went through to generate the event. (MATLAB R2018a Documentation). In
HMM, based on a sequence of observations, the sequence of states can be predicted. Other research
defines the HMM as a definite set of state with each state associated with a multidimensional probability
distribution. The probability that governs these states are transition probabilities and with the associated
36
Final Year Report Jireh Robert Jam K1557901
probability, an observation can be generated. The training and testing of the images for recognition are
performed in the observation vector space (Miar-Naimi and Davari 2008). The image is usually
represented in 2D matrix. In the design, the use of one-dimensional model of HMM is used to partition
the face.
The face is modelled in one-dimension with seven states. The state in a simpler markov chain is visible
to the observer but in HMM, the state is hidden, with the output of the state made visible to the observer.
The 7-state of HMM occur from top to bottom of the face and correspond to the Head, forehead,
eyebrows, eyes, nose, mouth, and chin. These states will be enhanced with the adjustment of following
elements that are needed to complete a HMM (Miar-Naimi and Davari 2008).
The number of states(N) of the model.
The number of observations.
A set of transition probabilities.
Singular Value Decomposition (SVD):
The SVD of a given matrix is the factorization of that matrix into three matrices, where the columns of
right and left matrix are orthonormal and the matrix in the middle of the three is a diagonal with real
positive entries. This is a tool used for signal processing and analysis for statistical data. (Miar-Naimi
and Davari 2008). A data matrix contains singular values with information on noise level, energy and
the rank of the matrix. SVD is used as they contain features of patterns embedded in a signal. This is
because the singular vectors of the matrix are the span bases of the matrix and orthonormal (Miar-Naimi
and Davari 2008). SVD is relevant to face recognition due to its stability on the face image. It is a robust
feature extraction technique. The singular value decomposition of a m-by-n matrix is given by
X = UVT where U and V are orthogonal matrix and is a diagonal matrix of singular values of X and
VT is the conjugate transpose of V. The matrix in this case is of grayscale values, with one for each
pixel.
Feature Extraction:
The features are extracted using SVD. This is because the coefficients have continuous values and build
observation vectors. As these values are continuous, it is clear to encounter an infinite number of
possible observations vectors that cannot be modelled by discrete HMM. Quantization is used to model
the probability density function by distribution of prototype vectors. It processes by dividing the large
set of vectors into groups of approximately the same number of points that may lead to information loss
(Miar-Naimi and Davari 2008).
Figure 4.3: Converting Single Face image into observation sequence. (Omid Sahki 2008).
The diagram shows an image of width(W) and height(H). The observation sequence will be divided
into overlapping blocks of new height (newH) and width(W). The features will be extracted in seven
states (from forehead to the chin). The patch size of (new height by w) slides from top to bottom and
the sequence of overlapping blocks is generated. The overlapping size (OL)=new height-1. Showing
37
Final Year Report Jireh Robert Jam K1557901
that each patch is moved only by 1 pixel at a time. Using a for loop, the number of blocks extracted
from each face is #Block = (H-newH/newH-OL) +1 =sequence of elements. Each block extracted are
converted single values using SVD. Only a 1-by-3 matrix stores the three coefficient values. The values
undergo quantization to round to an approximate value. Assume the discrete coefficient values are
between 0 and 17, 0 and 9, and 0 and 6. Each block of the face will have three discrete values and only
one will be labelled for each block. The possible combinations to consider amounts up to a maximum
label(maxLabel) and a minimum label of 1 if the values are zero. From these values, the final
observation sequence is generated.
The system is put together using these algorithms and will be informed by the graphical user interface
that specifies how the system output each task.
CHAPTER FIVE
Implementation
This chapter will focus on the implementation of the proposed system and how it has been developed
iteratively using the DSDM methodology. It has been followed through by structural modelling with an
architectural design of the requirements captured during the requirements analysis. Also, a detailed
38
Final Year Report Jireh Robert Jam K1557901
description of the functionalities implemented with stages of how the prototype has evolved to
completion will be discussed.
Technology Used:
The key algorithms are Viola-Jones for face detection and Hidden Markov Model with SVD.
The existing implementation of The Viola-Jones algorithm are available for environments MATLAB,
OpenCV and Web Browsers (using adobe flash).
The existing implementation of the Hidden Markov Model with SVD for face recognition are available
on MATLAB, C++ and OpenCV libraries.
Hence, with advice from my supervisor, I have had to choose MATLAB for the implementation of the
Viola-Jones algorithm for face detection due to its full implementation in the Computer Vision System
toolbox in MATLAB (R2018a Documentation). Also, from research (literature review) different
implementation strategies were considered and the Viola-Jones algorithm had the most impact
compared to other algorithms used for face detection. The viola-Jones algorithm according to literature
performs with robustness and high accuracy.
Eventually, with MATLAB already chosen as the implementation platform for face detection, it was
necessary to consider HMM with SVD for implementation of face recognition. Also, with HMM, it has
reduced complexity in training and recognition, a better initial estimates of model parameters are
obtained, works well with images with variations in lighting, facial expression and variations (S.
Sharavanan et al 2009, p.82). Also, the use of SVD coefficients as features instead of gray values of
pixels in the sampling blocks. It also has the ability to integrate with OpenCV (Open Source Computer
Vision).
Implementation of Face Detection:
As informed by the design section, the implementation is done using the vision.CascadeObjectDetector
in MATLAB 2018a which detects objects using the Viola-Jones algorithm. “The cascade object
detector uses the Viola-Jones algorithm to detect people’s faces, noses, eyes, mouth, or upper body”
(MATLAB R2018 Documentation). The Viola-Jones algorithm examines an image within a sliding box
to match dark or light region to identify a face that contains mouth, eyes and nose. The window size
varies with different faces on different scales with the ratio unchanged. The cascade classifier in
MATLAB will determine the regions where a face can be detected. According to MATLAB R2018
Documentation, the stages in the cascade classifier are designed to rule out regions that do not have a
face in the initial stage (reject negative samples) to save time to analyze regions with possible potential
faces in the next stages.
Firstly, the input source of the image is implemented. This will be either from the database or directly
with a webcam. If loading from the database, the functionality is implemented as show in the figure 5.1
below. Notice that the variable image has been set to global, so it can be called anywhere in the GUI.
The “uigetfile” on line 103 is an algorithm in MATLAB that will load images of any file type and the
“strcat” on line 108 adds the filename that will read the image using the Matlab 2018a “imread”
functionality to extract the file from the path declared on line 103.
39
Final Year Report Jireh Robert Jam K1557901
2018a functionality which can be implemented with a simple menu choice as shown in line 14 of figure
5.2. The function “webcam ()” starts the camera and it is previewed by the “preview(cam)” function on
line 9 in figure 5.2.
Figure 5.3: The FaceDtect declared to the vision.CascadeObjectDetector and parameters to influence
detection rate.
On line 381, we implement the vision library of MATLAB to run the Viola-Jones algorithm to detect
faces by calling “BBox = step (FaceDtect, images)” to see if any object is being detected. The
rectangular search region of interest is denoted by the variable “images” and it is within FaceDtect,
specified as a four-element vector [x y width height] specified as pixels in the upper left corner and size
of bounding box (Matlab R2018a Documentation). This is also implemented in the real-time camera
mode described on figure 5.2 above.
FrontalFaceCART: This is an example of a classification model character vector. This character vector
will detect upright forward-facing faces. It is a default parameter of the algorithm. (MATLAB
documentation R2018a).
MergeThreshold: is the criterion needed to define the final detection in an area where there are multiple
detections around an object (MATLAB Documentation R2018a). With the threshold specified by an
integer, varying the integer targets a large area and influences the false detection rate during a multiscale
detection.
ScaleFactor: This takes care of the detection resolution between the MinSize and MaxSize. The search
region is scaled by the detector between MinSize and MaxSize, where the MinSize is the size of the
smallest detectable object in a vector format with two elements, set in pixels. Similarly, the MaxSize is
the largest detectable object in the same format as MinSize set in pixels.
Adjusting the Size of the Detections:
BBox Detections return an M-by-4 element matrix. The bounding box position can be adjusted as shown
on line 385 in figure 5.4 to return the bounding box values based on the number of faces. The values
40
Final Year Report Jireh Robert Jam K1557901
can be changed accordingly, to adjust the rectangle dimensions of width and height as on line 387 in
figure5.4. The number of BBox can also be counted with the MATLAB command ‘size’ and
annotations inserted with the ‘insertObjectAnnotation’ as shown on line 388 on figure 5.4 below.
Adjusting the size of the BBox was an additional implementation spotted iteratively during testing to
crop the face image detect to a size that will not lose many pixels during resizing. This section of the
code will be adjusted accordingly depending on the number of subjects in a group as we do not know
their positions yet.
41
Final Year Report Jireh Robert Jam K1557901
Figure 5.7: Declared Parameters to be used during the Training and Recognition
The figure 5.8 below shows how a folder of images of each student can be accessed from the dataset.
The dataset content is declared by the variable “folderContents” as a cell matrix on line 33. A matrix in
MATLAB is an array of single data type while a cell matrix in MATLAB is a matrix that holds matrices
of different types and formats. With a simple MATLAB command “dir” contents from the
dataset(database) can be retrieved. Before going through each folder on our dataset, we initialise an
empty cell where this information will be stored. This is done on line 35. In order to have a valid index
for each subject on the dataset, I have initialised the subject(studentIndex) to zero, as on line 36. The
size command in MATLAB is used on line 34 to know the number of folders in the dataset and is
declared as “numbofFolders”. The for-loop from line 39 to 45 goes through all the folders of the dataset.
The waitbar on line 40 will read the names of the folders while the loops go through each folder and
will be output the names to a pop-up box to the user. The first two rows in a directory is given by “.”
And “..” refers to the parent directory (Matlab R2018 Documentation). Inside the loop, the folder for
each subject can be accessed to read the list of files ending image format of which it has been
saved(e.1.jpg). This is done with the same function “dir” but this time with the name of the subject and
the file format at the end as shown on line 50.
In each folder, the name can be accessed by the code “folderContents(student,1).name)” as shown as a
parameter of the waitbar. This functionality is important as it will be used later in the implementation
to output the name of the input subject to the match subject and will be displayed on the same plot. It
is important to output the name to the view of the user to show the code is being executed. A vector is
42
Final Year Report Jireh Robert Jam K1557901
declared by variable “uvt” to contain 5 integers that will match the integer values of file name, stored
in the folder of each subject. With these integer values, only the images that will be used for training
and recognition to test the system will be accessed and processed respectively.
A minimum order-static filter (non-linear spatial filter) is added to the grayscale image. This filter
reduces the amount of intensity variation in pixel order where the number of observations are the order
statistics. It is an image restoration technique that make corrupted images near similar as the original
image (Kaur and Singh 2014). It has been implemented as shown on line 69 in the figure 5.9. Also, the
results from the paper in the literature review shows they were good due to the implementation of this
filter.
43
Final Year Report Jireh Robert Jam K1557901
and convert the block matrix into double to represent a data matrix with gray scale values to each pixel.
Line 77 to line 78 computes the SVD from the block matrix into double within the loop. The “blockcell”
is a 5x52 matrix with 52 rows corresponding to the number of blocks extracted for each image and 5
columns correspond to the number of images to be trained. These values are obtained due to the image
resized to half of its size to give [56 46]. This increases computational speed and the observation
sequence generated as explained in the design chapter. Each face of width=46 and height = 56 are
divided into overlapping blocks of newHeight =5 and width = 46. The overlapping size given by OL =
(newHeight-1). The number of blocks as seen previously is now given by #Block = (H-newH)/(newH-
OL) +1. This leaves us with #Block = (56-5)/ (5-4) +1= 52.
The block extraction starts with the parameters of the blocks are declared in figure 5.7. With these
parameters already declared, it can be implemented by the code from line 74 to line 82 of figure 5.9
above. This will now convert each face into a sequence of 52 elements as mention earlier represented
by a 5-by-52 element matrix. We should note here that we are using 5 images for the training as declared
by the vector integers. How these are output to the command window is as shown in Figure 5.10 below.
44
Final Year Report Jireh Robert Jam K1557901
above. The values of U(1 ,1), S(1,1), S(2,2) which were stored on line 84 in figure 5.9 are quantized
into the discrete values mentioned above. These coefficients have maximum and minimum values and
can be computed from all possible observed coefficients. First, we used the values stored in the second
row of the database(studentDatabase) as on the code in line 84 of figure 5.9 to compute the minimum
and maximum values of all observation vectors using a for-loop as in line 96 to line 112, gathering them
into separate matrices as shown. The quantized values are calculated with delta coefficients and stored
in the third row of the database(studentDatabase). This is done as shown on line 117 to 138 as in figure
5.12. A study by Miar-Naimi and Davari (2008p.6) has shown that each quantized vector is associated
with a label that here is an integer" Each block needs to store a discrete value(label). Each block of
image is assigned an integer. These integers are the discrete values declared in figure 5.7 line 21-23
(18,10,7). and this is stored on line 137 on the 4 row of the database (studentDatabase). With all the
labels stored as a cell matrix in the 4 row of the database, they can be converted to a regular matrix
using the MATLAB command “cell2mat” as shown on line 142 in figure 5.12.
Figure 5.12: Quantization data stored on the third row of database as in line 133.
On the figure 5.12 above, on line 134 then the maximum value for one discrete block is 1260. The
values computed for each subject are stored on the fourth row of the matrix as on line 137 in the figure
above. The fifth row is a cell matrix converted into a regular matrix with integer values that holds
observation for each image.
Training the Dataset:
The training of the HMM is done by the MATLAB code hmmtrain as shown below.
This will give an estimation of the emission and transition probabilities for a Hidden Markov Model.
First, create initial probabilities as shown on line 13 below. TRANSGUES is a good guess for transition
probabilities and EMISGUES is a good guess for emission probabilities. Miar-Naimi and Davari (2008,
p.7) initial estimates of observation probability matrix(TRANSGUES) has to be obtained. This is
computed with the number of states of our model and all possible observation symbols obtained during
quantization. This is the probability matrix that will be used to estimate the final probability states for
the HMM. On line 13 in the figure below, the return values are an n-by-n matrix of ones representing
the number of states (7) since we are using the 7-state model. estTRANS and estEMIS are final
estimated probabilities. These are stored on a sixth row on the database as shown on line 32 and 33.
These processes are iterated for all training images corresponding to the integer declared by the vector
“uvt” of the subjects in the dataset.
45
Final Year Report Jireh Robert Jam K1557901
46
Final Year Report Jireh Robert Jam K1557901
The system can be evaluated by passing the face recognition function. The face recognition function
returns the index of the recognised face on the database. The total index of correctly recognised faces
can be computed as a percentage against the total number of faces. First, the function is implemented
to go through the folder and pick images that were not trained declared by a vector “uvt”. The vector
has integer values that match the indices of images that were trained. The total number of recognised
subjects is held in “recSubjects” of the function.
47
Final Year Report Jireh Robert Jam K1557901
48
Final Year Report Jireh Robert Jam K1557901
The recognition callback has been embedded by the same command. However, because the requirement
requested to display the input image side by side with the matched image, it has been done using the
subplot and imshow command. This is as shown in the figure 5.21 below.
CHAPTER SIX
49
Final Year Report Jireh Robert Jam K1557901
of image patches of any size that contains a face defined as two eyes, a nose and a mouth. From this,
there is always a risk of detecting false positives in a face detection system. Yi-Qing Wang (2014 p.1),
in his analysis of the Viola Jones algorithm highlights that “no objective distribution can describe the
actual probability for a given image to have a face”. Therefore, in order to achieve an acceptable
performance, any face detection algorithm used must minimize false positive rate and false negative
rates respectively.
This system uses the state of the art Viola Jones algorithm framework for face detection which has been
mentioned in the literature review section on this paper. The Viola-Jones algorithm becomes highly
efficient with the use of image scaling approaches and techniques to detect faces of different dimensions
which are at different angles to the viewer. The quality of the image was also taken into consideration
whilst analysing the system.
The considered in this analysis are additional cascade e.g. FrontalFaceCART alongside three
parameters of the vision.CascadeObjectDetector (MergeThreshold level, scale factor and window size)
An integral image with low quality scaling of this algorithm will lead to a loss in features which can
directly affect the performance of the system.
Window Size: The window size consists of the MinSize and MaxSize. This sets the size a face detection
can be. For this system, to maximise the accuracy, I have decided to use MinSize only which sets the
minimum size for a face in order to include very tiny faces at the back of the class, as the faces are of
different sizes. The MinSize [height width] is greater than or equal to [20 20] for this system. Other
sizes have been tested during the implementation and iteration testing of the system before settling for
this size.
ScaleFactor: The ScaleFactor determines the scale for the detection resolution between successive
increments of the window size during scanning. This parameter will help to decrease the number of
false positives. (decrease in false positive is as a result of increase in scale factor).
MergeThreshold: This parameter will control the number of face detections and declare the final
detection of a face area after the combination and rejection of multiple detections around the object.
The accuracy of the system depends on the level of MergeThreshold. The higher the value of the
MergeThreshold level, the lower the accuracy and vice versa.
The algorithm has been analysed with images of different sizes for Image1, Image2 and Image3
respectively. These images were taken from different websites showing students in a classroom setting
with natural sitting positions showing faces of different sizes from (24x24 to 60x60). The textual
description of the images has been summarized on the table below and has been classified in order of
difficulty.
Key:
nFaces = Number of Faces per image
nRows = Number of Rows per image
Image nFaces nRows Description Image
Size Classification
Image1 930X620 29 9 All the students are facing the camera with Easy
each face clearly visible.
The sitting position is not relatively spaced
out (can be seen as organised).
The students are arranged in 8 rows and 6
columns directly facing the camera.
The sitting arrangement is queued in a
quadrilateral form.
There is minimal obstruction to student’s
frontal faces
At least 5 students have facial hair.
Older looking students of age range 30 to 50
At least 7 students have glasses on.
The face sizes are ranging from 25x25 to
80x80.
50
Final Year Report Jireh Robert Jam K1557901
51
Final Year Report Jireh Robert Jam K1557901
Table 1: Textual Description of Images with Level of Difficulty for Face Detection.
In order to carry out the experiment, all three images were used. The values of the image sizes were
generated by resizing each image to ¾, ½ and ¼ of the original image sizes respectively. The
MergeThreshold values were obtained by performing the experiment at MergeThreshold levels of
0,1,5,7 and 10. The ScaleFactor used was 1.05 and 1.25. The performance metrics used to analyse the
impact of these techniques are True Positive (TP), False Positive (FP) and False Negative(FN).
Where;
TP is the number of faces detected from the algorithm (detected and identified true)
FP is the number of non-faces falsely detected as faces (detected and identified as false)
FN is the number of faces not detected
Below the first image is considered with students sitting with all their faces facing the front of the
classroom.
Figure 1 of Image1: Students sitting at different positions from first row to last row. Face
Detection tested with no parameters.
In this image, the size of the image has not been reduced. The default FrontalFaceCART,
MergeThreshold of 4, MinSize of [], and Scale factor of 1.1 have been implemented and all the 29 faces
of the students detected. The MergeThreshold level, face window size and ScaleFactor have not been
taken into consideration. This shows how well the system can perform with a good quality image with
faces not too distant from the camera.
52
Final Year Report Jireh Robert Jam K1557901
In figure 2 of image1, the algorithm has been experimented with the following parameters
MergeThreshold = 1, window size for face = [20,20] and ScaleFactor = 1.05.
In this approach, the window scale of size [20,20] slides over the image at different directions scanning
the image. After one complete scan, the scale factor (1.05) is applied to increasing the detection
resolution of the window. This process continues until the size adjust to approximate the size of each
face. Also, the image size plays a big role for during this process as it determines the face sizes in the
image. Reducing the image size may lead to loss in features etc. However, it still detects some false
positives as the algorithm searches for anything in shape with features of the face. Further experiments
were carried out to demonstrate the performance of the algorithm. The results of these experiments are
as summarized on the tables below.
Table Key:
MergeThreshold = MT.
MATLAB function imresize(I) where I is the image,
ScaleFactor = SF.
True Positive = TP.
False Positive = FP.
53
Final Year Report Jireh Robert Jam K1557901
Table 3B above shows the evaluation of Image1 with the original image size, experimented with
different image sizes, and the results have been displayed as shown. From this table, the figures, the
system performs better with a 96.55 % detection rate. With the image size reduced to different sizes, it
can be seen that the detection rate drops. A plot of the Detection rate against image size has been shown
in Figure 3 below.
54
Final Year Report Jireh Robert Jam K1557901
Table 3C above shows the analysis at different mergethreshold levels with a scaleFactor of 1.25 and
window size of [20,20] keeping the original size of the image. The detection rate was 78.86% at the
threshold level of 0. At this mergethreshold level, the detected face is counted depending on the number
of windows clustered on the face. The least number of windows on a face to be considered is three. The
smaller windows centre is within the centre of the face and the window with high detection confidence
was considered. The number of windows detecting with size is predicted to grow in linearity with the
size of the window scanned on a face.
Based on a research by Y-Qing (2014), actual faces will have a high cluster of bounding boxes. As a
consequence, faces with less than three bounding boxes are not counted because the system is performed
at a MergeThreshold level of zero. Consequently, object with high cluster of windows were counted as
a face and used to calculate the detection rate. Figure 4 of image one below shows the face detection
performed at a MergeThreshold level of zero.
55
Final Year Report Jireh Robert Jam K1557901
A further experiment was carried out on image 2 and image 3 respectively with the respective
parameters used to analyse that of image 1 and the following results were obtained as shown on the
tables below figures and tables below.
56
Final Year Report Jireh Robert Jam K1557901
Table 4A Image 1: Shows Performance Metrics of different Image sizes (using imresize) to ¾, ½
and ¼ of the original image size using default parameters of the Viola Jones Face Detection
algorithm.
57
Final Year Report Jireh Robert Jam K1557901
58
Final Year Report Jireh Robert Jam K1557901
Table 5A Image 1: Shows Performance Metrics of different Image sizes (using imresize) to ¾, ½
and ¼ of the original image size using default parameters of the Viola Jones Face Detection
algorithm.
Figure 10: Plot of Detection Rate Against Image Size for Image 3.
59
Final Year Report Jireh Robert Jam K1557901
Figure 11: Plot of Detection Rate Against MergeThreshold level for Image 3
60
Final Year Report Jireh Robert Jam K1557901
61
Final Year Report Jireh Robert Jam K1557901
This shows that the default parameters will only perform better with images of higher illumination,
higher resolution and face size of 25x25, enough contrast on the faces, and most of the characteristics
outlined for Image1 on Table 1 above.
Tables 3B, 4B and 5B show the performance metrics at different image sizes. The values were obtained
by resizing the original image size of Image1, Image2 and Image3 (930X620, 850x570, 1200x630)
respectively. Each image was scaled by a factor of 1.05 at window size of [20 20]. On these tables (3B,
4B and 5B), the number of True Positives, False Positives and Detection rates drops as the image size
decreases. In contrast, the False Negatives increases as the image size decreases. This has been further
illustrated by a plotting the Detection rate against image size as shown in Figures 3, 7 and 10
respectively.
Tables 3C, 4C and 5C show the performance metrics at different MergeThreshold levels. Each threshold
level was introduced as the experiment was carried out. The window size [20 20] was scaled at 1.25.
From these tables (3C,4C and 5C), we can see that the number of True Positives, False Positives
62
Final Year Report Jireh Robert Jam K1557901
decreases as the MergeThreshold level is increased at the different levels from 1 to 10. There is no great
significant change with a MergeThreshold level of zero. The detection rate decreases with increase in
MergeThreshold level. The number of False Negatives increases as MergeThreshold level increases.
To further illustrate this, a plot of Detection rate against MergeThreshold levels is shown in figures 5,
8 and 11 respectively
From the analysis, above, based on the figures obtained, the face detection part of this system will take
into consideration a MergeThreshold =1, ScaleFactor =1.05, window size of [20 20] with a
FrontalFaceCART as the cascade. With the assumption that the lecturer will ask the students to face
forward into the camera, and the students being able to cooperate, the performance rate of face detection
by the system will be at an approximate of 70%.
The faces that have not been detected were either faces which the ScaleFactor could not correctly
determine its resolution as set by the MinSize of the system or did not have enough contrast on them.
Some faces are sideways at approximately angles of between 70-900 (orientation). Some of the faces
appear in half profile which makes it difficult to extract features that makes it easy for the system to
detect as a face (Head pose).
Some faces (approximately 18 in image2 and image3 combined), were blocked by their colleagues
sitting in front of them or have an obstruction to their face (i.e. hands etc.) which blocks the features for
face detection.
Approximately 56 out of 88 faces in image3 and approximately 12 out of 59 faces in image2 are too far
at the back of the lecture hall which can also be a resolution problem. The sizes of these faces are
approximated at [17 17], [16 16], [19 19], [19x25], which are far below the minimum size specified by
the system and MATLAB overall for detection. Moreover, one can conclude that although all the
images are of high resolution (930x620, 850x570, 1200x630) it will not meet the requirements for a
better performance of the algorithm if reduced remarkably. A reduction in the image size leads to a
reduction in the face sizes which in turn leads to poor detection rates. In summary for better performance
the image size should not be reduced more than ¾ of its original size.
Furthermore, the difficulty of image3 which makes it difficult to see facial features even from the human
point of view could be a more reason why the system could not detect more faces. However, it can be
very difficult to decide image quality as it is a contributing factor that determines the evaluation time
to detect a face (illumination).
This part of the system has reached a conclusion, based on the findings from the investigation of the
different parameters of the Viola-Jones algorithm. The purpose of this part of the system is to detect
faces and count the number of faces detected for attendance. The system automatically counts the total
number of bounding boxes the system detects as a face. This includes both True positive and False
positive respectively. In this regard, the numerical total of the lecture room will be obtained from the
total number of bounding boxes detected by the system. This can however, be more or less due to the
detection of TP and FP respectively. The lecturer will have to confirm by counting the total number of
students to compare with that obtained by the system.
To achieve a good register, the images of the lecture room will have to be taken in rows. By doing this,
the first front three rows will be taken and followed in rows of three for detection and subsequently for
the recognition phase of the system. Recognition cannot be achieved with tiny face detection as the size
will be too small.
To achieve this on a class size of 80, a 16MP phone camera is recommended for a maximum image
size of 4920X3264 pixels. The minimum requirements for the parameters of the Viola -Jones algorithm
will be FrontalFaceCART, MergeThreshold level of 1, ScaleFactor of 1.05 and window size (MinSize)
of [20 20]. The assumption here is that the Camera position is directly facing, the students with respect
to their sitting position.
As part of this project, the practical conclusion to achieve the recognition phase is to use a professional
Camera which can give a high-resolution image with good illumination. With this, we reduce the class
size to 8 rows of 40 students. The assumption here is that, images of each row will be taken with students
facing the camera. Each student will have a sitting position with no obstruction. This will enable the
face detection part of the system to be able to detect faces at a size that is useful for face recognition
and also an accurate numerical count for the class register. However, each row of students sitting
position will be considered and a photo for each set of students taken at a time for face detection. The
face detection part of the system has been implemented to automatically crop faces detected within a
63
Final Year Report Jireh Robert Jam K1557901
bounding box. The cropped faces are automatically stored to a folder and will be used as test images
for recognition.
Analysis of the Face Recognition part of the system.
The face recognition part of this system uses a 7-States Hidden Markov Model for Face Recognition as
mentioned in the literature review. The HMM will require seven blocks of the image vector for a face.
The seven segments of a face are shown in the figure below.
64
Final Year Report Jireh Robert Jam K1557901
found. It doesn’t matter if the image is a correct match or a mismatch, the name will tell the user if it is
correct.
Figure 17: Match of Image from Database1 and Mismatch of the same image on the same Database.
Figure 18: Match of Image from Database1 and Mismatch of the same image on the same Database.
Figure 19: Match of Image from Database2 and Mismatch of the same image on the same Database.
Figure 20: Match of Image from Database2 and Mismatch of the same image on the same Database.
In order to analyse the system, five images are used during the training and the remaining number of
images for testing. The following performance metrics have been taken into consideration. True
Positive (TP) here means correctly matching individuals whose photos are on the database. False
positive (FP) means wrongly matching individuals on the database. The recognition rate is calculated
based on the TP and FP obtained from the system. However, because the system extract features in
blocks, it has been analysed using the various sizes [92 112], [84 69] and [56 46] respectively. This is
because the blocks are extracted in a sequence of overlapping blocks as a patch size of face height (112)
and face width (92) is passed through the sliding window from top to bottom. The computational time
varies depending on the size of the image. However, this can have an impact on the recognition rate of
the system.
Database Number of Number of Image Size (I,[p.face_height Recognition Rate
Subjects images tested face_width]) (%)
Database1 52 260 [112 92] 67.3077%
52 260 [69 84] 69.2308%
52 260 [46 56] 73.0769%
Database2 40 268 [112 92] 63.6364%
65
Final Year Report Jireh Robert Jam K1557901
66
Final Year Report Jireh Robert Jam K1557901
67
Final Year Report Jireh Robert Jam K1557901
This will however achieve good quality in resolution. All the faces on the image below are on Database1
of our datasets, with 52 subjects of 10 images each. These faces will be used in the face detection part
of the system for detection, cropping and recognition.
The faces on the image above were detected and cropped by the face detection part of the system. The
faces were detected by the system as shown below. The cropped faces were stored in a folder. With the
system already trained, the faces are used for recognition.
Figure 23: Correct Match of cropped faces against faces of subjects on database.
68
Final Year Report Jireh Robert Jam K1557901
The faces above show an 80% match for the Jacque Chirac, 70% match for Tony Blair and 50% match
for Angela Merkel.
From this, I can conclude that a class attendance system based on face detection and face recognition
can be achieved at this level. Further research and experiments will be carried out on future work to
improve the system.
69
Final Year Report Jireh Robert Jam K1557901
CHAPTER SEVEN
Conclusion
The entire project has been developed from the requirements to a complete system alongside evaluation
and testing. The system developed have achieved its aim and objectives. The client was happy with the
overall performance of the system. However, though some challenges were encountered during
implementation, they were addressed and implemented.
Also, future work and strategies on how to improve the system are further in this section.
The Problems:
During the development of the system, some problems were encountered and have been discussed here.
The experiments conducted after the implementation of the system required some changes. During the
initial phase of the evaluation, the aim was to change the parameters of the Viola-Jones algorithm to
meet the objectives. However, it was proven that, achieving on this objective with these parameters will
lead to a big failure on the second part of the system. The conclusion to set the parameters of this part
of the system based on very small class size was due to the failures obtained from the recognition part
of the system. The size of the image is very important in face recognition as every pixel count. Resizing
an image leads to a loss in pixels and highly influenced the performance of the system poorly. The
evaluation results showed the face detection part of the system to perform at approximately 70%
detection rate which was not as great as compared to that obtained by Viola and Jones (2001). Also, I
have learned that face detection algorithms work on changes to illumination, image resolution etc. as
already mention and is still an ongoing research to images of uncontrolled backgrounds. However, the
next challenge for future work and research is to implement a system that can achieve a high
performance on such images.
The evaluation of the face recognition part of the system produced results which as not as expected.
The evaluation showed that the recognition part can achieve approximately 60% recognition rate based
on the image resolution. With a poor image resolution, there is a high chance the system will fail.
Furthermore, in order to achieve this result, the user must make sure image is of the right resolution
discussed on this report. Some people are not good at following instructions. This may lead to poor
image quality and will make the system perform poorly. However, for the purpose of evaluating the
system against images taken in controlled backgrounds (Yale Database), there is a high chance of
excellent result. The use of images taken from faces in the wild to evaluate the database was a great
idea suggested by my supervisor. This has given us a rough idea on the performance rate at
approximately 70%. This is different from that seen in the literature review for a similar system which
performed at 97% on one of the datasets used in research.
Also, one of the reasons for the choice of the platform used implementation could have influenced the
poor performance as I do not really know the language at my best.
The algorithm used to determine the percentage probability generates different percentage scores each
time. This is because, the percentage generated is a percentage to show to what extent the most likely
sequence of state agrees with the random sequence. The random sequence will be that of the input image
and the most likely sequence is that from the output image. Although it does not change the overall
percentage of recognition, it was not possible to tell the user at what percentage they could decide a
face is a face. The only way out was to carry out a test on all five input images and at least with three
matches, they user can confirm a face, based on the output match displayed side by side.
The performance of the system has impacted the reliability of the system. Because it is still an ongoing
research area, the system will not be available for use at the end of the project. However, it can be used
70
Final Year Report Jireh Robert Jam K1557901
for research purposed by the supervisor and experimented in a lecture room before approval.
It was not possible to deploy the system as a standalone as the Matlab SDK Compiler was not available
on a Matlab version with Vision support packages.
Future Work and Improvement:
A more detailed research is needed on a project as such. The methods used could be combined with
others to achieve great results. Different methods have been implemented in the past according to the
literature review.
The use of HMM with other feature extraction methods can be implemented and tested. This will need
more time as it is only a trial that will be made taking into consideration the method that already exist
in order to have a complete new idea.
The system that has been delivered and should only be used for experimental purpose as it is not
completely reliable.
Critical Review
Here, I will discuss the challenges faced during the implementation of the project.
Challenges:
The major challenge during the implementation of this project was learning Matlab from scratch. I
started learning Matlab from the course Computer Vision, Graphics and Image processing. I had a clash
with my core module which stopped me from carrying on with the course and later focused on YouTube
tutorial in image processing, face detection and face recognition in MATLAB. Other alternatives like
OpenCV and digiKam which is written in C++ could have been exploited but was not due to time
constraint and the nature of the project and other courses.
Agreeing on the set of objectives with the client was not easy. Like all other clients, my client was too
ambitious but at the end, we settled for objectives that will meet the solution to the system.
Learning how to implement Face Detection in MATLAB was the first challenge but with the help of
MATLAB webinars and YouTube tutorials, I was able to overcome this.
Another challenge encountered was to set the bounding box to a size that will not be too small to resize
to the size required for face recognition. Moreover, getting these bounding boxes and putting them in
the right position relative to the position of subjects in the image was not easy.
Learning how to implement Face Recognition using PCA with SVD was another challenge faced on its
own but served as a foundation. This gave me the enthusiasm to carry on with the project and implement
it with Hidden Markov Model the discussed on this report.
Also, implementing Hidden Markov Model for face recognition was not as easy as I thought, but I
finally did with the help from Mathworks.com. With reference to a similar solution, questions and
answers from other researchers, I was able to carry on with the implementation.
Evaluation and testing was not an easy task. I had to manually test the system, calculate the percentage
of recognition against each subject. Also, the changing parameters for the face detection part and
making a decision to the final parameter for the system was more of a challenge.
Using GUIDE for the GUI has been a challenge as the documentation has changed and some
functionalities removed completely with reference to the old versions.
Time management has not been great. This has resulted from the weekly medical appointments due to
ill health and disability. Also, there has been other coursework and datelines which carry almost equal
amount of work. Furthermore, developing a real-life project as one of my module assessment has
contributed to time constraint as I am SCRUM master of the project.
71
Final Year Report Jireh Robert Jam K1557901
72
Final Year Report Jireh Robert Jam K1557901
BIBLIOGRAPHY
Alpaydin, E. (2014) Introduction to Machine Learning. 3rd ed edn. Cambridge: Cambridge: The
MIT Press.
Anthony, S. (2014) Facebook's facial recognition software is now as accurate as the human brain,
but what now?. Available at: http://www.extremetech.com/extreme/178777-facebooks-facial-
recognition-software-is-now-as-accurate-as-the-human-brain-but-what-now (Accessed:
09/01/2018).
Baseer, K. (2015) 'A Systematic Survey on Waterfall Vs. Agile Vs. Lean Process Paradigms', I-
Manager's Journal on Software Engineering, 9 (3), pp. 34-59.
Belaroussi, R. and Milgram, M. (2012) 'A comparative study on face detection and tracking
algorithms', Expert Systems with Applications, 39 (8), pp. 7158-7164.
C, R.,Kavitha and Thomas, S.,Mary (2011) 'Requirement Gathering for small Projects using Agile
Methods', IJCA Special Issue on “Computational Science - New Dimensions & Perspectives, pp.
122-128.
Carro, R. C., Larios, J. -. A., Huerta, E. B., Caporal, R. M. and Cruz, F. R. (2015) 'Face recognition
using SURF', Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), 9225 pp. 316-326.
Castrillón, M., Déniz, O., Hernández, D. and Lorenzo, J. (2011) 'A comparison of face and facial
feature detectors based on the Viola–Jones general object detection framework', Machine Vision
and Applications, 22 (3), pp. 481-494.
Cheng, Y. F., Tang, H. and Chen, X. Q. (2014) 'Research and improvement on HMM-based face
recognition', Applied Mechanics and Materials, 490-491 pp. 1338-1341.
Da Costa, Daniel M. M., Peres, S. M., Lima, C. A. M. and Mustaro, P. (2015) Face recognition
using Support Vector Machine and multiscale directional image representation methods: A
comparative study. Killarney,Ireland. Neural Networks (IJCNN), 2015 International Joint
Conference on: IEEE.
Dagher, I., Hassanieh, J. and Younes, A. (2013) Face recognition using voting technique for the
Gabor and LDP features. Dallas, TX, USA. Neural Networks (IJCNN), The 2013 International
Joint Conference on: IEEE.
Feraud, R., Bernier, O., Viallet, J. E. and Collobert, M. (2000) 'A fast and accurate face detector for
indexation of face images', Automatic Face and Gesture Recognition, 2000.Proceedings.Fourth
IEEE International Conference on, pp. 77-82.
73
Final Year Report Jireh Robert Jam K1557901
25/01/2018).
Hadizadeh, H. (2015) 'Multi-resolution local Gabor wavelets binary patterns for gray-scale texture
description', Pattern Recognition Letters, 65 pp. 163-169.
Hiremath, P. S. and Hiremath, M. (2014) '3D Face Recognition based on Radon Transform, PCA,
LDA using KNN and SVM', International Journal of Image, 6 (7), pp. 36-43.
Hjelmås, E. and Low, B. K. (2001) 'Face Detection: A Survey', Computer Vision and Image
Understanding, 83 (3), pp. 236-274.
Jadhav, D. V. and Holambe, R. S. (2010) 'Rotation, illumination invariant polynomial kernel Fisher
discriminant analysis using Radon and discrete cosine transforms based features for face
recognition', Pattern Recognition Letters, 31 (9), pp. 1002-1009.
Jafri, R. and Arabnia, H. ((2009).) 'A Survey of Face Recognition Techniques.', Journal of
Information Processing Systems, 5 (2), pp. 41-68.
Jeong, G. and Choi, S. (2013) 'Performance evaluation of face recognition using feature feedback
over a number of Fisherfaces', IEEJ Transactions on Electrical and Electronic Engineering, 8 (6),
pp. 541-545.
Kashif, M., Deserno, T. M., Haak, D. and Jonas, S. (2016) 'Feature description with SIFT, SURF,
BRIEF, BRISK, or FREAK? A general question answered for bone age assessment', Computers in
Biology and Medicine, 68 pp. 67-75.
Leigh-Pollitt, P. (2001) The Data Protection Act explained. 3rd ed. edn. London: London : The
Stationery Office.
Lemley, J.Bazrafkan, S. and Corcoran, P. (2017) 'Deep Learning for Consumer Devices and
Services: Pushing the limits for machine learning, artificial intelligence, and computer vision'
Consumer Electronics Magazine, IEEE, 6 (2), pp. 48-56. 10.1109/MCE.2016.2640698.
Lenc, L. and Král, P. (2014) 'Automatic face recognition approaches', Journal of Theoretical and
Applied Information Technology, 59 (3), pp. 759-769.
Li, C., Tan, Y., Wang, D. and Ma, P. (2017) 'Research on 3D face recognition method in cloud
environment based on semi supervised clustering algorithm', Multimedia Tools and Applications,
76 (16), pp. 17055-17073.
Li, S. Z.Rong Xiao, S. Z.Li, Z. Y. and Hong, J. Z. (2001) 'Nonlinear mapping from multi-view face
patterns to a Gaussian distribution in a low dimensional space' Recognition, Analysis, and Tracking
of Faces and Gestures in Real-Time Systems, 2001.Proceedings.IEEE ICCV Workshop on, pp. 47-
54. 10.1109/RATFG.2001.938909.
Linna, M., Kannala, J. and Rahtu, E. (2015) 'Online face recognition system based on local binary
patterns and facial landmark tracking', Lecture Notes in Computer Science (Including Subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9386 pp. 403-414.
Lowe, D. G. (1999) 'Object recognition from local scale-invariant features', Computer Vision,
1999.the Proceedings of the Seventh IEEE International Conference on, 2 pp. 1150-1157.
Marciniak, T., Chmielewska, A., Weychan, R., Parzych, M. and Dabrowski, A. (2015) 'Influence
of low resolution of images on reliability of face detection and recognition', Multimedia Tools and
Applications, 74 (12), pp. 4329-4349.
74
Final Year Report Jireh Robert Jam K1557901
Mathworks (2017) Detect objects using the Viola-Jones algorithm. Available at:
https://uk.mathworks.com/help/vision/ref/vision.cascadeobjectdetector-system-
object.html?s_tid=srchtitle (Accessed: 05/02/2018).
Mayank Chauhan , Mukesh Sakle. (2014) 'Study & Analysis of Different Face Detection
Techniques', (IJCSIT) International Journal of Computer Science and Information Technologies, 5
(2), pp. 1615-1618.
Ming-Hsuan Yang, Kriegman, D. J. and Ahuja, N. (2002) 'Detecting faces in images: a survey',
Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24 (1), pp. 34-58.
Modi, M. and Macwan , F. (2014) 'Face Detection Approaches: A Survey.', International Journal
of Innovative Research in Science, Engineering and Technology, 3 (4), pp. 11107-11116.
Mohamed, A. S. S., Ying Weng, S. S., Ipson, S. S. and Jianmin Jiang, S. S. (2007) 'Face detection
based on skin color in image by neural networks', Intelligent and Advanced Systems, 2007.ICIAS
2007.International Conference on, pp. 779-783.
Mohamed, A. S. S., Ying Weng, S. S., Ipson, S. S. and Jianmin Jiang, S. S. (2007) Face detection
based on skin color in image by neural networks. Kuala Lumpur, Malaysia. Kuala Lumpur,
Malaysia:
Ojala, T.Pietikainen, M. and Maenpaa, T. (2002) 'Multiresolution gray-scale and rotation invariant
texture classification with local binary patterns' Pattern Analysis and Machine Intelligence, IEEE
Transactions on, 24 (7), pp. 971-987. 10.1109/TPAMI.2002.1017623.
Orceyre, M. J. (1975) 'Data security', Journal of Chemical Information and Computer Sciences, 15
(1), pp. 11.
Parmar, D. N. and Mehta, B. B. (Jan-Feb 2013) 'Face Recognition Methods & Applications',
International Journal of Computer Technology & Applications, 4 (1), pp. 84-86.
Perveen, N., Ahmad, N., Abdul Qadoos Bilal Khan, M., Khalid, R. and Qadri, S. (2015) 'Facial
Expression Recognition Through Machine Learning', International Journal of Scientific &
Technology Research, 4 (8), pp. 91-97.
75
Final Year Report Jireh Robert Jam K1557901
Postma, E. (2002) 'Review of Dynamic Vision: From Images to Face Recognition: S. Gong, S.
McKenna & A. Psarrou; Imperial College Press, 2000 (Book Review)', Cognitive Systems
Research, 3 (4), pp. 579-581.
Rowley, H. A., Baluja, S. and Kanade, T. (1998) 'Neural network-based face detection', Pattern
Analysis and Machine Intelligence, IEEE Transactions on, 20 (1), pp. 23-38.
Ryu, H., Chun, S. S. and Sull, S. (2006) 'Multiple classifiers approach for computational efficiency
in multi-scale search based face detection', Advances in Natural Computation, Pt 1, 4221 pp. 483-
492.
Saxena, V.Grover, S. and Joshi, S. (2008) 'A real time face tracking system using rank deficient
face detection and motion estimation' Cybernetic Intelligent Systems, 2008.CIS 2008.7th IEEE
International Conference on, pp. 1-6. 10.1109/UKRICIS.2008.4798956.
Shang-Hung Lin. (2000) 'An Introduction to Face Recognition Technology', Informing Science the
International Journal of an Emerging Transdiscipline, 3 (1), pp. 1-7.
Sharma, H., Saurav, S., Singh, S., Saini, A. K. and Saini, R. (2015) Analyzing impact of image
scaling algorithms on viola-jones face detection framework. Kochi, India. Advances in Computing,
Communications and Informatics (ICACCI), 2015 International Conference on: IEEE.
Shiwen, L., Feipeng, D. and Xing, D. (2015) A 3D face recognition method using region-based
extended local binary pattern. Quebec City, QC, Canada. Image Processing (ICIP), 2015 IEEE
International Conference on: IEEE.
Smith, S. M. (1995) 'ASSET-2: real-time motion segmentation and shape tracking' Computer
Vision, 1995.Proceedings., Fifth International Conference on, pp. 237-244.
10.1109/ICCV.1995.466780.
Sommerville, I. (eds.) (2011) Software Engineering. 9th edn. USA: Pearson Education.
S. Sharavanan et al, “LDA Based Face Recognition By Using Hidden Markov Model In Current
Trends”, International Journal of Engineering and Technology Vol.1(2), 77- 85, 2009.
Surayahani, S. and Masnani, M. (2010) 'Modeling Understanding Level Based Student Face
Classification' Mathematical/Analytical Modelling and Computer Simulation (AMS), 2010 Fourth
Asia International Conference on, pp. 271-275. 10.1109/AMS.2010.61.
76
Final Year Report Jireh Robert Jam K1557901
Taigman, Y., Yang, M., Ranzato, M. and Wolf, L. (2014) 'DeepFace: Closing the gap to human-
level performance in face verification', Proceedings of the IEEE Computer Society Conference on
Computer Vision and Pattern Recognition, pp. 1701-1708.
Thai, L. H., Nguyen, N. D. T. and Hai, T. S. (2011) 'A Facial Expression Classification System
Integrating Canny, Principal Component Analysis and Artificial Neural Network', International
Journal of Machine Learning and Computing, 1 (4), pp. 388-393.
Venugopalan, J.Brown, C.Cheng, C.Stokes, T. H. and Wang, M. D. (2012) 'Activity and school
attendance monitoring system for adolescents with sickle cell disease' Conference Proceedings :
...Annual International Conference of the IEEE Engineering in Medicine and Biology Society.IEEE
Engineering in Medicine and Biology Society.Annual Conference, 2012 pp. 2456.
10.1109/EMBC.2012.6346461.
Venugopalan, J., Brown, C., Cheng, C., Stokes, T. H. and Wang, M. D. (2012) 'Activity and school
attendance monitoring system for adolescents with sickle cell disease', Conference Proceedings :
...Annual International Conference of the IEEE Engineering in Medicine and Biology Society.IEEE
Engineering in Medicine and Biology Society.Annual Conference, 2012 pp. 2456.
Vincent, P.Larochelle, H.Bengio, Y. and Manzagol, P. (2008) 'Extracting and composing robust
features with denoising autoencoders' pp. 1096-1103. 10.1145/1390156.1390294.
Viola, P. and Jones, M. (2001) 'Rapid object detection using a boosted cascade of simple features',
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, 1 pp. I511-I518.
Wang, K., Song, Z., Sheng, M., He, P. and Tang, Z. (2015) 'Modular Real-Time Face Detection
System', Annals of Data Science, 2 (3), pp. 317-333.
Wang, K., Song, Z., Sheng, M., He, P. and Tang, Z. (2015) 'Modular Real-Time Face Detection
System', Annals of Data Science, 2 (3), pp. 317-333.
Wang, X., Cai, Y. and Abdulghafour, M. (2015) A comprehensive assessment system to optimize
the overlap in DCT-HMM for face recognition. Dubai, United Arab Emirates. Innovations in
Information Technology (IIT), 2015 11th International Conference on: IEEE.
Wenyi Zhao, and Rama Chellappa (2005) Face Processing: Advanced Modeling and Methods.
Elsevier Science.
Yi, J., Yang, H. and Kim, Y. (2000) 'Enhanced fisherfaces for robust face recognition', Biologically
Motivated Computer Vision, Proceeding, 1811 pp. 502-511.
Yi-Qing Wang. (2014) 'An Analysis of the Viola-Jones Face Detection Algorithm', Image
Processing on Line, 4 pp. 128-148.
77
Final Year Report Jireh Robert Jam K1557901
Appendix:
78
Final Year Report Jireh Robert Jam K1557901
Task
ID Task Start Date (DD-MMM) Duration (Days) End Date
1 Initiate Project Proposal 09-Oct 22
3 Project Proposal Draft 09-Oct 15
4 Project proposal feedback 10-Oct 2
5 Review changes with Feedback 11-Oct 4
6 Submit Draft report 15-Oct 1
7 Prototyping and Initial Demo 05-Nov 94
9 Install Matlab 12-Nov 1
10 Research 14-Nov 80
11 Implementation Face Detection 15-Nov 10
12 Initial Demo 16-Nov 1
19 Analysis of Methodology 17-Nov 2
20 Literature Review completed 1
21 Resource Implementation 19-Oct 130
22 Develop and Refine functionalities 19-Oct 2
23 Review of various algorithms 21-Oct 2
24 Decide on Design and algorithm 23-Oct 1
25 MATLAB Implementation 24-Oct 1
26 First Working Prototype
27 Review and Evaluation Face Detection 25-Nov 30
28 Identify Issues with Detection 25-Nov 3
29 Implement Changes with Feedback 28-Nov 2
30 Demo and Feedback 30-Nov 3
31 Testing and Implementation 03-Dec 5
32 Feedback from Supervisor
33 Review and Evaluation Face Detection 08-Jan 30
34 Feedback on Face Detection 08-Jan 2
35 Implementation of Face Recognition 10-Jan 15
36 Feedback on Face Recognition 25-Jan 2
37 Implement changes to Face Recognition 27-Jan 2
38 Incremental Implementation
39 Review work with Supervisor 15-Feb 40
40 Analyse Requirements 15-Feb 20
41 Make relevant adjustments 07-Mar 2
42 Review progress with Supervisor 09-Mar 2
43 Prioritise Functionalities of Face Recognition 11-Mar 3
44 Face Recognition Testing and Evaluation 14-Mar 2
Put System Functionalities Together
45 Evaluation and Analysis of System 01-Mar 45
46 Identify issues with Supervisor 01-Mar 4
47 Implement Changes to System 05-Mar 2
07-Mar 2
79
Final Year Report Jireh Robert Jam K1557901
80
Final Year Report Jireh Robert Jam K1557901
9-Oct 28-Nov 17-Jan 8-Mar 27-Apr 16-Jun 5-Aug 24-Sep 13-Nov 2-Jan
Initiate Project Proposal
Project proposal feedback
Submit Draft report
Install Matlab
Implementation Face Detection
Analysis of Methodology
Resource Implimentation
Review of various algorithms
MATLAB Implementation
Review and Evaluation Face Detetion
Implement Changes with Feeback
Testing and Implementation
Review and Evaluation Face Detetion
Implementation of Face Recognition
Implement changes to Face Recognition
Review work with Surpervisor
Make relevent adjustments
Prioritise Functionalities of Face Recognition
Put System functionalities Together
Identify issues with Supervisor
0
0 20 40 60 80 100 120 140 160 180 200
Final Year Report Jireh Robert Jam K1557901
1
Final Year Report Jireh Robert Jam K1557901