Face Recognition Using Machine Learning Algorithm
Face Recognition Using Machine Learning Algorithm
E-ISSN: 2321-9637
International Conference on “Topical Transcends in Science, Technology and Management”
(ICTTSTM-2018)
Available online at www.ijrat.org
ABSTRACT
Face Recognition has attaining very much consideration in security systems, due to its fast and precise results,
non-intrusiveness etc. The work proposed here uses different feature extraction and classification methods and
the result has been compared. Local Binary Pattern Histogram Features (LBPH) and Eigen features are used.
Minimum distance classifier and SVM classifiers are used and results are compared. Algorithm is realized using
OpenCV with python.
Algorithm is tested on FEI, ORL and our own MSRIT datasets.
Index terms— face recognition, SVM,PCA, LBPH, Euclidean distance
I.INTRODUCTION
Face recognition is one of the techniques used in biometric validation by comparing with the stored templates in
the datasets. Biometric is the capability of the computer to recognize the people based on people‟s inimitable
physical or behavioral characteristics. It is one of the firmest evolving field in advanced technology. Physical
characteristics are the traits of a human body that are not exposed to modification over ageing that includes face
recognition, fingerprint recognition, iris recognition, palm print recognition, finger geometry, retina scanning
and DNA matching. Behavioral characteristics of biometrics allied to the personal behavior of individuals such
as voice recognition, signature recognition, and gait and keystroke dynamics.
Face recognition is extensively used in biometrics because of its universality, user friendly, non-intrusiveness
and easily manageable system. Now-a-days the systems that requires heavy security are becoming prone to
illicit entry, deceits. This complications can be elucidated by taking the actual identity of the person into
deliberation by means of various face recognition techniques. All around the preceding years innumerable face
recognition techniques have been developed.
Though from so many decades face recognition system exists, still there are lot of challenges to be addressed
and it has become a continuous research interest.
For the algorithm putting into practice OpenCV Library has been used which is developed to deliver aid in
building systems that involves image processing. OpenCV library has many in-built packages that afford
assistance in face detection and recognition and implements tasks taking up fewer processing time and
providing augmented proficiency.
Figure 1 shows block diagram representation of face recognition system. It has two phases of operation: training
phase and test phase. In training phase features are extracted from preprocessed faces and stored in a database.
During test phase same feature extraction is applied and machine learning algorithm is used to recognize the test
face with the stored template face in the database. The system includes following operations in each phase:
(1) Face detection:
It is the pre-processing step in the face recognition step. Viola Jones face detection algorithm has been used.
There are four key steps in this technique:
1. Haar-like features
2. Integral image
3. Adaboost training
4. Cascading classifier
1. Haar feature selection
All human faces share few analogous properties. Haar like features are used to detect difference in the black and
light portion of the image. These regularities may be presented using Haar Features. A few properties common
to human faces are:
1. The eye region is darker than the upper-cheeks.
International Journal of Research in Advent Technology, Special Issue, August 2018
E-ISSN: 2321-9637
International Conference on “Topical Transcends in Science, Technology and Management”
(ICTTSTM-2018)
Available online at www.ijrat.org
2. The nose bridge region is brighter than the eyes. Composition of properties forming match able facial
features:
Location and size: eyes, mouth, bridge of nose
Value: oriented gradients of pixel intensities
If an image is given we will take a 24X24 window and apply each Haar feature to that window pixel by pixel.
The value is calculated applying Haar features is
Value = Σ (pixels in black area) - Σ (pixels in white area) Variations in brightness between the white and black
rectangles over a specific area. Each feature is related to a specific location in the sub-window
2. Integral images
The second step of the Viola-Jones face detection algorithm is to transmute an input image into an integral
image. The integral image at the location (x,y) contains the sum of the pixels to the above and to the left of (x
,y).
This makes the calculation of the addition to the entire pixels within any specified rectangle using only four
values. In the integral image, these values are the pixels that resemble with the edges of the rectangle in the
input image.
3.Adaboost machine learning method
Viola-jones algorithm uses a 24X24 window as the base window size to evaluate all the features in an image. If
we think about all the feasible parameters of the Haar features then we need to evaluate 160,000+ features in
given window. The basic idea is to exclude the redundant features which are not beneficial.
Adaboost is a machine learning algorithm which helps in judging only the most outstanding features from
160,000+ features. After these features forms a weighted arrangement of all the features which are used in
gaging and deciding any given window has face or not. These features are called weak classifiers.
4.Cascading classifiers
The cascading classifier is an assembly of stages that contains a strong classifier. The work of each phase is to
authenticate a particular sub-window is absolutely not a face or may be face. A sub window classified as a may
be face is passed to the next stage in the cascade, if these fails we conclude there is no face and discard that sub
window and move on to the next stage.
(2)FEATURE EXTRACTION
1. LOCAL BINARY PATTERN HISTOGRAM
There exists several methods for extracting the features from the preprocessed images.one is the Local Binary
Pattern Histograms (LBPH).this was introduced by ojala et al which is done by dividing an image into several
small regions from which the features are extracted.
The LBPH feature vector, is computed which is given below:
Divide the examined window into cells (e.g. 16x16 pixels for each cell).
For each pixel in a cell, compare the pixel to each of its 8 neighbors (on its left-top, left-middle, left-
bottom, right top, etc.). Follow the pixels along a circle, i.e. clockwise or counter-clockwise.
When the center pixel's value is greater than the neighbor's value, assign "1". Otherwise, assign "0".
This gives an 8-digit binary number (which is usually converted to decimal).
Compute the histogram, over the cell, of the frequency of each "number" occurring (i.e., each
combination of which pixels are smaller and which are greater than the center).
i. Prepare the data: let suppose we have M vectors of size N (=rows and columns of image) representing a set of
images, then the training set becomes:Γ1, Γ2, Γ3……..ΓM .
ii. Subtract the mean: The average matrix Ψ has to be calculated, then subtracted from the original faces (Γi)
and the result is stored in variable Φi
iii. Determine the co-variance matrix: In this step the covariance matrix B is calculated according to: B=ΦTΦ
iv. Determine the Eigenvectors and Eigenvalues of covariance matrix: In this stage, the eigenvectors Xi and the
eigenvalues λi are determined.
v. Determining Eigen faces: [Φ]Xi =Fi Where Xi and Fi are eigenvectors and eigenfaces respectively
vi. Classifying the faces: The new image is transformed into its eigenface components, the resulting weights
forms the weight vector .
(3)Matching or classification:
First we had done the face detection then from the detected face, we extract the features and then we feed it to
the classifier. In this paper, we considered two types of classifier support vector machines (SVM) and Euclidean
distance classifier.
(I) SUPPORT VECTOR MACHINES
Support vector machines (SVM) is a supervised machine learning algorithm which can be used for
both classification and regression purpose. SVMs are widely used for classification purpose.
A distinct property of SVM is that it instantaneously reduces the classification error and exploits the margin.
Therefore, it is well-known as maximum margin classifiers.
i. Binary classification: SVM belongs to set of maximum margin classification. They perform recognition
between two classes by determining the decision surface that has the maximum distance to the closest points in
the training set which are defined as support vectors. A direct method for building a classifier for person A is to
feed the SVM algorithm a training set with one class containing facial images of person A, and other class
containing facial images of different person.
Above fig 5 illustrates that SVM optimal hyperplane is the one with the maximum distance from the
adjacent training patterns. The support vectors (solid dots) are those adjacent patterns, a distance b from the
hyperplane.
To estimate the margin, two parallel hyperplanes are built, one on each side of the separating one,
which are pushed up against the two data sets. Instinctively, a good separation is attained by the hyperplane that
has the largest distance to the adjoining data points of both classes. The larger the margin or distance between
these parallel hyperplanes, the better the simplification error of the classifier.
A SVM algorithm will generate a linear decision surface, and the identity of image p will be accepted if
w.p+b≤0,
otherwise the claim will be rejected.
ii. Multi-class classification:
There are two fundamental strategies for solving n-class problem with SVMs:
a. In the one-vs.-all approach n SVMs are trained. Each of the SVMs separates a single class from the remaining
class.
b. In the one-vs.-one approach n(n-1)/2 SVMs are trained. Each SVM separates a pair of classes. The pairwise
classifiers are organized in trees, where each tree node signifies an SVM.
we opted for the one-vs.-all strategy where the number of SVMs is linear with the number of classes.
(II) EUCLIDEAN DISTANCE:
Euclidean distance classifier computes the sum of the squared distance (SSD) between the
corresponding histograms of the test image and the template image.
The smaller the distance is, the more similar the source image and the template Image.
After obtaining the Euclidean distance between the test image with all the images in the dataset, we classify the
images based on the minimum distance obtained between the test image and all the images in the dataset.
DATASETS USED
FEI dataset- 48 persons, where each image is of size 112X92
ORL dataset-40 persons, where each image is of size 128X128
MSRIT (our own acquired) dataset-100 persons, where each image is of size 128X128
Each dataset contains 10 samples /person.
Totally we have considered 1880 images for experimental purpose.
VII. Simulation and Experimental Results:
For each face Viola Jones algorithm is applied to detect only the face area. From detected face, Eigen
features and LBPH for feature extraction, then we will feed the features to the classifier.
Feature vector size for the databases are listed below:
Each dataset contains 10 samples /person and randomly training samples are selected and varied and
each time recognition rate is calculated. Recognition rate is defined as the ratio of number of test samples
recognized to the total number of test samples considered. Figure(6) shows the graph of recognition rate for all
three databases. We have considered 6:4 train: test ratio for simulation purpose.