Alamaldin Face Detection
Alamaldin Face Detection
Abstract
In the past years a lot of effort has been made in the field of face
detection. The human face contains important features that can be used by
vision-based automated systems in order to identify and recognize
individuals. Face location, the primary step of the vision-based automated
systems, finds the face area in the input image. An accurate location of
the face is still a challenging task. Viola-Jones framework has been
widely used by researchers in order to detect the location of faces and
objects in a given image. Face detection classifiers are shared by public
communities, such as OpenCV. An evaluation of these classifiers will
help researchers to choose the best classifier for their particular need.This
work focuses of the Face Detection Using HAAR Cascade
1) Introduction
Face detection is a computer vision technology that involves
identifying and locating human faces in digital images or video frames. It
plays a crucial role in various applications, ranging from photography and
video surveillance to facial recognition and augmented reality. The
primary goal of face detection is to locate and extract facial features
within an image or a video stream.LTHOUGH recognizing an individual
by the face is an easy task for humans, it is a challenge for vision-based
automated systems. It has been an active research area involving several
disciplines such as image processing, neural networks, statistics, pattern
recognition, anthropometry and computer vision. Vision-based automated
systems can apply facial recognition and facial identification in numerous
Face Detection
commercial applications, such as biometric authentication, human-
computer interaction, surveillance, games and multimedia entertainment.
Unlike other biometrics, face recognition is non-invasive, and does not
need physical contact of the individual with the system, making it a very
acceptable biometric. Vision-based automated systems applied to face
recognition can be divided into 4 steps: face detection, image pre-
processing, feature
extraction and matching [1]. Face detection is a hard task, once faces
form a similar class of objects and their features, such as eyes, mouth,
nose and chin, have, in general, the same geometrical configuration. The
captured image of the face may be pre-processed to overcome
illumination variations [2]. Feature extraction is the process where a
geometrical or vectorial model is obtained gathering important
characteristics presented on the face. Feature extraction can be divided
into 3 approaches: holistic, feature-based and hybrid. Principal
component analysis [3] [4], fisher discriminant analysis [5] [6] and
support vector machine [7] are examples of holistic approach. Feature-
based approach is based on geometrical relation of the facial features. [8]
applied active shape model, gathering important information presented in
some of the facial features. Statistical classifiers such as Euclidian
distance [9], Bayes classifier [10], Mahalanobis distance [11] and
neural classifiers [12] can be used to compare the characteristic vector
with other classes (individuals) in the
matching step. Face detection has been improved in terms of speed with
the application of haar-features with the contribution of the
ViolaJonesobject detection framework. Implementations of this
framework, such as OpenCV, provide different face classifiers created by
authors that used different datasets into their training. The performance
and reliability of these classifiers vary a lot.
Face Detection
Face detection
Face detection is a technology that can identify a person's face in pictures
or videos. It has become increasingly important for security purposes,
including legal requirements and global security. There are various
algorithms used for face detection, such as the Haar cascade and Local
Binary Pattern (LBP) algorithms. These algorithms use different
techniques to extract facial features and classify faces based on their
positions. In the context of the COVID-19 pandemic, there has been a
focus on detecting masked faces, which presents additional challenges for
face detection algorithms. Studies have compared the performance of
different algorithms, and it has been found that the Haar cascade classifier
outperforms the LBP classifier [24] [25]. However, there is still a lack of
evidence regarding how well existing face detection algorithms perform
on masked faces [26] [27].
Haar-like features
In the 19th century a Hungarian mathematician, Alfred Haar gave the
concepts of Haar wavelets, which are a sequence of rescaled “square-
shaped” functions which together form a wavelet family or basis. Voila
and Jones adapted the idea of using Haar wavelets and developed the so-
called Haar-like features.
Haar-like features are a method used in various research fields. They have
been applied in the development of frameworks for automatic gun
Face Detection
detection using CCTV images [29]. Haar-like features have also been
used in the detection of human faces for tracking purposes in unmanned
aerial vehicles (UAVs) [30]. Additionally, Haar-like features have been
utilized in the detection of wind turbine blade cracks from images [31]. In
the field of advertising, Haar-like features have been employed in
augmented reality (AR) technology for marker detection and car
specification presentation [32]. Furthermore, Haar-like features have been
used in face detection systems for biometric research, face recognition,
and identification [33]. These applications demonstrate the versatility and
effectiveness of Haar-like features in various domains.
The typical cascade classifier is the very successful method of Viola and
Jones for face detection [22-23] . Generally, many object detection tasks
with rigid structure can be addressed by means of this method, not
limited to face detection. The cascade classifier is a tree-based technology,
in which Viola and Jones used Haar-like features for human face
detection. The Haar-like features by default are shown in Figure 1 , which
can be used with all scales in the boosted classifier and can be rapidly
computed from an integral version of the image to be detected in.
Face Detection
D.LANDMARKS
Landmark detection is important not only to generate a geometric face
model, but also can be used for face detection [17]. [18] compared
different algorithms for facial landmark localization and proposed a set of
tools that ease the integration of other face databases. [19] proposed a
technique for face segmentation using Active Shape Model based on
border landmarks of the face. [20] used a facial geometrical model based
on the distance of the eyes to estipulate the position of other landmarks
for face segmentation, shown in Fig. 3.
FGnet project has published the location of 22 facial features of each face
of the AR face database [21]. We also marked manually the same 22
facial feature points of the Yale and FEI face database images used in this
work. Fig. 4 shows an image with the marked facial points. In the total,
565 images were used and for each one of the 22 landmarks, a score was
given (see Table LANDMARKS And Scores). The scores were either 1
or 2. The landmarks located in the contour of the face were given the
highest score. The application of the scores will be explained in the next
section.
The Viola Jones algorithm has four main steps, which we shall discuss in
the sections to follow:
Selecting Haar-like features
Creating an integral image
Running AdaBoost training
Creating classifier cascades
Edge features and Line features are useful for detecting edges and lines
respectively. The four-sided features are used for finding diagonal
features.
The value of the feature is calculated as a single number: the sum of pixel
values in the black area minus the sum of pixel values in the white area.
The value is zero for a plain surface in which all the pixels have the same
value, and thus, provide no useful information.
Since our faces are of complex shapes with darker and brighter spots, a
Haar-like feature gives you a large number when the areas in the black
Face Detection
and white rectangles are very different. Using this value, we get a piece
of valid information out of the image.
For example, when we apply this specific haar-like feature to the bridge
of the nose, we get a good response. Similarly, we combine many of these
features to understand if an image region contains a human face.
Integral Images
calculate a value for each feature, we need to perform computations on
all the pixels inside that particular feature. In reality, these calculations
can be very intensive since the number of pixels would be much greater
when we are dealing with a large feature.
The integral image plays its part in allowing us to perform these intensive
calculations quickly so we can understand whether a feature of several
features fit the criteria.
An integral image (also known as a summed-area table) is the name of
both a data structure and an algorithm used to obtain this data structure. It
is used as a quick and efficient way to calculate the sum of pixel values in
an image or rectangular part of an image.
The classifiers that performed well are given higher importance or weight.
The final result is a strong classifier, also called a boosted classifier, that
contains the best performing weak classifiers.
So when we’re training the AdaBoost to identify important features,
we’re feeding it information in the form of training data and subsequently
training it to learn from the information to predict. So ultimately, the
algorithm is setting a minimum threshold to determine whether
something can be classified as a useful feature or not.
Cascading Classifiers
Maybe the AdaBoost will finally select the best features around say 2500,
but it is still a time-consuming process to calculate these features for each
region. We have a 24×24 window which we slide over the input image,
and we need to find if any of those regions contain the face. The job of
Face Detection
the cascade is to quickly discard non-faces, and avoid wasting precious
time and computations. Thus, achieving the speed necessary for real-time
face detection.
When a subregion gets a maybe, it is sent to the next stage of the cascade
and the process continues as such till we reach the last stage.
Now how does it help us to increase our speed? Basically, If the first
stage gives a negative evaluation, then the image is immediately
discarded as not containing a human face. If it passes the first stage but
fails the second stage, it is discarded as well. Basically, the image can get
discarded at any stage of the classifier.
Face Detection
Code
if width>320
the_Image = imresize(the_Image,[320 NaN]);
end
%finding the bounding box that encloses the face on video frame
face_Location = step(faceDetector, the_Image);
4) Conclusion
In conclusion, using the Haar Cascade for face detection has both
advantages and limitations. Haar Cascade classifiers are appreciated for
their speed and resource efficiency, making them suitable for real-time
applications with less computational power. They perform well in
controlled environments with consistent lighting and well-posed faces,
and they are robust against overfitting.
However, Haar Cascade classifiers may fall short in accuracy when faced
with complex backgrounds, occlusions, or variations in facial expressions
and poses. Their training process can be complex, requiring a substantial
amount of diverse image samples.
Additionally, Haar Cascade classifiers are less effective for detecting
small faces or faces at a distance. Their performance tends to degrade as
the size of the face decreases.
In summary, Haar Cascade is a practical choice for certain applications,
especially those prioritizing speed and efficiency in controlled
environments. For more demanding applications that require higher
accuracy and robustness in handling diverse conditions, alternative
methods like deep learning approaches using Convolutional Neural
Networks (CNNs) may be more suitable. The choice depends on the
specific needs and constraints of the application.
Face Detection
5) REFERENCES