Final Report On Face Recognition
Final Report On Face Recognition
A Mid-Term Project Report Submitted for the partial fulfillment of the requirements for the Degree of Bachelor of Technology Under Biju Pattnaik University of Technology Project ID: 11079 Submitted By
2010 2011
i. ABSTRACT
Generic face recognition systems identify a subject by comparing the subjects image to images in an existing face database. These systems are very useful in forensics for criminal identification and in security for biometric authentication, but are constrained by the availability and quality of subject images. In this project, we will propose a novel system, that uses descriptive non-visual human input of facial features to perform face recognition without the need for a reference image for comparison. Our maps images in an existing database to a fourteen dimensional descriptive feature space and compares input feature descriptions to images in feature space. Our system clusters database images in feature space using feature-weighted K-Means clustering to offer computational speedup while searching feature space for matching images. We are working in MATLAB environment. Our system has four modules: 1: Convert Descriptive Features to numeric data. 2: Extract 14 features from each face image from face database. 3: Compare input features with extracted features of a face. 4: Display the best match image.
ii. ACKNOWLEDGEMENT
It is our proud privilege to epitomize our deepest sense of gratitude and indebtedness to our guide, Mr. SOURAV PRAMANIK for his valuable guidance, keen support, intuitive ideas and persistent endeavor. His inspiring assistance and affectionate care enabled us to complete our work smoothly and successfully. We are also thankful to Mr. SWADHIN MISHRA, B.Tech Project Coordinator for giving us valuable time and support during the presentation of the project. We acknowledge with immense pleasure the interest, encouraging attitude and constant inspiration rendered by Dr. Ajit Kumar Panda, Dean, N.I.S.T and Prof. Sangram Mudali, Director, N.I.S.T. Their continued derive for better quality in everything that happens at N.I.S.T. and selfless inspiration has always helped us to move ahead. We can never forget to thanks our family and friends for taking the pain of helping us and understanding us at any hour of time during the completion of the project. Lastly, we bow our gratitude at the omnipresent Almighty for all his kindness. We still seek his blessings to proceed further.
8. REFERENCES ..........................................................................22
1. INTRODUCTION
The face recognition problem involves searching an existing face database for a face, given a description of the face as an input. The face identification problem is one of accepting or rejecting a persons claimed identity by searching an existing face database to validate input data. Many databases for face identification and recognition have been built and are now widely used. However, most systems that have been developed in the past are constrained by images being the primary, and often singular, form of input data. In cases where images are not available as sample input,it not possible for such systems to perform face recognition. Our system uses general facial descriptions as input to retrieve images from a database. Users may utilize this system to identify images by just entering general descriptions, removing the constraint of input images for face recognition and identification purposes. Our system will formalize subjective human descriptions into discrete feature values and associates seven descriptive and seven geometric features to face images. The seven discretized geometric features combine with the seven descriptive features to form a composite fourteen dimensional feature set for our system. Similar images are clustered in feature space using weighted K-means clustering. User input, in the form of facial descriptions, directly maps to the fourteen dimensional descriptive feature space. Thereafter, the input description is compared to the three closest clusters of images in feature space iteratively, to check for matches. A set of prospective matches is then identified and returned. Our approach draws inspiration from the fact that humans describe faces using abstract and often subjective feature measures such as the shape of a face, the color of the skin, hair color etc.[3]. These semantic descriptions, supplied by humans are immune to picture quality and other effects that reduce the efficiency of contemporary face recognition and identification algorithms. We will identify the possible facial features that may lead to better recognition[5] while coming to our present feature set. We will implement all these in MATLAB programming language. .
2. LITERATURE REVIEW
For every unknown person, his/her face that draws our attention most. So face is the most important visual identity of a human being. For that reason face recognition has been an important research problem spanning numerous fields and disciplines. This because face recognition, in additional to having numerous practical applications such as bankcard identification, access control, Mug shots searching, security monitoring, and surveillance system, is a fundamental human behaviour that is essential for effective communications and interactions among people. A formal method of classifying faces was first proposed in [5]. The author proposed collecting facial profiles as curves, finding their norm, and then classifying other profiles by their deviations from the norm. This classification is multi-modal, i.e. resulting in a vector of independent measures that could be compared with other vectors in a database. Progress has advanced to the point that face recognition systems are being demonstrated in real-world settings [2]. The rapid development of face recognition is due to a combination of factors: active development of algorithms, the availability of a large databases of facial images, and a method for evaluating the performance of face recognition algorithms. The problem of face recognition can be stated as follows: Given still images or video of a scene, identifying one or more persons in the scene by using a stored database of faces [1]. The problem is mainly a classification problem. Training the face recognition system with images from of the face recognition systems. the known individuals and classifying the newly coming test images into one of the classes is the main aspect
1. Eigenfaces:
Eigenface is one of the most thoroughly investigated approaches to face recognition. It is also known as Karhunen- Love expansion, eigenpicture, eigenvector, and principal component. References [2, 3] used principal component analysis to efficiently represent pictures of faces. They argued that any face images could be approximately reconstructed by a small collection of weights for each face and a standard face picture (eigenpicture). The weights describing each face are obtained by projecting the face image onto the eigenpicture. There is substantial related work in multimodal biometrics. For example used face and fingerprint in multimodal biometric identification, and used face and voice. However, use of the face and ear in combination seems more relevant to surveillance applications 2. Neural Networks: The attractiveness of using neural networks could be due to its non linearity in the network. Hence, the feature extraction step may be more efficient than the eigenface methods. One of the first artificial neural networks (ANN) techniques used for face recognition is a single layer adaptive network called WISARD which contains a separate network for each stored individual . The way in constructing a neural network structure is crucial for successful recognition. But it is not used for more number of persons. If the number of persons increases, the computing expense will become more demanding. In general, neural network approaches encounter problems when the number of classes (i.e., individuals) increases. Moreover, they are not suitable for a single model image recognition test because multiple model images per person are necessary in order for training the systems to optimal parameter setting.
7
3. Graph Matching: Graph matching is another approach to face recognition. Reference presented a dynamic link structure for distortion invariant object recognition which employed elastic graph matching to find the closest stored graph. Dynamic link architecture is an extension to classical artificial neural networks. Memorized objects are represented by sparse graphs, whose vertices are labeled with a multi-resolution description in terms of a local power spectrum and whose edges are labeled with geometrical distance vectors. Object recognition can be formulated as elastic graph matching which is performed by stochastic optimization of a matching cost function. In general, dynamic link architecture is superior to other face recognition techniques in terms of rotation invariance; however, the matching process is computationally expensive. 4. Hidden Markov Models (HMMs): Stochastic modeling of non stationary vector time series based on (HMM) has been very successful for speech applications. Reference [3] applied this method to human face recognition. Faces were intuitively divided into regions such as the eyes, nose, mouth, etc., which can be associated with the states of a hidden Markov model. Since HMMs require a one-dimensional observation sequence and images are two dimensional, the images should be converted into either 1D temporal sequences or 1D spatial sequences.
geometrical feature matching based on precisely measured distances between features may be most useful for finding possible matches in a large database such as a Mug shot album. However, it will be dependent on the accuracy of the feature location algorithms. Current automated face feature location algorithms do not provide a high degree of accuracy and require considerable computational time.
6. Template Matching
A simple version of template matching is that a test image represented as a twodimensional array of intensity values is compared using a suitable metric, such as the Euclidean distance, with a single template representing the whole face. There are several other more sophisticated versions of template matching on face recognition. One can use more than one face template from different viewpoints to represent an individual's face. In general, template-based approaches compared to feature matching are a more logical approach. In summary, no existing technique is free from limitations. Further efforts are required to improve the performances of face recognition techniques, especially in the wide range of environments encountered in real world.
prefer Bayesian logic, and some control engineers, who prefer traditional two-valued logic.
2.2.1 Degrees of truth: Fuzzy logic and probabilistic logic are mathematically similar both have truth values ranging between 0 and 1 but conceptually distinct, due to different interpretations. Fuzzy logic corresponds to "degrees of truth", while probabilistic logic corresponds to "probability, likelihood"; as these differ, fuzzy logic and probabilistic logic yield different models of the same real-world situations. Both degrees of truth and probabilities range between 0 and 1 and hence may seem similar at first. For example, let a 100 ml glass contain 30 ml of water. Then we may consider two concepts: Empty and Full. The meaning of each of them can be represented by a certain fuzzy set. Then one might define the glass as being 0.7 empty and 0.3 full. Note that the concept of emptiness would be subjective and thus would depend on the observer or designer. Another designer might equally well design a set membership function where the glass would be considered full for all values down to 50 ml. It is essential to realize that fuzzy logic uses truth degrees as a mathematical model of the vagueness phenomenon while probability is a mathematical model of ignorance. The same could be achieved using probabilistic methods, by defining a binary variable "full" that depends on a continuous variable that describes how full the glass is. There is no consensus on which method should be preferred in a specific situation
2.2.2 Linguistic variables While variables in mathematics usually take numerical values, in fuzzy logic applications, the non-numeric linguistic variables are often used to facilitate the expression of rules and facts.[4] A linguistic variable such as age may have a value such as young or its antonym old. However, the great utility of linguistic variables is that they can be modified via linguistic hedges applied to primary terms. The linguistic hedges can be associated with certain functions.
10
2.2.3 Example Fuzzy set theory defines fuzzy operators on fuzzy sets. The problem in applying this is that the appropriate fuzzy operator may not be known. For this reason, fuzzy logic usually uses IF-THEN rules, or constructs that are equivalent, such as fuzzy associative matrices. Rules are usually expressed in the form: IF variable IS property THEN action
2.3.1 Method
Step 1: Get some data Step 2: Subtract the mean Step 3: Calculate the covariance matrix Step 4: Calculate the eigenvectors and eigenvalues of the covariance matrix. Step 5: Choosing components and forming a feature vector Step 6: Deriving the new data set:
11
Figure 2.1: flow chart for k-means clustering. Step 1. Begin with a decision on the value of k = number of clusters
12
Step 2. Put any initial partition that classifies the data into k clusters. You may assign the training samples randomly, or systematically as the following: 1. Take the first k training sample as single-element clusters 2. Assign each of the remaining (N-k) training samples to the cluster with the nearest centroid. After each assignment, recomputed the centroid of the gaining cluster. Step 3. Take each sample in sequence and compute its distance from the centroid of each of the clusters. If a sample is not currently in the cluster with the closest centroid, switch this sample to that cluster and update the centroid of the cluster gaining the new sample and the cluster losing the sample. Step 4. Repeat step 3 until convergence is achieved, that is until a pass through the training sample causes no new assignments. If the number of data is less than the number of cluster then we assign each data as the centroid of the cluster. Each centroid will have a cluster number. If the number of data is bigger than the number of cluster, for each data, we calculate the distance to all centroid and get the minimum distance. This data is said belong to the cluster that has minimum distance from this data.
needs typically 200-400 presentations for training each classifier where the training patterns included translation and variation in facial expressions. One classifier was constructed corresponding to one subject in the database. Classification was achieved by determining the classifier that was giving the highest response for the given input image.
venue between the external i/p and the network o/p in some useful manner. The ability of hidden neurons is to extract higher order statistics is particularly valuable when the size of i/p layer is large. The i/p vectors are feedforward to 1st hidden layer and this pass to 2nd hidden layer and so on until the last layer i.e. output layer, which gives actual network response.
3). Recurrent networks: A recurrent network distinguishes itself from feed forward neural network, in that it has least one feed forward loop. As shown in figures output of the neurons is fed back into its own inputs is referred as self-feedback .A recurrent network may consist of a single layer of neurons with each neuron feeding its output signal back to the inputs of all the other neurons. Network may have hidden layers or not.
15
network has become tuned to the statistical regularities of the input data, it developes the ability to form internal representation for encoding features of the input and there by to create the new class automatically.
16
17
18
2.7FEATURE EXTRACTION
When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (much data, but not much information) then the input data will be transformed into a reduced representation set of features (also named features vector). Transforming the input data into the set of features is called feature extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.
19
20
3. FUTURE WORK
We believe that our approach will be of great use for forensic face recognition and criminal identification systems which require descriptive input semantics, since the available data often consists of witness' descriptions. In addition, our method of searching for data using descriptive semantics could combine with existing automated face recognition systems and augment them. Adler et al concluded in 2006 that humans effectively utilize contextual information while recognizing faces, and in general equal or outperform even the best automated systems. Extensions to our work could include the annotation of contextual data to images using the descriptive semantic method. This could help improve our face recognition method by obtaining qualitatively better user input as well as improving our recognition performance. In general, the use of descriptive input features allows for input data to bear different semantics than the data being searched for. We believe that this could yield good results for other data types as well, specially where direct pattern recognition is either infeasible or yields unsatisfactory results.
21
8. REFERENCES
[1] A. Adler and M.E. Schuckers. Comparing human and automatic face recognition performance. Systems, Man, and Cybernetics, Part B, IEEE Transactions on , 37(5):1248{1255,Oct. 2007}. [2] Sherrie L. Davey Bruce W. Behrman. Eyewitness identification in actual criminal cases: An archival analysis. Law and Human Behavior ,25, Issue - 5:475{491, 2001.} [3] Ralph Gross. Face databases. February 2005. [4] Wong M. A. Hartigan J. A. A k-means clustering algorithm. applied statistics,. Journal of the Royal Statistical Society. Series C, Applied statistics , 28:100{108,1979.. [5] M. Kirby and L. Sirovich, Application of the Karhunen- Love procedure for the characterization of human faces, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, pp. 831-835, Dec.1990.
22