PCA + Distance Methods
PCA + Distance Methods
www.elsevier.com/locate/patrec
Image Processing and Multimedia Laboratory, Kaunas University of Technology, Studentu st. 56-305, LT-3031 Kaunas, Lithuania
Received 2 September 2003; received in revised form 9 December 2003
Abstract
In this article we compare 14 distance measures and their modifications between feature vectors with respect to the
recognition performance of the principal component analysis (PCA)-based face recognition method and propose
modified sum square error (SSE)-based distance. Recognition experiments were performed using the database con-
taining photographies of 423 persons. The experiments showed, that the proposed distance measure was among the first
three best measures with respect to different characteristics of the biometric systems. The best recognition results were
achieved using the following distance measures: simplified Mahalanobis, weighted angle-based distance, proposed
modified SSE-based distance, angle-based distance between whitened feature vectors. Using modified SSE-based dis-
tance we need to extract less images in order to achieve 100% cumulative recognition than using any other tested
distance measure. We also showed that using the algorithmic combination of distance measures we can achieve better
recognition results than using the distances separately.
2004 Elsevier B.V. All rights reserved.
0167-8655/$ - see front matter 2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.patrec.2004.01.011
712 V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724
(Navarrete and Ruiz-del-Solar, 2002; Phillips et al., In order to perform KLT, it is necessary to find
1997, 2000) in order to achieve better recognition eigenvectors uk and eigenvalues kk of the covari-
results. ance matrix CðCuk ¼ kk uk Þ. Because the dimen-
In this article we compare recognition per- sionality (N 2 ) of the matrix C is large even for a
formance of 14 distance measures including small images, and computation of eigenvectors
Euclidean, angle-based, Mahalanobis and their using traditional methods is complicated, dimen-
modifications. Also we propose modified sum sionality of matrix C is reduced using the
square error-based distance and modified Man- decomposition described in (Kirby and Sirovich,
T
hattan distance measures. The experiments showed, 1990). Found eigenvectors u ¼ ðu1 ; u2 ; . . . ; uN Þ are
that the proposed distance measures were among normed and sorted in decreasing order according
the best distance measures with respect to different to the corresponding eigenvalues. Then these vec-
characteristics of the biometric systems. For com- tors are transposed and arranged to form the row-
parison we used the following characteristics of the vectors of the transformation matrix T. Now any
biometric systems: equal error rate (EER), first one data X can be projected into the eigenspace using
recognition rate, area above cumulative match the following formula:
characteristic (CMC), area below receiver operating
characteristic (ROC), percent of images that we Y ¼ TðX mÞ; ð4Þ
need to extract in order to achieve 100% cumulative T
recognition. here X ¼ ðx1 ; x2 ; . . . ; xN Þ , Y ¼ ðy1 ; y2 ; . . . ; yr ;
T
0; . . . ; 0Þ .
Also we can perform ‘‘whitening’’ (Bishop,
1995) transform:
2. PCA-based face recognition
Y ¼ K1=2 TðX mÞ; ð5Þ
In this section we will describe Karhunen–
pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi p ffiffiffiffiffiffiffiffiffi
Loeve transform (KLT)-based face recognition
here K1=2 ¼ diag 1=k1 ; 1=k2 ; . . . ; 1=kr .
method, that is often called principal component
analysis (PCA) or eigenfaces. We will present only Whitening is a linear rescaling that makes the
main formulas of this method, whose details could transformed input data to have zero mean and a
be found in (Groß, 1994). covariance matrix given by the identity matrix.
Let X j be N -element one-dimensional image and For projection into eigenspace we can use not all
suppose that we have r such images (j ¼ 1; . . . ; r). A found eigenvectors, but only a few of them, cor-
one-dimensional image-column X from the two- responding to the largest eigenvalues. We can
dimensional image (face photography) is formed manually select desired number of eigenvectors or
by scanning all the elements of the two-dimensional use the method described in (Swets et al., 1998).
image row by row and writing them to the column- When the image (human face photography) is
vector. Then the mean vector, centered data vec- projected into the eigenspace we get its eigenfeature
T T
tors and covariance matrix are calculated: vector Z ¼ ðz1 ; z2 ; . . . ; zn Þ ¼ ðy1 ; y2 ; . . . ; yn Þ , here
n is the number of features. When we have feature
1X r
vector Z of each face, identification can be per-
m¼ X j; ð1Þ
r j¼1 formed. After projecting a new unknown face
image into the eigenspace we get its feature vec-
d j ¼ X j m; ð2Þ tor Z new and calculate the Euclidean distances
between unknown face and each known face
1X r
ei ¼ kZ new Z i k and say that the face with pro-
C¼ d j d Tj ; ð3Þ jection Z new belongs to a person s ¼ arg mini ½ei .
r j¼1
For rejection of unknown faces a threshold s is
T T
here X ¼ ðx1 ; x2 ; . . . ; xN Þ , m ¼ ðm1 ; m2 ; . . . ; mN Þ , chosen and it is said that the face with projec-
T
d ¼ ðd1 ; d2 ; . . . ; dN Þ . tion Z new is unknown if es P s. Distance between
V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724 713
Pn
projections Z is usually measured using the Euclid- x i yi
ean distance, some authors measured the distance cosðX; YÞ ¼ pP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
m
i¼1
2
Pm 2ffi;
x y
i¼1 i i¼1 i
between the feature vectors in the eigenspace using (6) Correlation coefficient-based distance
the angle-based measure (Phillips et al., 1997), but
other distance measures also could be used. dðX; YÞ ¼ rðX; YÞ; ð12Þ
rðX; YÞ
3. Distance measures P P P
n ni¼1 xi yi ni¼1 xi ni¼1 yi
¼ r
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
;
P Pn 2 P Pn 2
n ni¼1 x2i i¼1 xi n ni¼1 yi2 i¼1 yi
Let X, Y be eigenfeature vectors of length n.
Then we can calculate the following distances be-
(7) Mahalanobis distance and Mahalanobis dis-
tween these feature vectors (Grudin, 1997; Yam-
tance between normed vectors
bor and Draper, 2002; Phillips et al., 1999, 2000;
Cekanavicius and Murauskas, 2002): X n
dðX; YÞ ¼ zi xi yi ; ð13Þ
i¼1
(1) Minkowski distance (Lp metrics)
!1=p 1 Xn
X
n
p dðX; YÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2 zi xi yi ;
dðX; YÞ ¼ Lp ðX; YÞ ¼ jxi yi j ; i¼1 xi i¼1 yi i¼1
i¼1
ð14Þ
ð6Þ qffiffiffiffiffiffiffiffi
ki
here zi ¼ ki þa2 , a ¼ 0:25, ki ––corresponding
here p > 0;
(2) Manhattan distance (L1 metrics, city block dis- eigenvalues, or simplified Mahalanobis pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffidis-
ffi
tance) tance
pffiffiffiffiffiffiffiffiffi versions with z i ¼ ki =ðki þ a2 Þ ’
1=ki ;
X
n
(8) Weighted Manhattan distance
dðX; YÞ ¼ Lp¼1 ðX; YÞ ¼ jxi yi j; ð7Þ
i¼1
Xn pffiffiffiffiffiffiffiffiffi
dðX; YÞ ¼ zi jxi yi j; zi ¼ 1=ki ; ð15Þ
(3) Euclidean distance (L2 metrics) i¼1
ð16Þ
(4) Squared euclidean distance (sum square error,
SSE), mean square error (MSE) (10) Weighted angle-based distance
Pn
dðX; YÞ ¼ L2p¼2 ðX; YÞ ¼ SSE zi xi yi pffiffiffiffiffiffiffiffiffi
Pm i¼1 2 Pm 2ffi ;
dðX; YÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi zi ¼ 1=ki ;
Xn
i¼1 i x i¼1 yi
¼ kX Yk2 ¼ ðxi yi Þ2 ; ð9Þ ð17Þ
i¼1
(11) Chi square distance
1
dðX; YÞ ¼ L2p¼2 ðX; YÞ ¼ MSE Xn
ðxi yi Þ
2
n dðX; YÞ ¼ v2 ¼ ; ð18Þ
1X n
i¼1
xi þ yi
¼ ðxi yi Þ2 ; ð10Þ
n i¼1 (12) Canberra distance
(5) Angle-based distance Xn
jxi yi j
dðX; YÞ ¼ ; ð19Þ
dðX; YÞ ¼ cosðX; YÞ; ð11Þ i¼1
jx i j þ jyi j
714 V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724
(13) Modified Manhattan distance tioned weighted distances perform some data
Pn scaling along principal directions, weighting is not
jxi yi j
dðX; YÞ ¼ Pn i¼1 Pn ; ð20Þ necessarily related with whitening because of dif-
i¼1 jx i j i¼1 jyi j
ferent scaling factors. Also it must be noted that
some of the distances (e.g. (11) and (12)) could be
(14) Modified SSE-based distance shifted in order to have positive distance values
Pn and scaled in order to have values in the interval
ðxi yi Þ2
dðX; YÞ ¼ Pni¼1 2 Pn 2 ; ð21Þ ½0; 1, but these normalizations increase computa-
i¼1 xi i¼1 yi tion time. So if we do not necessary need nor-
malized values, we can calculate and perform
(15) Weighted modified Manhattan distance faster search with unnormalized values. When the
Pn
zi jxi yi j pffiffiffiffiffiffiffiffiffi search is done and we present some of the best
dðX; YÞ ¼ n i¼1 Pn
P ; zi ¼ 1=ki ; results to the user (usually only a small part of the
i¼1 jxi j i¼1 jyi j
database), then we can normalize the displayed
ð22Þ results.
Now we will compare some of the mentioned
(16) Weighted modified SSE-based distance distance measures using the PCA-based face rec-
ognition method.
Pn
zi ðxi yi Þ2 pffiffiffiffiffiffiffiffiffi
dðX; YÞ ¼ Pi¼1
n 2
Pn 2 ; zi ¼ 1=ki ;
i¼1 xi i¼1 yi
4. Experiments and results
ð23Þ
For experiments we used images from the AR
If the feature vectors X are stored in the Pndata-
2 (AR, 1998; Martinez and Benavente, 1998), Bern
base, some
Pn 2 Pn of the components
2 Pn like i¼1 xi , (1995), BioID (2001), Yale (1997), Manchester
n i¼1 xi , i¼1 xi , i¼1 jxi j of the described (1998), MIT (MIT, 1989; Turk and Pentland,
distances could be calculated in advance and
1991), ORL (ORL, 1992; Samaria and Harter,
stored in the database in order to speed-up com-
1994), Umist (Umist, 1997; Graham and Allinson,
parisons and search in the database. In some cases
1998), FERET (Phillips et al., 1997) databases.
instead of using eigenvalues ki in the distance
From these databases we collected the database
measures, we can include them in the transfor-
containing photographies of 423 persons (two
mation formula. For example instead of using
images per person––one for learning and one for
weighted Manhattan distance (15) we can use
testing). In order to avoid recognition errors re-
whitening transform (5) and simple Manhattan
lated to incorrectly detected faces we manually
distance (7):
selected the centers of eyes and lips. Then we rot-
X n
ated the images in order to make the line con-
dWeighted Manhattan ðX; YÞ ¼ ki1=2 jxi yi j
i¼1
necting eye centers horizontal, resized the images
X
n and made the distances between the centers of the
¼ k1=2 xi k1=2 yi eyes equal to 26 pixels, calculated the center of the
i i
i¼1 face using the centers of eyes and lips, cropped
X
n 64 · 64 central part of the face, performed histo-
¼ jui vi j; gram equalization on the cropped part of the
i¼1
image. It must be noted, that in some cases his-
here ui ¼ ki1=2 xi , vi ¼ ki1=2 yi . togram equalization reduces recognition perfor-
That is in some cases instead of using weighted mance, but usually it is used in order to normalize
distances we can calculate weighted vectors in illumination. Using the cropped templates we
advance and then use plain (unweighted) dis- performed PCA-based face recognition. In all the
tances. But it must be noted that, although men- experiments we use the same templates and change
V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724 715
only the distance measures between eigenfeature The results of experiments are summarized in
vectors and the number (percent) of used features. Tables 1–5. In these tables we can see how different
For comparison we use cumulative match char- distance measures affect recognition accuracy. For
acteristic (CMC) and receiver operating charac- measuring overall goodness of the distance mea-
teristic (ROC)-based measures, described in sure with respect to recognition accuracy, we use
(Bromba, 2003). the area above cumulative match characteristic
Table 1
Recognition using 10% of features (42)
Distance measure Rank (%) of images needed to extract in order to CMCA, First1 EER, % ROCA,
achieve some cumulative recognition percent 0–104 recogni- 0–104
80 85 90 95 100 tion, %
Euclidean; SSE 0.2 0.5 2.6 9.5 85.8 222.16 81.56 7.09 260.84
Angle; Mahalanobis 0.2 0.5 1.9 5.9 47.0 117.03 81.56 6.86 139.025
normed
Correlation 0.2 0.5 1.4 6.6 53.4 125.02 80.61 6.62 144.80
SSE modified 0.5 0.5 1.2 4.7 24.61 95.99 79.91 7.09 162.73
Manhattan 0.2 0.5 2.4 6.4 77.3 128.12 82.27 7.09 218.17
Manhattan modified 0.2 0.5 0.9 2.8 31.44 91.075 82.27 5.445 139.70
Mahalanobis 0.5 0.9 1.7 5.2 46.8 115.60 72.81 6.86 200.56
Mahalanobis simplified 0.2 0.5 0.7 1.9 25.32 66.171 82.985 3.783 71.434
Angle weighted 0.2 0.5 0.9 2.4 31.03 68.322 84.633 4.494 57.651
Manhattan weighted 0.2 0.7 2.1 6.9 65.5 149.28 81.09 9.22 307.51
Manhattan weighted 0.2 0.5 0.7 2.6 62.4 136.20 81.56 5.91 204.50
modified
SSE weighted 0.2 0.5 1.9 5.4 68.8 123.32 83.224 7.09 220.79
SSE weighted modified 0.2 0.5 0.9 2.6 50.4 92.19 82.27 6.15 169.19
Angle whitened 0.2 0.2 0.7 1.9 35.55 77.414 85.341 3.551 58.662
Correlation whitened 0.2 0.2 0.7 2.1 36.4 77.043 85.112 3.552 59.533
Table 2
Recognition using 20% of features (85)
Distance measure Rank (%) of images needed to extract in order to CMCA, First1 EER, % ROCA,
achieve some cumulative recognition percent 0–104 recogni- 0–104
80 85 90 95 100 tion, %
Euclidean; SSE 0.2 0.5 2.4 7.6 86.1 210.61 83.22 7.33 254.60
Angle; Mahalanobis 0.2 0.5 1.4 5.2 49.44 106.47 83.45 5.915 125.525
normed
Correlation 0.2 0.5 1.2 5.4 50.8 109.07 83.22 5.91 127.68
SSE modified 0.2 0.5 1.2 4.3 22.21 86.824 81.80 6.86 147.15
Manhattan 0.2 0.5 1.7 4.5 73.3 122.20 83.22 7.57 251.09
Manhattan modified 0.2 0.2 0.5 2.4 51.5 92.44 85.825 6.15 144.63
Mahalanobis 0.5 0.5 1.2 4.7 51.3 102.55 78.25 6.38 166.16
Mahalanobis simplified 0.2 0.2 0.5 1.2 31.72 56.981 86.524 3.312 50.182
Angle weighted 0.2 0.2 0.5 1.7 32.93 58.712 87.003 3.784 46.441
Manhattan weighted 0.2 0.5 1.4 7.3 92.4 200.19 82.98 10.64 442.58
Manhattan weighted 0.2 0.2 0.7 3.3 72.8 149.92 85.11 7.09 210.48
modified
SSE weighted 0.2 0.2 1.2 4.3 68.6 130.19 85.11 7.33 270.42
SSE weighted modified 0.2 0.5 0.9 1.9 40.0 97.11 84.16 6.62 216.36
Angle whitened 0.2 0.2 0.5 1.2 51.15 85.033 88.422 3.071 68.523
Correlation whitened 0.2 0.2 0.5 1.2 51.3 87.605 88.891 3.313 72.704
716 V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724
Table 3
Recognition using 30% of features (127)
Distance measure Rank (%) of images needed to extract in order to CMCA, First1 EER, % ROCA,
achieve some cumulative recognition percent 0–104 recogni- 0–104
80 85 90 95 100 tion, %
Euclidean; SSE 0.2 0.5 2.1 8.5 86.1 215.20 83.22 7.33 261.63
Angle; Mahalanobis 0.2 0.2 1.4 4.3 46.84 103.42 85.11 5.915 123.065
normed
Correlation 0.2 0.5 1.2 4.7 49.2 105.01 84.87 5.91 124.69
SSE modified 0.2 0.5 0.9 3.5 21.31 84.113 83.45 7.09 144.36
Manhattan 0.2 0.5 1.2 5.0 78.7 155.20 83.92 8.51 318.38
Manhattan modified 0.2 0.2 0.5 2.1 52.2 103.78 87.235 6.38 170.44
Mahalanobis 0.2 0.5 0.9 4.5 48.55 98.985 80.61 5.91 155.97
Mahalanobis simplified 0.2 0.2 0.5 0.9 23.92 54.571 87.944 3.071 45.192
Angle weighted 0.2 0.2 0.5 1.4 26.03 57.152 88.422 3.312 44.081
Manhattan weighted 0.2 0.5 4.3 20.1 92.7 316.44 82.98 13.48 628.08
Manhattan weighted 0.2 0.5 0.7 7.8 76.4 208.66 84.16 8.51 301.52
modified
SSE weighted 0.2 0.2 1.2 6.6 78.5 166.27 85.34 8.75 343.28
SSE weighted modified 0.2 0.5 1.2 2.6 54.6 118.65 82.98 7.09 261.83
Angle whitened 0.2 0.2 0.5 2.1 49.9 98.924 88.183 3.783 84.843
Correlation whitened 0.2 0.2 0.5 2.4 52.0 101.52 88.891 3.784 87.784
Table 4
Recognition using 60% of features (254)
Distance measure Rank (%) of images needed to extract in order to CMCA, First1 EER, % ROCA,
achieve some cumulative recognition percent 0–104 recogni- 0–104
80 85 90 95 100 tion, %
Euclidean; SSE 0.2 0.5 2.1 8.0 86.3 216.78 83.45 7.33 265.36
Angle; Mahalanobis 0.2 0.2 1.2 4.0 46.64 100.464 85.58 5.675 119.723
normed
Correlation 0.2 0.2 1.4 4.0 48.2 101.495 85.82 5.91 120.62
SSE modified 0.2 0.5 1.2 3.3 20.81 81.123 83.22 6.86 140.54
Manhattan 0.2 0.5 1.4 7.3 81.6 218.97 83.92 9.46 388.59
Manhattan modified 0.2 0.2 0.5 2.1 63.8 113.34 86.765 6.62 192.74
Mahalanobis 0.2 0.5 0.9 4.3 48.05 95.40 81.56 6.15 145.42
Mahalanobis simplified 0.2 0.2 0.5 0.9 21.32 49.011 88.654 2.841 36.041
Angle weighted 0.2 0.2 0.5 0.9 23.43 52.142 89.603 3.072 38.762
Manhattan weighted 0.5 2.1 15.1 48.0 98.3 589.51 79.20 17.26 951.26
Manhattan weighted 0.5 0.9 5.2 18.0 92.4 298.53 79.43 11.58 196.63
modified
SSE weighted 0.2 0.5 1.4 8.5 83.5 237.58 84.40 10.17 437.54
SSE weighted modified 0.2 0.7 1.7 6.1 58.4 160.40 82.51 8.75 369.03
Angle whitened 0.2 0.2 0.5 1.9 68.8 126.56 89.832 4.733 119.734
Correlation whitened 0.2 0.2 0.5 1.9 68.6 126.00 89.831 4.734 120.045
(CMCA). Smaller CMCA means better overall that we need to extract fewer images in order to
recognition accuracy. Also we present how many achieve some cumulative recognition rate. Last
images (in percents) must be extracted from the columns of the table are equal error rate (EER)
database in order to achieve some cumulative and the area below receiver operating character-
recognition rate (80–100%). Smaller values mean istic (ROCA). Smaller values mean better results.
V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724 717
Table 5
Recognition using 90% of features (381)
Distance measure Rank (%) of images needed to extract in order to CMCA, First1 EER, % ROCA,
achieve some cumulative recognition percent 0–104 recogni- 0–104
80 85 90 95 100 tion, %
Euclidean; SSE 0.2 0.5 2.1 7.6 87.2 217.54 83.69 7.33 266.20
Angle; Mahalanobis 0.2 0.2 0.9 3.8 46.14 99.425 85.34 5.913 118.663
normed
Correlation 0.2 0.2 1.2 4.0 46.35 100.12 85.11 5.91 119.244
SSE modified 0.2 0.5 0.9 3.3 20.83 79.703 83.45 6.86 140.41
Manhattan 0.2 0.5 2.6 12.1 86.1 270.22 83.92 9.46 423.03
Manhattan modified 0.2 0.2 0.5 2.1 77.8 119.80 88.423 6.38 182.42
Mahalanobis 0.2 0.5 0.9 4.3 47.0 94.204 81.80 5.91 142.75
Mahalanobis simplified 0.2 0.2 0.5 0.9 18.42 44.181 89.832 2.841 30.491
Angle weighted 0.2 0.2 0.2 0.7 16.31 47.782 90.071 3.072 33.822
Manhattan weighted 8.7 23.9 49.4 77.1 99.5 1079.20 66.43 18.44 1199.75
Manhattan weighted 1.7 5.2 14.9 37.1 94.8 474.77 72.58 15.84 828.84
modified
SSE weighted 0.2 0.5 2.6 16.5 86.8 301.63 83.92 11.35 497.21
SSE weighted modified 0.5 1.4 3.1 14.7 74.9 227.72 78.01 10.64 486.00
Angle whitened 0.2 0.2 0.7 2.8 89.1 168.75 86.524 5.204 169.875
Correlation whitened 0.2 0.2 0.7 3.5 87.9 172.78 86.525 5.205 172.27
Fig. 3. Recognition performance using Minkowski distance. (a) CMCA and ROCA, (b) Cumulative 100% and First1 recognition,
EER.
V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724 719
Table 6
Sorted distance measures with respect to recognition performance
Feat. CMCA First1 Cum100 EER ROCA
Num.
10% Mahalanobis Angle (whitened) SSE (modified) Angle (whitened) Angle (weighted)
(simplified)
(42) Angle (weighted) Correlation Mahalanobis Correlation Angle (whitened)
(whitened) (simplified) (whitened)
Correlation Angle (weighted) Angle (weighted) Mahalanobis Correlation (whitened)
(whitened) (simplified)
20% Mahalanobis Correlation SSE (modified) Angle (whitened) Angle (weighted)
(simplified) (whitened)
(85) Angle (weighted) Angle (whitened) Mahalanobis Mahalanobis Mahalanobis (simplified)
(simplified) (simplified)
Angle (whitened) Angle (weighted) Angle (weighted) Correlation Angle (whitened)
(whitened)
(weighted), 3––SSE (modified), 4––angle (whit- significant difference (LSD) test. This test showed
ened), 5––correlation (whitened). As we can see that the difference between Mahalanobis (simpli-
from the Table 8 the mean differences are not fied) and angle (whitened) with respect to First1 is
significant at the a ¼ 0:001 level between the fol- statistically significant.
lowing distance measures: Mahalanobis (simpli- Now we will compare our results with the re-
fied) and SSE (modified) with respect to Cum100, sults of other researchers. The experiments de-
Mahalanobis (simplified) and angle (weighted) scribed in (Phillips et al., 1997, 2000; Navarrete
with respect to ROCA, angle (whitened) and and Ruiz-del-Solar, 2002) showed that recognition
correlation (whitened) with respect to EER and performance using PCA-based recognition method
ROCA, Mahalanobis (simplified) and angle with angle-based distance measure is better than
(whitened) with respect to First1, angle (weighted) using the Euclidean distance, using the Euclid-
and angle (whitened) with respect to First1. All ean distance we can achieve larger recognition
other differences are statistically significant. We rates than using Manhattan distance, Maha-
also used more post hoc tests available in SPSS lanobis distance performs better than other
12.0 package (SPSS, 2003) and almost all the tests mentioned distances. The experiments with Man-
showed the same significant (and insignificant) hattan, Euclidean, angle-based, Mahalanobis dis-
differences. The only exception was FisherÕs least tances and different combinations described in
720 V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724
Fig. 4. Recognition performance using different number of features. (a) Cumulative 100% recognition, (b) First1 recognition,
(c) CMCA, (d) ROCA and (e) EER.
V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724 721
Table 7
Descriptive statistics
Distance ID Mean Std. Dev. Std. Err. 95% Confidence Interval Min Max
Lower bound Upper bound
CMCA
1 54.4720 9.53966 0.21331 54.0537 54.8904 29.34 90.18
2 56.9940 10.33195 0.23103 56.5409 57.4471 30.04 96.77
3 84.0432 12.61411 0.28206 83.4900 84.5963 48.06 131.64
4 99.0460 21.02596 0.47015 98.1240 99.9680 39.82 174.45
5 101.7862 21.68936 0.48499 100.8350 102.7373 43.76 181.75
First1
1 89.6742 1.62365 0.03631 89.6030 89.7454 83.69 94.80
2 89.9774 1.59608 0.03569 89.9074 90.0474 84.16 94.56
3 85.1074 1.82273 0.04076 85.0275 85.1874 78.96 91.02
4 89.8616 1.60529 0.03590 89.7912 89.9320 83.92 94.56
5 90.3178 1.55792 0.03484 90.2495 90.3862 84.16 95.04
Cum100
1 21.6412 3.72947 0.08339 21.4776 21.8047 4.50 30.30
2 24.1841 4.39684 0.09832 23.9913 24.3770 4.50 33.80
3 21.4187 2.00773 0.04489 21.3307 21.5067 11.10 27.90
4 43.6803 8.88014 0.19857 43.2909 44.0698 19.10 60.50
5 45.5153 9.21713 0.20610 45.1111 45.9194 19.10 62.20
EER
1 3.0896 0.87283 0.01952 3.0513 3.1278 0.97 6.80
2 3.3346 0.76055 0.01701 3.3012 3.3679 1.36 8.78
3 6.7445 1.29171 0.02888 6.6879 6.8012 3.35 15.17
4 3.7646 1.15688 0.02587 3.7139 3.8154 1.05 8.40
5 3.7791 1.18242 0.02644 3.7273 3.8310 1.17 8.85
ROCA
1 47.0447 19.52074 0.43650 46.1887 47.9007 11.69 170.44
2 44.9181 20.03766 0.44806 44.0394 45.7968 9.32 182.29
3 147.7803 43.57351 0.97433 145.8695 149.6911 56.38 451.57
4 85.7599 36.75803 0.82193 84.1479 87.3718 20.31 279.02
5 88.8438 37.94469 0.84847 87.1798 90.5078 19.53 288.80
(Yambor and Draper, 2002) showed that simpli- better than simplified Mahalanobis distance and
fied Mahalanobis distance performs significantly weighted angle-based distance with respect to
better than L1, L2 or angle-based distance if using 100% cumulative recognition. We also tested Chi
more than 60% of eigenfeatures. Our results also square and Canberra distances, but the results
showed that angle-based distance performs better were much worse than using the Euclidean or
than the Euclidean distance. Simplified Mahalan- other tested distance measure. The results using
obis distance performs better than the Euclidean, Euclidean or SSE-based distance between whit-
Manhattan and angle-based distance measures ened feature vectors were worse than the results
with respect to CMCA and EER. But the results using angle-based distance between whitened vec-
also showed, that weighted angle-based distance tors.
performs better than simplified Mahalanobis dis- In order to achieve larger recognition per-
tance with respect to ROCA and first one recog- formance we can try to combine different dis-
nition rate. Also the experiments showed, that the tance measures as it was done in (Yambor and
proposed modified SSE-based distance performs Draper, 2002). Also we can perform algorithmic
722 V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724
Table 8
Mean differences and significance values
Distance ID 1 Distance ID 2 CMCA First1 Cum100 EER ROCA
1 2 )2.52199 )0.30319 )2.54300 )0.24499 2.12662
0.000006 0.000000 0.000000 0.000000 0.250291
1 3 )29.57116 4.56678 0.22245 )3.65495 )100.73560
0.000000 0.000000 0.801765 0.000000 0.000000
1 4 )44.57398 )0.18735 )22.03920 )0.67508 )38.71515
0.000000 0.002906 0.000000 0.000000 0.000000
1 5 )47.31414 )0.64362 )23.87410 )0.68954 )41.79910
0.000000 0.000000 0.000000 0.000000 0.000000
2 3 )27.04916 4.86998 2.76545 )3.40997 )102.86222
0.000000 0.000000 0.000000 0.000000 0.000000
2 4 )42.05199 0.11584 )19.49620 )0.43009 )40.84177
0.000000 0.169208 0.000000 0.000000 0.000000
2 5 )44.79215 )0.34043 )21.33110 )0.44455 )43.92572
0.000000 0.000000 0.000000 0.000000 0.000000
3 4 )15.00282 )4.75414 )22.26165 2.97988 62.02044
0.000000 0.000000 0.000000 0.000000 0.000000
3 5 )17.74299 )5.21040 )24.09655 2.96542 58.93650
0.000000 0.000000 0.000000 0.000000 0.000000
4 5 )2.74017 )0.45626 )1.83490 )0.01446 )3.08394
0.000001 0.000000 0.000000 0.993103 0.026595
Table 9
Recognition using 30% of features (127) and combined distance measures
Distance measure Rank (%) of images needed to extract in order to CMCA, First1 EER, % ROCA,
achieve some cumulative recognition percent 0–104 recogni- 0–104
80 85 90 95 100 tion, %
SSE (mod.) 0.2 0.5 0.9 3.5 21.3 84.11 83.45 7.09 144.36
Mahalanobis (simplified) 0.2 0.2 0.5 0.9 23.9 54.57 87.94 3.07 45.19
Angle (weighted) 0.2 0.2 0.5 1.4 26.0 57.15 88.42 3.31 44.08
SSE (mod.) + Mahalanobis 0.2 0.2 0.5 0.9 16.8 49.54 87.94 3.07 38.97
(simplified)
SSE (mod.) + angle 0.2 0.2 0.5 1.4 15.8 50.55 88.42 3.31 36.63
(weighted)
Mahalanobis 0.2 0.2 0.5 1.4 24.8 56.81 88.42 3.31 41.79
(simplified) + angle
(weighted)
combination (Perlibakas, 2002) by sorting all noted that in order to achieve better results using
images using one distance measure (e.g. modified combined method than using not combined meth-
SSE-based distance) and then resorting some part ods we must choose an appropriate percent of
(e.g. 25%) of images with the smallest distances resorting.
using another distance measure, for example sim-
plified Mahalanobis or weighted angle-based dis-
tance. The results of such algorithmic combination 5. Conclusions
are presented in Table 9. As we can see from
the table, using such combination we can achieve In this publication we compared 14 distance
better performance with respect to CMCA, ROCA measures and their modifications for principal
and 100% cumulative recognition. But it must be component analysis-based face recognition method
V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724 723
and proposed modified sum squared error (SSE)- Hancock, P.J.B., Burton, A.M., Bruce, V., 1996. Race process-
based distance measure. Recognition experiments ing: Human perception and principal component analysis.
Memory Cognition 24, 26–40.
were performed using the database containing Kirby, M., Sirovich, L., 1990. Application of the Karhunen–
photographies of 423 persons. The experiments Loeve expansion for the characterization of human faces.
showed, that the proposed distance measure is IEEE Trans. PAMI 12 (1), 103–108.
among the first three best measures with respect to Manchester, 1998. Manchester face database. Available from
different characteristics of the biometric systems. <ftp://peipa.essex.ac.uk/ipa/pix/faces/manchester/>.
Martinez, A.M., Benavente, R., 1998. The AR face database.
The best recognition results were achieved using CVC TR 24.
the following distance measures: simplified Ma- MIT, 1989. MIT Media Laboratory face database. Available
halanobis, weighted angle-based distance, pro- from <ftp://whitechapel.media.mit.edu/pub/images/>.
posed modified SSE-based distance, angle-based Moghaddam, B., Pentland, A., 1998. Beyond linear eigen-
distance between whitened feature vectors. Using spaces: Bayesian matching for face recognition. Face
recognition: From theory to applications. NATO ASI Series
the proposed modified SSE-based distance we F, Computer and Systems Sciences 163, 230–243.
need to extract less images in order to achieve Moon, H., Phillips, P.J., 1998. Analysis of PCA-based face
100% cumulative recognition than using any other recognition algorithms. In: Bowyer, K.W., Phillips, P.J.
tested distance measure. We also showed that (Eds.), Empirical Evaluation Techniques in Computer
using the algorithmic combination of distance Vision. IEEE CS Press, pp. 57–71.
Navarrete, P., Ruiz-del-Solar, J., 2001. Eigenspace-based rec-
measures we can achieve better recognition results ognition of faces: Comparisons and a new approach. In:
than using the distances separately. International Conference on Image Analysis and Processing
ICIAP2001. pp. 42–47.
Navarrete, P., Ruiz-del-Solar, J., 2002. Comparative study
between different eigenspace-based approaches for face
References recognition.
ORL, 1992. The ORL face database at the AT&T (Olivetti)
Abdi, H., Valentin, D., Edelman, B., OÕToole, A.J., 1995. More Research Laboratory. Available from <http://www.uk.
about the difference between man and women: Evidence research.att.com/facedatabase.html>.
from linear neural network and principal component Perlibakas, V., 2002. Image and geometry based face recogni-
approach. Perception 24, 539–562. tion. Inform. Technol. Control 1 (22), 73–79.
AR, 1998. The AR face database. Available from <http:// Phillips, P.J., Moon, H., Rauss, P., Rizvi, S.A., 1997. The
rvl1.ecn.purdue.edu/~aleix/ar.html>. FERET September 1996 Database and Evaluation Proce-
Bern, 1995. Face database at the University of Bern. Available dure. In: 1st International Conference on Audio and Video-
from <ftp://ftp.iam.unibe.ch/pub/Images/FaceImages/>. based Biometric Person Authentication, Crans-Montana,
BioID, 2001. The BioID face database. Available from <http:// Switzerland.
www.humanscan.de/support/downloads/facedb.php>. Phillips, P.J., Moon, H.J., Rizvi, S.A., Rauss, P.J., 2000. The
Bishop, C.M., 1995. Neural Networks for Pattern Recognition. FERET evaluation methodology for face recognition algo-
Clarendon Press, Oxford. p. 504. rithms. PAMI 22 (10), 1090–1104.
Bromba, M., 2003. Biometrics FAQ. Available from <http:// Phillips, P.J., OÕToole, A.J., Cheng, Y., Ross, B., Wild, H.A.,
home.t-online.de/home/manfred.bromba/biofaqe.htm>. 1999. Assessing algorithms as computational models for
Cekanavicius, V., Murauskas, G., 2002. Statistics and its human face recognition. NIST TR.
applications. Part 2. TEV, Vilnius. p. 272 (in Lithuanian). Samaria, F., Harter, A., 1994. Parameterisation of a stochastic
Efron, B., Tibshirani, R.J., 1993. An Introduction to the model for human face identification. In: 2nd IEEE Work-
Bootstrap. Chapman and Hall, New York. shop on Applications of Computer Vision.
Graham, D.B., Allinson, G.N.M., 1998. Characterizing virtual SPSS, 2003. SPSS Home Page <http://www.spss.com/>.
eigensignatures for general purpose face recognition. Face Swets, D.L., Pathak, Y., Weng, J.J., 1998. An image database
recognition: From theory to applications. NATO ASI Series system with support for traditional alphanumeric queries
F, Computer and Systems Sciences 163, 446–456. and content-based queries by example. Multimedia Tools
Groß, M., 1994. Visual computing. The integration of Appl. (7), 181–212.
computer graphics, visual perception and imaging. Turk, M., Pentland, A., 1991. Eigenfaces for recognition. J.
Computer Graphics: Systems and Applications. Springer- Cognit. Neurosci. 3 (1), 71–86.
Verlag. Umist, 1997. Umist face database. Available from <http://
Grudin, M.A., 1997. A compact multi-level model for the images.ee.umist.ac.uk/danny/database.html>.
recognition of facial images. Ph.D. thesis, Liverpool John Viisage, Inc., 2001. Viisage face recognition technology. Avail-
Moores University. able from <http://www.viisage.com/technology.htm>.
724 V. Perlibakas / Pattern Recognition Letters 25 (2004) 711–724
Yale, 1997. The Yale face database. Available from <http:// (Eds.), Empirical Evaluation Methods in Computer Vision.
cvc.yale.edu/projects/yalefaces/yalefaces.html>. World Scientific Press, Singapore.
Yambor, W.S., Draper, B.A., Beveridge, J.R., 2002. Analyzing Yilmaz, A., Gokmen, M., 2001. Eigenhill vs. eigenface and
PCA-based face recognition algorithm: Eigenvector selec- eigenedge. Pattern Recognition 34, 181–184.
tion and distance measures. In: Christensen, H., Phillips, J.