Maximum recognition rate in different color spaces (CMU database, unit: %). (a) Independent processing, (b) Concatenated processing.
Light reflected from an object is multi-spectral, and human beings recognize the object by perceiving color spectrum of the visible light (Wyszecki & Stiles, 2000). However, most of face recognition algorithms have used only luminance information (Bartlett et al., 2002; Belhumeur et al., 1997; Etemad & Chellappa, 1997; Liu & Wechsler, 2000; Turk & Pentland, 1991 a, 1991b; Wiskott et al., 1997; Yang, 2002). Many face recognition algorithms convert color input images to grayscale images by discarding their color information.
Only a limited number of face recognition methods made use of color information. Torres et al. proposed a global eigen scheme to make use of color components as additional channels (Torres et al., 1999). They reported color information could potentially improve performance of face recognition. Rajapakse et al. proposed a non-negative matrix factorization method to recognize color face images and showed that the color image recognition method is better than grayscale image recognition approaches (Rajapakse et al., 2004). Yang et al. presented the complex eigenface method that combines saturation and intensity components in the form of a complex number (Yang et al., 2006). This work shows that the multi-variable principal component analysis (PCA) method outperforms traditional grayscale eigenface methods. Jones III and Abbott showed that the optimal transformation of color space into monochrome form can improve the performance of face recognition (Jones III & Abbott, 2004), and Neagoe extended the optimal transformation to two-dimensional color space (Neagoe, 2006).
Color images include more visual clues than grayscale images, and the above-mentioned work showed effectiveness of color information for face recognition. However, there is lack of analysis and evaluation regarding the recognition performance in various color spaces. A large number of face recognition algorithms (Bartlett et al., 2002; Belhumeur et al., 1997; Etemad & Chellappa, 1997; Liu & Wechsler, 2000; Turk & Pentland, 1991 a, 1991b; Wiskott et al., 1997; Yang, 2002) have been presented.
This paper is an extended version of the paper (Yoo et al., 2007), in which analysis of the recognition rate in various color spaces with two different approaches in CMU PIE database (Sim et al., 2003; Zheng et al., 2005) and color FERET database (Phillips et al., 1998, Phillips et al., 2000) is supplemented. Note that PCA-based algorithms are employed since they are the most fundamental and prevalent approaches. Recognition performance is evaluated in various color spaces with two different approaches (independent and concatenated processing). SV, RGB, YCg‘Cr‘, YUV, YCbCr, and YCgCb color spaces are used for investigation of performance analysis. Experimental results show that use of color information can give significant improvement in terms of the recognition rate in CMU and FERET database which contain a large number of face images with wide variation of illumination, facial expressions, and aging for test sets. To use color information for PCA-based face recognition, we adopt two kinds of approaches: independent and concatenated PCA-based face recognition.
The rest of the paper is organized as follows. In Section 2, a fundamental eigenface method is introduced. In Section 3, two schemes for color PCA-based face recognition are introduced and in Section 4, six color spaces for face recognition are described. Performance comparison of the face recognition for six color spaces is presented in Section 5. Finally, Section 6 gives conclusions and future work.
2. Eigenface face recognition
Turk and Pentland proposed the eigenface-based face analysis that is based on the PCA for efficient face recognition ( Turk & Pentland, 1991 a, 1991b). The algorithm consists of two phases: training and recognition phases. In the training phase of the eigenface method, eigenvectors are calculated with a large number of training faces. The computed eigenvectors are called as eigenfaces. Then, faces are enrolled in a face recognition system by their projection onto the eigenface space. In the recognition phase, an unknown input face can be identified by measuring the distances of the projected coefficients between the input face and the enrolled faces in database.
2.1. Eigenface space decomposition
Dimension of an image space is so high that it is often not only impractical but also inefficient to deal with all the data of images in their own dimensions. PCA enables to optimally reduce the dimensionality of images by constructing the eigenface space which is composed of eigenvectors ( Turk & Pentland, 1991 a, 1991b). An algorithmic procedure of eigenface decomposition is briefly described in the following.Let be a training set of face images, and x i represent the ith training face image which is expressed as an N×1 vector. Note that M t signifies the number of training images and N denotes the total number of pixels in an image. The mean vector µ of the dataset is defined by
Then, the N×N covariance matrix C of the dataset is computed by
where the superscript T denotes a transpose operation. The eigenvalues and the corresponding eigenvectors of C can be computed with the singular value decomposition (SVD). Let λ 1, λ 2, …, λ N be eigenvalues of C, where the eigenvalues are ordered in decreasing order, and u1, u2, …, u N represent N eigenvectors of C. Note that the ith eigenvalue, λ i is associated with the ith eigenvector, u i . The eigenvectors having larger λ i are considered as be more dominant axis to represent the training face images. We can choose N' eigenvectors as the eigenface space for face recognition (N’<<N).
2.2. Projection onto the eigenface space
A face image is transformed by projecting it onto the eigenface space. Let be a gallery set of face images, where M g is the size of the gallery set. Then, the weight ω ik of with respect to the kth eigenface can be obtained by
and all the weights are represented by a weight vector,
Given an unknown face image, we obtain the weight vector, by projecting it onto the eigenface space. Then, the input face image can be classified using the nearest neighborhood classifier. The distances between the input face and the other faces in the gallery are computed in the eigenface space. The Euclidean distance between the input face and the ith face image in the gallery set is defined by
whereas the Mahalanobis distance is defined by
The identity of the input face image can be determined by finding the minimum distance with a distance measure such as the above-mentioned distance function. The decision rule for face recognition can be expressed as
where i matching is the index indicating the identified person.
3. Face recognition in different color spaces
In general, color images have three components or channels: red (R), green (G), and blue (B). To apply the eigenface method to color facial images, two methods are employed. One way is to combine outcomes of independent PCA for each color component (independent processing), whereas the other is to serially concatenate three color components into a single component (concatenated processing). In this section, we will describe these two approaches for face recognition in different color spaces.
3.1. Independent color face recognition
Each color component of a signal can be independently fed into an eigen-face method, as shown in Fig. 1(a). The final decision is made with the distances from three independent eigenface modules (Torres et al., 1999). Fig. 1(a) shows the block diagram of the face recognition system (independent processing) for multi-channel face images. First, color space conversion is performed, i.e., three components of RGB color space, x R , x G , and x B, are converted into three other color components x C1, x C2, and x C3. At the second stage, the eigenface analysis is performed for each component independently. Then, the three distance vectors, d C1, d C2, and d C3 are consolidated with weighting factors and a person in the database is finally identified.
3.2. Concatenated color face recognition
The simple way to process a multi-channel signal is to concatenate independent multiple components into a single component (concatenated processing) and process it as if it is obtained from a single channel, as shown in Fig. 1(b). x R , x G , and x B are N×1 vectors, denoting red, green, and blue components of an input face image, respectively, while x C is a 3N×1 vector, representing a serially combined input for a color eigenface system. d C is an M g ×1 vector that represents the distance between the input and M g persons in a gallery. In this way, the multiple-component signal is converted into a single channel signal. The number of components becomes one, whereas the length of the component increases as many times as the original number of components. Then, the eigenface method is applied to the combined signal.
In the case of color images which consist of three channels, (RR…), (GG…), and (BB…), the concatenated signal will be expressed as (RGBRGB…).
4. Color spaces for face recognition
Even though most of digital image acquisition devices produce R, G, and B components, the RGB color space is converted into different color spaces for each application. For face recognition, the eigenface analysis in the RGB color space domain is known not to be effective, because R, G, and B components are largely correlated with each other. Some literatures also pointed that the RGB domain is inadequate for face recognition (Torres et al., 1999). Instead of RGB color space, other color spaces that are less correlated between their components should be investigated for face recognition. In this work, performance evaluation is conducted on SV, RGB, YCg‘Cr‘, YUV, YCbCr, and YCgCb color spaces.
The HSV and HSI color spaces are the well-known color spaces reflecting the human visual perception and they are composed of hue (H), saturation (S), and value (V)/ intensity (I) (Jack, 2001). The conversion equations are given by
where is computed by
The YUV color space consisting of luminance (Y) and chrominance (U, V) components has been widely used for video transmission systems. The black-and-white video systems use only Y information and U and V components are added for color systems. RGB to YUV conversion can be performed by
The YCbCr color space is an alternative to the YUV color space by employing an offset value for each component. It is used for multiple coding standards. This color space is also known as an effective space for skin color segmentation (Chai & Ngan, 1999) and the conversion matrix is defined by
The YIQ color space is related to the YUV color space. The ‘I’ represents ‘inphase’ and the ‘Q’ does ‘quadrature’, which is based on quadrature amplitude modulation. I and Q from U and V are computed by
The YCgCr color space was proposed for fast face segmentation (De Dios & Garcia, 2003). This color space produces another chrominance component Cg instead of Cb in YCbCr. Moreover, the YCg‘Cr‘ color space was derived by rotating the CgCr plane for face segmentation (De Dios & Garcia, 2004). YCgCr and YCg‘Cr‘ are defined by
The YCgCb color space was also proposed for face segmentation (Zhang & Shi, 2009). This color space produces another chrominance component Cb instead of Cr in YCbCr, expressed as,
Among various color spaces described in this section, only six color spaces that give high face recognition rates are presented in next section.
5. Experimental results and discussions
5.1. Database and preprocessing
For experiments, we used CMU PIE and FERET databases. CMU database was used in order to test face recognition performance in illumination variation because it has significant change of lighting conditions. FERET database has smaller variation of illuminations than CMU database. Instead, it includes expression changes and aging.
To remove the effect of background and hair style variations, face regions were cropped to exclude the background and hair regions. All the face images in CMU database were rescaled to 150×150 pixels while those in FERET database were done to 50×50 pixels, and rotated so that the line connecting two eyes is aligned horizontally. Then the color component of each transformed image was normalized to set mean and variance to have zero mean and unit variance.
CMU database used in our experiments consists of three gallery sets (Subset-1, Subset-2, and Subset-3) and three probe sets (Subset-4, Subset-5, and Subset-6), as shown in Fig. 2. Each gallery set consists of 24 face images with various poses while each probe set consists of 1632 face images with various illuminations. Other 412 face images were used as a training set to construct an eigenface space. Fig. 2 shows example face images for each data set from CMU database used in our experiments. Fig. 2(a) shows example face images in three gallery sets with no illumination change: from left to right, frontal face image (Subset-1), half right profile face image (Subset-2), and full right profile face image (Subset-3). Figs. 2(b)- 2(d) show three probe sets with illumination variation: frontal face images (Subset-4), half right profile face images (Subset-5), and full right profile face images (Subset-6), with five face images in each probe set.
FERET database used in our experiments consists of one gallery set (Fa) and three probe sets (Fb, Dup1, and Dup2). We used 194 images of set Fa as gallery set of our system, while three sets Fb, Dup1, and Dup2, which consist of 194, 269, and 150 face images, respectively, were used as probe sets. Other 386 face images were used as the training set to construct an eigenface space. Fig. 3 shows example faces of each data set in FERET database used in our experiments. Fig. 3(a) shows an example face image in the gallery set with no facial expression. Figs. 3(b)- 3(d) show three example sets: face images with different facial expression (Fb), additional short-term aging (Dup1), and additional long-term aging (Dup2).
In this section, the PCA-based color face recognition system with various color spaces including SV, RGB, YCg‘Cr‘, YUV, YCbCr, and YCgCb is investigated using CMU database and FERET database. We compare recognition performance of independent and concatenated processing with that of the conventional eigenface method employing only luminance information. Note that luminance component images are generated with two different conversions, i.e., Y = 0.3R + 0.59G + 0.11B and I = (R + G + B) / 3.
Figs. 4 and 5 illustrate the recognition rate of probe sets in CMU database and FERET database, respectively, in different color spaces with independent and concatenated processing when the number of features is set from 10 to 200. From all the graphs shown in Figs. 4 and 5, it is noted that the more features we use, the higher the recognition rate is. The recognition rate becomes saturated when the number of features is large enough, i.e., 180.
The recognition rates on the saturation range are influenced by color space and data set used for the probe set. Tables 1 and 2 show the maximum recognition rates in each color space for probe sets in CMU database and FERET database, respectively.
5.2. Different color spaces (CMU database)
The performance of face recognition in various lighting conditions is presented, in this subsection. The performance of the PCA-based face recognition algorithm in six different color spaces is evaluated, with independent and concatenated processing for CMU database images. The performance is compared in terms of the recognition rate as a function of the number of features (Fig. 4) and in terms of the maximum recognition rate (Table 1).
For probe set 1 consisting of frontal face images with illumination variations, the best performance is observed in the SV color space, with independent and concatenated processing, as shown in Fig. 4 (probe set 1). For probe set 2 consisting of half profile face images with illumination variations, the recognition rate in the SV color space, with independent and concatenated processing, also gives the best performance, as shown in Fig. 4 (probe set 2). For probe set 3 consisting of full profile face images with illumination variations, the recognition rate in the SV color space with independent processing also gives the best performance, as shown in Fig. 4(a) (probe set 3), whereas the recognition rate in the RGB color space with concatenated processing gives the best performance, as shown in Fig. 4(b) (probe set 3).
As shown in Table 1(a) with independent processing, for probe set 1, the maximum recognition rate in the SV color space is 18.3% and 22.3% higher than that in the RGB and YCg‘Cr‘ color spaces, respectively, whereas for probe set 2, 17.1% and 22.8% higher, respectively, and for probe set 3, 5.5% and 11.2% higher, respectively.
As shown in Table 1(b) with concatenated processing, for probe set 1, the maximum recognition rate in the SV color space is 16.8% and 26.9% higher than that in the RGB and YCbCr color spaces, respectively, while for probe set 2, 13.3% and 19% higher, respectively, and for probe set 3, 2.8% lower and 0.3% higher, respectively.
Not using H component in the HSV color space improves the recognition rate, as shown in Fig. 4 and Table 1. Because S component is not sensitive to illumination change, robustness to illumination variation can be observed. Various experiments show that the recognition rate in all the color spaces with independent processing is higher than that with concatenated processing, as shown in Fig. 4 and Table 1.
|Probe set 1||90.5||72.2||68.2||66.1||65.7||56.9|
|Probe set 2||81.7||64.6||58.9||57.3||56.9||55.2|
|Probe set 3||71.6||66.1||60.4||60.7||60.2||51.5|
|Probe set 1||85.2||68.4||53.5||56.0||58.3||56.8|
|Probe set 2||73.0||59.7||43.7||48.7||54.0||48.7|
|Probe set 3||59.8||62.6||52.7||57.1||59.5||31.3|
5.3. Different color spaces (FERET database)
The performance of face recognition in various expressions and aging is shown in this subsection. The performance of the PCA-based face recognition algorithm in six different color spaces is evaluated, with independent and concatenated processing for FERET database images. The performance is compared in terms of the recognition rate as a function of the number of features (Fig. 5) and in terms of the maximum recognition rate (Table 2).
For probe set 1 with facial expression variations, the best performance is observed in the YUV/YCbCr color spaces with independent processing, as shown in Fig. 5(a) (probe set 1). The recognition rate in the YUV space gives the best performance with concatenated processing, as shown in Fig. 5(b) (probe set 1). Fig. 5 (probe set 2) shows the recognition rate of face images with short-term aging as well as facial expression variations. As shown in Fig. 5 (probe set 2), the recognition rate in the YCg‘Cr‘ color space, with independent and
|Probe set 1||91.2||85.6||88.1||92.3||92.3||80.9|
|Probe set 2||64.3||59.5||69.5||65.4||65.1||64.7|
|Probe set 3||56.7||51.3||62.0||58.0||58.0||59.3|
|Probe set 1||88.7||86.1||85.1||89.7||84.0||80.9|
|Probe set 2||60.6||52.0||66.9||62.8||56.1||63.2|
|Probe set 3||51.3||58.7||59.3||50.0||48.7||35.3|
concatenated processing, gives the best performance. For probe set 3 consisting of full profile face images with long-term aging as well as facial expression variation, the recognition rate in the YCg‘Cr‘ color space, with independent and concatenated processing, also gives the best performance, as shown in Fig. 5 (probe set 3).
As shown in Table 2(a), for probe set 1 with independent processing, the maximum recognition rate in the YUV/YCbCr color spaces is 1.1% and 4.2% higher than that in the SV and YCg‘Cr‘ color spaces, respectively. For probe set 2, the maximum recognition rate in the YCg‘Cr‘ color space is 4.1% and 4.4% higher than that in the YUV and YCbCr color spaces, respectively. For probe set 3, the maximum recognition rate in the YCg‘Cr‘ color space is 2.7% and 4% higher than that in the YCgCb and YUV/YCbCr color spaces, respectively.
As shown in Table 2(b) with concatenated processing, for probe set 1, the maximum recognition rate in the YUV color space is 1% and 3.6% higher than that in the SV and RGB color spaces, respectively. For probe set 2, the maximum recognition rate in the YCg‘Cr‘ space is 3.7% and 4.1% higher than that in the YCgCb and YUV color spaces, respectively. For probe set 3, the maximum recognition rate in the YCg‘Cr‘ color space is 0.6% and 8% higher than that in the RGB and SV color spaces, respectively.
Noted that the Cg‘Cr‘ components are more robust to illumination variations and short- and long-term aging than the CbCr components, in the sense that the YCg‘Cr‘ color space is more efficient than the YCbCr and color spaces for probe sets 2 and 3 that consist of face images with short and long-term aging, respectively, as well as illumination changes.
5.4. Color space vs. gray space
Fig. 6 shows the importance of color information for face recognition. The performance of face recognition with color information is significantly improved compared with that using only grayscale information. We used Subset-4 in CMU database and Fb in FERET database as a probe set (independent processing) and compared face recognition performances in color spaces and gray spaces. The recognition rate in the SV color space is approximately 20 % and 5% higher than that in the gray space (luminance space, i.e., Y and I), in CMU and FERET database images, respectively. Note that the performance of the RGB color space is similar to that of the luminance space. The use of RGB components gives little benefit in generating distinguishable features for effective face recognition, since all the three components of the RGB color space are strongly correlated with each other. On the other hand, the SV color space is effective because its components are less correlated with each other through separation of luminance and chrominance components.
In this paper, we evaluate the PCA-based face recognition algorithms in various color spaces and analyze their performance in terms of the recognition rate. Experimental results with a large number of face images (CMU and FERET databases) show that color information is beneficial for face recognition and that the SV, YCbCr, and YCg‘Cr‘ color spaces are the most appropriate spaces for face recognition. The SV color space is shown to be effective to illumination variation, the YCbCr color to facial expression variation, and the YCg‘Cr‘ color space to aged faces. From experiments, we found that the recognition rate in all the color spaces with independent processing is higher than that with concatenated processing. Further work will focus on the analysis of inter-color correlation and investigation of illumination-invariant color features for effective face recognition.
This work was supported in part by Brain Korea 21 Project. Portions of the research in this paper use CMU database of facial images collected by Carnegie Mellon University and FERET database of facial images collected under FERET program.