InTechOpen uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Computer and Information Science » Human-Computer Interaction » "State of the art in Biometrics", book edited by Jucheng Yang and Loris Nanni, ISBN 978-953-307-489-4, Published: July 27, 2011 under CC BY-NC-SA 3.0 license. © The Author(s).

Chapter 11

Gabor-Based RCM Features for Ear Recognition

By Ali Pour Yazdanpanah and Karim Faez
DOI: 10.5772/17076

Article top


Covariance matrix provided for these seven features
Figure 1. Covariance matrix provided for these seven features
Rectangular region R
Figure 2. Rectangular region R
The real part of gabor function for five different scales and eight different orientations
Figure 3. The real part of gabor function for five different scales and eight different orientations
The magnitude part of gabor representation of an ear image
Figure 4. The magnitude part of gabor representation of an ear image
Five regions for covariance matrices of a sample ear image
Figure 5. Five regions for covariance matrices of a sample ear image
Sample ear image for two persons from database 1
Figure 6. Sample ear image for two persons from database 1
Sample ear image for two persons from database 2
Figure 7. Sample ear image for two persons from database 2
Mean Recognition rates of different methods on database 1 (%)
Figure 8. Mean Recognition rates of different methods on database 1 (%)
Mean Recognition rates of different methods on database 2 (%)
Figure 9. Mean Recognition rates of different methods on database 2 (%)

Gabor-Based RCM Features for Ear Recognition

Ali Pour Yazdanpanah1 and Karim Faez2

1. Introduction

Ear biometrics has received deficient attention compared to the more popular techniques of face, eye, or fingerprint recognition. The ear as a biometric is no longer in its infancy and it has shown encouraging progress so far. ears have played an important role in forensic science for many years, especially in the United States, where an ear classification system based on manual measurements was developed by (Iannarelli, 1989). In recent years, biometrics recognition technology has been widely investigated and developed. Human ear, as a new biometric, not only extends existing biometrics, but also has its own characteristics which are different from others. Iannarelli has shown that human ear is one of the representative human biometrics with uniqueness and stability (Iannarelli, 1989). Since ear as a major feature for human identification was firstly measured in 1890 by Alphonse Bertillon, so-called ear prints have been used in the forensic science for a long time (Bertillon, 1890). Ears have certain advantages over the more established biometrics; as Bertillon pointed out, they have a rich and stable structure that does not suffer from the changes of ages, skin-color, cosmetics, and hairstyles. Also the ear does not suffer from changes in facial expression, and is firmly fixed in the middle of the side of the head so that the background is more predictable than is the case for face recognition which usually requires the face to be captured against a controlled background. The ear is large compared with the iris, retina, and fingerprint and therefore is more easily captured at a distance.

We presented gabor-based region covariance matrix as an efficient feature for ear recognition. In this method, we construct a region covariance matrix by using gabor features, illumination intensity component, and pixel location, and use it as an efficient and robust ear descriptor for recognizing peoples. The feasibility of the proposed method has been successfully tested on ear recognition using two USTB databases, specifically used total 488 ear images corresponding to 137 persons. The effectiveness of the proposed method is shown in terms of the comparative performance against some popular ear recognition methods.

This chapter is organized as follows. In section 2, related works are presented. In section 3, region covariance matrix (RCM) and the method for fast RCM computation are presented. In section 4, the proposed method presented in detail. In section 5, ear image databases are introduced. In section 6, experimental results are shown and commented. The chapter concludes in section 7.

2. Related works

Ear recognition depends heavily on the particular choice of features that used in ear biometric systems. The Principal Component Analysis method (PCA) is a classical statistical characteristic extracts method. The PCA (Xu, 1994; Abdi & Williams, 2010) transformation is based on second order statistics, which is commonly used in biometric systems. With second order methods, a description with minimum reconstruction error of the data is found using the information contained in the covariance matrix of the data. It is assumed that all the information of Gaussian variables (zero mean) is contained in the covariance matrix. The Independent Component Analysis (ICA) is another popular feature extraction method. ICA (Comon, 1994; Stone, 2005) provides a linear representation that minimizes the statistical dependencies among its components, which is based on higher order statistics of the data. These dependencies among higher order features could be eliminated by isolating independent components. It is a statistical method for transforming an observed multidimensional random vector into components that are statistically independent from each other as much as possible. The ability of the ICA to handle higher-order statistics in addition to the second order statistics is useful in achieving an effective separation of feature space for given data. The higher order features are capable of capturing invariant features of natural images. In (Zhang & Mu, 2008), PCA and ICA methods with RBFN classifier is presented. In these two methods, PCA and ICA are used to extract features and RBFN is used as classifier. In this chapter, these two methods denote by PCA+RBFN, and ICA+RBFN respectively.

Hmax+SVM is another popular feature extraction method for ear recognition. Hmax model is motivated by a quantitative model of visual cortex, and SVMs are classifiers which have demonstrated high generalization capabilities in many different tasks, including the object recognition problem. This method (Yaqubi et al., 2008) combines these two techniques for the robust Ear recognition problem. With Hmax, a new set of features has been introduced for human identification, each element of this set is a complex feature obtained by combining position- and scale- tolerant edge detectors over neighboring positions and multiple orientations. This system’s architecture is motivated by a quantitative model of visual cortex (Riesenhuber & Poggio, 1999).

Another feature extraction method for ear recognition is presented by (Guo & Xu, 2008). This method called Local Similarity Binary Pattern (LSBP). Local Similarity Binary Pattern considers both the connectivity and similarity information in representation. LSBP histogram captures the information of connectivity and similarity, such as lines and connective area. In this method, in order to enhance efficient representation, histograms not only encode local information but also spatial information by image decomposition. Because of the special characteristics of ear images, the connectivity and similarity of intensity plays a significant role in ear recognition, which can be encoded by Local Similarity Binary Pattern.

3. RCM

3.1. Covariance matrix as a region descriptor

The covariance matrix is a symmetric matrix. Covariance matrix diagonal entries represent the variance of each feature and their non-diagonal entries represent their correlations. Using covariance matrices as the descriptors of the region has many advantages. The covariance matrix presents a natural way of fusing multiple features without normalizing features or using blending weights. It embodies the information embedded within the histograms as well as the information that can be derived from the appearance models. In general, for each region, a single covariance matrix is enough to match with that region in different views and poses. The noise corrupting individual samples are mostly filtered out with the average filter during covariance computation process. Due to the equal size of the covariance matrix of any region, we can compare any two regions without being restricted to a constant window size. If the raw features such as, image gradients and orientations, are extracted according to the scale difference, It has also scale invariance property over the regions in different images.

As given above, covariance matrix can be invariant to rotations. However, if information regarding the orientation of the points are embedded within the feature vector, it is possible to detect rotational discrepancies. We also want to mention that the covariance is invariant to the mean changes such as identical shifting of color values. This can be an advantageous property when objects are tracked under different illumination conditions. Region covariance matrix (RCM) presented by (Tuzel et al., 2006). RCM is a covariance matrix of many image statistics computed within a region.

We define Ias an one dimensional unit normalized intensity image. The method can be generalized to other type of images, which can be a 2D intensity image, or 3D color image or multi spectral. Assume F be the W×H×ddimensional feature image extracted from I


Where the function ϕcan be any mapping function such as color, image gradientsIx,Ixx,, edge magnitude, edge orientation, filter responses, etc. this pixel-wise mapping list can be extended by including higher order derivatives, radial distances, texture scores, angels, and temporal frame differences in case a video data is available.

For a given rectangular windowR, let {fk}k=1nbe the d-dimensional feature vectors insideR.

Each feature vector fk introduces a pixel (x, y) within that window. Since we extract the mutual covariance of the features, the windows can actually be any shape not necessarily rectangles. Basically, covariance is a statistical measure of how much two variables vary together. Covariance can be a negative, positive or zero number, conditional upon what is the relation between two features (Forsyth & Ponce, 2002). If the features increase together, the covariance is positive. If one feature increases and the other decreases, the covariance is negative, and if the two features are independent, the covariance is zero. We introduce each window Rwith a covariance matrix of the features.


Where μ is the mean vector of the corresponding features for the points within the regionR. The diagonal coefficients represent the variance of the corresponding features. For example, the jth diagonal element represents the variance for the jth feature. The non-diagonal elements represent the covariance between two different features.

The feature vectors can be constructed using different type of mapping functions like pixel coordinates, color intensity, gradient, etc.


or they can be constructed using the polar coordinates




are the relative coordinates with respect to window center(x0,y0), and


is the distance from(x0,y0)and


is the orientation component. For human detection problem, (Tuzel et al., 2007)

introduced the mapping function as


Where |.|denotes the absolute operator. First- and second-order gradients and pixel location were used in this function to construct RCM. The other form of feature mapping function which is introduced by (Tuzel et al., 2006) for gray level images is


Three other kinds of feature mapping functions are introduced by (Tuzel et al., 2007; Pang et al., 2008).


Figure 1, denotes a sample covariance matrix for a given image.


Figure 1.

Covariance matrix provided for these seven features

Despite RCM advantages, computation of the covariance matrices for all rectangular regions within an image is computationally prohibitive using the routine methods. Several applications such as detection, segmentation, and recognition require computation and comparison of covariance matrices of regions. However, routine methods disregard the fact that there exist a high number of overlaps between those regions and the statistical moments extracted for such overlapping areas can be utilized to enhance the computational speed.

3.2. Fast covariance computation using integral images

Instead of repeating the summation operator for each possible window as described by (Veksler, 2003 ; Porikli, 2005), we can calculate the sum of the values within rectangular windows in linear time. For each rectangular window we need a constant number of operations to calculate the sums over specific rectangles many times. First, we should define the cumulative image function. Each element of this function is equal to the sum of all values to the left and above of the pixel including the value of the pixel itself. We can calculate the cumulative image for every pixel with four arithmetic operations per pixel. Then we should calculate the sum of image function in a rectangle. This operation can be computed with another four arithmetic operations with some modifications at the border. Therefore by using a linear amount of computation, the sum of image function over any rectangle can be calculated in linear time.

Integral images are intermediate image representations used for fast calculation of region sums (Viola & Jones, 2001). Later Porikli (Porikli, 2005) was extended this idea for fast calculation of region covariances. He presented that the covariances can be obtained by a few arithmetic operations with a series of integral images.

We can rewrite (i, j)-th element in covariance matrix which introduces in (2) as


By expanding the mean we have


To compute region R (rectangular region) covariance, we need to calculate the sum of each feature dimension f(i)i=1..n as well as the sum of multiplication of any two feature dimensionsf(i)f(j)i,j=1..n. In this stage, we can use a series of integral images to compute these sums with a few arithmetic operations.

For each feature dimensionf(i)and multiplication of any two feature dimensionsf(i)f(j)we should construct integral images. Finally, we have d+d2integral images. Define p as the W×H×dtensor of the integral images along each feature dimensions.


And define Q as the W×H×d×d tensor of the second order integral images.

Px,yis the ddimensional vector and Qx,y is the d×d dimensional matrix.

If we have the rectangular region as R(x,y;x,y)shown in figure 2, the covariance of the region that bounded by (1,1)and (x,y)is


Wheren=x×y. In the same way, the covariance of the region R(x,y:x,y)is


Wheren=(xx)×(yy). Therefore, by using the integral images, the covariance of each rectangular region can be computed in O(d2)time. In our method we used integral image based covariance computation as a fast approach for RCM computation of the given features.


Figure 2.

Rectangular region R

3.3. Covariance matrix distance calculation

Since RCMs lie on connected Riemannian manifold, the Euclidean distance is not proper for our features, for instant, this space is not closed under multiplication with negative scalars. We use the distance measure presented in (Forstner & Moonen, 1999) to compute the distance/dissimilarity of the covariance matrices.


where λ1(C1,C2),,λd(C1,C2)are generalized eigenvalues of C1,C2and computed from


where xi0are the generalized eigenvectors.

4. Gabor-based region covariance matrix

4.1. Gabor features extraction

The RCM-based methods with feature mapping functions (9),(10) have great success in people detection, object tracking, and texture classification (Tuzel et al., 2006; Tuzel et al., 2007). However our experimental results showed that the recognition rates of these methods are very low when being applied to ear recognition which is a very difficult task from the classification point of view. We construct effective features for RCM by using Gabor features and pixel location and illumination intensity component, to get better result in ear recognition. The biological relevance and computational properties of Gabor wavelets for image analysis have been investigated in (Jones & Palmer, 1987).

The Gabor features of ear images are robust against illumination changes. Gabor representation facilitates recognition without correspondence, because it captures the local structure corresponding to spatial frequency (scale), spatial localization, and orientation selectivity (Schiele & Crowley, 2000).

Daugman (Daugman, 1985) modeled the responses of the visual cortex by Gabor functions because they are similar to the receptive field profiles in the mammalian cortical simple cells. Daugman (Daugman, 1985) enhanced the 2D Gabor functions (a series of local spatial bandpass filters), which have good spatial localization, orientation selectivity, and frequency selectivity. Lee (Lee, 2003) gave a good description to image representation by using Gabor functions. A Gabor (wavelet, kernel, or filter) function is the product of an elliptical Gaussian envelope and a complex plane wave as


Wherex¯=(x,y)is the variable in a spatial domain, and k¯ is the frequency vector, which determines the scale and direction of Gabor functionsk¯=kveiϕμ, wherekv=kmax/fv, withkmax=π/2. In our application, f=2andϕμ=πμ/8. The term exp(σ2/2)is subtracted in order to make the kernel DC-free and, thus, insensitive to illumination. Examples of the real part of Gabor functions used in this chapter are shown in Figure 3. We use Gabor functions with five different scales(v) and eight different orientations(μ), making a total of 40 Gabor functions. The number of oscillations under the Gaussian envelope is determined by σ=2π


Figure 3.

The real part of gabor function for five different scales and eight different orientations

The gabor kernels family is constructed by taking five scales (v{0,...,4})and eight orientations(μ{0,...,7}). The gabor features can be achieved by convolving the gabor kernels with the image I


Where |.|is a magnitude operator. gμ,v(x,y)are the gabor representation of an image at orientation μ and scalev. Figure 4 shows the magnitude of gabor representation of an ear image.


Figure 4.

The magnitude part of gabor representation of an ear image

4.2. Gabor based RCM

We propose a new gabor-based feature mapping function to construct effective and robust RCM.


Where I(x,y)is the pixel illumination intensity and gμ,v(x,y)are the gabor representation of the ear image. By substituting (24) into (2), we have the gabor-based region covariance matrices in region R(CR). CRdimntionality is43×43.

In our method, we represent each ear image with five RCMs extracted from five different regions(C1,,C5). First RCM (C1)defined over the whole ear image, so it gives us a global representation of the ear image. Four other RCMs are defined over part of the ear image, so they give us the part-based representation of the ear image. In order to increase the robustness of our method against illumination variations, we use both global and part-based representations for ear images in our method. Figure 5, denotes these five regions for C1,C2,C3,C4,C5.

For computing the distance between a gallery RCM and a Probe RCM, we use


Where CGand CP are RCMs from gallery and probe sets.


Figure 5.

Five regions for covariance matrices of a sample ear image

Sometimes one local RCM, due to illumination variation or noise, may be affected so much that make its corresponding distance unreliable. That is the reason why we subtracted the most unreliable part in (25) from the summation of all distances between gallery and probe RCMs. We used nearest neighbor classifier with the distance in (25) for our method.

5. Databases

Our method tested on two USTB databases (Yuan et al., 2005). Database 1 includes 180 images of human ear corresponding to 60 individual with three images per person. All the images in database 1 acquired under standard condition with a little changes. Figure 6, denotes sample ear images from database 1.


Figure 6.

Sample ear image for two persons from database 1

Database 2 includes 308 images of human ear corresponding to 77 individual with four images per person. All the images in database 1 acquired under illumination variation and ±30 degree pose variations. Figure 7, shows sample ear images from database 2.


Figure 7.

Sample ear image for two persons from database 2

6. Experimental result

We performed our experimental studies comparing various ear reconigtion algorithms including our method with PCA+RBFN method (Zhang & Mu, 2008), ICA+RBFN method (Zhang & Mu, 2008), Hmax+SVM method (Yaqubi et al., 2008), LSBP method (Guo & Xu, 2008), four RCM-based methods (Tuzel et al., 2007; Pang et al., 2008). In order to compare the recognition performance of our method with the above methods, we have used USTB databases (Yuan et al., 2005) in our experiments. In database 1, from a total of 60 persons, two images per person where randomly used for training. There are three different ways of selecting two images for training from three images. In database 2, from a total of 77 persons, three images per person where randomly used for training. There are four different ways of selecting three images for training from four images.

For simplicity, RCM-based methods associated with (9), (10), (11), (12) denote by RCM1, RCM2, RCM3, RCM4 respectively. RCM3 is a subset of RCM1 with lack of intensity component; also RCM2 is a subset of RCM4 with lack of intensity component.

Figures 8 and 9 denote the mean of the recognition rates for database 1 and 2 datasets. From Figures 8 and 9, it can be seen that the recognition performances of four RCM-based methods were worse than other methods, so it can be concluded that the discrimination power, in these RCM-based methods are weak for recognition task. To find out about the intensity parameter (I(x,y)) effect on the recognition rate, we compare the result of RCM1 with RCM3 and the result of RCM2 with RCM4. We can conclude that I(x,y)is an important feature in RCMs and it contributes to increasing the recognition performance of RCM-based methods. Thus, we used the illumination intensity component in our mapping function to increase the accuracy of our method.

Table 1 shows the comparision of the standard deviation of recognition performance between all discussed methods on database 1 and 2. From table 1, We can see that the standard deviation of our method for database 1 are low. Therefore, our method showed better performance than any other methods in database 1. The mean recognition rates of our method in database 1 and 2 are 93.33% and 87.98% respectively. Due to the pose variations in database 2 images, the recognition performance of our method, in terms of average accuracies, outperforms any other methods, except LSBP and ICA methods.


Figure 8.

Mean Recognition rates of different methods on database 1 (%)


Figure 9.

Mean Recognition rates of different methods on database 2 (%)

MethodsStandard Deviation
Database 1Database 2
Our method1.675.23

Table 1.

Standard deviations of the recognition rates

Eventually, these results prove that using Gabor features, as main features in constructing RCMs, will improve the discrimination ability for recognizing ear images, and it shows better recognition rate in proportion to previous methods.

7. Conclusion

In this chapter, we proposed gabor-based region covariance matrices for ear recognition. In this method we form region covariance matrix by using gabor features, illumination intensity component, and pixel location and utlize it as an efficient ear descriptor. We compared our method with PCA+RBFN method (Zhang & Mu, 2008), ICA+RBFN method (Zhang & Mu, 2008), Hmax+SVM method (Yaqubi et al., 2008), LSBP method (Guo & Xu, 2008), and four RCM-based methods (Tuzel et al., 2007; Pang et al., 2008), using two USTB databases.

Unlike the previous RCM-based methods which have very low recognition rates when being applied to ear recognition, our RCM-based method, which used gabor features as a main feature for constructing RCM, showed better result in ear recognition. Potential results showed that our method achieved improvement, in terms of recognition rate, in proportion to other methods. Our method obtains the average accuracy of 93.33% and 87.98%, respectively, on the databases 1 and 2 for ear recognition.


1 - H. Abdi, L. J. Williams, 2010 Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2 4 433459 .
2 - A. Bertillon, 1890 La Photograhie Judiciaire, avec un appendice sur la classification et l’Identification Anthropometriques, Gauthier-Villars.
3 - P. Comon, 1994 Independent Component Analysis, A New Concept?. Signal Processing. 36 3 287314 .
4 - J. G. Daugman, 1985 Uncertainty Relation for Resolution in Space, Spatial Frequency and Orientation Optimized by Two-Dimensional Visual Cortical Filters. Journal of Optical Soc. Am., 2 7 11601169 .
5 - W. Forstner, B. Moonen, 1999 A metric for covariance matrices. In Dept geodesy Geoinform, Stuttgart University. Stuttgart, Germany: Tech. Rep.
6 - D. A. Forsyth, J. Ponce, 2002 Computer Vision: A Modern Approach. Prentice Hall.
7 - Y. Guo, Zh Xu, 2008 Ear recognition using a new local matching approach. Proceedings of IEEE International Conference on Image Processing.
8 - Iannarelli, 1989 1989. Ear Identification, Forensic Identification Series, Paramount Publishing Company, Fremont, California.
9 - J. Jones, L. Palmer, 1987 An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J. Neurophys. 58 6 12331258 .
10 - T. S. Lee, 2003 Image Representation Using 2D Gabor Wavelets. IEEE Trans. Pattern Analysis and Machine Intelligence, 18 10 959971 .
11 - Y. Pang, Y. Yuan, X. Li, 2008 Gabor-based region covariance matrices for face recognition. IEEE Trans. On Circuits and Systems for Video Technology, 18 7 989993 .
12 - F. Porikli, 2005 Integral Histogram: A fast way to extract histograms in Cartesian spaces. Proceedings of CVPR. 2005.
13 - M. Riesenhuber, T. Poggio, 1999 Hierarchical models of object recognition in cortex. Nat. Neurosci., 2 11 10191025 .
14 - B. Schiele, J. L. Crowley, 2000 Recognition without correspondence using multidimensional receptive field histograms. Int. J. Comput. Vis. 36 1 3152 .
15 - J. Stone, 2005 Independent Component Analysis: A Tutorial Introduction. The Knowledge Engineering Review archive. 20 2 June 2005.
16 - O. Tuzel, F. Porikli, P. Meer, 2006 Region covariance: A fast descriptor for detection and classification. Proceedings of Eur. Comput. Vision Conference, 589600 .
17 - O. Tuzel, F. Porikli, P. Meer, 2007 Human detection via classification on Riemannian manifolds. Proceedings of IEEE Comput. Vision Pattern Recog. Conference, 18 .
18 - O. Veksler, 2003 Fast variable window for stereo correspondence by integral images. Proceedings of CVPR. 2003.
19 - P. Viola, M. Jones, 2001 Rapid object detection using a boosted cascade of simple features. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Kauai, HI. 1 511518 .
20 - L. Xu, 1994 Theories of Unsupervised Learning, PCA and Its Nonlinear Extension. Proceedings of IEEE International Conference on Neural Network. Olando. USA. 12541257 .
21 - M. Yaqubi, K. Faez, S. Motamed, 2008 Ear recognition using features inspired by visual cortex and support vector machine technique. Proceedings of IEEE International Conference on Computer and Communication Engineering, Kula Lampur, Malaysia.
22 - Li. Yuan, Zhichun. Mu, Xu. Zhengguang, 2005 Using ear biometrics for personal recognition. International Workshop on Biometric Recognition Systems. IWBRS 2005, 221228 .
23 - H. Zhang, Zh. Mu, 2008 Compound Structure Classifier System For Ear Recognition. Proceedings of the IEEE International Conference on Automation and Logistics, Qingdao, China.