Using Wavelet Transforms for Dimensionality Reduction in a Gait Recognition Framework

the recognition framework, the biometric and the wavelet in the analysis of and cancer. procedure is analyzed within wavelet and on real world important applications of the wavelet transforms in geoscience.


Introduction
This work proposes a novel computer vision approach that processes video sequences of people walking and then recognizes those people by their gait. Human motion carries different information that can be analyzed in various ways.
Gait can be defined by motor behavior consisting of integrated movements of the human body. It is a cyclical pattern of corporal movements that are indefinitely repeated every cycle. Gait analysis is very important in the medical field both for detecting and treating locomotion disorders. Historically, gait analysis was restricted to medical contexts, but now it has been extended to other applications, such as biometry. Research has proven that human beings have special and distinct ways of walking (Winter, 1991;Sarkar et al., 2005;Havasi et al., 2007;Boulgouris, 2007). Given this premise, a human being's gait can be understood as an important biometric characteristic. Arantes and Gonzaga (Arantes & Gonzaga, 2010,2011 have proposed a new framework for gait recognition called Global Body Motion (GBM). This framework was developed as a fusion of four models of human movement. Each model was based on specific image segmentation of the human silhouette and extracted global information about tri-dimensional, bi-dimensional, boundary and skeleton motion. That work applied the Haar Wavelet Transform (WT) for image dimensionality reduction with reduced loss of movement information. However, they did not analyze which wavelet family could perform better, maintaining the discriminant information of the human body movement in spite of the image scale reduction. Wavelet Transforms can be seen as mechanisms to decompose or break signals into their constituent parts. Thus, you can analyze data in different frequency domains with the resolution of each component adjusted to their scale. In this chapter, we analyze several wavelet families, choosing the best one, where "best" is defined as the Wavelet Transform that maintains the movement information after scale reduction for each model (Arantes & Gonzaga, 2010,211).

Objectives
There are differences in the way each person walks, and these differences can be significant in terms of identifying an individual. In a video sequence with only one person walking, the movement of this person, even in images with a complex background, generates valuable data among the highly correlated frames. In this work, we assumed that a frame-by-frame video sequence of a person walking forms one class, where each frame is an element of this class. Thus, our objective is to establish a methodology that can recognize a person from the way he/she walks. Movement of the human body can be interpreted in various ways using standard techniques of image processing. Our system obtains global information about the body's movement as a whole, from four models of segmented video images of the human being, before merging the results into a single model that we call the GBM (Global Body Motion). This model should improve the rates of biometric recognition.
In this work, we propose to determine the best family of wavelets that maintains the characteristics of human body movement in scale for each previously published model (SGW, SBW, SEW and SSW) (Arantes & Gonzaga, 2010,2011. Because each family of wavelets has distinct characteristics, applications of low-pass and high-pass filters will generate different discriminant features. For gait recognition improvements, we developed a fusion of human movement models using the framework proposed by Gonzaga, 2010,2011), and the fusion model results will be compared with the previously published models to determine the best-suited model.
The analyzed wavelets families in this work were as follows:


Haar: Is the simplest wavelet family. This wavelet has a linear phase, is discontinuous and equal to Daubechies 1. The wavelet also has only two filter coefficients, and thus, a long-range transition is guaranteed. The Haar wavelet function is represented by a square wave where soft signs are not well reconstructed. This wavelet is the only symmetric and orthogonal wavelet (Burrus et al., 1998).  Daubechies: Has a non-linear phase. The response to impulse is maximally flat. This wavelet is quite compact in time, but within the frequency domain, it has a high degree of spectrum superposition between scales (Burrus et al., 1998). These wavelets were the first to make discrete analysis practical. Ingrid Daubechies constructed these models with a maximum orthogonal relationship in the frequency response and half of the sampling rate, imposing a restriction on the amount of decay in a certain range, thereby obtaining a better resolution in the time domain; 2n filter coefficients are produced given the wavelet order, n (Burrus et al., 1998).  Symlets: Have a non-linear phase. The response to impulse is more symmetrical. This model was proposed by Daubechies as a modification to the family "dbN" because of its similar properties and the fact that it tends to be symmetric (Burrus et al., 1998).  Bi-Orthogonal: Has a linear phase. This family uses two wavelets: one for decomposition and another for reconstruction. The Bi-Orthogonal wavelet family has compact support and is symmetric (Burrus et al., 1998).

Methodology
The proposed framework is shown in Figure 1. The extracted features create independent models (SGW, SBW, SEW and SSW) of the global movement of the human body, and these are compared separately using distinct wavelet families.
To eliminate the background and consequent segmentation of the movement, we have used the algorithm based on the Gaussian Mixture Model (GMM), originally proposed by Stauffer and Grimson (Staufer & Grimson, 1999) and modified by KaewTrakulpong and Bowden (KaewTraKulPong & Bowden, 2001).

www.intechopen.com
Using Wavelet Transforms for Dimensionality Reduction in a Gait Recognition Framework

63
Two types of images were generated by segmentation: The first image corresponds to the image in segmented movement in grayscale. This sequence is called Silhouette-Gray (SG).
The second image is obtained from the binary mask generated by the GMM. This sequence is called Silhouette-Binary (SB).
We have proposed four families with the goal of achieving scale reduction without significant loss of information. Each WT performance is tested, taking into account the previously proposed framework for gait recognition (Arantes & Gonzaga, 2010, 2011, which was studied only for the Haar family.
The images of people walking are decomposed into four sub-bands with different information in terms of both content and detail. For each level of decomposition, four new images are generated, each with half the special resolution and scale. Each decomposition level outputs one image from the low-pass filtering stage and three images from the highpass filtering stage. The low-pass filter generates the approximation coefficient's image, and the high pass-filter outputs the vertical, horizontal and diagonal details. The approximation coefficients contain information about the human body shape and grey-level variations, and the detail coefficients furnish information about the silhouette contour.
Given that the original segmented image contains all the information about the global movement of the human body when walking and that this information does not change significantly with scale, the four families of wavelets are applied at two levels for each of the segmented sequences.

Scale reduction
The SG sequence, after applying WT, generates the SGW sequence. The SB sequence, also after applying WT, generates the SBW sequence. The segmented sequences, which constitute each class of subjects, represent the output as images of 31 x 60 pixels with the subject walking in the center of each frame. The scale reduction is basic in terms of reducing the amount of data without decreasing the amount of global information contained in the movement, thereby optimizing the computational effort of the recognition. Thus, the representation of two models of human gait is obtained: SGW and SBW models. The SGW model is derived from the SG sequence, after application at two levels of WT and SBW is generated from the SB sequence, also after WT.
Scale reduction is performed for the four wavelet families: Haar, Symlets second-order, Daubechies second-order and Bi-Orthogonal 1.1 and 1.3. Thus, there are five databases for SGW and SBW, one for each order wavelet family.  Figure 2(b) shows the second level of wavelet decomposition, broken down into four components: low frequency coefficient (approximation coefficient's image) and the coefficients with horizontal, vertical and diagonal details, respectively. At this stage, the size of the generated image is 31 x 60 pixels. This figure represents the WT Bi-Orthogonal. The same process is applied to the other wavelet families considered in this work.

Movement of the contour and the skeleton
Aiming to capture the global variations of the movement of the human body contained only in the contour of the silhouette, we used the horizontal, vertical and diagonal details generated by the WT. We applied the algorithm proposed by Lam (Lam et al., 1992) for the silhouette skeletonization. This procedure generates the skeleton sequence class of global movements called SSW (Arantes & Gonzaga, 2010, 2011. We generated five complete models for each wavelet family: SGW Model -Silhouette-Gray-Wavelet: each class is represented by a grayscale silhouette sequence using the WT applied to moving objects segmented by GMM. The SGW model has information about the three-dimensional global movement of the human gait grayscale variations, but it can be quite sensitive to variations in light.
SBW Model -Silhouette-Binary-Wavelet: each class is represented by a sequence of binary silhouettes generated using the WT, applied to moving objects and segmented by GMM.
The SBW Model provides information about the two-dimensional global movement of the silhouette of the human body while walking. The SBW Model reduces the sensitivity to the variation of light, but the clothes remain a variable that can negatively impact system performance.
SEW Model -Silhouette-Edge-Wavelet: each class is represented by a sequence of silhouettes of images of edges obtained from the horizontal, vertical and diagonal coefficients of WT. The SEW model carries information about the global behaviors of the contours while walking. The SEW model is even less sensitive to light variations than the SBW model. However, the contour is insufficient for satisfactory recognition.
SSW Model -Silhouette-Skeleton-Wavelet: each class is represented by a sequence of skeleton silhouettes obtained from the SBW method. The SSW model contains information about the global movement of joints of the human body and how they behave while walking.

Feature extraction -EigenGait
The EigenGait captures the temporal features (or temporal differences) of the human gait among the frames within each class and projects these features in a prototype vector.
Because each frame sequence represents a corresponding class of a person walking, the intra-class variance is small, and the inter-class variance is large. Therefore, the PCA (Principal Component Analysis) technique is used to extract relevant characteristics for recognition. The PCA technique is applied to all frames of all classes belonging to the four models (SGW, SBW, SEW and SSW) for the wavelet families with the best individual result for gait recognition.
The data dimensionality is also reduced in relation to the original variables, but maintains the relevant information. The main extracted characteristic is the feature vector that will be used for silhouette classification in its respective class.

Fusion
Different motion representation options carry distinct information about human body movement and silhouettes. Aside from being vulnerable in different situations (presence of shadows, change lighting, changes in dress, etc.), the proposed fusion model can add both static (for the SGW, SBW and SEW models) and dynamic characteristics of the movement (for the SSW model).
The proposed fusion approach assumes that the output of each model (SGW, SBW, SEW and SSW), trained individually with different wavelet families, yields a similarity score between each frame and the classes to be classified. This score similarity is obtained through the Nearest Neighbor (NN) classifier. Thus, we obtain the percentage of the correct answer for each model individually in each wavelet family. The model representation of the individual gait that yields a better performance will have a greater weight in the frame classification decision.
The following steps can describe the algorithm: where , i jc S is the smallest Euclidean distance between each j th frame and the EigenGait of each class c for each model i, with i varying from 1 to 4 (SGW, SBW, SEW and SSW). The frame will be classified in the class such that the mean distance is minimized.
2. Calculate the average precision of correct answer for each model i, given by equation 2: where TP i is the number of correctly classified frames (true positives) of the model i, and TG i is the total number of samples of the test set of model i.
3. Calculate the fusion score ( , j c  ) between the j th frame and class c, given by equation 3:

Materials
We have used the "Gait Database" of the National Laboratory of Pattern Recognition (NLPR) from the Institute of Automation at the Chinese Academy of Science (CASIA, 2010) in this work. These images were generated outside in an environment with natural light. The images include three views: side, oblique and front (0 o , 45 o and 90 o ).
Each class has three views and four sequences per view (two sequences walking from the left to the right and two sequences walking from the right to the left). These are numbered sequence 1, sequence 2, sequence 3 and sequence 4 with the following respective directions: right-left, left-right, right-left and left-right. Each variation of angles in each of the four sequences is illustrated in Figure 3.
The sequences of videos were assembled from the available images. Altogether, 20 classes were obtained with 240 video sequences and 8,400 frames.

Evaluation methods
To evaluate the Wavelet Transform performance for human gait recognition in this framework, independent tests were carried out for each type of sequence (SGW, SBW, SEW and SSW) of each wavelet family. The results for each wavelet family were analyzed, taking into account the False Acceptance Rate (FAR) and False Rejection Rate (FRR).
Each image of each frame was projected into a PCA sub-space and compared with the EigenGait prototype of each class. For each experiment, confusion matrices were generated, and the FAR and FRR rates were calculated for each class. We used the Feret protocol (Philips et al., 2000) with the leave-one-out cross-validation rule for evaluation of the results. After computing the similarity between the test sample and the training set, the nearest neighbor (NN) was applied to the classification. The correct answer from each model is used as the weight in the weighted mean for the fusion process. For the test in which the three angles are used together to form a single base, the wavelet family that best extracts the coefficients of low frequency (approximation image) and the details of the high-pass filtering is the Bi-Orthogonal 1.3. This fact implies that the Bi-Orthogonal 1.3 WT captured more information on the global movement of the object. Another detail to consider is the low False Acceptance Rate (FAR) in the Bi-Orthogonal wavelet family. When the amount of information present in the image is reduced, we can infer that the length of the wavelet filter has a very important role. The Bi-Orthogonal wavelet has the low-pass filter of length 6, which causes the detail level obtained for this family to be much larger than that of the Haar family. Because the images of the model are derived from SBW low-pass filtering, a higher level of detail is obtained, resulting in a better image. For an angle of 0 o , Symlets and Bi-Orthogonal wavelets provided equivalent results. For the other angles, the best performance occurs with the Bi-Orthogonal family. FAR for this family are also lower than that of WT Haar and Daubechies. Table 3 shows the percentages of matches with their respective FAR and FRR for the SEW model.

Results
The SEW model carries information about the global movement related to the human contours of the silhouettes. The SEW model is generated from coefficients with horizontal, vertical and diagonal details. The best performance for this model is with the Bi-Orthogonal wavelet at an angle of 90°. Table 4 shows the percentages of matches with their respective FAR and FRR considering the SSW model.
The SSW model carries the global information about movement of the body's joints. This model provides the least amount of information about the movement. Nevertheless, its rate of correct classifications using the families of Symlet and Bi-Orthogonal wavelets is good and far exceeds that of the Haar wavelet.  Table 3. Percentages of matches with their respective FAR and FRR rates for the SEW model.
Analyzing the overall performance of the wavelet families, WT Bi-Orthogonal maintains good performance regardless of the type of movement or angle used. A Haar WT, for this study, is very susceptible to the motion model and the steering angle of the walker.
The average of the correct answers of each wavelet in each model is used to calculate the weighted mean within the fusion schema. The Feret protocol (Philips et al., 2000) is used to evaluate the results. The statistical performance of this method is reported as the CMS (Cumulative Match Score), which is defined as the cumulative probability of a correct classification of a test object within the top k hits.
The CMS curves in Figure 4 were obtained through the fusion of the SGW, SBW, SEW and SSW models. The models used in the fusion process were those that achieved the best results for the analyzed wavelet families.
The CMS curves in Figure 5 were obtained through the fusion of the SGW, SBW, SEW and SSW models, using the combination of the three views with two sequences each.

Comparative results
The results from the GBM model for this angle were compared with the results obtained from the previous work of Arantes and Gonzaga (Arantes & Gonzaga, 2010,2011  In table 5, the best results for the correct answer are presented based on the CMS. The CMS ranks vary among 1, 5 and 10 at an angle of 0 o . For this angle, the best results were obtained for the families of Symlets and Bi-Orthogonal wavelets. Table 6 shows the best results for the correct answer, based on the CMS. The CMS ranks vary among 1, 5 and 10 at an angle of 45 o . For this angle, the best results were obtained for the families of Symlets and Bi-Orthogonal wavelets.

Conclusions
To evaluate the Wavelet Transform performance for human gait recognition in the proposed framework, independent tests were carried out for each type of sequence (SGW, SBW, SEW and SSW) for each wavelet family. The results for each wavelet family were analyzed, taking into account the FAR and FRR. Each image of each frame was projected into a PCA subspace and compared with the EigenGait prototype of each class. For each experiment, the confusion matrices were generated and the FARs and FRRs were calculated for each class. The Feret protocol (Philips et al., 2000) with a leave-one-out cross-validation rule was used to evaluate the results. The fusion process, carried out with the best performance wavelet family, is compared with the original GBM (Arantes & Gonzaga, 2010,2011.
For the SGW model, at an angle of 0°, the average hit rate is similar for each of the wavelet families analyzed. The best rate of correct classifications is for WT Bi-Orthogonal 1.3, and the difference in the hit rate over Haar WT is 5.3%. The Daubechies WT, with second-order, obtained a lower rate of correct classifications. The SGW model carries the most information; however, it is also the model that is the most sensitive to interference from the external environment.
For the 45° and 90° angles, considering the SGW model, the Haar WT had the lowest rate of correct classifications in relation to other models. The Bi-Orthogonal 1.3 WT obtained 81.2% of corrected matches for the average angle of 45° and 83.7% for the angle of 90°. For these angles, the best choice is the Bi-Orthogonal family. This improvement in the hit rate can be attributed to many details that the family can capture with Bi-Orthogonal WT when compared with Haar WT.
The SBW model carries global information about human movement, present in binary images. For the three views in this model, there was an increase in the hit rate of approximately 23%, in the best case.
For the 45 o angle, the Haar WT obtained a FRR higher than the rate of correct answers. When all the views are combined into a single base, the WT Bi-Orthogonal 1.3, also performs well.
The SEW model is obtained from the horizontal, vertical and diagonal coefficients generated from the WT implementation. This model carries fewer details compared with the SGW and SBW models. Thus, the greater the number of details that the WT can capture the better is the model performance. The match score is similar for Symlets and Daubechies families; this may be due to the fact that these families have the same length filter.
The SSW model provides the global information of the human movement contained in the skeleton of the body. The SSW model carries an even smaller number of details in relation to the other models, but they are less susceptible to changes in the external environment. The best results for the average hit rate are for the Bi-Orthogonal Wavelets 1.3 and Symlets.
The highest rates of correct classifications are chosen as the weights in the fusion process. Therefore, rates are chosen from the Bi-Orthogonal 1.3 family. This led to better performance in the system, which can be observed in the CMS curves. The amount of detail that each wavelet family captures is closely related to the system performance.