This chapter introduces a new phase-input joint transform correlator (PiJTC) to accommodate the additive noise in face recognition. It begins with the 4f Vander Lugt architecture in order to highlight the importance of spatial coordinate filtering in the Fourier plane. It continues with the introduction of non-zero order fringe-adjusted amplitude joint transform correlator (FA-AJTC) model. There are also presented the experimental optical and hybrid setups, with their benefits and drawbacks, for a better understanding of the correlation process.
The authors emphasize the presentation of this model because it represents the basis for the phase-input model. A better understanding of the (FA-AJTC) facilitates the use of the proposed sine modulated fringe-adjusted phase-input joint transform correlator (FA-sinePiJTC). All the steps that help the amplitude (FA-AJTC) model performances improve, are similar in the phase model with the benefits migration, too. A short step by step example of correlation process explains and reveals the pattern recognition performances improvements of the proposed phase-input JTC model.
It is pointed out that the sine modulation function comprises a degree of freedom through the limits of the modulation domain,. With these parameters one can adjust the pattern recognition performance of the correlation process due to the variations of faces in the captured images.
Next stage of this chapter analysis the face database recognition performances of the proposed (FA-sinePiJTC) model in additive noise conditions from 10% to 50% random noise. There are presented some computer simulation results of the correlation process over face images selected from a small test database with 21 individual classes x 3 face images/class. Due to the variations of the faces in the original images there were chosen different sine modulation domains to achieve better face recognition performances in additive noisy conditions for a larger database. Thus, there were selected 102 individual classes with 5 face images/class multiplied by 5 additive noise conditions. The goal was to build up 1,560,600 cross-correlations between all original face images and all the noisy ones. Where was necessary, the face recognition performances were analysed with the error rates: equal error rate (EER), genuine acceptance rate (GAR), false acceptance rate (FAR) and false rejection rate (FRR). The receiver operating characteristic (ROC) curve was also built up.
2. Amplitude joint transform correlators (AJTC)
Vander Lugt correlator (VLC) is based on 4f setup involving two convergent lenses, L1 and L2, with the same focal length, f (figure 1) (Born & Wolf, 1970; Vlad et al., 1976). The two lenses represent the 2D Fourier operators and generate the Fourier transform in the output Fourier plane of the input image. Thus, there are two Fourier output planes in the VLC: the Fourier plane or the spatial frequencies plane, PF, of L1 lens and the output plane, Pe, of L2 lens. An important notice regarding the optical or hybrid correlators based on 4f Vander Lugt principle is that they have two Fourier transforms related with the L1 and L2.
The L1 lens generates as the Fourier transform of the input image, in the (VLC’s) Fourier plane, PF. In this plane, the complex filter, , of the reference image that has to be recognized by the (VLC) is placed. In other words, the reference has to be discriminated in the input image, , at the target image location. The complex filter consists of complex conjugate Fourier of the reference, located at (Born & Wolf, 1970; Vlad et al., 1976)
where is the Fourier transform of, and are the spatial coordinates in the Fourier plane, is the focal length of the lenses, is the wavelength of the coherent light used.
This filter is applied to
and generates the filtered Fourier transform in the (VLC) Fourier plane (Vlad et al., 1976)
The L2 lens generates the Fourier transform (second in number) of the. The result is the optical cross-correlation 2D function or image (Vlad et al., 1976)
There are two output cross-correlation terms. The one is called the DC or zero-order term and represents the “useless” term of cross-correlation between the reference image and the embedding noise. The one is called the dc-term and represents the cross-correlation between the reference image and the target image. Now, if the target image consists in identical or distorted reference image, then this term is the auto-correlation term.
Pattern recognition (i.e. discrimination or detection) process fails, if the cross-correlation term presents higher correlation peak than the autocorrelation. The L1 and L2 lenses optical axes alignment is the main disadvantage of the (VLC). If the optical axes of L1 and L2 lenses are not perfectly aligned then the spatial filtering in the (VLC) Fourier plane is missed and the correlation process doesn’t generate accurate information. In this case, the pattern recognition process may fail.
Optical axis alignment, as being the major disadvantage of (VLC), generates the occurrence of the amplitude joint transform correlator (AJTC). This correlator is based on the 4f Vander Lugt correlator, but uses a joint image to alleviate the optical axes alignment (Abookasis et al., 2001; Alam & Karim, 1993a). The joint image presents the reference image and the scene image placed in two separate half (figure 2). The corresponding mathematical model of an optical (AJTC) consists in the following well known equations, starting with the joint image
Here and are the reference image and the scene image (figure 2).
The next equations, after the L1 lens optical Fourier transform, are
Here, and are the Fourier transforms of and, respectively, and, are the spatial frequency coordinates in the Fourier plane, is the focal length of the lenses, is the wavelength of the light used, , , , are the joint power spectra, scene power spectrum and reference power spectrum, respectively (Alam & Karim, 1993a; Huang, et al, 1997; Javidi & Kuom, 1988; Lu et al., 1997).
The correlation result presents the zero-order term, , and two anti-parallel correlation “lines”, as and, with the same information.
Amplitude joint transform correlators (AJTC) are known as robust to embedding and additive noise, thus they are a good solution for the pattern recognition in these conditions. Also, the main disadvantage of the JTCs is that it has a very high intensity zero-order correlation peak (dc term),. This term is useless for the discrimination (e.g. pattern recognition) process.
It was the past issue to resolve the JTCs. To remove the zero-order term and to accomplish the non-zero order JTC (NZAJTC), many methods have been proposed: for instance, phase-shifting technique (Su & Karim, 1998; Su & Karim, 1999), joint transform power spectrum (JTPS) subtraction strategy (Cheni & Wu, 2003; Cheni & Wu, 2005; Lu et al., 1997), Mach-Zehnder method.
The (NZAJTC) based on JTPS subtraction is more time consuming than the others, as a digital operation of non-zero order term subtraction is needed (Cheni & Wu, 2003; Cheni & Wu, 2005; Lu et al., 1997 )
where, , , , are the non-zero joint power spectrum, joint power spectrum, scene power spectrum and reference power spectrum, respectively.
In order to achieve thin and higher correlation peaks with (NZAJTC), one must make a spatial frequency domain filtering that enhances the high order spatial frequencies which defines the reference object details.
One of the conventional methods to alleviate this issue is the use of amplitude fringe-adjusted joint transform correlator (FA-AJTC). The method applies to, an amplitude fringe-adjusted filter (FAF) which consists in the inverse amplitude spectrum of the reference Fourier transform, , (Abookasis et al., 2001; Alam & Karim, 1993a; Alam & Karim, 1993b; Alam & Horache,2004)
where is the lowest positive real value that the computer recognizes and is a real non-zero function used to alleviate the poles of .
The (FAF) creates better light diffraction efficiency because it works like inverse reference adaptive filtering. In other words, use of inverse adaptive filter increases the higher spatial frequencies and decreases the lower spatial frequencies of the. The higher frequencies are responsible for the “details” of an object-image and the lower frequencies are responsible for “common” properties of the object-image. Thus, increasing the higher spatial frequencies of the with a reference inverse adaptive filter, it generates higher autocorrelation peaks and fades the cross-correlation peaks. As a consequence, a better discrimination of the reference in the scene image is achieved.
and generates the correlation result
According to this equation, the amplitude fringe-adjusted joint transform correlator (FA-AJTC) generates only the two anti-parallel cross-correlation “lines” ”, and, with the usefull pattern recognition information (figure 5).
All optics experimental setup of the fringe-adjusted non zero-order amplitude joint transform correlator (FA-AJTC) invokes holographic (i.e. CD media type) or high resolution photographic media. With the help of the laser coherent light, these plates have to be “written” with the joint image in the input plane and the reference fringe-adjusted filter in the Fourier plane.
Alternatively a hybrid (optical-digital) (FA-AJTC) can be built up that is implemented with amplitude spatial modulator (ASLM) and a square law image capture device (CCD camera) (Demoli et al., 1997; Sharp et al., 1999). A coherent laser beam light is projected on the (ASLM1) which is addressed by a computer with the joint image in the input plane. After the Fourier transform with the L1 lens, the joint power spectrum, , is captured by a CCD1 camera and stored in the computer. Here, the digital processing of generates the, and then the application of FAF filter is accomplished. After these digital processing steps the resulting power spectrum is projected in the Fourier plane with a (ASLM2) on lens L2 to perform the inverse Fourier transform. In the output plane a CCD2 camera captures the correlation result as a digital image.
All the above steps describe the 4f (FA-AJTC) architecture in the hybrid version. The 1/f (FA-AJTC) architecture in the hybrid version can be done using just one amplitude spatial modulator (ASLM), one CCD camera and just one lens. The correlation peaks are generated by using two passes through this hybrid system.
The only drawback of the hybrid setup of (FA-AJTC) is the time consuming of the digital operations (i.e. digital addressing the (ASLM), digital image capturing with CCD camera and digital computer operations). The benefits of the hybrid experimental setup of (FA-AJTC) are the mathematical perfect filtering and processing of the digital- thus, there is no need of optical lenses axis alignment.
As follows, quantitative analysis should be done. Correlation performance criteria, that can be used to analyse the described joint transform correlators, need the definition of cross-correlation peak intensity, CPI, and the auto-correlation peak intensity, API. The ratio denotes the detection efficiency coefficient (Abookasis et al., 2001; Alam & Karim, 1993a; Alam & Karim, 1993b) and prescribes a pattern recognition failure for values less than the recommended threshold value of 1.2000 (and not 1.0000). Values greater than the threshold conclude in a successful pattern recognition process.
A better understanding of the correlation performances of the (JTC) is accomplished by a short example. Thus, a joint image (figure 2) with the scene image gathering a reference with additive noise (50% random type – indexed F1), two morphologically distorted reference images (indexed F2 and F3) and a non-reference image (with index F4) are considered.
Correlation results, presented in figures 3 and 5, show the benefits of fringe-adjusted frequency domain filtering (figure 4). After this filtering, the output correlation peaks are sharper and the minimum value of detection efficiency coefficients calculated with CPI for F4 and APIs for F1, F2, F3, for (FA-AJTC) (DEC = 1.7501) is greater than for (NZAJTC) (DEC = 1.1944). There can be noticed that the (NZAJTC) fails to discriminate between the reference-type image, F3, and the non-reference image, F4. The F3 image generates the lowest API from the reference-image class and the DEC value is less sensible than the threshold.
3. Phase-input joint transform correlators (PiJTC)
Amplitude encoding joint transform correlators need (ASLM) with large bandwidth and thus, generate wider correlation peaks and lower light efficiency. In the holographic research, there is stated that the phase encoding needs small bandwidth of the recording media to generate higher light efficiency. These reasons have made the transition from amplitude JTC to phase-input JTC, denoted (PiJTC), worthy.
Since the appearance of phase spatial light modulators (PSLM) there were several approaches to encode the joint image in the phase domain. In early attempts direct phase encoding domain was restricted to. Nowadays, (PSLM) direct phase encoding domains cover the full range of (Cohn & Liang, 1996; Labastida et al., 1994; Takahashi & Ishii, 2010; Takahashi & Ishii, 2006). Here it must be mentioned that all optics phase-input correlators can be built with holographic approach, but not a versatile one.
The (PiJTC) model is the same as the amplitude one, but it uses the phase transformation of the reference, scene and joint input image. This method, assumes that the amplitude image is somehow transformed from intensity gray levels (usually from 0 to 255) in phase levels with a domain usually of or. Phase transformation function,, invoked to obtain a phase image, is mathematically described by (Lu & Yu, 1996; Sharp et al., 1998)
where is the phase depth, , , are the maximum and the minimum gray levels of the amplitude encoded image.
Reference, scene and joint phase encoded images are optically Fourier transformed to obtain the reference, scene and joint power spectrum in the Fourier plane (Chang & Chen, 2006; Nomura, 1998; Su & Karim, 1998; Su & Karim, 1999). These power spectrums can be processed as in the amplitude non-zero order fringe-adjusted joint transform correlator (FA-AJTC) to provide the correlation peaks in the output plane. The second Fourier transform generated by the L2 lenses uses the amplitude coded image projection and does not need just the amplitude one, (ASLM).
These steps describe the non-zero order fringe adjusted phase-input joint transform correlates (FA-PiJTC). The “phase-input” terminology comes from the fact that in the input plane of the (FA-PiJTC) a (PSLM) is used to make a direct phase projection of an amplitude input image.
In optical holography, when the object is a phase encoded one, then the whitened holographic method must be applied in order to get the best contrast and light diffraction efficiency. Similarly, in the (FA-PiJTC) the input direct projection phase encoded image must have a non-black background to achieve the theoretically predicted performances. One solution is to apply a gray level grating background at the (PSLM) pixel level. But this processing is due to the amplitude gray domain range of the input image and the phase encoding domain of the (PSLM),. If this step is done manually, then real time applications are not possible. Thus, for speed reasons and better pattern recognition performances, the authors proposed a new (FA-PiJTC) which can automatically adjust the amplitude gray domain range of the input image and the phase encoding domain of the (PSLM) in the same time denoted as (FA-sinePiJTC).
Sine modulated fringe-adjusted non-zero order phase-input joint transform correlator, (FA-sinePiJTC) presents the amplitude domain pre-processing step, other than the scalar transform of the input amplitude image. In this step, as in the previous one, a scalar transformation of the input amplitude is involved. This is required by a pre-processing trigonometric function, chosen by the authors as the sine function
The pre-processing amplitude sine function needs a definition domain within the amplitude image to be scaled in,. The limits of this definition domain represent the parameters of the (FA-sinePiJTC) and they can be considered as degree of freedom. With these parameters, one can make the adjustments for better correlation performance. One of the aims of this paragraph is to analyse the correlation performance adjustments possibilities with these two parameters.
A better understanding of the correlation performances of the (FA-PiJTC) and (FA-sinePiJTC) is accomplished by a short example which consider the same joint image used previously for the amplitude models.
In the same way, from figure 7 and 9, it can be noticed that (FA-PiJTC) has sharper correlation peaks than (FA-AJTC) and the minimum detection efficiency coefficient value is DEC=1.5547, with the PSLM definition domain; the (FA-sinePiJTC) has the sharpest correlation peaks and the best pattern recognition performance DEC=2.0036, with the sine definition domain and the (PSLM) definition domain.
These results conclude that (FA-sinePiJTC) can be chosen for database (i.e. inter-class and intra-class) face recognition with and without additive noise (that consists in an image distortion).
4. Experimental and computer simulation aspects of phase-input fringe-adjustment joint transform correlators (PiJTC)
Previous paragraphs present the mathematical models of (FA-AJTC) and (FA-sinePiJTC) and the experimental setup for (FA-AJTC) in all optics and hybrid versions.
The following presentation emphasizes the experimental hybrid setup, the computer simulation conditions and results for (FA-sinePiJTC) model. In the 4f architecture the (FA-sinePiJTC) can be done more accurate. The reason is that in the first stage the joint image projection is done in the phase domain with a (PSLM) and in the second stage the is projected in the amplitude domain with a (ASLM). Thus for a phase-input joint transform correlator, two spatial light modulators are imminently needed: one in phase domain and one in the amplitude domain. This is the fact that generates the synonym “phase-amplitude“ architecture correlator for the 4f phase-input joint transform correlator. From this point of view, some researchers proved that “phase-phase” architecture does not bring any improvements to pattern recognition performances in comparison with “amplitude-amplitude” one.
The drawbacks and benefits of the 4f hybrid architecture of (FA-AJTC) model are the same as for the (FA-sinePiJTC) model. The (FA-sinePiJTC) model has a degree of freedom, the sine modulation domain limits, that can adjust the performances of the correlation process.
The computer simulations were done on a face database with the sine modulated fringe-adjusted phase-input joint transform correlator, (FA-sinePiJTC). The (FA-sinePiJTC) model has a degree of freedom, the sine modulation domain limits, that can adjust the performances of the correlation process. In the computer simulations two sine modulation domains: and are used.
The face databases used are two parts from the “General Purpose Recognition and Face Recognition” section from Computer Vision Research Projects supervised by Dr. Libor Spacek from Department of Computer Science University of Essex, Colchester, UK (http://cswww.essex.ac.uk/mv/allfaces/index.html). As it is described by the database owner, the face images are held in four directories (Faces94, Faces95, Faces96, Grimace), in order of increasing the difficulty degree. Faces96 and Grimace are the most difficult, for two different reasons - variation of background and scale, versus extreme variation of expressions.
The presented computer simulations use the “male” and “malestaff” directories from Faces94 (updated Friday, 16-Feb-2007 15:52:52 GMT). The subjects sit at fixed distance from the camera and are asked to speak, whilst a sequence of images is taken. The speech is used to introduce facial expression variation (see table 1).
Joint images, built up to perform the correlation process, have 512x512 pixels. The reference image is located in the upper half and two target images are located in the lower half composing the scene image.
5. Correlation performances of phase-input fringe-adjustment joint transform correlator (FA-PiJTC) in face recognition with additive noise
Computer simulations of correlation process over a database can be very time consuming. In order to avoid time loss, a step by step strategy is chosen. In this strategy a small database is tested and after that, due to the results, the decisions are applied to a larger database in order to draw final conclusions. The smaller database from directory ”malestaff” the larger the one from directory “male”.
The test face database used is organized in individual classes Figure 10. One individual class has a total of 20 images of the same person (i.e. individual). For the test database there are selected only 3 face images (figure 11) from each class of the 21 classes (figure 12).
Five additive noise conditions - with 10%, 20%, 30%, 40%, 50% random noise levels - are presented in figure 11. References, just the face parts, were subtracted from the original images to prevent the background false correlation. CIELab colour filter was used to extract the predominant green background.
Correlation performance analyse over the face database without additive noise should be done. If the pattern recognition process fails or has poor detection efficiency for the situation without additive noise, then the additive noise condition will collapse the process. At the start, autocorrelations over all class reference images were done: 21 classes x ((20x20) reference images/class). For each class only the overall (20x20=400 values) minimum value represented with diamond line - figure 13 as intra-class correlation peak intensity (ICPI) was retained.
Large domain values of ICPI can be noticed from figure 13. The main reason is that the face images were captured during a monologue. At the start of session, in the first images, the individuals take care about their posing. After a few seconds they have head motions that generate significant variations in the face images. As a consequence, the ICPI decreases very fast for 8 individual classes. In order to analyze the additive noise correlator performances, as a counter measure, the authors decide to drop the number of reference images for the test database from 20 to only 3 and the target images to just one (the first in the class).
Autocorrelations between 63 reference images (21 classes x 3 reference images) were done. The diagonal (3x3) block from the cross-correlation matrix represents the intra-class results (ICPI). For each class was retained only the overall (3x3=9 values) minimum value represented with boxed line - figure 12 - as intra-class correlation peak intensity line (ICPI). The rest of the values consist of the inter-class cross-correlations values. For each class, only the maximum values of the inter-class correlation values were retained, as the inter-class correlation peak intensity (IECPI) – represented in figure 14 with diamond line.
In the same manner, the results for cross-correlations between 63 reference images (21 classes x 3 reference images) and 105 additive noisy images (21 classes x (1 original image x 5 noise conditions)) were generated. Graphical results are presented in figure 15 as the ICPI-AN and IECPI-AN (AN from Additive Noise) correlations peak intensity lines.
Figure 14 shows that the ICPI line does not intersect with IECPI line; figure 15 shows that the ICPI-AN line does not intersect with IECPI-AN line. If the lines intersect then there is an image from an individual class that is similar with an image from different class. But, these facts prove that the (FA-sinePiJTC) is robust to random additive noise.
Figure 14/15 present gaps between the ICPI/ICPI-AN values and IECPI/IECPI-AN values along all classes. The significance of these gaps is a very good discrimination (e.g. pattern recognition) performance of the (FA-sinePiJTC). Thus, the (FA-sinePiJTC) can discriminate with high accuracy over all classes (figure 16), between faces of different classes and do the registration of the faces from the same class successfully. The accuracy (gap magnitude) decreases in additive noise conditions (figure 16) at 22.95 (a.u.), from 96.05 (a.u in the situation without additive noise.
The algorithm revealed by correlation results with the (FA-sinePiJTC) over the test face image database, can be now applied on a larger face images database. First, the (21 classes x (20x20) reference images) autocorrelations are done. The results presented in figure 17 show small values of ICPI for 10% of the classes. As consequence, only 5 reference images per individual class were considered.
With these conditions, the autocorrelations between 510 reference images (102 classes x 5 reference images) were done. The results are presented in figure 18 as the ICPI and IECPI correlations peak intensities lines.
Correlation lines, ICPI and IECPI, in figure 18 do not intersect, but the gap between them is very small. As mentioned before, the main reason is the significant variation of the faces due to the motion during the monologue.
In analogue conditions, the cross-correlations between 510 reference images (102 classes x 5 reference images) and 2550 additive noisy images (102 classes x (5 original image x 5 noise conditions)) were done. Graphical results are presented in figure 19 as ICPI-AN and IECPI-AN correlations peak intensity lines. Correlation lines ICPI-AN and IECPI-AN, intersect because no gap between them is present. This fact conclude that some individuals appear to be similar using the (FA-sinePiJTC) in the mentioned conditions.
The pattern recognition performances in this case are measured by equal error rate (EER), genuine acceptance rate (GAR), false acceptance rate (FAR) and false rejection rate (FRR) (figures 20-22). Also there must be built up the receiver operating characteristic (ROC) curve (figure 22).
This chapter presents additive noise robustness of sine modulated fringe-adjusted phase-input joint transform correlator (FA-sinePiJTC), tested over a human face images database. There were inspected two databases: a small – test database for preliminary results, and then a large database for the final results and conclusions. The architecture for the proposed JTC is the 4f, with non-zero order algorithm and a fringe-adjusted filter, in order to achieve better pattern recognition efficiency.
The sine modulation function gives the correlation process a degree of freedom through the limits of the modulation domain,. These parameters can adjust the pattern recognition performance of the correlation process due to the variations of faces in the captured images. These variations are in plane and in axis head rotations, known as very altering for pattern discrimination performances. To ensure better performances the correlation processes were made with, for the test database, and with for the large database. There has to be pointed out that the conjunction of: the sine modulation function, the phase-input and the fringe-adjusted filter, provides best pattern recognition performances over the inspected face databases.
Presented results of the proposed joint transform correlator algorithm were achieved with 512x512 pixels input images. The joint image has two target images in the scene section, so it works twice faster than the Vander Lugt correlator (VLC), or other single pair matching correlators. Computer simulations were done on Intel i7-870 processor at 2.93 MHz with 3 GB of RAM with an operation speed of 0.185 seconds/single reference-target correlation. The application was built up to be used as a single process (CPU thread) or as multiple processes run simultaneously in maximum 5 CPU threads. The limited number of the CPU threads used is due to RAM memory and not to the processor. In these conditions, the single reference-target correlation time drops to only 37 milliseconds and make possible the achievement of 1,560,600 single pair correlations from the large database. In this way, the statistical significance of the face recognition results is very high.
The (FA-sinePiJTC) face recognition for the test database reveals that choosing only 3 out of 20 face images at the start of capturing process provides zero ERR and FAR error rates. This is stated because of the gap between the maximum IECPI and the minimum ICPI over the entire database process, without and with random additive noise. We can conclude that the (FA-sinePiJTC) is robust to additive noise up to 50% for database face recognition.
The selection of more than 3 initial face images, to 5 face images in the individual classes, widens the variations of faces. In this situation non-zero error rates. The equal error rate appear value is EER=0.0886% for CPI value of 35.4007, and the genuine acceptance rate is GAR=99.9020% at a false acceptance rate of FAR=0.0869%. The error rates values are very low revealing high accuracy and high robustness of the (FA-sinePiJTC) at most 50% additive random noise in face recognition.
Low resolution used (200x180 pixels) in the above performance analysis validate video security systems applications.
Authors experience with the same (FA-sinePiJTC) confirm the good pattern recognition results in correlation applications involving human fingerprint and irises, mostly based on the sine modulation function, the phase-input and the fringe-adjusted filter, previously emphasized.
Future work can involve 3D face recognition systems based on (FA-sinePiJTC) correlation algorithm.