Advances in Adaptive Composite Filters for Object Recognition

The problem of object recognition is one of the most common problems that is addressed by researchers and engineers that want to develop artificial vision or image analysis systems. In order to recognize an object within an image or video sequence we must basically solve two different but related tasks. Firstly, it is essential to detect the target object within the scene image, and secondly its exact location within the image must be estimated. While the general concept of object recognition is straightforward, even a brief review of modern literature reveals a wide range of proposals and systems (Goudail & Refregier, 2004; Szeliski, 2010). However, one of the most common and successful approaches are local feature-based systems that normally employ two basic steps (Lowe, 2004; Tuytelaars & Mikolajczyk, 2008). First, object features are extracted from the scene image, and afterwards a classification step is used to determine if the observed features belong to the target object; a process known as feature matching. Feature-based systems have achieved very good results and are widely used in many application domains. Nevertheless, feature based systems suffer from two noteworthy drawbacks. First, they can be computationally expensive1, and second their overall performance depends upon some ad-hoc decisions that might require optimization (Brown et al., 2011; Olague & Trujillo, 2011; Perez & Olague, 2008; Theodoridis & Koutroumbas, 2008; Trujillo & Olague, 2008).


Introduction
The problem of object recognition is one of the most common problems that is addressed by researchers and engineers that want to develop artificial vision or image analysis systems. In order to recognize an object within an image or video sequence we must basically solve two different but related tasks. Firstly, it is essential to detect the target object within the scene image, and secondly its exact location within the image must be estimated. While the general concept of object recognition is straightforward, even a brief review of modern literature reveals a wide range of proposals and systems (Goudail & Refregier, 2004;Szeliski, 2010). However, one of the most common and successful approaches are local feature-based systems that normally employ two basic steps (Lowe, 2004;Tuytelaars & Mikolajczyk, 2008). First, object features are extracted from the scene image, and afterwards a classification step is used to determine if the observed features belong to the target object; a process known as feature matching. Feature-based systems have achieved very good results and are widely used in many application domains. Nevertheless, feature based systems suffer from two noteworthy drawbacks. First, they can be computationally expensive 1 ,a n d second their overall performance depends upon some ad-hoc decisions that might require optimization (Brown et al., 2011;Olague & Trujillo, 2011;Pérez & Olague, 2008;Theodoridis & Koutroumbas, 2008;Trujillo & Olague, 2008).
An attractive alternative to feature-based systems is given by correlation filtering algorithms, an approach that has been intensively investigated over the last decades (Vijaya-Kumar et al., 2005). A correlation filter is basically a linear system whose output is the maximum-likelihood estimator of the targets coordinates in the observed scene (Goudail & Refregier, 2004;Refregier, 1999). In other words, detection is carried out by searching for correlation peaks in the system output, and the coordinates of these peaks provide the position estimates that localize the objects within the scene. An advantage of correlation filtering is that it possesses a strong mathematical foundation. Moreover, the design process of correlation filters usually considers the optimization of various performance criteria (Vijaya-Kumar & Hassebrook, 1990). As result, correlation filters have been used to develop reliable object recognition systems that exhibit robust performance even when used in highly noisy conditions (Javidi & Hormer, 1994;Javidi & Wang, 1997;Javidi et al., 1996). Correlation filters are commonly implemented using hybrid opto-digital correlators, thus exploiting the inherent parallelism of optics and achieving a very high rate of operation. Optical correlators follow two basic types of architectures: the 4F correlator (4FC) (Vanderlugt, 1964; and the joint transform correlator (JTC) (Javidi & Horner, 1989;Weaver & Goodman, 1966). Both architectures allow fast object recognition, however they are very sensitive to ambient disturbances and to misalignments in the optical setup (Nicolás et al., 2001). On the other hand, it is also possible to effectively implement correlation filters using a digital computer and efficient algorithms for the fast Fourier transform. In fact, currently there are several very large scale integration (VLSI) devices that can be used to digitally implement correlation filtering algorithms that operate in real-time, such as field programmable gate arrays (FPGA) (Rakvic et al., 2010) and graphics processing units (GPU) (Sanders & Kandrot, 2010).
In general, correlation filters can be broadly classified into two main classes, analytical filters and composite filters. Analytical filters are typically given by a closed form mathematical expression that is directly derived from the respective signal and noise models while optimizing specific quality metrics (Javidi & Wang, 1997;Kerekes & Vijaya-Kumar, 2006;Vijaya-Kumar et al., 2000;Yaroslavsky, 1993). On the other hand, composite filters are constructed by combining a set of training images, which are explicit representations of the target object and their expected distortions (Bahri & Kumar, 1988;Kerekes & Vijaya-Kumar, 2008;Vijaya-Kumar, 1992). It is assumed that when the training images are properly chosen, we can synthesize composite filters that achieve very good and robust performance in recognizing the target object. The rest of this chapter deals with composite correlation filters, while the interested reader is referred to (Javidi & Hormer, 1994;Vijaya-Kumar et al., 2005) for more information regarding analytical filters. Composite filters can be further classified as constrained or unconstrained filters. Constrained filters are designed in such a manner that the filter's output at the origin of the training images must be equal to a prespecified value (Kerekes & Vijaya-Kumar, 2008;Vijaya-Kumar, 1992). These restrictions are known as the equal output correlation-peak (EOC) constraints. Synthetic discriminant functions (SDF) (Hester & Casasent, 1980) and minimum average correlation energy (MACE) (Mahalanobis et al., 1987), are two popular constrained filters. Unconstrained filters avoid the EOC constraints in order to expand the solutions space for filter synthesis, thus achieving a higher robustness to scene distortions when compared to constrained filters. Maximum average correlation height (MACH) filters  and optimal trade-off SDF (OTSDF) filters (Goudail & Refregier, 2004;Vijaya-Kumar et al., 1994) are examples of widely used unconstrained filters. The MACH filters maximize the average response at the origin of the training images and also minimize an average dissimilarity measure over the training set. Thus, MACH filters are robust to distorted versions of the target which are not included in the training set (called intraclass distortions). Several versions of MACH filters exist, among these the generalized MACH (GMACH) filter achieves the lowest variations in correlation peaks among the set of training images (Alkanhal et al., 2000;Nevel & Mahalanobis, 2003). This means that the GMACH filter yields an optimized response to intraclass distortions. The OTSDF filters, on the other hand, provide a compromise between multiple performance criteria by optimizing their weighted sum .
As result, OTSDF filters can yield a balanced performance in recognizing a target corrupted by several types of concurrent noise processes. Recently, a composite filter which performs a compromise between a constrained and unconstrained filter using two mutually exclusive training sets was proposed (Diaz-Ramirez, 2010). This constrained filter improves tolerance to intraclass distortions without lowering the signal to noise ratio.
A main drawback of both constrained and unconstrained composite filters is that their performance strongly depends upon the proper selection of the training set of images. In fact, the training images are commonly chosen based on the experience of the designer in an ad-hoc manner. Therefore, it is not possible to guarantee optimal performance in the general case, given that it is not possible to a priori determine the optimal set of training patterns.
To overcome these shortcomings, recent works propose an adaptive approach towards filter synthesis (Aguilar-Gonzalez et al., 2008;Diaz-Ramirez & Kober, 2007;Diaz-Ramirez et al., 2006;Gonzalez-Fraga et al., 2006;Martinez-Diaz et al., 2008;Ramos-Michel & Kober, 2008). In such an approach, the goal is to construct a composite filter with optimal performance characteristics for a fixed set of patterns, rather than a filter that achieves average performance over an ensemble of images. One possible way to implement an adaptive approach for filter synthesis is to use an incremental search algorithm. Such an algorithm can use all available information about the objects to be recognized, as well as examples of false objects or background samples that should be rejected. The adaptive process for filter synthesis can also account for additive sensor noise by training with images corrupted by a particular noise model. Therefore, adaptive filters can exhibit a high amount of robustness to noise during the imaging process.
This chapter presents recent advances in the design of adaptive composite correlation filters for robust object recognition. We describe two different design approaches, based on the basic models of constrained and unconstrained filters. We show that the resultant adaptive constrained filters can achieve a high recognition rate with a low computational complexity, by simply using EOC constraints with complex values. Furthermore, unconstrained adaptive filters can be constructed to produce robust recognition in highly noisy conditions. The remainder of the chapter is organized follows. Section 2 presents a brief review of the most successful composite filters for object recognition. Then, Section 3 describes two proposed algorithms to synthesize adaptive composite filters. Computer simulation results obtained with the proposed adaptive filters are presented in Section 4. These results are discussed and compared in terms of performance metrics with those obtained with existing composite filters in noisy scenes. Finally, Section 5 summarizes our conclusions.

Composite correlation filters
In this Section, the main strategies for composite correlation filter designs are recalled. We consider constrained SDF and MACE filters, as well as unconstrained MACH and OTSDF filters. Basically, composite filters can be used for intraclass distortion-tolerant pattern recognition; i.e., detection of distorted patterns belonging to the same class of objects. Let {S} = {T i (µ, ν)|i = 1, ..., N} be a set consisting of N different training images expressed in the frequency domain, where each one represents a distorted versions of the target object t(x, y), where T(µ, ν) is the Fourier transform of t(x, y). Composite filters must be able to recognize the target and all the distorted versions in {S} using a single correlation operation.

Synthetic Discriminant Functions (SDF) filter
An SDF filter can be expressed as a linear combination of the Fourier transformed training images T i (µ, ν), as follows, where {a i |i = 1,...,N} are unknown coefficients that must be chosen to satisfy the inner-product conditions (Vijaya-Kumar, 1992) The quantities {c i } represent the EOC constraints, that is, prespecified values in the correlation output at the origin of each training image. Let T be a matrix with N columns and d rows (the number of pixels in each training image) where its ith column is given by t i ,ad × 1 vector constructed by placing the elements of T i (µ, ν) in lexicographical order. Let a and c respectively represent column vectors of {a i } and {c i }. In matrix-vector notation, filter H(µ, ν) and constraints {c i } can be rewritten as and where superscripts " * "and"+" represent the complex conjugate and the conjugate transpose, respectively. Combining Eqs.
(3) and (4) the solution of the system of equations is a = T + T −1 c, and if matrix T + T is nonsingular the filter solution is

Minimum Average Correlation Energy (MACE) filter
The MACE filter is able to produce sharp correlation peaks by suppressing lateral sidelobes (Mahalanobis et al., 1987). This can be done by minimizing the average correlation energy (ACE) in the filter output, subject to the prespecified EOC constraints. The effect of minimizing the ACE measure is that the resultant correlation function would yield values close to zero everywhere except at the central location for training images, where the EOC constraints occur (Mahalanobis et al., 1987). Let D be a d × d diagonal matrix where the entries along the main diagonal are obtained by computing E |t i | 2 ; i = 1,...,N ,whichare the average power spectra of the training images. In matrix-vector notation, filter h MACE which minimizes and is subject to meet the EOC constraints is given by (Mahalanobis et al., 1987)

Two-class problem
Assume that there are several distorted versions of a target object {t i (x, y)} and various objects to be discriminated { f i (x, y)}; in other words, a two-class pattern recognition problem. Then, the goal is to design a constrained composite filter to recognize images from the training set of true-class objects (target class), given by and to reject training images from the false-class (unwanted class), given by A two-class composite filter can be constructed by combining all of the given training images in a set {S} = {T} ∪ {F}. Afterwards, to solve the two-class pattern recognition problem we can set the filter output as for the true-class objects, and for the false-class objects. In this manner, the vector c of EOC constraints is given by It can be seen that both SDF and MACE filters with equal output correlation peaks can be used for intraclass distortion-tolerant pattern recognition or for interclass pattern recognition. For a two-class constrained composite filter, we can expect that the central correlation peak will be close to unity for the true-class objects and close to zero for objects of the false-class. Moreover, this approach can easily be extended to multi-class problems.

Multiclass problem
Suppose that the true-class subset {T} is given by the union of K different subsets of training images, as follows where {T k } is a subset of training images that represents the kth true-class of objects to be recognized, which is given by Here, T k i (µ, ν) is the ith Fourier transformed training image, which belongs to the kth true-class of objects. For simplicity, we assume that each subset {T k } contains N T training images. The set {S} of all training images can be constructed as follows According to the SDF approach a constrained filter can be constructed as a linear combination of all training images in {S}, subject to satisfy the prespecified EOC constraints {c i } (Vijaya-Kumar, 1992). In the basic two-class object recognition problem, we need to set the filter output to yield an intensity value equal to unity for any object that belongs to {T},and to yield an intensity value of zero for any object that belongs to {F}; i.e., and Furthermore, to distinguish among objects from different true-classes {T k }, the constraint vales {c i } must not only satisfy Eqs. (17) and (18), they must also provide information regarding the specific class of each training image. For this, we propose to use complex values {c i } with a magnitude value equal to unity for all, but each with a different prespecified phase value that indicates the class that correspond to each training image. The encoded phase values must be chosen to allow us to associate (in the complex correlation plane of the output) any unknown input patterns to one of the K different true-classes. This can be achieved by using the following EOC constraints, Here, {φ k |k = 1,...,K} are prespecified phase values associated to the kth true-class of objects {T k }. Observe that by using a constrained composite filter with complex EOC constraints, we satisfy the equal output intensity restrictions imposed by Eqs. (17) and (18), and at the same time we can classify any unknown input pattern from the input scene by comparing the obtained phase valuesφ k at coordinates of maximum intensities (correlation peaks), with the prespecified φ k values previously defined in the filter constraints (Diaz-Ramirez et al., 2012).

Maximum Average Correlation Height (MACH) filter
The MACH filter h MACH is designed to maximize the ratio between the intensity of the output average correlation height (ACH) and the average similarity measure (ASM) among training images . Hence, the MACH filter is designed to maximize the function J = |ACH| 2 /ASM. Let X i and M,b eb o t hd × d diagonal matrices containing the elements of the training vectors t i , and the average training vector Furthermore, the ACH measure can be described as the average of the output central correlation values produced by the training images, as Additionally, the ASM can be seen as the average error between the full correlation responses produced by the training images v i = X * i h MACH , and the correlation function produced by the average training imagev = M * h MACH ,thatis, In a compact notation we can rewrite the ACH and ASM measures as follows, and Thus, filter h MACH is obtained by maximizing the following objective function (Mahalanobis et al., 1987): where the resultant MACH filter is given by

Generalized MACH (GMACH) filter
The GMACH filter h GMACH (Alkanhal et al., 2000), can be seen as a trade-off between a filter with EOC constraints and the MACH filter. Note, that the correlation output at the origin for Furthermore, the average correlation output at the origin is The output correlation variance can be written as (Alkanhal et al., 2000) σ where is a covariance matrix estimate. The GMACH filter h GMACH is designed to maximize the function (Alkanhal et al., 2000) J where the resultant filter is

Optimal trade-off SDF (OTSDF) filter
In earlier sections, we have seen that most successful composite filters are designed to optimize certain performance criteria, namely ACE, ASM, and ACH. However, some of these metrics are in fact conflicting objectives, for instance ACE and ASM. For example, consider the MACE filter, which produces sharp correlation peaks by optimizing (minimizing) the output ACE. This means that the MACE filter has a great capacity to distinguish between target objects that should be recognized and false patterns that should be rejected. However, it is well known that MACE filter has a poor tolerance to intraclass distortions, which is characterized by the ASM metric. Therefore, OTSDF filters are designed to perform a compromise between several conflicting measures (Goudail & Refregier, 2004). For instance, an OTSDF filter can be obtained by minimizing the following function : where ACE and ASM are functions to be minimized, ACH is a function to be maximized, and ω 2 1 + ω 2 2 = 1 are trade-off constants. The resultant OTSDF filter, is given by (Goudail & Refregier, 2004) h We can see that unconstrained filters cannot restrict their correlation responses at the origin of the training images in the same manner that a constrained filters does. Instead, these filters maximize the intensity value produced by the average training image and minimize the intensity response produced by unwanted patterns.

Adaptive composite filter designs
In Section 2 we described how a basic SDF filter is designed to satisfy the EOC constraints. This means that the filter is only able to control the output correlation points at the central location of the training images within the observed scene. This limited control yields the appearance of high correlation sidelobes over the entire image background. This undesirable property causes a drastic reduction in recognition performance for the SDF filter when it is used in highly cluttered scenes. However, this problem is solved by the MACE filter, which yields sharp correlation peaks at the central location of the training images and suppresses correlation sidelobes by minimizing the ACE metric. However, as we see in Section 2 the MACE filter has a poor tolerance to intraclass distortions. In contrast, the OTSDF filter removes the EOC constraints to gain more control over the output correlation plane. In this manner, the filter can suppress the correlation sidelobes more efficiently and can improve its tolerance to intraclass distortions. This is accomplished because the OTSDF filter optimizes the ACH, ASM, and ACE performance measures. However, note that these metrics are based on the calculation of spatial averages over the complete training set of images. This leads to the synthesis of composite filters which can only yield average performance over several similar applications and assumming stationary conditions.
In this chapter, we are interested in designing composite filters that are optimized in terms of performance metrics for a given set of patterns that are directly related to a particular  Fig. 2. Iterative training procedure to synthesize an adaptive unconstrained composite filter application problem. First, we analyze the two-class pattern recognition problem, where the training set is given by {S} = {T} ∪ {F}. We assume that the true-class training images {t i (x, y)} ∈ {T} are previously chosen by the filter designer and that the false-class images { f i (x, y)} ∈ {F} can be given by any known false objet to be rejected, or by unknown patterns that have similar structures to those of the target. If information of the background where detection will be carried out is available, the false-class images f i (x, y) can be given by small fragments taken from a synthetic image with similar statistical properties to those of the expected background in the image scene.
Let us define a set {U F } that contains all feasible image patterns that can be chosen as false-class images f i (x, y), an extremely vast set given the size and resolution of common digital images. The set {U F } can be seen as the universe of feasible training images from which w ec a no b t a i ns u b s e t{F}. In this sense, we can see that an optimal subset {F O } ⊂ {U F } of image patterns must exist, which is the set of false-class images that can be used to synthesize a composite filter that achieves optimal performance; i.e., when {F} = {F O }. Note that the subset {F O } is a priori unknown, and its contents cannot be derived analytically from the problem definition. Therefore, a search and optimization strategy is required to find {F O } 2 .
In this chapter, the proposal is to use an adaptive iterative algorithm to search for {F O }. The first step of the adaptation algorithm is to perform the correlation process between the background scene and a basic composite filter, initially trained with all available versions of the target and known false-class objects. The background function can be either described deterministically as an image or by a stochastic process. Next, we search for the coordinates of a point in the output correlation plane that allows us to improve the performance of the filter. The goal is to incorporate a segment, or region, that is cropped from the synthetic background around a central point as a new false class image in {F}, call this new image taken from the background f n (x, y) that has a support region which is similar to that of the target image class. The new image f n (x, y) should provide the maximum performance increase, based on a chosen performance criteria, when compared to all other possible background segments that could have been chosen. After including f n (x, y) in {F} a new composite filter is synthesized. This procedure is iteratively repeated until a prespecified performance level for the filter is reached. Note that the suggested training procedure can be used to synthesize adaptive composite filters based on constrained or unconstrained models. The general steps of the training procedure are summarized as follows: • STEP 1: Include all available training images to a corresponding subset {T} or {F},a n d construct the training set {S} = {T} ∪ {F}.
• STEP 2: Synthesize a composite filter trained for {S} using a constrained or unconstrained filter model.
• STEP 3: Carry out the correlation between the actual composite filter and a synthetic image of the background.
• STEP 4: Calculate the performance metrics of the composite filter and set them the current performance level of the filter. If the performance level of the filter is greater than a prespecified value, the procedure is finished. Otherwise, go to next step.
• STEP 5: Find the maximum intensity value in the output correlation plane, and around this point extract a new training image to be rejected from the background. The region of support of this new training image is similar to that of the reference image of the target.
• STEP 6: Include the new false-image to set {F} and update set {S}. Next, go to STEP 2.

Adaptive constrained filter design
An adaptive constrained filter can be constructed by training a simple SDF filter with the iterative procedure described above. First, all available views of the target are included in the true-class training set {T}. Next, we construct the matrix T and the vector of constraints c, and a basic SDF filter h sd f is synthesized using Eq. (5). At this point, the h sd f filter is able to recognize all objects in subset {T} with a single correlation operation. However, the filter may produce high correlation sidelobes when the target is embedded into a highly cluttered background. Nonetheless, we can train the filter h sd f to optimize its ability to distinguish among the different views of the target and the background by optimizing the discrimination capability (DC) of the filter. The DC can be formally defined, as follows (Yaroslavsky, 1993): where c B max 2 is the maximum intensity value in the output correlation plane over the background area, and c T max 2 is the maximum intensity value in the output correlation plane over the area occupied by the target. The background area and the target area are complementary. A filter with a DC value close to unity possesses a good capacity to Fig. 3. Sample views of the target object distinguish between targets and unwanted objects. Negative values of DC indicate that the filter is unable to recognize any target. Note that other discrimination metrics can be used in the training procedure; for instance, the peak to correlation energy (PCE) (Vijaya-Kumar & Hassebrook, 1990) and the peak to sidelobe (PSR) ratio (Kerekes & Vijaya-Kumar, 2008). To measure the DC of the filter we carry out the correlation process between h sd f and a synthetic image of the background with similar statistical properties to those of the real background, then we calculate the DC using Eq. (36). If the DC of the h sd f filter is greater than a prespecified value the training procedure is finished. Otherwise, we search for the coordinates of the highest sidelobe in the output correlation plane between h sd f and the background image. These coordinates are set as the origin, and around the origin we construct a training image form the background. This new training image is included in the false-class subset {F} and a new h sd f filter is synthesized to recognize the object patterns in {T} and reject the object patterns in {F}. This cycle can be continued until a desired DC value is reached. The training algorithm to synthesize an adaptive constrained composite filter is presented in Fig. 1.

Adaptive unconstrained filters
An unconstrained adaptive composite filter can be constructed by training a basic OTSDF filter and optimizing several performance criteria. It must be noted that since the OTSDF filter is not restricted to satisfy hard EOC constraints, the filter has more freedom to concurrently optimize multiple criteria. The flow diagram of the proposed iterative algorithm is presented in Fig. 2. The algorithm begins by constructing subset {T} with all available views of the target objects. Next, we create the mean vector of training images m (see Eq. (20)) and matrix S using Eq. (25), then a basic OTSDF filter is synthesized following Eq. (35). The diagonal matrix D required in Eq. (35) can be constructed using all available known patterns that ought to be rejected; otherwise D is zero. The next step of the algorithm is to carry out the correlation process between the current h otsd f filter and a synthetic image that is representative of the background. Afterwards, we evaluate the performance of the filter using the following objective function: where D bg is a diagonal matrix where the main diagonal is given by b g 2 which is the power spectrum vector of the representative image of the background. Note that the objective function increases when both of the ACE and ASM metrics are minimized and when the ACH metric is maximized. If the value of Eq. (37) is greater than a desired value then the training procedure is finished. Otherwise, we search for coordinates in the output correlation plane (between h otsd f and the background image) that achieves the maximum improvement of the objective function. These coordinates are the center of the background region that is extracted and included as a new training image. This new training image is included in the false set {F} and the matrix D is updated; finally a new filter h otsd f is constructed. This cycle can be continued until a designed trade-off performance is obtained.

Experimental results
In this section, we analyze and discuss the simulation performance of the proposed adaptive filters for object recognition. These results are compared with those obtained with conventional MACE (Mahalanobis et al., 1987) and MACH  composite filters. The performance of the composite filters is evaluated in terms of recognition performance and location accuracy. Recognition performance is given by discrimination capability (see Eq. (36)), whereas location accuracy is characterized by the location errors (LE) defined by (Kober & Campos, 1996): where τ x ,τ y andτ x ,τ y are the exact and estimated target coordinates, respectively. τ x ,τ y are assumed to be known, whereasτ x ,τ y are estimated from correlation-peak location. The target is a flying bird whose sample views are shown in Fig. 3, which were extracted from a real video sequence. The input scene is defined with a non-overlapping signal model (Javidi & Wang, 1994;Kober et al., 2000) as follows, where t k (x, y) represents the kth view of the target, τ x k ,τ y k are random variables representing unknown coordinates of the target within the scene, b (x, y) is the background, n (x, y) is a zero-mean additive noise with variance σ 2 n ,a n dw k (x, y) is the region of support of t k (x, y). The input scene can be interpreted as a view of the target embedded into a background σ 2 n = 2 /256 σ 2 n = 4 /256  at unknown coordinates, and corrupted with additive noise. In our experiments, we use monochrome images of size 400×400 pixels. The signal range is [0, 1] with 256 quantization levels. The size of the target is of about 120×95 pixels, with the mean value and standard deviation of µ t = 0.354, σ t = 0.237, respectively. The background image has a mean value µ b = 0.73 and standard deviation σ b = 0.21. Fig. 4 (a)-(d) shows examples of the input test scene for different positions of the target and different amounts of noise. First, we design an adaptive constrained filter (ACF) trained to recognize the five views of the target shown in Fig. 3, using the iterative algorithm shown in Fig. 1. In the design process we use a different background image, which has similar statistical properties than the one used during the recognition experiments. Before the first iteration the DC value for the ACF is negative. However, after 31 iterations of the adaptation process the ACF reaches DC=0.95. This implies that a high level of control over the correlation plane for the input scene can be achieved. Fig. 5 shows the performance of the ACF in the design process in terms of the DC value versus the iteration index. To illustrate the performance of the proposed method, Fig. 4 (a)-(d) show four test scenes and Fig. 4   by the ACF on each scene. We can see one sharp correlation peak in each output intensity plane, indicating the presence of the target at the correct position. Moreover, observe that the output-correlation intensity values in the background area are very low in all the tests. Next, we compare the recognition performance of all considered composite filters when different views of the target are embedded into the background at unknown coordinates, and the variance of additive noise σ 2 n is changed. To guarantee correct statistical results, 120 statistical trials of each experiment for different views of the target and realizations of random noise processes were carried out. With 95% confidence the performance results in terms of DC and LE are presented in Table 1. One can observe that the proposed ACF yields the best results in terms of DC and no location errors occurred. This means that the proposed ACF is robust to additive noise and to background disjoint noise. Now, we design an adaptive unconstrained filter (AUF) trained to recognize the five views of the target including rotated versions from -10 to 10 degrees with increments of two degrees, and scaled versions with 0.8 and 1.2 scale factors. In this case, the true-class training set {T} contains 70 training images. The AUF was synthesized using the iterative training algorithm shown in Fig. 2, reaching its maximum value in terms of the objective function "J(h AUF )" (see Ec. (37)) after 16 iterations. The normalized performance of the AUF in the design process in terms of J(h AUF ) versus the iteration index is shown in Fig. 6. To illustrate the performance of the AUF in recognizing geometrically distorted views of the target, Fig. 7 (a)-(d) exhibits several input test-scenes containing a distorted version of the target over the background at unknown coordinates. The output intensity planes obtained with the AUF for each of the input scenes are presented in Fig. 7 (e)-(h). It can be seen that the distorted target can be accurately located in each scene with the adaptive filter. Next, we test the recognition performance of AUF in recognizing geometrically distorted views of the target embedded within noisy scenes. To guarantee correct statistical results, 120 statistical trials of each experiment for different positions, rotations, and scale changes of the target (within the training intervals) and realizations of random noise processes were carried out. In each trial, we randomly choose a geometrically distorted view of the target which can be given by a rotated version of the target within the range of [-10,10] degrees or by a scaled version within the range of [0.8,1.2] scale factors. The distorted target is embedded into the background at unknown coordinates and the scene is corrupted with additive noise. Then, the constructed scene is correlated with the composite filters and the DC and LE metrics are calculated. The results are summarized in Table 2, it can be seen that the proposed AUF yields the best results in terms of DC and LE whereas the MACE filter yields the worst results.
Finally, the simulation results suggest that both ACF and AUF possess very good discrimination capability, outperforming conventional MACE and MACH filters in all our tests. Moreover, one can observe that the ACF is more robust than the AUF with respect to additive noise, and also yields a better location accuracy. In contrast, the AUF is more tolerant in recognizing geometrically distorted views which are embedded into a background.

Conclusions
In summary, the chapter presents an iterative approach to synthesize adaptive composite correlation filters for object recognition. The approach can be used to monotonically improve the quality of a simple composite filter in terms of quality metrics using all available information about the target object to be recognized, and false patterns to be rejected such as the background. Given a subset of true-class training images the proposed approach designs the impulse response of an optimized adaptive filter in terms of a particular performance criterion using an incremental search-based strategy. We designed an adaptive constrained filter with the suggested iterative algorithm optimizing the discrimination capability. According to the simulation results, the proposed adaptive constrained filter proved to be very robust in recognizing different views of a target within an input scene that is corrupted with additive noise. Moreover, the filter exhibits high levels of discrimination capability and location accuracy when compared with conventional MACE and MACH formulations. Furthermore, we synthesized an adaptive unconstrained composite filter optimized with respect to a proposed objective function based on the ACH, ACE, and ASM metrics. Here again, the experimental results suggest that the adaptive unconstrained filter provides a robust detection of geometrically distorted versions of the target when it is embedded within a highly cluttered background.
Finally, we can envision several lines of future research that can be derived from the algorithms and methods presented here. First, future experimental tests should consider real-world scenarios and applications to validate the usefulness of these filters in applied domains. Second, while the adaptive design process presented here has shown promising performance, it is evident that we cannot assume that an optimal strategy has been chosen. For instance, the proposed algorithm follows an iterative mechanism to build the final solution; i.e., it incrementally constructs the training set of images. However, from a search and optimization stand-point there is no reason to assume that this is in any way an optimal strategy for the filter design process. Therefore, it would be instructive to propose, design, and test other iterative search algorithms, such as population-based meta-heuristics, since the structure of the search space is not known a priori and is probably discontinuous and highly multi-modal. Finally, a comparative study of the developed composite filters with other object recognition approaches, particularly feature based methods, might provide a more comprehensive understanding regarding the domain of competence of each. An invariant object recognition system needs to be able to recognise the object under any usual a priori defined distortions such as translation, scaling and in-plane and out-of-plane rotation. Ideally, the system should be able to recognise (detect and classify) any complex scene of objects even within background clutter noise. In this book, we present recent advances towards achieving fully-robust object recognition. The relation and importance of object recognition in the cognitive processes of humans and animals is described as well as how human-and animal-like cognitive processes can be used for the design of biologically-inspired object recognition systems. Colour processing is discussed in the development of fully-robust object recognition systems. Examples of two main categories of object recognition systems, the optical correlators and pure artificial neural network architectures, are given. Finally, two examples of object recognition's applications are described in details. With the recent technological advancements object recognition becomes widely popular with existing applications in medicine for the study of human learning and memory, space science and remote sensing for image analysis, mobile computing and augmented reality, semiconductors industry, robotics and autonomous mobile navigation, public safety and urban management solutions and many more others. This book is a "must-read" for everyone with a core or wider interest in this "hot" area of cutting-edge research.