Estimation of the Separable MGMRF Parameters for Thematic Classification

Because of its ability to describe interdependence between neighboring sites, the Markov Random Field (MRF) is a very attractive model in characterizing correlated observations (Moura and Balram, 1993) and it has potential applications in areas of remote sensing, such as spatio-temporal modeling and machine vision. In this study, we model image random field conditional to the texture label as a Multivariate Gauss Markov Random Field (MGMRF); whereas; the thematic map is modeled as a discrete label MRF (Li, 1995). The observations in the Gauss Markov Random Field (GMRF) are distributed with the Gaussian distribution.


Introduction
Because of its ability to describe interdependence between neighboring sites, the Markov Random Field (MRF) is a very attractive model in characterizing correlated observations (Moura and Balram, 1993) and it has potential applications in areas of remote sensing, such as spatio-temporal modeling and machine vision. In this study, we model image random field conditional to the texture label as a Multivariate Gauss Markov Random Field (MGMRF); whereas; the thematic map is modeled as a discrete label MRF (Li, 1995). The observations in the Gauss Markov Random Field (GMRF) are distributed with the Gaussian distribution.
There are some MGMRF models where the interaction matrices are modeled in some simplified form, including the MGMRF with isotropic interaction matrix which we shall refer here as Hazel's GMRF (Hazel, 2000), The MGFMRF with anisotropic interaction matrix proportional to the identity matrix which we shall refer here as Rellier's GMRF (Rellier et. al., 2004), and the Gaussian Symmetric Clustering (GSC) (Hazel, 2000).
From these developments, the model for anisotropic GMRF was generalized and its parameter estimator for an arbitrary neighborhood system is characterized (Navarro et al., 2009). Using our model, the classification performance was analyzed and compared with the GMRF models in literature.
Spectral classes are explored in segmenting image random field models to be able to extract the spatial, spectral, and temporal information. A special case is addressed when the observation includes spectral and temporal information known as the spectro-temporal observation. With respect to the spectral and temporal dimensions, the separability structure is considered based on the Kronecker tensor product of the GMRF model parameters. Separable parameters contain less parameters, compared with its non-separable counterpart. In addition, the spectral and temporal dimensions on a separable model can be analyzed separately. We analyzed whether the separability of the GMRF parameters would improve the classification of the thematic map.

Image random field modelling and thematic classification
This section covers statistical background in characterizing random fields based on the MRF. Then, we will present estimation for the thematic map and image random field parameters.

100
Finally the thematic map classifier is presented based on the Iterated Conditional Modes (ICM) algorithm.

Markov random fields
A random field   :  ZZ s s  where s is a site on the lattice  with the neighborhood system  with parameter Π is a MRF if for  s  (Winkler, 2003 where a m is the singleton potential coefficient for the th m thematic class, b r are made up by the pairwise potential coefficients, and  is region of support (Jeng & Woods, 1991) or the neighborhood set (Kasyap & Chellappa, 1983). Its conditional probability density function (pdf) is given by The noise process has the following characterization:

Maximum pseudo-likelihood estimation
The maximum pseudo-likelihood estimation (MPLE) combines sites to form the pseudolikelihood function from the conditional probabilities (Li, 1995). The pseudo-likelihood functions for the thematic map random field and image random field parameters are given as follows:  (Casella and Berger, 2002) since the form of the pseudolikelihood function is analogous that of the likelihood function, depending on the parameter given the data. Moreover, the MPLE converges to the MLE almost surely as the lattice size approaches infinity (Geman and Greffigne, 1987).

Thematic classification
The thematic map can be recovered by the maximum a posteriori probability (MAP) rule. It can be implemented using a numerical optimization technique such as Simulated Annealing (SA) (Jeng & Woods, 1991). Although the global convergence employing SA is guaranteed almost surely, its convergence is very slow (Aarts & Korts, 1987;Winkler, 2006). An alternative to this is to use the ICM algorithm (Besag, 1986) given as 102 This is interpreted as the instantaneous freezing of the annealing schedule of the SA.
However, since   ; p YLΘ is difficult to evaluate, alternatively, it is replaced by its pseudolikelihood (Hazel, 2000) The ICM algorithm, unlike the SA, is only guaranteed to converge to the local maxima. This problem can be alleviated by initializing the thematic map from the Gaussian Spectral Clustering (GSC) model (Hazel, 2000).

Numerical implementation
The MPLE-based estimators are not in their closed form and must be evaluated numerically. The pseudocode for estimating the parameters is presented below. The image random field parameters are estimated using a method with some resemblance to the Gauss-Seidel iteration method (Kreyzig, 1993). The convergence criterion for estimating these parameters using this iteration method has yet to be established. As a precautionary measure, a single iteration was performed. This method was also applied in estimating the image random field estimators in Rellier's GMRF (Rellier, et. al., 2004).

Spectro-temporal MGMRF modelling
The spectro-temporal observation image random field will be characterized with hybrid separable MGMRF parameters.

Image random field modeling
We let M 1 -number of lines, M 2 -number of samples, N 1 -number of spectral bands, and N 2 -number of temporal slots. The image random field is characterized as follows: The matrix # s Y is characterized by allocating the reflectance across the bands for a given time by column and the reflectance across time for a given band by row.

Separable structure of the covariance matrix
There is a growing interest in modeling the covariance structure with more than one attribute. For example, in spatio-temporal modeling, the covariance structure of "spatial" and "temporal" attributes is jointly considered (Kyriakidis and Journel, 1999;Huizenga, et. al., 2002). On the other hand, in the area of longitudinal studies the covariance structure between "factors" and "temporal" attributes are jointly considered (Naik and Rao, 2001). Both studies mentioned above considered covariance matrices with a separable structure between these attributes.
In the realm of remote sensing, few studies have been conducted combining the covariance structure involving spectro-temporal attributes. Campbell and Kiiveri demonstrated canonical variates calculations are reduced to simultaneous between-groups and within-group analyses of a linear combination of spectral bands over time, and the analyses of a linear combination of the time over the spectral bands (Campbell and Kiiveri, 1988).
In light of recent literature, we propose to model the GMRF models as applied to remote sensing image processing where the covariance structure of the "spectral" and "temporal" attributes is characterized jointly. The separable covariance structure associated with the matrix Gaussian distribution has been considered.

Non-separable covariance structure
The matrix observation driven by a colored noise and its vectorized distribution, is assumed to be a realization from the process whose conditional form is given by does not have any special structure, except it has to be a positive definite symmetric matrix. This covariance matrix structure referred to as an unpatterned covariance matrix (Dutilleul, 1999). The statistical characterization is similar to the MGMRF discussed in Section 2.3.

Matrix gaussian distribution
Let # X be a random matrix distributed as  is the covariance matrix across the rows, and  is the covariance matrix across the columns. Hence, the pdf of # X is given as , 1981), and its pdf is given as

Separable covariance structure
The spectro-temporal, separable covariance matrix model (Lu and Zimmerman, 2005;Fuentes, 2006)   , which has fewer parameters compared to its non-separable counterpart.

Separable of interaction matrix structure
We can also model the interaction matrix coefficients with a separable structure for all   ss r XX L Θ has a separable structure between the spectral domain and temporal dimensions. It has a form analogous to that of what is shown in (4) through (6), which is intuitively appealing.
The number of parameters in the unpatterned interaction matrix coefficient is 22 2 12 .

NN N 
On the other hand, the number of parameters for the separable interaction matrix coefficient is 22

12
NN  , which has fewer parameters compared to its non-separable counterpart.

Separable mean structure
Likewise, we can also model the mean with a separable structure of the form NN  which has fewer number of parameters compared to its non-separable counterpart.

Hybrid separable structure
Finally, we can model the GMRF parameters as having a hybrid separability structure, that is, some of its parameters are separable while the rest are not. Hence, there are eight combinations to consider. As shown in Section 5.2, it is impossible to model a separable interaction matrix with a non-separabable matrix. This leave us six cases to consider in this study.

Estimation of thematic map parameters
The MPLE of φ is obtained by taking the derivative of   log PL φ with respect to   1 m mM a  and   b  r r  , then equating to zero (Li, 1995). Accordingly, the estimators are obtained numerically by solving the following set of simultaneous nonlinear equations:

Important MGMRF specifications
This section provides important characterizations enable us to derive the estimators of the GMRF parameters in the next section. We present a simple, yet powerful, method to derive the MPL estimators of the mean and the interaction matrix. Finally, new problems arise in estimating the multivariate observation GMRFs, which were not encountered in the univariate case, are discussed.

MPL-based method technique of deriving mean and the interaction matrix estimators
In this section, a method of deriving the MPL estimators for the mean and the vectorized interaction coefficients are presented regardless of separability. The MPL estimator of the interaction matrix coefficients can be derived by taking the matrix derivative of the log of the pseudo-likelihood function with respect to the interaction matrix coefficient or with respect to its vectorized version from the equivalence relation (Neudecker, 1969)   ff vec vec The latter expression is preferred, since it is easier to evaluate. The following proposition provides a simple way of deriving the MPL estimators, where the estimator is either the mean or the vectorized interaction matrix coefficient (Navarro, et. al., 2009 Proof From (7) and (9), the log pseudo-likelihood of the image random field conditional to the thematic map is given as Taking the gradient of the log pseudo-likelihood function in (33) Finally, substituting the result of (36) into (34)

GMRF parameter estimation
This section proposes an estimation procedure for the GMRF parameters for both separable and non-separable cases based on the MPL.

Proposition 2
Assume that the interaction matrix coefficients   Proof a. The proof for the non-separable case is derived by applying Proposition 1 (Navarro, et. al., 2009 by applying Preposition 1 and rearranging terms, we obtain (47).

Proposition 3
Assume that the mean vectors   Proof a. The proof for the non-separable case is derived by applying Proposition 1 (Navarro, et. al., 2009 The above expression can also be written using the following matrix identities (Magnus and Neudecker, 1999)  . In addition, from the identity (Magnus and Neudecker, 1999) By aggregating the equation in (79) The above estimators are not in their closed form. The estimators can be solved iteratively using the flip-flop algorithm (Dutilleul, 1999).

Data preparation
The multispectral and multitemporal satellite image under consideration is the 'Butuan' image acquired from the LANDSAT TM. The image shows the scenery of Butuan City and its surroundings in Northeastern Mindanao, Philippines. It consists of six spectral bands and four temporal slots with a dynamic range of 8 bits. The images were captured chronologically on the following dates: August 1, 1992, August 7, 2000, May 22, 2001, and December 3, 2002. The images were radiometrically corrected, geometrically co-registered with each other, and have been resized to 600 x 800 pixels. The image in Fig. 1  The thematic classes were established by employing the k-means algorithm (Richards and Jia, 2006). The thematic classes were identified and their mean reflectance vector form the training data are shown in Table 1 Training and verification sites were obtained from a random sample of 1200 sites. The firstorder neighborhood system in the MRF modeling of the thematic map and the image were used.

Non-separable case
The classification performance of our model with non-separable MGMRF parameters, as compared to the GSC, Hazel's, and Rellier's models are presented in Table 2 (Hazel, 2000) which in general, does not hold the multivariate case. This relation, however, holds in the univariate case (Kashyap and Chellappa, 1983) as well as the Rellier's GMRF.
On the other hand, anisotropic models, such as Rellier's GMRF, and our model exhibited a substantially better classification performance as compared to the GSC. Since the covariance matrix estimators used a sub-optimal alternative, some slight performance degradation has resulted.

Hybrid separable case
Denote S μ , S θ , and S Σ to be the separable indicators for the mean, interaction matrix, and covariance matrix, respectively.

Hybrid separable GSC model
Since the GSC model is a degenerate form of our MGMRF with zero interaction matrices, the separability structure of the mean and covariance matrices are examined. The results are presented in Table 3 showed that no improvement in the classification performance, regardless of separability of the parameters.  Table 3. Classification Accuracy of Hybrid Separable GSC models.

Hybrid separable anisotropic GMRF model
The hybrid separable anisotropic MGMRF shows the separability of the covariance matrix has a slight improvement in performance over a non-separable spectro-temporal observation. As discussed in Section 5.2, the hybrid separable model with separable interaction matrix, together with a non-separable matrix, were excluded in the model performance as these modes are not possible. The classification accuracy is presented in

Summary, conclusions, and recommendations
This study presents a parameter estimation procedure based on the MPL for an anisotropic MGMRF with hybrid-separable parameters. Although the MGMRF is a natural extension of its univariate counterpart, the interaction matrix relationship is, in general, dependent on the covariance matrix. In an effort to make the estimation and classification procedure more tractable to compute, some sub-optimal approximations were incorporated. This resulted in a slight degradation in the classification performance. The classification performance based on our model performed well when compared to the GSC model and Hazel's MGMRF. Nonetheless, its performance is comparable to the Rellier's MGMRF. Moreover, for spectrotemporal observations, the separability of the interaction matrix as well as the covariance matrix improved the classification performance. Computational capabilities are foreseen to further advance in the near future following the improvement of numerical estimation and classification procedures.
This study presents a parameter estimation procedure based on the MPL for anisotropic MGMRF with hybrid-separable parameters. Although the MGMRF is a natural extension of its univariate counterpart, the interaction matrix relationship is, in general, dependent on the covariance matrix. In an effort to make the estimation and classification procedure more tractable to compute, some sub-optimal approximations were incorporated in the process. This resulted in a slight degradation in the classification performance. The classification performance, based on our model, has performed well, as compared to the GSC model and Hazel's MGMRF. Furthermore, its performance is comparable to Rellier's MGMRF. In terms of spectro-temporal observations, the separability of the covariance matrix has improved the classification performance. This study can be improved even more with numerical estimation and classification procedure as computational capabilities. This is foreseen to further advance in the near future.