1. Introduction
During the past decades, image segmentation and edge detection have been two important and challenging topics. The main idea is to produce a partition of an image such that each category or region is homogeneous with respect to some measures. The processed image can be useful for posterior image processing treatments.
Spatial autoregressive moving average (ARMA) processes have been extensively used in several applications in image/signal processing. In particular, these models have been used for image segmentation, edge detection and image filtering. Image restoration algorithms based on robust estimation of a two-dimensional process have been developed (Kashyap & Eom 1988). Also the two-dimensional autoregressive model has been used to perform unsupervised texture segmentation (Cariou & Chehdi, 2008). Generalizations of the previous algorithms using the generalized M estimators to deal with the effect caused by additive contamination was also addressed (Allende et al., 2001). Later on, robust autocovariance (RA) estimators for two dimensional autoregresive (AR-2D) processes were introduced (Ojeda, 2002). Several theoretical contributions have been suggested in the literature, including the asymptotic properties of a nearly unstable sequence of stationary spatial autoregressive processes (Baran et al., 2004). Other contributions and applications of spatial ARMA processes have been considered in many publications (Basu & Reinsel, 1993, Bustos 2009a, Francos & Friendlaner1998, Guyon 1982, Ho 2011, Illig & Truong-Van 2006, Martin1996, Vallejos & Mardesic 2004).
A new approach to perform image segmentation based on the estimation of AR-2D processes has been recently suggested (Ojeda 2010). First an image is locally modeled using a spatial autoregressive model for the image intensity. Then the residual autoregressive image is computed. This resulting image possesses interesting texture features. The borders and edges are highlighted, suggesting that the algorithm can be used for border detection. Experimental results with real images clarify how the algorithm works in practice. A robust version of the algorithm was also proposed, to be used when the original image is contaminated with additive outliers. Applications in the context of image inpainting were also offered.
Another concern that has been pointed out in the context of spatial statistics is the development of coefficients to compare two spatial processes. Coefficients that take into account the spatial association between two processes have been proposed in the literature. (Tjostheim, 1978) suggested a nonparametric coefficient to assess the spatial association between two spatial variables. Later on, (Clifford et al. 1989) proposed an hypothesis testing procedure to study the spatial dependence between two spatial sequences. Rukhin & Vallejos (2008) studied asymptotic properties of the codispersion coefficient first introduced by Matheron(1965). The performance and impact of this coefficient to quantify the spatial association between two images is currently under study Ojeda et al. (2012). An adaptation of this coefficient to time series analysis was studied in Vallejos (2008).
In the context of clustering time series Chouakria & Nagabhushan (2007) proposed a distance measure that is a function of the codispersion coefficient. This measure includes the correlation behavior and the proximity of two time series. They proposed to combine these distances in a multiplicative way, introducing a tuning constant controlling the weight of each quantity in the final product. This makes the measure flexible to model sequences with different behaviors, comparing them in terms of both correlation and dissimilarity between the values of the series.
The structure of this chapter consist in two parts. In the first part we review some theoretical aspects of the spatial ARMA processes. Then the algorithm suggested by Ojeda(2010), its limitations and advantages are briefly described. In order to propose a more efficient algorithm new variants of this algorithm are suggested specially to address the problem of determining the most convenient (in terms of the quality of the segmentation) prediction window of unilateral AR-2D processes. The computation of the distance between the filtered images and the original one will be done by using the codispersion coefficient and other image quality measures (Wang and Bovik 2002). Examples with real images will highlight the features of the modified algorithm. In the second part, the codispersion coefficient previously used to measure the closeness between images is utilized in a distance measure to perform cluster analysis of time series. The distance measure introduced in Chouakria & Nagabhushan (2007) is generalized in the sense that considers an arbitrary lag
2. Image Segmentation Through Estimation of Spatial ARMA Processes
2.1. The Spatial ARMA Processes
Spatial ARMA processes have been studied in the context of random fields indexed over
A random field
where
with
Applications of spatial ARMA processes have been developed, including analysis of yield trials in the context of incomplete block designs (Cullis & Glesson 1991, Grondona et al. 1996) and the study of spatial unilateral first-order ARMA model (Basu & Reinsel, 1993). Other theoretical extensions of time series and spatial ARMA models can be found in (Baran et al., 2004, Bustos et al., 2009b, Gaetan & Guyon 2010, Choi 2000, Genton & Koul 2008, Guo 1998, Vallejos and Garccía-Donato 2006).
2.2. An Image Segmentation Algorithm
In this section, we describe an image segmentation algorithm that is based on a previous fitting of spatial autoregressive models to an image. This fitted image is constructed by dividing the original image into squared sub-images (e.g.,
Let
where
where
Algorithm 1.
For each block
1. Compute estimators
where
2. Let
where
Then the approximated image
The image segmentation algorithm we describe below is supported by a widely known notion in regression analysis. If a fitted image very well represents the patterns on the original image, then the residual image (i.e., the fitted image minus the observed image) will not contain useful information about the original patterns because the model already explains the features that are present in the original image. On the contrary, if the model does not well represent the patterns that are present in the original image, then the residual image will contain useful information that has not been explained by the model. Thus, to implement an algorithm based on these notions, we must characterize which patterns are present in the residual image when the fitted image is not a good representation of the original one, and we must develop a technique to produce a fitting that is satisfactory in terms of segmentation but not a very good estimation in that the residual image still contains valuable information. (Ojeda et al. 2010) investigated these concerns and, based on several numerical experiments with images, determined that the residual image associated with a good local fitting is in fact poor in terms of structure (i.e., it is very similar to a white noise). However, when the fitted image is poor in terms of estimation, the residual image is useful for highlighting the boundaries and edges of the original image. Moreover, a bad fitting is related to the size of the block (or window) used in Algorithm 1. The best performance is attained for the maximum block size, which would be the size of the original image. The image segmentation algorithm introduced by (Ojeda et al. 2010) can be summarized as follows.
Algorithm 2.
1. Use Algorithm 1 to generate an approximated image
2. Compute the residual autoregressive image given by
Example 1. We present examples with real images to illustrate the performance of Algorithms 1 and 2. These images were taken from the database http://sipi.usc.edu/database. Figure 1(a) shows an original image of size
2.3. Improving the Segmentation Algorithm
In all experiments carried out in (Ojeda et al., 2010) and (Quintana et al., 2011), Algorithm 1 was implemented using the same prediction window for the AR-2D process, which contains only two elements belonging to a strongly causal region on the plane. Here, we consider other prediction windows to observe the effect on the performance of Algorithm 2. A description
of the most commonly used prediction windows in statistical image processing is in Bustos et al., (2009a). A brief description of the strongly causal prediction windows is given below.
For all
For a given
In particular, if
The set
Visually, the best segmentation for the aerial image is yielded by the prediction window
To gain insight on image quality measures, the fitted images produced by Algorithm 1 associated with the images shown in Figure 4(a) -(d) were compared aerially with the original image using three coefficients described in (Ojeda et al., 2012). These coefficients are briefly described below.
Consider two weakly stationary processes,
where
For
with
The index
where
where
The correlation coefficient and the coefficients defined in (6), (7) and (8) were computed to compare the fitted images, which were generated with a prediction window with two elements and associated with the images shown in Figure 4(a) -(f), and the original images. The results are shown in Table 1. In all cases, the highest values of the image quality measures are attained for the image fitted using the prediction window
experiment was carried out for the image shown in Figure 2(a). Table 2 summarizes the values of the image quality coefficients for the fitted images generated by Algorithm 2 with prediction windows
generated with prediction window
Algorithm 3.
1. Use Algorithm 1 to generate the approximated images
2. Compute an image quality index between
3. Compute the residual autoregressive image
3. Clustering Time series
3.1. Measuring Closeness and Association Between Time Series
Let
where
where
Dynamic time warping (DTW) is a variant of the Fréchet distance that considers mapping length as the sum of the spans of all coupled observations. That is,
Dynamic time warping is then defined as
The distances defined above are based on the proximity of the values
Several distance measures that are functions of the correlation between two sequences (
where
3.2. The Codispersion Coefficient for Time Series
Consider two weakly stationary processes,
where
3.3. Dissimilarity Index for Time Series
This coefficient involves a distance measure and a correlation-type measure that addresses both the correlation behavior and the proximity of two time series. The dissimilarity index depends on similarity behaviors, which should be specified in advance. The suggested dissimilarity index
where
where
Note that (13) is a generalization of the dissimilarity index introduced in Chouakria & Nagabhushan, (2007). The dissimilarity index (13) can capture high-order serial correlations between the sequences because the distance lag
The dependence of (13) on
When the variance of the codispersion coefficient is difficult to compute, resampling methods can be use to estimate the variance of the sample codispersion coefficient (Politis & Romano, 1994, Vallejos, 2008).
In the next section, we present two simulation examples to illustrate the capabilities of the hierarchical methods using the distance measure (13) under the tuning function given by (14). All else being constant, the clusters produced using traditional distances are usually different from those yielded using the distance measure (13).
3.4. Simulations
In this example, we simulate observations from six first-order autoregressive models to illustrate the clustering produced by hierarchical methods when the sequences exhibit serial correlation. To generate the series, we consider the following models.
where
with
Two hundred observations were generated from each model for
In Figure 5, we see that the dendrogram obtained using hierarchical methods with the Euclidean distance does not recognize the correlation structure between
To obtain better insight into the classification process using the proposed distance measure (13), we carried out a second simulation study that involves clustering measures based on other distances (but using the same setup). Observations from models 1-6 were generated using Gaussian white noise sequences for the errors, thereby preserving the same correlation structure used in the first study. The goal was to explore the ability of the distance measure (13) to group strongly correlated series first. A total of 1000 runs were considered for this
experiment, and 200 observations were generated in each run. We used measure (13) under the tuning function (14) for
Note from Table 3 that the traditional distance measures failed to group the correlated sequences, with the exception of the Minkowski distance, which correctly grouped the correlated series 99% of the time. The hierarchical algorithm that uses the distance measure (13) has a higher percentage of well-clustered correlated sequences than the same algorithm using the traditional distance measures described in Section 2 (see Table 4). The percentage of correct clusters increased in all cases with the distance measure (13), suggesting that hierarchical algorithms can be improved by including coefficients of association that consider high-order cross-correlation.
3.5. The NDVI Data Set
In this section, we consider time series from four different locations in Argentina. The data set consists of 15 monthly NDVI series measured during a period of 19 years (i.e., January 1982-December 2000). The observed values correspond to a transformation to the interval
We can observe a variety of different patterns in Figure 6. In particular, the data collected during the period 1994-1995 show irregular behavior. Additionally, the original data lack some information (less than one percent) for all series over the period 1999-2000. An imputation technique based on moving averages, which takes into account past and future values of the series, was used to replace missing values. The series were grouped by geographical region and then plotted (Figure 7). Similar patterns are observed for the series across each group.
An exploratory data analysis was carried out for each of the 15 series. There exists significant autocorrelation of order of at least one in all series. Seasonal components are present in most of partial autocorrelations. Because there is no large departure from the weakly stationary assumptions (i.e., constant means and variances), all series can be modeled using the Box-Jenkins approach. Specifically, seasonal ARIMA models can be fitted to each single series with a small number of parameters (i.e.,
3.6. Clustering
Using the NVDI data set described in Section 3.5, the distance measure
4. Concluding Remarks and Future Work
This chapter described two problems. The first problem involved image segmentation, while the second problem involved clustering time series. For the first problem, a new algorithm was proposed that enhances the segmentation yielded by a previous algorithm (Ojeda et al., 2010). Identifying the best prediction window improves segmentation based on the estimation of AR-2D processes and generalizes the previous algorithm to different prediction windows associated with unilateral processes on the plane. An analysis of the association between the original and fitted images relies on the selection of a suitable image quality measure. Using three image quality coefficients that are commonly used in image segmentation, we carried out experiments that support our algorithm. Specifically, a set of images belonging to the image database (http://sipi.usc.edu/database/) were processed and provided satisfactory results (not shown here) in terms of image segmentation.
This chapter also proposed an extension of the dissimilarity measure first introduced in (Chouakria & Nagabhushan,2007). The simulation experiments performed and the data analysis carried out for relevant ecological series show that the distance lag
Now, further research for the topics presented in this chapter is outlined.
Following the notation used in the Algorithm 3, consider the following residual image.
One interesting open problem involves the characterization of the types of images and distributions associated with the segmentation produced by Algorithms 2 and 3. In addition, the definition and study of linear combinations of residual images produced by distinct prediction windows is also of interests. For example,
where
Regarding the clustering technique problem, the distribution of
Acknowledgments
The first author was partially supported by Fondecyt grant 1120048, UTFSM under grant 12.12.05, and Proyecto Basal CMM, Universidad de Chile. The second author was supported in part by CIEM-FAMAF, UNC, Argentina.
References
- 1.
Allende H. Galbiati J. Vallejos R. 2001 Robust image modeling on image processing. 22 11 1219 1231 - 2.
Baran S. Pap G. Zuijlen M. C. A. 2004 Asymptotic inference for a nearly unstable sequence of stationary spatial AR models. 69 1 53 61 - 3.
Basu S. Reinsel G. 1993 Properties of the spatial unilateral first-order ARMA model. 25 3 631 648 - 4.
Bustos O. Ojeda S. Vallejos R. 2009a Spatial ARMA models and its applications to image filtering. 23 2 141 165 - 5.
Bustos O. Ojeda S. Ruiz M. Vallejos R. Frery A. 2009b Asymptotic Behavior of RA-estimates in Autoregressive 2D Gaussian Processes. 139 10 3649 3664 - 6.
Cariou C. Chehdi K. 2008 Unsupervised texture segmentation/classification usind 2-D autoregressive modeling and the stochastic expectation-maximization algorithm. 29 7 905 917 - 7.
Choi B. 2000 On the asymptotic distribution of mean, autocovariance, autocorrelation, crosscovariance and impulse response estimators of a stationary multidimensional random field. 29 8 1703 1724 - 8.
Chouakria A. D. Nagabhushan P. N. 2007 Adaptive dissimilarity for measuring time series proximity. 1 1 5 21 - 9.
Clifford P. Richardson S. Hémon D. 1989 Assessing the significance of the correlation between two spatial processes. 45 1 123 134 - 10.
Cullis B. R. Glesson A. C. 1991 Spatial analysis of field experiments-an extension to two dimensions. 47 4 1449 1460 - 11.
Francos J. Friedlander B. 1998 Parameter estimation of two-dimensional moving average random fields. 46 8 2157 2165 - 12.
Gaetan C. Guyon X. 2010 Springer New York - 13.
Genton M. G. Koul H. L. 2008 Minimum distance inference in unilateral autoregressive laticce processes. 18 617 631 - 14.
Golay X. Kollias S. Stoll G. Meier D. Valavanis A. 1998 A new correlation-based fuzzy logic clustering algorithm of FMRI. 40 2 249 260 - 15.
Grondona M. R. Crossa J. Fox P. N. Pfeiffer W. H. 1996 Analysis of variety yield trials using two-dimensional separable ARIMA processes. 52 2 763 770 - 16.
Guo J. Billard L. 1998 Some inference results for causal autoregressive processes on a plane. 19 6 681 691 - 17.
Guyon X. 1982 Parameter estimation for a stationary process on a d-dimensional lattice. 69 1 95 105 - 18.
Ho P. G. P. 2011 Image segmentation by autoregressive time series model. In edited by Pei-Gee Ho.InTech . - 19.
Illig A. Truong Van B. 2006 Asymptotic results for spatial ARMA Models. 35 4 671 688 - 20.
Jain A. K. Murty M. N. Flynn P. J. 1999 Data Clustering: A Review. 31 3 264 323 - 21.
Kashyap R. Eom K. 1988 Robust images techniques with an image restoration application. 36 8 1313 1325 - 22.
Mac Queen. J. B. 1967 Some Methods for classification and Analysis of Multivariate Observations. Berkeley University of California Press 1 281 297 - 23.
Martin R. J. 1996 Some results on unilateral ARMA laticce processes. 50 3 395 411 - 24.
Matheron G. 1965 Masson Paris . - 25.
Politis D. N. Romano J. P. 1994 The stationary bootstrap. 89 428 1303 1313 - 26.
Ojeda S. M. Vallejos R. O. Lucini M. 2002 Performance of RA Estimator for Bidimensional Autoregressive Models. 72 1 47 62 - 27.
Ojeda S. M. Vallejos R. Bustos O. 2010 A New Image Segmentation Algorithm with Applications to Image Inpainting .54 9 2082 2093 - 28.
Ojeda S. M. Vallejos R. Lamberti W. P. 2012 A Measure of Similarity Between Images , in press - 29.
Quintana C. Ojeda S. Tirao G. Valente M. 2011 Mammography image detection processing for automatic micro-calcification recognition 2 2 69 79 - 30.
Rukhin A. Vallejos R. 2008 Codispersion coefficient for spatial and temporal series. 78 11 1290 1300 - 31.
Vallejos R. Mardesic T. 2004 A recursive algorithm to restore images based on robust estimation of NSHP autoregressive models. 13 3 674 682 - 32.
Vallejos R. Garcia-Donato G. 2006 Bayesian analysis of contaminated quarter plane moving average models. 76 2 131 147 - 33.
Vallejos R. 2008 Assessing the association between two spatial or temporal sequences. 35 12 1323 1343 - 34.
Tjostheim D. 1978 A measure of association for spatial variables. 65 1 109 114 - 35.
Wang Z. Bovik A. 2002 A universal image quality index. 9 3 81 84