Meteorological readings, hydrological parameters and many measures of air, soil and water pollution are often collected for a certain span, regularly in time, and at different survey stations of a monitoring network. Then, these observations can be viewed as realizations of a random function with a spatio-temporal variability. In this context, the arrangement of valid models for spatio-temporal prediction and environmental risk assessment is strongly required. Spatio-temporal models might be used for different goals: optimization of sampling design network, prediction at unsampled spatial locations or unsampled time points and computation of maps of predicted values, assessing the uncertainty of predicted values starting from the experimental measurements, trend detection in space and time, particularly important to cope with risks coming from concentrations of hazardous pollutants. Hence, more and more attention is given to spatio-temporal analysis in order to sort out these issues.
Spatio-temporal geostatistical techniques provide useful tools to analyze, interpret and control the complex evolution of various variables observed by environmental monitoring networks. However, in the literature there are no specialized monographs which contain a thorough presentation of multivariate methodologies available in Geostatistics, especially in a spatio-temporal context. Several authors have developed different multivariate models for analyzing the spatial and spatio-temporal behavior of environmental variables, as it is clarified in the following brief review.
In multivariate spatial analysis, direct and cross correlations for the variables under study are quantified by estimating and modeling the matrix variogram. The difficulty in modeling this matrix function, especially the off diagonal entries of the same matrix, has been first faced by using the linear coregionalization model (LCM), proposed by [Matheron, 1982].
For matrix covariance functions, [Gneiting et al., 2010] constructed a parametric family of symmetric covariance models for stationary and isotropic multivariate Gaussian spatial random fields, where both the diagonal and off diagonal entries are of the Matérn type. In the bivariate case, they provided necessary and sufficient conditions for the positive definiteness of the second-order structure, whereas for the other multivariate cases they suggested a parsimonious model which imposes restrictions on the scale and cross-covariance smoothness parameters. In the bivariate case, where the smoothness parameter is the same for both covariance functions, the Gneiting model is a simplified LCM. The Gneiting cross-covariance model also assumes that the scale parameter is the same for all the covariance functions and the cross-covariance functions. Both the LCM and Gneiting constructions for cross-covariances result in symmetric models; however, no distributional assumptions are required for using a LCM, which can easily incorporate components with compact support and multiple ranges and an unbounded variogram component.
Although models for multivariate spatial data have been extensively explored [Wackernagel, 2003, Gelfand et al., 2004, Ver hoef Barry, 1998], models for multivariate spatio-temporal data have received relatively less attention. In the literature, it is common to use classical techniques for multivariate spatial and temporal analysis [Wackernagel, 2003, De Iaco et al., 2001b]. Recently, canonical correlation analysis was combined with space-time geostatistical tools for detecting possible interactions between two groups of variables, associated with pollutants and atmospheric conditions [De Iaco, 2011]. In the dynamic modeling framework, there are some results in studying the spatio-temporal variability of several correlated variables: [Gelfand et al., 2005], for example, extended univariate spatio-temporal dynamic models to multivariate dynamic spatial models. Moreover, [Li et al., 2008] proposed a methodology to evaluate the appropriateness of several common assumptions, such as symmetry, separability and linear model of coregionalization, on multivariate covariance functions in the spatio-temporal context, while [Choi et al., 2009] proposed a spatio-temporal LCM where the multivariate spatio-temporal process was expressed as a linear combination of independent Gaussian processes in space-time with mean zero and a separable spatio-temporal covariance. [Apanasovich Genton, 2010] considered some solutions to the symmetry problem; moreover, they proposed a class of cross-covariance functions for multivariate random fields based on the work of [Gneiting, 2002]. The maximum likelihood estimation of heterotopic spatio-temporal models with spatial LCM components and temporal dynamics was developed by [Fassò Finazzi(2011)]. A GSLib [Deutsch Journel, 1998] routine for cokriging was properly modified in [De Iaco et al., 2010] to incorporate the spatio-temporal LCM, previously developed using the generalized product-sum variogram model [De Iaco et al., 2003]. Recently, in [De Iaco et al., 2012a] an automatic procedure for fitting the spatio-temporal LCM using the product-sum variogram model has been presented and some computational aspects, analytically described by a main flow-chart, have been discussed. In [De Iaco et al., 2012 b] simultaneous diagonalization of the sample matrix variograms has been used to isolate the basic components of a spatio-temporal LCM and it has been illustrated how nearly simultaneous diagonalization of the cross-variogram matrices simplifies modeling of the matrix variogram.
In the following, after an introduction of the theoretical framework of the multivariate spatio-temporal random function and its features (Section 2), a review of recent techniques for building admissible models is proposed (Section 3). Successively, the spatio-temporal LCM, its assumptions and appropriate statistical tests are presented (Section 4) and techniques for prediction and risk assessment maps are introduced (Section 5). Some critical aspects regarding sampling, modeling and computational problems are discussed (Section 6). Finally, a case study concerning particle pollution () and two atmospheric variables (Temperature and Wind Speed) in the South of Apulian region (Italy), has been presented (Section 7). Before using the spatio-temporal LCM to describe the spatio-temporal multivariate correlation structure among the variables under study, its adequacy with respect to the data has been analyzed; in particular, the assumption of symmetry of the cross-covariance function, has been properly tested [Li et al., 2008]. By using a recent fitting procedure [De Iaco et al., 2012 b], based on the simultaneous diagonalization of several symmetric real-valued matrix variograms, the basic structures of the spatio-temporal LCM which describes the spatio-temporal correlation among the variables, have been easily detected. Predictions of the primary variable () are obtained by using a modified GSLib program, called COK2ST [De Iaco et al., 2010]. Then, risk maps showing the probability that the particle pollution exceeds the national law limit have been associated to predition maps and the estimation of the probability distributions for two sites of interest have been produced.
2. Multivariate spatio-temporal random functionLet be a vector of spatio-temporal random functions (STRF) defined on the domain, with, then
represents a multivariate spatio-temporal random function ( MSTRF), where are the coordinates of the spatial domain and the coordinate of the temporal domain.
Afterwards, the MSTRF will be denoted with and its components with. The STRF are the components of and they are associated to the spatio-temporal variables under study; these components are called coregionalized variables [Goovaerts, 1997].
The observations of the variables, at the points, are considered as a finite realization of a MSTRF Z.
2.1. Moments of a MSTRF
Given a MSTRF Z, with components, we define, if they exist and they are finite:
the expected value, or first-order moment of each component,
the second-order moments,
1. the variance of each component,
2. the cross-covariance for each pair of STRF ,
3. the cross-variogram for each pair of STRF ,
Note that for, we obtain:
the covariance of the STRF, called direct covariance, or simply covariance,
the direct variogram of the STRF,
These moments describe the basic features of a MSTRF, such as the spatio-temporal correlation for each variable and the cross-correlation among the variables.
2.2. Admissibility conditions
In multivariate Geostatistics, admissibility conditions concern both the cross-covariances and the cross-variograms, as described in the following.Let be a MSTRF, with components, and let a set of points of a spatio-temporal domain; the direct and cross-covariances of the MSTRF must satisfy the following inequality:
for any choice of the points and for any choice of the weights. Using the matrix notation, the matrices of the direct and cross-covariances of the STRF and will be admissible if they satisfy the following condition:
where is a vector of weights.
As in the univariate case, the matrices of the direct and cross-variograms of the STRF and will be admissible if, for any choice of the N points, they satisfy the following condition
under the constraint: 3mm
2.3. Stationarity hypotheses
Stationarity hypotheses allow to make inference on the MSTRF. In particular, second-order stationarity and intrinsic hypotheses concern the first and second-order moments of the MSTRF.
2.3.1. Second-order stationarity
A MSTRF Z, with p components, is second-order stationary if:
for any STRF
for any pair of STRF and the cross-covariance depends only on the spatio-temporal separation vector between the points u and u + h:
where. For, the direct covariance function of the STRF is obtained.
There exist several physical phenomena for which neither variance, nor the covariance exist, however it is possible to assume the existence of the variogram.
2.3.2. Intrinsic hypotheses
A MSTRF Z, with components, satisfies the intrinsic hypotheses if:
for any STRF
for any pair of STRF and the cross-variogram exists and it depends only on the spatio-temporal separation vector h:
Second-order stationarity implies the existence of the intrinsic hypotheses, however the converse is not true. Intrinsic hypotheses imply that the cross-variogram can be expressed as the expected value of the product of the increments:
2.3.3. Properties of the cross-covariance for second-order stationary MSTRF
Given a second-order stationary MSTRF, the cross-covariance satisfies the properties listed below.
The cross-covariance is not invariant with respect to the exchange of the variables:
as well as it is not invariant with respect to the sign of the vector:
However, the cross-covariance is invariant with respect to the joint exchange of the variables and the sign of the vector h:
2.3.4. Properties of the cross-variogram for intrinsic MSTRF
Afterwards, the main properties of the cross-variogram for intrinsic MSTRF are given.
The cross-variogram vanishes at the origin, that is:
The cross-variogram is invariant with respect to the exchange of the variables:
The cross-variogram is invariant with respect to the sign of the vector h:
From (15) and (16) follows that the cross-variogram is completely symmetric, as it will be pointed out in the next sections.
2.3.5. Separability for a MSTRF
The cross-covariance for a second-order stationary MSTRF Z is separable if:
where are the elements of a positive definite matrix and is a correlation function. In this case, it results:
hence the changes of the cross-covariance functions, with respect to the changes of the vector h, do not depend on the pair of the STRF.
The cross-covariance for a second-order stationary MSTRF Z is fully separable if:
where are the elements of a positive definite matrix, is a spatial correlation function and is a temporal correlation function. In the literature, many statistical tests for separability have been proposed and are based on parametric models [Brown et al., 2000, Guo, 1998, Shitan Brockwell, 1995], likelihood ratio tests and subsampling [Mitchell et al., 2005] or spectral methods [Fuentes, 2006, Scaccia Martin, 2005].
2.3.6. Symmetry for a MSTRF
The cross-covariance of a second-order stationary MSTRF Z, with components, is symmetric if:
or, equivalently, if:
The cross-covariance of a second-order stationary MSTRF Z, with components, is fully symmetric if:
Atmospheric, environmental and geophysical processes are often under the influence of prevailing air or water flows, resulting in a lack of full symmetry [de Luna Genton, 2005, Gneiting, 2002, Stein, 2005].
Fig. 1 summarizes the relationships between separability, symmetry, stationarity and the LCM in the general class of the cross-covariance functions of a MSTRF Z. If a cross-covariance is separable, then it is symmetric, however, in general, the converse is not true. Moreover, the hypothesis of full separability is a special case of full symmetry.
3. Techniques for building admissible models
In the following, a brief review of the most utilized techniques to construct admissible cross-covariance models is presented.
For the intrinsic correlation model, the matrices are described by separable cross-covariances, [Mardia Goodall, 1993], that is:
where the coefficients are the elements of a positive definite matrix, and is a correlation function. However, this model is not flexible enough to handle complex relationships between processes, because the cross-covariance function between components measured at each location always has the same shape regardless of the relative displacement of the locations. As it will be discussed in the next section, the LCM is a straightforward extension of the intrinsic correlation model.
In the kernel convolution method [Ver hoef Barry, 1998] the cross-covariance functions is represented as follows:
where the are square integrable kernel functions and is a valid stationary correlation function. This approach assumes that all the variables, are generated by the same underlying process, which is very restrictive. Moreover, this model and its parameters lack interpretability and, except for some special cases, it requires Monte Carlo integration.
where are second-order stationary covariances. The motivation and interpretation of the resulting cross-dependency structure is rather unclear. Although some closed-form expressions exist, this method usually requires Monte Carlo integration.
Recently, an approach based on latent dimensions and existing covariance models for univariate random fields, has been proposed; the idea is to develop flexible, interpretable and computationally feasible classes of cross-covariance functions in closed form [Apanasovich Genton, 2010].
4. Linear coregionalization model
LCM is based on the hypothesis that each direct or cross-variogram (covariogram) can be represented as a linear combination of some basic models and each direct or cross-variogram (covariogram) must be built using the same basic models [Journel Huijbregts, 1981].
LCM is utilized in several applications because of its flexibility, moreover it encouraged the development of algorithms able to estimate quickly the parameters of the selected model, assuring the admissibility conditions, even in presence of several variables [Goovaerts, 1997, Goulard, 1989, Goulard Voltz, 1992].
Let Z be a second-order stationary MSTRF with p components, , the direct and cross-covariances for the spatio-temporal LCM are defined as follows:
where are covariances, called basic structures, and the non-negative coefficients satisfy the following property:
hence, in the LCM, it is assumed that:
The matrix C for the second-order stationary Z is built as follows:
Analogously, it is also possible to introduce the LCM for a MSTRF which satisfies the intrinsic hypotheses. In such a case, the direct and cross-variograms are built as follows:
where each basic structure is a variogram and the L matrices of the coefficients, corresponding to the sill values of the basic models, are positive definite.
Then, for a MSTRF which satisfies the intrinsic hypotheses the matrix of the direct and cross-variograms is:
The necessary and sufficient conditions because the model defined in (17) and (20) are admissible are:
the matrices, called coregionalization matrices, must be positive definite.
The necessary, but not sufficient conditions, for the coefficients are the following:a)
The basic structures of the spatio-temporal LCM (20) can be modelled by using several spatio-temporal variogram models known in the literature, such as the metric model [Dimitrakopoulos Luo, 1994], Cressie-Huang models [Cressie Huang, 1999], Gneiting models [Gneiting, 2002] and many others [De Iaco et al., 2002, Kolovos et al., 2004, Ma, 2002, Ma, 2005, Porcu, 2008, Stein, 2005].
whereand are, respectively, the marginal spatial and temporal variograms at th scale of variability; , defined as follows
is the parameter of generalized product-sum variogram model and it is such that
The inequality (23) represents a necessary and sufficient condition in order that each basic structure, with, is admissible. Recently, it was shown that strict conditional negative definiteness of both marginals is a necessary as well as a sufficient condition for the generalized product-sum (21) to be strictly conditionally negative definite [De Iaco et al., 2011a, De Iaco et al., 2011 b].
Substituting (21) in (20), the spatio-temporal LCM can be defined through two marginals: one in space and one in time, i.e.:
Using the generalized product-sum variogram model it is possible:
to identify the different scales of variability and build the matrices by means of the direct and cross marginal variograms;
to describe the correlation structure of processes characterized by a different spatial and temporal variability.
4.1. Assumptions in the spatio-temporal LCM
Fitting a spatio-temporal LCM to the data requires the identification of the spatio-temporal basic variograms and the corresponding positive definite coregionalization matrices, however this is often a hard step to tackle. A recent approach [De Iaco et al., 2012b], based on the simultaneous diagonalization of a set of matrix variograms computed for several spatio-temporal lags, allows to determine the spatio-temporal LCM parameters in a very simple way.
In several environmental applications [Wackernagel, 2003], the cross-covariance function is not symmetric, as for example, in time series in presence of a delay effect, as well as in hydrology, for the cross-correlation between a variable and its derivative, such as water head and transmissivity [Thiebaux, 1990]. Hence, this assumption should be tested before fitting a spatio-temporal LCM.
A useful hint to verify the symmetry of the cross-covariance can be given by estimating all the pseudo cross-variograms [Myers, 1991] of the standardized variables, i.e., If the differences between the estimated pseudo cross-variograms and are zero or close to zero, then it could be assumed that the cross-covariances are symmetric.
The appropriateness of the assumption of symmetry of a spatio-temporal LCM can be tested by using the methodology proposed by [Li et al., 2008], based on the asymptotic joint normality of the sample spatio-temporal cross-covariances estimators. Given a set of user-chosen spatio-temporal lags and the cardinality of, let be a vector of cross-covariances at spatio-temporal lags in. Moreover, let be the estimator of based on the sample data in the spatio-temporal domain, where represents the spatial domain and the temporal one, and define. Under the assumptions given in the above paper, , where converges to. Then the tests for symmetry properties can be based on the following statistics
where a is the row rank of the matrix A, which is such that under the null hypothesis.
Moreover, the choice of modeling the MSTRF Z by a spatio-temporal LCM is based on the prior assumption that the multivariate correlation structure of the variables under study is characterized by scales of spatio-temporal variability. On the other hand, if the multivariate correlation of a set of variables does not present different scales of variability (), then the cross-covariance functions are separable, i.e.,
where are the entries of a positive definite matrix B and is a spatio-temporal correlation function. Hence, as in the spatial context [Wackernagel, 2003], a spatio-temporal intrinsic coregionalization model can be considered.
Obviously, this last model is just a particular case () of the spatio-temporal LCM defined in (17) and it is much more restrictive than the linear model of coregionalization since it requires that all the variables have the same correlation function, with possible changes in the sill values. Note that, if a cross-covariance is separable, then it is symmetric.
In the spatio-temporal LCM, each component is represented as a linear combination of latent, independent univariate spatio-temporal processes. However, the smoothness of any component defaults to that of the roughest latent process, and thus the standard approach does not admit individually distinct smoothness properties, unless structural zeros are imposed on the latent process coefficients [Gneiting et al., 2010].
In most applications, the wide use of the spatio-temporal LCM is justified by practical aspects concerning the admissibility condition for the matrix variograms (covariances). Indeed, it is enough to verify the positive definiteness of the coregionalization matrices, , at all scales of variability.
The spatio-temporal LCM allows unbounded variogram components to be used [Wackernagel, 2003].
5. Prediction and risk assessment in space-time
For prediction purposes, various cokriging algorithms can be found in the literature [Journel Huijbregts, 1981, Chilés Delfiner, 1999]. As a natural extension of spatial ordinary cokriging to the spatio-temporal context, the linear spatio-temporal predictor can be written as
where is a point in the spatio-temporal domain, are the data points in the same domain and are matrices of weights whose elements are the weights assigned to the value of the th variable, at the th data point, to predict the ith variable, , at the point
The predicted spatio-temporal random vector at is such that each component is obtained by using all information at the data points.
The matrices of weights are determined by ensuring the unbiased condition for the predictor and the efficiency condition, obtained by minimizing the error variance [Goovaerts, 1997].
The new.. routine COK2ST [De Iaco et al., 2010] produces multivariate predictions in space-time, for one or all the variables under study, using the spatio-temporal LCM (20) where the basic spatio-temporal variograms are modelled as generalized product-sum variograms. An application is also given in [De Iaco et al., 2005].
Similarly, for environmental risk assessment, the formalism of multivariate spatio-temporal indicator random function () and corresponding predictor, have to be introduced. Let
be a vector of p spatio-temporal indicator random functions (STIRF) defined on the domain, with, as follows
represents a MSTIRF. In other words, for each coregionalized variable with a STIRF can be appropriately defined. Then the linear spatio-temporal predictor (27) can be easily written in terms of the indicator random variables. If the spatio-temporal correlation structure of a is modelled by using the spatio-temporal LCM, based on the product-sum, the new routine COK2ST [De Iaco et al., 2010] can be used to produce risk assessment maps, for one or all the variables under study. If, the dependence of the indicator variable is characterized by the corresponding indicator variogram of I:, which depends solely on the lag vector, and. After fitting a model for, which must be conditionally negative definite, ordinary kriging can be applied to generate the environmental risk assessment maps. In this case, the routine K2ST [De Iaco and Posa, 2012] can be used for prediction purposes in space and time.
6. Some critical aspects
Multivariate geostatistical analysis for spatio-temporal data is rather complex because of several problems concerning:
b. the choice of admissible direct and cross-correlation models,
c. the definition of automatic procedures for estimation and modeling.
There exist several sampling techniques for multivariate spatio-temporal data, as specified herein. Let
be the sets of sampled points in the spatio-temporal domain for the p variables under study. It is possible to distinguish the following situations:
total heterotopy, where the sets of the sampled points are pairwise disjoint, that is
partial heterotopy, where the sets of the sampled points are not pairwise disjoint, that is
isotopy, where the sets of the sampled points coincide, that is
A special case of partial heterotopy is the so-called undersampling: in such a case, the points where a variable, called primary or principal, has been sampled, constitute a subset of the points where the remaining variables, called auxiliary or secondary, have been observed. The secondary variables provide additional information useful to improve the prediction of the primary variable.
In Geostatistics, the main modeling problems concern the choice of an admissible parametric model to be fitted to the empirical correlation function (covariance or variogram). In particular, in multivariate analysis for spatio-temporal data it is important to identify an admissible model able to describe the correlation among several variables which describe the spatio-temporal process. In this context, it is suitable to underline that
it is not enough to select an admissible direct variogram to model a cross-variogram;
the direct variograms are positive functions, on the other hand cross-variograms could be negative functions;
only some necessary conditions of admissibility are known to model a cross-variogram, as the Cauchy-Schwartz inequality, however sufficient conditions cannot be easily applied [Wackernagel, 2003].
The use of multivariate correlation models well-known in the literature, such as the LCM [Wackernagel, 2003], the class of non-separable and asymmetric cross-covariances, proposed by [Apanasovich Genton, 2010], or the parametric family of cross-covariances, where each component is a Matérn process [Gneiting et al., 2010], requires the identification of several parameters, especially in presence of many variables.
Moreover, estimation and modeling the direct and cross-correlation functions could be compromised by the sampling plan.
For spatial multivariate data different algorithms for fitting the LCM have been implemented in software packages. [Goulard, 1989] and [Goulard Voltz, 1992] described an iterative procedure to fit a LCM using a weighted least-squares like technique: this requires first fitting the diagonal entries, i.e. the basic variogram structures must be determined first. In contrast, [Xie Myers, 1995] and [Xie et al., 1995] developed an alternative method for modeling the matrix-valued variogram by near simultaneously diagonalizing the sample variogram matrices, without assuming any model for the variogram matrix; [Kunsch et al., 1997] proposed estimating the range parameters of a LCM using a non-linear regression method to fit the range parameters; [Lark Papritz, 2003] used simulated annealing to minimize a weighted sum of squares of differences between the empirical and the modelled variograms; [Pelletier et al., 2004] modified the Goulard and Voltz algorithm to make it more general and usable for generalized least-squares or any other least-squares estimation procedure, such as ordinary least-squares; [Zhang, 2007] developed an algorithm for the maximum-likelihood estimation for the purely spatial LCM and proved that the EM algorithm gives an iterative procedure based on quasi-closed-form formulas, at least in the isotopic case. Significant contributions concerning estimation and computational aspects of a LCM can be found in [Emery, 2010]. Unfortunately, for multivariate spatio-temporal data there does not exist software packages which perform in an unified way a) the structural analysis, b) a convenient graphical representation of the covariance (variogram) models fitted to the empirical ones and c) predictions. One of the solutions could be to extend the above mentioned techniques to space-time. Indeed, some routines already implemented in the GSLib software [Deutsch Journel, 1998] or in the module gstat of R, can only be applied in a multivariate spatial context.
In the last years, a new GSLib routine, called COK2ST, can be used to make predictions in the domain under study, utilizing the spatio-temporal LCM, based on the generalized product-sum model [De Iaco et al., 2010]. This routine could be merged with the automatic procedure for fitting the spatio-temporal LCM using the product-sum variogram model, presented in [De Iaco et al., 2012 a], in order to provide a complete and helpful package for the analyst who needs to obtain predictions in a spatio-temporal multivariate context. This is certainly the first step for other developments and improvements in this field.
7. Case study
In the present case study, the environmental data set, with a multivariate spatio-temporal structure, involves daily concentrations, Temperature and Wind Speed daily averages measured at some monitored stations located in the South of Apulia region (Italy), from the 1st to the 23rd of November 2009. In particular, there are 28 survey stations, 60 and 54 atmospherical stations for monitoring Temperature and Wind Speed, respectively. As it is highlighted in Fig. 2(a), over the domain of interest, almost all the monitoring stations are either traffic or industrial stations, depending on the area where they are located (close to heavy traffic area or to industrialized area). The remaining monitoring stations are called peripheral. In Fig. 2(b), box plots of daily concentrations classified by typology of survey stations are illustrated.
Fig. 3 shows the temporal profiles of the observed values. It is evident that low (high) values of Temperature and Wind Speed are associated with high (low) values of.
Note that, during the period of interest, the threshold value fixed by National Laws for the human health protection (i.e. 50 which should not be exceeded more than 35 times per year) has been overcome the 13rd, the 14th, the 15th and the 23rd of November 2009. Indeed, as regards this last aspect, it is worth noting that the highest daily averages have been registered at all kinds of monitored stations, even at stations located at peripheral areas, likely for the transport effects caused by wind. As shown in Fig. 2(b), although the maximum values registered at stations close to heavy traffic area are greater than the threshold value, it is evident that these values are less than the ones measured at the other stations.
Spatio-temporal modeling and prediction techniques have been applied in order to assess risk pollution over the area of interest for the period 24-29 November 2009. In particular, the following aspects have been considered:
1. estimating and modeling spatio-temporal correlation among the variables; in the fitting stage of a spatio-temporal LCM, the recent procedure [De Iaco et al., 2012b] based on the nearly simultaneous diagonalization of several sample matrix variograms, has been applied and the product-sum variogram model [De Iaco et al., 2001a] has been fitted to the basic components;
2. spatio-temporal cokriging based on the estimated model, in order to obtain prediction maps for pollution levels during the period 24-29 November 2009 and indicator kriging [Journel, 1983] in order to construct risk maps related to the probability that predicted concentrations exceed the threshold value (50) fixed by National Laws;
3. generating and comparing, for two sites of interest (one close to an industrial area and the other one close to a heavy traffic area), the probability distributions that daily concentrations exceed some risk levels during the period 24-29 November 2009.
7.1. Modeling spatio-temporal LCM
Modeling the spatio-temporal correlation among the variables under study by using the spatio-temporal LCM, requires first to check the adequacy of such model. In particular, the symmetry assumption has been checked by
a. exploring the differences between the pseudo cross-variograms of the standardized variables (standardized by subtracting the mean value and dividing this difference by the standard deviation),
b. using the methodology proposed by [Li et al., 2008].
As regards point a), the largest absolute difference has been equal to 0.135 and has been observed among the differences between the two pseudo cross-variograms concerning and Wind Speed standardized data. On the other hand, for the point b), three pairs composed by six stations, with consecutive daily average measurements have been selected for the test. These pairs of monitoring stations have been picked to be approximately along the direction, since the prevalent wind direction over the area under study and during the analyzed period was. Moreover, the temporal lag has been selected in connection with the largest empirical cross-correlations for all variable combinations; hence has been considered for all testing lags. Three different variables and three pairs of stations generate 9 testing pairs in the symmetry test, and consequently the degree of freedom a for the symmetry test is equal to 9. The test statistic TS (25) has been equal to 0.34 with a corresponding p-value equal to 0.99. Hence, the results from both the procedures have highlighted that the spatio-temporal LCM is suitable for the data set under study.
By using the recent fitting procedure [De Iaco et al., 2012b] based on the nearly simultaneous diagonalization of several sample matrix variograms computed for a selection of spatio-temporal lags, the basic independent components and the scales of spatio-temporal variability have been simply identified. In particular, the spatio-temporal surfaces of the variables under study have been computed for 7 and 5 user-chosen spatial and temporal lags, respectively (Fig. 4).
Then, the 35 symmetric matrices of sample direct and cross-variograms have been nearly simultaneous diagonalization in order to detect the independent basic components. In this way, 3 scales of spatial and temporal variability have been identified: 10, 18 and 31.5 km in space, and 2.5, 3.5 and 6 days in time.
Thus, the following spatio-temporal LCM has been fitted to the observed data:
where the spatio-temporal variograms are modelled as a generalized product-sum model, i.e.
The spatial and temporal marginal basic variogram models, and respectively, and the coefficients previously defined in (22), are shown below:
where and denote the well known exponential and Gaussian variogram models, with practical range a [Deutsch Journel, 1998].
Finally, the matrices of the spatio-temporal LCM (31), computed by the procedure described in [De Iaco et al., 2012 a], are the following:
Fig. 5 shows the spatio-temporal variograms and cross-variograms fitted to the surfaces of, Temperature and Wind Speed daily averages. Then, the spatio-temporal LCM (31) has been used to produce prediction and risk assessment maps for daily concentrations, as discussed hereafter.
7.2. Prediction maps and risk assessment
In order to obtain the prediction maps for pollution levels for the period 24-29 November 2009, spatio-temporal cokriging has been applied, using the routine COK2ST [De Iaco et al., 2010]. Risk assessment maps have been associated to the prediction maps. Spatial indicator kriging has been applied to assess the probability that predicted daily concentrations exceed the threshold value fixed by National Laws for the human health protection (i.e. 50), during the period of interest. Fig. 6 and Fig. 7 show contour maps of the predicted values and the corresponding risk maps, for the period 24-29 November 2009. The red points on the maps represent the monitoring stations.
It is important to highlight that the highest values are predicted in the Eastern part of the domain of interest: this area corresponds to the boundary between Lecce and Brindisi districts which is strictly close to an industrial site, such as the thermoelectric power station Enel-Federico II, located in Cerano (Brindisi district). Moreover, in this area the probability that daily concentrations exceed 50 is high during the predicted week. It is worth noting that on Saturday and Sunday the predicted values of daily concentrations show lower average levels than the ones estimated during the working days, when heavy traffic contributes to keep pollution concentrations high; consequently, the corresponding risk maps do not show hazardous conditions.
7.3. Probability distributions for different sites
After producing predicted maps of daily concentrations, it is also interesting to estimate the probability distribution that daily concentrations exceed some risk levels at sites characterized by sources of pollution.
Two different sites, one close to an industrialized area, located at Brindisi district, and the other one close to a heavy traffic area, located at Lecce district, have been considered in order to generate and compare the probability distributions that daily concentrations overcome several risk levels during the period 24-29 November 2009. Fig. 8 shows that, all over the industrialized area of interest, the probability that daily concentrations exceed the threshold fixed by National Laws (50) is very high (greater than 80%) during the period 24-27 November 2009 (working days); on the other hand, during the weekend (28-29 November 2009) such a probability decreases at 40-42%.
Note that at the site close to a heavy traffic area, the probability that daily concentrations exceed the national law limit is very low during the analyzed period, except on the 25th of November; moreover, it drops off rapidly for values below the national threshold. This empirical evidence highlights that there is no critical exceeding for the selected traffic site, especially during the last 3 days of the week. As it is shown, this is a very powerful tool since any action of environmental protection might be adopted in advance by taking into account the actual likelihood of dangerous exceeding.
In this paper, some significant theoretical and practical aspects for multivariate geostatistical analysis have been discussed and some critical issues concerning sampling, modeling and computational aspects, which should be faced, have been pointed out. The proposed multivariate geostatistical techniques have been applied to a case study pertaining particle pollution () and two atmospheric variables (Temperature and Wind Speed) in the South of Apulian region.
Further analysis regarding the integration of land use and possible sources of pollution through an appropriate geographical information system could be helpful to fully understand the dynamics of, which is still considered one of the most hazardous pollutant for human health.
The authors are grateful to the Editor and the reviewers, whose comments contribute to improve the present version of the paper. This research has been partially supported by the 5per1000 project (grant given by University of Salento in 2011).