Spatial Coordinate Transformations with Noisy Data

The parametric transformation of spatial coordinates between different reference frames is traditionally dealt with a stepwise approach which imposes a suboptimal treatment in the presence of noisy data. The chapter explains briefly the drawbacks of this approach and then presents an alternative scheme for spatial coordinate transformations that improves the classic stepwise solution when using noisy coordinates of known stochastic structure. The proposed methodology is simple in principle, although its numerical implementation with nonlinear parametric models is a bit more involved and it relies on the joint least squares adjustment of the observed coordinates using their full stochastic model over all points of interest. The mathematical framework and the related properties of this “ stacking ” approach are presented in detail, along with a numerical example that demonstrates its feasibility for practical problems in geospatial applications.

Spatial coordinate transformations (SCTs) are utilized in practice either explicitly for determining the unknown coordinates of scattered points in a desired frame from their observed coordinates in a different frame or implicitly in the context of more composite procedures such as the self-calibration of terrestrial laser scanners [20,23,26], the conflation of digital maps and geographical databases [2,4,6,27], the reconstruction of 3D models from multi-sensor data [9,29,35], and the integration of aerial or satellite images in ground-based systems of geographic coordinates [14,15,21]. Various technical terms have actually been used with regard to SCTs in practical problems, for instance, spatial adjustment, image registration, absolute orientation, geo-referencing, and frame transformation, to name a few. Despite their linkage to different application fields, all these terms refer, more or less, to the same archetypical problem, that is, the optimal fusion of partially overlapping configurations of spatial points using their coordinate-based representations in separate frames and an application-specific model to describe their systematic differences. Although this general viewpoint includes also cases with raster-type where X 0 and X contain the Cartesian coordinates for the same group of physical points (or homologous groups of points) in different reference frames. The spatial dimensionality does not need to be specified here, and it can refer to any case that occurs in practice. The vector θ represents the parameters of the transformation model which enable the coordinate mapping from the initial frame to the target frame.
In the following, Eq. (1) is considered as an exact formula for noise-free coordinates and provides the general framework for the LSA of observed coordinates in the involved frames. Simpler types of transformation models with joint or marginal linearity in X 0 and θ (e.g., errors-in-variables models, differential or close-to-identity models) can be also analyzed under the previous setting.
For the purpose of this contribution, the user's data shall consist of: a. the observed coordinates for CPs and NPs in the initial frame (denoted by X 0 and Z 0 , respectively); b. the observed coordinates for CPs in the target frame (denoted by X); and c. the error CV matrices of the previous vectors (denoted by Σ X 0 , Σ Z 0 , Σ X ).
An additional matrix of special importance is the cross-CV matrix Σ X 0 Z 0 which reflects the intra-frame error correlation between CPs and NPs, and it is totally ignored in the traditional stepwise procedure.

Estimation of transformation parameters
The first step refers to the estimation of the transformation parameters using a sufficient number of known CPs. Following a statistical estimation perspective, the optimal parameter values are obtained by solving the nonlinear LSA problem where the vectors v X and v X 0 represent the zero-mean random errors in the observed coordinates. After appropriate linearization, the above problem can be reduced to a linear LSA problem for the so-called Gauss-Helmert (GH) model [12,17], and it leads to an iterative solution via successive refinements of the preliminary estimate:θ where θ o contains approximate values for the transformation parameters. The recursive updating of the previous solution is performed by the Newton-Gauss iteration algorithm in accordance to a more complex expression that will be presented later in this chapter. The matrices J X 0 and J θ are the Jacobians with respect to the initial frame coordinates and the transformation parameters, that is, (6) and they need to be re-evaluated at each iteration step using the adjusted values from the previous step. For more details on nonlinear least squares adjustment and iterative computational algorithms, the reader should consult the excellent treatise in [24] (see also [3,10,34]).

Determination of transformed coordinates
After estimating the transformation parameters, an additional step is required to complete the solution of the problem at hand, that is, the computation of the transformed coordinates in the target frame. This is performed by a simple forward evaluation of the transformation model at the CPs and NPs, using the respective nonlinear formulae:X Note thatθ corresponds to the estimated parameters from the first step, whereas X 0 and Z 0 refer to the observed coordinates in the initial frame. The following Jacobian matrices are also defined here (to be used later on): which differ from their previous counterparts in Eq. (6) as they refer to a separate group of points (NPs).

Deficiency of the stepwise approach
A number of drawbacks exist in the stepwise approach for SCT problems with noisy data. More specifically, (a) the noise of the original coordinates remains unfiltered during their transformation to the target frame, (b) the correlated errors in the original coordinates between CPs and NPs are not taken into account, and (c) the accuracy of the transformed coordinates is not optimized under any statistical principle. All these drawbacks relate to the same modeling deficiency that is summarized as follows: the observed coordinates in the initial frame are contaminated by random errors which remain uncontrolled during the second step of the transformation process, and they are fully absorbed by the transformed coordinates of CPs and NPs.
The aforesaid deficiency is irrelevant for practical applications only in two cases: • if the sole objective is to determine a set of transformation parameters between different frames, without the need to perform any coordinate transformation at specific points; or • if spatial objects (e.g., point cloud, network, digitized map) need to be transferred from an initial frame to another frame, without any "quality improvement" of their transformed coordinates.
However, if the user's goal is the optimal referencing of spatial objects with respect to a target frame, then the unfiltered data noise becomes a critical error source for SCT problems. This does not mean that the stepwise approach leads to wrong results, but it signifies that the composite estimators in Eqs. (7) and (8) do not provide an optimal solution of maximum accuracy for the transformed coordinates.
It is worth noting that the stepwise approach is not compelled to reproduce the prior reference coordinates of CPs in the target frame, that is,X 6 ¼ X, even if these coordinates are perfectly known without any errors!

Best-fitting transformation solutions
In some cases, the estimation of transformation parameters is performed via the alternative nonlinear least squares principle [36,37]: where Á k k denotes the standard form of the Euclidean vector norm. The rationale of the above principle is to bring in the best alignment two different coordinate sets over a group of CPs, and it does not lead to the same parameter estimates as the statistical least squares formulation of Section 3.2. Their formal equivalency occurs if the known coordinates in the initial frame are treated as noiseless quantities and the respective coordinates in the target frame are affected by uncorrelated random errors of equal variance. Nevertheless, Eq. (10) has a strong geometrical significance, and it is often used in practice regardless of the noise characteristics of the available data.
If the transformation parameters are obtained by the alternative principle of Eq. (10), then it obviously holds that X ÀX 2 ! min (11) which implies that the transformed coordinates of CPs will be optimally fitted, in a least squares sense, to their prior known values in the target frame. This best-fitting property does not enforce statistical optimality to the accuracy of the transformed coordinates-the latter will still absorb the entire observation noise according to Eqs. (7) and (8). Therefore, the point to be stressed here is that a high-quality transformation solution should not just rely on the fitting performance at CPs, but it has to exploit in an optimal sense the stochastic error model of the observed coordinates over all points of interest.

The stacking approach in spatial coordinate transformations 4.1 Theoretical aspects
A unified optimal solution for SCT problems can be obtained in a single stage through the rigorous combination of all available data. This requires the joint LSA of the nonlinear transformation equations: which should be performed in a linearized context via the Newton-Gauss iteration method [3,24,34]. The algebraic setup of this stacking adjustment and the basic properties of the resulting estimators for the transformed coordinates are presented in this section.

Linearization
At first, we need to approximate the nonlinear Eqs. (12) and (13) by the truncated multivariate Taylor's series expansions: where θ o is a vector of approximate values for the transformation parameters and X 0 o , Z 0 o are vectors of approximate coordinates for the respective points in the initial frame. Taking into account that the observables correspond to the coordinate vectors X, X 0 , and Z 0 , the previous formulae should be further augmented as follows: where the added vectors v X , v X 0 , v Z 0 denote the zero-mean random errors of the observed coordinates. The linearized expressions (16) and (17) can be equivalently written in the block-matrix form: which conforms to the usual structure of Gauss-Helmert linear models of statistical estimation theory [12,17,24]. Our objective here is to invert the above stacked system of the general form Ax þ Bv þ w = 0 using the general least squares principle v T P v ¼ min, in conjunction with the data weight matrix: which reflects the total statistical accuracy of the observables. Note that the inter-frame correlations of the observed coordinates are assumed to be zero, whereas the intra-frame correlations between CPs and NPs are taken into account by the cross-CV matrix If applied under a proper iterative setting, the LSA of Eq. (18) leads to the sought optimal solution of the problem at hand. Specifically, the transformation parameters and the coordinates of NPs in the target frame are both contained into the "parameter vector" of the stacked GH-type model, and they can be directly obtained via the respective least squares estimator (see next section). On the other hand, the estimated coordinates of CPs in the target frame shall be deduced in an implicit way by correcting the observed values X for the effect of their random errors (v X ) which are also estimable from the iterative least squares inversion of Eq. (18).

Optimal least squares estimators
By applying the general LSA solution of linear GH models (see [12,17]) to the stacked system of Eq. (18) and after some extra lengthy derivations using analytic inversions of 2 Â 2 block matrices, we obtain the explicit estimators for the transformation parameters: and for the coordinates of NPs in the target framê whereas the estimated errors for each subset of observed coordinates are given by the equation The auxiliary matrix W that appears in the previous equations was defined earlier in Section 3.2. Finally, if we combine the first error component from Eq. (22) with the basic formulaX ¼ X þv X , we get the estimated coordinates of CPs in the target frame: To facilitate a comprehensive analysis of the stacking approach, it is useful to rewrite Eqs. (21) and (23) in the combined Kalman-like form: where the auxiliary terms X _ and Z _ are strictly given by the expressions which, to a first-order approximation, mimic the result of the traditional stepwise approach, that is, All previous estimators refer to a single execution of the weighted LSA in the linearized system of Eq. (18). Their use in practical applications with nonlinear transformation models requires a recursive algorithm, as explained in more detail in Section 4.2.

Basic features of the stacking approach
Compared to the traditional stepwise methodology, the stacking approach leads to the same least squares estimate for the transformation parameters but to different values for the estimated coordinates in the target frame. This partial equivalency is expected since the inclusion of NPs into the adjustment procedure does not contribute additional information for the transformation parameters. On the other hand, the estimated coordinates contain extra corrections which are derived from stochastic filtering of the coordinate residuals XÀ X _ and kriging-like prediction over all points of interest [see Eq. (24)]. Loosely speaking, the effect of those corrections resembles a rubber-sheeting process in the sense of "stretching" the classic stepwise solution to counteract the propagated data noise in the entire set of transformed coordinates.
The stacking approach permits also the exact fit over all CPs regardless of the noise level in the initial frame. This essential property is easily verified by Eq. (24) which implies that or in a loosened version The first condition dictates that the transformed coordinates of CPs will match their prior values, if the latter are assumed to be of perfect quality. The second condition is also useful for practical applications, as it allows the users to improve the fitting performance of the transformation results via a simple tuning of the CV matrix Σ X . This last option is essentially equivalent to stochastic constraining of the prior coordinates of CPs in the target frame.
As a final note, let us point out that both approaches give similar results in the presence of noiseless data in the initial frame. In such case the least squares estimators of the previous section admit the conditional behavior: Interestingly, the CV matrix Σ Z 0 does not play an active role within the stacking approach, in contrast to the cross-CV matrix Σ Z 0 X 0 which is of crucial importance for the optimal transformation at the NPs [see Eq. (22)]. In Table 1 all relevant cases that can appear in SCT problems are classified with regard to the stochastic model of the observed coordinates in the respective frames.

Computational aspects
The numerical computation of the stacking solution in nonlinear SCT problems requires a recursive implementation of the least squares estimators given in Section 4.1.2. The Newton-Gauss iteration method is suitable for this purpose and entails the updating of the approximate vectors X 0 o , Z 0 o , θ o at each step by their adjusted values from the previous step until sufficient convergence is achieved in all estimated quantities of interest [3,24,34].
The aforesaid procedure should be applied for computing both the transformation parameters and the coordinates of CPs/NPs in the target frame, based on the following algorithm: where the index k ¼ 1, 2, … denotes the LSA iteration step. All Jacobian matrices shown in these equations should be re-evaluated at each step as follows: CV matrices of observed coordinates Does data noise filtering occur in the transformation process ?

Initial frame Target frame
Control points Σ X 0 6 ¼ 0 Yes-perfect fit to prior values Table 1. Different cases in the stacking approach with regard to the stochastic model of the observed coordinates.
Note that the auxiliary weight matrix W that appears in Eq. (31) depends on J X 0 [see Eq. (5)] and it is also required to be updated at each step.
To initialize the Newton-Gauss iteration process, a simple choice is to set the approximate coordinates equal to the observed values (X 0 , while the approximate transformation parameters are typically obtained via empirical procedures. The initial computation ofθ is thus reduced to the simpler form given already in Section 3.2, whereas for subsequent iterations the rigorous expression of Eq. (31) should be used. The updating of all approximate vectors at each step should be performed by the following equations: Special cases with noise-free coordinates in the initial frame (Σ X 0 ¼ 0) and/or uncorrelated coordinates between CPs and NPs (Σ Z 0 X 0 ¼ 0) can be easily treated under the previous framework, and they lead to identical results as the traditional stepwise approach.

Statistical accuracy assessment in SCT solutions
The error CV matrices ofθ,X, andẐ are the fundamental elements for the formal quality assessment in SCT solutions. Their rigorous expressions are obtained by covariance propagation to the respective estimators given in previous sections, and they are presented here without their full mathematical proofs.
Both the stepwise and the stacking approach lead to the same optimal estimate for the transformation parameters, whose error CV matrix is given by the formula: Regarding the accuracy assessment of the transformed coordinates by the stepwise approach, the following expressions should be used: which refer to the CPs and NPs, respectively. The overbar symbol is used to distinguish the above error CV matrices from the respective expressions that apply in the stacking approach. The latter are given by the general formulae: where the auxiliary matrices K and Q are defined as and Σ e is the CV matrix of the coordinate residuals X À f X 0 ;θ , that is, Equations (43) and (44) reveal the expected improvement of the statistical accuracy in the SCT solution by the stacking approach. The diagonal elements (i.e., coordinate error variances) of ΣX and ΣẐ are always smaller than the respective elements of ΣX and ΣẐ , a fact that is attributed to the noise filtering of the observed coordinates during the transformation process.

Numerical example
To demonstrate the potential of the stacking approach in practical transformation problems, a simple example is given here for a simulated 2D network with seven CPs and four NPs. The true coordinates of all network points are listed in Table 2, and they are related by a second-order polynomial transformation: whose associated parameters are provided in Table 3.
The observed coordinates for our experiments stem by adding simulated Gaussian noise to the true values of Table 2. The known coordinates of NPs in

Initial frame
Target frame the target frame are not included in the observables, but they were used only for cross-validation of the transformation results. The generated random errors at the CPs in the target frame are uncorrelated with a common standard deviation of 0.1 cm for the x and y coordinates. On the other hand, the generated random errors at the CPs/NPs in the initial frame are spatially correlated in terms of the simplified Gaussian-type covariance model: 18.50 À0.25 1.20 À0.0002 0.0011 À0.0002 Table 3.
True parameter values of the second-order polynomial transformation model. where σ is the common error standard deviation for the x 0 and y 0 coordinates (set equal to 5 cm) and ρ is their error correlation coefficient at each point (set equal to À0.2). The values of the auxiliary parameters A and B were fixed to 6 Â 10 À7 and 7 Â 10 À6 , respectively, which ensure the positive definiteness of the resulting CV matrix for the observed coordinates in the initial frame.
Using a Monte Carlo sampling scheme and a Cholesky-based algorithm for the stochastic simulation of correlated random vectors, a total of 1000 noisy ensembles were produced for the triplet of coordinate vectors X, X 0 , and Z 0 . These synthetic datasets were used with the stepwise and stacking approach to determine the transformed coordinates and their associated accuracy, over all points of the simulated network.
The differences between the true and the transformed coordinates in the target frame, as obtained by all data ensembles under each approach, are shown in Figures 1 and 2. The cloud plots in these figures refer only to a subset of the CPs/NPs, yet similar results are acquired at all other network points. It is clear that the stacking approach yields significantly better results than the traditional stepwise approach, and it effectively filters the existing noise of the initial coordinates. The accuracy improvement ranges from 88 to 92% at the CPs, while it is a bit lower (63-78%) at the NPs (see detailed results in Table 4). It should be emphasized that the stochastic model of the observed coordinates plays a key role in the performance of the stacking approach. This means that the results shown here may exhibit different behavior-displaying either insignificant or even more profound accuracy improvement for the transformed coordinatesfor varied choices of the CV matrices Σ X , Σ X 0 , Σ Z 0 , and Σ X 0 Z 0 .  All values given in cm. Table 4. Standard deviations of the transformed coordinates in the simulated network by different approaches.