Multimodal datasets with multisensor (#1, #4, #10), multisource (#2, #3, #6, #7, #8) and mutilooking (#5, #9) bi-temporal satellite images.
Open access peer-reviewed article
This Article is part of NUMERICAL ANALYSIS AND SCIENTIFIC COMPUTING Section
Version of Record (VOR)
*This version of record replaces the original advanced online publication published on 28/03/2022
Article metrics overview
401 Article Downloads
View Full Metrics
Article Type: Research Paper
Date of acceptance: March 2022
Date of publication: March 2022
DoI: 10.5772/acrt.02
copyright: ©2022 The Author(s), Licensee IntechOpen, License: CC BY 4.0
Statistical methods for automatic change detection, in heterogeneous bitemporal satellite images, remains a challenging research topic in remote sensing mainly because this research field involves the processing of image data with potentially very different statistical behaviors. In this paper, we propose a new Bayesian statistical approach, relying on spatially adaptive class conditional likelihoods which are also adaptive to the considered imaging modality pair and whose parameters are estimated in a first preliminary estimation step. Once that estimation is done, a second stage is dedicated to the change detection segmentation itself based on this likelihood model defined for each pixel and for each imaging modality. In this context, we compare and discuss the performance of different Markovian segmentation strategies obtained in the sense of several non-hierarchical or hierarchical Markovian estimators on real satellite images with different imaging multi-modalities. Based on our original pixel-wise likelihood model, we also compare these Markovian segmentation strategies over the existing state-of-the-art heterogeneous change detection algorithms proposed in the literature.
change detection
heterogeneous satellite captors/sensors
Markovian segmentation
multimodal or multisource or multisensor or mutilooking satellite images
parameter estimation
Markovian estimators
Author information
Multimodal change detection (MCD) is a recent area of research that has grown and developed considerably over the past decade, mainly due to the rapid development of new sensor systems, new data processing techniques and easier access to remote sensing data. In satellite imagery, multimodal (or heterogeneous) change detection (CD) [1] is the process of detecting or identifying, in a given geographical area, any changes, based on two (or more) images acquired at different instances. The pair (or set) of satellite images (also commonly referred to as bi (or multi)-temporal images to be analyzed) are usually acquired either by different (or heterogeneous) sensors or with the same sensor but with different specifications or conditions.
More precisely, in MCD systems, the heterogeneous bitemporal images may be; (in order of increasing difference from the conventional monomodal CD and difficulty of processing), either provided by different (active or passive) types of sensors, like passive optical and active SAR systems (and leading to bitemporal multisource images), or provided by different passive systems (e.g., different optical sensors) or active remote sensors such as lidars and radar (in the case of bitemporal multisensor or cross-sensor images). Heterogeneous bitemporal images can also be acquired with the same sensing system, but either in different wavelength ranges (hyperspectral or multispectral images) or with different internal settings or different looks (possibly different look angles) or different and complementary speckle noise pre-filtering processes (for mutilooking images) or with unbalanced data or noise distributions possibly due to lighting, weather or phenological or chemical changes which also influences the imaging system (unbalanced image data [2]).
MCD is a difficult image processing problem that can only be solved with a sufficiently flexible, intelligent and robust model for processing image data with different statistical behaviors. This low-level image processing task allows us to solve the same issues usually handled by homogeneous CD techniques [3–8] required for the development, for example, of damage detection and evaluation systems (earthquake, flooding, hurricane, tsunami, forest fire, volcanic activation, etc.) or environmental, agricultural, mineral exploration, or urban growth, monitoring or planning systems etc. MCD has shown an increasing trend and a great interest over the past decade in the geoscience community, mainly because this approach is much less demanding on the origin of the data acquired compared to the conventional homogeneous CD technique. In fact, this low-level processing allows us to fully and intelligently use the multiple heterogeneous (and ever increasing) data sourced from existing archives or from the many and very varied Earth observation satellites existing or planned. In addition, due to advances in this field and the format/specifications evolving through time, it is expected that the heterogeneity of these satellites or aerial image data will be increasing in the years to come. Finally, we include the complementarity of these different imaging methods, which can be used to our advantage. More precisely, a technique of fusion of imaging modalities could potentially be exploited (not only in geo-scientific imagery [9]) to further improve the detection and the analysis of surface changes.
The problem of change detection for homogeneous satellite datasets has been extensively studied in the literature since the advent of digital imaging techniques. The Geoscience research community waited until this problem was recently well enough understood to investigate the MCD problem. This explains why until now, relatively little research has been proposed on MCD.
Despite this, and depending on the modeling strategy used, four main groups of approaches are reported in the recent remote sensing literature and can be easily identified.
The first and most basic techniques are based on similarity metrics [10–12] or rely on empirical (generally hand-crafted) features or local and non parametric operators or detectors [12–14] with supposed good invariance to sensor characteristic.
The second group is made up of non-parametric (possibly statistical) based models whose structure is not
Thirdly, one can also identify algorithms based mainly on a projection or transformation of the bi-temporal heterogeneous images into a common representation space, in which the pair of multimodal satellite images have the same behavior in statistical terms and on which conventional change detection techniques using homogeneous multitemporal satellite images can then be adopted [31–43]. Belonging to this category but also to the previous one, image mapping from one domain to another domain was also exploited
Finally the last class is given by parametric methods, which we will detail more precisely below since the model described in this work falls into this group. In this strategy, a (possibly finite mixture of) parameterized multivariate distribution law(s) are usually (
Finally, the authors in [55] use a pixel pair modeling and find the likelihood laws for these two possible class labels;
In this work, we present a quite different approach, in a non-parametric statistical framework, whose originality is to rely on a set of class conditional likelihoods also conditioned to the spatial neighborhood of each pixel. This set of spatially adaptive likelihoods allows us to formally define the MCD issue in the Bayesian setting with a reliable and spatially (or neighborhood) adaptive likelihood model whose parameters are estimated in a first preliminary estimation step. Once the estimation stage is achieved, a second stage is dedicated to the binarization (or detection of multimodal change) itself given this set of estimated spatially adaptive likelihoods. In this context, we compare and discuss the accuracy of the CD segmentation map obtained in the sense of several Markovian estimators such as the Maximum Likelihood (ML), MAP (Maximum A Posteriori) [56], MPM (Marginal Posterior Mode) [57], SMAP (or Sequential MAP) estimator using the multiscale and hierarchical Bayesian segmentation framework of Bouman
Let
In the unsupervised Bayesian framework proposed in this work, it is necessary, first of all, to estimate the likelihoods (
To this end, for each pixel
In our experiments, we have noticed that the distribution of the grey levels associated to each central pixel of these
Regarding the class label
At this point, it is important to understand that some of the
Once the estimation of the conditional likelihood distributions (see equations (1)) are defined for each site
At this point, it is interesting to recall the fundamental difference between these different criteria commonly used in the Bayesian image segmentation framework. To this end, let
From left to right; Hierarchical Markovian structure of the SMAP with a quad-tree structure (for the ascending Markovian process) and a pyramid structure (for the descending pass) and SCMAP with a spatial second order neighborhood system including a scale causal link with the (immediately) coarser resolution scale.
We detail a little more in this section, the simplest Markovian segmentation algorithm called ICM related to the classical
In this Bayesian strategy, the segmentation in two classes 𝛬
In order to minimize this energy function, we use a iterative deterministic optimization technique called ICM algorithm for which we recall the different steps of this deterministic iterative minimization algorithm:
For the initialization of ICM (iteration [0], in superscript), we choose an initial configuration as close as possible to the optimal segmentation; for example a segmentation in the ML sense:
Estimation of
For each pixel (or site) in lexicographic order:
For each site
We select the class
Return to 1. until a criterion is reached, generally:
Multimodal datasets (see table
Multimodal datasets (see table
CD confusion maps (white: TN, red: TP, blue: FP, cyan: FN) for the dataset #1 (top) and #2 (bottom), obtained in the sense of several Markovian estimators combined with the data likelihood model defined in equation (
To show the effectiveness of our approach, we have performed extensive experiments involving different real heterogeneous datasets with different multi-modality types (multi-sensor/source/looking) and exhibiting a wide diversity of changed events and finally provided at different resolution levels and image sizes (see table 1).
For all the tests we have performed, we have only used the luminance component or the greyscale information of the image and thus converted the three color channels (or the multi-spectral bands (possibly with the models introduced in [62, 63])) into one single gray channel when it is necessary. In addition, to reduce the computational load, we have chosen to sub-sample the image so that its maximum length or width is less than 512 pixels (with a decimation technique given by a simple moving average filter) and finally, we have used a double histogram matching method [64] on
Date | Location | Size (pixels) | Event (& Spatial res.) | Sensor | |
---|---|---|---|---|---|
1 | 09/95–07/96 | Sardinia, It | 123.6 K | Overflow (30 m) | Landsat-5 (NIR band)/Optical |
2 | 07/06–07/07 | Gloucester, UK | 9.6 M | Flooding (0.65 m) | TerraSAR-X/QuickBird 2 |
3 | 02/09–07/13 | Toulouse, Fr | 11.4 M | Urbanization (2 m) | TerraSAR-X/Pleiades |
4 | 05/12–07/13 | Toulouse, Fr | 4 M | Urbanization (0.52 m) | Pleiades/WorldView-2 |
5 | 01/01–01/02 | Congo, Africa | 320 K | Volcano (10 m) | Radarsat (3/5-looks) |
6 | 01/17–02/17 | Sutter, CA, USA | 439.7 K | Flooding (≈15 m) | Landsat-8/Sentinel-1A |
7 | 06/08–06/13 | Island town, Ch | 167.2 K | Urbanization (8 m) | Radarsat-2/ Google Earth |
8 | 06/08–09/12 | Shuguang, Ch | 419 × 342 | Urbanization (8 m) | Radarsat-2/Google Earth |
9 | 06/08–06/09 | Yellow river, Ch | 150 K | River drying up (8 m) | Radarsat-2 (1/4-looks) |
10 | 1999–2000 | Gloucester, UK | 548.5 K | Flooding (≈25 m) | Spot/NDVI |
Multimodal datasets with multisensor (#1, #4, #10), multisource (#2, #3, #6, #7, #8) and mutilooking (#5, #9) bi-temporal satellite images.
The internal parameters of our estimation step model (see section 2.1) are;
To discuss the efficiency of our neighborhood-adaptive class conditional data likelihood model defined in equation (1), we have compared the obtained CD segmentation results on our heterogeneous dataset, in the sense of different existing (non-hierarchical or hierarchical) Markovian estimators, a first quantitative study is achieved and evaluated with the same evaluation measures proposed in [55] and [40]; namely the F-measure and the total percentage of good classification (accuracy).
We can notice (see table 2) that hierarchical Markovian estimators, like the SMAP and SCMAP are in fact comparable, in terms of efficiency (while being quite different in terms of hierarchical structures and algorithms) and clearly perform better than the classical non-hierarchical Markovian estimators such as the MAP or MPM. MAP (
Estimator | #1 | #2 | #3 | #4 | #5 | #6 | #7 | #8 | #9 | #10 | PCC Mean |
---|---|---|---|---|---|---|---|---|---|---|---|
ML | 92.6 | 87.4 | 81.9 | 79.7 | 80.4 | 85.9 | 86.0 | 90.3 | 86.8 | 85.8 | 85.680 |
MAP | 95.7 | 92.1 | 87.8 | 84.9 | 81.5 | 92.5 | 92.1 | 92.5 | 95.9 | 90.8 | 90.588 |
MPM | 95.8 | 92.2 | 87.8 | 84.9 | 81.5 | 92.5 | 92.1 | 92.6 | 95.9 | 90.9 | 90.629 |
SMAP | 96.5 | 92.8 | 88.3 | 85.9 | 82.3 | 93.7 | 94.4 | 95.2 | 97.0 | 94.8 | 92.095 |
SCMAP | 96.1 | 94.1 | 89.4 | 85.8 | 82.6 | 94.3 | 94.8 | 95.4 | 97.1 | 94.8 | |
| |||||||||||
| | | | | | | | | | | |
| |||||||||||
ML | 0.40 | 0.38 | 0.32 | 0.21 | 0.33 | 0.20 | 0.28 | 0.34 | 0.25 | 0.55 | 0.3256 |
MAP | 0.66 | 0.56 | 0.48 | 0.29 | 0.40 | 0.32 | 0.42 | 0.46 | 0.59 | 0.72 | 0.4892 |
MPM | 0.66 | 0.56 | 0.48 | 0.29 | 0.40 | 0.32 | 0.42 | 0.46 | 0.58 | 0.72 | 0.4894 |
SMAP | 0.70 | 0.57 | 0.51 | 0.27 | 0.36 | 0.21 | 0.46 | 0.63 | 0.67 | 0.82 | |
SCMAP | 0.63 | 0.60 | 0.54 | 0.25 | 0.34 | 0.26 | 0.44 | 0.61 | 0.66 | 0.82 | 0.5158 |
Percentage of correctly classified pixels for the different single-scale (ML, MAP, MPM) or hierarchical (SMAP, SCMAP) Markovian estimators. Top: in terms of percentage of correct changed and unchanged pixels (PCC). Bottom: in terms of F measure.
NIR Thermic/Optical [#1] | Accuracy | | | |
---|---|---|---|---|
| | | | |
Mignotte [41] | 0.928 | Mignotte [41] | 0.971 | |
Touati | 0.847 | Touati | 0.943 | |
Touati | 0.964 | Touati | 0.961 | |
Zhang | 0.975 | Touati | 0.955 | |
PCC [21] | 0.882 | Touati | 0.949 | |
Prendes | 0.844 | |||
Correlation [52] | 0.670 | |||
Mutual Inf. [52] | 0.580 | |||
| ||||
| | | | |
| ||||
| | | | |
Mignotte [41] | 0.909 | Mignotte [41] | 0.859 | |
Touati | 0.881 | Touati | 0.877 | |
Touati | 0.892 | Touati | 0.880 | |
Touati | 0.909 | Touati | 0.862 | |
Touati | 0.867 | Touati | 0.853 | |
Prendes | 0.918 | Prendes | 0.844 | |
Prendes | 0.854 | Correlation [52, 53] | 0.679 | |
Copulas [49, 51] | 0.760 | Mutual Inf0. [52, 53] | 0.759 | |
Correlation [49, 51] | 0.688 | Pixel Dif0. [53, 65] | 0.708 | |
| ||||
| | | | |
| ||||
| | | | |
Mignotte [41] | 0.830 | Mignotte [41] | 0.952 | |
Touati | 0.840 | |||
Chatelain | 0.749 | |||
Correlation [50] | 0.713 | |||
Ratio edge [50] | 0.737 | |||
| ||||
| | | | |
| ||||
| | | | |
Mignotte [41] | 0.940 | Mignotte [41] | 0.942 | |
Touati | 0.767 | Touati | 0.847 | |
Touati | 0.817 | |||
Liu | 0.976 | |||
| ||||
| | | | |
| ||||
| | | | |
Mignotte [41] | 0.979 | Mignotte [41] | 0.962 | |
Liu | 0.957–0.964 |
Percentage of good classification (accuracy) on the dataset described in table 1 and obtained comparisons by the two hierarchical (SMAP or SCMAP) segmentation methods based on our pixel-wise neighborhood and class conditional likelihoods) versus the state-of-the-art multimodal change detectors existing in the literature (first upper part of each table) and with the mono-modal change detectors (second lower part of each table) [13, 14, 21, 25, 26, 30, 39, 41, 49–53, 55, 65].
Second, a comparison of the SMAP and SCMAP segmentation results with different state of the art approaches [13, 30, 39, 49, 51] is summarized in table 3. From figures 2 and 3 and from this latter table, we can see that the accuracy rate of these two hierarchical Markovian methods outperforms in average the other state-of-the-art non Markovian approaches and allows us to obtain good CD results across a wide variety of existing satellite imagery heterogeneities. We can also notice that a SCMAP or SMAP hierarchical approach combined with our likelihood model allow us to achieve a relatively constant efficiency whatever the type of multi-modality of satellite imagery encountered. It should be noted, however, that this approach tends to underestimate the
In this paper, we have shown that a data likelihood model relying on a set of spatial neighborhood adaptive class-conditional likelihoods, whose parameters have been previously estimated in a first estimation step, combined with a hierarchical Markovian segmentation procedure in a second step, turns out to be both a simple, reliable, computationally efficient and good unsupervised statistical strategy for the change detection issue, whatever the type of multi imaging modality encountered in heterogeneous remote sensing imagery.
Author would like to thank the NSERC (Natural Sciences and Engineering Research Council of Canada; RGPIN 2016-04578) for having supported this research work.
The code, data, and all that is necessary for reproduction of the results will be freely accessible on the author’s website http://www.iro.umontreal.ca/∼mignotte/ (directory: ResearchMaterial/).
The Author declares that there is no conflict of interest associated with this work and this publication.
Supplementary Data
Written by
Article Type: Research Paper
Date of acceptance: March 2022
Date of publication: March 2022
DOI: 10.5772/acrt.02
Copyright: The Author(s), Licensee IntechOpen, License: CC BY 4.0
© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 4.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Impact of this article
401
Downloads
357
Views
3
Crossref Citations
3
Dimensions Citations
Join us today!
Submit your Article