Threshold values, pixel change ratio, omission/commission error, and the kappa coefficient of agreement, obtained from each detection method, input image, and threshold technique.

## Abstract

To detect changes in satellite imagery, a supervised change detection technique was applied to Landsat images from an area in the south of México. At first, the linear regression (LR) method using the first principal component (1-PC) data, the Chi-square transformation (CST) method using first three principal component (PC-3), and tasseled cap (TC) images were applied to obtain the continuous images of change. Then, the threshold was defined by statistical parameters, and histogram secant techniques to categorize as change or unchanged the pixels. A threshold optimization iterative algorithm is proposed, based on the ground truth data and assessing the accuracy of a range of threshold values through the corresponding Kappa coefficient of concordance. Finally, to evaluate the change detection accuracy of conventional methods and the threshold optimization algorithm, 90 polygons (15,543 pixels) were sampled, categorized as real change/unchanged zones, and defined as ground truth, from the interpretation of color aerial photo slides aided by the land cover maps to obtain the omission/commission errors and the Kappa coefficient of agreement. The results show that the threshold optimization is a suitable approach that can be applied for change detection analysis.

### Keywords

- Landsat
- threshold
- histogram
- change detection
- optimization
- algorithm

## 1. Introduction

The change detection is the process of identifying differences between the state of specific characteristics of a phenomenon or elements, by comparing its spatial representation at two points in time [1–3], controlling all variances caused by the differences in variables that are not of interest, and measuring changes caused by differences in the variables of interest.

The change detection applied to assess the transformation of the space over the time can be a suitable alternative to support the territory management policies and in the proposal of appropriate actions of urban planning, natural resource management, disaster assessment, and risk management that are required to cope with the impacts of global change that are recorded in several regions of the world and are rapidly transforming the landscapes [3–6].

To evaluate and address suitably the effect of changes in land coverage and land use, several studies and models of change detection based on multispectral image data analysis have been carried out such as principal component analysis (PCA), image subtraction, image ratioing, change vector analysis, image regression, and spectral features variance, which are the usual methods of the change detection assessment [7–11].

Other nonconventional studies introduce transformations as input data in change detection to process, such as the use of the pseudo-cross variogram as a statistical function to quantify the changes [4], the multivariate alteration detection, based on the established canonical correlations analysis, introduced by [12], or the Chi-square transformation, based on the Mahalanobis distance between pixels of images at different dates, submitted by Ridd and Liu [13].

The first stage of any change detection process results in a (continuous) change image, in which the pixel value corresponds to the intensity of the change calculated according to the method employed. However, to complete the process of detecting changes, it is necessary the generation of a binary map that categorizes the continuous value of the pixels only into two classes: change and no change. Thus, a segmentation of the continuous change image is required, according to a value assigned as a threshold, and thus, the dynamic areas can be distinguished from those that remained stable during the period analyzed [10, 14].

Considering that only a few pixels change over a period, the classic method for defining the threshold value needed to categorize the changed and unchanged pixels is based on the fact that the density function of the difference image can be considered almost equal to the density function of unmodified pixels [15]. Thus, the unchanged pixels are distributed around the mean (*μ*), while the changes are spread over the tails of the distribution [1–16], separated from the mean a certain number of standard deviations of the distribution.

Typically, the definition of the threshold values is directly based on a manual process of trial-and-error or by using the statistical parameters (mean and standard deviation) of the difference image interactively, adjusting the number of standard deviations and evaluating the results until the sets of accuracy and error are satisfied [7, 10, 17]. In practice, the determination of the thresholds is highly subjective, based on the analyst’s abilities and the known details of the study area [18].

There are also other studies where automatic processes define the threshold value. Hervás and Rosin [19] proposed an automatic threshold definition through the intensity value of the pixel in the difference image, which corresponds to the point in the histogram (usually unimodal) of maximum distance to the secant line between the highest and lowest points of the distribution function [20].

In the first stage of this work, two conventional methods of detection of changes are applied: linear regression (LR) and Chi-square transformation (CST); followed by the thresholding definition by statistical parameters and by the histogram secant technique, in order to categorize the pixels as changes or no change and thus generate the final thematic map for each method.

In a second stage, an optimization algorithm was designed, aimed to define the value of an optimal threshold that maximizes the pixel hits and minimizes the fails ones, and based on the ground truth data of the studied area. This stage was developed for each of the continuous images obtained from the conventional methods analyzed, and also, a final (optimal) thematic change map was obtained.

Finally, the accuracy of each of the final thematic maps obtained from the traditional methods and the threshold optimization process was assessed, validating the results with the collected ground truth information through confusion matrices.

The Dinamica EGO software1 [21, 22] was used to develop the described models and processes of this work.

## 2. Materials and methods

### 2.1. Study area

The studied area consists of a group of basins in the central region of the Guerrero state in México, around the largest gold mine in Latin America “Filos-Bermejal” [23] covering approximately 4800 km^{2} ( Figure 1 ), of which 43.8% is covered by oak, pine, and mesophyll forest, 39.3% is deciduous forest, 8.2% is agricultural-livestock land use, and 8.7% is covered by grassland, bare soil, water, and urban.

This area is interesting because the concentration of diverse factors that lead to an important dynamics of landscape transformation, such as the pressure on vegetation areas by urban and agricultural growth, there are deforestation areas by industrial and mining activity and also areas of natural hazards.

### 2.2. Dataset

Two Landsat surface reflectance (SR) images (Path 26, Row 48, WRS-2) were used. This processed product is courtesy of the U.S. Geological Survey Earth Resources Observation and Science Centre and is offered as climate data records [24, 25]. The images correspond to the dates and sensors: February 24, 2011 for Landsat 5-TM and February 16, 2014 for Landsat 8-OLI.

In the performed processes, the thermal bands were excluded. Thus, the bands used were Blue, Green, Red, Near-Infrared (NIR), Shortwave Infrared 1 (SWIR-1), and Shortwave Infrared 2 (SWIR-2), according to the official nomenclature [26].

The SR data are produced by the specialized software *Landsat Ecosystem Disturbance Adaptive Processing System* for Landsat 5-TM and *L8SR* for Landsat 8-OLI, developed by the Making Earth System Data Records for Use in Research Environments of the Goddard Space Flight Center of NASA and the University of Maryland [27].

The software applies the MODIS atmospheric correction routines. In addition to the Landsat data, the water vapor, ozone, geopotential height, aerosol optical thickness, and digital elevation data are presented to the radiative transfer model *Second Simulation of a Satellite Signal in the Solar Spectrum* (6S) to generate the top of atmosphere reflectance, SR, brightness temperature and cloud masks, cloud shadows, and adjacent clouds, land, and water.

### 2.3. Topographic correction and image pre-processing

An overview of the stages performed in this study can be seen in Figure 2 .

To prevent false results in the change detection analysis, it is recommended to perform a precise coregistration, radiometric calibration, atmospheric, and topographic corrections between the multitemporal images [28, 29]. Almost all of these processes are already applied in the elaborated product used (Landsat SR images).

Thus, to reduce the effect that the changes of slopes, terrain orientation, and solar geometry at the time of the image acquisition cause on the reflectance values, the topographic correction was performed by Sun Canopy Sensor + Correction (SCS + C) method [30], which is more appropriate for forest mountain areas than other ground-based methods, because it preserves the geotropic nature of the trees (growth normal to geoid) [31].

The C parameter used in the topographic correction aims to improve the correction by moderating the overcorrection of dimly lit pixels [32] and was determined by linear regressions between the illumination and reflectance values, according to a classification of the topographical slopes of the studied area [14, 33]. Thus, six NR bands for each image date were obtained. A true-color composite of the normalized bands (RGB) is shown in Figure 3 .

Images of brightness, greenness, and wetness indices (tasseled cap-TC) [34–36] and principal components (PC) [37] were generated, to be processed as input data in the change detection analysis to differentiate the results.

### 2.4. Change detection methods

#### 2.4.1. Linear regression (LR)

The LR method for change detection assumes that the pixel values (*Y*) of the final date image *f* _{2} results from a linear function of the pixel values (*X*) of the initial date image *f* _{1}. Thus, it is possible to perform a regression from *m* and ordinate *b* regression line parameters to find an equation of the form *Y* ^{′} = *mX* + *b* to model it.

Applying the RL model to the *f* _{1} image, a new image *f* _{2} image, it is possible to obtain a continuous image of the residual values calculated by:

where

If no changes are recorded (Residual = 0), then the pixel values obtained by the prediction model in the expected image will be the same as that of the real pixel values in the final date image *f* _{2}. On the other hand, the founded differences registered in the corresponding residual value image will indicate a change, and its magnitude will indicate the intensity.

The first PC images (PC-1) of each date were used as input data in the performance of the LR detection method, obtaining a single image of continuous change.

#### 2.4.2. Chi-square transformation (CST)

The CST is a statistical technique that is applied to obtain a measure of divergence or distance between groups regarding multiple characteristics. The most used measure is *Mahalanobis distance* (Md) [39] and plays a fundamental role in the data analysis with multiple measurements, finding applications on statistical patterns recognition in archeology, medical diagnosis, or remote sensing [40].

In remote sensing, the Chi-square transformation is applied to recognize patterns in a single global image, starting from the multivariate information stored in numerous multispectral bands [20].

To distinguish how much each image has changed over the time, an additional image of the relative difference of each pixel (*P*) compared to its mean (*μ*) was calculated by (*P* − *μ*)/*μ*, applied to each kind of images and dates [20]. For this task, a cloud mask was considered in the calculation of the mean not to affect the final values of relative differences.

For each pair of corresponding images acquired on different dates, a simple subtraction pixel-by-pixel was performed, aimed at obtaining a third image that stores the numerical differences. Thus, three differences images have been achieved for each kind of data: Dif-PC_{1}, Dif-PC_{2,} and Dif-PC_{3} and Dif-TC_{B}, Dif-TC_{G}, and Dif-TC_{W}.

According to Ridd and Liu [13], the CST detection method was applied to each group of difference images (Dif-PC and Dif-TC) to obtain a single global image that stores the continuous change, represented by the square Md [39, 40], determined by the equation:

where *Y* is the square root of Md for each pixel in the change image, *X* is the difference vector of the *n* values between the two dates, *M* is the vector of the mean residuals of each band, *T* is the transpose of the matrix (*X* − *M*), and (*P* − 1) is the inverse covariance matrix of the bands between the two dates.

The Md takes into account the variance of each variable and the covariance between variables. Geometrically, this is done by transforming the data into standardized uncorrelated data and computing the ordinary Euclidean distance for the transformed data, thus providing a way to measure distances that take into account the scale of the data (standard deviation) [20, 41].

The Md performs a Chi-square distribution with degrees of freedom equal to the number of bands used in the transformation model. Thus, it is possible to compare the distances just as the data would be compared to a normal distribution, when the number of bands has a multivariate normal distribution with the covariance matrix Σ and the vector of the mean residuals *M* [13, 20, 42].

The square root of the Md (*Y*) can be represented as a single image, and the pixel values represent the magnitudes of change between the analyzed dates. Thus, theoretically, a zero pixel value means an absolute no change.

The CST was applied to the first 3-PC images and the TC images between the dates of study. Thus, two single continuous images of change were obtained.

### 2.5. Threshold definition

Any of the methods explained previously results in a continuous change image, where the pixel values correspond to the intensity of the change estimated according to the method employed. However, to complete the process, it is necessary to generate a categorical binary map that isolates the pixels into change and no change classes. Thus, it is a need to segment the continuous change image according to a value assigned as a threshold, and in this way, the dynamic areas can be distinguished from those that were stable during the analyzed period.

#### 2.5.1. Statistical method

From the continuous change image, the statistical parameters of the mean (*μ*) and the standard deviation (*σ*) are used to calculate the threshold by *μ* ± *nσ*.

According to D’Addabbo et al. [16], the density function of the continuous change image is almost equal to the density function of the unmodified pixels, and in the determination of the threshold statistically fixed, *n* is an empirical parameter set by the user that can be adjusted.

In this stage, the *n* value was set equal to 2. Thus, the *nσ* factor was obtained first, then added to, and subtracted from the mean value of a continuous image obtained by the LR method (with thresholds in both tails of the distribution, as can be seen in Figure 3 ). On the other hand, the *nσ* factor was only added to the average value of the continuous image obtained by the CST method (with a single threshold in the right tail of the distribution).

#### 2.5.2. Secant method

The threshold is automatically defined by selecting the pixel value corresponding to the point in the distribution of the histogram, where the maximum perpendicular line intersects the secant line between the highest and lowest points of the histogram [19]. According to the distribution of the continuous change image, this procedure was performed just in the right tail for the CST detection method and in both tails for the LR detection method ( Figure 4 ).

### 2.6. Accuracy assessment

To assess the accuracy of the change detection process, a sampling of 90 polygons was identified as ground truth, and their pixels were categorized as change and unchanged. Through the interpretation of color, aerial photo slides near to the study dates, aided by the land cover, land use, and vegetation maps, 7810 pixels were defined as changed and 7733 ones were defined as unchanged. This ground truth was compared with the final thematic maps obtained from each detection method, and each thresholding technique applied through confusion matrices to get the omission and commission errors.

For each final thematic map, the Kappa concordance coefficient of agreement [43] was obtained to quantify the difference between the observed map-reality agreement and the one that would be expected simply by random. The Kappa index attempts to define the degree of adjustment due only to the accuracy of the categorization, regardless of random factors causes [10, 44]. The Kappa coefficient was calculated by:

where *k* is the Kappa coefficient of agreement, *n* is the sample size, *Xii *is the observed agreement, and *X* _{i+} *X* _{+i }is the expected agreement in each category *i*.

The Kappa coefficient allows knowing if the marked degree of agreement draws away or is not significantly different from the expected random agreement. The agreement observed highlights on the diagonal of the confusion matrix, and the expected agreement is used to calculate the fit between the map and the reality, due to randomness [10].

### 2.7. Threshold optimization

The optimization algorithm proposed is based on the Kappa concordance coefficient of agreement, considering the threshold value that reports the highest Kappa index corresponds to the one that also will report the maximum numbers of pixels hit and the minimum numbers of pixels fail.

The algorithm is an automatic, but also a supervised process, unlike previous methods explained that are automatic unsupervised. Thus, it is necessary to have real information as field data about the changed and unchanged areas that will be used as ground truth (the same data to assess the accuracy) to train and evaluate the optimization iteratively.

It is also necessary for an initial threshold value, to start the iterative process, which can be any of the automatically threshold determined by the previously explained methods (statistical or secant).

Figure 6 shows the flowchart of the optimization process.

For the CST detection method (using 3-PC and TC as the input images), the threshold optimization process was applied following the scheme as it is shown in Figure 5 ; starting from the value of the threshold established by the statistical method to set the range and develop 200 iterations in the whole process.

For the continuous change image obtained by the LR detection method (using PC-1 as the input image), the threshold optimization process was applied in two stages, first looking for the optimal value of the right threshold by establishing the left one equal to the value calculated by the statistical method. Then, the optimum value of the left threshold was sought by setting as the right limit equal to the optimal value previously founded.

For each method, two maps (change/no change) were obtained, and their accuracy was assessed by confusion matrices and the concordance index (Kappa). Then, the proposed algorithm was performed to obtain an optimal threshold by successive iterations to first increasing/decreasing the last threshold value and then calculate the corresponding Kappa and comparing with previous results to stop the process and finally define the highest Kappa linked to the optimal threshold.

For each continuous change, image generated by each detection method and kind of input image processed, an optimal final binary change/no change map was obtained, which was compared, analyzed, and discussed.

## 3. Results and discussion

### 3.1. Change detection analysis

The threshold definition processes performed to the histograms of the continuous change image arising from the LR detection method using PC-1 as input images are shown in Figure 7 . The values raised from the thresholding (statistical and secant) processes are also illustrated.

The observed ranges of the distribution ( Figure 7 ) show the changed pixels (at the ends of both tails) for each of the threshold methods applied and also the unchanged pixels remaining in the center of the distribution (around the mean).

The definition of the thresholds values varies according to the applied method. The threshold values in the left tail are similar in both approaches; however, in the right tail, the statistical method reports 0.0818 as a threshold, while the secant method reports 0.0693. This fact affects the number of pixels considered as changed and their location in the final thematic change maps, making it more or less intense visually.

The histograms of the continuous change images resulting from the CST detection method using the 3-PC and TC as input images, as well as the values arising from the thresholding process are shown in Appendix A.

### 3.2. Optimization process

The optimization algorithm was applied to each of the three detection methods considered in the study. Figure 8 shows the values of the coefficient of concordance Kappa reached by each threshold valued analyzed for each of the detection method and threshold definition.

The algorithm determines the Kappa index for each of the threshold values within the analyzed range. According to the patterns depicted, it can be seen how the value of the tested threshold increases, the successes increase, and errors decrease, which is reflected in a better Kappa index. This performance occurs until a certain limit (maximum of the curve); from this point, the pattern change, now the successes decrease, the errors increase, and the Kappa index decreases. Then, based on the pattern depicted for each detection method, the algorithm determines the highest Kappa index and matches it to its corresponding threshold, resulting as the optimal one.

### 3.3. Thematic maps of change

From the threshold values determined for each detection method applied, thematic change maps were generated. Figure 9 shows the maps obtained by the LR detection method and each thresholding determined.

The resulting thematic maps seem similar, detecting changes around the same places regardless of the detection method, the nature of the input images used, and the thresholding technique. At a glance, the resulting maps show the agreement of the analyzed approaches in detecting visible changes caused by specific phenomena or events occurred between the dates ( Figure 9a ).

The resulting maps are more or less visually intense, depending on whether more or fewer pixels are detected as change. According to the kind of data used as input images and how it is analyzed by the detection method, result in more or less homogeneous areas represented in maps for certain kinds of land cover use.

More pixels can be detected as a change in the river zones when 3-PC images were used ( Figure 9 ( d – f )) may be due to the presence of elements (such as sediments in water) that this kind of images detects better than the other used images. In forest areas, more changes are detected when TC images were used ( Figure 9 ( g – i ))]. As the previous case may be, the nature of the data is more suitable for detect shifts in this type of cover.

The optimal-threshold map obtained by the LR method seems to detect more pixels as a change in the same areas as the statistic and secant maps, except in the cloud zone where a difference is observed in the interpretation as a change between the cloud and its shadow (according to Figure 3 ).

In the same way, the optimal-threshold map obtained by the CST using 3-PC images seems more visually intense than the statistic and secant maps in all areas, including the cloud zone. While the optimal-threshold map from the CST using TC images looks very similar to the statistical and secant maps.

### 3.4. Accuracy assessment

To assess the accuracy of the resulting thematic change maps, confusion matrices were performed to obtain the omission and commission errors and the Kappa coefficient of agreement ( Table 1 ).

Detection method | Input image | Threshold definition | Threshold value | Pixels detected as change (%) | Mean omission error | Mean commission error | Kappa |
---|---|---|---|---|---|---|---|

LR | PC-1 | Statistic | <−0.0818> 0.0818 | 4.40 | 4.80 | 4.70 | 90.31 |

LR | PC-1 | Secant | <−0.0831>0.0693 | 4.89 | 5.00 | 4.90 | 90.07 |

LR | PC-1 | Optimal | <-0.0482> 0.1942 | 7.37 | 3.10 | 3.10 | 93.78 |

CST | 3-PC | Statistic | >3.4969 | 3.64 | 6.00 | 5.80 | 87.97 |

CST | 3-PC | Secant | >3.5882 | 3.41 | 6.30 | 6.10 | 87.29 |

CST | 3-PC | Optimal | >3.0248 | 5.44 | 5.00 | 5.00 | 90.04 |

CST | TC | Statistic | >3.4725 | 3.61 | 5.80 | 5.80 | 88.29 |

CST | TC | Secant | >3.1137 | 4.76 | 5.60 | 5.60 | 88.75 |

CST | TC | Optimal | >3.1773 | 4.53 | 5.50 | 5.50 | 89.00 |

According to the ground truth validation, all the analyzed methods reach at least 85% of accuracy in isolating the changed from the unchanged areas. According to the Kappa coefficient of agreement ( Table 1 ), it can be seen that the LR method results in an improvement in the change detection process in comparison to the CST method.

The CST method reports very similar Kappa coefficients, regarding the use of 3-PC or TC as input images in the detection of general changes. However, as was mentioned before, one kind of images could have better results than another, for the detection of specific changes and specific land covers.

According to the Kappa coefficient, in all cases, the optimal-threshold thematic change map reports an improvement compared with the accuracy by statistic and secant thresholds. Over the mean of statistic and secant coefficients, the improvement is 3.6% for the LR method, 1.6% for the CST method using 3-CP images, and 0.3% for the CST method using TC images ( Table 1 ).

From all the analyzed approaches, the LR detection method using 1-PC as input images and the threshold value determined by the optimization algorithm combine the highest Kappa coefficient and the lowest proportion of omission–commission errors; thus it is the most accurate in isolating the changed from the unchanged areas, according to the ground truth validation defined.

On the other hand, the CST detection method using TC as input images and the threshold value determined by statistic parameters is the least accurate of analyzed approaches.

## 4. Conclusions

In the change detection analysis, the threshold value designation is a major step that defines the overall capacity of the method employed. A suitable threshold maximizes its ability to discriminate the dynamic from the stable areas; but even other impacts such as those caused by random factors occurring between the periods analyzed could also be detected like differences in illumination, atmospheric conditions, soil moisture, sensor calibration, or information recording.

The statistical threshold definition is suitable when the pixel ratio of change in the studied area is less than ≈4.6%, by the threshold definition based on the statistical parameters. On the other hand, the threshold optimization algorithm works with the information of the sample data of the ground truth and thus has no limitation of the pixel ratio of change resulting, as can be seen in Table 1 .

According to the obtained results, it can be concluded that the proposed threshold optimization algorithm is a suitable approach in the change detection analysis, since it reports substantial improvements compared with the other conventional detection methods developed.

The proposed threshold optimization is a supervised approach, and as such, it necessarily requires ground truth information to be performed which could be a limitation, unlike the conventional methods applied which are not supervised and fully automatic in the definition of thresholds.

Although this studio focuses on Landsat imagery, the threshold optimization can be applied to any satellite imagery, any study area, and any data information contained in the images to compare.

The threshold optimization could even be of interest in areas other than satellite imagery, such as medical imaging where images of the human body are used in medical procedures that seek to reveal, diagnose, or examine diseases.

## Appendix

## Notes

- Dinamica EGO consists of a sophisticated platform for environmental modeling with outstanding possibilities for the design from the very simple static spatial model to very complex dynamic ones, which can ultimately involve nested iterations, multi-transitions, dynamic feedbacks, multi-region and multi-scale approach, decision processes for bifurcating and joining execution pipelines, and a series of complex spatial algorithms for the analysis and simulation of space-time phenomena [22].