Sum of precision and recall at each scale for area 1. Bold values indicate the optimal segmentation scale in different categories. Italic value indicate the overall optimal segmentation scale for all categories.
With the recent developments in the acquisition of images using drone systems, object-based image analysis (OBIA) is widely applied to such high-resolution images. Therefore, it is expected that the application of drone survey images would benefit from studying the uncertainty of OBIA. The most important source of uncertainty is image segmentation, which could significantly affect the accuracy at each stage of OBIA. Therefore, the trans-scale sensitivity of several spatial autocorrelation measures optimizing the segmentation was investigated, including the intrasegment variance of the regions, Moran’s I autocorrelation index, and Geary’s C autocorrelation index. Subsequently, a top-down decomposition scheme was presented to optimize the segmented objects derived from multiresolution segmentation (MRS), and its potential was examined using a drone survey image. The experimental results demonstrate that the proposed strategy is able to effectively improve the segmentation of drone survey images of urban areas or highly consistent areas.
- high-resolution image
- Moran’s I
- Geary’s C
Low-altitude drone imaging is widely used in mapping, land cover/land use monitoring, and resource and environment monitoring, and various low-altitude drone data processing and analysis models have been established [1, 2, 3, 4]. As drones are flexible, have customizable temporal resolution, and high spatial resolution, they have attracted much attention from researchers and manufacturers. Drone-based remote sensing has already been applied to the management and monitoring of forest resources , vegetation and river monitoring , monitoring of archeological sites , management of natural disasters and seismic monitoring , precision farming , and other aspects. Drone-based remote sensing has been widely applied mainly owing to recent breakthroughs in drone-based remote sensing data acquisition technology, as well as innovative and technological improvements in the remote sensing field [10, 11, 12, 13]. The abovementioned descriptions of information extraction from drone-based remote sensing fully utilized the advantages of high spatial resolution in imaging and employed object-based image analysis (OBIA) technology. Therefore, studying the uncertainty of OBIA in drone-based image processing has important significance for the application of drone-based high-resolution imaging.
Segmentation is a prerequisite for OBIA, and the scale of segmentation is an important factor affecting OBIA and affects nearly every stage of OBIA. Multiresolution segmentation (MRS) has been shown to be one of the relatively successful segmentation algorithms in OBIA [14, 15]. This algorithm is very complex and has high requirements on the user; the scale, shape, compactness, and other variables are its main parameters, which are all user-customizable . However, many studies have shown that the scale is the most important parameter, as it controls the dimensions of the subject after segmentation and can directly affect the subsequent classification [16, 17, 18, 19, 20]. Therefore, scale problems have become a current prominent problem in OBIA, particularly in OBIA research on MRS. Arbiol et al.  also pointed out that semantically significant regions are found at different scales, making the acquisition of appropriate segmentation scales and obtaining optimized segmentation results relatively important. However, many specific terrain extraction studies were dependent on repeated experiments, and scale parameters were determined according to experience . Evidently, this is an irreplaceable method , and therefore, many researchers have proposed methods to determine the optimal scale parameter [20, 23, 24, 25, 26, 27].
Therefore, this chapter focuses on discussing the uncertainty of multiscale segmentation and testing the sensitivity of the evaluation indicators in different segmentation results. Furthermore, the quality of the segmentation results from different scales will be verified in order to propose a strategy to improve the quality of multiscale segmentation. Firstly, the internal consistency of the segmentation object (area-weighted average variance) and the spatial autocorrelation indicators of the object (Moran’s I and Geary’s C) under different segmentation results were evaluated and measured. Subsequently, based on the consistency and autocorrelation indicators, a top-down object decomposition protocol was proposed so that the segmentation objects can coincide with objects in different terrains. Lastly, an area-based method was used to calculate the precision and recall indicators to evaluate the quality of the multiscale segmentation results. In addition, the optimized segmentation results in the proposed method were verified. This contributed to the high-efficiency processing of data generated by drone-based remote sensing.
2. Study site and data
In August 2011, we used a fixed-wing drone equipped with a Canon EOS 5D Mark II digital camera, with end and side overlaps of 80 and 60%, respectively, at an average flight altitude of 750 m to collect raw image data from a total of 400 km2 of built-up areas and suburbs in Deyang city. The size of a single image was 5616 × 3744 pixels, and the spatial resolution was 0.2 m. The actual coverage area of each image was 1123 × 748 m. The focal length of the camera was 24.5988 m, and the sensor pixel size was 0.064 mm. After the field images were acquired, the field control points were collected, with each flight belt interval containing one control point. In a flight belt, there were generally 3–5 photographs per control point. Subsequently, digital photogrammetry was used to complete a 0.2-m resolution digital orthophoto map (DOM), which generated 500 × 500 m standard maps.
Two standard drone DOMs (500 × 500 m) were selected for the study, including area 1 and area 2 (Figure 1). The terrain ratio of the two experimental images was different: area 1: covered cultivated land (38%), forests (43%), buildings (6%), bare land (5%), and roads (2%), whereas area 2: covered cultivated land (45%), forests (37%), buildings (4%), water bodies (5%), and roads (1%).
3. Multiscale segmentation
Multiscale segmentation is one of the most popular remote sensing segmentation algorithms, and practical applications have been widely used [22, 28, 29]. Multiscale segmentation is a technique based on region merging, and it is a process of bottom-up region merging from the pixel layer, in which image objects are merged into the large image object layer by layer to produce segmentation results at different segmentation scales. The average spectral heterogeneity of all image objects in the image layer is significantly increased after fusion. In order to achieve this aim, each single merging process must minimize the heterogeneity of two adjacent objects involved in the merging process , such that the heterogeneity of the object after merging relative to the increase in the area-weighted heterogeneity of the original two images
In summary, the segmentation parameters mainly include scale parameters and two groups of parameters (
4. Evaluation indicators of segmentation results
4.1. Measurement of internal consistency
Multiscale segmentation is essentially a technique based on region merging/growing ; this type of method is usually sensitive to the threshold values of the merging conditions, and artificially determined threshold values generally have errors. Therefore, we first measured the sensitivity of different indicators toward the segmentation results and focused on two types of indicators (object internal consistency and object heterogeneity). The best segmentation result should have the maximum consistency and minimum heterogeneity (low spatial autocorrelation) . Currently, in order to evaluate the consistency of objects in the segmentation results, many studies on scale optimization have focused on an area-weighted average variance or local variance, given by the following formula :
Here, vi represents the variance of the ith segmentation object and ai represents the area of the ith segmentation object. The result v represents the internal area-weighted average variance of all the segmentation objects. The larger the value of v, the stronger the consistency of various objects or the smaller the overall difference, and the smaller the value, the larger the overall difference among the segmentation objects .
4.2. Object spatial autocorrelation
Generally, the best segmentation results result in the largest difference among objects, such that objects can be better differentiated and heterogeneity indicators reflect this difference. In order to evaluate the heterogeneity between segmentation objects, we tested two heterogeneity indicators, including the Moran’s I and reverse Geary’s C indices. Moran’s I is widely used in current research [23, 32] and tends to indicate global heterogeneity. Geary’s C index is less commonly used and tends to represent local heterogeneity.
(1) Moran’s I index
where n represents the number of segmentation objects, yi represents the average grayscale value of all the pixels in the ith segmentation object Ri, and
(2) Reverse Geary’s C index
The range of Geary’s C index values is [0, 2], where a value of 1 indicates no spatial autocorrelation, values less than 1 indicate that there is spatial autocorrelation, and the greater the value, the stronger the correlation. Correspondingly, values greater than 1 indicate negative spatial correlation . Therefore, it is not difficult to see that the Geary’s C and Moran’s I indices are essentially negatively correlated. In order for Geary’s C to be consistent with Moran’s I, Geary’s C was expanded here into the reverse Geary’s C (C), such that C is equal to one minus Geary’s C, given as follows:
where N represents the total number of segmentation objects that are in the calculation through i or j indices, X represents the characteristic variable in the calculation,
4.3. Combined analysis of indicators
As consistency and spatial autocorrelation use different angles to evaluate the segmentation results, this section further tests the combined results of both indicators. In order for the consistency and autocorrelation measurements to be comparable, first,
5. Top-down object decomposition
The optimal segmentation scales of differently sized objects are different , and the scales obtained through the single acquisition of indicators above are only individually optimized scales; therefore, the segmentation objects have further potential for optimization. Here, the above three parameters were used as a reference to substitute the global indicators, and considering the local spatial autocorrelation indicators, a top-down object decomposition strategy was proposed to optimize the segmentation objects in different terrain types. The specific steps were as follows: (1) firstly, the segmentation of different scales was achieved, such as 10–300, with a step length of 10. Following that, from a scale of 300, an object set O at the scale 290 in various objects at the scale 300 was searched. If the absolute value of the C indicator of
6. Validation method for segmentation boundaries
Finally, the validation of segmentation boundaries was implemented. On the one hand, the multiscale segmentation results were validated as a reference for subsequent studies; on the other hand, the segmentation results using the method proposed here were validated. An artificial interpreted reference image layer was used and combined with the precision and recall indicators that were calculated from area-based methods. These two indicators have already been widely used in the evaluation of segmentation boundaries [35, 36]. The basic principle is that by assuming a segmentation result S of the raw image and a corresponding actual ground reference image layer R, the precision indicator shows the pixels or area ratio in the ground reference object when the majority of pixels in the object in the segmentation result S overlap. This indicator is relatively sensitive to over-segmentation. The recall indicator shows the ratio of the majority of pixels or area overlapping in the segmentation object in the actual ground object and is sensitive to under-segmentation . In order to clearly describe the calculation process of the precision and recall indicators based on area, the description in  was referenced to calculate the precision indicator. The segmentation image layer was matched to the reference image layer, and the object Si in the segmentation image layer was transversed to calculate the overlap area between every R and the largest reference object Rimax in the overlap area of the segmentation image layer. Subsequently, the sum of the overlap areas was divided by the total area of the segmentation image layer, to calculate the precision indicator as follows:
From the principle and calculation process of the two accuracy indicators, it is not difficult to see that similar to the consistency/heterogeneity indicators, both of these indicators are negatively correlated to each other to some degree and it is difficult for both precision indicators to be large at the same time. Generally, only the mean value of the two indicators can be obtained; therefore, both indicators are simply summed together to measure the overall effects of the segmentation:
7. Experiment discussion
7.1. Changes in each indicator with scale
(1) Area-weighted variance
Generally, optimized scales can be measured by considering the relationship between variance and scale. Figures 2 and 3 demonstrate the variation of the average variance of the three bands with segmentation scale in experimental areas 1 and 2, respectively, and both regions show consistent trends: as the scale increases, the number of segmentation objects decreases and the average variance of the objects gradually increases. This is easily understood, as when the segmentation scale increases, the segmentation object becomes larger and each segmentation object tends to include a greater area of image brightness values . Therefore, on a rough scale, the average variance of the segmentation objects will tend to increase. Kim et al.  found that with increased scale, and even until under-segmentation, the hybrid object includes more pixels that did not originally belong to the actual image, thereby decreasing the variance of these hybrid images. Therefore, it is generally believed that an optimal segmentation scale exists before the variance tends be gentle. However, the experiment results showed that apart from insignificant inflection points near scale 60, it is difficult to find regions with gentle changes in Figures 2 and 3. Conversely, when the variance increases with scale, the magnitude of the increase in consistency is almost maintained. It is worth noting that  used a similar principle to develop an estimation of scale parameter (ESP) scale optimization tool, where they integrated the rate of change of the variance curves and the variance curves to identify the optimal segmentation scale. This was carried out under conditions when the magnitude of change in variance with scale was not very significant and was not the best choice.
(2) Moran’s I
Figures 4 and 5 shows that Moran’s I index continuously decreased when the scale increased from fine to coarse. A fine scale generally tends toward over-segmentation, that is, the segmentation objects that are adjacent to each other are more similar, resulting in a larger Moran’s I index (i.e., stronger autocorrelation between objects). Conversely, an increase in scale results in under-segmentation; the segmentation objects become larger, the differences between adjacent segmentation objects become significant, the spectral consistency decreases, and thus Moran’s I index decreases. Therefore, considering the variation curve of the autocorrelation indices with the segmentation scale,  believed that the minimum autocorrelation should correspond to the optimal segmentation scale. However, the results showed that (Figures 4 and 5) the autocorrelation in both experimental areas continuously decreases with increasing scale and Moran’s I index alone cannot determine the optimal scale.
(3) Reverse Geary’s C index
Considering that changes in the autocorrelation index Moran’s I are not very significant, we tried another autocorrelation index that is more sensitive to local heterogeneity, Geary’s C. Figures 6 and 7 show the changes in Geary’s C index with changes in the segmentation scale in the two experimental areas, and it was found that Geary’s C decreased with increasing scale. In reality, changes around the optimal segmentation scale are more sensitive: In experimental areas 1 and 2, the regions near scales 120 and 150, respectively, started to become stable, and the magnitude of the scale changes was not as large as that on the fine scale. According to the validation results of the optimized segmentation boundaries (Figures 12 and 13), the reverse Geary’s C index can better represent the optimized scale compared with variance and Moran’s I index.
(4) Normalized sums
According to the consistency and autocorrelation tests in the preceding section, it was found that as the scale increases, the variance indicators that represent the consistency of the segmentation object continuously increase, whereas the autocorrelation indicator that represents heterogeneity continuously decreases and it is difficult to discover regions where the changes start to stabilize. Therefore, it is theoretically possible to use the two indicators individually to identify the optimal segmentation scale; Drǎguţ et al.  and Kim et al.  obtained the optimal segmentation scale using one indicator in their studies. The maximum or minimum value can be identified through single indicators, but in fact, the maximum or minimum values do not have a corresponding optimal segmentation scale. This is because when variance still increases at scale 500 (Figures 2 and 3), Moran’s I and reverse Geary’s C indices still decrease at scale 500 (Figures 4–7). In our experimental area, scale 500 or even larger scales are evidently not optimal, and this is shown in the subsequent validation results of the optimized segmentation boundaries. Therefore, single indicators are not suitable for the identification of the overall optimal scale. According to the description in the preceding sections, the sum of two indicators for the identification of optimized scales may be appropriate. The test results for experimental areas 1 and 2 are shown in Figures 8–11. Figures 8 and 9 represent the sum of normalized variance and normalized Moran’s I index, whereas Figures 10 and 11 represent the sum of normalized variance and normalized reverse Geary’s C index. From the validation results of the combined optimized segmentation boundaries (Figures 12 and 13), it is easy to see that the curve of the sum of the normalized variance and normalized Moran’s I index with scale changes can better highlight the optimal segmentation scale, even at the extremely low value obtained at scale 200 in Figure 8. This result is consistent with the thinking of  who suggested that the lowest corresponding scale of the sum of consistency and heterogeneity indicators is the optimal scale. However, for different experimental areas, the lowest value usually cannot be obtained at the optimal scale, such as in Figure 9, which show that the sum of the normalized variance and normalized Moran’s I index in experimental area 2 did not achieve extremely low values at suitable scales. It is worth noting that starting from scales 150–200, with increasing scale, the sum of the normalized variance and normalized Moran’s I index in experimental area 2 starts to show significant moderation trends and this region corresponds well to the optimal segmentation scale in experimental area 2. Therefore, optimal segmentation scales are assumed to exist between under-segmentation and over-segmentation. Therefore, theoretically, the indicator values start to show significant changes before and after this segmentation scale , but owing to differences in segmented terrain in the experimental area, extremely low values do not always appear. Generally, the segmentation scale region before the sum of the normalized variance and normalized Moran’s I index starts to show stable changes is used as the optimal segmentation scale. In addition, the results of the sum of the normalized variance and normalized reverse Geary’s C index were not good (Figure 10), as abnormal changes occurred at smaller scales. This is due to the oversensitivity of the reverse Geary’s C index, and the sum of the two is not recommended. For single indicators, the reverse Geary’s C index is recommended, although in experimental area 2, the combination of the two (Figure 11) was similar to the performance of the sum of the normalized variance and normalized Moran’s I index in experimental area 2 (Figure 9), or even more significantly, represents the optimal segmentation scale.
7.2. Precision indicator analysis of multiscale segmentation results
This section mainly evaluates the segmentation quality of different segmentation scales by referencing polygon testing of the segmentation results. At the same time, the performance of the abovementioned indicators is validated in order to provide reliable reference information to determine which indicators are more suitable for representing the optimal segmentation scale. Figures 12 and 13 demonstrate the precision and recall indicators of the two experimental areas and the changes in these two indicators with changes in scale. It can be clearly seen that the precision indicator decreases when the scale increases, whereas the recall indicator increases when the scale increases. The sum of the two increases when scale increases, and starts to become stable within a suitable scale range. Ideally, the larger the sum of the precision and recall values, the better the segmentation result. As these two indicators are sensitive to over-segmentation and under-segmentation, respectively, similar to the consistency and autocorrelation indicators, the optimal segmentation scale is assumed to be the scale at which both indicators start to become stable. Therefore, for experimental area 1, the optimal segmentation scale should be in the region of scale 130, whereas that of experimental area 2 should be in the region of scale 150, and this is similar to the analysis results of Section 3. Therefore, the sum of the region-based precision and recall indicators can effectively show the optimal segmentation scale, which was consistent with the analysis results of . Furthermore, when the sum of the consistency measures and autocorrelation measures is plotted with the sum of precision and recall, it can be clearly see that the combined value of the consistency measures and autocorrelation measures can represent the region when the combined precision and recall indicators start to become stable, that is, the sum of the consistency and autocorrelation measures also starts to show the corresponding scale regions during significant changes. Figures 14 and 15 show the best combination in the two experimental regions: for experimental area 1, it is the sum of the normalized variance and Moran’s I (Figure 14), and for experimental area 2, it is the sum of the normalized variance and reverse Geary’s C (Figure 15). In the figures, the corresponding dotted vertical lines are artificially identified optimal scales.
7.3. Top-down decomposition based on autocorrelation measures
Currently, the majority of scale optimization studies all have a goal of obtaining single optimized scales [38, 39]. However, according to the study by , the optimized scales of different terrains are different and purely relying on the identification of single optimized scales is essentially not in line with the core thinking of object-oriented remote sensing analysis. Here, we attempted to propose a top-down multiscale segmentation scheme with an aim of obtaining the optimized segmentation results of different terrains. Table 1 shows the corresponding sum of the precision and recall indicators at different scales for experimental area 1. The maximum or local maximum values in different categories do not always appear on the overall optimal segmentation scale. Here, we used the reverse Geary’s C to achieve a top-down decomposition of under-segmented objects. A reverse Geary’s C value of 1 indicates that positive autocorrelation exists in the object and the segmentation objects in various layers in the top-down decomposition are obtained using the consistency of adjacent objects to determine the bottom-up merger. Therefore, a high degree of autocorrelation exists in the object set in the lower layer that is included in the upper layer. Through testing, we found that the local reverse Geary’s C indices in the object set in the lower layer that is included in the objects in the middle and upper layers in the experiment area are all large and approached 1. Therefore, the threshold values in this test include 0.999, 0.997, and 0.995. If the calculated reverse Geary’s C index in the objects in the upper layer, which include objects in the lower layer, is smaller than these values, the objects in the upper layer are disintegrated, that is, the object sets in the layer are retained. Figure 16 shows the layer-by-layer decomposition result from segmentation scale 320 to scale 50, when the experimental area 1 is below the threshold value of 0.999. It can be seen that this not only retained the overall characteristics of cultivated land and buildings, but the segmentation of forests is also refined. In particular, the proposed method can better represent buildings. Figure 16(a) and (b) shows the results of top-down decomposition and optimal segmentation scale 130, respectively. Figure 16(c) and (d) show the optimal segmentation scale 130 result and top-down decomposition result, respectively. The experiments showed that this method can make up for the deficiencies in merging when the bottom-up multiscale segmentation only considers the consistency of adjacent objects.
|Scale||Forests||Roads||Cultivated land||Buildings||Bare land|
7.4. Comparison of results of single-scale and multiscale decomposition
Assume that the optimal segmentation scale of experimental area 1 is 130; then, the corresponding precision and recall indicators of various categories are as shown in Table 2. Table 2 shows the summation of the precision and recall indicators of various categories that were obtained from the accuracy validation of the segmentation results of gradual decomposition from scale 320 to 50 using a threshold value of 0.999, and the comparison of this with the accuracy determined from the optimal segmentation scale of 130. It can be seen that this method can result in the sums of the precision and recall accuracy of forests, roads, and cultivated land being worse than the single optimal scale used for discrimination. However, this method can simultaneously greatly increase the segmentation accuracy of buildings (Figure 16) and retain the segmentation characteristics of bare land to a maximum degree. The sum of the precision and recall accuracy was better than at scale 130. It can be seen that the proposed method can effectively improve the segmentation results of urban areas or regions with high consistency (such as buildings and bare land). However, the results of this method may be worse for forests, cultivated land, or other regions with similar spectra. Therefore, this method must be used selectively, such as in study sites that are dominated by urban areas or consistent regions.
|Different methods||Forests||Roads||Cultivated land||Buildings||Bare land|
8. Chapter summary
This chapter presented the use of drone-based remote sensing images to evaluate the quality of the MRS algorithm for the segmentation of drone images, tested the sensitivity of different segmentation evaluation indicators, and proposed an optimization protocol for segmentation scales. First, the consistency and heterogeneity measures of the object were used to test the sensitivity of different indicators in multiscale segmentation results. The results showed that it is more difficult to find optimal scales by using single indicators. A combination of area-weighted variance (consistency) and Moran’s I spatial autocorrelation index (heterogeneity) can simultaneously account for the internal consistency of the object and the heterogeneity between objects, such that the optimized segmentation object can internally achieve maximum homogeneity and maximum heterogeneity can be achieved between objects, which is more conducive to discovering the optimal segmentation scale. For normalized combined indicators, the combined results of the normalized variance and normalized Moran’s I were found to be better than the results of the normalized variance and normalized reverse Geary’s C. Through a combination of normalized precision and recall measures, we found the optimal segmentation scale region for experimental areas 1 and 2. These results can provide an empirical reference for the optimization of segmentation in drone-based remote sensing images. Compared with other indicators, the reverse Geary’s C is more sensitive to the segmentation scale, as the top-down object decomposition protocol based on its spatial autocorrelation indicator can improve the segmentation results of different terrains. However, this method does not show good results for forests or cultivated land, which have low spectral consistency. Therefore, based on the results of this research, it is recommended that this method be selectively used.
This work was supported by the National Natural Science Foundation of China (No. 41701374), the Natural Science Foundation of Jiangsu Province of China (No. BK20170640), the China Postdoctoral Science Foundation (No. 2017 T10034, 2016 M600392), and the National Key Research and Development Program of China (No. 2017YFB0504200). We are also grateful to anonymous reviewers and members of the editorial team for advice.