Clustering Techniques for Land Use Land Cover Classification of Remotely Sensed Images

Image processing is growing fast and persistently. The idea of remotely sensed image clustering is to categorize the image into meaningful land use land cover classes with respect to a particular application. Image clustering is a technique to group an image into units or categories that are homogeneous with respect to one or more characteristics. There are many algorithms and techniques that have been developed to solve image clustering problems, though, none of the method is a general solution. This chapter will highlight the various clustering techniques that bring together the current development on clustering and explores the potentiality of those techniques in extracting earth surface features information from high spatial resolution remotely sensed imageries. It also will provide an insight about the existing mathematical methods and its application to image clustering. Special emphasis will be given on Hölder exponent (HE) and Variance (VAR). HE and VAR are well-established techniques for texture analysis. This chapter will highlight about the Hölder exponent and variance-based clustering method for classifying land use/land cover in high spatial resolution remotely sensed images.


Introduction
High spatial resolution remotely sensed imagery helps to obtain quality and detailed information about the earth's surface features in conjunction with their geographical associations. The internal changeability within the identical land-use land-cover units augments with the rise of resolution. The augmented changeability diminishes the statistical distinguishability of land-use/land-cover classes in the spectral data space. This reduced distinguishability tends to decrease the accuracies of pixel-based clustering algorithms such as Fuzzy C Means [1], minimum distance classifiers [2] and K-Means [3]. These pixel-based clustering techniques assign a pixel to a region according to the similarities of spectral signature. It considers only one pixel at a time [4]. Spectral signatures are the specific combination of emitted, reflected or absorbed electromagnetic (EM) radiation at varying wavelengths which can uniquely identify an object [4].
Compared to IRS-1A/1B sensors, the spectral resolution of high spatial resolution images is normally relatively poor. Spectral resolution describes a sensor's ability to identify fine intervals of wavelength. The better the spectral resolution, the finer the channel or band width. Therefore, between spatial and spectral resolution, there is a trade-off. It is mainly true for panchromatic (PAN) images of high spatial resolution, namely CARTOSAT-II 1m and IKONOS 1m. There is a need to consider the spatial relationships between pixel values, also known as the 'texture' of the scene objects to classify high-resolution (HR) images owing to the wide difference in the spatial structure in these images. Consequently, multiple texture-based clustering technique namely GLCM [5][6][7][8], Markov random field (MRF) model [5], Gray scale rotation invariant [9] were evolved for clustering remote sensing images having high spatial resolution. Nevertheless, above mentioned methods are appropriate in textured area of HR images. A region is called textured; where the intensity dissimilarity within adjacent pixels is substantial. A region is said to be nontextured, where the intensity dissimilarity among adjacent pixels is insignificant [10,11]. But texture-based classification techniques failed in non-textured region of high spatial resolution image as much variation is not found in the spatial pattern of those regions of the image [12]. Thus, we can infer from earlier studies that classification of high spatial resolution imageries either by pixel or texture-based algorithm may not yield desired results.
Some more techniques namely watershed approach [13,14], region-growing approach [4,15], mean shift approach [16,17], region merging approach [18] etc. are in use for clustering high spatial resolution remote sensing images. Application of these approaches for clustering of images either leads to under-segmentation or over-segmentation [19,20]. Structural image indexing approach [21], semisupervised feature learning approach [22] and multi-scale manner using SVM approach [23] are also found fairly suitable in clustering high resolution images. The imagery of higher resolution includes textured and non-textured areas. Hence, pixel or texture-based algorithm for clustering of high-resolution imagery does not produce expected results. This type of high-resolution imagery clustering research is in the trend. Multi-circular local binary pattern and variance-based method [10] were used separately to cluster high resolution image having textured and non-textured regions. The Multi circular local binary pattern operator has been used here for measuring the spatial structure of the image. But, disadvantage in this strategy is that multi-circular local binary pattern operator is susceptible to noise as it exactly sees the value of the moving window's central pixel as a limit for computing the spatial structure around the central pixel.
In last one decade the Hölder exponent (HE) has been used for calculating spatial structure of the images [24][25][26]. It is also being used for clustering highresolution images [12]. HE gives an evidence of the spatial structure of the image and is not much influenced by the noise. In addition, spatial structure, contrast of the local image holds considerable property for calculating the texture around the pixel. In this research, high-resolution picture textured and non-textured region is originally segmented using HE and VAR-based method and subsequently separately clustered and non-textured areas. VAR is used to calculate the contrast around the pixel. The suggested method is applied with a 1 m spatial resolution on high resolution IKONOS PAN images.

Methods
The suggested high-resolution image 'P' clustering technique has three main steps: (i) image transformation, (ii) segmentation and extraction, and (iii) clustering. Initially, every pixel of the image is converted into a degree of texture or non-texture around the pixel. In the second step, using segmented image mask, the transformed image is segmented and non-textured and textured regions are extracted from the initial image. Finally, the two areas obtained are separately clustered.

Transformation of image
The Hölder Exponent (HE) and VAR are jointly used to convert the image for computing the texture. The HE calculates each pixel of P's spatial structure. Besides spatial structure, local image contrast also grasps important property for computing the texture around the pixel. In this research, therefore, VAR is used to calculate the contrast around the pixel.

Hölder exponent
Hölder exponent has been used for investigating the texture in high-resolution images [12]. It measures the irregularity in the vicinity. Supremacy of applying Hölder Exponent in HR images are that (i) it can be used as an instrument to calculate each pixel of the image's spatial structure, (ii) no previous data on the pixel intensity is required and (iii) is not very sensitive to noise [12].
Definition of HE [27]: Let μ be a measure on a set Ω as well as for all A sequence of 15 values of radius r (i.e. 1, √2, √5, 3, √13, 3√2, 5, √29, 2√10, 3√5, 7, √61, 6√2, √85, 7√2) centered on x are used as a scale parameter for calculating HE value around each pixel x in the image [12] and the total number (N) of intersected pixels by the perimeter of series of circles of radius r is considered as a scale parameter for computing VAR value around x [12]. N is computed using Eq. (1).
where t is the total number of identified circles, m r is the number of intersected pixels on the perimeter of the radius r circle.

VAR (σ 2 ) for contrast measurement around each pixel of the image
To get the contrast value of (x, y), the neighbor's σ 2 of each pixel (x, y) is calculated over the entire image. Using Eq. (2), the σ 2 (x, y) is realized where a rj is the intensity value of pixel (r,j), Thus obtained α(x,y) and σ 2 (x,y) for each P(x,y). Afterward, these values are used in Eq. (3) to obtain the corresponding pixel value (x,y) in the transformed image T. Each pixel (x,y) of T signifies the degree of texture around that pixel.

Image segmentation and extraction
The image 'T' is segmented into textured and non-textured regions based on a threshold value 'δ'. The pixel value in T below the 'δ' is considered to be a nontextured region, whereas greater than or equal to 'δ' is considered to be the textured region in the segmented image. Pixels are labeled as zero in non-textured areas, whereas pixels are marked as one in textured areas in the segmented image mask and depicted as follows: where T(x,y) and Γ(x,y) represents the pixel value in (x,y) position of the two dimensional transformed image and segmented image respectively and δ represents the threshold value. The δ is calculated by using Eq. (5).
where T min and T max represents minimum and maximum pixel gray value in T respectively and K is user defined value.
IKONOS PAN sensor image of size 256 Â 256 pixels (shown in Figure 1a) is used to achieve the optimum K. The suggested clustering method is also implemented for distinct K values on this image.
The segmented image is subsequently used to obtain the textured and nontextured region from the initial image P. This process's mathematical representation is shown as follows: where P, Γ, R 1 and R 2 indicates original image, segmented image, extracted nontextured region from original image P and extracted textured region from original image P respectively.

Clustering
Initially, a threshold is used to segment the transformed image into textured and non-textured region. Afterward, the original image is extracted into textured and non-textured regions using the segmented image mask and clustered independently. The extracted textured region (R 2 ) is clustered by means of ISODATA clustering algorithm [28] considering HE, VAR and intensity values of individual pixel of textured area. The clustering algorithm of ISODATA is less computational, easy and non-supervisory. Whereas the non-textured area (R1) of the image is categorized using the clustering algorithm of ISODATA. In the event of nontextured region, the individual pixel HE and VAR value is not regarded for classification as there is no important variation in texture between classes. The classified outputs of the non-textured and textured region are subsequently produced separately and mixed together to obtain the final classified image. This research uses "HE-VAR and PAN" and "MCLBP and VAR" based clustering technique to show the power of the suggested clustering technique. The technique based on "HE-VAR and PAN" clusters the entire image using the HE, VAR and intensity of each pixel of the IKONOS PAN image. The suggested technique of clustering is then contrasted with the outcomes of the clustering method based on "HE-VAR and PAN" and "MCLBP and VAR" to demonstrate the strength of the suggested technique of clustering.

Results and discussion
The projected clustering method imagines threshold δ to get the segmented image mask from the transformed image. The threshold is computed using a constant 'K'. In this study, proposed clustering procedure is implemented on IKONOS PAN image with spatial resolution 1 m for 'K' values between 3 and 7 and subsequently, classification rate is measured for these 'K' values using the ground truth data. The classification accuracy with different 'K' is shown in Figure 2. The 'K' affects the accuracy in classifying High spatial resolution images considerably as shown in Figure 2. For computing texture, a suitable choice of 'K' is important. In this study, superlative performance in high-resolution image classification was accomplished with K = 5. The optimum K is discovered based on Figure 1a and is also implemented in the classification of Figure 1e in addition to other images and found classification accuracy is more than 88%. Thus, from the present study, we can infer that the same K value is suitable for most images.
The Proposed clustering method, "MCLBP and VAR" based method and "HE-VAR and PAN" based method were applied on two different 1 m PAN (IKONOS) images (size 256 Â 256 pixels) covering (i) vegetation, (ii) built-up area, (iii) water bodies, and (iv) fallow (shown in Figure 1a, e). Texture is observable in in Figure 1a, e. The results of proposed method are then compared with the results obtained from the analysis based on "HE-VAR and PAN" and "MCLBP and VAR" respectively. Figure 1f-h shows the classification outcomes of the methods "HE-VAR and PAN," "MCLBP and VAR" and "Proposed classification" after proceeding to the second IKONOS image respectively. Figure 1b-d shows the classification outcomes of the methods "HE-VAR and PAN," "MCLBP and VAR" and "Proposed classification" after proceeding to the first IKONOS image respectively. Classified images recognize varied features in Figure 1b-d, f-h. From the results, it is evident that the method based on "MCLBP and VAR" gives less heterogeneous segments than the method based on "HE-VAR and PAN," while the method based on "Proposed classification method" provides more homogeneous segments with distinct classes than the method based on "MCLBP and VAR." The ground truth data is collected using GPS equipment for the class vegetation, built-up area, fallow and water body of sample size of 656, 519, 577 and 462 square meters respectively. Afterward, ArcGIS software is used to transfer the ground truth data into vector data. Subsequently, by overlaying the ground truth information distinctly on the results acquired from both IKONOS images (Figure 1a, e) adopting methods such as "HE-VAR and PAN," "MCLBP and VAR" and "Proposed clustering," the classification accuracies for each strategy are shown by confusion matrix. The confusion matrices ( Table 1)  The technique based on "MCLBP and VAR" somehow overcomes these inconsistencies. It is discovered that, as shown in Figure 1c, g, the superposition of fallow, water body, vegetation region becomes less. In addition, decreased inconsistencies improve the accuracy of the classification of fallow, water body and vegetation regions (see Tables 1 and 2).
"HE-VAR and PAN" based method classifies water bodies and fallow areas as a single class (Figure 1b, f) since the texture patterns of these two areas does not show much difference in high resolution imageries as shown in Figure 1a, e. "MCLBP and VAR" based technique demonstrates improvement in classifying the fallow areas and water bodies which is observable in Figure 1g. But this method could not extract non-textured region appropriately form Figure 1a since MCLBP is sensitive to noise. Therefore "MCLBP and VAR" based method could not discriminate appropriately fallow areas and water bodies in Figure 1a as visible in Figure 1c. HE is not as much of sensitive to noise therefore the proposed technique partitions the image into textured and non-textured regions noticeably which in turn helps in classifying the fallow and water bodies as shown in Figure 1d.
The proposed clustering method is applied further on a 1 m PAN (IKONOS) image (Figure 3a) of (i) urban woodland, (ii) building, (iii) water bodies, and (iv) fallow to show the robustness and validity of the method in classifying land use area. The method satisfactorily discriminate urban woodland, building, fallow and water bodies as shown in

Conclusion
In the present study, the spatial structure of local image texture is computed using HE. The contrast around the pixel is measured using VAR. Afterward, the image is transformed using HE and VAR together for measuring the texture. A threshold δ is used to extract textured and non-textured region from the image. The classification algorithm ISODATA is used to classify the textured region taking into account HE, VAR and intensity values of the textured area's individual pixels. Whereas ISODATA clustering algorithm classifies the extracted non-textured region of the image. The HE and VAR value of individual pixels is not regarded for classification in the event of non-textured region. From the research outcomes, it is discovered that the suggested technique is helpful to extract earth surface characteristics from complicated remote sensing images that contain both textured and non-textured areas. Moreover, it can be considered as an intuitively appealing and unsupervised clustering algorithm for extracting features from remotely sensed images. As a result, the method is potentially useful to extract earth surface features by clustering high spatial resolution panchromatic images more efficiently.