Open access peer-reviewed chapter

Incorporating Local Data and KL Membership Divergence into Hard C-Means Clustering for Fuzzy and Noise-Robust Data Segmentation

By Reda R. Gharieb

Submitted: October 6th 2017Reviewed: January 26th 2018Published: August 1st 2018

DOI: 10.5772/intechopen.74514

Downloaded: 308

Abstract

Hard C-means (HCM) and fuzzy C-means (FCM) algorithms are among the most popular ones for data clustering including image data. The HCM algorithm offers each data entity with a cluster membership of 0 or 1. This implies that the entity will be assigned to only one cluster. On the contrary, the FCM algorithm provides an entity with a membership value between 0 and 1, which means that the entity may belong to all clusters but with different membership values. The main disadvantage of both HCM and FCM algorithms is that they cluster an entity based on only its self-features and do not incorporate the influence of the entity’s neighborhoods, which makes clustering prone to additive noise. In this chapter, Kullback-Leibler (KL) membership divergence is incorporated into the HCM for image data clustering. This HCM-KL-based clustering algorithm provides twofold advantage. The first one is that it offers a fuzzification approach to the HCM clustering algorithm. The second one is that by incorporating a local spatial membership function into the HCM objective function, additive noise can be tolerated. Also spatial data is incorporated for more noise-robust clustering.

Keywords

  • data science
  • clustering
  • image clustering
  • hard and fuzzy C-means
  • membership function
  • Kullback-Leibler (KL) divergence

1. Introduction

Image segmentation is a principle process in many image, video, scene analysis and computer vision applications [1–3]. The objective of segmentation process is to divide an image into multiple separate regions or clusters which make it easier to recognize and distinguish different objects in image. Over the last few decades, several image segmentation methods have been developed. However, there is still no satisfactory performance especially in noisy images. This makes development of segmentation algorithms that are capable of handling noisy images an active area of research. The current segmentation methods can be classified into thresholding, region-detection, edge-detection, probabilistic and artificial neural-network classification and clustering [1, 2, 3]. Among the widely used are the hard and fuzzy-based clustering methods since clustering needs no training examples [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]. Hard C-means (HCM) also called K-means clustering algorithm is an unsupervised approach in which data is basically partitioned based on locations and distances between various data points [4, 5, 6]. K-means partitions the data into C-clusters so that the distances between data within each cluster are as close as possible but as far as possible between data in different clusters. HCM clustering algorithm offers crisp segmentation in which each data point belongs to only one cluster. Thereby it does not take into consideration fine details of infrastructure of data such as hybridization or mixing. Compared with HCM algorithm, fuzzy C-means (FCM) algorithm is able to provide soft segmentation by incorporating membership of belonging described by a membership function [7, 8]. However, one disadvantage of the standard FCM is not incorporating any spatial or local information in image context, making it very sensitive to additive noise and other imaging artifacts. To handle this problem, different techniques have been developed [9–13]. These techniques have involved spatial or local data information for the enhancement and regularization of the performance of the standard FCM algorithm. Local membership information has also been employed to generate a parameter to weight or modify the membership function in order to give more weight to the pixel membership if the immediate neighborhood pixels are of the same cluster [14]. HCM algorithm has also been fuzzified by involving membership entropy optimization [15, 16, 17].

In this chapter, HCM clustering algorithm is modified by incorporating local spatial data and Kullback-Leibler (KL) membership divergence [18, 19, 20, 21, 22]. The local data information is incorporated via an additional weighted HCM function in which the smoothed image data is used for the distance computation. The KL membership divergence aims at minimizing the information distance between the membership function of each pixel and the locally smoothed one in the pixel vicinity. The KL membership divergence thus provides an approach for regularization and fuzzification. The chapter is organized as follows. In Section 2, clustering problem formulation is overviewed. In Section 3, HCM clustering algorithm is described. In Section 4, several FCM-related clustering algorithms are explained. In Sections 5 and 6, the proposed local membership KL divergence-based FCM (LMKLFCM) and Local Data and membership KL divergence-based FCM (LDMKLFCM) clustering algorithm are discussed. In Section 7, simulation results of clustering and segmentation of synthetic and real-world images are presented. Finally Section 8 presents the conclusion.

2. Problem formulation

The objective is to cluster a set of observed data xnn=12..Nwhere each data point is an Mdimensional real-vector called the feature or the pattern vector, i.e., xnR1×M. For gray-scale image data, xnn=12..Nis a row-wise concatenation of a 2-D image Xpqp=12..Pq=12..Q. That is n=p1Q+qand the intensity-feature xnis a single-dimensional real-value, i.e., M=1. Clustering aims at partitioning theses Nobservations into C<Ndivisions, {μ1, μ2,…,μC} called Cclusters or segments so as to make the entities or pixels in the same cluster as similar as possible and the ones in different clusters as dissimilar as possible. One approach to cluster these data is to minimize the within-clusters sum of squares of distances (WCSS) and to maximize the between-clusters sum of squares of distances (BCSS).

3. Hard C-means (HCM)

In hard C-means (HCM) algorithm also called the K-means one, the objective is to minimize the following function [4, 5, 6, 15].

JHCM=i=1Cn=1NuindinE1

where din=xnvi2, is the square of the Euclidian distance between the nth pixel feature xnof the image under segmentation and viV=v1v2vCcalled the center of the ith cluster given by

vi=xnμixnNi,i=1,2,,C.E2

where μiis the ith cluster label and Niis its number of patterns in cluster i. In (2), it is clear that the pattern xnbelongs to only one cluster which means that uin01called the membership function is given by [15].

ukn=1;k=argminidin0,OtherwiseE3

From (3), it is obvious that the HCM provides a crisp membership function uin01or {False, True}. uin01. Thus HCM algorithm does not take into account fine details of infrastructure such as hybridization or mixing of data which is important in data clustering and decision making. The algorithm is implemented by an iterative procedure as summarized in Table 1.

Given xn,n=1,2,,N.
Initialize vi0,i=1,2,..,C;t=0;
  1. For n=1,2,,N

    Compute:

  2. din=xnvit2;i=1,2,,C.

  3. k=argminidin; ukn=1; uin=0;i=1,2,..,C;ik.(HCM);

    uin=1j=1Cdindjn1m1(FCM)

  4. Update t=t+1;vit+1=nuinxnnuin,i=1,2,,C.

  5. Check if VtVt+12>ε(negligible change); repeat 1–5 until convergence.

Table 1.

Pseudo code of the HCM (FCM) algorithms.

4. Related fuzzy clustering algorithms

4.1. Conventional FCM

The fuzzy C-means (FCM) algorithm seeks to minimize the following objective function [7].

JFCM=i=1Cn=1NuinmdinE4

It is obvious that the difference between the FCM algorithm and HCM one is the incorporation of the exponent parameter m, called the fuzzification parameter, and if it is settled to 1, the FCM algorithm reduces to the HCM one. Thus, due to this exponent m,the membership of the nth pixel to the ith cluster, uin, can take on an infinite set of values from 0 to 1. Thus each nth pixel may belong to all clusters with equal membership values of 1/Cwhich in this case we obtain too fuzzy membership function. Then the exponent parameter 1<mis incorporated to control the degrees of fuzzification; the bigger the m,the more the fuzzification. Finally, the fuzzy membership uinshould satisfy [7].

uinU=uin01i=1Cuin=1n0<n=1Nuin<Ni,E5

The membership uinand the cluster-center vithat minimize the FCM function in (4), subject to i=1Cuin=1nare given by [7].

uin=1j=1Cdindjn1m1E6
vi=n=1Nuinxnn=1NuinE7

4.2. Local spatial data-based FCM (LDFCM)

The neighboring pixels of an image are highly correlated and are thus highly expected to belong to the same cluster or object. To get benefit from this spatial data information, the standard FCM objective function in (4) has been modified by adding a weighted regularization function dependent on local image data information [10, 11, 12]. That is, the LDFCM objective function is given by

JLDFCM=JFCM+αi=1Cn=1Nuinmdin¯E8

where αis a weight to be experimentally selected by the user, mis a fuzzification parameter, d¯in=x¯nvi2,x¯nX¯is the nth pixel of the locally-smoothed image, X¯, obtained in advance from the original one by X¯=wxX,where ** means two-dimensional convolution. The weights wxcan be equal or not provided that its centerweight is zero and are summed to unity. From (8), it is clear that the LDFCM aims at minimizing the standard FCM objective function plus another weighted modified FCM function acting as a regularization function. In this regularization FCM function, the distances are generated from the locally-smoothed image data instead of the original image data. Therefore, this correlates the clustering pixel xnwith its immediate spatial neighboring pixels which biases the algorithm to provide clustered images with piecewise homogenous regions. The membership uinand the cluster-center vifunctions of the LDFCM method are given by [10, 11, 12].

uin=1j=1Cdin+αd¯indjn+αd¯jn1m1E9
vi=n=1Nuinxn+αx¯n1+αn=1NuinE10

It is obvious from (9) and (10) that when α=0,the membership uinand the cluster-center vibecome the ones provided by the standard FCM algorithm in (6) and (7). The advantage of the LDFCM method arises from involving the locally-smoothed data αx¯nin computing the membership uinand the cluster-center vifunctions which indeed can handle additive noise.

4.3. Spatial-based fuzzy C-means (SFCM)

An approach to incorporating local spatial data information into the standard FCM has been presented in [13]. The objective function of the SFCM algorithm is given by

JSFCM=i=1Cn=1NuinmDin,E11

where Dinis a modified or weighted distance between the nth pixel and the ith cluster-center. This modified distance is computed from the original or standard distance din=xnvi2as follows

Din=1λdinfin+λdinE12

where λ01is an experimentally selected weight, and finis a spatial or local data function given by [13].

fin=kNndikminkNndiki=12..CE13

It is obvious from (12) that with λ=1,the SFCM clustering method reduces to the standard FCM method. The spatial data function finis dependent on the original distances of the set of pixels Nnin the immediate neighborhood of the nth pixel. If all pixels in the neighbor set do not belong to the ith cluster finis maximum since the denominator is minimum while the numerator is maximum. This implies that fincauses Dinto increase when the pixels of the immediate neighborhood of the nth pixel do not belong to the ith cluster. This increase of Dincontributes to decreasing the membership uinfor achieving and preserving the minim of the SFCM function in (11).

The membership uinand the cluster-center viassociated with the SFCM method are given by [13].

uin=1j=1CDinDjn1m1E14
vi=n=1Nuinxnn=1NuinE15

It is obvious from (14) that similar to the standard FCM, the membership uinis inversely proportional to the weighted distance Din, which again means that, increasing Dinwhen the immediate neighboring pixels to the nth pixel do not belong to the ith cluster, decreases the membership function uin. From (15), however, it is clear that the SFCM algorithm computes the cluster-center viin a similar way as the standard FCM method does. Hence, additive noise can still reduce the accuracy of cluster center viobtained by the SFCM algorithm.

4.4. HCM incorporating membership entropy

The membership entropy has been incorporated into the HCM for fuzzification. The membership entropy-based FCM (MEFCM) algorithm has the following objective function [17].

JMEFCM=JHCM+βi=1Cn=1Nuinloguin+1uinlog1uinE16

where β>0is a weight experimentally selected to control the fuzziness of the entropy term. We still need Uto be constrained to satisfy (5). It can be shown that the membership and the cluster-center that minimize (16) are respectively given by [17]

uin=1j=1Cexpdin/β+1expdjn/β+1E17
vi=n=1Nuinxnn=1NuinE18

It is obvious so far that the membership function of the nth entities provided by FCM, HCM and MEFCM algorithms depends upon the inverse of the square of the Euclidean distance din=xnvi2which is a function of only xnwith no data or membership information of the clustering entity’s neighbors are involved. Hence, the FCM, HCM and MEFCM algorithms miss important spatial local data and membership information. Thus additive noise can degrade xn, viand din, thereby biasing the membership of a degraded entity to a false cluster.

5. HCM incorporating local membership KL divergence

In [18], an approach to incorporating local spatial membership information into HCM algorithm has been presented. By adding Kullback-Leibler (KL) divergence between the membership function of an entity and the locally-smoothed membership in the immediate spatial neighborhood, the modified objective function, called the local membership KL divergence-based FCM (LMKLFCM), is given by [18, 19, 20, 21, 22].

JLMKLFCM=JHCM+γi=1Cn=1Nuinloguinπin+i=1Cn=1Nu¯inlogu¯inπ¯inE19

where γis a weighting parameter experimentally selected to control the fuzziness induced by the second term in (19), u¯in=1uinis the complement of the membership function uin, πinand π¯inare the spatial local or moving averages of membership uinand the complement membership u¯in, functions respectively. These local membership and membership complement averages are computed by [18, 19, 20, 21, 22].

πin=1NKkNn;knuikE20
π¯in=1NKkNn;kn1uik=1πinE21

where Nnis a set of entities/pixels falling in a square window around the nth pixel and NKis its cardinality. It is obvious that all entities in the window are weighted equally by wpqu=1/NK. Other windows can be used such as Gaussian one provided that the weight of the window-center is 0 and the rest weights are summed to unity. The first term in (19) provides hard-cluster labeling. It pushes the membership function toward 0 or 1. The KL membership and membership-complement divergences, in addition to providing fuzzification approach to HCM clustering, measure the proximity between the membership of a pixel in a certain cluster and the local average of the membership over the immediate spatial neighborhood pixels in this cluster. So, they push the membership function to the locally smoothed membership function πin. Therefore, this can smooth out additive noise and bias the solution to piecewise homogenous labels which leads to a segmented image with piecewise homogenous regions.

The minimization of the objective function JLMKLFCMin (19) yields uinand vito be given, respectively, by [18].

uin=1j=1Cπjn1πinexpdin/γ+πin1πjnexpdjn/γ+πjnπin=δinπinE22
vi=n=1Nuinxnn=1NuinE23

It is obvious from (22) that uinis proportional to πinand the proportional parameter δinis inversely proportional to the entity’s distance dinand the maximum δknoccurs when dkn=0.

It is clear that if γ,uin=πin/j=1Cπjn. Therefore, the resultant membership is independent of the data to be clustered but dependent on the initial value of the membership matrix U0and on the smoothing fashion. If uin0is generated from a random process greater than zero, then uintversus the number of iteration tconverges, because of recursive averaging and normalizing, to a normal distribution variable with mean equal to 1C=Euint=Eπin/j=1CEπjnwhich, in this case, means too much fuzzy membership function. This has been proved experimentally by using a synthetic image of 4 clusters and γ=1010.Finally, as shown by (23), the computation of the cluster-center viis still independent of the local original data.

6. HCM incorporating local data and membership KL divergence

To incorporate local spatial data into the LMKLFCM objective function in (19), the following objective function has been proposed in [18].

JLDMKLFCM=i=1Cn=1Nuindin+αd¯in+γi=1Cn=1Nuinloguinπin+i=1Cn=1Nu¯inlogu¯inπ¯inE24

Therefore, similar to (22) and (23), the membership function uinand the cluster-center viare, respectively, given by [18].

uin=1j=1Cπjn1πinexpdin+αd¯in/γ+πin1πjnexpdin+αd¯in/γ+πjnπinE25
vi=n=1Nuinxn+αx¯n1+αn=1NuinE26

It is obvious that the LDMKLFCM algorithm in (24)(26) provides a membership that depends upon the local spatial data and membership information while the cluster center is dependent upon the locally-smoothed data. Thus the algorithm has twofold approach to handle additive noise.

7. Simulation results

This simulation aims at examining the performance of the conventional FCM, the membership entropy-based FCM (MEFCM), the spatial distance weighted FCM (SFCM), the local membership KL divergence-based FCM (LMKLFCM) and the local data and membership KL divergence-based FCM (LDMKLFCM) algorithms. It is to be noticed that all the algorithms can be implemented almost similar to the pseudo code in Table 1 by replacing the steps 3 and 4 by the corresponding computation of the membership function and cluster centers of each algorithm.

7.1. Clustering validity

To measure the performance of the fuzzy clustering algorithms, several quantitative measures or indices have been adopted in [23, 25] and references therein. Few of these measures are the partition coefficient VPCand the partition entropy VPEindex of Bezdek and Xie-Beni (XB index) VXB, given respectively by

VPC=1Nn=1Ni=1CuinE27
VPE=1Nn=1Ni=1CuinloguinE28

The closer of the VPCto 1, the better the performance since the minimization is constrained by i=1Cuin=1.The closer the VPEto 0, the better the performance since this means the less fuzziness of the membership and thus clusters are well-separated.

In synthetic images, in addition to the above clustering validity measures, several clustering validity and performance measures have also been used such as the accuracy, sensitivity and specificity given respectively by

Acc.=TP+TN/TP+TN+FP+FNE29
Sen.=TP/TP+TNE30
Spe.=TN/TN+FNE31

where T, F, P, and N are mean true, false, positive, and negative, respectively. The TP, FP, TN, and FN are computed as follows. While generating the synthetic image, the ground truth labels are formulated as the logical matrix given by [23].

Lin=1;ifxni0;otherwise;i=1,2,,C,n=1,2,..,N.E32

where xnis the noise-free pixel in the synthetic image and 1 and 0 represent True and False, respectively. After the segmentation is done, the estimated labels are also formulated as logical matrices generated by [20].

L̂kn=1;k=argmaxiuin0;otherwise;i=1,2,,C,n=1,2,..,N.E33

Finally, the TP, TN, FP, and FN are given by [20].

TP=i=1Cn=1NL̂inLin;TN=i=1Cn=1NL̂¯inL¯inFP=i=1Cn=1NL̂inL¯in;FN=i=1Cn=1NL̂¯inLinE34

where “__” means the logical complement.

7.2. Artificial image

In this simulation, the artificial or synthetic noise-free image shown in Figure 1(a) is degraded by adding zero-mean white Gaussian noise (WGN) with different variances. The noisy image shown in Figure 1(b) is for 0.08 noise variance. We have studied the performance of the five algorithms, namely, the standard FCM, the membership entropy-based FCM (MEFCM), the spatial distance weighted FCM (SFCM), the local membership KLFCM (LMKLFCM) and the local data and membership KLFCM (LDMKLFCM) algorithms in segmenting these noisy images with m=2and C=4. The parameters for the algorithms have been elected via simulation as β=1000for MEFCM; λ=0.5for SFCM; γ=1000for LMKLFCM; and γ=1000and α=0.5for LDMKLFCM. For the computation of the locally smoothed data x¯n, a neighboring window of size 3x3 has been used. Also, the same spatial window has been used for the computation of the locally-smoothed membership function πin. The initial values of the membership functions Uand the cluster-centers Vare generated from a uniformly distributed random process with means 0.5 and equal to the image mean, respectively. We have collected results from 25 Monte Carlo runs of each algorithm. In each run, the initial values of Uand Vof the FCM are new random samples while the ones of the rest algorithms are generated by executing few number of iterations of the FCM algorithm. Simulation results, not included for space limitation, have shown that the algorithms provide further improvement with these initial values generated by the FCM algorithm than those randomly generated. Also, in each run, a new random sample of WGN is used in generating the noisy images. Figure 1(c–g) show the clustered images generated by the five algorithms in the case of 0.08 noise variance. These clustered images show that the LMKLFCM and the LDMKLEFCM algorithms provide the ones with lesser noise which means lesser number of misclassified pixels. Moreover, the LDMKLFCM algorithm offers the superior clustered image. Table 2 summarizes the averages and standard deviations (μ±σ) of the performance measures. The LMKLFCM and LDMKLFCM show the maximum VPCand the minimum VPE. The averages of the accuracy, sensitivity and the specificity performance measures of the five algorithms have been studied against noise variance. Figure 2 shows these measures versus noise variance. It is clear that both the LMKLFCM and the LDMKLFCM algorithms provide the superior performance among the five algorithms and the LDMKLFCM algorithm shows more noise-robustness.

Figure 1.

Clustering of the synthetic image: (a), noise free-image; (b), the noise-free image plus zero-mean and 0.08 variance WGN; (c) FCM; (d), MEFCM; (e), SFCM; (f), LMKLFCM; (g), LDMKLFCM. It is evident that the clustered images in (f) and (g) have lesser number of misclassified pixels which means that noisy pixels are rightly clustered. Clustering validation measures are summarized in Table 2.

AlgorithmImagesVPCVPE
FCMSynthetic
Simulated MR
Real MR
Lena
0.8105 ±0.0007
0.7921 ±0.0011
0.8930 ±0.0140
0.8286 ±0.0004
0.3517 ±0.0012
0.3986 ±0.0020
0.1998 ±0.0240
0.2824 ±0.0006
SFCMSynthetic
Simulated MR
Real MR
Lena
0.8370 ±0.0010
0.8674 ±0.0009
0.9204 ±0.0006
0.8936 ±0.0006
0.3017 ±0.0017
0.2409 ±0.0014
0.1440 ±0.0012
0.1786 ±0.0009
MEFCMSynthetic
Simulated MR
Real MR
Lena
0.8616 ±0.0012
0.8873 ±0.0012
0.9602 ±0.0113
0.9268 ±0.0004
0.2271 ±0.0019
0.1841 ±0.0018
0.0650 ±0.0183
0.1198 ±0.0007
LMKLFCMSynthetic
Simulated MR
Real MR
Lena
0.9853 ±0.0011
0.8958 ±0.0088
0.9625 ±0.0087
0.9609 ±0.0012
0.0270 ±0.0028
0.1721 ±0.0146
0.0441 ±0.0128
0.0643 ±0.0020
LDMKLFCMSynthetic
Simulated MR
Real MR
Lena
0.9874 ±0.0011
0.9234 ±0.0030
0.9519 ±0.0016
0.9730 ±0.0026
0.0227 ±0.0022
0.1258 ±0.0049
0.0604 ±0.0025
0.0446 ±0.0026

Table 2.

Clustering validation measures for synthetic and real-world images.

Figure 2.

The average versus noise variance of accuracy, (a); sensitivity, (b); and specificity, (c); ⊳, FCM; +, MEFCM; SFCM; LMKLFCM; LDMKLFCM. The proposed LMKLFCM and LDMKLFCM algorithms provide the superior performance among the five algorithms. The LDMKLFCM algorithm shows more noise-robust capability.

7.3. Magnetic resonance image (MRI)

A simulated MRI of [26], illustrated by Figure 3(a), has been used as a noise-free image. It has been degraded by adding white Gaussian noise (WGN) with zero-mean and 0.005 variance to generate the noisy MRI illustrated by Figure 3(b). This noisy MRI image has been clustered by the five algorithms. The parameters for all algorithms have been taken similar to the ones of the synthetic image simulation except, for the MEFCM algorithm, β=200and, for both LMKLFCM and LDMKLFCM algorithms, γ=1000.We have also executed 25 runs of each algorithm. The initial values of uinand vihave been generated and adjusted as explained in the synthetic image simulation. Figure 3(c–g) shows the resulting clustered images provided by the five algorithms in a certain run. Table 2 shows the averages and standard deviations (μ±σ) of the performance measures VPCand VPEof the five algorithms. It obvious that the LMKLFCM and LDMKLFCM provide the segmented images with lesser noise or lesser number of misclassified pixels, the maximum VPCand the minimum VPE.

Figure 3.

Clustering of simulated MRI: (a), noise-free MRI; (b), the MRI in (a) plus zero-mean WGN with 0.005 variance. Segmented images by: (c), FCM; (d), MEFCM; (e), SFCM; (f), LMKLFCM (g), LDMKLFCM. Obviously, the segmented images in (f) and (g) provided by the LMKLFCM and the LDMRKLCM algorithms, respectively, have lesser noise which means that the noisy pixels are correctly clustered. The clustering validation measures summarized in Table 2 show that the LMRKlCM; and LDMKLFCM provide the maximum VPC and the minimum VPE.

A real MRI from [27], shown in Figure 4(a), has been considered as a noise-free image. To generate the noisy MRI shown in Figure 4(b), salt & pepper noise with 0.050 variance have been added. The noisy MRI has been clustered by the FCM, SFCM, MEFCM, LMKLCM and the LDMKLFCM algorithms. The parameters for all algorithms have been taken similar to the ones of the synthetic image simulation except, for the MEFCM algorithm, β=300and, for both the LMKLFCM and LDMKLFCM algorithms, γ=800.We have also obtained the results of 25 runs of each algorithm. The initial values of uinand vihave been generated and adjusted as mentioned in the synthetic image simulation. Figure 4(c–g) show the segmented images provided by the five algorithms in a certain run while Table 2 summarizes the averages and standard deviations (μ±σ) of the performance measures. It is obvious that the proposed LMKLFCM and LDMKLFCM algorithms provide the segmented images with lesser noise or lesser number of misclassified pixels, the maximum VPCand the minimum VPE.

Figure 4.

Clustering of real MRI example: (a), noise-free real MRI; (b), the image in (a) plus salt&pepper with 0.05 variance. Segmented images by: (c), FCM; (d), MEFCM; (e), SFCM; (f), LMKLFCM (g), LDMKLFCM. Clearly, the segmented images in (f) and (g) generated by the LMKLFCM and the LDMRKLCM algorithms, respectively, have lesser noise. The clustering validation coefficients summarized in Table 2 show that the LMRKlCM; and LDMKLFCM provide the maximum VPC and the minimum VPE.

7.4. Lena image

A popular Lena image shown in Figure 5(a) has been considered as a noise-free image example. The noisy Lena image shown in Figure 5(b) has been generated by adding WGN noise with zero-mean and 0.01 variance. The parameters of the five algorithms have been adjusted to the values similar to the ones used in the previous simulations except C=2; β=1000for the MEFCM algorithm; γ=2000for the LMREFCM and γ=2000and α=0.5for the LDMREFCM algorithms. We have also executed 25 Mont Carlo Runs of each algorithm as explained above. Figure 5(c–g) shows the resulting segmented images obtained by the five algorithms. Visually investigation of the segmented images shows that the LMKLFCM and LDMKLFCM algorithms provide the images with lesser number of misclassified pixels. Table 2 shows the average and standard deviation (μ±σ) of the performance measures of the five algorithms. It is also clear that the two algorithms provide the maximum VPCand the minimum VPE.

Figure 5.

Segmentation of Lena image: (a), noise-free image; (b), the image in (a) plus WGN noise with zero-mean and 0.05 variance. It is obvious that the images in (f) and (g) have lesser number of misclassified pixels. The clustering validation coefficients summarized in Table 2 which shows that the LMKLFCM and the LDMKLFCM algorithms provide the superior VPC and VPE.

8. Conclusions

The hard C-means algorithm has been fuzzified by incorporating into the objective function spatial local information through two KL membership divergences. The first KL membership divergence measures the information proximity between the membership of each pixel and its local membership average in the pixel neighborhood. The second one measures the information proximity between the complement membership and its local membership average in the pixel neighborhood. For regularization, the local data information has been incorporated by an additional new weighted hard C-means function in which the noisy-image is replaced by a noise-reduced one. Such incorporation of both local data and local membership information facilitates biasing the algorithm to classify each pixel in correlation with its immediate neighboring pixels. Results of segmentation of synthetic, simulated medical and real-world images have shown that the proposed local membership KL divergence-based FCM (LMKLFCM) and the local data and membership KL divergence-based entropy FCM (LDMKLFCM) algorithms outperform several widely used FCM related algorithms. Moreover, the average runtimes of all algorithms have been measured via simulation. In all runs, all algorithms start from the same randomly generated initial conditions, as mentioned in the simulation section, and stopped at the same fixed point. The LDMKLFCM, LMKLFCM, standard FCM, MEFCM, and SFCM algorithms have provided average runtime of 1.5, 1.75, 1, 0.9 and 1 sec respectively. The simulation results have been done using Matlab R2013b under windows on a processor of Intel (R) core (TM) i3, CPU M370 2.4 GHZ, 4 GB RAM.

Acknowledgments

The author would like to thank for funding the open access publication of this Chapter. Also, the author would like to thank Prof. H. Selim, Dr. A. AbdelFattah and Eng. G. Gendy for their contribution to this work.

Conflict of interest

No potential conflicts of interest to report.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Reda R. Gharieb (August 1st 2018). Incorporating Local Data and KL Membership Divergence into Hard C-Means Clustering for Fuzzy and Noise-Robust Data Segmentation, Recent Applications in Data Clustering, Harun Pirim, IntechOpen, DOI: 10.5772/intechopen.74514. Available from:

chapter statistics

308total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Centroid-Based Lexical Clustering

By Khaled Abdalgader

Related Book

First chapter

Concepts, Historical Milestones and the Central Place of Bioinformatics in Modern Biology: A European Perspective

By T.K. Attwood, A. Gisel, N-E. Eriksson and E. Bongcam-Rudloff

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us