Open access peer-reviewed chapter

A Clustering Approach Based on Charged Particles

By Yugal Kumar, Sumit Gupta, Dharmender Kumar and Gadadhar Sahoo

Submitted: October 20th 2015Reviewed: March 15th 2016Published: September 21st 2016

DOI: 10.5772/63081

Downloaded: 954

Abstract

In pattern recognition, clustering is a powerful technique that can be used to find the identical group of objects from a given dataset. It has proven its importance in various domains such as bioinformatics, machine learning, pattern recognition, document clustering, and so on. But, in clustering, it is difficult to determine the optimal cluster centers in a given set of data. So, in this paper, a new method called magnetic charged system search (MCSS) is applied to determine the optimal cluster centers. This method is based on the behavior of charged particles. The proposed method employs the electric force and magnetic force to initiate the local search while Newton second law of motion is employed for global search. The performance of the proposed algorithm is tested on several datasets which are taken from UCI repository and compared with the other existing methods like K-Means, GA, PSO, ACO, and CSS. The experimental results prove the applicability of the proposed method in clustering domain.

Keywords

  • clustering
  • charged particles
  • electric force
  • magnetic force
  • Newton Law

1. Introduction

Clustering is an unsupervised technique which can be applied to understand the organization of data. The basic principle of clustering is to partition a set of objects into a set of clusters such that the objects within a cluster share more similar characteristics in comparison to the other clusters. A pre-specified criterion has been used to measure the similarity between the objects. In clustering, there is no need to train the data, it only deals with the internal structure of data and used a similarity criterion to group the objects into different clusters. Due to this, it is also known as unsupervised classification technique. It becomes a NP hard problem when number of clusters is greater than three. Consider a set S = [A1, A2, A3… AN] such that Ai ϵ S, consist of N number of data objects and another set P = [B1, B2 … BK] consist of K cluster centers. The objective of clustering is to arrange each data object from the set S with one of the cluster center j of set P such that the value of objective function is minimized. The objective function is defined as sum of squared Euclidean distance between the data object Ai and cluster center Bj and it can be described as follows:

D=i=1i=NminAiBj2,j=1,2,3,.K

where Ai denotes the ith data objects,Bj denotes the jth cluster center, and D denotes the distance between ith data objects from the jth cluster center. It is also noted that:

  • Each cluster at least consists of one data object.

    Bj ≠ Ø, for all j ϵ {1, 2, 3 … K}, where Bj represents the jth cluster and K denotes the total number of clusters.

  • Each data object is allotted to only one cluster.

    BiBj, for all i ≠ j and i, jϵ {1, 2, 3 …. K} such that ith and jth clusters do not consists same data objects.

  • Each data objects should be allocated to a cluster.

    Uj=1KBj=Swhere S represents the set of data objects.

Hence, the aim of the partition based clustering algorithm is to determine the K number of cluster centers in a given dataset. Here, the MCSS algorithm is applied for determining the optimal cluster centers in a dataset.

Clustering has proven its importance in many applications successfully. Some of these are pattern recognition [1, 2], image processing [36], process monitoring [7], machine learning [8, 9], quantitative structure activity relationship [10], document retrieval [11], bioinformatics [12], image segmentation [13], construction management [14], marketing [15, 16] and healthcare [17, 18]. Broadly, clustering algorithms can be divided into two categories - partition based clustering algorithms and hierarchal clustering algorithms. In partition based clustering algorithms, partition a dataset into k clusters on the basis of some fitness functions [19]. While in hierarchical clustering algorithms, clustering of data occurs in the form of tree representation and this representation is known as dendrogram. Hierarchical clustering algorithms do not require any prerequisite knowledge about number of clusters in a dataset but its only drawback is lacking of dynamism as the objects are tightly bound with respective clusters [2023]. However, our research is focused on partition clustering, which decomposes the data into several disjoint clusters that are optimal in terms of some predefined criteria. From the literature, it is found that K-means algorithm is the oldest, most popular, and extensively used partition based algorithm for data clustering. It is easy, fast, simple structure, and having linear time complexity [24, 25]. In K-means algorithm, a dataset is decomposed into a predefined number of clusters and the data into distinct clusters based on the euclidean distance [25]. Nowadays, heuristic approaches gain wide popularity to solve the clustering problem and become more successful. Numerous researchers have been applying heuristic approaches in the field of clustering. Some of these are summarized as simulated annealing [26], tabu search [27, 28], genetic algorithm [2932], particle swarm optimization [33, 34], ant colony optimization [35], artificial bee colony algorithm [36, 37, 56], charged system search algorithm [38, 39], cat swarm optimization [4042, 57], teacher learning based optimization method [43, 44], gravitational search algorithm [45, 46] and binary search based clustering algorithm [47].

2. Magnetic charge system search (MCSS) algorithm

The magnetic charged system search (MCSS) algorithm is a recent meta-heuristic algorithm based on electromagnetic theory [48]. According to electromagnetic theory, moving charged particles produce an electric field as well as a magnetic field. Movement of the charged particles in a magnetic field enforces a magnetic force on the other charged particles and the resultant force is proportional to the charge (mass) and speed of the charged particles. The magnitude and direction of the resultant force depend on the two factors: first, velocity of the charged particles, and secondly, magnitude and direction of the magnetic field. Thus, the MCSS algorithm is further advancement in the charge system search (CSS) algorithm using the concept of electromagnetic theory. The difference between the CSS and MCSS is that the CSS algorithm considers only the electric force to determine the movement of CPs while the MCSS utilizes both the forces (electric and magnetic) to determine the same. Along this, MCSS can be either attractive or repulsive in nature. This nature of MCSS algorithm generates more promising solutions in random space. On the other hand, CSS algorithm is attractive by nature. Thus, the performance of the algorithm can be affected with small number of CPs. So, the addition of the magnetic force to the already existing electric force, results in enhancement of both the exploration and exploitation capabilities of CSS and this makes the algorithm more realistic one. Hence, the inclusion of magnetic force in the charge system search (CSS) algorithm results in the formation of a new algorithm known as magnetic charge system search (MCSS). The main steps of the MCSS algorithm are as follows.

Step 1: Initialization

Algorithm starts by identifying the initial positions of charged particles (CPs) in d-dimensional space in random order and set the initial velocities of CPs is zero. To determine the initial positions of CPs, equation 1 is used. A variable charge memory (CM) is used to store the best results.

Ck=Xj min+rj*((XjmaxXjmin)/K),where j=1,2..d and k=1,2..KE1

where, Ck denotes the kth cluster center for a given dataset, rj is a random number in the range of 0 and 1, Xjmin and Xj max denote the minimum and maximum value of the jth attribute of the dataset, and K represents the total number of clusters in a dataset.

Step 2: Compute the total force (Ftotal) acts on CPs.

The total force is the combination of the electric force and magnetic force, and this force influences the movement of CPs in d-dimensional space. It can be computed as follows:

  • Determine the electric force – when CPs move in d-dimensional space, an electric field is produced surrounding it, and exerted an electric field on the other CPs. This electric force is directly proportional to the magnitude of its charge and the distance between CPs. The magnitude of an electric force generated by a charge particle is enforced on another charge particle and it can be measured using equation 2.

    Eik=qki,ik(qiR3*w1+qirik2*w2)*pik*(XiCk),{k=1,2,3,..Kw1=1,w2=0rik<Rw1=0,w2=1rikRE2

In equation 2, qi and qk represents the fitness values of ith and kth CP, rik denotes the separation distance between ith and kth CPs, w1 and w2 are the two variables whose values are either 0 or 1, R represents the radius of CPs which is set to unity and it is assumed that each CPs has uniform volume charge density but changes in every iteration, and Pik denotes the moving probability of each CPs.

  • Determine the magnetic force – The movement of CPs also produce magnetic field along with the electric field. As a result of this, a magnetic force is imposed on the other CPs and equation 3 is utilized to compute the magnitude of magnetic force exerted by a CP on other CPs. It can be either positive or negative depending on the value of average electric current of the previous iteration.

    Mik=qki,ik(IiR2*rik*w1+Iirik*w2)*PMik*(XiCk),{k=1,2,3,..Kw1=1,w2=0rik<Rw1=0,w2=1rikRE3

  • In equation 3, qk represents the fitness values of the kth CP, Ii is the average electric current, rik denotes the separation distance between ith data instance and kth CPs, w1 and w2 are the two variables whose values are either 0 or 1, R represents the radius of CPs which is set to unity, and PMik denotes the probability of magnetic influence between ith data instance and kth CP. In other words, it can be summarized that the magnetic force can be either attractive or repulsive in nature. As a result of this, more promising solutions can be generated during the search. Whereas, the electric force is always attractive in nature. Therefore, this nature of electric force may influence the performance of the algorithm. Hence, to overcome the repulsive nature, a probability function is added with the electric force and finally, the total force acting on other CPs can be computed using equation 4.

    Ftotal=pr*Eik+MikE4

    Where, pr denotes a probability value to determine either the electric force (Eik) repelling or attracting, Eik and Mik present the electric and magnetic forces exerted by the kth CP to ith data instance.

    Step 3: Determine the new positions and velocities of CPs.

    Newton second law of motion is applied to determine the movement of CPs. The magnitude of the total force with Newtonian laws is used to produce the next positions and velocities of CPs. The new positions and velocities of CPs can be computed using equation 5 and 6.

    Ck new=rand1*Za*Ftotalmk*Δt2+rand2*Zv*Vk old*Δt+Ck oldE5

    Where, rand1 and rand2 are the two random variable in the range of 0 and 1, Za and Zv act as control parameters to control the influence of total force (Ftotal), and Vk old denotes the velocity of kth CPs,mk is the mass of kth CPs which is equal to the qkt represents the time step which is set to 1, and Ck old denotes the position of kth current CP.

    Vk new=Ck newCk oldΔtE6

    where Vk old denotes the new velocity of kth CP, Ck old and Ck new represents the old and new position of kth CP, and Δt represents the time stamp.

    Step 4: Update charge memory (CM)

    CPs with better objective function values replace the worst CPs from the CM and store the positions of new CPs in CM.

    Step 5: Termination condition

    If the maximum iterations is reached and condition is satisfied, then stop the algorithm and obtain the optimal cluster centers. Otherwise repeat steps 2–4.

    2.1. Pseudo code of MCSS algorithm for clustering

    This section summarizes the pseudo code of the MCSS algorithm for clustering tasks.

    1. Step 1: Load the dataset and initialize the parameters of MCCS algorithm.

    2. Step 2: Initialize the initial positions and velocities of Charged Particles (CPs).

    3. Step 3: Compute the value of objective function using equation 7 and arrange the data instances to the clusters using minimum value of objective function.

      dik=k=1Ki=1nj=1dXjiCjk2E7

    where, Xji denotes the jth attribute of the ith data instance, Cjk represents the jth attribute of the kth CP, and dik denotes Euclidean distance between ith data instance from the kth CP.

  • Step 4: Compute the mass of initial positioned CPs.

  • Step 5: Store the positions of initial CPs (Ck) into a variable, called charge memory (CM).

  • Step 6: While the termination conditions are not met, compute the value of Electric Force (Eik) for each CPs as follows:

    1. Step 6.1: Calculate the value of moving probability (Pik) for each charged particle Ck.

    2. Step 6.2: Compute the fitness of each instance qi.

    3. Step 6.3: Compute the separation distance (rik) of CPs.

    4. Step 6.4: Compute the value of (Xi - Ck).

    5. Step 6.5: Compute the value of Electric Force (Eik) for each CPs

  • Step 7: Determine the value of Magnetic Force (Mik) for each CPs.

    1. Step 7.1: Compute the value of average electric current (Ii).

    2. Step 7.2: Compute the probability of magnetic influence (PMik).

    3. Step 7.3: Compute the value of Magnetic Force (Mik) for each CPs.

  • Step 8: Compute the total force (Ftotal) act on each CPs.

  • Step 9: Calculate the new positions and velocities of charged particles using equation 5 and 6.

  • Step 10: Recalculate the value of objective function using new positions of charge particles.

  • Step 11: Compare the newly generated charge particles to the charge particles reside in CM.

  • Step 12: Memorize the best solution achieved so far and Iteration= Iteration +1;

  • Step 13: Output the best solution obtained.

  • 3. Experimental results

    This section deals with the experimental setup of our study. It includes the performance measures, parameters settings, datasets to be used, experiment results, and statistical analysis. To prove the effectiveness of the MCSS algorithm, 10 datasets are applied in which two datasets are artificial ones and the rest are taken from UCI repository. The proposed algorithm is implemented in MATLAB 2010a using a computer with window operating, corei3 processor, 3.4 GHz and 4 GB RAM. Experimental outcomes of MCSS algorithm are compared with other clustering algorithms like K-means, GA [30], PSO [49], ACO [35], and CSS [38].

    3.1. Performance measures

    The performance of MCSS algorithm is examined over the sum of intra cluster distance and F-measure parameters. The sum of intra cluster distance can be measured in terms of best case, average case, and worst case solutions including standard deviation parameter which shows the dispersion of the data. F-measure parameter is used to measure the accuracy of proposed method. Performance measures are described as follows:

    3.1.1. Intra cluster distances

    Intra cluster distance can be used to measure the quality of clustering [35, 36]. It indicates the distance between the data objects within a cluster and its cluster center. This parameter also highlights the quality of clustering i.e. minimum is the intra cluster distance, better will be the quality of the solution. The results are measured in terms of best, average, and worst solutions.

    3.1.2. Standard Deviation (Std.)

    Standard deviation gives the information about the scattering of data within a cluster [47, 49]. Lower value of standard deviation indicates that the data objects are scattered near its center, while high value indicates that the data is dispersed away from its center point.

    3.1.3. F-Measure

    This parameter is measured in terms of recall and precision of an information retrieval system [50, 51]. It is also described in terms of weighted harmonic mean of recall and precision. Recall and precision of an information retrieval system is computed using equation 8 which can be described as follows:

    Recall(r(i,j))=ni,jniand Precision(p(i,j))=ni,jnjE8

    The value of F-measure (F (i, j)) can be computed using equation 9.

    F(i,j)=2*(Recall*Precision)(Recall+Precision)E9

    Finally, the value of F-measure for a given clustering algorithm which consists of n number of data instances is calculated using equation 10.

    F(i,j)=i=1nnin*maxi*F(i,j)E10

    3.2. Parameters settings

    In order to evaluate the performance of the proposed algorithm, user defined parameters are to be used prior to the process. In MCSS, there are four user defined parameters such as number of CPs, rand, R and ∈. The details of the parameters as follows: the number of CPs is equal number of clusters present in a dataset, rand is a random function that provides a value in the range of 0 and 1, R denotes the radius of CPs and it is set as 1, ∈ is also a user defined parameter which is used to prevent the singularity and it is set to 0.001. In addition to it, number of iterations for algorithm must be specified. Therefore, maximum iteration number is set to 100 and results are summed over 10 runs of the algorithm using different initial cluster centers for each dataset. Table 1 summarizes the parameters setting of MCSS algorithm. It is also mentioned that the performance of the proposed algorithm is compared with the K-means, GA, PSO, ACO, and CSS. The parameter settings of these algorithms are set accordingly as reported (Figures 1 and 2; Table 2.

    ParametersValue
    No. of CPsNo. of Clusters
    randrandom value between [0, 1]
    R1
    0.001

    Table 1.

    Parameters setting of MCSS algorithm.

    Figure 1.

    (a): Distribution of data in ART1. (b): Distribution of data in ART2.

    Figure 2.

    (a): Clustering in ART1 dataset. (b): Clustering the ART2 dataset. (c): Clustering the ART1 dataset using MCSS (Vertical view as X1 and Y1 coordinate in horizontal plane and Z1 coordinate in vertical plane).

    DatasetClassesAttributesTotal instancesInstance in each classes
    ART 132300(100, 100, 100)
    ART 233300(100, 100, 100)
    Iris34150(50, 50, 50)
    Glass69214(70,17, 76, 13, 9, 29)
    LD26345(145, 200)
    Thyroid33215(150, 30, 35)
    Cancer29683(444, 239)
    CMC391473(629,334, 510)
    Vowel63871(72, 89, 172, 151, 207, 180)
    Wine313178(59, 71, 48)

    Table 2.

    Description of datasets.

    3.3. Experiment results

    This subsection demonstrates the results of the proposed algorithm. The results of the proposed algorithm are compared with other existing techniques like K-means, GA, PSO, ACO, and CSS using a mixture of datasets [5355]. Two artificial and eight real life datasets are used to obtain the results [52]. In real life datasets, iris, thyroid, and vowel datasets are categorized as low dimensional datasets, while cancer and LD datasets are moderate ones, and the rest of datasets (wine, CMC, and glass) are high dimensional. For enrich visualization and understanding, the results are discussed with one dataset at a time.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    ART 1Best157.12154.46154.06154.37153.91153.18
    Average161.12158.87158.24158.52158.29158.02
    Worst166.08164.08161.83162.52161.32159.26
    Std0.340.2810000
    F-Measure99.1499.78100100100100

    Table 3.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using ART1 dataset.

    Table 3 illustrates the results of the proposed method as well as other clustering algorithms (in terms of intra cluster distance: best, average and worst, standard deviation, and F-measure parameters) for ART1 dataset. It is seen that the K-means exhibits the poor performance among all the techniques being compared using all of the parameters. From the results, it also noticed that performance of the PSO, ACO, CSS, and MCSS are almost similar except intra cluster parameter. On the behalf of intra cluster parameter, it can be said that the MCSS algorithm achieves minimum distance in comparison to all other algorithms.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    ART2Best743741.71740.29739.81738.96737.85
    Average749.83747.67745.78746.01745.61745.12
    Worst754.28753.93749.52749.97749. 66748.67
    Std0.5160.3560.2370.2060.2090.17
    F-Measure98.9499.1799.2699.1999.4399.56

    Table 4.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using ART2 dataset.

    Table 4 summarizes the results of all the techniques for artificial dataset ART2. From the results, it is clearly shown that a significant difference occurred between the results of the proposed algorithm and other algorithms. The proposed algorithm outperforms using all of the parameters. Again, it is observed that the performance of the K-means algorithm is poor among all the methods. It is also stated that the results of the CSS and PSO algorithms are close to the optimal solution, but with slightly high value of standard deviation parameter.

    Table 5 displays the results of the proposed algorithm and other algorithms for iris dataset. From this table, it came to notice that results obtained using GA is far from the optimal solutions, while the proposed method again gives the superior results. It is also observed that K-means algorithm gives the better results with iris dataset; especially it performs well over GA and ACO algorithms. It is also noticed that results of the K-means algorithm is very close to the PSO algorithm (in terms of F-measure parameter).

    Results of all six methods for wine dataset are listed in Table 6. It demonstrates that MCSS algorithm obtains good results (in terms of intra cluster distance and F-measure) in comparison to others, but slightly large value of standard deviation parameter. On the other hand, it is also stated that the performance of GA, PSO, and ACO are nearly same except some variation between F-measure parameter. Again, the performance of the K-means algorithm is better than the GA, PSO, and ACO in terms of F-measure parameter but with a large value of standard deviation parameter. CSS algorithm also gives good performance with wine dataset except MCSS, and obtains low value for standard deviation parameter which shows that in each iteration, a near optimal solution is generated.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    IrisBest97.33113.9896.8997.196.4796.34
    Average106.05125.1997.2397.1796.6396.57
    Worst120.45139.7797.8997.896.7896.63
    Std14.63114.5630.3470.3670.140.1
    F-Measure0.7820.7780.7820.7790.7870.790

    Table 5.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using iris dataset.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    WineBest16555.6816530.5316345.9616530.5316282.1216158.56
    Average1806116530.5316417.4716530.5316289.4216189.96
    Worst18563.1216530.5316562.3116530.5316317.6716223.61
    Std793.213085.497010.3136.72
    F-Measure0.5210.5150.5180.5190.5290.537

    Table 6.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using wine dataset.

    Table 7 describes the results of all the six algorithms using LD dataset. From the results, it is clearly seen that the outcomes of the proposed algorithm is better in comparison to the other algorithms. It is also noted that K-means algorithm does not perform well with LD dataset, and results obtained using K-means are far-far away from the optimal ones. Again, it came into revelation that the performance of PSO algorithm is better except MCSS algorithm and its results are close to the optimal solutions.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    LDBest11397.83532.48209.15224.76207.09206.14
    Average11673.12543.69224.47235.16228.27221.69
    Worst12043.12563.26239.11256.44242.14236.23
    Std667.5641.7829.3817.4618.5412.07
    F-Measure0.4670.4820.4930.4870.4910.495

    Table 7.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using LD dataset.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    CancerBest2999.192999.322973.52970.492946.482932.43
    Average3251.213249.463050.043046.062961.162947.74
    Worst3521.593427.433318.883242.013006.142961.03
    Std251.14229.734110.80190.512.2310.33
    F-Measure0.8290.8190.8190.8210.8470.859

    Table 8.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using cancer dataset.

    Results of all the six algorithm for cancer dataset is listed in Table 8. As it indicates, the performance of the GA and PSO algorithms are not so good with cancer dataset. But the proposed algorithm works well and achieves respectable results as compared to others. K-means algorithm also achieves good results over GA, ACO, and PSO algorithms.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    CMCBest5842.25705.635700.985701.925672.465653.26
    Average5893.65756.595820.965819.135687.825678.83
    Worst5934.435812.645923.245912.435723.635697.12
    Std47.1650.36946.95945.63421.4317.37
    F-Measure0.3340.3240.3310.3280.3590.368

    Table 9.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using CMC dataset.

    Table 9 demonstrates the results of all the six algorithms for CMC dataset. As can be seen clearly, the proposed method achieves better results in comparison to the other algorithms using all the parameters. It is also stated that the performance of the GA is found to be poor among all the algorithms (in terms of standard deviation and F-measure parameters). Along this, it is found that the K-means algorithm obtains maximum intra cluster distance amongst all.

    Table 10 illustrates the results of the proposed algorithm and all other algorithms for thyroid dataset. From the results, again it is observed that proposed algorithm obtains superior results in comparison to other algorithms, but gets the marginally high value of standard deviation parameter in comparison to GA, PSO, and ACO algorithms. Again, K-means results are far away from the optimal ones and CSS algorithm also obtain high value of standard deviation parameter except K-means. In the rest of the algorithms, performance of ACO algorithm is better.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    ThyroidBest13956.8310176.2910108.5610085.829997.259928.89
    Average14 133.1410218.8210149.710108.1310078.2310036.93
    Worst146424.2110254.3910172.8610134.8210116.5210078.34
    Std246.0632.6427.1321.3449.0243.61
    F-Measure0.7310.7630.7780.7830.7890.793

    Table 10.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using thyroid dataset.

    Results of all the six algorithm for glass dataset is summarized in Table 11. It indicates that CSS algorithm gives the better results for glass dataset in comparison to the other algorithms (in terms of intra cluster distance and standard deviation parameters). From the analysis of F-measure parameter, it is found that the performance of the MCSS is better than CSS. It is worthy to be noted that both the CSS and MCSS achieve good results on the cost of standard deviation parameter. Along this, it is also noticed that the GA exhibits weak performance.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    GlassBest215.74278.37270.57269.72203.58209.47
    Average235.5282.32275.71273.46223.44231.61
    Worst255.38286.77283.52280.08241.27263.44
    Std12.474.1384.553.58413.2917.08
    F-Measure0.4310.3330.3590.3640.4460.449

    Table 11.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using glass dataset.

    Table 12 summarizes the results of the proposed algorithm and all other algorithms for vowel dataset. From the results, it is noticed that MCSS algorithm obtains minimum intra cluster distance amongst all but on the cost of high value of standard deviation parameter. In addition to it, it is also observed that both the MCSS and K-means algorithms exhibit similar performance in terms of F-measure parameters, but K-means obtains minimum value for standard deviation parameter. It is also stated that the quality of clustering is measured in terms of intra cluster distance. Therefore, MCSS algorithm provides good quality results in terms of intra cluster distance. Whereas, ACO method gets maximum intra cluster distance among all the methods. Over all, it is concluded that the proposed algorithm gives better performance with most of datasets in comparison to the other algorithms and quality of solutions is obtained. A statistical analysis is also carried out to prove the same.

    DatasetParametersK-meansGAPSOACOCSSMCSS
    VowelBest149422.26149513.73148976.01149395.6149335.61146124.87
    Average159242.89159153.49151999.82159458.14152128.19149832.13
    Worst161236.81165991.65158121.18165939.82154537.08157726.43
    Std9163105.5442881.3463485.3812128.022516.58
    F-Measure0.6520.6470.6480.6490.6490.652

    Table 12.

    Comparison of the proposed MCSS algorithm with other clustering algorithms using vowel dataset.

    4. Conclusion

    In this chapter, a magnetic charged system search algorithm is applied to solve the clustering problems. The idea of proposed algorithm came from the electromagnetic theory and it is based on the behavior of moving charged particles. A moving charged particle exerts both the forces (electric force and magnetic force) on other charged particles and in turn altered the positions of charged particles. Therefore, in MCSS algorithm, initial population is presented in the form of charged particles. It utilizes the concept of electric and magnetic forces along with newton second law of motion to obtain the updated positions of charged particles. In MCSS, both the electric force (Ek) and magnetic force (Mk) correspond to the local search for the solution, while the global solution is exploited using newton second law of motion. The aim of this research is to investigate the applicability of MCSS algorithm for clustering problems. To achieve the same, performance of the MCSS algorithm is evaluated on variety of datasets and compared with K-Means, GA, PSO, ACO, and CSS using intra cluster distance, standard deviation, and F-measure parameters. Experiment results support the applicability of proposed algorithm in clustering field as well as the proposed method provides good results with most datasets in comparison to the other methods. Finally, it is concluded that proposed method not only gives good results but also improves the quality of solutions.

    © 2016 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

    How to cite and reference

    Link to this chapter Copy to clipboard

    Cite this chapter Copy to clipboard

    Yugal Kumar, Sumit Gupta, Dharmender Kumar and Gadadhar Sahoo (September 21st 2016). A Clustering Approach Based on Charged Particles, Optimization Algorithms - Methods and Applications, Ozgur Baskan, IntechOpen, DOI: 10.5772/63081. Available from:

    chapter statistics

    954total chapter downloads

    1Crossref citations

    More statistics for editors and authors

    Login to your personal dashboard for more detailed statistics on your publications.

    Access personal reporting

    Related Content

    This Book

    Next chapter

    Topology Optimization Method Considering Cleaning Procedure and Ease of Manufacturing

    By Takeo Ishikawa

    Related Book

    First chapter

    Distributionally Robust Optimization

    By Jian Gao, Yida Xu, Julian Barreiro-Gomez, Massa Ndong, Michalis Smyrnakis and Hamidou Tembine

    We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

    More About Us