Open access peer-reviewed chapter

Deterministic Annealing: A Variant of Simulated Annealing and its Application to Fuzzy Clustering

By Makoto Yasuda

Submitted: June 2nd 2016Reviewed: September 28th 2016Published: April 26th 2017

DOI: 10.5772/66072

Downloaded: 1725

Abstract

Deterministic annealing (DA) is a deterministic variant of simulated annealing. In this chapter, after briefly introducing DA, we explain how DA is combined with the fuzzy c-means (FCM) clustering by employing the entropy maximization method, especially the Tsallis entropy maximization. The Tsallis entropy is a q parameter extension of the Shannon entropy. Then, we focus on Tsallis-entropy-maximized FCM (Tsallis-DAFCM), and examine effects of cooling functions for DA on accuracy and convergence. A shape of a membership function of Tsallis-DAFCM depends on both a system temperature and q. Accordingly, a relationship between the temperature and q is quantitatively investigated.

Keywords

  • deterministic annealing
  • simulated annealing
  • free energy
  • fuzzy c-means clustering
  • entropy maximization
  • Shannon entropy
  • fuzzy entropy
  • Tsallis entropy

1. Introduction

Statistical mechanics investigates the macroscopic properties of a physical system consisting of many elements. Recently, great research activities of applying statistical mechanical models or tools to information engineering problems have been seen. The entropy maximization method applied to a fuzzy c-means clustering is a good example of such models.

Simulated annealing (SA) [1, 2] is one of the most commonly used optimization techniques and plays an important role in the field of engineering because many of the engineering problems can be formulated as optimization problems. SA is a stochastic relaxation method that treats an objective function as a system energy, and by analogy with the annealing process of solids, searches for its minimum with decreasing the system temperature. SA searches randomly at a high temperature, but more deterministically at a low temperature. As long as a neighborhood can be defined and the temperature is lowered sufficiently slowly, SA is a global optimization technique for solving optimization problems. It requires a very long time to find an optimal global solution because of a stochastic search at each temperature. Thus, SA is an approximation method in practical.

Deterministic annealing (DA) is a deterministic variant of SA, which is first proposed by Rose et al. [3] for a vector quantization algorithm. DA characterizes the minimization problem of the objective function as the minimization of the free energy of the system, which depends on the temperature. DA tracks the minimum of the free energy while decreasing the temperature. Thus, it can deterministically optimize the objective function at each temperature, and is more efficient than SA but does not guarantee the optimal solution. In addition, an effect of a cooling function on the quality of a DA’s solution is still unclear.

Membership functions of the fuzzy c-means (FCM) clustering [4] maximized or regularized with entropy [5, 6] have similar forms with the distribution functions that appear in statistical mechanics. For example, a membership function obtained by the Shannon entropy maximization has a similar form with the Boltzmann-Gibbs distribution function [3, 5]. Similarly, a membership function obtained by the fuzzy entropy [7] has a similar form with the Fermi-Dirac distribution function [8]. Annealing methods can be applicable to these membership functions because they contain a parameter that can be corresponded to the system temperature. The advantage of applying the entropy maximization methods to FCM is that fuzzy clustering can be analyzed from both information processing and statistical physical points of view.

Tsallis [9], by extending Boltzmann-Gibbs statistics nonextensively with a generalization parameter q, postulated a generalized formulation of entropy. The entropy is now well known as The Tsallis entropy, which, in the limit of qto 1, approaches the Shannon entropy. The Tsallis entropy is applicable to numerous fields, including physics, bioscience, chemistry, networks, computer science, and so on [1012]. Menard et al. [13, 14] investigated fuzzy clustering in the framework of nonextensive thermostatistics, and derived the possibilistic membership function by taking the possibilistic constraint into account.

On the other hand, based on the Tsallis entropy, Yasuda [15] defined another form of entropy for FCM, and then derived the membership function by maximizing this entropy within FCM [15]. After that, by combining the membership function with DA, a new fuzzy clustering algorithm (the Tsallis-DAFCM algorithm) has been developed. Tsallis-DAFCM was proved to yield superior results in comparison with the conventional annealing methods.

Similarly to SA, a performance of Tsallis-DAFCM strongly depends on how to decrease the temperature. Among the cooling functions for SA, the very fast annealing (VFA) method decreases the temperature fastest. Thus, VFA is applied to Tsallis-DAFCM to improve its performance, and proved to be effective [16].

In spite of its performance, it remains unknown how appropriate qvalue and initial annealing temperature Thigh for Tsallis-DAFCM should be determined according to the data distribution. One of the important characteristics of the membership function of Tsallis-DAFCM is that centers of clusters are given as a weighted function of the membership function to the power of q(uikq). Furthermore, it changes its shape in a similar way by decreasing the temperature or by increasing q. In order to reveal the relationship between the temperature and q, uikqis quantitatively analyzed using the Iris Data Set [17]. The result shows that the temperature and qaffect uikqalmost inversely, suggesting that a q-incrementation algorithm is possible. This algorithm might be a solution to the initialvalue problem of Tsallis-DAFCM [18, 19].

This chapter is composed of five sections. Section 1 is this introduction. In Section 2, how DA works is explained generously, and its applications are summarized. In Section 3, each of the components of entropy-maximized FCM clustering methods is explained. In Section 4, VFA is used as the cooling function of Tsallis-DAFCM, and its effects are experimentally investigated. Effects of the temperature and q-values on the membership function of Tsallis-DAFCM are also examined. Section 5 gives a conclusion of this chapter.

Advertisement

2. Deterministic annealing

Vector quantization is a classification method for a big data, which is widely used in the field of image compression. Linde-Buzo-Gray [20] and self-organizing feature map [21] algorithms, for example, are well-known vector quantization algorithms. However, classification results of these algorithms depend on initial settings of the reference vector, and can easily fall into local minima.

In order to overcome the problem, by analogy with statistical mechanics, a DA method has been proposed. DA does not suffer from the initial vector problem. It usually performs better than other methods. However, it does not theoretically guarantee to find a global optimal solution. Furthermore, its performance depends on a way of decreasing the system temperature.

This section gives a brief introduction of the deterministic annealing method. In Section 2.1, a definition and major characteristics of DA are explained. In Section 2.2, DA is compared with SA. In Section 2.3, applications and modifications of DA are summarized.

2.1. Major characteristics of deterministic annealing

In DA, by analogy with statistical mechanics, the free energy Fis derived from an objective function Jof a problem. At a high temperature, Frepresents a global structure of J. As the temperature decreases, it gradually reaches J.

Based on these characteristics, at the high temperature, DA is able to find a global minimum of Fby the steepest descent method because it should have multiple local minima. When the temperature is lowered a little, Fwould change its shape only a little. Accordingly, by setting the previous global mimimum as an initial value of the steepest descent mehod, DA searches the next global minimum. This procedure continues until the temperature is lowered sufficiently, and Freaches J. Consequently, at each temperature, DA searches the local minimum of Jdeterministically.

2.2. Comparison with simulated annealing

While decreasing the temperature, SA searches the minimum stochastically at each temperature and thus requires a very long time to find an optimal solution. Hence, though theoretically, it is guaranteed to find the optimal solution, SA is practically an approximation method.

On the contrary, DA consumes less computational time because it searches the local minimum deterministically at each temperature. Furthermore, it should be noticed that, in case that multiple local minima exist at some temperature, DA might not be able to find the minimum. For this reason, even theoretically, DA is not guaranteed to find the optimal solution.

Approaches to speed up SA are mainly based on the improvement of a transition method and a cooling function including its parallelization. For example, adaptive SA (ASA) [22], which may belong to the both categories, is an implementation of very fast simulated re-annealing (VFSA) [23]. As compared with the acceleration method called fast annealing (FA) [24] in which the temperature is lowered inversely linear to a number of iterations, ASA is faster than FA. In addition, among many features included in ASA, it can get the benefit of speeding-up by simulated quenching.

In DA, it seems no comprehensive studies on this topic have been conducted.

The summary of comparison is shown in Table 1.

SADA
Search strategy   Stochastic search based on the Metropolis algorithm.Deterministic search based on the
steepest descent algorithm.
Cooling function  Two categories of cooling functions are well-used.
1. Functions based on statistical analysis.
2. Adaptive functions depending on the problems.
Cooling functions appear in SA are
used empirically.
Optimality     The global minimum can be achieved if a temperature
is decreased as slow as T∝1/log(iterations).
Not guaranteed.

Table 1.

Comparison of SA and DA.

2.3. Applications and modifications of deterministic annealing

The study on DA first addressed avoidance of the poor local minima of data clustering [25]. Then it was extensively applied to various subjects such as combinational optimization problems [26], vector quantization [27, 28], maximum likelihood estimation [29], classifier design [30], and pairwise data clustering [31].

In order to cluster a large number of data, research activities that attempt to parallelize DA have become popular. Kim et al. [32] discussed the parallelization method of DA using GPU, and applied it to color image segmentation. In order to cluster a large number of bioscientific data, Fox et al. [33, 34] parallelized DA using MPI. Qiu et al. [35] compared the DA’s performance using C# messaging runtime library CCR with that using MPI.

Advertisement

3. Application of deterministic annealing to fuzzy c-means clustering maximized with entropy

One of the important applications of DA is fuzzy clustering. In this section, we focus on fuzzy c-means (FCM) clustering. By maximizing the objective function of FCM with various entropies, membership functions similar to the statistical mechanical distribution functions are obtained. These membership functions can be easily combined with DA, because they contain a parameter corresponding to a system temperature.

In this section, first we outline the formulation of FCM. Then, we describe how to apply the entropy maximization methods to FCM. In Section 3.1, the classical FCM clustering method is introduced. In Section 3.2, various entropy maximization methods are explained. The free energies for each entropy maximization method are derived in Section 3.3. In Section 3.4, representative annealing functions are presented.

3.1. Fuzzy c-means clustering

Let X={x1,,xn}(xk=(xk1,,xkp)Rp) be a data set in p-dimensional real space, which is to be divided into cclusters. Let V={v1,,vc}(vi=(vi1,,vip)Rp)be the centers of the clusters, and let uik[0, 1] (i=1,,c;k=1,,n)be the membership functions. Then, let

J=k=1ni=1cuikmdik  (dik=xkvi2; mR; 1<m)E1

be the objective function of FCM to be minimized. Under the normalization constraint of

i=1cuik=1  (k),E2

the Lagrange function Lis given by

L=Jk=1nηk(i=1cuik1),E3

where ηkdenotes the Lagrange multiplier. L/uik=0gives the membership function of the form:

uik=[j=1c(dikdjk)1m1]1.E4

Similarly, vican be determined by L/vi=0as follows:

vi=k=1nuikxkk=1nuik.E5

Desirable cluster centers can be obtained by calculating Eq. (4) and Eq. (5) repeatedly.

3.2. Entropy maximization method for fuzzy c-means

3.2.1. Shannon entropy maximization

Shannon entropy for FCM takes the form:

SSE=k=1ni=1cuikloguik.E6

By setting mto 1 in Eq. (1), under the normalization constraint of Eq. (2), the Shannon entropy functional is given by

δSSEk=1nαkδ(i=1cuik1)βk=1ni1cδ(uikdik),E7

where αkand βdenote the Lagrange multipliers. The stationary condition for Eq. (7) leads to the following Gaussian membership function:

uik=eβdikj=1ceβdjk,E8

and the same formula for viin Eq. (5).

3.2.2. Fuzzy entropy maximization

Fuzzy entropy for FCM is defined as [36]

SFE=k=1ni=1c{uikloguik+(1uik)log(1uik)}.E9

The fuzzy entropy functional is given by

δSFEk=1nαkδ(i=1cuik1)βk=1ni1cδ(uikdik).E10

The stationary condition for Eq. (10) leads to the following membership function:

uik=1eαk+βdik+1,E11

and the same formula for viin Eq. (5). Eq. (11) is similar to the Fermi-Dirac distribution function.

3.2.3. Tsallis entropy maximization

The Tsallis entropy for FCM is defined as [15]

STsallis=1q1(k=1ni1cuikq1),E12

where q ∈ Ris a real number. In case of the Tsallis entropy maximization, the objective function should be rewritten as

JTsallis=k=1ni=1cuikqdik.E13

Accordingly, the Tsallis entropy functional is given by

δSTsallisk=1nαkδ(i=1cuik1)βk=1ni1cδ(uikqdik).E14

The stationary condition for Eq. (14) leads to the membership function of the form:

uik={1β(1q)dik}11qZ,E15

where

Z=j=1c{1β(1q)djk}11q.E16

viis defined as

vi=k=1nuikqxkk=1nuik.E16

3.3. Free energy for entropy maximized fuzzy c-means

In each entropy, maximization methods introduced in Section 3.2, βcan be regarded as the inverse of the system temperature T-1. This feature makes it possible to apply DA, and Shannon-entropy-maximized FCM with DA (Shannon-DAFCM, hereafter), fuzzy-entropy-maximized FCM with DA (fuzzy-DAFCM, hereafter), and Tsallis-entropy-maximized FCM with DA (Tsallis-DAFCM, hereafter) have been developed.

3.3.1. Free energy for Shannon-DAFCM

In Shannon-DAFCM, by analogy with statistical mechanics [37], the sum of the states (the partition function) for the grand canonical ensemble of FCM can be expressed as

ξ=k=1ni=1ceβdikE17

The free energy is derived as

F=1βlogξ=1βk=1nlogi=1ceβdik.E18

Stable thermal equilibrium requires a minimization of the free energy. By formulating deterministic annealing as a minimization of the free energy, F/vi=0yields the same expression for vias that in Eq. (5).

3.3.2. Free energy for fuzzy-DAFCM

Similarly, in fuzzy-DAFCM, the grand partition function for the grand canonical ensemble for FCM can be written as

Ξ=k1ni=1c(1+eαkβdik).E19

The free energy is calculated as

F=1β(logΞαklogΞαk)=1βk=1n{i=1clog(1+eαkβdik)+αk}.E20

F/vi=0 yields the same expression for vias that in Eq. (5).

3.3.3. Free energy for Tsallis-DAFCM

In Tsallis-DAFCM, the free energy can be derived as

F=JTsallisTSTsallis=1βk=1nZ1q11q,E21

where Tis a system temperature. F/vi=0 yields the same expression for vias that in Eq. (16).

3.4. Effect of annealing temperature on clustering

3.4.1. Dependency of shape of membership function on temperature

While reducing the system temperature T, DA achieves thermal equilibrium at each temperature by minimizing the free energy. Thus, DA searches a cluster distribution that minimizes the free energy at each temperature. When the temperature is high, the membership functions distribute widely. This makes clusters to which a data belong fuzzy. In case of Tsallis-DAFCM, when qis nearly equal to 2, the width of the membership function is almost proportional to T. On the contrary, at the low temperature, fuzzy clustering approaches hard clustering. The relationship F=JTsallisTSTsallisin Eq. (21) suggests that, at the higher temperature, the larger entropy state or chaotic state is caused by a widening of the extent of the membership function.

3.4.2. Cooling function

In SA, the temperature decreases according to a cooling function or an annealing schedule. The representative cooling functions for SA [38] are:

(I) Exponential function

T=Thighrt,E22

where Thighis the highest initial temperature, ris a parameter which defines a temperature reduction rate, and tis a number of iterations of temperature reduction.

(II) Inversely linear function

T=Thight.E23

(III) Inversely logarithmic function

T=Thighlnt.E24

(IV) Inversely exponential function

T=Thighert.E25

(V) Very fast annealing

Rosen [39] proposed another inversely exponential function known as VFA. VFA decreases the temperature exponentially in a similar way to ASA:

T=Thighert(1/D),E26

where Dis a dimension of a state space.

In clustering, a state space refers to an input space of a data set. For example, if a data set consists of 3 attributes, Dis set to 3.

Figure 1 compares plots of Eq. (25) and Eq. (26).

Figure 1.

Plots of (a)Eq. (25)and (b)Eq. (26)(Thigh = 1.0 × 105,D= 2).

Advertisement

4. Tsallis-entropy-maximized fuzzy c-means clustering with deterministic annealing

In this section, we focus on Tsallis-DAFCM, and its important experimental results are explained. In Section 4.1, we present the Tsallis-DAFCM clustering algorithm. In Section 4.2, how VFA affects Tsallis-DAFCM is experimentally investigated. In Section 4.3, effects of the temperature and q-values on the membership function are examined.

4.1. Tsallis-DAFCM clustering algorithm

The Tsallis-entropy-maximization method, fuzzy c-means clustering, and the deterministic annealing method can be combined as the following Tsallis-DAFCM clustering algorithm [15]:

  1. Set the number of clusters c, the highest temperature Thigh, the temperature reduction rate m, and the threshold of convergence test δ1 and δ2;

  2. Generate cclusters at random positions. Set current temperature Tto Thigh;

  3. Calculate the membership function uikby Eq. (15);

  4. Calculate the centers of clusters viby Eq. (17);

  5. Compare the difference between the current cluster centers viand the cluster centers obtained in the previous iteration vi. If the convergence condition max1ic||viv¯i||<δ1is satisfied, then go to 6, otherwise go back to 3;

  6. Compare the difference between the current cluster centers and the cluster centers obtained at the previous temperature vi¯¯.If the convergence condition max1ic||viv¯¯i||<δ2is satisfied, then stop, otherwise decrease the temperature using the cooling function, and go back to 3.

4.2. Effect of cooling functions

In general, the temperature should be reduced gradually in DA. However, this takes a long time to converge. If VFA is applicable to Tsallis-DAFCM, it is of great advantage to this method. Accordingly, VFA is tested as a cooling function of Tsallis-DAFCM.

In this subsection, both Shannon- and Tsallis-DAFCM are examined.

4.2.1. Experiment 1

In experiment 1, the numerical data composed of five clusters and 2000 data points are used, as shown in Figure 2. The parameters are set as follows: c= 10, δ1 = 50, δ2 = 2, q= 1.5, Thigh = 1.0×106 or 1.0×105.

Figure 2.

Numerical data.

First, the inversely exponential cooling function is applied to Tsallis-DAFCM. The changes of βparameterized by rare plotted in Figure 3. In case of r= 1000.0, when Thigh = 1.0×105, as Tis lowered from Figures 3 (A) to (D), data are clustered gradually and desirably. In case of r= 10.0 and r= 1.0 (Figures 3 (E) and (F), respectively), it is observed that uikand viconverge more rapidly.

Figure 3.

Increasing of β by inversely exponential cooling function.

However, when Thigh = 1.0×106, the algorithm fails to converge with r= 100.0 and 10.0 (expressed by “Not converged” in Figure 3) because the initial distribution of uikis too wide. This result indicates that it is important to set both Thighand rproperly.

Figure 4.

Shifts of cluster centers during clustering obtained by (a) Shannon-DAFCM and (b) Tsallis-DAFCM.

In order to clarify the adaptability of VFA as a cooling function of DA, numerical experiments on Shannon- and Tsallis-DAFCM are performed. The shifts of cluster centers with decreasing temperature are illustrated in Figures 4 (a) and (b). Initially, clusters are located randomly. Then, at the higher temperature, βis comparatively small and clusters move to near the center of gravity of data because the membership functions are extend over the data area and become extremely uniform. As Tis lowered, contrarily, the membership functions become narrower and the associations of data to the clusters become less fuzzy. In this process, in Shannon-DAFCM, the clusters move to their nearest local data distribution centers. However, in Tsallis-DAFCM, the clusters can move a long distance to optimal positions because the membership functions have gentle base slopes.

Figures 5 (a) and (b) illustrates the three-dimensional plots of uikin the progress of Shannon- and Tsallis-DAFCM clustering combined with VFA. When the temperature is as high as 3.7×104, roughness of uikof Tsallis-DAFCM is smaller than that of Shannon-DAFCM. After that, the shapes of both membership functions do not change greatly, because VFA reduces the temperature extremely only at the early annealing stage. When the temperature is lowered to 1.3×104, both methods cluster data desirably.

Figure 5.

Initial and final landscapes ofuikof (a) Shannon-DAFCM and (b) Tsallis-DAFCM.

Consequently, because Tsallis-DAFCM has gentle slope in the region far from the origin, clusters can move long distance to optimal positions stably. This feature makes it possible to reduce the temperature rapidly. Thus, VFA is suitable as a cooling function of Tsallis-DAFCM. On the other hand, final cluster positions obtained by Shannon-DAFCM tend to depend on their initial positions.

4.2.2. Experiment 2

In experiment 2, the Iris Data Set [17] consisting of 150 four-dimensional vectors of iris flowers is used. Three clusters of flowers detected are Versicolor, Virginia, and Setosa. Each cluster consists of 50 vectors. VFA is used as a cooling function of DA. The parameters are set as follows: c= 3, δ1 = 0.1, δ2 = 0.01, q= 1.5, Thigh = 2.0.

The minimum, maximum, and average values of misclassified data points of 100 trials are summarized in Table 2. It can be seen that Shannon-DAFCM gives slightly better results than Tsallis-DAFCM. However, it is also confirmed that Tsallis-DAFCM gives its best results when the temperature reduction rate ris set to 1 or 2, though the best result for Shannon-DAFCM is obtained only when r= 2. Furthermore, variances of Tsallis-DAFCM are smaller than those of Shannon-DAFCM. These features indicate that a wide range of rvalues are applicable to Tsallis-DAFCM. On the other hand, with larger rvalues, Shannon-DAFCM becomes unstable.

Shannon-DAFCMTsallis-DAFCM
rMin.Max.Ave.rMin.Max.Ave.
1151615.011141414.00
2111312.972141514.01
3111413.593141514.68

Table 2.

Misclassified data points of Iris Data Set (100 trials).

Figure 6.

Plots ofuq(x).

4.3. Dependencies of the membership function on temperature and q

In Eq. (16), it can be seen that uikqplays an important role as a weight value to each xk, and it determines vi. For this reason, dependencies of uikqon Tand qare to be investigated.

In this subsection, for simplicity, viis set to be 0, because this makes the denominator of Eq. (15) become the sum of the same formulas of its numerator. Figure 6 illustrates the numerator of uikq(expressed by uq(x)) as a function of xk, parameterized by Tand q, respectively. In order to plot the shape of uikqas a function of the distance between the cluster center and various data points, in this figure, xkis replaced to a continuous variable x. Figure 6 confirms that the extent of uikqbecomes narrower with increasing q. On the contrary, as the temperature decreases, the distribution becomes narrower.

4.3.1. Quantitative relationship between temperature and q

As stated in the previous subsection, Tand qinversely affect the extent of uikq, which changes in a similar way with decreasing Tor increasing q. In order to examine the quantitative relationship between Tand qin more detail, they are changed independently as follows:

First, we define

uq(x,T,q)={11qTx2}q1q.E27

Then, Eq. (27) is calculated by fixing Tand qto some constants T0 and q0. Next, by decreasing T, we search the qvalues that minimize the sum of squares of the residuals of the following two functions:

S=0Smax|uq(ΔxS,T0,q0)uq(ΔxS,T,q)|2.E28

In the following calculations, the parameters are set as follows: Thigh(= T0) = 2.0; the domain of xis 0 ≤ x ≤ 100; the number of sampling points of the sum of residuals Smax = 10,000; ∆x = 0.01.

For q(= q0) values of 1.01, 2.0, 6.0, 10.0, and for Tdecreasing from Thigh, the qvalues that minimize Eq. (28) (expressed by qmin) are shown in Figure 7 (a). Figure 7 (b), on the other hand, shows the results of cases in which qis set to 2.0 and Tis lowered from Thigh = 2.0, 20.0, 100.0, 200.0. Approximate curves plotted in Figure 7 are obtained by fitting the data to the following exponential function:

Figure 7.

Plots ofqminas a function ofTparameterized by (a)qand (b)Thigh.

qmin=aTb,E29

where aand bdenote the fitting parameters. Optimal values for these parameters obtained by the least squares method are summarized in Table 3 and Table 4. From these tables, it is concluded that bis nearly equal to 1.0 indicating that qis inversely proportional to T. In addition, it can be seen that, though bdoes not change its value much, aincreases with increasing T.

qab
1.012.711.126
24.671.066
612.651.023
1020.641.014

Table 3.

Parameters of approximate curves (Thigh = 2.0).

Thighab
24.671.066
2054.331.066
100302.061.066
200632.351.066

Table 4.

Parameters of approximate curves (q= 2.0).

As a result, by using the approximate relationship of Tand qmin, instead of annealing or T-reduction, q-incrementation clustering might be possible [18].

Advertisement

5. Conclusion

In this chapter, we first explained the major characteristics of DA and compared it with SA. DA is a variant of SA and searches a minimum deterministically. Thus, generally it is more efficient than SA. We then explained how DA could be applied to the fuzzy c-means clustering by employing the entropy maximization method.

After that, by focusing on Tsallis-entropy-maximized FCM combined with DA (Tsallis-DAFCM), an effect of VFA on DA was examined. VFA reduces the temperature extremely only at the early annealing stage, and the experimental result showed that this feature improved the performance of Tsallis-DAFCM because it has gentle slope in the region far from the origin and clusters can move long distance to optimal positions from the beginning.

The Tsallis entropy is an extension of the Shannon entropy with a generalization parameter q. A shape of a membership function of Tsallis-DAFCM strongly depends on both the temperature and q. Accordingly, a relationship between the temperature and qwas quantitatively investigated, and it was experimentally confirmed that they affected the area covered by the membership function almost inversely. Based on the result, a development of a q-incrementation algorithm is our future subject.

Notes

  • In clustering, a state space refers to an input space of a data set. For example, if a data set consists of 3 attributes, D is set to 3.

© 2017 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Makoto Yasuda (April 26th 2017). Deterministic Annealing: A Variant of Simulated Annealing and its Application to Fuzzy Clustering, Computational Optimization in Engineering - Paradigms and Applications, Hossein Peyvandi, IntechOpen, DOI: 10.5772/66072. Available from:

chapter statistics

1725total chapter downloads

3Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Generalized Simulated Annealing

By Yang Xiang, Sylvain Gubian and Florian Martin

Related Book

First chapter

Multi-Agent Based Distributed Manufacturing

By J. Li, J.Y. H. Fuh, Y.F. Zhang and A.Y.C. Nee

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us