Collective Solutions on Sets of Stable Clusterings

Vladimir Vasilevich Ryazanov

doi:10.5772/intechopen.76189

Abstract

Two clustering problems are considered. We consider a lot of different clusters of the same data for a given number of clusters. Data clustering is understood as their stable partition into a given number of sets. Clustering is considered stable if the corresponding partitioning remains unchanged with its minimum change. How to create a new cluster based on ensemble clusterings? The second problem is the following. A definition of the committee synthesis as ensemble clustering is introduced. The sets of best and worst matrices of estimates are considered. Optimum clustering is built on the basis of the clusterings obtained as being closest to the set of the best estimation matrices or as the most distant from the set of worst-case matrices of estimates. As a result, the problem of finding the best committee clustering is formulated as a discrete optimization problem on permutations.

Keywords

clustering
algorithm
ensemble
collective
stability
optimality
construction

Author Information

Show +

Vladimir Vasilevich Ryazanov*
- Dorodnicyn Computing Centre, Federal Research Center, “Computer Science and Control” of Russian Academy of Sciences, Moscow, Russia

*Address all correspondence to: rvvccas@mail.ru

1. Introduction

There are many different approaches to solving the problems of clustering multidimensional data: based on the optimization of internal criteria (indices) [1, 2], hierarchical clustering [3], centroid-based clustering [4], density-based clustering [5], distribution-based clustering [6], and many others. There are well-known books and papers on clustering [7, 8, 9, 10].

This section is devoted to one approach to the creation of stable clusterings and the processing of their sets. A natural criterion is considered, which is applicable to any clustering method. In work [11], various criteria (indices) are proposed, optimizing which clustering is built with a definite look “what is clustering?” In this chapter, we use a criterion based on stability. If we really got clustering, that is, a solution for the whole sample, the partitioning should not change with a small change in the data. Criteria are introduced for the quality of the partition obtained. If the criterion value is less than one, then the partition is unstable. Let us obtain for the same data N clusterings. How to create a new ensemble clustering based on the N partitions? Previously, a committee method for building ensemble clusterings was proposed [12, 13, 14, 15]. Let there be N results of cluster analysis of the same data for l clusters. The committee method of building ensemble clustering makes it possible to build such l clusters, each of which is the intersection of “many” initial clusters. In other words, we find such l clusters whose objects are “equivalent” to each other according to several principles. As initial N clusterings, one can take stable ones. Finally, we consider a video-logical approach to building the initial N coarse clusterings.

2. Criteria for stability of clustering

Let the sample of objects Χ=xii=12…m,xi∈Rn be given and Κ=K1K2…Kl is the clustering of the sample into l clusters obtained by some method, Ki⊆Χ,i=1,2,…,l,∪1lKi=Χ,Ki∩Kj=∅,i≠j. Speaking of clustering, we mean applying a method to a sample without focusing on the method itself. Is partition Κ of a sample by this method clustering or here some kind of stopping criterion is satisfied? For example, an extremum of some functional is obtained or the maximum number of operations in the iterative process is fulfilled. We will use the following thesis as the main one. If the resulting partition Κ is indeed clustering, then it must be the same clustering for any minimal change in the sample Χ. Let xi be arbitrary, xi∈Kα ensemble then the sample Χ∖xi partition Κ∗xi=K1∗K2∗…Kl∗, Kj∗=Kj,j=1,2,…,l,j≠α,Kα∗=Kα∖xi,i=1,2,…,m must be clustering. The fact of “coincidence” of clusterings Κ=K1K2…Kl and Κ∗xi=K1∗K2∗…Kl∗ will be called identity, the clusterings themselves are identical and denoted it as Κ∗xi≈Κ. In this case, it is natural to call a partition Κ as stable clustering if the partitions Κ∗xi and Κ coincide for all xi,i=1,2,…,m. In the case of non-identity of some individual Κ∗xi with Κ, we will call Κ as quasi-clustering.

Definition 1. The quality of quasi-clustering (of unstable clustering) is the quantity ФΚ={xii=12…m:Κ∗xi≈Κ}/m.

If ФΚ=1, then in this case, we will talk about stable clustering Κ or simply clustering. Suppose that for some i,i=1,2,…,m the condition Κ∗xi≈Κ is not satisfied, and Κ∘xi=K1∘K2∘…Kl∘ is the clustering of the sample X∖xi obtained from the partition Κ∗xi using Κ∗xi as the initial approximation. Then Κ∘xi can significantly differ from Κ∗xi. We will use as a function of the proximity between clustering Κ∘xi and partitioning Κ the value dΚ∘xiΚ=maxα∑i=1lKi∘∩Kαi/m−1. Note that to calculate proximity it is required to find the maximum matching in a bipartite graph, for which there is a polynomial algorithm [16]. If Κ∘xi does not exist, we will assume that dΚ∘xiΚ=0.

Definition 2. The quality Fmin(Κ) of the quasi-clustering Κ will be called the quantity FminΚ=minidΚ∘xiΚ.

Definition 3. The quality Favr(Κ) of the quasi-clustering Κ will be called the quantity FavrΚ=∑i=1mdΚ∘xiΚ/m.

For some clustering algorithms, there are simple economical rules for computing ФΚ. Let us bring them (see also in [3, 17, 18]).

2.1. Method of minimizing the dispersion criterion

It is known that in order to minimize the dispersion criterion, it suffices to satisfy inequalities

njnj−1x×−mj2−nknk+1x×−mk2≤0E1

for any clusters Kj and Kk, arbitrary x×∈Kj, where nj=Kj, mj=1nj∑xt∈Kjxt.

We establish the conditions for the identity Κ∗xi≈Κ of the partitions Κ∗xi and Κ. In the case x×∈Kj [considering (Eq. (1))] to satisfy the condition Κ∗xi≈Κ inequalities.

nj−1nj−2x×−mj2+2nj−2x×−mjxi−mj+1nj−1nj−2xi−mj2−nknk+1x×−mk2≤0 must be satisfied. In the case x×∈Kk inequalities nknk−1x×−mk2−nj−1njx×−mj2−2njx×−mjxi−mj−1njnj−1xi−mj2≤0 must be satisfied.

2.2. k-means method

Let the clustering Κ be obtained by k-means method, that is, x×−mj≤x×−mk,∀j≠k,∀x×∈Kj. In the case of equality, the object is considered to belong to a cluster with a lower number. Then, Κ∗xi≈Κ is satisfied if x×−mj2+2nj−1x×−mjxi−mj+1nj−12xi−mj2≤x×−mk2 under x×∈Kj,x×≠xi and x×−mk2≤x×−mj2+2nj−1x×−mjxi−mj+1nj−12xi−mj2 under x×∈Kk.

2.3. Method of hierarchical agglomeration grouping

We confine ourselves to the case of an agglomeration hierarchical grouping. To find the value of the criterion ФΚ, you can calculate the partitioning Κ, partitions Κ∘xi,i=1,2,…,m, and compare Κ with each Κ∘xi,i=1,2,…,m. Here it is possible to save in the calculation of ФΚ without carrying through the clustering for some of “i”. Indeed, let there Κtxi=K1tK2t…Km−tt be clustering of the sample X∖xi into m−t clusters, t≤m−l. Κ is a partition obtained by the clustering algorithm X. The main property of the hierarchical grouping is that for any k=1,2,…,m−t there is j=1,2,…,m−t−1 for which Kkt⊆Kjt+1. In this case, if at some step t,t≤m−l for some k the condition Kkt⊆Kj does not hold for all j=1,2,…,l, then the condition Κ∗xi≈Κ will not be fulfilled.

2.4. Examples

We give some examples illustrating the stability criteria introduced.

Below are the results obtained for model samples. The method of clustering based on the minimization of the dispersion criterion [3] has been used. As the initial data, we used samples of a mixture of two two-dimensional normal distributions with independent features, different а, and σ. Examples are shown in Figures 1–3 (images of the samples in question) and in Tables 1 and 2. Figure 1 represents a sample of 200 objects for which all the criteria ФΚ, Fmin(Κ), Favr(Κ) are equal to 1, and the resulting clustering into two clusters is stable clustering. Here we used distributions with parameters а1=00,а2=99, and σ1=σ2=33.
Further, with the same parameters а1,а2, experiments were carried out for σ1=σ2=55.
Then, we used distributions with parameters а1=00,а2=99, σ1=σ2=1010, m=200. In this case, we have the case of strongly intersecting distributions. Formally, the clustering method gives a quasi-clustering, approximately corresponding to the partitioning of the original sample (Figure 3) into two sets by a diagonal from the upper left corner of the picture to the lower right. The values of the criteria in Table 2 were obtained.
Data clustering of [19] and criteria values ФΚ, Fmin(Κ), Favr(Κ). The following data from classification problem of electromagnetic signals were considered: n=34, m1=225, m2=126, l=2. We give the values of the stability criteria obtained. Figure 4 shows the visualization [3] of the sample. The accuracy of the supervised classification methods was about 87% of the correct answers. However, the clustering of data turned out to be only quasi-clustering (Table 3).

Figure 1.
Clustering in a task with parameters а1=00,а2=99, σ1=σ2=33, m=200.

Figure 2.
Clustering in a task with parameters а1=00,а2=99, σ1=σ2=55, m=200.

Figure 3.
Data with parameters а1=00,а2=99, σ1=σ2=1010, m=200.

ФΚ	0.995
Fmin(Κ)	0.995
Favr(Κ)	0.999

Table 1.

Values of quasi-clustering criteria.

ФΚ	0.770
Fmin(Κ)	0.995
Favr(Κ)	0.998

Table 2.

Values of quasi-clustering criteria. ?ase of very intersecting distributions.

ФΚ	0.966
Fmin(Κ)	0.997
Favr(Κ)	0.999

Table 3.

The values of the criteria in the problem “ionosphere” ФΚ, Fmin(Κ), Favr(Κ).

3. Committee synthesis of ensemble clustering

The problem is as follows. There are N clusterings for the same number of clusters. How to choose from them the only one or build a new clustering from the available ones? In the supervised classification problem (with the help of a collective solution of a set of algorithms) there is a criterion according to which one can choose an algorithm from existing ones or build a new algorithm. This is a supervised classification error. This direction in the theory of classification appeared in the early 1970s of the last century [20, 21], then was created an algebraic approach [22], various correctors were appeared. The key in the algebraic approach is the creation in the form of special algebraic polynomials of a correct (error-free) algorithm based on a set of supervised classification algorithms. Some algebraic operations on matrices of “degrees of belonging” of recognized objects are used. Various types of correctors were also created [22, 23, 24, 25], when the problem of constructing (and applying) the best algorithm is also solved in two stages. First, the supervised classification algorithms are determined, and then the corrector. This can be, for example, the problem of approximating a given partial Boolean function by some monotonic function. In recent decades, there are conferences on multiple classifier systems, these issues are reflected in the books [21, 10]. How to choose or create the best clustering using a finite set of given solutions? Here, all problems are connected primarily with the absence of a single generally accepted criterion. Each clustering algorithm finds such “source” clusters of objects that are “equivalent” to each other. In this chapter, it is proposed to build such a clustering of the initial data, the cluster solutions of which have a large intersection with the initial clusters.

Let the sample of objects Χ=x1x2…xm,xi∈Rn for supervised classification and l classes are given. In the theory of supervised classification, the following definition of the supervised classification algorithm exists [21]. Let αij∈01 be equal to 1 when the object xi,i=1,2,…,m is classified by the algorithm Ar as xi∈Kj and 0 otherwise: ArΧ=αijm×l. Here the intersection of classes is allowed. Unlike the supervised classification problem, when clustering a sample, we have freedom in the designation of clusters.

Definition 4. The matrices I=αijm×l,αij∈01 and I'=αij'm×l,αij'∈01 are said to be equivalent if they are equals to within a permutation of the columns.

It is clear that this definition defines a class of equivalent matrices for some matrix.

Definition 5. A clustering algorithm is an algorithm that maps a sample Χ to a set of equivalent information matrices AcΧ=Καijm×l.

The number of clusters and the length of the control sample are considered to be given. This definition emphasizes the fact that in an arbitrary partition of a sample into l clusters, we have complete freedom in the numbering of clusters. In what follows we shall always consider matrices of dimension m×l.

Let there be given N algorithms A1c,A2c,…,ANc for clustering and their solutions AνcΧ=Καijvm×l for sample Χ. We denote Iν=αijνm×l an arbitrary element of the clustering Καijvm×l.

Therefore, we have Ι=ΚI1×ΚI2×…×ΚIN or set Ι=I1'I2'…IN'Iν'∈ΚIν, Iν'=αij'νm×l.

There are two problems.

Construction of the mapping Ι on, Κc, Ι→Κc=Κсijm×l,сij∈01 (that is, the construction of some kind of clustering).
Finding the optimal element in Κc (i.e. finding the best clustering in Κc).

Definition 6. An operator ΒI1'I2'…IN'=B=bijm×l is called an adder if bij=∑ν=1Nαij'ν.

It is clear that 0≤bij≤N,bij∈012…N.

Definition 7. An operator r is called a threshold decision rule, if rB=С=сijm×l, cij=1,bij≥δi,0,otherwise, where δi∈R.

Definition 8. By the committee synthesis of an information matrix С on an element I˜'=I1'I2'…IN' let us call it a computation by the formula С=rΒI˜', provided that Β is the adder and r is the threshold decision rule.

The general scheme of collective synthesis is shown in Figure 5.

Figure 5.
Scheme of committee synthesis.

We note that the total number of possible values B is bounded from above by a quantity l!N. Let s be the operator that performs permutation of columns of matrices m×l with the help of a substitution <j1,j2,…,jl>, S=s is the set of all operators s. We believe that rs=sr,∀s∈S. We continue s∈S to the n-dimensional case σI˜'=sI1'sI2'…sIn'. We denote Σ=σ,σ is the extension of s. From the definition of the adder it follows that σΒ=Βσ,∀σ∈Σ. Further, ∀I˜'∈Ι,∀σ∈Σ we have rΒσI˜'=rσΒI˜'=srΒI˜' and finally σI˜'σ∈Σ→rΒsrΒI˜'s∈S=ΚrΒI˜'=Κcijm×l. Therefore, the product rΒ defines the desired mapping and specifies some ensemble clustering. It is necessary to determine the optimal element from Κc, find it and I˜'.

Ι→rΒΚc, AI˜'cΧ=ΚrΒI˜'.

We introduce definitions of potentially best and worst-case solutions. As the “ideal” of the collective solution, we will consider the case when all algorithms give us essentially the same partitions or coverings.

Definition 9. A numerical matrix bijm×l is called contrasting if bij∈0N. A numeric matrix bijm×l is called blurred if bij=δi∈R.

As the distance between two numerical matrices, we consider the function ρB1B2=∑i=1m∑j=1lbij1−bij2.

Denote by Μ the set of all contrast matrices, and by M˜ the set of all blurred matrices. We introduce definitions for estimating the quality of matrices.

Definition 10.

ΦB=ρBΜ→Bmin.E2

Definition 11.

Φ˜B=ρBΜ˜→Bmax.E3

The set Μ˜'=B˜ (where B˜=b˜ijm×l,b˜ij=N2) is called the mean blurred matrix.

Definition 12.

Φ˜'B=ρBB˜→BmaxE4

We note that the optimums according to the criteria (Eq. (2)) and (Eq. (3)) do not have to coincide. The sets Μ and Μ˜ intersect.

Figure 6 illustrates the sets of contrasting and blurred matrices. Arrows indicate some elements of sets.

Figure 6.
The sets of contrasting Μ, blurred Μ˜ matrices, and the set of matrices B.

Theorem 1. The sets of optimal solutions by criteria Eqs. (2) and (4) coincide.

Let us show that ΦB+Φ˜'B = Nml2 for any B. We write Φ˜'B=∑i=1m∑j=1lα˜ij,α˜ij=bij−N2,ΦB=∑i=1m∑j=1lαij∗,αij∗=minbijN−bij. If bij≥N2 then α˜ij=bij−N2,αij∗=N−bij, and α˜ij+αij∗=N2. If bij<N2 then α˜ij=N2−bij,αij∗=bij, and α˜ij+αij∗=N2.

Summing over all the set of values of pairs of indices i,j, we get that ΦB+Φ˜'B = Nml2.

We consider the problem of finding optimal ensemble clusterings for the criterion (2). It is clear that ΦB=∑i=1m∑j=1lminbijN−bij.

We introduce the notations M=12…m, Xj=ibij≥N2i=12…m, Yj=M∖Xj, j=1,2,…,l. Let πν=<μ1ν,μ2ν,…,μlν>, ν=1,2,…,N be some permutation of the set π0=<1,2,…,l>. A set of permutations π=<π1,π2,…,πN> uniquely determines the matrix of estimates.

B'=bij'm×l,bij'=bijπ=∑ν=1Nαij'ν.

We will further assume that the “initial” matrix αijνm×l of the algorithm Aνc corresponds to the permutation π0. αij'νm×l is the matrix of the algorithm Aνc corresponding to some permutation πν. Then αij'ν=αiμjνν.

Consider Δ˜ν=∑j=1l∑i∈Xjα¯ijν+∑i∈Yjαijν, Δ˜ν'=∑j=1l∑i∈Xjα¯ij'ν+∑i∈Yjαij'ν.

Then Δν=Δ˜ν'−Δ˜ν=∑j=1l∑i∈Xjαijν−αij'ν+∑i∈Yjαij'ν−αijν. We convert this expression.

The identity ∑j=1l∑i∈Xjαijν+∑i∈Yjαijν=∑j=1l∑i∈Xjαiμjνν+∑i∈Yjαiμjνν is valid. Get Δν=∑j=1l∑i∈Xjαijν−αiμjνν+∑i∈Yjαiμjνν−αijν=2∑j=1l∑i∈Xjαijν−αiμjνν=2∑j=1l∑i∈Xjαijν−2∑j=1l∑i∈Xjαiμjνν.

Thus, minimizing a function is equivalent to maximizing the second sum of the expression.

After applying the permutations π=<π1,π2,…,πN>, the sets Xj,Yj,j=1,2,…,l change. We introduce the notations M1j=Xj∖Yj'∖Yj, M2j=Yj'∖Yj, M3j=Yj∖Xj'∖Xj, M4j=Xj'∖Xj.

Figure 7 schematically shows the changes in sets Xj,Yj,j=1,2,…,l.

Figure 7.
Sets Xj,Yj,j=1,2,…,l are changed.

Theorem 2

ΔΦ=ΦB'−ΦB≤∑ν=1NΔν+∑ν=1NM2j−2,N−even,−1,N−odd+M4j0,N−even,−1,N−odd

The proof is given in [12, 13]. Theorem 2 is the basis for creating an effective minimization algorithm of Φ.

Since the second sum is always not positive, we have an upper bound. We consider the problem of minimizing a function Δν. We write out all possible variants of the function ∑j=1l∑i∈Xjαiμjνν in the form of a table in Figure 8. Then the minimum of this function is reduced to finding the maximum matching of the bipartite graph, for finding which we can use the polynomial Hungarian algorithm [16].

Figure 8.
All possible variants of ∑j=1l∑i∈Xjαiμjνν for all admissible j and i.

It is clear that minπνΔν≤0. Now we can propose the following heuristic algorithm for steepest descent.

Algorithm.

1. We calculate Xj,j=1,2,…,l.

2. We find Δν∗=minπνΔν for each ν.

If ∑ν=1NΔν∗<0, then apply the found permutations πν=<μ1ν,μ2ν,…,μlν>,ν=1,2,…,N and go to step 1).

If ∑ν=1NΔν∗=0 then the END of algorithm.

NOTE. We note that our algorithm does not even find a local minimum of the criterion ΦB. Nevertheless, this algorithm is very fast, its complexity at each iteration is estimated as Ol5mN.

4. The algorithm of collective k-means

Results of clustering by N algorithms of sampling of m objects to l clusters solutions are obtained, which we can write in the form of a binary matrix αijv,ν=1,2,…,N,i=1,2,…,m,j=1,2,…,l. We assume that the cluster numbers in each algorithm are fixed. Then any horizontal layer number i of this three-dimensional matrix will denote the results of object xi clustering. As an ensemble clustering of the sample Χ, we can take the result of clustering the “new” descriptions—the layers of the original matrix αijv,ν=1,2,…,n. As a method of clustering, we take the method of minimizing the dispersion criterion. Let there be a lot of N clusterings αi1jv,αi2jv,…,αiNjv with heuristic clustering algorithms, then we calculate their sample mean αj∗ν as the solution of the problem ∑μ=1tαj∗v−αiμjv2→minαj∗v. Where do we obtain αj∗v=1N∑μ=1Nαiμjv. Note that this method makes it possible to calculate such ensemble clusterings Κ=K1∗K2∗…Kl∗ that the sets of heuristic clustering of the objects of some cluster of the collective solution will be close to each other in the Euclidean metric. The committee synthesis of collective decisions provides more interpretable solutions. Indeed, if Κν=K1νK2ν…Klν,ν=1,2,…,N are separate solutions of heuristic clustering algorithms, then the cluster of collective solution will be the “intersection” of many some original clusters Ki11,Ki22,…,KilN.

5. Man-machine (video-logical) clustering method

In the problems of ensemble clustering synthesis considered earlier, we did not consider the number of initial clustering algorithms, their quality and their proximity. Ensemble clustering was built and reflected only the opinion of the collective decisions that we used. “Internal” indices [9] reflect the person’s ideas about clustering. You can think up examples of data when known internal criteria lead to degenerate solutions.

At the same time, a person has the ability to cluster visual sets on a plane without using any proximity functions, criteria and indices. The following idea was realized. A person can personally cluster projections of sets of points from Rn into R2. Having made such clusterings under different projections, we can construct generally speaking various N clusterings, which we submit to the input of the construction of the collective solution. The person himself “does not see” the objects in Rn, but can exactly solve the clustering tasks on the plane. Thus, here we use N precise solutions, but of various partial information about the data. Consider this video-logical method on one model example.

A sample of two normal distributions with independent characteristics was considered. The first feature of the first distribution (200 objects) had zero expectation and the standard deviation, the first attribute of the second distribution (200 objects) had these values equal to 5. All the other 49 attributes for all objects had аi=5, σi=5,i=2,3,…,50. That is, the two sets had equal distributions for 49 features and one informative feature. Clustering of the entire sample by minimizing dispersion is shown in Figure 9. Black and gray points on sample visualization represent the objects of the first and second clusters. Here the fact of informative character of the first feature is lost.

Figure 9.
Clustering of a sample of model objects by the method of minimizing variance.

The program of the video-logical approach worked as follows. With the help of a single heuristic approach, all Cn2 projections are automatically ordered according to the descending criteria of the presence of two clusters. Next we as experts consider some projections and with the help of the mouse we select in each of them two clusters. Figure 10 shows two such examples. Note that the first feature was present in all projections. It was used “manually” as the defining area for the dense location of objects. Then 10 “manual” clustering went to the program entrance for the committee synthesis of the collective solution. Note that only two objects were erroneously clustered.

Figure 10.
Allocation of clusters by mouse on the (1.4) and (1.6) features.

6. Conclusion

This chapter consists of two parts. First, clustering criteria based on sustainability are introduced. Next, we propose an approach to processing the sets of obtained partitions of the same sample. As the initial clustering, it is better to use stable clustering. It is shown how a person can be used in the construction of the committee synthesis of ensemble clustering.

Acknowledgments

The reported study was funded by RFBR according to the research project No 17-01-00634 and No 18-01-00557.

References

1. Halkidi M, Batistakis Y, Vazirgiannis M. Cluster validity methods: Part 1. SIGMOD Record. 2002;31(2):40-45. DOI: 10.1145/601858.601862
2. Aggarwal C, Reddy C. Data Clustering: Algorithms and Applications. CRC Press; 2014
3. Duda R, Hart P, Stork D. Pattern Classification. 2nd ed. New York: Wiley; 2000
4. Lloyd S. Least squares quantization in PCM (PDF). IEEE Transactions on Information Theory. 1982;28(2):129-137. DOI: 10.1109/TIT.1982.1056489
5. Kriegel H, Kröger P, Sander J, Zimek A. Density-based clustering. WIREs Data Mining and Knowledge Discovery. 2011;1(3):231-240. DOI: 10.1002/widm.30
6. Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B. 1977;39(1):1-38 JSTOR 2984875. MR 0501537
7. Jain A, Dubes R. Algorithms for Clustering Data. Englewood Cliffs: Prentice-Hall, Inc.; 1998
8. Kaufman L, Rousseeuw P. Finding Groups in data: An Introduction to Cluster Analysis. New York: Wiley; 2009
9. Aggarwal C. Data Mining: The Textbook. Yorktown Heights/New York: IBM T.J. Watson Research Center; 2015. 771 p. DOI: 10.1007/978-3-319-14142-8
10. Kuncheva L. Combining Pattern Classifiers: Methods and Algorithms. Hoboken: Wiley; 2004. DOI: 10.1002/9781118914564
11. Desgraupes B. Clustering indices. University Paris Ouest. Lab Modal'X; 2013
12. Ryazanov V. Commitee synthesis of algorithms for recognition and classification. Journal of Computational Mathematics and Mathematical Physics. 1981;21(6):1533-1543. DOI: 10.1016/0041-5553(81)90161-0
13. Ryazanov V. On the synthesis of classification algorithms on finite sets of classification algorithms (taxonomy). Journal of Computational Mathematics and Mathematical Physics. 1982;22(2):429-440. DOI: 10.1016/0041-5553(82)90049-0
14. Ryazanov V. One approach for classification (taxonomy) problem solution by sets of heuristic algorithms. In: Proceedings of the 9-th Scandinavian Conference on Image Analysis; 6–9 June 1995; Uppsala; 1995(2). pp. 997-1002
15. Biryukov A, Shmakov A, Ryazanov V. Solving the problems of cluster analysis by collectives of algorithms. Journal of Computational Mathematics and Mathematical Physics. 2008;48(1):176-192. DOI: 10.1134/S0965542508010132
16. Kuhn H. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 1955;2:83-97. DOI: 10.1002/nav.3800020109
17. Ryazanov V. Estimations of clustering quality via evaluation of its stability. In: Bayro-Corrochano E, Hancock E, editors. CIARP 2014. LNCS. Vol. 8827; 2014. pp. 432-439. DOI: 10.1007/978-3-319-12568-8_53
18. Ryazanov V. About estimation of quality of clustering results via its stability. Intelligent Data Analysis. 2016;20:S5-S15. DOI: 10.3233/IDA-160842
19. Sigillito V, Wing S, Hutton L, Baker K. Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APL Technical Digest. 1989;10:262-266
20. Rastrigin L, Erenstein R. Collective decision-making in pattern recognition. Avtomatica i telemehanika. 1975;9:133-144
21. Method of committees in pattern recognition. Sverdlovsk, IMM AN USSR; 1974
22. Yu Z. On the algebraic approach to solving problems of recognition or classification. Problems of Cybernetics. 1978;33:5-68
23. Zuev Y. The method of increasing the reliability of classification in the presence of several classifiers, based on the principle of monotony. Journal of Computational Mathematics and Mathematical Physics. 1981;21(1):157-167
24. Krasnoproshin V. About the optimal corrector of the set of recognition algorithms. Journal of Computational Mathematics and Mathematical Physics. 1979;19(1):204-215
25. Zhuravlev Y. Selected Scientific Works. Moscow: Publishing House Magister; 1998. p. 420

[1] 1. Halkidi M, Batistakis Y, Vazirgiannis M. Cluster validity methods: Part 1. SIGMOD Record. 2002;31(2):40-45. DOI: 10.1145/601858.601862

[2] 2. Aggarwal C, Reddy C. Data Clustering: Algorithms and Applications. CRC Press; 2014

[3] 3. Duda R, Hart P, Stork D. Pattern Classification. 2nd ed. New York: Wiley; 2000

[4] 4. Lloyd S. Least squares quantization in PCM (PDF). IEEE Transactions on Information Theory. 1982;28(2):129-137. DOI: 10.1109/TIT.1982.1056489

[5] 5. Kriegel H, Kröger P, Sander J, Zimek A. Density-based clustering. WIREs Data Mining and Knowledge Discovery. 2011;1(3):231-240. DOI: 10.1002/widm.30

[6] 6. Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B. 1977;39(1):1-38 JSTOR 2984875. MR 0501537

[7] 7. Jain A, Dubes R. Algorithms for Clustering Data. Englewood Cliffs: Prentice-Hall, Inc.; 1998

[8] 8. Kaufman L, Rousseeuw P. Finding Groups in data: An Introduction to Cluster Analysis. New York: Wiley; 2009

[9] 9. Aggarwal C. Data Mining: The Textbook. Yorktown Heights/New York: IBM T.J. Watson Research Center; 2015. 771 p. DOI: 10.1007/978-3-319-14142-8

[10] 10. Kuncheva L. Combining Pattern Classifiers: Methods and Algorithms. Hoboken: Wiley; 2004. DOI: 10.1002/9781118914564

[11] 11. Desgraupes B. Clustering indices. University Paris Ouest. Lab Modal'X; 2013

[12] 12. Ryazanov V. Commitee synthesis of algorithms for recognition and classification. Journal of Computational Mathematics and Mathematical Physics. 1981;21(6):1533-1543. DOI: 10.1016/0041-5553(81)90161-0

[13] 13. Ryazanov V. On the synthesis of classification algorithms on finite sets of classification algorithms (taxonomy). Journal of Computational Mathematics and Mathematical Physics. 1982;22(2):429-440. DOI: 10.1016/0041-5553(82)90049-0

[14] 14. Ryazanov V. One approach for classification (taxonomy) problem solution by sets of heuristic algorithms. In: Proceedings of the 9-th Scandinavian Conference on Image Analysis; 6–9 June 1995; Uppsala; 1995(2). pp. 997-1002

[15] 15. Biryukov A, Shmakov A, Ryazanov V. Solving the problems of cluster analysis by collectives of algorithms. Journal of Computational Mathematics and Mathematical Physics. 2008;48(1):176-192. DOI: 10.1134/S0965542508010132

[16] 16. Kuhn H. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly. 1955;2:83-97. DOI: 10.1002/nav.3800020109

[17] 17. Ryazanov V. Estimations of clustering quality via evaluation of its stability. In: Bayro-Corrochano E, Hancock E, editors. CIARP 2014. LNCS. Vol. 8827; 2014. pp. 432-439. DOI: 10.1007/978-3-319-12568-8_53

[18] 18. Ryazanov V. About estimation of quality of clustering results via its stability. Intelligent Data Analysis. 2016;20:S5-S15. DOI: 10.3233/IDA-160842

[19] 19. Sigillito V, Wing S, Hutton L, Baker K. Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APL Technical Digest. 1989;10:262-266

[20] 20. Rastrigin L, Erenstein R. Collective decision-making in pattern recognition. Avtomatica i telemehanika. 1975;9:133-144

[21] 21. Method of committees in pattern recognition. Sverdlovsk, IMM AN USSR; 1974

[22] 22. Yu Z. On the algebraic approach to solving problems of recognition or classification. Problems of Cybernetics. 1978;33:5-68

[23] 23. Zuev Y. The method of increasing the reliability of classification in the presence of several classifiers, based on the principle of monotony. Journal of Computational Mathematics and Mathematical Physics. 1981;21(1):157-167

[24] 24. Krasnoproshin V. About the optimal corrector of the set of recognition algorithms. Journal of Computational Mathematics and Mathematical Physics. 1979;19(1):204-215

[25] 25. Zhuravlev Y. Selected Scientific Works. Moscow: Publishing House Magister; 1998. p. 420

Collective Solutions on Sets of Stable Clusterings

Recent Applications in Data Clustering

Abstract

Keywords

Author Information

Vladimir Vasilevich Ryazanov*

1. Introduction

2. Criteria for stability of clustering

2.1. Method of minimizing the dispersion criterion

2.2. k-means method

2.3. Method of hierarchical agglomeration grouping

2.4. Examples

Figure 1.

Figure 2.

Figure 3.

Table 1.

Table 2.

Figure 4.

Table 3.

3. Committee synthesis of ensemble clustering

Figure 5.

Figure 6.

Figure 7.

Figure 8.

4. The algorithm of collective k-means

5. Man-machine (video-logical) clustering method

Figure 9.

Figure 10.

6. Conclusion

Acknowledgments

References

Clustering Algorithms for Incomplete Datasets

Collective Solutions on Sets of Stable Clusterings

Recent Applications in Data Clustering

Abstract

Keywords

Author Information

Vladimir Vasilevich Ryazanov*

1. Introduction

2. Criteria for stability of clustering

2.1. Method of minimizing the dispersion criterion

2.2. k-means method

2.3. Method of hierarchical agglomeration grouping

2.4. Examples

Figure 1.

Figure 2.

Figure 3.

Table 1.

Table 2.

Figure 4.

Table 3.

3. Committee synthesis of ensemble clustering

Figure 5.

Figure 6.

Figure 7.

Figure 8.

4. The algorithm of collective k-means

5. Man-machine (video-logical) clustering method

Figure 9.

Figure 10.

6. Conclusion

Acknowledgments

References

Continue reading from the same book

Recent Applications in Data Clustering