About the book
Clustering is the process of grouping the data individuals into classes or clusters of similar individuals. A cluster is a collection of data individuals that are similar to one another within the same cluster and are dissimilar to the individuals in other clusters. Clustering can be employed to obtain the distribution of data, the features of each cluster, and analyze some special clusters. A lot of algorithms have been developed for data clustering. The existing clustering algorithms include partitional clustering (e.g., K-means, K-medoids), hierarchical clustering, density-based clustering, and model-based clustering. However, these exiting methods can not directly be applied to high-dimensional data clustering, large scale data clustering, and distributed data clustering, and big data clustering. To this end, the aims of the project are to develop some novel methods and algorithms to conduct data clustering for various data, including generic data, time-series data, big data, large scale data, high-dimensional and missing data, and investigate their theoretical properties under some wild conditions. In particular, some novel dimension reduction techniques and model average techniques, and Bayesian tools are developed to conduct data clustering for some complicated data.