This chapter will survey the clustering algorithm that is unsupervised learning among data mining and machine learning techniques. The most popular clustering algorithm is the K-means clustering algorithm; It can represent a cluster of data. The K-means clustering algorithm is an essential factor in finding an appropriate K value for distributing the training dataset. It is common to find this value experimentally. Also, it can use the elbow method, which is a heuristic approach used in determining the number of clusters. One of the present clusterings applied studies is the particulate matter concentration clustering algorithm for particulate matter distribution estimation. This algorithm divides the area of the center that the fine dust distribution using K-means clustering. It then finds the coordinates of the optimal point according to the distribution of the particulate matter values. The training dataset is the latitude, longitude of the observatory, and PM10 value obtained from the AirKorea website provided by the Korea Environment Corporation. This study performed the K-means clustering algorithm to cluster feature datasets. Furthermore, it showed an experiment on the K values to represent the cluster better. It performed clustering by changing K values from 10 to 23. Then it generated 16 labels divided into 16 cities in Korea and compared them to the clustering result. Visualizing them on the actual map confirmed whether the clusters of each city were evenly bound. Moreover, it figures out the cluster center to find the observatory location representing particulate matter distribution.
Part of the book: Data Mining