Trace elements migrate among different environment bodies with the natural geochemical reactions, and impacted by human industrial, agricultural, and civil activities. High load of trace elements in water, river and lake sediment, soil and air particle lead to potential to health of human being and ecological system. To control the impact on environment, source apportionment is a meaningful, and also a challenging task. Traditional methods to make source apportionment are usually based on geochemical techniques, or univariate analysis techniques. In recently years, the methods of multivariate analysis, and the related concepts data mining, machine learning, big data, are developing fast, which provide a novel route that combing the geochemical and data mining techniques together. These methods have been proved successful to deal with the source apportionment issue. In this chapter, the data mining methods used on this topic and implementations in recent years are reviewed. The basic method includes principal component analysis, factor analysis, clustering analysis, positive matrix fractionation, decision tree, Bayesian network, artificial neural network, etc. Source apportionment of trace elements in surface water, ground water, river and lake sediment, soil, air particles, dust are discussed.
Part of the book: Trace Metals in the Environment
Water inrush is a major threat to the working safety for coal mines in the Northern China coal district. The inrush pattern, threaten level, and also the geochemical characteristics varies according to the different of water sources. Therefore, identifying the water source correctly is an important task to predict and control the water inrush accidents. In this chapter, the algorithms and attempts to identify the water inrush sources, especially in the Northern China coal mine district, are reviewed. The geochemical and machine learning algorithms are two main methods to identify the water inrush sources. Four main steps need to apply, namely data processing, feature selection, model training, and evaluation, in the process of machine learning (ML) modelling. According to a calculation instance, most of the major ions, and some trace elements, such as Ti, Sr, and Zn, were identified to be important in light of geochemical analysis and machine learning modelling. The ML algorithms, such as random forest (RF), support vector machine (SVM), Logistica regression (LR) perform well in the source identification of coal mine water inrush.
Part of the book: Groundwater Management and Resources
Coal and host rock, including the gangue dump, are important sources of toxic elements, which have high-contaminating potential to surface and groundwater. Surface water in the coal mine area and groundwater in the active or abandoned coal mines have been observed to be polluted by trace elements, such as arsenic, mercury, lead, selenium, cadmium. It is helpful to control pollution caused by the trace elements by understanding the leaching behavior and mechanism. The leaching and migration of the trace elements are controlled mainly by two factors, trace elements’ occurrence and the surrounding environment. The traditional method to investigate elements’ occurrence and leaching mechanism is based on the geochemical method. In this research, the data mining method was applied to find the relationship and patterns, which is concealed in the data matrix. From the geochemical point of view, the patterns mean the occurrence and leaching mechanism of trace elements from coal and host rock. An unsupervised machine learning method, principal component analysis was applied to reduce dimensions of data matrix of solid and liquid samples, and then, the re-calculated data were clustered to find its co-existing pattern using the method of Gaussian mixture model.
Part of the book: Data Mining