Description of daily average and weekly average input layers.
Reliable prediction of drainage flow rate and drainage chemistry is essential to the treatment of drainage from waste rock storages at mine sites. The traditional predictive models require simplification and assumption of geo-bio-chemical processes followed by intensive characterization, and sometimes lead to poor prediction accuracy. In the big data era, various sensors are installed in field to constantly monitor mine sites, which enables machine learning to utilize the generated monitoring data and study the underlying pattern behind the data. This chapter describes an approach to use artificial neural network to predict the drainage flow rate and drainage chemistry based on weather monitoring data collected at mine sites. The advantage of this approach is that generally no additional characterization are required to make prediction because the relevant geo-bio-chemical mechanisms are embedded naturally in the monitoring data, which can be captured through machine learning process.
- machine learning
- artificial neural network
- drainage flow rate
- drainage chemistry
- waste rock storages
- weather monitoring data
Hard rock mining generates a huge amount of mine wastes (mine tailings and waste rocks), which often contains metal sulfide minerals. Once exposed to air and water during and after mining, the oxidation of metal sulfides minerals releases acid and heavy metals to the environment. The oxidation of metal sulfides can be accelerated in the presence of microorganisms. The drainage from mine wastes may have high level of toxic elements and chemicals such as arsenic, selenium, lead, uranium, zinc etc. Over time, waste rocks are deposited in the storages which can contain over one hundred million tons and cover a few hundred hectares. The drainage water from waste rock storages and its impact on the surrounding environment are becoming critical challenges to both of mining companies and the public. The treatment of the drainage from waste rock storages may have to last decades, even centuries, and bring a significant cost to the mining sectors [1, 2].
For day-to-day mine site management, flood control and contaminant remediation plan are dependent on the evaluation and prediction of drainage flow rates and drainage chemistries. Various methodologies have been developed over the past several decades to predict the drainage from waste rock storages. For predicting drainage flow rates, a numerical model to simulate groundwater flow through unsaturated bed or layers of earth is developed in . In reference , a water balance approach is proposed to calculate the conservation of total water flow through waste rock piles by dividing the whole hydrological process into independent components. In terms of predicting drainage chemistries, a numerous numerical models enabled with mass transport effect are developed to evaluate the geochemical reaction and transport inside waste rock piles [5, 6, 7]. Furthermore, using dimensional equation to correlate drainage chemistries with seepage flow rates from waste rock piles is explored in . In reference , the effectiveness of a cover system is assessed and a Multiphysics model is developed to predict the iron loading and lime consumption for a full-scale waste rock pile.
However, there are several limitations with above predictive models. For example: 1. Many predictive models are based on lab testing and then scaled up to predict the result in the field, but there is little comprehensive understanding on how to scale up; 2. Simplification and assumptions of geo-bio-chemical processes for the geochemical reaction and leaching process in waste rock storages are critical to the accuracy of the predictive models; 3 Lab or field characterization of material and transport properties related to predictive models is essential, which is also very costly and time-consuming.
To understand and minimize the environmental impact from the contaminated drainage, routine monitoring of waste rock storages is required by many governmental regulators. With the rapid development of computer and sensing technologies, constant and comprehensive monitoring on the waste rock storages is now possible for many mine sites. Daily or even hourly monitoring data become available for many key parameters such as precipitation, temperature, wind, internal temperature, gas concentrations, air/water flow rates, drainage chemistries, etc. These monitoring data are accumulated to weekly, monthly and yearly datasets, and become so huge and complex that traditional data analysis approaches are inadequate to handle and investigate them. As one of the most famous machine learning technologies, artificial neural network not only has the advantages of high processing speed and high computational accuracy, but also enables a machine to mimic human learning behavior and problem solving functions. Thus using neural network to investigate the huge monitoring datasets and further predict drainage flow rates and drainage chemistries from waste rock storages shows very promising potentials. For example, the concentrations of sulphate, chlorine, total dissolved solids and total suspended solids in mine water are predicted by artificial neural network based on the input of pH, temperature and hardness in . Heavy metal included in acid rock drainage is investigated by support vector machine and neural network for a copper mine in Iran . Five machine learning approaches to predict copper concentration are compared in . A feedforward neural network with weather input is proposed to predict drainage flow rates for a full scale waste rock pile .
In this book chapter, a refined feedforward neural network based on  will be introduced to learn from historical monitoring data and then predict the drainage flow rate, in addition, the refined neural network will also be extended to predict the drainage chemistries in the field. Compared with above traditional predictive models, the proposed neural network approach requires much less simplification and assumption of geo-bio-chemical processes involved and it can significantly reduce characterization cost for mining companies, as the monitoring data inherently contain the information of all the underlying physical mechanisms within real waste rock storages. However, the prediction accuracy is highly dependent on the quality of monitoring data as the proposed neural network is actually a mathematical regression process.
The proposed feedforward neural network selects the weather monitoring data from mine sites as the input to predict and the drainage as the output. The underlying logic for this approach is based on the fact that the water passing through the waste rock storage is mainly from two sources: 1 precipitation falls directly onto the storage and infiltrates into it; 2 groundwater originating from uphill precipitation flows into the storage from higher elevations. Both sources are highly dependent on rain, snow, temperature, hydrologic properties and geo-bio-chemical conditions in the field. As the hydrologic properties and geo-bio-chemical conditions are relatively stable than previous factors related to the weather, the evolution of total precipitation and mean temperature from ambient environment at the mine site is then adopted to correlate with the drainage flow rates and also drainage chemistries. The correlation can be gradually captured by machine learning through studying historical monitoring data from a specific waste rock storage. In addition, the reference  proposed to use the number of year and month as additional input to capture long-term fluctuation of drainage. As the number of month is naturally uncycled, the value of the month number has no meaning for machine learning but only brings learning issue when December transits to January. The chapter proposes to use the concept of accumulated days to capture the long term fluctuation instead. With further normalizing all input data, the refined feedforward neural network can better predict the drainage flow rates and further the drainage chemistries. A case study on a full-scale waste rock storage will be provided to validate the proposed approach in this chapter.
2.1 Feedforward neural network
The Feedforward neural network is an artificial neural network wherein connections between the artificial neurons do not form a cycle, which is different from its variant: recurrent neural networks. The artificial neurons are capable of simulating basic learning behaviors through receiving inputs, calculating a weighted sum and then passing the sum through a transformation known as activation function to produce outputs. The mathematical calculation for an artificial neuron in a feedforward neural network is generally illustrated as follows:
2.2 Application to predict the drainage from waste rock storage
The schematic of the proposed feedforward neural network structure is illustrated in Figure 1. As mentioned above, hydrologic properties and geo-bio-chemical conditions in waste rock storages are generally considered as a very slow evolution, which means they are relatively stable compared with weather conditions such as rain, snow and temperature in the field. The dynamics of weather conditions are powerful to act as driving input forces for the training process, leaving hydrologic properties and geo-bio-chemical conditions as coefficients within neural network to be determined during learning process. As the temperature controls not only the formation of rain/snow but also evaporation rate on the surface, total precipitation and mean temperature are then selected as two groups of neurons in the input layer. Current and preceding total precipitation and mean temperature are extracted from the weather monitoring database, then they are formatted into a time series in the input layer before entering the hidden layer. The length of time series determines the neuron number in each group of the input layer. For example, an input including previous 10 daily weather monitoring data indicates 10 neurons for previous daily total precipitation and 10 neurons for previous daily mean temperature in the input layer.
An additional neuron in the input layer is composed of a time tag that represents the drainage measurement day. For example, the day with first weather monitoring data is considered as the first day, and the corresponding time tag is set to 1. Any future time tag for one drainage measurement is the accumulated day number adding from the first day to the measurement day. By introducing the concept of accumulated day number as the time tag, the geo-bio-chemical evolutions inside the waste rock storages no longer have to keep constant in the temporal scale, and become potentially time dependent. Thus this hybrid input structure enables the proposed neural network to capture the long-term trend of the drainage flow rates and drainage chemistries.
The output can be calculated by moving forward in the neural network based on Eq. (1). After the output is obtained, it is compared with drainage monitoring data (target) including flow rate and chemistry concentration. A cost function is then adopted to evaluate the difference between output and target. In this study, the mean squared errors (
where is the th calculated output, is the th target, and is the number of the target for machine learning.
As both of input and target data are different in terms of their scales, it is generally required to pre-process them to become normalized before the training starts. The normalization could accelerate the training process by making all undetermined coefficients in the neural network get updated in the same scales. For this study, the mean for each data set (total precipitation, mean temperature, flow rate, chemistry concentration) is set to 0 and the standard deviation is set to 1. The normalization is obtained by following calculations:
where and is the mean and standard deviation for the data set of . is the total number of data in the set, denotes Hadamard division, and is the normalization of .
In the beginning, all coefficients within the neural network shown in Eq. (1) are randomly initiated. During the training process, they are automatically updated through a special data iteration technique called backpropagation algorithm, which calculates the gradient of the cost function based on comparing target with output. The proposed feedforward neural network should be trained with a fair amount of observation samples from historical monitoring database so that it can capture the correlation between input data and target data. Here an observation sample is defined as a combination of input and target from historical monitoring database. The training needs assessment to prevent both of underfitting and overfitting with various validation methods. The hold-out approach is adopted in this study. Among all observation samples, the training observation samples are those for actual training, the validation observation samples are for evaluating the generalization of the neural network and the training process continues until the generalization does not get improved, and the rest are called testing observation samples which do not impact on the training process but give independent assessment for the training performance.
After the training is completed, both of
Theoretically, the lower value for
In theory, a well-trained neural network proposed in this study is able to reasonably predict future drainage flow rate and drainage chemistry concentration for full-scale waste rock storages as long as the historical weather monitoring database, historical drainage monitoring database and the weather forecast onsite are available. Figure 2 shows the schematic diagram of the general functions for the proposed feedforward neural network approach. There are two processes involved in the implementation of the approach. After the training processed is completed, the correlation between weather and drainage for the waste rock storage is believed to be captured by the proposed neural network, and then the prediction process starts to utilize weather forecast to predict future drainage on site.
3. Validation-a case study
3.1 Input, target and neural network parameters
To validate the proposed neural network approach, a full-scale case study is performed to predict real drainage flow rate and drainage chemistry in field condition. A real waste rock pile from an anonymous mine site in western Canada is adopted in this study. The proposed neural network is trained by historical monitoring data for 16 years (Year 1 to Year 16). After the training is completed, it will be used to predict the drainage in the next 2 years (Year 17 and Year 18). A comparison between real measurement and predicted value will be provided.
At this mine site, the drainage flow rates are not directly measured but they are calculated based on readings of the water level in v-notch weirs installed at the end of the drainage collecting ditch. Thus the v-notch is used as the one actual target of neural network training. In the following discussion, the drainage flow rate actually refers to the original measurement of v-notch from the weirs. Among all drainage chemistry data from this waste rock pile, acidity is selected for this case study as another target, because it is directly linked to the lime consumption for contaminant treatment. These drainage measurement are generally performed in a dynamic time frame at this mine site. During spring freshet and large precipitation periods, the measurements are usually more frequent than the remaining time in a year, as increased drainage flow rate is observed.
The weather monitoring data at this site is extracted from the website of Environment Canada (weather.gc.ca) on a daily basis, including minimum temperature, maximum temperature, mean temperature, total rain, total snow, total precipitation and snow thickness on ground, etc. For this case study, the total 16 years weather monitoring data have been obtained for the training purpose. As mentioned in the previous section, two independent weather parameters measured on a daily basis - the total precipitation and mean temperature are selected as the inputs for the proposed neural network.
To evaluate how long the weather can impact on the drainage from this waste rock pile, two types of input layers are proposed for comparison in the study. When a target (flow rate or acidity) is selected to train the neural network, its measurement date is extracted. Daily average input layer consists of 21 neurons including the time tag of measurement day, daily total precipitation and daily mean temperature from previous 9 days and the measurement day, which mainly investigates short-term weather impact on the drainage. Furthermore, weekly average input layer has 21 neurons including the time tag of measurement day, weekly average total precipitation and weekly averaged mean temperature from previous 9 weeks and current week to investigate long-term weather impact. The weekly average input layer reflects longer weather monitoring data than the daily average input layer does, however, high frequent information is filtered in the weekly average input layer. The summary of daily average and weekly average input layers can be found in Table 1. Here 0 day means the measurement day, 0 week means measurement day and previous 6 days. Finally, both types of input layers are adopted to train the neural network to determine which input layer is more competent to capture the underlying pattern and make a better prediction.
|Type of input layer||Daily Average||Weekly Average|
|Description||Mean temperature of −9 day|
Total precipitation of −9 day
Mean temperature of −8 day
Total precipitation of −8 day
Mean temperature of −1 day
Total precipitation of −1 day
Mean temperature of 0 day
Total precipitation of 0 day
Time tag of measurement day
|Mean temperature of −9 week|
Total precipitation of −9 week
Mean temperature of −8 week
Total precipitation of −8 week
Mean temperature of −1 week
Total precipitation of −1 week
Mean temperature of 0 week
Total precipitation of 0 week
Time tag of measurement day
As the mine site is anonymous, the original monitoring data is confidential and not publicly accessed. To protect the site information, only normalized historical monitoring data including total precipitation, mean temperature, flow rate and acidity from the 16 years are provided in Figure 3. Total numbers of flow rate measurement and acidity measurement during the 16 years is 1741 and 320, respectively. It should be noticed that the weather data are extracted on a daily basis and any missing data is represented by a gap. The flow rate and acidity is not measured in a fixed time frame and the time interval is dynamic, so each measurement data is represented by a solid dot in the figure. As some weather data are missing, not all of drainage measurements are utilized for the training. Those drainage measurements that do not have a complete daily average or weekly average input will be excluded. In terms of the hold-out approach to avoid overfitting, 80% of the total observation samples are used for training and 20% for validation. No testing data is allocated in the training process, as a prediction of the drainage in the next 2 years will be performed after the training is completed. As the monitoring data of weather and drainage during the next 2 years are also available. The real weather data will be utilized as the input of the neural network. In real situation, weather forecasting data should be adopted for prediction purpose. The predicted drainage flow rates and acidities will be compared with the real monitoring values.
As shown in the Figure 1, only a single hidden layer is in the feedforward neural network adopted for this case study. So the number of neurons in the hidden layer is considered as a hyperparameter for the training process. A grid search is performed on 5, 10 and 20 neurons in the hidden layer to find the optimized size. The proposed neural network is trained through Levenberg–Marquardt backpropagation algorithm. The adaptive value (damping factor) is set to 0.001 initially, and will increase by 10 until the change of above results in a reduced performance value. The change is then made to the network and adaptive value is decreased by 0.1. The maximum adaptive value is set to 1e10. The maximum epochs before the stop of training is set to 1000. However, the training may be stopped early if the
The proposed neural network is implemented through the machine learning toolbox built in commercial software MATLAB. The weather monitoring data are pre-constructed for both types of input layer through Microsoft Excel Macro before exporting into the MATLAB.
3.2 Regression and prediction results
After the training is completed, all training and validation observation samples go through Eq. (1) again and then the calculated output is called drainage regression. The
|Number of Neurons in the Hidden layer||Flow Rate (||Acidity (|
|Daily Average||Weekly Average||Daily Average||Weekly Average|
|5||1.01, 0.62||0.25, 0.93||0.53, 0.65||0.35, 0.78|
|10||0.99, 0.63||0.18, 0.95||0.55, 0.65||0.41, 0.79|
|20||0.92, 0.67||0.18, 0.95||0.58, 0.67||0.28, 0.84|
The comparison between regressions of flow rate and acidity and real measurements (target) in temporal scale are provided in Figure 4. It is found that the proposed neural network is capable to capture the underlying correlation between the drainage and weather, as not only seasonal fluctuations in a year but also long term evolution across years are well reflected in the drainage regression.
In addition, the successful candidate neuron networks (weekly average input layer with 10 neurons in hidden layer for flow rate, and weekly average input layer with 20 neurons in hidden layer for acidity) are adopted to predict the future flow rate and acidity in the Year 17 and Year 18 based on the input extracted from the real weather monitoring data. The prediction and real measurement in temporal scale are compared in Figure 5.
It is observed that the general trend for both of flow rate and acidity are well predicted by the neural network. The proposed neural network is capable to predict the time and also the amount of peak flow rates in the spring freshet of both years, which is important for site water management. In terms of acidity, the long term downward trend shown in Year 11 to Year 16 is reflected in the prediction, which matches the trend of real measurement in both years. However, seasonal fluctuation has some mismatch. The reason is that the amount of observation samples for acidity is much less than those for flow rate, so the acidity prediction is not as good as the flow rate prediction in this case study. The accuracy can be improved when more monitoring data are accumulated for the training in the future.
A machine learning algorithm based on feedforward neural network is introduced in this chapter to correlate the drainage flow rate and drainage chemistry with the field precipitation and temperature for waste rock storages. Comparing with traditional predictive models, the neural network approach requires little simplification and assumptions of bio-geo-chemical processes involved, in additional, the cost and time for characterizations can be significantly reduced. The advantage of the neural network is that all underlying mechanisms have been naturally reflected in the monitoring data, which can be gradually captured during the machine learning process.
A case study on a full-scale waste rock storage is performed. The results show that the flow rate and acidity of the drainage discharged in the field have strong correlations with previous 10 weekly averaged weather data at this site. The capability of making prediction of future drainage in the field is also validated. However, the structure of input layer, hidden layer number, neurons in the hidden layer are all site specific, which may be adjusted for the applications to other waste rock storages.
It should also be addressed that the measurements of drainage flow rate and drainage chemistry may not always be accurate in the field, furthermore, they can fluctuate in a single day depending on the hydrogeological conditions. So the monitoring data may not represent the daily average in some cases, which means that some mismatch between the prediction and measurement does not necessarily indicate the prediction is wrong. High frequent (multiple in a single day or hourly) measurement is highly suggested, so that good quality monitoring data will be available to predict drainage for waste rock storages in the future.
The work is financially supported by the Environmental Advances in Mining Program and New Beginnings Initiative in the National Research Council Canada.