Radon as an Earthquake Precursor – Methods for Detecting Anomalies

Radon is one of many geophysical and geochemical phenomena that can be considered to be an earthquake precursor. Due to the non-linear dependence of earthquakes’ initial conditions, the question about the predictability of earthquakes often arises (Geller, 1997). The successful prediction of earthquakes is yet to be accomplished, in terms of their magnitude, location and time, and much effort has been spent on this goal. The term “earthquake precursor” is used to describe a wide variety of geophysical and geochemical phenomena that reportedly precede at least some earthquakes (Cicerone et al., 2009). The observation of these types of phenomena is one recent research activity which has aimed at reducing the effects of natural hazards. Among the different precursors, geochemistry has provided some high-quality signals, since fluid flows in the Earth’s crust have a widely recognised role in faulting processes (Hickman et al., 1995). The potential of gas geochemistry in seismo-tectonics has been widely discussed by Toutain and Baubron (1999). In the late 1960s and early 1970s, reports from seismically active countries such as the former USSR, China, Japan and the USA (Ulomov & Mavashev, 1967; Wakita et al., 1980) indicated that concentrations of radon gas in the earth apparently changed prior to the occurrence of nearby earthquakes (Lomnitz, 1994). The noble gas radon (222Rn) originates from the radioactive transformation of 226Ra in the 238U decay chain in the Earth’s crust. Since radon is a radioactive gas, it is easy and relatively inexpensive to monitor instrumentally, and its short half-life (3.82 days) means that short-term changes in radon concentration in the earth can be monitored with a very good time resolution. Radon emanation from grains depends mainly on their 226Ra content and their mineral grain size, its transport in the earth being governed by geophysical and geochemical parameters (Etiope & Martinelli, 2002), while exhalation is controlled by hydrometeorological conditions. The stress-strain developed within the Earth’s crust before an earthquake leads to changes in gas transport and a rise of volatiles from the deep earth up to the surface (Ghosh et al., 2009; Thomas, 1988), resulting in anomalous changes in radon concentration. The mechanism of observed radon anomalies is still poorly understood, although several theories have been proposed (Atkinson, 1980; King, 1978; Lay et al., 1998; Martinelli, 1991). Over the past three decades, the occurrence of anomalous temporal


Radon migration in the Earth's crust
Only a fraction of the radon atoms created by radioactive transformation from radium are able to emanate from mineral grains and enter into the void space, filled either by gas or water. Radon ascends towards the surface mainly through cracks or faults, on a short scale by diffusion and, for longer distances, by advection -dissolved either in water or in carrier gases. Gas movement should be ascribed to the combination of both processes. Diffusive movement is driven by a concentration gradient and is described by Fick's law. Considering gas diffusion in porous media, it is necessary to take into account that the volume through which gas diffuses is reduced and the average path length between two points is increased, thus altering the diffusion coefficient (Etiope & Martinelli, 2002). Nevertheless, the velocity of radon transport in the earth is quite low (≤ 10 -3 cm/s) and the concentration of radon is reduced by radioactive decay to the background level before even 10 m are traversed (Etiope & Martinelli, 2002;Fleischer, 1981). Diffusion is only important in capillaries and small-pored rocks. On the other hand, the velocity and space scales of advective movements are much higher than those of diffusive ones. Advective transport is driven by pressure gradients, following Darcy's law. The amount of radon itself is, however, too small to form a macroscopic quantity of gas which can react to pressure gradients. Therefore, it must be carried by a macroscopic flow of carrier gases (Kristiansson & Malmqvist, 1982). A gas mixture formed by carrier gases (e.g., CO 2 , CH 4 , and N 2 ) and rare gases (e.g., He, Rn) can be referred to as "geogas" (Etiope & Martinelli, 2002;Kristiansson & Malmqvist, 1982). In dry, porous or fractured media, gas flows through an interstitial or fissure space (gas-phase advection) whereas in saturated, porous media gas can dissolve and then be transported in three ways: by groundwater (waterphase advection), by displacing water (gas-phase advection) or by forming a bubble flow by means of buoyancy in aquifers and water-filled fractures. The bubble movement has been theoretically and experimentally recognised as a fast gas migration mechanism governing the distribution of carrier and trace gases over wide areas on the Earth's surface (Vàrhegyi et al., 1992).

External effects on radon in soil gas and water
Radon concentration in soil, gas or water is not only controlled by geophysical parameters, but it also changes due to other external effects. Meteorological effects -such as soil humidity, rainfall, temperature, barometric pressure and wind -control radon concentrations in soil gas. These parameters change the physical characteristics of soil and rock, thus influencing the rate of radon transport and, consequently, perturbing eventual radon variations caused by geophysical processes originating in the deeper parts of the Earth's crust. Shallow soil levels are more affected by changing meteorological conditions than deeper ones. Radon concentrations with no larger variations present are usually observed at depths of 0.8 m or deeper. Besides the effects of meteorological parameters on radon in soil gas, considerable variations in the gas composition of thermal springs have been shown to be the result of fluctuations of local hydrologic regimes (Klusman & Webster, 1981). The significant influence of barometric pressure has been discussed by several authors, who clearly pointed out an inverse relationship between barometric pressure and radon concentration in soil gas (Chen et al., 1995;Clements & Wilkening, 1974;Klusman & Webster, 1981). A decrease in barometric pressure, with the values of other environmental parameters remaining constant, generally causes an increase in radon exhalation from the ground, whereas during periods of rising pressure, air with low radon concentration is forced into the ground, thus diluting radon. Temperature-related fluctuations of soil gas radon concentration have also been proven to be very important. Klusman & Jaacks (1987) found an inverse relationship between soil temperature and radon concentration. They suggested that lower air temperatures as compared with soil temperatures during winter months promoted an upward movement of radon by convection, whereas during the summer, lower soil temperatures as compared with air temperatures and an inversion layer below the level of sampling reduces the upward flux and observed concentration. In general, the behaviour of soil gas migration in different types of soil is seasonally dependent (King & Minissale, 1994;Washington & Rose, 1990). In systems where gas movement is driven by diffusion or slow advection processes, radon activity in soil might be controlled by soil moisture and rainfall through the opening of cracks in the surface (Pinault & Baubron, 1996;Toutain & Baubron, 1999). On the other hand, barometric pressure has the major influence on radon concentrations in soils in advective systems, which display generally higher gas flows. However, micro-scale soil heterogeneities in permeability, porosity and lithology can cause significant heterogeneities in the response of radon concentration to changes of atmospheric parameters (King & Minissale, 1994;Neznal et al., 2004). Numerous and often divergent results in studies related to the effect of external factors on soil gas radon concentration suggest that no general predictive model for excluding meteorological effects can be proposed, and studies of radon in soil gas need a simultaneous record of meteorological parameters.
of the bedrock and soil in areas of crustal discontinuities, such as fractures and fault zones, promotes intense degassing fluxes, which causes higher soil gas radon concentrations on the ground surface above active fault zones. Although several measurements, experiments and models have been performed, the understanding of the mechanism of radon anomalies and their connection to earthquakes is still inadequate (Chyi et al., 2010;King, 1978;Ramola et al., 1990). Before the earthquake stress in the Earth's crust builds up causing a change in the strain field; the formation of new cracks and pathways under the tectonic stress leads to changes in gas transport and a rise in volatiles from the deep layers to the surface. In fact, fluids play a widely recognised role in controlling the strength of crustal fault zones (Hickman et al., 1995). Anomalous changes of radon concentration are closely linked to changes in fluid flow and, therefore, also to highly permeable areas along fault zones. Large faults are not discrete surfaces but rather a braided array of slip surfaces encased in a highly fractured and often hydrothermally altered transition -or "damage" -zone. Episodic fracturing and brecciation are followed by cementation and crack healing, leading to cycles of permeability enhancement and reduction along faults (Hickman et al., 1995). Several mechanisms have been proposed, which could explain the relationship between radon anomalies and earthquake. Two models of earthquake precursors are discussed by Mjachkin et al. (1975), with a common principle: at a certain preparation stage, a region of many cracks is formed. According to the dilatancy-diffusion model (Martinelli, 1991;Mjachkin et al., 1975), the increase in tectonic stress causes the extension and opening of favourably-oriented cracks in a porous, cracked, saturated rock. Water flows into the opened cracks, drying the rock near each pore and finally resulting in a decrease of pore pressure in the total earthquake preparation zone. Water from the surrounding medium diffuses into the zone. The increased water-rock surface area, due to cracking, leads to an increase in radon transfer from the rock matrix to the water. At the end of the diffusion period, the appearance of pore pressure and the increased number of cracks leads to the main rupture. According to the crack-avalanche model (Mjachkin et al., 1975), the increasing tectonic stress leads to the formation of a cracked focal rock zone, with slowly altering volume and shape. At a certain stage -when the whole focal zone becomes unstable -the cracks quickly concentrate near the fault surface, triggering the main rupture. An alternate mechanism for earthquake precursory study, based on stress-corrosion theory, has been proposed by Anderson and Grew (1977). According to them, the observed radon anomalies are due to slow crack growth controlled by stress corrosion in a rock matrix saturated by ground waters. King (1978) has proposed a compression mechanism for radon release, whereby anomalous high radon release may be due to an increase of crustal compression before an impending earthquake, that squeezes out soil gas into the atmosphere at an increasing rate. Toutain and Baubron (1999) observed that gas transfer within the upper crust is affected by strains less than 10 -7 , much smaller than those causing earthquakes. According to Dobrovolsky (1979), the radius of the effective precursory manifestation zone depends on the earthquake magnitude and can be calculated using the empirical equation: Where R D is the strain radius in km and M L is the magnitude of the earthquake. Considering the Earth's crust to be an anisotropic medium, this law can be modified according to the effective sensitivity to the impending earthquake. The ideal circle with its theoretical radius can be transformed into an ellipse or characterised by shadow areas where no precursory phenomena are observable due to crustal anisotropy, discontinuities or loose contacts along some faults, which prevent further stress transfer (İnan & Seyis, 2010;Martinelli, 1991). Although radon anomalies can be studied in soil gas and thermal waters, thermal waters could be much more representative of the geologic environment and could be more reactive to stress/strain changes acting at depth than soil gases. The disadvantage of soil gases lie in weak gas concentrations, generally due to the thickness of the sedimentary cover and the high level of atmospheric perturbations (Toutain & Baubron, 1999).

Methods for detecting anomalies in radon time series
An anomaly in radon concentration is defined as a significant deviation from the mean value. Due to the high background noise of radon time series, it is often impossible to distinguish an anomaly caused solely by a seismic event from one resulting from meteorological or hydrological parameters. For this reason, the implementation of more advanced statistical methods in data evaluation is important (Belyaev, 2001;Cuomo et al., 2000;Negarestani et al., 2003;Sikder & Munakata, 2009;Steinitz et al., 2003). In our research, radon has been monitored in several thermal springs (Gregorič et al., 2008;Zmazek et al., 2002a;Zmazek et al., 2006) and in soil gas (Zmazek et al., 2002b) and different approaches to distinguishing radon anomalies were applied.

Standard deviation
A very common practice in determining radon anomalies is the use of standard deviation. The average radon concentration is calculated for different periods with regard to the nature of yearly cycles of radon concentration. In the case of radon in soil gas, the mean value of radon concentration is calculated separately for four seasons (spring, summer, autumn and winter) based on the air and soil temperature. In contrast to soil gas , rad on in ground or spring water is greatly influenced by the hydrologic cycle, which has to be considered during the data analysis. To define the mean and standard deviation, anomalously high and low values -which may cause unnecessary high deviation and perturb the real anomalies -have to be neglected. The periods when radon concentration deviates by more than ±2σ from the related seasonal value are considered as radon anomalies that are possibly caused by earthquake events and not by meteorological parameters (Ghosh et al., 2007;Gregorič et al., 2008;Virk et al., 2002;Zmazek et al., 2002b). Fig. 1 shows an anomalous radon concentration, exceeding 2σ above the average value, which appeared approximately 10 days before the occurrence of three earthquakes with magnitudes from 1.8 to 3.2.

Relationship between radon exhalation and barometric pressure
An inverse relationship exists between the time derivative of radon concentration in soil gas and the time derivative of barometric pressure (as was discussed previously in section 2.1). A decrease in barometric pressure causes an increase in radon exhalation from the ground, whereas during periods of rising pressure, air with low radon concentration is forced into Fig. 2. The time gradient of radon concentration in soil gas and the time gradient of barometric pressure during two periods at the Krško basin: a) the period without local seismic activity, b) the seismically active period, whereby the radon anomaly 14 days before the earthquake is marked by the green rectangle. The earthquake is expressed in terms of local magnitude (M L ) and the distance between the measuring location and the earthquake epicentre (D).
the ground, thus diluting the radon concentration. Therefore, deviations from this rule during these periods -when the time gradient of barometric pressure, ΔP/Δt, and the time gradient of radon concentration, ΔC Rn /Δt, in soil gas have the same sign -can be considered to be radon anomalies (Zmazek et al., 2002b). A clear negative correlation between the time gradient of radon concentration and the time gradient of barometric pressure can be seen in Fig. 2a, when no seismic activity is present. The radon anomaly, characterised by positive correlation of time gradients, is marked in Fig. 2b. Anomalous behaviour in radon concentration started 14 days before the earthquake with a local magnitude of 2.6, and ended a few days after the earthquake.

Machine learning methods
Machine learning methods have been successfully applied to many problems in the environmental sciences (Džeroski, 2002). In the case of radon as an earthquake precursor, it must be considered -as discussed in section 2.1 -that the variation in radon concentration is controlled not only by geophysical phenomena in the Earth's crust, but also by the environmental parameters associated with the radon monitoring sites. With machine learning methods, a model for the prediction of radon concentration can be built, taking into account various environmental parameters (e.g., barometric pressure, rainfall, and air and soil temperature). The aim is to identify radon anomalies which might be caused by seismic events. The application of artificial neural networks (Negarestani et al., 2002(Negarestani et al., , 2003Torkar et al., 2010), regression and model trees Sikder & Munakata, 2009;Zmazek et al., 2003;Zmazek et al., 2006) and some other methods (Sikder & Munakata, 2009;Steinitz et al., 2003) have proven to be useful means of extracting radon anomalies caused by seismic events.

Artificial neural networks
An artificial neural network (ANN) is a well-known computational structure inspired by the operation of the biological neural system (Jain et al., 1996) and it is a well-established tool, being used widely in signal processing, pattern recognition and other applications. An ANN consists of a set of units (neurons, nodes), and a set of weighted interconnections among them (links). The organisation of neurons and their interconnections defines the net topology. The inputs are grouped in an input layer, the outputs in an output layer and all the other units in so-called hidden layers. The algorithm repeatedly adjusts the weights to minimise the mean square error between the actual output vector and the desired network output vector. The universal approximator functional form of ANNs is well-suited for the requirements of modelling the non-linear dependency of radon concentrations on multiple variables. Among a number of various topologies, training algorithms and architectures of ANNs, the traditional multilayer perceptron (MLP) with a conjugate gradient learning algorithm was chosen in the case of analysing the soil gas radon concentration time series at the Krško basin (Torkar et al., 2010). The series was first split into seismically non-active periods (NSA) and seismically active periods (SA), adjusting the duration of the seismic window from 0 to 10 days before and after the earthquake and with the purpose of investigating the influence of a complete earthquake event on radon concentration (the preparation phase, the earthquake itself and aftershocks). The ANN of the MLP type was trained with each of the NSA datasets, which were divided into three sets: the training set (60%), the cross-validation set (15%) and the test set (25%). The ANN was trained with the training and cross-validation set, while the test set was used to verify its performance. The topology of the ANN generated for each NSA dataset is shown in Fig. 3.   Fig. 3. The ANN topology for learning radon concentration dependency on environmental parameters.
In the testing phase, the correlation between the measured (m-C Rn ) and predicted (p-C Rn ) radon concentration in NSA periods was compared to the correlation between the measured and predicted radon concentration in the entire dataset (NSA and SA). The difference between the correlation coefficients might indicate a period of seismically induced radon anomaly. The ratio between the measured and predicted values (m-C Rn /p-C Rn )1 represents the discrepancy between both values (Fig. 4). Fig. 4. The ratio between the measured and predicted radon concentration (m-C Rn /p-C Rn )1 using an ANN in the case of soil gas radon in the Krško basin for a seismic window of ±7 days. A radon anomaly, possibly caused by a seismic event, is observed when the signal exceeds the threshold value of 0.2.
A radon anomaly is held to be when the absolute value of signal (m-C Rn /p-C Rn )1 exceeds the predefined threshold of 0.2. The ANN in this case performed the best in the case of a seismic window of ±7 days (indicating the length of the period of pre-and post-seismic changes).

Decision trees
Decision trees are machine-learning methods for constructing prediction models from data. The models are obtained by recursively partitioning the data space and fitting a simple prediction model within each partition. As a result, the partitioning can be represented graphically as a decision tree, where each internal node contains a test on an attribute, each branch corresponds to an outcome of the test, and each leaf node gives a prediction for the value of the class variable (Džeroski, 2001;Loh, 2011). Regression trees are designed for dependent variables that take continuous or ordered discrete values. Like classical regression equations, they predict the value of a dependent variable (called the class) from the values of a set of independent variables (called attributes). The model in each leaf can be either a linear equation or just a constant; trees with linear equations in the leaves are also called model trees. Tree construction proceeds recursively, starting with the entire set of training examples. At each step, the most discriminating attribute is selected as the root of the sub-tree and the current training set is split into subsets according to the values of the selected attribute. For continuous attributes, a threshold is selected and two branches are created, based on that threshold. The attributes that appear in the training set are considered to be thresholds. Tree construction stops when the variance of the class values of all examples in a node is small enough. These nodes are called leaves and are labelled with a model for predicting the class value. An important mechanism used to prevent the tree from over-fitting data is tree pruning. Regression (RT) and model trees (MT), as implemented with the WEKA data mining suite (Witten & Frank, 1999), were used for predicting radon concentration from meteorological parameters in the case of radon time series in soil gas at the Krško basin Zmazek et al., 2005) and in the thermal spring water in Zatolmin (Zmazek et al., 2006). Fig. 6. Measured and predicted radon concentration using model trees in the case of soil gas radon at the Krško basin for a seismic window ±7 days; a) low discrepancy in the period without seismic activity; b) high discrepancy starting 10 days before a group of earthquakes; c) the ratio between the measured and predicted radon concentration (m-C Rn /p-C Rn )1 for the same SA period. A radon anomaly, possibly caused by earthquakes, is observed when the signal exceeds the threshold value of 0.2 (marked by the green lines).
As presented in Fig. 5 the first stage of data analysis comprises the selection of attributesi.e. environmental parameters -and the partitioning of the whole data set to the periods with and without seismic activity, SA and NSA respectively. After inspecting the correlation changes between radon concentration and barometric pressure, a seismic window of ±7 days was chosen. The performance was estimated with 10-fold cross-validation in order to evaluate the predictability of the radon concentration in the NSA periods. The model built on the NSA data set was then applied to the SA data set and the performance change was determined using two different measures, the correlation coefficient (r) and the root mean square error (RMSE). For the purposes of prediction, the measured performance in NSA periods should be higher than the performance in SA periods. In these periods, when the discrepancy between the measured and predicted radon concentration is low, no seismic activity is anticipated (Fig. 6a), while in the periods with a higher discrepancy, a radon anomaly can be ascribed to increased seismic activity, rather than to the effect of atmospheric parameters (Fig. 6b). This discrepancy is clearly shown in form of the ratio between both values (m-C Rn /p-C Rn )1, as shown in Fig. 6c. A radon anomaly is held to be when the absolute value of the signal (m-C Rn /p-C Rn )1 exceeds the predefined threshold of 0.2. Besides regression trees, other machine learning methods were also tested (e.g., linear regression and instance-based regression). However, model trees have been shown to outperform other approaches.

Comparison of the results
The results of all of the approaches used for the identification of radon anomalies caused by seismic events in the case of soil gas radon at the Krško basin are shown in Fig. 7 for the period of 1/9 -30/12/2000. Among all of the approaches -and although not very exactthe ±xσ method (I) is the most frequently used. The threshold of anomalous concentrations (e.g., ±1σ, ±2σ, ±3σ) should be chosen in order to minimise the number of false anomalies (FA: anomalies in seismically non-active periods) and so as not to miss the correct ones (CA). Generally, a range of ±2σ from the related seasonal mean value is chosen. Furthermore, a cyclic behaviour of radon concentration has to be taken into account in order to accurately define the period of standard deviation and the calculation of the mean value. For this purpose different methods of time series analyses -for example, Fourier transform (Ramola, 2010) -can be applied.
In the case shown in Fig. 7a, three radon anomalies exceeding 2σ above the mean value may be noticed. The first, in the beginning of September, cannot be assigned to a seismic event (FA). About a week before a weak earthquake of local magnitude M L =1.1, 5 km away from the measurement location -which is the first of five earthquakes over a period of 2 monthsthe second anomaly is observed. And finally, the third one can be noticed soon after a weak earthquake 6 km away (M L =1). The first of the anomalies mentioned above as FA is also visible by applying the method of pressure gradients (II) (Fig. 7b). A positive correlation between the time gradient of radon concentration and the time gradient of barometric pressure is considered to be a radon anomaly, and corresponds to the anomaly observed through method (I) which preceded the first earthquake (M L =1.1). A radon anomaly can also be noticed a few days before the last earthquake, as with the analysis of method (I). Additionally, the anomalous behaviour of the radon concentration as regards the gradient approach is observed during the period starting a few days before the earthquake with M L =2.7 and lasting until the earthquake with M L =1. More often than not, swarms of anomalies are observed over longer periods, with a higher number of anomalies in a swarm observed for approach (II) than for approach (I). As an additional criterion, a threshold of ΔP/Δt > 2 hPa d  1 is introduced by this approach in order to optimise the identification of anomalies caused by seismic events. However, by increasing the threshold value above 2 hPa d  1 , the ratio between correct and false anomalies cannot be significantly improved (Zmazek et al., 2005). Both machine learning approaches, artificial neural networks (III) and decision trees (IV) give promising results, with a low number of false anomalies. The two distinctive anomalies -observed in Fig. 7c and Fig. 7d, for ANN and MT, respectively -confirm the anomalies identified by approaches (I) and (II). Additionally, a relatively long negative anomaly was observed using the ANN approach at the end of November, accompanying the earthquake with M L =1.6. On the other hand, the same negative anomaly is only weakly expressed using the MT approach. A FA observed at the beginning of September using approaches (I) and (II) was also noticed using the MT approach but not by the ANN approach. Approaches (III) and (IV) do not appear to greatly depend upon the choice for the threshold of (m-C Rn /p-C Rn )1 and can, therefore, be used with less hesitation.

Conclusion
Since the appropriate interpretation of field measurements plays an important role in any research, the purpose of this work was to combine and evaluate the different approaches applied by our research group for differentiating the radon anomalies caused by increased seismic activity from those caused solely by environmental parameters. The application of four different approaches -standard deviation from the related mean value (I), the correlation between time gradients of barometric pressure and radon concentration (II), artificial neural networks (III) and decision trees (IV) -was presented. Radon anomalies based on approach (I) have been less successful in predicting earthquakes than those based on the other three approaches. Secondly, approaches (I) and (II) greatly depend upon the values of the ±xσ and ΔP/Δt thresholds, respectively, while the dependence of approaches (III) and (IV) on the threshold of (m-C Rn /p-C Rn )1 is very weak. The number of false anomalies for approach (II) points to the disturbance of radon exhalation by other environmental parameters and not just by barometric pressure. The assumption that radon exhalation is only directly influenced by barometric pressure is further suggested by different forms of radon transport at compression and dilatation zones (Ghosh et al., 2009). Promising results are achieved by applying approaches (III) and (IV), which make it possible to simultaneously incorporate all of the available environmental parameters. Furthermore, in using these techniques, the relation between radon concentration and environmental parameters does not necessarily have to be presumed linear. And finally, in taking into account the scale of the earthquake magnitudes observed during the time of radon measurements, one may speculate that the performance of the applied approaches would be better in the case of stronger earthquakes.

Acknowledgement
This study was done within the program P1-0143: Cycling of substances in the environment, mass balances, modelling of environmental processes and risk assessment.