Open Access book publisher

The aim of this work is to study the seismicity in Chile using the ETAS (epidemic type aftershock sequences) space‐time approach. The proposed ETAS model is estimated using a semi‐parametric technique taking into account the parametric and nonparametric components corresponding to the triggered and background seismicity, respectively. The model is then used to predict the temporal and spatial intensity of events for some areas of Chile where recent large earthquakes (with magnitude greater than 8.0 M) occurred.


Introduction
Chile is considered one of the most active seismic countries in the world. It is located on the South American tectonic plate, on its western boundary where the Nazca and Antarctic plates converge and generate the subduction zones. The interaction of the two plates produces a large deformation of the South American continent, which renders earthquakes in Chile. Due to the high speed of convergence between the Nazca and South America plates, the seismicity in that area is the most intense and produces the strongest earthquakes in the country. Recent significant events are the 2010 Chile earthquake occurred off the coast of central Chile, having a magnitude of 8.8, the 2014 Iquique earthquake with a magnitude of 8.2, and the 2015 Illapel earthquake with magnitude 8.3.
As described in reference [1], Chilean seismicity can vary substantially from place to place, showing various clustering zones. Fault zone heterogeneity and complexity are often responsible for spatial and temporal variations. Stochastic point-process models for earthquake occurrence have been shown to be appropriate tools for studying such complex futures [2,3].
Temporal and space-time epidemic type aftershock sequence (ETAS) models and their extensions have been introduced for such a purpose [3,4]. In particular, such space-time point processes are used to model the occurrence of earthquakes, with two different components of the intensity function: a background component homogeneous in time but nonhomogeneous in space and a short-period component inhomogeneous in time and space which depends on previous events. In statistical seismology, these two components are associated with the background seismicity and the aftershock sequences, respectively. In such processes, the estimation of the basic reproductive rate is often complicated by the presence of triggered events, superimposed to the persistent background component. Since the persistent background activity prevails, on large timescale, over the aftershocks activity, location of large earthquakes may be forecasted starting from the identification of the background seismicity.
Recently, ETAS has been used by a number of researchers for the development of both shortand long-term forecasting models [5][6][7][8][9][10]. Moreover, few works study the seismicity in Chile: nowadays the ETAS model has been applied to the Chilean catalog only by Nicolis et al. [1]. In this work, we use the ETAS model based on the forward likelihood for prediction (FLP) estimation proposed by Adelfio and Chiodi [11,12] for predicting the spatial and temporal seismicity in Chile. In particular, in this chapter, we focus on three areas associated with the biggest earthquakes of the last 10 years. For the space-time forecasting, we will consider a period including the principal event.
The work is organized as follows. In Section 2, we describe the conditional intensity function and the ETAS model as a particular branching-type model; Section 3 presents the spatial and temporal forecasts for of the Chilean catalog. Conclusion and further developments are discussed in Section 4. Sections 5 and 6 report acknowledges and references.

The forecast approach
Earthquake prediction research is still a considerable problem. Earthquake forecast can be defined as a sequence of earthquake rates corresponding to specified multidimensional intervals (defined by the location, time, and magnitude) [13,14]. From the rates specified in each forecast, a vector of the expected number of events within all intervals is calculated.
In the context of earthquakes, expectation has a different meaning from prediction, since earthquake prediction implies high probability and imminence of a single earthquake [13]. In other words, earthquake prediction can be considered as a special case of forecast in which the forecast rate is temporarily high enough to justify an exceptional response beyond that appropriate under normal conditions [13,14].
In this context, the most general earthquake forecast is a ranking of the bins according to their expected probability to host one or more earthquakes. In recent years, several projects have been developed in order to provide some tools to quantify the predictive skill of an earthquake forecast, which is to check if an observed set of earthquakes is consistent with its forecast. Famous prediction centers are the Collaboratory for the Study of Earthquake Predictability (CSEP) testing centers [15,16], and the Regional Earthquake Likelihood Models (RELMs) [17].
A proper comprehension and an accurate interpretation of real natural phenomena described by evolutionary processes require the identification of a function used to specify the dependence of the present on the past. If we assume that the process under study evolves according to a memoryless distribution, the number of events occurring in any bounded interval of time after time t is independent of the number of events occurred before time t [18,19]. This assumption is sometimes unrealistic and useless in real contexts, where the dependence on past events is crucial, mostly if we are aimed at predictive results [18]. Point process theory provides a very useful theoretical tool to represent the evolution of some random systems, through time in a fixed space region. In such kind of processes, it is assumed that the present may be influenced by its past, but not by the future. This identifies a natural ordering for temporal point processes. Furthermore, if we consider the spatial component, a space-time point process can be defined by specifying a stochastic model based on the spatial coordinates and the time of the next event, given the history of previous spatial temporal events. The expected number of events in a given region conditional on the past events is given by the conditional intensity function [18,19].
Let T×S d-1 be a general d-dimensional closed region, with T the time domain and S d-1 a two-or three-dimensional space. Any analytic space-time point process is uniquely characterized by its associated conditional intensity function [19] defined as the expected frequency of events in a space-time unit region, conditional on the history H t of the point process up to time t, i.e.:,  [20]. Generally, intensities λ( t, s ) depend on some unknown parameters. A very primary and crucial step to provide a valid forecast is to obtain a proper estimate of the conditional intensity function of the generator process of the observed data.
Among the processes used to model and describe the evolutionary features of the observed reproduction process, branching models may be considered. A widely used branching-type model for earthquake description is the Epidemic Type Aftershock-Sequences (ETAS) model [21], a self-exciting point processes for seismological context.

The space-time ETAS model
In general, the conditional intensity function of the branching model in a time-space point (t, s) is defined as the sum of a term describing the long-term variation (spontaneous activity or background) and one relative to the short-term variation (induced activity or offsprings): with θ=(ϕ, μ), the vector of parameters including the parameter set of the induced intensity (ϕ) together with the parameter of the background general intensity (μ), f(s) the space density, and τ ϕ (t, s) the induced intensity, given by with ν ϕ (.) a space-time density function.
In such models, the main issue is the simultaneous estimate of the two different components of the intensity function: the large-timescale component given by the component μf(s) and the short-period component given by τ φ (t, s ) . If the large-timescale component in Eq. (2) is known, the parameter set ϕ is usually estimated by the maximum likelihood method. In applications, the background component μf(s) is usually estimated by nonparametric techniques, like kernel estimators [22].
In particular, in such processes where the reproduction or some epidemic activity can be modeled, prediction of the basic reproductive rate is often complicated by the presence of triggered events, superimposed to the persistent background component. In the seismological process, earthquake clusters, formed by the main shock event with its foreshocks and aftershocks, often complicate the statistical analysis of the background seismic activity that might be linked to the plate tectonic rate. On a large timescale, the background seismic activity prevails over the aftershock activity, and so background seismicity can be used to forecast the location of large earthquakes, and removal of temporal cluster members may be a crucial issue. Indeed, when forecasting large earthquakes when aftershocks occur, the statistical analysis of the background activity may be complicated by earthquake clusters [11].
Moreover, since description of real phenomena like seismic events often requires the definition of more complex models than the stationary Poisson process. If we do not assume statistical independence of earthquakes, second-order properties of point processes may have a relevant role in the study and the comprehension of seismic process and its realization [21]. Indeed, seismic data represent an interesting situation where the study and the interpretation of features like self-similarity, long-range dependence, and fractal dimension are crucial. For this reason, fractal point processes could also find an application to seismological analysis, although it would require further advances in the study of geology and seismology.
A branching process widely used in seismology for describing the temporal and spatial distribution of earthquakes events is the epidemic type aftershocks sequences (ETAS) model [23]. The ETAS conditional intensity function can be written as follows: where m j is the magnitude of the jth event. In the ETAS model, the background seismicity f ( s ) is assumed to be stationary in time, while the occurrence rate of aftershocks at time t is described by the following parametric function, where κ is the normalizing constant, c and p are characteristic parameters of the seismicity activity, and p indicates the decay rate of aftershock in time. The function l ( ⋅ ) given by, describes the spatial distribution of the triggering events. It relates the occurrence rate of aftershocks to the main shock (principal earthquake event) of magnitude m j through the parameters α and γ ; m 0 is the lower bound for which earthquakes with higher values of magnitude are surely recorded in the catalog; d and q are two parameters related to the spatial influence of the main-shock.

The estimation of a branching process
The simultaneous estimation of the background intensity and the triggered intensity components of ETAS model is a crucial statistical issue. In exploratory contexts, or to assess the adequacy of a specific parametric model, some kind of nonparametric estimation procedures could be useful, though this cannot be appropriate when we look for predictive properties of the estimated intensity function.
Because of the different seismogenic features controlling the kind of seismic release of clustered and background seismicity to describe the seismicity of an area in space, time, and magnitude domains, it could be useful to study separately the features of independent events and those of the strongly correlated ones. Zhuang [24] proposed a stochastic method associating with each event a probability to be either a background event or an offspring generated by other events, based on the ETAS for clustering patterns: a thinned catalog is generated by a random assignment of events, where events with a higher probability of being main shocks are more likely included, and their spatial intensity is modeled using the inhomogeneous Poisson process. The two complementary subprocesses of the general seismic process are so identified: the background subprocess and the cluster (or offspring) subprocess [11]. In previous papers [12,18], we proposed a technique to find out the two main components of seismicity, i.e., the background seismicity and the triggered one. Helmstetter et al. [7] presented a seismic sequence detection technique based on Maximum Likelihood Estimation (MLE) of parameters, which identifies the conditional intensity function of a model describing the seismic activity as a clustering-process, such as the ETAS model. Console et al. [25] proposed a stochastic method associating with each event a probability to be either a background event or an offspring generated by other events; Marsan and Lenglinè [26] used the concept of cascade triggering without using models. A probabilistic clustering approach, providing an uncertainty about an object's class membership, can be provided by latent clustering analysis [27]. In recent papers, we classify events according to their probability of being a background or an offspring event, estimating the space-time intensity of the generating point process of the different components by mixing nonparametric and parametric approaches. In references [11,18],a new approach (forward likelihood for prediction, FLP) for estimating the components of a space-time point process is introduced and developed. FLP provides an estimation of the space-time intensity of a branching-type point process that is usually characterized by these different components (background (nonparametric) and triggered (parametric), and maximizes the sum of increments of the likelihood, similarly to a cross-validation-type technique, with the advantage of taking into account the information of the past on the present. In other words, it is a nonparametric estimation procedure based on the subsequent increments of log-likelihood obtained adding an observation one at a time, to account for the information of the observations until t k on the next one.
Briefly, suppose that in a space-time point process the intensity function λ ( · ) depends on a set of parameters ψ. The difference among the log-likelihood for the point process, given the k observed values (t i s i ) and the log-likelihood computed on the first k+1 observations, but using the estimates based on the first k defined, is a measure of the predictive information of the first k observations on the k+1th, named δ k, k+1 (ψ). This approach leads to a technique similar to cross-validation, but applied only on next observations. In particular, we choose ψ that maximizes the sum of these measures adding one observation at a time, denoted by FLP(ψ).
The maximization of the measure FLP(ψ) can be used to estimate the nonparametric component of earthquake models such as branching one, providing better kernel estimates (in terms of Mean Integrated Squared error, MISE) of space-time intensity functions than classical methods [11,18]. In general, the FLP approach can be used to provide valid estimates of the parameters (i.e., the bandwidth in kernel components) of complex intensity functions that identify multidimensional point processes.

Predicting the number of events
In order to make predictions in the temporal domain, we develop a new procedure that can be summarized as follows [12]: • Start from the estimated ETAS model, based on the FLP estimation using data until time t.
• Forecast the intensity of the following day, that is, compute the intensity in the time t + dt , conditioned to the history up to t, that is, λ θ ( t + dt, s| H t ) (in our experiment dt is one day).
• Then, reestimate the model based on the predicted intensity and repeat these steps for a period of n fixed days.
In particular, the main idea for improving prediction for t + dt is that we have to account also for the triggering effect of events that may occur in the meanwhile, conditioning not simply to the history up to t but to H t+dt , that is we actually make a prediction computing λ θ ( t + dt, s| H t+dt ) instead of simply λ θ ( t + dt, s| H t ) .
To obtain this, we assume that we have just a temporal process: Then, in order to make a prediction of the number of events in t + dt , we have to consider the integral of Eq. (7) with respect to the time: In Eq. (8), the first part of the sum describes the predicted number of events in t + dt conditioning to the history up to t, while the second part represents an additional effect due to the events that may occur between t and dt .Although traditional methods only consider the first part of Eq. (8), we think that the correction term given by the additional effect in Eq. (8) could improve the intensity of temporal predictions.
This last part has been approximated, considering its value in the point 0.5, taken as the half of the next day and, in particular for the g(.) function, we considered the following approximation:.
In the next section, we will compare predictions obtained with and without the additional effect.

Forecasting Chilean events
We consider a rectangular area of Chile that stretches from the town of Arica to the town of Valdivia, delimited by the longitudes (-76, -68) and the latitudes of (-40, -18) and we select only events at a depth of less than 60 km for the period January 2007-April 2016. We then study the temporal and spatial distribution of the seismic events with magnitude greater than 3.0.
Staring from the estimated ETAS model, we forecast the number of seismic events before and after some big earthquakes. In particular, we consider three main earthquake events: the 27F-2010 Chile earthquake (in the center of Chile), whose epicenter was close to the town of Cencepción; the 2014 Iquique earthquake (in the north of Chile), and the 2015 Illapel earthquake (in the center north of Chile) (see Figure 1).
In particular, we aim to obtain daily estimates for the probability to observe at least one event of magnitude greater than the threshold magnitude m 0 in each of the cells of the testing area.
We employ an ETAS model, with conditional intensity function as in Eq. (3). Background seismicity f(x; y) is estimated through the FLP approach, while parametric components are estimated through maximum likelihood.
The two estimation steps are alternated until convergence [5]. Estimates were obtained using the R-package etasFLP [12]. As an example of the performance of the proposed technique, we report a forecast experiment conducted for the temporal period of the three catastrophic seismic sequences that hit the Chilean area as described above. We first estimate the total, background and triggered intensity using all observations, with magnitude greater than 3, from January 1, 2007 to May 4, 2016. Then we forecast the daily intensity forward for 120 days, each day at time. Results of the experiment are presented in Figure 2.
Some preliminary results are shown in Figure 2, which compares the number of observed events with the number of events predicted by the ETAS model, for the 2014 Iquique earthquake.
As shown in Figure 1, the predicted events with the correction term (green line) (see, Section 2.3) are very close to the observed ones, which implies that the forecast including the amplifier effect of magnitude is the closest ones to the real situation. However, it is evident lack of fit in forecasting the number of events in the first days when we do not account for the correction effect. Figure 3(a) and (b) shows the spatial estimated intensity of the seismicity for the 5th and 20th day, respectively, after the main earthquake at Iquique. In both cases, the model identifies the highest risk seismic area. As expected, the intensity of earthquake events is lower for 20 days after the main event. The sequence of the estimated parameters (μ, κ, c, p, and α) for the 120 prediction days is represented in Figure 4, together with the standard errors. In particular, it is evident how that the main effect of the main earthquake event changes the behavior of estimated parameters, always in the same direction: e.g., a decreasing effect for the background intensity parameter (ε), and an increasing one for the parameter of the triggered component (k 0 ). These effects are reasonable and confirm the idea that seismicity is characterized by complex and multiscale  features, strictly related also to the magnitude of events. A similar behavior of the parameters may be observed also for the other sequences reported in this paper (see Figures 7 and 10).
The number of predicted events for the 2015 Illapel earthquake is represented in Figure 5.
As in the latter case, predictions obtained by the ETAS model with the correction effect (green line) are much better than the model without the correction. Unlike the 2014 Iquique earthquake, in this case, there are not precursors for the main event. Despite this, the model is able to almost exactly predict the number of events for the day after the main shock The forecasted spatial intensity maps represented in Figure 6 correctly identified the areas with higher probability of seismicity (the pixels with higher values are close to the epicentres of the observed earthquakes). The behavior of the estimated parameters for the Illapel's eartquake (Figure 7) are similar to the case of Iquique's earthquake except for the background seismicity parameter which seems to decrease after the main shock. The predictions for 27F-2010 earthquake are not as good as in the latter cases. The seismicity in this area seems to have a different behavior compared to the Gutenberg-Richter law where the number of events should follow an exponential distribution. In this case, the number of events after the main shock seems to follow an asymmetric distribution with a mode close to the 20th event after the main earthquake (Figure 8).   This earthquake is characterized by a magnitude (equal to 8.8 M) greater than the other two events considered in this study and affected a large area. Probably the strongest earth quake provoked break points in the neighboring faults that may not necessarily due to the triggering seismicity. Figure 9 shows the extension of the area affected by the 27F earthquake. We can also note that most of the events that occurred after the February 27 are not clustered around the main event but in a northern area. However, the spatial model seems to correctly identify the seismic area.   Figure 10 shows the behavior of the parameters for the 120 prediction days. Although the behavior of the parameters μ, κ, c, and p is similar to the behavior observed in the previous big earthquakes, the parameter α, measuring the efficiency of an event of a given magnitude to generate aftershocks, shows a very strong decrease, reaching values close to 0.5 (this parameter normally assumes values close to one).

Conclusions
In this work, we estimate the space-time intensity function of the seismic process describing the very recent Chilean activity (as shown in Figure 1), using the FLP approach. Moreover, we propose a forecast method based on a correction term for predicting three sequences of earthquakes whose epicenters are distributed in different areas of Chile. In all cases, the ETAS model provides better forecasts when the correction term is considered. However, we think that future developments are needed to improve the results of the forecast experiment, most in terms of the spatial adjustment. For this purpose, more flexible versions of the ETAS model will be considered, by accounting, for instance, for the observed space geophysics structures [28].