Tohoku volcanic arc [30, 28]. Dense-rock equivalent (DRE) of eruptive volumes is the product of volume and density of the respective volcanic deposits.
Geological hazard assessments based on established statistical techniques are now commonly used as a basis to make decisions that may affect society over the long-term (0.1 – 1 Ma). Volcanic risk essentially consists of:
A challenge with the long-term probabilistic assessment of future volcanism in relation to the siting of, for example geological repositories is that because new volcano formation is rare, uncertainties in models are inherently large . Sites for nuclear facilities in particular must be located in areas of very low geologic risk . Recent studies have been carried out looking at the hazard posed by volcanoes to nuclear power plants in Armenia [e.g. 9, 12] and Java, Indonesia [e.g. 13]. Here the focus was more on the consequences of an eruption at an existing volcano on the safety of an operating nuclear power plant. In the case of a geological repository for high and/or low level radioactive waste, the emphasis is on the consequences of new igneous activity such as a dike that may intrude the repository [e.g. 14] and transport the waste to the surface. In this case, the probability of a new volcano forming in the first place is very low (typically < 10-7/a) since by definition such facilities should be located away from existing Quaternary volcanoes. However the lack of volcano ‘data’ implies that addition information on the processes that control future long-term spatio-temporal distribution of volcanism are needed. This has motivated several investigators to incorporate datasets in addition to the distribution and timing of past volcanic activity in volcanic probabilistic analyses [e.g. 15]. Bayesian inference has been used to combine geophysical datasets to probability distributions constructed from known historic volcano locations in order to estimate the location of future volcanism over a regional scale . More recently,  used Bayesian inference to merge prior information and past data to construct a probability map of vent opening at the Campi Flegrei caldera in Italy.
Here we revisit the Bayesian approach developed by  where seismic tomographs and geothermal gradients were incorporated into probabilistic assessments by Bayesian inference in Tohoku. We apply the same Bayesian technique in the same study area to incorporate recently acquired helium isotopes into probabilistic hazard assessments; such noble gases have been shown to be excellent natural tracers for mantle-crust interaction owing to their inert chemical properties which means they are not altered by complex chemical processes. Moreover helium isotopes provide evidence for the presence of mantle derived materials in the crust, owing to the distinct isotopic compositions between the crust and the upper mantle [e.g. 17, 18]. We examine the link between volcanism and 3He/4He ratios that may infer possible regions of magma generation and hence volcano formation. Such links between magmatism and elevated 3He/4He ratios have been proposed [e.g. 19, 20], but the link has not been examined quantitatively in probabilistic based models. Finally we discuss the Bayesian method in developed by  in the context of recent approaches to incorporate multiple datasets [e.g. 21, 22].
2. Japan and the Tohoku region
Japan is one of the most tectonically active regions in the world. Due to the dynamics of four plates, Quaternary volcanoes have formed along distinct volcanic fronts in east and west Japan (Figure 1).
The Tohoku region (Figure 2) is arguably one of the most extensively studied volcanic arcs in the world, particularly regarding the relationship between volcanism and tectonics. Moreover there have been numerous geological and geophysical investigations yielding high-quality datasets e.g., [23 - 29].
Tohoku is a mature double volcanic arc with a back-arc marginal sea basin located on a convergent plate boundary of the subducting Pacific plate and the North American plate (Figure 1). The location and orientation of the volcanic front (grey line in Figure 2) has been linked to the opening of the Sea of Japan and subduction angle of the Pacific plate [e.g., 26, 33]. From 60 Ma up until about 10 Ma the volcanic front migrated east and west several times, however, it has been relatively static during the last 8 Ma .
Presently there are 15 known historically active volcanoes in the Tohoku region and a total 170 volcanoes that formed during the Quaternary . Volcanism has gradually become more clustered and localized over a period from 14 Ma to present , thus volcano clustering is a characteristic feature in Tohoku.
3. Defining the volcanic event
What do we mean here by ‘volcano’? In developing probabilistic based models, one of the most difficult and challenging tasks is defining the ‘volcanic event’. This is because the volcanic event defined has to be simple and consistent enough for the probabilistic based models to handle. To a certain extent the degree of consistency that can be realistically included in a model is largely constrained by the size of the study area and by the amount and quality of geological, geochemical and/or geophysical data available. The volcanic event could range from a single eruption to a series of eruptions. It could be defined as the existence of a relatively young cinder cone, spatter mound, maar, tuff ring, tuff cone, pyroclastic fall, lava flow or even a large composite volcano. On the other hand older edifices may have been eroded and/or covered by sedimentary deposits such as alluvium and thus be more difficult to locate and/or are easily overlooked. Results of magnetic and gravity data have been used as evidence for locating such hidden volcanic events which in turn had an impact on resulting probabilities at given locations [e.g. 15].
If we were carrying out a hazard assessment on a single volcano we may be interested in defining the event as a series of pyroclastic flows or surges or eruptions that generate lava flows that exceed a certain volume [e.g. 9]. This is particular relevant to volcanic hazard assessments carried out at volcanoes near densely urbanized areas such as the Campi Flegrei caldera in southern Italy .
Several aligned edifices with the same eruption age may also be considered as a single volcanic event. Such vent alignments typically developed simultaneously as a result of magma supply from a single dike. For example the vent alignments in the Higashi-Izu monogenetic volcanic group [e.g. 35], could well be classified as a single volcanic event temporally but spatially are multiple. Where age data has been limited, some authors have implemented a condition whereby a cone or cones can only be defined as a volcanic event if they are associated with a single linear dike or a dike system with more complex geometry [e.g. 36].
Many of the advances made on modelling future spatial or spatio-temporal patterns of volcanism where carried out in monogenetic volcano fields due to the apparent relative ease of defining such volcanoes as point processes [e.g., 37, 38]. However as composite or established polygenetic volcanoes represent multiple eruptions from the same conduit occurring over several tens to hundreds of thousands of years, defining the volcanic event is not so easy if the focus is on single eruption episodes as the type of eruption can evolve significantly during the lifetime of the composite volcano. In fact the temporal definition of a monogenetic volcano appears to be not so straightforward either as this can range from several days to a few weeks or longer. For example the Ukinrek maar in Alaska formed in about eight days  and the 1913 eruption forming the Ambrym Volcano, Vanuatu in the south west pacific in just a few days . Moreover,  argued that monogenetic volcanoes can be both spatially and temporarily more complex than a single eruptive event. In other words so called ‘monogenetic’ volcanoes can also be ‘polygenetic’ albeit smaller scaled than large volume complex strato or caldera volcanoes. Based on this there could be a case to look again at the volcanic event definition used in earlier probabilistic assessments carried out in monogenetic volcanic fields [e.g. 42, 43].
In Tohoku, new volcanoes forming at new locations typically evolve into large complex strato and/or caldera volcanoes containing multiple vents e.g. Akitakomagatake volcano . Such large polygenetic volcanoes in Tohoku have been sub-grouped into unstable types where the eruptive centre has migrated more than 1.5 km within 10 ka and stable types were the vents are more concentrated around the geographic centre of the volcano .
The volcanic event definition requires information on both the temporal and spatial aspects; the temporal definition relates to the recurrence rate, (number of volcanic events per unit time), and spatial definition to the intensity or spatial recurrence rate (number of volcanic events per unit area). and can also be combined as a spatio-temporal recurrence rate (number of volcanic events per unit area per unit time) .
The temporal definition of a volcanic event could range from a single eruption occurring in one day or less, to an eruption cycle in which active periods of eruptions occur between dormant periods. The time scale of an active period may vary from several years to thousands of years. In previous volcanic hazard analyses carried out on complex, large-volume strato and/or caldera volcanoes, volcanologists have typically defined volcanic events as single eruptions or several eruptions within some defined time period separated by periods in which there is no activity e.g. . This is because the focus at such established volcanoes is not on the probability of a new volcano forming in the vicinity of the existing volcano but rather on the probability of the next eruption or eruption phase.
3.1. Tohoku volcanic event definition
In the context of siting of a geological repository, the main concern is the formation of a new volcano in a region where volcanoes do not already exist. Thus the distinction between monogenetic (simple or complex) and polygenetic (complex strato and/or caldera) volcanism is not relevant for the definition of volcanic event here. Table 1 is a compilation of all Quaternary volcanoes in the Tohoku volcanic arc modified from the Catalog of Quaternary Volcanoes in Japan [1, 30]. Volcano complexes refer to magma systems that have evolved over the long-term (order of 0.1 Ma) which appear as regional scale clusters. In this chapter we use the same definition of volcanic event as  taking into account eruption volumes. This is depicted as a white triangle in Figure 3 and is the average geographic location of the vents (white dots). The eruption products released from the vents are represented by the dark grey regions in Figure 3. The lighter grey areas in Figure 3a are the eruption products of a separate volcanic event. Each volcanic event typically has a time gap of more than 10 ka, and/or is differentiated from other volcanic events according to geochemistry.
|Okiura||Aoni F. Aonigawa P.F.||40.573||140.763||ca.1.7||K-Ar||17.6|
|Okiura||Aoni F. Other P.F.||40.573||140.763||1.7||－||0.9||K-Ar||3.7|
|Kensomori/Morobidake||1470m Mt. lava||39.909||140.872||0.1|
|Tamagawa Welded Tuff||Tamagawa Welded Tuffs R4||39.963||140.763||ca.2.0||K-Ar||83.2|
|Tamagawa Welded Tuff||Tamagawa Welded Tuffs D||39.963||140.763||ca.1.0||K-Ar||32.0|
|Takamatsu/Kabutoyama||Kabutoyama Welded Tuff||39.025||140.618||1.16||TL||3.2|
|Takamatsu/Kabutoyama||Kiji-yama Welded Tuffs||39.025||140.618||0.30||K-Ar||5.1|
|Kurikoma||Older Higashi Kurikoma||38.934||140.779||ca.0.5||K-Ar||2.2|
|Kurikoma||Younger Higashi Kurikoma||38.934||140.779||0.4||－||0.1||K-Ar||0.7|
|Onikobe||Onikobe Centeral cones||38.805||140.727||ca.0.2||TL||1.1|
|Naruko||Naruko Central cones||38.730||140.727||ca.0.045||14C||0.1|
|Zao||Central Zao 1st.||38.133||140.453||1.46||－||0.79||K-Ar||0.8|
|Zao||Central Zao 2nd.||38.133||140.453||0.32||－||0.12||K-Ar||15.2|
|Zao||Central Zao 3rd.||38.133||140.453||0.03||－||0||K-Ar||0.0|
|Azuma||Azuma Kitei lava||37.733||140.247||1.3||－||1||K-Ar||24.7|
|Nishikarasugawa andesite||Nishikarasugawa andesite||37.650||140.283||ca.1.5||K-Ar||1.9|
|Adatara||Adatara Stage 1||37.625||140.280||0.55||－||0.44||K-Ar||0.3|
|Adatara||Adatara Stage 2||37.625||140.280||ca.0.35||K-Ar||0.4|
|Adatara||Adatara Stage 3a||37.625||140.280||ca.0.20||K-Ar||2.0|
|Adatara||Adatara Stage 3b||37.625||140.280||0.12||－||0.0024||K-Ar||0.3|
|Shirakawa||Tokaichi A.F. tuffs||37.242||140.032||1.31||－||1.24||Strat.||12.0|
|Shirakawa||Kinshoji A.F. tuffs||37.242||140.032||1.2||－||1.18||Strat.||9.0|
|Chokai||Shinsan Lava flow||39.097||140.053||0.02||－||0||Strat.|
|Numazawa||Sozan lava domes||37.452||139.577|
|Numazawa||Mizunuma pyroclastic dep.||37.452||139.577||ca.0.05||FT||2.0|
|Hijiori||Hijiori Pyroclastic flow||38.610||140.159||ca.0.01||Strat.||1.0|
|Hijiori||Komatsubuchi lava dome||38.613||140.171||ca.0.01||Strat.||0.0|
|Tashiro||Hirataki nueeardente deps.||40.420||140.413||0.020||－||0.020||Strat.||0.7|
4. Bayesian model
The following is a slightly shorter description of the Bayesian methodology published in . A two-dimensional surface distribution is set-up showing the continuous probability of one or more volcanic event(s) forming within a region of interest, in an arbitrarily time frame of the order of 0.1 – 1 Ma. The volcanic event definition defined above means that we are estimating with known uncertainty, the probability of a new volcano forming at a given location (x, y).  noted that a challenge with estimating the long-term future spatial distribution of volcanism is the fact that we are trying to model something that we cannot sample directly; namely the locations of future volcanoes. In this chapter we incorporate 3He/4He ratios, as these may be indicative of conduits in the earth’s crust through which magma may rise through resulting in future volcano formation [19, 20].
Information, no matter how obtained, can be described by a probability density function (PDF) [e.g. 45, 46]. Once the dataset is expressed as a PDF, it is possible to combine with our initial PDF created based on
Essentially, two stages are performed yielding the
4.1. Bayesian inference and Bayes’ theorem
Bayes’ theorem [e.g. 47] is used to setup a model providing a joint probability distribution for the location known volcanic events (
4.2. A priori PDF
We assume that past and present volcanic events can be used to estimate future locations of volcanoes over the long-term, as well as constraining upper bound recurrence rates in the volcanic field. The spatial distribution of volcanoes in volcanic arcs like Tohoku are random  hence by treating volcanism in Tohoku volcanic arc as a low frequency, random event, it is assumed that the underlying process could be approximated to a Poisson process . Moreover, by treating the location of volcanic events as random points within some set, the spatial distribution of volcanism can be modeled as a spatial point process  where a spatial point process is a stochastic model that can be described as the process controlling the spatial locations of the events,…,in some arbitrary set . In applying point process models to volcanism,  eloquently defined,…,as volcanic events and as the volcanic field.
The Poisson process is ‘homogeneous’ if the spatial distribution of point events are completely random . However, as with many volcanic fields, spatial patterns of volcanism in the Tohoku volcanic arc are clustered [34, 50], hence the distribution of volcanoes are not completely random and therefore non-homogeneous (also referred to as in-homogenous). Applying the Clark-Evans nearest-neighbour test ,  showed that the distribution of the volcanic events defined above is clustered with greater than 95% confidence. A non-homogeneous Poisson process is the simplest alternative for modeling such clustered events. Moreover, point process models based on non-homogeneous Poisson processes have been extensively used in modeling the spatial and spatio-temporal characteristics of several volcano fields (e.g. the Springerville volcanic fields in Arizona  and the Higashi-Izu monogenetic volcano group, Shizuoka Prefecture, Japan . In these models the local spatial density of volcanic events is calculated using a kernel function [37, 52]. The kernel function itself is a density function used to obtain the intensity of volcanic events at a sampling point, calculated as a function of the distance to nearby volcanoes and a smoothing constant
As noted by  the choice of kernel function with appropriate values of
4.3. Estimating an optimum smoothing coefficient h for the volcanoes in Tohoku
The choice of the smoothing coefficient depends on a combination of the size of the volcanic field, size and degree of clustering and the amount of robustness and conservatism required at specific points within or nearby the volcanic fields in question. In order to estimate the most likely optimum value of smoothing coefficient,  plotted cumulative probability density functions with varying values of smoothing are compared with the fraction of volcanic vents and nearest-neighbour volcanic event distances in Tohoku (Figure 5).
The cumulative plots in Figure 5 suggest that the spatial distribution of volcanic events in the Tohoku volcanic arc fit a Cauchy distribution with smoothing coefficients of
4.4. A priori probabilities
Probability estimates for each grid point are computed by using a Poisson distribution where represents the intensity parameter computed using equation (2) :
Using smoothing coefficients of 1 - 1.5 km for the Cauchy kernel, as well as weighting eruption volumes, probability plots were constructed using equation (3). A probability contour plot for one case is shown in Figure 6.
The highest probabilities are located in the Sengan region (10-6 - 10-5 / a) which has the highest density of volcanic events in the Tohoku volcanic arc. By testing the two volcanic event sub-definitions (weighted with and without eruption volume),  found that the probabilities in the vicinity of monogenetic volcanoes on the back-arc region were higher when volcanic events were not weighted with eruption volumes (1 - 4 x 10-7/a, weighted; 1 - 4 x 10-6/a, un-weighted), whereas the probabilities around established centers such as Iwaki, Towada, Sengan and Chokai were reduced slightly. This is expected as volcanoes with large eruption volumes are the sites of highest magma production. However if the focus of the assessment is on new volcano event formation, irrelevant of whether the new volcano evolves into are large complex stratovolcano and/or caldera or not, then selecting the volcanic event definition that is not weighted with eruption volume would seem more appropriate.
4.5. The likelihood function
In order to compare the R/RA ratios, cumulative plots of values around all volcanic events and values of 10 km2 bins over all of Tohoku are plotted. Figure 8 shows R/RA ratios below all volcanic events (8a) and volcanic events less than 100 ka (8b). In both cases approximately 90% of all volcanic events are distributed in regions with R/RA ratios greater than 3. In other words 90% volcanoes are located in regions where 3He/4He is elevated.
The R/RA ratios are interpolated to represent a continuous, differentiable surface and then the spatial data are mapped into a likelihood function based on the percentage of recent volcanic events that lie within the binned R/RA ratios in Figure 8. For low P velocity perturbation,  assumed an inverse linear relationship; based on the interpretation that low
4.6. A posteriori probabilities
Using equations (1) to (3) above, two dimensional probability plots are subsequently constructed showing the probability of one or more future volcanic event(s) forming during the long-term, given that a volcanic event will occur in the Tohoku volcanic arc during 100 ka. Figure 9 shows a comparison of the
The probability of new volcanic event formation in the forearc region to the east of the volcanic front is reduced slightly in the
The R/RA analyses are compared with the probability calculations conditioned on
 found that
The main advantage of probabilistic based models over deterministic models is that the probability of new volcano event formation is never zero.  showed that Bayesian inference is well-suited for formally combining observations relevant to the imaging of the magma source region (e.g. seismic tomography) with quantitative methods for estimation of volcano intensity. Moreover, the strength of Bayesian inference is that probabilistic assessments can be improved with increased understanding of the physical processes governing magmatism and/or data that may be indicative of future volcanism such as the helium isotope ratios presented here. Nevertheless it is worth examining the logic behind what we perceive to be ‘data’ and what we mean by
5.1. Which datasets are a priori information?
 used the volcano geographical datasets themselves as a starting point in their analysis. The same approach was applied in this chapter. In the first step a Cauchy kernel was used to calculate. This means that the probability new volcanic event formation decreases with increasing distance from existing volcanic events. In the case of selecting a location for a geological repository, there may be a need to have a conservative estimate and accept that extreme events may occur. In this case, selection of the Cauchy as the
It could be equally argued, however that the logic of  should be reversed in that the models based on seismic tomography or elevated helium isotope ratios are in fact
On the other hand there are also practical aspects to be considered particularly when starting a hazard analysis in a region where there have not been many studies. In such a case, the only data available to begin with might be just the geographical location of volcanoes. Information from more complicated and expensive surface based investigations might not come until later.
5.2. Model evaluation
Since it is not possible to infer directly the location of future volcanic events that will form in the next 0.1 to 1 Ma from now, models can instead by evaluated by calculating the probability of the new volcanic events that formed after some time in the past, using all volcanic events that formed before that time [1, 38]. Since we calculate the probability of future volcanism in the next 100 ka in most of the analyses described here, 100 ka is selected as the timeframe in the verification calculations. In Tohoku, as there are a large number of dated volcanic events it is possible to verify the Bayesian models developed to a certain extent by using all volcanic events that formed before 100 ka to predict the location of volcanic events that formed between 100 ka and the present day. Since the ‘new’ volcanic events are still in the past, it is possible to compare probability plots with the locations of volcanic events we are attempting to forecast. Figure 13 shows probability plots for the Cauchy PDF (h=1.5 km) and the
In both cases, all subsequent volcanoes formed in regions where the probability was at least 10%. Approximately 50% of newly formed volcanic events formed in regions where the probability was at least 25%. There was approximately 10% increase in probabilities in the locations were volcanoes formed in the
Probability calculations above were made using single inferences on one set of data. However, Bayes’ theorem allows beliefs to be updated as additional information becomes available.  attempted this by combining geothermal and seismic tomography datasets (Figure 14).
By conditioning on
5.3. Varying the temporal recurrence rate
It can thus be argued that for periods beyond 0.1Ma, it is unreasonable to treat in equation (3) as constant or steady state. One option might be to assign say a Weibull function where recurrence rates can increase or decrease with time  if there is sufficient age data to indicate temporal trends statistically. Alternatively one could assume that the temporal recurrence rates are entirely random with a tendency to cluster temporally [e.g. 22, 62]. Moreover,  showed that time clustering can have an impact on the spatial intensity of volcanoes.
A challenge though with utilizing temporal data are the quantity and quality of the age datasets and being consistent enough with the temporal definitions since eruptions may last for several days, weeks, months, years even longer. Having a consistent temporal definition is especially challenging when handling volcanic datasets on the regional scale described in this chapter. As highlighted in section 3, even for monogenetic volcanoes, the temporal definition is not so straightforward . It was for this reason  argued that a drawback with nearest-neighbour models which are a function of both spatial and temporal parameters is that they require the ages of every single volcanic event within the volcanic field in question. Nevertheless in certain cases such as tectonically controlled basaltic fields, eruptions can be time predictable,  hence there is potential to improve on the Bayesian model presented here by taking into account time clustering in the temporal rate parameter.
Bayes’ thereom is a powerful statistical tool for incorporating additional datasets. In this chapter R/RA ratios were used in probabilistic volcanic hazard assessments applying the methodology developed by . These were compared with earlier assessments in Tohoku incorporating low P perturbations at 10km and 40km depth and geothermal gradients. Probabilities of one or more volcanic event(s) forming in Tohoku for both analyses were found to be similar ranging from 10-10 – 10-9 /a between clusters and 10-5 /a within clusters. The Cauchy kernel, combined with multiple datasets successfully captures all subsequent volcanic events, including extreme events. This is particularly important when making calculations over 1Ma when the tectonic setting is likely to change resulting in a potential shift of the volcanic front. Although the Cauchy kernel appears to be over conservative for regions east of the volcanic front, where probabilities are expected to be negligible, values are reduced when R/RA ratios are included.
Diagrams of the probability plots were made using Generic Mapping Tools (GMT) . The authors thank the constructive comments made by two anonymous reviewers which improved the manuscript.