Optimization Using Genetic Algorithms – Methodology with Examples from Seismic Waveform Inversion

Subhashis Mallick

doi:10.5772/intechopen.113897

Abstract

Genetic algorithms use the survival of the fittest analogy from evolution theory to make random walks in the multiparameter model-space and find the model or the suite of models that best-fit the observation. Due to nonlinear nature, runtimes of genetic algorithms exponentially increase with increasing model-space size. A diversity-preserved genetic algorithm where each member of the population is given a measure of diversity and the models are selected in preference to both their objective and diversity values, and scaling the objectives using a suitably chosen scaling function can expedite computation and reduce runtimes. Starting from an initial model and the model-space defined as search intervals around it and using a new sampling strategy of generating smoothly varying initial set of random models within the specified search intervals; the proposed diversity-preserved method converges rapidly and estimates reliable models. The methodology and implementation of this new genetic algorithm optimization is described using examples from the prestack seismic waveform inversion problems. In geophysics, this new method can be useful for subsurface characterization where well-control is sparse.

Keywords

global optimization
genetic algorithm
inversion
parameter estimation
diversity preservation
sampling strategy

Author Information

Show +

Subhashis Mallick*
- University of Wyoming, Laramie, WY, USA

*Address all correspondence to: smallick@uwyo.edu

1. Introduction

Genetic algorithm (GA) is a global optimization method based on the natural analogy from genetics and evolution theory [1, 2]. In geophysics GA has been used to solve a variety of single- and multi-objective inverse problems [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17].

GA belongs to the class of model-based optimizations in which there are three distinct components: (1) model, (2) data, and (3) objective. It is also assumed that the model and the objective are related to one another via data and the underlying physics of the problem. The model or the decision space is usually denoted as X. In seismic inverse problems under an isotropic elastic assumption for example, the model consists of the vectors of the P- wave velocity (V_P), S-wave velocity (V_S), and density (ρ) for each subsurface depth (or time) sample and the model space is the entire feasible range of their variability. For an anisotropic elastic inversion, the model would be other anisotropic parameters in addition to V_P, V_S, and ρ. For electromagnetic (EM) and gravity inverse problems, the model would consist respectively of the electrical resistivities or densities. Mathematically, the model or decision space is thus defined as

X=x1x2⋯xNT.E1

In Eq. (1) the superscript T represents a transpose and each xi,i=1,2,⋯,N represents one model vector and N is the total number of model vectors. Thus for isotropic elastic seismic inverse problem at a single location, N=3 with x1=VP1VP2⋯VPMT,x2=VS1VS2⋯VSMT,x3=ρ1ρ2⋯ρMT , where for any depth or time sample j,j=1,2,⋯,M, VPj,VSj,ρj respectively denote the P-wave velocity, S-wave velocity, and density at that sample, and M is the total number of samples. The data space is defined as

D=d1d2⋯dJT,E2

where each di,i=1,2,⋯J is the vector representing the ithdata being optimized. Finally, the objective space is defined as

Y=y1y2⋯yJT,E3

where each yi,i=1,2,⋯,J is a scalar valued quantity representing the objective of the ithdata. To compute the objective, it is assumed that there exists a unique mapping of the model space onto the objective space, i.e., X→Y via the underlying physics of the problem and the data space D. The aim of any model-based optimization is to find the model (or suite of models) in the model (decision) space that either minimize or maximize the objective.

The problem defined above is multi-parameter and multi-objective optimization where multiple parameters (Eq. 1) are simultaneously estimated from multiple datasets (Eq. 2) via optimizing multiple objectives (Eq. 3). Such multi-parameter and multi-objective optimizations have been previously used in geophysics to solve a variety of problems such as estimating anisotropic properties for mantle lithosphere from the splitting parameters of teleseismic S-waves and P-wave residual spheres [18], wave equation migration velocity inversion [19], estimating earthquake focal mechanisms [20], inverting multicomponent seismic and electromagnetic (EM) data [10, 11, 12, 14, 17], etc.

This chapter restricts to the discussion of the multi-parameter and single-objective optimization problem such that there are multiple parameters in the models space to be estimated (Eq. 1) using a single set of data and single objective, i.e., when J=1 in Eqs. (2) and (3). Although the examples provided are from the seismic inversion problem, its extension to solving other optimization problems is straightforward.

2. Multi-parameter and single-objective optimization problem

Restricting to seismic inversion under isotropic assumption, the model consists of three parameters, (1) the longitudinal or P-wave velocity (VP), (2) transverse or the S-wave velocity (VS), and (3) density (ρ). Thus, for single-component isotropic elastic seismic inverse problems, the model and data can be redefined as

m=VPVSρT,E4

and

d=d1d2⋯dNT.E5

In Eq. (4), the vectors VP=VP1VP2⋯VPMT, VS=Vs1VS2⋯VSMT, and ρ=ρ1ρ2⋯ρMT respectively denote the P-wave velocity, S-wave velocity, and density for each depth (or time) sample i;i=1,2,⋯,M, and the vector d in Eq. (5) is the input seismic data with N samples.

Having defined the model and data, a unique forward modeling operator s=gm is then defined which allows computing the synthetic or predicted data vector s=s1s2⋯sNT exactly at the same points as those of the data vector d of Eq. (5).This forward modeling operator varies depending upon the flavor of the inversion method. For post-stack or pre-stack amplitude-variation-with-angle (AVA)/elastic-impedance inversion [9, 20, 21, 22, 23, 24], gm is the convolutional modeling at normal or nonnormal incidence angles [25]. For wave-equation based inversion such as the full waveform inversion (FWI) [26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], gm is the numerical solution of the elastic or acoustic wave equation using finite-difference or finite-element modeling. Finally, for prestack waveform inversion (PWI), which is a subset of FWI under the assumption of a locally horizontal (1D) stratification at each location [3, 4, 5, 6, 7, 8, 13, 16, 40, 41], gm is the analytical solution to the 1D wave equation [42, 43].

After defining the synthetic or predicted data s=gm , the misfit or error between the observed and synthetic data is defined as e=d−s , following which, the objective is defined as the sum of the squared errors

y=eTe,E6

and the optimization can be cast as the minimization of the objective y. Alternatively, the objective can be either defined as−eTe [17] or as the normalized cross-correlation [3]

y=d†s+s†dd†d+s†s,E7

and the optimization can be cast as the maximization of y. The superscript † in Eq. (7) denotes a complex conjugate transpose. It is assumed here that the real and synthetic data vectors d and s can be complex valued. However, when they are real-valued, d†=dT and s†=sT. Also, the objective defined by Eqs. (6) and (7) represent a pure least-square optimization which is unstable for most practical problems. Therefore, additional regularization or damping terms are used in defining the objective. These damping terms stabilize the optimization by addressing the inherent noise that are present on the data.

Figure 1 illustrates single-objective and multi-parameter global optimization with reference to PWI. As shown in Figure 1, an initial model of V_P, V_S, and ρ, each as functions of depth is first estimated and the search windows around them are provided to define the model-space. In the presence of well-logs with borehole measurements of V_P, V_S, and ρ, they can be used for generating this initial model. In the absence of well-logs, initial V_P can be estimated from seismic data using velocity analysis or other advanced velocity estimation procedures such as traveltime tomography [44] and then established V_P-V_S [23, 24, 45, 46] and V_P-ρ relations [47] can be used to estimate initial V_S and ρ. Based on the local geology and the knowledge of the variability of V_P, V_S, and ρ, the model-space or search windows can then be defined around these initial models and a suite of models is generated within them. These suites of models can be generated in different ways by providing an appropriate probability distribution around the initial model. For the most unbiased case, the distribution could be uniform (boxcar) between the minimum and maximum values and the models are generated such that the value of V_Pi, V_Si, and ρ_i at any depth i can have any random value within their minimum and maximum search bounds specified for that depth with equal probability. Alternatively, a Gaussian distribution with the mean and standard deviation, dictated by the initial model and the search window can be defined for each depth, and depth-dependent models of V_P, V_S, and ρ can be randomly drawn from this distribution. After generating these suite of models, synthetic seismograms using each of these models are computed and matched with the observed seismic data. When there are adequate offset samples, prestack seismic data in time and offset domain can be decomposed into plane-waves in the intercept-time and ray-parameter (τ-p) domain [5]; synthetic seismograms can also be computed in τ-p domain and matched with the real seismic data using Eq. (7) to compute the objective for each model. In many practical situations however, the offset samples may not be adequate to perform an accurate plane-wave decomposition in the τ-p domain and it must be approximately done in the angle-domain instead [8]. Thus, using the P-wave velocity field of each of the models, the offset-domain seismic data are converted into the angle domain using an offset-to-angle transformation method [48, 49, 50]. Synthetic seismograms are also computed in angle domain and matched with the observed angle-gathers using Eq. (7) to compute the objective. Following the objective evaluation, they are scaled using a suitably chosen scaling function. Objective scaling, in principle, is equivalent to objective regularization and will be discussed in detail later in relation to genetic algorithms. Following scaling, the models generated so far are checked to verify if convergence is achieved. If yes, the method reports the solutions and exits. If not, the models are iteratively modified until a point is reached when one or more of the convergence criteria are satisfied (see Figure 1).

Figure 1.
Illustration of the single-objective and multi-parameter global optimization for prestack waveform inversion of seismic data.

The way the models are modified in global optimization (the box shown with a different color in Figure 1), depends upon the flavor of global optimization being used. In simulated annealing (SA) for example [4, 6], the analogy of the fundamental physics of crystallization from a melt via slow and steady cooling is used to generate new models at each iteration. In Genetic algorithm (GA) on the other hand [1, 2, 3, 7, 8], the analogy from the natural selection and survival of the fittest of the evolution theory is used. In Particle swarm optimization (PSO), the models are modified using the analogy from the social behavior of a flock of birds or a school of fish [51, 52].

In theory, global optimizations do not depend upon the choice of the initial model like the local, gradient-based optimizations do. The only requirement for global methods is that the model-space provided as search bounds or prior probability distributions, is wide enough to contain the true model. In seismology, these global methods could be useful to characterize the subsurface in new exploration areas where well-control is sparse or unavailable. Even in matured areas with adequate well control, if production data are to be integrated with time-lapse seismic data through history matching via iterative reservoir simulations, geomechanical modeling, and seismic inversion; existing wells before production may not represent the true subsurface model and a global optimization would be useful to predict dynamic reservoir properties. Superiority of the GA over a linearized local inversion in predicting the ocean salinity and temperature from the water column reflections with no prior information has been clearly demonstrated by Padhi et. al. [41]. Yet, uses of global methods in seismic inversion is still limited. The primary reason for this is the fact that all global methods are nonlinear optimizations and computationally challenging for handling moderate to large sized seismic inverse problems. The advantage of using a global method is they can find a reasonable subsurface model even when the initial starting model is far from the true model. However, to find the true (global) model starting from a faraway initial guess requires (1) defining sufficiently large search bounds within the model-space, (2) generating a very large number of models within these bounds, and (3) iterating these models many times such that the model-space is adequately sampled (Figure 1). And the runtime for all global optimizations exponentially increase with increasing model-space size (i.e., with increasing search bounds) and increasing number of models to be iterated within this model-space [12].

Restricting specifically to GA, we will now elaborate the steps shown in Figure 1 with particular emphasis on seismic inversion. As shown in Figure 1, the first step for GA is to get an estimate of the initial model and define the model-space. This initial model can be generated from the well-logs. But when well-logs are present, the initial model is close to the true model. Under such conditions, there are many compute-efficient local gradient-based methods to handle the inverse problem, and the use of a global optimization is unnecessary. The power of using any global optimization is in the situation where well-logs are unavailable, and the initial model must be differently estimated.

Figure 2 is a simple way to estimate the initial V_P via velocity analysis and normal moveout (NMO) correction. In this method, the input seismic data (Figure 2a) are interactively NMO corrected using different V_P fields until a time-varying V_P field, shown in Figure 2b is obtained that optimally flattens (NMO-corrects) the input gather (Figure 2c). Once the V_P field is estimated directly from data using the procedure shown in Figure 2 or using other sophisticated method like tomography, initial V_S and ρ can be estimated using established V_P versus V_S and V_P versus ρ relations. To keep things simple, V_P-V_S relation, given as VS=VP2 and the Gardner’s relation [47] given as ρ=αVPβ can be used to estimate the initial V_S and ρ. The V_P-V_S relation VS=VP2 is commonly used in AVA/elastic-impedance inversion [23, 24, 46]. In Gardner’s relation, β=0.25 and α is 0.23 or 0.31, respectively for V_P in ft/s and m/s for most Gulf of Mexico sedimentary rocks.

Figure 2.
Estimation of initial V_P from prestack seismic data. (a) Input seismic gather. (b) Estimated V_P. (3) Seismic gather after NMO correction using the estimated V_P.

Cyan curves in Figure 3 represent the initial model for inverting the prestack seismic data shown in Figure 2a. Figure 3a is the initial V_P, which is same as the V_P shown in Figure 2b after time-to-depth conversion. Initial V_S and ρ, shown in Figures 3b and c were computed from the initial V_P of Figure 3a using the V_P-V_S and V_P-ρ relations as discussed above. The cyan curve of Figure 3d is the Poisson’s ratio, computed from the initial V_P and V_S of Figure 3a and b using the formula for the Poisson’s ratio, given as [53]

Figure 3.
The well-log model (black), initial model (cyan), and the model space search limits (dashed red) for (a) V_P, (b) V_S, (c) ρ, and (d) Poisson’s ratio.

ν=1−2VSVP221−VSVP2.E8

Note that because the initial V_S is computed from the initial V_P by setting Vs=VP2, per Eq. (8), the initial Poisson’s ratio, shown in Figure 3d is constant 13. The location of the seismic data shown in Figure 2 coincides with a well location and the well-log V_P, V_S, ρ, and the computed Poisson’s ratio are shown as black curves in Figures 3a–d.

Besides the initial model, GA needs the model-space to be defined. The upper and lower bounds shown as dashed red curves on each panel of Figure 3 are computed by varying (1) the initial V_P by ±10% (Figure 3a), (2) initial ρ by ±15% (Figure 3c), (3) initial Poisson’s ratio from −70 to +25% (Figure 3d), and (4) by computing the upper and lower bounds of V_S (Figure 3b) from the initial V_P (Figure 3a) and the lower and upper bounds of the Poisson’s ratio (Figure 3d). Note that the model space, defined by these upper and lower bounds (dashed red curves), are wide enough to span the range of variations in the true (well-log) model.

After estimating the initial model and the model-space (Figure 3), the next step in GA is generating the initial population of models. As mentioned earlier, these initial models can be generated by letting initial V_P, V_S (or Poisson’s ratio), and ρ to randomly vary between their respective lower and upper bounds such that each of them can take any value between these bounds with equal probability. These models could be generated from the lower and upper bounds of V_P and ρ and the lower and upper bounds of either V_S or Poisson’s ratio. Note that for single-component (i.e., for vertical or pressure response) seismic data considered here, the variations of reflection amplitudes are controlled by the variations of P-P reflection coefficients (RPP) at subsurface interface boundaries as functions of the angle-of-incidence (θ). While the exact mathematical expression of RPP is complex, linearized approximations suggest that RPPθ is primarily controlled by (1) P-wave velocity contrast, (2) density contrast, and (3) V_S-V_P ratio (V_S/V_P) or Poisson’s ratio contrast [53, 54, 55, 56, 57, 58, 59]. For both AVA waveform inversion and PWI, Mallick [7, 8] found that parameterizing the random models with V_P, Poisson’s ratio and ρ, and then computing V_S from V_P and Poisson’s ratio provides more stable inversion than parameterizing them directly in V_P, V_S, and ρ. From Eq. (8), it is straightforward to show that V_S can be obtained from V_P and Poisson’s ratio ν as

VS=VP1−2ν21−ν.E9

Thus, in the applications shown here, VP,ν, and ρ were randomly generated and then Eq. (9) was used to calculate V_S. Also note that for 1D seismic inverse problems, layer thickness is also necessary. However, instead of layer thickness to be treated as the model parameter to be estimated, it is preferable to calculate them using a user-defined wavelength fraction at the dominant seismic frequency. Therefore, from the V_P value VPi for any layer i, the thickness for the layer i can be computed as ti=gλi where g is the user-defined wavelength fraction and λi is the wavelength at the dominant seismic frequency fdom, i.e., λi=VPifdom. Thus, the layer thickness also varies with the variation of V_P but controlled by the dominant seismic frequency and the user-defined wavelength fraction. For seismic inverse problems, it has been shown that allowing layer thickness to randomly vary along with V_P, V_S, and ρ is unstable and stable results are obtained by fixing them at the observed reflection boundaries in time domain [3, 4, 60]. Thus, in some old GA and SA implementations [3, 4, 60], the layers were of fixed time-thickness. Later Mallick [8] obtained stable results by resolving models at a given fraction of the dominant wavelength. More recently, a trans-dimensional seismic inversion method [61, 62, 63, 64, 65] have been developed where the number of layers besides V_P, V_S or ν, and ρ is treated as additional model parameter and estimated. However, in the applications shown here, a fixed number of layers with thickness set as the user-defined wavelength fraction is used.

Figure 4 shows randomly generated V_P models along with the search window, initial model, and true (well-log) model. For any layer i, the P-wave velocity VPi for the layer is assumed to have any random value between the minimum and maximum search limits VPimin and VPimax for the layer. Generating them for all layers and for all other model parameters (i.e., for VS/ν and ρ) provides one random model and repeating the same for many models gives the model population. In Figure 4, ten random V_P models are shown with one of the models in orange and the other nine in grey. Note that when a sufficiently large population of such random models are generated, they would span the entire model space.

Figure 4.
Ten random V_P models generated within the specified search window. One of the models is shown in a different color than the others to show the variability of the model within the specified search limits. The well-log and initial V_P are also shown.

After generating the random population of models, synthetic data are computed for each model and matched with observation to compute the objective (Figure 1). However, before discussing objective computation, it is important to discuss one important GA-specific step known as the parameter coding/decoding. One popular version for GA works with coded parameters instead of the parameters themselves, and one good way for parameter coding is binary coding, illustrated in Figure 5. In binary coding, the model parameters (V_P, V_S or ν, and ρ) for each layer are coded as binary digits (bits). In Figure 5 for example, 16 bits are assigned for V_P and 8 bits are assigned for V_S/ν and ρ. Figure 5 shows the binary coded parameters for layer-1 only, but those for all layers are concatenated to form a single binary string to represent one model or one member of the population. In GA, this entire binary string representing one member is called a chromosome and each bit within the string is called a gene. The number of bits (genes) Nbits in a single model (chromosome) is thus given as NLNVP+NVS+Nρ in which NL is the number of layers and NVP,NVS, and Nρ are the number of bits used to code V_P, V_S or ν, and ρ for each layer. Thus, in binary coding, a coin is tossed Nbits times with 50% probability for heads (1) and 50% for tails (0) and the results (i.e., zeros and ones) of the coin-toss are placed next to one another to represent one model (chromosome). Repeating this 2N times where N is an integer, produces a population of 2N models (Figure 5). For a single model parameter (V_P, V_S/ν, ρ) in each layer i, having all bits set to 0 (zero) corresponds to the minimum parameter value and having them all set 1 (one) corresponds to the maximum parameter value, specified by the search window for that layer (Figure 4). Thus, the bits with random zeros and ones from the coin-toss can be decoded for their actual value by linearly interpolating between these minimum and maximum values. For example, assume the following parameters for binary coding:

For each layer, V_P, ν, and ρ are coded with 8 bits.
Generated binary string for a single layer is given as 100101110011100111001001. And the search limits for the layer are 2500–5500 m/s for V_P, 0.1–0.42 for ν, and 2.0–2.5 g/cm³ for ρ.

Figure 5.
Illustration of binary coding.

Under these conditions, V_P for the layer is represented by the first eight bits, i.e., 10010111, ν for the layer is the next eight bits- 00111001, and ρ is represented by the last eight bits- 11001001. Now the equivalent decimal integer dval of the binary string 10010111 that V_P represents is dval=1×20+0×21+0×22+1×23+0×24+1×25+1×26+1×27=1+8+32+64+128=233. Assuming when all eight bits are zero (00000000) is equivalent to the minimum value VPminof the search limit (2500 m/s) and when all of them are one (11111111) is equivalent to the maximum value VPmax of the search limit (5500 m/s), V_P for the layer is decoded as VP=VPmin+dval−1VPmax−VPmin28−1=2500+2325500−2500255=5229 m/s. Similarly, it can be shown that ν and ρ for the layer are respectively 0.295 and 2.29 g/cm³. While such binary coding and decoding allows an easy way to search the model-space via crossover and mutation (see below), such a coding discreetly samples the model-space. For any model parameter P with minimum and maximum search limits Pminand Pmax, the resolution of this discrete sampling is given as Pmax−Pmin2NB−1, where NB is the number of bits used to code the parameter. Considering that the model space is, in fact, continuous, modern GA implementations are real-coded, which do not use any coding and directly work with real numbers. Thus, in real-coded GA, real values of the model parameters (V_P, V_S/ν, ρ) within the specified search window limits are randomly generated for each layer to represent one model or member in the population and do not require any decoding. Details of real parameter coding for GA can be found elsewhere [10, 11, 12, 66] and is not repeated here. Irrespective of whether the model parameters are coded as binary strings (Figure 5) or as real numbers, the next step in GA is to compute synthetic data using the underlying physics, match with the observation, and compute the objective.

Objective computation is followed by objective scaling or fitness scaling. In GA, objective is the primary driving mechanism for model-space search. However, if these objectives are directly used to drive search, the good models tend to dominate the poor ones, which is undesirable at early generations when none of the models are expected to be anywhere close to the true (global) optima, driving GA optimization to a local optimum [2]. In fitness scaling, the good models are scaled down and the poor models are scaled up and the degree of scaling up and down varies with generation. In GA, the original objectives computed using the normalized cross-correlation (Eq. 7 or equivalent) are called raw fitness and those after scaling are called scaled fitness. Although many ways for fitness scaling have been proposed, one simple and yet stable method is the linear scaling proposed by Goldberg (1989) [2]. In linear fitness scaling, the scaled fitness (f′) is assumed to be linearly related to the raw fitness (f) as

f′=a+bf,E10

where a and b are constants, and are computed from the constraints

fmax′=SCfavg,E11

and

favg′=favg.E12

In Eq. (11), fmax′ is the maximum value of the scaled fitness, favg is the average value of the raw fitness, SC is the scaling constant, and favg′ in Eq. (12) is the average scaled fitness, set equal to the average raw fitness favg. The scaling constant SC is the user-defined parameter that controls the model selection and mathematically, it represents the expected number of copies of the best member to be selected for crossover and mutation (discussed below). In seismic inversion, setting SC=0.8 at the beginning and slowly increasing it to 1.8 at the final generation provides good results [7, 8]. As mentioned earlier, fitness scaling is like adding a regularization or damping term to the pure least square inverse problem.

Fitness scaling allows stability to GA optimization. In seismic inverse problems however, Sen and Stoffa (1992) [4] found that even after fitness scaling, GA may still fail to converge. This is because of genetic drift- a tendency for GA to cluster toward a single region in the model-space when finite population size is used [2]. One way to avoid genetic drift is to use a very large population size and propagate it to many generations. However, being a nonlinear method, the runtime for GA becomes prohibitively expensive with increasing population size and must be avoided. So, instead of using a large population, running several independent GA optimizations with small population sizes, and later combining them is one good way to handle genetic drift [4, 7, 17]. However, making several such GA runs is still compute-intense, and it is advisable to find ways to avoid genetic drift in a single rather than several GA runs. For multi-objective optimizations, one way to avoid genetic drift in a single run is to compute the population diversity, given as the normalized Euclidean distance of a member in the population from its nearest neighbors measured along all objective axes [67]. For single-objective optimization problem, diversity can be calculated using the pseudo-code shown in Figure 6 in which the population size is given as NP and the measured diversity is stored as a floating-point array dist of size NP. Note that the diversity value calculated by the pseudo-code of Figure 6 is normalized, which can have a value varying between 0 and 1. In addition, the higher the diversity value, the more diverse (i.e., more isolated from its neighbors) is the model. Thus, multiplying the raw fitness values of each model by their respective diversity values prior to fitness scaling would prefer more diverse models over the ones that are less diverse and add an additional control to avoid premature convergence of GA. Thus, in diversity-preserved GA, diversity is an additional regularization parameter for the objective besides fitness scaling.

Figure 6.
The pseudo-code for diversity computation.

Having discussed genetic drift and how to avoid clustering and premature convergence via objective-regularization, the next step in GA is reproduction or tournament selection. In this step, the models are selected in preference to their respective regularized (diversity-preserved and scaled) fitness values. Although many methods are proposed for reproduction, the stochastic remainder selection without replacement [2] has been reported to work very well for seismic inversion problems [3, 7]. In this method the mathematically expected value Ei for each member i of the of the population are first calculated as

Ei=NPfi′∑j=1NPfj′.E13

In Eq. (13)NP is the population size (number of models in the population) and fk′ is the diversity-preserved scaled fitness of the kth member in the population. After calculating Ei for all members, the integer part of Ei is used as the number of copies of the member i to be selected into a new population pool of the same size NP. For example, if for a given member j, Ej=3.67, then three copies of the jth member would be selected into the new population pool. Repeating this process for all members j,j=1,2,⋯,NP, would then partially fill up the new population pool, in which some of the original members would be present once, more than once, or not present at all, depending upon whether the integer part of their mathematically expected values are 1, greater than 1, or less than 1. Next, the rest of the new population is filled up as follows:

A member is randomly selected out of the entire original population and a coin is tossed with the probability of heads set to the fractional part of its expected value.
1. Using the same example again, if the member j is randomly selected where Ej=3.67, then the coin would be tossed with the probability of heads set to 0.67.
  1. If the outcome of the coin-toss is heads, then the one copy of the member j would be selected.
Step 1 above is repeated until the entire population is filled up, i.e., the selected number of members in the new population is NP.

Reproduction generates a new population from the original (old) population, but the new population is simply a fitness and diversity preferred copy of the old population. To explore the model space and create a new generation of models from the old generation, the processes that GA uses are crossover, mutation, and elitism (update). Typically, crossover and mutation are combined into a single process, which is then followed by elitism to advance to the new generation. In the following, crossover, mutation, and elitism are explained using binary coding. Their extension for real-coded GA can be found in [66].

Figure 7 illustrates the binary-coded crossover. First, two members from the reproduced population are randomly selected and treated as parents. In Figure 7, they are called Parent-1 and Parent-2. Next, a crossover site within each parameter space (V_P, V_S/ν, ρ) for each layer is randomly selected (dashed purple lines in Figure 7). Finally, the bits (genes) on the left-hand side of the crossover sites of each parameter are swapped between the parents to produce two children, denoted as Child-1 and Child-2 in Figure 7. Crossover is performed with a crossover probability PC. For each layer and each parameter, a coin is tossed with chances of heads set to PC. If the outcome of the coin-toss is heads, then the crossover for the given layer and model parameter is performed, else the bits for the parents for the parameter are simply copied into the children (i.e., Parent-1 is copied into Child-1, and Parent-2 is copied into Child-2). Thus, crossover produces two new (child) members from the two original (parent) members. For seismic inversions, setting PC=0.8 at early generations and slowly reducing it to about 0.65 at the end produces good results [7, 8]. In mutation, for each bit (gene) in both child members (Child-1 and Child-2) is sequentially visited and a coin is tossed with the chances of heads set to the probability of mutation PM. If the outcome of the coin-toss is heads, then the bit is changed, i.e., if it is 0 it is changed to 1 and if it is 1, it is changed to 0. A starting value of 0.15 and reducing it down to about 0.01 at the end is a good choice for PM [7, 8].

After producing each pair of children from their parents via crossover and mutation, objectives for the children are computed. Next, in elitism the objectives of the children and their corresponding parents are compared and two members with the highest objective values are allowed to advance to the next generation. Like crossover and mutation, elitism is also performed with a probability of elitism PE and a value of 0.5 at early generations and increasing it to about 0.9 at the end are good choices for PE for seismic inversion problems [7, 8]. Crossover, mutation, and elitism produces two members of the next generation from the two members of the old generation and performing N such operations of crossover, mutation, and elitism for a population size NP=2N would thus produce NP members of the new (next) generation from NP members of the old (previous) generation. Once the new generation of models are produced, they are advanced to another generation via reproduction, crossover, mutation, elitism, and the process is continued until a prespecified maximum number of generations- Gmax is reached or some other stopping criteria (see below for details) is satisfied.

3. Practical implementation of GA

GA optimization is computationally challenging. However, the entire methodology can be parallelized in high-performance parallel computing environments, which, in turn, can lead to a compute-efficient nonlinear global seismic inversion method. Considering that a global method in place of a local, gradient-based method for seismic inversion can, in principle, allow estimating a reliable subsurface depth model of elastic properties, even when there is no prior (well) information, implementing an efficient global method like GA to solve seismic inverse problems would certainly be a big leap forward not only for oil and gas exploration, but also in solid earth geophysics, global seismology, and in the emerging fields like Carbon Capture, Utilization, and Storage (CCUS), Underground Hydrogen Storage in Porous media (UHSP) , characterization of the Enhanced Geothermal Systems (EGS), etc. However, to implement GA in practice, it is important to address a few important aspects. In above, GA is described as the process where an initial population models of size NP, randomly generated within a user specified model-space (Figure 4) is propagated via reproduction, crossover, mutation, and elitism up to a specified number of generations Gmax. While these are the fundamental GA steps, implementing just them, irrespective of how large the values of NP and Gmax are, may possibly work on synthetic data, but is most likely to fail when applied to real seismic data. To develop GA that would work on a wide variety of real data, fundamental issue that must be addressed is that the real data are always noisy, which must be efficiently handled. Any model-based optimization, whether local or global, is an iterative process which must have ways to stop iteration and exit such that the method does not suffer from unnecessary computational burden. Thus, after each iteration, these algorithms check if the method converged to a reasonable solution. If not, the iteration is continued, else the method reports solutions and exits (Figure 1). In dealing with noisy real data however, deciding whether to continue with iteration or to stop iterating is not one simple step as shown in Figure 1. Noise levels on real seismic data are known to widely vary, which are controlled by the geological factors, environmental factors, and many other factors directly related to how the data were acquired in the field and later processed in a computational facility. Even within a single area, there are often different noise levels in different parts. To handle noisy data, the stopping criteria for GA optimization must be implemented such that there is convergence check at different points within the algorithm, so that the data that are less noisy may converge early and exit and at the same time, more iterations are allowed for noisy data. Such multipoint convergence check, if correctly implemented, would not only allow additional computations when needed, but it would also avoid unnecessary computations on the data that are relatively less noisy.

Considering the above issues, Figure 8 is the workflow for the prestack waveform inversion (PWI) using GA optimization. The workflow shown in Figure 8 is complex, in which different parts or modules are shown with different colors and are outlined by dashed boxes, also of different colors. The first module (Module-1) is color coded in light green and outlined by the red dashed box. This module comprises the basic GA optimization steps. The second module (Module-2), color coded in peach and outlined by gray dashed box is the first convergence checkpoint. The third module (Module-3) in light blue and within purple dashed box is the second convergence checkpoint. Finally, the fourth module (Module-4), coded in yellow within green box is the output module where the method reports solutions and exits. The main user defined controlling parameters are (1) NP− population size, (2) Gmax− number of generations, (3) Rmax− maximum number of repeats, (4) Niter− maximum number of iterations, and (5) Cmin− minimum correlation, i.e., the minimum value of objective (raw fitness value) to be achieved in the optimization process.

Figure 8.
Methodology for GA optimization with parallel implementation and multiple convergence checkpoints to solve seismic inverse problems.

At the beginning of Module-1 (modules with the red box), an initial random population of NP models are generated and distributed to different processors. Different processors compute the synthetic data and objectives (raw fitness values) for the subset of the model population they each receive. The master node then collects results from all processors, computes diversity, and performs fitness scaling and reproduction (tournament selection). Next, the selected models are again distributed to different processors such that they each perform crossover, mutation, and elitism and produce new members of the next generation on a subset of the entire population. The size of the population subset each processor receives and produces new members is decided by the number of processors being used. For example, if NP=100 and 50 processors are used, then each processor would receive two members from the selected population as parents, produce two child members via crossover and mutation, and then two new members of the new generation via elitism. When the number of processors is less than 50, they each would then receive more than two parents and thus produce more than two new next generation members. If the number of assigned processors are more than 50, then only the first 50 processors would do the operation and the rest would sit idle; thus, for a population size of NP, the maximum number of processors to be used by the method should be NP2, else the additional processors would not be used. In addition, NP should be an even number so that NP2 is an integer. After generating new members within each processor, the master node receives them all and checks if the method progressed to Gmax generations. If not, the master node continues to perform tournament selection using the new models and distributes the selected models to different processors for creating the models for the next generation. However, if Gmax is reached, the methodology then goes into Module-2. In this module, whether the best value of the objective achieved at this point is at least 0.9Cmin is first checked. If yes, the method goes straight into Module-4 (yellow) to report solutions and exit. Otherwise, the generation number is reset to 1 and the current model population is sent back to the green module (Module-1) to continue with another set of GA optimization for Gmax generations, and this process is repeated for a maximum of Rmax times with the chance of reporting solutions and exiting if 0.9Cminis reached during any stage. After such Rmax repeats, the method goes to Module-3. In this module, the convergence criterion of reaching the objective at least to 0.9Cmin is first checked, and if not, a new initial model is set as the maximum likelihood model achieved at this point, a fresh new set of random models are generated within the model-space and sent to the top of Module-1 to start a fresh set of GA optimization with Rmax repeats. Module-3 is repeated for a maximum of Niter times with chances of being sent to Module-4 of reporting solutions and exiting at multiple points (see Figure 8).

4. Examples

Here, different PWI runs using GA optimization on a single real prestack seismic data are shown. Input seismic, initial model generated from velocity analysis and NMO correction and then using established V_P-V_S and V_P-ρ relations, and the search windows were used to define the model-space are shown in Figures 2 and 3. In all examples, GA with diversity preservation and linear fitness scaling was used. Fitness scaling constant (SC), probabilities of crossover, mutation, and elitism (PC,PM,PE) were all set to their recommended values as discussed earlier. Also, the models were parameterized as the P-wave velocity (V_P), Poisson’s ratio (ν), and density (ρ).

Using the workflow of Figure 8, first two examples were run using NP=80,Gmax=400,Rmax=5,Niter=7, and Cmin=2. Note the value of Cmin cannot exceed 1. In these two runs, it was deliberately set to 2 to ensure that the methodology goes through all seven iterations (Niter), each with five repeats (Rmax) of GA optimization using a population size (NP) of 80 with 400 generations (Gmax). In both examples, the search windows to define the model space were ±10% for V_P, -70% to +25% for ν, and ±15% for ρ around their initial values, which are shown as dashed red curves in Figure 4. Using 40 Intel Sandybridge/Ivybridge processors, runtimes for each were approximately 45 min.

In the first example, the initial random models were generated using the method shown in Figure 4, where the V_P, ν, and ρ for each layer were allowed to randomly vary within their respective model-space search limits, and the inversion result along with the initial model and the true (well-log) model are shown in Figure 9. The inverted model shown in red in this Figure is not the true inverted model, but a smoothed version of it, computed by taking a moving average of five samples (layers) across the entire depth range. The reason for showing a smoothed model instead of the actual model is because for unconstrained GA inversion where well-logs are not used at all, it is necessary to run multiple inversion passes. For the first pass, the initial model, generated from velocity analysis, V_P-V_S, and V_P-ρ relations is discretized at 0.5λ resolution where λ is the dominant wavelength as discussed earlier. In successive passes, a smoothed model from the previous pass is discretized at finer resolutions (0.25λ, 0.15λ, etc.) and used as the initial model. If the search window required for any given inversion pass is narrower than the one needed for the previous pass, then it can be inferred that the multi-pass inversion is moving toward convergence. Now, comparing the (smoothed) inverted model with the true (well-log) model shown in Figure 9, it can be readily verified that V_P is estimated reasonably well. However, the estimates of V_S (or ν) and ρ are not as good, especially for depths greater than 3 km. Thus, by creating the initial random models using the sampling method of Figure 4, further refinements in the estimated inverted model is possible by making successive GA runs with the smoothed inverted model from the previous run set as the initial model for the next run with new search windows. Comparing the true (well-log) model with the initial model and the smoothed inverted model, shown in Figure 9, it can be readily seen that if the smoothed inverted model (the red curve in Figure 9) is used as the initial model for another inversion pass, much narrower search window than the first pass would be needed for GA to encompass all variations the true (well-log), and running a few such inversion passes would eventually find a model close to the true model. However, running such multiple inversion passes is not only compute-intense and cumbersome, but also impractical. Although well-logs were not used here to define the initial model, the results could still be compared with the well data and the inversion could be stopped when the estimated model is sufficiently close to the true well-log model. In the absence of well data, it is however difficult to come to a decision point of when the estimated model is sufficiently close to the true model so that the multiple inversion passes could be stopped. Thus, although the inversion result of Figure 9 indicates that starting from a faraway initial model and using a wide model-space, GA optimization would, in theory, find the true model via successive inversion runs, the method is still difficult to apply in practice.

Figure 9.
PWI results using the diversity preserved GA optimization and random sampling method for initial model generation shown in Figure 4. (a) P-wave velocity, (b) S-wave velocity, (c) density, and (d) Poisson’s ratio.

In contrast with the method for generating initial models shown in Figure 4, Figure 10 shows a new way to generate them. Like Figure 4, Figure 10 also shows ten random V_P models, generated between the minimum and maximum search bounds with model-1 in a different color from models 2–10. Note that the random V_P models of Figure 10 span the entire search limit like the ones shown in Figure 4. These newly generated random models, however, vary vertically, but are much smoother laterally than the ones in Figure 4. Comparing the random models of Figures 4 and 10, the former could be regarded as the laterally sampled and later as the vertically sampled random models.

Figure 10.
A new strategy of generating initial random models, demonstrated using V_P. Vertically varying and laterally smooth models are generated randomly between the minimum and maximum search limits. Like Figure 4, the search limits, well-log and initial V_P models, and model-1 in a different color than models 2–10 are shown.

Figure 11 is the PWI result with GA optimization using the same parameters as the ones used in Figure 9, except the initial set of random models were generated via vertical sampling method of Figure 10. Note that unlike Figure 9, the smoothed inverted model of Figure 11 is sufficiently close to the true model, and just a single inversion pass using this smoothed inverted model as the initial model should find the true model.

Figure 11.
Same as Figure 9 except when the initial set of random models were generated using the vertical sampling method of Figure 10.

To verify, if the smoothed inverted model shown in Figure 11, would find the true model, another pass of PWI with GA optimization was run by using the model shown in red in Figure 11 as the initial model and discretizing it at 0.25λ resolution. Search windows for V_P and ν were set to ±5% and that for ρ was set to ±2% and random models of population size NP=40 were generated using vertical sampling procedure of Figure 10. Other parameters used for this inversion were Gmax=200,Rmax=3,Niter=3, and Cmin=1.0. Like last two inversions, a diversity preserved GA optimization with linear fitness scaling was used with the same parameters as before. Using 20 Intel Sandybridge/Ivybridge processors, runtime for this inversion was 10 min and the results are shown in Figure 12.

Figure 12.
Pass-2 inversion, using the smoothed inverted model of Figure 11 as the initial model. The original initial model is also shown in blue.

Unlike Figures 9 and 11, the inverted model in Figure 12 is the actual model estimated from inversion, not a smoothed version of it. This model is almost identical to the true (well-log) model, indicating that vertical sampling strategy for generating the initial set of random models is superior to lateral sampling and is the practical way for using GA optimization to solve seismic inverse problems when the initial model is far, and the model-space is large.

5. Discussion

Vertical sampling (Figure 10) instead of lateral sampling (Figure 4) is a new concept in the GA optimization for PWI. By starting from an initial model obtained directly from data and well-known V_P-V_S and V_P-ρ relations, using a large model-space defined as wide search windows for each model parameter, and generating vertically sampled initial set of random models (Figure 10), this newly proposed two-pass GA optimization can find the true (well-log) model with very good accuracy. By comparing lateral sampling (Figure 4) with the new vertical sampling (Figure 10) it may indicate the former sampling method encompasses the model space more uniformly than the latter. This is, however, untrue. In Figures 4 and 10 only ten random models are shown, and when sufficiently large numbers of models are generated, both methods sample the model-space equally well.

Figure 13 compares a single random model of VP,ρ, and ν generated from both standard (lateral) and newly proposed (vertical) sampling methods along with the initial model and the search limits. There is a fundamental difference between how the models are generated in these methods and how do they behave within the model space. In lateral sampling, values for each parameter and for each layer are independently generated in between their respective search bounds. Thus, VP,ρ, and ν for each layer vary widely between their specified search limits (light green curves in Figure 13). Vertical sampling, shown as magenta curves in Figure 13 on the other hand, uses a single random number for each model. Thus, a random number between 0 and 1 is generated and combined with the minimum and maximum search limits for each model parameter at each depth point or layer to define the model. For example, consider two specific layers in Figure 13 at 1 and 2 km depths. The minimum and maximum limits for V_P at 1 km are approximately 3.3 and 4.2 km/s and those at 2 km are about 3.5 and 4.5 km/s. So, if the random number generated for a given model is 0.35, then the V_P at 1 km for the model would be set to 3.3+0.35×4.2−3.3=3.615 km/s. Similarly, V_P at 2 km for the same model would be calculated as 3.5+0.35×4.5−3.5=3.85 km/s. Values for ρ and ν can also be computed in a similar fashion. Note that because the initial Poisson’s ratio is constant, the random Poisson’s ratio models generated by vertical sampling are also constant. Because a single random number defines one model, vertically sampled random models are much smoother than the laterally sampled ones. In addition, these vertically sampled random models follow the initial V_P, V_S (or ν) and ρ model trends. In the examples shown here, the initial V_P is derived from the velocity analysis and NMO correction of prestack seismic data, and the initial V_S (ν) and ρ were generated from this initial V_P and established V_P-V_S and V_P-ρ relations, which, in a broad sense, is the representative of the local geology that follows the regional compaction trend. Even for the case when well-logs are used as the initial model and then the initial random models are generated by vertical sampling, they would still be the representative of the geology. Thus, the vertically sampled random models could also be regarded as geologically constrained random models. Although random, their smooth and geologically constrained behavior is the reason for their ability to better sample the model-space. Starting from the combinations of V_P, V_S (or ν) and ρ models, consistent with the local geology, they tend to converge faster than those generated from lateral sampling. Finally, the dashed cyan line across all panels of Figure 13 is the top of the inversion window. All model parameters above this line were regarded as overburden and kept unchanged.

Figure 13.
Comparison between the lateral (light green) and vertical (magenta) sampling using a single random model. The search limits (dashed red) and the initial model (black) are also shown. In addition, the dashed line in cyan represents the top of the inversion window. (a) VP, (b) ρ, and (c) ν.

Discussion here is restricted to the application of GA to solve seismic inverse problem a single location using the parallel implementation outlined in Figure 8. This procedure at single location is extendable to multiple location using the multi-level parallelization. The concept of such multi-level parallelization is to use multiple sets of the workflow of Figure 8 to simultaneously invert different regions of the seismic data. Details of such multi-level parallelization is described elsewhere [13, 16] and therefore not repeated.

6. Conclusions

By introducing a new sampling strategy for generating initial random models, a GA optimization methodology is presented here. By avoiding genetic drift via diversity preservation and fitness scaling, the proposed method has been shown to work very well for large-sized model-spaces in solving seismic waveform inversion problems. The proposed method shown for a single location can be easily extended to multiple locations using a multi-level parallelization approach.

References

1. Holland JH. Adaptation in Natural and Artificial System. Ann Arbor, MI, USA: University of Michigan Press; 1975
2. Goldberg DE. Genetic Algorithms in Search, Optimization and Machine Learning. Boston, MA, USA: Addison Wesley Publishing Company; 1989
3. Stoffa PL, Sen MK. Nonlinear multiparameter optimization using genetic algorithms: Inversion of plane-wave seismograms. Geophysics. 1991;56:1794-1810
4. Sen MK, Stoffa PL. Rapid sampling of model space using genetic algorithms: Examples from seismic waveform inversion. Geophysical Journal International. 1992;108:281-292
5. Sen MK, Stoffa PL. Global Optimization Methods in Geophysical Inversion. Amsterdam, Netherlands: Elsevier Science Publications; 1995
6. Sen MK, Stoffa PL. Bayesian inference, Gibbs’s sampler and uncertainty estimation in geophysical inversion. Geophysical Prospecting. 1996;44:313-350
7. Mallick S. Model-based inversion of amplitude-variations-with-offset data using a genetic algorithm. Geophysics. 1995;60:939-954
8. Mallick S. Case History: Some practical aspects of prestack waveform inversion using a genetic algorithm: An example from the east Texas Woodbine gas sand. Geophysics. 1999;64:326-336
9. Du Z, MacGregor LM. Reservoir characterization from joint inversion of marine CSEM and seismic AVA data using Genetic Algorithms: a case study based on the Luva gas field. SEG Technical Program Expanded Abstracts. 2010;80:737-741
10. Padhi A, Mallick S. Accurate estimation of density from the inversion of multicomponent prestack seismic waveform data using a non-dominated sorting genetic algorithm. The Leading Edge. 2013;32:94-98
11. Padhi A, Mallick S. Multicomponent prestack seismic waveform inversion in transversely isotropic media using a non-dominated sorting genetic algorithm. Geophysical Journal International. 2014;196:1600-1618
12. Li T, Mallick S. Multicomponent, multi-azimuth pre-stack seismic waveform inversion for azimuthally anisotropic media using a parallel and computationally efficient non-dominated sorting genetic algorithm. Geophysical Journal International. 2015;200:1136-1154
13. Mallick S, Adhikari S. Amplitude-variation-with-offset and prestack waveform inversion: A direct comparison using a real data example from the Rock Springs Uplift, Wyoming, USA. Geophysics. 2015;80(2):B45-B59
14. Li T, Mallick S, Tamimi N, Davis T. Inversion of wide-azimuth multicomponent vertical seismic profile data for anisotropic subsurface properties. SEG Technical Program Expanded Abstracts. 2016;2016:1252-1257
15. Mazzotti A, Bienati N, Stucchi E, Tognarelli A, Aleardi M, Sajeva A. Two-grid genetic algorithm full-waveform inversion. The Leading Edge. 2016;35:1068-1075
16. Pafeng J, Mallick S, Sharma H. Prestack waveform inversion of three-dimensional seismic data – An example from the Rock Springs Uplift, Wyoming, USA. Geophysics. 2017;82(1):B1-B12
17. Ayani M, MacGregor L, Mallick S. Inversion of marine controlled source electromagnetic data using a parallel non-dominated sorting genetic algorithm. Geophysical Journal International. 2020;220:1066-1077
18. Kozlovskaya E, Vecsey L, Plomerova J, Raita T. Joint inversion of multiple data types with the use of multiobjective optimization: problem formulation and application to the seismic anisotropy investigations. Geophysical Journal International. 2007;171:761-779
19. Singh VP, Duquet B, Leger M, Schoenauer M. Automatic wave equation migration velocity inversion using multiobjective evolutionary algorithms. Geophysics. 2008;73:VE61-VE73
20. Heyburn R, Fox B. Multi-objective analysis of body and surface waves from the Market Rasen (UK) earthquake. Geophysical Journal International. 2010;181:532-544
21. Oldenburg DW, Scheuer T, Levy S. Recovery of acoustic impedance from reflection seismograms. Geophysics. 1983;48:1318-1337
22. Oldenburg DW, Levy S, Stinson K. Root-means-square velocities and recovery of the acoustic impedance. Geophysics. 1984;49:1653-1663
23. Connolly P. Elastic impedance. The Leading Edge. 1999;18:438-452
24. Hampson DP, Russell BH, Bankhead B. Simultaneous inversion of pre-stack seismic data. SEG Technical Program Expanded Abstracts. 2005;75:1633-1637
25. Mallick S. Amplitude-variation-with-offset, elastic impedance, and wave-equation synthetics – A modeling study. Geophysics. 2007;72:C1-C7
26. Pratt RG. Seismic waveform inversion in the frequency domain. Part 1: Theory and verification in a physical scale model. Geophysics. 1999;64:888-901
27. Vigh D, Starr EW. 3D prestack plane-wave, full-waveform inversion. Geophysics. 2008;73:VE135-VE144
28. Plessix R-É. Three-dimensional frequency-domain full-waveform inversion with an iterative solver. Geophysics. 2009;74:WCC149-WCC157
29. Lee KH, Kim HJ. Source-independent full-waveform inversion of seismic data. Geophysics. 2003;68:2010-2015
30. Liu F, Guasch L, Morton SA, Warner M, Umpleby A, Meng Z, et al. 3-D time-domain full waveform inversion of a valhall obc dataset. SEG Technical Program Expanded Abstracts. 2012;82:1-5. DOI: 10.1190/segam2012-1105.1
31. Guasch L, Warner M, Nangoo T, Morgan J, Umpleby A, Stekl I, et al. Elastic 3D full-waveform inversion. SEG Technical Program Expanded Abstracts. 2012;82:1-7. DOI: 10.1190/segam2012-1239.1
32. Guitton A, Ayeni G, Diaz E. Constrained full-waveform inversion by model reparameterization. Geophysics. 2012;77:R117-R217
33. Warner M, Ratcliffe A, Nangoo T, Morgan J, Umpleby A, Shah N, et al. Anisotropic 3D full-waveform inversion. Geophysics. 2013;78:R59-R80
34. Bansal R, Krebs J, Routh P, Lee S, Anderson J, Baumstein A, et al. Simultaneous-source full-wavefield inversion. The Leading Edge. 2013;32:1100-1108
35. Xue Z, Zhu H, Fomel S. Full-waveform inversion using seislet regularization. Geophysics. 2017;82:A43-A49
36. Huang G, Nammour R, Symes W. Full-waveform inversion via source-receiver extension. Geophysics. 2017;82:R153-R171
37. Biswas R, Sen MK. 2D Full Waveform Inversion and Uncertainty Estimation using the Reversible Jump Hamiltonian Monte Carlo. SEG Technical Program Expanded Abstracts. 2017;87:1280-1285
38. da Silva NV, Yao G, Warner M. Semiglobal viscoacoustic full-waveform inversion. Geophysics. 2019;84:R271-R293
39. Zhang Z-d, Alkhalifah T. Local-crosscorrelation elastic full-waveform inversion. Geophysics. 2019;84:R897-R908
40. Sen MK, Roy IG. Computation of differential seismograms and iteration adaptive regularization in prestack waveform inversion. Geophysics. 2003;68:2026-2039
41. Padhi A, Mallick S, Fortin W, Holbrook WS, Blacic T. 2-D ocean temperature and salinity images from pre-stack seismic waveform inversion methods: an example from the South China Sea. Geophysical Journal International. 2015;202:800-810
42. Kennett BLN. Seismic Wave Propagation in Stratified Media. Cambridge, United Kingdom: Cambridge University Press; 1983
43. Kennett BLN, Kerry NJ. Seismic waves in a stratified half space. Geophysical Journal of the Royal Astronomical Society. 1979;57:557-583
44. Yilmaz O. Seismic data processing, Society of Exploration Geophysicists. Tulsa, OK, USA; 1987
45. Castagna JP, Batzle ML, Eastwood RL. Relationships between compressional-wave and shear-wave velocities in clastic silicate rocks. Geophysics. 1985;50:571-581
46. Hilterman F. Is AVO the seismic signature of lithology? A case study of Ship Shoal-South Addition. The Leading Edge. 1990;10:39-42
47. Gardner GHF, Gardner LW, Gregory AR. Formation velocity and density – The diagnostic basics for stratigraphic traps. Geophysics. 1974;39:770-780
48. Todd CP, Backus MM. Offset-dependent reflectivity in a structural context. SEG Technical Program Expanded Abstracts. 1985;55:586-588
49. Resnick JR. Seismic data processing for AVO and AVA analysis. In: Castagna JE, Backus MM, editors. Offset Dependent Reflectivity—Theory and Practice for AVO and AVA analysis. Tulsa, OK, USA: Society of Exploration Geophysicists; 1993. pp. 175-189
50. Mukhopadhyay PK, Mallick S. An accurate ray-based offset-to-angle transform from normal moveout uncorrected multicomponent data in a transversely isotropic medium with vertical symmetry axis. Geophysics. 2011;76:C41-C51
51. Shaw R, Srivastava S. Particle swarm optimization: A new tool to invert geophysical data. Geophysics. 2007;72:F75-F83
52. Tronicke J, Paasche H, Böniger U. Crosshole traveltime tomography using particle swarm optimization: A near-surface field example. Geophysics. 2016;77:R19-R32
53. Mallick S. A simple approximation to the P-wave reflection coefficient and its implication in the inversion of amplitude variation with offset data. Geophysics. 1993;58:544-552
54. Koefoed O. On the effect of Poisson’s ratios of rock strata on the reflection coefficients of plane waves. Geophysical Prospecting. 1955;3:381-387
55. Koefoed O. Reflection and transmission coefficients for plane longitudinal incident waves. Geophysical Prospecting. 1962;10:304-351
56. Bortfeld R. Approximations to the reflection and transmission coefficients of plane longitudinal and transverse waves. Geophysical Prospecting. 1961;9:485-502
57. Richards PG, Frasier CW. Scattering of elastic waves from depth-dependent inhomogeneities. Geophysics. 1976;41:441-458
58. Shuey RT. A simplification of the Zoeppritz, equations. Geophysics. 1985;50:609-614
59. Aki K, Richards PG. Quantitative Seismology. Herndon, VA, USA: University Science Books; 2002
60. Sen MK, Stoffa PL. Nonlinear one-dimensional seismic waveform inversion using simulated annealing. Geophysics. 1991;56:1624-1638
61. Sen MK, Biswas R. Transdimensional seismic inversion using the reversible jump Hamiltonian Monte Carlo algorithm. Geophysics. 2017;82:R119-R134
62. Zhu D, Gibson R. Seismic inversion and uncertainty quantification using transdimensional Markov chain Monte Carlo method. Geophysics. 2018;83:R321-R334
63. Jian W, Lei Z, Hao C, Xiu-ming W. Trans-dimensional Bayesian inversion for directional resistivity logging while drilling data. SEG Technical Program Expanded Abstracts. 2018;88:849-852
64. Visser G, Guo P, Saygin E. Bayesian transdimensional seismic full-waveform inversion with a dipping layer parameterization. Geophysics. 2019;84:R845-R858
65. Dhara A, Sen MK, Yang D, Schmedes J, Routh P, Sain R. Facies and reservoir properties estimation by a transdimensional seismic inversion. SEG Technical Program Expanded Abstracts. 2020;90:255-259
66. Deb K, Agrawal S. A niched-penalty approach for constraint handling in genetic algorithms. Vienna: Springer; 1999. DOI: 10.1007/978-3-7091-6384-9_40
67. Deb K, Pratap A, Agarwal S, Meyarivan T. A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation. 2002;6(2):181-197

[1] 1. Holland JH. Adaptation in Natural and Artificial System. Ann Arbor, MI, USA: University of Michigan Press; 1975

[2] 2. Goldberg DE. Genetic Algorithms in Search, Optimization and Machine Learning. Boston, MA, USA: Addison Wesley Publishing Company; 1989

[3] 3. Stoffa PL, Sen MK. Nonlinear multiparameter optimization using genetic algorithms: Inversion of plane-wave seismograms. Geophysics. 1991;56:1794-1810

[4] 4. Sen MK, Stoffa PL. Rapid sampling of model space using genetic algorithms: Examples from seismic waveform inversion. Geophysical Journal International. 1992;108:281-292

[5] 5. Sen MK, Stoffa PL. Global Optimization Methods in Geophysical Inversion. Amsterdam, Netherlands: Elsevier Science Publications; 1995

[6] 6. Sen MK, Stoffa PL. Bayesian inference, Gibbs’s sampler and uncertainty estimation in geophysical inversion. Geophysical Prospecting. 1996;44:313-350

[7] 7. Mallick S. Model-based inversion of amplitude-variations-with-offset data using a genetic algorithm. Geophysics. 1995;60:939-954

[8] 8. Mallick S. Case History: Some practical aspects of prestack waveform inversion using a genetic algorithm: An example from the east Texas Woodbine gas sand. Geophysics. 1999;64:326-336

[9] 9. Du Z, MacGregor LM. Reservoir characterization from joint inversion of marine CSEM and seismic AVA data using Genetic Algorithms: a case study based on the Luva gas field. SEG Technical Program Expanded Abstracts. 2010;80:737-741

[10] 10. Padhi A, Mallick S. Accurate estimation of density from the inversion of multicomponent prestack seismic waveform data using a non-dominated sorting genetic algorithm. The Leading Edge. 2013;32:94-98

[11] 11. Padhi A, Mallick S. Multicomponent prestack seismic waveform inversion in transversely isotropic media using a non-dominated sorting genetic algorithm. Geophysical Journal International. 2014;196:1600-1618

[12] 12. Li T, Mallick S. Multicomponent, multi-azimuth pre-stack seismic waveform inversion for azimuthally anisotropic media using a parallel and computationally efficient non-dominated sorting genetic algorithm. Geophysical Journal International. 2015;200:1136-1154

[13] 13. Mallick S, Adhikari S. Amplitude-variation-with-offset and prestack waveform inversion: A direct comparison using a real data example from the Rock Springs Uplift, Wyoming, USA. Geophysics. 2015;80(2):B45-B59

[14] 14. Li T, Mallick S, Tamimi N, Davis T. Inversion of wide-azimuth multicomponent vertical seismic profile data for anisotropic subsurface properties. SEG Technical Program Expanded Abstracts. 2016;2016:1252-1257

[15] 15. Mazzotti A, Bienati N, Stucchi E, Tognarelli A, Aleardi M, Sajeva A. Two-grid genetic algorithm full-waveform inversion. The Leading Edge. 2016;35:1068-1075

[16] 16. Pafeng J, Mallick S, Sharma H. Prestack waveform inversion of three-dimensional seismic data – An example from the Rock Springs Uplift, Wyoming, USA. Geophysics. 2017;82(1):B1-B12

[17] 17. Ayani M, MacGregor L, Mallick S. Inversion of marine controlled source electromagnetic data using a parallel non-dominated sorting genetic algorithm. Geophysical Journal International. 2020;220:1066-1077

[18] 18. Kozlovskaya E, Vecsey L, Plomerova J, Raita T. Joint inversion of multiple data types with the use of multiobjective optimization: problem formulation and application to the seismic anisotropy investigations. Geophysical Journal International. 2007;171:761-779

[19] 19. Singh VP, Duquet B, Leger M, Schoenauer M. Automatic wave equation migration velocity inversion using multiobjective evolutionary algorithms. Geophysics. 2008;73:VE61-VE73

[20] 20. Heyburn R, Fox B. Multi-objective analysis of body and surface waves from the Market Rasen (UK) earthquake. Geophysical Journal International. 2010;181:532-544

[21] 21. Oldenburg DW, Scheuer T, Levy S. Recovery of acoustic impedance from reflection seismograms. Geophysics. 1983;48:1318-1337

[22] 22. Oldenburg DW, Levy S, Stinson K. Root-means-square velocities and recovery of the acoustic impedance. Geophysics. 1984;49:1653-1663

[23] 23. Connolly P. Elastic impedance. The Leading Edge. 1999;18:438-452

[24] 24. Hampson DP, Russell BH, Bankhead B. Simultaneous inversion of pre-stack seismic data. SEG Technical Program Expanded Abstracts. 2005;75:1633-1637

[25] 25. Mallick S. Amplitude-variation-with-offset, elastic impedance, and wave-equation synthetics – A modeling study. Geophysics. 2007;72:C1-C7

[26] 26. Pratt RG. Seismic waveform inversion in the frequency domain. Part 1: Theory and verification in a physical scale model. Geophysics. 1999;64:888-901

[27] 27. Vigh D, Starr EW. 3D prestack plane-wave, full-waveform inversion. Geophysics. 2008;73:VE135-VE144

[28] 28. Plessix R-É. Three-dimensional frequency-domain full-waveform inversion with an iterative solver. Geophysics. 2009;74:WCC149-WCC157

[29] 29. Lee KH, Kim HJ. Source-independent full-waveform inversion of seismic data. Geophysics. 2003;68:2010-2015

[30] 30. Liu F, Guasch L, Morton SA, Warner M, Umpleby A, Meng Z, et al. 3-D time-domain full waveform inversion of a valhall obc dataset. SEG Technical Program Expanded Abstracts. 2012;82:1-5. DOI: 10.1190/segam2012-1105.1

[31] 31. Guasch L, Warner M, Nangoo T, Morgan J, Umpleby A, Stekl I, et al. Elastic 3D full-waveform inversion. SEG Technical Program Expanded Abstracts. 2012;82:1-7. DOI: 10.1190/segam2012-1239.1

[32] 32. Guitton A, Ayeni G, Diaz E. Constrained full-waveform inversion by model reparameterization. Geophysics. 2012;77:R117-R217

[33] 33. Warner M, Ratcliffe A, Nangoo T, Morgan J, Umpleby A, Shah N, et al. Anisotropic 3D full-waveform inversion. Geophysics. 2013;78:R59-R80

[34] 34. Bansal R, Krebs J, Routh P, Lee S, Anderson J, Baumstein A, et al. Simultaneous-source full-wavefield inversion. The Leading Edge. 2013;32:1100-1108

[35] 35. Xue Z, Zhu H, Fomel S. Full-waveform inversion using seislet regularization. Geophysics. 2017;82:A43-A49

[36] 36. Huang G, Nammour R, Symes W. Full-waveform inversion via source-receiver extension. Geophysics. 2017;82:R153-R171

[37] 37. Biswas R, Sen MK. 2D Full Waveform Inversion and Uncertainty Estimation using the Reversible Jump Hamiltonian Monte Carlo. SEG Technical Program Expanded Abstracts. 2017;87:1280-1285

[38] 38. da Silva NV, Yao G, Warner M. Semiglobal viscoacoustic full-waveform inversion. Geophysics. 2019;84:R271-R293

[39] 39. Zhang Z-d, Alkhalifah T. Local-crosscorrelation elastic full-waveform inversion. Geophysics. 2019;84:R897-R908

[40] 40. Sen MK, Roy IG. Computation of differential seismograms and iteration adaptive regularization in prestack waveform inversion. Geophysics. 2003;68:2026-2039

[41] 41. Padhi A, Mallick S, Fortin W, Holbrook WS, Blacic T. 2-D ocean temperature and salinity images from pre-stack seismic waveform inversion methods: an example from the South China Sea. Geophysical Journal International. 2015;202:800-810

[42] 42. Kennett BLN. Seismic Wave Propagation in Stratified Media. Cambridge, United Kingdom: Cambridge University Press; 1983

[43] 43. Kennett BLN, Kerry NJ. Seismic waves in a stratified half space. Geophysical Journal of the Royal Astronomical Society. 1979;57:557-583

[44] 44. Yilmaz O. Seismic data processing, Society of Exploration Geophysicists. Tulsa, OK, USA; 1987

[45] 45. Castagna JP, Batzle ML, Eastwood RL. Relationships between compressional-wave and shear-wave velocities in clastic silicate rocks. Geophysics. 1985;50:571-581

[46] 46. Hilterman F. Is AVO the seismic signature of lithology? A case study of Ship Shoal-South Addition. The Leading Edge. 1990;10:39-42

[47] 47. Gardner GHF, Gardner LW, Gregory AR. Formation velocity and density – The diagnostic basics for stratigraphic traps. Geophysics. 1974;39:770-780

[48] 48. Todd CP, Backus MM. Offset-dependent reflectivity in a structural context. SEG Technical Program Expanded Abstracts. 1985;55:586-588

[49] 49. Resnick JR. Seismic data processing for AVO and AVA analysis. In: Castagna JE, Backus MM, editors. Offset Dependent Reflectivity—Theory and Practice for AVO and AVA analysis. Tulsa, OK, USA: Society of Exploration Geophysicists; 1993. pp. 175-189

[50] 50. Mukhopadhyay PK, Mallick S. An accurate ray-based offset-to-angle transform from normal moveout uncorrected multicomponent data in a transversely isotropic medium with vertical symmetry axis. Geophysics. 2011;76:C41-C51

[51] 51. Shaw R, Srivastava S. Particle swarm optimization: A new tool to invert geophysical data. Geophysics. 2007;72:F75-F83

[52] 52. Tronicke J, Paasche H, Böniger U. Crosshole traveltime tomography using particle swarm optimization: A near-surface field example. Geophysics. 2016;77:R19-R32

[53] 53. Mallick S. A simple approximation to the P-wave reflection coefficient and its implication in the inversion of amplitude variation with offset data. Geophysics. 1993;58:544-552

[54] 54. Koefoed O. On the effect of Poisson’s ratios of rock strata on the reflection coefficients of plane waves. Geophysical Prospecting. 1955;3:381-387

[55] 55. Koefoed O. Reflection and transmission coefficients for plane longitudinal incident waves. Geophysical Prospecting. 1962;10:304-351

[56] 56. Bortfeld R. Approximations to the reflection and transmission coefficients of plane longitudinal and transverse waves. Geophysical Prospecting. 1961;9:485-502

[57] 57. Richards PG, Frasier CW. Scattering of elastic waves from depth-dependent inhomogeneities. Geophysics. 1976;41:441-458

[58] 58. Shuey RT. A simplification of the Zoeppritz, equations. Geophysics. 1985;50:609-614

[59] 59. Aki K, Richards PG. Quantitative Seismology. Herndon, VA, USA: University Science Books; 2002

[60] 60. Sen MK, Stoffa PL. Nonlinear one-dimensional seismic waveform inversion using simulated annealing. Geophysics. 1991;56:1624-1638

[61] 61. Sen MK, Biswas R. Transdimensional seismic inversion using the reversible jump Hamiltonian Monte Carlo algorithm. Geophysics. 2017;82:R119-R134

[62] 62. Zhu D, Gibson R. Seismic inversion and uncertainty quantification using transdimensional Markov chain Monte Carlo method. Geophysics. 2018;83:R321-R334

[63] 63. Jian W, Lei Z, Hao C, Xiu-ming W. Trans-dimensional Bayesian inversion for directional resistivity logging while drilling data. SEG Technical Program Expanded Abstracts. 2018;88:849-852

[64] 64. Visser G, Guo P, Saygin E. Bayesian transdimensional seismic full-waveform inversion with a dipping layer parameterization. Geophysics. 2019;84:R845-R858

[65] 65. Dhara A, Sen MK, Yang D, Schmedes J, Routh P, Sain R. Facies and reservoir properties estimation by a transdimensional seismic inversion. SEG Technical Program Expanded Abstracts. 2020;90:255-259

[66] 66. Deb K, Agrawal S. A niched-penalty approach for constraint handling in genetic algorithms. Vienna: Springer; 1999. DOI: 10.1007/978-3-7091-6384-9_40

[67] 67. Deb K, Pratap A, Agarwal S, Meyarivan T. A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation. 2002;6(2):181-197

Optimization Using Genetic Algorithms – Methodology with Examples from Seismic Waveform Inversion

Genetic Algorithms - Theory, Design and Programming

Abstract

Keywords

Author Information

Subhashis Mallick*

1. Introduction

2. Multi-parameter and single-objective optimization problem

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

3. Practical implementation of GA

Figure 8.

4. Examples

Figure 9.

Figure 10.

Figure 11.

Figure 12.

5. Discussion

Figure 13.

6. Conclusions

References

Application of Genetic Algorithms in Health Sciences

Optimization Using Genetic Algorithms – Methodology with Examples from Seismic Waveform Inversion

Genetic Algorithms - Theory, Design and Programming

Abstract

Keywords

Author Information

Subhashis Mallick*

1. Introduction

2. Multi-parameter and single-objective optimization problem

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

3. Practical implementation of GA

Figure 8.

4. Examples

Figure 9.

Figure 10.

Figure 11.

Figure 12.

5. Discussion

Figure 13.

6. Conclusions

References

Continue reading from the same book

Genetic Algorithms