DL Seismic FWI
DL Seismic FWI
net/publication/343986020
Deep learning seismic full waveform inversion for realistic structural models
CITATIONS READS
93 1,360
6 authors, including:
All content following this page was uploaded by Yangkang Chen on 31 August 2020.
structural models
Bin Liu∗†‡ , Senlin Yang∗ , Yuxiao Ren∗ , Xinji Xu† , Peng Jiang∗ and Yangkang
Chen§
Shandong University,
Shandong University,
Shandong University,
GEO-2020
1
ABSTRACT
Velocity model inversion is one of the most important tasks in seismic exploration. Full
waveform inversion (FWI) can obtain the highest resolution in traditional velocity inver-
sion methods, but it heavily depends on initial models and is computationally expensive.
In recent years, a large number of deep learning based velocity model inversion methods
have been proposed. One critical component in those deep learning based methods is a
large training set containing different velocity models. We propose a method to construct
a realistic structural model for deep learning network. Our P-wave velocity model build-
ing method for creating dense-layer/fault/salt body models can automatically construct
a large number of models without much human effort, which is very meaningful for deep
learning networks. Moreover, to improve the inversion result on these realistic structural
models, instead of only using the common-shot gather, we also propose to extract features
from the common-receiver gather as well. Through a large number of realistic structural
models, reasonable data acquisition methods, and appropriate network setups, a more gen-
eralized result can be obtained through our proposed inversion framework, which has been
demonstrated to be effective on the independent testing data set. The results of dense-layer
models, fault models, and salt body models are compared and analyzed, respectively, which
demonstrates the reliability of the proposed method and also provides practical guidelines
2
INTRODUCTION
Accurate seismic velocity model inversion from seismic data is an important task in
seismic exploration for a variety of applications. The velocity model plays an essential role
in high-quality and high-resolution seismic exploration, and accurate velocity model can
provide a better research basis for reverse-time migration (Baysal et al., 1983), prestack-
depth migration (Mittet et al., 1995), and other imaging methods (Bunks et al., 1995).
With the increasing complexity of seismic exploration conditions and requirements, the
the true velocity model, many methods have been developed from different perspectives,
such as normal moveout correction (NMO) based velocity analysis (Dunkin and Levin, 1973;
Alkhalifah and Tsvankin, 1995), wave-equation tomography (Woodward, 1992) and full
waveform inversion (FWI) (Lailly and Bednar, 1983; Tarantola, 1984) methods. In recent
years, with the development of machine learning, more and more data-driven methods are
In view of the importance of the velocity model in seismic exploration, researchers have
devoted great efforts on velocity inversion methods (Cohen and Bleistein, 1979). Lailly and
Bednar (1983) and Tarantola (1984) first proposed the idea of seismic full waveform inversion
based on the generalized least-squares criterion, which provides an overall framework for
seismic velocity inversion. Different from traditional tomography methods, FWI requires
high-accuracy modeling of wave propagation so that it can make full use of the kinematics
and dynamics information of prestack seismic wave field. In addition, due to the highly
non-linear characteristics of FWI, it is strongly dependent on initial models and easy to fall
into a local minimum. To solve the local minimum problem, Bunks et al. (1995) propose
3
the dense-scale full-waveform inversion in the time domain, where seismic data information
at different frequencies are used to improve the inversion effect. After that, FWI methods
in the frequency domain (Pratt and Worthington, 1990; Pratt, 1999; Pratt and Shipp, 1999;
Hu et al., 2009b) and the Laplace domain (Shin and Cha, 2008; Shin and Ho Cha, 2009)
have been proposed with satisfactory results achieved. In addition, the directional total
variation (DTV) method has been introduced recently to solve the local minimum problem
(Qu et al., 2019). With the intensive study, FWI methods has been further expanded
in various aspects, including viscoelastic media (Yang et al., 2009), joint inversion (Khan
et al., 2010; Hu et al., 2009a), and gradually been applied in field data (Sirgue et al., 2010).
However, the limitations of FWI methods, such as the impact of initial models, still exist,
and new methods are needed to solve these problems (Chen et al., 2016).
a new idea for velocity model inversion. Among them, deep learning has become one of
the most important research topics, and more and more fields are beginning to introduce
deep learning methods to solve related problems. After decades of development, neural
networks have evolved from the initial form of neurons to the recent architecture of deep
neural networks. The concept of neural networks dates back to the study by McCulloch and
Pitts (1943), which focuses on the neural computing method. Rosenblatt (1958) creatively
proposes the concept of the perceptron, which opened up the research of neural network al-
gorithms. Rumelhart et al. (1988) propose the back-propagating neural network. However,
because of the huge requirements of computational resources, the neural network study
mostly stayed at the theoretical stage. With the great improvement of computing ability
in the 21st century, the neural network algorithms have returned to the research focus via
the publication of a series of papers (Hinton et al., 2006; Lecun et al., 2015; Hinton et al.,
4
2012a). With the development of deep learning and the outbreak of related technologies,
more and more fields begin to use this method to solve problems (Deng et al., 2014), such as
computer vision (Voulodimos et al., 2018), medical diagnosis (Esteva et al., 2017), speech
In the seismic exploration community, research began to use deep learning methods and
achieved better results than traditional methods. Röth and Tarantola (1994) first apply the
method of neural network to invert 1D velocity model from seismic data, which confirmed
the applicability of neural network to velocity model inversion. Moseley et al. (2018) apply
data. Araya-Polo et al. (2018) achieve velocity model reconstruction through convolutional
neural network (CNN) for a velocity spectrum cube calculated from prestack data. For
layer model and fault model, Wu and Lin (2018) use a CNN with an encoder-decoder, called
InversionNet, to achieve corresponding velocity model building. Yang and Ma (2019) use
fully convolutional neural networks (FCN) to rebuild velocity models, especially salt model,
from prestack data with Gaussian noise. In particular, they further trained SEG data sets by
using a transfer learning method, and the result was better than that from FWI. The above-
mentioned deep neural networks are based on the application of existing network methods
and applied to the seismic data set. Based on in-depth analysis of the characteristics of
seismic data, Li et al. (2020) further design and optimize CNN and fully connected network,
then proposed SeisInvNet, which achieved better results than InversionNet. Some methods
for improving FWI using neural network are proposed, and the computational efficiency and
inversion results are significantly improved(Sun et al., 2020; Ren et al., 2020). Moreover,
in the research of seismic data processing, deep learning has been successfully applied, such
as seismic data denoising (Yu et al., 2018; Chen et al., 2019; Saad and Chen, 2020), fault
5
identification (Wu et al., 2019), lithology prediction (Shi et al., 2019; Zhang et al., 2018),
geological structure classification(Li, 2018), arrival picking (Tsai et al., 2018; Yuan et al.,
2019; Zhang et al., 2020). It is also expected that there will be more deep learning programs
applied in geophysical exploration in the future, such as the mineral exploration (Malehmir
et al., 2012), the forward geological prospecting in tunneling (Li et al., 2017), the four-
dimensional data monitoring (Liu et al., 2020b). In addition to seismic exploration, other
exploration methods also applied deep learning to realize data processing and interpretation
(George and Huerta, 2018; Puzyrev, 2019; Nurindrawati and Sun, 2019; Liu et al., 2020a).
In this study, we develop a method to build realistic structural models, i.e., dense layer
models, fault models, and salt body models, and a complete framework for P-wave velocity
model inversion using the deep neural network. The research consists of two major aspects.
First, to obtain as many complex models as possible, the model building process is designed.
18,000 dense-layer/fault/salt body models are acquired, which provides sufficient data for
the network training. SeisInvNet is chosen and improved for more complex models. Based
on the generated velocity models and the corresponding dense-shot data, new deep neural
network is trained to approximate the nonlinear mapping from the data to the model.
Through the trained network, the velocity model can be obtained by directly inputting
seismic data. Compared with traditional methods, our method achieves better results in
computing efficiency and accuracy and has a certain degree of generalization. Finally, to
compare with the original SeisInvNet, we analyze the results by four evaluation criteria,
including M AE, M SE, SSIM and M SSIM . From the analysis, the proposed method
prediction results of the fault model are carefully studied, and future research plans are
discussed.
6
METHOD
Problem definition
excite seismic wavefield and use receivers on the ground to record seismic waves. In this
paper, synthetic data are modeled based on the acoustic wave equation in the time domain:
where a denotes the wave velocity and u denotes pressure, i.e., the acoustic wave field. x
and y denote spatial coordinates, and t is time, f (x, y, t) is the function of source.
For velocity model inversion, the velocity model V is inferred by the corresponding
For FWI methods, the velocity model is iteratively optimized by gradients derived from
an objective function. The velocity model is usually converged to a local minimum. In con-
trast, deep learning (DL) methods learn the nonlinear mapping by optimizing the network
parameters and usually require a lot of data-model pairs to train the network. The perfor-
mance of deep learning methods heavily depends on the training data set and the network
design. According to the latest research (Kawaguchi, 2016; Choromanska et al., 2014), the
convergence point deep learning methods could reach is closer to the global minimum.
Given input data D, the deep neural network can predict a corresponding model V,
so the fit error between the prediction V and the model V can be computed, then the
gradients are derived to update the network parameters. In this way, the trained network
7
is obtained, i.e., the nonlinear mapping F from time-series data to the model is built as:
Figure 1 shows the workflow of velocity model inversion based on deep learning, where
the velocity model V of the size [H × W ] serves as the output target (also called label),
and the corresponding seismic data D as the input. The size of input data is [S × R × T ],
where S, R and T represent the number of shotes, receivers and the time step recorded
respectively. During the training process, we can calculate the misfit error between the
network output V and the label V, and update the network parameters by gradient back-
propagation. After multiple epoch iterations, the error will converge to a sufficiently small
value, and the network parameters can be determined. During inference, given the seismic
Generally speaking, most research on DL velocity model inversion follow the supervised
learning paradigm, where a large quantity of data with the target labels are indispensable.
the data used during training. Thus, for DL velocity model inversion, the testing model
design is crucial for bringing out the nonlinear mapping ability of neural networks. Specif-
ically, the reasonable model design can help the trained neural network more likely to be
applied in realistic situations. Some studies (Wu and Lin, 2018; Yang and Ma, 2019; Li
et al., 2020) have collected their own data sets, including models with faults, salt bodies,
and layered subsurface. However, most of these works design models only following some
8
simple rules, which make the models neither realistic nor complex enough. To obtain a
more realistic structural velocity model, we further propose a new scheme of designing the
To design a dense-layer model with one or two faults or salt bodies, we first generate a
dense-layer structural model, and then randomly add faulting structures on it. For dense-
layer models, the difficulty is how to ensure the continuity and variability of each layer and
increase the amount of underground media in the limited depth and exploration resolution.
2. Iteratively generate curves by making some adjustments according to the upper inter-
face to ensure no drastic change between two adjacent interfaces and keep realistic.
3. Fill P-wave velocity value to the media between two adjacent interfaces, following the
The target inversion model size of this paper is set to be 100 × 100. Around the model,
there is an extra 20-grid absorption boundary at the left, right, and bottom sides.
continuous, fluctuant, and complex curves. The function and separate equations are shown
as follows:
y = y1 + y2 + y3 ,
x x
y1 = a1 sin + θ1 + a2 (sin + θ2 ) 2 ,
2πT1 2πT2
(4)
x
y2 = a3 (sin + θ3 )i , i = 1, 2, 3,
2πT3
y3 = r × x + c0 ,
9
where aj , Tj , and θj , j = 1, 2, 3, are parameters for different trigonometric equations, r and
c are constants to control tilt and depth of interface, respectively. We let a1 > a2 > a3 and
T1 < T2 < T3 , while the θs are given randomly, so y1 is the dominant part of the curve. To
make the trends between the two adjacent layers similar, the y1 for each interface should
be adjusted, compared with the previous one. y2 is assigned in the way that its period
and magnitude are re-given at each interface individually. Finally, as for y3 , c and r are
assigned to control the depth and tilt of the current interface, respectively. Through the
above definitions, we could obtain multiple interfaces with the same trend, as shown in
Figure 2.
To further weaken the smoothness of the curve caused by the trigonometric term in the
function 4 and make the model more realistic, we select some discrete points from each
curve and form the new curve by reconnecting these points. In this way, the randomness of
After all the interfaces are determined, we set the P-wave velocity value for each layer
from top to bottom, and the value range is [1500, 4000] m/s. Moreover, according to the
consolidation effect of the real earth media, the velocity value of each layer is set to be
increasing with layer depth. In our implementation, we set the velocity difference between
adjacent layers greater than 200 m/s. According to the required maximum wave velocity
vmax , the wave velocity of the previous layer of media vupper , the number of next media ln,
and the random term r, the wave velocity range of each layer of media is selected. The
In Figure 4, we carry out a statistical analysis on the velocity value changes for different
10
kinds of layer models (five-to-nine layers). The trend is the same for both kinds of layer
models that with the increase of depth, the velocity value increases. Through the above
curve design, interface setting, and velocity assignment processes, the realistic structural
Based on the dense-layer model design, the fault setting is further studied. We use the
dense-layer model and the line of fault l(x) as input, and generate a fault model in a random
position. l(x) can be defined as a diagonal line, and can also be set as a curve, such as a
1. Randomly generate the starting position P of the fault and the amount of movement
2. Randomly select the moving part, i.e., the hanging wall or the footwall and the moving
distance.
The randomness ensures the possibility of a fault appearing at any position, which
improves the model complexity. However, in some cases, the reflected wave information
from the fault is difficult to receive, which affects the prediction result. The impact of faults
in models to the inversion results is discussed in the Discussion section. Consequently, for
each point [h0 , w0 ] of the moving part, the updated position [h∗ , w∗ ] can be calculated by
the equation:
It is worth noting that sometimes, the velocity value at the boundary positions will be
missing, and needs to be filled. The workflow of the fault setting is shown in Figure 5.
11
To further enrich our model and improve the complexity of the model, we further designed
We further design the salt body model based on the dense-layer model. The salt body
can be considered as a dense stratum rising upward from the bottom, while salt body
would fluctuate as it flows into the upper stratum. We use Gaussian function to simulate
where a, b, and c are amplitude, center, and standard deviation, respectively. To simulate
the influence of the salt body on the dense layers, where the deeper layer will have a larger
The original layered model fluctuates according to this set of Gaussian curves, simulating
the invading influence of the salt body. Finally, a new rock mass is designed through a
parabola with random parameters and given a wave velocity in the range [4400, 4500] m/s.
According to the proposed design method for dense-layer, fault, and salt body models,
we have generated 18,000 sets of models in batches (each type contains 6000 models) for
the next step of network training. Some of them are shown in the Figure 7.
Another research focus in this work is the network design for seismic inversion. For the
more complex velocity models generated in this work, a stable and high-resolution deep
neural network is needed. Here, we select SeisInvNet (Li et al., 2020) as our backbone, and
we further modify and adapt it to meet our needs. We first introduce the four parts of the
12
network and explain how to implement the end-to-end inversion process that maps data to
the corresponding model. Then, our improvements are carefully presented and discussed.
1) Encoder: For seismic data, considering the weak correspondence between seismic
data and velocity model, there are some troubles when using convolutional neural networks
or fully connected networks. One of the main advantages of SeisInvNet is to analyze and
extract the characteristics of seismic data, especially in the Encoder. The convolution
method is used to extract global information and domain information from the common-
shot gathers, and the one-hot vector (with a unit equals one while others leave zero) is used
to label the trace, which increases the single trace information and provides the possibility
of introducing a fully connected network. For seismic data of size [S × R × T ], through two
information are obtained, whose sizes are [R × C] and [R × T ], respectively. Besides, the
source and receiver location information are also encoded by a one-hot vector with a length
Improvement: To extract the effective features of data as much as possible, in this work,
R common-receiver gathers [S × T ] are further considered. That is, we obtain data informa-
tion from two aspects, common shot points, and common receiver points. Correspondingly,
the global information will have sources from two profiles. Specifically, the convolution op-
eration is performed on two kinds of seismic data gathers (i.e., the common-shot gather and
the common-receiver gather), respectively, and two vectors with a length of C are obtained.
In the same way, the neighboring information also comes from the neighboring traces of the
two profiles and then is linearly weighted as the final neighbor information with a length of
13
R.
which is slightly larger than that in the original SeisInvNet. In this way, each trace contains
more valid information, which provides more advantages for the fully connected network.
Figures 8a and 8b show the difference between SeisInvNet and the proposed method on
2) Generator: Given the encoded S ×R traces, the main purpose of this part is to obtain
S × R feature maps through a fully connected network. For problems of mapping a time
series to a spatial sequence, using a fully connected network is considered the most direct
and efficient way. We extract the feature of each encoded trace, whose size is [w × d] for
model construction.
3) Decoder: In this part, the generated S × R features are decoded to get the velocity
model V similar to the decoder part of many networks (Simonyan and Zisserman, 2014).
The dropout (Hinton et al., 2012b) is used to ensure that the network does not depend on a
particular feature. By randomly dropping some features from a certain channel, the results
will not heavily depends on certain feature. In this way, it is also possible to achieve good
inversion results even after missing certain data, which will be discussed later.
4) Loss function: The loss function determines the direction of approximating to the
velocity model. Mean Squared Error (MSE) is widely used in velocity inversion literature.
In the network, the misfit between the predicted model and the actual model is calculated
to derive gradients and update the network parameters. The MSE loss function is defined
as follows:
H×W
i
1 X 2
LM SE V , Vi = Vki − Vki . (8)
H ×W
k=1
14
Among them, Vi and Vi are the prediction of the i-th model and the corresponding
ground truth.
could better represent the structural similarity between two figures than M SE, and have
been widely used in many computer vision tasks. Here, to better optimize the structural
characteristics of the prediction model, M SSIM is also used as the loss function. In previ-
ous work (Li et al., 2020), the advantage of this kind of loss functions has been demonstrated.
(2µx µy + C1 ) (2σxy + C2 )
SSIM(x, y) = , (9)
µ2x+ µ2y + C1 σx2 + σy2 + C2
where x and y represent two corresponding windows in two images respectively. µx and µy
are averages of x and y, and σx and σy are variances of x and y, while σxy is the covariance
of x and y. This metric measures the similarity of two image windows. The value ranges
from 0 to 1, and the closer the value is to zero, the lower the similarity. With SSIM , the
where V is the model (image) and Vy(k,r) is a window of V that centers in k with width of
r, the definition is similarly for V. Thus, this equation computes the similarity by summing
the SSIM over windows at all the position and width. λr is the weight for SSIM of window
with width r. Specific M SSIM parameter details are referred to in Wang et al. (2003).
Through the calculation of the above two loss functions, we sum them and get the final
loss function:
i i i
LSU M V , Vi = LM SE V , Vi + LM SSIM V , Vi . (11)
15
EXPERIMENT AND RESULTS
Dataset Setting
Through the above model construction method, we obtain 18,000 dense-layer fault and
salt body models (of size 140×140) in total. Each model has five to nine layers. In the
numerical simulation of the acoustic wavefield, the size of each grid is defined as 10m×10m.
That is, the inversion region is 1km×1km for a model size of 100×100 after removing the
absorbing boundary. As for the observation system, 20 seismic sources and 32 receivers
are placed uniformly and symmetrically on the surface, respectively. The interval between
each source is 50m (five gridpoints), and the interval between each receiver is 30m (three
gridpoints).
A pseudospectral method (Kosloff and Baysal, 1982; Furumura et al., 1998; Virieux
et al., 2011) is used to simulate the seismic wave propagation. The Ricker wavelet with the
dominant frequency of 20Hz is excited as seismic source. The sampling interval is 1 ms, and
the data of the first 1000 time steps are recorded. We present some representative seismic
data and the corresponding velocity model in Figure 10. Finally, 18,000 model-data pairs
are divided into the training set, validation set, and test set in a ratio of 10:1:1. They are
used to train the network, validate performance to save the optimal parameters, and test
The setting of network hyperparameters also plays a key role in achieving a good inver-
sion effect. Especially for the mapping between the seismic data and the velocity model,
due to the large number of parameters and the weak mapping relationship, an advantageous
16
method is needed to prevent the gradient from disappearing and overfitting. Based on our
previous research and experience, we have selected the appropriate network parameter to
take advantage of the network. We select Adam optimizer (Kingma and Ba, 2014) with
batchsize of 36 and the initial learning rate of 5×10−5 to optimize our network and SeisIn-
vNet. For each layer of the network, including the convolutional layer and fully connected
layer, the Rectified Linear Unit (ReLU) activation and batch normalization (BN) are ap-
plied, who are as one of the most commonly used techniques in deep learning. The ReLU
This activation function is simple and fast to calculate. Similarly, this derivative is very
simple, when the value x is greater than zero, it is 1, which can effectively avoid the problem
network optimization process, and improve the predictability and stability, to achieve the
purpose of making the network training process faster (Santurkar et al., 2018).
The dropout ratio is 0.2 used in the Decoder. All the hyperparameters of the network
are shown in Table 1. We train the network for 200 epochs in total. After each epoch of
training, one validating process is carried out on the validation set. The parameters perform
Result analysis
In this subsection, we evaluate and compare the results of the two networks under differ-
ent configurations, including visual comparison and quantitative comparison with different
metrics, such as MAE, MSE, SSIM, and MSSIM. All the scores are calculated after velocity
17
model normalized.
The loss curves on the training set and validation set with respect to the epoch iteration
are shown in Figure 11. It can be seen that the loss curves for LM SE and LM SSIM decrease
monotonously and converge to a very small value on both training and validation sets.
However, our network is more stable and reliable on the validation set than the original
SeisInvNet. The minimum values of these curves are shown in Table 2, which demonstrates
the superiority of our network. Considering that the realistic structural model includes
various models such as salt body, this may be the main reason for the large fluctuation of
Some ground truth and prediction results are shown in Figures 12- 14. We observe that
for relatively simple dense-layer models, the prediction results are very accurate, i.e., almost
identical to truth, while the results get slightly worse for the fault models, sometimes the
fault location is not clear. Especially for the new models with multiple faults, the higher
degree of model complexity affects the amount of effective information collected in the data.
Lack of data leads to poor agreement in the multi-fault models. Furthermore, we provide
the statistics for dense-layer models, fault models, and salt body models on the test set in
Table 3. We choose four metrics to obtain these statistics, which are: M AE, M SE, SSIM ,
and M SSIM (the better network will have a smaller M AE and M SE and a larger SSIM
and M SSIM ). From these statistics, we observe a similar phenomenon as before, i.e., our
network performs better on the dense-layer model. Thus, we conclude that the fault model
has more complex features, which pose more challenges to the network.
Three groups of the dense-layer model, fault model, and salt body model are selected
to make the visual comparison in Figures 12- 14. We plot the velocity curves in these
18
figures to help compare the difference. It can be seen that for simple layer models, the
velocity is almost fitted, though there are still some errors, especially for the model with
large fluctuation. Besides, since the seismic data only have a limited recording time, less
reflection information can be obtained for the deeper layers, which leads to a poor inversion
effect in those layers, as shown in Figure 12c. For the fault model, the change of the velocity
at the fault site is more complex and does not increase monotonously with the depth, so
the network’s predicting ability on the fault models degenerated significantly, though it can
still reflect the trend of fault. Especially for the multi-fault model, as shown in Figure 13b,
it is difficult to obtain the information of multiple faults. For the salt body model, the
exact position, wave velocity and shape of salt body can be obtained by both methods,
while the inversion results of other layers is still accurate. However, our method is better
than SeisInvNet in terms of the position and shape of salt body, as shown in Figure 14.
The same conclusion can be easily drawn from Table 3. For the three types of models, the
dense layer model is the best, while the fault model is the worst.
terms of robustness, we evaluate the prediction results when some shot points are lost, as
shown in the Figure 15. Here we show the inversion results of of data with zero missing shot,
two missing shots, four missing shots and eight missing shots, respectively. Figure 15a is the
ground truth, Figures 15b-15e are the results of our proposed method, and Figures 15f-15i
are the prediction results of SeisInvNet. It can be seen that with several missing shots,
both methods can still work well. With the increase of missing data, the prediction results
of SeisInvNet, especially the salt body, become more and more inaccurate. However, the
inversion quality of our improved SeisInvNet could still be preserved to some extent. The
salt body and interface are acquired accurately, and the absence of shots will not affect the
19
prediction of the position and shape of salt body. The significance of this test lies in that
in the process of acquiring seismic data on the earth surface, considering the severe terrain
conditions, it is often difficult to realize the shot excitation at all of the planned positions.
Our method can still get acceptable prediction results in the presence of missing traces.
DISCUSSION
In this study, a novel method in designing dense-layer, fault, salt body models is de-
veloped, and the applicability and generalizability of SeisInvNet are also demonstrated.
Unlike previous models for DL based inversion methods, the proposed velocity model build-
ing workflow can automatically generate various models containing dense-layers, faults and
salt bodies, thus provide realistic model-data pairs to train the DL network. We consider
the variation of stratum fluctuations, layer thickness, and the change of velocity, and pro-
pose an architecture based on the design workflow of the dense-layer model. Similarly,
for the fault model, the range of fault location, tilt angle and moving direction is selected
and the fault setting function is completed, which could add the faulting structure to the
layer model. Finally, a total of 18,000 models with up to fifteen classes of velocities were
constructed, and good data-model pair were provided for deep learning. For SeisInvNet,
due to its generalizability and applicability, the effective information of the fault velocity
model data can also be extracted by the encoder and generator, and the velocity model is
While analyzing the results, we found that for the fault model with steep-dip or at the
edge of the observation system, the cause for poor predictions is that the reflected wave data
is difficult to record. As can be seen from Figure 16, the velocity values and the interface
locations are well predicted, but the fault is not obtained. This shows that the fault can
20
hardly be predicted because the reflected wave information of the fault is not recorded by
the receivers. Although the full waveform information is used, the network relies more on
the strong first reflected wave for prediction. Information such as diffractions is hard to be
effectively used by the network. In future research, we will further consider the problem
To verify the generalization ability of our training network, we test the ten and twelve-
layer models (not included in the training set). As shown in Figure 17a and 17b, for the
ten-layer model, the trained network can still carry out accurate prediction and get obvious
hierarchical results. For the twelve-layer model, due to the small layer thickness, part of the
interface prediction is not accurate, but the overall speed prediction is achieved, as shown
In addition, we make a simple comparison with FWI (as shown in Figure 18). Given a
better initial model, the resolution of the full waveform inversion in the shallow layer is still
better. However, building the initial model and the tendency into a local minimum, are still
problems for full waveform inversion. Our method does not rely on the initial model, and
can retrieve the exact velocity of each layer, with a high calculation efficiency. Therefore,
we think that deep learning inversion and full waveform inversion may not be completely
opposite, but they may be complementary, which will also be our next research focus.
Furthermore, we adjust the network slightly and test it on a small elastic wave dataset.
As shown in Figure 19, P- and S-wave velocity models are predicted by network. Although
the elastic wave data are more complex than the acoustic wave data, and the training data
is less, the improved SeisInvNet has also achieved considerable results. Considering the
21
lack of training data and that the network structure has not been optimized for elastic wave
data, more in-depth research on elastic wave data will be carried out in the future.
CONCLUSION
In this study, we developed a model design method and proposed an applicable seismic
inversion network based on SeisInvNet. In our model design scheme, we have considered a
number of controlling factors including the variation of stratum fluctuations, layer thickness,
the change of velocity for dense-layer, the range of fault location, moving direction for fault,
and invading salt body. Finally, a total of 18,000 realistic structural and complex models
were constructed. With the simulated data from these models, we collect a large data-model
dataset for training, validation and testing of deep neural networks. We improve SeisInvNet
by introducing more information from the common-receiver gathers and further enhance the
For the prediction results, the dense-layer, fault models, and salt body models are analyzed
separately. We find that the proposed method can work successfully for the dense-layer and
salt body models and reasonably for the fault models. The inversion results of the fault
models have improvement space by optimizing the data recording system, which we intend
ACKNOWLEDGMENTS
The research is supported by the Joint Program of the National Natural Science Foun-
dation of China (grant no.U1806226), the National Natural Science Foundation of China
(grant nos.5179007, 51809155, 61702301), and the starting fund from Zhejiang University.
22
.
23
REFERENCES
Alkhalifah, T., and I. Tsvankin, 1995, Velocity analysis for transversely isotropic media:
Baysal, E., D. D. Kosloff, and J. W. Sherwood, 1983, Reverse time migration: Geophysics,
Bunks, C., F. M. Saleck, S. Zaleski, and G. Chavent, 1995, Multiscale seismic waveform
Chen, Y., H. Chen, K. Xiang, and X. Chen, 2016, Geological structure guided well log
Chen, Y., M. Zhang, M. Bai, and W. Chen, 2019, Improving the signal-to-noise ratio of
Choromanska, A., M. Henaff, M. Mathieu, G. B. Arous, and Y. Lecun, 2014, The loss
Cohen, J. K., and N. Bleistein, 1979, Velocity inversion procedure for acoustic waves: Geo-
Deng, L., D. Yu, et al., 2014, Deep learning: methods and applications: Foundations and
Dunkin, J., and F. Levin, 1973, Effect of normal moveout on a seismic pulse: Geophysics,
Esteva, A., B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, 2017,
24
Dermatologist-level classification of skin cancer with deep neural networks: Nature, 542,
115–118.
Furumura, T., B. Kennett, and H. Takenaka, 1998, Parallel 3-D pseudospectral simulation
George, D., and E. Huerta, 2018, Deep learning for real-time gravitational wave detection
and parameter estimation: Results with advanced LIGO data: Physics Letters B, 778,
64–70.
P. Nguyen, and T. N. Sainath, 2012a, Deep neural networks for acoustic modeling in
speech recognition: The shared views of four research groups: IEEE Signal Processing
Hinton, G. E., S. Osindero, and Y.-W. Teh, 2006, A fast learning algorithm for deep belief
preprint arXiv:1207.0580.
Hu, W., A. Abubakar, and T. M. Habashy, 2009a, Joint electromagnetic and seismic inver-
Kawaguchi, K., 2016, Deep learning without poor local minima: Advances in neural infor-
seismic and gravity data for lunar composition and thermal state: Geophysical Journal
25
International, 168, no. 1, 243–258.
Kingma, D. P., and J. Ba, 2014, Adam: A method for stochastic optimization: arXiv
preprint arXiv:1412.6980.
Kosloff, D. D., and E. Baysal, 1982, Forward modeling by a fourier method: Geophysics,
Lailly, P., and J. Bednar, 1983, The seismic inverse problem as a sequence of before stack
Lecun, Y., Y. Bengio, and G. Hinton, 2015, Deep learning: Nature, 521, no. 7553, 436.
Li, S., B. Liu, Y. Ren, Y. Chen, S. Yang, Y. Wang, and P. Jiang, 2020, Deep-learning
inversion of seismic data: IEEE Transactions on Geoscience and Remote Sensing, 58, no.
3, 2135–2149.
Li, S., B. Liu, X. Xu, L. Nie, Z. Liu, J. Song, H. Sun, L. Chen, and K. Fan, 2017, An
Li, W., 2018, Classifying geological structure elements from seismic images using deep
Geophysicists, 4643–4648.
Liu, B., Q. Guo, S. Li, B. Liu, Y. Ren, Y. Pang, L. Liu, and P. Jiang, 2020a, Deep learning
Liu, B., Y. Pang, D. Mao, J. Wang, Z. Liu, N. Wang, S. Liu, and X. Zhang, 2020b, A
26
Bernd, and C. Geoff, 2012, Seismic methods in mineral exploration and mine planning: A
general overview of past and present case histories and a look into the future: Geophysics,
McCulloch, W. S., and W. Pitts, 1943, A logical calculus of the ideas immanent in nervous
Mittet, R., R. Sollie, and K. Hokstad, 1995, Prestack depth migration with compensation
Nurindrawati, F., and J. Sun, 2019, Estimating total magnetization directions using convo-
lutional neural networks, in SEG Technical Program Expanded Abstracts 2019: Society
Pratt, R. G., 1999, Seismic waveform inversion in the frequency domain, part 1: Theory
Pratt, R. G., and R. M. Shipp, 1999, Seismic waveform inversion in the frequency domain,
part 2: Fault delineation in sediments using crosshole data: Geophysics, 64, no. 3, 902–
914.
Pratt, R. G., and M. Worthington, 1990, Inverse theory applied to multi-source cross-hole
3, 287–310.
Puzyrev, V., 2019, Deep learning electromagnetic inversion with convolutional neural net-
Qu, S., E. Verschuur, and Y. Chen, 2019, Full-waveform inversion and joint migration
inversion with an automatic directional total variation constraint: Geophysics, 84, no. 2,
27
R175–R183.
Ren, Y., X. Xu, S. Yang, L. Nie, and Y. Chen, 2020, A physics-based neural-network way
Rosenblatt, F., 1958, The perceptron: a probabilistic model for information storage and
Röth, G., and A. Tarantola, 1994, Neural networks and inversion of seismic data: Journal
Saad, O., and Y. Chen, 2020, Deep denoising autoencoder for seismic random noise atten-
Santurkar, S., D. Tsipras, A. Ilyas, and A. Madry, 2018, How does batch normalization
Shi, Y., X. Wu, and S. Fomel, 2019, Saltseg: Automatic 3D salt segmentation using a deep
Shin, C., and Y. H. Cha, 2008, Waveform inversion in the Laplace domain: Geophysical
Shin, C., and Y. Ho Cha, 2009, Waveform inversion in the Laplace—Fourier domain: Geo-
Simonyan, K., and A. Zisserman, 2014, Very deep convolutional networks for large-scale
Sirgue, L., O. Barkved, J. Dellinger, J. Etgen, U. Albertin, and J. Kommedal, 2010, The-
matic set: Full waveform inversion: The next leap forward in imaging at Valhall: First
28
Break, 28, no. 4, 65–70.
Sun, J., Z. Niu, K. A. Innanen, J. Li, and D. O. Trad, 2020, A theory-guided deep-learning
Tarantola, A., 1984, Inversion of seismic reflection data in the acoustic approximation:
Tsai, K. C., W. Hu, X. Wu, J. Chen, and Z. Han, 2018, First-break automatic picking
with deep semisupervised learning neural network, in SEG Technical Program Expanded
Virieux, J., H. Calandra, and R.-E. Plessix, 2011, A review of the spectral, pseudo-spectral,
for computer vision: A brief review: Computational Intelligence and Neuroscience, 2018,
1–13.
Wang, Z., E. P. Simoncelli, and A. C. Bovik, 2003, Multiscale structural similarity for
Wu, X., L. Liang, Y. Shi, and S. Fomel, 2019, Faultseg3d: using synthetic datasets to train
Wu, Y., and Y. Lin, 2018, InversionNet: A Real-Time and Accurate Full Waveform Inversion
Yang, F., and J. Ma, 2019, Deep-learning inversion: a next generation seismic velocity-
29
model building method: Geophysics, 84, no. 4, 583–599.
Yang, Y., Y. Li, and T. Liu, 2009, 1D viscoelastic waveform inversion for Q structures from
the surface seismic and zero-offset VSP data: Geophysics, 74, no. 6, WCC141–WCC148.
Yu, S., J. Ma, and W. Wang, 2018, Deep learning tutorial for denoising: arXiv preprint
arXiv:1810.11614.
Yuan, P., W. Hu, X. Wu, J. Chen, and H. Van Nguyen, 2019, First arrival picking using
u-net with lovasz loss and nearest point picking method, in SEG Technical Program
Zhang, G., C. Lin, and Y. Chen, 2020, Convolutional neural networks for microseismic
Zhang, G., Z. Wang, and Y. Chen, 2018, Deep learning for seismic lithology prediction:
30
LIST OF TABLES
31
LIST OF FIGURES
1 The workflow of the deep learning based velocity model inversion algorithm. Through
the proposed velocity model and designed method, a large number of models are provided.
After the observation system is determined, we obtain the seismic data by performing the
wavefield simulation. We use seismic data as input and the velocity model as the label to
train the designed network. After training, given the input seismic data, we can directly
4 The average wave velocity at each depth of dense-layer models in the training set.
8 Comparison between the previous SeisInvNet and the proposed method on the En-
coder part. (a) presents the SeisInvNet, i.e., which only encodes the information from the
common-shot gather. (b) presents the proposed method which encodes information from
10 The velocity model and its corresponding simulated data. (a) denotes the velocity
model; (b), (c), and (d) represent the recorded data from the 1st, 10th, and 20th shot,
respectively.
11 Loss curves on training set and validation set during the training process. Among
them, the red line is derived by the proposed method, and the blue line is obtained by
the SeisInvNet. (a) denotes mean-squared error for the simulated velocity inversion in the
32
training set; (b) denotes MSSIM for the simulated velocity inversion in the training set; (c)
denotes mean-squared error for the simulated velocity inversion in the validation set; (d)
denotes MSSIM for the simulated velocity inversion in the validation set.
12 The comparison between the ground truth, prediction results of the proposed
method, and prediction results of SeisInvNet with the velocity curves for the dense-layer
13 The comparison between the ground truth, prediction results of the proposed
method, and prediction results of SeisInvNet with the velocity curves for the fault models
in testing set.
14 The comparison between the ground truth, prediction results of the proposed
method, and prediction results of SeisInvNet with the velocity curves for the salt body
15 The inversion results from data with zero missing shot, two missing shots, four
missing shots, and eight missing shots, respectively. (a) represents the ground truth. (b)-
(e) represent the results of our proposed method. (f)-(i) represent the prediction results of
SeisInvNet.
16 Prediction results for fault with large dips and faults close to the model boundary.
17 Network generalization ability test. (a) and (c) represent 10 and 12 layer velocity
models, respectively; (b) and (d) represent the prediction results of the proposed method,
respectively.
18 Comparison between the proposed method and full waveform inversion. (a) repre-
sents the true velocity model; (b) represents the initial model of the full waveform inversion;
(c) represents the full waveform inversion result; (d) represents the prediction result of the
proposed method.
33
19 Preliminary test results of elastic data inversion problem. (a), (b) and (c) repre-
sent the received data from the 1st, 10th, and 20th shot, respectively; (d) and (e) represent
P-wave velocity model and S-wave velocity model, respectively; (f) and (g) represent the
34
Figure 1: The workflow of the deep learning based velocity model inversion algorithm.
Through the proposed velocity model and designed method, a large number of models
are provided. After the observation system is determined, we obtain the seismic data by
performing the wavefield simulation. We use seismic data as input and the velocity model
as the label to train the designed network. After training, given the input seismic data, we
35
Figure 2: Multiple interfaces made up of y.
36
Figure 3: Curves and the corresponding models in different construction ways.
37
4.0
5-layer model
6-layer model
3.5 7-layer model
8-layer model
Velocity (km/s)
9-layer model
3.0
2.5
2.0
1.5
0 0.2 0.4 0.6 0.8 1.0
Depth (km)
Figure 4: The average wave velocity at each depth of dense-layer models in the training set.
38
Layer model Fault movement Fault model
(a) (b) (c)
39
Figure 6: Salt body model design workflow.
40
Figure 7: Designed dense-layer, fault, and salt body velocity models samples.
41
(a)
(b)
Figure 8: Comparison between the previous SeisInvNet and the proposed method on the
Encoder part. (a) presents the SeisInvNet, i.e., which only encodes the information from the
common-shot gather. (b) presents the proposed method which encodes information from
42
Figure 9: The proposed method used for model prediction.
43
(a) (b)
(c) (d)
Figure 10: The velocity model and its corresponding simulated data. (a) denotes the
velocity model; (b), (c), and (d) represent the recorded data from the 1st, 10th, and 20th
shot, respectively.
44
0.8 1
The proposed method The proposed method
SeisInvNet SeisInvNet
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0 0
0 50 100 150 200 0 50 100 150 200
Epochs Epochs
(a) (b)
0.8
0.6
0.4
0.2
0
0 50 100 150 200
Epochs
(c) (d)
Figure 11: Loss curves on training set and validation set during the training process. Among
them, the red line is derived by the proposed method, and the blue line is obtained by the
SeisInvNet. (a) denotes mean-squared error for the simulated velocity inversion in the
training set; (b) denotes MSSIM for the simulated velocity inversion in the training set; (c)
denotes mean-squared error for the simulated velocity inversion in the validation set; (d)
denotes MSSIM for the simulated velocity inversion in the validation set.
45
Figure 12: The comparison between the ground truth, prediction results of the proposed
method, and prediction results of SeisInvNet with the velocity curves for the dense-layer
46
Figure 13: The comparison between the ground truth, prediction results of the proposed
method, and prediction results of SeisInvNet with the velocity curves for the fault models
in testing set.
47
Figure 14: The comparison between the ground truth, prediction results of the proposed
method, and prediction results of SeisInvNet with the velocity curves for the salt body
48
Figure 15: The inversion results from data with zero missing shot, two missing shots, four
missing shots, and eight missing shots, respectively. (a) represents the ground truth. (b)-
(e) represent the results of our proposed method. (f)-(i) represent the prediction results of
SeisInvNet.
49
Figure 16: Prediction results for fault with large dips and faults close to the model boundary.
50
Figure 17: Network generalization ability test. (a) and (c) represent 10 and 12 layer velocity
models, respectively; (b) and (d) represent the prediction results of the proposed method,
respectively.
51
Figure 18: Comparison between the proposed method and full waveform inversion. (a)
represents the true velocity model; (b) represents the initial model of the full waveform
inversion; (c) represents the full waveform inversion result; (d) represents the prediction
52
Figure 19: Preliminary test results of elastic data inversion problem. (a), (b) and (c) repre-
sent the received data from the 1st, 10th, and 20th shot, respectively; (d) and (e) represent
P-wave velocity model and S-wave velocity model, respectively; (f) and (g) represent the
53
Hyperparameter
Batchsize 36
learning rate
Epochs 200
54
Loss The proposed SeisInvNet
method
set
training set
LM SE in 2.659×10−4 2.698×10−4
validation set
validation set
55
Metric Dense-layer model Fault model Salt body model
56