Semi-supervised Learning for Acoustic Impedance Inversion

Motaz Alfarraj; Ghassan AlRegib

arXiv:1905.13412·eess.IV·June 3, 2019

Semi-supervised Learning for Acoustic Impedance Inversion

Motaz Alfarraj, Ghassan AlRegib

PDF

2 Repos

TL;DR

This paper introduces a semi-supervised deep learning framework for acoustic impedance inversion in seismic data, effectively reducing the need for labeled data while maintaining high accuracy.

Contribution

It presents a novel semi-supervised neural network approach combining convolutional and recurrent layers for seismic inversion, leveraging well logs and a learned forward model as constraints.

Findings

01

Achieves 98% correlation between estimated and true impedance.

02

Uses only 20 labeled traces for training with high accuracy.

03

Incorporates geophysical constraints via a learned seismic forward model.

Abstract

Recent applications of deep learning in the seismic domain have shown great potential in different areas such as inversion and interpretation. Deep learning algorithms, in general, require tremendous amounts of labeled data to train properly. To overcome this issue, we propose a semi-supervised framework for acoustic impedance inversion based on convolutional and recurrent neural networks. Specifically, seismic traces and acoustic impedance traces are modeled as time series. Then, a neural-network-based inversion model comprising convolutional and recurrent neural layers is used to invert seismic data for acoustic impedance. The proposed workflow uses well log data to guide the inversion. In addition, it utilizes a learned seismic forward model to regularize the training and to serve as a geophysical constraint for the inversion. The proposed workflow achieves an average correlation of…

Figures7

Click any figure to enlarge with its caption.

Tables1

Table 1. Table 1 : Quantitative evaluation of the estimated AI.

	Training	Validation
PCC	0.9836	0.9809
$𝒓^{2}$	0.9466	0.9422

Equations14

d = F (m) + n,

d = F (m) + n,

\overset{m}{^} = m \in X argmin [H (F (m), d) + λ C (m)],

\overset{m}{^} = m \in X argmin [H (F (m), d) + λ C (m)],

F_{Θ}^{†} (d) \approx m .

F_{Θ}^{†} (d) \approx m .

L (Θ) := D (\overset{m}{^}, F_{Θ}^{†} (d))

L (Θ) := D (\overset{m}{^}, F_{Θ}^{†} (d))

L (Θ) := D (F (F_{Θ}^{†} (d)), d)

L (Θ) := D (F (F_{Θ}^{†} (d)), d)

L (Θ_{1}, Θ_{2}) := α \cdot property loss D (\overset{m}{^}, F_{Θ_{1}}^{†} (d)) + β \cdot seismic loss D (F_{Θ_{2}} (F_{Θ_{1}}^{†} (d)), d)

L (Θ_{1}, Θ_{2}) := α \cdot property loss D (\overset{m}{^}, F_{Θ_{1}}^{†} (d)) + β \cdot seismic loss D (F_{Θ_{2}} (F_{Θ_{1}}^{†} (d)), d)

L (Θ_{1}, Θ_{2}) := mean property loss \frac{α}{N _{p}} ∥ \overset{m}{^} - F_{Θ_{1}}^{†} (d) ∥_{2}^{2} + mean seismic loss \frac{β}{N _{s}} ∥ d - F_{Θ_{2}} (F_{Θ_{1}}^{†} (d)) ∥_{2}^{2}

L (Θ_{1}, Θ_{2}) := mean property loss \frac{α}{N _{p}} ∥ \overset{m}{^} - F_{Θ_{1}}^{†} (d) ∥_{2}^{2} + mean seismic loss \frac{β}{N _{s}} ∥ d - F_{Θ_{2}} (F_{Θ_{1}}^{†} (d)) ∥_{2}^{2}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Citation

M. Alfarraj and G. AlRegib, ”Semi-supervised Learning for Acoustic Impedance Inversion,” SEG Technical Program Expanded Abstracts 2019. Society of Exploration Geophysicists

Review

Accepted on: 23 May 2019

Data and Codes

[GitHub Link]

Bib

@incollection{alfarraj2019semisupervised,

title=Semi-supervised Learning for Acoustic Impedance Inversion,

author=Alfarraj, Motaz and AlRegib, Ghassan,

booktitle=SEG Technical Program Expanded Abstracts,

year=2019,

publisher=Society of Exploration Geophysicists}

Contact

[email protected] OR [email protected]

http://ghassanalregib.com/

Semi-supervised Learning for Acoustic Impedance Inversion

Abstract

Recent applications of deep learning in the seismic domain have shown great potential in different areas such as inversion and interpretation. Deep learning algorithms, in general, require tremendous amounts of labeled data to train properly. To overcome this issue, we propose a semi-supervised framework for acoustic impedance inversion based on convolutional and recurrent neural networks. Specifically, seismic traces and acoustic impedance traces are modeled as time series. Then, a neural-network-based inversion model comprising convolutional and recurrent neural layers is used to invert seismic data for acoustic impedance. The proposed workflow uses well log data to guide the inversion. In addition, it utilizes a learned seismic forward model to regularize the training and to serve as a geophysical constraint for the inversion. The proposed workflow achieves an average correlation of $98\%$ between the estimated and target elastic impedance using 20 AI traces for training.

1 Introduction

Seismic inversion is the process of estimating rock properties from seismic reflection data. In principle, inversion is a procedure to infer true model parameters $m\in X$ through indirect measurements $d\in Y$ . Mathematically, the problem can be formulated as follows

[TABLE]

where $\mathcal{F}:X\rightarrow Y$ is a forward operator, $d$ is the measured data, $m$ is the model, and $n\in Y$ is a random variable that represents noise in the measurements. To estimate the model from the measured data, one needs to solve an inverse problem. The solution depends on the nature of the forward model and observed data. In the case of seismic inversion, and due to the non-linearity and heterogeneity of the subsurface, the inverse problem is ill-posed. In order to find a stable solution to an ill-posed problem, the problem needs to be regularized. For instance, one can seek a solution by imposing constraints on the solution space, or by incorporating prior knowledge about the model. A classical approach to solve inverse problems is to set up the problem as a Bayesian inference problem, and improve prior knowledge by optimizing for a cost function based on the data likelihood,

[TABLE]

where $\hat{m}$ is the estimated model, $\mathcal{H}:Y\times Y\rightarrow\mathbb{R}$ is an affine transform of the data likelihood, $\mathcal{C}:X\rightarrow\mathbb{R}$ is a regularization function that incorporates prior knowledge in the inversion, and $\lambda$ is regularization parameters that control the influence of the regularization function.

The solution of equation 2 in seismic inversion can be sought in a stochastic or a deterministic fashion through an optimization routine. The literature of seismic inversion is rich in various methods to formulate, regularize and solve the problem (e.g., [14, 13, 15, 24, 7, 16, 23]).

Recently, there have been several successful applications of machine learning and deep learning methods in inverse problems [18]. Moreover, machine learning and deep learning methods have been utilized in the seismic domain for different tasks such as inversion and interpretation [5]. For example, seismic inversion has been attempted using supervised-learning algorithms such as support vector regression (SVR) [2], artificial neural networks [22, 6], committee models [17], convolutional neural networks (CNNs) [12], recurrent neural networks [4], and many other methods [8, 27, 17, 9, 20, 10].

In general, machine learning algorithms are used to learn a non-linear mapping parameterized by $\Theta\in Z$ , i.e., $\mathcal{F}_{\Theta}^{\dagger}:Y\rightarrow X$ from a set of examples (known as the training dataset) such that:

[TABLE]

There is one key difference between classical inversion methods and machine learning methods. In classical inversion, the outcome is a set of model parameters (deterministic) or a posterior probability density function (stochastic). On the other hand, learning methods produce a mapping from measurements domain to model parameters domain ( $\mathcal{F}^{\dagger}_{\Theta}$ ).

Using neural networks, one can learn $\mathcal{F}^{\dagger}_{\Theta}$ (in equation 3) using different learning schemes such as supervised or unsupervised learning [1]. In supervised learning, the machine learning algorithm is given a set measurement-model pairs $\{d,m\}$ (e.g., seismic traces and their corresponding rock property traces from well logs) to learn the mapping by minimizing the following loss function

[TABLE]

where $\mathcal{D}$ is a distance measure that compares the estimated rock property to the estimated property. Namely, supervised machine learning algorithms seek a solution that minimizes the inversion error over the given measurement-model pairs. There are many challenges that might prevent supervised machine learning algorithms from finding a proper mapping that can be generalized beyond the training dataset. One of the challenges is the lack of labeled data from a given survey area on which a model can be trained. For this reason, such algorithms must have a limited number of learnable parameters (i.e. shallow neural networks) and good regularization methods in order to prevent over-fitting and to be able to generalize well [4].

Alternatively, a solution of the inverse problem can be sought in an unsupervised-learning scheme where the learning algorithm is given a set of measurements only $d$ and a forward model $\mathcal{F}$ . The algorithm then learns by minimizing the following data misfit described by the following equation

[TABLE]

Such formulation does not integrate well log data directly in the learning process. Furthermore, the forward model and its parameters must be chosen carefully to result in reasonable inversion.

In this work, we proposed a semi-supervised machine-learning approach to seismic inversion that integrates both well log data misfit in addition to data misfit. Semi-supervised learning enables the use of deep learning to seek better inversion without high data requirements as often required in supervised deep learning schemes. Formally, the loss function of the proposed workflow is written as

[TABLE]

where $\mathcal{F}^{\dagger}_{\Theta_{1}}$ is a learned inverse model parameterized by $\Theta_{1}$ and $\mathcal{F}_{\Theta_{2}}$ is a learned forward model parameterized by $\Theta_{2}$ . In addition, $\alpha,\beta\in\mathbb{R}$ are tuning parameters that govern the influence of each of the property loss and seismic loss, respectively.

2 Methodology

The proposed workflow shown in Figure 1 consists of two main modules: the inverse model ( $\mathcal{F}^{\dagger}_{\Theta_{1}}$ ) and a forward model ( $\mathcal{F}_{\Theta_{2}}$ ); both of which have learnable parameters. The inverse model takes zero-offset seismic traces as inputs, and outputs the best estimate of the corresponding AI. Then, the forward model is used to synthesize seismograms from the estimated AI. The error (data misfit) is computed between the synthesized seismogram and the input seismic traces using the seismic loss module for all traces in the survey. Furthermore, property loss is computed between estimated and true AI on traces for which we have a true AI from well logs. The parameters of both the inverse model and forward model are adjusted by combining both losses as in equation 1.

In this work, we chose the distance measure ( $\mathcal{D}$ ) as the Mean Squared Error (MSE). Hence, equation 1 reduces to:

[TABLE]

where $N_{\text{s}}$ is the total number of seismic traces in the survey, and $N_{p}$ in the number of available well logs from which AI traces are obtained. In seismic surveys, $N_{p}\ll N_{s}$ , therefore, the seismic loss is computed over many more traces that the property loss. On the other hand, the properly loss has access to direct high-resolution model parameters (well log data). To ensure stable learning, $\alpha$ and $\beta$ are chosen to balance learning from the two terms of the loss function. In this work, we chose $\alpha=0.2$ , and $\beta=1$ .

2.1 Inverse Model

The proposed inverse model in the proposed workflow consists of four main submodules (shown in Figure 2). These submodules are labeled as sequence modeling, local pattern analysis, upsampling, regression. Each of the four submodules performs a different task in the overall inversion model.

2.1.1 Sequence Modeling

The sequence modeling submodule consists of a series of Gated Recurrent Units (GRU) [11]. GRUs model their inputs as sequential data and compute temporal features based on the temporal variations of the input traces. In addition, they compute a state variable from future and past predictions that serve as a memory. The series of the three GRUs in the sequence modeling submodule is equivalent to a 3-layer deep GRU. Deeper networks are able to model complex input-output relationships that shallow networks might not capture. Moreover, deep GRUs generally produce smooth outputs. Hence, the output of the sequence modeling submodule is considered as the low-frequency trend of AI.

2.1.2 Local pattern analysis

The local pattern analysis submodule consists of a set of 1-dimensional convolutional blocks with different dilation factors in parallel. The output features of each of the parallel convolutional blocks are then combined using another convolutional block. Dilation refers to the spacing between convolution kernel points in the convolutional layers [26]. Multiple dilation factors of the kernel extract multiscale features by incorporating information from trace samples that are direct neighbors to a reference sample (i.e., the center sample), in addition to the samples that are further from it. A convolutional block (ConvBlock) in Figure 2 consists of a convolutional layer followed by group normalization [25] and an activation function. In this work, we chose hyperbolic tangent function as the activation function.

Convolutional layers operate on small windows of the input trace. Therefore, they mostly capture high-frequency trends in traces. However, since convolutional layers do not have a state variable like recurrent layers, they do not capture low-frequency trends. Hence, the outputs of the local pattern analysis and Sequence modeling modules are added to obtain a full-band frequency content.

2.1.3 Upsampling

The upsampling submodule is used to compensate for the resolution mismatch between seismic data and well log data. Deconvolutional layers (also known as transposed convolutional or fractionally-strided convolutional layers) are upsampling modules with learnable kernel parameters unlike classical interpolation methods with fixed kernel parameters (e.g., linear interpolation). In addition, the stride controls the factor by which the inputs are upsampled. For example, a deconvolutional layer with a stride of ( $s=2$ ) produces an output that has twice the number of the input samples (vertically). Deconvolutional layers have been used for various applications like semantic segmentation and seismic structure labeling [21, 3].

A deconvolutional block (DeconvBlock) in Figure 2 have a similar structure as the convolutional blocks introduces earlier. They are a series of deconvolutional layer followed by group normalization and an activation function.

2.1.4 Regression

The final submodule in the inverse model is regression which consists of a GRU followed by a linear mapping layer (fully-connected layer). The role of this module is to regress the extracted features from the other modules to the target domain (AI domain). The GRU in this module is a simple 1-layer GRU that augments the interpolated outputs by the upsampling submodule using global temporal features. Finally, a linear affine transformation layer (fully-connected layer) takes the output features from the GRU and map them AI values.

2.2 Forward Model

The role of the forward model is to synthesize seismograms from AI. Forward modeling is commonly used in classical inversion approaches. However, in the work, we use a neural network to learn an appropriate forward model from the data. We used a simple 2-layer CNN to compute features from the AI traces, followed by a single convolutional layer that resembles a wavelet convolution in forward modeling. One of the advantages of using a learned froward model is that it automatically extracts the wavelet from the data.

3 Case Study

In order to validate the proposed algorithm, we chose Marmousi 2 model [19] (converted to time) as a case study. Marmousi 2 model is an extension of the original Marmousi synthetics model that has been used for numerous studies in geophysics for various applications including seismic inversion, seismic modeling, and seismic imaging. The model spans 17 km in width and 3.5 km in depth with a vertical resolution of 1.25 m.

3.1 Training The Models

To train the proposed inversion workflow, we chose $20$ evenly-spaced traces for training ( $N_{p}=20$ ). For those training traces, we assume we have access to both AI and seismic data. For all remaining traces in the survey ( $N_{s}=2721$ traces), we assume we have access to seismic data only.

First, the inverse and forward models are initialized with random parameters. Then, randomly chosen seismic traces in addition to the seismic traces for which we have AI traces in the training dataset are inputted to the inverse model to get a corresponding set of AI traces. The forward model is then used to synthesize seismics from the estimated AI. Seismic loss is computed as the MSE between the synthesized seismic and the input seismic. Property loss is computed as the MSE between the predicted AI and the true AI trace on the training traces only. The total loss is computed as a weighted sum of the two losses. Then, the gradients of the total loss are computed, and the parameters of the inverse model are updated accordingly. The process is repeated until convergence.

3.2 Results and Discussion

Figure 3 shows estimated AI and true AI for the entire section. The shown predicted AI is the direct output of the inversion workflow with no post-processing. The jitter effect visible in the predicted AI is expected since the proposed workflow is based on 1-dimensional modeling with no explicit spatial constraints as often done in classical inversion methods.

The traces around $x=3400$ m passe through an anomaly (Gas-charged sand channel) represented by an isolated and sudden transition in AI at $1.25$ ms. This anomaly causes the inverse model to incorrectly estimate AI. Since our workflow is based on bidirectional sequence modeling, we expect the error to propagate to nearby samples in both directions. However, the algorithm quickly recovers a good estimate for deeper and shallower samples of the trace. This quick recovery is mainly due to the reset-gate variable in the GRU that limits the propagation of such errors in sequential data estimation.

Furthermore, we show a scatter plot of the estimated and true AI in Figure 4. The shaded region includes all points that are within one standard deviation of the true AI ( $\sigma_{\text{AI}}$ ). The scatter plot shows a linear correlation between the estimated and true AI with the majority of the estimated samples within $\pm\sigma_{\text{AI}}$ from the true AI.

To evaluate the performance of the proposed workflow quantitatively, we use two metrics that are commonly used for regression analysis. Namely, Pearson correlation coefficient (PCC), and coefficient of determination ( $r^{2}$ ). PCC is a measure of the linear correlation between the estimated and target traces. It is commonly used to measure the overall fit between the two traces. On the other hand, $r^{2}$ is a goodness-of-fit measure that takes into account the mean squared error between the two traces. The quantitative results are computed over the training traces and for all traces in the survey that were not used in the training (validation data). The results are summarized in Table 1.

The results in Table 1 shows that the performance of the proposed workflow on unseen data (validation) is very close to its performance on the training data, which indicates its generalizabilty beyond the training data.

4 Conclusion

In this work, we proposed an innovative semi-supervised machine learning workflow for elastic impedance (AI) inversion from zero-offset seismic data. The proposed workflow was validated on the Marmousi 2 model. Although the training was carried out on a small number of AI traces for training, the proposed workflow was able to estimate AI for the entire Marmousi 2 model with an average correlation of $98\%$ . The application of the proposed workflow is not limited to AI inversion; it can be easily extended to perform full elastic inversion as well as property estimation for reservoir characterization.

5 Acknowledgements

This work is supported by the Center for Energy and Geo Processing (CeGP) at Georgia Institute of Technology and King Fahd University of Petroleum and Minerals (KFUPM).

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Jonas Adler and Ozan Öktem. Solving ill-posed inverse problems using iterative deep neural networks. Inverse Problems , 33(12):124007, 2017.
2[2] AF Al-Anazi and ID Gates. Support vector regression to predict porosity and permeability: effect of sample size. Computers & Geosciences , 39:64–76, 2012.
3[3] Yazeed Alaudah, Shan Gao, and Ghassan Al Regib. Learning to label seismic structures with deconvolution networks and weak labels. In SEG Technical Program Expanded Abstracts 2018 , pages 2121–2125. Society of Exploration Geophysicists, 2018.
4[4] Motaz Alfarraj and Ghassan Al Regib. Petrophysical property estimation from seismic data using recurrent neural networks. In SEG Technical Program Expanded Abstracts 2018 , pages 2141–2146. Society of Exploration Geophysicists, 2018.
5[5] Ghassan Al Regib, Mohamed Deriche, Zhiling Long, Haibin Di, Zhen Wang, Yazeed Alaudah, Muhammad Amir Shafiq, and Motaz Alfarraj. Subsurface structure analysis using computational interpretation and learning: A visual signal processing perspective. IEEE Signal Processing Magazine , 35(2):82–98, 2018.
6[6] Mauricio Araya-Polo, Joseph Jennings, Amir Adler, and Taylor Dahlke. Deep-learning tomography. The Leading Edge , 37(1):58–66, 2018.
7[7] Arild Buland and Henning Omre. Bayesian linearized avo inversion. Geophysics , 68(1):185–198, 2003.
8[8] Soumi Chaki, Aurobinda Routray, and William K Mohanty. A novel preprocessing scheme to improve the prediction of sand fraction from seismic attributes using neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , 8(4):1808–1820, 2015.