Data-driven methods for the initialization of full-waveform inversion

Ph.D. thesis defense by Oleg Ovcharenko
Data-driven methods for the initialization of full-waveform inversion
Physical Science & Engineering

Earth Science and Engineering Program

KAUST

15.11.2021, Thuwal, Saudi Arabia
Advisor: Daniel Peter (KAUST)

Committee:
Martin Mai (Chair, KAUST)
Tariq Alkhalifah (KAUST)

Xiangliang Zhang (UT ND)

Sergey Fomel (UT Austin)

Outline
2
Salt flooding by variance-based interpolation
Low-frequency extrapolation by deep learning
Multi-task learning for data and model recovery

Hydrocarbon exploration and extraction
3
https://www.semanticscholar.org/paper/Machine-learning-applications-to-geophysical-data-Bougher/91cd3152a60ddf41911dd2a0b876718f89555633/figure/12
https://www.arabianbusiness.com/saudi-aramco-cut-drilling-costs-hold-rig-count-steady-579284.html
https://www.oedigital.com/news/474771-saudi-aramco-ades-in-rig-contract-extension
Motivation

Seismic data
4
Signal is band-limited
by hardware and noise

Source energy

Sensitivity of receivers

Frequencies < 4 Hz are
generally considered low

Industry advances hardware
Airgun source
https://commons.wikimedia.org/wiki/File:Streamer-detail_hg.jpg
https://archive.epa.gov/esd/archive-geophysics/web/html/marine_seismic_methods.html
(Chelminski et al., 2021)
Common-shot-gather
Towed streamer
Motivation

Full-waveform inversion (FWI)
5
Observed
Numerical modeling
Misfit calculation
Gradient w.r.t.
model parameters
Init / Update model parameters
Calculated
Initial guess
Motivation

Challenges
6
Frequency
(Bunks, 1995)
Gradient-based nonlinear optimization:
alter subsurface model to minimize data misfit
Trade-oﬀ between initial point and data
frequency bandwidth
Local minima where derivative turns to zero
Stop Start
Motivation

7
Frequency
(Bunks, 1995)
Missing low-frequency data
Converged to a local
minumum
Challenges
Motivation

Solutions
8
Frequency
(Bunks, 1995)
Recover
low-frequency
data
Better
initial model
Motivation

Trend of data-driven research
9
https://deepai.org/publication/integrating-machine-learning-for-planetary-science-perspectives-for-the-next-decade
Anomalies as data
Deep learning (DL)
Motivation

Thesis objectives
10
Geophysics
Exploration Seismic inversion Gradient-based FWI
Low-frequency data recovery
Initial model building
Field of contributions:
Explore ways to improve the robustness of FWI for complex environments
• How to automate salt flooding?

• How could deep learning be used to expand seismic data bandwidth?

• How to bridge the gap between synthetic and field data experiments?
Motivation

Evolution of my Ph.D. ideas
11
Realism
2016 2017 2019 2021
Industrial experience
Salt flooding
Low-frequency
extrapolation
Model and data
reconstruction
Abstract concepts Practical implementation
Motivation

Outline
12

13
Chapter 1

of 3
Variance-based model interpolation for improved
full-waveform inversion in the presence of salt
bodies
• Challenges of salt

• Iterative salt flooding

• Synthetic example
Idea:
Use cycle-skipping artifacts from frequency-domain FWI
as a guide for salt flooding
Objective:
Automate salt flooding for frequency-domain FWI without
intervention into FWI formulation

Salt imaging
14
Features and challenges:
• Hydrocarbons near salt bodies

• High-velocity contrasts

• Complex geometries, steep flanks

• Illumination issues
https://wiki.seg.org/wiki/Salt_imaging_techniques
Existing solutions:
• Top-bottom approach (Zhang et al., 2009)

• Regularisation / conditioning (Alkhalifah, 2016))

• Automated salt flooding (Esser et al., 2016; Kalita
et al., 2019; etc.)
Willacy and Kryvohuz, 2019
Salt flooding

Frequency-domain experiment
15
Receivers
Sources
Crop from BP 2004 (Billette and Brandsberg-Dahl, 2005)
Size: 61 x 220, dx = 50 m
Acoustic, isotropic
Frequency domain
Low
3 Hz
4.12 Hz
High
3.33 Hz
3.7 Hz
Cycle-skipping artifacts at diﬀerent mono-frequencies
Salt flooding

Selection of frequencies
16
Size of cycle-skipping artifacts is proportional to wavelength λ
λ1
λ2
λ3
λ4
Low-frequency
artifacts
Intermediate-frequency
artifacts
High-frequency
artifacts
Wavelength
f1
f2
f3
f4
Frequency
Low
High
Artifacts
Salt flooding

17
1. Averaging
0. Modeling
2. Variance
3. Flooding
f4
f3
f2
f1
High frequency
Low
Salt flooding

18
Weighted average = more weight to lower frequencies since
these are less prone to cycle-skipping
1. Averaging
0. Modeling
2. Variance
3. Flooding
Salt flooding

19
0. Modeling
2. Variance
3. Flooding
1. Averaging
Salt flooding
How much a variable alternates from its weighted average value?

20
0. Modeling
2. Variance
3. Flooding
1. Averaging
Salt flooding
Floating threshold tracks the history of variance properties

21
0. Modeling
2. Variance
3. Flooding
1. Averaging
Salt flooding
Flood where the variance exceeds the threshold
Low (high) SNR leads to flooding with the mean (max) from a half-wavelength circle

22
Input
Iterations Iterations Iterations
km/s
km/s
Salt flooding

23
Input
km/s
km/s
Salt flooding

24
Input
km/s
km/s
Salt flooding

25
Input
km/s
km/s
Salt flooding

Salt flooding result
26
Target crop from BP 2004 model
FWI from linear initial model
Initial model after salt flooding
FWI from salt-flooded initial model
Salt flooding

Chapter summary
27
Pros:
Does not interfere with the core of frequency-domain FWI

Computationally aﬀordable
Cons:
Modeling for multiple frequencies

How these artifacts look in the real world?
Variance-based interpolation build around using cycle-skipping artifacts as new data
Takeaways:
Distinctive geological features of salt bodies might be a beneficial for generation

of synthetic subsurface models
Salt flooding

Outline
28

29
Chapter 2

of 3
Deep learning for low-frequency extrapolation
from multi-oﬀset seismic data
• Value of low frequencies

• Frequency domain

• Deep learning method

Idea:
Supervised deep learning to extrapolate patterns in
frequency-domain high-frequency data
Objective:
Reconstruct missing low-frequency data to compensate
for poor initial model for frequency-domain FWI

Why do we need low frequencies?
30
Lack of low-frequency data
- Due to instrumental limitations
- Due to noise
(Kazei et al., 2016)
fHigh
fLow
- Inverts large-scale velocity structures
- Less chance to get stuck in local minima
- Reveals deep model structures / below salt
ata
mitations
fHigh
fLow
Seismic buoys for ultra-long oﬀset surveys by GWL
Low-frequency data

Why do we need low frequencies?
31
Lack of low-frequency data
- Due to instrumental limitations
- Due to noise
fHigh
fLow Seismic buoys for ultra-long oﬀset surveys by GWL
Low-frequency data

Frequency bandwidth extrapolation
32
Fidelity
of
wave
phenomena
Computational complexity
Trace-to-Trace
Shot-to-Shot
Data-to-Data
(Ovcharenko et al, 2017, 2018

2019, 2020)
(Sun & Demanet,
2018-2021; Hu, 2019)
(Aharchaou et al, 2020,
2021)
Extrapolation for atomic events

(Li & Demanet, 2015, 2016)
Deep learning methods
Beat-tone inversion

(Hu, 2014)
Envelope inversion

(Wu et al., 2013 )
Pre-deep learning methods
Low-frequency data

Common shot gather in frequency domain
33
Source Receivers
Dataset size = Nshots * Nmodels
Solve Helmholtz equation to get complex mono-frequency amplitudes at
receiver locations
Low-frequency data

Mapping high frequencies to low
34
Extrapolate patterns from high frequencies down to low frequencies
Low-frequency data

Experimental and training setup
35
Input high-frequency data Target low-frequency data
MobileNet

(Howard et al., 2017)
64 sources and receivers

32 known frequency in range 3-5 Hz
Successive mono-frequency inversions at

0.25 0.55 0.93 2.04 2.66 3.46 4.50 Hz
Acoustic modeling

Frequency domain
Low-frequency data

Inference
36
Target Prediction Diﬀerence
Frequency slice of the data cube
0.25 Hz
0.55 Hz
0.93 Hz
64
64
Receivers
Sources
Real part of frequency-domain data
Low-frequency data

Validation by FWI
37
0.25Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.25Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.55Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.55Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.93Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.25 Hz
0.55 Hz
0.93 Hz
4.5 Hz
FWI of predicted data
FWI of target data
Low-frequency data

Chapter summary
38
Pros:
Mono-frequency target is “simple” compared to time domain

Eﬃcient generation of training data by shots

Suitable for extrapolation of ultra-low (< 1 Hz) frequencies
Cons:
One frequency = one training

Frequencies disconnected
Low-frequency extrapolation in frequency domain by deep neural network
Takeaways:
Bandwidth extrapolation is feasible but application in FWI requires high accuracy
of reconstructed data
Low-frequency data

Outline
39

40
Chapter 3

of 3
Multi-task learning for low-frequency
extrapolation and elastic model building from
seismic data
• Multi-task learning

• Time domain data


• Field data example
Idea:
Jointly predict initial model and low-frequency data so
missing ultra-low frequencies are compensated by the
predicted model
Objective:
Alleviate high accuracy requirement for extrapolated low-
frequency data

Multi-task learning
41
Multi-task learning
Benefit from knowledge acquired by
learning related tasks
Child learns to recognize faces and can then apply this knowledge
to recognize other objects
Hard parameter sharing (Ruder, 2017)
(Kendall et al., 2018)
• General representations in encoder

• Learn a complex task by solving a simple task

• Reduced risk of overfitting

Multi-task network architecture
42
Encoder Data decoder
Model decoder
Convolution
Dilated

convolution
Local velocity model
kernel 7x7
kernel 5x5
kernel 3x3
High-frequency data
Concatenation
> 4 Hz
< 5 Hz
Low-frequency data
Multi-task learning

Multi-task objective
43
Loss terms breakdown:
Data loss
Data correlation loss
Model loss
Model regularization
To reconstruct low-frequency data
To treat the data trace-wise
To reconstruct low-wavenumber model
To avoid data leakage into model
W is the weight of a loss term
Multi-task learning

On the fly loss balancing
44
Sigmas quantify uncertainties associated
with a given loss.

Logarithmic term prevents excessive
uncertainty growth

In practice, sigmas are scalars that are
trainable alongside the network weights.
Multi-task learning
(Kendall et al., 2018)

Semi-synthetic training dataset based on field data
45
Noise collection
Elastic modeling in random subsurface models
Source wavelet
Pre-arrival noise
BroadSeis data by CGG
324 hydrophones every 25 m, recording for 7 seconds
Multi-task learning

Semi-synthetic training dataset
46
High Low
Synthetic
Field
Low

< 5 Hz
High

> 4 Hz
Input Target #1 Validation Target #2
Oﬀset, 324 ~ 8 km
Time, 376 ~ 6 sec
ULow

< 3 Hz
Multi-task learning

Experiments
47
Vs
Rho
Synthetic data: modified Marmousi II model
Shear-wave velocity and density are
constructed from empirical relations:
The domain geometry for synthetic experiment is
the same as for FWI on field data. Velocity range
is diﬀerent
Field data: marine streamer data from Australia
(Gardner et al., 1974)
Multi-task learning

Inference depending on loss configuration
48
LС
LСM
L
UNet
Target
Input
Legend:

L - data loss

C - correlation loss

M - model loss
LС
LСM
L
UNet
Input Target
These are predicted
data after low-pass
filtering below 3 Hz,
where the input data
was set to strict zero
Synthetic data Field data
Multi-task learning

FWI application workflow
49
NN
FWI
> 4 Hz
< 5 Hz
Blend
Stack
Apply to shots one-by-one
Multi-task learning

Validation by FWI
50
Synthetic data Field data
Predicted initial
model
Predicted data
< 3 Hz
Predicted data
< 4 Hz
Predicted and
available data
< 7 Hz
Multi-task learning

Compare to inversion of true data
51
Expectation: True low-frequency data > 2.5 Hz, started from 1D initial
Reality: Predicted low-frequency data > 2.5 Hz, started from predicted initial
Well-log comparison
Multi-task learning

Data match before
52
at 4 km location at 8 km location
Multi-task learning

Data match after
53
at 4 km location at 8 km location
Multi-task learning

Chapter summary
54
Pros:
Data generation is aﬀordable and follows conventional FWI steps

Dynamically weighted loss terms
Cons:
Need to be tailored for a specific dataset
Multi-task learning for frequency bandwidth extrapolation and initial model building from time domain data
Takeaways:
Recovered initial model addresses the time-domain challenge of low-frequency
extrapolation

Semi-synthetic dataset suﬃcient for inference on field data
Undergoing review
for IEEE TGRS
Multi-task learning

Conclusions & Outlook
55
• Salt flooding with variance-based method can help to automate initial model building

• Low-frequency extrapolation with deep learning is feasible for salt-induced environments

• Multi-tasking learning can help to relax accuracy expectations for reconstructed data
• Semi-synthetic dataset to bridge the gap between synthetic and field data applications
Supervised vs. unsupervised learning:

* Accuracy? Computational costs? Feasibility?

* Low-frequency or directly invert for subsurface model?

Explainable AI:
* How to analyze the NN to understand the input problem?

Physics-guided methods:
* Should we replace deterministic solvers by NN?

OUTLOOK

Contributions of my Ph.D. work
56
• Three methods to improve the initialization of FWI (journal articles)

• Model domain: cycle-skipping artifacts as new data to guide salt flooding

• Data domain: frequency domain suitable for ultra-low frequency extrapolation

• Data + Model domains: joint recovery of low frequencies and background model to compensate for
imperfections of each other
• Several concepts introduced, extended or adopted (conference proceedings)

• Multiple-frequency bands to enable domain adaptation

• Texture-transfer from geological prior

• Orthogonal encoding for surface multiple suppression
• Open-source contributions

• Python API for DENISE-Black-Edition by Daniel Kohn

• WaveProp in MATLAB

• Multi-task learning for joint low-frequency data and model extrapolation

57
Journal articles published and submitted Peer-reviewed conference proceedings
…
…
…
…
…
…

Acknowledgements
58
I would like to thank my supervisor Daniel Peter, Vladimir Kazei and Tariq
Alkhalifah for shaping me as a researcher. My Ph.D. Committee members:
Martin Mai, Xiangliang Zhang and Sergey Fomel for their time and eﬀorts
dedicated to evaluating my work. SMI and SWAG group members for fruitful
discussions.

Individuals who helped me on the way: Pavel Plotnitskii, Mahesh Kalita,
Hanchen Wang, Christos Tzivanakis, Jubran Akram, Yana Ovcharenko, Dias
Urozaev, Muhammad Izzatullah, Fuqiang Chen, Armando Carmona, Eduardo
Cano, Martyn Ovcharenko, Yan Yang, Daniel Kohn, Siarhei Khirevich, Matteo
Ravasi, Claire Birnie and others.

Anatoly Baumstein, Song Hou, and Andrey Bakulin for my industrial
experience and feedback. CGG for marine streamer data. KAUST, ECRC and
Saudi Aramco for giving me the environment and for funding my work.
https://inhabitat.com/kaust-breakwater-beacon-is-a-naturally-cooled-lighthouse-in-saudi-arabia/

Conclusions & Outlook
60
• Salt flooding with variance-based method can help to automate initial model building

• Low-frequency extrapolation with deep learning is feasible for salt-induced environments

• Multi-tasking learning can help to relax accuracy expectations for reconstructed data
• Semi-synthetic dataset to bridge the gap between synthetic and field data applications
Supervised vs. unsupervised learning:

* Accuracy? Computational costs? Feasibility?

* Low-frequency or directly invert for subsurface model?

Explainable AI:
* How to analyze the NN to understand the input problem?

Physics-guided methods:
* Should we replace deterministic solvers by NN?

OUTLOOK

62
1. Averaging
0. Modeling
2. Variance
3. Flooding
f4
f3
f2
f1
High frequency
Low
Salt flooding

63
Weighted average using weights
1. Averaging
0. Modeling
2. Variance
3. Flooding
Assigns more weight to lower frequencies since these are less prone to cycle-skipping
Salt flooding

64
Weighted variance
0. Modeling
2. Variance
3. Flooding
using weights
1. Averaging
Indicates how much a variable alternates from its weighted average value
Salt flooding

65
0. Modeling
2. Variance
3. Flooding
1. Averaging
Floating threshold
initial threshold
mean of variance map
max of variance map
max threshold
in flooding history
Salt flooding

66
0. Modeling
2. Variance
3. Flooding
1. Averaging
High-variance mask Flooding within the mask
Low SNR = flooding with the mean from half-wavelength circle, flooding with the maximum value
when noise-free scenario (infinite SNR)
Salt flooding

Data-driven methods for the initialization of full-waveform inversion

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Data-driven methods for the initialization of full-waveform inversion

Similar a Data-driven methods for the initialization of full-waveform inversion (20)

Más de Oleg Ovcharenko

Más de Oleg Ovcharenko (7)

Último

Último (20)

Data-driven methods for the initialization of full-waveform inversion