Here are the slides from the Ph.D. thesis defense by Oleg Ovcharenko that took place at KAUST in Thuwal, Saudi Arabia. The research is related to deep learning applications for initial model building for FWI and low-frequency extrapolation in seismic data.
Data-driven methods for the initialization of full-waveform inversion
1. Ph.D. thesis defense by Oleg Ovcharenko
Data-driven methods for the initialization of full-waveform inversion
Physical Science & Engineering
Earth Science and Engineering Program
KAUST
15.11.2021, Thuwal, Saudi Arabia
Advisor: Daniel Peter (KAUST)
Committee:
Martin Mai (Chair, KAUST)
Tariq Alkhalifah (KAUST)
Xiangliang Zhang (UT ND)
Sergey Fomel (UT Austin)
2. Outline
2
Salt flooding by variance-based interpolation
Low-frequency extrapolation by deep learning
Multi-task learning for data and model recovery
3. Hydrocarbon exploration and extraction
3
https://www.semanticscholar.org/paper/Machine-learning-applications-to-geophysical-data-Bougher/91cd3152a60ddf41911dd2a0b876718f89555633/figure/12
https://www.arabianbusiness.com/saudi-aramco-cut-drilling-costs-hold-rig-count-steady-579284.html
https://www.oedigital.com/news/474771-saudi-aramco-ades-in-rig-contract-extension
Motivation
4. Seismic data
4
Signal is band-limited
by hardware and noise
Source energy
Sensitivity of receivers
Frequencies < 4 Hz are
generally considered low
Industry advances hardware
Airgun source
https://commons.wikimedia.org/wiki/File:Streamer-detail_hg.jpg
https://archive.epa.gov/esd/archive-geophysics/web/html/marine_seismic_methods.html
(Chelminski et al., 2021)
Common-shot-gather
Towed streamer
Motivation
6. Challenges
6
Frequency
(Bunks, 1995)
Gradient-based nonlinear optimization:
alter subsurface model to minimize data misfit
Trade-off between initial point and data
frequency bandwidth
Local minima where derivative turns to zero
Stop Start
Motivation
9. Trend of data-driven research
9
https://deepai.org/publication/integrating-machine-learning-for-planetary-science-perspectives-for-the-next-decade
Anomalies as data
Deep learning (DL)
Motivation
10. Thesis objectives
10
Geophysics
Exploration Seismic inversion Gradient-based FWI
Low-frequency data recovery
Initial model building
Field of contributions:
Explore ways to improve the robustness of FWI for complex environments
• How to automate salt flooding?
• How could deep learning be used to expand seismic data bandwidth?
• How to bridge the gap between synthetic and field data experiments?
Motivation
11. Evolution of my Ph.D. ideas
11
Realism
2016 2017 2019 2021
Industrial experience
Salt flooding
Low-frequency
extrapolation
Model and data
reconstruction
Abstract concepts Practical implementation
Motivation
12. Outline
12
Salt flooding by variance-based interpolation
Low-frequency extrapolation by deep learning
Multi-task learning for data and model recovery
13. 13
Chapter 1
of 3
Variance-based model interpolation for improved
full-waveform inversion in the presence of salt
bodies
• Challenges of salt
• Iterative salt flooding
• Synthetic example
Idea:
Use cycle-skipping artifacts from frequency-domain FWI
as a guide for salt flooding
Objective:
Automate salt flooding for frequency-domain FWI without
intervention into FWI formulation
14. Salt imaging
14
Features and challenges:
• Hydrocarbons near salt bodies
• High-velocity contrasts
• Complex geometries, steep flanks
• Illumination issues
https://wiki.seg.org/wiki/Salt_imaging_techniques
Existing solutions:
• Top-bottom approach (Zhang et al., 2009)
• Regularisation / conditioning (Alkhalifah, 2016))
• Automated salt flooding (Esser et al., 2016; Kalita
et al., 2019; etc.)
Willacy and Kryvohuz, 2019
Salt flooding
15. Frequency-domain experiment
15
Receivers
Sources
Crop from BP 2004 (Billette and Brandsberg-Dahl, 2005)
Size: 61 x 220, dx = 50 m
Acoustic, isotropic
Frequency domain
Low
3 Hz
4.12 Hz
High
3.33 Hz
3.7 Hz
Cycle-skipping artifacts at different mono-frequencies
Salt flooding
16. Selection of frequencies
16
Size of cycle-skipping artifacts is proportional to wavelength λ
λ1
λ2
λ3
λ4
Low-frequency
artifacts
Intermediate-frequency
artifacts
High-frequency
artifacts
Wavelength
f1
f2
f3
f4
Frequency
Low
High
Artifacts
Salt flooding
18. 18
Weighted average = more weight to lower frequencies since
these are less prone to cycle-skipping
1. Averaging
0. Modeling
2. Variance
3. Flooding
Salt flooding
19. 19
0. Modeling
2. Variance
3. Flooding
1. Averaging
Salt flooding
How much a variable alternates from its weighted average value?
20. 20
0. Modeling
2. Variance
3. Flooding
1. Averaging
Salt flooding
Floating threshold tracks the history of variance properties
21. 21
0. Modeling
2. Variance
3. Flooding
1. Averaging
Salt flooding
Flood where the variance exceeds the threshold
Low (high) SNR leads to flooding with the mean (max) from a half-wavelength circle
26. Salt flooding result
26
Target crop from BP 2004 model
FWI from linear initial model
Initial model after salt flooding
FWI from salt-flooded initial model
Salt flooding
27. Chapter summary
27
Pros:
Does not interfere with the core of frequency-domain FWI
Computationally affordable
Cons:
Modeling for multiple frequencies
How these artifacts look in the real world?
Variance-based interpolation build around using cycle-skipping artifacts as new data
Takeaways:
Distinctive geological features of salt bodies might be a beneficial for generation
of synthetic subsurface models
Salt flooding
28. Outline
28
Salt flooding by variance-based interpolation
Low-frequency extrapolation by deep learning
Multi-task learning for data and model recovery
29. 29
Chapter 2
of 3
Deep learning for low-frequency extrapolation
from multi-offset seismic data
• Value of low frequencies
• Frequency domain
• Deep learning method
• Synthetic example
Idea:
Supervised deep learning to extrapolate patterns in
frequency-domain high-frequency data
Objective:
Reconstruct missing low-frequency data to compensate
for poor initial model for frequency-domain FWI
30. Why do we need low frequencies?
30
Lack of low-frequency data
- Due to instrumental limitations
- Due to noise
(Kazei et al., 2016)
fHigh
fLow
- Inverts large-scale velocity structures
- Less chance to get stuck in local minima
- Reveals deep model structures / below salt
ata
mitations
(Kazei et al., 2016)
fHigh
fLow
- Inverts large-scale velocity structures
- Less chance to get stuck in local minima
- Reveals deep model structures / below salt
Seismic buoys for ultra-long offset surveys by GWL
Low-frequency data
31. Why do we need low frequencies?
31
Lack of low-frequency data
- Due to instrumental limitations
- Due to noise
- Inverts large-scale velocity structures
- Less chance to get stuck in local minima
- Reveals deep model structures / below salt
(Kazei et al., 2016)
fHigh
fLow Seismic buoys for ultra-long offset surveys by GWL
Low-frequency data
32. Frequency bandwidth extrapolation
32
Fidelity
of
wave
phenomena
Computational complexity
Trace-to-Trace
Shot-to-Shot
Data-to-Data
(Ovcharenko et al, 2017, 2018
2019, 2020)
(Sun & Demanet,
2018-2021; Hu, 2019)
(Aharchaou et al, 2020,
2021)
Extrapolation for atomic events
(Li & Demanet, 2015, 2016)
Deep learning methods
Beat-tone inversion
(Hu, 2014)
Envelope inversion
(Wu et al., 2013 )
Pre-deep learning methods
Low-frequency data
33. Common shot gather in frequency domain
33
Source Receivers
Dataset size = Nshots * Nmodels
Solve Helmholtz equation to get complex mono-frequency amplitudes at
receiver locations
Low-frequency data
34. Mapping high frequencies to low
34
Extrapolate patterns from high frequencies down to low frequencies
Low-frequency data
35. Experimental and training setup
35
Input high-frequency data Target low-frequency data
MobileNet
(Howard et al., 2017)
64 sources and receivers
32 known frequency in range 3-5 Hz
Successive mono-frequency inversions at
0.25 0.55 0.93 2.04 2.66 3.46 4.50 Hz
Acoustic modeling
Frequency domain
Low-frequency data
37. Validation by FWI
37
0.25Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.25Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.55Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.55Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.93Hz
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0 5 10 15 20
km
0
2
4
6
km
2
3
4
km/s
0.25 Hz
0.55 Hz
0.93 Hz
4.5 Hz
FWI of predicted data
FWI of target data
Low-frequency data
38. Chapter summary
38
Pros:
Mono-frequency target is “simple” compared to time domain
Efficient generation of training data by shots
Suitable for extrapolation of ultra-low (< 1 Hz) frequencies
Cons:
One frequency = one training
Frequencies disconnected
Low-frequency extrapolation in frequency domain by deep neural network
Takeaways:
Bandwidth extrapolation is feasible but application in FWI requires high accuracy
of reconstructed data
Low-frequency data
39. Outline
39
Salt flooding by variance-based interpolation
Low-frequency extrapolation by deep learning
Multi-task learning for data and model recovery
40. 40
Chapter 3
of 3
Multi-task learning for low-frequency
extrapolation and elastic model building from
seismic data
• Multi-task learning
• Time domain data
• Synthetic example
• Field data example
Idea:
Jointly predict initial model and low-frequency data so
missing ultra-low frequencies are compensated by the
predicted model
Objective:
Alleviate high accuracy requirement for extrapolated low-
frequency data
41. Multi-task learning
41
Multi-task learning
Benefit from knowledge acquired by
learning related tasks
Child learns to recognize faces and can then apply this knowledge
to recognize other objects
Hard parameter sharing (Ruder, 2017)
(Kendall et al., 2018)
• General representations in encoder
• Learn a complex task by solving a simple task
• Reduced risk of overfitting
42. Multi-task network architecture
42
Encoder Data decoder
Model decoder
Convolution
Dilated
convolution
Local velocity model
kernel 7x7
kernel 5x5
kernel 3x3
High-frequency data
Concatenation
> 4 Hz
< 5 Hz
Low-frequency data
Multi-task learning
43. Multi-task objective
43
Loss terms breakdown:
Data loss
Data correlation loss
Model loss
Model regularization
To reconstruct low-frequency data
To treat the data trace-wise
To reconstruct low-wavenumber model
To avoid data leakage into model
W is the weight of a loss term
Multi-task learning
44. On the fly loss balancing
44
Sigmas quantify uncertainties associated
with a given loss.
Logarithmic term prevents excessive
uncertainty growth
In practice, sigmas are scalars that are
trainable alongside the network weights.
Multi-task learning
(Kendall et al., 2018)
45. Semi-synthetic training dataset based on field data
45
Noise collection
Elastic modeling in random subsurface models
Source wavelet
Pre-arrival noise
BroadSeis data by CGG
324 hydrophones every 25 m, recording for 7 seconds
Multi-task learning
46. Semi-synthetic training dataset
46
High Low
Synthetic
Field
Low
< 5 Hz
High
> 4 Hz
Input Target #1 Validation Target #2
Offset, 324 ~ 8 km
Time, 376 ~ 6 sec
ULow
< 3 Hz
Multi-task learning
47. Experiments
47
Vs
Rho
Synthetic data: modified Marmousi II model
Shear-wave velocity and density are
constructed from empirical relations:
The domain geometry for synthetic experiment is
the same as for FWI on field data. Velocity range
is different
Field data: marine streamer data from Australia
(Gardner et al., 1974)
Multi-task learning
48. Inference depending on loss configuration
48
LС
LСM
L
UNet
Target
Input
Legend:
L - data loss
C - correlation loss
M - model loss
LС
LСM
L
UNet
Input Target
These are predicted
data after low-pass
filtering below 3 Hz,
where the input data
was set to strict zero
Synthetic data Field data
Multi-task learning
50. Validation by FWI
50
Synthetic data Field data
Predicted initial
model
Predicted data
< 3 Hz
Predicted data
< 4 Hz
Predicted and
available data
< 7 Hz
Multi-task learning
51. Compare to inversion of true data
51
Expectation: True low-frequency data > 2.5 Hz, started from 1D initial
Reality: Predicted low-frequency data > 2.5 Hz, started from predicted initial
Well-log comparison
Multi-task learning
54. Chapter summary
54
Pros:
Data generation is affordable and follows conventional FWI steps
Dynamically weighted loss terms
Cons:
Need to be tailored for a specific dataset
Multi-task learning for frequency bandwidth extrapolation and initial model building from time domain data
Takeaways:
Recovered initial model addresses the time-domain challenge of low-frequency
extrapolation
Semi-synthetic dataset sufficient for inference on field data
Undergoing review
for IEEE TGRS
Multi-task learning
55. Conclusions & Outlook
55
• Salt flooding with variance-based method can help to automate initial model building
• Low-frequency extrapolation with deep learning is feasible for salt-induced environments
• Multi-tasking learning can help to relax accuracy expectations for reconstructed data
• Semi-synthetic dataset to bridge the gap between synthetic and field data applications
Supervised vs. unsupervised learning:
* Accuracy? Computational costs? Feasibility?
* Low-frequency or directly invert for subsurface model?
Explainable AI:
* How to analyze the NN to understand the input problem?
Physics-guided methods:
* Should we replace deterministic solvers by NN?
OUTLOOK
56. Contributions of my Ph.D. work
56
• Three methods to improve the initialization of FWI (journal articles)
• Model domain: cycle-skipping artifacts as new data to guide salt flooding
• Data domain: frequency domain suitable for ultra-low frequency extrapolation
• Data + Model domains: joint recovery of low frequencies and background model to compensate for
imperfections of each other
• Several concepts introduced, extended or adopted (conference proceedings)
• Multiple-frequency bands to enable domain adaptation
• Texture-transfer from geological prior
• Orthogonal encoding for surface multiple suppression
• Open-source contributions
• Python API for DENISE-Black-Edition by Daniel Kohn
• WaveProp in MATLAB
• Multi-task learning for joint low-frequency data and model extrapolation
58. Acknowledgements
58
I would like to thank my supervisor Daniel Peter, Vladimir Kazei and Tariq
Alkhalifah for shaping me as a researcher. My Ph.D. Committee members:
Martin Mai, Xiangliang Zhang and Sergey Fomel for their time and efforts
dedicated to evaluating my work. SMI and SWAG group members for fruitful
discussions.
Individuals who helped me on the way: Pavel Plotnitskii, Mahesh Kalita,
Hanchen Wang, Christos Tzivanakis, Jubran Akram, Yana Ovcharenko, Dias
Urozaev, Muhammad Izzatullah, Fuqiang Chen, Armando Carmona, Eduardo
Cano, Martyn Ovcharenko, Yan Yang, Daniel Kohn, Siarhei Khirevich, Matteo
Ravasi, Claire Birnie and others.
Anatoly Baumstein, Song Hou, and Andrey Bakulin for my industrial
experience and feedback. CGG for marine streamer data. KAUST, ECRC and
Saudi Aramco for giving me the environment and for funding my work.
https://inhabitat.com/kaust-breakwater-beacon-is-a-naturally-cooled-lighthouse-in-saudi-arabia/
60. Conclusions & Outlook
60
• Salt flooding with variance-based method can help to automate initial model building
• Low-frequency extrapolation with deep learning is feasible for salt-induced environments
• Multi-tasking learning can help to relax accuracy expectations for reconstructed data
• Semi-synthetic dataset to bridge the gap between synthetic and field data applications
Supervised vs. unsupervised learning:
* Accuracy? Computational costs? Feasibility?
* Low-frequency or directly invert for subsurface model?
Explainable AI:
* How to analyze the NN to understand the input problem?
Physics-guided methods:
* Should we replace deterministic solvers by NN?
OUTLOOK
63. 63
Weighted average using weights
1. Averaging
0. Modeling
2. Variance
3. Flooding
Assigns more weight to lower frequencies since these are less prone to cycle-skipping
Salt flooding
64. 64
Weighted variance
0. Modeling
2. Variance
3. Flooding
using weights
1. Averaging
Indicates how much a variable alternates from its weighted average value
Salt flooding
65. 65
0. Modeling
2. Variance
3. Flooding
1. Averaging
Floating threshold
initial threshold
mean of variance map
max of variance map
max threshold
in flooding history
Salt flooding
66. 66
0. Modeling
2. Variance
3. Flooding
1. Averaging
High-variance mask Flooding within the mask
Low SNR = flooding with the mean from half-wavelength circle, flooding with the maximum value
when noise-free scenario (infinite SNR)
Salt flooding