Machine Learning encompasses data acquisition, transmission, retention, analysis, and reduction. The expected outgrowth of 24x7 data systems and operations centers is Knowledge Engineering and Data Intensive Analytics AKA Machine Learning. This presentation will develop and apply Machine Learning concepts to the Upstream O&G industry. Specific focus will be given to the fundamental concepts and definitions of Machine Learning along with the application of Machine Learning.
2. 1
Introduction to Southwestern Energy
Southwestern Energy Company (NYSE: SWN) is a
leading natural gas and oil company with operations
predominantly in the United States, engaged in
exploration, development and production activities,
including related natural gas gathering and marketing.
Source: http://www.swn.com/
3. 2
Digital Energy Luncheon
Machine Learning:
Fundamentals and E&P Applications
Machine Learning encompasses data acquisition, transmission,
retention, analysis, and reduction. The expected outgrowth of 24x7
data systems and operations centers is Knowledge Engineering and
Data Intensive Analytics AKA Machine Learning. This presentation will
develop and apply Machine Learning concepts to the Upstream O&G
industry. Specific focus will be given to the fundamental concepts and
definitions of Machine Learning along with the application of Machine
Learning.
4. 3
Machine Learning
“ A computer program is said to learn from experience
E with respect to some class of tasks T and
performance measure P, if its performance at tasks in
T, as measured by P, improves with experience E. ”
~Tom Mitchell
Source: Tom Mitchell, Mitchell, T. (1997). Machine Learning, McGraw Hill.
5. 4
Use Case #1 – Lateral Placement
Source: http://geology.com/articles/horizontal-drilling/
6. 5
Predictive Analytics
• Focuses on Prediction
– Based on Known Properties
– Learned from Training Data
Data Mining
• Focuses on Discovery
– Unknown Properties in Data
– The Analysis Phase of
Knowledge Discovery
Precursors to Machine Learning
Machine Learning is the “Extraction of Wisdom
by Understanding the underlying Data”
~Mark Reynolds
Source: Mark Reynolds, compilation
7. 6
Machine Learning: Data into Wisdom
Source: Mark Reynolds, compilation
Seismic
Drilling
Completions
Production
Data
Information
Visualization
Knowledge
Forensics
Understanding
Analysis &
Mining
Wisdom
Anticipating
Application
RT
Frac
Daily
Rpts
Well
Plan RT
Drill
Geo-
steer
AFE
RT
Prod
Reservior
9. 8
Use Case #2 – Offset Torque & Drag
Source: Gefei Liu, PVI Connecting Dots with Lines Using Drilling Software, August 20, 2013 http://www.pvisoftware.com/blog/2013/08/
10. 9
The Four Paradigms in O&G
• O&G is where we found itEmpirical
• O&G is where we expect itTheoretical
• O&G is where we estimate itComputational
• O&G is where we infer it
Data
Exploration
Source: Mark Reynolds, compilation
11. 10
The Catalyst
• Data captured by
instruments
• Data generated by
simulations
• Data acquired by
sensor networks
The Destination
• Solutions from data analysis
• Solutions from data mining
• Solutions from visualization
• Solutions from drill down
• Solutions for bottom line
• Solutions using eScience
Machine Learning in the 4th Paradigm
Source: eScience and the Fourth Paradigm: Data-Intensive Scientific Discovery and Digital Preservation, Tony Hey, Microsoft Research
http://www.alliancepermanentaccess.org/wp-content/uploads/2011/12/apa2011/15_%28Nov11%29TonyHey-APA%20Meeting.pdf
“ eScience is the set of tools and technologies
to support data federation and collaboration ”
~ Jim Grey
12. 11
Machine Learning in the 4th Paradigm
Acquire Analyze Annunciate Archive Analyze Anticipate Apply
Data
Information
Visualization
Knowledge
Forensics
Understanding
Analysis &
Mining
Wisdom
Anticipating
Application
Creating Informational Accessibility and Transparency
Discovering Experiential Performance Improvements
Segmenting Processes and Process Results
Replacing Human Decision w/ Automated Algorithms
Innovating New Models, Products, Services
Source: Mark Reynolds, compilation
13. 12
Modern Data Exploration
Unsupervised Learning
Supervised Learning
Reinforcement Learning
Semi-Supervised Learning
24/7
Predictive
Analytics
Machine
Learning
Data
Mining
AI
Source: Mark Reynolds, compilation
14. 13
Principal Concepts in Machine Learning
• Unsupervised Learning
– Data is unlabeled
• Supervised Learning
– Teach and train with data that is well labeled with a
defined output
• Reinforcement Learning
– Validity of data alignment is served as feedback
• Semi-Supervised Learning
– Some of the data is labeled, some is unlabeled
Source: Mark Reynolds, compilation
15. 14
Use Case #3 – Unsupervised Learning
Unsupervised Learning Torque increases in the curve
Source: Mark Reynolds, compilation
16. 15
Textbook Process of Machine Learning
Training
Data
Pre-
Processing
Learning
Error
Analysis
Model
Phase 1) Learning
Phase 2) Prediction
New Data Model
Predictable
Result
17. 16
Algorithmic Approaches
• Decision Tree Learning
– Maps observation to conclusions
• Association Rule Learning
– Discovering interesting relations
• Artificial Neural Networks
– Incremental function modules
• Inductive Logic Programming
– Rule based representations for
input --> output
• Support Vector Machines
– Classification and regression
• Clustering
– Assignment of observations to
clusters
• Bayesian Networks
– Probabilistic models correlating
variables
• Reinforcement Learning
– Finds policy to map states to
desired outcome
• Representation Learning
– Principal component analysis
• Similarity & Metric Learning
– Pairs of examples train others
• Sparse Dictionary Learning
– Datum as linear combinations
• Genetic Algorithms
– Mimics natural heuristics
18. 17
Use Case #4: Compositional Reservoir
SPE 154505
A novel approach for treating the phase stability and phase split
problems in compositional reservoir simulation…
~Vassilis Gaganis, et al
Source: SPE 154505: Machine Learning Methods to Speed up Compositional Reservoir Simulation, June 2012
19. 18
Machine Learning: The “Data Layer”
• Engineering the Source
– Signals, content, and
characterizations
• Engineering the Data
– Address errant data
– Address valid spurious data
– Address data quality
• Engineering the Store
– Repository
– Recall and Reporting
– Representations
Data Acquisition
Data Transmission
Data Retention
Data Analysis
Data Reduction
Source: Mark Reynolds, compilation
20. 19
Machine Learning: Data Diversity
• Macro (or field-level)
– Spatial
– Temporal
• Pad (or offset)
– Spatial
– Temporal
• Well (or wellbore)
– Spatial
– Temporal
• External
– Uploads
– Political, Climate, etc
• The 3 Cs of Data Quality
– Consistency
– Correctness
– Completeness
– [#4] Currency
– [#5] Conformity
Source: Mark Reynolds, compilation
Data Diversity - Spatial, Temporal, Referential
21. 20
Machine Learning: The “Output Layer”
• Engineering the Store
– Data distribution
– Data staging
• Engineering the Recall
– Simple query
– Cube v Matrix
• Engineering the Use Case
– Destination: human
– Destination: machine
Classification
Regression
Clustering
Density Estimation
Dimensional Reduction
22. 21
Use Case #5: Decline Curve Anomaly
Source: Mark Reynolds, compilation
23. 22
The Fast Data ecosystem in O&G
Land
Drilling
Reservoir Completion
Water
Production
Steering Regulatory
Midstream
Source: Assorted web images
24. 23
Security –OPC / Scada / IIoT
Source: Industrial control systems and SCADA cyber-security, 11 August 2014, By Dr Richard Piggin
http://eandt.theiet.org/magazine/2014/08/cyber-security-new-battlefront.cfm
25. 24
Machine Learning must be Integrated
Systems &
Knowledge
Engineer
O&G
Systems
Control
Systems
Remote
Systems
Information
Systems
Embedded
Systems
Robotic
Systems
Data
Fusion
Real-Time
Systems
Look-Back
Analysis
Look-
Ahead
Systems
Land and Regulatory
Geology Geophysics
Drilling Engineering
Completion Engineering
Production Engineering
Reservoir Engineering
Systems Engineering
Source: Mark Reynolds, compilation
26. 25
Algorithmic Approaches (revisited)
• Decision Tree Learning
– Maps observation to conclusions
• Association Rule Learning
– Discovering interesting relations
• Artificial Neural Networks
– Incremental function modules
• Inductive Logic Programming
– Rule based representations for
input --> output
• Support Vector Machines
– Classification and regression
• Clustering
– Assignment of observations to
clusters
• Bayesian Networks
– Probabilistic models correlating
variables
• Reinforcement Learning
– Finds policy to map states to
desired outcome
• Representation Learning
– Principal component analysis
• Similarity & Metric Learning
– Pairs of examples train others
• Sparse Dictionary Learning
– Datum as linear combinations
• Genetic Algorithms
– Mimics natural heuristics
27. 26
Keep Your Eye on the Prize
Data
Information
Knowledge
Understanding
Wisdom
Application
The question is NOT
“How can we … ?”
But instead
“What is the objective?”
( or “Why?” )
29. 28
Mark Reynolds
Mark Reynolds Vitae
• Southwestern Energy
• Lone Star College
• Intent Driven Designs
• Scan Systems
• Sikorsky Aircraft
• General Dynamics
• Southwestern Energy Email
– Mark_Reynolds@swn.com