This document discusses using Python for solving problems in geophysics. It begins by defining geophysics as the application of physics to the study of the Earth, its environments, and its processes. It then discusses various geophysical themes like gravity, heat flow, electricity, fluid dynamics, magnetism, radioactivity, and vibration. The rest of the document focuses on different geophysical libraries and software that can be used with Python, applications of geophysics to energy exploration and production, and challenges of dealing with big data in upstream oil and gas.
29. ALMOST ALL
OF THAT IS
OPEN-SOURCE
(AND SO IS THE DATA)
BUT HERE’S THE KICKER:
30. GEOPHYSICS-FOCUSED
SCIPY TALKS
2012
ALGES: Geostatistics and Pythong
Py-ART: Python for Remote Sensing Science
Building a Solver Based on PyClaw for the Solution of the Multi-Layer Shallow Water Equations
2013
Modeling the Earth with Fatiando a Terra
2014
The Road to Modelr: Building a Commercial Web Application on an Open-Source Foundation
Measuring Rainshafts: Bringing Python to Bear on Remote Sensing Data
The History and Design Behind the Python Geophysical Modeling and Interpretation (PyGMI) Package
Prototyping a Geophysical Algorithm in Python
2015
(and an entire Geophysics Track)
Using Python to Span the Gap Between Education, Research, and Industry Applications in Geophysics
Practical Integration of Processing, Inversion, and Visualization of Magnetotelluric Geophysial Data
Striplog: Wranging 1D Subsurface Data
Geodynamic Simulations in HPC with Python
62. VARIETY
• Structured
• Standard data models
• SEG-Y
• WITSML
• RESQML
• PRODML
• LAS
• .shp, .lyr, other GIS files
• Unstructured
• Images (maps, embedded well logs in .PDF’s)
• Audio, video
• …and more, on both fronts
72. UNCONVENTIONALS
Huge number of wells operating simultaneously
Operators need to make decisions very quickly, and are far
removed from central business units – autonomy
• Geology interpretation – comparing geology to production
• New well delivery – improving drilling and completions,
reducing lag time and minimizing the number of wells in
process at any given moment in time
• Well and field optimization – well spacing and completions
techniques (cluster spacing, number of stages, proppants
and fluids used, etc.)
73. CONVENTIONALS
Fewer wells in this scenario
Can still spot trends from the constant streams of
information, particularly sensors – spotting where a piece of
equipment might fail
Reducing the potential for environmental disasters
74. MIDSTEAM /
DOWNSTREAM
Monitoring pipelines and equipment for a more predictable
and precise approach to maintenance
Preventing shutdowns and launching interventions to
prevent spills
Ideally, we would have as few people operating in hazardous
locations as possible
75. Historically, oil companies relied on
operating models that focused on
functional excellence and clear hand-
offs from one function to the next.
This process takes time, and it breaks down when you
have to make decisions quickly.
76. Each individual function may have a
wealth of data, but unless your model
can put it all in a single location,
analyze it, and place that information
in the right hands at the right time, it’s
difficult to improve performance.
(Bain Energy Report)
So you’ll have a good idea on whether you want to stick around or not… ;)
General overview of what Geophysics is
Listing of some of my favorite python and geophysics libraries
People are doing great work, and deserve to be recognized
The final topic is going to be what I know best (I guess) – the progression of data throughout the life cycle of the oil industry
Employed by a truly rad technology-focused O&G company by day, MS Earth Sciences graduate student at Rice University by night, founder of PyLadies-HTX (though these sometimes all bleed into one another)
Background: degrees are ABA, BA Sociology, BS Geophysics – which is the weirdest combo anyone could ever have
Adrian Lenardic’s first class.
Magritte actually had a series of paintings of curiously-shaped rocks suspended in space, or in natural settings. Arches national park; other curious geologic formations. How did they get there? What processes shaped them?
Hydrology and the Talking Heads.
Huge concepts, right?
Bouguer anomaly
Geoid
Geopotential
Gravity anomaly
Undulation of the geoid
20,000 feet tall
Cathedral sized. More than a cathedral. For context, the Empire State Building is like 1300 feet.
And they’re all over the dang place. Mention the Lake Peigneur salt mine fiasco.
Geothermal gradients and internal heating
Suburface heat flow – whole earth geophysics
Heating of hydrocarbons – if the organic material is too deeply buried, it turns into gas or “overcooks” entirely
Isostasy
Post-glacial rebound
Mantle convection
Geodynamo
Rate of lithospheric uplift due to Postglacial Rebound, as modelled by Paulson, A., S. Zhong, and J. Wahr. Inference of mantle viscosity from GRACE and relative sea level data, Geophys. J. Int. (2007) 171, 497–508. doi: 10.1111/j.1365-246X.2007.03556.x
This layered beach at Bathurst Inlet,Nunavut is an example of post-glacial rebound after the last Ice Age. Little to no tide helped to form its layer-cake look. Isostatic rebound is still underway here. Canada.
The Earth’s poles sometimes reverse direction – and we don’t know why. North at the bottom, south at the top. What’s interesting is that as the seafloor spreads, cools, and lithifies, certain minerals in the rock orient themselves to align with Earth’s current polarity. This means that as you check magnetism readings along the bottom of the seafloor, you see these wonderful bands
Whole earth perspective: Earth’s magnetic field
Basically materials science – researching how structures change based on differential heating, pressure, compaction. Same chemical makeup, different expressions and structures.
A great resource for this is USGS’s earthquakes website.
Joe Kington’s presentation on 3D-printing cubes of geology (to get a better feel for the stratigraphy) and seismic
Madagascar – multi-dimensional data analysis, including seismic processing
PySIT – imaging and inversion
Segpy – reading and writing SEG-Y filessegpy-py – reading SEG-Y filesSLIMpy – processing front end
Fatiando a Terra – geophysical modeling and inversion; extensive cookbookObsPy – seismology toolboxPyGMI – 3D interpretation and modelling of magnetic and gravity data
SimPEG – simulation and parameter estimation in geophysics; great learning utilitySeismic Handler – signal processing for earthquakessgp4 – tracking earth satellitesPy-ART – python ARM radar toolkit (weather data)
SgFm – sediment transport at geologic scale
Laspy – LAS file conversionParaView Geo – 3D geoscience visualization
3ptScience – Rowan Cockett’s websiteBruges – modelling and post-processing seismic reflection data
Modelr – seismic forward modeling on the webPick This – social image interpretationG3.js – coming soon, a geoscience wrapper for D3.jsStriplog – wrangling 1D data, usually core with varying sample ratesArcPy – geospatial processing tools for ArcGISPyQGIS – the same, for the open-source mapping alternative QGISUniversity of British Columbia
SEG-Y is one of the standards developed by SEG for storing geophysical data
USGS puts out scads of data sets; so does NASA Mention the importance of Python in geoscience research (and science research in general) because there’s a move toward reusable code and repeatable experiments
“Github for scientists is just… Github.”
SEG Hackathon – sponsored by Agile geoscience, I believe it’s their third
Saturday and Sunday, October 17th and 18th so you can go to this without going to the SEG Conference as a whole, if you can’t get off work.
…but now for something completely differentAnd apologies for focusing on the oil and gas aspects of energy.
1927 by Conrad Schlumberger, though he’d been formulating the idea since 1919
He sent down a sonde (sensor attached to a wire) into a 500m deep well in the Alsace region of France and started collecting information
“Electrical resistivity log”
All measurements were made by hand
1921 by J. Clarence Karcher, who was an Electrical Engineer
This is the means by which the majority of the world’s oil reserves have been discovered
Founded Geophysical Service Incorporated in 1930, which eventually turned into Texas Instruments
Got the idea because his assignment in World War I, the assignment that took him out of grad school, was to locate heavy artillery batteries in France by studying the acoustic waves the guns generated in the air.
He noticed an unexpected event in his research and switch his concentration to seismic waves in the earth
He thoughts it would be possible to determine the depths of the underlying geologic strata by vibrating the earth’s surface while precisely recording and timing the waves of energy
Earliest known oil wells were drilled in China, in 347 AD
These wells had depths of up to about 790 feet, and were drilled using bits attached to bamboo poles
Egyptians were using asphalt more than 4000 years ago, in the construction of the walls of Babylon. Ancient Persians were using petroleum for medicinal and lighting uses. The first streets of Baghdad were paved with tar.
Befuddled “shoot the ground and gusher comes up” situations. Producing dozens of barrels a day, maybe hundreds, but recovery rates were exceptionally low, and you weren’t really finding anything interesting.
I guess the point that I’m trying to make is that…
[read slide]
Advances in technology create a marked step change in petroleum exploration. Those advances are primarily in terms of better hardware / equipment, which give explorers better data about the subsurface. The data is the key.
Now, I’m a geophysicist – so those advances are the ones I’m best at spotting.
Point out the upticks for 2D seismic, better resolution for 3D seismic
80’s: 2D data acquired, pre-stack and post-stack imaging, Cray supercomputers
90’s: 3D narrow azimuth data, 3D post-stack and pre-stack imaging, Unix
00’s: 3D wide azimuth data, imaging, reverse time migration; Linux clusters
Now: coil shooting, continuous machine-generated sensory data
Mathematical insights – mention that last night you found out that the guy who first discovered the FFT was a Chevron employee, ain’t no thing
Point out fracking boom, mention that the crazy upward tick has continued, though the steepness of the slope has decreased a bit due to the drop in oil prices
Shamelessly stolen from Wikipedia
7 out of 10 of the largest public, state-owned, and private businesses – and a huge proportion of the overall list. Trillions of dollars of revenue.
Direct link to reserves and success of a company. We’re selling a thing; the margins on the beef jerky you buy in a gas station are higher than the margins for a barrel of oil
Oil companies are all in the business of getting barrels out of the ground – so characterizing the subsurface is incredibly important. Both of those bits of data that I mentioned before – that came so late in the game – were huge technological step changes for the industry, and drastically impacted oil discovery.
Improved resolution within the reservoir is critical because deepwater wells cost a lot - $100 million or more – and fully exploiting assets is essential
The oil industry is a bit like an ecosystem. This particular piece is subsurface characterization – the earth science-y and engineering bits
Every image you see here has a data type (or more!) associated with it, and, though it’s getting better, a shortage of standards
So these components of the energy ecosystem, and this subsurface data workflow can be grouped into “earth science-y bits” and “engineering bits” with this kind of fuzzy area in between with petrophysicsEarth scientists record millions and billions of data points called “seismic” and they don’t trust any of them unless you put them all together
Engineers trust pressure readings in the well, the stuff they can measure with sensors – and trust it everywhere, and extrapolate everywhere
Something that I should also mention is that this is an iterative process. I put a loop here, but in reality, all of these steps can feed back into one another – and a change to one component of the subsurface model drastically impacts all other components
New sorts of geology: horizontal drilling and hydraulic fracturing combined have been revolutionary
All that I mentioned before was earth sciences or drilling related – impacting the “upstream” components of the oil industry.
But in reality, data impacts every single component of the oil and gas value chain. And what’s more: it’s a variety of data, coming in at asynchronous rates.
How we get it, how we transport it, how we process it, how we use it – and of these components have the opportunity to be honed by analytics insights.
Streamlining the transport, refinement, and distribution of O&G is vital.
So this past decade, the first one of the thousands, 2000 – 2010, has been the decade of “big data”.
Kind of a buzzword, right? Like “in the cloud”.
and if you thought there was a lot of data in this first decade, you realize there's going to be a heck of a lot more in the second.
Mobility, infrastructure, and collaboration technologies currently are the biggest investment areas
In the next three to five years, investments are expected to increase in big data, the industrial IoT, and automation
In a recent study (May 2015) from Microsoft and Accenture, 86 – 90% of respondents said that increasing their analytical, mobile, and internet of things capabilities would increase the value of their business
In the near term during the current low crude price cycle, approximately 3 out of 5 respondents said they plan to invest the same amount (32%) or more or significantly more (25%) in digital technologies
89% noted that leveraging more analytics capabilities would add business value
90% felt more mobile tech in the field would add business value
86% leveraging more IIoT and automation would boost value
That’s near unanimous. I’ve never seen management be unanimous about *anything*.
Structured and Unstructured Data
Data scientists seem to really like alliteration, for whatever reason.
…and all supposedly leading up to “Value”
In the 80’s, seismic was gigabytes in size; some people were still hand-interpreting on paper
Static
5D interpolation: can produce file sets that exceed 100 TB in size. Some seismic surveys I’ve seen – regional studies – can reach petabytes.
This is partially due to the way that the seismic is acquired
Coil seismic has replaced lines and grids – explain why, and explain why that impacts the size of the data that you’re looking at
Real-Time
Shell is using fiberoptic cables created in a special partnership with HP for their sensors, and this data is transferred to AWS servers – 1TB / day
And it’s not just in the engineering realm. On the business side:
Chevron’s internal IT traffic alone exceeds 1.5 TB a day – and that’s 2013 numbers.
CAT scanning of cores
What you’re seeing here is a subsection of the well
Pore-scale imaging (.01 to 10 microns) can generate large data sets, as well: a centimeter cubed can exceed 10GB, and when you take into account that you’re measuring 1000 meters of core, that’s 1 exabyte
Reducing the approximations, improving the equations
Images taken from Schlumberger
Handled with specific applications used to manage surveying, processing and imaging, exploration planning, reservoir modeling, production, and other upstream activities
The structured stuff’s (mostly) easy to deal with. You might not have standard naming conventions, and it might not always be as complete as you’d like, but (for the most part) you know what you’re getting and you know what it’s intended for
Unstructured or semi-structured such as:
Emails
Word processing documents
Spreadsheets
Images
Voice recordings
Multimedia
Data market feeds
Pictures of well logs
PDF’s
This all makes it difficult and costly to store in traditional data warehouses or routinely query and analyze. Enter Hadoop (or other large-scale unstructured databases)
And a note – even though data is structured, it can come in a variety of formats. There’s no such thing as a pristine data set, out of the box.
Real-time streaming data: offshore, onshore; pipelines, refineries, in the wellbore, on machinery at the wellsite, in office buildings…
But, again, it’s that variety in the velocity that’s important. We have some data that comes in immediately, and some that comes in three months later via spreadsheet. How can we consolidate and use both?
It’s not that great
“success rate” for exploration is very low
It’s not that great
“success rate” for exploration is very low
Studies show that a gradual shift to a data and technology-driven oilfield is expected to tap into 125 billion barrels of oil, equal to the current estimated reserves of Iraq
Currently, recovery rates are only about 50%. The biggest risk is finding the oil; the second biggest risk is getting it out of the ground safely.
Increased speed to first oil
Enhanced production
Reduced costs, such as non-productive time
Reduced risks, especially in the area of health, environment, and safety
Our survey of more than 400 executives
in many sectors revealed that companies with better analytics
capabilities were twice as likely to be in the top quartile
of financial performance in their industry, five times
more likely to make decisions faster than their peers and
three times more likely to execute decisions as planned.
The evidence is compelling.
…which leads to more alliteration.
Remember what I said about data scientists loving alliteration?
So you’ve got all this data. How can you use it?
The business of a data scientist.
And making sure that data from all sectors is integrated.
And there are opportunities for so many others – everything from HR Analytics, to looking at social media to detect political unrest, to machine learning on seismic to detect channels or slug models – things that geologists usually hunt for
“Unconventional resources” such as shale gas and tight oil supply 20% of the gas used in the USA and is expanding rapidly around the globe.
Mention the tech talk that you went to that was sponsored by the SPE – Randy LaFollette, Baker Hughes
flat time
which crews are most efficient
bit economics
when to use different bits
mud-motor optimization
Not any of the fancy horizontal drilling.
Deepwater wells are key here; onshore is less complex.
Refineries have limited capacity, and fuel needs to be produced as close as possible to its point of end use to minimize transportation costs. Complex algorithms take into account the cost of producing the fuel as well as diverse data such as economic indicators and weather patterns to determine demand, allocate resources and set prices at the pumps.
Functional excellence isn’t something that can be sacrificed, by any means – it’s just that companies are going to have to leverage technologies in more ways to accelerate the decision making process.
Consider, for example, the new well delivery process,
where performance metrics such as the time from spud
to hookup or the dead time between steps require visibility
into activity data from each function involved. If
the functions (including land, regulatory, pad construction,
drilling, completions and operations) run on different
systems and rely on differently constructed data
models, it becomes very difficult to have a clear, integrated
view of what is happening in the field.
(and I’m paraphrasing)
Companies that build better analytics capabilities concentrate
their efforts in three areas: technology architecture,
interaction between IT and the business, and hiring
and retaining strong analytic talent.