3. Goodchild et al. (2012):
“The supply of geographic information from satellite-based and ground-
based sensors has expanded rapidly, encouraging belief in a new, fourth, or
“big data,” paradigm of science that emphasizes international
collaboration, data-intensive analysis, huge computing resources, and
high-end visualization.”
19. The Earth Engine Data Catalog
> 200 public datasets
MODIS
250m daily
Vector Data
WDPA, Tiger
Weather & Climate
NOAA NCEP, OMI, ...
Terrain &
Land Cover
> 4000 new images every day
> 5 million images > 7 petabytes of data
Landsat & Sentinel 1, 2
10-30m, weekly
... and upload your own vectors and rasters
28. Global composites with a few lines of code
var composite = ee.Algorithms.Landsat.simpleComposite({
collection: ee.ImageCollection('LANDSAT/LC08/C01/T1'),
asFloat: true
});
Map.addLayer(composite,
{bands: ['B4', 'B3', 'B2'], max: 0.3},
'composite');
https://code.earthengine.google.com/05d2e23206b329dfe696e5ba8e232c3f
30. The Earth Engine Code Editor
Your Scripts &
Example Scripts
API Docs
Your Data Search Your Code Data Inspector
Batch Tasks
Output Console
Drawing Tools Map
code.earthengine.google.com
45. Data Models
Feature
Image
Stack of Georeferenced bands
Each band has its own:
Mask, Projection, Resolution
A list of properties, including:
Date, Bounding-box
47. Map
Apply a function to each element of a collection
A "map" (for-each) operation
Examples
● Compute area of each feature
● Cloud cover of each image
● Mosaic for each month
48. Reduce
Aggregate everything in a collection
"Reduction"
Examples
● Summed area over all features
● Median-pixel composite
● Train a classifier
63. Images are tiled during ingestion
Downsampled by averaging
During computation
Compute output tiles
Tiling
64. Images are tiled during ingestion
Downsampled by averaging
During computation
Compute output tiles
Tiling
65. Images are tiled during ingestion
Downsampled by averaging
During computation
Compute output tiles
Find intersecting source tiles
Reproject into the output projection
Tiling
Share the presentation as public, and update the shortlink accordingly.
This section has information about the background and history of Earth Engine.
Setting the stage. This is what the external community was asking for around the time EE was under development (this paper was written several years before it was published). What you will discover is that Earth Engine delivers exactly what academia needed.
There are 40+ years of remotely sensed data available from a constellation of satellites and sensors, but many facilities still lack capacity for downloading and analyzing the data.
Organizing geospatial data is part of Google's mission! On of the fundamental goals of Earth Engine is to organize all that satellite imagery and make it accessible and useful.
The most efficient way to make the data accessible and useful is to "move the question to the data." Jim Gray elaborated on this idea in the influential book by Hey et al., director of research at Microsoft. Earth Engine implements this plan.
In order to move the question to the data, Earth Engine hosts a petabyte-scale archive of satellite imagery and other geospatial data on Google infrastructure. At the same time, Earth Engine provides an API in order to perform processing, analysis, visualization of the data, also using Google machines. Data storage and processing are all performed at Google. You connect to the service through a web browser (details coming).
You need compute power colocated with the data.
You need an API to be able to implement your geospatial workflows.
Example.
South East Coast of Borneo as shown in Landsat imagery. At the time this image was collected, it was the most cloud free Landsat image of this area. Since the clouds move around over time, by applying a simple algorithm to a stack of such images, it's possible to opportunistically choose clear pixels in order to create a cloud-free composite. For example, the median of each pixel over time gets rid of both cloud shadows (dark) and clouds (bright), leaving surface reflected light (right in the middle). Some of these places had never before been seen from space in their entirety. This was a clue that Earth Engine might be useful.
Once the concept was demonstrated for individual places, the next logical step was to create a cloud-free composite of the world! This greenest-pixel composite became known as Pretty Earth, because it's not a real representation of Earth's surface. There is not a cloud in the sky anywhere, it's springtime everywhere, but it's a beautiful representation that became Google Maps and Earth satellite basemap.
Once a create a cloud-free composite of the world was made for a single time period, the next logical step was to make many composites spanning 30 years of the Landsat record and turn those composites into a video! That video became know as TimeLapse (developed in conjunction with Randy Sargent at CMU). It was released on the Time magazine website and won a webby for best use of video on the internet.
This animation shows that TimeLapse is nearly global in scope. That means you can go to your area of interest and see how places you care about have changed over time. A few compelling examples follow.
Meandering river in Pucallpa, Peru.
Urbanization in Suzhou, China.
Amazon deforestation. Time Lapse is available from the Earth Engine home page. You can pan, zoom and interact with the video to explore places of interest. You can make tours and/or embed the video in other websites.
Making Time Lapse used a huge amount of data, taking millions of hours of computation. However, when such a job is run in massive parallel on Google infrastructure, it just takes a few days.
That's the background. Now it's time to talk about what Earth Engine is. At its core is the petabyte-scale data catalog. Here's a brief introduction to the public data hosted in the Earth Engine archive.
Landsat data. Whatever USGS has, Earth Engine has. Landsats 1-8. Raw, TOA reflectance, surface reflectance. Sentinels 1-2. TOA reflectance. SAR data from Sentinel-1 processed to backscattering coefficients. Most MODIS terrestrial composites. Global DEMS at a variety of resolutions. Multiple land cover datasets. Atmosphere and climate data. And growing daily. There is a one or two day latency between scene acquisition and when it is ingested to the catalog.
This is the pixel count of Landsats. To make an image like this would be very, very difficult if you had to download all that imagery. In Earth Engine it is just a few lines of code.
High temporal cadence data. This is a harmonic model fit to 10+ years of MODIS EVI composites. The colors represent the seasonality of max greenness predicted by the model. Another example of a very difficult analysis that can be done with not very much code in Earth Engine. (We teach this example in about half an hour).
Terrain data. The purpose of this slide is to illustrate that compelling visualizations can be created and exported exactly as they appear on screen. You have control over scale, projection and rendering.
This image is mean annual ozone from the merged OMI/TOMS dataset. Red is less ozone, blue is more ozone. There are lots of other climate and atmosphere data in the catalog. Use the search tool to explore the catalog.
A composite of Sentinel-1 backscattering coefficients.
You've seen a sample of what's in the data catalog. The next section is intended to give a sense of what you can do with those data.
Image is the fundamental raster data structure in Earth Engine. Bands. Pixels. Metadata.
ImaeCollection is stack or time series of images.
Feature is the fundamental vector structure in Earth Engine. Geometry and metadata.
FeatureCollection is a collection of Features (surprise).
Filter is how to limit the scope of analysis temporally, spatially or by metadata.
Reducer is the way to aggregate data. Statistics such as min, max, mean, SD, variance, covariance, linear regression, etc.
Joins are the way to combine different data sets.
Kernel facilitates image processing operations, for example convolution.
Machine Learning algorithms support supervised and unsupervised classification.
Projection is the way to control scale and appearance of data outputs.
And growing in response to user requests.
Concept: reduce a stack of images to one image. The inputs are the images. The output is the median. Map is just for review.
The reducer is repeated. It is evaluated separately for each band. Come to the arrays session to learn another way.
Interactive mode is the Code Editor. To run big jobs at arbitrary scale and scope, use batch mode.
Where the magic happens. This is the online IDE at code.earthengine.google.com running inside a Chrome browser. All you need to use Earth Engine is an internet connection. Note the Scripts tab (Git repository and examples to help get started), Docs tab (API reference docs), Assets tab (upload your own data), Inspector tab (query the layers on the map), Console tab (messages), Tasks tab (execute long running tasks), Code Editor (for JavaScript, but there's also a Python API), Map (including layer tools, geometry tools, etc. The image displayed in this slide is from the Image Collection > Linear Fit example), Get Link button (send it to your friends!), Search bar, Help button
"Cloud computing: it's as if you had access to a supercomputer designed for geospatial analysis." All the heavy lifting is done on Google servers. Getting the result is low bandwidth.
How does it work? The code you write in the Code Editor gets turned into an object representing the set of instructions which is then sent to Google for processing. The analysis you requested is then run in parallel on many computers. What you get back in your browser is only what you request, for example a statistic or chart printed to the console or small RGB tiles to display on the Map. This is low bandwidth, but it's as if you have access to a supercomputer for geospatial analysis.
See this paper for more details. Segue to publications.
So far has been a little about background, the data catalog and the API. What follows is a brief literature review of papers that have been done using Earth Engine.
This visualization is from the New York Times article about the Pekel et al. publication:
https://www.nytimes.com/interactive/2016/12/09/science/mapping-three-decades-of-global-water-change.html
Lobell et al. (2016) fit regressions from many simulations of crop growth over a range of meteorological variables. They used relationships between RS variables and model variables (e.g. VI <-> LAI) to apply the equation in each pixel. They compare aggregate predictions to reported yields (with varying success).
Dong et al. (2016) used a phenology and pixel based algorithm to estimate rice paddy distribution. Flooding signals are based on interaction of vegetation and water indices. LSWI > NDVI for example. LST for growing season. Define rice paddies as areas flooded for at least 10% of the growing season.
Allred et al. (2016) used data on well location in conjunction with MOD17 NPP to estimate the loss of ecosystem services from oil and natural gas drilling. Well density per sq. km. determined the percentage (up to 33%) by which the NPP is reduced.
The previous examples were all published studies and what follows represent examples of operationalizing science through online, interactive apps.
Climate engine allows users to interactively create maps of climate data and trends, again without reading and writing any code. App Engine app. See https://developers.google.com/earth-engine/app_engine_intro
Screenshot of the eco dash monitor in Vietnam. Cumulative anomalies indicate the trajectory of the project area in terms of recovery from disturbance. An increasing trajectory indicates revegetation, while a decrease indicates lack of revegetation. http://ecodash-servir.adpc.net/ Also an App Engine app.
Push-button interactive app publishing. See https://developers.google.com/earth-engine/apps
All this will sound extremely mysterious, but will start to make more sense once you start using Earth Engine.
Vector data. Geometry + attributes.
Raster data. Now it's starting to get weird. Images can mix bands of different projection and scale.
A stack of images is an ImageCollection. A set of vectors is a FeatureCollection.
Do something to each element in a collection with map().
Aggregate data (i.e. compute statistics) with reduce().
See the docs for a definitive reference. Docs are auto-populated by the server, so should always be up to date.
"Spectral" reduction.
Convolutions, linear and non-linear.
Temporal reduction.
Spatial reduction
Spatial reductions (plural).
Raster to vector conversion.
AKA map algebra.
Earth Engine helpfully replicates one band images in mathematical operations.
Parallelization
Images exist at multiple scales.
Locate the dataset.
Identify tiles in the area of interest (e.g. Code Editor map).
Resample and reproject as necessary according to the output, then do the computation.
You code does not run at Google. It gets turned into a request object which is sent to Google for processing. See Gorelick et al. for details.