CSR_Module5_Green Earth Initiative, Tree Planting Day
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ. of Athens
1. Managing Big, Linked and Open Earth
Observation Data: the Projects TELEIOS
and LEO
Manolis Koubarakis
2. Introduction
• TELEIOS: a STREP project funded
under the 5th call of FP7/ICT
(Strategic Objective: Intelligent
Information Management).
• Managed by: DG CONNECT, Data
Value Chain Unit G3
• Duration: September 2010 –
August 2013
• Consortium:
2
• LEO: a STREP project funded
under the 7th call of FP7/ICT
(Strategic Objective: SME
initiative on analytics)
• Managed by: DG CONNECT, Data
Value Chain Unit G3
• Duration: October 2013 –
September 2015
• Consortium:
3. The life cycle of EO data
6/18/2015
3
Mission 1
Mission 2
Value-adding or
downstream
processing
Auxiliary data
Users
…
Information
products
Mission N
4. The V’s of Big EO Data
• Volume: The Sentinel satellites are expected to
produce around 3000 TB yearly.
• Velocity: Several TB of new data will be
arriving every day.
• Variety: Copernicus data includes satellite
images, but also in-situ data. EO data are
usually complemented by auxiliary data (e.g.,
geospatial).
• Veracity: Data sources are of varying quality.
• Value: EO data gains value when analyzed,
correlated and enriched with other data sources
and turned into information and knowledge.
4
5. Use Cases of TELEIOS and LEO
• Semantic catalogues for EO archives
• Real-time wild fire monitoring
• Diachronic burn scar mapping
• Publishing European EO datasets as linked data
• Precision Farming
5
6. Use Case I: Semantic Catalogues for
EO Archives
• Consider the following query:
Find images taken by the MSG2
satellite on August 25, 2007 which
contain fire hotspots in areas which
have been classified as forests
according to CORINE land cover, and
are located within 2km from an
archaeological site in the Peloponnese.
• Can I pose this query using current Web
interfaces for EO archives?
7
8. Example (cont’d)
• Well, only partially.
Find images taken by the MSG2
satellite on August 25, 2007 which
contain fire hotspots in areas which have
been classified as forests according to
CORINE land cover, and are located
within 2km from an archaeological site in
the Peloponnese.
9
9. Example (cont’d)
• But why?
• All this information is available in the satellite
images and other auxiliary data sources of
EO data centers or on the Web.
• However, EO data centers today do not allow:
• the mining of satellite image content
and
• its integration with other relevant
data sources so the previous query can
be answered.
10
11. Knowledge Discovery and Data
Mining
• Developed a knowledge discovery and data mining framework
for satellite images and related geospatial data.
• Applied the knowledge discovery and data mining framework to the
TerraSAR-X archive of DLR:
• Processed 300+ scenes (3TB data)
• Discovered 850+ semantic categories with high
precision and recall.
• The resulting Virtual Earth Observatory was used to develop:
• New generation of semantic catalogues for TerraSAR-X data
• Rapid mapping applications
12
12. Knowledge Discovery and Data
Mining
• Developed a knowledge discovery and data mining framework
for satellite images and related geospatial data.
• Applied the knowledge discovery and data mining framework to the
TerraSAR-X archive of DLR:
• Processed 300+ scenes (3TB data)
• Discovered 850+ semantic categories
• The resulting Virtual Earth Observatory was used to develop:
• New generation of semantic catalogues for TerraSAR-X data
• Rapid mapping applications
13
16. Use Case IV: Publishing European
EO Datasets as Linked Data
• See also the Greek linked open data portal:
http://www.linkedopendata.gr/
17
Available on as linked data
http://datahub.io/organization/teleios
CORINE land cover Urban Atlas
17. Use Case V: Precision Farming
18
Our challenge for 2050: feeding 9 billion
people
We will live in a more
populated and more
urban world.
Urban population
Rural population
*forecast
18. Precision Farming (cont’d)
19
Latest studies conclude that global
agricultural supply needs to be
increased by 70-150%
to meet the increasing demand
by 2050.
19. Precision Farming (cont’d)
• How can we achieve an increase and
optimization of agricultural productivity?
• higher yields with the same factor input
or
• production savings with equal yield level
e.g., more efficient use of fertilizer and
plant protection measures
• Precision Farming = site-specific cultivation
is the technique to achieve this providing both
economic and ecological benefits.
• Goal of : Develop a precision farming
application utilizing open EO data and linked
geospatial data.
20
20. Focus on Fertilization
Stefan Burgstaller
Applying high
amounts of
fertilizers
Applying little
amounts of
fertilizers
Limestone areas with reduced nutrient-
and water holding capacity show
reduced biomass & yield
23. Strabon
• A state-of-the-art spatiotemporal RDF store.
25
Find more about Strabon at
http://strabon.di.uoa.gr/
Strabon
Repository
SAIL
Query Engine
Parser
Optimizer
Transaction Manager
Storage Manager
RDBMS
Evaluator
stSPARQL to SPARQL
Translator
Named Graph
Translator
PostgreSQL
MonetDB
GeneralDB
PostGIS
PostgreSQL
Temporal
stRDF
graphs
stSPARQL/
GeoSPARQL
queries
WKT GML
24. Sextant
• A browser and visualizer for linked
spatiotemporal data (available as a Web or
Android application)
26
http://bit.ly/sextant-rapid-mapping-attica
Find more at:
http://sextant.di.uoa.gr/
25. MonetDB SciQL
• The scientific database query language SciQL in
MonetDB.
• One of the 3 international efforts used as a basis for the
ArrayQL standard (http://www.xldb.org/arrayql/).
27
Find more about SciQL at
http://monetdb.org/
26. • The data vault functionality in MonetDB
• LRIT/HRIT, GeoTIFF, FITS, mSEED and BAM file
types can be handled now.
28
Find more about Data Vaults at
http://monetdb.org/
MonetDB Data Vault
27. GeoTriples
• A tool for tranforming EO and geospatial data into RDF.
• Available from http://sourceforge.net/projects/geotriples/
29
Mapping
Generator
KML SHP
R2RML
Mapping
Document
Geo
TIFF
net
CDF
Relational
Database
Connector
GeoTriples
D2RQ Engine
R2RML
Processor
D2RQ Engine
28. Silk
• Geospatial and temporal extensions of the tool Silk.
• Available from https://github.com/psmeros/stSilk
30
contains
close
29. Project MELODIES
• Maximising the Exploitation of Linked
Open Data In Enterprise and Science
• 8 use cases: Groundware Modelling, Marine
Transport Services, Crisis Mapping,
Desertification Indicators, Ocean Status
Assessment, Land Management, Urban
Accounting and Emission Inventories.
• http://www.melodiesproject.eu/
31
30. • Open EO data will continue to be produced.
• EO data is an important class of big data.
• EO data have all the V’s of big data.
• Transforming EO data into linked data and integrating it
with other kinds of linked data can help us develop
many interesting applications.
• Scientific Database and Semantic Web technologies
are important development tools for EO data.
Lessons Learned from TELEIOS, LEO and
MELODIES