2. Talk outline
¥ The BigDataEurope action
¥ The Big Data Integrator platform
¥ Pilots across all seven H2020 challenges
¥ Upcoming BDE Activities
18-oct.-16www.big-data-europe.eu
4. Big Data Europe (CSA: 2015-17)
¥ Show societal value of Big Data
o Across all societal challenges addressed by Horizon 2020
¥ Lower barrier for using big data technologies
o Effort and resources to convert tools and workflows
o Skills and expertise
¥ Help establish data value chains
o Across languages, organizations, and domains
18-oct.-16www.big-data-europe.eu
6. Stakeholder Engagement
¥ Present action, showcase
deployments
¥ Raise awareness about BDE results,
what they mean for stakeholders
¥ Collect requirements to drive
further development
18-oct.-16
www.big-data-europe.eu
M12M6 M18 M24 M30
7. Data Value Chain Evolution
18-oct.-16www.big-data-europe.eu
Extraction, Curation Quality, Linking,
Integration
Publication,
Visualization, Analysis
Extraction, Curation, Quality,
Linking, Integration, Publication,
Visualization, Analysis
Health
Transport
Security
Extraction Curation Quality Linking Integration Publication Visualization Analysis
Data Repositories Linked Open Data
Cloud
Stage 1
Stage 2
Stage 3
Food SocietiesClimate Energy
9. Architecture
¥ Big Data Integrator (BDI):
o The prototype developed by BDE
¥ Main points of the architecture
o Dockerization
o Support layer, including integrated UI
o Semantification layer
18-oct.-16www.big-data-europe.eu
11. Docker containers
18-oct.-16www.big-data-europe.eu
¥ Docker offers lightweight virtualization
o Docker containers can be shared to be provisioned on different
Linux variations and versions
¥ Identical base sys
not required
¥ All BDI components:
Docker containers
12. BDI components
18-oct.-16www.big-data-europe.eu
¥ Processing and storage components
o Re-used existing docker containers where available
o Dockerized by BDE otherwise
o Ensured all can be provisioned through Docker Swarm
¥ Components by BDE:
o Support Layer
o Semantic Layer
13. Support Layer
18-oct.-16www.big-data-europe.eu
¥ BDE defines uniform UI stylesheets
o Web UIs from BDE dockers (including for third party
components) follow these BDE stylesheets
¥ BDE-developed tools:
o Starting containers
and dependencies
o Monitoring execution
14. Semantic data lake
18-oct.-16www.big-data-europe.eu
¥ Minimal ingestion
pre-processing
¥ Semantic layer
maintains metadata
¥ Add meaning when
retrieving/processing
Data Lake: scalable unstructured data store
Relationship definitions and metadata
JSON-LD CSVW R2RMLXML2RDF
16. Semantic layer tools
18-oct.-16www.big-data-europe.eu
¥ BDE tooling for Semantic Data Lake:
o Swagger: Semantics of RESTful APIs
o Semantic Analytics Stack (SANSA):
Distributed data processing for large-
scale RDF data
o Semagrow: SPARQL perspective over
Big Data stores
19. SC2: Viticulture resources
18-oct.-16www.big-data-europe.eu
Food and
Agriculture
• AgInfra is a major infrastructure for agriculture
researchers, serving cross-linked bibliography,
data, and processing services
• Pilot automates
publication
ingestion and
thematic
classification
21. SC4: Traffic conditions estimation
18-oct.-16www.big-data-europe.eu
Transport
• Estimation of real-time traffic
conditions in Thessaloniki
• Combines:
• Traffic modelling from
historical data
• Current measurements from a
taxi fleet of 1200 vehicles
22. SC5: Climate modelling
18-oct.-16www.big-data-europe.eu
Climate
• Discovering and re-using previously
computed derivatives
• Lineage annotation: datasets and model
parameters used to compute derivative
datasets
• Finding appropriate past runs avoids
repeating weeks-long modelling runs
• Preparing modelling experiments
• Slicing, transforming, combining datasets into new datasets
• Submission to and retrieval from modelling infrastructure
23. SC5 Pilot: Points Demonstrated
18-oct.-16www.big-data-europe.eu
Climate
• Existing infrastructure and stable, reliable
software for parallel computation of models
• BDI is deployed as an external infrastructure
for preparing and managing datasets
• BDI offers:
• Hive for managing data in a way that can be
retrieved and manipulated, rather than file blocks
• Cassandra stores structured and textual metadata
for searching headers and lineage
25. SC7: Change detection & verification
18-oct.-16www.big-data-europe.eu
Secure
Societies
• Events are extracted from text
published by news agencies
and on social networking sites
• Events are geo-located and
relevant changes are detected
by comparing current and
previous satellite images
27. 2nd round of Societal Workshops
18-oct.-16www.big-data-europe.eu
Transport 22 September 2016 Brussels Collocated with Big Data for
Transport, Tisa workshop
Food&Agri 30 September 2016 Brussels Collocated with DG AGRI WP2018-
20 stakeholder consultation
Energy 4 October 2016 Brussels Collocated with EC H2020 Info Day
on “Smart Grids and Storage”
Climate 11 October 2016 (1) Brussels Collocated with Melodies Project
Event – Exploiting Open Data
Health 19 October 2016 Brussels Standalone Workshop
Security 18 October 2016 Brussels Standalone Workshop
Societies 5 December 2016 Cologne Collocated with EDDI16- 8th Annual
European DDI User Conference
28. Other Activities
¥ Hands-on BDE pilots workshop
o Apache Big Data Europe, Seville, 14-16 Nov
o Enable BD technology practitioners to try out BDI &
components
o To fine-tune technical BDI requirements
¥ Various SC-focussed and general hangouts, follow!
o Apache Flink & BDE (20 Oct) – Free Webinar
18-oct.-16www.big-data-europe.eu