iMarine is empowering users in the marine community and beyond by providing a highly efficient e-Infrastructure to accelerate data discovery, exchange, and analysis, tools and platforms that facilitates scientific discovery. Funded by the European Commission's 7th Framework Programme, a number of iMarine services are already available through the iMarine Gateway supplying cross disciplinary data supporting experts in the field.
2. Outline
1. Project Info & Objectives (D. Castelli)
2. e-Infrastructure seletected capabilities
(A.Ellenbroek)
3. e-Infrastructure governance (M. Taconet)
4. Concluding remarks (M. Taconet)
Marine Knowledge All Projects meeting,
11-12 October 2012
3. iMarine project info
• Research Infrastructures CP & CSA funded by the
European Commission under the FP7 Capacities
Programme - eInfrastructure Unit DG INFSO
• 1 Nov 2011 - 30 Apr 2014
• 13 partners
• 660 p/m co-funded by EU + 123 p/m in-kind
contribution from externals collaborators
Marine Knowledge All Projects meeting,
11-12 October 2012
4. iMarine Community
Marine Knowledge All Projects meeting,
11-12 October 2012
5. Objective
Launch an initiative aimed at establishing and
operating an e-infrastructure contributing to the
implementation of the principles of the Ecosystem
Approach to Fisheries Management and
Conservation of Marine Living Resources.
Marine Knowledge All Projects meeting,
11-12 October 2012
6. Implementing the EA
• Analysis and processing of a large amount of heterogeneous,
across-domain produced information
• Multidisciplinary & multifacets collaboration at the local,
national, regional and international levels
Marine
resource
assessment
Socio-
Habitat types economic
aspects
Inventories of
biological
information
Physical and Fishery
chemical operation,
features processingand
trade
Marine Knowledge All Projects meeting,
11-12 October 2012
7. e-Infrastructure
Elecronic platform operated by a responsible
entity offering an open set of basic enabling
services (including access to resources) to a
distributed Community of Practice. By
exploiting these shared services the members of
the Community of Practice realise economies of
scale.
Marine Knowledge All Projects meeting,
11-12 October 2012
8. iMarine focus
…
«The creation of the marine knowledge begins with
the observation of the sea and oceans. Data from
Assemble
these observations are assembled, then analysed
to create information and knowledge.
Subsequently, the knowledge can be applied to
Analyse
deliver smart sustainable growth, to assess the
health of the marine ecosystem or to protect
coastal communities.»
Production
Marine Knowledge 2020 Communication
…
Marine Knowledge All Projects meeting,
11-12 October 2012
9. iMarine offer
…
Assemble
Analyse
Production
Functionality Capacity
…
Marine Knowledge All Projects meeting,
11-12 October 2012
10. Building upon existing e-Infrastructures
e-Infrastructure
services
Genesi-
GBIF DEC
MyOcean
EGI
(Grid&Cloud)
VENUS-C
(cloud) EMODNET
Marine Knowledge All Projects meeting,
11-12 October 2012
11. e-Infastructure ecosystem
• Interoperability
‒ Each e-Infrastructure can outsource
required facilities to other e-
Infrastructures
– The same e-infrastructure can play
both provider and consumer roles
• Competition
– The most effective and sustainable
e-Infrastructures will survive
Marine Knowledge All Projects meeting,
11-12 October 2012
12. Data infrastructure components
e-Infrastructure
software system Physical architecture
(computing & storage resouces)
Data & sw tool
resources
Governing procedures
and policies
Marine Knowledge All Projects meeting,
11-12 October 2012
13. Software system
Application
Business Ecological Time Series Workspace Vessel
Document Niche Activity
Workflow Modelling Analyser
Data Management
Data Management
Access Mining Search
Storage Transformation
Enabling
Process Resource Resource
Security
Execution Discovery Management
Marine Knowledge All Projects meeting,
11-12 October 2012
14. Functionality classes
Data Data
Data import & harmonization, transformation,
sharing validation and publishing and
enrichment visualization
Collaborative
environments
Advanced data
(Virtual
analysis
Research
Environments)
Marine Knowledge All Projects meeting,
11-12 October 2012
16. Products
The initiative
( CoP, board, policies, sustainability,…..)
The e-infrastructure
(the operational platform)
The system
(the enabling sw system)
Marine Knowledge All Projects meeting,
11-12 October 2012
17. We are not starting from scratch
EA Partnerships
Information Technology
iMarine
(2011-2014)
D4ScienceII (2010-
2011)
D4Science
(2008-2009)
DILIGENT
(2004-2006)
Marine Knowledge All Projects meeting,
11-12 October 2012
19. iMarine e-Infrastructure - Options
Select from different Applications
Marine Knowledge All Projects meeting,
11-12 October 2012
20. iMarine e-Infrastructure – Selected Options
• Import
TimeSeries • Validation
Data • Analysis
• Mining
• Discovery
Biodiversity
• Access
Data
• Analysis
• Discovery
Geospatial
• Access
Data
• Process
Marine Knowledge All Projects meeting,
11-12 October 2012
21. Time Series - Import
Import Formatted and unformatted data
CSV-Import, harmonize, structure, publish as Time Series
Marine Knowledge All Projects meeting,
11-12 October 2012
22. Time Series - Validation
Harmonize, Format, and Structure data
Use rules for formatting, range check, code-list recognition, etc.
Marine Knowledge All Projects meeting,
11-12 October 2012
23. Time Series – Data Analysis
Time Series are treated as
tabular data
The Options include:
• Union / Join / Merge / Sum
• Graphs
• Plot on maps
• Analysis with R, weka,
RapidMiner
• Safely Share
• Publish to ‘World‘
Marine Knowledge All Projects meeting,
11-12 October 2012
24. Time Series – Code Lists
Import Formatted and unformatted Code Lists
Create your own, or import from SDMX registry. Useful in validation
Marine Knowledge All Projects meeting,
11-12 October 2012
25. Time Series Analysis - Data Mining
Outlier detection allows to recognize anomalies in n-dimensional data-sets
• Outlier detection
• Frequency detection
Marine Knowledge All Projects meeting,
11-12 October 2012
26. Time Series Analysis - Vessel Position Analysis
Example: Time Series analysis of Position Observations
Time Series processing techniques can be exploited for :
- Aggregating Vessel Data
- Calculating fishing effort
- Classifying the fishing activity
Marine Knowledge All Projects meeting,
11-12 October 2012
27. Biological datasets
Biological Data Provider Status
Catalogue of Life Released (2.9.0)
GBIF Released (2.9.0)
ITIS Released (2.9.0)
OBIS Released (2.9.0)
WoRMS Released (2.9.0)
IRMNG Release (2.10.0) – October 16
NCBI Release (2.10.0) – October 16
Marine Knowledge All Projects meeting,
11-12 October 2012
28. Biodiversity Products Retrieval
Discovery and access across heterogeneous providers
Search by scientific name or common name; retrieval of taxonomy items and
occurrence points
Marine Knowledge All Projects meeting,
11-12 October 2012
29. Biodiversity – Taxonomic Items
• Active links on selected
items
• Common names matrix
• Checklists (DwC-A)
production via jobs
– Executed in batch
– Concurrent jobs
– Live monitoring
Marine Knowledge All Projects meeting,
11-12 October 2012
30. Biodiversity – Occurrence Points Visualization
• Active links on selected
items
• Geo-visualisation
• Export
– DarwinCore
– CSV
– CSV for openModeller
Marine Knowledge All Projects meeting,
11-12 October 2012
31. Biodiversity – Occurrence Points Analysis
A set of probabilistic operations on Occurrence Points.
Two thresholds: T° for spatial proximity. Ts for a similarity confidence.
Merge ( A, B ) T ° ,Ts A B
<T°
x,y x,y
=
Event Date Event Date
Inters ( A, B ) T ° ,Ts Modif Date Modif Date
Author Author
LexicalD(Author)
Species * Species
LexicalD(SciName)
NoDuplicat es ( A)T °,Ts Scientific
> Ts
Scientific
Name Name
OnEarth ( A )
Take the most Recent
InSea ( A )
Marine Knowledge All Projects meeting,
31
11-12 October 2012
32. Biodiversity – Occurrence Points Analysis
• Dedicated environment for
occurrence points
management
• Open environment
• Export
– DarwinCore
– CSV
– CSV for openModeller
Marine Knowledge All Projects meeting,
32
11-12 October 2012
33. Geospatial A Simple Scenario
After the Joint Activity with the other participants
Marine Knowledge All Projects meeting, 11-12
October 2012
34. Visualization Example; Neural Network inferred suitable range maps
DISTRIBUTIONS
AQUAMAPS_SUITABLE
Aquamaps Suitable
Distribution
DISTRIBUTIONS
AQUAMAPS_SUITABLE_NEURAL_
NETWORK
Aquamaps Neural
Network Suitable
Distribution
Marine Knowledge All Projects meeting,
11-12 October 2012
35. A big plus: Integrated Quality Analysis with Biodiversity products
EVALUATORS
QUALITY_ANALYSIS
Quality Analysis on AbsencePresence Points (Res. 0.5 degrees)
Aquamaps_suitable ( eq. to native)
TRUE POSITIVES 13
FALSE POSITIVES 0
TRUE NEGATIVES 7
FALSE NEGATIVES 21
ACCURACY 0.49
SENSITIVITY 0.38
SPECIFICITY 1
OMISSION RATE 0.62
ROC BEST THRESHOLD 0.17
AUC 0.41
Quality Analysis on AbsencePresence Points (Res. 0.5 degrees)
Neural Network_suitable (eq. to native)
TRUE POSITIVES 32
FALSE POSITIVES 0
TRUE NEGATIVES 7
FALSE NEGATIVES 2
ACCURACY 0.95
SENSITIVITY 0.94
SPECIFICITY 1
OMISSION RATE 0.059
ROC BEST THRESHOLD 0
AUC 1
Marine Knowledge All Projects meeting,
11-12 October 2012
36. iMarine e-Infrastructure
EUPL
iMarine
Application
Business Ecological Time Series Workspace Vessel
Document Niche Activity
Workflow Modelling Analyser
Data Management
Data Management
Access Mining Search Subsystem
Boundary
SUBSYSTEM BOUNDARY
Storage Transformation
External
interactions
Enabling
Process Resource Resource
Security
Execution Discovery Management
Application Platform
Marine Knowledge All Projects meeting,
11-12 October 2012
37. Secure, Powerful, and Standard-based
application
Server
Server
Server
application
Server
Portal
gCube Enabling Technology application
• Secure: all data moved over the network and all server to server communications
are authorised and encrypted; data can be stored encrypted
Marine Knowledge All Projects meeting,
11-12 October 2012
38. Virtual Research Environments
VRE is the hardware, data, and applications allocated for a timeframe
to a group of people for effective collaborations
• Stored in
User
Software
uploads/selects Repository
apps
User • Accessible
register/selects through
data sets Mediators
• System deploys,
Apps are executed on configures, executes
the most suitable HW and monitors
• System controls
User invites
authentication and
other users enforces policies
Marine Knowledge All Projects meeting,
11-12 October 2012
39. Summary
• An advanced data e-Infrastructure
• across location and ownership boundaries
• across technological boundaries
• regulated by governance and policies (EUPL)
• Designed for future developments
• integrate or develop applications
• designed to grow incrementally in size
• share, harmonize, transform data
Marine Knowledge All Projects meeting,
11-12 October 2012
41. e-Infrastructure governance
• Objective 1 of iMarine
<<
To develop community-driven policies enabling:
– governance and operation of a data infrastructure,
– sharing of data and other resources, and processing data
In order to support the Community of Practice in implementing the Ecosystem
Approach to fishery management and marine living resource conservation
>>
role of the iMarine Board
Marine Knowledge All Projects meeting,
11-12 October 2012
42. iMarine Boards’ tasks
• Mobilize user community
• Develop governance model
• Address systems’ harmonization
Marine Knowledge All Projects meeting,
11-12 October 2012
43. The iMarine Board
• Mobilize the user community • FAO
• DG MARE
– Core set of influential partners • Eurostat
Fisheries
stimulated to work together on three • NEAFC
• MEDDE/DOF
main business cases:
Support to implementation of the EU Common Fishery Policy
• IRD
Support to FAO’s deep seas fisheries programme
Support to regional tropical LME pelagic EAF community
• ICES
Bio-diversity
– Raise awareness on the offer • IOC/OBIS
• FIN
– Position/align the offer versus the needs • CRIA
• VLIZ
Environt
• T2/GENESI-DEC
Marine Knowledge All Projects meeting,
11-12 October 2012
44. The iMarine Board
• Develop Governance model, with sustainability
focus
» Governance is the combination of
processes (including relationships among stakeholders),
structures (including formal and informal institutions), and
instruments (policies, laws)
implemented by the board,
» through which stakeholders interests are articulated, right and
obligations are established, and differences are mediated
Marine Knowledge All Projects meeting,
11-12 October 2012
45. The iMarine Board –develop Governance model
• An interface between the CoP and the data infrastructure owners
• Designed to allow control of EA Community on Data Infrastructure Developments
• Forerunner of a Governance structure
Marine Community
iMarine Project
Project Recomm. Advisory
iMarine Board Council
Boards Explain DGMare
Eurostat
ICES
Decision
Recommendations
FAO
RFBs
IUCN UNEP
IFSA (GOBI)
Ecoscope
Cat.of Life
Emodnet
OBIS
FIGIS
Marine Knowledge All Projects meeting,
11-12 October 2012
46. The iMarine Board
• Develop Governance model, with sustainability
focus
– Policies as governing instruments
• Data access and sharing policies
• Software and hardware sharing policies
Marine Knowledge All Projects meeting,
11-12 October 2012
47. The iMarine Board
• Address the harmonization of systems
– Rationalizing solutions among partners (cost efficiency)
• 4 technical clusters: Biodiversity, Geo-spatial, Statistical, semantics
– Agreeing on “iMarine” standards
• OGC, Darwin Core, SDMX, RDF
• Includes promotion of new standard (e.g. FLUX)
– Mainstreaming requirements and specifications
Operationalization of the agreed formats and protocols
Marine Knowledge All Projects meeting,
11-12 October 2012
48. The iMarine Board
• Result of Systems harmonization
Ocean environment
Geo-
Taxonomy
Processing
MyOcean
WORMS services
VLIZ
xxx T2 SeaDatanet
Emodnet
Other sources
OBIS IOC
Policy fmk
Niche
FLUX std
Aquamaps modelling
FIN algorythms Open Source
software
OpenSSDMX -
DG-MARE
CRIA CLM National DOF
IRD ICES FAO ESTAT
Ecoscope RDB
FIGIS
RDF DC SDMX FLOD SDMX SDMX
OGC
Biodiversity Fisheries
Marine Knowledge All Projects meeting,
11-12 October 2012
50. What’s peculiar about iMarine
• The iMarine resources offer is about
– Integrating and managing services across systems’
administrative boundaries
– End users:
• Interactive gateway for collaborative science: VREs
– Infrastructure owners:
• Platform providing outsourcing services: computing,
distribution, scaling, interoperability
• Cloud hosting services
Marine Knowledge All Projects meeting,
11-12 October 2012
51. What’s peculiar about iMarine
• iMarine Positioning in Open data perspective
Cross-thematic
federator
O F
iMarine
B I
I R Thematic
SEADATANET
S M aggregators
Genomics
S
GEOSEAS
Data locked
in isolated
computers
Marine Knowledge All Projects meeting,
11-12 October 2012
52. For major impact, we look for
• Strategic alliances with Federators:
– Thematic aggregators of data providers
– Scientists end-user needs
– Governance and policy models
Marine Knowledge All Projects meeting,
11-12 October 2012
53. Landscapefor your attention
Thanks
D4Science e-Infrastructure
www.i-marine.org
portal.i-marine.d4science.org
gCube Framework
gCube Apps
www.gcube-system.org
gcube.wiki.gcube-system.org
Discussion
Marine Knowledge All Projects meeting,
11-12 October 2012