Boost Fertility New Invention Ups Success Rates.pdf
iMarine catalogue of services
1. iMarine
Catalogue
of
Services
Pasquale
Pagano
(CNR)
iMarine
Technical
Director
pasquale.pagano@is?.cnr.it
iMarine
data
plaAorm
for
collabora?ons
7th
March
2014,
09:00
–
17:30
Food
and
Agriculture
Organiza2on
of
the
United
Na2ons
(FAO)
Headquarters
2. The
Catalogue
of
Services
iMarine
is
exploi?ng
a
Hybrid
Data
Infrastructure
combining
over
500
soPware
components
into
a
coherent
and
centrally
managed
system
of
hardware,
soPware,
and
data
resources.
iMarine
data
plaAorm
for
collabora?ons
2
3. Born
from
the
user
needs
3
iMarine
data
plaAorm
for
collabora?ons
I
need
to
host
my
applica?ons
in
a
secure
and
scalable
environment
I
need
to
maintain
my
database
I
need
to
backup
my
data
I
need
to
delivery
my
data
to
a
set
of
known
people
I
need
to
analyse
my
big
datasets
4. Born
from
the
user
needs
4
iMarine
data
plaAorm
for
collabora?ons
I
need
to
manage
and
analyze
biological
and
ecological
data
I
need
to
manage
the
full
data
life-‐cycle
from
import
to
valida?on,
cura?on,
harmoniza?on
and
publica?on
I
need
to
offer
to
my
team
a
powerful
tool
to
manage
code-‐lists
I
need
to
store
and
analyze
geospa?al
explicit
informa?on
I
want
to
offer
a
flexible
sharing,
storage,
repor?ng,
search
and
retrieval
tool
5. Born
from
the
user
needs
5
iMarine
data
plaAorm
for
collabora?ons
I
need
to
access
authorita?ve
biological
and
ecological
data
I
wish
to
simplify
the
access
to
my
geospa?al
data
I
need
to
mash-‐up
sta?s?cal
and
biodiversity
data
I
need
to
reduce
the
costs
of
data
maintenance
of
my
dept.
I
need
to
validate
my
datasets
and
provide
a
standard
access
to
them
6. User
Needs
Analysis
6
iMarine
data
plaAorm
for
collabora?ons
• Needs
– Not
isolated
– Not
disconnected
– Not
trivial
• Solu?ons
– Actual
but
with
an
eye
to
the
future
– Designed
for
individuals
but
looking
at
the
community
7. Capaci?es:
Storage
as
Service
• Scalability
and
high
availability
• Across
sites
• ISO
19115/19139
Metadata
• Catalogue
• Open
source
RDBMS
• Up
to
1
TB
data
• Secure
• Fault-‐tolerant
• Replica?on
Virtual
Workspace
Rela?onal
Databases
Large
and
Ac?ve
data
storage
Spa?al
Database
iMarine
data
plaAorm
for
collabora?ons
7
8. Capaci?es:
Compu?ng
as
Service
Hadoop
Sta?s?cal
Manager
R
clusters
• MapReduce
• Analysis/
clustering/
modeling
• Windows
and
Linux
iMarine
data
plaAorm
for
collabora?ons
1000
CPUs
Currently
Available
8
9. Management
and
interpreta?on
of
biological
and
ecological
data
in
the
environment
Complete
full
life-‐cycle
data
framework,
from
observa?onal
data
to
aggregated
data
repositories
enriched
with
valida?on
and
analy?cal
tools
Storage
and
interpreta?on
of
geospa?al
explicit
informa?on,
including
WPS
processing
Flexible
sharing,
storage,
repor?ng,
search
and
retrieval,
aggrega?on
and
projec?on
facili?es
Applica?ons
iMarine
data
plaAorm
for
collabora?ons
A
BUNDLE
is
a
set
of
services
and
technologies
grouped
according
to
a
family
of
related
tasks
for
achieving
a
common
objec?ve
9
10. Occurrence
and
Taxonomic
Data
Discovery
Occurrence
Data
Processing
Species
Distribu2on
Modeling
Species
Distribu2on
Maps
Discovery
Taxonomic
Data
Comparison
Taxonomic
Data
Matching
Code
List
Discovery
Code
List
Management
Sta2s2cal
Engine
Tabular
Data
Discovery
Tabular
Data
Enrichment
Tabular
Data
Management
Tabular
Data
Processing
Geospa2al
Data
Discovery
Geospa2al
Data
Processing
Enhanced
Documents
Management
Fact-‐sheets
Management
Informa2on
Object
Discovery
Messaging
Shared
Workspace
Social
Networking
Facili2es
Applica?ons
10
iMarine
data
plaAorm
for
collabora?ons
A
BUNDLE
is
a
set
of
services
and
technologies
grouped
according
to
a
family
of
related
tasks
for
achieving
a
common
objec?ve
11. iMarine
data
plaAorm
for
collabora?ons
Presence
Points
(FishBase
+
Obis)
Density
Based
Clustering
DBSCAN
(with
outliers)
Other
methods
are
also
available
…
K-‐Means
X-‐Means
Features
Clustering
with
StatsCube
11
12. Data
Analysis
with
StatsCube
12
Import
CodeLists
Validate
Datasets
Analyse
And
Project
14. VS
FAO
Eleutheronema
tetradactylum
AquaMaps
Eleutheronema
tetradactylum
Maps
Comparison
with
GeosCube
MEAN=0.81
VARIANCE=0.02
NUMBER_OF_ERRORS=6691
NUMBER_OF_COMPARISONS=259200
ACCURACY=97.42
MAXIMUM_ERROR=1.0
MAXIMUM_ERROR_POINT=3005:363:1
COHENS_KAPPA=0.218
COHENS_KAPPA_CLASSIFICATION_LANDIS_KOCH=Fair
COHENS_KAPPA_CLASSIFICATION_FLEISS=Marginal
TREND=EXPANSION
RESOLUTION=0.5
iMarine
data
plaAorm
for
collabora?ons
14
15. iMarine
OBIS
WoR
MS
WoR
DS
GBIF
CoL
ITIS
IRMN
G
NCBI
MyOc
ean
WOA
EuroS
tat
Data.
FAO
…
Data
15
iMarine
data
plaAorm
for
collabora?ons
iMarine
Registries
Valida2on
Enriching
Processing
Sharing
16. Data
Ontologies
and
Data
Warehouses
Biological
and
Ecological
Data
GeoSpa?al
Data
Sta?s?cal
Data
Documents
iMarine
data
plaAorm
for
collabora?ons
DarwinCore
/
ISO19139
>35
M
Observa?ons
(OBIS)
≈
120
K
Observed
Species
(OBIS)
≈
500
K
Taxa
(WoRMS)
>600
K
Scien?fic
Names
(ITIS)
>12
K
Species
Maps
(AquaMaps)
≈
600
Species
Extent
(FAO)
…
FishBase,
SeaLifeBase
…
CoL,
GBIF
SDMX
*
Ø FAO
CodeLists
Ø IRD
CodeLists
Ø FAO
datasets
Ø Eurostat
Ø …
ISO19139
(OGC
W*S)
Ø 10
years
Chemical
and
Physical
variables
in
2D
space
Ø Ice
concentra?on
and
velocity,
Chlorophyll,
Oxygen,
Nitrate,
Phosphate,
Phytoplankton
as
carbon,
Salinity,
Temperature,
…
Ø On-‐demand
Chemical
and
Physical
variables
in
3D
space
Ø Apparent
Oxygen
U?liza?on,
Dissolved
Oxygen,
Salinity,
Temperature,
…
>
350
variables
16
OAI-‐PMH,
OpenSearch
Ø FAO
Facksheets
Ø Aqua?c
Commons
Ø Bioline
Interna?onal
Ø Biodiversity
Heritage
Ø OceanDocs
Ø Nature,
PenSoP
Journals
Ø …
RDF,
OWL
Ø FAO
FLOD
Ø Marine
Top
Level
Ontology
Ø IRD
Ecoscope
Ø FactForge,
Yago2
Ø …
17. Is
this
enough?
• An
ecosystem
of
par?cipatory
data
e-‐
Infrastructures
• Regulated
by
policies
• Enabled
by
standards
• Promo?ng
not
only
access
but
mash-‐up
of
heterogeneous
data
iMarine
data
plaAorm
for
collabora?ons
User
centric
17
18. Virtual
Research
Environment
iMarine
is
user-‐centric
and
workflow-‐oriented
thanks
to
the
gCube
VRE
technology
Virtual
Research
Environment
(VRE)
is
• a
distributed
and
dynamically
created
environment
• where
subset
of
data,
services,
computa?onal,
and
storage
resources
• regulated
by
tailored
policies
• are
assigned
to
a
subset
of
users
via
interfaces
• for
a
limited
2meframe
• at
lifle
or
no
cost
for
the
providers
of
the
par?cipatory
data
e-‐infrastructures
iMarine
data
plaAorm
for
collabora?ons
L.
Candela,
D.
Castelli,
P.
Pagano
(2013)
Virtual
Research
Environments:
An
Overview
and
a
Research
Agenda.
Data
Science
Journal,
Vol.
12
18
19. iMarine
Technology
• iMarine
is
powered
by
gCube
iMarine
data
plaAorm
for
collabora?ons
19
hups://www.ohloh.net/p/gCube
20. iMarine
Technology
• iMarine
is
powered
by
gCube
iMarine
data
plaAorm
for
collabora?ons
20
hups://www.ohloh.net/p/gCube
21. iMarine
Technology
• iMarine
is
powered
by
gCube
iMarine
data
plaAorm
for
collabora?ons
21
hups://www.ohloh.net/p/gCube
22. iMarine
e-‐infrastructure
iMarine
is
exploi?ng
D4Science.org
iMarine
data
plaAorm
for
collabora?ons
22
Geographically
Distributed
Compu?ng
Infrastructure
Across
administra?ve
boundaries
Across
private
and
commercial
providers
Service
Alloca?ons,
Deployment,
Monitoring,
and
Opera?on
Uniform
resource
and
data
access
Opera?on
Built
on
SLAs
Support
monitoring,
audi?ng,
repor?ng,
and
no?fica?on
Trust
Privacy,
governance,
and
auribu?on
Security,
trusted
network
23. Landscape
D4Science
e-‐Infrastructure
gCube
Framework
gCube
Apps
Discussion
www.i-‐marine.eu
i-‐marine.d4science.org
iMarine
data
plaAorm
for
collabora?ons
23