SlideShare una empresa de Scribd logo
1 de 3
Descargar para leer sin conexión
What does research infrastructure really need for data?
               A personal view based on LifeWatch and ENVRI
                            Alex Hardisty, Cardiff University

LifeWatch: an ESFRI Research Infrastructure; an e-Infrastructure for
Biodiversity and Ecosystem Science.

What is LifeWatch?
Biodiversity science is the study of the diversity of life on our planet – plants, animals, microorganisms and
viruses – and the environments (ecosystems) they live in. LifeWatch (www.lifewatch.eu) will be an open
access infrastructure, accessed through a single portal (portal.lifewatch.eu) for users from the scientific
community, as well as policy makers and representatives of the private sector. It will allow scientists to
explore, describe and understand patterns in biodiversity, and the processes that maintain biodiversity, in
space and time at the gene, species, ecosystem and landscape levels; and to understand what causes and
affects species diversity.

The innovative design of LifeWatch offers integrated access to large-scale data resources, advanced
algorithms and computational capability through a service-oriented architecture to support creation of new
knowledge. Key elements of the infrastructure will include: distributed observatories/sensors,
interoperable datasets, processing and analytical tools, and both computational capability and capacity.
Data mining, data analysis and modelling allows users to study patterns and mechanisms across different
levels of biodiversity. The LifeWatch infrastructure provides scientific research teams with new
collaborative environments by creating ‘e-Laboratories’ or composing ‘e-Services’. They may share their
data and analytical and modelling algorithms with others, while controlling access. LifeWatch enables
“distributed large scale” and collaborative research on complex and multidisciplinary problems.

In planning for the past 3 years, LifeWatch is presently transitioning to its construction phase. Early Virtual
Labs are likely to support scientific studies of biodiversity in marine wetlands and the fragility of ecosystems
towards alien and invasive species. The Biodiversity Virtual e-Laboratory (BioVeL) project (www.biovel.eu)
contributes to the construction by causing islands of compatible infrastructure to be created / emerge at
key centres across Europe.

The challenges of scale and heterogeneity
LifeWatch is supported by many good data providers from within the scientific communities (networks of
excellence) for terrestrial ecology, marine ecology and the natural history collections with all their
biological specimens. There are currently about 1800 terrestrial monitoring sites and 200 marine research
sites across Europe. Hundreds of millions of specimens in natural history collections all over Europe are
gradually being indexed and digitised.

Biodiversity data is extremely diverse and heterogeneous. Biodiversity science spans many more familiar
disciplines: biology, botany, zoology, ecology, genetics, soil science, biogeography, climate science,
chemistry - to name but a few. Each of these established scientific communities already has its own way of


Alex Hardisty, XLDB-Europe, Edinburgh, 8-10th June 2011                                                 Page 1
doing things, their own data resources and their own tools. Not only that, but they have their own different
vocabularies and conceptual underpinnings. Interoperability is a problem demanding a determined
ontological and thesaurus solution like that used in the medical domain: the Unified Medical Language
System (UMLS) (www.nlm.nih.gov/research/umls).

The interconnections between different biodiversity ideas/concepts, data sources, and the outputs from
data processing, manipulation and modelling are intricate. As well as the traditional sources mentioned
above, genomic data including, for example: sequence data, DNA barcodes and phylogenies are becoming
increasingly important sources. Biodiversity science also demands environmental data (climate, soil, ocean
temperature, etc.), as well as economic and census data for particular types of studies.

Apart from the well known and often large sources - GBIF, EBI, environmental data, census data - there are
numerous small datasets in the hands of individual researchers. If computerised at all, these small datasets
are often held in spreadsheets and with no identifiable common structure. There are probably thousands of
them. And multiple tools for processing too. The biodiversity science community is highly fragmented and
all these kinds of small, personal, group and departmental datasets need to get published and become
discoverable and usable.

LifeWatch aims to support upwards of 25,000 users, primarily from the academic and research community,
and the policymaking community, but also supporting the student education sector and the general public
(citizen science).

The LifeWatch strategy of “Thinking globally, acting locally” addresses these challenges of heterogeneity
and scale. “Thinking globally, acting locally” devises and promotes the pan-European top-down strategies
that foster collaboration and interoperability, and at the local level assists and encourages ‘islands’ of
compliant infrastructure to emerge and fuse.


ENVRI: Common Operations of the ESFRI Environmental Research
Infrastructures
What is ENVRI?
ENVRI is a soon to be funded EC FP7 project that brings together many of the main ESFRI research
infrastructures from the environmental sciences domain. The ENVRI project will contribute to the
construction of these research infrastructures by sharing experiences and technologies and by solving
crucial common technology issues and challenges together. Through cooperation in this project the ESFRI
ENV infrastructures, together with ICT partners, are seeking to increase the interoperability of their data
and facilities to increase the use and effectiveness of their infrastructures. The central goal of the ENVRI
project is to implement harmonised solutions and draw up guidelines for the common needs of the
environmental ESFRI projects, with a special focus on issues as architectures, metadata frameworks, data
discovery in scattered repositories, visualization and data curation.

ENVRI recognises scientific data services as part of a horizontal set of foundational services that include
communications, distributed computing, and storage. It recognises that data providers, as well as data
users, are users of data services and that there are common requirements irrespective of domain-specific
communities. Community-specific services sit on top of data services and interact with them.

The key to improved interoperability is finding common solutions to common problems that can be
adopted by each research infrastructure as it progresses through its construction phase. Fundamental
common solutions include:


Alex Hardisty, XLDB-Europe, Edinburgh, 8-10th June 2011                                               Page 2
a) A Common Reference Model providing multiple compatible ‘views’ of the research infrastructure for
   different purposes.

    An ENVRI Common Reference Model is likely to be based on the ISO/IEC 10746 series of Standards for
    Open Distributed Processing, presenting 5 viewpoints: i) Science business / enterprise view; ii)
    Information view; iii) Computational / services view; iv) Engineering view and v) Technology view.

b) “Standards, Standards, Standards” are required for, at least:
   • Data capture from distributed sensors
   • Metadata definition
   • Management of high volume data
   • Execution of workflows
   • Visualization of data
   • Provenance and annotation
   • Interoperability between assets

c) Based on a generic metadata model (the Information viewpoint of the Common Reference Model),
   tools to allow data discovery and access in a federation of distributed digital repositories and
   interoperating infrastructures;

d) RDF and OWL frameworks to describe relations between (virtualized) e-Infrastructure components,
   and to link semantic descriptions of data with the semantic descriptions of the infrastructure, allowing
   the creation of a data-centric network.


Riding the Wave: How Europe can gain from the rising tide of scientific
data
The recently published report of the High Level Expert Group on Scientific Data – “Riding the Wave: How
Europe can gain from the rising tide of scientific data” – is an important contribution towards addressing
the question of what research infrastructures really need for data. Neelie Kroes, the Vice-President of the
European Commission responsible for the Digital Agenda has asked: “every citizen and every organisation
involved in scientific research to take note of this report and to use it as a reference point when discussing
the priorities of EU research investments.”

The report may be found here:

http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf




Alex Hardisty, XLDB-Europe, Edinburgh, 8-10th June 2011                                                  Page 3

Más contenido relacionado

La actualidad más candente

BiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole GrazianoBiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole Graziano
eventi-ITBbari
 
2014-09-09-CReATIVE-B Roadmap Interactive
2014-09-09-CReATIVE-B Roadmap Interactive2014-09-09-CReATIVE-B Roadmap Interactive
2014-09-09-CReATIVE-B Roadmap Interactive
David MANSET
 
ELIXIR Node poster Norway
ELIXIR Node poster NorwayELIXIR Node poster Norway
ELIXIR Node poster Norway
ELIXIR-Europe
 

La actualidad más candente (20)

Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
 
BiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole GrazianoBiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole Graziano
 
The Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics Landscape
 
Energy files
Energy filesEnergy files
Energy files
 
2014-09-09-CReATIVE-B Roadmap Interactive
2014-09-09-CReATIVE-B Roadmap Interactive2014-09-09-CReATIVE-B Roadmap Interactive
2014-09-09-CReATIVE-B Roadmap Interactive
 
ELIXIR Node poster Norway
ELIXIR Node poster NorwayELIXIR Node poster Norway
ELIXIR Node poster Norway
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Data driven systems medicine article
Data driven systems medicine articleData driven systems medicine article
Data driven systems medicine article
 
Opendata repository-v2
Opendata repository-v2Opendata repository-v2
Opendata repository-v2
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019
 
Saarikko jarmo-nefis-helsinki-2005
Saarikko jarmo-nefis-helsinki-2005Saarikko jarmo-nefis-helsinki-2005
Saarikko jarmo-nefis-helsinki-2005
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data management
 
Open Science at Genome Scale
Open Science at Genome ScaleOpen Science at Genome Scale
Open Science at Genome Scale
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 3
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 3USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 3
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 3
 
EMBL-ABR_ AGRF2016
EMBL-ABR_ AGRF2016EMBL-ABR_ AGRF2016
EMBL-ABR_ AGRF2016
 
ViBRANT 8th e-Concertation Meeting, CERN
ViBRANT 8th e-Concertation Meeting, CERNViBRANT 8th e-Concertation Meeting, CERN
ViBRANT 8th e-Concertation Meeting, CERN
 
Legal Assessment Tool (LAT) - interactive help for data sharing
Legal Assessment Tool (LAT) - interactive help for data sharingLegal Assessment Tool (LAT) - interactive help for data sharing
Legal Assessment Tool (LAT) - interactive help for data sharing
 
Opendata repository-Gabarone,20181108
Opendata repository-Gabarone,20181108Opendata repository-Gabarone,20181108
Opendata repository-Gabarone,20181108
 
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
 

Destacado (6)

Informativo nº4 conj. de jovens shalon ad canaã i ipatinga mg
Informativo  nº4 conj. de jovens shalon ad canaã i ipatinga mgInformativo  nº4 conj. de jovens shalon ad canaã i ipatinga mg
Informativo nº4 conj. de jovens shalon ad canaã i ipatinga mg
 
Owls
OwlsOwls
Owls
 
TF-04S COOWIN
TF-04S COOWIN TF-04S COOWIN
TF-04S COOWIN
 
Global Research Infrastructures for Biodiversity and Ecosystems Research
Global Research Infrastructures for Biodiversity and Ecosystems ResearchGlobal Research Infrastructures for Biodiversity and Ecosystems Research
Global Research Infrastructures for Biodiversity and Ecosystems Research
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
 
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) projectApproach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
 

Similar a AH-XLDBEurope-position-09 jun2011

Implementation of a European e-Infrastructure for the 21st Century
Implementation of a European e-Infrastructure for the 21st CenturyImplementation of a European e-Infrastructure for the 21st Century
Implementation of a European e-Infrastructure for the 21st Century
Ed Dodds
 
ViBRANT Project Overview
ViBRANT Project OverviewViBRANT Project Overview
ViBRANT Project Overview
vbrant
 
The Ascent of Open Science and the European Open Science Cloud
The Ascent of Open Science and the European Open Science CloudThe Ascent of Open Science and the European Open Science Cloud
The Ascent of Open Science and the European Open Science Cloud
Tiziana Ferrari
 

Similar a AH-XLDBEurope-position-09 jun2011 (20)

Using Open Research Data for Public Policy Making: Opportunities of Virtual R...
Using Open Research Data for Public Policy Making: Opportunities of Virtual R...Using Open Research Data for Public Policy Making: Opportunities of Virtual R...
Using Open Research Data for Public Policy Making: Opportunities of Virtual R...
 
Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity Conservation
 
EOSC-hub: first steps towards realising EOSC vision
EOSC-hub: first steps towards realising EOSC visionEOSC-hub: first steps towards realising EOSC vision
EOSC-hub: first steps towards realising EOSC vision
 
Implementation of a European e-Infrastructure for the 21st Century
Implementation of a European e-Infrastructure for the 21st CenturyImplementation of a European e-Infrastructure for the 21st Century
Implementation of a European e-Infrastructure for the 21st Century
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
ViBRANT Project Overview
ViBRANT Project OverviewViBRANT Project Overview
ViBRANT Project Overview
 
The Ascent of Open Science and the European Open Science Cloud
The Ascent of Open Science and the European Open Science CloudThe Ascent of Open Science and the European Open Science Cloud
The Ascent of Open Science and the European Open Science Cloud
 
XldbEuropeEdinburgh-09-jun2011
XldbEuropeEdinburgh-09-jun2011XldbEuropeEdinburgh-09-jun2011
XldbEuropeEdinburgh-09-jun2011
 
Building data infrastructures for science
Building data infrastructures for scienceBuilding data infrastructures for science
Building data infrastructures for science
 
British Library Datasets Programme 2010
British Library Datasets Programme 2010British Library Datasets Programme 2010
British Library Datasets Programme 2010
 
British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011
 
OpenAIRE services in support of “Open Science as-a-Service” - Presentation at...
OpenAIRE services in support of “Open Science as-a-Service” - Presentation at...OpenAIRE services in support of “Open Science as-a-Service” - Presentation at...
OpenAIRE services in support of “Open Science as-a-Service” - Presentation at...
 
OpenAIRE at Workshop on CRIS and OAR, May 2010
OpenAIRE at Workshop on CRIS and OAR, May 2010OpenAIRE at Workshop on CRIS and OAR, May 2010
OpenAIRE at Workshop on CRIS and OAR, May 2010
 
Vince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notextVince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notext
 
ViBRANT—Virtual Biodiversity Research and Access Network for Taxonomy
ViBRANT—Virtual Biodiversity Research and Access Network for TaxonomyViBRANT—Virtual Biodiversity Research and Access Network for Taxonomy
ViBRANT—Virtual Biodiversity Research and Access Network for Taxonomy
 
Horizon 2020: Outline of a Pilot for Open Research Data
Horizon 2020: Outline of a Pilot for Open Research Data  Horizon 2020: Outline of a Pilot for Open Research Data
Horizon 2020: Outline of a Pilot for Open Research Data
 
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
An introduction to ViBRANT: Virtual Biodiversity Research and Access Network ...
 
Sla2009 D Curation Heidorn
Sla2009 D Curation HeidornSla2009 D Curation Heidorn
Sla2009 D Curation Heidorn
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
Big Data Europe at eHealth Week 2017: Linking Big Data in Health
Big Data Europe at eHealth Week 2017: Linking Big Data in HealthBig Data Europe at eHealth Week 2017: Linking Big Data in Health
Big Data Europe at eHealth Week 2017: Linking Big Data in Health
 

Más de Alex Hardisty

Data accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereData accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphere
Alex Hardisty
 
Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3
Alex Hardisty
 
TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010
Alex Hardisty
 

Más de Alex Hardisty (11)

openDS - A new standard for digital specimens
openDS - A new standard for digital specimensopenDS - A new standard for digital specimens
openDS - A new standard for digital specimens
 
Data accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereData accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphere
 
Constructing bottomup
Constructing bottomupConstructing bottomup
Constructing bottomup
 
Mapping Research Infrastructures with the ENVRI Reference Model
Mapping Research Infrastructures with the ENVRI Reference ModelMapping Research Infrastructures with the ENVRI Reference Model
Mapping Research Infrastructures with the ENVRI Reference Model
 
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
BioVeL at IBERGRID e-Infrastructures and biodiversity workshop, 19th Septembe...
 
Biodiversity Informatics Horizons 2013 - Introduction and Scope
Biodiversity Informatics Horizons 2013 - Introduction and ScopeBiodiversity Informatics Horizons 2013 - Introduction and Scope
Biodiversity Informatics Horizons 2013 - Introduction and Scope
 
Hardistyroberts190313opt 130319072407-phpapp02
Hardistyroberts190313opt 130319072407-phpapp02Hardistyroberts190313opt 130319072407-phpapp02
Hardistyroberts190313opt 130319072407-phpapp02
 
Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3Eudat user forum-london-11march2013-biovel-v3
Eudat user forum-london-11march2013-biovel-v3
 
Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)
 
TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010TextofKeynote-EGIforum-15-Sep2010
TextofKeynote-EGIforum-15-Sep2010
 
EGIforum-Amsterdam-15-Sep2010
EGIforum-Amsterdam-15-Sep2010EGIforum-Amsterdam-15-Sep2010
EGIforum-Amsterdam-15-Sep2010
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

AH-XLDBEurope-position-09 jun2011

  • 1. What does research infrastructure really need for data? A personal view based on LifeWatch and ENVRI Alex Hardisty, Cardiff University LifeWatch: an ESFRI Research Infrastructure; an e-Infrastructure for Biodiversity and Ecosystem Science. What is LifeWatch? Biodiversity science is the study of the diversity of life on our planet – plants, animals, microorganisms and viruses – and the environments (ecosystems) they live in. LifeWatch (www.lifewatch.eu) will be an open access infrastructure, accessed through a single portal (portal.lifewatch.eu) for users from the scientific community, as well as policy makers and representatives of the private sector. It will allow scientists to explore, describe and understand patterns in biodiversity, and the processes that maintain biodiversity, in space and time at the gene, species, ecosystem and landscape levels; and to understand what causes and affects species diversity. The innovative design of LifeWatch offers integrated access to large-scale data resources, advanced algorithms and computational capability through a service-oriented architecture to support creation of new knowledge. Key elements of the infrastructure will include: distributed observatories/sensors, interoperable datasets, processing and analytical tools, and both computational capability and capacity. Data mining, data analysis and modelling allows users to study patterns and mechanisms across different levels of biodiversity. The LifeWatch infrastructure provides scientific research teams with new collaborative environments by creating ‘e-Laboratories’ or composing ‘e-Services’. They may share their data and analytical and modelling algorithms with others, while controlling access. LifeWatch enables “distributed large scale” and collaborative research on complex and multidisciplinary problems. In planning for the past 3 years, LifeWatch is presently transitioning to its construction phase. Early Virtual Labs are likely to support scientific studies of biodiversity in marine wetlands and the fragility of ecosystems towards alien and invasive species. The Biodiversity Virtual e-Laboratory (BioVeL) project (www.biovel.eu) contributes to the construction by causing islands of compatible infrastructure to be created / emerge at key centres across Europe. The challenges of scale and heterogeneity LifeWatch is supported by many good data providers from within the scientific communities (networks of excellence) for terrestrial ecology, marine ecology and the natural history collections with all their biological specimens. There are currently about 1800 terrestrial monitoring sites and 200 marine research sites across Europe. Hundreds of millions of specimens in natural history collections all over Europe are gradually being indexed and digitised. Biodiversity data is extremely diverse and heterogeneous. Biodiversity science spans many more familiar disciplines: biology, botany, zoology, ecology, genetics, soil science, biogeography, climate science, chemistry - to name but a few. Each of these established scientific communities already has its own way of Alex Hardisty, XLDB-Europe, Edinburgh, 8-10th June 2011 Page 1
  • 2. doing things, their own data resources and their own tools. Not only that, but they have their own different vocabularies and conceptual underpinnings. Interoperability is a problem demanding a determined ontological and thesaurus solution like that used in the medical domain: the Unified Medical Language System (UMLS) (www.nlm.nih.gov/research/umls). The interconnections between different biodiversity ideas/concepts, data sources, and the outputs from data processing, manipulation and modelling are intricate. As well as the traditional sources mentioned above, genomic data including, for example: sequence data, DNA barcodes and phylogenies are becoming increasingly important sources. Biodiversity science also demands environmental data (climate, soil, ocean temperature, etc.), as well as economic and census data for particular types of studies. Apart from the well known and often large sources - GBIF, EBI, environmental data, census data - there are numerous small datasets in the hands of individual researchers. If computerised at all, these small datasets are often held in spreadsheets and with no identifiable common structure. There are probably thousands of them. And multiple tools for processing too. The biodiversity science community is highly fragmented and all these kinds of small, personal, group and departmental datasets need to get published and become discoverable and usable. LifeWatch aims to support upwards of 25,000 users, primarily from the academic and research community, and the policymaking community, but also supporting the student education sector and the general public (citizen science). The LifeWatch strategy of “Thinking globally, acting locally” addresses these challenges of heterogeneity and scale. “Thinking globally, acting locally” devises and promotes the pan-European top-down strategies that foster collaboration and interoperability, and at the local level assists and encourages ‘islands’ of compliant infrastructure to emerge and fuse. ENVRI: Common Operations of the ESFRI Environmental Research Infrastructures What is ENVRI? ENVRI is a soon to be funded EC FP7 project that brings together many of the main ESFRI research infrastructures from the environmental sciences domain. The ENVRI project will contribute to the construction of these research infrastructures by sharing experiences and technologies and by solving crucial common technology issues and challenges together. Through cooperation in this project the ESFRI ENV infrastructures, together with ICT partners, are seeking to increase the interoperability of their data and facilities to increase the use and effectiveness of their infrastructures. The central goal of the ENVRI project is to implement harmonised solutions and draw up guidelines for the common needs of the environmental ESFRI projects, with a special focus on issues as architectures, metadata frameworks, data discovery in scattered repositories, visualization and data curation. ENVRI recognises scientific data services as part of a horizontal set of foundational services that include communications, distributed computing, and storage. It recognises that data providers, as well as data users, are users of data services and that there are common requirements irrespective of domain-specific communities. Community-specific services sit on top of data services and interact with them. The key to improved interoperability is finding common solutions to common problems that can be adopted by each research infrastructure as it progresses through its construction phase. Fundamental common solutions include: Alex Hardisty, XLDB-Europe, Edinburgh, 8-10th June 2011 Page 2
  • 3. a) A Common Reference Model providing multiple compatible ‘views’ of the research infrastructure for different purposes. An ENVRI Common Reference Model is likely to be based on the ISO/IEC 10746 series of Standards for Open Distributed Processing, presenting 5 viewpoints: i) Science business / enterprise view; ii) Information view; iii) Computational / services view; iv) Engineering view and v) Technology view. b) “Standards, Standards, Standards” are required for, at least: • Data capture from distributed sensors • Metadata definition • Management of high volume data • Execution of workflows • Visualization of data • Provenance and annotation • Interoperability between assets c) Based on a generic metadata model (the Information viewpoint of the Common Reference Model), tools to allow data discovery and access in a federation of distributed digital repositories and interoperating infrastructures; d) RDF and OWL frameworks to describe relations between (virtualized) e-Infrastructure components, and to link semantic descriptions of data with the semantic descriptions of the infrastructure, allowing the creation of a data-centric network. Riding the Wave: How Europe can gain from the rising tide of scientific data The recently published report of the High Level Expert Group on Scientific Data – “Riding the Wave: How Europe can gain from the rising tide of scientific data” – is an important contribution towards addressing the question of what research infrastructures really need for data. Neelie Kroes, the Vice-President of the European Commission responsible for the Digital Agenda has asked: “every citizen and every organisation involved in scientific research to take note of this report and to use it as a reference point when discussing the priorities of EU research investments.” The report may be found here: http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf Alex Hardisty, XLDB-Europe, Edinburgh, 8-10th June 2011 Page 3