SlideShare una empresa de Scribd logo
1 de 25
agINFRA
A data infrastructure to
support agricultural scientific
communities
Andreas Drakos, University of Alcala
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014
Our project

in agINFRA we will:

share agricultural research…
…over a data e-infrastructure

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

2
Agricultural research data
• Primary data:
– Structured, e.g. datasets as tables
– Digitized : images, videos, etc.

• Secondary data (elaborations, e.g. a dendogram)
• Provenance information, incl. authors, their
organizations and projects
• Methods and procedures followed
• Reports, including papers
• Secondary documents, e.g. training resources
• Metadata about the above
• Social data, tags, ratings, etc.
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

3
agINFRA values: scientific data must be
A

| Open |

Must be open and interlinked
NOT subject to barriers, based on standard formats and avoiding building
data silos due to lack of interrelatedness and ad-hoc APIs.

B

| Meaningful | Must be meaningful through explicit semantics
Reusing the semantics already provided in mature terminologies and
ontologies that are exposed and interlinked through the Web.

C

| Reliable | Must be reliable, traceable and accessible
Any kind of research objects can be stored in the data infrastructure, and
there are NO barriers to expressing relations between these objects to
capture the context of research activities.

D

| Actionable | Must be actionable via services that empower research
Data is not useful without flexible and adaptable services that allow
researchers to act on the data in the ways they need.
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

4
There is a lot of data

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

5
CONTENT PROVIDER
WITH UNORGANISED
COLLECTION
(e.g. listed at Web
site or in DVD-ROM)

chooses sharing
compliant tool

register as
data source

hosted over agINFRA

(meta)data export in
proprietary format & ingestion in sharing
mapping to known
compliant tool
CONTENT PROVIDER
WITH CMS THAT DOES
NOT SUPPORT
SHARING (e.g.
proprietary DB)

register as
data source

hosted over agINFRA
computed over agINFRA

register as
data source
hosted over agINFRA
CONTENT PROVIDER
WITH CMS THAT
SUPPORTS SHARING
(e.g. OAI-PMH,
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014
RSS,...)

6
shares (meta)data
e.g. through OAI-PMH

computed over agINFRA
hosted over agINFRA
shares (meta)data
e.g. through OAI-PMH

computed over agINFRA

computed over agINFRA

(META)DATA
AGGREGATOR

indexed & available
through CIARD RING

served through agINFRA

shares (meta)data
e.g. through OAI-PMH

computed over agINFRA

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

7
computed over agINFRA

computed over agINFRA

…
EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

hosted over agINFRA
computed over agINFRA

8
Actors over the infrastructure
Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources

Cloud / SaaS tools

APIs

LOD Vocabularies
agINFRA RDF
vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

Information services

agINFRA LOD KOSs

9
Actors over the infrastructure
Developers
Information
systems
providers

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources

Cloud / SaaS tools

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

Taxonomists

APIs

LOD Vocabularies

Data providers
agINFRA RDF
vocabularies

agINFRA LOD KOSs

Researchers

EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

Information services

Policy makers 10
An existing data community

• a global community movement to make
agricultural research information and
knowledge publicly accessible to all
– http://www.ciard.net

agINFRA 2nd Review Meeting, 13th of December 2013

11
A core registry service

• CIARD RING (Routemap to Information Nodes
and Gateways)
– global registry to give access to any kind of
information sources pertaining to agricultural
research for development
– principal tool created through CIARD to allow
information providers to register their services in
various categories and facilitate discovery of
sources of agriculture-related information across
the world
agINFRA 2nd Review Meeting, 13th of December 2013

12
New agINFRA RING

agINFRA 2nd Review Meeting, 13th of December 2013

13
New agINFRA RING

agINFRA 2nd Review Meeting, 13th of December 2013

14
RING data registry usage scenario 1

• data aggregators registering their data
providers to
CIARD RING
– asking directly to
be registered there
(AGRIS)
– federating own
smaller registries
(GLN)

agINFRA 2nd Review Meeting, 13th of December 2013

15
RING data registry usage scenario 2

• new data providers using agINFRA cloud tools
can be automatically registered to CIARD RING
– cloud-hosted AgriDrupal or AgriOceanDSpace
instances for document repositories
– cloud-hosted agLR instances for learning
repositories

• agINFRA Cloud hosting services
– In collaboration with other cloud communities
(eg. OKEANOS/GRNET)
– In collaboration with CHAIN-REDS project etc.
agINFRA 2nd Review Meeting, 13th of December 2013

16
Data provider scenario 1
Data provider in
need of hosting &
storage of smallscale CMS

Use a cloud
hosted CMS
Cloud / SaaS tools

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources
APIs

LOD Vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools
agINFRA RDF
vocabularies

agINFRA LOD KOSs

sets up own CMS instance

agINFRA 2nd Review Meeting, 13th of December 2013

Information services

17
Data provider scenario 2
Data provider in
need of large scale
hosting &
replication CMS
Requests
space/accounts
in large-scale
CMS
Cloud / SaaS tools

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources
APIs

LOD Vocabularies
agINFRA RDF
vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

agINFRA 2nd Review Meeting, 13th of December 2013

Information services

agINFRA LOD KOSs

18
A semantic backbone for agINFRA

• to help all data providers declaring, publishing &
linking their metadata properties and value
spaces
– Publishing their KOSs using the VocBench and their
metadata vocabularies using Neologism
– Linking them to existing vocabularies, e.g. AGROVOC
for KOSs, Dublin Core for metadata

• guidelines & tools to support data providers in
adopting such a LOD framework
– e.g. LODE-BD recommendations

• to provide an entry point to existing relevant
vocabularies
agINFRA 2nd Review Meeting, 13th of December 2013

19
Exposing to the e-infrastructure scenario
Data provider
hosting CMS at
own or
external/commerci
al infrastructure
Interested to expose
(meta)data to einfrastructure
Cloud / SaaS tools

Registry of
Datasets and APIs
collections

Registry of
vocabularies
and tools

data sources
APIs

LOD Vocabularies
agINFRA RDF
vocabularies

Public REST APIs

Grid jobs
Grid workflowss

Productivity Tools

agINFRA 2nd Review Meeting, 13th of December 2013

Information services

agINFRA LOD KOSs

20
agINFRA LOD layer usage scenario 1
• A data owner wants to share their data as Linked
Data
• The data owner uses non-LOD vocabularies and
KOSs and wants to publish them as LOD and link
them to existing vocabularies
• agINFRA offers tools for publishing vocabularies
and KOSs

Once the vocabularies are published, all metadata
and all concepts have URIs and can be referenced by
any other system
agINFRA 2nd Review Meeting, 13th of December 2013

21
agINFRA LOD layer usage scenario 2
• Once KOSs are published, all metadata and all
concepts have URIs and can be referenced by any
other system
• Data aggregators like AGRIS and GLN can create
mash ups between their core data and other
agricultural data types (e.g. germplasm, soil maps,
statistics, ….) by using the LOD semantic backbone as
a crosswalk between metadata formalizations and
concepts in different vocabularies

agINFRA 2nd Review Meeting, 13th of December 2013

22
agINFRA LOD layer usage scenario 2
Example: LOD-based mash-ups in AGRIS
AGRIS bibliographic metadata
Journal

AGRIS
Journals
RDF store

Topic
Geographic
metadata

Thematic
metadata

DBpedia

Scientific
names

FAO Country
Profiles

FAO
Fisheries

WorldBank
indicators by
country
Info on
journal

Info on
topic
Info on
country

agINFRA 2nd Review Meeting, 13th of December 2013

Info on
species
Specific
indicators on
country

23
Workflow architecture

File system
(DC, IEEE
LOM, MODS
XML)

Stores

Ariadne
harvester

File system
(DC, IEEE
LOM, MODS
XML)

Stores

Filtering
component
To be ported on
the Grid

MySQL

Records
with
Broken
Links

File
system
(XMLs)

Get unique ID

Identification and
de-duplication
component

Transformation
component

Stores

Duplicates

Store
metadata
in JSON

Link checking
component

PostProcessing/
Enrichment
component
Thank you!
Questions

Más contenido relacionado

La actualidad más candente

Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0SpringPeople
 
Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Anirudh Gangwar
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoMark Kromer
 
Cloud computing major project
Cloud computing major projectCloud computing major project
Cloud computing major projectayk115
 
How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...OW2
 
Data Visualization Project Presentation
Data Visualization Project PresentationData Visualization Project Presentation
Data Visualization Project PresentationShubham Shrivastava
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1Ognjen Antonic
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real WorldMark Kromer
 
Hadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy TableHadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy TableCloudera, Inc.
 
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...OW2
 
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge SpagoWorld
 

La actualidad más candente (20)

Hadoop data access layer v4.0
Hadoop data access layer v4.0Hadoop data access layer v4.0
Hadoop data access layer v4.0
 
Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
Towards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF DataTowards Versioning of Arbitrary RDF Data
Towards Versioning of Arbitrary RDF Data
 
Multidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGISMultidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGIS
 
Cloud computing major project
Cloud computing major projectCloud computing major project
Cloud computing major project
 
PoolParty Search Server
PoolParty Search ServerPoolParty Search Server
PoolParty Search Server
 
How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...How to maximize the value of Big Data with SpagoBI suite through a comprehens...
How to maximize the value of Big Data with SpagoBI suite through a comprehens...
 
Data Visualization Project Presentation
Data Visualization Project PresentationData Visualization Project Presentation
Data Visualization Project Presentation
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
 
Data science big data and analytics
Data science big data and analyticsData science big data and analytics
Data science big data and analytics
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
Hadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy TableHadoop World 2010 - BAH - Fuzzy Table
Hadoop World 2010 - BAH - Fuzzy Table
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
 
Solution architecture for big data projects
Solution architecture for big data projectsSolution architecture for big data projects
Solution architecture for big data projects
 
CSB_community
CSB_communityCSB_community
CSB_community
 
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
SpagoBI and Big Data: next Open Source Information Management suite, OW2con'1...
 
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
Webinar: SpagoBI & Big Data, a smart approach to turn data into knowledge
 

Similar a agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

The new CIARD RING , a machine-readable directory of datasets for agriculture
The new CIARD RING, a machine-readable directory of datasets for agricultureThe new CIARD RING, a machine-readable directory of datasets for agriculture
The new CIARD RING , a machine-readable directory of datasets for agricultureValeria Pesce
 
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
The CIARD RING, a global directory of datasets for agriculture, by Valeria P...The CIARD RING, a global directory of datasets for agriculture, by Valeria P...
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...CIARD Movement
 
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...FIWARE
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !Christophe Guéret
 
Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...Pedro Príncipe
 
SplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunk
 
Red hat infrastructure for analytics
Red hat infrastructure for analyticsRed hat infrastructure for analytics
Red hat infrastructure for analyticsKyle Bader
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on AzureTrivadis
 
Science and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in BrazilScience and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in BrazilATMOSPHERE .
 
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project OverviewLinDa_FP7
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSAlasdair Gray
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSAWS User Group Kochi
 
TEAMS 6, 7 and 8
TEAMS 6, 7 and 8TEAMS 6, 7 and 8
TEAMS 6, 7 and 8plan4all
 
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...LIBER Europe
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...BigData_Europe
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsJohn Evans
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureJen Stirrup
 

Similar a agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 (20)

The new CIARD RING , a machine-readable directory of datasets for agriculture
The new CIARD RING, a machine-readable directory of datasets for agricultureThe new CIARD RING, a machine-readable directory of datasets for agriculture
The new CIARD RING , a machine-readable directory of datasets for agriculture
 
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
The CIARD RING, a global directory of datasets for agriculture, by Valeria P...The CIARD RING, a global directory of datasets for agriculture, by Valeria P...
The CIARD RING , a global directory of datasets for agriculture, by Valeria P...
 
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
FIWARE Wednesday Webinars - NGSI-LD and Smart Data Models: Standard Access to...
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
 
What's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache SparkWhat's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache Spark
 
Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...
 
SplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk Overview
 
Red hat infrastructure for analytics
Red hat infrastructure for analyticsRed hat infrastructure for analytics
Red hat infrastructure for analytics
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
 
Science and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in BrazilScience and Research - a new experimental platform in Brazil
Science and Research - a new experimental platform in Brazil
 
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
 
TEAMS 6, 7 and 8
TEAMS 6, 7 and 8TEAMS 6, 7 and 8
TEAMS 6, 7 and 8
 
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
 
PNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data AnalyticsPNDA - Platform for Network Data Analytics
PNDA - Platform for Network Data Analytics
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
 

Más de Andreas Drakos

My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?Andreas Drakos
 
USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015Andreas Drakos
 
agINFRA vision after the end of the project
agINFRA vision after the end of the projectagINFRA vision after the end of the project
agINFRA vision after the end of the projectAndreas Drakos
 
Edrene.2014 ODS Application Profile
Edrene.2014 ODS Application ProfileEdrene.2014 ODS Application Profile
Edrene.2014 ODS Application ProfileAndreas Drakos
 
Big Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceBig Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceAndreas Drakos
 
AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013Andreas Drakos
 

Más de Andreas Drakos (6)

My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?My Privacy at Risk, is it Safe?
My Privacy at Risk, is it Safe?
 
USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015USEMP Project Presentation ICT 2015
USEMP Project Presentation ICT 2015
 
agINFRA vision after the end of the project
agINFRA vision after the end of the projectagINFRA vision after the end of the project
agINFRA vision after the end of the project
 
Edrene.2014 ODS Application Profile
Edrene.2014 ODS Application ProfileEdrene.2014 ODS Application Profile
Edrene.2014 ODS Application Profile
 
Big Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experienceBig Data in Agriculture, the SemaGrow and agINFRA experience
Big Data in Agriculture, the SemaGrow and agINFRA experience
 
AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013AGRICOM Final Conference, September, 2013
AGRICOM Final Conference, September, 2013
 

Último

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014

  • 1. agINFRA A data infrastructure to support agricultural scientific communities Andreas Drakos, University of Alcala EGI-APARSEN workshop, Amsterdam, 4-6 March 2014
  • 2. Our project in agINFRA we will: share agricultural research… …over a data e-infrastructure EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 2
  • 3. Agricultural research data • Primary data: – Structured, e.g. datasets as tables – Digitized : images, videos, etc. • Secondary data (elaborations, e.g. a dendogram) • Provenance information, incl. authors, their organizations and projects • Methods and procedures followed • Reports, including papers • Secondary documents, e.g. training resources • Metadata about the above • Social data, tags, ratings, etc. EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 3
  • 4. agINFRA values: scientific data must be A | Open | Must be open and interlinked NOT subject to barriers, based on standard formats and avoiding building data silos due to lack of interrelatedness and ad-hoc APIs. B | Meaningful | Must be meaningful through explicit semantics Reusing the semantics already provided in mature terminologies and ontologies that are exposed and interlinked through the Web. C | Reliable | Must be reliable, traceable and accessible Any kind of research objects can be stored in the data infrastructure, and there are NO barriers to expressing relations between these objects to capture the context of research activities. D | Actionable | Must be actionable via services that empower research Data is not useful without flexible and adaptable services that allow researchers to act on the data in the ways they need. EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 4
  • 5. There is a lot of data EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 5
  • 6. CONTENT PROVIDER WITH UNORGANISED COLLECTION (e.g. listed at Web site or in DVD-ROM) chooses sharing compliant tool register as data source hosted over agINFRA (meta)data export in proprietary format & ingestion in sharing mapping to known compliant tool CONTENT PROVIDER WITH CMS THAT DOES NOT SUPPORT SHARING (e.g. proprietary DB) register as data source hosted over agINFRA computed over agINFRA register as data source hosted over agINFRA CONTENT PROVIDER WITH CMS THAT SUPPORTS SHARING (e.g. OAI-PMH, EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 RSS,...) 6
  • 7. shares (meta)data e.g. through OAI-PMH computed over agINFRA hosted over agINFRA shares (meta)data e.g. through OAI-PMH computed over agINFRA computed over agINFRA (META)DATA AGGREGATOR indexed & available through CIARD RING served through agINFRA shares (meta)data e.g. through OAI-PMH computed over agINFRA EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 7
  • 8. computed over agINFRA computed over agINFRA … EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 hosted over agINFRA computed over agINFRA 8
  • 9. Actors over the infrastructure Registry of Datasets and APIs collections Registry of vocabularies and tools data sources Cloud / SaaS tools APIs LOD Vocabularies agINFRA RDF vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 Information services agINFRA LOD KOSs 9
  • 10. Actors over the infrastructure Developers Information systems providers Registry of Datasets and APIs collections Registry of vocabularies and tools data sources Cloud / SaaS tools Public REST APIs Grid jobs Grid workflowss Productivity Tools Taxonomists APIs LOD Vocabularies Data providers agINFRA RDF vocabularies agINFRA LOD KOSs Researchers EGI-APARSEN workshop, Amsterdam, 4-6 March 2014 Information services Policy makers 10
  • 11. An existing data community • a global community movement to make agricultural research information and knowledge publicly accessible to all – http://www.ciard.net agINFRA 2nd Review Meeting, 13th of December 2013 11
  • 12. A core registry service • CIARD RING (Routemap to Information Nodes and Gateways) – global registry to give access to any kind of information sources pertaining to agricultural research for development – principal tool created through CIARD to allow information providers to register their services in various categories and facilitate discovery of sources of agriculture-related information across the world agINFRA 2nd Review Meeting, 13th of December 2013 12
  • 13. New agINFRA RING agINFRA 2nd Review Meeting, 13th of December 2013 13
  • 14. New agINFRA RING agINFRA 2nd Review Meeting, 13th of December 2013 14
  • 15. RING data registry usage scenario 1 • data aggregators registering their data providers to CIARD RING – asking directly to be registered there (AGRIS) – federating own smaller registries (GLN) agINFRA 2nd Review Meeting, 13th of December 2013 15
  • 16. RING data registry usage scenario 2 • new data providers using agINFRA cloud tools can be automatically registered to CIARD RING – cloud-hosted AgriDrupal or AgriOceanDSpace instances for document repositories – cloud-hosted agLR instances for learning repositories • agINFRA Cloud hosting services – In collaboration with other cloud communities (eg. OKEANOS/GRNET) – In collaboration with CHAIN-REDS project etc. agINFRA 2nd Review Meeting, 13th of December 2013 16
  • 17. Data provider scenario 1 Data provider in need of hosting & storage of smallscale CMS Use a cloud hosted CMS Cloud / SaaS tools Registry of Datasets and APIs collections Registry of vocabularies and tools data sources APIs LOD Vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools agINFRA RDF vocabularies agINFRA LOD KOSs sets up own CMS instance agINFRA 2nd Review Meeting, 13th of December 2013 Information services 17
  • 18. Data provider scenario 2 Data provider in need of large scale hosting & replication CMS Requests space/accounts in large-scale CMS Cloud / SaaS tools Registry of Datasets and APIs collections Registry of vocabularies and tools data sources APIs LOD Vocabularies agINFRA RDF vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools agINFRA 2nd Review Meeting, 13th of December 2013 Information services agINFRA LOD KOSs 18
  • 19. A semantic backbone for agINFRA • to help all data providers declaring, publishing & linking their metadata properties and value spaces – Publishing their KOSs using the VocBench and their metadata vocabularies using Neologism – Linking them to existing vocabularies, e.g. AGROVOC for KOSs, Dublin Core for metadata • guidelines & tools to support data providers in adopting such a LOD framework – e.g. LODE-BD recommendations • to provide an entry point to existing relevant vocabularies agINFRA 2nd Review Meeting, 13th of December 2013 19
  • 20. Exposing to the e-infrastructure scenario Data provider hosting CMS at own or external/commerci al infrastructure Interested to expose (meta)data to einfrastructure Cloud / SaaS tools Registry of Datasets and APIs collections Registry of vocabularies and tools data sources APIs LOD Vocabularies agINFRA RDF vocabularies Public REST APIs Grid jobs Grid workflowss Productivity Tools agINFRA 2nd Review Meeting, 13th of December 2013 Information services agINFRA LOD KOSs 20
  • 21. agINFRA LOD layer usage scenario 1 • A data owner wants to share their data as Linked Data • The data owner uses non-LOD vocabularies and KOSs and wants to publish them as LOD and link them to existing vocabularies • agINFRA offers tools for publishing vocabularies and KOSs Once the vocabularies are published, all metadata and all concepts have URIs and can be referenced by any other system agINFRA 2nd Review Meeting, 13th of December 2013 21
  • 22. agINFRA LOD layer usage scenario 2 • Once KOSs are published, all metadata and all concepts have URIs and can be referenced by any other system • Data aggregators like AGRIS and GLN can create mash ups between their core data and other agricultural data types (e.g. germplasm, soil maps, statistics, ….) by using the LOD semantic backbone as a crosswalk between metadata formalizations and concepts in different vocabularies agINFRA 2nd Review Meeting, 13th of December 2013 22
  • 23. agINFRA LOD layer usage scenario 2 Example: LOD-based mash-ups in AGRIS AGRIS bibliographic metadata Journal AGRIS Journals RDF store Topic Geographic metadata Thematic metadata DBpedia Scientific names FAO Country Profiles FAO Fisheries WorldBank indicators by country Info on journal Info on topic Info on country agINFRA 2nd Review Meeting, 13th of December 2013 Info on species Specific indicators on country 23
  • 24. Workflow architecture File system (DC, IEEE LOM, MODS XML) Stores Ariadne harvester File system (DC, IEEE LOM, MODS XML) Stores Filtering component To be ported on the Grid MySQL Records with Broken Links File system (XMLs) Get unique ID Identification and de-duplication component Transformation component Stores Duplicates Store metadata in JSON Link checking component PostProcessing/ Enrichment component