SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
COMSODE tools
Pushing data to the open ecosystem
Jindřich Mynarz
EEA.sk
ELAG 2015 Stockholm
June 9, 2015
The gist of the talk
To save legacy library data and satisfy
internal and external requirements on your
data you need ETL.
“Libraries have to focus on making their data
infrastructure more efficient if they want to
keep up with the ever changing needs of their
audience and invest in sustainable service
development.” — Lukas Koster (source)
Building tools to publish & reuse open data
EU FP7 project (2013➝2015)
Project partners:
● University of Milano-Bicocca,
Italy
● Charles University in Prague,
Czech Republic
● EEA, Czech Republic and
Slovakia
● ADDSEN, Slovakia
● Spinque, the Netherlands
● Ministry of Interior of the
Slovak Republic
Legacy library data
Save the data?
● …or let it go?
● What’s the cost of recovering the legacy?
● To save legacy data you need automation
⇒ ETL
● Unfortunately, paraphrasing Tolstoy, “tidy
datasets are all alike but every messy dataset
is messy in its own way.” (source)
Confusion of tongues
● MARC used to be (or
still is?) the lingua
franca. What's next?
● Many data formats
required to be
supported
● MARC→Web
impedance
mismatch
● Export & import
in systems
integration
Open Data Node
“(Linked) open data plumbing”
● Open Data Node (ODN) is a platform for
publishing (open) data & automating
internal data flows that enables
progressive enhancement of data.
● Main product of the COMSODE project
● Free, open source, modular, integrated (e.
g., single sign-on)
Open Data Node networks
● Data replication (e.g.,
local copy of name
authority file)
● Data synchronization (e.
g., periodical harvesting of
incremental updates via
OAI-PMH)
● Data distribution (e.g.,
shared cataloguing)
Open Data Node workflow
1. Catalogue your internal data
2. Create a data processing pipeline for the
datasets to be published
3. Schedule the pipeline to be run to publish
the data
Internal catalogue
● Map out the data you have or external
data you use; both open and closed.
● If data cannot be found, it is as if it did
not exist, so make data discoverable and
provide it with descriptive metadata
(DCAT-AP).
● Based on CKAN.
● An extensible ETL tool with native RDF
support for automating repetitive data
exchange and transformation tasks.
● Allows you to define, execute, monitor,
debug (examine intermediate data),
schedule, and share (import/export) data
transformations.
● Open source, dual-licensed to enable
commercial extensions
Extract-Transform-Load pipeline
Data flow of an ETL process in UnifiedViews
is defined as a pipeline composed of data
processing units.
Data processing units
Extractors
● Download
file
● Load from
SQL
database
● SPARQL
endpoint
extractor
Transformers
● Zip/unzip
● Find/replace
● Parse and
serialize RDF
● SPARQL
Update
● XSLT
● ISO 2709 to
MARCXML
● SPARQL
SELECT to
CSV
Loaders
● Files upload
● Load to
Virtuoso
● Load to SQL
database
+ Quality
Assessment
Public catalogue
● Public interface that enables users to
discover & access your data.
● Links to data dumps, APIs (REST API,
SPARQL endpoint), and applications
based on the data.
● Provides metadata, such as licence,
dataset maintainer’s contact, or last
update date.
● Based on CKAN.
COMSODE methodology
● Guidelines on how to use ODN for those
with little open data experience
● Defines phases, practices, roles, and
artifacts.
● Phases:
a. Development of open data publication plan
b. Preparation of publication
c. Realization of publication
d. Archiving
http://opendatanode.org/product/methodology-for-od-publishing
Open Data Node in use
● Reality check
○ Eating our own dog food
○ Testing the ODN’s versatility
● 150 datasets transformed
by COMSODE partners
● Supporting 10 pilot projects, including:
○ eDemokracia: Slovak nation-wide e-government
project
○ Czech Trade Inspection Authority
○ Slovak Environment Agency
○ Slovak National Library
Slovak National Library
COMSODE pilot
Demo time!
Impact
● Improve your internal & external data
flows.
● Libraries are required to publish data by
the EU directive on the re-use of public
sector information.
○ If you release MARC, is the cost of access to the
data marginal?
● Insiders have access, yet outsiders often
have more experience to build value
upon the data.
In conclusion
♫ The pipelines, the pipelines are calling... ♫
To save legacy library data and satisfy internal and
external requirements on your data you need ETL.
http://opendatanode.org
Image credits from the Noun Project:
Database by Dmitry Baranovskiy, Counter by Sergey
Demushkin, Ventil by Sergey Demushkin, Spider
Web by Denis, Scroll by EliRatus, Chest by Victor
Escorsin, Pipes by Christopher T. Howlett, Adoption
by Luis Prado, Plumber by Luis Prado, Filter by
Muneer A.Safiah, Lock by Alex Auda Samora, Lego
by Jon Trillana, Atom by Mister Pixel

Más contenido relacionado

La actualidad más candente

LDCache - a cache for linked data-driven web applications
LDCache - a cache for linked data-driven web applicationsLDCache - a cache for linked data-driven web applications
LDCache - a cache for linked data-driven web applicationsMetaSolutions AB
 
Nordic regional germplasm documentation, at European genbank network meeting ...
Nordic regional germplasm documentation, at European genbank network meeting ...Nordic regional germplasm documentation, at European genbank network meeting ...
Nordic regional germplasm documentation, at European genbank network meeting ...Dag Endresen
 
Open Data Node - Platform and Methodology - 2015-May
Open Data Node - Platform and Methodology - 2015-MayOpen Data Node - Platform and Methodology - 2015-May
Open Data Node - Platform and Methodology - 2015-MayComsode - FP7 project
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataMetaSolutions AB
 
skos-history: Tracking the evolution of Knowledge Organization Systems
skos-history: Tracking the evolution of Knowledge Organization Systemsskos-history: Tracking the evolution of Knowledge Organization Systems
skos-history: Tracking the evolution of Knowledge Organization SystemsJoachim Neubert
 
Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021WARCnet
 
Change Tracking in Knowledge Organization Systems with skos-history
Change Tracking in Knowledge Organization Systems with skos-historyChange Tracking in Knowledge Organization Systems with skos-history
Change Tracking in Knowledge Organization Systems with skos-historyJoachim Neubert
 
TransportDCAT-AP and PhD Thesis at Civic Lab Brussels
TransportDCAT-AP and PhD Thesis at Civic Lab BrusselsTransportDCAT-AP and PhD Thesis at Civic Lab Brussels
TransportDCAT-AP and PhD Thesis at Civic Lab BrusselsDavid Chaves-Fraga
 
The Use of Big Data Techniques for Digital Archiving
The Use of Big Data Techniques for Digital ArchivingThe Use of Big Data Techniques for Digital Archiving
The Use of Big Data Techniques for Digital ArchivingSven Schlarb
 
INOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANTINOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANTvbrant
 
Session 1.6 fostering interoperability of european qualifications: the qual...
Session 1.6   fostering interoperability of european qualifications: the qual...Session 1.6   fostering interoperability of european qualifications: the qual...
Session 1.6 fostering interoperability of european qualifications: the qual...semanticsconference
 
Scalable load-balancing for large-scale big data applications (+Brazil, São P...
Scalable load-balancing for large-scale big data applications (+Brazil, São P...Scalable load-balancing for large-scale big data applications (+Brazil, São P...
Scalable load-balancing for large-scale big data applications (+Brazil, São P...Carlos Eduardo Moreira dos Santos
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Projectmbruemmer
 
Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...WARCnet
 
Drupal Day 2011 - Thinking spatially with your open data
Drupal Day 2011 - Thinking spatially with your open dataDrupal Day 2011 - Thinking spatially with your open data
Drupal Day 2011 - Thinking spatially with your open dataDrupalDay
 
Integration and Exploration of Financial Data using Semantics and Ontologies
Integration and Exploration of Financial Data using Semantics and OntologiesIntegration and Exploration of Financial Data using Semantics and Ontologies
Integration and Exploration of Financial Data using Semantics and OntologiesRoberto García
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationSebastian Hellmann
 

La actualidad más candente (20)

LDCache - a cache for linked data-driven web applications
LDCache - a cache for linked data-driven web applicationsLDCache - a cache for linked data-driven web applications
LDCache - a cache for linked data-driven web applications
 
Nordic regional germplasm documentation, at European genbank network meeting ...
Nordic regional germplasm documentation, at European genbank network meeting ...Nordic regional germplasm documentation, at European genbank network meeting ...
Nordic regional germplasm documentation, at European genbank network meeting ...
 
Open Data Node - Platform and Methodology - 2015-May
Open Data Node - Platform and Methodology - 2015-MayOpen Data Node - Platform and Methodology - 2015-May
Open Data Node - Platform and Methodology - 2015-May
 
Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open Data
 
skos-history: Tracking the evolution of Knowledge Organization Systems
skos-history: Tracking the evolution of Knowledge Organization Systemsskos-history: Tracking the evolution of Knowledge Organization Systems
skos-history: Tracking the evolution of Knowledge Organization Systems
 
Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021Maurer Presentation - WARCnet Spring Meeting 2021
Maurer Presentation - WARCnet Spring Meeting 2021
 
Change Tracking in Knowledge Organization Systems with skos-history
Change Tracking in Knowledge Organization Systems with skos-historyChange Tracking in Knowledge Organization Systems with skos-history
Change Tracking in Knowledge Organization Systems with skos-history
 
TransportDCAT-AP and PhD Thesis at Civic Lab Brussels
TransportDCAT-AP and PhD Thesis at Civic Lab BrusselsTransportDCAT-AP and PhD Thesis at Civic Lab Brussels
TransportDCAT-AP and PhD Thesis at Civic Lab Brussels
 
The Use of Big Data Techniques for Digital Archiving
The Use of Big Data Techniques for Digital ArchivingThe Use of Big Data Techniques for Digital Archiving
The Use of Big Data Techniques for Digital Archiving
 
INOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANTINOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANT
 
Session 1.6 fostering interoperability of european qualifications: the qual...
Session 1.6   fostering interoperability of european qualifications: the qual...Session 1.6   fostering interoperability of european qualifications: the qual...
Session 1.6 fostering interoperability of european qualifications: the qual...
 
Learning R - Handling NetCDF files
Learning R - Handling NetCDF filesLearning R - Handling NetCDF files
Learning R - Handling NetCDF files
 
Scalable load-balancing for large-scale big data applications (+Brazil, São P...
Scalable load-balancing for large-scale big data applications (+Brazil, São P...Scalable load-balancing for large-scale big data applications (+Brazil, São P...
Scalable load-balancing for large-scale big data applications (+Brazil, São P...
 
Dirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz ProjectDirk Goldhahn: Introduction to the German Wortschatz Project
Dirk Goldhahn: Introduction to the German Wortschatz Project
 
Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...Tuesday 5 May: Definition and Representation of National Web Domains across W...
Tuesday 5 May: Definition and Representation of National Web Domains across W...
 
LOD2 Webinar Series FOX
LOD2 Webinar Series FOXLOD2 Webinar Series FOX
LOD2 Webinar Series FOX
 
Drupal Day 2011 - Thinking spatially with your open data
Drupal Day 2011 - Thinking spatially with your open dataDrupal Day 2011 - Thinking spatially with your open data
Drupal Day 2011 - Thinking spatially with your open data
 
Integration and Exploration of Financial Data using Semantics and Ontologies
Integration and Exploration of Financial Data using Semantics and OntologiesIntegration and Exploration of Financial Data using Semantics and Ontologies
Integration and Exploration of Financial Data using Semantics and Ontologies
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 

Destacado

EL DERECHO EN LA INFORMÁTICA
EL DERECHO EN LA INFORMÁTICAEL DERECHO EN LA INFORMÁTICA
EL DERECHO EN LA INFORMÁTICAAnahí Graniel
 
COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015Comsode - FP7 project
 
Open Mobile EcoSystem
Open Mobile EcoSystemOpen Mobile EcoSystem
Open Mobile EcoSystemSeungyul Kim
 
The Open Ecosystem: Issues and challenges for Institutional Repositories
The Open Ecosystem: Issues and challenges for Institutional RepositoriesThe Open Ecosystem: Issues and challenges for Institutional Repositories
The Open Ecosystem: Issues and challenges for Institutional Repositories H Anil Kumar
 
Leveraging the Open IoT Ecosystem to Accelerate Product Strategy
Leveraging the Open IoT Ecosystem to Accelerate Product StrategyLeveraging the Open IoT Ecosystem to Accelerate Product Strategy
Leveraging the Open IoT Ecosystem to Accelerate Product StrategyIan Skerrett
 
A Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 Languages
A Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 LanguagesA Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 Languages
A Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 Languagesijpla
 
Difference between Java and c#
Difference between Java and c#Difference between Java and c#
Difference between Java and c#Sagar Pednekar
 
An Open and Collaborative Ecosystem for IoT
An Open and Collaborative Ecosystem for IoTAn Open and Collaborative Ecosystem for IoT
An Open and Collaborative Ecosystem for IoTCharles Eckel
 
Comparison of Programming Platforms
Comparison of Programming PlatformsComparison of Programming Platforms
Comparison of Programming PlatformsAnup Hariharan Nair
 

Destacado (20)

Illik verteilte systeme
Illik verteilte systemeIllik verteilte systeme
Illik verteilte systeme
 
Theorie U
Theorie UTheorie U
Theorie U
 
manosalasiembra1meraño"A"
manosalasiembra1meraño"A"manosalasiembra1meraño"A"
manosalasiembra1meraño"A"
 
EL DERECHO EN LA INFORMÁTICA
EL DERECHO EN LA INFORMÁTICAEL DERECHO EN LA INFORMÁTICA
EL DERECHO EN LA INFORMÁTICA
 
COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015
 
05 ai uml_illik_students_part_2_eng
05 ai uml_illik_students_part_2_eng05 ai uml_illik_students_part_2_eng
05 ai uml_illik_students_part_2_eng
 
05 ai uml_illik_students_part_2_de
05 ai uml_illik_students_part_2_de05 ai uml_illik_students_part_2_de
05 ai uml_illik_students_part_2_de
 
05 ai uml_illik_students_part_1_de
05 ai uml_illik_students_part_1_de05 ai uml_illik_students_part_1_de
05 ai uml_illik_students_part_1_de
 
TcpET
TcpETTcpET
TcpET
 
Open Mobile EcoSystem
Open Mobile EcoSystemOpen Mobile EcoSystem
Open Mobile EcoSystem
 
9 system-sizing
9 system-sizing9 system-sizing
9 system-sizing
 
The Open Ecosystem: Issues and challenges for Institutional Repositories
The Open Ecosystem: Issues and challenges for Institutional RepositoriesThe Open Ecosystem: Issues and challenges for Institutional Repositories
The Open Ecosystem: Issues and challenges for Institutional Repositories
 
201510 odn-itapa
201510 odn-itapa201510 odn-itapa
201510 odn-itapa
 
Leveraging the Open IoT Ecosystem to Accelerate Product Strategy
Leveraging the Open IoT Ecosystem to Accelerate Product StrategyLeveraging the Open IoT Ecosystem to Accelerate Product Strategy
Leveraging the Open IoT Ecosystem to Accelerate Product Strategy
 
A Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 Languages
A Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 LanguagesA Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 Languages
A Performance Comparison Of C# 2013, Delphi Xe6, And Python 3.4 Languages
 
Showdown ss2013
 Showdown ss2013 Showdown ss2013
Showdown ss2013
 
Difference between Java and c#
Difference between Java and c#Difference between Java and c#
Difference between Java and c#
 
An Open and Collaborative Ecosystem for IoT
An Open and Collaborative Ecosystem for IoTAn Open and Collaborative Ecosystem for IoT
An Open and Collaborative Ecosystem for IoT
 
Comparison of Programming Platforms
Comparison of Programming PlatformsComparison of Programming Platforms
Comparison of Programming Platforms
 
PEA responsibilities
PEA responsibilitiesPEA responsibilities
PEA responsibilities
 

Similar a Comsode tools - pushing data to open ecosystem

UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.tomasknap
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so farEnrico Daga
 
SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...
SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...
SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...South Tyrol Free Software Conference
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage InformationEnno Meijers
 
F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016
F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016
F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016evaminerva
 
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
Harvesting Repositories:  DPLA, Europeana, & Other Case StudiesHarvesting Repositories:  DPLA, Europeana, & Other Case Studies
Harvesting Repositories: DPLA, Europeana, & Other Case Studieseohallor
 
Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014aceas13tern
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationDenodo
 
Kettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolAlex Rayón Jerez
 
On chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurementsOn chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurementsNina Jeliazkova
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
Presentations from ICT 2015 in Lisbon
Presentations from ICT 2015 in LisbonPresentations from ICT 2015 in Lisbon
Presentations from ICT 2015 in Lisbonsdi4apps
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
 

Similar a Comsode tools - pushing data to open ecosystem (20)

Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...
SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...
SFScon21 - Sander Van Dooren - Joinup: Maintaining an Open catalogue of reusa...
 
8 eodc status mistelbauer
8 eodc status mistelbauer8 eodc status mistelbauer
8 eodc status mistelbauer
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
 
F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016
F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016
F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016
 
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
Harvesting Repositories:  DPLA, Europeana, & Other Case StudiesHarvesting Repositories:  DPLA, Europeana, & Other Case Studies
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
 
Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014
 
OR2012 Biblio-transformation-engine
OR2012 Biblio-transformation-engineOR2012 Biblio-transformation-engine
OR2012 Biblio-transformation-engine
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data Virtualization
 
Kettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration tool
 
On chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurementsOn chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurements
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
RDM@Edinburgh_interoperation_IDCC2015
RDM@Edinburgh_interoperation_IDCC2015RDM@Edinburgh_interoperation_IDCC2015
RDM@Edinburgh_interoperation_IDCC2015
 
LOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViewsLOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViews
 
Presentations from ICT 2015 in Lisbon
Presentations from ICT 2015 in LisbonPresentations from ICT 2015 in Lisbon
Presentations from ICT 2015 in Lisbon
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Service Integration to Enhance RDM
Service Integration to Enhance RDMService Integration to Enhance RDM
Service Integration to Enhance RDM
 

Más de Comsode - FP7 project

ODN - Technical introduction of the platform
ODN - Technical introduction of the platformODN - Technical introduction of the platform
ODN - Technical introduction of the platformComsode - FP7 project
 
Apporach to Open Data in Umbria region
Apporach to Open Data in Umbria regionApporach to Open Data in Umbria region
Apporach to Open Data in Umbria regionComsode - FP7 project
 
Comsode pilot - Slovak eDemokracia project
Comsode pilot - Slovak eDemokracia projectComsode pilot - Slovak eDemokracia project
Comsode pilot - Slovak eDemokracia projectComsode - FP7 project
 
Comsode pilot - Netherlands Institute for Sounds and Vision
Comsode pilot - Netherlands Institute for Sounds and VisionComsode pilot - Netherlands Institute for Sounds and Vision
Comsode pilot - Netherlands Institute for Sounds and VisionComsode - FP7 project
 
Comsode pilot - Czech Trade Inspection Authority
Comsode pilot - Czech Trade Inspection AuthorityComsode pilot - Czech Trade Inspection Authority
Comsode pilot - Czech Trade Inspection AuthorityComsode - FP7 project
 
Deployment strategies of Open Data Node focused mainly on pilots (2015-May)
Deployment strategies of Open Data Node focused mainly on pilots (2015-May)Deployment strategies of Open Data Node focused mainly on pilots (2015-May)
Deployment strategies of Open Data Node focused mainly on pilots (2015-May)Comsode - FP7 project
 
Predstavenie Open Data Node - Open Data Meetup
Predstavenie Open Data Node - Open Data MeetupPredstavenie Open Data Node - Open Data Meetup
Predstavenie Open Data Node - Open Data MeetupComsode - FP7 project
 

Más de Comsode - FP7 project (9)

ODN - Technical introduction of the platform
ODN - Technical introduction of the platformODN - Technical introduction of the platform
ODN - Technical introduction of the platform
 
ODN introduction @ Innovation Radar
ODN introduction @ Innovation RadarODN introduction @ Innovation Radar
ODN introduction @ Innovation Radar
 
Apporach to Open Data in Umbria region
Apporach to Open Data in Umbria regionApporach to Open Data in Umbria region
Apporach to Open Data in Umbria region
 
Approach to Open Data in Vienna
Approach to Open Data in ViennaApproach to Open Data in Vienna
Approach to Open Data in Vienna
 
Comsode pilot - Slovak eDemokracia project
Comsode pilot - Slovak eDemokracia projectComsode pilot - Slovak eDemokracia project
Comsode pilot - Slovak eDemokracia project
 
Comsode pilot - Netherlands Institute for Sounds and Vision
Comsode pilot - Netherlands Institute for Sounds and VisionComsode pilot - Netherlands Institute for Sounds and Vision
Comsode pilot - Netherlands Institute for Sounds and Vision
 
Comsode pilot - Czech Trade Inspection Authority
Comsode pilot - Czech Trade Inspection AuthorityComsode pilot - Czech Trade Inspection Authority
Comsode pilot - Czech Trade Inspection Authority
 
Deployment strategies of Open Data Node focused mainly on pilots (2015-May)
Deployment strategies of Open Data Node focused mainly on pilots (2015-May)Deployment strategies of Open Data Node focused mainly on pilots (2015-May)
Deployment strategies of Open Data Node focused mainly on pilots (2015-May)
 
Predstavenie Open Data Node - Open Data Meetup
Predstavenie Open Data Node - Open Data MeetupPredstavenie Open Data Node - Open Data Meetup
Predstavenie Open Data Node - Open Data Meetup
 

Último

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Último (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Comsode tools - pushing data to open ecosystem

  • 1. COMSODE tools Pushing data to the open ecosystem Jindřich Mynarz EEA.sk ELAG 2015 Stockholm June 9, 2015
  • 2. The gist of the talk To save legacy library data and satisfy internal and external requirements on your data you need ETL. “Libraries have to focus on making their data infrastructure more efficient if they want to keep up with the ever changing needs of their audience and invest in sustainable service development.” — Lukas Koster (source)
  • 3. Building tools to publish & reuse open data EU FP7 project (2013➝2015) Project partners: ● University of Milano-Bicocca, Italy ● Charles University in Prague, Czech Republic ● EEA, Czech Republic and Slovakia ● ADDSEN, Slovakia ● Spinque, the Netherlands ● Ministry of Interior of the Slovak Republic
  • 4. Legacy library data Save the data? ● …or let it go? ● What’s the cost of recovering the legacy? ● To save legacy data you need automation ⇒ ETL ● Unfortunately, paraphrasing Tolstoy, “tidy datasets are all alike but every messy dataset is messy in its own way.” (source)
  • 5. Confusion of tongues ● MARC used to be (or still is?) the lingua franca. What's next? ● Many data formats required to be supported ● MARC→Web impedance mismatch ● Export & import in systems integration
  • 6. Open Data Node “(Linked) open data plumbing” ● Open Data Node (ODN) is a platform for publishing (open) data & automating internal data flows that enables progressive enhancement of data. ● Main product of the COMSODE project ● Free, open source, modular, integrated (e. g., single sign-on)
  • 7. Open Data Node networks ● Data replication (e.g., local copy of name authority file) ● Data synchronization (e. g., periodical harvesting of incremental updates via OAI-PMH) ● Data distribution (e.g., shared cataloguing)
  • 8. Open Data Node workflow 1. Catalogue your internal data 2. Create a data processing pipeline for the datasets to be published 3. Schedule the pipeline to be run to publish the data
  • 9. Internal catalogue ● Map out the data you have or external data you use; both open and closed. ● If data cannot be found, it is as if it did not exist, so make data discoverable and provide it with descriptive metadata (DCAT-AP). ● Based on CKAN.
  • 10. ● An extensible ETL tool with native RDF support for automating repetitive data exchange and transformation tasks. ● Allows you to define, execute, monitor, debug (examine intermediate data), schedule, and share (import/export) data transformations. ● Open source, dual-licensed to enable commercial extensions
  • 11. Extract-Transform-Load pipeline Data flow of an ETL process in UnifiedViews is defined as a pipeline composed of data processing units.
  • 12. Data processing units Extractors ● Download file ● Load from SQL database ● SPARQL endpoint extractor Transformers ● Zip/unzip ● Find/replace ● Parse and serialize RDF ● SPARQL Update ● XSLT ● ISO 2709 to MARCXML ● SPARQL SELECT to CSV Loaders ● Files upload ● Load to Virtuoso ● Load to SQL database + Quality Assessment
  • 13. Public catalogue ● Public interface that enables users to discover & access your data. ● Links to data dumps, APIs (REST API, SPARQL endpoint), and applications based on the data. ● Provides metadata, such as licence, dataset maintainer’s contact, or last update date. ● Based on CKAN.
  • 14. COMSODE methodology ● Guidelines on how to use ODN for those with little open data experience ● Defines phases, practices, roles, and artifacts. ● Phases: a. Development of open data publication plan b. Preparation of publication c. Realization of publication d. Archiving http://opendatanode.org/product/methodology-for-od-publishing
  • 15. Open Data Node in use ● Reality check ○ Eating our own dog food ○ Testing the ODN’s versatility ● 150 datasets transformed by COMSODE partners ● Supporting 10 pilot projects, including: ○ eDemokracia: Slovak nation-wide e-government project ○ Czech Trade Inspection Authority ○ Slovak Environment Agency ○ Slovak National Library
  • 18. Impact ● Improve your internal & external data flows. ● Libraries are required to publish data by the EU directive on the re-use of public sector information. ○ If you release MARC, is the cost of access to the data marginal? ● Insiders have access, yet outsiders often have more experience to build value upon the data.
  • 19. In conclusion ♫ The pipelines, the pipelines are calling... ♫ To save legacy library data and satisfy internal and external requirements on your data you need ETL. http://opendatanode.org Image credits from the Noun Project: Database by Dmitry Baranovskiy, Counter by Sergey Demushkin, Ventil by Sergey Demushkin, Spider Web by Denis, Scroll by EliRatus, Chest by Victor Escorsin, Pipes by Christopher T. Howlett, Adoption by Luis Prado, Plumber by Luis Prado, Filter by Muneer A.Safiah, Lock by Alex Auda Samora, Lego by Jon Trillana, Atom by Mister Pixel