Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
1. Enrichment of DDI support in the
Dataverse data repository
Slava Tykhonov, Marion Wittenberg (DANS-KNAW)
EDDI 2019, Tampere, Finland, December 3, 2019
Creative Commons Attribution 4.0 International (CC BY 4.0)
2. SSHOC objective and deliverables
Objective
Development of a research data repository service on EOSC, for SSH
institutions currently without such a facility for their designated communities
Deliverables
After 38 months: Data repository service running on EOSC
After 40 months: Report on principles of governance and sustainability of
the data repository service
3. Development process
DataverseSSHOC project has two parallel tracks of the development:
● Core development team is working on the modification and extension
of the Dataverse core functionality.
● The application development team will create new or will integrate
existent tools that will be published on Dataverse App Store website.
Our goal is to build the distributed and mature data infrastructure based on
sustainable microservices.
4. Services in European Open Science Cloud (EOSC)
● EOSC requires the level 8 of maturity
(at least)
● we need the highest quality of software
to be accepted as a service
● clear and transparent evaluation of
services is essential
● the evidence of technical maturity is the
key to success
● the limited warranty will allow to stop
out-of-warranty services
5. Applications maturity level
Every software package should follow the same CESSDA Maturity Model to
be accepted as a service.
https://zenodo.org/record/2591055#.XKR6ny2B2u5
Must have: k8s infrastructure with upstream Docker images, warranty
statement, documentation, unit tests, Selenium tests, jenkins pipeline.
Dataverse external applications with enough maturity that are deployed as a
Cloud services can be connected to any Dataverse repository by using API
Token.
6. Dataverse App Store
We’re building a different services out of tools!
Data preview: DDI Explorer, Spreadsheet/CSV, PDF, Text files, HTML,
Images, video render, audio, JSON, GeoJSON/Shapefiles/Map, XML
Interoperability: external controlled vocabularies (CESSDA CV Manager)
Data processing: NESSTAR DDI migration tool
Linked Data: RDF compliance including SPARQL endpoint
Federated login: eduGAIN, PIONIER ID
7. DDI Converter tool
It usually takes a lot of efforts and time to migrate metadata and data to any
data repository like NESSTAR or DSpace to another repository.
The main idea of the DDI Converter is to separate mappings from the
conversion process and let metadata specialist to do it separately from the
DDI migration pipeline.
DDI Converter has a Docker infrastructure that allows to deploy it as image
on Kubernetes or other Cloud platforms. You don’t need any development
capacity to use it, just create mappings and the tool will do the rest!
9. Why XSLT mappings?
● XSLT (1998) is a language designed
primarily for transforming human
readable documents into other self
describing documents.
● DDI community is already using XSLT to
map metadata from one format to
another and collected a lot of mappings
that can be reused.
● XSLT mappings for different DDI standards
can be managed in the same github
repository
● At the moment the knowledge of XSLT is a
common job requirement for metadata
specialists.
10. DDI Converter in a nutshell
● Developed in Python3 as Flask application with pyDataverse module
(AUSSDA)
● DDI Converter uses XSLT mappings stored in github
● all CESSDA DDI transformations are also supported
https://github.com/MetadataTransform/ddi-xslt
● Swagger framework allows to use the tool as a manual deposit form
and in the same time as a microservice builtin in the migration pipeline
● Docker image deployed locally or on Cloud can connect DDI Converter
to any Dataverse instance by API
● You can migrate your data even if Dataverse instance is maintained by
someone else. Just copy API Token from your Dataverse account and
put in DDI Converter, and it will do the job for you!
11. Using Swagger as dataset deposit form
Import steps:
1. Open Swagger page
2. Upload DDI file
3. Select XSLT mapping from
github
4. Copy API Token from user
page in Dataverse
5. Choose a subdataverse where
dataset shoud go
6. Start migration process in one
click
7. Check result in Dataverse
Interested?
https://github.com/IQSS/dataverse-
ddi-converter-tool
12. What’s next? DDI explorer as a service
DDI Explorer is a Dataverse
application developed by
Scholars Portal
dataverse.scholarsportal.info
Dataverse SSHOC project got
it integrated in Docker image
and incorporated in the
Kubernetes infrastructure
Dataverse-docker module
DDI explorer will be delivered
as a Cloud service that can be
connected to any Dataverse
instance!
13. Spreadsheet previewer
This tool was contributed by
Dataverse SSHOC project and
integrated by Harvard IQSS in
Dataverse 4.18
It allows to browse through
web interface for viewing
data directly without
download.
Spreadsheet viewer can
increase chances to find a
proper data and to get a
citation - more FAIRness!