SlideShare una empresa de Scribd logo
1 de 39
Research Data Management
- an introductory webinar
Tony Ross-Hellauer, OpenAIRE
Sarah Jones, EUDAT
This work is licensed under the Creative
Commons CC-BY 4.0 licence
Open Access Infrastructure
for Research in Europe
www.openaire.eu
Who we are
Research Data Services, Expertise &
Technology https://www.eudat.eu
• Why manage data?
• RDM in Horizon 2020 (+ recent changes)
• How to manage and share research data?
• EUDAT and OpenAIRE services
Overview
WHY MANAGE DATA?
Image CC-BY-NC-SA by Leo Reynolds www.flickr.com/photos/lwr/13442910354
Data explosion
• More and more data is
being created
• Issue is not creating data,
but being able to navigate
and use it
• Data management is
critical to make sure data
are well-organised,
understandable and
reusable
Digital data are fragile and susceptible to loss for a wide variety of reasons
• Natural disaster
• Facilities infrastructure failure
• Storage failure
• Server hardware/software failure
• Application software failure
• Format obsolescence
• Legal encumbrance
• Human error
• Malicious attack
• Loss of staffing competencies
• Loss of institutional commitment
• Loss of financial stability
• Changes in user expectations
Data loss
Image CC BY-NC-SA 2.0 by Dave Hill https://www.flickr.com/photos/dmh650/4031607067
A reproducibility crisis
Why manage data?
• Make your research easier
• Stop yourself drowning in irrelevant stuff
• Save data for later
• Avoid accusations of fraud or bad science
• Share your data for re-use
• Get credit for it
• Meet funder/institution requirements
Because well-managed data opens up
opportunities for re-use, sharing and
makes for better science!
RDM IN HORIZON 2020
Image “Open Data” CC BY 2.0 by http://www.descrier.co.uk
EC Open Research Data Pilot,
Jan 2015 -
• A limited, voluntary pilot (initially 8 programme areas) with opt-out and
safeguards
• Participating projects must:
• Keep a data management plan, to be updated at regular intervals
• Deposit in an open access repository:
1. the data, including associated metadata, needed to validate the
results presented in scientific publications as soon as possible;
2. other data, including associated metadata, as specified and within the
deadlines laid down in the data management plan
EC Open Research Data Pilot
Opt-out Reasons
https://open-data.europa.eu/data/dataset/open-research-data-the-uptake-of-
the-pilot-in-the-first-calls-of-horizon-2020
Just announced!
H2020 - Open Data by
Default from 2017
MANAGING & SHARING DATA
CREATING
DATA
PROCESSING
DATA
ANALYSING
DATA
PRESERVING
DATA
GIVING
ACCESS TO
DATA
RE-USING
DATA
Research data lifecycle
CREATING DATA: designing research,
DMPs, planning consent, locate existing
data, data collection and management,
capturing and creating metadata
RE-USING DATA: follow-
up research, new
research, undertake
research reviews,
scrutinising findings,
teaching & learning
ACCESS TO DATA:
distributing data,
sharing data,
controlling access,
establishing copyright,
promoting data PRESERVING DATA: data storage, back-
up & archiving, migrating to best format
& medium, creating metadata and
documentation
ANALYSING DATA:
interpreting, & deriving
data, producing outputs,
authoring publications,
preparing for sharing
PROCESSING DATA:
entering, transcribing,
checking, validating and
cleaning data, anonymising
data, describing data,
manage and store data
Ref: UK Data Archive: http://www.data-archive.ac.uk/create-manage/life-cycle
• Findable
– assign persistent IDs, provide rich metadata, register in a
searchable resource...
• Accessible
– Retrievable by their ID using a standard protocol, metadata remain
accessible even if data aren’t...
• Interoperable
– Use formal, broadly applicable languages, use standard
vocabularies, qualified references...
• Reusable
– Rich, accurate metadata, clear licences, provenance, use of
community standards...
www.force11.org/group/fairgroup/fairprinciples
FAIR data
A DMP is a brief plan to define:
• how the data will be created?
• how it will be documented?
• who will access it?
• where it will be stored?
• who will back it up?
• whether (and how) it will be shared & preserved?
DMPs are often submitted as part of grant applications, but
are useful whenever researchers are creating data.
Data Management Plans
DMPonline
A web-based tool to help researchers write DMPs
Includes a template for Horizon 2020
Guidance from EUDAT and OpenAIRE being added
https://dmponline.dcc.ac.uk
• Metadata and documentation is needed to locate and
understand research data
• Think about what others would need in order to find,
evaluate, understand, and reuse your data.
• Get others to check the metadata to improve quality
• Use standards to enable interoperability
Metadata & documentation
Metadata standards
Use relevant standards for interoperability
http://rd-alliance.github.io/metadata-directory
Where to store data?
• Your own drive (PC, server, flash drive, etc.)
– And if you lose it? Or it breaks?
• Somebody else’s drive / departmental drive
• “Cloud” drive
– Do they care as much about your data as you do?
• Large scale infrastructure services like EUDAT
How to backup?
• 3... 2... 1... backup!
– at least 3 copies of a file
– on at least 2 different media
– with at least 1 offsite
• Use managed services where possible e.g. University
filestores or infrastructure services like EUDAT rather
than local or external hard drives
• Ask IT teams for advice
Backup and preservation
– not the same thing!
• Backups
– Used to take periodic snapshots of data in case the current version
is destroyed or lost
– Backups are copies of files stored for short or near-long-term
– Often performed on a somewhat frequent schedule
• Archiving
– Used to preserve data for historical reference or potentially during
disasters
– Archives are usually the final version, stored for long-term, and
generally not copied over
– Often performed at the end of a project or during major milestones
Data repositories
http://databib.org
http://service.re3data.org/search
• Does your publisher or funder suggest a repository?
• Are there data centres or databases for your discipline?
• Does your university offer support for long-term preservation?
A mistake in a spreadsheet led
to dramatically different results
from those published.
These results were cited by
the International Monetary
Fund and the UK Treasury to
justify austerity programmes.
Had the data been shared, this
could have been picked up
earlier.
The importance of sharing data
Concerns about data sharing
Concern Solution
inappropriate use due to
misunderstanding of research
purpose or parameters
security and confidentiality of
sensitive data
lack of acknowledgement / credit
loss of advantage when competing
for research funding
Concerns about data sharing
Concern Solution
inappropriate use due to
misunderstanding of research
purpose or parameters
security and confidentiality of
sensitive data
lack of acknowledgement / credit
loss of advantage when competing
for research funding
metadata
metadata
metadata
metadata
Concerns about data sharing
Concern Solution
inappropriate use due to
misunderstanding of research
purpose or parameters
provide rich Abstract, Purpose,
Constraints and Supplemental
Information where needed
security and confidentiality of
sensitive data
• the metadata does NOT
contain the data
• Use Constraints specify who
may access the data and how
lack of acknowledgement / credit
specify a required data citation
within the Use Constraints
loss of data insight and
competitive advantage when vying
for research funding
create second, public version with
generalised Data Processing
Description
Make data shareable
• Create robust metadata that has been checked
• Include reference information in metadata e.g. unique
IDs & properly formatted data citations
• Publish your metadata so it’s discoverable. Use portals,
clearing houses, online resources…
• Package up the data and associated metadata to deposit
in repositories
• License the data clearly
www.dcc.ac.uk/resources/how-guides/license-research-data
Licensing research data
This DCC guide outlines the pros and
cons of each approach and gives
practical advice on how to implement
your licence
CREATIVE COMMONS LIMITATIONS
NC Non-Commercial
What counts as commercial?
ND No Derivatives
Severely restricts use
These clauses are not open licenses
Horizon 2020 Open Access
guidelines point to:
or
EUDAT licensing tool
Answer questions to determine which licence(s) are
appropriate to use
http://ufal.github.io/public-license-selector
What to preserve & share
It’s not possible to keep everything. Select based on:
– What has to be kept e.g. data underlying publications
– What can’t be recreated e.g. environmental recordings
– What is potentially useful to others
– What has scientific, cultural or historical value
– What legally must be destroyed
How to select and appraise research data:
www.dcc.ac.uk/resources/how-guides/appraise-select-research-data
EUDAT & OPENAIRE SERVICES
Image CC-BY-NC ‘Data centre’ by Bob Mical www.flickr.com/photos/small_realm/15995555571
EUDAT services
EUDAT offers a pan-European solution, providing a
generic set of services to ensure minimum level of
interoperability
Building common
data services in
close collaboration
with 25+
communities
EUDAT B2 service suite
Covering both access and
deposit, from informal data
sharing to long-term
archiving, and addressing
identification,
discoverability and
computability of both long-
tail and big data, EUDAT’s
services will address the
full lifecycle of research
data
CREATING
DATA
PROCESSING
DATA
ANALYSING
DATA
PRESERVING
DATA
GIVING
ACCESS TO
DATA
RE-USING
DATA
PIDs  Referencing data:
Finding data and making
data findable
Data Transfer from
public data servers
Store mutable data
Accessing services
Move data to HPC
OpenAIRE services:
zenodo.org
For all content types!
With GitHub integration!
Upload Describe Publish
Create communities!
https://www.openaire.eu/search
Link data to publications
OpenAIRE training and
support materials
• Briefing papers, factsheets,
Webinars, workshops,
FAQs
• Information on:
• Open Research Data Pilot
• Creating a data management
plan
• Selecting a data repository
https://www.openaire.eu/opendatapilot
https://www.openaire.eu/support
www.eudat.eu www.openaire.eu
Thanks – any questions?
Contact us:
Tony Ross-Hellauer, OpenAIRE: ross-hellauer@sub.uni-goettingen.de
Sarah Jones, EUDAT: Sarah.Jones@glasgow.ac.uk
Acknowledgements:
Thanks to EUDAT colleagues Mark van de Sanden and Christine Staiger
for slides.
Content has also been repurposed from the DataONE Educational
modules, ‘Data Management’ and ‘Data Sharing’ Retrieved from
https://www.dataone.org/education-modules

Más contenido relacionado

La actualidad más candente

DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciencesSarah Jones
 
Data Management Planning
Data Management PlanningData Management Planning
Data Management PlanningSarah Jones
 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolkfear
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinarSarah Jones
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Historic Environment Scotland
 
Developing a Data Management Plan
Developing a Data Management PlanDeveloping a Data Management Plan
Developing a Data Management PlanMartin Donnelly
 
EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementSarah Jones
 
Research support-challenges
Research support-challengesResearch support-challenges
Research support-challengesSarah Jones
 
EPSRC research data expectations and PURE for datasets
EPSRC research data expectations and PURE for datasetsEPSRC research data expectations and PURE for datasets
EPSRC research data expectations and PURE for datasetsEDINA, University of Edinburgh
 
Introduction to Data Management Planning
Introduction to Data Management PlanningIntroduction to Data Management Planning
Introduction to Data Management PlanningSarah Jones
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Datakfear
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallengesjyotikhadake
 

La actualidad más candente (20)

DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
 
Data Management Planning
Data Management PlanningData Management Planning
Data Management Planning
 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPTool
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...
 
Developing a Data Management Plan
Developing a Data Management PlanDeveloping a Data Management Plan
Developing a Data Management Plan
 
EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu |
 
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Research support-challenges
Research support-challengesResearch support-challenges
Research support-challenges
 
RDM & ELNs @ Edinburgh
RDM & ELNs @ EdinburghRDM & ELNs @ Edinburgh
RDM & ELNs @ Edinburgh
 
EPSRC research data expectations and PURE for datasets
EPSRC research data expectations and PURE for datasetsEPSRC research data expectations and PURE for datasets
EPSRC research data expectations and PURE for datasets
 
Introduction to Data Management Planning
Introduction to Data Management PlanningIntroduction to Data Management Planning
Introduction to Data Management Planning
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
 
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Data
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallenges
 

Destacado

Presentation of the OpenAIRE webinars during the Open Access Week 2016
Presentation of the OpenAIRE webinars during the Open Access Week 2016Presentation of the OpenAIRE webinars during the Open Access Week 2016
Presentation of the OpenAIRE webinars during the Open Access Week 2016OpenAIRE
 
OpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open ScienceOpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open ScienceOpenAIRE
 
User engagement in OpenAIRE - panel presentation at #DI4R2016
User engagement in OpenAIRE - panel presentation at #DI4R2016User engagement in OpenAIRE - panel presentation at #DI4R2016
User engagement in OpenAIRE - panel presentation at #DI4R2016OpenAIRE
 
Evolving Strategies for Open Access Implementation: Some Findings from the Op...
Evolving Strategies for Open Access Implementation: Some Findings from the Op...Evolving Strategies for Open Access Implementation: Some Findings from the Op...
Evolving Strategies for Open Access Implementation: Some Findings from the Op...OpenAIRE
 
Science Needs Open - Chris Hartgerink
Science Needs Open - Chris HartgerinkScience Needs Open - Chris Hartgerink
Science Needs Open - Chris HartgerinkOpenAIRE
 
OpenAIRE webinar on Open Access in H2020 (OAW2016)
OpenAIRE webinar on Open Access in H2020 (OAW2016)OpenAIRE webinar on Open Access in H2020 (OAW2016)
OpenAIRE webinar on Open Access in H2020 (OAW2016)OpenAIRE
 
Open Data: Sharing the Main Actor of a Scientific Story - Paola Masuzzo
Open Data: Sharing the Main Actor of a Scientific Story - Paola MasuzzoOpen Data: Sharing the Main Actor of a Scientific Story - Paola Masuzzo
Open Data: Sharing the Main Actor of a Scientific Story - Paola MasuzzoOpenAIRE
 
Marina Angelaki - PASTEUR4OA: Supporting Open Access Policies
Marina Angelaki - PASTEUR4OA: Supporting Open Access PoliciesMarina Angelaki - PASTEUR4OA: Supporting Open Access Policies
Marina Angelaki - PASTEUR4OA: Supporting Open Access PoliciesOpenAIRE
 
Alma Swan - PASTEUR4OA: Policy alignment and effectiveness
Alma Swan - PASTEUR4OA: Policy alignment and effectivenessAlma Swan - PASTEUR4OA: Policy alignment and effectiveness
Alma Swan - PASTEUR4OA: Policy alignment and effectivenessOpenAIRE
 

Destacado (9)

Presentation of the OpenAIRE webinars during the Open Access Week 2016
Presentation of the OpenAIRE webinars during the Open Access Week 2016Presentation of the OpenAIRE webinars during the Open Access Week 2016
Presentation of the OpenAIRE webinars during the Open Access Week 2016
 
OpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open ScienceOpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open Science
 
User engagement in OpenAIRE - panel presentation at #DI4R2016
User engagement in OpenAIRE - panel presentation at #DI4R2016User engagement in OpenAIRE - panel presentation at #DI4R2016
User engagement in OpenAIRE - panel presentation at #DI4R2016
 
Evolving Strategies for Open Access Implementation: Some Findings from the Op...
Evolving Strategies for Open Access Implementation: Some Findings from the Op...Evolving Strategies for Open Access Implementation: Some Findings from the Op...
Evolving Strategies for Open Access Implementation: Some Findings from the Op...
 
Science Needs Open - Chris Hartgerink
Science Needs Open - Chris HartgerinkScience Needs Open - Chris Hartgerink
Science Needs Open - Chris Hartgerink
 
OpenAIRE webinar on Open Access in H2020 (OAW2016)
OpenAIRE webinar on Open Access in H2020 (OAW2016)OpenAIRE webinar on Open Access in H2020 (OAW2016)
OpenAIRE webinar on Open Access in H2020 (OAW2016)
 
Open Data: Sharing the Main Actor of a Scientific Story - Paola Masuzzo
Open Data: Sharing the Main Actor of a Scientific Story - Paola MasuzzoOpen Data: Sharing the Main Actor of a Scientific Story - Paola Masuzzo
Open Data: Sharing the Main Actor of a Scientific Story - Paola Masuzzo
 
Marina Angelaki - PASTEUR4OA: Supporting Open Access Policies
Marina Angelaki - PASTEUR4OA: Supporting Open Access PoliciesMarina Angelaki - PASTEUR4OA: Supporting Open Access Policies
Marina Angelaki - PASTEUR4OA: Supporting Open Access Policies
 
Alma Swan - PASTEUR4OA: Policy alignment and effectiveness
Alma Swan - PASTEUR4OA: Policy alignment and effectivenessAlma Swan - PASTEUR4OA: Policy alignment and effectiveness
Alma Swan - PASTEUR4OA: Policy alignment and effectiveness
 

Similar a Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT

EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationHistoric Environment Scotland
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationEDINA, University of Edinburgh
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020Sarah Jones
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE
 
University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management Bill Worthington
 
Creating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchCreating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchRobin Rice
 
All an NCP should know about DMPs, but didn't have the time to ask
All an NCP should know about DMPs, but didn't have the time to askAll an NCP should know about DMPs, but didn't have the time to ask
All an NCP should know about DMPs, but didn't have the time to askSarah Jones
 
Open Research Data & H2020
Open Research Data & H2020Open Research Data & H2020
Open Research Data & H2020Sarah Jones
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycleMarieke Guy
 
Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionMartin Donnelly
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...OpenAIRE
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaireSarah Jones
 
Ariadne: Data Management Planning
Ariadne: Data Management PlanningAriadne: Data Management Planning
Ariadne: Data Management Planningariadnenetwork
 
2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning WorkshopLizzy_Rolando
 
Open Access Week 2017: Research data management and data management plans (Fl...
Open Access Week 2017: Research data management and data management plans (Fl...Open Access Week 2017: Research data management and data management plans (Fl...
Open Access Week 2017: Research data management and data management plans (Fl...OpenAIRE
 

Similar a Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT (20)

EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
 
How to elaborate a data management plan
How to elaborate a data management planHow to elaborate a data management plan
How to elaborate a data management plan
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
 
University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
Creating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchCreating a Data Management Plan for your Research
Creating a Data Management Plan for your Research
 
Research Data Management: Why is it important?
Research Data Management: Why is it  important?Research Data Management: Why is it  important?
Research Data Management: Why is it important?
 
All an NCP should know about DMPs, but didn't have the time to ask
All an NCP should know about DMPs, but didn't have the time to askAll an NCP should know about DMPs, but didn't have the time to ask
All an NCP should know about DMPs, but didn't have the time to ask
 
Open Research Data & H2020
Open Research Data & H2020Open Research Data & H2020
Open Research Data & H2020
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introduction
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
 
Ariadne: Data Management Planning
Ariadne: Data Management PlanningAriadne: Data Management Planning
Ariadne: Data Management Planning
 
2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop
 
Open Access Week 2017: Research data management and data management plans (Fl...
Open Access Week 2017: Research data management and data management plans (Fl...Open Access Week 2017: Research data management and data management plans (Fl...
Open Access Week 2017: Research data management and data management plans (Fl...
 

Último

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Último (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT

  • 1. Research Data Management - an introductory webinar Tony Ross-Hellauer, OpenAIRE Sarah Jones, EUDAT This work is licensed under the Creative Commons CC-BY 4.0 licence
  • 2. Open Access Infrastructure for Research in Europe www.openaire.eu Who we are Research Data Services, Expertise & Technology https://www.eudat.eu
  • 3. • Why manage data? • RDM in Horizon 2020 (+ recent changes) • How to manage and share research data? • EUDAT and OpenAIRE services Overview
  • 4. WHY MANAGE DATA? Image CC-BY-NC-SA by Leo Reynolds www.flickr.com/photos/lwr/13442910354
  • 5. Data explosion • More and more data is being created • Issue is not creating data, but being able to navigate and use it • Data management is critical to make sure data are well-organised, understandable and reusable
  • 6. Digital data are fragile and susceptible to loss for a wide variety of reasons • Natural disaster • Facilities infrastructure failure • Storage failure • Server hardware/software failure • Application software failure • Format obsolescence • Legal encumbrance • Human error • Malicious attack • Loss of staffing competencies • Loss of institutional commitment • Loss of financial stability • Changes in user expectations Data loss Image CC BY-NC-SA 2.0 by Dave Hill https://www.flickr.com/photos/dmh650/4031607067
  • 8. Why manage data? • Make your research easier • Stop yourself drowning in irrelevant stuff • Save data for later • Avoid accusations of fraud or bad science • Share your data for re-use • Get credit for it • Meet funder/institution requirements Because well-managed data opens up opportunities for re-use, sharing and makes for better science!
  • 9. RDM IN HORIZON 2020 Image “Open Data” CC BY 2.0 by http://www.descrier.co.uk
  • 10. EC Open Research Data Pilot, Jan 2015 - • A limited, voluntary pilot (initially 8 programme areas) with opt-out and safeguards • Participating projects must: • Keep a data management plan, to be updated at regular intervals • Deposit in an open access repository: 1. the data, including associated metadata, needed to validate the results presented in scientific publications as soon as possible; 2. other data, including associated metadata, as specified and within the deadlines laid down in the data management plan
  • 11. EC Open Research Data Pilot Opt-out Reasons https://open-data.europa.eu/data/dataset/open-research-data-the-uptake-of- the-pilot-in-the-first-calls-of-horizon-2020
  • 12. Just announced! H2020 - Open Data by Default from 2017
  • 14. CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA Research data lifecycle CREATING DATA: designing research, DMPs, planning consent, locate existing data, data collection and management, capturing and creating metadata RE-USING DATA: follow- up research, new research, undertake research reviews, scrutinising findings, teaching & learning ACCESS TO DATA: distributing data, sharing data, controlling access, establishing copyright, promoting data PRESERVING DATA: data storage, back- up & archiving, migrating to best format & medium, creating metadata and documentation ANALYSING DATA: interpreting, & deriving data, producing outputs, authoring publications, preparing for sharing PROCESSING DATA: entering, transcribing, checking, validating and cleaning data, anonymising data, describing data, manage and store data Ref: UK Data Archive: http://www.data-archive.ac.uk/create-manage/life-cycle
  • 15. • Findable – assign persistent IDs, provide rich metadata, register in a searchable resource... • Accessible – Retrievable by their ID using a standard protocol, metadata remain accessible even if data aren’t... • Interoperable – Use formal, broadly applicable languages, use standard vocabularies, qualified references... • Reusable – Rich, accurate metadata, clear licences, provenance, use of community standards... www.force11.org/group/fairgroup/fairprinciples FAIR data
  • 16. A DMP is a brief plan to define: • how the data will be created? • how it will be documented? • who will access it? • where it will be stored? • who will back it up? • whether (and how) it will be shared & preserved? DMPs are often submitted as part of grant applications, but are useful whenever researchers are creating data. Data Management Plans
  • 17. DMPonline A web-based tool to help researchers write DMPs Includes a template for Horizon 2020 Guidance from EUDAT and OpenAIRE being added https://dmponline.dcc.ac.uk
  • 18. • Metadata and documentation is needed to locate and understand research data • Think about what others would need in order to find, evaluate, understand, and reuse your data. • Get others to check the metadata to improve quality • Use standards to enable interoperability Metadata & documentation
  • 19. Metadata standards Use relevant standards for interoperability http://rd-alliance.github.io/metadata-directory
  • 20. Where to store data? • Your own drive (PC, server, flash drive, etc.) – And if you lose it? Or it breaks? • Somebody else’s drive / departmental drive • “Cloud” drive – Do they care as much about your data as you do? • Large scale infrastructure services like EUDAT
  • 21. How to backup? • 3... 2... 1... backup! – at least 3 copies of a file – on at least 2 different media – with at least 1 offsite • Use managed services where possible e.g. University filestores or infrastructure services like EUDAT rather than local or external hard drives • Ask IT teams for advice
  • 22. Backup and preservation – not the same thing! • Backups – Used to take periodic snapshots of data in case the current version is destroyed or lost – Backups are copies of files stored for short or near-long-term – Often performed on a somewhat frequent schedule • Archiving – Used to preserve data for historical reference or potentially during disasters – Archives are usually the final version, stored for long-term, and generally not copied over – Often performed at the end of a project or during major milestones
  • 23. Data repositories http://databib.org http://service.re3data.org/search • Does your publisher or funder suggest a repository? • Are there data centres or databases for your discipline? • Does your university offer support for long-term preservation?
  • 24. A mistake in a spreadsheet led to dramatically different results from those published. These results were cited by the International Monetary Fund and the UK Treasury to justify austerity programmes. Had the data been shared, this could have been picked up earlier. The importance of sharing data
  • 25. Concerns about data sharing Concern Solution inappropriate use due to misunderstanding of research purpose or parameters security and confidentiality of sensitive data lack of acknowledgement / credit loss of advantage when competing for research funding
  • 26. Concerns about data sharing Concern Solution inappropriate use due to misunderstanding of research purpose or parameters security and confidentiality of sensitive data lack of acknowledgement / credit loss of advantage when competing for research funding metadata metadata metadata metadata
  • 27. Concerns about data sharing Concern Solution inappropriate use due to misunderstanding of research purpose or parameters provide rich Abstract, Purpose, Constraints and Supplemental Information where needed security and confidentiality of sensitive data • the metadata does NOT contain the data • Use Constraints specify who may access the data and how lack of acknowledgement / credit specify a required data citation within the Use Constraints loss of data insight and competitive advantage when vying for research funding create second, public version with generalised Data Processing Description
  • 28. Make data shareable • Create robust metadata that has been checked • Include reference information in metadata e.g. unique IDs & properly formatted data citations • Publish your metadata so it’s discoverable. Use portals, clearing houses, online resources… • Package up the data and associated metadata to deposit in repositories • License the data clearly
  • 29. www.dcc.ac.uk/resources/how-guides/license-research-data Licensing research data This DCC guide outlines the pros and cons of each approach and gives practical advice on how to implement your licence CREATIVE COMMONS LIMITATIONS NC Non-Commercial What counts as commercial? ND No Derivatives Severely restricts use These clauses are not open licenses Horizon 2020 Open Access guidelines point to: or
  • 30. EUDAT licensing tool Answer questions to determine which licence(s) are appropriate to use http://ufal.github.io/public-license-selector
  • 31. What to preserve & share It’s not possible to keep everything. Select based on: – What has to be kept e.g. data underlying publications – What can’t be recreated e.g. environmental recordings – What is potentially useful to others – What has scientific, cultural or historical value – What legally must be destroyed How to select and appraise research data: www.dcc.ac.uk/resources/how-guides/appraise-select-research-data
  • 32. EUDAT & OPENAIRE SERVICES Image CC-BY-NC ‘Data centre’ by Bob Mical www.flickr.com/photos/small_realm/15995555571
  • 33. EUDAT services EUDAT offers a pan-European solution, providing a generic set of services to ensure minimum level of interoperability Building common data services in close collaboration with 25+ communities
  • 34. EUDAT B2 service suite Covering both access and deposit, from informal data sharing to long-term archiving, and addressing identification, discoverability and computability of both long- tail and big data, EUDAT’s services will address the full lifecycle of research data
  • 35. CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA PIDs  Referencing data: Finding data and making data findable Data Transfer from public data servers Store mutable data Accessing services Move data to HPC
  • 36. OpenAIRE services: zenodo.org For all content types! With GitHub integration! Upload Describe Publish Create communities!
  • 38. OpenAIRE training and support materials • Briefing papers, factsheets, Webinars, workshops, FAQs • Information on: • Open Research Data Pilot • Creating a data management plan • Selecting a data repository https://www.openaire.eu/opendatapilot https://www.openaire.eu/support
  • 39. www.eudat.eu www.openaire.eu Thanks – any questions? Contact us: Tony Ross-Hellauer, OpenAIRE: ross-hellauer@sub.uni-goettingen.de Sarah Jones, EUDAT: Sarah.Jones@glasgow.ac.uk Acknowledgements: Thanks to EUDAT colleagues Mark van de Sanden and Christine Staiger for slides. Content has also been repurposed from the DataONE Educational modules, ‘Data Management’ and ‘Data Sharing’ Retrieved from https://www.dataone.org/education-modules

Notas del editor

  1. There are four main topics that we will discuss: Why manage data - The changing data landscape, looking at what issues this brings. Brief overview of evolution of EC’s RDM policies Secondly, we discuss considerations to make when managing and sharing data Finally we’ll touch on EUDAT and OpenAIRE services to show how support is provided throughout the lifecycle
  2. So let’s begin by looking at the changing data landscape.
  3. There’s been a data explosion. 1. 90% of all the data in the world has been generated over the last 2 years. 2. Scientific data output is currently increasing at an annual rate of 30%. As the amount of data being created now is growing exponentially, the biggest challenge is being able to navigate and use it. This is why data management is critical.
  4. Digital data are fragile. There are lots of ways in which data can be lost. Hardware and software can fail, formats can become obsolete, you can lose the knowledge and skills needed to understand the data, and you can lose the investment needed to keep the data accessible. Despite significant investment, data is not being managed effectively The current estimated total global spend on research and development is $1.5 trillion, which could be at risk. Much of the data generated is lost – in one study, the odds of sourcing datasets declined by 17% each year. The same study found 80% of datasets over 20 years old not available.
  5. Many experimentally established "facts" don't seem to hold up to repeated investigation. Several studies have shown alarming numbers of published papers that don’t stand up to scrutiny. Over half of psychology studies fail reproducibility test (61/100) – Nosek et al, Science, 2015 Causes of reproducibility not well understood – but can say that it is obvious that where the original data is available, accountability is increased – able to review where questions arise.
  6. There are lots of reasons to manage research data. Ultimately though, it’s to make your research easier. If data are properly documented and organised, you can stop yourself drowning in irrelevant stuff and find the data when you need it – for example to validate findings. By managing your data you can also more easily share it with others to get more credit and impact. You may also be required to explain how you will manage your data by your funder or university. Well-managed data opens up opportunities for re-use, integration and new science
  7. Let’s move on to the considerations to make when managing and sharing data
  8. Introduced at the start of 2015, covering just seven work programme areas, the Horizon 2020 Open Research Data Pilot has been a big success. In the first six months of the pilot, about a third of projects (65.4%, 431 signed grant agreements) that were part of the pilot chose to opt out. The most common reasons for opting out were: (1) concerns over intellectual property (37%), (2) the project did not expect to generate any data (18%), and privacy/data protection concerns (18%). Of those projects that were not originally part of the pilot, 11.9% (3268 projects) nonetheless have voluntarily opted in.
  9. Introduced at the start of 2015, covering just seven work programme areas, the Horizon 2020 Open Research Data Pilot has been a big success. In the first six months of the pilot, about a third of projects (65.4%, 431 signed grant agreements) that were part of the pilot chose to opt out. The most common reasons for opting out were: (1) concerns over intellectual property (37%), (2) the project did not expect to generate any data (18%), and privacy/data protection concerns (18%). Of those projects that were not originally part of the pilot, 11.9% (3268 projects) nonetheless have voluntarily opted in.
  10. Let’s move on to the considerations to make when managing and sharing data
  11. This research data lifecycle is taken from the UK Data Archive. It shows you the different processes and activities you’ll go through. Creating data: This is when you’ll design the research, write Data Management Plans, negotiate consent agreements, find any existing data you want to reuse, collect/capture your data and create any associated metadata Processing data: When processing your data, you’ll be entering, transcribing, checking, validating and cleaning it, you may also need to anonymise your data, you should describe it and make sure it’s properly managed and stored. Analysing data: when you analyse your data you’ll be interpreting it and creating derived data and outputs, you’ll probably also author publications and prepare the data for deposit and sharing. Preserving data: data repositories play a key role in preserving data: they will make sure it’s properly stored and archived, they will migrate the formats and storage medium and create associated metadata and documentation to explain any changes made Access to data: it may be that you share your data via a repository or handle access requests yourself. Either way, you need to establish copyright, decide who can have access and promote the data. Re-using data: data can be re-used in follow-up studies, new research, research reviews, to evidence findings or for teaching and learning. Try to keep an open mind about the different ways in which your data could be re-used and make it as open as possible.
  12. A Data Management Plan is often written early on in the research process to determine what data will be created and how it will be managed. Sometime you are asked for a DMP as part of a grant application, but they are useful to write regardless as it helps to develop consistent procedures from the outset.
  13. Metadata is needed to locate and understand the data. When you are deciding what information to capture, think about what others would need in order to find, evaluate, understand, and reuse your data. Also get others to check your metadata to improve the quality and make sure it’s understandable to others. Standards should be used where possible.
  14. To make sure their data can be understood by themselves, their community and others, researchers should create metadata and documentation. Metadata is basic descriptive information to help identify and understand the structure of the data e.g. title, author... Documentation provides the wider context. It’s useful to share the methodology / workflow, software and any information needed to understand the data e.g. explanation of abbreviations or acronyms There are lots of standards that can be used. The DCC started a catalogue of disciplinary metadata standards which is now being taken forward as an international initiative via an RDA working group
  15. There are lots of places you can store your data. You’re best to use managed services where possible as they’re more resilient. If you store data on standalone computers, memory sticks or in the cloud, be mindful of the risk of loss or security breaches.
  16. If you’re responsible for backing up your own data, you want to ensure there are multiple copies, on different media with at least 1 offsite. Where possible though, you should use managed services so the backup is done automatically for you.
  17. Remember that backup and preservation are not the same thing (though the terms are often used interchangeably). Backups are performed regularly to take periodic snapshots of the data for the short to medium term, whereas archiving is preserving the final version of the data for the long-term. You should make sure your data are backed-up during the active phase of research and that any data needed for the long-term are archived.
  18. It is also important to share your data where possible, particularly to evidence your findings. This article reflects on an inadvertent error in a economics paper by Reinhadt and Rogoff. Missing some rows out of an average gave drastically different results – what was published suggested that countries with 90% debt ratios see their economies shrink by 0.1%. Instead, it should have found that they grow by 2.2% – less than those with lower debt ratios, but not a spiralling collapse. This mistake wasn’t picked up on initially as the data hadn’t been shared. The mistake fed into government policy as the findings were used as justification for austerity measures in the UK and various other countries in the EU.
  19. Naturally, researchers may worry that the data will be taken out of context, misinterpreted or used inappropriately. They may also be concerned about maintaining the confidentiality and security of sensitive data. Business concerns may arise as well - will data users give proper credit and acknowledgement to the scientist? Will the scientist lose a competitive advantage by sharing this valuable resource? There are lots of reasons why researchers may be reluctant to share data, so what is the solution?
  20. Each of these issues can, in great part, be addressed by providing rich data documentation known as ‘metadata’.
  21. By providing metadata, the research scientist establishes the purpose, methods, sources and parameters of the data. As such, data users are given the information necessary to appropriately apply, protect and cite the data. If the metadata contains information about proprietary data processing or analysis techniques, the competitive advantage can be maintained by creating a second, more generalized, metadata record for public distribution.
  22. To make your data shareable, you should create robust metadata and seek a second a second opinion on this to ensure it’s understandable to others. Also include reference information so others can find your data and give you credit. The metadata should be published online and packaged up with your data to deposit in repositories.
  23. Guidance from the DCC can also help researchers to understand data licensing. This guide outlines the pros and cons of each approach e.g. the limitations of some CC options The OA guidelines under Horizon 2020 point to CC-0 or CC-BY as a straightforward and effective way to make it possible for others to mine, exploit and reproduce the data. See p11 at: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
  24. It might not be possible to preserve and share all your data, so you may need to make a selection. Some factor to consider could be what has to be kept, for example for legal reasons or to evidence findings, what is potentially useful to others or can’t be recreated. You may also be under obligation to destroy certain data due to consent agreements or commercial non-disclosure restrictions. The Digital Curation Centre has guidance on how to select what data to keep.
  25. Let’s close by looking briefly at the EUDAT service suite and how it helps with data management and sharing
  26. EUDAT offers a pan-European solution, providing a generic set of data services. These are being built in close collaboration with user communities.
  27. The services assist researchers to store, manage and process the data through-out the active phase of research, and also help to archive data and make it discoverable to others.
  28. The B2DROP service helps you to syncronrise and exchange research data like Dropbox; B2STAGE helps you get data to computation when processing and analysing data; B2SAFE helps you to replicate the data safely; B2SHARE is a repository to archive the data and share it with others; and B2FIND is a cataloguing service that allows you and others to find relevant data.
  29. Catch-all repository Multiple data types Publications Long tail of research data Citable data (DOI) Links to funding, pubs, data, software
  30. Should happen automatically thanks to our data-literature interlinking services But where it doesn’t, you
  31. Thanks