SlideShare a Scribd company logo
1 of 34
Preparing	your	own	data	for	future	re-use:	
data	management	and	the	FAIR	principles
Martin	Donnelly
Digital	Curation	Centre
University	of	Edinburgh
Sheffield	Hallam	University,	5	April	2017
The Digital Curation Centre (DCC)
• UK	national	centre	of	expertise	in	digital	preservation	and	data	
management,	established	2004
• Principal	audience	is	the	UK	higher	education	sector,	but	we	
increasingly	work	further	afield	(continental	Europe,	North	
America,	South	Africa,	Asia…)
• Provide	guidance,	training,	tools	(e.g.	DMPonline)	and	other	
services	on	all	aspects	of	research	data	management	and	Open	
Science
• Now	offering	tailored	consultancy/training
• Organise	national	and	international	events	and	webinars	
(International	Digital	Curation	Conference,	Research	Data	
Management	Forum)
Contents
1. Overview
2. Recap:	why	does	data	need	managing?	
3. The	FAIR	principles
4. Principles	into	practice:	data	in	Horizon	2020
5. FAIR	data,	step-by-step
6. References/resources
Overview
• As	Open	Access	to	publications	became	normal	(if	not	ubiquitous),	
scholarly	attention	turned	to	the	data	underpinning	the	written	
outputs	of	research,	and	it	is	now	considered	a	first-class	research	
output	in	its	own	right.	The	development	of	OA	and	research	data	
management	(RDM)	are	closely	linked	as	part	of	a	broader	trend	in	
research,	sometimes	termed	‘Open	Science’ or	‘Open	Research’
• “The	European	Commission	is	now	moving	beyond	open	access	towards	the	
more	inclusive	area	of	open	science.	Elements	of	open	science	will	gradually	
feed	into	the	shaping	of	a	policy	for	Responsible	Research	and	Innovation	and	
will	contribute	to	the	realisation of	the	European	Research	Area	and	the	
Innovation	Union,	the	two	main	flagship	initiatives	for	research	and	
innovation”	
http://ec.europa.eu/research/swafs/index.cfm?pg=policy&lib=science
• The	EC’s	data	expectations	are	based	on	the	framework	of	the	FAIR	
principles,	which	state	that	data	(and	metadata)	should	ideally	be	
Findable,	Accessible,	Interoperable and	Reusable
Contents
1. Overview
2. Recap:	why	does	data	need	managing?	
3. The	FAIR	principles
4. Principles	into	practice:	data	in	Horizon	2020
5. FAIR	data,	step-by-step
6. References/resources
The old way of doing research
1.	Researcher	collects	data	(information)
3.	Researcher	writes	paper	based	on	data
4. Paper is published (and preserved)
5. Data is left to benign neglect,
and eventually ceases to be
accessible
2.	Researcher	interprets/synthesises	data
Without intervention, data + time = no data
Vines	et	al.	“examined	the	availability	of	data	from	516	studies	between	2	and	22	
years	old”
- The	odds	of	a	data	set	being	reported	as	extant	fell	by	17%	per	year
- Broken	e-mails	and	obsolete	storage	devices	were	the	main	obstacles	to	data	sharing
- Policies	mandating	data	archiving	at	publication	are	clearly	needed
“The	current	system	of	leaving	data	with	authors	means	that	almost	all	of	it	is	lost	over	time,	
unavailable	for	validation	of	the	original	results	or	to	use	for	entirely	new	purposes”	
according	to	Timothy	Vines,	one	of	the	researchers.	This	underscores	the	need	for	intentional	
management	of	data	from	all	disciplines	and	opened	our	conversation	on	potential	roles	for	
librarians	in	this	arena.(“80	Percent	of	Scientific	Data	Gone	in	20	Years”	HNGN,	Dec.	20,	
2013,	http://www.hngn.com/articles/20083/20131220/80-percent-of-scientific-data-gone-in-
20-years.htm.)
Vines	et	al.,	The	Availability	of	Research	Data	Declines	Rapidly	with	Article	Age,	
Current	Biology	(2014),	http://dx.doi.org/10.1016/j.cub.2013.11.014
Baker, M. (2016)
“1,500 scientists
lift the lid on
reproducibility”,
Nature,
533:7604,
http://www.nat
ure.com/news/1
-500-scientists-
lift-the-lid-on-
reproducibility-
1.19970
(Aside: from data to research objects?)
• ‘Research	object’	is	a	term	that	is	gaining	in	popularity,	not	
least	in	the	humanities	where	the	relevance	of	the	term	‘data’	
is	not	always	recognised…
• Research	objects	can	comprise	any	supporting	material	which	
underpins	or	otherwise	enriches	the	(written)	outputs	of	
research
• Data	(numeric,	written,	audiovisual….)
• Software	code	and	algorithms
• Workflows	and	methodologies
• Slides,	logs,	lab	books,	sketchbooks,	notebooks,	etc
• See	http://www.researchobject.org/ for	more	info
The new way of doing research
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
DEPOSIT
…and	
RE-USE
The DataONE
lifecycle model
N.B. other models are available…
Ellyn Montgomery, US Geological Survey
Data sharing isn’t entirely new…
from	Philosophical	
Transactions	of	
the	Royal	Society,	
(MDCCCLXI)	(or	
1861	if	you’d	
prefer)
…but what’s “normal” is shifting
Data	management	is	a	part	of	good	research	practice.
- RCUK	Policy	and	Code	of	Conduct	on	the	Governance	of	Good	Research	Conduct
The benefits of Open / managed data
• SPEED: The	research	process	becomes	faster
• EFFICIENCY:	Data	collection	can	be	funded	once,	and	used	many	
times	for	a	variety	of	purposes
• ACCESSIBILITY:	Interested	third	parties	can	(where	appropriate)	
access	and	build	upon	publicly-funded	research	resources	with	
minimal	barriers	to	access
• IMPACT and LONGEVITY:	Open	publications	and	data	receive	more	
citations,	over	longer	periods	(see	for	example	recent	DCC/SPARC-
Europe	paper,	“The	Open	Data	Citation	Advantage”)
• TRANSPARENCY	and	QUALITY:	The	evidence	that	underpins	research	
can	be	made	open	for	anyone	to	scrutinise,	and	attempt	to	replicate	
findings.	This	leads	to	a	more	robust	scholarly	record
• SECURITY: Not	all	data	should	be	made	available	to	everyone.	Careful	
management	reduces	the	risk	of	inappropriate	disclosure.
MANAGEMENT
≠
SHARING
Open and/or Managed?
• Taking	a	managed	and	planned	approach	to	research	is	not	the	same	as	
making	everything	open	to	everyone
• The	purpose	of	research	data	management	is	twofold:
• To	ensure	that	data	remains	accessible	and	understandable;	or
• To	ensure	that	data	is	not	accessible	or	understandable	(in	its	raw	state,	by	the	
wrong	people,	or	at	the	wrong	time)
• Which	of	these	pertains	will	depend	on	the	nature	of	the	research.	It	is	
increasingly	expected	that	publications	and	data	(and	software,	
algorithms,	workflows	etc)	will	be	made	Open	by	default,	unless…
• There	is	an	ethical	reason	to	restrict	access
• There	is	a	public	safety	reason	to	restrict	access
• There	is	a	commercial	or	contractual	reason	to	restrict	access
• In	some	cases,	data	can	be	made	partially-open	(i.e.	anonymised,	
aggregated	or	redacted)	in	order	to	protect	these	interests
Unanticipated data re-use
Ships’	log	books	build	picture	of	climate	
change		14	October	2010
You	can	now	help	scientists	understand	
the	climate	of	the	past	and	unearth	new	
historical	information	by	revisiting	the	
voyages	of	First	World	War	Royal	Navy	
warships.
Visitors	to	OldWeather.org will	be	able	to	
retrace	the	routes	taken	by	any	of	280	
Royal	Navy	ships.	These	include	historic	
vessels	such	as	HMS	Caroline,	the	last	
survivor	of	the	1916	Battle	of	Jutland	still	
afloat.	By	transcribing	information	about	
the	weather	and	interesting	events	from	
images	of	each	ship's	logbook,	web	
volunteers	will	help	scientists	build	a	
more	accurate	picture	of	how	our	climate	
has	changed	over	the	last	century.	
http://www.nationalarchives.gov.uk/news
/503.htm
Detail	from	Royal	Navy	Recruitment	poster,	RNVR	
Signals	branch,	1917	(Catalogue	reference:	ADM	
1/8331)
Endeavour,	1768-71	
(Captain	Cook)
HMS	Beagle,	
1830-34
HMS	Torch,	
1918
Controversial	FOI	requests	to…
- University	of	East	Anglia
- Queens	University	Belfast
- University	of	Stirling
Unanticipated data mis-use?
Contents
1. Overview
2. Recap:	why	does	data	need	managing?	
3. The	FAIR	principles
4. Principles	into	practice:	data	in	Horizon	2020
5. FAIR	data,	step-by-step
6. References/resources
The FAIR Data Principles (0/4)
One	of	the	grand	challenges	of	data-intensive	science	is	
to	facilitate	knowledge	discovery	by	assisting	humans	
and	machines	in	their	discovery	of,	access	to,	
integration	and	analysis	of,	task-appropriate	scientific	
data	and	their	associated	algorithms	and	workflows.	
FAIR is	a	set	of	guiding	principles	to	make	data
• Findable
• Accessible
• Interoperable, and	
• Re-usable
The FAIR Data Principles (1/4)
To	be	Findable:
F1.	(meta)data	are	assigned a globally	unique	and	
eternally	persistent	identifier.
F2.	data	are	described	with rich metadata.
F3.	(meta)data	are registered or indexed in	a	
searchable	resource.
F4.	metadata specify the	data	identifier.
The FAIR Data Principles (2/4)
To	be	Accessible:
A1.	 (meta)data	are retrievable	by	their	
identifier using a	standardized	communications	
protocol.
A1.1.	the protocol is open,	free,	and	universally	
implementable.
A1.2.	the protocol allows	for	an authentication	
and	authorization	procedure,	where	necessary.
A2. metadata	are	accessible,	even	when	the	data	
are	no	longer	available.
The FAIR Data Principles (3/4)
To	be	Interoperable:
I1.	(meta)data	use	a formal,	accessible,	shared,	
and	broadly	applicable	language for	knowledge	
representation.
I2.	(meta)data	use vocabularies that	follow	FAIR	
principles.
I3.	(meta)data	include qualified	references to	
other	(meta)data.
The FAIR Data Principles (4/4)
To	be	Re-usable:
R1.	meta(data)	have	a plurality	of	accurate	and	
relevant	attributes.
R1.1.	(meta)data	are	released	with	a clear	and	
accessible data	usage	license.
R1.2.	(meta)data	are	associated	with	
their provenance.
R1.3.	(meta)data meet	domain-relevant	
community	standards
Contents
1. Overview
2. Recap:	why	does	data	need	managing?	
3. The	FAIR	principles
4. Principles	into	practice:	data	in	Horizon	2020
5. FAIR	data,	step-by-step
6. References/resources
FAIR in practice: European data policy
• The	EC	is	currently	midway	through	an	extended	pilot	for	Horizon	
2020.	Other	projects	can	participate	voluntarily,	and	opting	in	has	
been	more	popular	than	opting	out
• The	pilot	applies	as	minimum to	research	data	underlying	
publications,	plus	any	other	data	as	decided	by	the	project
• Participants	must:
• Create	and	maintain	a	DMP	as	a	project	deliverable
• Deposit	data	in	a	repository
• Make	it	possible	for	others	to	access,	mine,	exploit	and	reuse	the	data
• Share	information	on	the	tools	needed
…unless there	are	compelling	reasons	not	to	do	so.	
(And	these	reasons	should	be	recorded in	the	DMP.)
“As	open	as	possible,	as	closed	as	necessary”
Horizon 2020 – extended pilot (i)
The	DMP	should	include	information	on:
• the	handling	of	research	data	during	and	after	the	
end	of	the	project
• what	data	will	be	collected,	processed	and/or	
generated
• which	methodologies	and	standards	will	be	applied
• whether	data	will	be	shared/made	open	access,	and
• how	data	will	be	curated	and	preserved	(including	
after	the	end	of	the	project)
Horizon 2020 – extended pilot (ii)
• Once	project	funding	is	approved	and	gets	underway,	the	
first	version	of	the	DMP	is	submitted	(as	a	deliverable)	
within	the	first	6	months
• The	EC	provides	a	template	(in	the	Guidelines),	use	of	which	
is	recommended	but	voluntary
• The	DMP	needs	to	be	updated	over	the	course	of	the	
project	whenever	significant	changes	arise	(e.g.	new	
datasets	created;	changes	in	consortium	policies;	changes	in	
consortium	members,	etc.)
• DMP	should	be	updated	for	each	periodic	evaluation/	
assessment	of	the	project,	and	at	minimum	in	time	for	the	
final	review.
Contents
1. Overview
2. Recap:	why	does	data	need	managing?	
3. The	FAIR	principles
4. Principles	into	practice:	data	in	Horizon	2020
5. FAIR	data,	step-by-step
6. References/resources
Making your data FAIR, step-by-step
1. Understand	your	funder’s	policies	(e.g.	the	EC	Guidelines)
2. Create	a	data	management	plan	(e.g.	with	DMPonline)
3. Decide	which	data	to	preserve	using	the	DCC	How-To	guide	and	
checklist,	“Five	Steps	to	Decide	what	Data	to	Keep”
4. Identify	a	long-term	home	for	your	data	(e.g.	via	re3data.org)
5. Link	your	data	to	your	publications	with	a	persistent	identifier	
(e.g.	via	DataCite)
• N.B.	Many	repositories	will	do	this	for	you
6. Investigate	infrastructure	services	and	resources,	e.g.	EUDAT,	
OpenAIRE,	FOSTER,	etc…
Tools and resources
Contents
1. Overview
2. Recap:	why	does	data	need	managing?	
3. The	FAIR	principles
4. Principles	into	practice:	data	in	Horizon	2020
5. FAIR	data,	step-by-step
6. References/resources
References/resources
• FORCE11,	“Guiding	principles	for	findable,	accessible,	interoperable	and	re-usable	
data	publishing”,	https://www.force11.org/fairprinciples
• Guidelines	on	FAIR	Data	Management	in	Horizon	2020,	v3.0,	26	July	2016,	
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pil
ot/h2020-hi-oa-data-mgt_en.pdf
• DMPonline,	https://dmponline.dcc.ac.uk/
• DCC	guide,	“Five	Steps	to	Decide	what	Data	to	Keep”	(2014),	
http://www.dcc.ac.uk/resources/how-guides/five-steps-decide-what-data-keep
• DCC/SPARC-Europe	report,	“The	Open	Data	Citation	Advantage”	–
http://sparceurope.org/open-data-citation-advantage/
• Registry	of	Research	Data	Repositories,	http://www.re3data.org/
• DataCite,	https://www.datacite.org/
• Workshop	materials	from	“How	EUDAT	services	support	FAIR	data"	at	IDCC	2017,	
Edinburgh,	https://www.eudat.eu/events/trainings/eudat-workshop-how-eudat-
services-support-fair-data-at-idcc-2017-edinburgh
• OpenAIRE,	https://www.openaire.eu/
• FOSTER,	https://www.fosteropenscience.eu/
Thank you: any questions?
• For	more	information	about	the	DCC:
• Website:	www.dcc.ac.uk
• Director:	Kevin	Ashley	
(kevin.ashley@ed.ac.uk)
• General	enquiries:	Alex	Delipalta
(alexandra.delipalta@ed.ac.uk)	
• Twitter:	@digitalcuration
• My	contact	details:
• Email:	martin.donnelly@ed.ac.uk
• Twitter:	@mkdDCC
• Slideshare:	
http://www.slideshare.net/martindonnelly
This work is licensed
under the Creative
Commons Attribution 2.5
UK: Scotland License.

More Related Content

What's hot

Virtual support_to_research_communities
Virtual  support_to_research_communitiesVirtual  support_to_research_communities
Virtual support_to_research_communities
СОБДиЮ
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
IUPUI
 

What's hot (20)

Virtual support_to_research_communities
Virtual  support_to_research_communitiesVirtual  support_to_research_communities
Virtual support_to_research_communities
 
Open Science Incentives/Veerle van den Eynden
Open Science Incentives/Veerle van den EyndenOpen Science Incentives/Veerle van den Eynden
Open Science Incentives/Veerle van den Eynden
 
The African Open Science Platform/Geoffrey Boulton
The African Open Science Platform/Geoffrey BoultonThe African Open Science Platform/Geoffrey Boulton
The African Open Science Platform/Geoffrey Boulton
 
Incentivizing data sharing: a "bottom up" perspective/Louise Bezuidenhout
Incentivizing data sharing: a "bottom up" perspective/Louise BezuidenhoutIncentivizing data sharing: a "bottom up" perspective/Louise Bezuidenhout
Incentivizing data sharing: a "bottom up" perspective/Louise Bezuidenhout
 
Without data, science is merely an opinion: African Open Science Platform/Ina...
Without data, science is merely an opinion: African Open Science Platform/Ina...Without data, science is merely an opinion: African Open Science Platform/Ina...
Without data, science is merely an opinion: African Open Science Platform/Ina...
 
Research Support Services ECU Library
Research Support Services ECU LibraryResearch Support Services ECU Library
Research Support Services ECU Library
 
Open science and data sharing: the DataFirst experience/Martin Wittenberg
Open science and data sharing: the DataFirst experience/Martin WittenbergOpen science and data sharing: the DataFirst experience/Martin Wittenberg
Open science and data sharing: the DataFirst experience/Martin Wittenberg
 
B2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public SectorB2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public Sector
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchData
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
 
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3mResearch Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
 
LinkedUp at Mozilla Festival Science Fair
LinkedUp at Mozilla Festival Science FairLinkedUp at Mozilla Festival Science Fair
LinkedUp at Mozilla Festival Science Fair
 
Data Science for Every Student at RPI
Data Science for Every Student at RPIData Science for Every Student at RPI
Data Science for Every Student at RPI
 
Open by default: the challenges of research data in Europe
Open by default: the challenges of research data in EuropeOpen by default: the challenges of research data in Europe
Open by default: the challenges of research data in Europe
 
Open Data Strategies and Research Data Realities
Open Data Strategies and Research Data RealitiesOpen Data Strategies and Research Data Realities
Open Data Strategies and Research Data Realities
 
Research Data Management and the brave new world, By Paul Ayris
Research Data Management and the brave new world, By Paul AyrisResearch Data Management and the brave new world, By Paul Ayris
Research Data Management and the brave new world, By Paul Ayris
 
How can we ensure research data is re-usable? The role of Publishers in Resea...
How can we ensure research data is re-usable? The role of Publishers in Resea...How can we ensure research data is re-usable? The role of Publishers in Resea...
How can we ensure research data is re-usable? The role of Publishers in Resea...
 
Emerging roles and collaborations in research support for academic health lib...
Emerging roles and collaborations in research support for academic health lib...Emerging roles and collaborations in research support for academic health lib...
Emerging roles and collaborations in research support for academic health lib...
 
IBM Watson Classroom Experience
IBM Watson Classroom ExperienceIBM Watson Classroom Experience
IBM Watson Classroom Experience
 
The Challenges of Making Data Travel, by Sabina Leonelli
The Challenges of Making Data Travel, by Sabina LeonelliThe Challenges of Making Data Travel, by Sabina Leonelli
The Challenges of Making Data Travel, by Sabina Leonelli
 

Similar to Preparing your own data for future re-use: data management and the FAIR principles

Similar to Preparing your own data for future re-use: data management and the FAIR principles (20)

Implementing the Research Data Management Policy: University of Edinburgh Roa...
Implementing the Research Data Management Policy: University of Edinburgh Roa...Implementing the Research Data Management Policy: University of Edinburgh Roa...
Implementing the Research Data Management Policy: University of Edinburgh Roa...
 
Research Data Management Roadmap@Edinburgh
Research Data Management Roadmap@EdinburghResearch Data Management Roadmap@Edinburgh
Research Data Management Roadmap@Edinburgh
 
RDM landscape in the Netherlands
RDM landscape in the NetherlandsRDM landscape in the Netherlands
RDM landscape in the Netherlands
 
Developing a Data Management Plan
Developing a Data Management PlanDeveloping a Data Management Plan
Developing a Data Management Plan
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)
 
Research data management in a developing country: a personal journey
Research data management in a developing country: a personal journeyResearch data management in a developing country: a personal journey
Research data management in a developing country: a personal journey
 
Research data management training - How to make it happen?
Research data management training - How to make it happen?Research data management training - How to make it happen?
Research data management training - How to make it happen?
 
AKVS - Edinburgh Data Repository Experiences June 2016
AKVS - Edinburgh Data Repository Experiences June 2016AKVS - Edinburgh Data Repository Experiences June 2016
AKVS - Edinburgh Data Repository Experiences June 2016
 
Libraries and Research Data Management – What Works? Summary of a Pre-Survey.
Libraries and Research Data Management – What Works? Summary of a Pre-Survey.Libraries and Research Data Management – What Works? Summary of a Pre-Survey.
Libraries and Research Data Management – What Works? Summary of a Pre-Survey.
 
DIY Research Data Management training Kit for Librarians
DIY Research Data Management training Kit for LibrariansDIY Research Data Management training Kit for Librarians
DIY Research Data Management training Kit for Librarians
 
Research Data Management Training and Support
Research Data Management Training and SupportResearch Data Management Training and Support
Research Data Management Training and Support
 
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
On being a cog rather than inventing the wheel: Edinburgh DataShare as a key ...
 
Research Data Management Training and Support
Research Data Management Training and SupportResearch Data Management Training and Support
Research Data Management Training and Support
 
Research data management: DMP & repository
Research data management: DMP & repositoryResearch data management: DMP & repository
Research data management: DMP & repository
 
Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introduction
 
Sharing the load: librarians and research data support services
Sharing the load: librarians and research data support servicesSharing the load: librarians and research data support services
Sharing the load: librarians and research data support services
 
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
 
EDINA / Data Library Overview
EDINA / Data Library OverviewEDINA / Data Library Overview
EDINA / Data Library Overview
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyond
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyond
 

More from Martin Donnelly

More from Martin Donnelly (20)

The Roots of DMPonline
The Roots of DMPonlineThe Roots of DMPonline
The Roots of DMPonline
 
Horizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesHorizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandates
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practices
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening Research
 
Research Data in the Arts and Humanities: A Few Difficulties
Research Data in the Arts and Humanities: A Few DifficultiesResearch Data in the Arts and Humanities: A Few Difficulties
Research Data in the Arts and Humanities: A Few Difficulties
 
Practical Research Data Management: tools and approaches, pre- and post-award
Practical Research Data Management:  tools and approaches, pre- and post-awardPractical Research Data Management:  tools and approaches, pre- and post-award
Practical Research Data Management: tools and approaches, pre- and post-award
 
Research Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky QuestionsResearch Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky Questions
 
Open Access and Open Data: what do I need to know (and do)?
Open Access and Open Data: what do I need to know (and do)?Open Access and Open Data: what do I need to know (and do)?
Open Access and Open Data: what do I need to know (and do)?
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
Digital Resources for Open Science
Digital Resources for Open ScienceDigital Resources for Open Science
Digital Resources for Open Science
 
Open Science and Horizon 2020
Open Science and Horizon 2020Open Science and Horizon 2020
Open Science and Horizon 2020
 
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE WebinarThe Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
 
The Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotThe Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data Pilot
 
Winning Horizon 2020 with Open Science
Winning Horizon 2020 with Open ScienceWinning Horizon 2020 with Open Science
Winning Horizon 2020 with Open Science
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overview
 
Research Data Management for the Humanities and Social Sciences
Research Data Management for the Humanities and Social SciencesResearch Data Management for the Humanities and Social Sciences
Research Data Management for the Humanities and Social Sciences
 
Data Management Plans: a gentle introduction
Data Management Plans: a gentle introductionData Management Plans: a gentle introduction
Data Management Plans: a gentle introduction
 
Research Data Management: a gentle introduction for admin staff
Research Data Management: a gentle introduction for admin staffResearch Data Management: a gentle introduction for admin staff
Research Data Management: a gentle introduction for admin staff
 
Research Data Management: a gentle introduction
Research Data Management: a gentle introductionResearch Data Management: a gentle introduction
Research Data Management: a gentle introduction
 
Future agenda: repositories, and the research process
Future agenda: repositories, and the research processFuture agenda: repositories, and the research process
Future agenda: repositories, and the research process
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 

Preparing your own data for future re-use: data management and the FAIR principles

  • 2. The Digital Curation Centre (DCC) • UK national centre of expertise in digital preservation and data management, established 2004 • Principal audience is the UK higher education sector, but we increasingly work further afield (continental Europe, North America, South Africa, Asia…) • Provide guidance, training, tools (e.g. DMPonline) and other services on all aspects of research data management and Open Science • Now offering tailored consultancy/training • Organise national and international events and webinars (International Digital Curation Conference, Research Data Management Forum)
  • 3. Contents 1. Overview 2. Recap: why does data need managing? 3. The FAIR principles 4. Principles into practice: data in Horizon 2020 5. FAIR data, step-by-step 6. References/resources
  • 4. Overview • As Open Access to publications became normal (if not ubiquitous), scholarly attention turned to the data underpinning the written outputs of research, and it is now considered a first-class research output in its own right. The development of OA and research data management (RDM) are closely linked as part of a broader trend in research, sometimes termed ‘Open Science’ or ‘Open Research’ • “The European Commission is now moving beyond open access towards the more inclusive area of open science. Elements of open science will gradually feed into the shaping of a policy for Responsible Research and Innovation and will contribute to the realisation of the European Research Area and the Innovation Union, the two main flagship initiatives for research and innovation” http://ec.europa.eu/research/swafs/index.cfm?pg=policy&lib=science • The EC’s data expectations are based on the framework of the FAIR principles, which state that data (and metadata) should ideally be Findable, Accessible, Interoperable and Reusable
  • 5. Contents 1. Overview 2. Recap: why does data need managing? 3. The FAIR principles 4. Principles into practice: data in Horizon 2020 5. FAIR data, step-by-step 6. References/resources
  • 6. The old way of doing research 1. Researcher collects data (information) 3. Researcher writes paper based on data 4. Paper is published (and preserved) 5. Data is left to benign neglect, and eventually ceases to be accessible 2. Researcher interprets/synthesises data
  • 7. Without intervention, data + time = no data Vines et al. “examined the availability of data from 516 studies between 2 and 22 years old” - The odds of a data set being reported as extant fell by 17% per year - Broken e-mails and obsolete storage devices were the main obstacles to data sharing - Policies mandating data archiving at publication are clearly needed “The current system of leaving data with authors means that almost all of it is lost over time, unavailable for validation of the original results or to use for entirely new purposes” according to Timothy Vines, one of the researchers. This underscores the need for intentional management of data from all disciplines and opened our conversation on potential roles for librarians in this arena.(“80 Percent of Scientific Data Gone in 20 Years” HNGN, Dec. 20, 2013, http://www.hngn.com/articles/20083/20131220/80-percent-of-scientific-data-gone-in- 20-years.htm.) Vines et al., The Availability of Research Data Declines Rapidly with Article Age, Current Biology (2014), http://dx.doi.org/10.1016/j.cub.2013.11.014
  • 8. Baker, M. (2016) “1,500 scientists lift the lid on reproducibility”, Nature, 533:7604, http://www.nat ure.com/news/1 -500-scientists- lift-the-lid-on- reproducibility- 1.19970
  • 9. (Aside: from data to research objects?) • ‘Research object’ is a term that is gaining in popularity, not least in the humanities where the relevance of the term ‘data’ is not always recognised… • Research objects can comprise any supporting material which underpins or otherwise enriches the (written) outputs of research • Data (numeric, written, audiovisual….) • Software code and algorithms • Workflows and methodologies • Slides, logs, lab books, sketchbooks, notebooks, etc • See http://www.researchobject.org/ for more info
  • 10. The new way of doing research Plan Collect Assure Describe Preserve Discover Integrate Analyze DEPOSIT …and RE-USE The DataONE lifecycle model
  • 11. N.B. other models are available… Ellyn Montgomery, US Geological Survey
  • 12. Data sharing isn’t entirely new… from Philosophical Transactions of the Royal Society, (MDCCCLXI) (or 1861 if you’d prefer)
  • 13. …but what’s “normal” is shifting Data management is a part of good research practice. - RCUK Policy and Code of Conduct on the Governance of Good Research Conduct
  • 14. The benefits of Open / managed data • SPEED: The research process becomes faster • EFFICIENCY: Data collection can be funded once, and used many times for a variety of purposes • ACCESSIBILITY: Interested third parties can (where appropriate) access and build upon publicly-funded research resources with minimal barriers to access • IMPACT and LONGEVITY: Open publications and data receive more citations, over longer periods (see for example recent DCC/SPARC- Europe paper, “The Open Data Citation Advantage”) • TRANSPARENCY and QUALITY: The evidence that underpins research can be made open for anyone to scrutinise, and attempt to replicate findings. This leads to a more robust scholarly record • SECURITY: Not all data should be made available to everyone. Careful management reduces the risk of inappropriate disclosure.
  • 16. Open and/or Managed? • Taking a managed and planned approach to research is not the same as making everything open to everyone • The purpose of research data management is twofold: • To ensure that data remains accessible and understandable; or • To ensure that data is not accessible or understandable (in its raw state, by the wrong people, or at the wrong time) • Which of these pertains will depend on the nature of the research. It is increasingly expected that publications and data (and software, algorithms, workflows etc) will be made Open by default, unless… • There is an ethical reason to restrict access • There is a public safety reason to restrict access • There is a commercial or contractual reason to restrict access • In some cases, data can be made partially-open (i.e. anonymised, aggregated or redacted) in order to protect these interests
  • 17. Unanticipated data re-use Ships’ log books build picture of climate change 14 October 2010 You can now help scientists understand the climate of the past and unearth new historical information by revisiting the voyages of First World War Royal Navy warships. Visitors to OldWeather.org will be able to retrace the routes taken by any of 280 Royal Navy ships. These include historic vessels such as HMS Caroline, the last survivor of the 1916 Battle of Jutland still afloat. By transcribing information about the weather and interesting events from images of each ship's logbook, web volunteers will help scientists build a more accurate picture of how our climate has changed over the last century. http://www.nationalarchives.gov.uk/news /503.htm Detail from Royal Navy Recruitment poster, RNVR Signals branch, 1917 (Catalogue reference: ADM 1/8331) Endeavour, 1768-71 (Captain Cook) HMS Beagle, 1830-34 HMS Torch, 1918
  • 19. Contents 1. Overview 2. Recap: why does data need managing? 3. The FAIR principles 4. Principles into practice: data in Horizon 2020 5. FAIR data, step-by-step 6. References/resources
  • 20. The FAIR Data Principles (0/4) One of the grand challenges of data-intensive science is to facilitate knowledge discovery by assisting humans and machines in their discovery of, access to, integration and analysis of, task-appropriate scientific data and their associated algorithms and workflows. FAIR is a set of guiding principles to make data • Findable • Accessible • Interoperable, and • Re-usable
  • 21. The FAIR Data Principles (1/4) To be Findable: F1. (meta)data are assigned a globally unique and eternally persistent identifier. F2. data are described with rich metadata. F3. (meta)data are registered or indexed in a searchable resource. F4. metadata specify the data identifier.
  • 22. The FAIR Data Principles (2/4) To be Accessible: A1. (meta)data are retrievable by their identifier using a standardized communications protocol. A1.1. the protocol is open, free, and universally implementable. A1.2. the protocol allows for an authentication and authorization procedure, where necessary. A2. metadata are accessible, even when the data are no longer available.
  • 23. The FAIR Data Principles (3/4) To be Interoperable: I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (meta)data use vocabularies that follow FAIR principles. I3. (meta)data include qualified references to other (meta)data.
  • 24. The FAIR Data Principles (4/4) To be Re-usable: R1. meta(data) have a plurality of accurate and relevant attributes. R1.1. (meta)data are released with a clear and accessible data usage license. R1.2. (meta)data are associated with their provenance. R1.3. (meta)data meet domain-relevant community standards
  • 25. Contents 1. Overview 2. Recap: why does data need managing? 3. The FAIR principles 4. Principles into practice: data in Horizon 2020 5. FAIR data, step-by-step 6. References/resources
  • 26. FAIR in practice: European data policy • The EC is currently midway through an extended pilot for Horizon 2020. Other projects can participate voluntarily, and opting in has been more popular than opting out • The pilot applies as minimum to research data underlying publications, plus any other data as decided by the project • Participants must: • Create and maintain a DMP as a project deliverable • Deposit data in a repository • Make it possible for others to access, mine, exploit and reuse the data • Share information on the tools needed …unless there are compelling reasons not to do so. (And these reasons should be recorded in the DMP.) “As open as possible, as closed as necessary”
  • 27. Horizon 2020 – extended pilot (i) The DMP should include information on: • the handling of research data during and after the end of the project • what data will be collected, processed and/or generated • which methodologies and standards will be applied • whether data will be shared/made open access, and • how data will be curated and preserved (including after the end of the project)
  • 28. Horizon 2020 – extended pilot (ii) • Once project funding is approved and gets underway, the first version of the DMP is submitted (as a deliverable) within the first 6 months • The EC provides a template (in the Guidelines), use of which is recommended but voluntary • The DMP needs to be updated over the course of the project whenever significant changes arise (e.g. new datasets created; changes in consortium policies; changes in consortium members, etc.) • DMP should be updated for each periodic evaluation/ assessment of the project, and at minimum in time for the final review.
  • 29. Contents 1. Overview 2. Recap: why does data need managing? 3. The FAIR principles 4. Principles into practice: data in Horizon 2020 5. FAIR data, step-by-step 6. References/resources
  • 30. Making your data FAIR, step-by-step 1. Understand your funder’s policies (e.g. the EC Guidelines) 2. Create a data management plan (e.g. with DMPonline) 3. Decide which data to preserve using the DCC How-To guide and checklist, “Five Steps to Decide what Data to Keep” 4. Identify a long-term home for your data (e.g. via re3data.org) 5. Link your data to your publications with a persistent identifier (e.g. via DataCite) • N.B. Many repositories will do this for you 6. Investigate infrastructure services and resources, e.g. EUDAT, OpenAIRE, FOSTER, etc…
  • 32. Contents 1. Overview 2. Recap: why does data need managing? 3. The FAIR principles 4. Principles into practice: data in Horizon 2020 5. FAIR data, step-by-step 6. References/resources
  • 33. References/resources • FORCE11, “Guiding principles for findable, accessible, interoperable and re-usable data publishing”, https://www.force11.org/fairprinciples • Guidelines on FAIR Data Management in Horizon 2020, v3.0, 26 July 2016, http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pil ot/h2020-hi-oa-data-mgt_en.pdf • DMPonline, https://dmponline.dcc.ac.uk/ • DCC guide, “Five Steps to Decide what Data to Keep” (2014), http://www.dcc.ac.uk/resources/how-guides/five-steps-decide-what-data-keep • DCC/SPARC-Europe report, “The Open Data Citation Advantage” – http://sparceurope.org/open-data-citation-advantage/ • Registry of Research Data Repositories, http://www.re3data.org/ • DataCite, https://www.datacite.org/ • Workshop materials from “How EUDAT services support FAIR data" at IDCC 2017, Edinburgh, https://www.eudat.eu/events/trainings/eudat-workshop-how-eudat- services-support-fair-data-at-idcc-2017-edinburgh • OpenAIRE, https://www.openaire.eu/ • FOSTER, https://www.fosteropenscience.eu/
  • 34. Thank you: any questions? • For more information about the DCC: • Website: www.dcc.ac.uk • Director: Kevin Ashley (kevin.ashley@ed.ac.uk) • General enquiries: Alex Delipalta (alexandra.delipalta@ed.ac.uk) • Twitter: @digitalcuration • My contact details: • Email: martin.donnelly@ed.ac.uk • Twitter: @mkdDCC • Slideshare: http://www.slideshare.net/martindonnelly This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License.