Data management planning: the what, the why, the who, the how
1. INTECOL 2013 workshop
22 August 2013
Data management planning:
the what, the why, the who, the
how
Martin Donnelly
Digital Curation Centre
University of Edinburgh
2. Running order
1. What and why:
1. Thinking it through, then writing it down
2. Carrots and sticks
2. Who: roles and responsibilities
3. How: DMPonline and DMP Tool
4. What is a DMP?
• Research funders (and other bodies) often ask for a short
statement/plan to be submitted alongside grant
applications
– in the UK, 6 of the 7 RCUK funders require something like this as
part of the Je-S application process (and NERC ask for two
versions)
– in the US, as of 2011 the NSF requires that all grant applications
include a data management plan of no more than 2 pages (also
true of NIH)
– An EC requirement/expectation is coming in FP8/Horizon 2020
• Institutions increasingly see a benefit in asking their
researchers to do this too
• As do some publishers…
5. Why do they want this?
• Planning is an important early stage in
the research lifecycle
• For researchers, it’s an opportunity to
think things through, to prepare for
longer-term preservation, and to identify
tacit assumptions which may exist in
multi-partner / multi-disciplinary
research
• From the funder’s point of view, DMPs contribute to
the grant awards process. They also help to drive
efficiencies and improve the longevity of data assets
6. Typical DMP contents
In general, funders want to know:
- What kinds of data will be created and how?
- How will the data be documented and described?
- Are there ethical and Intellectual Property issues?
- What are the arrangements for data sharing and third-party
access?
- What is the strategy for longer-term preservation?
- In short, how and why: list methods and standards, justify
decisions, and note any limitations (e.g. on access)
But they all have different requirements, and express them
in different ways…
8. - Just the principal investigator? (usually
directly/ultimately responsible)
- But what about the research assistants?
- And the institution’s funding office?
- And the Library/IT?
- What about partners based in other
institutions?
- And commercial partners?
- Etc and so on
So, who’s involved in this process?
11. What do & do?
Two free and Open Source web-based tools which
enable users to...
i.Create, store and update Data Management Plans
across the research lifecycle
ii.Meet a variety of specific data-related requirements
(from funders, institutions, publishers, etc.) in a single
place
iii.Get tailored guidance on best practice and helpful
contacts, at the point of need
iv.Customise, export and share DMPs in a variety of
formats in order to facilitate communication within and
beyond research projects
13. - User logs in and starts a new DMP by filling in some basic
details. She then chooses the template(s) she needs.
- Each template contains a subset of the checklist questions,
which have been mapped to the requirements of the funder,
institution, discipline, publishers, etc. Templates also contain
detailed guidance and links to further information to help the
user on her way.
- The DMP owner can share access with collaborators or with
support staff. They can therefore work together on different
sections of the plan.
- Plans can be exported at any stage, and in a variety of different
formats; once complete, they can be submitted alongside the
funding application.
How it works
34. As part of the workflow…
DMP tools can also be used in conjunction
with other tools that support the data
management/curation lifecycle, e.g.…
- DAF (Data Asset Framework)
- DRAMBORA (Digital Repository Audit Method
Based On Risk Assessment)
- CARDIO (Collaborative Assessment of
Research Data Infrastructure and Objectives)
- LIFE
- Planets testbed tools
- CRIS and RMS systems
- DataONE tools
- and more
35. PRO
-Users prefer using an online tool to paper templates
-They like the sharing functionality
-Considerable demand for institutional customisations
CON
-Users generally find the current screen layout and user interface
too complex
-Some also complained about the mapping process, and say
they would prefer to answer funder questions verbatim (as is the
case with DMP Tool)
DMPonline evaluation (2012)
36. - We’ll be simplifying and improving the user interface ASAP
- We’ve produced a new version of the Checklist, with fewer
questions. In time this *might* develop along the lines of a
taxonomy of data management, and we’re looking into linking
up with work that CASRAI are doing (which we influenced via
the original Checklist!)
- We’re freshening up the associated guidance
- Users will be able to select new templates that don’t map to
the Checklist at all, or relate to it in a different way
- New system will retain compatibility with existing content
See DCC Director Kevin Ashley’s blog post at
http://www.dcc.ac.uk/news/future-plans-dmponline
Future plans (short and longer term)
39. A few more DMP resources
– “Dealing with Data” (Lyon, 2008)
– Analysis of Funder Policies (Jones,
2009)
– Checklist for a Data Management Plan
(Donnelly and Jones, 2009-2012)
– “How to Develop a Data Management
and Sharing Plan” (Jones, 2011)
– “Data Management Plans and
Planning” (Donnelly, 2012) in Pryor
(ed.) Managing Research Data,
London: Facet
Links to all DCC resources via http://www.dcc.ac.uk/resources/data-management-plans
40. Key things to remember
All research projects are different, so there’s no one-
size-fits-all DMP approach
The DMP will depend upon the nature of the research
AND its context (funder, domain, institution(s) etc)
DMPs are useful communication tools between
multiple stakeholders
Keep the twin goals of RDM in mind throughout, i.e.
Keep sensitive data safe
Enable reuse via ongoing access
41. Thank you
Image credits:
Slide 1 - http://upload.wikimedia.org/wikipedia/commons/8/88/LernaeanHydraRephael.jpg
Slide 3 - http://www.flickr.com/photos/axis/
Slide 7 - http://www.oxbridgebiotech.com/wp-content/uploads/2013/07/Confused-Doctor.jpg
Slide 9 - http://en.wikipedia.org/wiki/File:Hercules_slaying_the_Hydra.jpg
Thanks to DMPTool’s Bill Michener and Carly Strasser for the use of their slides.
This work is licensed under the Creative
Commons Attribution 2.5 UK: Scotland
License.
Martin Donnelly
Digital Curation Centre
University of Edinburgh
martin.donnelly@ed.ac.uk
Twitter: @mkdDCC
www.dcc.ac.uk/resources/data-management-plans
For other DCC services see www.dcc.ac.uk or follow us on twitter @digitalcuration and
#ukdcc
Editor's Notes
What I’m going to cover
A DMP is a basic statement of how you will create, manage, share and preserve your data Funders expect the decisions to be justified, particularly where it ’ s not in line with their policy (e.g. limits on data sharing)
A DMP is a basic statement of how you will create, manage, share and preserve your data Funders expect the decisions to be justified, particularly where it ’ s not in line with their policy (e.g. limits on data sharing)
The DCC Checklist is by nature very long, and its length was felt to be off-putting to researchers. Most of them don ’ t want to deal with this stuff even at a basic level, and a long Checklist with over 100 questions was not going to enjoy a large takeup. No matter how many times we said “ you don ’ t need to fill it all in, just the bits that are relevant to you at this time ” the message wasn ’ t going to sink in, so we developed a fairly basic wizard style tool which asked a few questions about what stage your research was at, who your funder was, etc, and then pulled out only the most relevant questions from the Checklist to help you meet the pertinent requirements. So instead of seeing 115 questions, you might be presented with only 15 or 20. Much better. We then added functionalities like export and customisation, and some generic guidance to help with some of the more esoteric sections such as file format selection and metadata.
Similarly, DMP Online can also be used in conjunction with other tools that support the data management/curation lifecycle, be these DCC tools or tools from other sources.
So in summary, these are some of the key DMP-related resources.
The main things to remember about DMPs is that all research projects are different- the DMP will vary with context. Apart from a few very specialised areas like backup - there are no universal rights and wrongs. R esearch data management by nature involves multiple stakeholders, so planning is important as a communication mechanism. The process of producing a plan (i.e. engaging with others and deciding on the best way forward) is as important as the plan itself.