1. MANAGING AND SHARING
RESEARCH DATA
Martin Donnelly
Digital Curation Centre
University of Edinburgh
White Rose Perspectives on Research Data Management event
University of York, 24 May 2012
2. Running order
I. DEFINITIONS
II. DRIVERS
III. SOME (IN)EQUATIONS
IV. STAKEHOLDERS
V. CURRENT WORK
4. The DCC Mission
Helping to build capacity, capability
and skills in data management and
curation across the UK’s higher
education research community
– DCC Phase 3 Business
Plan
5. What is Research Data Management?
“the active management and
Manage
appraisal of data over the
lifecycle of scholarly and
scientific interest”
Share Data management is a part
of good research practice
6. Why manage research data?
• Enable reuse
• Control costs
• Research integrity
• Research impact
– Linking data and publication
– Making data citable
• Regulatory requirements
• Maximising value
RLUKRDM -20120416 - Kevin Ashley, DCC, CC-BY
7. Where is the data in research?
The six datacentric phases of the research lifecycle
9. “Research Data Management”
- The phrase means different things to
different people
- Researchers may care enormously about
their data, so much so that they worry
about it going out into the world on its
own
- Others (e.g. those with responsibility for
compliance) may worry about it not
going out into the world, or going out
when it shouldn’t / underdressed
- Some may not even recognise the
relevance of ‘data’ in what they do
11. The data deluge
“Surfing the
Tsunami”
Science: 11 February 2011
12. • Public good
• Preservation
• Discovery
• Confidentiality
• First use
• Recognition
• Public funding
13. RCUK Policy and Code of Conduct on the
Governance of Good Research Conduct
Unacceptable research conduct includes mismanagement
or inadequate preservation of data and/or primary
materials, including failure to:
– keep clear and accurate records of the research procedures followed and the results
obtained, including interim results;
– hold records securely in paper or electronic form;
– make relevant primary data and research evidence accessible to others for reasonable
periods after the completion of the research: data should normally be preserved and
accessible for 10 yrs (in some cases 20 yrs or longer);
– manage data according to the research funder’s data policy and all relevant legislation;
– wherever possible, deposit data permanently within a national collection.
Responsibility for proper management and preservation
of data and primary materials is shared between the
researcher and the research organisation.
14.
15. April 2011 - EPSRC Letter to VCs
EPSRC expects all those institutions it funds:
- to develop a roadmap that aligns their
policies and processes with EPSRC’s
expectations by 1st May 2012
- to be fully compliant with these
expectations by 1st May 2015
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx
18. Government pressure…
6.9 The Research Councils expect the researchers they fund to deposit
published articles or conference proceedings in an open access
repository at or around the time of publication. But this practice is
unevenly enforced. Therefore, as an immediate step, we have asked
the Research Councils to ensure the researchers they fund fulfil the
current requirements. Additionally, the Research Councils have now
agreed to invest £2 million in the development, by 2013, of a UK
‘Gateway to Research’. In the first instance this will allow ready access
to Research Council funded research information and related data but
it will be designed so that it can also include research funded by others
in due course. The Research Councils will work with their partners and
users to ensure information is presented in a readily reusable
form, using common formats and open standards.
http://www.bis.gov.uk/assets/biscore/innovation/docs/i/11-1387-
innovation-and-research-strategy-for-growth.pdf
23. “Data
sharing was
“While many researchers are more readily
positive about sharing data indiscussed by
principle, they are almost early career
universally reluctant in researchers.”
practice. ..... using these
data to publish results before
anyone else is the
primary way of gaining
prestige in nearly all
disciplines.” INCREMENTAL Project
25. Rule 2. Don’t Share It All
• Data Protection Act
• Ethical concerns
• Commercial interests
26. Open to all? Case studies of openness in research
Choices are made according to context, with degrees
of openness reached according to:
• The kinds of data to be made available
• The stage in the research process
• The groups to whom data will be made available
• On what terms and conditions it will be provided
Default position of most:
• YES to protocols, software, analysis tools, methods
and techniques
• NO to making research data content freely
available to everyone
Angus Whyte, RIN/NESTA, 2010
29. “The ability to take data -to
be able to understand it, to
process it, to extract value
from it, to visualise it, to
communicate it -that’s
going to be a hugely
important skill in the next
decades.”
Hal Varian, Chief Economist, Google
30. Implications of
“Big Data” and
data science for
organisations in
all sectors
Predicts a
shortage of
190,000
data scientists
by 2019
http://www.mckinsey.com/Insights/MGI/Research/Technology_and_Innov
ation/Big_data_The_next_frontier_for_innovation
31. Position Location
Science Data Librarian Stanford
Data Management Librarian Oregon State
Social Sciences Data Librarian Brown
Data Curation Librarian Northeastern
Data Librarian New South Wales
Research Data Management Sydney
Co-ordinator
Research Data & Digital Cambridge
Curation Officer
Data Services Librarian Iowa
Data Analyst ANDS
Institutional Data Scientist Bath
32. Data roles
1. Director IS/CIO/University Librarian
2. Data librarians /data scientist
/liaison/subject/faculty librarians
3. Repository managers
4. IT/Computing Services
5. Research Support/Innovation Office
6. Doctoral Training Centres
7. PVC Research
+ Public Engagement Office
Liz Lyon, Informatics Transform, IJDC Current Issue, 2012
34. DCC institutional stakeholders
University managers
Researchers
Research support staff • University library / repository
with a role to play in data • IT services
management, particularly • Research and innovation
those from • Etc
35. Institutional Engagements
With funding from HEFCE we’re:
• Working intensively with 18 HEIs to increase RDM capability
– 60 days of effort per HEI drawn from a mix of DCC staff
– Deploy DCC and external tools, approaches and best practice
• Support varies based on what each institution wants/needs
– Institution agrees a schedule of work with the DCC, and each assigns a primary
contact / programme manager
• Lessons and examples to be shared with the community
www.dcc.ac.uk/community/institutional-engagements
36. Some current IE activities
Assessing needs Piloting tools
e.g. DataFlow
RDM roadmaps
Policy Policy
development implementation
37. Support offered by the DCC
Institutional
Assess data catalogues
needs Workflow
assessment Pilot RDM
tools
Develop
DAF & CARDIO DCC
assessments Guidance and support
support
team training and
services
RDM policy
Advocacy to senior development
management
Customised Data
Make the case Management Plans
…and support policy implementation
39. CREDITS
Images:
Slide 3 – http://www.flickr.com/photos/dougbelshaw/
Slide 9 – http://www.flickr.com/photos/chaparral/
Slide 10 – http://www.flickr.com/photos/rpmarks/
Slide 19 – http://www.flickr.com/photos/billburris/
Slide 21 – http://www.flickr.com/photos/mykl/
Slide 28 – http://www.flickr.com/photos/mugley/
Slide 33 – http://www.flickr.com/photos/chiotsrun/
Slide 38 – http://www.treehugger.com/picture-is-worth-sum-car-parts.jpg
Thanks to DCC colleagues for their slides:
Kevin Ashley, Liz Lyon, Graham Pryor, Sarah Jones
40. QUESTIONS AND CONTACTS
For more information:
– Visit http://www.dcc.ac.uk
– Email martin.donnelly@ed.ac.uk
– Twitter @mkdDCC
This work is licensed under a Creative Commons Attribution 2.5
UK: Scotland License.
Notas del editor
The DCC developed the curation lifecycle model to explain the range of activities involved in creating, preserving and sharing digital content.In RDM terms ‘curation’ is simply managing & sharing data. The DCC argues that this is just part of good research practice.
for projects ofclinical or major social, environmental or heritage importance, for 20 yearsor longer
This is what senior university managers REALLY want to avoid. They also want to keep the government happy, and BIS want private sector to be able to build upon public sector data.
Point out that although funders like the ESRC have been asking for data for years (apparently since 2001 or something like that), and that until recently these requirements have been poorly enforced (very little incentive for researchers to comply - they still got their grant money!), the funders are now under external pressure to ensure that researchers do provide data. So things are changing and the old 'lax' system is being replaced by one that that is more stringent.
IT departments in particular tend to think of data management as primarily a hardware/technical problem. It’s not – the human side is about 80% of the problem.
Best place to hide a leaf is in a tree…
Consideration of roleswill be explored in the breakouts
Whole range of support and services we can offer. There are typically a few things we do with each HEI. The focus depends on what they need.
We’ll talk more about some of these shortly, but first we’ll give a snapshot of the current UK ‘scene’…
Research infrastructure depends on all of its parts, and a good data management strategyrelies upon coordination and communication for ensuring smooth and accurate interactions