1. PREFACE: Without wishing to state the obvious
Ideas are the currency used by all learned institutions - their
generation, communication, and exchange precipitate
knowledge.
However academic institutions are governed and influenced by
a complex set of contributors:
history, culture, politics, legislation, economics, geography, jurisdiction
They may function under specific operational parameters:
public or private
centralized or devolved
vocational or academic
large or small
2. They may susceptible to sentient elements such as:
Opportunism
Serendipity
Reputation
Competition
Aspiration
But all things considered each is equipped with:
strategic direction
research agendas
curriculum development
support infrastructures
technological capabilities
THUS - What’s significant or appropriate for one institution
may not be so for the another!
3. RDM Priorities, Stakeholders, Practice:
an Edinburgh perspective
Stuart Macdonald
CISER Data Services Librarian
Associate Data Librarian
University of Edinburgh
Emails:
srm262@cornell.edu
stuart.macdonald@ed.ac.uk
Presented at the Cornell Univ. RDMS & Data Discussion Groups Meeting, 7 March, 2014
4. EDINA & Data Library (EDL)
• EDINA and University Data Library (EDL) together
are a division within Information Services (IS) of the
University of Edinburgh.
• EDINA is a Jisc-funded National Data Centre
providing national online resources for education
and research.
• The Data Library assists Edinburgh University
users in the discovery, access, use and
management of research datasets.
5. EDINA National Data Centre
• Mission statement: “We develop and deliver online services and
digital infrastructure for UK research and education ….. drawing
upon knowledge and expertise gained through research, innovation
and development.”
• Networked access to a range of online resources for UK
FE and HE
• Services free at the point of use for use by staff and
students in learning, teaching and research through
institutional subscription
• Focus is on service but also undertake R&D (projects
services)
• delivers about 20 online services
• 5 - 10 major projects (incl. services in development)
• employs about 80 staff (Edinburgh & St Helens)
6. Data Library services and projects
• Data Library & Consultancy
• JISC-funded projects
– DISC-UK DataShare (2007-2009)
• Edinburgh DataShare Repository
– Research Data MANTRA (2010-2011)
– Data Audit Framework Implementation (2008)
7. Data Library & Consultancy
•
•
•
•
•
finding…
accessing …
using …
teaching …
managing
Building relationships with researchers via postgraduate
teaching activities, research support projects, IS Skills
workshops, Research Data Management training and
through traditional reference interviews.
8. Edinburgh DataShare
An online institutional repository of multi-disciplinary
research datasets produced at the University of
Edinburgh, hosted by the Data Library
Researchers producing research data associated with a
publication, or which has potential use for other
researchers, can upload their dataset for sharing and
safekeeping. A persistent identifier and suggested citation
will be provided.
DataShare is a customised DSpace instance with a
selection of standards-compliant metadata fields useful
for dataset discovery, through Google and other search
engines via OAI-PMH.
9. Research Data MANTRA
Partnership between:
Data Library & Institute for Academic Development
Funded by JISC Managing Research Data Programme (Sept.
2010 – Aug. 2011)
Grounded in three disciplinary contexts: social science,
clinical psychology and geoscience
Aim was to develop online interactive open learning resources
for PhD students and early career researchers that will:
• Raise awareness of the key issues related to research data
management
• Contribute to culture change & good research practice
10. Online learning module
Eight units with activities, scenarios and videos:
•
•
•
•
•
•
•
•
Research data explained
Data management plans
Organising data
File formats and transformation
Documentation and metadata
Storage and security
Data protection, rights and access
Preservation, sharing and licensing
Four data handling practicals: SPSS, NVivo, R, ArcGIS
Xerte Online Toolkits – University of Nottingham
• CC licence to allow manipulation of content for re-use with attribution
• Portable content in open standard formats (e.g. SCORM)
11. RDM Roadmap@Edinburgh
- an institutional approach
Background:
Edinburgh Data Audit Framework (DAF) Implementation Project
(May – Dec 2008)
A JISC-funded pilot project produced 6 case studies from research
units across the University in identifying research data assets and
assessing their management, using DAF methodology developed
by
the Digital Curation Centre.
4 main outcomes:
•
•
•
•
Develop online RDM guidance
Develop RDM training
Develop university research data management policy
Develop services & support for RDM (in partnership IS)
12. Drivers
• RDM policy containing 10 aspirational statements affirming both the
researchers’ and the University’s responsibilities, e.g.
•PI responsible for RDM
•University will provide RDM support and training
•University will provide RDM services (such as back-up, storage,
deposit)
•Data retained elsewhere will be registered with the University
•RDM plans must ensure that data are available for access and re-use
under appropriate safeguards
•
UKRC Common Principles of Research Data Policy states:
•Publicly funded research are a public good produced in the
public interest
•Data with long terms value should be preserved & accessible
for re-use
•
UK Research Funders have all issued research data management
policies demanding institutions take care of and preserve data for
re-use
13. Committees
An RDM Policy Implementation Committee was set up by the
Vice Principal Knowledge Management, Professor Jeff Haywood
to implement recommendations:
•
Membership from across IS
•
Charged with delivering services that will meet RDM policy
objectives
•
Iterate with researchers to ensure services meet the needs of
researchers
The Vice Principal also established a Steering Committee led by
Prof. Peter Clarke comprising members of Research Committee
from the 3 colleges, IS, DCC and Edinburgh Research and
Innovation (ERI).
Their role is to:
•
Provide oversight to the activity of the Implementation
Committee
•
Ensure services meet researcher requirements without
harming research competitiveness
14. RDM Roadmap
•EPSRC expects funded projects to have developed a roadmap
aligned with EPSRC’s RDM expectations by 1st May 2012, and to be
fully compliant with these expectations by 1st May 2015.
•The Executive Summary of the Information Services Plan, 2012-13
states, “Research data management & storage – policies, training,
curation, preservation, baseline 0.5Tb/user,” is a major IS-led
project for the year.
•The Edinburgh RDM roadmap was set out as a high level plan for its
delivery, noting objectives, outcomes, deliverables and target dates
for an 18-month period - consisting of Phase 0 planning period (May
– Sept. 2013) followed by 3 x six monthly phases up to April 2015.
•The RDM Roadmap forms the basis of the RDM Programme of
service, support and communication activities.
15. Costs !
•The roadmap follows up a business case submitted to the
University IT Committee in Summer 2012 by Jeff Haywood which
estimated one-off and recurrent costs.
•In May 2013 Vice Principal announced that funding would be in the
order of £2 million split between infrastructure and RDM support
and technical personnel.
•Currently the Roadmap does not include itemised costs.
•Freemium model - 1.6PB storage – allocated to Schools / Research
Institutes / Research groups
16.
17. General consultancy and support service throughout the
research process
Example services might include:
• Tailored awareness and advocacy activities
• Online Data Management guidance
• Training (online / F-2-F)
• Data Management consultancy (as part of grant funded research)
18. Support for planning activities that are typically performed
before research data is collected or created
Example services might include:
• Bespoke Data Management Planning (DMP) support (as
dependent upon funding body’s requirements)
• Customised DMPOnline tool (incl. boiler plate text)
19. Facilities to store data that is actively used in current
research activities, to provide access to that storage, and
tools to assist in working with the data
Example services might include:
• Accessible cross-platform Data Store
• (Remote) File Access Services (e.g. Dropbox-like)
• Data Synchronisation (e.g. mobile devices)
• Web-based Collaboration tools
• Structured Data Version Control (WebDAV)
• Central Database service
20. Tools and services to aid in the description, deposit, and
on-going management of completed research data
outputs
Example services might include:
• Data archive service (vault)
• Data asset register
• Data repository (enhanced)
• PURE Current Research Information System integrated with
other systems
21. To describe how these
services fit together we have
differentiated each system by
what it will hold & who can
access the content.
The vertical axis differentiates
between systems that hold
metadata only from those that
contain data files
The horizontal axis
differentiates between private
and public systems
22. • PURE is our Current Research Information System
(CRIS). It is a private system for the University to record
metadata about the research outputs it generates. (It can
hold files, and has a public interface, but this is primarily
for OA publications rather than research data).
• DataShare is our open data repository. It holds and
curates datasets (and associated metadata) for public
consumption on behalf of the data creators.
• What about the other two quadrants:
• Is there a case where we need a public store of
metadata about research data?
• or a private store of finished data sets?
• We think there is!
23. Public Metadata:
Not only is it good practice for a research institution to know what
research data it is creating, some research funders require this. In
addition the University’s RDM policy requires (point 6.)
“Any data which is retained elsewhere, for example in an international data
service or domain repository should be registered with the University.”
This need can be fulfilled by a Data Asset
Register or DAR.
Private Data:
Whilst some data are suitable for sharing, some will need to have their
access controlled. Thus there is a need for a secure place for keeping
data. Once archived there files should only be accessible by the data
creator. It should not be possible to change files, but only to create new
versions or to remove/delete them.
This need can be fulfilled by a Data Vault – a place to store datestamped golden copy data (associated with a paper, containing personal
information, from completed research, or subject to retention rules)
24. Interoperation
Systems do not live in isolation, and become more powerful and
more likely to be used if they are integrated with each other.
However, the last thing that we want is to introduce further
systems that need to be fed with duplicate information.
This means interoperation for some or all of the components
25. Working on the premise that the DAR will become the main user interface
for entering metadata about datasets:
It may also be the main user interface for uploading files into the Data Vault.
Thus the DAR and the Vault will need to be integrated.
PURE is the master system for holding records and relationships about
research outputs including data sets. If some or all of these are being
created in the DAR, then they will need to be pushed into PURE. Equally if
data are being registered directly in PURE, it will be useful to pull this out of
PURE and into the DAR
Finally, for instances where metadata is held in the DAR, corresponding
files are held in the Data Vault, the data owner may decide to make the data
openly available. Thus the DAR should be able to deposit these as a new
item in the Data Repository.
26.
27. RDM Programme Communications Plan
There are a number of different groups within the university and
outside with whom we need to communicate our RDM programme.
This will be done through a variety of communication activities.
Target Audiences
1.University of Edinburgh staff need to understand the principles of
RDM and how it is practiced and supported within the University:
•
•
•
Research active staff
IS and School/college support staff
Other university committees and groups (research policy group, library
committee, IT committee, knowledge strategy committee)
2.External collaborators and stakeholders such as funding bodies,
Russell Group, national and international RDM community e.g. RDA,
DANS, ANDS, COAR, DPC, DCC
28. Key Messages:
Co-ordinated, Consistent, Coherent
There are three key messages which will need to be tailored
and made timely and relevant to our target audiences.
The core of each message however must be maintained to
ensure that everyone gains the same level of understanding.
1.The University is committed to and has invested in RDM
•
services, training, support
2.What is meant by Research Data Management?
•
definitions, data lifecycle, responsibilities
3.The University is supporting researchers
•
encourage good research practice, effect culture change
29. Communication activities
• Awareness raising sessions
• In each of the 22 Schools (all researchers) – to ensure that all
research staff have the same level of knowledge about RDM
• Attend School Research Committee meeting to ensure committee
understands key RDM messages
• Regular (12 monthly) updates for IS staff on RDM Programme
• Tailored sessions for specific support groups (ERI, Helpline)
• Tailored sessions for School support staff – likely to be first port
of call for researchers
• Segment into new staff induction sessions (run by IAD)
• Website – ‘One Stop Shop’ for all university RDM materials (FAQs,
key messages, RDM planning guidance, service guides)
30. • Internal and external publications
•
•
•
•
•
•
•
Service leaflets
Data blog
Internal publications (newsletters, annual and community reports)
Emails to staff applying for grants
Emails to staff who have just received a grant
Information Packs for new grant recipients sent out by Research Office (ERI)
containing service leaflets, help guides, contact details
Outreach and dissemination (papers, posters, demos at conferences,
seminars, articles, blog posts)
• Training for researchers
•
•
Online via MANTRA
Training courses (1-2 hour sessions over 6 weeks delivered by IS)
• Training for support staff
•
•
•
Academic Support Librarian Training
Training for Research Office staff
Training for School Support staff (esp. research administrators, data
managers, IT support staff)
• Launch of Live Services – May 2014 !
• External Events – IDCC, RDMF, RDA events, conferences where
appropriate
31. Jisc MRD Programme 2009-2013
•
Jisc is ‘a registered charity [.. that ..] champions the use of digital
technologies in UK education and research.’ It is funded by all the UK
tertiary education funding councils.
• Jisc funded two Managing Research Data (MRD) strands:
•
Strand 1 (Oct. 2009 – Sept. 2011)
•
•
•
•
•
Piloting RDM infrastructures within institutions
Improving practice in RDM planning
Developing tools to help institutions plan their RDM practice
Demonstrating the benefits of citing, publishing and research data
Developing RDM Training materials – up-skilling researchers and
support staff
32. Strand 2 (Oct. 2011 – Jul. 2013) - Working closely with the
DCC this strand aimed to improve institutional RDM
capability.
•
17 institutional projects to help universities pilot or develop RDM
infrastructures to provide quality support for research
•
8 projects for helping research groups or departments fulfill the
requirements of research funders by implementing DMPs
•
2 projects to customize DCC’s DMP Online tool for institutional use
These activities were complemented by work:
•
to improve the practice of data citation and explore innovative
ways of publishing research data
•
to develop disciplinary focused RDM training materials for other
stakeholders, including discipline liaison librarians and research
liaison officers.
33. Objectives are to:
1.
Build support structures for researchers in depositing
research publications in collaboration with Open Access Liaison
Offices in 27 member states
2.
Establish and operate an electronic infrastructure for handling
scholarly communications and other value-added functionality
(annotation tools, metrics and reporting tools)
3.
Work with subject communities to explore the practices,
incentives, workflows, and technologies required to
deposit,
access, and manipulate research datasets associated with
research publications.
34. The European Commission recently announced its €15 Billion Horizon 2020
Programme, intended to boost the knowledge economy with a set of
"... new rules to make 'open access' a requirement for Horizon 2020, so that
publications of project results are freely accessible to all .“
Open Access to Research Data
The EC has also announced a Pilot on Open Access to Research Data. This
aims to maximise and share research data produced by EC funded projects
‘for the benefit of society and the economy’.
Each pilot project will have to develop an RDM Plan indicating what kind of
data their project will create and how this data can be exploited and made
available ‘for use by other researchers, innovative industries and citizens .
The EC has also issued Guidelines for Data Management and another set of
guidelines for Open Access to Scientific Publications and Data.
35.
36. THANK YOU!
Links:
Data Library Services: http://www.ed.ac.uk/is/data-library
EDINA: http://edina.ac.uk/
Edinburgh University Data Policy: http://www.ed.ac.uk/is/research-data-policy
Edinburgh Data Audit Framework (DAF) Implementation:
http://ie-repository.jisc.ac.uk/283/
Research Data MANTRA course: http://datalib.edina.ac.uk/mantra
Edinburgh University RDM Roadmap: http://tinyurl.com/km99a9l
Managing Research Data Strand 1:
http://www.jisc.ac.uk/whatwedo/programmes/mrd.aspx
Managing Research Data Strand 2: http://tinyurl.com/6w6g6qx
OpenAIRE: https://www.openaire.eu/en/home
Guidelines on Open Access to Scientific Publications and Research Data in
Horizon 2020 [Dec. 2013]: http://tinyurl.com/ndfrdts
Guidelines on Data Management in Horizon 2020 [Dec. 2013]:
http://tinyurl.com/leu4v7h
37. Acknowledgements:
Dr. Cuna Ekmekcioglu (Vice Principal’s
Office)
Sarah Jones (Digital Curation Centre)
Stuart Lewis (Research & Learning Services)
Kerry Miller (Research & Learning Services)
Robin Rice (EDINA & Data Library)
Dr. John Scally (Library and Collections)
Tony Weir (IT Infrastructure)
Notas del editor
Without wanting to state the obvious !!!!!
All urls and links will be available on the last slide
25 years ago
disk storage - expensive
researchers interested in working with data came together to petition the PLU and the University’s Library – wanting a university-wide provision for files that were too large to be stored on individual computing accounts
Early holdings were research data from universities of edinburgh, glasgow, and strathclyde
Division with Information services along with Applications , IT Infrastructure, Library and Collections, User Services Division, DCC
Primarily social sciences but not exclusively so, large scale government surveys (micro data), macro-economic time series data (country-level data), Elections studies, Geospatial data, financial datasets, population census data
Free on internet / subscription / through national data centres/archives / resource discovery portals
Registration / authorisaiton and authentication / special conditions / budget to pay for data
SPSS, STATS, SAS, R, ArcGIS – interpret documentaiton/codebooks, merge and match users data with other data (via look-up tables), subset data
Data Catalogue
Funded by JISC as part of its UK programme, Managing Research Data to develop online learning materials to assist researchers manage their digital assets.
IAD – set up to deliver training and development for postgraduate students and staff – via online course, Virtual Learning Environments, transferable skills training
Training for postgraduates and early career researchers
These were the School of Divinity, School of History, Classics and Archaeology), School of Biomedical Sciences), (School of Molecular and Clinical Medicine), (School of Physics and Astronomy). Also, the School of Geosciences
Publicly funded research area public good produced in the public interest; data with long-term value should be preserved & accessible for re-use; metadata for searching and description/context; embargo for privileged use by researcher; data citation/attribution
Funders have policies, responsibilities fall to the
university as well as the researcher
Researchers are mobile
Institution and researcher must work together,
define the responsibilities
Awareness raising within university of practicalties
1.6PB storage – allocated to Schools / research institutes / research groups
To support most research use-cases, provide off-site back-up
ECDF – cross-platform storage, HPC file storage, back-up services, version control & software source code store
THIS grant, funded THAT piece of equipment, which was used to create THIS data set, that was described in THESE journal articles
There are a wide variety of different communication activities that will be required to ensure that all audiences receive the right message, at the right time, and in an appropriate way
20 Billion Dollars
The Pilot involves key areas of Horizon 2020:
Future and Emerging Technologies
Research infrastructures – part e-Infrastructures
Leadership in enabling and industrial technologies – Information and Communication Technologies
Societal Challenge: Secure, Clean and Efficient Energy – part Smart cities and communities
Societal Challenge: Climate Action, Environment, Resource Efficiency and Raw materials – with the exception of topics in the area of raw materials
Societal Challenge: Europe in a changing world – inclusive, innovative and reflective Societies
Science with and for Society