Presentation given at Bett: Technology in Higher Education Conference, Jan 30 - 31
http://www.bettshow.com/Default.aspx?nid=15&refer=17&id=mainLnk2&id1=ssubLnk8
The Transnational Online Pivot: A Case Study Exploring Online Delivery in China
Improving Access to Research Data: What does changing legislation mean for you?
1. Improving Access to
Research Data:
What does changing legislation
mean for you?
Marieke Guy, Institutional Support Officer,
Digital Curation Centre, UKOLN, University of Bath, UK
Email: m.guy@ukoln.ac.uk
Twitter Id: mariekeguy
Web: http://www.dcc.ac.uk
Technology in Higher Education, 31st January 2013
UKOLN is supported by:
This work is licensed under a Creative Commons Licence
Attribution-ShareAlike 2.0
1
2. http://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox-
a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bih
Research Data
http://www.flickr.com/photos/think
mulejunk/352387473/
http://www.flickr.com/photos/usf
sregion5/4546851916//
http://www.flickr.com/photos/wasp
http://www.flickr.com/photos/charleswelch/3 _barcode/4793484478/
2 597432481//
3. What is Research Data?
…whatever is produced in research or evidences its outputs
• Facts
• Statistics
• qualitative
• quantitative
• Not published
research output
3
What Kinds of Data? • Discipline specific
4. A Data Present
“Data underpins our economy and
our society - data about how
much is being spent and where,
data about how schools, hospitals
and police are performing, data
about where things are and data
about the weather.”
Tim Berners Lee, director of W3C.
4
5. Big Data…and Small Data
• Big data
• DIY data
• Consumer data
• Crowd Sourced data
• Linked data
• Open data
• Databases
• Learning data
• Administrative data
“The 1000
“The 1000
Genomes Project
Genomes Project “Volume
generated more
generated more
, velocity
and varie
DNA sequence
DNA sequence ty”
ata
oject: “d
data in its first 66
data in its first
s pr
months than
months than
JISC M aRDI-Gros icant
GenBank had
GenBank had the least signif nce
accumulated in v olume is nt c ontext, si
the prese
accumulated in
(issue) in m”
cal proble
its entire 21 year
its entire 21 year
echni
existence”
existence”
5 ‘only’ a t
6. Away from Secrecy
“We need to move away from a
culture of secrecy and towards
a world where researchers can
benefit from sharing expertise
throughout the research
lifecycle”
Dr Malcolm Read, then executive secretary
of JISC, 2011
6
Hal Varian, Chief Economist, Google
7. Making Public Data Accessible
“We have opened up much public
data already, but need to go much
further in making this data
accessible. We believe publicly
funded research should be freely
available. We have commissioned
independent groups of academics
and publishers to review the
availability of published research,
and to develop action plans for
The Open Data Institute
making this freely available” (ODI) will be the first of its
kind, a pioneering centre
of innovation, driven by
the UK Government’s
7 Open Data policy
8. Science as an Open Enterprise
• Report by Royal Society, June 2012
• Analyses the impact of new and emerging
technologies that are transforming the
conduct and communication of research
• Recommendations:
• Scientists should make data available in
data repositories
• Universities have a major role to play in
supporting open data
• Learned societies and academic bodies
should promote open science
• Science journals should require data
underpinning article
• Industry sector and regulators should
work together to share data in public
8
interest
9. Finch Report
• June 2012: Finch report: Accessibility, sustainability,
excellence: how to expand access to research publications
• Funded by BIS, HEFCE, RCUK & Publishers association
• Addresses question how to achieve “better, faster access to
research publications”
• Recommends that UK should embrace the transition to open
access
• Recommends ‘gold’ open access journals (over ‘green’)
• Government accepted all recommendations
• HEFCE endorsed report – making open access published
research the basis for the REF from 2014
• Cost of the transition (up to £50m a year) must be covered by
the existing science budget
• Main concerns: cost, repository use, reduction in niche
funding, jobs?
9
10. Funding: RCUK
• RCUK Common Principles on Data Policy
– Public good: Publicly funded research should be made
openly available with as few restrictions as possible
– Planning for preservation
– Discovery: Metadata should be available and discoverable
– Confidentiality: Ensure legal, ethical and commercial
constraints assessed
– First use: Provision for a period of exclusive use
– Recognition: Acknowledge data sources
– Public funding: Must be efficient and cost-effective
• 16th July 2012 new ‘Access to Research Outputs’ policy based
on Finch report
• All publications submitted from 1 April 2013 must be
published in journals which are compliant with Research
10 Council policy on Open Access
11. Data Policies of Funders
http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
11
12. Funders Policies: EPSRC
Engineering and Physical Sciences Research
Council (EPSRC) expects all those institutions
it funds
•to develop a roadmap that aligns their policies and
processes with EPSRC’s expectations by 1st May 2012;
•to be fully compliant with these expectations by 1st May
2015.
•Compliance will be monitored and non-compliance
investigated.
•Failure to share research data could result in the imposition
of sanctions.
12
14. Other Moves Towards Openness
Organisation for Economic Co-operation and
Development describes data as a public good that
should be made available
European Commission Statement on open access
July 2012
•All research funded through its Horizon 2020 programme
(2014 – 2020) must be made open access.
•The commission wants to see 60% of publicly-funded
research articles in Europe available for free by 2016.
14
15. Government Open Agenda
David says… FOI 'furs up' government
with repeated requests
about processes. Open
data is better.
We need to shine the
We need to shine the
light of transparency on
light of transparency on
everything we do
everything we do
We recognise that
We recognise that
transparency and open data
transparency and open data
can be aapowerful tool to
can be powerful tool to
help reform public services,
help reform public services,
foster innovation and
foster innovation and
empower citizens.
empower citizens.
15
16. Research Data Management Drivers
External
• Government Open Agenda
• Public pressure – data as a public good
• Changes in funders’ data policies
• Research now becoming more global and more ‘data
Intensive’ – Riding the Wave report
• Institutional need for better research integrity - REF
• Best practice
• Desire to be ‘good researcher’ and a well-cited
researcher
16 Internal
18. Research Integrity
“Employers must take responsibility for the
integrity of their employees' research.
However, we question who would oversee
the employer and make sure that they are
doing the right thing. In the same way that
there is an external regulator overseeing
health and safety, we consider that there
should be an external regulator overseeing
research integrity.”
House of Commons Select Committee on Science
and Technology.
Eighth Report: Peer Review in Scientific Publications.
Published 28 July 2011
18
19. Data for Impact
• Research Excellence Framework (REF) measures researcher
contributions and their impact
• Has struggled in terms of its breadth when it comes to
extending beyond paper-based metrics
• Wariness of researchers to spend time on activity that
doesn’t count to the REF
• REF panels now allow submission of “a substantial,
coherent and widely admired data set or research
resource”
19
20. Research Data and FOI
• Recent years - some high profile cases of
FOI requests
• 3 Dec 2012 announced “Universities are
not compelled to release unpublished
research data”
• Recommendation by a House of Commons
Justice Committee report in July 2012
• Dedicated exemption, subject to both a
prejudice and public interest test
“This isn’t about transparency, it’s about timing,”
Vivienne Stern, head of political affairs at the
vice-chancellors’ group, Universities UK
20
21. Data Citation
•Data access raises
visibility
•Data with DOI = citeable
research output
•Data citations are good
for researchers
21
22. To Recap…
• The age of open access publishing and open data has finally
arrived
• Most research outputs, including underlying data, will soon
have to be published in open access format whether or not
the research has been funded externally
• Not making data accessible could result in loss of funding,
legal issues (FOI), loss of funding, reputational issues,
research integrity issues (inability to verify, scrutinise),
lack of visibility, data loss …
It is impossible to make data openly accessible
unless they have been properly managed
22
23. Challenges caused by Access
• Scale, volume – data deluge
• Complexity of data –
heterogeneous in nature
• Pace of data
• Management – storage,
infrastructure, sustainability
• Quality of data
• Reputation – FOI, DPA,
computer misuse
• Selection and appraisal
• Preservation implications
• Partnerships
• Resourcing and cost
23
24. What is Research Data Management?
Caring for, facilitating access
preserving and adding value
to research data throughout its
lifecycle.
Organisation, Resources and
Technology required to
support and sustain.
24
25. RDM Activities
• Producing and sharing of data with
research colleagues in collaborative
environments (internal and external)
• File naming
• Applying metadata for context and
discovery
• Caring for sensitive data
• Cleaning data for longer-term use
• Selecting mechanisms for data capture and
storage
• Selecting and appraising data for short and
longer-term retention
• Licensing data for reuse
• Developing data management plans
25
26. The Digital Curation Centre
• A consortium comprising units from the Universities of Bath
(UKOLN), Edinburgh (DCC Centre) and Glasgow (HATII)
• launched 1st March 2004 as a national centre for solving
challenges in digital curation that could not be tackled by
any single institution or discipline
• Funded by JISC with additional HEFCE funding from 2011
for the provision of support to national cloud services
• Targeted institutional development
• http://www.dcc.ac.uk/
26
27. Advocacy and Training
• Informatics: disciplinary
metadata schema, standards,
formats, identifiers, ontologies
• Storage: file-store, cloud, data
centres, funder policy
• Access: embargoes, FOI
• How to: appraise and select, cite
data sets, develop a data
management plan, licence
research data
New: How to set a RDM service –
coming soon!
How to cite data
27
29. DCC Tools
• Suite of tools to help with digital curation
29
30. Institutional Engagement Work
• Funded by the HEFCE through its Universities
Modernisation Fund (UMF)
• Intensive, tailored support to increase research data
management capability
• Originally 18 Higher Education Institutions (HEIs) between
Summer 2011 and Spring 2013
• Can help:
– win the support of senior management
– understand current data practices
– redesign data support services
– Help with policy development and training
30
31. So Where do I Start?
• Think about who you need involved
• Carry out audits to assess current assets, practices and
requirements, gaps in provision
• Identifying quick wins while developing long-term plan
• Avoid reinventing: try integrating, adapting, augmenting
– e.g. policies, training, storage
• Raise awareness and looking at training
• Look at current support available
• Take it step by step
31
32. Who Do I Involve?
• Researcher(s) • Funders
• Research support officers / • Archive / long-term data
project staff repository
• Lab technicians • Senior management
• Librarians / Data Centre staff • Others...
• Faculty ethics committees
• Institutional legal/IP advisors
• FOI officer / DPA officer /
records manager
• Computing support
• Institutional compliance
officers
32
33. 5 Steps to Research Data Readiness
•Step 1: Take stock
•Step 2: Let research needs
drive your strategy
•Step 3: Re-evaluate your
existing infrastructure and data
architecture
•Step 4: Get to know the new
technologies and standards
•Step 5: Bring your staff up to
speed
33
34. A Data Future
“The ability to take data - to
be able to understand it, to
process it, to extract value
from it, to visualise it, to
communicate it -that’s going to
be a hugely important skill in
the next decades.”
Hal Varian, Google’s chief economist.
34
Hal Varian, Chief Economist, Google
35. Thank You
• Thanks to DCC colleagues for contributing to slide material.
Any questions?
m.guy@ukoln.ac.uk
35
Notas del editor
Dame Janet Finch CBE June 2012
he question of where to host large server farms is not just about heat output – I’ve also heard it said that when they looked at building server farms in very cold climates to offset the cost of cooling, they found that they had to spend an equal amount of energy on heating (no idea if this is true though). Perhaps more to the point is the availability of eco-friendly renewable energey sources – this is why Iceland has been touted as an ideal venue – it is about the geothermal energy available, not about the climate http://www.theregister.co.uk/2007/04/10/iceland_to_power_server_farms/. I guess this also suggests that hosting server farms where there is a ready supply of solar energy might also be eco-friendly (if you could generate enough power from solar), even though plenty of solar energy suggests a warmer climate. Finally, a message from the Green Computing Forum was linking electricity usage to cost was key in terms of getting the organisation to react to the issue of energy efficiency. Re-charging electricity costs to the people using the power currently seems a better way than simply arguing that ‘for the common good’ we should be ‘greener’. So what we need is not to link digital preservation to radicalism, but commercialism – digital preservation will save you money.
Liz Lyon “ The Informatics Transform: Re-Engineering Libraries for the Data Decade” International Journal of Digital Curation Volume 7, Issue 1 | 2012