SlideShare a Scribd company logo
1 of 48
Introduction to Research
Data Management

Stuart Macdonald
EDINA & Data Library
stuart.macdonald@ed.ac.uk




RDM Training, School of Geosciences, 7 November 2012
•   Background
•   Data Library Services & Projects
•   Research Data MANTRA
•   What is RDM
    –   Research Data Defined
    –   Data Management Planning
    –   Organising Data
    –   File Formats & Transformations
    –   Documentation & Metadata
    –   Storage & Security
    –   Data protection & Rights
    –   Preservation & Sharing
Background
EDINA and University Data Library (EDL) together
are a division within Information Services of the
University of Edinburgh.

EDINA is a JISC-funded National Data Centre
providing national online resources for education
and research - url: http://edina.ac.uk

The Data Library assists Edinburgh University
users in the discovery, access, use and
management of research datasets - url:
http://www.ed.ac.uk/is/data-library
Data Library Services and Projects

• Data Library & Consultancy
• Edinburgh DataShare
• JISC-funded projects
  – DISC-UK DataShare (2007-2009)
  – Data Audit Framework
    Implementation (2008)
  – Research Data MANTRA (2010-
    2011)
Data Library & Consultancy
•   finding…
•   accessing …
•   using …
•   teaching …
•   managing

Building relationships with researchers via
PG teaching activities, research support projects,
IS Skills workshops, Research Data Management
training and through traditional reference
      interviews.
Edinburgh DataShare:
url: http://datashare.is.ed.ac.uk/

An online institutional repository of multi-disciplinary
research datasets produced by University researchers,
hosted by the Data Library.

Researchers producing research data associated with a
publication, or which has Re-use potential, can upload their
dataset for sharing and safekeeping. A persistent identifier
and suggested citation will be provided.

DataShare is a customised DSpace instance with a selection
of standards-compliant metadata fields to aid discovery
through Google and other search engines via OAI-PMH.
Edinburgh Data Audit Framework
(DAF) Implementation
(May – Dec 2008)

     A JISC-funded pilot project produced 6 case studies from
     research units across the University in identifying
           research
     data assets and assessing their management, using DAF
     methodology developed by the Digital Curation Centre.
     4 main outcomes:

     •     Develop online RDM guidance
     •     Develop university research data management policy
     •     Develop services & support for RDM (in partnership IS)
     •     Develop RDM training
Research Data
Management Web
Guidance

Online suite of web pages for IS
website developed in 2009 –
recently rationalised and
revamped (Oct. 2012)

url: http://tinyurl.com/pmje7o
University Research Data
Management Policy
   In spring 2010, a review commenced at the University to
   address the issue of managing the rapidly expanding volume
   and complexity of data produced by Edinburgh researchers.
   The Review was overseen by the IT & Library Committee and
   had twin tracks to look at Data Storage, and Data
   Management, Curation and Preservation.
   The Review looked at current practice in the University, in
   peer universities & internationally.
   Championed by Vice-Principal & Chief Information Officer
   Prof. Jeff Haywood the policy for management of research
   data was approved by the University Court on 16 May, 2011.
   One of the first RDM policies in a UK tertiary education
   Institution.
IS RDM Roadmap
Drivers: University research data management policy
and EPSRC request that all institutions in receipt of their
funding should develop a roadmap for research data
management (to be implemented by May 1st 2015).

Information Services (IS) has committed to an RDM
Roadmap over an 18 month period (July 2012-Jan. 2014)
across four strategic areas.

The Roadmap will help to engage academic units and
PIs in research data management and provide services
to implement the University’s RDM Policy.

The Roadmap is a cross-divisional goal of IS supported
by: DCC, EDINA & Data Library, User Services, Library
& Collections, IT Infrastructure.
Research Data MANTRA
Research Data MANTRA
Partnership between:
Edinburgh University Data Library
Institute for Academic
Development

Funded by JISC Managing Research
Data Programme (Sept. 2010 – Aug.
2011)
Why Manage
Research Data?

Data Deluge – exponential growth in the
volume of digital research artifacts created
within academia.

Data management is one of the essential
areas of responsible conduct of research.
Project Overview
Grounded in three disciplinary contexts: social science,
clinical psychology and geoscience.

Aim was to develop online interactive open learning
resources for PhD students and early career
researchers that will:
    • Raise awareness of the key issues related to
    research data management & contribute to
    culture change.

    • Provide guidelines for good practice.


 Selling RDM as a Transferrable Skill.
 (voluntary participation)
Online Learning Module
Eight units with activities, scenarios and videos:
•   Research data explained
•   Data management plans
•   Organising data
•   File formats and transformation
•   Documentation and metadata
•   Storage and security
•   Data protection, rights and access
•   Preservation, sharing and licensing

Four data handling practicals: SPSS, NVivo, R, ArcGIS
Video stories from researchers in variety of settings
Xerte Online Toolkits – University of Nottingham
MANTRA & Research Data Lifecycle
url: http://datalib.edina.ac.uk/mantra/index.html
Online Learning Module
•   Delivered online – self-paced, available ‘anytime,
    anyplace’
•   Emphasis on practical experience and active
    engagement via online activities
•   One hour per unit
•   Read and work through scenarios & activities
    (incl. videos etc)
•   CC licence to allow manipulation of content for
    re-use with attribution
•   Portable content in open standard formats (e.g.
    SCORM)
MANTRA Dissemination
• Learning materials deposited with an open
licence in JorumOpen & Xpert.
• Learning materials to be embedded in three
participating postgraduate programmes and
made available through IAD programme for use
by all postgraduate students and early career
researchers.
• Website: http://datalib.edina.ac.uk/MANTRA
• Download/re-brand/re-purpose materials
from JorumOpen in standards compliants
formats.
• Software modules – data handling practicals
(MS Word)
End of Part One!

Questions?
What is Research Data Management?
• An umbrella terms to describe all aspects of
  planning, organising, documenting, storing
  and sharing research data.

• It also takes into account issues such as
  documentation, data protection and
  confidentiality.

• It provides a framework that supports
  researchers and their data throughout the
  course of their research and beyond.
* Research Information Network. “Stewardship of digital research data - principles and guidelines", 30 March 2007. Viewed 30 October 2012




 Research Data Defined
 US Office of Management and Budget in its grants management
 circular A-110 defines research data as “the recorded factual
 material commonly accepted in the scientific community as
 necessary to validate research findings.”
 The KRDS2 study (Beagrie et al, 2009) define research data as
 ‘ collections of structured digital data from any disciplines or
 sources which can be used by academic researchers to
 undertake their research or provides an evidential record of
 their research.’
 RIN Classification*:
 • Observational – real-time, unique, usually irreplaceable
 • Experimental – from lab equipment, expensive, often
 reproducible
 • Simulation – generated from models – model & metadata more
 important than output data
 • Derived or compiled – reproducible but expensive
 • Reference - a (static or organic) collection of smaller (peer-
 reviewed) datasets, most probably published and curated
Research Data Defined

• Research data, unlike other information
  types, is collected, observed, or created, for
  purposes of analysis to produce original
  research results.

• Research data can be generated for different
  purposes and through different processes in
  a multitude of digital formats.
Research data may include the
following:
•   Documents (text, MS Word), spreadsheets
•   Lab books, field notes, diaries
•   Questionnaires, transcripts, codebooks
•   Audiotapes, videotapes, photographs, images
•   Slides, artefacts, specimens, samples
•   Collection of digital objects acquired & generated during the research
    process
•   Database contents (video, audio, text, images)
•   Models, algorithms, scripts
•   Contents of an application (input, output, logfiles for analysis software,
    schemas)
•   Methodologies, workflows
•   SOPs, protocols
By managing your data you will:
•   ensure scientific integrity of research and aid replication
•   ensure research data and records are accurate, complete, authentic
    and reliable
•   increase your research efficiency
•   save time, effort and resources in the long run
•   enhance data security and minimise the risk of data loss
•   prevent duplication of effort by enabling others to use your data
•   meet funding council grant requirements

Note:
It may also be important to manage research records (both digital &
    hardcopy) during and beyond the life of the project e.g.
    correspondence (emails); project files; grant applications; technical
    reports; research reports; consent forms; ethics applications.
Funders Policies




url: http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
What Do Funders Want?
• timely release of data
   - once patents are filed or on (acceptance for)
     publication.

• open data sharing
   - minimal or no restrictions if possible.

• preservation of data
   - typically 5-10+ years if of long-term value.


    See the RCUK Common Principles on data policy:
    www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
Data Management & Sharing Plans
Five common questions asked by funders are:
  •   What data will be created? (format, types, volumes etc)

  •   What standards and methodologies will you use?

  •   How will you manage ethics and Intellectual Property?

  •   What are the plans for data sharing and access?

  •   What is the strategy for long-term preservation?



 DCC’s DMP Online tool: https://dmponline.dcc.ac.uk
 How to write a DMP guide:
 www.dcc.ac.uk/resources/how-guides/develop-data-plan
Data Management Plan. What is it?
A DMP is a document which describes:

    What research data will be created.

    What policies (funding, institutional, legal) apply to the data.

     What data management practices (backups, storage, access
    control, archiving) will be used.

  What facilities and equipment will be required (hard-disk
space, backup server, repository).

    Who will own the copyright and have access to the data.

    Who will be responsible for each aspect of the plan.

  How its reuse will be enabled and long-term preservation
ensured after the original research is completed.
The data management plan must be continuously maintained
and kept up-to-date throughout the course of research.
Why do we need one?
It improves your research both now and later...
•Data is often valuable for a long time!
•Results of your research may outlast your degree.
•Will you use your data throughout your career?
•Loss of physical/digital data and records.
•Loss of usefulness through records loss, media and
software obsolescence,
•Forgetting stuff!


Good practice → Better research
Why do we need one?
•Ensure research integrity (and repeatability) through keeping
better records.
•People can trace your outcomes from data collection,
through research methodology, through to results.
•Maximises usefulness of data to fellow researchers.

•Highlights how data was collected, quality controls, how
people can and should use it (access and licensing), how you
then attribute people/projects.
•Facilitates data use within collaboration.

•Can help lead to subsequent research papers.
Getting started with a DMP

  Gain an understanding of terminology & issues.

  Gain understanding of your project/community
     – Supervisor and colleagues
     – People in your School, i.e. IT Officers, Graduate Research
        Coordinator...

  Talk to your supervisor about data authorship, IP, licensing,
policies.

  Use a research data planning checklist.

  Keep it practical and simple, don't spend too much time. What
you don't know leave gaps, investigate, fill in later.

  Remember it is never finished! Review it regularly through the
course of your research.
Organising your data
•Research data files and folders need to be labelled and
organised in a systematic way so that they are both
identifiable and accessible for current and future users.

•Naming datasets according to agreed conventions should
make file naming easier for colleagues because they will not
have to ‘re-think’ the process each time.

•One benefit of consistent research data file labelling is that
files are not accidentally overwritten or deleted.

•It is important to consistently identify and distinguish
versions of data files. This ensures that a clear audit trail
exists for tracking the development of a data file and
identifying earlier versions when needed.
File Formats & Transformation
•   A file format encodes information in a computer file, enabling
    another program to access data within it

•   HTML and PDF are two examples of commonly used file format
    and may be identified by their suffixes .html and .pdf.

•   Files are based on either text or binary encoding. The former is
    both machine- and human-readable and the latter only readable
    by means of  appropriate software.

•   Thus text files are less likely to become obsolete. Examples of file
    name extensions for these files are .txt, .csv and .por. 

•   If you convert or migrate your data files from one format to
    another, be aware of the potential risk of the loss or corruption of
    your data and take appropriate steps to avoid/minimise it.
File Formats & Transformation
•When compressing  your data files for storage,
transportation or transmission, you encode the information
using fewer bits than the original representation. Commonly
used compression programs are  Zip and Tar.

•You may use the process of data normalisation. This means
to convert data from one format (e.g. proprietary) into another
for use or preservation (e.g. ASCII).

•You may also need to compute new  values from old in your
data, a process which is called data transformation.

•This may be necessary prior to analysing your data. Three
techniques for doing this are aggregation, anonymisation and
perturbation.
Documenting Data
There are many reasons why you need to document
your data:
•To help you remember the details later
•To help others understand your research
•Verify your findings
•Review your submitted publication
•Replicate your results
•Archive your data for access and re-use

Some examples of data documentation are:
•Laboratory notebooks
•Field notes
•Questionnaires
Documenting Data
Laboratory or field notebooks, for example play an
important role in supporting claims relating to
intellectual property developed by University
researchers, and even defending claims against
scientific fraud.

Research data need to be documented at various
levels:
•Project level
•File or database level
•Variable or item level

The term metadata (‘data about data’) is often used.

The importance of metadata lies in the potential for
machine-to-machine interoperability to assist location
and access to data through search interfaces.
Secure data storage:
For the purposes of integrity, efficiency and ease of replication it is
important that research data is stored securely & backed up regularly via:

• Networked drives

    •   Fileservers managed by department / school / IS.

    •   Stored in single, secure, accessible place – regular back-ups.

• Personal computers / laptops

    •   Convenient, temporary storage - should not be used for storing
        master copies.
    •   Local drives may fail & laptops may get lost/stolen.
• External storage devices

   •   Hard drives, USB sticks, CDs, DVDs – low cost & portable BUT
       not recommended for long term storage.
   •   Longevity not guaranteed – degradation over time.
   •   Easily damaged or misplaced.
   •   Not big enough for all research data – need for use of multiple
       discs/drives.
   •   May pose a security threat.

   If USB sticks, DVDs, CDs are used for working data or extra back-up
      then:
   • Choose high quality products from reputable manufacturers.
   • Conduct regular checks to ensure media is not failing.
   • Periodically refresh data (i.e. copy to a new disc or drive).
   • Ensure confidential data is password protected / encrypted
• Remote or online back-up services             - services that provides
  an online system for storing and backing-up computer files e.g.
  Dropbox, Mozy, Humyo, A-Drive

   •   Allow users to store and sync data files online and between
       computers.
   •   Employ cloud computing storage facilities (e.g. Amazon S3).
   •   Business model – first few GBs free, pay for more space.
Backing-up
 Considerations for back-up policy:
 • Whether all data (full back-up), or only changed data will be backed-up
   (incremental back-up)?
 • How often full and incremental back-ups will be made?
 • How much hard-drive space or DVDs will be required to maintain this
   schedule?
 • If working with sensitive data, how will it be secured (and destroyed)?
 • What back-up services are available that meet your these needs?
 • Who will be responsible for ensuring back-ups are available?

 Recommendation:
 Keep at least 3 copies of your data (e.g. original, external/local, and
external/remote) and put in place regular back-up procedure
Data Security
    The means of ensuring that data is kept safe from corruption and that
    access to it is suitably controlled. It is important to consider data security
    to prevent:

•      Accidental or malicious damage / modification to data.
•      Theft of valuable or irreplaceable data.
•      Breach of confidentiality agreements and privacy laws.
•      Release of data before it has been checked for accuracy and
       authenticity.
Data Protection
•   The 1998 Data Protection Act regulates how personal data may be
    held and processed, and is aimed at organisations but also applies
    to individuals.

•   The Act recognises that personal data on its own or linked with
    other data, can reveal the identity of an actual living person.

•   You must comply with the Act from the moment you obtain
    personal data until the time when the data have been returned,
    destroyed, or perhaps transformed into a public use dataset for
    purposes of sharing.

•   Research exemption exists if you are able to process anonymised
    data instead of personal data for your research by destroying the
    “key” between the identifiers and the personally identifying
    information.
•   The Records Management Office has full guidance on its website.
Rights and access
•   Intellectual property rights (IPR) can be defined as rights
    acquired over any work created or invented with the intellectual
    effort of an individual.

•   Facts are not copyrightable but the structure of a database could
    be.

•   As a researcher, you should clarify ownership of and rights
    relating to research data before a project starts. This includes the
    right of access and the right to make copies.

•   Data licences determine the terms and conditions of use by
    another, and may accompany a purchase or subscription.

•   Open data licences attempt to “set data free” by minimising and
    standardising the terms and conditions of re-use. Conditions may
    include attribution, non-commercial use, no derivative works, or
    ‘share alike’.
Benefits of Sharing Data
• Scientific integrity – publishing & citing data in published
  research papers can allow others to replicate, validate, or
  correct results, thus improving the scientific record.
• Publicly funded research - there is a growing movement for
  making publicly funded research available to the public.
• Funding mandates - UK research councils are increasingly
  mandating data sharing so as to avoid duplication of effort and
  save costs.
• University of Edinburgh’s mission - "the creation, dissemination
  and curation of knowledge" implies transparency about the
  research that is conducted in its name.
• Preserve research data for researchers’ own future use.
THANK YOU!

Data Library services:
http://www.ed.ac.uk/is/data-library
EDINA:
http://edina.ac.uk/
Research data management guidance pages:
http://www.ed.ac.uk/is/research-data-management
Edinburgh University data policy:
http://www.ed.ac.uk/is/research-data-policy
Edinburgh Data Audit Framework (DAF) Implementation:
http://ie-repository.jisc.ac.uk/283/
Research data MANTRA course:
http://datalib.edina.ac.uk/mantra
Scenarios for Discussion

At completion of a research project the data and
records are boxed and stored in a departmental
storeroom. A participant in a research project lodges a
claim for compensation, alleging that he was not
adequately informed about the effects of the study and
does not recall giving consent. He finds that the
storeroom has since been converted into a coffee shop.
Where are the records?
Scenarios for Discussion
Sometime after completion of a research project the
researcher wishes to revisit her findings, applying a new
statistical approach. She manages to read the floppy discs
that the data were stored on, eventually gets the old
software format imported into her current statistical
package, only to find she cannot remember what many of the
variable labels –each 8 digits in length - actually mean. Has
she documented her data?

You publish a paper based on your thesis and are surprised
to find it has become a hot topic in your field. Suddenly
people are writing to you asking for the underlying data. How
much effort is required to give them a well-cleaned dataset
and adequate documentation for re-use?

More Related Content

What's hot

Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareRobin Rice
 
Building Confidence: Training Librarians in Research Data Management
Building Confidence: Training Librarians in Research Data ManagementBuilding Confidence: Training Librarians in Research Data Management
Building Confidence: Training Librarians in Research Data ManagementRobin Rice
 
IASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshopIASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshopRobin Rice
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Robin Rice
 
Open Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UKOpen Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UKEDINA, University of Edinburgh
 
A national repository (library?) service for learning materials
A national repository (library?) service for learning materialsA national repository (library?) service for learning materials
A national repository (library?) service for learning materialsEDINA, University of Edinburgh
 
Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...EDINA, University of Edinburgh
 
Certifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyCertifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyHistoric Environment Scotland
 
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShareResearch Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShareHistoric Environment Scotland
 
Building research data management services at the University of Edinburgh: a ...
Building research data management services at the University of Edinburgh: a ...Building research data management services at the University of Edinburgh: a ...
Building research data management services at the University of Edinburgh: a ...Robin Rice
 
Research Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghResearch Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghEDINA, University of Edinburgh
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareRobin Rice
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Robin Rice
 

What's hot (20)

Scottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShareScottish Digital Library Consortium Meeting: Edinburgh DataShare
Scottish Digital Library Consortium Meeting: Edinburgh DataShare
 
Building Confidence: Training Librarians in Research Data Management
Building Confidence: Training Librarians in Research Data ManagementBuilding Confidence: Training Librarians in Research Data Management
Building Confidence: Training Librarians in Research Data Management
 
Engaging the Researcher in RDM
Engaging the Researcher in RDMEngaging the Researcher in RDM
Engaging the Researcher in RDM
 
Research Data Management and Spatial Data
Research Data Management and Spatial DataResearch Data Management and Spatial Data
Research Data Management and Spatial Data
 
IASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshopIASSIST40: Data management & curation workshop
IASSIST40: Data management & curation workshop
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...
 
Open Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UKOpen Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UK
 
RDM Programme @ Edinburgh - Service Interoperation
RDM Programme @ Edinburgh - Service InteroperationRDM Programme @ Edinburgh - Service Interoperation
RDM Programme @ Edinburgh - Service Interoperation
 
A national repository (library?) service for learning materials
A national repository (library?) service for learning materialsA national repository (library?) service for learning materials
A national repository (library?) service for learning materials
 
Using a dumb identifier to do smart things
Using a dumb identifier to do smart thingsUsing a dumb identifier to do smart things
Using a dumb identifier to do smart things
 
Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...Supporting the development of a national Research Data Discovery Service – a ...
Supporting the development of a national Research Data Discovery Service – a ...
 
Research Data MANTRA Project at Edinburgh
Research Data MANTRA Project at EdinburghResearch Data MANTRA Project at Edinburgh
Research Data MANTRA Project at Edinburgh
 
Aggregation as Tactic
Aggregation as TacticAggregation as Tactic
Aggregation as Tactic
 
Certifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case StudyCertifying CISER! A Data Seal of Approval Case Study
Certifying CISER! A Data Seal of Approval Case Study
 
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShareResearch Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
 
Building research data management services at the University of Edinburgh: a ...
Building research data management services at the University of Edinburgh: a ...Building research data management services at the University of Edinburgh: a ...
Building research data management services at the University of Edinburgh: a ...
 
Research Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghResearch Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of Edinburgh
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShare
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...
 
RDM Programme @ Edinburgh
RDM Programme @ Edinburgh RDM Programme @ Edinburgh
RDM Programme @ Edinburgh
 

Viewers also liked

Open Source Software and Open Interoperability Standards at EDINA National Da...
Open Source Software and Open Interoperability Standards at EDINA National Da...Open Source Software and Open Interoperability Standards at EDINA National Da...
Open Source Software and Open Interoperability Standards at EDINA National Da...EDINA, University of Edinburgh
 
Free and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data InfrastructuresFree and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data InfrastructuresEDINA, University of Edinburgh
 
Developing Research Data Management Policy and Services
Developing Research Data Management Policy and ServicesDeveloping Research Data Management Policy and Services
Developing Research Data Management Policy and ServicesRobin Rice
 
Addressing Institutional Research Data Management - University of Edinburgh R...
Addressing Institutional Research Data Management - University of Edinburgh R...Addressing Institutional Research Data Management - University of Edinburgh R...
Addressing Institutional Research Data Management - University of Edinburgh R...EDINA, University of Edinburgh
 
SUNCAT: Transforming the service to create an open bridge from resource disco...
SUNCAT: Transforming the service to create an open bridge from resource disco...SUNCAT: Transforming the service to create an open bridge from resource disco...
SUNCAT: Transforming the service to create an open bridge from resource disco...EDINA, University of Edinburgh
 
Citizen Science in your pocket: Collecting data to support teaching and resea...
Citizen Science in your pocket: Collecting data to support teaching and resea...Citizen Science in your pocket: Collecting data to support teaching and resea...
Citizen Science in your pocket: Collecting data to support teaching and resea...EDINA, University of Edinburgh
 
AddressingHistory - Crowdsourcing historical data and maps
AddressingHistory - Crowdsourcing historical data and mapsAddressingHistory - Crowdsourcing historical data and maps
AddressingHistory - Crowdsourcing historical data and mapsEDINA, University of Edinburgh
 
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 EDINA, University of Edinburgh
 

Viewers also liked (20)

Looking After Your Data: RDM @ Edinburgh
Looking After Your Data: RDM @ EdinburghLooking After Your Data: RDM @ Edinburgh
Looking After Your Data: RDM @ Edinburgh
 
Open Source Software and Open Interoperability Standards at EDINA National Da...
Open Source Software and Open Interoperability Standards at EDINA National Da...Open Source Software and Open Interoperability Standards at EDINA National Da...
Open Source Software and Open Interoperability Standards at EDINA National Da...
 
Free and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data InfrastructuresFree and Open Source Software for Regional Spatial Data Infrastructures
Free and Open Source Software for Regional Spatial Data Infrastructures
 
UK RepositoryNet+: Jisc Oversight Group, 2012-11-29
UK RepositoryNet+: Jisc Oversight Group, 2012-11-29UK RepositoryNet+: Jisc Oversight Group, 2012-11-29
UK RepositoryNet+: Jisc Oversight Group, 2012-11-29
 
Developing Research Data Management Policy and Services
Developing Research Data Management Policy and ServicesDeveloping Research Data Management Policy and Services
Developing Research Data Management Policy and Services
 
Research Data Management: Why is it important?
Research Data Management: Why is it  important?Research Data Management: Why is it  important?
Research Data Management: Why is it important?
 
Addressing Institutional Research Data Management - University of Edinburgh R...
Addressing Institutional Research Data Management - University of Edinburgh R...Addressing Institutional Research Data Management - University of Edinburgh R...
Addressing Institutional Research Data Management - University of Edinburgh R...
 
Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]
 
UKLA Content Development
UKLA Content DevelopmentUKLA Content Development
UKLA Content Development
 
SUNCAT: Transforming the service to create an open bridge from resource disco...
SUNCAT: Transforming the service to create an open bridge from resource disco...SUNCAT: Transforming the service to create an open bridge from resource disco...
SUNCAT: Transforming the service to create an open bridge from resource disco...
 
Citizen Science in your pocket: Collecting data to support teaching and resea...
Citizen Science in your pocket: Collecting data to support teaching and resea...Citizen Science in your pocket: Collecting data to support teaching and resea...
Citizen Science in your pocket: Collecting data to support teaching and resea...
 
AddressingHistory - Crowdsourcing historical data and maps
AddressingHistory - Crowdsourcing historical data and mapsAddressingHistory - Crowdsourcing historical data and maps
AddressingHistory - Crowdsourcing historical data and maps
 
Agile Data Access Initiative
Agile Data Access InitiativeAgile Data Access Initiative
Agile Data Access Initiative
 
Delivering Postgraduate Training - MANTRA
Delivering Postgraduate Training - MANTRADelivering Postgraduate Training - MANTRA
Delivering Postgraduate Training - MANTRA
 
UK RepositoryNet+
UK RepositoryNet+UK RepositoryNet+
UK RepositoryNet+
 
Rf guyfinal
Rf guyfinalRf guyfinal
Rf guyfinal
 
Digimap for Schools for Primary Schools
Digimap for Schools for Primary SchoolsDigimap for Schools for Primary Schools
Digimap for Schools for Primary Schools
 
Roles & Skills for RDM
Roles & Skills for RDMRoles & Skills for RDM
Roles & Skills for RDM
 
using Social Media to Communicate Your Research
using Social Media to Communicate Your Researchusing Social Media to Communicate Your Research
using Social Media to Communicate Your Research
 
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2 PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
PECAN Phase 2: Pilot for Ensuring Continuity of Access via Nesli2
 

Similar to Introduction to Research Data Management

PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
 
Research Data Mantra (Management Training) Online Course Launch
Research Data Mantra (Management Training) Online Course LaunchResearch Data Mantra (Management Training) Online Course Launch
Research Data Mantra (Management Training) Online Course LaunchEDINA, University of Edinburgh
 
Introduction to Research Data Management at Lancaster University
Introduction to Research Data Management at Lancaster UniversityIntroduction to Research Data Management at Lancaster University
Introduction to Research Data Management at Lancaster UniversityLancaster University Library
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data ManagementIzzyChad
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU EindhovenLeon Osinski
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management Wendy Mears
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College LondonSarah Anna Stewart
 
Supporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingSupporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingLisa Haddow
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationHistoric Environment Scotland
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationEDINA, University of Edinburgh
 

Similar to Introduction to Research Data Management (20)

EDINA / Data Library Overview
EDINA / Data Library OverviewEDINA / Data Library Overview
EDINA / Data Library Overview
 
RDM @ UoE
RDM @ UoERDM @ UoE
RDM @ UoE
 
RDM@Edinburgh
RDM@EdinburghRDM@Edinburgh
RDM@Edinburgh
 
RDM@Edinburgh
RDM@EdinburghRDM@Edinburgh
RDM@Edinburgh
 
RDM Programme at University of Edinburgh
RDM Programme at University of EdinburghRDM Programme at University of Edinburgh
RDM Programme at University of Edinburgh
 
RDM Programme @ Edinburgh: Data Librarian Experience
RDM Programme @ Edinburgh: Data Librarian ExperienceRDM Programme @ Edinburgh: Data Librarian Experience
RDM Programme @ Edinburgh: Data Librarian Experience
 
RDM Priorities, Stakeholders, Practice
RDM Priorities, Stakeholders, PracticeRDM Priorities, Stakeholders, Practice
RDM Priorities, Stakeholders, Practice
 
RDM & ELNs @ Edinburgh
RDM & ELNs @ EdinburghRDM & ELNs @ Edinburgh
RDM & ELNs @ Edinburgh
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
Research Data Mantra (Management Training) Online Course Launch
Research Data Mantra (Management Training) Online Course LaunchResearch Data Mantra (Management Training) Online Course Launch
Research Data Mantra (Management Training) Online Course Launch
 
Introduction to Research Data Management at Lancaster University
Introduction to Research Data Management at Lancaster UniversityIntroduction to Research Data Management at Lancaster University
Introduction to Research Data Management at Lancaster University
 
User engagement in research data curation
User engagement in research data curationUser engagement in research data curation
User engagement in research data curation
 
Rdm slides march 2014
Rdm slides march 2014Rdm slides march 2014
Rdm slides march 2014
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data Management
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
 
Supporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingSupporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of Stirling
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 

More from EDINA, University of Edinburgh

We have the technology... We have the data... What next?
We have the technology... We have the data... What next?We have the technology... We have the data... What next?
We have the technology... We have the data... What next?EDINA, University of Edinburgh
 
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...EDINA, University of Edinburgh
 
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...If I Googled You, What Would I Find? Managing your digital footprint - Nicola...
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...EDINA, University of Edinburgh
 
Managing your Digital Footprint : Taking control of the metadata and tracks a...
Managing your Digital Footprint : Taking control of the metadata and tracks a...Managing your Digital Footprint : Taking control of the metadata and tracks a...
Managing your Digital Footprint : Taking control of the metadata and tracks a...EDINA, University of Edinburgh
 
Social media and blogging to develop and communicate research in the arts and...
Social media and blogging to develop and communicate research in the arts and...Social media and blogging to develop and communicate research in the arts and...
Social media and blogging to develop and communicate research in the arts and...EDINA, University of Edinburgh
 
Enhancing your research impact through social media - Nicola Osborne
Enhancing your research impact through social media - Nicola OsborneEnhancing your research impact through social media - Nicola Osborne
Enhancing your research impact through social media - Nicola OsborneEDINA, University of Edinburgh
 
Social Media in Marketing in Support of Your Personal Brand - Nicola Osborne
Social Media in Marketing in Support of Your Personal Brand - Nicola OsborneSocial Media in Marketing in Support of Your Personal Brand - Nicola Osborne
Social Media in Marketing in Support of Your Personal Brand - Nicola OsborneEDINA, University of Edinburgh
 
Best Practice for Social Media in Teaching & Learning Contexts - Nicola Osborne
Best Practice for Social Media in Teaching & Learning Contexts - Nicola OsborneBest Practice for Social Media in Teaching & Learning Contexts - Nicola Osborne
Best Practice for Social Media in Teaching & Learning Contexts - Nicola OsborneEDINA, University of Edinburgh
 
Introduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data servicesIntroduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data servicesEDINA, University of Edinburgh
 
Digimap for Schools: Introduction to an ICT based cross curricular resource f...
Digimap for Schools: Introduction to an ICT based cross curricular resource f...Digimap for Schools: Introduction to an ICT based cross curricular resource f...
Digimap for Schools: Introduction to an ICT based cross curricular resource f...EDINA, University of Edinburgh
 

More from EDINA, University of Edinburgh (20)

The Making of the English Landscape:
The Making of the English Landscape: The Making of the English Landscape:
The Making of the English Landscape:
 
Spatial Data, Spatial Humanities
Spatial Data, Spatial HumanitiesSpatial Data, Spatial Humanities
Spatial Data, Spatial Humanities
 
Land Cover Map 2015
Land Cover Map 2015Land Cover Map 2015
Land Cover Map 2015
 
We have the technology... We have the data... What next?
We have the technology... We have the data... What next?We have the technology... We have the data... What next?
We have the technology... We have the data... What next?
 
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...
Reference Rot in Theses: A HiberActive Pilot - 10x10 session for Repository F...
 
GeoForum EDINA report 2017
GeoForum EDINA report 2017GeoForum EDINA report 2017
GeoForum EDINA report 2017
 
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...If I Googled You, What Would I Find? Managing your digital footprint - Nicola...
If I Googled You, What Would I Find? Managing your digital footprint - Nicola...
 
Moray housemarch2017
Moray housemarch2017Moray housemarch2017
Moray housemarch2017
 
Uniof stirlingmarch2017secondary
Uniof stirlingmarch2017secondaryUniof stirlingmarch2017secondary
Uniof stirlingmarch2017secondary
 
Uniof glasgow jan2017_secondary
Uniof glasgow jan2017_secondaryUniof glasgow jan2017_secondary
Uniof glasgow jan2017_secondary
 
Managing your Digital Footprint : Taking control of the metadata and tracks a...
Managing your Digital Footprint : Taking control of the metadata and tracks a...Managing your Digital Footprint : Taking control of the metadata and tracks a...
Managing your Digital Footprint : Taking control of the metadata and tracks a...
 
Social media and blogging to develop and communicate research in the arts and...
Social media and blogging to develop and communicate research in the arts and...Social media and blogging to develop and communicate research in the arts and...
Social media and blogging to develop and communicate research in the arts and...
 
Enhancing your research impact through social media - Nicola Osborne
Enhancing your research impact through social media - Nicola OsborneEnhancing your research impact through social media - Nicola Osborne
Enhancing your research impact through social media - Nicola Osborne
 
Social Media in Marketing in Support of Your Personal Brand - Nicola Osborne
Social Media in Marketing in Support of Your Personal Brand - Nicola OsborneSocial Media in Marketing in Support of Your Personal Brand - Nicola Osborne
Social Media in Marketing in Support of Your Personal Brand - Nicola Osborne
 
Best Practice for Social Media in Teaching & Learning Contexts - Nicola Osborne
Best Practice for Social Media in Teaching & Learning Contexts - Nicola OsborneBest Practice for Social Media in Teaching & Learning Contexts - Nicola Osborne
Best Practice for Social Media in Teaching & Learning Contexts - Nicola Osborne
 
SCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison serviceSCURL and SUNCAT serials holdings comparison service
SCURL and SUNCAT serials holdings comparison service
 
Big data in Digimap
Big data in DigimapBig data in Digimap
Big data in Digimap
 
Introduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data servicesIntroduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data services
 
Digimap for Schools: Introduction to an ICT based cross curricular resource f...
Digimap for Schools: Introduction to an ICT based cross curricular resource f...Digimap for Schools: Introduction to an ICT based cross curricular resource f...
Digimap for Schools: Introduction to an ICT based cross curricular resource f...
 
Digimap Update - Geoforum 2016 - Guy McGarva
Digimap Update - Geoforum 2016 - Guy McGarvaDigimap Update - Geoforum 2016 - Guy McGarva
Digimap Update - Geoforum 2016 - Guy McGarva
 

Introduction to Research Data Management

  • 1. Introduction to Research Data Management Stuart Macdonald EDINA & Data Library stuart.macdonald@ed.ac.uk RDM Training, School of Geosciences, 7 November 2012
  • 2. Background • Data Library Services & Projects • Research Data MANTRA • What is RDM – Research Data Defined – Data Management Planning – Organising Data – File Formats & Transformations – Documentation & Metadata – Storage & Security – Data protection & Rights – Preservation & Sharing
  • 3. Background EDINA and University Data Library (EDL) together are a division within Information Services of the University of Edinburgh. EDINA is a JISC-funded National Data Centre providing national online resources for education and research - url: http://edina.ac.uk The Data Library assists Edinburgh University users in the discovery, access, use and management of research datasets - url: http://www.ed.ac.uk/is/data-library
  • 4. Data Library Services and Projects • Data Library & Consultancy • Edinburgh DataShare • JISC-funded projects – DISC-UK DataShare (2007-2009) – Data Audit Framework Implementation (2008) – Research Data MANTRA (2010- 2011)
  • 5. Data Library & Consultancy • finding… • accessing … • using … • teaching … • managing Building relationships with researchers via PG teaching activities, research support projects, IS Skills workshops, Research Data Management training and through traditional reference interviews.
  • 6. Edinburgh DataShare: url: http://datashare.is.ed.ac.uk/ An online institutional repository of multi-disciplinary research datasets produced by University researchers, hosted by the Data Library. Researchers producing research data associated with a publication, or which has Re-use potential, can upload their dataset for sharing and safekeeping. A persistent identifier and suggested citation will be provided. DataShare is a customised DSpace instance with a selection of standards-compliant metadata fields to aid discovery through Google and other search engines via OAI-PMH.
  • 7. Edinburgh Data Audit Framework (DAF) Implementation (May – Dec 2008) A JISC-funded pilot project produced 6 case studies from research units across the University in identifying research data assets and assessing their management, using DAF methodology developed by the Digital Curation Centre. 4 main outcomes: • Develop online RDM guidance • Develop university research data management policy • Develop services & support for RDM (in partnership IS) • Develop RDM training
  • 8. Research Data Management Web Guidance Online suite of web pages for IS website developed in 2009 – recently rationalised and revamped (Oct. 2012) url: http://tinyurl.com/pmje7o
  • 9. University Research Data Management Policy In spring 2010, a review commenced at the University to address the issue of managing the rapidly expanding volume and complexity of data produced by Edinburgh researchers. The Review was overseen by the IT & Library Committee and had twin tracks to look at Data Storage, and Data Management, Curation and Preservation. The Review looked at current practice in the University, in peer universities & internationally. Championed by Vice-Principal & Chief Information Officer Prof. Jeff Haywood the policy for management of research data was approved by the University Court on 16 May, 2011. One of the first RDM policies in a UK tertiary education Institution.
  • 10. IS RDM Roadmap Drivers: University research data management policy and EPSRC request that all institutions in receipt of their funding should develop a roadmap for research data management (to be implemented by May 1st 2015). Information Services (IS) has committed to an RDM Roadmap over an 18 month period (July 2012-Jan. 2014) across four strategic areas. The Roadmap will help to engage academic units and PIs in research data management and provide services to implement the University’s RDM Policy. The Roadmap is a cross-divisional goal of IS supported by: DCC, EDINA & Data Library, User Services, Library & Collections, IT Infrastructure.
  • 11.
  • 13. Research Data MANTRA Partnership between: Edinburgh University Data Library Institute for Academic Development Funded by JISC Managing Research Data Programme (Sept. 2010 – Aug. 2011)
  • 14. Why Manage Research Data? Data Deluge – exponential growth in the volume of digital research artifacts created within academia. Data management is one of the essential areas of responsible conduct of research.
  • 15. Project Overview Grounded in three disciplinary contexts: social science, clinical psychology and geoscience. Aim was to develop online interactive open learning resources for PhD students and early career researchers that will: • Raise awareness of the key issues related to research data management & contribute to culture change. • Provide guidelines for good practice. Selling RDM as a Transferrable Skill. (voluntary participation)
  • 16. Online Learning Module Eight units with activities, scenarios and videos: • Research data explained • Data management plans • Organising data • File formats and transformation • Documentation and metadata • Storage and security • Data protection, rights and access • Preservation, sharing and licensing Four data handling practicals: SPSS, NVivo, R, ArcGIS Video stories from researchers in variety of settings Xerte Online Toolkits – University of Nottingham
  • 17. MANTRA & Research Data Lifecycle url: http://datalib.edina.ac.uk/mantra/index.html
  • 18. Online Learning Module • Delivered online – self-paced, available ‘anytime, anyplace’ • Emphasis on practical experience and active engagement via online activities • One hour per unit • Read and work through scenarios & activities (incl. videos etc) • CC licence to allow manipulation of content for re-use with attribution • Portable content in open standard formats (e.g. SCORM)
  • 19. MANTRA Dissemination • Learning materials deposited with an open licence in JorumOpen & Xpert. • Learning materials to be embedded in three participating postgraduate programmes and made available through IAD programme for use by all postgraduate students and early career researchers. • Website: http://datalib.edina.ac.uk/MANTRA • Download/re-brand/re-purpose materials from JorumOpen in standards compliants formats. • Software modules – data handling practicals (MS Word)
  • 20. End of Part One! Questions?
  • 21. What is Research Data Management? • An umbrella terms to describe all aspects of planning, organising, documenting, storing and sharing research data. • It also takes into account issues such as documentation, data protection and confidentiality. • It provides a framework that supports researchers and their data throughout the course of their research and beyond.
  • 22. * Research Information Network. “Stewardship of digital research data - principles and guidelines", 30 March 2007. Viewed 30 October 2012 Research Data Defined US Office of Management and Budget in its grants management circular A-110 defines research data as “the recorded factual material commonly accepted in the scientific community as necessary to validate research findings.” The KRDS2 study (Beagrie et al, 2009) define research data as ‘ collections of structured digital data from any disciplines or sources which can be used by academic researchers to undertake their research or provides an evidential record of their research.’ RIN Classification*: • Observational – real-time, unique, usually irreplaceable • Experimental – from lab equipment, expensive, often reproducible • Simulation – generated from models – model & metadata more important than output data • Derived or compiled – reproducible but expensive • Reference - a (static or organic) collection of smaller (peer- reviewed) datasets, most probably published and curated
  • 23. Research Data Defined • Research data, unlike other information types, is collected, observed, or created, for purposes of analysis to produce original research results. • Research data can be generated for different purposes and through different processes in a multitude of digital formats.
  • 24. Research data may include the following: • Documents (text, MS Word), spreadsheets • Lab books, field notes, diaries • Questionnaires, transcripts, codebooks • Audiotapes, videotapes, photographs, images • Slides, artefacts, specimens, samples • Collection of digital objects acquired & generated during the research process • Database contents (video, audio, text, images) • Models, algorithms, scripts • Contents of an application (input, output, logfiles for analysis software, schemas) • Methodologies, workflows • SOPs, protocols
  • 25. By managing your data you will: • ensure scientific integrity of research and aid replication • ensure research data and records are accurate, complete, authentic and reliable • increase your research efficiency • save time, effort and resources in the long run • enhance data security and minimise the risk of data loss • prevent duplication of effort by enabling others to use your data • meet funding council grant requirements Note: It may also be important to manage research records (both digital & hardcopy) during and beyond the life of the project e.g. correspondence (emails); project files; grant applications; technical reports; research reports; consent forms; ethics applications.
  • 27. What Do Funders Want? • timely release of data - once patents are filed or on (acceptance for) publication. • open data sharing - minimal or no restrictions if possible. • preservation of data - typically 5-10+ years if of long-term value. See the RCUK Common Principles on data policy: www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
  • 28. Data Management & Sharing Plans Five common questions asked by funders are: • What data will be created? (format, types, volumes etc) • What standards and methodologies will you use? • How will you manage ethics and Intellectual Property? • What are the plans for data sharing and access? • What is the strategy for long-term preservation? DCC’s DMP Online tool: https://dmponline.dcc.ac.uk How to write a DMP guide: www.dcc.ac.uk/resources/how-guides/develop-data-plan
  • 29. Data Management Plan. What is it? A DMP is a document which describes:  What research data will be created.  What policies (funding, institutional, legal) apply to the data.  What data management practices (backups, storage, access control, archiving) will be used.  What facilities and equipment will be required (hard-disk space, backup server, repository).  Who will own the copyright and have access to the data.  Who will be responsible for each aspect of the plan.  How its reuse will be enabled and long-term preservation ensured after the original research is completed. The data management plan must be continuously maintained and kept up-to-date throughout the course of research.
  • 30. Why do we need one? It improves your research both now and later... •Data is often valuable for a long time! •Results of your research may outlast your degree. •Will you use your data throughout your career? •Loss of physical/digital data and records. •Loss of usefulness through records loss, media and software obsolescence, •Forgetting stuff! Good practice → Better research
  • 31. Why do we need one? •Ensure research integrity (and repeatability) through keeping better records. •People can trace your outcomes from data collection, through research methodology, through to results. •Maximises usefulness of data to fellow researchers. •Highlights how data was collected, quality controls, how people can and should use it (access and licensing), how you then attribute people/projects. •Facilitates data use within collaboration. •Can help lead to subsequent research papers.
  • 32. Getting started with a DMP  Gain an understanding of terminology & issues.  Gain understanding of your project/community – Supervisor and colleagues – People in your School, i.e. IT Officers, Graduate Research Coordinator...  Talk to your supervisor about data authorship, IP, licensing, policies.  Use a research data planning checklist.  Keep it practical and simple, don't spend too much time. What you don't know leave gaps, investigate, fill in later.  Remember it is never finished! Review it regularly through the course of your research.
  • 33. Organising your data •Research data files and folders need to be labelled and organised in a systematic way so that they are both identifiable and accessible for current and future users. •Naming datasets according to agreed conventions should make file naming easier for colleagues because they will not have to ‘re-think’ the process each time. •One benefit of consistent research data file labelling is that files are not accidentally overwritten or deleted. •It is important to consistently identify and distinguish versions of data files. This ensures that a clear audit trail exists for tracking the development of a data file and identifying earlier versions when needed.
  • 34. File Formats & Transformation • A file format encodes information in a computer file, enabling another program to access data within it • HTML and PDF are two examples of commonly used file format and may be identified by their suffixes .html and .pdf. • Files are based on either text or binary encoding. The former is both machine- and human-readable and the latter only readable by means of  appropriate software. • Thus text files are less likely to become obsolete. Examples of file name extensions for these files are .txt, .csv and .por.  • If you convert or migrate your data files from one format to another, be aware of the potential risk of the loss or corruption of your data and take appropriate steps to avoid/minimise it.
  • 35. File Formats & Transformation •When compressing  your data files for storage, transportation or transmission, you encode the information using fewer bits than the original representation. Commonly used compression programs are  Zip and Tar. •You may use the process of data normalisation. This means to convert data from one format (e.g. proprietary) into another for use or preservation (e.g. ASCII). •You may also need to compute new  values from old in your data, a process which is called data transformation. •This may be necessary prior to analysing your data. Three techniques for doing this are aggregation, anonymisation and perturbation.
  • 36. Documenting Data There are many reasons why you need to document your data: •To help you remember the details later •To help others understand your research •Verify your findings •Review your submitted publication •Replicate your results •Archive your data for access and re-use Some examples of data documentation are: •Laboratory notebooks •Field notes •Questionnaires
  • 37. Documenting Data Laboratory or field notebooks, for example play an important role in supporting claims relating to intellectual property developed by University researchers, and even defending claims against scientific fraud. Research data need to be documented at various levels: •Project level •File or database level •Variable or item level The term metadata (‘data about data’) is often used. The importance of metadata lies in the potential for machine-to-machine interoperability to assist location and access to data through search interfaces.
  • 38. Secure data storage: For the purposes of integrity, efficiency and ease of replication it is important that research data is stored securely & backed up regularly via: • Networked drives • Fileservers managed by department / school / IS. • Stored in single, secure, accessible place – regular back-ups. • Personal computers / laptops • Convenient, temporary storage - should not be used for storing master copies. • Local drives may fail & laptops may get lost/stolen.
  • 39. • External storage devices • Hard drives, USB sticks, CDs, DVDs – low cost & portable BUT not recommended for long term storage. • Longevity not guaranteed – degradation over time. • Easily damaged or misplaced. • Not big enough for all research data – need for use of multiple discs/drives. • May pose a security threat. If USB sticks, DVDs, CDs are used for working data or extra back-up then: • Choose high quality products from reputable manufacturers. • Conduct regular checks to ensure media is not failing. • Periodically refresh data (i.e. copy to a new disc or drive). • Ensure confidential data is password protected / encrypted
  • 40. • Remote or online back-up services - services that provides an online system for storing and backing-up computer files e.g. Dropbox, Mozy, Humyo, A-Drive • Allow users to store and sync data files online and between computers. • Employ cloud computing storage facilities (e.g. Amazon S3). • Business model – first few GBs free, pay for more space.
  • 41. Backing-up Considerations for back-up policy: • Whether all data (full back-up), or only changed data will be backed-up (incremental back-up)? • How often full and incremental back-ups will be made? • How much hard-drive space or DVDs will be required to maintain this schedule? • If working with sensitive data, how will it be secured (and destroyed)? • What back-up services are available that meet your these needs? • Who will be responsible for ensuring back-ups are available? Recommendation: Keep at least 3 copies of your data (e.g. original, external/local, and external/remote) and put in place regular back-up procedure
  • 42. Data Security The means of ensuring that data is kept safe from corruption and that access to it is suitably controlled. It is important to consider data security to prevent: • Accidental or malicious damage / modification to data. • Theft of valuable or irreplaceable data. • Breach of confidentiality agreements and privacy laws. • Release of data before it has been checked for accuracy and authenticity.
  • 43. Data Protection • The 1998 Data Protection Act regulates how personal data may be held and processed, and is aimed at organisations but also applies to individuals. • The Act recognises that personal data on its own or linked with other data, can reveal the identity of an actual living person. • You must comply with the Act from the moment you obtain personal data until the time when the data have been returned, destroyed, or perhaps transformed into a public use dataset for purposes of sharing. • Research exemption exists if you are able to process anonymised data instead of personal data for your research by destroying the “key” between the identifiers and the personally identifying information. • The Records Management Office has full guidance on its website.
  • 44. Rights and access • Intellectual property rights (IPR) can be defined as rights acquired over any work created or invented with the intellectual effort of an individual. • Facts are not copyrightable but the structure of a database could be. • As a researcher, you should clarify ownership of and rights relating to research data before a project starts. This includes the right of access and the right to make copies. • Data licences determine the terms and conditions of use by another, and may accompany a purchase or subscription. • Open data licences attempt to “set data free” by minimising and standardising the terms and conditions of re-use. Conditions may include attribution, non-commercial use, no derivative works, or ‘share alike’.
  • 45. Benefits of Sharing Data • Scientific integrity – publishing & citing data in published research papers can allow others to replicate, validate, or correct results, thus improving the scientific record. • Publicly funded research - there is a growing movement for making publicly funded research available to the public. • Funding mandates - UK research councils are increasingly mandating data sharing so as to avoid duplication of effort and save costs. • University of Edinburgh’s mission - "the creation, dissemination and curation of knowledge" implies transparency about the research that is conducted in its name. • Preserve research data for researchers’ own future use.
  • 46. THANK YOU! Data Library services: http://www.ed.ac.uk/is/data-library EDINA: http://edina.ac.uk/ Research data management guidance pages: http://www.ed.ac.uk/is/research-data-management Edinburgh University data policy: http://www.ed.ac.uk/is/research-data-policy Edinburgh Data Audit Framework (DAF) Implementation: http://ie-repository.jisc.ac.uk/283/ Research data MANTRA course: http://datalib.edina.ac.uk/mantra
  • 47. Scenarios for Discussion At completion of a research project the data and records are boxed and stored in a departmental storeroom. A participant in a research project lodges a claim for compensation, alleging that he was not adequately informed about the effects of the study and does not recall giving consent. He finds that the storeroom has since been converted into a coffee shop. Where are the records?
  • 48. Scenarios for Discussion Sometime after completion of a research project the researcher wishes to revisit her findings, applying a new statistical approach. She manages to read the floppy discs that the data were stored on, eventually gets the old software format imported into her current statistical package, only to find she cannot remember what many of the variable labels –each 8 digits in length - actually mean. Has she documented her data? You publish a paper based on your thesis and are surprised to find it has become a hot topic in your field. Suddenly people are writing to you asking for the underlying data. How much effort is required to give them a well-cleaned dataset and adequate documentation for re-use?

Editor's Notes

  1. 25 years ago disk storage - expensive researchers interested in working with data came together to petition the PLU and the University’s Library – wanting a university-wide provision for files that were too large to be stored on individual computing accounts Early holdings were research data from universities of edinburgh, glasgow, and strathclyde
  2. Primarily social sciences but not exclusively so, large scale government surveys (micro data), macro-economic time series data (country-level data), Elections studies, Geospatial data, financial datasets, population census data Free on internet / subscription / through national data centres/archives / resource discovery portals Registration / authorisaiton and authentication / special conditions / budget to pay for data SPSS, STATS, SAS, R, ArcGIS – interpret documentaiton/codebooks, merge and match users data with other data (via look-up tables), subset data Data Catalogue
  3. Training for postgraduates and early career researchers   These  were  the  School  of  Divinity,  School  of  History,  Classics  and  Archaeology),  School of Biomedical Sciences),  (School  of  Molecular  and  Clinical  Medicine),   (School  of  Physics  and  Astronomy).  Also,  the  School  of  Geosciences
  4. Digital Curation centre, Data Library, Information Services Infrastructure, Research Computing, Library & Collections Concern is both for the shorter term – ensuring competitive advantage through secure and easy-to-use access, and for the longer term – ensuring enduring access and usability to the research community into the future and compliance with legislation. 2 working groups RDS working group RDM working group
  5. Funded by JISC as part of its UK programme, Managing Research Data to develop online learning materials to assist researchers manage their digital assets. IAD – set up to deliver training and development for postgraduate students and staff – via online course, Virtual Learning Environments, transferable skills training
  6. A set of Multi- or Cross-Disciplinary online learning resources FRUIT principles – Fun Relevant Useful Interesting Timely
  7. Shareable Content Object Reference Model – XML-based
  8. JorumOpen - national OER repository
  9. What about preserving?
  10. Observational – sensor data, survey or sample data, neuroimages – e.g. ocean temperature, voters attitudes before an election, photographs of a supernova Experimental – e.g. gene sequences, chromatograms, toroid magnetic field data, HPLC, gel electrophoresis, chemical reaction rates, Simulation – e.g. climate models, economic models, algorithms Derived – e.g. text and data mining, compiled database, 3D models, maps Reference - e.g. gene sequence databanks, chemical structures, spatial data portals
  11. BioData Blog “ Documenting data may seem like a tedious, wasteful step, but each researcher must think of its long-term benefits ” - methodologies, workflows, procedures, recording conditions etc