SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Dataset citation and identification




Adam Farquhar, PhD
Head of Digital Library Technology, The British Library
President, DataCite

December, 2009
Widening gap



A widening gap in the scientific
record between published research
and the data that underlies it
   Published work held by libraries
   Datasets held by data centres
   No effective way to link between
   datasets and articles
   No widely used method to
   identify datasets
   No widely used method to cite
   datasets
As a result, datasets are
  Difficult to discover
  Difficult to access
  Second-class citizens in the
  scientific record


                                      2
Datasets – first class citizens?



Datasets                                Published articles

Data is difficult to manage after       Libraries ensure long-term storage
project funding ceases                  and management

Informal networks provide the           Established funded services provide
primary means of sharing                the primary means of access

Only 21% use a national or              Nearly all published articles are held
international facility                  in multiple national libraries

Datasets are not included in impact     Articles and citations form the
analysis                                backbone of impact analysis

Good luck finding it or getting         Catalogues and full-text search
permission to use it (your discipline   support discovery
may vary)

Source: UKRDS Study
                                                                                 3
Dataset citation using Digital Object Identifiers (DOIs)



The DOI system offers an             Dataset
easy way to connect the              G.Yancheva, N. R. Nowaczyk et al (2007)
article with the underlying
data                                 Rock magnetism and X-ray flourescence
                                     spectrometry analyses on sediment cores
Several organisations assign         of the Lake Huguang Maar, Southeast
DOIs to datasets
                                     China, PANGAEA
  IUCR, ICPSR, OECD
  through CrossRef                   doi:10.1594/PANGAEA.587840
  Pangea, Mare, and others
  through TIB (German
  Science Library)

 Article
 G. Yancheva, N. R. Nowaczyk et al (2007)




                                                        s
 Influence of the intertropical convergence




                                                      te
                                                   Ci
 zone on the East Asian monsoon
 Nature 445, 74-77
 doi:10.1038/nature05431
                                                                          4
DataCite – International Data
Citation Initiative



Our long term vision is to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.

Milestones
  2005, Hannover, TIB begins to issue DOIs for datasets
  March 2009, Paris
    Memorandum signed at ICSTI
  December 2009, London
    DataCite Association founded

(DataCite : Data Centres :: CrossRef : Publishers)

                                                           5
Global partnership



 Germany - Technische
 Informationsbibliothek (TIB)
 United Kingdom - The British Library
 France - L’Institut de l’Information
 Scientifique et Technique (INIST)
 Switzerland - Library of the ETH
 Zürich
 Denmark - Library of TU Delft
 Netherlands - Technical Information
 Center
 Canada - Canadian Institute for
 Scientific and Technical Information
 (CISTI)
 Australia - National Data Service
 (ANDS)
 USA - California Digital Library
 USA - Purdue University


                                        6
DataCite



The DataCite registration agency
  Maintains the resolution infrastructure
  Maintains a searchable database of metadata
  Manages the identifiers over the long term
  Establishes and shares best practice

Publishing agents (data centres, research institutes, publishers) are
responsible for
   Quality assurance
   Content storage and access
   Creating the identifier
   Creating and updating metadata


                                                                        7
DataCite Structure



                       International DOI
                          Foundation
                      Member

         s                                        Managing Agent
   rr ie                   DataCite
                                                      (TIB)
Ca



              Member                          Member                Associate
             Institution                     Institution           Stakeholder

                                  …                          Works
                                                              with

         Data Centre
        Data Centre                          Data Centre
                                            Data Centre
       Data Centre                         Data Centre

                                                                                 8
9
10
11
12
13
14
15
16
Research Data in Articles




                            17
How can we work together?



DataCite supports researchers       Help to establish best
by enabling them to locate,         practices
identify, and cite research         Adjust author policies to
datasets with confidence            require clear unambiguous
                                    citations for datasets
This is the start of a
conversation                        Integrate links to datasets into
                                    delivery platforms
We welcome your comments,           Collaborate to understand
questions, and ideas!               evolving roles and
                                    responsibilities among
Contact:                            publishers, data centres, and
adam.farquhar {@} bl.uk             libraries
jan.brase {@} tib.uni-hannover.de
                                    Help me to rewrite this list!


                                                                       18

Más contenido relacionado

La actualidad más candente

Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-researchUc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
University of California Curation Center
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data Project
Edward Blurock
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
ECNOfficer
 

La actualidad más candente (19)

NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-researchUc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
3 tu.dc 5min nordbib jp rombouts
3 tu.dc 5min nordbib jp rombouts3 tu.dc 5min nordbib jp rombouts
3 tu.dc 5min nordbib jp rombouts
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
20070919 Bkt Padua Esf Dfg Workshop Intro
20070919 Bkt Padua Esf Dfg Workshop Intro20070919 Bkt Padua Esf Dfg Workshop Intro
20070919 Bkt Padua Esf Dfg Workshop Intro
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Data, librarians, and services
Data, librarians, and servicesData, librarians, and services
Data, librarians, and services
 
Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)Integration of research literature and data (InFoLiS)
Integration of research literature and data (InFoLiS)
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure Commons
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data Project
 
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data AcceptanceRDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
 
Supporting researchers with DMPs
Supporting researchers with DMPsSupporting researchers with DMPs
Supporting researchers with DMPs
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
 

Similar a Dataset citation and identification

DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
Crossref
 
Introduction to DataCite and its Infrastructure for new Members
Introduction to DataCite and its Infrastructure for new MembersIntroduction to DataCite and its Infrastructure for new Members
Introduction to DataCite and its Infrastructure for new Members
Frauke Ziedorn
 

Similar a Dataset citation and identification (20)

British Library Datasets Programme 2010
British Library Datasets Programme 2010British Library Datasets Programme 2010
British Library Datasets Programme 2010
 
DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
 
British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 
DataCite overview 2014
DataCite overview 2014DataCite overview 2014
DataCite overview 2014
 
Riding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information accessRiding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information access
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
Introduction to DataCite and its Infrastructure for new Members
Introduction to DataCite and its Infrastructure for new MembersIntroduction to DataCite and its Infrastructure for new Members
Introduction to DataCite and its Infrastructure for new Members
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Preserving the Inputs and Outputs of Scholarship
Preserving the Inputs and Outputs of ScholarshipPreserving the Inputs and Outputs of Scholarship
Preserving the Inputs and Outputs of Scholarship
 
Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018
 
Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?
 
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
 
Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Dataset citation and identification

  • 1. Dataset citation and identification Adam Farquhar, PhD Head of Digital Library Technology, The British Library President, DataCite December, 2009
  • 2. Widening gap A widening gap in the scientific record between published research and the data that underlies it Published work held by libraries Datasets held by data centres No effective way to link between datasets and articles No widely used method to identify datasets No widely used method to cite datasets As a result, datasets are Difficult to discover Difficult to access Second-class citizens in the scientific record 2
  • 3. Datasets – first class citizens? Datasets Published articles Data is difficult to manage after Libraries ensure long-term storage project funding ceases and management Informal networks provide the Established funded services provide primary means of sharing the primary means of access Only 21% use a national or Nearly all published articles are held international facility in multiple national libraries Datasets are not included in impact Articles and citations form the analysis backbone of impact analysis Good luck finding it or getting Catalogues and full-text search permission to use it (your discipline support discovery may vary) Source: UKRDS Study 3
  • 4. Dataset citation using Digital Object Identifiers (DOIs) The DOI system offers an Dataset easy way to connect the G.Yancheva, N. R. Nowaczyk et al (2007) article with the underlying data Rock magnetism and X-ray flourescence spectrometry analyses on sediment cores Several organisations assign of the Lake Huguang Maar, Southeast DOIs to datasets China, PANGAEA IUCR, ICPSR, OECD through CrossRef doi:10.1594/PANGAEA.587840 Pangea, Mare, and others through TIB (German Science Library) Article G. Yancheva, N. R. Nowaczyk et al (2007) s Influence of the intertropical convergence te Ci zone on the East Asian monsoon Nature 445, 74-77 doi:10.1038/nature05431 4
  • 5. DataCite – International Data Citation Initiative Our long term vision is to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence. Milestones 2005, Hannover, TIB begins to issue DOIs for datasets March 2009, Paris Memorandum signed at ICSTI December 2009, London DataCite Association founded (DataCite : Data Centres :: CrossRef : Publishers) 5
  • 6. Global partnership Germany - Technische Informationsbibliothek (TIB) United Kingdom - The British Library France - L’Institut de l’Information Scientifique et Technique (INIST) Switzerland - Library of the ETH Zürich Denmark - Library of TU Delft Netherlands - Technical Information Center Canada - Canadian Institute for Scientific and Technical Information (CISTI) Australia - National Data Service (ANDS) USA - California Digital Library USA - Purdue University 6
  • 7. DataCite The DataCite registration agency Maintains the resolution infrastructure Maintains a searchable database of metadata Manages the identifiers over the long term Establishes and shares best practice Publishing agents (data centres, research institutes, publishers) are responsible for Quality assurance Content storage and access Creating the identifier Creating and updating metadata 7
  • 8. DataCite Structure International DOI Foundation Member s Managing Agent rr ie DataCite (TIB) Ca Member Member Associate Institution Institution Stakeholder … Works with Data Centre Data Centre Data Centre Data Centre Data Centre Data Centre 8
  • 9. 9
  • 10. 10
  • 11. 11
  • 12. 12
  • 13. 13
  • 14. 14
  • 15. 15
  • 16. 16
  • 17. Research Data in Articles 17
  • 18. How can we work together? DataCite supports researchers Help to establish best by enabling them to locate, practices identify, and cite research Adjust author policies to datasets with confidence require clear unambiguous citations for datasets This is the start of a conversation Integrate links to datasets into delivery platforms We welcome your comments, Collaborate to understand questions, and ideas! evolving roles and responsibilities among Contact: publishers, data centres, and adam.farquhar {@} bl.uk libraries jan.brase {@} tib.uni-hannover.de Help me to rewrite this list! 18