1. Data Publishing
@PLOS
THOR Workshop, Amsterdam
Catriona MacCallum, PLOS Advocacy Director
Member of the Boards OASPA, OpenAire
@PLOS, @catmacOA
April 2016
ORCID: 0000-0001-9623-2225
3. Data Availability
Probability of finding the
data associated with a
paper declined by 17%
every year
Vines, Timothy et al. “The
Availability of Research Data
Declines Rapidly with Article Age.”
Current Biology 24, no. 1 (June 1,
2014): 94–97.
doi:10.1016/j.cub.2013.11.014.
4. PLOS Data Policy
• PLOS journals require authors to make all data
underlying the findings described in their
manuscript fully available without restriction,
with rare exception.
• When submitting a manuscript online, authors
must provide a Data Availability Statement
describing compliance with PLOS's policy.
Since March 2014
5. External Data Advisory Group
• Academic Chair: Phil Bourne
• 40 experts across the world with
representatives from all PLOS journals
6. DAS
NB The DAS is openly available, and machine-readable as part of the
PLOS search API
8. Where are the Data (PLOS ONE)?
Time Papers with DAS
Data in
Submission Files
(#)
Data in
Submission
Files (%)
Data in
Repositories
(Estimate)
Data upon
Request
(Estimate)
Q2-Q4 2014 9491 7918 74% 11% 10%
Q2-Q4 2015 22142 15382 69% 14% 12%
Dryad Figshare NCBI Github
Q2-Q4 2014 152 210 551 37
Q2-Q4 2015 551 753 1229 174
DAS = Data availability statement
9. Internal Checks: PLOS ONE
• At submission: check for unacceptable restrictions
to access
• During review: Editors & Reviewers assess
underlying data
• At accept: check statements & ensure clinical
datasets have no potentially identifying information
• Post-publication: work with authors as needed
12. On January 7, 2016, a coalition of publishers
sign an Open Letter committing to start
requiring ORCID IDs in 2016.
1. Implementing best practices for ORCID
collection and auto-update of ORCID
records upon publication
2. Require ORCID IDs for corresponding
authors and encourage for co-authors
13. CRediT – Contributor Roles Taxonomy
A simple taxonomy of research
contributions (CASRAI and NISO).
- Includes but not limited to
traditional authorship roles
- Makes contributions machine-
readable and portable
- Meant to inspire development:
Mozilla badges, VIVO-ISF
ontology, JATS integration,
ORCID integration
14. The CRediT taxonomy is by design simple, which may become limiting, but it
provides an important framework for authorship discussions.
Ideal solution:
* includes a free text field for each contribution
* can be used upstream from submission, during research
15. Data Citation (ongoing):
credit for data producers and collectors
• Should comply with Force11 Data Citation Principles
• Minimum Requirements
• author names, repository name, date + persistent unique
identifier (such as DOI or URI)
• citation should link to the dataset directly via the persistent
identifier
• comprehensive, machine-readable landing pages for
deposited data
• guidance to authors to include data in references
https://www.force11.org/group/joint-declaration-data-citation-principles-final
17. Protocols.io
• Data base of experimental protocols
• Open access and free for users
• Desktop and mobile applications
• Functionality to
• Create
• Fork – create derivatives (keeps provenance)
• Run
• Annotate while running
• Keep date-stamped version of actual run
• Export to PDF, etc
10k registrants
1,000 private protocols
21. False expectations
Peer review is expected to police the literature but:
• Science has become more cross disciplinary and more
complicated (mammoth datasets)
• Is 2 or 3 reviewers + 1 editor sufficient?
• Anonymity conceals/engenders negativity and bias
• No incentive/reward for constructive collaboration
• Reviewers review for journals and editors – not for readers,
colleagues or society
• Peer review is a black box – impossible to assess its
effectiveness
22. Is science reliable ?
• Poorly Designed studies
• small sample sizes, lack of randomisation, blinding and
controls
• Data not available to scrutinise/replicate
• ‘p-hacking’ (selective reporting) widespread1
• Poorly reported methods & results2
• Negative results are not published
1Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD (2015) The Extent and
Consequences of P-Hacking in Science. PLoS Biol 13(3): e1002106.
doi:10.1371/journal.pbio.1002106
2Landis SC, et al. (2012) A call for transparent reporting to optimize
the predictive value of preclinical research. Nature 490(7419):
187–191.
26. “Current incentive structures in science are likely to lead rational
scientists to adopt an approach to maximise their career
advancement that is to the detriment of the advancement of
scientific knowledge. “
Andrew Higginson and Marcus
Mufano, in prep (cited with their
permission)
27. Declaration on Research
Assessment
• A worldwide initiative, spearheaded by
the ASCB (American Society for Cell
Biology), together with scholarly journals
and funders
• Focuses on:
• the need to eliminate the use of journal-
based metrics, such as Journal Impact
Factors, in funding, appointment, and
promotion considerations;
• “need to assess research on its own
merits rather than on the basis of the
journal in which the research is published”
28. A simple proposal for the publication of journal citation
distributions
Vincent Larivière1, Véronique Kiermer2, Catriona J. MacCallum3, Marcia McNutt4, Mark Patterson5,
Bernd Pulverer6, Sowmya Swaminathan7, Stuart Taylor8, Stephen Curry9*
1Associate Professor of Information Science, École de bibliothéconomie et des sciences de l’information, Université de Montréal,
C.P. 6128, Succ. Centre-Ville, Montréal, QC. H3C 3J7, Canada; Observatoire des Sciences et des Technologies (OST), Centre
Interuniversitaire de Recherche sur la Science et la Technologie (CIRST), Université du Québec à Montréal, CP 8888, Succ.
Centre-Ville, Montréal, QC. H3C 3P8, Canada
2Executive Editor, PLOS, 1160 Battery Street, San Francisco, CA 94111, USA
3Advocacy Director, PLOS, Carlyle House, Carlyle Road, Cambridge CB4 3DN, UK
4Editor-in-Chief, Science journals, American Association for the Advancement of Science, 1200 New York Avenue, NW,
Washington, DC 20005, USA
5Executive Director, eLife, 24 Hills Road, Cambridge CB2 1JP, UK
6Chief Editor, The EMBO Journal, Meyerhofstrasse 1,69117 Heidelberg, Germany
7Head of Editorial Policy, Nature Research, Springer Nature, 225 Bush Street, Suite 1850, San Francisco 94104, USA
8Publishing Director, The Royal Society, 6-9 Carlton House terrace, London SW1Y 5AG, UK
9Professor of Structural Biology, Department of Life Sciences, Imperial College, Exhibition Road, London, SW7 2AZ, UK
*Corresponding Author. Email: s.curry@imperial.ac.uk
Published in bioRxiv, 2016 : http://biorxiv.org/content/early/2016/07/05/062109
CC BY
29. Fig 1. Citation distributions of 11 different science
journals. Citations are to ‘citable documents’ as
classified by Thomson Reuters, which include
standard research articles and reviews. The
distributions contain citations accumulated in 2015
to citable documents published in 2013 and 2014 in
order to be comparable to the 2015 JIFs published
by Thomson Reuters. To facilitate direct
comparison, distributions are plotted with the same
range of citations (0-100) in each plot; articles with
more than 100 citations are shown as a single bar
at the right of each plot.
0
10
20
30
40
50
60
70
80
90
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
eLife
0
5
10
15
20
25
30
35
40
45
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
EMBO J.
0
10
20
30
40
50
60
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
J. Informetrics
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
Nature
0
50
100
150
200
250
300
350
400
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
Nature Comm.
0
5
10
15
20
25
30
35
40
45
0 10 20 30 40 50 60 70 80 90 100+Numberofpapers
Number of citations
PLOS Biol.
0
20
40
60
80
100
120
140
160
180
200
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
PLOS Genet.
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
PLOS ONE
0
20
40
60
80
100
120
140
160
180
200
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
Proc. R. Soc. B
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
Science
0
200
400
600
800
1,000
1,200
0 10 20 30 40 50 60 70 80 90 100+
Numberofpapers
Number of citations
Sci. Rep.
Larivière et al. (2016)
30. Fig 4. A log-scale comparison of the 11 citation distributions. (a) The absolute number of articles
plotted against the number of citations. (b) The percentage of articles plotted against the number
of citations.
1
10
100
1,000
10,000
100,000
1 10 100
Numberofarticles
Number of citations (+1)
a eLife
EMBO J.
J.
Informetrics
Nature
Nature
Comm.
PLOS Biol.
PLOS Genet.
PLOS ONE
Proc. R. Soc.
B
Science
Sci. Rep.0.001%
0.010%
0.100%
1.000%
10.000%
100.000%
1 10 100
Percentageofarticles
Number of citations (+1)
b eLife
EMBO J.
J. Informetrics
Nature
Nature
Comm.
PLOS Biol.
PLOS Genet.
PLOS ONE
Proc. R. Soc.
B
Science
Sci. Rep.
Larivière et al. (2016)
31. Recommendations
• We encourage journal editors and publishers that advertise or display JIFs to publish
their own distributions using the above method.
• We encourage publishers to make their citation lists open via Crossref, so that citation
data can be scrutinized and analyzed openly.
• We encourage all researchers to get an ORCID_iD that …facilitates the consideration
of a broader range of outputs in research assessment.
Larivière et al. (2016)
32. By the time a paper is submitted to a journal it’s
generally too late