Reproducibility (and the R*) of Science: motivations, challenges and trends
1. Reproducibility
(and the R*) of
Science:
motivations,
challenges and
trends
Professor Carole Goble
The University of Manchester, UK
Software Sustainability Institute, UK
Head of Node ELIXIR-UK
ELIXIR, IBISBA, FAIRDOM Association e.V., BioExcel Life Science Infrastructures
carole.goble@manchester.ac.uk
IRCDL Pisa 31 Jan – 1 Feb 2019
Beware.
Results may vary.
4. The famous Nature survey
1576 researchers, 2016
https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970
5. Reporting and Availability
John P. A. Ioannidis Why Most Published Research FindingsAre False, August 30, 2005, DOI:
10.1371/journal.pmed.0020124
incomplete reporting of method, software configurations, resources,
parameters & resource versions, missed steps, missing data, vague
methods, missing software, unreproducible environments.
Joppa, et al,TroublingTrends in Scientific Software Use SCIENCE 340 May 2013
BetterTraining
Methodological Support
More robust designs
Independent
accountability
Collaboration & team science
Diversifying peer review
Better Practices
Funding replication studies
Rewarding right behaviour
Design Flaws
HARKIng (hypothesizing after the results are
known), cherry picking data, random seed
reporting, non-independent bias, poor
positive and negative controls, poor
normalisation, arbitrary cut-offs,
premature data triage, un-validated materials,
improper statistical analysis, poor statistical
power, stop when “get to the right answer”,
software misconfigurations, misapplied black
box software
6. Trend: Policy and advice proliferation
Findable ( and be Citable)
Accessible (and beTrackable)
Interoperable (and be Intelligible)
Reusable (and be Reproducible)
Record, Automate,
Contain, Access
7. *Based on
Scientific publications
• announce a result
• convince readers to trust it.
Experimental science
• describe the results
• provide a clear enough the
materials and protocol to allow
successful repetition and
extension. (Jill Mesirov 2010*)
Computational science
• describe the results
• provide the complete software
development environment,
data, instructions, techniques
(which generated the figures)
(David Donoho 1995*).
8. “Virtual Witnessing”
*Leviathan and theAir-Pump: Hobbes, Boyle, and the
Experimental Life (1985) Shapin and Schaffer.Joseph Wright, Experiment with the Air Pump c. 1768
9. “virtually witnessing” the “moist lab”
Experiment
Setup
Methods
Algorithms, spec. of the
analysis steps, models…
Materials Datasets, parameters,
algorithm seeds…
Instruments Codes, services, scripts,
workflows, reference
datasets…
Laboratory Software and hardware
infrastructure…
Wet Dry
Physical Lab
Chemicals, reagents,
samples, strain of mouse…
Mass specs, sequencers,
microscopes, calibrations…
Lab protocols, standard
operating procedures…
10. International Mouse Strain
Resource (IMSR)
Bramhall et al QUALITY OF METHODS
REPORTING IN ANIMAL MODELS OF COLITIS
Inflammatory Bowel Diseases, , 2015,
“Only one of the 58 papers
reported all essential criteria on
our checklist. Animal age,
gender, housing conditions and
mortality/morbidity were all
poorly reported…..”
The Materials
11. Turning FAIR into reality
Final report and action plan from the European Commission expert group on FAIR data ,
Nov 2018
The Materials
12. The Methods
Method Reproducibility
the provision of enough detail about
study procedures and data so the
same procedures could, in theory or
in actuality, be exactly repeated.
Result Reproducibility
the same results from the conduct of
an independent study whose
procedures are as closely matched
to the original experiment as possible
Procedure = Software, SOP, Lab Protocol, Workflow, Script.
Tools, Technologies, Techniques. A whole bunch of them together.
Goodman, et al ScienceTranslational Medicine 8 (341) 2016
14. Assemble
Methods, Materials Experiment
Observe
Simulate
Analyse
Results
Publish/
Share Results
Manage
Results
Plan
Run
“I can’t immediately reproduce the
research in my own laboratory. It
took an estimated 280 hours for an
average user to approximately
reproduce the paper.
Garijo et al. 2013 Quantifying Reproducibility in
Computational Biology: The Case of the Tuberculosis
Drugome PLOS ONE.
16. The R* Nautilus
with thanks to Nicola Ferro for the visualisation
Repeat
Replicate
Reproduce
Reuse /
Generalise
17. The R* Nautilus
with thanks to Nicola Ferro for the visualisation
Repeat
Same data, set up
Same task/goal
Same materials
Same methods
Same group/lab
My Research Environment
robust, defensible, productive
“Micro” Reproducibility
18. The R* Nautilus
with thanks to Nicola Ferro for the visualisation
Repeat
Same data, set up
Same task/goal
Same materials
Same methods
Same group/lab
Replicate
Same data, set up
Same task/goal
Same materials
Same methods
Different group
Our Research Environment
review, validate, certify
Publication Environment
review, validate, certify
“Sameness”
Accountability
Trust
19. The R* Nautilus
with thanks to Nicola Ferro for the visualisation
Repeat
Same data, set up
Same task/goal
Same materials
Same methods
Same group/lab
Replicate
Same data, set up
Same task/goal
Same materials
Same methods
Different group
Reproduce
Different data, set up
Same task/goal
Same/different materials
Same/different methods
Different group
Their Research Environment
review, compare, verify
“Similar”
Accountability
Trust
“Macro” Reproducibility
20. The R* Nautilus
with thanks to Nicola Ferro for the visualisation
Repeat
Same data, set up
Same task/goal
Same materials
Same methods
Same group/lab
Replicate
Same data, set up
Same task/goal
Same materials
Same methods
Different group/lab
Reproduce
Different data, set up
Same task/goal
Same/different materials
Same/different methods
Different group/lab
Reuse / Generalise
Different data, set up
Different task/goal
Same/different materials
Same/different methods
Different group/lab
Transferred
Repurposed
Trusted
Productivity
21. The R* Nautilus
with thanks to Nicola Ferro for the visualisation
Reused
Experimental
outputs
Outputs retained
Outputs Used and
Shared
Outputs
Published
Not all outputs are worth the
burden of metadata unless its
automagical and a side-effect
22. Why does this matter?
Moving between different environments
Recreating / accessing common environments
Fragmented, decentralised, multi-various and complicated …
Research Infrastructure
Services
Assemble
Methods, Materials Experiment
ObserveSimulate
Analyse
Results
Quality
Assessment
Track and Credit
Disseminate
Deposit &
Licence
Publishing Services
Share
Results
Manage
Results
Science 2.0 Repositories: Time for a Change in Scholarly Communication Assante,
Candela, Castelli, Manghi, Pagano, D-Lib 2015
23.
24. Why does this matter?
Accuracy, Sameness, Change, Dependencies
What has been fixed, must be
fixed, what variations are valid.
We snapshot publications but
science does not stay still.
Replication may be harder than
reproducing and will decay as the
tools, methods, software, data …
move on or are inherently
unavailable.
What are the dependencies.
What are the black box steps.
Results
may vary
25. Why does this matter?
More than just “FAIR” data
Open Access to data, software and platforms
Rich descriptions of data, software, methods
• Transparent record of steps, dependencies,
provenance.
• Reporting robustness of methods, versions,
parameters, variation sensitivities
• Portability and preservation of the software
and the data
Should be embedded in Research Practice not a
burdensome after thought at publication.
Keeping track a side effect of using research
tools.
27. Extreme example
Precision medicine HTS pipelines
Alterovitz, Dean, Goble, Crusoe, Soiland-Reyes et al Enabling
Precision Medicine via standard communication of NGS provenance,
analysis, and results, biorxiv.org, 2017, https://doi.org/10.1101/191783
parameters
28. Why does this matter?
• Reproducibility is a spectrum
• Strength and difficulty
depends on context and
purpose in the scholarly
workflow
• Beware reproducibility (and
FAIR) dogmatists.
29. Why does this matter?
forced fragmentation and decentralisation
distributed knowledge infrastructures
De-contextualised
Static, Fragmented
Lost Semantic linking
Contextualised
Active, Unified
Semantic linking
Buried in a
PDF
figure
Reading and Writing Scattered….
31. Trend: Research Objects
context, data, methods, models, provenance bundled together
Handling and embracing decentralisation and enabling portability
32. Trend: Tool/Environment Proliferation
built in reproducibility by side effect, reproducibility ramps,
disguised as productivity. If only they worked together…
Standards and templates for reporting
methods, provenance, tracking
Tools and platforms for capturing, tracking,
structuring, organising assets throughout the
whole project research cycle.
Shared Cloud-based
analysis systems &
collaboratories
Workflow/Script Automation
Containers for
executable software
dependencies &
portability
Electronic
Lab note
books
Open source software
repositories
Models and methods archives
Research
Commons
33. Trend: Publication Tool Proliferation
mostly as an additional step
eLife Reproducible Document Stack
publish computationally reproducible
research articles online.
Data2Paper
35. Provocation:
why are we still publishing articles?
For Reproducible Research
Release Research Objects
Jennifer Schopf,Treating Data Like Software: ACase for ProductionQuality Data, JCDL 2012
Analogous to software products and
practices rather than data or articles or
library practices…
Treat ALL Products and
ALL Research Like Software
Time Higher Education Supplement, 14 May 2015
36. Acknowledgements
• Dagstuhl Seminar 16041 , January 2016
– http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=16041
• ATI Symposium Reproducibility, Sustainability and Preservation , April 2016
– https://turing.ac.uk/events/reproducibility-sustainability-and-preservation/
– https://osf.io/bcef5/files/
• Nicola Ferro
• CTitus Brown
• Juliana Freire
• David De Roure
• Stian Soiland-Reyes
• Barend Mons
• Tim Clark
• Daniel Garijo
• Norman Morrison
• Matt Spritzer
• Scott Edmunds
• Paolo Manghi …