Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
1. Being Reproducible:
Models, Research
Objects and R* Brouhaha
Professor Carole Goble, carole.goble@manchester.ac.uk
The University of Manchester, UK
The FAIRDOM Association Coordinator
ELIXIR-UK Head of Node
Co-lead ELIXIR Interoperability Platform
SSBSS 2017, July 17 2017, Cambridge, UK
4th International Synthetic & Systems Biology Summer School
7. John P. A. Ioannidis How to Make More Published ResearchTrue, October 21, 2014 DOI: 10.1371/journal.pmed.1001747
8. Reproducibility of biological experiments
is hard
for in vivo/vitro and
for in silico analysis
• OS version
• Revision of scripts
• Data analysis software versions
• Version of data files
• Command line parameters written on
a napkin
• “Black magic” only a grad student
knows
Fix with latest technologies, best
practices and willingness
[Keiichiro Ono, Scripps Institute]
The first
step is to
be
FAIR
See the whole of
the previous talk…
9. Record All
Automate All
Contain All
Access All
Findable (Citable)
Accessible (Trackable)
Interoperable (Intelligible)
Reusable (Reproducible)
10. design
cherry picking data, random seed
reporting, non-independent bias, poor
positive and negative controls, dodgy
normalisation, arbitrary cut-offs,
premature data triage, un-validated
materials, improper statistical analysis,
poor statistical power, stop when “get to
the right answer”, software
misconfigurations misapplied black box
software
reporting
incomplete reporting of software configurations, parameters & resource
versions, missed steps, missing data, vague methods, missing software
Empirical Statistical Computational
V. Stodden, IMS Bulletin (2013)
Reproducibility and reliability of biomedical
research: improving research practice
https://www.sciencenews.org/article/12-reasons-research-goes-wrong
11.
12. “When I use a word," Humpty Dumpty
said in rather a scornful tone, "it means
just what I choose it to mean - neither
more nor less.”
Carroll, Through the Looking Glass
re-compute
replicate
rerun
repeat
re-examine
repurpose
recreate
reuse
restore
reconstruct review
regenerate
revise
recycle
redo
robustness
tolerance
verificationcompliancevalidation assurance
remix
13. Scientific publications goals:
(i) announce a result
(ii) convince readers its correct.
Papers in experimental science
should describe the results and
provide a clear enough protocol to
allow successful repetition and
extension.
Papers in computational science
should describe the results and
provide the complete software
development environment, data
and set of instructions which
generated the figures.
VirtualWitnessing*
*Leviathan and theAir-Pump: Hobbes, Boyle, and the
Experimental Life (1985) Shapin and Schaffer.
Jill Mesirov
David Donoho
15. Repeatability:
“Sameness”
Same result
1 Lab
1 experiment
Reproducibility:
“Similarity”
Similar result
> 1 Lab
> 1 experiment
why the differences?
https://2016-oslo-
repeatability.readthedocs.org/en/latest/repeatability-discussion.html
Validate
Verify
16. Method Reproducibility
the provision of enough detail about
study procedures and data so, in
theory or in actuality, the same
procedures could be exactly
repeated.
Result Reproducibility
(aka replicability)
obtaining the same results from the
conduct of an independent study
whose procedures are as closely
matched to the original experiment
as possible
Goodman, et al ScienceTranslational Medicine 8 (341) 2016
Validate
Verify
17. What are you reproducing?
Algorithm vs its script conflation
Methods
techniques, algorithms,
spec. of the steps, models
Materials
datasets, parameters,
algorithm seeds
Instruments
codes, services, scripts,
underlying libraries,
workflows, ref datasets
Laboratory
sw and hw infrastructure,
systems software,
integrative platforms
computational environment
21. Black boxes
• closed codes
• closed external or cloud
services
• method obscurity
• manual steps
[Thanks to Jason Scott]
22. The ReproducibilityWindow
all experiments become less reproducible over
time….
• Can’t contain everything
– Pesky Internet in a Box
• Can’t automate everything
– Pesky people intervening
• Can’t fix and fossils everything
– Pesky science keeps changing
Results may vary
23. Bonus slide
At SSBSS Theodor Gescher came up with REALSCI
Robust -many runs
Environment -describe the equipment/OS
Another -done by not your lab
Limits -parameters
Standards -well understood/comprehensible methods
Complete -not cherry picking
Immortal -community supported commodity systems
24. Mixed Central and Distributed stores:
Containment and Dependencies. Upload vs Referencing
In House Stores
External Databases
Publishing services
Model Resources
25. Mixed Central and Distributed stores:
Containment and Dependencies. Upload vs Referencing
In House Stores
External Databases
Publishing services
Model Resources
Migrations into FAIRDOMHub
For long term reproducibility
26. Shades of Reproducibility
Running an active instrument
Reading an archived record
Are you using
hard-wired
localhost ids?
Workflows
SOPs
Containers, cloud services, common services
Markup languages,
reporting guidelines and
checklists, ontologies,
catalogues
Sounds hard….
what can I do?
Catalogue
27. Protocol specs and sharing…
A language for specifying
experimental protocols for
biological research in way that is
precise, unambiguous, and
understandable by both humans
and computers.
30. in situ reproducible models in FAIRDOM
metadata annotation against standards
validation, comparison and simulation
SBML Model simulation
Model comparison
Model versioning
Reproducing simulations
[Jacky Snoep, Dagmar Waltemath, Martin Peters, Martin Scharm]
JWS Online
32. Tracking model versions smartly
Scharm, M., Wolkenhauer, O., & Waltemath, D. (2015). An algorithm to detect and
communicate the differences in computational models describing biological
systems. Bioinformatics, btv484
34. A simulation database allows a one-click, live
figure reproduction in a FAIRDOM-SEEK
JWS model Excel data file
Dagmar Waltemath, Uni Rostock
Jacky Snoep, Uni Stellenbosch
Simulation Experiment Description Markup
Language: XML-based format for encoding
simulation setups, to ensure exchangeability and
reproducibility of simulation experiments
• which models to use in an experiment,
• modifications to apply on the models before using them,
• which simulation procedures to run on each model,
• what analysis results to output,
• and how the results should be presented.
36. ModelTechnical curation forJournals
[Jacky Snoep (Stellenbosch), DagmarWaltemath, Martin Peters, Martin Scharm (Rostock)]
* store DOI citable supplementary files on FAIRDOMHub
** model and data curation
*** reproducible clickable figures in papers using SED-ML
37. Cataloguing
Packaging
Penkler, G., du Toit, F., Adams,
W., Rautenbach, M., Palm, D. C.,
van Niekerk, D. D. and Snoep, J.
L. (2015), Construction and
validation of a detailed kinetic
model of glycolysis in
Plasmodium falciparum. FEBS J,
282: 1481–1511.
doi:10.1111/febs.13237
https://fairdomhub.org/investigations/56
DOI: 10.15490/seek.1.investigation.56
Snapshot
preservation
active
38. 18/07/2017 39
An “evolving manuscript” would begin with a pre-
publication, pre-peer review “beta 0.9” version of an
article, followed by the approved published article itself, [
… ] “version 1.0”.
Subsequently, scientists would update this paper with
details of further work as the area of research develops.
Versions 2.0 and 3.0 might allow for the “accretion of
confirmation [and] reputation”.
Ottoline Leyser […] assessment criteria in science revolve
around the individual. “People have stopped thinking
about the scientific enterprise”.
http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article
39. Packaging: CombineArchive
https://sems.uni-rostock.de/projects/combinearchive/
Scharm M,Wendland F, Peters M,Wolfien M,TheileT,Waltemath D
SEMS, University of Rostock
zip-like file with a manifest & metadata
- Bundling files - Keeping provenance
- Exchanging data - Shipping results
Bergmann, F.T.,Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014). COMBINE archive and OMEX format:
one file to share all information to reproduce a modeling project. BMC bioinformatics,15(1), 1.
40. Standards-based metadata framework for
bundling (scattered) resources with context and citation
Packaging:
Research Objects
http://researchobject.org
43. FromVirtual Machines to Executable Containers
for portable execution
• Containers everything required to make a piece of
software run is packaged into isolated containers.
• UnlikeVMs, containers do not bundle a full operating
system - only libraries and settings required to make
the software work.
• Efficient, lightweight, self-contained systems
• Guarantees that software will always run the same,
regardless of where it’s deployed.
https://www.software.ac.uk/c4rr/ https://biocontainers.pro/
Biocontainers
44. Use commodity and community systems
Sustained platforms
Communities to drive them
Tooling and training
Spreadsheets are the Cockroaches of Science
45. EU FAIR Data Expert Group Consultation
https://github.com/FAIR-Data-
EG/consultation/issues
46. What to know more?
Go on a Software or Data Carpentry Course
https://tess.elixir-europe.org
48. Software Sustainability Institute ,
http://www.software.ac.uk
Goble, Better Software Better Research
IEEE Internet Computing 18(5), (2014 )
DOI: 10.1109/MIC.2014.88
Jiménez RC, Kuzak M, Alhamdoosh M et al.
Four simple recommendations to
encourage best practices in research
software [version 1; referees: 3 approved].
F1000Research 2017, 6:876 (doi:
10.12688/f1000research.11407.1)
49. Use Common
Platforms
Get the licencing
right…
MATLAB
Mathematica….
Proprietary
software
Cloud Centralised Service
insitu reproducibility….
Galaxy
FAIRDOMHub + JWS Online
Blackbox vs
Whitebox
51. Use a workflow – the vision!
preferrably a workflow management system
preferrably described using CommonWorkflow Language
Experimental
workflows
Event BUS Business Process Management
Taverna Knime Galaxy
Workflow
BPM layer
Workflow
Computation
Application
layer
Computing resources Databases
Effector
layer
Front-end
Web interface / Monitoring interface
Pipeline
Pilot
FAIRDOM SEEK
Workflow repository
Workflow portal
repository
launch, results
FAIRDOM
[Jean Loup Fallon, Carole Goble]
54. What can you do?
• Follow the 10 RACA Principles
• Take action, be imperfect
• Demand reproducibility in reviews.
• Educate your PIs and supervisors.
56. What are the incentives?
[Garza] [Malone] [Resnik]
57. Acknowledgements
• David De Roure
• Tim Clark
• Sean Bechhofer
• Robert Stevens
• Christine Borgman
• Victoria Stodden
• Marco Roos
• Jose Enrique Ruiz del Mazo
• Oscar Corcho
• Ian Cottam
• Steve Pettifer
• Magnus Rattray
• Chris Evelo
• Katy Wolstencroft
• Robin Williams
• Pinar Alper
• C. Titus Brown
• Greg Wilson
• Kristian Garza
• Juliana Freire
• Jill Mesirov
• Simon Cockell
• Paolo Missier
• Paul Watson
• Gerhard Klimeck
• Matthias Obst
• Jun Zhao
• Pinar Alper
• Daniel Garijo
• Yolanda Gil
• James Taylor
• Alex Pico
• Sean Eddy
• Cameron Neylon
• Barend Mons
• Kristina Hettne
• Stian Soiland-Reyes
• Rebecca Lawrence
• Michael Crusoe
58. Jon OlavVik,
Norwegian University of Life Science
Maksim Zakhartsev
University Hohenheim, Stuttgart,
Germany
Alexey Kolodkin
Siberian Branch
Russian Academy of Sciences
Tomasz Zieliński,
SynthSys Centre
University Edinburgh, UK
Martin Peters, Martin Scharm
Systems Biology Bioinformatics
University of Rostock, Germany
59. Web sites
• Force11 http://www.force11.org
• TeSS https://tess.elixir-europe.org
• FAIRDOM http://www.fair-dom.org
• FAIRDOMHub http://www.fairdomhub.org
• Software Carpentry http://software-carpentry.org
• Data Carpentry http://datacarpentry.org
• Software Sustainability Institute http://www.software.ac.uk
• Rightfield http://www.rightfield.org.uk
• FAIRSharing http://www.fairsharing.org
• CommonWorkflow Language http://commonwl.org/
60. Reading List (refs also throughout)
• John P. A. Ioannidis How to Make More Published ResearchTrue, October 21, 2014 DOI:
10.1371/journal.pmed.1001747
• Ioannidis JPA (2005) Why Most Published Research FindingsAre False. PLoS Med 2(8): e124.
doi:10.1371/journal.pmed.0020124
• Steven N. Goodman*, Daniele Fanelli and John P. A. Ioannidis,What does research reproducibility mean? Science
Translational Medicine 01 Jun 2016:Vol. 8, Issue 341, pp. 341ps12 DOI: 10.1126/scitranslmed.aaf5027
• Sandve GK, Nekrutenko A,Taylor J, Hovig E (2013)Ten Simple Rules for Reproducible Computational Research.
PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
• Massimiliano Assante, Leonardo Candela, DonatellaCastelli, Paolo Manghi and Pasquale Pagano, Science 2.0
Repositories:Time for a Change in Scholarly Communication, D-Lib Magazine January/February 2015,Volume 21,
Number 1/2 , DOI: 10.1045/january2015-assante
• Waltemath, D., Henkel, R., Hälke, R., Scharm, M., &Wolkenhauer, O. (2013). Improving the reuse of
computational models through version control.Bioinformatics, 29(6), 742-748.
• Bergmann, F.T., Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014).
COMBINE archive andOMEX format: one file to share all information to reproduce a modeling project. BMC
bioinformatics,15(1), 1.
• Scharm, M.,Wolkenhauer, O., &Waltemath, D. (2015). An algorithm to detect and communicate the differences
in computational models describing biological systems. Bioinformatics, btv484
• http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328
• http://www.acmedsci.ac.uk/policy/policy-projects/reproducibility-and-reliability-of-biomedical-research/