Scott Edmunds at International Data Week 2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform. 21st June 2022
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
More Related Content
Similar to IDW2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform
Similar to IDW2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform (20)
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
IDW2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform
1. A decades experiences in transparent and interactive
publication of FAIR data and software via an end-to-end XML
publishing platform
Scott Edmunds 0000-0001-6444-1436
3. GigaSolution: rewarding open data & code
http://gigasciencejournal.com/
Publishes “Data Notes” for CC0 data, “Tech Notes” for OSI software.
Transparent: Open Peer Review and linked to preprints. Mandates code in repo.
4. Integrated GigaDB repository: DataCite DOIs, no size limits, code snapshots, APC covers curation
http://gigadb.org/
GigaSolution: rewarding open data & code
9. Independent execution of computations underlying research articles.
Experience publishing CODECHECK: 2020
CODECHECK tackles one of the main challenges of computational research by supporting
codecheckers with a workflow, guidelines and tools to evaluate computer programs
underlying scientific papers. The independent time-stamped runs conducted by
codecheckers will award a “certificate of executable computation” and increase availability,
discovery and reproducibility of crucial artefacts for computational sciences.
https://codecheck.org.uk/
12. Tech really the
bottleneck
Process much too
slow & expensive
Still too focused on
narrative and static
“version of record”
Still not very FAIR
Lessons learned in a decade of data & software
publishing:
13. D ATA C O D E E N T I T I E S FA C T S S TA B I L I T Y
A new approach
Follow the Software
Paradigm?
C O D E R E L E A S E F O R K U P D AT E R E P E AT
Deconstruct the “Version
of Record”?
14. Move to new XML end-to-end pipeline
Custom end-to-end workflow makes integrations simpler with one integration point
15. Features of new journal:
Main advantage of workflow is XML from start to end
https://gigabytejournal.com/
Several modules acting as one platform: no
import/export of files, so fast and accurate
Cutting out production allows huge time & cost saving
(currently as little as 3.5hrs per paper)
Any number of versions can be published instantly,
including typographic quality PDF or updates/forks
Allows instantaneous switch of views
Leverage embeddable dynamic content/widgets
Initial focus on forkable open source products:
data + software + update papers
16. Focusing beyond VoR allows different views…
16
What does focusing on Data + software + XML allow us to do?
https://doi.org/10.46471/gigabyte.1
17. https://doi.org/10.46471/gigabyte.6
High quality rich XML
CC-BY open licensed, open citations, open corpus
Structured schema.org metadata
No hiding of material in supplemental files
Maximise use of persistent identifiers (PIDs)
Who
ORCID IDs
CASRAI contributorship
Funder (Fundref)
Institution (ROR)
What
Species (NCBI, fishbase)
Cell/strain (RRID)
How
Equipment (RRID)
Software (RRID, bio.tools)
Output
Data (accessions, DOIs)
Results (DOIs)
Helping to make research “AI-ready”
Thinking about users: machines
18. Interaction: increasing understanding & trust
https://doi.org/10.46471/gigabyte.13
Do you trust an immunoinformatics tool to predict whether memory T cells generated from
previous exposure to common cold coronaviruses are cross-reactive against SARSCoV2?
19. Interaction: software and code via Stencila and CodeOcean
http://gigasciencejournal.com/blog/gigabyte-executable-research-articles/
Code Ocean “Compute Capsule”: readers can
directly interact with software via an embedded
version in the article; or deploy and run in their
own cloud computing environment.
Popout Stencila “Executable Research Article”
where figures are accompanied by editable
code blocks that can be edited and re-
executed to immediately see the changes.
20. Interact with Stenci.la “code chunks” & Code Ocean “compute
capsules” of COVID-19 immunoinformatics papers
https://doi.org/10.46471/gigabyte.13
21. A new way of publishing FAIR research with new tech
• Share & get credit for updatable data & software papers
• Follow the software paradigm, bring your research to life
• XML makes it much easier to embed interactive content
• Use automation & interaction to increase scrutiny & trust
• XML only workflow cuts time and cost to publish
• Rethink “Version of Record”: focus on facts/data/code &
discard the packaging
Help us change scientific publishing, contact: editorial@gigabytejournal.com
https://gigabytejournal.com/
22. Thanks to:
@GigaByteJournal
facebook.com/GigaScience
http://gigasciencejournal.com/blog/
Follow us:
+
Weibo
& WeChat
Laurie Goodman, Publisher
Nicole Nogoy, Editor
Hans Zauner, Assistant Editor
Hongling Zhao, Assistant Editor
Peter Li, Head of IT
Chris Hunter, Lead BioCurator
Chris Armit, Data Scientist
Mary Ann Tulli, Data Editor
Rija Ménagé, Senior Software Engineer
Ken Cho, Systems Programmer Analyst
Chen Qi, Shenzhen Office.
https://gigabytejournal.com/
editorial@gigabytejournal.com