SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
GNU Guix
Reproducible genomics
analysis pipelines with
使用
可重复性的 基因组学 分析管道
提供
R. Wurmus, B. Uyar, B. Osberg, V. Franke,
A. Gosdschan, K. Wreczycka, J. Ronen, A. Akalin https://doi.org/10.1093/gigascience/giy123
a
b
笔记本
a = 10mlb = 30ml
Supplier: ACME
Temp: 22 deg C
To repeat an experiment
we first need to
reproduce its environment
How hard could this possibly be?
coreutils-8.24
perl-5.22.1
tar-1.28gzip-1.6 bzip2-1.0.6 xz-5.2.2file-5.25 diffutils-3.3
patch-2.7.5
sed-4.2.2findutils-4.6.0
gawk-4.1.3
grep-2.22 coreutils-8.24
make-4.1
bash-4.3.42
ld-wrapper-0
binutils-2.25.1
gcc-4.9.3
glibc-2.22
glibc-utf8-locales-2.22
acl-2.2.52
gmp-6.1.0
libcap-2.24
glibc-utf8-locales-2.22
gcc-4.9.3
ld-wrapper-boot3-0
binutils-cross-boot0-2.25.1
make-boot0-4.1
diffutils-boot0-3.3
findutils-boot0-4.6.0file-boot0-5.25
bootstrap-binaries-0
ed-1.12
libsigsegv-2.10
perl-boot0-5.22.1 perl-5.22.1
acl-2.2.52
gmp-6.1.0
libcap-2.24
pkg-config-0.29
guile-2.0.11
bison-3.0.4
readline-6.3
ncurses-6.0
gcc-cross-boot0-wrapped-4.9.3
texinfo-6.0
bash-static-4.3.42
libstdc++-4.9.3zlib-1.2.8
perl-boot0-5.22.1
gettext-boot0-0.19.7
gcc-cross-boot0-4.9.3
glibc-bootstrap-0
gcc-bootstrap-0
linux-libre-headers-3.14.37
gzip-1.6
gettext-0.19.7
attr-2.4.47
m4-1.4.17
gzip-1.6
guile-bootstrap-2.0
binutils-bootstrap-0
gettext-0.19.7
attr-2.4.47
m4-1.4.17
gcc-cross-boot0-wrapped-4.9.3
glibc-intermediate-2.22
m4-1.4.17
expat-2.1.0
lzip-1.16
pkg-config-0.29 libffi-3.2.1
readline-6.3
libunistring-0.9.6 libltdl-2.4.6
libgc-7.4.2gmp-6.1.0
ncurses-6.0libatomic-ops-7.4.2 m4-1.4.17
expat-2.1.0
Very.
totherescue?
Containers
lacktransparency
Containersstrawberry?
whale oil?
Automategenomicsanalyses
Design
goals
RNAseq
UCGG
ACACCCGUAAA
ChIPseq
single cell BSseq
1
PiGx ChIPseq
Improve
read quality
Trim-Galore
Align reads
Bowtie2
Call peaks
MACS2
ChIP QC &
reproducibility
ChIPQC + IDR
Peak
annotation
genomation
Compute
read coverage
R Scripts
Check
sequencing
quality
FastQC
Pan-sample
quality check
MultiQC
Simpleuserinterface
Design
goals Settings
Sample sheet
interactive reports
browser tracks
alignments
QC reports
sample clustering
2
Easytoinstallreproducibly
Design
goals
guix package
--install pigx
3
Reproducible package manager
Full environment declarations
Builds software in isolation
source / binary transparency
Packan applicationbundle
higher order
source description
lower-level binary
application bundles
90%
Status
not reproducible
minor problems
reproducible
all pipelines
PiGx BSseq
PiGx ChIPseq
PiGx RNAseq
PiGx scRNAseq
~98%
Constrain software variables
Containers are not transparent (smoothies)
Guix builds software reproducibly and transparently
PiGx shows that Guix makes reproducibility easy
PiGx brings analysis to non-bioinformaticians
2
3
4
1
5
http://bioinformatics.mdc-berlin.de/pigx/
https://hpc.guixsd.org
https://gnu.org/s/guix Let’s talk!
#guix on irc.freenode.net
ricardo.wurmus@mdc-berlin.de
Learn more
PiGx RNAseq
Improve
read quality
Trim-Galore
Align reads
STAR
Quantify
expression
STAR / Salmon
Analyze
differential
expression
DESeq2
Find enriched
GO terms
g:ProfileR
Compute
read coverage
Bedtools
Check
sequencing
quality
FastQC
Pan-sample
quality check
MultiQC
PiGx BSseq
Improve
read quality
Trim-Galore
Align reads
Bismark
Call
methylation
methylkit
Differential
methylation
methylkit
Annotate DMRs
and segments
genomation
Check
sequencing
quality
FastQC
Pan-sample
quality check
MultiQC
Methylation
segmentation
methylkit
PiGx single cell
RNAseqImprove
read quality
Trim-Galore
Align reads
STAR
Determine
cell number
Dropbead
Dropout rate
and QC
Scater
Dimension
reduction
tSNE + PCA
Compute
read coverage
Bedtools
Check
sequencing
quality
FastQC
Pan-sample
quality check
MultiQC
headers
sources
build tools
libraries
...
headers
sources
build tools
libraries
...
cabba9e-samtools-1.7/
bin
samtools
lib
...

Más contenido relacionado

Más de GigaScience, BGI Hong Kong

Más de GigaScience, BGI Hong Kong (20)

Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
 
Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...
Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...
Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...
 
Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)
Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)
Zhipeng Li at #ICG12: Draft Genome of the Reindeer (Rangifer tarandus)
 
Daniela Puiu at #ICG12: The first near-complete assembly of the hexaploid bre...
Daniela Puiu at #ICG12: The first near-complete assembly of the hexaploid bre...Daniela Puiu at #ICG12: The first near-complete assembly of the hexaploid bre...
Daniela Puiu at #ICG12: The first near-complete assembly of the hexaploid bre...
 

Último

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 

Último (20)

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 

Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix