SlideShare una empresa de Scribd logo
1 de 21
Taming Snakemake
1/27/14
Why Make?


What are Make's advantages (over Perl and shell scripts)?


Make forces you to think about file transformation in terms of inputs and outputs, recipes and rules. In Perl you are forced to think at the level of
variables, conditionals, and loops. In Shell you are forced to think like a caveman.



Unfortunately, bioinformatics is still largely about files and their suffixes. Make has a very powerful syntax based almost entirely around file suffixes.



Make knows what's been made and what hasn't. Make can be interrupted and restarted safely, and without overwriting finished work.



Make knows what's changed and what hasn't. If an input is newer than an output, it will attempt to rebuild the output.



Make allows you to add new input files without worrying about overwriting old ones.



Make is well supported. There are 1333 Make questions on SO alone.



When people see a Makefile, they immediately know how to run it.



Make does not force you to wrap shell statements in quotes.



Make is a DSL. It will attempt to validate your syntax.



Make is ancient, ubiquitous, and reliable.



Make can parallelize with --jobs.



Make recipes encourage reuse.

https://share.chop.edu/pages/viewpage.action?pageId=138478819
Make review

http://github.research.chop.edu/BiG/err_chip_seq/blob/master/Makefile
Pipelines and Workflows
Other pipelines
Ruffus

Queue

GKNO
Why Snakemake?
 Addresses Makefile weaknesses without
throwing out the good stuff
 Difficult to implement control flow
 No cluster support
 Inflexible wildcards
 Too much reliance on sentinal files
 No reporting mechanism

Johannes Köster
Syntax
Make
Variables

Targets

Rules

Snakemake
Utilities
 Logs - wire them up manually

 Cluster support pretty decent
source /nas/is1/leipzig/martin/variome-env/bin/activate
snakemake --directory /nas/is1/leipzig/martin/snake-env --snakefile /nas/is1/leipzig/martin/snake-env/Snakefile -c qsub -j
16
source /mnt/isilon/cbmi/variome/leipzig/martin/respublica-env/bin/activate
snakemake --directory /mnt/isilon/cbmi/variome/leipzig/martin/snake-env --snakefile /mnt/isilon/cbmi/variome/leipzig/martin/snake-env/Snakefile -c qsub -j
16

 Cores/jobs/resources
Useful stuff
 dry-runs
 keep-going
 touch
 version changes
 workflow diagrams
Python legal
Client websites with Jekyll
 Jekyll is a templating engine for blogs that
accepts Markdown
 Layouts use the Liquid markup

http://mitomap.org/martin-rnaseq/
A workflow that reports itself
Avoiding Sweave-Hell
The bad way
Cache-ing chunks?
Avoiding Sweave-Hell
Avoiding Sweave-Hell
R/Snakemake integration
git submodule add git@github.research.chop.edu:BiG/rna-seq-common-functions.git common/rna-seq
Leave a paper trail
Reproducible Checklist
 repository github.research.chop.edu
 workflow of some kind from beginning to end
 website at mybic.chop.edu
Ties that bind

Más contenido relacionado

La actualidad más candente

Sciunits: Resuable Research Object
Sciunits: Resuable Research Object Sciunits: Resuable Research Object
Sciunits: Resuable Research Object Tanu Malik
 
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIWhole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIPhil Ewels
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqEnis Afgan
 
Lichtenberg bosc2010 wordseeker
Lichtenberg bosc2010 wordseekerLichtenberg bosc2010 wordseeker
Lichtenberg bosc2010 wordseekerBOSC 2010
 
London bosc2010
London bosc2010London bosc2010
London bosc2010BOSC 2010
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolHong ChangBum
 
Performance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryPerformance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryZongYing Lyu
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotLi Shen
 
NGS data formats and analyses
NGS data formats and analysesNGS data formats and analyses
NGS data formats and analysesrjorton
 
Bic bellec 2016_small
Bic bellec 2016_smallBic bellec 2016_small
Bic bellec 2016_smallPierre Bellec
 
Closing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceClosing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceJustin Johnson
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesReece Hart
 
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...Anne Nicolas
 
Variability, Bugs, and Cognition
Variability, Bugs, and CognitionVariability, Bugs, and Cognition
Variability, Bugs, and CognitionAndrzej Wasowski
 
Architecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory systemArchitecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory systemZongYing Lyu
 
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelAn Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelNicolas Bettenburg
 
Big data solution for ngs data analysis
Big data solution for ngs data analysisBig data solution for ngs data analysis
Big data solution for ngs data analysisYun Lung Li
 

La actualidad más candente (20)

Sciunits: Resuable Research Object
Sciunits: Resuable Research Object Sciunits: Resuable Research Object
Sciunits: Resuable Research Object
 
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIWhole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
 
Lichtenberg bosc2010 wordseeker
Lichtenberg bosc2010 wordseekerLichtenberg bosc2010 wordseeker
Lichtenberg bosc2010 wordseeker
 
London bosc2010
London bosc2010London bosc2010
London bosc2010
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
 
Software Dev
Software DevSoftware Dev
Software Dev
 
Performance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryPerformance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memory
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plot
 
NGS data formats and analyses
NGS data formats and analysesNGS data formats and analyses
NGS data formats and analyses
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 
Bic bellec 2016_small
Bic bellec 2016_smallBic bellec 2016_small
Bic bellec 2016_small
 
Nephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele resultsNephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele results
 
Closing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceClosing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real Science
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment Discrepancies
 
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
Kernel Recipes 2016 - Would an ABI changes visualization tool be useful to Li...
 
Variability, Bugs, and Cognition
Variability, Bugs, and CognitionVariability, Bugs, and Cognition
Variability, Bugs, and Cognition
 
Architecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory systemArchitecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory system
 
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release LevelAn Empirical Study on Inconsistent Changes to Code Clones at Release Level
An Empirical Study on Inconsistent Changes to Code Clones at Release Level
 
Big data solution for ngs data analysis
Big data solution for ngs data analysisBig data solution for ngs data analysis
Big data solution for ngs data analysis
 

Similar a Taming Snakemake

🐲 Here be Stacktraces — Flink SQL for Non-Java Developers
🐲 Here be Stacktraces — Flink SQL for Non-Java Developers🐲 Here be Stacktraces — Flink SQL for Non-Java Developers
🐲 Here be Stacktraces — Flink SQL for Non-Java DevelopersHostedbyConfluent
 
Makefile for python projects
Makefile for python projectsMakefile for python projects
Makefile for python projectsMpho Mphego
 
Katello on TorqueBox
Katello on TorqueBoxKatello on TorqueBox
Katello on TorqueBoxlzap
 
Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Espen Brækken
 
Compile ahead of time. It's fine?
Compile ahead of time. It's fine?Compile ahead of time. It's fine?
Compile ahead of time. It's fine?Dmitry Chuyko
 
Iptablesrocks
IptablesrocksIptablesrocks
Iptablesrocksqwer_asdf
 
Functional programming is the most extreme programming
Functional programming is the most extreme programmingFunctional programming is the most extreme programming
Functional programming is the most extreme programmingsamthemonad
 
ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in...
 ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in... ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in...
ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in...Saurabh Nanda
 
Sparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With SparkSparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With SparkIan Pointer
 
Developing OpenResty Framework
Developing OpenResty FrameworkDeveloping OpenResty Framework
Developing OpenResty FrameworkOpenRestyCon
 
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...NETWAYS
 
Webinar: Learn Perl - The Jewel of Scripting Languages
Webinar: Learn Perl - The Jewel of Scripting LanguagesWebinar: Learn Perl - The Jewel of Scripting Languages
Webinar: Learn Perl - The Jewel of Scripting LanguagesEdureka!
 
Concurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple SpacesConcurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple Spacesluccastera
 
Le PERL est mort
Le PERL est mortLe PERL est mort
Le PERL est mortapeiron
 
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...Jennifer Shelton
 
The computer science behind a modern disributed data store
The computer science behind a modern disributed data storeThe computer science behind a modern disributed data store
The computer science behind a modern disributed data storeJ On The Beach
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala InternalsDavid Groozman
 
Introduction to Clojure
Introduction to ClojureIntroduction to Clojure
Introduction to ClojureRenzo Borgatti
 

Similar a Taming Snakemake (20)

🐲 Here be Stacktraces — Flink SQL for Non-Java Developers
🐲 Here be Stacktraces — Flink SQL for Non-Java Developers🐲 Here be Stacktraces — Flink SQL for Non-Java Developers
🐲 Here be Stacktraces — Flink SQL for Non-Java Developers
 
Makefile for python projects
Makefile for python projectsMakefile for python projects
Makefile for python projects
 
Katello on TorqueBox
Katello on TorqueBoxKatello on TorqueBox
Katello on TorqueBox
 
Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex
 
Compile ahead of time. It's fine?
Compile ahead of time. It's fine?Compile ahead of time. It's fine?
Compile ahead of time. It's fine?
 
Iptablesrocks
IptablesrocksIptablesrocks
Iptablesrocks
 
Functional programming is the most extreme programming
Functional programming is the most extreme programmingFunctional programming is the most extreme programming
Functional programming is the most extreme programming
 
ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in...
 ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in... ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in...
ABRIDGED VERSION - Joys & frustrations of putting 34,000 lines of Haskell in...
 
Sparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With SparkSparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With Spark
 
Basic Make
Basic MakeBasic Make
Basic Make
 
Developing OpenResty Framework
Developing OpenResty FrameworkDeveloping OpenResty Framework
Developing OpenResty Framework
 
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
OSDC 2018 | The Computer science behind a modern distributed data store by Ma...
 
Intro to Elixir talk
Intro to Elixir talkIntro to Elixir talk
Intro to Elixir talk
 
Webinar: Learn Perl - The Jewel of Scripting Languages
Webinar: Learn Perl - The Jewel of Scripting LanguagesWebinar: Learn Perl - The Jewel of Scripting Languages
Webinar: Learn Perl - The Jewel of Scripting Languages
 
Concurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple SpacesConcurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple Spaces
 
Le PERL est mort
Le PERL est mortLe PERL est mort
Le PERL est mort
 
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
 
The computer science behind a modern disributed data store
The computer science behind a modern disributed data storeThe computer science behind a modern disributed data store
The computer science behind a modern disributed data store
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
Introduction to Clojure
Introduction to ClojureIntroduction to Clojure
Introduction to Clojure
 

Último

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Último (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Taming Snakemake

  • 2. Why Make?  What are Make's advantages (over Perl and shell scripts)?  Make forces you to think about file transformation in terms of inputs and outputs, recipes and rules. In Perl you are forced to think at the level of variables, conditionals, and loops. In Shell you are forced to think like a caveman.  Unfortunately, bioinformatics is still largely about files and their suffixes. Make has a very powerful syntax based almost entirely around file suffixes.  Make knows what's been made and what hasn't. Make can be interrupted and restarted safely, and without overwriting finished work.  Make knows what's changed and what hasn't. If an input is newer than an output, it will attempt to rebuild the output.  Make allows you to add new input files without worrying about overwriting old ones.  Make is well supported. There are 1333 Make questions on SO alone.  When people see a Makefile, they immediately know how to run it.  Make does not force you to wrap shell statements in quotes.  Make is a DSL. It will attempt to validate your syntax.  Make is ancient, ubiquitous, and reliable.  Make can parallelize with --jobs.  Make recipes encourage reuse. https://share.chop.edu/pages/viewpage.action?pageId=138478819
  • 6. Why Snakemake?  Addresses Makefile weaknesses without throwing out the good stuff  Difficult to implement control flow  No cluster support  Inflexible wildcards  Too much reliance on sentinal files  No reporting mechanism Johannes Köster
  • 8. Utilities  Logs - wire them up manually  Cluster support pretty decent source /nas/is1/leipzig/martin/variome-env/bin/activate snakemake --directory /nas/is1/leipzig/martin/snake-env --snakefile /nas/is1/leipzig/martin/snake-env/Snakefile -c qsub -j 16 source /mnt/isilon/cbmi/variome/leipzig/martin/respublica-env/bin/activate snakemake --directory /mnt/isilon/cbmi/variome/leipzig/martin/snake-env --snakefile /mnt/isilon/cbmi/variome/leipzig/martin/snake-env/Snakefile -c qsub -j 16  Cores/jobs/resources
  • 9. Useful stuff  dry-runs  keep-going  touch  version changes  workflow diagrams
  • 11. Client websites with Jekyll  Jekyll is a templating engine for blogs that accepts Markdown  Layouts use the Liquid markup http://mitomap.org/martin-rnaseq/
  • 12. A workflow that reports itself
  • 18. R/Snakemake integration git submodule add git@github.research.chop.edu:BiG/rna-seq-common-functions.git common/rna-seq
  • 19. Leave a paper trail
  • 20. Reproducible Checklist  repository github.research.chop.edu  workflow of some kind from beginning to end  website at mybic.chop.edu