SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
Cloud BioLinux: open source, fully-customizable
 bioinformatics computing on the cloud for the
       genomics community and beyond

          BOSC 2011 - Vienna, Austria



                  Ntino Krampis, PhD
                     Asst. Professor
            J. Craig Venter Institute (JCVI)
                 agbiotec@gmail.com
The community is what makes an open source project


Brad Chapman, Tim Booth, Mesude Bicak, Dawn Field, Dan Pass –
core development and planning

Enis Afgan, Pjotr Prins, Stephen Möller -
and all other members of the cloud biolinux community that move it fwd

J. Craig Venter Inst. -
time allowed to work on an open-source project
Expensive sequencing and large organizations
                   Commodity sequencing and small labs

●
    large sequencing center, multi-million, broad-impact sequencing projects
●   dedicated bioinformatics department, compute clusters


●   small-factor, bench-top sequencer available: GS Junior by 454
●   sequencing as a standard technique in basic biology and genetics research
●   RNAseq and ChiPseq, and each biologist will be tackling a metagenome
Will small labs become the long tail of sequencing ?




   amount of
   sequencing         Credit: WikiMedia Commons




                  number of labs
“Bioinformatics nation is a land of city-states” Lincoln Stein

●   small labs building small-scale bioinformatics infrastructures
●   duplication of effort in compiling and installing software tools
●   some groups have no hardware, expertise, or time to install and run software

●   NEBC BioLinux ( tinyurl.com/BioLinux-NEBC ) 100+ pre-configured tools
●   example: glimmer, hmmer, phylip, rasmol, genespring, clustalw, EMBOSS



    how about large-scale sequence datasets ?
Cloud BioLinux
    pre-configured and on-demand bioinformatics computing on the cloud



                      ●
                          JCVI cloud computing research
                      ●   NEBC BioLinux software repository
      +               ●   community effort – Hackathon / BOSC 2010 - 11
                      ●   Virtual Machine (VM) on Amazon cloud

                        large-scale computing independently of
      =
                      ●

                      institutional or geographic boundaries
                      ●   only need a desktop computer with internet access



cloudbiolinux.org
simple for end-users             signup at
                                             aws.amazon.com




http://tinyurl.com/cloud-biolinux-tutorial
Amazon EC2
→
linux desktop
via remote
desktop client
What if I want to
    share my
alignments with
a collaborator?

save your data as
   a new VM

  0.10$ / GB /
     month

at 15GB, it costs
  1.5$ / month
“whole system snapshot exchange” (Dudley and Butte 2010)
capture the state of the computing system and data
software execution parameters and “massaged” input datasets
Cloud BioLinux developer's framework
        create cloud VM / images with standardized software configurations


●   customize Cloud BioLinux based on community requirements

●   mix and match software from NEBC or other (DebianMed, Scientific Linux etc.)

●   share customized VMs with collaborators, avoiding effort duplication

●   deploy Cloud BioLinux on private and local clouds
software domains in bioinformatics: nextgen
sequencing, de novo assembly, annotation, phylogeny,
    molecular structures, gene expression analysis


        github.com/chapmanb/cloudbiolinux
Cloud BioLinux developer's framework


    ●   based on python-fabric auto-deployment tool

    ●   software components listed in plain text files

    ●   collaborators use files to share descriptions of cloud VM / images

    ●   start with a bare-bones VM / image

    ●   fabric downloads and installs specified software




tinyurl.com/python-fabric
Cloud Biolinux
                                      The future


●   groups.google.com/cloudbiolinux and cloudbiolinux.org

●   expand community, receive feedback, add more software to the VM

●   scalable computing: SGE (Galaxy Cloudman), Hadoop (cloudgene.uibk.ac.at)

●   add next-gen sequencing pipelines, NIH funding - adds effort in development

●   We just had a 2-day codefest at the MetaLab, http://metalab.at/
and before I finish
this talk....
Thank you !

Más contenido relacionado

Destacado (6)

F05-Cloud-Sequencescape
F05-Cloud-SequencescapeF05-Cloud-Sequencescape
F05-Cloud-Sequencescape
 
G04-Misc-Debianmed
G04-Misc-DebianmedG04-Misc-Debianmed
G04-Misc-Debianmed
 
D02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENISD02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENIS
 
G03-SemanticWeb-OntoCAT
G03-SemanticWeb-OntoCATG03-SemanticWeb-OntoCAT
G03-SemanticWeb-OntoCAT
 
F07-Cloud-Hadoop-BAM
F07-Cloud-Hadoop-BAMF07-Cloud-Hadoop-BAM
F07-Cloud-Hadoop-BAM
 
Bosc2011 arakawa
Bosc2011 arakawaBosc2011 arakawa
Bosc2011 arakawa
 

Similar a Bosc2011 ntino-krampis-full

CHPC Workshop Morning Session
CHPC Workshop Morning SessionCHPC Workshop Morning Session
CHPC Workshop Morning SessionNtino Krampis
 
Chi next gen-ntino-krampis
Chi next gen-ntino-krampisChi next gen-ntino-krampis
Chi next gen-ntino-krampisNtino Krampis
 
Cloud BioLinux S.Africa
Cloud BioLinux S.AfricaCloud BioLinux S.Africa
Cloud BioLinux S.AfricaNtino Krampis
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018Krishna-Kumar
 
Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Robert Grossman
 
Raspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflowRaspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflow霈萱 蔡
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding KubernetesTu Pham
 
Cloud computing and bioinformatics
Cloud computing and bioinformaticsCloud computing and bioinformatics
Cloud computing and bioinformaticsEnis Afgan
 
Towards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdfTowards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdfWalid Shaari
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...David Wallom
 
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...PranavPatil822557
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it worldChris Dwan
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobus
 
20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group 20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group Matt Ray
 
Oscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby projectOscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby projectPatrick Chanezon
 
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula Project
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and FutureKeiichiro Ono
 
Moeller bosc2010 debian_taverna
Moeller bosc2010 debian_tavernaMoeller bosc2010 debian_taverna
Moeller bosc2010 debian_tavernaBOSC 2010
 

Similar a Bosc2011 ntino-krampis-full (20)

CHPC Workshop Morning Session
CHPC Workshop Morning SessionCHPC Workshop Morning Session
CHPC Workshop Morning Session
 
Chi next gen-ntino-krampis
Chi next gen-ntino-krampisChi next gen-ntino-krampis
Chi next gen-ntino-krampis
 
Cloud BioLinux S.Africa
Cloud BioLinux S.AfricaCloud BioLinux S.Africa
Cloud BioLinux S.Africa
 
Cloud ntino-krampis
Cloud ntino-krampisCloud ntino-krampis
Cloud ntino-krampis
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
 
Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011
 
Raspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflowRaspberry pi x kubernetes x tensorflow
Raspberry pi x kubernetes x tensorflow
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
Cloud computing and bioinformatics
Cloud computing and bioinformaticsCloud computing and bioinformatics
Cloud computing and bioinformatics
 
Towards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdfTowards-cloud-native-HPC.pdf
Towards-cloud-native-HPC.pdf
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...
 
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group 20160629 Habitat Introduction: Austin DevOps/Mesos User Group
20160629 Habitat Introduction: Austin DevOps/Mesos User Group
 
Oscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby projectOscon 2017: Build your own container-based system with the Moby project
Oscon 2017: Build your own container-based system with the Moby project
 
Nimbus Concept
Nimbus ConceptNimbus Concept
Nimbus Concept
 
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
OpenNebula TechDay Boston 2015 - Bringing Private Cloud Computing to HPC and ...
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
Moeller bosc2010 debian_taverna
Moeller bosc2010 debian_tavernaMoeller bosc2010 debian_taverna
Moeller bosc2010 debian_taverna
 

Más de Bioinformatics Open Source Conference

Más de Bioinformatics Open Source Conference (19)

Running workflows through galaxy bosc presentation
Running workflows through galaxy bosc presentationRunning workflows through galaxy bosc presentation
Running workflows through galaxy bosc presentation
 
Talk1 ben sadi for_gmod_bosc_2011
Talk1 ben sadi for_gmod_bosc_2011Talk1 ben sadi for_gmod_bosc_2011
Talk1 ben sadi for_gmod_bosc_2011
 
Bosc mercer
Bosc mercerBosc mercer
Bosc mercer
 
Mobyle 1 0_new_features_new_types_of_service
Mobyle 1 0_new_features_new_types_of_serviceMobyle 1 0_new_features_new_types_of_service
Mobyle 1 0_new_features_new_types_of_service
 
Bosc2011 isobar-fbp
Bosc2011 isobar-fbpBosc2011 isobar-fbp
Bosc2011 isobar-fbp
 
Talk6 biopython bosc2011
Talk6 biopython bosc2011Talk6 biopython bosc2011
Talk6 biopython bosc2011
 
Unipro ugene bosc 2011 update
Unipro ugene bosc 2011 updateUnipro ugene bosc 2011 update
Unipro ugene bosc 2011 update
 
Bosc talk 7-15-2011x
Bosc talk 7-15-2011xBosc talk 7-15-2011x
Bosc talk 7-15-2011x
 
B07-GenomeContent-Biomart
B07-GenomeContent-BiomartB07-GenomeContent-Biomart
B07-GenomeContent-Biomart
 
B03-GenomeContent-Intermine
B03-GenomeContent-IntermineB03-GenomeContent-Intermine
B03-GenomeContent-Intermine
 
F06-Cloud-Enabling NGS
F06-Cloud-Enabling NGSF06-Cloud-Enabling NGS
F06-Cloud-Enabling NGS
 
D03-NextGen-Bio-NGS
D03-NextGen-Bio-NGSD03-NextGen-Bio-NGS
D03-NextGen-Bio-NGS
 
F01-Cloud-Mygene.info
F01-Cloud-Mygene.infoF01-Cloud-Mygene.info
F01-Cloud-Mygene.info
 
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
 
F03-Cloud-Obiwee
F03-Cloud-ObiweeF03-Cloud-Obiwee
F03-Cloud-Obiwee
 
C02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analyticsC02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analytics
 
B04-GenomeContent-EasyDAS
B04-GenomeContent-EasyDASB04-GenomeContent-EasyDAS
B04-GenomeContent-EasyDAS
 
G07-Misc-Gmod
G07-Misc-GmodG07-Misc-Gmod
G07-Misc-Gmod
 
G09-Misc-EMBOSS
G09-Misc-EMBOSSG09-Misc-EMBOSS
G09-Misc-EMBOSS
 

Último

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Último (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Bosc2011 ntino-krampis-full

  • 1. Cloud BioLinux: open source, fully-customizable bioinformatics computing on the cloud for the genomics community and beyond BOSC 2011 - Vienna, Austria Ntino Krampis, PhD Asst. Professor J. Craig Venter Institute (JCVI) agbiotec@gmail.com
  • 2. The community is what makes an open source project Brad Chapman, Tim Booth, Mesude Bicak, Dawn Field, Dan Pass – core development and planning Enis Afgan, Pjotr Prins, Stephen Möller - and all other members of the cloud biolinux community that move it fwd J. Craig Venter Inst. - time allowed to work on an open-source project
  • 3. Expensive sequencing and large organizations Commodity sequencing and small labs ● large sequencing center, multi-million, broad-impact sequencing projects ● dedicated bioinformatics department, compute clusters ● small-factor, bench-top sequencer available: GS Junior by 454 ● sequencing as a standard technique in basic biology and genetics research ● RNAseq and ChiPseq, and each biologist will be tackling a metagenome
  • 4. Will small labs become the long tail of sequencing ? amount of sequencing Credit: WikiMedia Commons number of labs
  • 5. “Bioinformatics nation is a land of city-states” Lincoln Stein ● small labs building small-scale bioinformatics infrastructures ● duplication of effort in compiling and installing software tools ● some groups have no hardware, expertise, or time to install and run software ● NEBC BioLinux ( tinyurl.com/BioLinux-NEBC ) 100+ pre-configured tools ● example: glimmer, hmmer, phylip, rasmol, genespring, clustalw, EMBOSS how about large-scale sequence datasets ?
  • 6. Cloud BioLinux pre-configured and on-demand bioinformatics computing on the cloud ● JCVI cloud computing research ● NEBC BioLinux software repository + ● community effort – Hackathon / BOSC 2010 - 11 ● Virtual Machine (VM) on Amazon cloud large-scale computing independently of = ● institutional or geographic boundaries ● only need a desktop computer with internet access cloudbiolinux.org
  • 7. simple for end-users signup at aws.amazon.com http://tinyurl.com/cloud-biolinux-tutorial
  • 8. Amazon EC2 → linux desktop via remote desktop client
  • 9. What if I want to share my alignments with a collaborator? save your data as a new VM 0.10$ / GB / month at 15GB, it costs 1.5$ / month
  • 10. “whole system snapshot exchange” (Dudley and Butte 2010) capture the state of the computing system and data software execution parameters and “massaged” input datasets
  • 11. Cloud BioLinux developer's framework create cloud VM / images with standardized software configurations ● customize Cloud BioLinux based on community requirements ● mix and match software from NEBC or other (DebianMed, Scientific Linux etc.) ● share customized VMs with collaborators, avoiding effort duplication ● deploy Cloud BioLinux on private and local clouds
  • 12. software domains in bioinformatics: nextgen sequencing, de novo assembly, annotation, phylogeny, molecular structures, gene expression analysis github.com/chapmanb/cloudbiolinux
  • 13. Cloud BioLinux developer's framework ● based on python-fabric auto-deployment tool ● software components listed in plain text files ● collaborators use files to share descriptions of cloud VM / images ● start with a bare-bones VM / image ● fabric downloads and installs specified software tinyurl.com/python-fabric
  • 14. Cloud Biolinux The future ● groups.google.com/cloudbiolinux and cloudbiolinux.org ● expand community, receive feedback, add more software to the VM ● scalable computing: SGE (Galaxy Cloudman), Hadoop (cloudgene.uibk.ac.at) ● add next-gen sequencing pipelines, NIH funding - adds effort in development ● We just had a 2-day codefest at the MetaLab, http://metalab.at/
  • 15. and before I finish this talk....
  • 16.
  • 17.
  • 18.
  • 19.