SlideShare una empresa de Scribd logo
1 de 30
BioRuby
•a bioinformatics
 library for the Ruby
 language
•>11 years - project
 since Nov. 21, 2000
BioRuby

 is an
 open-source
 project
          BUT, I HAVE A QUESTION...
Aspects of the word ‘OPEN’
 •OPEN for
  redistribution
 •OPEN for source
  code access
 •OPEN for
  contribution
CENTRALIZED APPROACH
• Pros
  –QC for stability and consistency
  –easy to apply coding standard
  –enables extensive tests and documentation
• Cons
  –heavy burden on release managers
  –longer process, sparser release
  –lack of cutting-edge features
Two ways to participate in
  BioRuby development
1. Be a committer
  1.   be a trusted contributor in the community
  2. get an open-bio.org account
  3. be a CSV/SVN committer
2. Send patches to (busy) core-members
  1. wait for patch evaluation
  2. wait for next release of BioRuby
Two ways to participate in
  BioRuby development
1. Be a committer
  1.   be a trusted contributor in the community
  2. get an open-bio.org account
  3. be a CSV/SVN committer
2. Send patches to (busy) core-members
  1. wait for patch evaluation
  2. wait for next release of BioRuby
Actions of BioRuby
 •more OPEN for
  source code
  access
 •more OPEN for
  contribution
ACTION 1

  Social Coding Using GitHub

       In 2010, the BioRuby
       project source repository
       moved to GitHub
• Users can fork the code freely.
• Users still have to wait for
  acceptance of pull-requests to get
  their code incorporated into the
  official repository.
ACTION 2

Plug-in system - BioGem
DECENTRALIZED APPROACH
• Enables expanding BioRuby without
  tweaking its stable core
• plug-ins are maintained by their authors
• encourage ‘best practice’ using a tool
  (biogem command)
  – Standard directory structure
  – version control using Git
  – Using the RubyGems packaging system
  – testing and documentation
The Biogems workflow
Biogems.info – a portal site for Biogem users
 Biogems.info

 rank in total downloads (rank up&down)
 citation, current version,
 day of final release, links to source code,
 status of Travis continuous integration




                           highly motivating (me)
Database /web-service API     File Parser                  Visualization
      bio ucsc api                   bio gff3                    bio graphics
      intermine                      bio assembly          Framework
      eutils                         bio blastxmlparser          bio ngs
      sequenceserver                 bio faster            Toolbox
      goruby                         bio alignment               bio genomic interval
      bio ensembl                    bio nexml                   bio bigbio
Wrapper                              bio kb illumina             bio hello
      bio samtools                   bio octopus                 bio plasmoap
      bio logger                     bio affy                    bio cnls screenscraper
      bio bwa                        bio dbsno                   bio data
      bio signalp                    bio rdf                     bio aliphatic index
      bio sge                        bio hmmer model             bio hydropathy
      bio exportpred                 bio hmmer3 report           bio gngm
      bio tabix                      bio pileup iterator
Application                          bio phyloxml          Biogem Example
      scaffolder                                                 bio hello
      genfrag
      bio isoelectric point                                Biogem Collection
      bio phyta                                                  bio core
      bio tm hmm
      dna sequence aligner
      bio gag
      bio kmer counter
                                              more than 60 Biogems...
Database /web-service API     File Parser                  Visualization
      bio ucsc api                   bio gff3                    bio graphics
      intermine                      bio assembly          Framework
      eutils                         bio blastxmlparser          bio ngs
      sequenceserver                 bio faster            Toolbox
      goruby                         bio alignment               bio genomic interval
      bio ensembl                    bio nexml                   bio bigbio
Wrapper                              bio kb illumina             bio hello
      bio samtools                   bio octopus                 bio plasmoap
      bio logger                     bio affy                    bio cnls screenscraper
      bio bwa                        bio dbsnp                   bio data
      bio signalp                    bio rdf                     bio aliphatic index
      bio sge                        bio hmmer model             bio hydropathy
      bio exportpred                 bio hmmer3 report           bio gngm
      bio tabix                      bio pileup iterator
Application                          bio phyloxml          Biogem Example
      scaffolder                                                 bio hello
      genfrag
      bio isoelectric point                                Biogem Collection
      bio phyta                                                  bio core
      bio tm hmm
      dna sequence aligner
      bio gag                       Database Access-related
      bio kmer counter              Next Generation Sequencing-related
Hiro Mishima
•   NOT a core
    developer of
    BioRuby
•   not a computer
    scientist but a
    dentist
•   semi-dry biologist
•   human geneticist
Ruby UCSC API
>40,000
tables!
How to get started


$ gem install bio-ucsc-api



                             22
A query written in fluent interface.

 require 'bio-ucsc‘
 Bio::Ucsc::Hg19.connect
 result =
   Bio::Ucsc::Hg19::Snp131.
   find_by_name("rs56289060")
 puts result.chrom # => "chr1"


                                       23
SQL made easy
    region = "chr17:7,579,614-7,579,700"
    condition =
      Bio::Ucsc::Hg19::Snp131.
      with_interval(region).select(:name)
    puts condition.to_sql



SELECT name FROM `snp131`
WHERE (chrom = 'chr17' AND bin in (642,80,9,1,0)
 AND ( (chromStart BETWEEN 7579613 AND 7579700)
    OR (chromEnd BETWEEN 7579613 AND 7579700)
    OR (chromStart <= 7579613 AND
        chromEND >= 7579700) ));
                                               24
FUTURE DIRECTION of BioGem
• Still QC by peer-review is important.
  –ensures stability and quality of codes
   and documents
  –educates plug-in authors
• R/Bioconductor has excellent peer-
  review system
  –good coding style and well-formatted
   document
  –requires huge human resources and
   efforts
Solutions would be…

• recommended collections
   • Bio-Core (Raoul J.P. Bonnal)
• loose/casual peer-review
• need to draw up guidelines for
  designing “good” biogems
ACKNOWLEDGMENTS
• All BioRuby contributors
• Ruby UCSC API
  – Jan Aerts
• The BioRuby Panel
  –   Raoul Bonnal
  –   Naohisa Goto
  –   Francesco Strozzi
  –   Toshiaki Katayama
  –   Pjotr Prins
• Dept. of Human Genetics, Nagasaki Univ.
  – Koh-ichiro Yoshiura
• Google Summer of Code students
• O|B|F – Open Bioinformatics Foundation
or   mishima_eng

Más contenido relacionado

La actualidad más candente

Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSurya Saha
 
NGS overview
NGS overviewNGS overview
NGS overviewAllSeq
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017Surya Saha
 
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Joe Parker
 
How to write bioinformatics software people will use and cite - t.seemann - ...
How to write bioinformatics software people will use and cite -  t.seemann - ...How to write bioinformatics software people will use and cite -  t.seemann - ...
How to write bioinformatics software people will use and cite - t.seemann - ...Torsten Seemann
 
Nanopore long-read metagenomics
Nanopore long-read metagenomicsNanopore long-read metagenomics
Nanopore long-read metagenomicsMartin Hölzer
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.jennomics
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Keith Bradnam
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015Richard Casey
 
16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur Pipeline16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur PipelineEman Abdelrazik
 

La actualidad más candente (12)

Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN Platform
 
NGS overview
NGS overviewNGS overview
NGS overview
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
 
How to write bioinformatics software people will use and cite - t.seemann - ...
How to write bioinformatics software people will use and cite -  t.seemann - ...How to write bioinformatics software people will use and cite -  t.seemann - ...
How to write bioinformatics software people will use and cite - t.seemann - ...
 
Nanopore long-read metagenomics
Nanopore long-read metagenomicsNanopore long-read metagenomics
Nanopore long-read metagenomics
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
 
Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015
 
16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur Pipeline16S rRNA Analysis using Mothur Pipeline
16S rRNA Analysis using Mothur Pipeline
 

Destacado

Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009bosc
 
Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008bosc_2008
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsChris Mungall
 
Sharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and FosterSharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and FosterOpenAIRE
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionHilmar Lapp
 
yw jakartarb20101031
yw jakartarb20101031yw jakartarb20101031
yw jakartarb20101031Yannick Wurm
 

Destacado (9)

D03-NextGen-Bio-NGS
D03-NextGen-Bio-NGSD03-NextGen-Bio-NGS
D03-NextGen-Bio-NGS
 
Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009Prins Bio Lib Bosc 2009
Prins Bio Lib Bosc 2009
 
Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008
 
Amistad
AmistadAmistad
Amistad
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformatics
 
Sharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and FosterSharing Data: An Introductory Workshop from OpenAIRE and Foster
Sharing Data: An Introductory Workshop from OpenAIRE and Foster
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some Introspection
 
yw jakartarb20101031
yw jakartarb20101031yw jakartarb20101031
yw jakartarb20101031
 
Ch5andch6
Ch5andch6Ch5andch6
Ch5andch6
 

Similar a H Mishima - Biogem, Ruby UCSC API, and BioRuby

20120907 microbiome-intro
20120907 microbiome-intro20120907 microbiome-intro
20120907 microbiome-introLeo Lahti
 
What Synthetic Biology Can Do For You
What Synthetic Biology Can Do For YouWhat Synthetic Biology Can Do For You
What Synthetic Biology Can Do For YouEric Ma
 
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...InsideScientific
 
Biopython Project Update 2013
Biopython Project Update 2013Biopython Project Update 2013
Biopython Project Update 2013pjacock
 
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformaticsDeveloping an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformaticsBrad Chapman
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeChunlei Wu
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009bosc
 
NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw Alexander Pico
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS
 
Single-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptxSingle-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptxtinatarariyan
 
What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?ylog
 
Splash presentation tra slides
Splash presentation tra slidesSplash presentation tra slides
Splash presentation tra slidesEric Holmes
 
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...The University of Queensland
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsAznaShihab
 

Similar a H Mishima - Biogem, Ruby UCSC API, and BioRuby (20)

20120907 microbiome-intro
20120907 microbiome-intro20120907 microbiome-intro
20120907 microbiome-intro
 
Bio4j
Bio4jBio4j
Bio4j
 
What Synthetic Biology Can Do For You
What Synthetic Biology Can Do For YouWhat Synthetic Biology Can Do For You
What Synthetic Biology Can Do For You
 
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
Recombinant Antibody Production: Current Methods and a Novel Antibody Generat...
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
Biopython Project Update 2013
Biopython Project Update 2013Biopython Project Update 2013
Biopython Project Update 2013
 
Developing an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformaticsDeveloping an open source community for cloud bioinformatics
Developing an open source community for cloud bioinformatics
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009Prlic Bio Java Bosc2009
Prlic Bio Java Bosc2009
 
NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
 
Single-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptxSingle-Use-Bioreactors-A-Comprehensive-Examination.pptx
Single-Use-Bioreactors-A-Comprehensive-Examination.pptx
 
What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?
 
Splash presentation tra slides
Splash presentation tra slidesSplash presentation tra slides
Splash presentation tra slides
 
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
 
The Infobiotics workbench
The Infobiotics workbenchThe Infobiotics workbench
The Infobiotics workbench
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 

Más de Jan Aerts

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Jan Aerts
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Jan Aerts
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Jan Aerts
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Jan Aerts
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data AnalysisJan Aerts
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsJan Aerts
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...Jan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumJan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisJan Aerts
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...Jan Aerts
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...Jan Aerts
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...Jan Aerts
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsJan Aerts
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 

Más de Jan Aerts (20)

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 

Último

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Último (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

H Mishima - Biogem, Ruby UCSC API, and BioRuby

  • 1.
  • 2. BioRuby •a bioinformatics library for the Ruby language •>11 years - project since Nov. 21, 2000
  • 3. BioRuby is an open-source project BUT, I HAVE A QUESTION...
  • 4.
  • 5. Aspects of the word ‘OPEN’ •OPEN for redistribution •OPEN for source code access •OPEN for contribution
  • 6. CENTRALIZED APPROACH • Pros –QC for stability and consistency –easy to apply coding standard –enables extensive tests and documentation • Cons –heavy burden on release managers –longer process, sparser release –lack of cutting-edge features
  • 7. Two ways to participate in BioRuby development 1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer 2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  • 8. Two ways to participate in BioRuby development 1. Be a committer 1. be a trusted contributor in the community 2. get an open-bio.org account 3. be a CSV/SVN committer 2. Send patches to (busy) core-members 1. wait for patch evaluation 2. wait for next release of BioRuby
  • 9.
  • 10. Actions of BioRuby •more OPEN for source code access •more OPEN for contribution
  • 11. ACTION 1 Social Coding Using GitHub In 2010, the BioRuby project source repository moved to GitHub
  • 12. • Users can fork the code freely. • Users still have to wait for acceptance of pull-requests to get their code incorporated into the official repository.
  • 14. DECENTRALIZED APPROACH • Enables expanding BioRuby without tweaking its stable core • plug-ins are maintained by their authors • encourage ‘best practice’ using a tool (biogem command) – Standard directory structure – version control using Git – Using the RubyGems packaging system – testing and documentation
  • 16. Biogems.info – a portal site for Biogem users Biogems.info rank in total downloads (rank up&down) citation, current version, day of final release, links to source code, status of Travis continuous integration highly motivating (me)
  • 17. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbio Wrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsno bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iterator Application bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag bio kmer counter more than 60 Biogems...
  • 18. Database /web-service API File Parser Visualization bio ucsc api bio gff3 bio graphics intermine bio assembly Framework eutils bio blastxmlparser bio ngs sequenceserver bio faster Toolbox goruby bio alignment bio genomic interval bio ensembl bio nexml bio bigbio Wrapper bio kb illumina bio hello bio samtools bio octopus bio plasmoap bio logger bio affy bio cnls screenscraper bio bwa bio dbsnp bio data bio signalp bio rdf bio aliphatic index bio sge bio hmmer model bio hydropathy bio exportpred bio hmmer3 report bio gngm bio tabix bio pileup iterator Application bio phyloxml Biogem Example scaffolder bio hello genfrag bio isoelectric point Biogem Collection bio phyta bio core bio tm hmm dna sequence aligner bio gag Database Access-related bio kmer counter Next Generation Sequencing-related
  • 19. Hiro Mishima • NOT a core developer of BioRuby • not a computer scientist but a dentist • semi-dry biologist • human geneticist
  • 22. How to get started $ gem install bio-ucsc-api 22
  • 23. A query written in fluent interface. require 'bio-ucsc‘ Bio::Ucsc::Hg19.connect result = Bio::Ucsc::Hg19::Snp131. find_by_name("rs56289060") puts result.chrom # => "chr1" 23
  • 24. SQL made easy region = "chr17:7,579,614-7,579,700" condition = Bio::Ucsc::Hg19::Snp131. with_interval(region).select(:name) puts condition.to_sql SELECT name FROM `snp131` WHERE (chrom = 'chr17' AND bin in (642,80,9,1,0) AND ( (chromStart BETWEEN 7579613 AND 7579700) OR (chromEnd BETWEEN 7579613 AND 7579700) OR (chromStart <= 7579613 AND chromEND >= 7579700) )); 24
  • 25.
  • 26. FUTURE DIRECTION of BioGem • Still QC by peer-review is important. –ensures stability and quality of codes and documents –educates plug-in authors • R/Bioconductor has excellent peer- review system –good coding style and well-formatted document –requires huge human resources and efforts
  • 27. Solutions would be… • recommended collections • Bio-Core (Raoul J.P. Bonnal) • loose/casual peer-review • need to draw up guidelines for designing “good” biogems
  • 28.
  • 29. ACKNOWLEDGMENTS • All BioRuby contributors • Ruby UCSC API – Jan Aerts • The BioRuby Panel – Raoul Bonnal – Naohisa Goto – Francesco Strozzi – Toshiaki Katayama – Pjotr Prins • Dept. of Human Genetics, Nagasaki Univ. – Koh-ichiro Yoshiura • Google Summer of Code students • O|B|F – Open Bioinformatics Foundation
  • 30. or mishima_eng