SlideShare una empresa de Scribd logo
1 de 17
Descargar para leer sin conexión
Bio.Phylo
A unified phylogenetics toolkit for Biopython


                Eric Talevich

            Institute of Bioinformatics
              University of Georgia


               June 29, 2010
Abstract


       Bio.Phylo is a new phylogenetics library for:

• Exploring, modifying and annotating trees
• Reading & writing standard file formats
• Quick visualization
• Gluing together computational pipelines



                 Availability: Biopython 1.54
A quick survey of file formats

   Newick (a.k.a. New Hampshire) is a simple nested-parens
          format:    (A, (B, C), (D, E))
             • Extended & tweaked, led to NHX (and parsing
               problems)

   Nexus is a collection of formats, including Newick trees
             • More than just tree data. . . still tough to parse

PhyloXML is an XML-based replacement for NHX
             • Annotations formalized as XML elements;
               extensible with user-defined element types

  NeXML is an XML-based successor to Nexus
             • Ontology-based — key-value assignments have
               semantic meaning
Demo: What’s in a tree?




1. Read a simple Newick file
                              4. Promote to a PhyloXML tree
2. Inspect through IPython
                              5. Set branch colors
3. Draw with
                              6. Write a PhyloXML file
   PyLab/matplotlib
# In a terminal, make a simple Newick file
# Then launch the IPython interpreter and read the file


% cat > simple.dnd <<EOF
> (((A,B),(C,D)),(E,F,G))
> EOF

% ipython -pylab
>>> from Bio import Phylo
>>> tree = Phylo.read(’simple.dnd’, ’newick’)
# String representation shows the object structure

>>> print tree

Tree(weight=1.0, rooted=False, name=’’)
    Clade(branch_length=1.0)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’A’)
                Clade(branch_length=1.0, name=’B’)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’C’)
                Clade(branch_length=1.0, name=’D’)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0, name=’E’)
            Clade(branch_length=1.0, name=’F’)
            Clade(branch_length=1.0, name=’G’)
# Draw an ASCII-art dendrogram

>>> Phylo.draw_ascii(tree, column_width=52)

                                  ______________   A
                  ______________|
                 |               |______________   B
   ______________|
 |               |                ______________   C
 |               |______________|
_|                               |______________   D
 |
 |                 ______________ E
 |               |
 |______________|______________ F
                 |
                 |______________ G
>>> tree.rooted = True
>>> Phylo.draw graphiz(tree)

                                   D
              A


                                           C



       B

                                       G

                  E
                               F
# Promote a basic tree to PhyloXML
>>> from Bio.Phylo.PhyloXML import Phylogeny
>>> phy = Phylogeny.from_tree(tree)
>>> print phy

Phylogeny(rooted=True, name=’’)
    Clade(branch_length=1.0)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’A’)
                Clade(branch_length=1.0, name=’B’)
            Clade(branch_length=1.0)
                Clade(branch_length=1.0, name=’C’)
                Clade(branch_length=1.0, name=’D’)
        Clade(branch_length=1.0)
            Clade(branch_length=1.0, name=’E’)
            Clade(branch_length=1.0, name=’F’)
            Clade(branch_length=1.0, name=’G’)
Branch color
>>> phy.root.color = (128, 128, 128)
Or:
>>> phy.root.color = ’#808080’
Or:
>>> phy.root.color = ’gray’

Find clades by attribute values:
>>> mrca = phy.common ancestor({’name’:’E’},
                                 {’name’:’F’})
>>> mrca.color = ’salmon’

Directly index a clade:
>>> phy.clade[0,1].color = ’blue’

>>> Phylo.draw graphviz(phy, prog=’neato’)
D               B


C                       A




        G       F

            E
# Save the color annotations in phyloXML

>>> Phylo.write(phy, ’simple-color.xml’, ’phyloxml’)

<phy:phyloxml xmlns:phy="http://www.phyloxml.org">
  <phylogeny rooted="true">
    <clade>
        <branch_length>1.0</branch_length>
        <color>
            <red>128</red>
            <green>128</green>
            <blue>128</blue>
        </color>
        <clade>
            <branch_length>1.0</branch_length>
            <clade>
                 <branch_length>1.0</branch_length>
                 <clade>
                     <name>A</name>
                     ...
Thanks


Holla:
  • Brad Chapman and Christian Zmasek, GSoC 2009 mentors
  • The Biopython developers, feat. Peter J. A. Cock,
    Frank Kauff & Cymon J. Cox
  • Hilmar Lapp & the NESCent Phyloinformatics program
  • Google’s Open Source Programs Office
  • My professor, Dr. Natarajan Kannan
  • Developers like you
Q&A



• Which 3rd-party applications should we wrap in
  Bio.Phylo.Applications? (e.g. RAxML, MrBayes)
• Which other libraries should we support interoperability with?
  (PyCogent, ape)
• What other algorithms are simple, stable and relevant?
  (Consensus, rooting)
• Features for systematics? (Geography, PopGen integration?)
Extra: Tree methods
>>> dir(tree)

collapse                      get terminals
collapse all                  is bifurcating
common ancestor               is monophyletic
count terminals               is parent of
depths                        is preterminal
distance                      ladderize
find any                      prune
find clades                   split
find elements                 total branch length
get nonterminals              trace
get path

   See: http://biopython.org/DIST/docs/api/Bio.Phylo.
             BaseTree.TreeMixin-class.html
Extra: The Bio.Phylo class hierarchy




Figure: Inheritance relationship among the core classes
Extra: PhyloXML classes

 $ pydoc Bio.Phylo.PhyloXML

Accession              Date                 Point
Alphabet               Distribution         Polygon
Annotation             DomainArchitecture   Property
BaseTree               Events               ProteinDomain
BinaryCharacters       Id                   Reference
BranchColor            MolSeq               Sequence
Clade                  Other                SequenceRelation
CladeRelation          Phylogeny            Taxonomy
Confidence              Phyloxml             Uri


            See: http://biopython.org/wiki/PhyloXML

Más contenido relacionado

Destacado

Goozzy
Goozzy Goozzy
Goozzy alarin
 
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...Juan Carbonell
 
The holy land 2015
The holy land 2015The holy land 2015
The holy land 2015tomdinapoli
 
Brochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En OnderhoudBrochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En OnderhoudCees Koertse
 
Русская бизнес-профессионалов
Русская бизнес-профессионаловРусская бизнес-профессионалов
Русская бизнес-профессионаловDavid Dugas
 
Handout sekolah pilot aaa academy
Handout sekolah pilot aaa academyHandout sekolah pilot aaa academy
Handout sekolah pilot aaa academyMuhammad Abdullah
 
Westweaves Profile
Westweaves ProfileWestweaves Profile
Westweaves Profileanantdamani
 
Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1CVO-SSH
 
4de lesdag kindfactoren
4de lesdag kindfactoren4de lesdag kindfactoren
4de lesdag kindfactorenCVO-SSH
 
Moser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student LearningMoser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student Learningoxfordcollegelibrary
 
Shannon Smith Cv 201109
Shannon Smith Cv 201109Shannon Smith Cv 201109
Shannon Smith Cv 201109shagsa
 
Overall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship iOverall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship iYudi Lesmana
 
Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Dimitri Corpakis
 

Destacado (20)

Proton ds userguide
Proton ds userguideProton ds userguide
Proton ds userguide
 
Shikha Verma_Resume
Shikha Verma_ResumeShikha Verma_Resume
Shikha Verma_Resume
 
Goozzy
Goozzy Goozzy
Goozzy
 
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
The Organic IT Department: Strategic Cost Analysis to Unlock a Sustainable Co...
 
Latest trends in em
Latest trends in emLatest trends in em
Latest trends in em
 
Small Business Profits Tune-Up
Small Business Profits Tune-UpSmall Business Profits Tune-Up
Small Business Profits Tune-Up
 
2004 04 27_ocpd_casestudies
2004 04 27_ocpd_casestudies2004 04 27_ocpd_casestudies
2004 04 27_ocpd_casestudies
 
cvBarisGomleksizoglu-eng
cvBarisGomleksizoglu-engcvBarisGomleksizoglu-eng
cvBarisGomleksizoglu-eng
 
The holy land 2015
The holy land 2015The holy land 2015
The holy land 2015
 
Brochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En OnderhoudBrochure Koertse Bouw En Onderhoud
Brochure Koertse Bouw En Onderhoud
 
Русская бизнес-профессионалов
Русская бизнес-профессионаловРусская бизнес-профессионалов
Русская бизнес-профессионалов
 
Chi2015 sig-od
Chi2015 sig-odChi2015 sig-od
Chi2015 sig-od
 
Handout sekolah pilot aaa academy
Handout sekolah pilot aaa academyHandout sekolah pilot aaa academy
Handout sekolah pilot aaa academy
 
Westweaves Profile
Westweaves ProfileWestweaves Profile
Westweaves Profile
 
Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1Overzicht syllabus beroepspraktijk 1
Overzicht syllabus beroepspraktijk 1
 
4de lesdag kindfactoren
4de lesdag kindfactoren4de lesdag kindfactoren
4de lesdag kindfactoren
 
Moser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student LearningMoser and Riutta - Partnerships for Student Learning
Moser and Riutta - Partnerships for Student Learning
 
Shannon Smith Cv 201109
Shannon Smith Cv 201109Shannon Smith Cv 201109
Shannon Smith Cv 201109
 
Overall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship iOverall complete result indonesia friendly memory championship i
Overall complete result indonesia friendly memory championship i
 
Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...Designing and implementing synergies; Coordinating investment in Research and...
Designing and implementing synergies; Coordinating investment in Research and...
 

Similar a Talevich bosc2010 bio-phylo

Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Paul Richards
 
A search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaA search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaRoderic Page
 
Package-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsPackage-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsJie Bao
 
Bioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekingeBioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekingeProf. Wim Van Criekinge
 
Representing and Reasoning with Modular Ontologies
Representing and Reasoning with Modular OntologiesRepresenting and Reasoning with Modular Ontologies
Representing and Reasoning with Modular OntologiesJie Bao
 
Phylogenetics Analysis in R
Phylogenetics Analysis in RPhylogenetics Analysis in R
Phylogenetics Analysis in RKlaus Schliep
 
Querying XML: XPath and XQuery
Querying XML: XPath and XQueryQuerying XML: XPath and XQuery
Querying XML: XPath and XQueryKatrien Verbert
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesElsevier
 
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docxCS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docxannettsparrow
 
PhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integrationPhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integrationRutger Vos
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
 
Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...Chris Freeland
 
These questions will be a bit advanced level 2
These questions will be a bit advanced level 2These questions will be a bit advanced level 2
These questions will be a bit advanced level 2sadhana312471
 
Perl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PBPerl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PBtutorialsruby
 

Similar a Talevich bosc2010 bio-phylo (20)

Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015Phylogeny in R - Bianca Santini Sheffield R Users March 2015
Phylogeny in R - Bianca Santini Sheffield R Users March 2015
 
A search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-BacaA search engine for phylogenetic tree databases - D. Fernándes-Baca
A search engine for phylogenetic tree databases - D. Fernándes-Baca
 
PYTHON 101.pptx
PYTHON 101.pptxPYTHON 101.pptx
PYTHON 101.pptx
 
Package-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsPackage-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary Results
 
biopython, doctest and makefiles
biopython, doctest and makefilesbiopython, doctest and makefiles
biopython, doctest and makefiles
 
Uncovering Library Features from API Usage on Stack Overflow
Uncovering Library Features from API Usage on Stack OverflowUncovering Library Features from API Usage on Stack Overflow
Uncovering Library Features from API Usage on Stack Overflow
 
Bioinformatica p6-bioperl
Bioinformatica p6-bioperlBioinformatica p6-bioperl
Bioinformatica p6-bioperl
 
Bioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekingeBioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p5-bioperl v2013-wim_vancriekinge
 
Representing and Reasoning with Modular Ontologies
Representing and Reasoning with Modular OntologiesRepresenting and Reasoning with Modular Ontologies
Representing and Reasoning with Modular Ontologies
 
Phylogenetics Analysis in R
Phylogenetics Analysis in RPhylogenetics Analysis in R
Phylogenetics Analysis in R
 
Querying XML: XPath and XQuery
Querying XML: XPath and XQueryQuerying XML: XPath and XQuery
Querying XML: XPath and XQuery
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific Tables
 
philogenetic tree
philogenetic treephilogenetic tree
philogenetic tree
 
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docxCS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
CS101S. ThompsonUniversity of BridgeportLab 7 Files, File.docx
 
i18n and L10n in TYPO3 Flow
i18n and L10n in TYPO3 Flowi18n and L10n in TYPO3 Flow
i18n and L10n in TYPO3 Flow
 
PhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integrationPhyloTastic: names-based phyloinformatic data integration
PhyloTastic: names-based phyloinformatic data integration
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
 
Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...Plays Well with Others, or What I’ve learned as a data provider in an intero...
Plays Well with Others, or What I’ve learned as a data provider in an intero...
 
These questions will be a bit advanced level 2
These questions will be a bit advanced level 2These questions will be a bit advanced level 2
These questions will be a bit advanced level 2
 
Perl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PBPerl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PB
 

Más de BOSC 2010

Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsBOSC 2010
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesBOSC 2010
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenisBOSC 2010
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 embossBOSC 2010
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evokerBOSC 2010
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorBOSC 2010
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisBOSC 2010
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorBOSC 2010
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfBOSC 2010
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsBOSC 2010
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perlBOSC 2010
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopythonBOSC 2010
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBOSC 2010
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaBOSC 2010
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytowebBOSC 2010
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptxBOSC 2010
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiBOSC 2010
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitBOSC 2010
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 
Robinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdfRobinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdfBOSC 2010
 

Más de BOSC 2010 (20)

Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomics
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-services
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 emboss
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evoker
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projector
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenis
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductor
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasf
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstats
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perl
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopython
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_ruby
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rna
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytoweb
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptx
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadi
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkit
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
Robinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdfRobinson bosc2010 bio_hdf
Robinson bosc2010 bio_hdf
 

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Talevich bosc2010 bio-phylo

  • 1. Bio.Phylo A unified phylogenetics toolkit for Biopython Eric Talevich Institute of Bioinformatics University of Georgia June 29, 2010
  • 2. Abstract Bio.Phylo is a new phylogenetics library for: • Exploring, modifying and annotating trees • Reading & writing standard file formats • Quick visualization • Gluing together computational pipelines Availability: Biopython 1.54
  • 3. A quick survey of file formats Newick (a.k.a. New Hampshire) is a simple nested-parens format: (A, (B, C), (D, E)) • Extended & tweaked, led to NHX (and parsing problems) Nexus is a collection of formats, including Newick trees • More than just tree data. . . still tough to parse PhyloXML is an XML-based replacement for NHX • Annotations formalized as XML elements; extensible with user-defined element types NeXML is an XML-based successor to Nexus • Ontology-based — key-value assignments have semantic meaning
  • 4. Demo: What’s in a tree? 1. Read a simple Newick file 4. Promote to a PhyloXML tree 2. Inspect through IPython 5. Set branch colors 3. Draw with 6. Write a PhyloXML file PyLab/matplotlib
  • 5. # In a terminal, make a simple Newick file # Then launch the IPython interpreter and read the file % cat > simple.dnd <<EOF > (((A,B),(C,D)),(E,F,G)) > EOF % ipython -pylab >>> from Bio import Phylo >>> tree = Phylo.read(’simple.dnd’, ’newick’)
  • 6. # String representation shows the object structure >>> print tree Tree(weight=1.0, rooted=False, name=’’) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’A’) Clade(branch_length=1.0, name=’B’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’C’) Clade(branch_length=1.0, name=’D’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’E’) Clade(branch_length=1.0, name=’F’) Clade(branch_length=1.0, name=’G’)
  • 7. # Draw an ASCII-art dendrogram >>> Phylo.draw_ascii(tree, column_width=52) ______________ A ______________| | |______________ B ______________| | | ______________ C | |______________| _| |______________ D | | ______________ E | | |______________|______________ F | |______________ G
  • 8. >>> tree.rooted = True >>> Phylo.draw graphiz(tree) D A C B G E F
  • 9. # Promote a basic tree to PhyloXML >>> from Bio.Phylo.PhyloXML import Phylogeny >>> phy = Phylogeny.from_tree(tree) >>> print phy Phylogeny(rooted=True, name=’’) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’A’) Clade(branch_length=1.0, name=’B’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’C’) Clade(branch_length=1.0, name=’D’) Clade(branch_length=1.0) Clade(branch_length=1.0, name=’E’) Clade(branch_length=1.0, name=’F’) Clade(branch_length=1.0, name=’G’)
  • 10. Branch color >>> phy.root.color = (128, 128, 128) Or: >>> phy.root.color = ’#808080’ Or: >>> phy.root.color = ’gray’ Find clades by attribute values: >>> mrca = phy.common ancestor({’name’:’E’}, {’name’:’F’}) >>> mrca.color = ’salmon’ Directly index a clade: >>> phy.clade[0,1].color = ’blue’ >>> Phylo.draw graphviz(phy, prog=’neato’)
  • 11. D B C A G F E
  • 12. # Save the color annotations in phyloXML >>> Phylo.write(phy, ’simple-color.xml’, ’phyloxml’) <phy:phyloxml xmlns:phy="http://www.phyloxml.org"> <phylogeny rooted="true"> <clade> <branch_length>1.0</branch_length> <color> <red>128</red> <green>128</green> <blue>128</blue> </color> <clade> <branch_length>1.0</branch_length> <clade> <branch_length>1.0</branch_length> <clade> <name>A</name> ...
  • 13. Thanks Holla: • Brad Chapman and Christian Zmasek, GSoC 2009 mentors • The Biopython developers, feat. Peter J. A. Cock, Frank Kauff & Cymon J. Cox • Hilmar Lapp & the NESCent Phyloinformatics program • Google’s Open Source Programs Office • My professor, Dr. Natarajan Kannan • Developers like you
  • 14. Q&A • Which 3rd-party applications should we wrap in Bio.Phylo.Applications? (e.g. RAxML, MrBayes) • Which other libraries should we support interoperability with? (PyCogent, ape) • What other algorithms are simple, stable and relevant? (Consensus, rooting) • Features for systematics? (Geography, PopGen integration?)
  • 15. Extra: Tree methods >>> dir(tree) collapse get terminals collapse all is bifurcating common ancestor is monophyletic count terminals is parent of depths is preterminal distance ladderize find any prune find clades split find elements total branch length get nonterminals trace get path See: http://biopython.org/DIST/docs/api/Bio.Phylo. BaseTree.TreeMixin-class.html
  • 16. Extra: The Bio.Phylo class hierarchy Figure: Inheritance relationship among the core classes
  • 17. Extra: PhyloXML classes $ pydoc Bio.Phylo.PhyloXML Accession Date Point Alphabet Distribution Polygon Annotation DomainArchitecture Property BaseTree Events ProteinDomain BinaryCharacters Id Reference BranchColor MolSeq Sequence Clade Other SequenceRelation CladeRelation Phylogeny Taxonomy Confidence Phyloxml Uri See: http://biopython.org/wiki/PhyloXML