SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
Project Update
Bioinformatics Open Source Conference (BOSC)
                 July 14, 2012
         Long Beach, California, USA

          Eric Talevich, Peter Cock,
       Brad Chapman, João Rodrigues,
          and Biopython contributors
Hello, BOSC
Biopython is a freely available Python library for biological
computation, and a long-running, distributed collaboration
to produce and maintain it [1].
 ● Supported by the Open Bioinformatics Foundation
    (OBF)
 ● "This is Python's Bio* library. There are several Bio*
    libraries like it, but this one is ours."
 ● http://biopython.org/
_____
[1] Cock, P.J.A., Antao, T., Chang, J.T., Chapman, B.A., Cox, C.J., Dalke, A.,
Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., de Hoon, M.J. (2009)
Biopython: freely available Python tools for computational molecular biology
and bioinformatics. Bioinformatics 25(11) 1422-3. doi:10.1093
/bioinformatics/btp163
Bio.Graphics (Biopython 1.59, February 2012)
New features in...
BasicChromosome:
 ● Draw simple sub-features on chromosome segments
 ● Show the position of genes, SNPs or other loci

GenomeDiagram [2]:
 ● Cross-links between tracks
 ● Track-specific start/end positions for showing regions

_____
[2] Pritchard, L., White, J.A., Birch, P.R., Toth, I. (2010) GenomeDiagram: a
python package for the visualization of large-scale genomic data.
Bioinformatics 2(5) 616-7.
doi:10.1093/bioinformatics/btk021
BasicChromosome: Potato NB-LRRs




Jupe et al. (2012) BMC Genomics
GenomeDiagram:
     A tale of three phages




Swanson et al. (2012) PLoS One (to appear)
GenomeDiagram imitates
Artemis Comparison Tool (ACT)
SeqIO and AlignIO
(Biopython 1.58, August 2011)

● SeqXML format [3]

● Read support for ABI chromatogram files (Wibowo A.)

● "phylip-relaxed" format (Connor McCoy, Brandon I.)
     ○ Relaxes the 10-character limit on taxon names
     ○ Space-delimited instead
     ○ Used in RAxML, PhyML, PAML, etc.

_____
[3] Schmitt et al. (2011) SeqXML and OrthoXML: standards for sequence and
orthology information. Briefings in Bioinformatics 12(5): 485-488. doi:10.1093
/bib/bbr025
Bio.Phylo & pypaml

● PAML interop: wrappers, I/O, glue
  ○ Merged Brandon Invergo’s pypaml as
    Bio.Phylo.PAML (Biopython 1.58, August 2011)

● Phylo.draw improvements

● RAxML wrapper (Biopython 1.60, June 2012)

● Paper in review [4]

_____
[4] Talevich, E., Invergo, B.M., Cock, P.J.A., Chapman, B.A. (2012) Bio.Phylo:
a unified toolkit for processing, analysis and visualization of phylogenetic data
in Biopython. BMC Bioinformatics 13:209. doi:10.1186/1471-2105-13-209
Phylo.draw and matplotlib
Bio.bgzf (Blocked GNU Zip Format)
● BGZF is a GZIP variant that compresses
  blocks of a fixed, known size
● Used in Next Generation Sequencing for
  efficient random access to compressed files
  ○ SAM + BGZF = BAM


Bio.SeqIO can now index BGZF compressed
sequence files. (Biopython 1.60, June 2012)
TogoWS
(Biopython 1.59, February 2012)

● TogoWS is an integrated web resource for
    bioinformatics databases and services
●   Provided by the Database Center for Life Science in
    Japan
●   Usage is similar to NCBI Entrez

_____
http://togows.dbcls.jp/
PyPy and Python 3
Biopython:
● works well on PyPy 1.9
    (excluding NumPy & C extensions)
●   works on Python 3 (excluding some C
    extensions), but concerns remain about
    performance in default unicode mode.
    ○ Currently 'beta' level support.
Bio.PDB
● mmCIF parser restored (Biopython 1.60, June 2012)
  ○ Lenna Peterson fixed a 4-year-old lex/yacc-related
    compilation issue
  ○ That was awesome
  ○ Now she's a GSoC student
  ○ Py3/PyPy/Jython compatibility in progress

● Merging GSoC results incrementally
  ○ Atom element names & weights (João Rodrigues,
    GSoC 2010)
  ○ Lots of feature branches remaining...
Bio.PDB feature branches

                                                 PDBParser


                                          Bio.Struct
               Mocapy++
 Generic
 Features     InterfaceAnalysis   mmCIF Parser


            GSOC



  '10              '11              '12                 ...
Google Summer of Code (GSoC)
In 2011, Biopython had three projects funded via the OBF:
●   Mikael Trellet (Bio.PDB)
●   Michele Silva (Bio.PDB, Mocapy++)
●   Justinas Daugmaudis (Mocapy++)

In 2012, we have two projects via the OBF:
●   Wibowo Arindrarto: (SearchIO)
●   Lenna Peterson: (Variants)

_____
http://biopython.org/wiki/Google_Summer_of_Code
http://www.open-bio.org/wiki/Google_Summer_of_Code
https://www.google-melange.com/
GSoC 2011: Mikael Trellet
Biomolecular interfaces in Bio.PDB
Mentor: João Rodrigues

● Representation of protein-protein
    interfaces: SM(I)CRA
●   Determining interfaces from PDB coordinates
●   Analyses of these objects

_____
http://biopython.org/wiki/GSoC2011_mtrellet
GSoC 2011: Michele Silva
Python/Biopython bindings for Mocapy++
Mentor: Thomas Hamelryck

Michele Silva wrote a Python bridge for Mocapy++ and
linked it to Bio.PDB to enable statistical analysis of protein
structures.

More-or-less ready to merge after the next Mocapy++
release.
_____
http://biopython.org/wiki/GSOC2011_Mocapy
GSoC 2011: Justinas Daugmaudis
Mocapy extensions in Python
Mentor: Thomas Hamelryck

Enhance Mocapy++ in a complementary way, developing a
plugin system for Mocapy++ allowing users to easily write
new nodes (probability distribution functions) in Python.

He's finishing this as part of his master's thesis project with
Thomas Hamelryck.
_____
http://biopython.org/wiki/GSOC2011_MocapyExt
GSoC 2012: Lenna Peterson
Diff My DNA: Development of a
Genomic Variant Toolkit for Biopython
Mentors: Brad Chapman, James Casbon

● I/O for VCF, GVF formats
● internal schema for variant data


_____
http://arklenna.tumblr.com/tagged/gsoc2012
GSoC 2012: Wibowo Arindrarto
SearchIO implementation in
Biopython
Mentor: Peter Cock

Unified, BioPerl-like API for
search results from BLAST,
HMMer, FASTA, etc.


_____
http://biopython.org/wiki/SearchIO
http://bow.web.id/blog/tag/gsoc/
Thanks
●   OBF
●   BOSC organizers
●   Biopython contributors
●   Scientists like you

Check us out:
● Website: http://biopython.org
● Code: https://github.com/biopython/biopython

Más contenido relacionado

Similar a Biopython Project Update (BOSC 2012)

Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
 
BioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics LibraryBioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics Libraryngotogenome
 
Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008bosc_2008
 
UKSG Meeting April 4, 2011
UKSG Meeting April 4, 2011UKSG Meeting April 4, 2011
UKSG Meeting April 4, 2011Philip Bourne
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitBOSC 2010
 
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 UpdateBioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Updatedolleyj
 
Data compression with Python: application of different algorithms with the us...
Data compression with Python: application of different algorithms with the us...Data compression with Python: application of different algorithms with the us...
Data compression with Python: application of different algorithms with the us...Alex Camargo
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyBarry Smith
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopythontiago
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyarsasikalaD3
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyarsasikalaD3
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyarsasikalaD3
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyarsasikalaD3
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyarsasikalaD3
 
Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008bosc_2008
 

Similar a Biopython Project Update (BOSC 2012) (20)

Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
 
BioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics LibraryBioRuby -- Bioinformatics Library
BioRuby -- Bioinformatics Library
 
Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008
 
UKSG Meeting April 4, 2011
UKSG Meeting April 4, 2011UKSG Meeting April 4, 2011
UKSG Meeting April 4, 2011
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkit
 
BioPortal: ontologies and integrated data resources at the click of a mouse
BioPortal: ontologies and integrated data resourcesat the click of a mouseBioPortal: ontologies and integrated data resourcesat the click of a mouse
BioPortal: ontologies and integrated data resources at the click of a mouse
 
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 UpdateBioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
 
Stoltzfus_EvoIO_2010
Stoltzfus_EvoIO_2010Stoltzfus_EvoIO_2010
Stoltzfus_EvoIO_2010
 
Stoltzfus_EvoIO_2010
Stoltzfus_EvoIO_2010Stoltzfus_EvoIO_2010
Stoltzfus_EvoIO_2010
 
Data compression with Python: application of different algorithms with the us...
Data compression with Python: application of different algorithms with the us...Data compression with Python: application of different algorithms with the us...
Data compression with Python: application of different algorithms with the us...
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopython
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyar
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyar
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyar
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyar
 
Python training centre in adyar
Python training centre in adyarPython training centre in adyar
Python training centre in adyar
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008Prins Bio Lib Bosc2008
Prins Bio Lib Bosc2008
 

Último

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Biopython Project Update (BOSC 2012)

  • 1. Project Update Bioinformatics Open Source Conference (BOSC) July 14, 2012 Long Beach, California, USA Eric Talevich, Peter Cock, Brad Chapman, João Rodrigues, and Biopython contributors
  • 2. Hello, BOSC Biopython is a freely available Python library for biological computation, and a long-running, distributed collaboration to produce and maintain it [1]. ● Supported by the Open Bioinformatics Foundation (OBF) ● "This is Python's Bio* library. There are several Bio* libraries like it, but this one is ours." ● http://biopython.org/ _____ [1] Cock, P.J.A., Antao, T., Chang, J.T., Chapman, B.A., Cox, C.J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., de Hoon, M.J. (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3. doi:10.1093 /bioinformatics/btp163
  • 3. Bio.Graphics (Biopython 1.59, February 2012) New features in... BasicChromosome: ● Draw simple sub-features on chromosome segments ● Show the position of genes, SNPs or other loci GenomeDiagram [2]: ● Cross-links between tracks ● Track-specific start/end positions for showing regions _____ [2] Pritchard, L., White, J.A., Birch, P.R., Toth, I. (2010) GenomeDiagram: a python package for the visualization of large-scale genomic data. Bioinformatics 2(5) 616-7. doi:10.1093/bioinformatics/btk021
  • 4. BasicChromosome: Potato NB-LRRs Jupe et al. (2012) BMC Genomics
  • 5. GenomeDiagram: A tale of three phages Swanson et al. (2012) PLoS One (to appear)
  • 7. SeqIO and AlignIO (Biopython 1.58, August 2011) ● SeqXML format [3] ● Read support for ABI chromatogram files (Wibowo A.) ● "phylip-relaxed" format (Connor McCoy, Brandon I.) ○ Relaxes the 10-character limit on taxon names ○ Space-delimited instead ○ Used in RAxML, PhyML, PAML, etc. _____ [3] Schmitt et al. (2011) SeqXML and OrthoXML: standards for sequence and orthology information. Briefings in Bioinformatics 12(5): 485-488. doi:10.1093 /bib/bbr025
  • 8. Bio.Phylo & pypaml ● PAML interop: wrappers, I/O, glue ○ Merged Brandon Invergo’s pypaml as Bio.Phylo.PAML (Biopython 1.58, August 2011) ● Phylo.draw improvements ● RAxML wrapper (Biopython 1.60, June 2012) ● Paper in review [4] _____ [4] Talevich, E., Invergo, B.M., Cock, P.J.A., Chapman, B.A. (2012) Bio.Phylo: a unified toolkit for processing, analysis and visualization of phylogenetic data in Biopython. BMC Bioinformatics 13:209. doi:10.1186/1471-2105-13-209
  • 10. Bio.bgzf (Blocked GNU Zip Format) ● BGZF is a GZIP variant that compresses blocks of a fixed, known size ● Used in Next Generation Sequencing for efficient random access to compressed files ○ SAM + BGZF = BAM Bio.SeqIO can now index BGZF compressed sequence files. (Biopython 1.60, June 2012)
  • 11. TogoWS (Biopython 1.59, February 2012) ● TogoWS is an integrated web resource for bioinformatics databases and services ● Provided by the Database Center for Life Science in Japan ● Usage is similar to NCBI Entrez _____ http://togows.dbcls.jp/
  • 12. PyPy and Python 3 Biopython: ● works well on PyPy 1.9 (excluding NumPy & C extensions) ● works on Python 3 (excluding some C extensions), but concerns remain about performance in default unicode mode. ○ Currently 'beta' level support.
  • 13. Bio.PDB ● mmCIF parser restored (Biopython 1.60, June 2012) ○ Lenna Peterson fixed a 4-year-old lex/yacc-related compilation issue ○ That was awesome ○ Now she's a GSoC student ○ Py3/PyPy/Jython compatibility in progress ● Merging GSoC results incrementally ○ Atom element names & weights (João Rodrigues, GSoC 2010) ○ Lots of feature branches remaining...
  • 14. Bio.PDB feature branches PDBParser Bio.Struct Mocapy++ Generic Features InterfaceAnalysis mmCIF Parser GSOC '10 '11 '12 ...
  • 15. Google Summer of Code (GSoC) In 2011, Biopython had three projects funded via the OBF: ● Mikael Trellet (Bio.PDB) ● Michele Silva (Bio.PDB, Mocapy++) ● Justinas Daugmaudis (Mocapy++) In 2012, we have two projects via the OBF: ● Wibowo Arindrarto: (SearchIO) ● Lenna Peterson: (Variants) _____ http://biopython.org/wiki/Google_Summer_of_Code http://www.open-bio.org/wiki/Google_Summer_of_Code https://www.google-melange.com/
  • 16. GSoC 2011: Mikael Trellet Biomolecular interfaces in Bio.PDB Mentor: João Rodrigues ● Representation of protein-protein interfaces: SM(I)CRA ● Determining interfaces from PDB coordinates ● Analyses of these objects _____ http://biopython.org/wiki/GSoC2011_mtrellet
  • 17. GSoC 2011: Michele Silva Python/Biopython bindings for Mocapy++ Mentor: Thomas Hamelryck Michele Silva wrote a Python bridge for Mocapy++ and linked it to Bio.PDB to enable statistical analysis of protein structures. More-or-less ready to merge after the next Mocapy++ release. _____ http://biopython.org/wiki/GSOC2011_Mocapy
  • 18. GSoC 2011: Justinas Daugmaudis Mocapy extensions in Python Mentor: Thomas Hamelryck Enhance Mocapy++ in a complementary way, developing a plugin system for Mocapy++ allowing users to easily write new nodes (probability distribution functions) in Python. He's finishing this as part of his master's thesis project with Thomas Hamelryck. _____ http://biopython.org/wiki/GSOC2011_MocapyExt
  • 19. GSoC 2012: Lenna Peterson Diff My DNA: Development of a Genomic Variant Toolkit for Biopython Mentors: Brad Chapman, James Casbon ● I/O for VCF, GVF formats ● internal schema for variant data _____ http://arklenna.tumblr.com/tagged/gsoc2012
  • 20. GSoC 2012: Wibowo Arindrarto SearchIO implementation in Biopython Mentor: Peter Cock Unified, BioPerl-like API for search results from BLAST, HMMer, FASTA, etc. _____ http://biopython.org/wiki/SearchIO http://bow.web.id/blog/tag/gsoc/
  • 21. Thanks ● OBF ● BOSC organizers ● Biopython contributors ● Scientists like you Check us out: ● Website: http://biopython.org ● Code: https://github.com/biopython/biopython