SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
TECHNICAL AND SOCIAL CHALLENGES
IN SYNTHESIZING THE TREE OF LIFE
Karen Cranston	

National Evolutionary Synthesis Center

@kcranstn	

http://slideshare.net/kcranstn
IF WE “HAD” A TREE OF LIFE?
complete = contains all of
biodiversity	

dynamic = continuously updated
with new data	

available digitally = browsing,
querying, downloading
Produce a digitally-available phylogeny that
contains all of biodiversity	

Provide tools for managing, analyzing and sharing
phylogenetic data
http://avatol.org
CHALLENGE: COMPLETENESS
Even if there were phylogenies for all
species in GenBank, would only have a
small fraction of biodiversity
NCBI taxonomy data (578 taxa)

Soltis et al APG III phylogeny (30 taxa)
from Stephen Smith
Dipsicales graph

Synthesized tree; contains
structure of phylogeny but all
578 taxa
from Stephen Smith
Inputs:
Published phylogenies
Taxonomies

•
•

•
•

filter / weight input trees	

synthesize into single data
structure

process feedback 	

input new data sets

complete tree of life
CHALLENGE: ACCESS TO
PUBLISHED PHYLOGENIES
“Phylogeny provides a mechanism through which to
interpret the patterns and processes of evolution and to
predict the responses of life to rapid environmental change.
Phylogenies and phylogenetic methods are now being used
to enhance agriculture, identify and combat diseases,
conserve biodiversity, and predict responses to global
climate change and to biological invasions.” *

(tl;dr: We need trees to do cool and important science)
* OpenTree grant proposal
Expertise in
phylogenetic
inference

Expertise in
methods
that use
phylogenies
EVOLUTION

TREE Fig._S1 = [&R] (2,1,((3,7),(4,(6,(33,(15,((20,(47,((51,
(49,50)),(46,(48,(52,16)))))),(((44,45),((18,(12,(13,(43,42)))),
((41,((39,38),(40,17))),((35,9),(34,(36,37)))))),(32,(((21,19),
((30,14),(22,((11,31),((27,25),(23,((28,(24,8)),(10,(26,
(5,29)))))))))),((((72,(63,57)),((65,64),((66,67),(68,(69,(70,
(71,54))))))),(((82,59),(60,(61,(62,55)))),((80,(81,56)),((53,
(77,78)),((75,73),(76,(58,74))))))),((88,((86,87),((85,84),
(83,89)))),(79,((91,(93,(95,(92,(96,(94,90)))))),((100,(99,98)),
(97,(((168,((172,185),((159,101),(109,157)))),(((181,(179,180)),
((102,(183,187)),(175,(176,(178,177))))),(212,((195,(210,211)),
(199,((201,(196,202)),((194,197),((203,(192,205)),(204,(193,
((209,(208,206)),(198,(200,207))))))))))))),(113,(((154,
((169,170),(103,191))),((131,126),(128,((134,135),(129,(125,
((132,130),(104,133)))))))),((((190,166),((162,171),((116,120),
(115,114)))),((122,(188,(186,108))),((118,(119,105)),(117,(158,
(184,189)))))),((123,124),(((148,((165,161),(174,182))),
((106,121),(163,(167,127)))),((173,(156,(155,160))),(164,
(((136,137),(139,(138,107))),((153,145),(112,(((146,143),(144,
(140,141))),((142,152),(147,((110,111),(149,
(150,151)))))))))))))))))))))))))))))))));

Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL =
344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with improved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–
88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number
of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of
the organisms.

thermore, a paraphyletic relationship of phorids and syrphids
would support the hypothesis that their shared special mode of
extraembryonic development (dorsal amnion closure) (26)
evolved in the stem lineage of Cyclorrhapha and preceded the
origin of the schizophoran amnioserosa.
Wiegmann et al.

To test this hypothesis, we used a relatively recent phylogenomic
marker: small, noncoding, regulatory micro-RNAs (miRNAs).
miRNAs exhibit a striking phylogenetic pattern of conservation
across the metazoan tree of life, suggesting the accumulation and
maintenance of miRNA families throughout organismal evolution

import phytools!
flyTree<-read.tree(“flies.tre”)!
contMap(flyTree,flyData)
PNAS Early Edition | 3 of 6

Weigmann et al. PNAS, 2011
Archiving sequence data is a
community norm

~ 4% of all published
phylogenetic trees
Stoltzfus et al 2012

Archiving phylogenetic data is quite rare
OPENTREE PHYLOGENY INPUTS
Surveyed >7000 phylogenetic studies in plants, fungi and animals,
unicellular organisms	

Result: data for >2700 studies, >4800 trees
CHALLENGE: SELECTING
BACKBONE TAXONOMY
Complete?	

Up to date with taxonomic literature?	

Phylogenetically-informed?
Systematics
research

very slow…..

Online taxonomic
resources
OPEN TREE TAXONOMY
+
+

+

+

patch files for manual edits
(requires source info!)
•
•

3,133,028 nodes and 2,559,835 ‘species’
https://github.com/OpenTreeOfLife/reference-taxonomy
CHALLENGE: PHYLOGENY
CURATION
TREE Fig._S1 = [&R] (2,1,((3,7),(4,(6,(33,(15,((20,(47,((51,
(49,50)),(46,(48,(52,16)))))),(((44,45),((18,(12,(13,(43,42)))),
((41,((39,38),(40,17))),((35,9),(34,(36,37)))))),(32,(((21,19),
((30,14),(22,((11,31),((27,25),(23,((28,(24,8)),(10,(26,
(5,29)))))))))),((((72,(63,57)),((65,64),((66,67),(68,(69,(70,
(71,54))))))),(((82,59),(60,(61,(62,55)))),((80,(81,56)),((53,
(77,78)),((75,73),(76,(58,74))))))),((88,((86,87),((85,84),
(83,89)))),(79,((91,(93,(95,(92,(96,(94,90)))))),((100,(99,98)),
(97,(((168,((172,185),((159,101),(109,157)))),(((181,(179,180)),
((102,(183,187)),(175,(176,(178,177))))),(212,((195,(210,211)),
(199,((201,(196,202)),((194,197),((203,(192,205)),(204,(193,
((209,(208,206)),(198,(200,207))))))))))))),(113,(((154,
((169,170),(103,191))),((131,126),(128,((134,135),(129,(125,
((132,130),(104,133)))))))),((((190,166),((162,171),((116,120),
(115,114)))),((122,(188,(186,108))),((118,(119,105)),(117,(158,
(184,189)))))),((123,124),(((148,((165,161),(174,182))),
((106,121),(163,(167,127)))),((173,(156,(155,160))),(164,
(((136,137),(139,(138,107))),((153,145),(112,(((146,143),(144,
(140,141))),((142,152),(147,((110,111),(149,
(150,151)))))))))))))))))))))))))))))))));

How was this tree inferred?	

What are the tip labels?	

Is it rooted correctly?	

What clade was the focus of the study?
CURATOR TOOLS
Data curation

NeXSON
(NeXML as JSON)

Tree synthesis
Input names

Mapped to
taxonomy
Tree synthesis

API layer
Common data store of NexSON files (NeXML as JSON)
•
•
•
•
•

Open source software tools for managing open
data 	

Publicly-accessible data store	

Full provenance data (who changed what & when?)	

Allows access & download through standard
protocols (git)	

Where possible, using Creative Commons 0 waiver
CHALLENGE: SYNTHESIZING
PHYLOGENY AND TAXONOMY
Graph databases are key

Image:
Open Tree of Life
Thanks to Joseph Brown, Stephen Smith, Jonathan Rees, Jim Allman
for getting the latest version up last night!
Thanks to Joseph Brown, Stephen Smith, Jonathan Rees, Jim Allman
for getting the latest version up last night!
Synthesis details next week from Stephen
Smith, University of Michigan	

Thursday, February 13, 1 pm EST	

phyloseminar.org
WHAT CAN WE DO WITH THESE
DATA AND TOOLS?
Comparing phylogeny and taxonomy

Rick Ree & Lyndon Coghill
Conflict within sets of trees

Open Tree of Life

Stephen Smith
Highlight under-studied parts of the tree	

Label internal nodes on phylogenies 	

Test various methods for synthesis	

Quantify and visualize phylogenetic conflict	

Extract phylogeny given list of taxa 	

Infer branch lengths on synthetic trees	

Organize biodiversity data phylogenetically	

… and many more, enabled by phylogenetic synthesis and digitally
available phylogenetic data
COMING IN 2014	


Hackathon, jointly with	

Clade-based curation and analysis workshops
QUESTIONS? PARTICIPATE?

opentreeoflife@googlegroups.com	

opentreeoflife-software@googlegroups.com	

irc: #opentreeoflife on freenode	

http://github.com/OpenTreeOfLife
Gordon Burleigh	

Keith Crandall	

Karl Gude	

David Hibbett	

Mark Holder	

Laura Katz	

Rick Ree	


Stephen Smith	

Doug Soltis	

Tiffani Williams	

+ many postdocs,
grad students and
undergrads

@NESCent: Karen Cranston, Jonathan Rees, Jim Allman

Más contenido relacionado

La actualidad más candente

Banana Transposable Elements: The hAT DNA element story PAGXXIII
Banana Transposable Elements: The hAT DNA element story PAGXXIIIBanana Transposable Elements: The hAT DNA element story PAGXXIII
Banana Transposable Elements: The hAT DNA element story PAGXXIII
Pat (JS) Heslop-Harrison
 
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Pat (JS) Heslop-Harrison
 
Marker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in RiceMarker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in Rice
Indrapratap1
 
QTL for Nitrogen use efficiency in plants
QTL for Nitrogen use efficiency in plantsQTL for Nitrogen use efficiency in plants
QTL for Nitrogen use efficiency in plants
ANANDALEKSHMIL
 

La actualidad más candente (20)

Role of Biotechnology in Improving Productivity for Rice Producers in Asia fr...
Role of Biotechnology in Improving Productivity for Rice Producers in Asia fr...Role of Biotechnology in Improving Productivity for Rice Producers in Asia fr...
Role of Biotechnology in Improving Productivity for Rice Producers in Asia fr...
 
My publication-1
My publication-1My publication-1
My publication-1
 
Banana Transposable Elements: The hAT DNA element story PAGXXIII
Banana Transposable Elements: The hAT DNA element story PAGXXIIIBanana Transposable Elements: The hAT DNA element story PAGXXIII
Banana Transposable Elements: The hAT DNA element story PAGXXIII
 
A bad genetic history of maize
A bad genetic history of maizeA bad genetic history of maize
A bad genetic history of maize
 
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
Banana, Ensete and Boesenbergia Genomics - Schwarzacher, Heslop-Harrison, Har...
 
Next generation genomics for chickpea (Cicer arietinum L.) improvement
Next generation genomics for chickpea (Cicer arietinum L.) improvementNext generation genomics for chickpea (Cicer arietinum L.) improvement
Next generation genomics for chickpea (Cicer arietinum L.) improvement
 
Genome Evolution Chromosomes Heslop-Harrison ICC Prague
Genome Evolution Chromosomes Heslop-Harrison ICC PragueGenome Evolution Chromosomes Heslop-Harrison ICC Prague
Genome Evolution Chromosomes Heslop-Harrison ICC Prague
 
B0391012021
B0391012021B0391012021
B0391012021
 
The efficiency of transgenesis by restriction enzyme mediated integration s...
The efficiency of transgenesis by restriction enzyme mediated integration   s...The efficiency of transgenesis by restriction enzyme mediated integration   s...
The efficiency of transgenesis by restriction enzyme mediated integration s...
 
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsRamil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
 
Beiko dcsi2013
Beiko dcsi2013Beiko dcsi2013
Beiko dcsi2013
 
The future of Rice Genomics: sequencing the collective Oryza Genome
The future of Rice Genomics: sequencing the collective Oryza GenomeThe future of Rice Genomics: sequencing the collective Oryza Genome
The future of Rice Genomics: sequencing the collective Oryza Genome
 
Transgenesis in Transgenic Animals and Its Applications
Transgenesis in Transgenic Animals and Its ApplicationsTransgenesis in Transgenic Animals and Its Applications
Transgenesis in Transgenic Animals and Its Applications
 
Marker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in RiceMarker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in Rice
 
Transgenic animals new
Transgenic animals newTransgenic animals new
Transgenic animals new
 
The Wheat Genome
The Wheat GenomeThe Wheat Genome
The Wheat Genome
 
Adaptation in plant genomes: bigger is different
Adaptation in plant genomes: bigger is differentAdaptation in plant genomes: bigger is different
Adaptation in plant genomes: bigger is different
 
Bacteria Sit technique
Bacteria Sit techniqueBacteria Sit technique
Bacteria Sit technique
 
Applied genomic research in rice genetic improvement (2)
Applied genomic research in rice genetic improvement (2)Applied genomic research in rice genetic improvement (2)
Applied genomic research in rice genetic improvement (2)
 
QTL for Nitrogen use efficiency in plants
QTL for Nitrogen use efficiency in plantsQTL for Nitrogen use efficiency in plants
QTL for Nitrogen use efficiency in plants
 

Destacado (7)

Using phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesisUsing phylogenetic metadata for large-scale phylogeny synthesis
Using phylogenetic metadata for large-scale phylogeny synthesis
 
Building communities around open-source scientific software
Building communities around open-source scientific softwareBuilding communities around open-source scientific software
Building communities around open-source scientific software
 
Open Tree of Life at Duke Futures
Open Tree of Life at Duke FuturesOpen Tree of Life at Duke Futures
Open Tree of Life at Duke Futures
 
Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013Open Tree at UNCC Jan 2013
Open Tree at UNCC Jan 2013
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
 
Open Tree of Life at Evolution 2014
Open Tree of Life at Evolution 2014Open Tree of Life at Evolution 2014
Open Tree of Life at Evolution 2014
 
Cranston Evolution 2013
Cranston Evolution 2013Cranston Evolution 2013
Cranston Evolution 2013
 

Similar a Open Tree of Life Phyloseminar 2014

10 rapid molecular evolution in a living fossil
10 rapid molecular evolution in a living fossil10 rapid molecular evolution in a living fossil
10 rapid molecular evolution in a living fossil
João Soares
 
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaeaA phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
Jonathan Eisen
 

Similar a Open Tree of Life Phyloseminar 2014 (20)

If this is the future, where is my tree of life?
If this is the future, where is my tree of life?If this is the future, where is my tree of life?
If this is the future, where is my tree of life?
 
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
Analysis of Phylogenetic Relationship Among Carangoides Species using Mega 6
 
Bioinformatics in a Nutshell
Bioinformatics in a NutshellBioinformatics in a Nutshell
Bioinformatics in a Nutshell
 
Survey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisSurvey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysis
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
bai2
bai2bai2
bai2
 
2000 JME (51)278-285
2000 JME (51)278-2852000 JME (51)278-285
2000 JME (51)278-285
 
10 rapid molecular evolution in a living fossil
10 rapid molecular evolution in a living fossil10 rapid molecular evolution in a living fossil
10 rapid molecular evolution in a living fossil
 
Metagenomics as a tool for biodiversity and health
Metagenomics as a tool for biodiversity and healthMetagenomics as a tool for biodiversity and health
Metagenomics as a tool for biodiversity and health
 
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
 
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
 
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdfPaper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
 
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuksThe need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
 
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaeaA phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
 
Open Tree of Life @NSF
Open Tree of Life @NSFOpen Tree of Life @NSF
Open Tree of Life @NSF
 
Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02
 
Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02
 
21 kebere bezaweletaw 207-217
21 kebere bezaweletaw 207-21721 kebere bezaweletaw 207-217
21 kebere bezaweletaw 207-217
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogenetic
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Open Tree of Life Phyloseminar 2014

  • 1. TECHNICAL AND SOCIAL CHALLENGES IN SYNTHESIZING THE TREE OF LIFE Karen Cranston National Evolutionary Synthesis Center @kcranstn http://slideshare.net/kcranstn
  • 2. IF WE “HAD” A TREE OF LIFE? complete = contains all of biodiversity dynamic = continuously updated with new data available digitally = browsing, querying, downloading
  • 3. Produce a digitally-available phylogeny that contains all of biodiversity Provide tools for managing, analyzing and sharing phylogenetic data http://avatol.org
  • 5. Even if there were phylogenies for all species in GenBank, would only have a small fraction of biodiversity
  • 6. NCBI taxonomy data (578 taxa) Soltis et al APG III phylogeny (30 taxa) from Stephen Smith
  • 7. Dipsicales graph Synthesized tree; contains structure of phylogeny but all 578 taxa from Stephen Smith
  • 8. Inputs: Published phylogenies Taxonomies • • • • filter / weight input trees synthesize into single data structure process feedback input new data sets complete tree of life
  • 10. “Phylogeny provides a mechanism through which to interpret the patterns and processes of evolution and to predict the responses of life to rapid environmental change. Phylogenies and phylogenetic methods are now being used to enhance agriculture, identify and combat diseases, conserve biodiversity, and predict responses to global climate change and to biological invasions.” * (tl;dr: We need trees to do cool and important science) * OpenTree grant proposal
  • 12. EVOLUTION TREE Fig._S1 = [&R] (2,1,((3,7),(4,(6,(33,(15,((20,(47,((51, (49,50)),(46,(48,(52,16)))))),(((44,45),((18,(12,(13,(43,42)))), ((41,((39,38),(40,17))),((35,9),(34,(36,37)))))),(32,(((21,19), ((30,14),(22,((11,31),((27,25),(23,((28,(24,8)),(10,(26, (5,29)))))))))),((((72,(63,57)),((65,64),((66,67),(68,(69,(70, (71,54))))))),(((82,59),(60,(61,(62,55)))),((80,(81,56)),((53, (77,78)),((75,73),(76,(58,74))))))),((88,((86,87),((85,84), (83,89)))),(79,((91,(93,(95,(92,(96,(94,90)))))),((100,(99,98)), (97,(((168,((172,185),((159,101),(109,157)))),(((181,(179,180)), ((102,(183,187)),(175,(176,(178,177))))),(212,((195,(210,211)), (199,((201,(196,202)),((194,197),((203,(192,205)),(204,(193, ((209,(208,206)),(198,(200,207))))))))))))),(113,(((154, ((169,170),(103,191))),((131,126),(128,((134,135),(129,(125, ((132,130),(104,133)))))))),((((190,166),((162,171),((116,120), (115,114)))),((122,(188,(186,108))),((118,(119,105)),(117,(158, (184,189)))))),((123,124),(((148,((165,161),(174,182))), ((106,121),(163,(167,127)))),((173,(156,(155,160))),(164, (((136,137),(139,(138,107))),((153,145),(112,(((146,143),(144, (140,141))),((142,152),(147,((110,111),(149, (150,151))))))))))))))))))))))))))))))))); Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (−lnL = 344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with improved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80– 88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of the organisms. thermore, a paraphyletic relationship of phorids and syrphids would support the hypothesis that their shared special mode of extraembryonic development (dorsal amnion closure) (26) evolved in the stem lineage of Cyclorrhapha and preceded the origin of the schizophoran amnioserosa. Wiegmann et al. To test this hypothesis, we used a relatively recent phylogenomic marker: small, noncoding, regulatory micro-RNAs (miRNAs). miRNAs exhibit a striking phylogenetic pattern of conservation across the metazoan tree of life, suggesting the accumulation and maintenance of miRNA families throughout organismal evolution import phytools! flyTree<-read.tree(“flies.tre”)! contMap(flyTree,flyData) PNAS Early Edition | 3 of 6 Weigmann et al. PNAS, 2011
  • 13. Archiving sequence data is a community norm ~ 4% of all published phylogenetic trees Stoltzfus et al 2012 Archiving phylogenetic data is quite rare
  • 14. OPENTREE PHYLOGENY INPUTS Surveyed >7000 phylogenetic studies in plants, fungi and animals, unicellular organisms Result: data for >2700 studies, >4800 trees
  • 16. Complete? Up to date with taxonomic literature? Phylogenetically-informed? Systematics research very slow….. Online taxonomic resources
  • 17. OPEN TREE TAXONOMY + + + + patch files for manual edits (requires source info!)
  • 18. • • 3,133,028 nodes and 2,559,835 ‘species’ https://github.com/OpenTreeOfLife/reference-taxonomy
  • 20. TREE Fig._S1 = [&R] (2,1,((3,7),(4,(6,(33,(15,((20,(47,((51, (49,50)),(46,(48,(52,16)))))),(((44,45),((18,(12,(13,(43,42)))), ((41,((39,38),(40,17))),((35,9),(34,(36,37)))))),(32,(((21,19), ((30,14),(22,((11,31),((27,25),(23,((28,(24,8)),(10,(26, (5,29)))))))))),((((72,(63,57)),((65,64),((66,67),(68,(69,(70, (71,54))))))),(((82,59),(60,(61,(62,55)))),((80,(81,56)),((53, (77,78)),((75,73),(76,(58,74))))))),((88,((86,87),((85,84), (83,89)))),(79,((91,(93,(95,(92,(96,(94,90)))))),((100,(99,98)), (97,(((168,((172,185),((159,101),(109,157)))),(((181,(179,180)), ((102,(183,187)),(175,(176,(178,177))))),(212,((195,(210,211)), (199,((201,(196,202)),((194,197),((203,(192,205)),(204,(193, ((209,(208,206)),(198,(200,207))))))))))))),(113,(((154, ((169,170),(103,191))),((131,126),(128,((134,135),(129,(125, ((132,130),(104,133)))))))),((((190,166),((162,171),((116,120), (115,114)))),((122,(188,(186,108))),((118,(119,105)),(117,(158, (184,189)))))),((123,124),(((148,((165,161),(174,182))), ((106,121),(163,(167,127)))),((173,(156,(155,160))),(164, (((136,137),(139,(138,107))),((153,145),(112,(((146,143),(144, (140,141))),((142,152),(147,((110,111),(149, (150,151))))))))))))))))))))))))))))))))); How was this tree inferred? What are the tip labels? Is it rooted correctly? What clade was the focus of the study?
  • 22. Data curation NeXSON (NeXML as JSON) Tree synthesis
  • 23.
  • 25.
  • 26. Tree synthesis API layer Common data store of NexSON files (NeXML as JSON)
  • 27. • • • • • Open source software tools for managing open data Publicly-accessible data store Full provenance data (who changed what & when?) Allows access & download through standard protocols (git) Where possible, using Creative Commons 0 waiver
  • 29. Graph databases are key Image:
  • 30. Open Tree of Life
  • 31. Thanks to Joseph Brown, Stephen Smith, Jonathan Rees, Jim Allman for getting the latest version up last night!
  • 32. Thanks to Joseph Brown, Stephen Smith, Jonathan Rees, Jim Allman for getting the latest version up last night!
  • 33. Synthesis details next week from Stephen Smith, University of Michigan Thursday, February 13, 1 pm EST phyloseminar.org
  • 34. WHAT CAN WE DO WITH THESE DATA AND TOOLS?
  • 35. Comparing phylogeny and taxonomy Rick Ree & Lyndon Coghill
  • 36. Conflict within sets of trees Open Tree of Life Stephen Smith
  • 37. Highlight under-studied parts of the tree Label internal nodes on phylogenies Test various methods for synthesis Quantify and visualize phylogenetic conflict Extract phylogeny given list of taxa Infer branch lengths on synthetic trees Organize biodiversity data phylogenetically … and many more, enabled by phylogenetic synthesis and digitally available phylogenetic data
  • 38. COMING IN 2014 Hackathon, jointly with Clade-based curation and analysis workshops
  • 40. Gordon Burleigh Keith Crandall Karl Gude David Hibbett Mark Holder Laura Katz Rick Ree Stephen Smith Doug Soltis Tiffani Williams + many postdocs, grad students and undergrads @NESCent: Karen Cranston, Jonathan Rees, Jim Allman