SlideShare una empresa de Scribd logo
1 de 1
Branch: An interactive, web-based tool for building decision tree 
classifiers 
Benjamin M. Good, Karthik Gangavarapu, Vyshakh Babji, Max Nanis, Andrew I. Su 
ABSTRACT 
A crucial task in modern biology is the prediction of complex 
phenotypes, such as breast cancer prognosis, from genome-wide 
measurements. Machine learning algorithms can sometimes infer 
predictive patterns, but there is rarely enough data to train and test 
them effectively and the patterns that they identify are often 
expressed in forms (e.g. support vector machines, neural networks, 
random forests composed of 10s of thousands of trees) that are 
highly difficult to understand. In addition, it is generally unclear 
how to include prior knowledge in the course of their construction. 
Decision trees provide an intuitive visual form that can capture 
complex interactions between multiple variables. Effective methods 
exist for inferring decision trees automatically but it has been shown 
that these techniques can be improved upon via the manual 
interventions of experts. Here, we introduce Branch, a new Web-based 
tool for the interactive construction of decision trees from 
genomic datasets. Branch offers the ability to: (1) upload and share 
datasets intended for classification tasks (in progress), (2) construct 
decision trees by manually selecting features such as genes for a 
gene expression dataset, (3) collaboratively edit decision trees, (4) 
create feature functions that aggregate content from multiple 
independent features into single decision nodes (e.g. pathways) and 
(5) evaluate decision tree classifiers in terms of precision and recall. 
The tool is optimized for genomic use cases through the inclusion of 
gene and pathway-based search functions. 
Branch enables expert biologists to easily engage directly with high-throughput 
datasets without the need for a team of 
bioinformaticians. The tree building process allows researchers to 
rapidly test hypotheses about interactions between biological 
variables and phenotypes in ways that would otherwise require 
extensive computational sophistication. In so doing, this tool can 
both inform biological research and help to produce more accurate, 
more meaningful classifiers. 
A prototype of Branch is available at http://biobranch.org/ 
The Scripps Research Institute 
Background 
Feature types 
REFERENCES 
CONTACT 
Benjamin Good: bgood@scripps.edu @bgood 
Andrew Su: asu@scripps.edu @andrewsu 
Dataset library 
http://biobranch.org/ 
Building a decision tree 
Research reported in this poster was supported by the National Institute of General Medical Sciences 
of the National Institutes of Health under award numbers R01GM089820 and R01GM083924, and by 
the National Center for Advancing Translational Sciences of the National Institute of Health under 
award number UL1TR001114. 
Goals 
(1) Find patterns 
(2) make predictions 
on new samples 
< 10 year >10 year 
< 10 year ? 
> 10 year ? 
1. Griffith et al (2013) A robust prognostic signature for hormone-positive node-negative 
breast cancer. Genome Medicine. 
2. Dutkowski and Ideker (2011) Protein Networks as Logic Functions in Development and 
Cancer. PLoS Computational Biology 
3. Winter et al (2012) Google Goes Cancer: Improving Outcome Prediction for Cancer 
Patients by Network-Based Ranking of Marker Genes. PLoS Computational Biology 
4. Liu et al (2012) Identifying dysregulated pathways in cancers from pathway interaction 
networks. BMC Bioinformatics 
5. Paik et al (2004) A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node- 
Negative Breast Cancer. The New England Journal of Medicine 
6. Mihael et al. (1999) Visual classification: an interactive approach to decision tree 
construction. Proceedings of the fifth ACM SIGKDD international conference on 
Knowledge discovery and data mining. 
7. Malcolm W. (2002) Interactive machine learning: letting users build classifiers. 
International Journal of Human-Counter Studies. 
Example: breast cancer survival prediction 
Gene Expression Data 
(+CNVs, SNPs, etc..) (3) Understand the biology that 
the pattern indicates 
Statistics and machine learning 
• Example, Random Forests [1] 
• Good at (1) finding patterns 
• Have mixed results at (2) identifying patterns that 
generalize well across cohorts 
• Sometimes offer little help for (3) increasing 
understanding of the underlying biology 
Prior knowledge 
• Known relationships between the data elements 
(e.g. genes) can be used to improve predictor 
accuracy and generalizability. 
• Examples of inputs to automated methods: protein-protein 
interactions [2,3], pathway databases [4] 
• Manual consideration by domain experts is a vital 
aspect to the inference of new classifiers and is 
fundamental to the formation of understanding. 
See for example the creation of the OncoTypeDx 
predictor for breast cancer prognosis [5] 
Funding 
Decision Trees 
• Can be inferred automatically but.. 
• Engaging domain experts in their creation: 
• (1) provides access to prior knowledge, (2) results in 
smaller, more understandable trees, (3) can improve 
predictive performance, (4) can increase user’s 
comprehension of both the classifier and the data [6,7] 
Clicking on a node shows the 
percentage of the dataset that 
passes through it and its 
accuracy. 
View/use trees shared by community 
• Gene (e.g. expression) 
• Non-gene (e.g. clinical data) 
• Custom feature (manually created feature 
combination) 
• Classifier node (e.g. a trained SVM) 
• Pre-existing tree 
• Visual (manually defined decision 
boundary using GUI) 
• Create a classifier node. 
Iteratively select feature to create each split (If, Then rule) 
Transplant 
rejection 
HIV-1 coreceptor 
usage 
• Test datasets loaded: 
• Breast cancer survival (gene expression) 
• Kidney transplant rejection (gene expression) 
• HIV coreceptor usage (amino acid sequences) 
• Coming soon: upload your own data 
The number of colored squares indicate the 
number of samples that pass through the 
node. The colors are associated with the 
classes to be predicted. Ideal leaf nodes are 
‘pure’ in that they only contain one kind of 
class. 
Breast cancer 
survival 
Decision trees can be made 
private or shared with the 
public when saved. Public 
trees may be used as a 
starting point for others. 
For collaboratively authored 
trees, the author associated 
with each node is tracked. 
http://biobranch.org/

Más contenido relacionado

La actualidad más candente

Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsPragya Pai
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the futurePistoia Alliance
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinarPistoia Alliance
 
Application of blockchain technology in healthcare and biomedicine
Application of blockchain technology in healthcare and biomedicineApplication of blockchain technology in healthcare and biomedicine
Application of blockchain technology in healthcare and biomedicinePranavathiyani G
 
NRNB Annual Report 2011
NRNB Annual Report 2011NRNB Annual Report 2011
NRNB Annual Report 2011Alexander Pico
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsAlexander Pico
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuAlexander Pico
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
 
Final report
Final reportFinal report
Final reportTian Hao
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryResearch Information Network
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarAlexander Pico
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata managementPistoia Alliance
 
Bayesian network-based predictive analytics applied to invasive species distr...
Bayesian network-based predictive analytics applied to invasive species distr...Bayesian network-based predictive analytics applied to invasive species distr...
Bayesian network-based predictive analytics applied to invasive species distr...Wisdom Dlamini
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Alexander Pico
 
Systems biology for medical students/Systems medicine
Systems biology for medical students/Systems medicineSystems biology for medical students/Systems medicine
Systems biology for medical students/Systems medicineimprovemed
 

La actualidad más candente (20)

Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the future
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinar
 
Application of blockchain technology in healthcare and biomedicine
Application of blockchain technology in healthcare and biomedicineApplication of blockchain technology in healthcare and biomedicine
Application of blockchain technology in healthcare and biomedicine
 
NRNB Annual Report 2011
NRNB Annual Report 2011NRNB Annual Report 2011
NRNB Annual Report 2011
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network Representations
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang Su
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
 
Final report
Final reportFinal report
Final report
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David Amar
 
ThaddeusBerger_Poster
ThaddeusBerger_PosterThaddeusBerger_Poster
ThaddeusBerger_Poster
 
B.3.5
B.3.5B.3.5
B.3.5
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
Bayesian network-based predictive analytics applied to invasive species distr...
Bayesian network-based predictive analytics applied to invasive species distr...Bayesian network-based predictive analytics applied to invasive species distr...
Bayesian network-based predictive analytics applied to invasive species distr...
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
 
BTIS
BTISBTIS
BTIS
 
An Introduction to Biology with Computers
An Introduction to Biology with ComputersAn Introduction to Biology with Computers
An Introduction to Biology with Computers
 
Systems biology for medical students/Systems medicine
Systems biology for medical students/Systems medicineSystems biology for medical students/Systems medicine
Systems biology for medical students/Systems medicine
 

Destacado

Why digital communications for seniors suck? By Agata Kukwa. #RockitWAW
Why digital communications for seniors suck? By Agata Kukwa. #RockitWAWWhy digital communications for seniors suck? By Agata Kukwa. #RockitWAW
Why digital communications for seniors suck? By Agata Kukwa. #RockitWAWDigiComNet
 
Finanças e Investimentos para Startups - Startup Pirates Foz ´14
Finanças e Investimentos para Startups - Startup Pirates Foz ´14Finanças e Investimentos para Startups - Startup Pirates Foz ´14
Finanças e Investimentos para Startups - Startup Pirates Foz ´14Ricardo Moraes
 
RAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenza
RAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenzaRAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenza
RAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenzaBTO Educational
 
Isha Arogya Information
Isha Arogya InformationIsha Arogya Information
Isha Arogya InformationIsha Outreach
 
Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013
Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013
Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013Leonardo Naressi
 
Search Intelligence - Social Media e Search Marketing - Proxxima 2011
Search Intelligence - Social Media e Search Marketing - Proxxima 2011Search Intelligence - Social Media e Search Marketing - Proxxima 2011
Search Intelligence - Social Media e Search Marketing - Proxxima 2011Leonardo Naressi
 
Casslyn Tan - American Decline
Casslyn Tan - American DeclineCasslyn Tan - American Decline
Casslyn Tan - American Declinecynrx
 
O show de Paul McCartney no Brasil nas redes sociais
O show de Paul McCartney no Brasil nas redes sociaisO show de Paul McCartney no Brasil nas redes sociais
O show de Paul McCartney no Brasil nas redes sociaisLeonardo Naressi
 
REGIONE TOSCANA - Rapporto partecipazione 2009
REGIONE TOSCANA - Rapporto partecipazione 2009REGIONE TOSCANA - Rapporto partecipazione 2009
REGIONE TOSCANA - Rapporto partecipazione 2009BTO Educational
 
Social Media for Artists
Social Media for ArtistsSocial Media for Artists
Social Media for ArtistsSOMArts
 
Database automated build and test - SQL In The City Cambridge
Database automated build and test - SQL In The City CambridgeDatabase automated build and test - SQL In The City Cambridge
Database automated build and test - SQL In The City CambridgeRed Gate Software
 
Glossario de Metricas e Midias Interativas
Glossario de Metricas e Midias InterativasGlossario de Metricas e Midias Interativas
Glossario de Metricas e Midias InterativasLeonardo Naressi
 

Destacado (20)

World Diabetes Day 2011 posters
World Diabetes Day 2011 postersWorld Diabetes Day 2011 posters
World Diabetes Day 2011 posters
 
Diabetes Poster
Diabetes PosterDiabetes Poster
Diabetes Poster
 
Why digital communications for seniors suck? By Agata Kukwa. #RockitWAW
Why digital communications for seniors suck? By Agata Kukwa. #RockitWAWWhy digital communications for seniors suck? By Agata Kukwa. #RockitWAW
Why digital communications for seniors suck? By Agata Kukwa. #RockitWAW
 
Finanças e Investimentos para Startups - Startup Pirates Foz ´14
Finanças e Investimentos para Startups - Startup Pirates Foz ´14Finanças e Investimentos para Startups - Startup Pirates Foz ´14
Finanças e Investimentos para Startups - Startup Pirates Foz ´14
 
RAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenza
RAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenzaRAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenza
RAPPORTO 2009 - Toscana, la società dell’informazione e della conoscenza
 
Burlando win xp original
Burlando win xp originalBurlando win xp original
Burlando win xp original
 
Isha Arogya Information
Isha Arogya InformationIsha Arogya Information
Isha Arogya Information
 
Prolapso mitral
Prolapso mitralProlapso mitral
Prolapso mitral
 
Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013
Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013
Modelagem e Análise de Dados em PPC - Search Masters Brasil 2013
 
Search Intelligence - Social Media e Search Marketing - Proxxima 2011
Search Intelligence - Social Media e Search Marketing - Proxxima 2011Search Intelligence - Social Media e Search Marketing - Proxxima 2011
Search Intelligence - Social Media e Search Marketing - Proxxima 2011
 
Cartaopostal
CartaopostalCartaopostal
Cartaopostal
 
Apresentacao
ApresentacaoApresentacao
Apresentacao
 
Você tem...
Você tem...Você tem...
Você tem...
 
Casslyn Tan - American Decline
Casslyn Tan - American DeclineCasslyn Tan - American Decline
Casslyn Tan - American Decline
 
O show de Paul McCartney no Brasil nas redes sociais
O show de Paul McCartney no Brasil nas redes sociaisO show de Paul McCartney no Brasil nas redes sociais
O show de Paul McCartney no Brasil nas redes sociais
 
REGIONE TOSCANA - Rapporto partecipazione 2009
REGIONE TOSCANA - Rapporto partecipazione 2009REGIONE TOSCANA - Rapporto partecipazione 2009
REGIONE TOSCANA - Rapporto partecipazione 2009
 
Social Media for Artists
Social Media for ArtistsSocial Media for Artists
Social Media for Artists
 
Database automated build and test - SQL In The City Cambridge
Database automated build and test - SQL In The City CambridgeDatabase automated build and test - SQL In The City Cambridge
Database automated build and test - SQL In The City Cambridge
 
Neem je mee
Neem je meeNeem je mee
Neem je mee
 
Glossario de Metricas e Midias Interativas
Glossario de Metricas e Midias InterativasGlossario de Metricas e Midias Interativas
Glossario de Metricas e Midias Interativas
 

Similar a Branch: An interactive, web-based tool for building decision tree classifiers

Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsmikaelhuss
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...Human Variome Project
 
Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Sage Base
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisDatamining Tools
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisDataminingTools Inc
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysisDr. Naveen Gaurav srivastava
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...gerogepatton
 
Graphical Model and Clustering-Regression based Methods for Causal Interactio...
Graphical Model and Clustering-Regression based Methods for Causal Interactio...Graphical Model and Clustering-Regression based Methods for Causal Interactio...
Graphical Model and Clustering-Regression based Methods for Causal Interactio...gerogepatton
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...ijaia
 
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and PrimersGASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and Primersijdmtaiir
 

Similar a Branch: An interactive, web-based tool for building decision tree classifiers (20)

Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
UNMSymposium2014
UNMSymposium2014UNMSymposium2014
UNMSymposium2014
 
C0344023028
C0344023028C0344023028
C0344023028
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
 
Updated proposal powerpoint.pptx
Updated proposal powerpoint.pptxUpdated proposal powerpoint.pptx
Updated proposal powerpoint.pptx
 
Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer Diagnosis
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer Diagnosis
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 
Izant openscience
Izant openscienceIzant openscience
Izant openscience
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
 
Graphical Model and Clustering-Regression based Methods for Causal Interactio...
Graphical Model and Clustering-Regression based Methods for Causal Interactio...Graphical Model and Clustering-Regression based Methods for Causal Interactio...
Graphical Model and Clustering-Regression based Methods for Causal Interactio...
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
 
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and PrimersGASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
 

Más de Benjamin Good

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledgeBenjamin Good
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsBenjamin Good
 
Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Benjamin Good
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of FoodBenjamin Good
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopBenjamin Good
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationBenjamin Good
 
Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Benjamin Good
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Benjamin Good
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative SpiritBenjamin Good
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidataBenjamin Good
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery Benjamin Good
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbioBenjamin Good
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfBenjamin Good
 

Más de Benjamin Good (20)

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledge
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMs
 
Knowledge Beacons
Knowledge BeaconsKnowledge Beacons
Knowledge Beacons
 
Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden
 
Science Game Lab
Science Game LabScience Game Lab
Science Game Lab
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshop
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative Spirit
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
 
2016 mem good
2016 mem good2016 mem good
2016 mem good
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio
 
(Bio)Hackathons
(Bio)Hackathons(Bio)Hackathons
(Bio)Hackathons
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdf
 

Último

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themeitharjee
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Último (20)

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Branch: An interactive, web-based tool for building decision tree classifiers

  • 1. Branch: An interactive, web-based tool for building decision tree classifiers Benjamin M. Good, Karthik Gangavarapu, Vyshakh Babji, Max Nanis, Andrew I. Su ABSTRACT A crucial task in modern biology is the prediction of complex phenotypes, such as breast cancer prognosis, from genome-wide measurements. Machine learning algorithms can sometimes infer predictive patterns, but there is rarely enough data to train and test them effectively and the patterns that they identify are often expressed in forms (e.g. support vector machines, neural networks, random forests composed of 10s of thousands of trees) that are highly difficult to understand. In addition, it is generally unclear how to include prior knowledge in the course of their construction. Decision trees provide an intuitive visual form that can capture complex interactions between multiple variables. Effective methods exist for inferring decision trees automatically but it has been shown that these techniques can be improved upon via the manual interventions of experts. Here, we introduce Branch, a new Web-based tool for the interactive construction of decision trees from genomic datasets. Branch offers the ability to: (1) upload and share datasets intended for classification tasks (in progress), (2) construct decision trees by manually selecting features such as genes for a gene expression dataset, (3) collaboratively edit decision trees, (4) create feature functions that aggregate content from multiple independent features into single decision nodes (e.g. pathways) and (5) evaluate decision tree classifiers in terms of precision and recall. The tool is optimized for genomic use cases through the inclusion of gene and pathway-based search functions. Branch enables expert biologists to easily engage directly with high-throughput datasets without the need for a team of bioinformaticians. The tree building process allows researchers to rapidly test hypotheses about interactions between biological variables and phenotypes in ways that would otherwise require extensive computational sophistication. In so doing, this tool can both inform biological research and help to produce more accurate, more meaningful classifiers. A prototype of Branch is available at http://biobranch.org/ The Scripps Research Institute Background Feature types REFERENCES CONTACT Benjamin Good: bgood@scripps.edu @bgood Andrew Su: asu@scripps.edu @andrewsu Dataset library http://biobranch.org/ Building a decision tree Research reported in this poster was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award numbers R01GM089820 and R01GM083924, and by the National Center for Advancing Translational Sciences of the National Institute of Health under award number UL1TR001114. Goals (1) Find patterns (2) make predictions on new samples < 10 year >10 year < 10 year ? > 10 year ? 1. Griffith et al (2013) A robust prognostic signature for hormone-positive node-negative breast cancer. Genome Medicine. 2. Dutkowski and Ideker (2011) Protein Networks as Logic Functions in Development and Cancer. PLoS Computational Biology 3. Winter et al (2012) Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes. PLoS Computational Biology 4. Liu et al (2012) Identifying dysregulated pathways in cancers from pathway interaction networks. BMC Bioinformatics 5. Paik et al (2004) A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node- Negative Breast Cancer. The New England Journal of Medicine 6. Mihael et al. (1999) Visual classification: an interactive approach to decision tree construction. Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining. 7. Malcolm W. (2002) Interactive machine learning: letting users build classifiers. International Journal of Human-Counter Studies. Example: breast cancer survival prediction Gene Expression Data (+CNVs, SNPs, etc..) (3) Understand the biology that the pattern indicates Statistics and machine learning • Example, Random Forests [1] • Good at (1) finding patterns • Have mixed results at (2) identifying patterns that generalize well across cohorts • Sometimes offer little help for (3) increasing understanding of the underlying biology Prior knowledge • Known relationships between the data elements (e.g. genes) can be used to improve predictor accuracy and generalizability. • Examples of inputs to automated methods: protein-protein interactions [2,3], pathway databases [4] • Manual consideration by domain experts is a vital aspect to the inference of new classifiers and is fundamental to the formation of understanding. See for example the creation of the OncoTypeDx predictor for breast cancer prognosis [5] Funding Decision Trees • Can be inferred automatically but.. • Engaging domain experts in their creation: • (1) provides access to prior knowledge, (2) results in smaller, more understandable trees, (3) can improve predictive performance, (4) can increase user’s comprehension of both the classifier and the data [6,7] Clicking on a node shows the percentage of the dataset that passes through it and its accuracy. View/use trees shared by community • Gene (e.g. expression) • Non-gene (e.g. clinical data) • Custom feature (manually created feature combination) • Classifier node (e.g. a trained SVM) • Pre-existing tree • Visual (manually defined decision boundary using GUI) • Create a classifier node. Iteratively select feature to create each split (If, Then rule) Transplant rejection HIV-1 coreceptor usage • Test datasets loaded: • Breast cancer survival (gene expression) • Kidney transplant rejection (gene expression) • HIV coreceptor usage (amino acid sequences) • Coming soon: upload your own data The number of colored squares indicate the number of samples that pass through the node. The colors are associated with the classes to be predicted. Ideal leaf nodes are ‘pure’ in that they only contain one kind of class. Breast cancer survival Decision trees can be made private or shared with the public when saved. Public trees may be used as a starting point for others. For collaboratively authored trees, the author associated with each node is tracked. http://biobranch.org/