SlideShare una empresa de Scribd logo
1 de 1
The Cure: Making a game of gene selection for breast cancer survival prediction
Background: Molecular signatures for predicting breast cancer prognosis
could greatly improve care through personalization of treatment.
Computational analyses of genome-wide expression datasets have
identified such signatures, but these signatures leave much to be desired
in terms of accuracy, reproducibility and biological interpretability.
Methods that take advantage of structured prior knowledge (e.g. protein
interaction networks) show promise in helping to define better
signatures but most knowledge remains unstructured. Crowdsourcing via
scientific discovery games is an emerging methodology that has the
potential to tap into human intelligence at scales and in modes
previously unheard of.
Objective: The main objective of this study was to test the hypothesis
that knowledge linking expression patterns of specific genes to breast
cancer outcomes could be captured from players of an open, Web-based
game. We envisioned capturing knowledge both from the player’s prior
experience and from their ability to interpret text related to candidate
genes presented to them in the context of the game.
Methods: We developed and evaluated an online game called “The
Cure” that captured information from players regarding genes for use in
predictors of breast cancer survival. Information gathered from game
play was aggregated using a voting approach and used to create rankings
of genes. The top genes from these rankings were evaluated using
annotation enrichment analysis, comparison to prior predictor gene
sets, and by using them to train and test machine learning systems for
predicting 10-year survival.
Results: Between its launch in Sept. 2012 and Sept. 2013, The Cure
attracted more than 1,000 registered players who collectively played
nearly 10,000 games. Gene sets assembled through aggregation of the
collected data showed significant enrichment for genes known to be
related to key concepts such as Cancer, Disease Progression, and
Recurrence (P < 1.1e-07). In terms of the accuracy of models trained
using them, these gene sets provided comparable performance to gene
sets generated using other methods including those used in commercial
tests. The Cure is available at http://genegames.org/cure/
ABSTRACT
Benjamin M. Good1, Karthik Gangavarapu1, Salvatore Loguercio1, Obi L. Griffith2, Max Nanis1, Chunlei Wu1, Andrew I. Su1
1The Scripps Research Institute, 2Washington University School of Medicine
Molecular survival prediction
How Gene Wiki?
REFERENCES
CONTACT
Benjamin Good: bgood@scripps.edu @bgood
Andrew Su: asu@scripps.edu @andrewsu
How Gene Wiki?
Cure2.0: Interactive, Collaborative, Genomic Decision Tree Construction, now live!
FUNDING
ACKNOWLEDGEMENTS
Thanks to all of the players of The Cure !
Crowdsourcing via scientific discovery games
We acknowledge support from the National Institute of
General Medical Sciences (GM089820 and GM083924).
The Cure game. Players alternate turns taking a gene card from the
board and adding it to their hand. The tabbed display provides gene
annotations (‘ontology’, ‘Rifs’) and views of decision trees
constructed by the system using the selected genes. There are one
hundred boards to choose from in a given round of the game (four
rounds were completed).
find patterns
make predictions on
new samples
< 10 year >10 year
• With tens of thousands of measurements but only
hundreds of samples, many possible patterns are found.
• But which ones are real?
• Which genes should we use to build predictors?
< 10 year
> 10 year
Online games are successfully tapping into the knowledge
and reasoning abilities of thousands of people [4].
Devise protein folding algorithmsDesign RNA molecules
The purpose
Prior knowledge encoded in protein-protein interaction
databases [1,2] and pathway databases [3] has been used to
improve prediction
What about
knowledge that is
not recorded in
structured
databases?
1. Dutkowski and Ideker (2011) Protein Networks as Logic
Functions in Development and Cancer. PLoS
Computational Biology
2. Winter et al (2012) Google Goes Cancer: Improving
Outcome Prediction for Cancer Patients by Network-
Based Ranking of Marker Genes. PLoS Computational
Biology
3. Liu et al (2012) Identifying dysregulated pathways in
cancers from pathway interaction networks. BMC
Bioinformatics
4. Good and Su (2011) Games with a Scientific Purpose.
Genome Biology
5. Wang, Jing, et al. (2013) WEB-based GEne SeT AnaLysis
Toolkit (WebGestalt): update 2013. Nucleic Acids
Research
• Goal: pick the best set of genes.
• Best: the gene set that produces the best decision tree classifier.
• Classifier: created using training data and selected genes, used to predict 10
year survival.
• Score: accuracy of the tree inferred using the selected genes
The Cure is a game
designed to focus the
collective intelligence of a
diverse community on the
challenge of selecting
genes for building
prognostic classifiers
The rules
The game
Results – recruitment and engagement
• One year, 1077 players, 9904 games played
1077
players
Key result: Genes selected in high frequencies by
the player community performed comparably to
genes selected using statistical approaches and to
genes used in commercial tests when used to train
machine learning models for survival prediction
Results – knowledge captured
Workflow for Synthesizing Knowledge Regarding Gene Selection
1. Select a set of played games based on player information such as education.
2. Measure the frequency with which each gene was selected by these players
across many different games and boards. Each time a gene is added to a hand
a ‘vote’ is recorded for that gene.
3. Measure the likelihood of observing the number of votes a gene has received
by chance and calculate a P value for that gene.
4. Rank genes by P value and select those with P<=0.001
3 gene sets extracted from all games, games
from experts, and games from novices
Overlap of ‘expert’ player selected gene
set with known predictor gene sets
Disease terms
associated with 61 genes
preferentially selected
by all players using
WebGestalt [5] with adj.
P < 10-5
Overlap between genes selected
by different player populations
61 genes preferentially
selected by all players,
P <= 0.001
Changes in Cure 2.0
1. Adapted for advanced players / scientists.
2. Players choose from all genes in dataset
3. Clinical features supported
4. Players control structure of trees.
5. Scoring based on accuracy, complexity and
novelty of trees.
6. Collaborative – players can build from other
players trees
7. Trees can also be kept private.
http://genegames.org/cure/
Try it Now!

Más contenido relacionado

La actualidad más candente

Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...
Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...
Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...
Lloyd Morgan
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 

La actualidad más candente (12)

Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
 
Genomica Yquimiot
Genomica YquimiotGenomica Yquimiot
Genomica Yquimiot
 
Web applications for rapid microbial taxonomy identification
Web applications for rapid microbial taxonomy identification Web applications for rapid microbial taxonomy identification
Web applications for rapid microbial taxonomy identification
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 
Bioinformatics in medicine
Bioinformatics in medicineBioinformatics in medicine
Bioinformatics in medicine
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
Big Data Challenges for Real-Time Personalized Medicine
Big Data Challenges for Real-Time Personalized MedicineBig Data Challenges for Real-Time Personalized Medicine
Big Data Challenges for Real-Time Personalized Medicine
 
Exploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsExploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease Informatics
 
Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...
Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...
Hallberg & Morgan A Model of the Number of Brain Tumors by Year from Cellphon...
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Pine Biotech
Pine BiotechPine Biotech
Pine Biotech
 

Destacado

Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Benjamin Good
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Benjamin Good
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
Benjamin Good
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
Benjamin Good
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Benjamin Good
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen science
Benjamin Good
 

Destacado (20)

Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdf
 
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative Spirit
 
Science Game Lab
Science Game LabScience Game Lab
Science Game Lab
 
(Bio)Hackathons
(Bio)Hackathons(Bio)Hackathons
(Bio)Hackathons
 
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio
 
2016 mem good
2016 mem good2016 mem good
2016 mem good
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen science
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshop
 

Similar a The Cure: Making a game of gene selection for breast cancer survival prediction

Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...
KrishMendapara1
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
Benjamin Good
 
How Genomics & Data analysis are intertwined each other (1).pdf
How Genomics & Data analysis are intertwined each other  (1).pdfHow Genomics & Data analysis are intertwined each other  (1).pdf
How Genomics & Data analysis are intertwined each other (1).pdf
Nusrat Gulbarga
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
Chien-Wei Lin
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
Martín Arrieta
 
John Boikov Personalised Medicine Essay, Mark - 95 out of 100
John Boikov Personalised Medicine Essay, Mark - 95 out of 100John Boikov Personalised Medicine Essay, Mark - 95 out of 100
John Boikov Personalised Medicine Essay, Mark - 95 out of 100
John Boikov
 
Algorithmically Optimized Gene Selection for Targeted Clinical Sequencing Panels
Algorithmically Optimized Gene Selection for Targeted Clinical Sequencing PanelsAlgorithmically Optimized Gene Selection for Targeted Clinical Sequencing Panels
Algorithmically Optimized Gene Selection for Targeted Clinical Sequencing Panels
Thermo Fisher Scientific
 
Mining of Important Informative Genes and Classifier Construction for Cancer ...
Mining of Important Informative Genes and Classifier Construction for Cancer ...Mining of Important Informative Genes and Classifier Construction for Cancer ...
Mining of Important Informative Genes and Classifier Construction for Cancer ...
ijsc
 
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
ijsc
 

Similar a The Cure: Making a game of gene selection for breast cancer survival prediction (20)

A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...Establishment and analysis of a disease risk prediction model for chronic kid...
Establishment and analysis of a disease risk prediction model for chronic kid...
 
Computational biology
Computational biologyComputational biology
Computational biology
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
 
How Genomics & Data analysis are intertwined each other (1).pdf
How Genomics & Data analysis are intertwined each other  (1).pdfHow Genomics & Data analysis are intertwined each other  (1).pdf
How Genomics & Data analysis are intertwined each other (1).pdf
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
 
John Boikov Personalised Medicine Essay, Mark - 95 out of 100
John Boikov Personalised Medicine Essay, Mark - 95 out of 100John Boikov Personalised Medicine Essay, Mark - 95 out of 100
John Boikov Personalised Medicine Essay, Mark - 95 out of 100
 
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
Algorithmically Optimized Gene Selection for Targeted Clinical Sequencing Panels
Algorithmically Optimized Gene Selection for Targeted Clinical Sequencing PanelsAlgorithmically Optimized Gene Selection for Targeted Clinical Sequencing Panels
Algorithmically Optimized Gene Selection for Targeted Clinical Sequencing Panels
 
In silico prediction of novel therapeutic targets using gene - disease associ...
In silico prediction of novel therapeutic targets using gene - disease associ...In silico prediction of novel therapeutic targets using gene - disease associ...
In silico prediction of novel therapeutic targets using gene - disease associ...
 
Mining of Important Informative Genes and Classifier Construction for Cancer ...
Mining of Important Informative Genes and Classifier Construction for Cancer ...Mining of Important Informative Genes and Classifier Construction for Cancer ...
Mining of Important Informative Genes and Classifier Construction for Cancer ...
 
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
 
Izant openscience
Izant openscienceIzant openscience
Izant openscience
 

Más de Benjamin Good

Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Benjamin Good
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
Benjamin Good
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first week
Benjamin Good
 
An online game for human phenotype prediction
An online game for human phenotype predictionAn online game for human phenotype prediction
An online game for human phenotype prediction
Benjamin Good
 

Más de Benjamin Good (10)

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledge
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMs
 
Knowledge Beacons
Knowledge BeaconsKnowledge Beacons
Knowledge Beacons
 
Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden
 
Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
 
Serious games for bioinformatics education. ISMB 2014 education workshop
Serious games for bioinformatics education.  ISMB 2014 education workshopSerious games for bioinformatics education.  ISMB 2014 education workshop
Serious games for bioinformatics education. ISMB 2014 education workshop
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first week
 
An online game for human phenotype prediction
An online game for human phenotype predictionAn online game for human phenotype prediction
An online game for human phenotype prediction
 

Último

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 

Último (20)

GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 

The Cure: Making a game of gene selection for breast cancer survival prediction

  • 1. The Cure: Making a game of gene selection for breast cancer survival prediction Background: Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility and biological interpretability. Methods that take advantage of structured prior knowledge (e.g. protein interaction networks) show promise in helping to define better signatures but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes previously unheard of. Objective: The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player’s prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game. Methods: We developed and evaluated an online game called “The Cure” that captured information from players regarding genes for use in predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10-year survival. Results: Between its launch in Sept. 2012 and Sept. 2013, The Cure attracted more than 1,000 registered players who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as Cancer, Disease Progression, and Recurrence (P < 1.1e-07). In terms of the accuracy of models trained using them, these gene sets provided comparable performance to gene sets generated using other methods including those used in commercial tests. The Cure is available at http://genegames.org/cure/ ABSTRACT Benjamin M. Good1, Karthik Gangavarapu1, Salvatore Loguercio1, Obi L. Griffith2, Max Nanis1, Chunlei Wu1, Andrew I. Su1 1The Scripps Research Institute, 2Washington University School of Medicine Molecular survival prediction How Gene Wiki? REFERENCES CONTACT Benjamin Good: bgood@scripps.edu @bgood Andrew Su: asu@scripps.edu @andrewsu How Gene Wiki? Cure2.0: Interactive, Collaborative, Genomic Decision Tree Construction, now live! FUNDING ACKNOWLEDGEMENTS Thanks to all of the players of The Cure ! Crowdsourcing via scientific discovery games We acknowledge support from the National Institute of General Medical Sciences (GM089820 and GM083924). The Cure game. Players alternate turns taking a gene card from the board and adding it to their hand. The tabbed display provides gene annotations (‘ontology’, ‘Rifs’) and views of decision trees constructed by the system using the selected genes. There are one hundred boards to choose from in a given round of the game (four rounds were completed). find patterns make predictions on new samples < 10 year >10 year • With tens of thousands of measurements but only hundreds of samples, many possible patterns are found. • But which ones are real? • Which genes should we use to build predictors? < 10 year > 10 year Online games are successfully tapping into the knowledge and reasoning abilities of thousands of people [4]. Devise protein folding algorithmsDesign RNA molecules The purpose Prior knowledge encoded in protein-protein interaction databases [1,2] and pathway databases [3] has been used to improve prediction What about knowledge that is not recorded in structured databases? 1. Dutkowski and Ideker (2011) Protein Networks as Logic Functions in Development and Cancer. PLoS Computational Biology 2. Winter et al (2012) Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network- Based Ranking of Marker Genes. PLoS Computational Biology 3. Liu et al (2012) Identifying dysregulated pathways in cancers from pathway interaction networks. BMC Bioinformatics 4. Good and Su (2011) Games with a Scientific Purpose. Genome Biology 5. Wang, Jing, et al. (2013) WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013. Nucleic Acids Research • Goal: pick the best set of genes. • Best: the gene set that produces the best decision tree classifier. • Classifier: created using training data and selected genes, used to predict 10 year survival. • Score: accuracy of the tree inferred using the selected genes The Cure is a game designed to focus the collective intelligence of a diverse community on the challenge of selecting genes for building prognostic classifiers The rules The game Results – recruitment and engagement • One year, 1077 players, 9904 games played 1077 players Key result: Genes selected in high frequencies by the player community performed comparably to genes selected using statistical approaches and to genes used in commercial tests when used to train machine learning models for survival prediction Results – knowledge captured Workflow for Synthesizing Knowledge Regarding Gene Selection 1. Select a set of played games based on player information such as education. 2. Measure the frequency with which each gene was selected by these players across many different games and boards. Each time a gene is added to a hand a ‘vote’ is recorded for that gene. 3. Measure the likelihood of observing the number of votes a gene has received by chance and calculate a P value for that gene. 4. Rank genes by P value and select those with P<=0.001 3 gene sets extracted from all games, games from experts, and games from novices Overlap of ‘expert’ player selected gene set with known predictor gene sets Disease terms associated with 61 genes preferentially selected by all players using WebGestalt [5] with adj. P < 10-5 Overlap between genes selected by different player populations 61 genes preferentially selected by all players, P <= 0.001 Changes in Cure 2.0 1. Adapted for advanced players / scientists. 2. Players choose from all genes in dataset 3. Clinical features supported 4. Players control structure of trees. 5. Scoring based on accuracy, complexity and novelty of trees. 6. Collaborative – players can build from other players trees 7. Trees can also be kept private. http://genegames.org/cure/ Try it Now!