SlideShare una empresa de Scribd logo
1 de 16
Descargar para leer sin conexión
Prediction of novel targets using
disease association data from
Open Targets
Enrico Ferrero, PhD, Associate GSK Fellow
Scientific Leader, Computational Biology, Target Sciences
GSK
@enricoferrero
Data + AI = drugs?
BBC News, 2017 Nature Biotechnology, 2017
The pharma AI space is getting crowded
Partner
Partner
Developing a new drug: 15+ years, $2B+
So, what’s wrong?
Harrison, Nat Rev Drug Discov, 2016
Cook et al., Nat Rev Drug Discov, 2014
Rethink the drug discovery pipeline
Manhattan Institute, 2012
Late phase
failures cost
(a lot) more
Spend more time
and resources in
target discovery
Reduce
attrition in
later phases
But how do we find good targets?
Nelson et al., Nat Genet, 2015
Open Targets
Koscielny et al., 2016
Could it be as easy as spotting spam emails?
▪ Is it possible to predict novel therapeutic targets using available
gene – disease association data?
▪ Is Open Targets just a catalogue of gene – disease associations
or can we learn from it what makes a good target?
A positive – unlabelled (PU) semi-
supervised learning approach
▪ Obtain all gene – disease associations and supporting evidence from Open
Targets platform. For all genes, create numeric features by taking the
mean score across all diseases:
▪ Genetic associations (germline)
▪ Somatic mutations
▪ Significant gene expression changes
▪ Disease-relevant phenotype in animal model
▪ Pathway-level evidence
▪ Gather positive labels from Pharmaprojects: only consider targets with
drugs currently on the market, in clinical trials or preclinical studies. A
semi-supervised framework with only positive labels is used: targets
according to PharmaProjects constitute the positive class (P), while the
rest of the proteome is used as the unlabelled class (U), containing both
negatives and yet-to-be-discovered positive.
▪ All positive cases (1421) and an equal number of randomly selected
unlabelled cases (2842 in total) are set apart for training (80%) and
testing (20%). The remainder is kept as a prediction set where predictions
from the final model will be made.
Finding structure and most important features
t-SNE dimensionality reduction
reveals structured observations
Most important features
according to chi-squared test and
information gain
Nested cross-validation and bagging for
tuning and model selection
Bischl et al., 2012
Wikipedia
Four classifiers are independently tuned, trained and tested on the training
set using a nested cross-validation strategy (4 inner rounds for parameter
tuning and 4 outer rounds to assess performance):
▪ Random forest
▪ Feed-forward neural network with single hidden layer
▪ Support vector machine with radial kernel
▪ Gradient boosting machine with AdaBoost exponential loss
function
In PU learning, U contains both positive and negative cases, which results in classifier
instability. Bagging (bootstrap aggregating) can improve the performance of instable
classifiers by randomly resampling P and U with replacement (bootstrap) and then
aggregating the results by majority voting:
▪ Bagging with 100 iterations was applied to the neural network, the support vector
machine and the gradient boosting machine.
▪ Random forests are already a special case of bagging.
Assessing performance and investigating results
Neural network classifier
achieves 71% accuracy
(0.76 AUC) on test set
More advanced targets
have higher disease
association evidence
Validation of predictions with literature mining
Significant overlap between neural
network predictions and text mining
results (p = 5.05e-172)
Automating drug target discovery
with machine learning
▪ The gene – disease association data from Open Targets contains enough
information to predict whether a protein can make a therapeutic target or
not with decent accuracy.
▪ According to our model, the most informative evidence types are animal
models showing disease-relevant phenotypes, dysregulated gene
expression in disease tissue and genetic associations between gene and
disease.
▪ The ability to predict late stage targets with greater accuracy confirms that
clear linkage between target and disease is essential to maximise chances
of success in the clinic.
▪ Limitations:
▪ Lack of prediction on indication;
▪ No tractability considerations.
Thank you!
▪ Philippe Sanseau
▪ Ian Dunham
▪ Gautier Koscielny
▪ Giovanni Dall’Olio
▪ Pankaj Agarwal
▪ Mark Hurle
▪ Steven Barrett
▪ Nicola Richmond
▪ Jin Yao

Más contenido relacionado

La actualidad más candente

SMi Group's AI in Drug Discovery 2020 conference
SMi Group's AI in Drug Discovery 2020 conferenceSMi Group's AI in Drug Discovery 2020 conference
SMi Group's AI in Drug Discovery 2020 conferenceDale Butler
 
AI applications in life sciences - drug development
AI applications in life sciences - drug developmentAI applications in life sciences - drug development
AI applications in life sciences - drug developmentJayanthi Repalli, PhD
 
Ai in drug discovery and drug development
Ai in drug discovery and drug developmentAi in drug discovery and drug development
Ai in drug discovery and drug developmentSRUTHI N
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryDr. Gerry Higgins
 
Combination of informative biomarkers in small pilot studies and estimation ...
Combination of informative  biomarkers in small pilot studies and estimation ...Combination of informative  biomarkers in small pilot studies and estimation ...
Combination of informative biomarkers in small pilot studies and estimation ...LEGATO project
 
Sample size & meta analysis
Sample size & meta analysisSample size & meta analysis
Sample size & meta analysisdrsrb
 
Overcoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative diseaseOvercoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative diseaseLona Vincent
 
Bayesian estimations of strong toxic signals [compatibility mode]
Bayesian estimations of strong toxic signals [compatibility mode]Bayesian estimations of strong toxic signals [compatibility mode]
Bayesian estimations of strong toxic signals [compatibility mode]Bhaswat Chakraborty
 
Power Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative StudiesPower Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative StudiesStatistics Solutions
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesJosef Scheiber
 
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryBioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryJosef Scheiber
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
 
5 essential steps for sample size determination in clinical trials slideshare
5 essential steps for sample size determination in clinical trials   slideshare5 essential steps for sample size determination in clinical trials   slideshare
5 essential steps for sample size determination in clinical trials slidesharenQuery
 
Bayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - PubricaBayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - PubricaPubrica
 
Principles of data_science
Principles of data_sciencePrinciples of data_science
Principles of data_sciencetvk66866
 
2011 JSM - Good Statistical Practices
2011 JSM - Good Statistical Practices2011 JSM - Good Statistical Practices
2011 JSM - Good Statistical PracticesTerry Liao
 
Predicting Diabetic Readmission Rates: Moving Beyond HbA1c
Predicting Diabetic Readmission Rates: Moving Beyond HbA1cPredicting Diabetic Readmission Rates: Moving Beyond HbA1c
Predicting Diabetic Readmission Rates: Moving Beyond HbA1cDamian R. Mingle, MBA
 
Sample determinants and size
Sample determinants and sizeSample determinants and size
Sample determinants and sizeTarek Tawfik Amin
 

La actualidad más candente (20)

Discovery_Schreiner
Discovery_SchreinerDiscovery_Schreiner
Discovery_Schreiner
 
SMi Group's AI in Drug Discovery 2020 conference
SMi Group's AI in Drug Discovery 2020 conferenceSMi Group's AI in Drug Discovery 2020 conference
SMi Group's AI in Drug Discovery 2020 conference
 
AI applications in life sciences - drug development
AI applications in life sciences - drug developmentAI applications in life sciences - drug development
AI applications in life sciences - drug development
 
Ai in drug discovery and drug development
Ai in drug discovery and drug developmentAi in drug discovery and drug development
Ai in drug discovery and drug development
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discovery
 
Combination of informative biomarkers in small pilot studies and estimation ...
Combination of informative  biomarkers in small pilot studies and estimation ...Combination of informative  biomarkers in small pilot studies and estimation ...
Combination of informative biomarkers in small pilot studies and estimation ...
 
Sample size & meta analysis
Sample size & meta analysisSample size & meta analysis
Sample size & meta analysis
 
Overcoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative diseaseOvercoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative disease
 
Bayesian estimations of strong toxic signals [compatibility mode]
Bayesian estimations of strong toxic signals [compatibility mode]Bayesian estimations of strong toxic signals [compatibility mode]
Bayesian estimations of strong toxic signals [compatibility mode]
 
Power Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative StudiesPower Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative Studies
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use Cases
 
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryBioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
 
5 essential steps for sample size determination in clinical trials slideshare
5 essential steps for sample size determination in clinical trials   slideshare5 essential steps for sample size determination in clinical trials   slideshare
5 essential steps for sample size determination in clinical trials slideshare
 
Bayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - PubricaBayesian random effects meta-analysis model for normal data - Pubrica
Bayesian random effects meta-analysis model for normal data - Pubrica
 
Sample size calculation
Sample size calculationSample size calculation
Sample size calculation
 
Principles of data_science
Principles of data_sciencePrinciples of data_science
Principles of data_science
 
2011 JSM - Good Statistical Practices
2011 JSM - Good Statistical Practices2011 JSM - Good Statistical Practices
2011 JSM - Good Statistical Practices
 
Predicting Diabetic Readmission Rates: Moving Beyond HbA1c
Predicting Diabetic Readmission Rates: Moving Beyond HbA1cPredicting Diabetic Readmission Rates: Moving Beyond HbA1c
Predicting Diabetic Readmission Rates: Moving Beyond HbA1c
 
Sample determinants and size
Sample determinants and sizeSample determinants and size
Sample determinants and size
 

Similar a Prediction of novel targets using disease association data from Open Targets

Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...European School of Oncology
 
Evaluating the Medical Literature
Evaluating the Medical LiteratureEvaluating the Medical Literature
Evaluating the Medical LiteratureClista Clanton
 
Review : Impact of informatics on IVF
Review : Impact of informatics on IVFReview : Impact of informatics on IVF
Review : Impact of informatics on IVFVirochana Kaul
 
K7 - Critical Appraisal.pdf
K7 - Critical Appraisal.pdfK7 - Critical Appraisal.pdf
K7 - Critical Appraisal.pdfJeslynTengkawan1
 
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...InsideScientific
 
HRUG - Text Mining to Construct Causal Models
HRUG - Text Mining to Construct Causal ModelsHRUG - Text Mining to Construct Causal Models
HRUG - Text Mining to Construct Causal Modelsegoodwintx
 
Amia tbi-14-final
Amia tbi-14-finalAmia tbi-14-final
Amia tbi-14-finalRuss Altman
 
Leverage machine learning and new technologies to enhance rwe generation and ...
Leverage machine learning and new technologies to enhance rwe generation and ...Leverage machine learning and new technologies to enhance rwe generation and ...
Leverage machine learning and new technologies to enhance rwe generation and ...Athula Herath
 
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).pptintroductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).pptDr.Venkata Suresh Ponnuru
 
The Clinical Genome Conference 2014
The Clinical Genome Conference 2014The Clinical Genome Conference 2014
The Clinical Genome Conference 2014Nicole Proulx
 
Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6cphensley
 
RxpredictPresentation.pdf
RxpredictPresentation.pdfRxpredictPresentation.pdf
RxpredictPresentation.pdfDanikaGupta
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020Rui Zhang
 
A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014
A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014
A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014Office of Health Economics
 
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlAnalysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlHealth Informatics New Zealand
 
Introduction to Evidence Based Medicine (EBM)
Introduction to Evidence Based Medicine (EBM)Introduction to Evidence Based Medicine (EBM)
Introduction to Evidence Based Medicine (EBM)Elsayed Salih
 
Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...
Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...
Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...NORC at the University of Chicago
 
Digital transformation of translational medicine
Digital transformation of translational medicineDigital transformation of translational medicine
Digital transformation of translational medicineEagle Genomics
 

Similar a Prediction of novel targets using disease association data from Open Targets (20)

Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
 
Evaluating the Medical Literature
Evaluating the Medical LiteratureEvaluating the Medical Literature
Evaluating the Medical Literature
 
Review : Impact of informatics on IVF
Review : Impact of informatics on IVFReview : Impact of informatics on IVF
Review : Impact of informatics on IVF
 
K7 - Critical Appraisal.pdf
K7 - Critical Appraisal.pdfK7 - Critical Appraisal.pdf
K7 - Critical Appraisal.pdf
 
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
 
HRUG - Text Mining to Construct Causal Models
HRUG - Text Mining to Construct Causal ModelsHRUG - Text Mining to Construct Causal Models
HRUG - Text Mining to Construct Causal Models
 
Amia tbi-14-final
Amia tbi-14-finalAmia tbi-14-final
Amia tbi-14-final
 
Leverage machine learning and new technologies to enhance rwe generation and ...
Leverage machine learning and new technologies to enhance rwe generation and ...Leverage machine learning and new technologies to enhance rwe generation and ...
Leverage machine learning and new technologies to enhance rwe generation and ...
 
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).pptintroductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
 
The Clinical Genome Conference 2014
The Clinical Genome Conference 2014The Clinical Genome Conference 2014
The Clinical Genome Conference 2014
 
AI in eHealth
AI in eHealthAI in eHealth
AI in eHealth
 
Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6
 
RxpredictPresentation.pdf
RxpredictPresentation.pdfRxpredictPresentation.pdf
RxpredictPresentation.pdf
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020
 
A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014
A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014
A Health Economics Perspective on NICE and Stratified Medicine Towse Jan 2014
 
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlAnalysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
 
Introduction to Evidence Based Medicine (EBM)
Introduction to Evidence Based Medicine (EBM)Introduction to Evidence Based Medicine (EBM)
Introduction to Evidence Based Medicine (EBM)
 
Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...
Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...
Hit-Miss Model for Duplicate Detection-WHO Drug Safety Database_PVER Conf_May...
 
Digital transformation of translational medicine
Digital transformation of translational medicineDigital transformation of translational medicine
Digital transformation of translational medicine
 
Towse NDDP implications for drug development
Towse NDDP implications for drug developmentTowse NDDP implications for drug development
Towse NDDP implications for drug development
 

Último

{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Último (20)

{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Prediction of novel targets using disease association data from Open Targets

  • 1. Prediction of novel targets using disease association data from Open Targets Enrico Ferrero, PhD, Associate GSK Fellow Scientific Leader, Computational Biology, Target Sciences GSK @enricoferrero
  • 2. Data + AI = drugs? BBC News, 2017 Nature Biotechnology, 2017
  • 3. The pharma AI space is getting crowded Partner Partner
  • 4. Developing a new drug: 15+ years, $2B+
  • 5. So, what’s wrong? Harrison, Nat Rev Drug Discov, 2016 Cook et al., Nat Rev Drug Discov, 2014
  • 6. Rethink the drug discovery pipeline Manhattan Institute, 2012 Late phase failures cost (a lot) more Spend more time and resources in target discovery Reduce attrition in later phases
  • 7. But how do we find good targets? Nelson et al., Nat Genet, 2015
  • 9. Could it be as easy as spotting spam emails? ▪ Is it possible to predict novel therapeutic targets using available gene – disease association data? ▪ Is Open Targets just a catalogue of gene – disease associations or can we learn from it what makes a good target?
  • 10. A positive – unlabelled (PU) semi- supervised learning approach ▪ Obtain all gene – disease associations and supporting evidence from Open Targets platform. For all genes, create numeric features by taking the mean score across all diseases: ▪ Genetic associations (germline) ▪ Somatic mutations ▪ Significant gene expression changes ▪ Disease-relevant phenotype in animal model ▪ Pathway-level evidence ▪ Gather positive labels from Pharmaprojects: only consider targets with drugs currently on the market, in clinical trials or preclinical studies. A semi-supervised framework with only positive labels is used: targets according to PharmaProjects constitute the positive class (P), while the rest of the proteome is used as the unlabelled class (U), containing both negatives and yet-to-be-discovered positive. ▪ All positive cases (1421) and an equal number of randomly selected unlabelled cases (2842 in total) are set apart for training (80%) and testing (20%). The remainder is kept as a prediction set where predictions from the final model will be made.
  • 11. Finding structure and most important features t-SNE dimensionality reduction reveals structured observations Most important features according to chi-squared test and information gain
  • 12. Nested cross-validation and bagging for tuning and model selection Bischl et al., 2012 Wikipedia Four classifiers are independently tuned, trained and tested on the training set using a nested cross-validation strategy (4 inner rounds for parameter tuning and 4 outer rounds to assess performance): ▪ Random forest ▪ Feed-forward neural network with single hidden layer ▪ Support vector machine with radial kernel ▪ Gradient boosting machine with AdaBoost exponential loss function In PU learning, U contains both positive and negative cases, which results in classifier instability. Bagging (bootstrap aggregating) can improve the performance of instable classifiers by randomly resampling P and U with replacement (bootstrap) and then aggregating the results by majority voting: ▪ Bagging with 100 iterations was applied to the neural network, the support vector machine and the gradient boosting machine. ▪ Random forests are already a special case of bagging.
  • 13. Assessing performance and investigating results Neural network classifier achieves 71% accuracy (0.76 AUC) on test set More advanced targets have higher disease association evidence
  • 14. Validation of predictions with literature mining Significant overlap between neural network predictions and text mining results (p = 5.05e-172)
  • 15. Automating drug target discovery with machine learning ▪ The gene – disease association data from Open Targets contains enough information to predict whether a protein can make a therapeutic target or not with decent accuracy. ▪ According to our model, the most informative evidence types are animal models showing disease-relevant phenotypes, dysregulated gene expression in disease tissue and genetic associations between gene and disease. ▪ The ability to predict late stage targets with greater accuracy confirms that clear linkage between target and disease is essential to maximise chances of success in the clinic. ▪ Limitations: ▪ Lack of prediction on indication; ▪ No tractability considerations.
  • 16. Thank you! ▪ Philippe Sanseau ▪ Ian Dunham ▪ Gautier Koscielny ▪ Giovanni Dall’Olio ▪ Pankaj Agarwal ▪ Mark Hurle ▪ Steven Barrett ▪ Nicola Richmond ▪ Jin Yao