SlideShare a Scribd company logo
1 of 24
Predicting Pharmacology Willem van Hoorn Pfizer Global Research & Development Sandwich UK [email_address] Pipeline Pilot UGM, San Diego, Mar 2006
Willem van Hoorn Standing on the Shoulders of Giants Gaia Paolini Richard Shapland Andrew Hopkins Jonathan Mason
The Work of Giants 4.8 M structures 275k active compounds 600k activities (IC50, etc) 3k targets 800 human targets Inpharmatica StARLITe Cerep Bioprint Thomson IDDB Pfizer in house ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Unified DB
Why Giants Are Required
Unified DB Unified Database as Starting Point  Bayesian Learn Molecular Categories Predicting activities Linear Discriminant Analysis (LDA) Predicting gene families Polypharmacology interaction network
Polypharmacology Network From Binding Data Node : target Edge : compound Metalloproteases Cysteine proteases Serine proteases Phosphodiesterases Aminergic GPCRs Peptide GPCRs GPCRs (others: classes A, B & C) Enzymes  (hydrolases, transferases, oxidoreductases & others) Ion Channels Nuclear hormone receptors Aspartyl proteases Kinases Miscellaneous
Deriving Multi-Category Bayesian Model 238k actives (   10   M), human target,  Mw < 1000, pass reactivity filter,    10 actives / target FCFP_6 90% / 214k 10% / 23,792 55,781 activities 698 models Unified DB
Assessing the Predictions of the Random Test Set ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
50 Assessing the Predictions of the Random Test Set 58,428 predictions / 17,210 compounds 16,281 compounds   1 correct prediction 31,600 true positives (random: 292) Enrichment ~ 100 fold 26,828 false positives (random: 55,489) 24,181 false negatives
Nuclear hormone receptors Ion Channels Phosphodiesterases Aminergic GPCRs Peptide GPCRs GPCRs (others) Enzymes  (others) True positive prediction False positive prediction Predicted Polypharmacology Network At Bayesian Cut-off 50
Predicted Polypharmacology Network At Bayesian Cut-off 50  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A More Challenging Test Set: Cerep Bioprint 238k actives (   10   M), human target,  Mw < 1000, pass reactivity filter,    10 actives / target FCFP_6 237k Bioprint 997 compounds 316 targets 694 models Unified DB
A More Challenging Test Set: Cerep Bioprint 50 720 predictions / 291 compounds 210 compounds   1 correct prediction 433 true positives (random: 17) Enrichment ~ 25 fold 287 false positives (random: 55,489) 12,281 false negatives
Another Look At The Same Data 0 36,222 predictions  6,121 true positives 30,101 false positives 6,593 false negatives  48% of actives in 11% of data Plus 378 extra predicted targets
A More Challenging Test Set: Cerep Bioprint ,[object Object],[object Object],[object Object],[object Object]
length height left rim bottom rim H. Lohninger Teach/Me Data Analysis http://www.vias.org/tmdatanaleng Linear Discriminant Analysis diagonal ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],NOTE Length Left Right Bottom Top Diagonal Genuine BN1 214.8 131.0 131.1 9.000 9.700 141.0 true BN2 214.6 129.7 129.7 8.100 9.500 141.7 true BN3 214.8 129.7 129.7 8.700 9.600 142.2 true BN4 214.8 129.7 129.6 7.500 10.40 142.0 true BN5 215.0 129.6 129.7 10.40 7.700 141.8 true BN6 215.7 130.8 130.5 9.000 10.10 141.4 true BN7 215.5 129.5 129.7 7.900 9.600 141.6 true BN8 214.5 129.6 129.2 7.200 10.70 141.7 true BN9 214.9 129.4 129.7 8.200 11.00 141.9 true BN10 215.2 130.4 130.3 9.200 10.00 140.7 true … . … . … . … . … . … . … . … . BN195 214.9 130.3 130.5 11.60 10.60 139.8 false BN196 215.0 130.4 130.3 9.900 12.10 139.6 false BN197 215.1 130.3 129.9 10.30 11.50 139.7 false BN198 214.8 130.3 130.4 10.60 11.10 140.0 false BN199 214.7 130.7 130.8 11.20 11.20 139.4 false BN200 214.3 129.9 129.9 10.20 11.50 139.6 false
Predicting Forgeries with LDA and Bayesian LDA Bayesian NOTE Length Left Right Bottom Top Diagonal BankNotes LD1 BN1 215.1 130.0 129.8 9.100 10.20 141.5 true 2.501 BN2 214.7 130.7 130.8 11.20 11.20 139.4 false -4.561 BN3 214.3 129.9 129.9 10.20 11.50 139.6 false -3.390 BN4 214.7 130.0 129.4 7.800 10.00 141.2 true 4.060 NOTE Length Left Right Bottom Top Diagonal BankNotesBayes BN1 215.1 130.0 129.8 9.100 10.20 141.5 1.992 BN2 214.7 130.7 130.8 11.20 11.20 139.4 -6.611 BN3 214.3 129.9 129.9 10.20 11.50 139.6 -6.341 BN4 214.7 130.0 129.4 7.800 10.00 141.2 1.771
Predicting Gene Class by Physical Properties Compounds binding to different gene classes posses different  physical property distributions: Can this be used to predict gene class from physical properties alone? How does LDA compare to Bayesian? Mw clogP
Predicting Gene Class by Physical Properties 148k actives (   10   M), human target,  Mw < 1000, pass reactivity filter, binding to single target class only Aminergic GPCRs Aspartyl Proteases Cysteine Proteases Enzymes- others GPCRs Class A- others GPCRs Class B GPCRs Class C Hydrolases Ion Channels- Ligand_Gated Ion Channels- others Kinases- others Metalloproteases Nuclear hormone receptors Others Oxidoreductases PDEs Peptide GPCRs Protein Kinases Serine Proteases Transferases 20 Gene Classes: Unified DB
Molecular_Weight Num_H_Acceptors  Num_H_Donors Num_RotatableBonds Molecular_PolarSurfaceArea No_IonCenters  Molecular_Solubility Molecular_SurfaceArea ClogP * Andrews* Predicting Gene Class by Physical Properties 10 Descriptors: 147,534 118,118 29,416
Predicting Gene Class by Physical Properties 29416 (9025) 1 (0) 349 (137) 5309 (1423) 8123 (2811) 791 (248) 888 (241) 2638 (499) 482 (163) 279 (74) 0 (0) 152 (59) 47 (0) 0 (0) 0 (0) 1 (0) 1268 (366) 1969 (321) 75 (28) 1180 (613) 5864 (2042) LDA (correct) 29416  (5631) 1012 (125) 792 (133) 341 (147) 2809 (1135) 2176 (392) 1437 (329) 90 (47) 2083 (345) 1626 (293) 1545 (100) 964 (104) 2109 (280) 350 (42) 3346 (146) 2340 (115) 962 (309) 1 (0) 1464 (73) 1670 (614) 2299 (902) Bayes (correct) 29416  (1447) 1460 (36) 1526 (53) 1488 (148) 1461 (236) 1468 (56) 1492 (54) 1465 (167) 1459 (53) 1515 (47) 1430 (11) 1441 (29) 1448 (52) 1461 (15) 1438 (29) 1477 (14) 1524 (117) 1451 (135) 1470 (13) 1479 (29) 1463 (153) Random (correct) 29416 727 913 2927 5027 1178 1385 3336 1238 849 198 594 764 286 339 226 2647 2574 252 728 3228 Experiment Target class Total Transferases Serine Proteases Protein Kinases Peptide GPCRs PDEs Oxidoreductases Others Nuclear hormone receptors Metalloproteases Kinases- others Ion Channels- others Ion Channels- Ligand_Gated Hydrolases GPCRs Class C GPCRs Class B GPCRs Class A- others Enzymes- others Cysteine Proteases Aspartyl Proteases Aminergic GPCRs
Predicting Gene Class by Physical Properties ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 

More Related Content

Similar to Predicting Pharmacology

NGS-Based Clinical Analysis
NGS-Based Clinical AnalysisNGS-Based Clinical Analysis
NGS-Based Clinical AnalysisDelaina Hawkins
 
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...Spencer Bliven
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresLars Juhl Jensen
 
Aug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigenticsAug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigenticsGenomeInABottle
 
STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...Lars Juhl Jensen
 
140128 use cases of giab RMs
140128 use cases of giab RMs140128 use cases of giab RMs
140128 use cases of giab RMsGenomeInABottle
 
NetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-LotemNetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-LotemAlexander Pico
 
Gendoo: Functional profiling of gene and disease features for omics analysis
Gendoo: Functional profiling of gene and disease features for omics analysisGendoo: Functional profiling of gene and disease features for omics analysis
Gendoo: Functional profiling of gene and disease features for omics analysisTakeru Nakazato
 
Ai Biotech Mktg Overview 0411
Ai Biotech Mktg Overview 0411Ai Biotech Mktg Overview 0411
Ai Biotech Mktg Overview 0411Adam Libby
 
The Yoyo Has Stopped: Reviewing the Evidence for a Low Basal Human Protein...
The Yoyo Has Stopped:  Reviewing the Evidence for a Low Basal Human Protein...The Yoyo Has Stopped:  Reviewing the Evidence for a Low Basal Human Protein...
The Yoyo Has Stopped: Reviewing the Evidence for a Low Basal Human Protein...Chris Southan
 
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...European School of Oncology
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informaticsDaniela Rotariu
 
Dr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland S
Dr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland SDr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland S
Dr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland SEdizonJambormias2
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceGenomeInABottle
 
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...Mark Gerstein
 
Recombinant protein expression and purification Lecture
Recombinant protein expression and purification LectureRecombinant protein expression and purification Lecture
Recombinant protein expression and purification Lecturetest
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Philip Bourne
 
Applying cheminformatics and bioinformatics approaches to neglected tropical ...
Applying cheminformatics and bioinformatics approaches to neglected tropical ...Applying cheminformatics and bioinformatics approaches to neglected tropical ...
Applying cheminformatics and bioinformatics approaches to neglected tropical ...Sean Ekins
 
6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_Final
6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_Final6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_Final
6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_FinalLawrence Hwang
 

Similar to Predicting Pharmacology (20)

NGS-Based Clinical Analysis
NGS-Based Clinical AnalysisNGS-Based Clinical Analysis
NGS-Based Clinical Analysis
 
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design...
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein features
 
Aug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigenticsAug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigentics
 
STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...
 
140128 use cases of giab RMs
140128 use cases of giab RMs140128 use cases of giab RMs
140128 use cases of giab RMs
 
NetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-LotemNetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-Lotem
 
Gendoo: Functional profiling of gene and disease features for omics analysis
Gendoo: Functional profiling of gene and disease features for omics analysisGendoo: Functional profiling of gene and disease features for omics analysis
Gendoo: Functional profiling of gene and disease features for omics analysis
 
Ai Biotech Mktg Overview 0411
Ai Biotech Mktg Overview 0411Ai Biotech Mktg Overview 0411
Ai Biotech Mktg Overview 0411
 
The Yoyo Has Stopped: Reviewing the Evidence for a Low Basal Human Protein...
The Yoyo Has Stopped:  Reviewing the Evidence for a Low Basal Human Protein...The Yoyo Has Stopped:  Reviewing the Evidence for a Low Basal Human Protein...
The Yoyo Has Stopped: Reviewing the Evidence for a Low Basal Human Protein...
 
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
 
Proteinomics
ProteinomicsProteinomics
Proteinomics
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
 
Dr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland S
Dr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland SDr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland S
Dr Luke Alphey - DNA Sequencing (Introduction to Biotechniques)-Garland S
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
 
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
 
Recombinant protein expression and purification Lecture
Recombinant protein expression and purification LectureRecombinant protein expression and purification Lecture
Recombinant protein expression and purification Lecture
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212
 
Applying cheminformatics and bioinformatics approaches to neglected tropical ...
Applying cheminformatics and bioinformatics approaches to neglected tropical ...Applying cheminformatics and bioinformatics approaches to neglected tropical ...
Applying cheminformatics and bioinformatics approaches to neglected tropical ...
 
6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_Final
6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_Final6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_Final
6-23-2015 AACC Poster HIV Incidence Assay - Stengelin_Final
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Predicting Pharmacology

  • 1. Predicting Pharmacology Willem van Hoorn Pfizer Global Research & Development Sandwich UK [email_address] Pipeline Pilot UGM, San Diego, Mar 2006
  • 2. Willem van Hoorn Standing on the Shoulders of Giants Gaia Paolini Richard Shapland Andrew Hopkins Jonathan Mason
  • 3.
  • 4. Why Giants Are Required
  • 5. Unified DB Unified Database as Starting Point Bayesian Learn Molecular Categories Predicting activities Linear Discriminant Analysis (LDA) Predicting gene families Polypharmacology interaction network
  • 6. Polypharmacology Network From Binding Data Node : target Edge : compound Metalloproteases Cysteine proteases Serine proteases Phosphodiesterases Aminergic GPCRs Peptide GPCRs GPCRs (others: classes A, B & C) Enzymes (hydrolases, transferases, oxidoreductases & others) Ion Channels Nuclear hormone receptors Aspartyl proteases Kinases Miscellaneous
  • 7. Deriving Multi-Category Bayesian Model 238k actives (  10  M), human target, Mw < 1000, pass reactivity filter,  10 actives / target FCFP_6 90% / 214k 10% / 23,792 55,781 activities 698 models Unified DB
  • 8.
  • 9. 50 Assessing the Predictions of the Random Test Set 58,428 predictions / 17,210 compounds 16,281 compounds  1 correct prediction 31,600 true positives (random: 292) Enrichment ~ 100 fold 26,828 false positives (random: 55,489) 24,181 false negatives
  • 10. Nuclear hormone receptors Ion Channels Phosphodiesterases Aminergic GPCRs Peptide GPCRs GPCRs (others) Enzymes (others) True positive prediction False positive prediction Predicted Polypharmacology Network At Bayesian Cut-off 50
  • 11.
  • 12. A More Challenging Test Set: Cerep Bioprint 238k actives (  10  M), human target, Mw < 1000, pass reactivity filter,  10 actives / target FCFP_6 237k Bioprint 997 compounds 316 targets 694 models Unified DB
  • 13. A More Challenging Test Set: Cerep Bioprint 50 720 predictions / 291 compounds 210 compounds  1 correct prediction 433 true positives (random: 17) Enrichment ~ 25 fold 287 false positives (random: 55,489) 12,281 false negatives
  • 14. Another Look At The Same Data 0 36,222 predictions 6,121 true positives 30,101 false positives 6,593 false negatives  48% of actives in 11% of data Plus 378 extra predicted targets
  • 15.
  • 16.
  • 17. Predicting Forgeries with LDA and Bayesian LDA Bayesian NOTE Length Left Right Bottom Top Diagonal BankNotes LD1 BN1 215.1 130.0 129.8 9.100 10.20 141.5 true 2.501 BN2 214.7 130.7 130.8 11.20 11.20 139.4 false -4.561 BN3 214.3 129.9 129.9 10.20 11.50 139.6 false -3.390 BN4 214.7 130.0 129.4 7.800 10.00 141.2 true 4.060 NOTE Length Left Right Bottom Top Diagonal BankNotesBayes BN1 215.1 130.0 129.8 9.100 10.20 141.5 1.992 BN2 214.7 130.7 130.8 11.20 11.20 139.4 -6.611 BN3 214.3 129.9 129.9 10.20 11.50 139.6 -6.341 BN4 214.7 130.0 129.4 7.800 10.00 141.2 1.771
  • 18. Predicting Gene Class by Physical Properties Compounds binding to different gene classes posses different physical property distributions: Can this be used to predict gene class from physical properties alone? How does LDA compare to Bayesian? Mw clogP
  • 19. Predicting Gene Class by Physical Properties 148k actives (  10  M), human target, Mw < 1000, pass reactivity filter, binding to single target class only Aminergic GPCRs Aspartyl Proteases Cysteine Proteases Enzymes- others GPCRs Class A- others GPCRs Class B GPCRs Class C Hydrolases Ion Channels- Ligand_Gated Ion Channels- others Kinases- others Metalloproteases Nuclear hormone receptors Others Oxidoreductases PDEs Peptide GPCRs Protein Kinases Serine Proteases Transferases 20 Gene Classes: Unified DB
  • 20. Molecular_Weight Num_H_Acceptors Num_H_Donors Num_RotatableBonds Molecular_PolarSurfaceArea No_IonCenters Molecular_Solubility Molecular_SurfaceArea ClogP * Andrews* Predicting Gene Class by Physical Properties 10 Descriptors: 147,534 118,118 29,416
  • 21. Predicting Gene Class by Physical Properties 29416 (9025) 1 (0) 349 (137) 5309 (1423) 8123 (2811) 791 (248) 888 (241) 2638 (499) 482 (163) 279 (74) 0 (0) 152 (59) 47 (0) 0 (0) 0 (0) 1 (0) 1268 (366) 1969 (321) 75 (28) 1180 (613) 5864 (2042) LDA (correct) 29416 (5631) 1012 (125) 792 (133) 341 (147) 2809 (1135) 2176 (392) 1437 (329) 90 (47) 2083 (345) 1626 (293) 1545 (100) 964 (104) 2109 (280) 350 (42) 3346 (146) 2340 (115) 962 (309) 1 (0) 1464 (73) 1670 (614) 2299 (902) Bayes (correct) 29416 (1447) 1460 (36) 1526 (53) 1488 (148) 1461 (236) 1468 (56) 1492 (54) 1465 (167) 1459 (53) 1515 (47) 1430 (11) 1441 (29) 1448 (52) 1461 (15) 1438 (29) 1477 (14) 1524 (117) 1451 (135) 1470 (13) 1479 (29) 1463 (153) Random (correct) 29416 727 913 2927 5027 1178 1385 3336 1238 849 198 594 764 286 339 226 2647 2574 252 728 3228 Experiment Target class Total Transferases Serine Proteases Protein Kinases Peptide GPCRs PDEs Oxidoreductases Others Nuclear hormone receptors Metalloproteases Kinases- others Ion Channels- others Ion Channels- Ligand_Gated Hydrolases GPCRs Class C GPCRs Class B GPCRs Class A- others Enzymes- others Cysteine Proteases Aspartyl Proteases Aminergic GPCRs
  • 22.
  • 23.
  • 24.