SlideShare una empresa de Scribd logo
1 de 41
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
October 2020
Not for Circulation
Accelerating lead optimisation with Active Learning -
joining MMPA ADMET knowledge with Regression Forest
machine learning models
Dr Alexander G. Dossetter
Managing Director, MedChemica Ltd
Available on Slideshare - search for Dossetter
Twitter @MedChemica
Twitter @covid_moonshot
Twitter #BucketListPapers
https://www.medchemica.com/bucket-list/
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Agenda
• Problem statement
• What is Active Learning?
– How can it applied to LI and LO?
• Generating new ideas with MMPA
– Enumeration with MMPA (RuleDesignTM)
• “hit-to-lead” / “AllRules” / 3pairtrans
• Protein class Rule sets
– Permutative-MMPA (Free Wilson ++)
• Getting the best ideas from small data sets
• Regression Forest models for ‘potency’ prediction
– QSAR revisited with transparent descriptors
- Analysis of Error
• Learnings so far
– The system can ‘gets stuck’ at the start…
• ”It’s like the first 8 moves in chess”
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Problem Statement
…8 Years of working with pharma companies
“Our median number of compounds per LO project is 3000 - this is
unsustainable… [it should be] 300”
– Director of Chemistry (large pharma)
“Can we define the text book of medicinal chemistry?”
– Director of Comp Chem (large pharma)
“We are aiming at 300 compound per project. Currently we are about 400, we will
get better”
– ExScienta scientist at SCI ‘What can Big Data do for chemistry”
“Can you find us hits [leads] and predict potency on this [brand] new
protein?”
- Many many people….
MedChemica: using knowledge extraction techniques to build Artificial
Intelligence (AI) systems to reduce the time and cost to critical
compounds and candidate drugs.
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Problem Statement
“Can you find us hits [leads] and predict potency on this
[brand] new protein?”
Can we automate Lead compound design?
The algorithm will:-
- design compounds and explore SAR
- ‘actively’ selecting compounds to improve properties
- AND improve the machine learning models
Small
amount of
data
Matched
Molecular Pair
Analysis
Explainable
QSAR
Awesome leads
pIC50 > 7, good in-vitro PK
SAR, Novelty
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Augmenting the Medicinal Chemist
Prioritizes
options
Sets goals
Makes
Decisions
Data is organized
and summarized
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Augmented Chemists
proposalsRuleDesignTM
Permutative
MMPA
Missing
features
Explainable
QSAR models
Alerts
ideas
Score
and
store
Make &
test
SpotDesignTM
SLIDE 27
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Augmenting the Chemist: Lessons so far…
Develop AI constructively
• Use methods that can be directly connected to
chemical structures and data
– SpotDesign™, RuleDesignTM, Permutative
MMPA, Explainable QSAR
• Ensure that all methods are auditable
– See the transformations and underlying data,
see the pharmacophore pairs on molecules
• Automate updates and track metrics
– All systems are automated from the start,
logging is built in
• Integrate automated systems and chemists ideas
Principles for Positive Engagement
• Define common goals
• Evaluate with directly observable
data
• Expose conflicting views
• Continuous learning and
improvement
• Place in context
Chemists: AI Is Here; Unite To Get the Benefits,
Griffen E.J.; Dossetter, A.G.; Leach,A.G; J. Med. Chem. 2020, 63, 16, 8695–8704.
https://doi.org/10.1021/acs.jmedchem.0c00163
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Data
Warehouse
rule
finder
Exploitable
Knowledge
Molecule
problem
solving
Explainable
QSAR
Automated
loader
MMPA
Clean
Structures &
Data
Property
Prediction
Idea ranking
Instant SAR
analysis
REST API &
GUI
Explainable AI for Medicinal Chemistry Design
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Griffen, E. et al. J. Med. Chem. 2011, 54(22), pp.7739 - 7750.
Leach et al. J. Chem. Inf. Model. 2017, 57, 2424 - 2436
Fully Automated Matched Molecular Pair Analysis (MMPA)
What is this form of Artificial Intelligence?
Δ Data A-
B1
2
2
3
3
3
4
4
4
12
23
3
34
4
4A B
• Matched Molecular Pairs – Molecules that differ only by a
particular, well-defined structural transformation
• Capture the change and environment – MMPs can be recorded as
transformations from A B
• Statistical analysis to define “medicinal chemistry rules”
Defined transformations with high probability of improving
properties of molecules
• Store in a high performance database and provide an intuitive user
interface
Level 4 and higher very
important to P-MMPA
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
A B pSol A (μM) pSol B (μM) ∆pSol
- 4.3(48 μM) - 3.2 (700μM) 1.1
- 6.0 (1.0 μM) - 3.7 (178 μM) 2.3
-5.7 (2.0 μM) - 4.1 (82 μM) 1.6
3 pairs +ve Sol
Median 1.6
CHEMBL1949790CHEMBL1949786
From SAR to MMPA…..
CHEMBL3356658 CHEMBL218767
CHEMBL456322CHEMBL456802
MCPairs Rule finder required 6 matched pairs for 95% confidence
(Al)(Al)
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
The Matched Pairs leading to Rule…..
Actual Rule from MCPairs
Endpoint:
Aqueous Solubility at pH 7.4
[CHEMBL2362975]
n-qual 69
n-qual-up 47
n-qual-down 21
median ∆pSol 0.26
std dev +/- 0.636
(Al)(Al)
Explainable
• Drill back to real world
examples and measured data
Actionable
• Clear decision to make the
compound
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Identify and group matching SMIRKS
Calc ulate statistical parameters for eac h unique
SMIRKS(n, median, sd, se, n_up/ n_down)
Is n ≥ 6?
Not enough data:
ignore transformation
Is the | median| ≤ 0.05 and the
interc entile range (10-90%) ≤ 0.3?
Perform two-tailed binomial test on the
transformation to determine the
signific anc e of the up/ down frequenc y
transformation is
c lassified as ‘neutral’
Transformation c lassified as
‘NED’ (No Effec t Determined)
Transformation c lassified as
‘increase’ or ‘ decrease’
depending on whic h direc tion the
property is c hanging
passfail
yesno
yesno
Rule selection
0 +ve-ve
Median data difference
Neutral IncreaseDecrease
NED
• No assumption of normal
distribution
• Manages ‘censored’ =
qualified / out-of-range data
Leach et al. J. Chem. Inf. Model. 2017, 57, 2424 - 2436
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Molecule Problem Solving - RuleDesignTM
RuleDesignTM (formally “Compounds From Rules”)
• Exploitable Knowledge is a Rule database derived from MMPA
• User puts in a problem molecule with a property they wish to improve
o e.g. solubility, metabolism, hERG….
• System generates potential improved molecules based on data
Exploitable
Knowledge
Enumerator
System
Problem molecule + property to improve
Solution molecules
Watch RuleDesignTM on YouTube https://www.youtube.com/watch?v=nQxXddJDTfc
“..it’s like asking 150 of your peers for ideas in just a few seconds”
- Principal Scientist (large pharma)
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Looking at the results
Results sorted in
increasing RMM
(Mol Weight)
Yellow highlight is
the overlap with
the input
compound
One column per assay
– colour and direction
- LogD decrease, Sol increase
Hyperlink to “Drill
back” to the
original data
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
“Multi-Step” transformations
Shibuya Crossing Tokyo
A C
B
E
F
Would you go steps via A -> B -> C
How would you go know to go E -> F
Or go straight there via D
- if the data said it was good?
D
A Turing test for molecular generators
Darren Green D.; et al J. Med. Chem. 2020
https://doi.org/10.1021/acs.jmedchem.0c01148
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
How many pairs? – deeper Goal setting
Specific Goal
settings
Non-rules transformations
from pair counts
’All Rules’
– all of the Increase and Decrease Rules for all datasets
– warning output can be large
– not suitable for Excel spreadsheet
‘Hit to Lead’
– most frequent transformations chemists perform
’Min 3 pair Trans’
– all transformations with 3 OR MORE matched pairs
‘Min 6 pair Trans’
– all transformations with 6 OR MORE matched pairs
- Actually Increase, Decrease, Neutral and NED
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Broad Rule Sets
• “Rules” for increasing
“potency” are gathered by
MMPA
• Individual assay Rules
(numbers in brackets) are
grouped as a “Broad” Goal
• Example Dopamine Rules
number 3548 (screen shot)
• Therefore new hits for a new
Dopamine target can have
these Rules applied [What
worked in the past?]
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Permutative MMPA
• Take all compounds in a data set
• Find all matched pairs & extract
DpIC50 and the transforms between
them
• Aggregate transformations with median
DpIC50 and count of pairs
• Apply all transformations back to
the initial data set (at the most
specific environment level) NO R
GROUP MAPPING REQUIRED !!!
• Predicted pIC50 = substrate pIC50 +
median DpIC50
• Remove existing compounds
• Prioritize new compounds by pIC50
estimate
M1
M2
M3
M4
t1
M5
t1
t1
M*
Internal
Structures
& data
Apply
transforms
New
structures
&
estimated
data
Filter and
prioritize
Extract
transforms
Remove
existing
compounds
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Exploit Own or Patent Data
External Patents
& data
Extract
transforms
Apply
transforms
Filter and
prioritize
Internal
Structures &
data
Apply
transforms
New
structures &
estimated
data
Filter and
prioritize
Extract
transforms
Remove
existing
compounds
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Client Oncology PPI project example
• 386 patent compounds analyzed
• 6024 pair relationships found(39% - good
number of MMPs)
• Permutative MMPA process:
• Apply to own series,
• Then filter:
• remove undesirable substructure
• Estimated potency >= 6.5,
• clogP <= 2.5
• 52 suggestions
Measurement =
p(TR-FRET nucleotide exchange assay pIC50) or
estimated pIC50 from seed value + DpIC50
Explainable
• Visible, original real world compounds and
measurement
Actionable
• Prioritises ‘realistic’ next step compounds.
PPIpIC50
cLogP
Molecule suggestions yes no
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Regression Forest Models
• Features are acid, base, hydrogen bond
donor, acceptor, hydrophobe, aromatic
attachment, aliphatic attachment and
halogen. Definitions are highly engineered
[SMARTS]
• Feature 1 – topological dist - Feature 2
• Engineered for chemical relevance –
features can be superimposed or directly
linked, e.g. enables a group to be both a
hydrogen bond acceptor and a base
• A bit identifies a pharmacophore pair
e.g. : Aromatic - 3 bonds - Base
• Used as unfolded 360 bit fingerprints
• Regression Forest as ML method
• Build models with 10 fold CV – report
CV-Pearson’s R2 and CV RMSE
• Build RF error model to generate
predicted error for each compound
using the same descriptors
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Feature Definition
Basic Group Atom or group most likely protonated at pH 7.4
Acidic Group Atom or group most likely deprotonated at pH 7.4, includes N and C
acids
Acceptor Definitions derived from Taylor, Cosgrove et al
Donor Definitions derived from Taylor, Cosgrove et al
Hydrophobic C4 or greater cyclic or acyclic alkyl group
Aromatic Attachment connection of any group to an aromatic atom excluding connections
within rings
Aliphatic Attachment connection of any atom to an aliphatic group not in a ring.
Halo F,Cl, Br, I
Reference for Donor acceptor feature definitions:
Taylor, R.; Cole, J. C.; Cosgrove, D. A.; Gardiner, E. J.; Gillet, V. J.; Korb, O. J Comput Aided Mol Des 2012, 26 (4), 451–472.
Acid & Base definitions are SMARTS including C, N, heteroaromatic acids, bases excluding weak aniline bases, including
amidines, guanidine’s - MedChemica definitions.
MedChemica Advanced Pharmacophore Pairs
Gobbi, A.; Poppinger, D. Biotechnology and Bioengineering 1998, 61 (1), 47–54.
Reutlinger, M.; Koch, C. P.; Reker, D.; Todoroff, N.; Schneider, P.; Rodrigues, T.; Schneider, G. Mol. Inf. 2013, 32 (2), 133–138.
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Regression Forest & Pharmacophore understanding
• hERG – auditable models
• Identify important chemical features driving potency
• Predict hERG potency from RF model [10 fold CV]
Pharmacophore fp length 280
10 fold CV
Compounds in training 6196
RMSE 0.37
CV R2 0.51
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Examples of exact Pharmacophore Pairs
HBA-same_group-Base HBA-1_atom-HBD Base-2_atom-Ar
Topological distances are precisely specified and can be exactly visualized on the
molecules – no ambiguity over which features are correlated with activity
Critically – enables interrogation and validation of SAR understanding
Record as an unfolded fingerprint of 360 bits, 1 or 0 for presence or absence of a
feature-distance-feature pair
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
• hERG – auditable models
• Predict hERG potency from RF model [10 fold CV]
• Example CHEMBL12713 sertindole
• Colour structure by feature importance
weighted sum of of pharmacophore pair
fingerprints – show the chemists where the
hotspots are.
• Drill deeper to show the most important positive
and negative features. RF prediction pIC50 7.8
median_with: 5.1
median_without: 4.7
median_diff: 0.4
n_examples_with: 4585
n_examples_without : 1383
median_with: 5.1,
median_without: 5.3
median_diff: -0.2
n_examples_with: 3106
n_examples_without : 2862
Regression Forest & Pharmacophore understanding
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Explainable – chemists can see the parts of the molecule that count
Explainable
• Highlighted features show the chemist the contribution to the
prediction
Actionable
• Which parts should be optimized to achieve the Goal
Explainable
• Nearest Neighbours show original data on which model is built
Actionable
• What weight do I put on this results? How likely is it? Do we test?
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
RF and kNN are good but……
• The models are good but could be great or even superb..
• Analysis of error identifies the exact “functional groups” that are less accurately
predicted
• A feedback loop could design cmpds to improve models  testing
• “Either not enough or the wrong sort of data – the downfall of AI in Life Science?” – Dossetter, A.G.
https://www.linkedin.com/pulse/either-enough-wrong-sort-data-downfall-ai-life-al-dossetter/
Using the model RMSE to
estimate error:
78% measured values in
range prediction +/- RMSE
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Overview
Generate virtual compounds from MCPairs MMPA
• Hit-to-Lead transformations – the most used medicinal chemistry
• ADMET transformations for metabolism and solubility
• Target class transformations learning from target analogues
• E.g. Dopamine Rule
Regression forest models
• Accurate pharmacophore features with topological distance
• Unfolded fingerprints connect feature importance to
pharmacophores
• Error models give accuracy of prediction for each compound
Active Learning
• Explore Strategy - predicted high potency, high error
• Exploit Strategy - predicted high potency, low error
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Active Learning
Hits
Build model
with error
estimates
Enumerate
Select for
Explore and
Exploit
Synthesise &
Test
Compounds
with data
Compounds
meet criteria?
Yes
No
STRATEGIES
Explore: prioritize high error
Exploit : prioritize high potency & low error
Ratio of explore to exploit varies with stage
Select enumeration strategy by stage:
Hit-to lead, target class, solubility,
metabolism
For in silico simulation match to
known and measured compounds
System operational
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Active Learning – V1
Challenges:
• How to get started when you only have a few
compounds to model build from
• limited synthesis resource
D2 Case study
• Start with 30 literature compounds :
5 <= pIC50 <=6 , -1 < AlogP < 3.5, selected by
LLE sort (literature contains 5200 compounds)
• Build RF model CV-R2 -0.26, small data set
• Enumerate from all compounds:
• What is the best enumeration strategy?
– how to pick the (few)compounds to make from the
enumerated set?
– Enumeration is a success if we match literature
compounds (very stringent test)
– Have we learnt all that the initial set of compounds
can teach us?
Strategy
(MMPA)
Number of
compounds
generated
Number of
matches to D2
known set
Maximum
pIC50
(actual)
Maximum pIC50
(predicted[error])
Hit-to-Lead 682 10 7.8 5.5[0.21]
Dopamine
class
469 8 7.9 5.5[0.23]
Solubility 10148 10 7.8 5.5[0.21]
Metabolism 12729 19 7.9 5.5[0.21]
Permutative
MMPA
(env = 4)
5 3 7.9 6.1[?]
D2pIC50
cLogP
Round 1…..
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
D2 worked example – The p-MMPA
Predicted: pIC50 6.1, actual pIC50 7.9
Finding all the MMP SAR that is present and
applying it exhaustively including behind the
Pareto frontier.
D2pIC50
cLogP
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Active Learning v2
System under development
Hits
Compounds
with data
P-MMPA Under
Dev
Compounds
with data
Build model
with error
estimatesEnumerate
Select for
Explore and
Exploit
Synthesise &
Test
Compounds
meet criteria?
Yes
No
Explore: prioritize high error
Exploit : prioritize high potency & low error
Ratio of explore to exploit varies with stage
Enumerate by:
target class,
solubility,
metabolism
Compounds
with data
Need initial “induction phase” before cyclic
automated active learning can be applied
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Like the opening in chess game
• “The first moves of a chess game are
termed the "opening" or "opening
moves". A good opening will provide
better protection of the King, control
over an area of the board (particularly
the centre), greater mobility for pieces,
and possibly opportunities to capture
opposing pawns and pieces.” A Beginner's
Garden of Chess Openings - David A. Wheeler
• Success or failure of an
automated active learning
system could be like the first few
moves of a chess – they shape
the game…
• Will it always need a human
intervention (or ten…)? …set up for either Queen’s Gambit, King’s Indian Defense,
Nimzo-Indian, Bogo-Indian, Queen’s Indian Defense, and
Dutch Defense.
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Learning from First Experiments….
• MMPA and RF work together to suggest and rank compound designs
• Strategies explored
– Explore: prioritize high error
– Exploit : prioritize high potency & low error
• Ratio of explore / exploit varies with stage
• The initial phase from a small number of hits is a challenge
– Hit-to-Lead / ADMET Rules did not match compounds in literature
– Victims of what is published
– Requires full datasets
– Process can get “stuck”
• Human intervention may always be required
• Both MMPA and RF can select compounds to make to improve models –
analysis of error.
• Permutative-MMPA works very well (of course)
• Where AI could help is a compound selector depending on strategy
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
• Dr Alexander G. Dossetter
• Managing Director, MedChemica Ltd
• al.dossetter@medchemic.com
• MedChemica
• Lauren Reid
• Jessica Stacey
• Phil De. Sousa
• Shane Montague
• Edward J. Griffen
• Andrew G. Leach
• Available on Slideshere - search for Dossetter
• Twitter @MedChemica
• Twitter #BucketListPapers
• https://www.medchemica.com/bucket-list/
Thank you
Exploiting medicinal chemistry knowledge to accelerate projects October 2020October 2020
Not for Circulation
About MedChemica
>10 experience in building A.I. Systems for drug discovery
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
• Founded in 2012 by AZ AP Medicinal / Computational chemists
to accelerate drug hunting by exploiting data driven knowledge
• Domain leaders in SAR knowledge extraction and knowledge
based design
• > 11 years experience of building AI systems that suggest
actions to chemists (7 years as MedChemica)
• Creators of largest ever documented database of medicinal
chemistry ADMET knowledge
MedChemica Publications
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
AI Software Platforms
– Complete In-house platform
– Analysis of own data and automated
updating
– Design tool access to all chemists
– Custom fitting (Software-as-a-Service)
One stop GUI
Design tool
Biotech, Universities and
Foundations
Medium to large pharma,
agrochemical and materials
research
– Secure web-based AI design platform
– CHEMBL, Patent data analysed
– Merged into one knowledgebase
Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020
Science As A Service (SaaS)
Target ID
Hit
Screening
Lead Identification Lead Optimisation Pre-Clinical
AI H2L design
sets
Bespoke Advanced Analytics and Computational Chemistry services through-out the research phase
Compound design to
solve ADMET and
potency issues
Third party
compound
assessment
Directed virtual screening
for hit matter
Library design for novel
protein targets
AI Toxophore
assessment
Patent analysis
Pharmacophore
profiling
Generating IP for
clients
[Scaffold hops]
Collection
evaluation
and
enhancement
Exploiting medicinal chemistry knowledge to accelerate projects October 2020
October 2020
Not for Circulation
Panel Discussion:
What should the Medicinal Chemistry Discipline be
like in 10 years?
Slideshere - search for Dossetter
Twitter @MedChemica
Twitter @covid_moonshot
Twitter #BucketListPapers
https://www.medchemica.com/bucket-list/

Más contenido relacionado

La actualidad más candente

P. Joshi SBDD and docking (1).ppt
P. Joshi SBDD and docking (1).pptP. Joshi SBDD and docking (1).ppt
P. Joshi SBDD and docking (1).pptpranalpatilPranal
 
Pharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuPharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuKAUSHAL SAHU
 
PRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptx
PRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptxPRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptx
PRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptxDharaMehta45
 
Computational Drug Design
Computational Drug DesignComputational Drug Design
Computational Drug Designbaoilleach
 
How to analyse bulk transcriptomic data using Deseq2
How to analyse bulk transcriptomic data using Deseq2How to analyse bulk transcriptomic data using Deseq2
How to analyse bulk transcriptomic data using Deseq2AdamCribbs1
 
Xml in bio medical field
Xml in bio medical fieldXml in bio medical field
Xml in bio medical fieldJuman Ghazi
 
Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Dinesh Barupal
 
Structure base drug design
Structure base drug designStructure base drug design
Structure base drug designJayshreeUpadhyay
 
ADMET-Predictor-Webinar_AO-AM-final.pdf
ADMET-Predictor-Webinar_AO-AM-final.pdfADMET-Predictor-Webinar_AO-AM-final.pdf
ADMET-Predictor-Webinar_AO-AM-final.pdfsweed5
 
Lead development and drug discovery
Lead development and drug discoveryLead development and drug discovery
Lead development and drug discoveryApuMarma
 
Superimposition method- ligand based drug design
Superimposition method- ligand based drug designSuperimposition method- ligand based drug design
Superimposition method- ligand based drug designIshpreet Sachdev
 
consensus superiority of the pharmacophore based alignment, over maximum comm...
consensus superiority of the pharmacophore based alignment, over maximum comm...consensus superiority of the pharmacophore based alignment, over maximum comm...
consensus superiority of the pharmacophore based alignment, over maximum comm...Deepak Rohilla
 
Application of proteomics for identification of abiotic stress tolerance in c...
Application of proteomics for identification of abiotic stress tolerance in c...Application of proteomics for identification of abiotic stress tolerance in c...
Application of proteomics for identification of abiotic stress tolerance in c...Vivek Zinzala
 
statistical tools used in qsar analysis
statistical tools used in qsar analysisstatistical tools used in qsar analysis
statistical tools used in qsar analysisSandeep Sahu
 
Drug discovery
Drug discoveryDrug discovery
Drug discoverySaba Ahmed
 

La actualidad más candente (20)

P. Joshi SBDD and docking (1).ppt
P. Joshi SBDD and docking (1).pptP. Joshi SBDD and docking (1).ppt
P. Joshi SBDD and docking (1).ppt
 
Metabolomics Data Analysis
Metabolomics Data AnalysisMetabolomics Data Analysis
Metabolomics Data Analysis
 
Pharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuPharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahu
 
PRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptx
PRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptxPRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptx
PRINCIPLES OF DRUG DISCOVERY & DEVELOPMENT.pptx
 
Computational Drug Design
Computational Drug DesignComputational Drug Design
Computational Drug Design
 
How to analyse bulk transcriptomic data using Deseq2
How to analyse bulk transcriptomic data using Deseq2How to analyse bulk transcriptomic data using Deseq2
How to analyse bulk transcriptomic data using Deseq2
 
Xml in bio medical field
Xml in bio medical fieldXml in bio medical field
Xml in bio medical field
 
Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Mapping metabolites against pathway databases
Mapping metabolites against pathway databases
 
Structure base drug design
Structure base drug designStructure base drug design
Structure base drug design
 
ADMET-Predictor-Webinar_AO-AM-final.pdf
ADMET-Predictor-Webinar_AO-AM-final.pdfADMET-Predictor-Webinar_AO-AM-final.pdf
ADMET-Predictor-Webinar_AO-AM-final.pdf
 
Drug Design:Discovery, Development and Delivery
Drug Design:Discovery, Development and DeliveryDrug Design:Discovery, Development and Delivery
Drug Design:Discovery, Development and Delivery
 
Pharmacogenomics
Pharmacogenomics Pharmacogenomics
Pharmacogenomics
 
Lead development and drug discovery
Lead development and drug discoveryLead development and drug discovery
Lead development and drug discovery
 
Superimposition method- ligand based drug design
Superimposition method- ligand based drug designSuperimposition method- ligand based drug design
Superimposition method- ligand based drug design
 
consensus superiority of the pharmacophore based alignment, over maximum comm...
consensus superiority of the pharmacophore based alignment, over maximum comm...consensus superiority of the pharmacophore based alignment, over maximum comm...
consensus superiority of the pharmacophore based alignment, over maximum comm...
 
Cadd
CaddCadd
Cadd
 
Application of proteomics for identification of abiotic stress tolerance in c...
Application of proteomics for identification of abiotic stress tolerance in c...Application of proteomics for identification of abiotic stress tolerance in c...
Application of proteomics for identification of abiotic stress tolerance in c...
 
statistical tools used in qsar analysis
statistical tools used in qsar analysisstatistical tools used in qsar analysis
statistical tools used in qsar analysis
 
Drug discovery
Drug discoveryDrug discovery
Drug discovery
 
Antisense therapy
Antisense therapyAntisense therapy
Antisense therapy
 

Similar a MedChemica Active Learning - Combining MMPA and ML

Explainable AI in Drug Hunting
Explainable AI in Drug HuntingExplainable AI in Drug Hunting
Explainable AI in Drug HuntingEd Griffen
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEd Griffen
 
Practical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial IntelligencePractical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial IntelligenceAl Dossetter
 
SCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemicaSCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemicaEd Griffen
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Bigfinite
 
Data Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsData Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsAkin Osman Kazakci
 
MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...
MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...
MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...Medicines Discovery Catapult
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinarPistoia Alliance
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisjournalBEEI
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET Journal
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
 
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses Dmitry Grapov
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarAnn-Marie Roche
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...Kamel Mansouri
 
The challenges of Analytical Data Management in R&D
The challenges of Analytical Data Management in R&DThe challenges of Analytical Data Management in R&D
The challenges of Analytical Data Management in R&DLaura Berry
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data AnalyticsUtkarsh Sharma
 
Insights from Building the Future of Drug Discovery with Apache Spark with Lu...
Insights from Building the Future of Drug Discovery with Apache Spark with Lu...Insights from Building the Future of Drug Discovery with Apache Spark with Lu...
Insights from Building the Future of Drug Discovery with Apache Spark with Lu...Databricks
 
Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...IJECEIAES
 

Similar a MedChemica Active Learning - Combining MMPA and ML (20)

Explainable AI in Drug Hunting
Explainable AI in Drug HuntingExplainable AI in Drug Hunting
Explainable AI in Drug Hunting
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
 
Practical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial IntelligencePractical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial Intelligence
 
SCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemicaSCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemica
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
 
Data Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsData Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analytics
 
Mohammed AL Madhani
Mohammed AL MadhaniMohammed AL Madhani
Mohammed AL Madhani
 
MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...
MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...
MDC Connect: In-Silico Drug Design - what to do, what not to do - project dri...
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysis
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinar
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
 
The challenges of Analytical Data Management in R&D
The challenges of Analytical Data Management in R&DThe challenges of Analytical Data Management in R&D
The challenges of Analytical Data Management in R&D
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
Insights from Building the Future of Drug Discovery with Apache Spark with Lu...
Insights from Building the Future of Drug Discovery with Apache Spark with Lu...Insights from Building the Future of Drug Discovery with Apache Spark with Lu...
Insights from Building the Future of Drug Discovery with Apache Spark with Lu...
 
Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...Performance evaluation of random forest with feature selection methods in pre...
Performance evaluation of random forest with feature selection methods in pre...
 

Último

Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service AvailableTrichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service AvailableGENUINE ESCORT AGENCY
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Dipal Arora
 
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...GENUINE ESCORT AGENCY
 
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...chandars293
 
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In AhmedabadGENUINE ESCORT AGENCY
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...parulsinha
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...jageshsingh5554
 
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...Anamika Rawat
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...narwatsonia7
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Sheetaleventcompany
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...parulsinha
 
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappMost Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappInaaya Sharma
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...Arohi Goyal
 
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...khalifaescort01
 
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...tanya dube
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...chandars293
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...chandars293
 
Call Girls Raipur Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Raipur Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Raipur Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Raipur Just Call 9630942363 Top Class Call Girl Service AvailableGENUINE ESCORT AGENCY
 
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...vidya singh
 

Último (20)

Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service AvailableTrichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
 
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
 
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
 
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
 
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
 
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
 
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappMost Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
 
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
 
Call Girls Raipur Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Raipur Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Raipur Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Raipur Just Call 9630942363 Top Class Call Girl Service Available
 
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
 

MedChemica Active Learning - Combining MMPA and ML

  • 1. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 October 2020 Not for Circulation Accelerating lead optimisation with Active Learning - joining MMPA ADMET knowledge with Regression Forest machine learning models Dr Alexander G. Dossetter Managing Director, MedChemica Ltd Available on Slideshare - search for Dossetter Twitter @MedChemica Twitter @covid_moonshot Twitter #BucketListPapers https://www.medchemica.com/bucket-list/
  • 2. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Agenda • Problem statement • What is Active Learning? – How can it applied to LI and LO? • Generating new ideas with MMPA – Enumeration with MMPA (RuleDesignTM) • “hit-to-lead” / “AllRules” / 3pairtrans • Protein class Rule sets – Permutative-MMPA (Free Wilson ++) • Getting the best ideas from small data sets • Regression Forest models for ‘potency’ prediction – QSAR revisited with transparent descriptors - Analysis of Error • Learnings so far – The system can ‘gets stuck’ at the start… • ”It’s like the first 8 moves in chess”
  • 3. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Problem Statement …8 Years of working with pharma companies “Our median number of compounds per LO project is 3000 - this is unsustainable… [it should be] 300” – Director of Chemistry (large pharma) “Can we define the text book of medicinal chemistry?” – Director of Comp Chem (large pharma) “We are aiming at 300 compound per project. Currently we are about 400, we will get better” – ExScienta scientist at SCI ‘What can Big Data do for chemistry” “Can you find us hits [leads] and predict potency on this [brand] new protein?” - Many many people…. MedChemica: using knowledge extraction techniques to build Artificial Intelligence (AI) systems to reduce the time and cost to critical compounds and candidate drugs.
  • 4. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Problem Statement “Can you find us hits [leads] and predict potency on this [brand] new protein?” Can we automate Lead compound design? The algorithm will:- - design compounds and explore SAR - ‘actively’ selecting compounds to improve properties - AND improve the machine learning models Small amount of data Matched Molecular Pair Analysis Explainable QSAR Awesome leads pIC50 > 7, good in-vitro PK SAR, Novelty
  • 5. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Augmenting the Medicinal Chemist Prioritizes options Sets goals Makes Decisions Data is organized and summarized
  • 6. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Augmented Chemists proposalsRuleDesignTM Permutative MMPA Missing features Explainable QSAR models Alerts ideas Score and store Make & test SpotDesignTM SLIDE 27
  • 7. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Augmenting the Chemist: Lessons so far… Develop AI constructively • Use methods that can be directly connected to chemical structures and data – SpotDesign™, RuleDesignTM, Permutative MMPA, Explainable QSAR • Ensure that all methods are auditable – See the transformations and underlying data, see the pharmacophore pairs on molecules • Automate updates and track metrics – All systems are automated from the start, logging is built in • Integrate automated systems and chemists ideas Principles for Positive Engagement • Define common goals • Evaluate with directly observable data • Expose conflicting views • Continuous learning and improvement • Place in context Chemists: AI Is Here; Unite To Get the Benefits, Griffen E.J.; Dossetter, A.G.; Leach,A.G; J. Med. Chem. 2020, 63, 16, 8695–8704. https://doi.org/10.1021/acs.jmedchem.0c00163
  • 8. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Data Warehouse rule finder Exploitable Knowledge Molecule problem solving Explainable QSAR Automated loader MMPA Clean Structures & Data Property Prediction Idea ranking Instant SAR analysis REST API & GUI Explainable AI for Medicinal Chemistry Design
  • 9. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Griffen, E. et al. J. Med. Chem. 2011, 54(22), pp.7739 - 7750. Leach et al. J. Chem. Inf. Model. 2017, 57, 2424 - 2436 Fully Automated Matched Molecular Pair Analysis (MMPA) What is this form of Artificial Intelligence? Δ Data A- B1 2 2 3 3 3 4 4 4 12 23 3 34 4 4A B • Matched Molecular Pairs – Molecules that differ only by a particular, well-defined structural transformation • Capture the change and environment – MMPs can be recorded as transformations from A B • Statistical analysis to define “medicinal chemistry rules” Defined transformations with high probability of improving properties of molecules • Store in a high performance database and provide an intuitive user interface Level 4 and higher very important to P-MMPA
  • 10. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 A B pSol A (μM) pSol B (μM) ∆pSol - 4.3(48 μM) - 3.2 (700μM) 1.1 - 6.0 (1.0 μM) - 3.7 (178 μM) 2.3 -5.7 (2.0 μM) - 4.1 (82 μM) 1.6 3 pairs +ve Sol Median 1.6 CHEMBL1949790CHEMBL1949786 From SAR to MMPA….. CHEMBL3356658 CHEMBL218767 CHEMBL456322CHEMBL456802 MCPairs Rule finder required 6 matched pairs for 95% confidence (Al)(Al)
  • 11. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 The Matched Pairs leading to Rule….. Actual Rule from MCPairs Endpoint: Aqueous Solubility at pH 7.4 [CHEMBL2362975] n-qual 69 n-qual-up 47 n-qual-down 21 median ∆pSol 0.26 std dev +/- 0.636 (Al)(Al) Explainable • Drill back to real world examples and measured data Actionable • Clear decision to make the compound
  • 12. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Identify and group matching SMIRKS Calc ulate statistical parameters for eac h unique SMIRKS(n, median, sd, se, n_up/ n_down) Is n ≥ 6? Not enough data: ignore transformation Is the | median| ≤ 0.05 and the interc entile range (10-90%) ≤ 0.3? Perform two-tailed binomial test on the transformation to determine the signific anc e of the up/ down frequenc y transformation is c lassified as ‘neutral’ Transformation c lassified as ‘NED’ (No Effec t Determined) Transformation c lassified as ‘increase’ or ‘ decrease’ depending on whic h direc tion the property is c hanging passfail yesno yesno Rule selection 0 +ve-ve Median data difference Neutral IncreaseDecrease NED • No assumption of normal distribution • Manages ‘censored’ = qualified / out-of-range data Leach et al. J. Chem. Inf. Model. 2017, 57, 2424 - 2436
  • 13. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Molecule Problem Solving - RuleDesignTM RuleDesignTM (formally “Compounds From Rules”) • Exploitable Knowledge is a Rule database derived from MMPA • User puts in a problem molecule with a property they wish to improve o e.g. solubility, metabolism, hERG…. • System generates potential improved molecules based on data Exploitable Knowledge Enumerator System Problem molecule + property to improve Solution molecules Watch RuleDesignTM on YouTube https://www.youtube.com/watch?v=nQxXddJDTfc “..it’s like asking 150 of your peers for ideas in just a few seconds” - Principal Scientist (large pharma)
  • 14. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Looking at the results Results sorted in increasing RMM (Mol Weight) Yellow highlight is the overlap with the input compound One column per assay – colour and direction - LogD decrease, Sol increase Hyperlink to “Drill back” to the original data
  • 15. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 “Multi-Step” transformations Shibuya Crossing Tokyo A C B E F Would you go steps via A -> B -> C How would you go know to go E -> F Or go straight there via D - if the data said it was good? D A Turing test for molecular generators Darren Green D.; et al J. Med. Chem. 2020 https://doi.org/10.1021/acs.jmedchem.0c01148
  • 16. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 How many pairs? – deeper Goal setting Specific Goal settings Non-rules transformations from pair counts ’All Rules’ – all of the Increase and Decrease Rules for all datasets – warning output can be large – not suitable for Excel spreadsheet ‘Hit to Lead’ – most frequent transformations chemists perform ’Min 3 pair Trans’ – all transformations with 3 OR MORE matched pairs ‘Min 6 pair Trans’ – all transformations with 6 OR MORE matched pairs - Actually Increase, Decrease, Neutral and NED
  • 17. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Broad Rule Sets • “Rules” for increasing “potency” are gathered by MMPA • Individual assay Rules (numbers in brackets) are grouped as a “Broad” Goal • Example Dopamine Rules number 3548 (screen shot) • Therefore new hits for a new Dopamine target can have these Rules applied [What worked in the past?]
  • 18. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Permutative MMPA • Take all compounds in a data set • Find all matched pairs & extract DpIC50 and the transforms between them • Aggregate transformations with median DpIC50 and count of pairs • Apply all transformations back to the initial data set (at the most specific environment level) NO R GROUP MAPPING REQUIRED !!! • Predicted pIC50 = substrate pIC50 + median DpIC50 • Remove existing compounds • Prioritize new compounds by pIC50 estimate M1 M2 M3 M4 t1 M5 t1 t1 M* Internal Structures & data Apply transforms New structures & estimated data Filter and prioritize Extract transforms Remove existing compounds
  • 19. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Exploit Own or Patent Data External Patents & data Extract transforms Apply transforms Filter and prioritize Internal Structures & data Apply transforms New structures & estimated data Filter and prioritize Extract transforms Remove existing compounds
  • 20. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Client Oncology PPI project example • 386 patent compounds analyzed • 6024 pair relationships found(39% - good number of MMPs) • Permutative MMPA process: • Apply to own series, • Then filter: • remove undesirable substructure • Estimated potency >= 6.5, • clogP <= 2.5 • 52 suggestions Measurement = p(TR-FRET nucleotide exchange assay pIC50) or estimated pIC50 from seed value + DpIC50 Explainable • Visible, original real world compounds and measurement Actionable • Prioritises ‘realistic’ next step compounds. PPIpIC50 cLogP Molecule suggestions yes no
  • 21. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Regression Forest Models • Features are acid, base, hydrogen bond donor, acceptor, hydrophobe, aromatic attachment, aliphatic attachment and halogen. Definitions are highly engineered [SMARTS] • Feature 1 – topological dist - Feature 2 • Engineered for chemical relevance – features can be superimposed or directly linked, e.g. enables a group to be both a hydrogen bond acceptor and a base • A bit identifies a pharmacophore pair e.g. : Aromatic - 3 bonds - Base • Used as unfolded 360 bit fingerprints • Regression Forest as ML method • Build models with 10 fold CV – report CV-Pearson’s R2 and CV RMSE • Build RF error model to generate predicted error for each compound using the same descriptors
  • 22. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Feature Definition Basic Group Atom or group most likely protonated at pH 7.4 Acidic Group Atom or group most likely deprotonated at pH 7.4, includes N and C acids Acceptor Definitions derived from Taylor, Cosgrove et al Donor Definitions derived from Taylor, Cosgrove et al Hydrophobic C4 or greater cyclic or acyclic alkyl group Aromatic Attachment connection of any group to an aromatic atom excluding connections within rings Aliphatic Attachment connection of any atom to an aliphatic group not in a ring. Halo F,Cl, Br, I Reference for Donor acceptor feature definitions: Taylor, R.; Cole, J. C.; Cosgrove, D. A.; Gardiner, E. J.; Gillet, V. J.; Korb, O. J Comput Aided Mol Des 2012, 26 (4), 451–472. Acid & Base definitions are SMARTS including C, N, heteroaromatic acids, bases excluding weak aniline bases, including amidines, guanidine’s - MedChemica definitions. MedChemica Advanced Pharmacophore Pairs Gobbi, A.; Poppinger, D. Biotechnology and Bioengineering 1998, 61 (1), 47–54. Reutlinger, M.; Koch, C. P.; Reker, D.; Todoroff, N.; Schneider, P.; Rodrigues, T.; Schneider, G. Mol. Inf. 2013, 32 (2), 133–138.
  • 23. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Regression Forest & Pharmacophore understanding • hERG – auditable models • Identify important chemical features driving potency • Predict hERG potency from RF model [10 fold CV] Pharmacophore fp length 280 10 fold CV Compounds in training 6196 RMSE 0.37 CV R2 0.51
  • 24. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Examples of exact Pharmacophore Pairs HBA-same_group-Base HBA-1_atom-HBD Base-2_atom-Ar Topological distances are precisely specified and can be exactly visualized on the molecules – no ambiguity over which features are correlated with activity Critically – enables interrogation and validation of SAR understanding Record as an unfolded fingerprint of 360 bits, 1 or 0 for presence or absence of a feature-distance-feature pair
  • 25. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 • hERG – auditable models • Predict hERG potency from RF model [10 fold CV] • Example CHEMBL12713 sertindole • Colour structure by feature importance weighted sum of of pharmacophore pair fingerprints – show the chemists where the hotspots are. • Drill deeper to show the most important positive and negative features. RF prediction pIC50 7.8 median_with: 5.1 median_without: 4.7 median_diff: 0.4 n_examples_with: 4585 n_examples_without : 1383 median_with: 5.1, median_without: 5.3 median_diff: -0.2 n_examples_with: 3106 n_examples_without : 2862 Regression Forest & Pharmacophore understanding
  • 26. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Explainable – chemists can see the parts of the molecule that count Explainable • Highlighted features show the chemist the contribution to the prediction Actionable • Which parts should be optimized to achieve the Goal Explainable • Nearest Neighbours show original data on which model is built Actionable • What weight do I put on this results? How likely is it? Do we test?
  • 27. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 RF and kNN are good but…… • The models are good but could be great or even superb.. • Analysis of error identifies the exact “functional groups” that are less accurately predicted • A feedback loop could design cmpds to improve models  testing • “Either not enough or the wrong sort of data – the downfall of AI in Life Science?” – Dossetter, A.G. https://www.linkedin.com/pulse/either-enough-wrong-sort-data-downfall-ai-life-al-dossetter/ Using the model RMSE to estimate error: 78% measured values in range prediction +/- RMSE
  • 28. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Overview Generate virtual compounds from MCPairs MMPA • Hit-to-Lead transformations – the most used medicinal chemistry • ADMET transformations for metabolism and solubility • Target class transformations learning from target analogues • E.g. Dopamine Rule Regression forest models • Accurate pharmacophore features with topological distance • Unfolded fingerprints connect feature importance to pharmacophores • Error models give accuracy of prediction for each compound Active Learning • Explore Strategy - predicted high potency, high error • Exploit Strategy - predicted high potency, low error
  • 29. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Active Learning Hits Build model with error estimates Enumerate Select for Explore and Exploit Synthesise & Test Compounds with data Compounds meet criteria? Yes No STRATEGIES Explore: prioritize high error Exploit : prioritize high potency & low error Ratio of explore to exploit varies with stage Select enumeration strategy by stage: Hit-to lead, target class, solubility, metabolism For in silico simulation match to known and measured compounds System operational
  • 30. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Active Learning – V1 Challenges: • How to get started when you only have a few compounds to model build from • limited synthesis resource D2 Case study • Start with 30 literature compounds : 5 <= pIC50 <=6 , -1 < AlogP < 3.5, selected by LLE sort (literature contains 5200 compounds) • Build RF model CV-R2 -0.26, small data set • Enumerate from all compounds: • What is the best enumeration strategy? – how to pick the (few)compounds to make from the enumerated set? – Enumeration is a success if we match literature compounds (very stringent test) – Have we learnt all that the initial set of compounds can teach us? Strategy (MMPA) Number of compounds generated Number of matches to D2 known set Maximum pIC50 (actual) Maximum pIC50 (predicted[error]) Hit-to-Lead 682 10 7.8 5.5[0.21] Dopamine class 469 8 7.9 5.5[0.23] Solubility 10148 10 7.8 5.5[0.21] Metabolism 12729 19 7.9 5.5[0.21] Permutative MMPA (env = 4) 5 3 7.9 6.1[?] D2pIC50 cLogP Round 1…..
  • 31. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 D2 worked example – The p-MMPA Predicted: pIC50 6.1, actual pIC50 7.9 Finding all the MMP SAR that is present and applying it exhaustively including behind the Pareto frontier. D2pIC50 cLogP
  • 32. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Active Learning v2 System under development Hits Compounds with data P-MMPA Under Dev Compounds with data Build model with error estimatesEnumerate Select for Explore and Exploit Synthesise & Test Compounds meet criteria? Yes No Explore: prioritize high error Exploit : prioritize high potency & low error Ratio of explore to exploit varies with stage Enumerate by: target class, solubility, metabolism Compounds with data Need initial “induction phase” before cyclic automated active learning can be applied
  • 33. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Like the opening in chess game • “The first moves of a chess game are termed the "opening" or "opening moves". A good opening will provide better protection of the King, control over an area of the board (particularly the centre), greater mobility for pieces, and possibly opportunities to capture opposing pawns and pieces.” A Beginner's Garden of Chess Openings - David A. Wheeler • Success or failure of an automated active learning system could be like the first few moves of a chess – they shape the game… • Will it always need a human intervention (or ten…)? …set up for either Queen’s Gambit, King’s Indian Defense, Nimzo-Indian, Bogo-Indian, Queen’s Indian Defense, and Dutch Defense.
  • 34. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Learning from First Experiments…. • MMPA and RF work together to suggest and rank compound designs • Strategies explored – Explore: prioritize high error – Exploit : prioritize high potency & low error • Ratio of explore / exploit varies with stage • The initial phase from a small number of hits is a challenge – Hit-to-Lead / ADMET Rules did not match compounds in literature – Victims of what is published – Requires full datasets – Process can get “stuck” • Human intervention may always be required • Both MMPA and RF can select compounds to make to improve models – analysis of error. • Permutative-MMPA works very well (of course) • Where AI could help is a compound selector depending on strategy
  • 35. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 • Dr Alexander G. Dossetter • Managing Director, MedChemica Ltd • al.dossetter@medchemic.com • MedChemica • Lauren Reid • Jessica Stacey • Phil De. Sousa • Shane Montague • Edward J. Griffen • Andrew G. Leach • Available on Slideshere - search for Dossetter • Twitter @MedChemica • Twitter #BucketListPapers • https://www.medchemica.com/bucket-list/ Thank you
  • 36. Exploiting medicinal chemistry knowledge to accelerate projects October 2020October 2020 Not for Circulation About MedChemica >10 experience in building A.I. Systems for drug discovery
  • 37. Exploiting medicinal chemistry knowledge to accelerate projects October 2020
  • 38. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 • Founded in 2012 by AZ AP Medicinal / Computational chemists to accelerate drug hunting by exploiting data driven knowledge • Domain leaders in SAR knowledge extraction and knowledge based design • > 11 years experience of building AI systems that suggest actions to chemists (7 years as MedChemica) • Creators of largest ever documented database of medicinal chemistry ADMET knowledge MedChemica Publications
  • 39. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 AI Software Platforms – Complete In-house platform – Analysis of own data and automated updating – Design tool access to all chemists – Custom fitting (Software-as-a-Service) One stop GUI Design tool Biotech, Universities and Foundations Medium to large pharma, agrochemical and materials research – Secure web-based AI design platform – CHEMBL, Patent data analysed – Merged into one knowledgebase
  • 40. Exploiting medicinal chemistry knowledge to accelerate projects October 2020Exploiting medicinal chemistry knowledge to accelerate projects October 2020 Science As A Service (SaaS) Target ID Hit Screening Lead Identification Lead Optimisation Pre-Clinical AI H2L design sets Bespoke Advanced Analytics and Computational Chemistry services through-out the research phase Compound design to solve ADMET and potency issues Third party compound assessment Directed virtual screening for hit matter Library design for novel protein targets AI Toxophore assessment Patent analysis Pharmacophore profiling Generating IP for clients [Scaffold hops] Collection evaluation and enhancement
  • 41. Exploiting medicinal chemistry knowledge to accelerate projects October 2020 October 2020 Not for Circulation Panel Discussion: What should the Medicinal Chemistry Discipline be like in 10 years? Slideshere - search for Dossetter Twitter @MedChemica Twitter @covid_moonshot Twitter #BucketListPapers https://www.medchemica.com/bucket-list/

Notas del editor

  1. Visualisations are anonymised data from an active client project.
  2. Feature definitions are pairs from Taylor and Cosgrove With the addition of a halogen class, distances are topological distance, binary fingerprints not scalar counts of number of matches. Feature importance is permutative importance not impurity
  3. Feature definitions are pairs from Taylor and Cosgrove With the addition of a halogen class, distances are topological distance, binary fingerprints not scalar counts of number of matches. Feature importance is permutative importance not impurity
  4. Everyone wants to be able to spot the weak points in a model so it can be improved. Here because we can identify where the under explored regions of pharmacophore space are, we can choose to bias our ‘explore’ synthesis and testing to improving the model in a transparent and verifiable way. As we are using the precise pharmacophore definitions and Random Forest modelling this means that understanding where to focus attention is straightforward.
  5. We can generate good compounds from enumeration – the problem is how to rank them, if we generate a lot of compounds then the initially generated model is not sufficiently discriminating? Generating lots of compounds is not the solution initially! Enumerating from HtL transformation or class transformations – is better, but the best approach is to first make sure you’ve got the most out of the data you already have – permutative MMPA. In the D2 example, the m-OMe  o-OH transformation if applid to the propyl compound gives a 1.6log increase in potency (mknown measured compound not in training set). Note the env = 4 is only using env 4 transformations from MCPairs – so we only transfer exact SAR, nogenerically pepper the compounds with all the substituents eg just m-Cl not all the Chloros.
  6. We can generate good compounds from enumeration – the problem is how to rank them, if we generate a lot of compounds then the initially generated model is not sufficiently discriminating? Generating lots of compounds is not the solution initially! Enumerating from HtL transformation or class transformations – is better, but the best approach is to first make sure you’ve got the most out of the data you already have – permutative MMPA. In the D2 example, the m-OMe  o-OH transformation if applid to the propyl compound gives a 1.6log increase in potency (mknown measured compound not in training set). Note the env = 4 is only using env 4 transformations from MCPairs – so we only transfer exact SAR, nogenerically pepper the compounds with all the substituents eg just m-Cl not all the Chloros.
  7. '"under dev’ covers MMS and extensions. It’s where Andy Bell at Ex Scienta comes in I think.
  8. You might want to put more of the team on the Thank you slide: E. Griffen, A. Leach, A. Lin, J. Stacey, L. Reid, S. Montague, P De Sousa.