1. Dispensing Processes Profoundly Impact
Biological, Computational and Statistical Analyses
Sean Ekins1, Joe Olechno2 Antony J. Williams3
1 Collaborationsin Chemistry, Fuquay Varina, NC.
2 Labcyte Inc, Sunnyvale, CA.
3 Royal Society of Chemistry, Wake Forest, NC.
Disclaimer: SE and AJW have no affiliation with Labcyte and have
not been engaged as consultants
2. Where do scientists get
chemistry/ biology
data?
Databases
Patents
Papers
Your own lab
Collaborators
“If I have seen further Some or all of the
than others, it is by above?
standing upon the What is common to
shoulders of giants.” all? – quality issues
Isaac Newton
3. ..drug structure quality is
Data can be found – but …
important
More groups doing in silico
repositioning
Target-based or ligand-based
Network and systems biology
integrating or using sets of
FDA drugs..if the structures
are incorrect predictions will
be too..
Need a definitive set of FDA
approved drugs with correct
structures
Also linkage between in vitro
data & clinical data
4. Structure Quality Issues
Database released and within days 100’s of errors found in structures
Science Translational Medicine 2011
NPC Browser http://tripod.nih.gov/npc/
DDT 17: 685-701 (2012)
DDT, 16: 747-750 (2011)
5. Its not just structure quality we
DDT editorial Dec 2011 need to worry about
This editorial led to the current
work http://goo.gl/dIqhU
6. Finding structures of Pharma molecules is hard
NCATS and MRC
made molecule
identifiers from
pharmas available
with no structures Southan et al., DDT, 18: 58-70 (2013)
7. How do you move Plastic leaching
a liquid?
McDonald et al., Science 2008,
322, 917.
Belaiche et al., Clin Chem 2009,
Images courtesy of Bing, Tecan 55, 1883-1884
8. Moving Liquids with sound: Acoustic Droplet Ejection (ADE)
Acoustic energy expels droplets without physical contact
Extremely precise
15.0
12.5
Extremely accurate 10.0
Rapid %CV 7.5
Auto-calibrating 5.0
Completely 2.5
touchless 0
0.1 1 10 100 1000 10000
Volume (nL)
No cross- Comley J, Nanolitre Dispensing, Drug Discovery World,
Summer 2004, 43-54
contamination
No leachates
No binding
8
Images courtesy of Labcyte Inc. http://goo.gl/K0Fjz
9. Using literature data from different dispensing methods to generate
computational models
Few molecule structures and corresponding datasets are public
Using data from 2 AstraZeneca patents –
Tyrosine kinase EphB4 pharmacophores (Accelrys Discovery
Studio) were developed using data for 14 compounds
IC50 determined using different dispensing methods
Analyzed correlation with simple descriptors (SAS JMP)
Calculated LogP correlation with log IC50 data for acoustic
dispensing (r2 = 0.34, p < 0.05, N = 14)
Barlaam, B. C.; Ducray, R., WO 2009/010794 A1, 2009
Barlaam, B. C.; Ducray, R.; Kettle, J. G., US 7,718,653 B2, 2010
11. A graph of the log IC50 values for tip-based serial dilution
and dispensing versus acoustic dispensing with direct dilution
shows a poor correlation between techniques (R2 = 0.246).
1.5
1
0.5
0
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5
log IC50-tips
-0.5
-1
-1.5 acoustic
technique
-2 always gave
a more
-2.5
potent IC50
-3 value
log IC50-acoustic
12. Experimental Process
Results
Acoustic Acoustic Acoustic
Model Model Model
Generate Test models Test models against
14 Structures
pharmacophore models against new X-ray crystal structure
with Data
for EphB4 receptor data pharmacophores
Tip-based Tip-based Tip-based
Model Model Model
Results
Initial data set of 14 Independent data set of 12 Independent crystallography data
WO2009/010794, US 7,718,653 WO2008/132505 Bioorg Med Chem Lett 18:2776;
12
18:5717; 20:6242; 21:2207
13. Tyrosine kinase EphB4 Pharmacophores
Generated with Discovery
Studio (Accelrys)
Cyan = hydrophobic
Green = hydrogen bond
acceptor
Purple = hydrogen bond donor
Each model shows most
potent molecule mapping
Acoustic Tip based
Hydrophobic Hydrogen Hydrogen Observed vs.
features (HPF) bond acceptor bond donor predicted IC50
(HBA) (HBD) r
Acoustic mediated process
2 1 1 0.92
Tip-based process
0 2 1 0.80
• Ekins et al., PLOSONE, In press
14. Test set evaluation of pharmacophores
• An additional 12 compounds from AstraZeneca
Barlaam, B. C.; Ducray, R., WO 2008/132505 A1, 2008
• 10 of these compounds had data for tip based dispensing
and 2 for acoustic dispensing
• Calculated LogP and logD showed low but statistically
significant correlations with tip based dispensing (r2=
0.39 p < 0.05 and 0.24 p < 0.05, N = 36)
• Used as a test set for pharmacophores
• The two compounds analyzed with acoustic liquid
handling were predicted in the top 3 using the ‘acoustic’
pharmacophore
• The ‘Tip-based’ pharmacophore failed to rank the
retrieved compounds correctly
15. Automated receptor-ligand pharmacophore generation
method
Pharmacophores for the tyrosine kinase EphB4 generated from crystal
structures in the protein data bank PDB using Discovery Studio version 3.5.5
Cyan =
hydrophobic
Green = hydrogen
bond acceptor
Purple = hydrogen
bond donor
Grey = excluded
volumes
Each model shows
most potent
molecule mapping
Bioorg Med Chem Lett
2010, 20, 6242-6245.
Bioorg Med Chem Lett
2008, 18, 5717-5721.
Bioorg Med Chem Lett
2008, 18, 2776-2780.
Bioorg Med Chem Lett
2011, 21, 2207-2211.
16. Summary
• In the absence of structural data, pharmacophores and other
computational and statistical models are used to guide medicinal
chemistry in early drug discovery.
• Our findings suggest acoustic dispensing methods could improve HTS
results and avoid the development of misleading computational models
and statistical relationships.
• Automated pharmacophores are closer to pharmacophore generated
with acoustic data – all have hydrophobic features – missing from Tip-
based pharmacophore model
• Importance of hydrophobicity seen with logP correlation and
crystal structure interactions
• Public databases should annotate this meta-data alongside biological
data points, to create larger datasets for comparing different
computational methods.
17. Acoustic vs. Tip-based Transfers
-40 -20 0 20 40 60 80 100
Adapted from Spicer et al.,
Presentation at Drug Discovery
50
Acoustic % Inhibition
Serial dilution IC50 μM
Technology, Boston, MA, August
2005
10 20 30 40
Adapted from Wingfield.
Presentation at ELRIG2012,
Manchester, UK
NOTE DIFFERENT
0
0 10 20 30 40 50 ORIENTATION -40 -20 0 20 40 60 80 100
Acoustic IC50 μM Aqueous % Inhibition
104
Adapted from Wingfield et al.,
103
Amer. Drug Disco. 2007,
Log IC50 tips
Serial dilution IC50 μM
102 3(3):24
10
1
Data in this presentation
10-1
10-2
10-3
10-3 10-2 10-1 1 10 102 103 104
Acoustic IC50 μM Log IC50 acoustic
No Previous Analysis of molecule properties
18. Strengths and Weaknesses
• Small dataset size – focused on one compound series
• No previous publication describing how data quality can be
impacted by dispensing and how this in turn affects
computational models and downstream decision making.
• No comparison of pharmacophores generated from acoustic
dispensing and tip-based dispensing.
• No previous comparison of pharmacophores generated from in
vitro data with pharmacophores automatically generated from
X-ray crystal conformations of inhibitors.
• Severely limited by number of structures in public domain
with data in both systems
• Reluctance of many to accept that this could be an issue
• Ekins et al., PLOSONE, In press
19. The stuff of nightmares?
How much of the data in databases is generated by tip based serial
dilution methods
How much is erroneous
Do we have to start again?
How does it affect all subsequent science – data mining etc
Does it impact Pharmas productivity?
20. Simple Rules for licensing Could data ‘open accessibility’
“open” data equal ‘Disruption’
As we see a future of increased 1: NIH and other international
database integration the scientific funding bodies should
licensing of the data may be a mandate …open accessibility for
hurdle that hampers progress all data generated by publicly
and usability. funded research immediately
Williams, Wilbanks and Ekins.
Ekins, Waller, Bradley, Clark and
PLoS Comput Biol 8(9):
Williams. DDT, 18:265-71, 2013
e1002706, 2012
21. You can find me @... CDD Booth 205
PAPER ID: 13433
PAPER TITLE: “Dispensing processes profoundly impact biological assays and computational and statistical
analyses”
April 8th 8.35am Room 349
PAPER ID: 14750
PAPER TITLE: “Enhancing High Throughput Screening For Mycobacterium tuberculosis Drug Discovery
Using Bayesian Models”
April 9th 1.30pm Room 353
PAPER ID: 21524
PAPER TITLE: “Navigating between patents, papers, abstracts and databases using public sources and
tools”
April 9th 3.50pm Room 350
PAPER ID: 13358
PAPER TITLE: “TB Mobile: Appifying Data on Anti-tuberculosis Molecule Targets”
April 10th 8.30am Room 357
PAPER ID: 13382
PAPER TITLE: “Challenges and recommendations for obtaining chemical structures of industry-provided
repurposing candidates”
April 10th 10.20am Room 350
PAPER ID: 13438
PAPER TITLE: “Dual-event machine learning models to accelerate drug discovery”
April 10th 3.05 pm Room 350