Dispensing processes profoundly influence estimates of biological activity of compounds. In this study using published inhibitor data for the tyrosine kinase EphB4, we show that IC50 values obtained via disposable tip-based serial dilution and dispensing versus acoustic dispensing differ by orders of magnitude with no correlation or ranking of datasets. Importantly, the computed EphB4 pharmacophores derived from this data differ for each dataset. Acoustic dispensing correctly highlights multiple hydrophobic features in the pharmacophore and correlates with calculated LogP values. Significantly, the acoustic dispensing-derived pharmacophore correctly identified active compounds in a test set. The subsequent analysis of crystal structures for other published EphB4 inhibitors and automated development of pharmacophores, indicated they were comparable to those developed with acoustic dispensing data. In short, dispensing processes are another important source of error in high-throughput screening that impacts computational and statistical analyses. These findings have far-reaching implications in biological research and in drug discovery.
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Dispensing Processes Profoundly Impact Biological Assays and Computational and Statistical Analyses
1. Dispensing Processes Profoundly Impact
Biological, Computational and Statistical Analyses
Sean Ekins1, Joe Olechno2 and Antony J. Williams3
1
Collaborations in Chemistry, Fuquay Varina, NC.
2
Labcyte Inc, Sunnyvale, CA.
3
Royal Society of Chemistry, Wake Forest, NC.
Disclaimer: SE and AJW have no affiliation with Labcyte and have
not been engaged as consultants
2. Where do scientists get
chemistry/ biology
data?
Databases
Patents
Papers
Your own lab
Collaborators
“If I have seen further
Some or all of the
than others, it is by
above?
standing upon the
What is common to
shoulders of giants.”
all? – quality issues
Isaac Newton
3. ..drug structure quality is
Data can be found – but …
important
More groups doing in silico
repositioning
Target-based or ligand-based
Network and systems biology
Integrating or using sets of FDA
drugs..if the structures are
incorrect predictions will be too..
Need a definitive set of FDA
approved drugs with correct
structures
Also linkage between in vitro
data & clinical data
4. Structure Quality Issues
Database released and within days 100’s of errors found in structures
Science Translational Medicine 2011
NPC Browser http://tripod.nih.gov/npc/
DDT 17: 685-701 (2012)
DDT, 16: 747-750 (2011)
5. It’s not just structure quality we
DDT editorial Dec 2011 need to worry about
This editorial led to the current
work http://goo.gl/dIqhU
6. Finding structures of Pharma molecules is hard
NCATS and MRC
made molecule
identifiers from
pharmas available
with no structures Southan et al., DDT, 18: 58-70 (2013)
7. How do you move Plastic leaching
a liquid?
McDonald et al., Science 2008,
322, 917.
Belaiche et al., Clin Chem 2009,
Images courtesy of Bing, Tecan 55, 1883-1884
8. Moving Liquids with sound: Acoustic Droplet Ejection (ADE)
Acoustic energy expels droplets without physical contact
Extremely precise 15.0
12.5
Extremely accurate
10.0
Rapid %CV 7.5
Auto-calibrating 5.0
Completely 2.5
touchless 0
0.1 1 10 100 1000 10000
Volume (nL)
No cross- Comley J, Nanolitre Dispensing, Drug Discovery World,
Summer 2004, 43-54
contamination
No leachates
No binding
8
Images courtesy of Labcyte Inc. http://goo.gl/K0Fjz
9. Using literature data from different dispensing methods to generate
computational models
Few molecule structures and corresponding datasets are public
Using data from 2 AstraZeneca patents:
Tyrosine kinase EphB4 pharmacophores (Accelrys Discovery
Studio) were developed using data for 14 compounds
IC50 determined using different dispensing methods
Analyzed correlation with simple descriptors (SAS JMP)
Calculated LogP correlation with log IC50 data for acoustic
dispensing (r2 = 0.34, p < 0.05, N = 14)
Barlaam, B. C.; Ducray, R., WO 2009/010794 A1, 2009
Barlaam, B. C.; Ducray, R.; Kettle, J. G., US 7,718,653 B2, 2010
11. A graph of the log IC50 values for tip-based serial dilution
and dispensing versus acoustic dispensing with direct dilution
shows a poor correlation between techniques (R2 = 0.246).
acoustic
technique
always gave
more potent
IC50 value
12. Experimental Process
Results
Acoustic Acoustic Acoustic
Model Model Model
Generate Test models Test models against
14 Structures
14 Structures pharmacophore models against new X-ray crystal structure
with Data
with Data for EphB4 receptor data pharmacophores
Tip-based Tip-based Tip-based
Model Model Model
Results
Initial data set of 14 Independent data set of 12 Independent crystallography data
WO2009/010794, US 7,718,653 WO2008/132505 Bioorg Med Chem Lett 18:2776;
12
18:5717; 20:6242; 21:2207
13. Tyrosine kinase EphB4 Pharmacophores
Generated with Discovery
Studio (Accelrys)
Cyan = hydrophobic
Green = hydrogen bond
acceptor
Purple = hydrogen bond donor
Each model shows most
potent molecule mapping
Acoustic Tip based
Hydrophobic Hydrogen Hydrogen Observed vs.
features (HPF) bond acceptor bond donor predicted IC50
(HBA) (HBD) r
Acoustic mediated process
2 1 1 0.92
Tip-based process
0 2 1 0.80
• Ekins et al., PLOSONE, In press
14. Test set evaluation of pharmacophores
• An additional 12 compounds from AstraZeneca
Barlaam, B. C.; Ducray, R., WO 2008/132505 A1, 2008
• 10 of these compounds had data for tip based dispensing
and 2 for acoustic dispensing
• Calculated LogP and logD showed low but statistically
significant correlations with tip based dispensing (r2=
0.39 p < 0.05 and 0.24 p < 0.05, N = 36)
• Used as a test set for pharmacophores
• The two compounds analyzed with acoustic liquid
handling were predicted in the top 3 using the ‘acoustic’
pharmacophore
• The ‘Tip-based’ pharmacophore failed to rank the
retrieved compounds correctly
15. Automated receptor-ligand pharmacophore generation
method
Pharmacophores for the tyrosine kinase EphB4 generated from crystal
structures in the protein data bank PDB using Discovery Studio version 3.5.5
Cyan =
hydrophobic
Green = hydrogen
bond acceptor
Purple = hydrogen
bond donor
Grey = excluded
volumes
Each model shows
most potent
molecule mapping
Bioorg Med Chem Lett
2010, 20, 6242-6245.
Bioorg Med Chem Lett
2008, 18, 5717-5721.
Bioorg Med Chem Lett
2008, 18, 2776-2780.
Bioorg Med Chem Lett
2011, 21, 2207-2211.
16. Summary
•In the absence of structural data, pharmacophores and other
computational and statistical models are used to guide medicinal
chemistry in early drug discovery.
•Our findings suggest acoustic dispensing methods could improve HTS
results and avoid the development of misleading computational models
and statistical relationships.
•Automated pharmacophores are closer to pharmacophore generated with
acoustic data – all have hydrophobic features – missing from Tip- based
pharmacophore model
•Importance of hydrophobicity seen with logP correlation and
crystal structure interactions
•Public databases should annotate this meta-data alongside biological
data points, to create larger datasets for comparing different
computational methods.
17. Acoustic vs. Tip-based Transfers
Adapted from Spicer et al.,
-40 -20 0 20 40 60 80 100
Presentation at Drug Discovery
50
Acoustic % Inhibition
Serial dilution IC50 μM
Technology, Boston, MA, August
2005
10 20 30 40
Adapted from Wingfield.
Presentation at ELRIG2012,
Manchester, UK
NOTE DIFFERENT
0
0 10 20 30 40 50 ORIENTATION -40 -20 0 20 40 60 80 100
Acoustic IC50 μM Aqueous % Inhibition
104
Adapted from Wingfield et al.,
103
Amer. Drug Disco. 2007,
Log IC50 tips
Serial dilution IC50 μM
102 3(3):24
10
1
Data in this presentation
10 -1
10-2
10-3
10-3 10-2 10-1 1 10 102 103 104
Acoustic IC50 μM Log IC50 acoustic
No Previous Analysis of molecule properties
18. Strengths and Weaknesses
• Small dataset size – focused on one compound series
• No previous publication describing how data quality can be
impacted by dispensing and how this in turn affects
computational models and downstream decision making.
• No comparison of pharmacophores generated from acoustic
dispensing and tip-based dispensing.
• No previous comparison of pharmacophores generated from in
vitro data with pharmacophores automatically generated from
X-ray crystal conformations of inhibitors.
• Severely limited by number of structures in public domain
with data in both systems
• Reluctance of many to accept that this could be an issue
• Ekins et al., PLOSONE, In press
19. The stuff of nightmares?
How much of the data in databases is generated by tip-based serial
dilution methods? We don’t know…the meta data doesn’t tell us!
How much is erroneous?
Do we have to start again?
How does it affect all subsequent science – data mining etc?
Does it impact Pharmas productivity?
20. Simple Rules for licensing Could data ‘open accessibility’
“open” data equal ‘Disruption’
As we see a future of increased 1: NIH and other international
database integration the scientific funding bodies should
licensing of the data may be a mandate …open accessibility for
hurdle that hampers progress all data generated by publicly
and usability. funded research immediately
Williams, Wilbanks and Ekins.
Ekins, Waller, Bradley, Clark and
PLoS Comput Biol 8(9):
Williams. DDT, 18:265-71, 2013
e1002706, 2012
21. You can find me @... CDD Booth 205
PAPER ID: 13433
PAPER TITLE: “Dispensing processes profoundly impact biological assays and computational and
statistical analyses”
April 8th 8.35am Room 349
PAPER ID: 14750
PAPER TITLE: “Enhancing High Throughput Screening For Mycobacterium tuberculosis Drug Discovery
Using Bayesian Models”
April 9th 1.30pm Room 353
PAPER ID: 21524
PAPER TITLE: “Navigating between patents, papers, abstracts and databases using public sources and
tools”
April 9th 3.50pm Room 350
PAPER ID: 13358
PAPER TITLE: “TB Mobile: Appifying Data on Anti-tuberculosis Molecule Targets”
April 10th 8.30am Room 357
PAPER ID: 13382
PAPER TITLE: “Challenges and recommendations for obtaining chemical structures of industry-provided
repurposing candidates”
April 10th 10.20am Room 350
PAPER ID: 13438
PAPER TITLE: “Dual-event machine learning models to accelerate drug discovery”
April 10th 3.05 pm Room 350