SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
David Evans and George Papadatos
Lilly Research Centre, Erl Wood Manor, Windlesham,
                         UK

          22nd September 2011
• Discover new chemotypes
• Multiobjective space
 •   Isosteres in activity
 •   Improvements in properties

• Want to use multiple tools in same
  environment
 •   But understand what works when
•       Open Source Workflow tool – main client is free

•       But support is available and can integrate commercial vendors + in-
        house code as nodes

•       Have released many Erl Wood nodes to KNIME community site
    •     http://tech.knime.org/community/erlwood
FieldAlign
                    Xedmin
                     Xedex

Xedmin
•XED minimization               FieldView
•2D -> 3D                       •Launches FieldView
                                •View field points +
Xedex                           energies + other data
• Conformational
analysis                 All nodes pass SDF
FieldAlign
• Flexible alignment
of query molecules
onto template
WHY                                         ?
Process is more than just the database search




               Company Confidential
               Copyright © 2008 Eli Lilly and Company
Don’t want
                             to load all
                             databases
        + secure              onto all
        intranet !
                             users’ PCs
                 SOAP Web
Command-           Service
line search       •Apache         node
                   Tomcat
    Platform-independent communication
Non-proprietary structure     • Read in pre-built hypothesis
                                     (MOE, Phase)

                            • Or sketch from template molecule

                                 • Jmol based visualizer

                            • Can also annotate and filter hits,
                                 aids manual inspection
How well do automated pharmacophore
methods do compared to 2D methods?
Maximum Unbiased Validation (MUV) dataset

• 17 targets, total 30 ligands and 15000 decoys per target,
source: PubChem bioactivity data.

• Wide-ranging targets: hormone receptors, kinases, proteases,
GPCRs plus others (e.g. HSP90, HIV RT).

• Unbiased for chemical analogues as MUV ligands pre-
clustered with 2D fingerprint
    •1.16 compounds per scaffold class

                             MUV: J. Chem. Inf. Model., 2009, 49 (2), 169-184.
• Have looked at whole molecule similarity

• Is there more data if we find fragments which maintain
activity?

• Matched Molecular pair analysis (MMP)
•   Fragments compounds and finds pairs where only one fragment differs
The mining and statistical analysis of transformations and
their impact on properties of interest (e.g. solubility or
activity)
  left molecule   right molecule   transformation    ΔSolubility (mgml)



                                      HF                 -0.8



                                    Br  OCH3             +1.2




                                                          +2.4
(*in an automated and unsupervised way)
  It used to be a slow and computationally expensive
  process...
   •   Pair-wise maximum common substructure extraction – O(N2)

  Recently a much more efficient algorithm was published
       1) Cleave all acyclic single
            bonds, one by one:

        2) Index all the fragments (cf. book index):


        3) Enumerate the values for each
                       key:                                                                                       >> *
                                                                             Mol A >> Mol B *
  Hussain and Rea (2010). J. Chem. Inf. and Model., 50 (3), 339-348.   Wagener and Lommerse (2006). J. Chem. Inf. and Model., 46 (2), 677-685.
In: MolRegnos (IDs), structures (in RDKit format) and
property values

Out: Matched pairs (left and right molecule, IDs,
transformation, property values, ΔP, context,
transformation atom count)

Available as an Erl Wood community contribution node
Find isosteres in chEMBL
chEMBL
       – Database of published medicinal chemistry activity data
       – Using chEMBL_10 , total >1,000,000 compounds


Use here just human protein kinase inhibitors
Quality assurance for chEMBL data (SQL statement)
   •     Med. chem. friendly compounds, parent structure, not downgraded,
         confidence score = 9, exact IC50 or Ki values only (converted to
         pIC50/pKi)  ~14K data points
   •     Compare biological values coming from the same assay ID only

Aggregate transformations; calculate and bin ΔpIC50s in
3 bins
   •     Good – Bad – Neutral(depending on a cut-off c = 0.4 log units)
• Each
transformation
 has a neutral
     count

• Absolute value
 or percentage:

NeutralCount%
chEMBL workflow outputs isosteric fragments

                              How similar are
                              isosteres in 2D
                            fingerprint space?

                              In field space?

                            Could fields help us
                             find unexpected
                                isosteres?
• 1802 fragment pairs from chEMBL_10 kinase data set

• 481 with no rotatable bonds left or right
    • Simplifies conformational analysis

• For each fragment pair
   1. Swap attachment points for adamantyl
   2. FieldAlign to get field similarity (Use adamantyl to
       constrain overlay)
   3. RDKit fingerprint similarity – topological Daylight-esque
   4. Correct similarities for adamantyl

• Are there isosteric pairs with high field similarity but low RDKit
similarity?
Size by Neutral
Field            Count %
Sim            Larger more
                 isosteric


               Pairs with high
               field similarity
                 but low 2D
                  similarity

               Pairs with high
                field and 2D
                  similarity
        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


             Thiophene -> Phenol




        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


               Imidazole->
               Morpholine?




        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


                Some small
                 fragments




        RDKit Sim
Non-proprietary structure
(from PDB)
                            WEE1
                            kinase
                            PDB 2I06



                                   Buried




                               Solvent-
                               exposed
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


             Me-tetrazole ->
              oxadiazole




        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples

              Thiophene ->
                 phenol




        RDKit Sim
Non-proprietary structure
(from PDB)
• 6299 data points from thermodynamic solubility assay

• 423 single-point transformations

• 215 no-rotatable point transformations

• Aggregate transformations; calculate and bin ΔlogS in 3 bins
   • Good – Bad – Neutral (c = 0.3 log units)



• Are there transformations which increase solubility with low
field similarity but high RDKit similarity?
Size by Good
Field            Count %
Sim
              Only those with
              >60% boosting
                examples



              Ring contraction
                 + twist ?




        RDKit Sim
Size by Good
Field            Count %
Sim
              Only those with
              >60% boosting
                examples



             Big boost from
              morpholine




        RDKit Sim
• Can mine chEMBL data for non-obvious isosteres
   • Will other data sets find more?
• Would like to improve workflow to make isostere data set for
3D similarity comparison
   • Improve fragmentation/conformer/ alignment handling?
   • Need to include whole molecule?
   • Need 3D binding site data as well to confirm isosterism?

• KNIME platform developing
   • Virtual screening and evaluation environment
   • Rapid experimentation with varied tools
   • http://tech.knime.org/community/erlwood
George Papadatos
      Juliette Pradon
        Hina Patel
     Nikolas Fechner
      David Thorner
     Michael Bodkin


KNIME, chEMBL + Cresset !
ROC curves for
retrieval of >66%
 isosteric groups

Field similarity
performs better
  than RDKit

But AUC = 0.68

 Workflow not
 optimized for
 this purpose

Más contenido relacionado

Similar a David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextRafał Kuć
 
Gordon2003
Gordon2003Gordon2003
Gordon2003toluene
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryIchigaku Takigawa
 
Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.ikrrish
 
Presentation l`aquila new
Presentation l`aquila newPresentation l`aquila new
Presentation l`aquila newikrrish
 
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...Spark Summit
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...LEGATO project
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Valery Tkachenko
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...ChemAxon
 
Imac10 Component
Imac10 ComponentImac10 Component
Imac10 ComponentSDTools
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsAjit Shinde
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool developmentAnubhav Jain
 
SFDC Introduction to Apex
SFDC Introduction to ApexSFDC Introduction to Apex
SFDC Introduction to ApexSujit Kumar
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbersYutaka Kawai
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsDinesh Barupal
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 

Similar a David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs' (20)

Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
PraveenBOUT++
PraveenBOUT++PraveenBOUT++
PraveenBOUT++
 
New
NewNew
New
 
Gordon2003
Gordon2003Gordon2003
Gordon2003
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discovery
 
Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.
 
Presentation l`aquila new
Presentation l`aquila newPresentation l`aquila new
Presentation l`aquila new
 
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
 
Imac10 Component
Imac10 ComponentImac10 Component
Imac10 Component
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
 
SFDC Introduction to Apex
SFDC Introduction to ApexSFDC Introduction to Apex
SFDC Introduction to Apex
 
Bioinformatica t4-alignments
Bioinformatica t4-alignmentsBioinformatica t4-alignments
Bioinformatica t4-alignments
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomics
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Point GEODES
Point GEODESPoint GEODES
Point GEODES
 

Más de Cresset

Selectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity MinerSelectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity MinerCresset
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Cresset
 
Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?Cresset
 
Organic converstions: an aid in perspective
Organic converstions: an aid in perspectiveOrganic converstions: an aid in perspective
Organic converstions: an aid in perspectiveCresset
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Cresset
 
Knowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesKnowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesCresset
 
Using waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesUsing waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesCresset
 
Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Cresset
 
New features in cresst products
New features in cresst productsNew features in cresst products
New features in cresst productsCresset
 
Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Cresset
 
Torch for medicinal chemists
Torch for medicinal chemistsTorch for medicinal chemists
Torch for medicinal chemistsCresset
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Cresset
 
Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Cresset
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spCresset
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spCresset
 
Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...Cresset
 
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...Cresset
 
Cresset: 25 year of Fields
Cresset: 25 year of FieldsCresset: 25 year of Fields
Cresset: 25 year of FieldsCresset
 
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Cresset
 
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Cresset
 

Más de Cresset (20)

Selectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity MinerSelectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity Miner
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
 
Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?
 
Organic converstions: an aid in perspective
Organic converstions: an aid in perspectiveOrganic converstions: an aid in perspective
Organic converstions: an aid in perspective
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...
 
Knowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesKnowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sites
 
Using waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesUsing waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinities
 
Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...
 
New features in cresst products
New features in cresst productsNew features in cresst products
New features in cresst products
 
Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...
 
Torch for medicinal chemists
Torch for medicinal chemistsTorch for medicinal chemists
Torch for medicinal chemists
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
 
Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond sp
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond sp
 
Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...
 
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
 
Cresset: 25 year of Fields
Cresset: 25 year of FieldsCresset: 25 year of Fields
Cresset: 25 year of Fields
 
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
 
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
 

Último

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Último (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

  • 1. David Evans and George Papadatos Lilly Research Centre, Erl Wood Manor, Windlesham, UK 22nd September 2011
  • 2. • Discover new chemotypes • Multiobjective space • Isosteres in activity • Improvements in properties • Want to use multiple tools in same environment • But understand what works when
  • 3. Open Source Workflow tool – main client is free • But support is available and can integrate commercial vendors + in- house code as nodes • Have released many Erl Wood nodes to KNIME community site • http://tech.knime.org/community/erlwood
  • 4. FieldAlign Xedmin Xedex Xedmin •XED minimization FieldView •2D -> 3D •Launches FieldView •View field points + Xedex energies + other data • Conformational analysis All nodes pass SDF
  • 5. FieldAlign • Flexible alignment of query molecules onto template
  • 6. WHY ? Process is more than just the database search Company Confidential Copyright © 2008 Eli Lilly and Company
  • 7. Don’t want to load all databases + secure onto all intranet ! users’ PCs SOAP Web Command- Service line search •Apache node Tomcat Platform-independent communication
  • 8. Non-proprietary structure • Read in pre-built hypothesis (MOE, Phase) • Or sketch from template molecule • Jmol based visualizer • Can also annotate and filter hits, aids manual inspection
  • 9. How well do automated pharmacophore methods do compared to 2D methods? Maximum Unbiased Validation (MUV) dataset • 17 targets, total 30 ligands and 15000 decoys per target, source: PubChem bioactivity data. • Wide-ranging targets: hormone receptors, kinases, proteases, GPCRs plus others (e.g. HSP90, HIV RT). • Unbiased for chemical analogues as MUV ligands pre- clustered with 2D fingerprint •1.16 compounds per scaffold class MUV: J. Chem. Inf. Model., 2009, 49 (2), 169-184.
  • 10.
  • 11. • Have looked at whole molecule similarity • Is there more data if we find fragments which maintain activity? • Matched Molecular pair analysis (MMP) • Fragments compounds and finds pairs where only one fragment differs
  • 12. The mining and statistical analysis of transformations and their impact on properties of interest (e.g. solubility or activity) left molecule right molecule transformation ΔSolubility (mgml) HF -0.8 Br  OCH3 +1.2 +2.4
  • 13. (*in an automated and unsupervised way) It used to be a slow and computationally expensive process... • Pair-wise maximum common substructure extraction – O(N2) Recently a much more efficient algorithm was published 1) Cleave all acyclic single bonds, one by one: 2) Index all the fragments (cf. book index): 3) Enumerate the values for each key: >> * Mol A >> Mol B * Hussain and Rea (2010). J. Chem. Inf. and Model., 50 (3), 339-348. Wagener and Lommerse (2006). J. Chem. Inf. and Model., 46 (2), 677-685.
  • 14. In: MolRegnos (IDs), structures (in RDKit format) and property values Out: Matched pairs (left and right molecule, IDs, transformation, property values, ΔP, context, transformation atom count) Available as an Erl Wood community contribution node
  • 15. Find isosteres in chEMBL chEMBL – Database of published medicinal chemistry activity data – Using chEMBL_10 , total >1,000,000 compounds Use here just human protein kinase inhibitors Quality assurance for chEMBL data (SQL statement) • Med. chem. friendly compounds, parent structure, not downgraded, confidence score = 9, exact IC50 or Ki values only (converted to pIC50/pKi)  ~14K data points • Compare biological values coming from the same assay ID only Aggregate transformations; calculate and bin ΔpIC50s in 3 bins • Good – Bad – Neutral(depending on a cut-off c = 0.4 log units)
  • 16. • Each transformation has a neutral count • Absolute value or percentage: NeutralCount%
  • 17. chEMBL workflow outputs isosteric fragments How similar are isosteres in 2D fingerprint space? In field space? Could fields help us find unexpected isosteres?
  • 18. • 1802 fragment pairs from chEMBL_10 kinase data set • 481 with no rotatable bonds left or right • Simplifies conformational analysis • For each fragment pair 1. Swap attachment points for adamantyl 2. FieldAlign to get field similarity (Use adamantyl to constrain overlay) 3. RDKit fingerprint similarity – topological Daylight-esque 4. Correct similarities for adamantyl • Are there isosteric pairs with high field similarity but low RDKit similarity?
  • 19. Size by Neutral Field Count % Sim Larger more isosteric Pairs with high field similarity but low 2D similarity Pairs with high field and 2D similarity RDKit Sim
  • 20. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Thiophene -> Phenol RDKit Sim
  • 21. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Imidazole-> Morpholine? RDKit Sim
  • 22. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Some small fragments RDKit Sim
  • 23. Non-proprietary structure (from PDB) WEE1 kinase PDB 2I06 Buried Solvent- exposed
  • 24. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Me-tetrazole -> oxadiazole RDKit Sim
  • 25. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Thiophene -> phenol RDKit Sim
  • 27. • 6299 data points from thermodynamic solubility assay • 423 single-point transformations • 215 no-rotatable point transformations • Aggregate transformations; calculate and bin ΔlogS in 3 bins • Good – Bad – Neutral (c = 0.3 log units) • Are there transformations which increase solubility with low field similarity but high RDKit similarity?
  • 28. Size by Good Field Count % Sim Only those with >60% boosting examples Ring contraction + twist ? RDKit Sim
  • 29. Size by Good Field Count % Sim Only those with >60% boosting examples Big boost from morpholine RDKit Sim
  • 30. • Can mine chEMBL data for non-obvious isosteres • Will other data sets find more? • Would like to improve workflow to make isostere data set for 3D similarity comparison • Improve fragmentation/conformer/ alignment handling? • Need to include whole molecule? • Need 3D binding site data as well to confirm isosterism? • KNIME platform developing • Virtual screening and evaluation environment • Rapid experimentation with varied tools • http://tech.knime.org/community/erlwood
  • 31. George Papadatos Juliette Pradon Hina Patel Nikolas Fechner David Thorner Michael Bodkin KNIME, chEMBL + Cresset !
  • 32. ROC curves for retrieval of >66% isosteric groups Field similarity performs better than RDKit But AUC = 0.68 Workflow not optimized for this purpose