SlideShare una empresa de Scribd logo
1 de 18
Fragment Database Analysis Using
  Molecular Shape Fingerprints
               John D. MacCuish
              Norah E. MacCuish,
     Michael Hawrylycz, and Mitch Chapman

            ACS San Francisco 2010
                     CINF


           john.maccuish@mesaac.com
Outline
• Shape Fingerprints:quasi-Monte
 Carlo integration approach

• 2D substructure commonality
• 3D shape and pharmacophore
 features analogue

• Fragment Database Example
• Future Work
quasi-Monte Carlo Integration (QMC)*
  • Approximate a 3D volume -- e.g., CPK
      van der Waals
  • In practice, quasi-randomly generated
      points have best error convergence in
      low dimensions.
  • Align volumes using binomial sampling
      = shape fingerprints, and SVD
  • Fast and accurate
*”Quasi-Monte Carlo integration for the fast and effective generation of molecular shape
fingerprints”, ACS San Francisco, Wednesday 2:30 , COMP 346
Binomial Sampling
Four subfingerprints
                      Find maximum
                      alignment to other
                      fingerprints of
                      confirmations
                      similarly sampled

                      Maximum Tanimoto
                      for best alignment
2D Substructure Commonality
• Exploratory visualization tool on a series or a 2D
    cluster of structures.
• Akin to loosening the constraint on a maximal
    common substructure (MCS).
• Path based 2D fingerprint form: “Stigmata”*
• Key-based 2D fingerprint form “ChemTattoo”
*"Stigmata: An Algorithm To Determine Structural Commonalities in Diverse Datasets",
Shemetulskis, et al, JCIM, 36(4),1996, pp. 862-871
2D Substructure Commonality
                      2D 768 Key-based Fingerprint




"Substructure commonality analysis and visualization with new key-based binary fingerprinter", Norah MacCuish, ACS
Chicago, CINF Session, March 24-28, 2007.
2D ChemTattoo
•   Generate the modal fingerprint from the input data set

    •   Modal fingerprint is the same length as the input data
        set 2D fingerprints

    •   A bit is set in the Modal if that ‘key’ occurs at least in
        the threshold number of input molecules

    •   Compare each input 2D fingerprint against the Modal
        Fingerprint

    •   Calculate atom score (counts) for each atom for each
        input structure that reflects the number of modal keys
        that a given atom participates in. Color code the
        scores and depict the 2D structures.
2D ChemTattoo
                        Four 2D fingerprints
Threshold set to 1.0 -- no bit in common among all 4 fingerprints




Threshold set to 0.5 -- some bits in common among all 4 fingerprints
2D ChemTattoo



 PUT CHEMVC EX OUTPUT HERE
Pharmacophore
          Extension
• Adding a pharmacophore extension to quasi-
  Monte Carlo generated shape fingerprints.
  • Substructure matching with pharmacophore
    features with user defined SMARTS
  • Create a ChemTattoo analogue in 3D, map
    the features onto the shape fingerprint.
  • Apply to 3D shape clusters, similarity
    searching, etc.
ChemTattoo 3D
•   Allow the pharmacophore features to included standard
    definitions (HBond donor, HBond acceptor, etc. Or
    allow a user defined set of definitions -- SMARTS based)
•   Perform shape fingerprint clustering and analyze the
    resulting clusters to perceive patterns (modal) in the
    pharmacophore feature space
•   Use a known target as the modal and query a database
    to find similar shape (based on shape fingerprints) and
    align the shape hits based on the pharmacophore
    features of the target.
•   Apply these ideas to fragments to find potential
    bioisosteres for an active fragment found from a
    fragment screen
Fragment Database
• ZINC Fragment database, ~500K
  compounds

• Cluster in 2D using 768 MACCS Keys
  Fingerprints

• Select the Representatives to create a 2D
  diverse set

• Generate multi-conformations (< 5Kcals)
  and Shape Fingerprints for all conformers

• Shape Fragment Database contains: 3,265
  structures, 24,029 conformers
Bacterial 23S rRNA Fragment Screen
1. Generate conformers for the active fragment
2. Search Fragment Database w/ Shape FP cutoff:
   0.6 - 324 conformers share the same shape
3. Identify the Pharmacophore Features in Target
   (features can be user defined)
                                                   Kd > 100µM
4. Score the Fragment 3D database based on the
                                                       MS
   number of modal pharmacophore features for
   fragments within the shape cutoff (require at
   least two pharmacophore matches)
5. Display the highlighted pharmacophore features
   in the target with a surface overlay
6. Align the hits via shape -- slider bars display
   shape matches that also have matching features
   within the slider bar distance
      Nature Reviews Drug Discovery v.3 8/04, p. 669
Bacterial 23S rRNA



      • Pharmacophore features of Active fragment
      • Shape of Active fragment w/ features
        highlighted
      • Hitlist of one conformation of Active
        fragment with highest scoring matches
      • Shapes aligned for the best shape
        fingerprint score
Bacterial 23S rRNA
             Showing hitlist...

             User to move the slider
             bar on the ‘red’ HBond
             Acceptor and find the
             shape matched
             fragment that also has
             a HBond Acceptor
             ‘close’ in distance
             space to the target.
Future Work
•   Adding in Thresholding

•   More experiments in industrial
    settings

•   Expanding the pharmacophore
    default feature definitions

•   Better visualization tools
Acknowledgments
•   Open Source            • Mesa Software
    • JMol
                              • ChemTattoo 2D, 3D
    • Balloon
                              • Shape Fingerprint
    • OpenBabel                 Module
    • BKChem                  • Fingerprint Module 2D
• Databases                   • Parallel Grouping
                                Module
    • ZINC
    •   PDB                   • WebflowDD

              john.maccuish@mesaac.com

Más contenido relacionado

Destacado

[ICDE 2012] On Top-k Structural Similarity Search
[ICDE 2012] On Top-k Structural Similarity Search[ICDE 2012] On Top-k Structural Similarity Search
[ICDE 2012] On Top-k Structural Similarity SearchPei Lee
 
Chemoinformatics and information management
Chemoinformatics and information managementChemoinformatics and information management
Chemoinformatics and information managementDuncan Hull
 
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...ChemAxon
 
Interaction fingerprint: 1D representation of 3D protein-ligand complexes
Interaction fingerprint: 1D representation of 3D protein-ligand complexesInteraction fingerprint: 1D representation of 3D protein-ligand complexes
Interaction fingerprint: 1D representation of 3D protein-ligand complexesVladimir Chupakhin
 
Fingerprinting
FingerprintingFingerprinting
Fingerprintingannperry09
 
Neural networks...
Neural networks...Neural networks...
Neural networks...Molly Chugh
 
Advanced Blogging Overview
Advanced Blogging OverviewAdvanced Blogging Overview
Advanced Blogging OverviewMrAppleby
 
Cheminformatics II
Cheminformatics IICheminformatics II
Cheminformatics IIbaoilleach
 

Destacado (9)

[ICDE 2012] On Top-k Structural Similarity Search
[ICDE 2012] On Top-k Structural Similarity Search[ICDE 2012] On Top-k Structural Similarity Search
[ICDE 2012] On Top-k Structural Similarity Search
 
Chemoinformatics and information management
Chemoinformatics and information managementChemoinformatics and information management
Chemoinformatics and information management
 
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
EUGM 2013 - Björn Windshügel (European ScreeningPort): Chemoinformatic tools ...
 
Interaction fingerprint: 1D representation of 3D protein-ligand complexes
Interaction fingerprint: 1D representation of 3D protein-ligand complexesInteraction fingerprint: 1D representation of 3D protein-ligand complexes
Interaction fingerprint: 1D representation of 3D protein-ligand complexes
 
Fingerprinting
FingerprintingFingerprinting
Fingerprinting
 
Neural networks...
Neural networks...Neural networks...
Neural networks...
 
Fingerprints
FingerprintsFingerprints
Fingerprints
 
Advanced Blogging Overview
Advanced Blogging OverviewAdvanced Blogging Overview
Advanced Blogging Overview
 
Cheminformatics II
Cheminformatics IICheminformatics II
Cheminformatics II
 

Molecular Shape Fingerprints and Fragment Analysis

  • 1. Fragment Database Analysis Using Molecular Shape Fingerprints John D. MacCuish Norah E. MacCuish, Michael Hawrylycz, and Mitch Chapman ACS San Francisco 2010 CINF john.maccuish@mesaac.com
  • 2. Outline • Shape Fingerprints:quasi-Monte Carlo integration approach • 2D substructure commonality • 3D shape and pharmacophore features analogue • Fragment Database Example • Future Work
  • 3. quasi-Monte Carlo Integration (QMC)* • Approximate a 3D volume -- e.g., CPK van der Waals • In practice, quasi-randomly generated points have best error convergence in low dimensions. • Align volumes using binomial sampling = shape fingerprints, and SVD • Fast and accurate *”Quasi-Monte Carlo integration for the fast and effective generation of molecular shape fingerprints”, ACS San Francisco, Wednesday 2:30 , COMP 346
  • 5. Four subfingerprints Find maximum alignment to other fingerprints of confirmations similarly sampled Maximum Tanimoto for best alignment
  • 6. 2D Substructure Commonality • Exploratory visualization tool on a series or a 2D cluster of structures. • Akin to loosening the constraint on a maximal common substructure (MCS). • Path based 2D fingerprint form: “Stigmata”* • Key-based 2D fingerprint form “ChemTattoo” *"Stigmata: An Algorithm To Determine Structural Commonalities in Diverse Datasets", Shemetulskis, et al, JCIM, 36(4),1996, pp. 862-871
  • 7. 2D Substructure Commonality 2D 768 Key-based Fingerprint "Substructure commonality analysis and visualization with new key-based binary fingerprinter", Norah MacCuish, ACS Chicago, CINF Session, March 24-28, 2007.
  • 8. 2D ChemTattoo • Generate the modal fingerprint from the input data set • Modal fingerprint is the same length as the input data set 2D fingerprints • A bit is set in the Modal if that ‘key’ occurs at least in the threshold number of input molecules • Compare each input 2D fingerprint against the Modal Fingerprint • Calculate atom score (counts) for each atom for each input structure that reflects the number of modal keys that a given atom participates in. Color code the scores and depict the 2D structures.
  • 9. 2D ChemTattoo Four 2D fingerprints Threshold set to 1.0 -- no bit in common among all 4 fingerprints Threshold set to 0.5 -- some bits in common among all 4 fingerprints
  • 10. 2D ChemTattoo PUT CHEMVC EX OUTPUT HERE
  • 11. Pharmacophore Extension • Adding a pharmacophore extension to quasi- Monte Carlo generated shape fingerprints. • Substructure matching with pharmacophore features with user defined SMARTS • Create a ChemTattoo analogue in 3D, map the features onto the shape fingerprint. • Apply to 3D shape clusters, similarity searching, etc.
  • 12. ChemTattoo 3D • Allow the pharmacophore features to included standard definitions (HBond donor, HBond acceptor, etc. Or allow a user defined set of definitions -- SMARTS based) • Perform shape fingerprint clustering and analyze the resulting clusters to perceive patterns (modal) in the pharmacophore feature space • Use a known target as the modal and query a database to find similar shape (based on shape fingerprints) and align the shape hits based on the pharmacophore features of the target. • Apply these ideas to fragments to find potential bioisosteres for an active fragment found from a fragment screen
  • 13. Fragment Database • ZINC Fragment database, ~500K compounds • Cluster in 2D using 768 MACCS Keys Fingerprints • Select the Representatives to create a 2D diverse set • Generate multi-conformations (< 5Kcals) and Shape Fingerprints for all conformers • Shape Fragment Database contains: 3,265 structures, 24,029 conformers
  • 14. Bacterial 23S rRNA Fragment Screen 1. Generate conformers for the active fragment 2. Search Fragment Database w/ Shape FP cutoff: 0.6 - 324 conformers share the same shape 3. Identify the Pharmacophore Features in Target (features can be user defined) Kd > 100µM 4. Score the Fragment 3D database based on the MS number of modal pharmacophore features for fragments within the shape cutoff (require at least two pharmacophore matches) 5. Display the highlighted pharmacophore features in the target with a surface overlay 6. Align the hits via shape -- slider bars display shape matches that also have matching features within the slider bar distance Nature Reviews Drug Discovery v.3 8/04, p. 669
  • 15. Bacterial 23S rRNA • Pharmacophore features of Active fragment • Shape of Active fragment w/ features highlighted • Hitlist of one conformation of Active fragment with highest scoring matches • Shapes aligned for the best shape fingerprint score
  • 16. Bacterial 23S rRNA Showing hitlist... User to move the slider bar on the ‘red’ HBond Acceptor and find the shape matched fragment that also has a HBond Acceptor ‘close’ in distance space to the target.
  • 17. Future Work • Adding in Thresholding • More experiments in industrial settings • Expanding the pharmacophore default feature definitions • Better visualization tools
  • 18. Acknowledgments • Open Source • Mesa Software • JMol • ChemTattoo 2D, 3D • Balloon • Shape Fingerprint • OpenBabel Module • BKChem • Fingerprint Module 2D • Databases • Parallel Grouping Module • ZINC • PDB • WebflowDD john.maccuish@mesaac.com

Notas del editor