Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Online Resources to Support Open Drug Discovery Systems

2.758 visualizaciones

Publicado el

This is a presentation given at the Opal Events meeting ""Drug Discovery Partnerships: Filling the Pipeline". I was speaking in a session with Jean-Claude Bradley regarding "Pre-competitive Collaboration: Sharing Data to Increase Predictability". This presentation discussed some of the work we are doing on Open PHACTS. My thanks especially to Carole Goble, Lee Harland and Sean Ekins for their comments.

Publicado en: Tecnología, Empresariales
  • Inicia sesión para ver los comentarios

Online Resources to Support Open Drug Discovery Systems

  1. 1. Online Resources to Support Open Drug Discovery Systems Antony Williams 3 rd Annual Drug Discovery Partnership: Filling the Pipeline, October 2011
  2. 2. Open Drug Discovery <ul><li>Pharma Companies spend >$50 billion annually on R&D </li></ul><ul><li>How much historical data/knowledge/information is in the public domain? And where is it? </li></ul><ul><li>How much generated data is truly competitive? </li></ul><ul><li>Pre-competitive and public domain data could deliver high value to drug discovery </li></ul><ul><ul><li>Data mining </li></ul></ul><ul><ul><li>Model-building </li></ul></ul><ul><ul><li>Integrating into in-house and online systems </li></ul></ul>
  3. 3. <ul><ul><li>Internal and external content </li></ul></ul><ul><ul><li>Built to meet primary use-case </li></ul></ul><ul><ul><li>Tailored indexes and GUIs </li></ul></ul><ul><ul><li>Internal unique language & metadata </li></ul></ul><ul><ul><li>Poor interoperability/integration </li></ul></ul><ul><ul><li>Powerpoint, Documents, Excel </li></ul></ul><ul><ul><li>Many suppliers of systems and content in a single workflow </li></ul></ul>Pharma Information Tombs Literature Patents News Pipeline SAR CSRs Safety In vivo Etc
  4. 4. What could create change? <ul><li>Harvard Business Review (2010) </li></ul><ul><li>“ One change would make a substantial difference [ to drug R&D ] : the creation of agreed-upon standards for digitally representing drug assets. ” </li></ul>
  5. 5. It is so difficult to navigate… What’s the structure? Are they in our file? What’s similar? What’s the target? Pharmacology data? Known Pathways? Working On Now? Connections to disease? Expressed in right cell type? Competitors? IP?
  6. 6. Where is chemistry online? <ul><li>Encyclopedic articles (Wikipedia) </li></ul><ul><li>Chemical vendor databases </li></ul><ul><li>Metabolic pathway databases </li></ul><ul><li>Property databases </li></ul><ul><li>Patents with chemical structures </li></ul><ul><li>Drug Discovery data </li></ul><ul><li>Scientific publications </li></ul><ul><li>Compound aggregators </li></ul><ul><li>Blogs/Wikis and Open Notebook Science </li></ul>
  7. 7. PubChem
  8. 8. ChEMBL
  9. 9. ChemSpider
  10. 10. SciDBs Wiki
  11. 12. Pharma are accessing, processing, storing & re-processing Public Domain Drug Discovery Data
  12. 13. New trend: Set Data Free on the Web
  13. 14. Open Algorithms, Descriptors and Closed Data – Can We Unlock It?
  14. 15. <ul><li>The Innovative Medicines Initiative </li></ul><ul><ul><li>EC funded public-private partnership for pharmaceutical research </li></ul></ul><ul><li>Focus on key problems </li></ul><ul><ul><li>Efficacy </li></ul></ul><ul><ul><li>Safety </li></ul></ul><ul><ul><li>Education & Training </li></ul></ul><ul><ul><li>Knowledge Management </li></ul></ul>
  15. 16. <ul><li>Open PHACTS Project </li></ul><ul><li>Develop a set of robust standards… </li></ul><ul><li>Implement the standards in a semantic integration hub </li></ul><ul><li>Deliver services to support drug discovery programs in pharma and public domain </li></ul><ul><li>22 partners, 8 pharmaceutical companies, 3 biotechs </li></ul><ul><li>36 months project </li></ul>Guiding principle is open access, open usage, open source - Key to standards adoption -
  16. 17. <ul><li>Open PHACTS Project Partners </li></ul>
  17. 18. <ul><li>Example Research questions </li></ul><ul><li>Give all compounds with IC50 < xxx for target Y in species W and Z plus assay data </li></ul><ul><li>What substructures are associated with readout X (target, pathway, disease, …) </li></ul><ul><li>Give all experimental and clinical data for compound X </li></ul><ul><li>Give all targets for compound X or a compound with a similarity > y% </li></ul>
  18. 19. <ul><li>Prioritised Research Questions Analysis </li></ul><ul><li>Prevalent Concepts </li></ul><ul><ul><li>Compound </li></ul></ul><ul><ul><li>Bioassay </li></ul></ul><ul><ul><li>Target </li></ul></ul><ul><ul><li>Pathway </li></ul></ul><ul><ul><li>Disease </li></ul></ul><ul><li>Prevalent data relationships </li></ul><ul><ul><li>Compound – target </li></ul></ul><ul><ul><li>Compound – bioassay </li></ul></ul><ul><ul><li>Bioassay – target </li></ul></ul><ul><ul><li>Compound – target – mode of action </li></ul></ul><ul><ul><li>Target – target classification </li></ul></ul><ul><ul><li>Target – pathway and disease </li></ul></ul><ul><li>Required cheminformatics functionality </li></ul><ul><ul><li>Chemical substructure searching </li></ul></ul><ul><ul><li>Chemical similarity searching </li></ul></ul><ul><li>Required bioinformatics functionality </li></ul><ul><ul><li>Sequence and similarity searching </li></ul></ul><ul><ul><li>Bioprofile similarity searching </li></ul></ul>
  19. 20. <ul><li>Selection of prioritised data sources </li></ul><ul><li>Chemistry </li></ul><ul><ul><li>ChEMBL </li></ul></ul><ul><ul><li>DrugBank </li></ul></ul><ul><ul><li>ChEBI </li></ul></ul><ul><ul><li>PubChem </li></ul></ul><ul><ul><li>ChemSpider </li></ul></ul><ul><ul><li>Human Metabolome DB </li></ul></ul><ul><ul><li>Wombat (commercial) </li></ul></ul><ul><li>Ontologies </li></ul><ul><ul><li>AmiGo (The Gene Ontology) </li></ul></ul><ul><ul><li>KEGG ( Kyoto Encyclopedia of Genes and Genomes) </li></ul></ul><ul><ul><li>OBI ( The Ontology for Biomedical Investigations) </li></ul></ul><ul><ul><li>Bioassay Ontology EFO ( Experimental Factor Ontology) </li></ul></ul><ul><li>Biology </li></ul><ul><ul><li>EntrezGene </li></ul></ul><ul><ul><li>HGNC </li></ul></ul><ul><ul><li>Uniprot </li></ul></ul><ul><ul><li>Interpro </li></ul></ul><ul><ul><li>SCOP </li></ul></ul><ul><ul><li>Wikipathways </li></ul></ul><ul><ul><li>OMIM </li></ul></ul><ul><ul><li>IUPHAR </li></ul></ul>
  20. 22. Linking “Flavors” of Chemistry
  21. 23. Improve Linked Data Access… <ul><li>Coordinate effort to clean up chemistry related data </li></ul><ul><li>Open tools – require good validation studies </li></ul><ul><li>Support scientists making data open </li></ul><ul><li>Support companies/groups promoting software for data sharing </li></ul><ul><li>Engage community to help create what they want. </li></ul>
  22. 24. Openness and Quality Issues Williams and Ekins, DDT, 16: 747-750 (2011) Science Translational Medicine 2011
  23. 25. Chemistry Databases on the Internet <ul><li>Some public databases are “trusted” as primary sources </li></ul><ul><li>Trust is granted without investigation or understanding of the content </li></ul><ul><li>What do we know about some of the online resources? </li></ul>
  24. 26. PHYSPROP Database <ul><li>The freely downloadable database under the EPI Suite prediction software </li></ul><ul><li>Very Basic filters suggest data quality issues </li></ul>
  25. 27. The Stereochemistry challenge. 12500 chemicals with “missed” stereo
  26. 28. Searches on ChemSpider <ul><li>Most searches are text-based: people searching for information about known chemicals </li></ul><ul><li>Creating accurate name-structure dictionaries is critical </li></ul>
  27. 29. NIST Webbook
  28. 30. PubChem
  29. 31. NPC Browser
  30. 33. Cyclic Data Sharing <ul><li>Data-sharing between open databases is cyclic </li></ul>
  31. 34. Synonyms on PubChem <ul><li>1,3-DICHLORO-PROPAN-2-ONE </li></ul><ul><li>(2R,3R)-Butanediol bis(methanesulfonate) </li></ul><ul><li>Ethyl-1-propenyl ether, mixture of cis and trans </li></ul><ul><li>PSS-[2-[(Chloromethyl)phenyl]ethyl]-Heptaisobutyl substituted </li></ul><ul><li>1-Chlorobenzylethyl-3,5,7,9,11,13,15-heptaisobutylpentacyclo [,9).1(5,15).1(7,13)]octasiloxane </li></ul>
  32. 35. Synonyms on PubChem
  33. 36. Data Proliferation
  34. 41.
  35. 42. ChemSpider… <ul><li>>26 million unique molecules </li></ul><ul><li>Links together >400 internet resources </li></ul><ul><li>Linking patents, publications, chemical vendors and online chemical compound databases </li></ul><ul><li>Crowdsourced depositions and curations </li></ul>
  36. 43. ChemSpider… <ul><li>>26 million unique molecules </li></ul><ul><li>Links together >400 internet resources </li></ul><ul><li>Linking patents, publications, chemical vendors and online chemical compound databases </li></ul><ul><li>Crowdsourced depositions and curations </li></ul><ul><li>A focus on data quality – cleaning data on the web </li></ul><ul><li>The structure database under Open PHACTS </li></ul>
  37. 44. Acknowledgments <ul><li>Sean Ekins – Collaborations in Chemistry </li></ul><ul><li>RSC|ChemSpider team </li></ul><ul><li>Open PHACTS consortium – especially Lee Harland and Carole Goble </li></ul><ul><li>Data depositors and curators </li></ul><ul><li>Software providers – ACD/Labs, OpenEye, GGA Software Inc, Open Source Cheminformatics </li></ul>
  38. 45. Thank you Email: Twitter: ChemConnector Blog: Personal Blog: SLIDES: