The collaborative project PharmaSea brings European researchers to some of the deepest, coldest and hottest places on the planet. Scientists from the UK, Belgium, Norway, Spain, Ireland, Germany, Italy, Switzerland and Denmark are working together to collect and screen samples of mud and sediment from huge, previously untapped, oceanic trenches. The large-scale, four-year project is backed by almost 10 million euros of funding and brings together 24 partners from 13 countries from industry, academia and non-profit organisations. The PharmaSea project focuses on biodiscovery research and the development and commercialisation of new bioactive compounds from marine organisms, including deep-sea sponges and bacteria, to evaluate their potential as novel drug leads or ingredients for nutrition or cosmetic applications. The Royal Society of Chemistry is responsible for developing a number of capabilities to support the Pharmasea project including a chemical registration system for new compounds, dereplication technologies to assist in the identification of new compounds and search techniques for mass spectrometrists within the project. This presentation will provide an overview of the project and our progress to contributing chemical information technologies to support the effort.
Servosystem Theory / Cybernetic Theory by Petrovic
Applying Royal Society of Chemistry cheminformatics skills to support the PharmaSea project
1. Applying Royal Society of Chemistry
Cheminformatics Skills to Support
the PharmaSea Project
Antony Williams, Alexey Pshenichnov, Valery
Tkachenko, Ken Karapetyan, David Sharpe
ACS San Francisco
August 2014
15. Focus on Marine Natural Products
• RSC cheminformatics support to include:
• Deliver “PharmaSea website”
• Provide access to natural products subset
• Develop “dereplication techniques”
• Searching NMR features against database
• Develop advanced searches for MS data
• Host Open Data from the PharmaSea project
and make available to the community
17. The PharmaSea Website
• RSC is open-sourcing a chemical registry
system as a result of Open PHACTS
• Chemical Registry system used to underpin
the PharmaSea website – behind login
• Will be enhanced with data deposition
capabilities and “dereplication”
28. Extending PharmaSea Site
• PharmaSea website will be extended
• Spectral data handling: Support Dereplication
29. Identifying novel compounds
• Compounds are collected from the ocean
• Extraction via chromatography
• Analytical sciences including:
• UV-Vis data (Lambda-max)
• Mass spectrometry (formula/mass)
• NMR spectroscopy (HNMR/2D)
• Utilized for dereplication,,,
31. 4 Me singlets
4 Me doublets
1 OMe singlet
Aromatic protons
Identifying novel compounds
32. Identifying novel compounds
2D NMR data will give details
regarding substitutions and
this information can be used in
the dereplication process
33. What we need is…
• If we could have:
• A DB containing known marine natural products
• This would give formula and mass for searching
• The DB has all spectral data available for each
compound
• If experimental data are not available then use
the compound to COMPUTE spectral features
34. RSC Acquires Marinlit
• All Marinlit chemical compounds in ChemSpider
• Marinlit developers are dereplication experts
35. • Index literature related to marine natural
products: 26K articles and growing
• Structure searchable database
• Data includes taxonomy, location and literature
• “Spectral features” generated algorithmically
• Utilize the spectral features for dereplication
36.
37.
38. PharmaSea Dereplication
• Work in progress:
• Produce “dereplication widget” to embed in
the PharmaSea website
• Generate “structure features” file for every
new compound deposited to PharmaSea
• Ideal would be to utilize spectral data directly
to elucidate structures – “Computer Assisted
Structure Elucidation”. ACD/Labs….
39. CASE-based Elucidation
• Computers can elucidate structures today
with greater efficiency and success than
many scientists – see Patrick Wheeler’s talk
• Natural products specifically can be very
challenging and CASE is well-proven
• ACD/Labs have delivered their CASE-
system (ACD/Structure Eludicator) to the
project
40. 1D & 2D NMR Synchronized
Processing
The Software displays correlations for assigned spectra and structures, and highlights
correlations that are likely to be erroneous.
41.
42. ChemSpider supporting CASE
RSC delivered entire ChemSpider structure dataset
for inclusion into the Structure Elucidator software.
50. Future Plans
• Roll out tagging on ChemSpider to crowdsource
marine natural products subset
• Implement tagging for further details onto
PharmaSea website
• Collaborate with other natural product sources
• Mass spectrometry fragmentation prediction
54. Modern NMR Approaches To The Structure Elucidation
of Natural Products
Volume 1: Instrumentation and Software
Volume 2: Data Acquisition and Applications to Compound
Classes
Edited by Antony Williams, RSC, Gary Martin, Merck and
David Rovnyak, Bucknell University
To be published: 2015 (RSC)
55. To be published: 2015 (Springer)
Computer-based Structure Elucidation from
Spectral Data
Will include a functional demo version of the
ACD/Structure Elucidator software to teach the
basic approaches to computer-assisted structure
elucidation
Authored by Mikhail Elyashberg, Kirill Blinov and
Antony Williams
56. Acknowledgments
• Alexey Pshenichnov, Ken Karapapetyan and
Valery Tkachenko (RSC – US Cheminformatics)
• Marcel Jaspars (University of Aberdeen)
• John Blunt and Murray Munro (Marinlit)
• Serin Dabb (RSC, Marinlit)
• Patrick Wheeler and David Hardy (ACD/Labs)
57. Thank you
Email: williamsa@rsc.org
ORCID: 0000-0002-2668-4821
Twitter: @ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams
Notas del editor
MarinLit is ‘article-centric’ and not compound centric. Compounds are only indexed when they are newly discovered, revised, or new to marine.
All compound records link to the paper they were first mentioned. They are not linked to subsequent articles that describe them.