We are all benefiting from a shift towards openness fed by Open Source, Open Standards, Open Data and Open Access. Open Notebook Science is likely the scientific revolution of the near term. As more scientists become comfortable with the concepts of openly sharing their experiments and data, often in near real time, we are seeing a shift to significant increases in the availability of new data that does not have to be extracted from publications but is available as data feeds that can be delivered to the community. This presentation will provide an overview of how the ChemSpider database from the RSC supports Open Notebook Science using programmatic access to both data and services and how ChemSpider ingests data feeds to mesh together with our existing database of over 27 million chemical compounds.
Feeding and consuming data to support open notebook science via the chem spider platform
1. Feeding and consuming data to
support Open Notebook Science via
the ChemSpider Platform
Antony Williams, Jean-Claude Bradley, Andrew Lang and
Valery Tkachenko
ACS Philadelphia August 2012
2. Setting the Stage
Chemists want access to tools and data
The more capabilities the better
The more data the better
And give us an API with that…
And it should be free…
And constantly updated…
And all data should be Open…
And make it fully Open Source…
And it needs to be on my mobile…
3. Setting the Stage
Chemists have access to tools and data
The more capabilities the better – we’ll see
The more data the better – changing daily
And give us an API with that… - not just one
And it should be free… - sure
And constantly updated… - indeed..please help!
And all data should be Open…- licensing
And make it fully Open Source… - kinda, sorta
And it needs to be on my mobile… - sure
4. Welcome to ChemSpider
5 years, 28 million chemicals, linking 400 data
sources and growing daily
Hosted by the Royal Society of Chemistry
An important part of our long term strategic vision
Free to access
With lots/most/all (?) of the functionality
necessary to support chemists and Open
Notebook Science…
18. Storing ONS Reactions
Working with JC Bradley to host ONS reactions
Linking directly back to ONS reactions
What if the links decay?
Host all related ONS data – benefits of Openness!
Future applications for RInChIs
19. What we have been asked for
“Allow us to grab data”
“Let us link”
“Give us web services to integrate”
“Can we store our data with you?”
“Can you give us predictions to validate data?”
20. What we have been asked for
“Allow us to grab data”
“Let us link”
“Give us web services to integrate”
“Can we store our data with you?”
“Can you give us predictions to validate data?”
“Can you build us an ELN?”
21. Simple Linking to ChemSpider
Link using ChemSpiderID
http://www.chemspider.com/1234567
29. Feeding ONS Data into ChemSpider
ONS data can be deposited into ChemSpider and
linked out to the ONS pages
Simply deposit structure(s) and links
33. So isn’t ONS all about ELNs?
Open Notebook Science is about
Making records of research publicly available
online as it is recorded
ONS is enabled by software tools and platforms
Keep the notebook of the researcher online
with all raw and processed data as it is
generated (close to or near real time)
Notebooks as Wikis, Commercial or Free ELNs
published to the web (choose public/private –
what data to expose)
34. Feeding ELN Data into ChemSpider
Integrate e-Notebooks into ChemSpider
IDBS e-Workbook plug-in allows direct
deposition of chemical structures
Can be extended to more ELN content
Spectra
Reactions
Properties etc.
Integration Video http://tinyurl.com/9xnprqr
36. How much data is lost?
How many reactions in a thesis never get
published?
How many spectra of common materials could be
shared?
How many properties are measured and lost?
What stands in the way of sharing?
Is it technology?
Permissions? “The Boss”, Licensing?
And yes – there are data quality issues but there
is algorithmic checking and data curation to help
37. What could the future look like?
“Publicly funded” research data flows onto the web
Licensing is clear and NOT a challenge
Machines are picking up data and depositing
EXAMPLE project – Any interest?
Put your spectra/structure in folders (Dropbox)
ChemSpider robot scoops, processes and
deposits – opportunity with JC Bradley
While processing also predicts spectra and
compares for validation
38. Leaving the Stage
Chemists have access to tools and data
The more capabilities the better – what’s missing?
The more data the better – anyone want to share?
And give us an API with that… - ask us for help
And it should be free… - it is
And constantly updated… - help annotate/curate
And all data should be Open…- licensing
And make it fully Open Source… - book chapter
And it needs to be on my mobile… - it is