5. Source: Nature Reviews Drug
Discovery 11, 191-200 (March
2012) | doi:10.1038/nrd3681
Jack W. Scannell, Alex Blanckley,
Helen Boldon & Brian Warrington
6. harmful
useless harmful
Source: Nature Reviews Drug Discovery 3, 711-716 (August 2004)
| doi:10.1038/nrd1470
Ismail Kola & John Landis
10. Information Tombs…
¤ Built to primary use-case
¤ Tailored indexes
¤ Tailored GUIs
¤ Unique language &
metadata
¤ Poor interoperability/
integration
In vivo Portfolio Literature HR Synthesis SAR Docs Safety Etc
13. Precompetitive Informatics
Public Domain Drug Discovery Data:
Pharma are accessing, processing, storing & re-processing
Repeat @
x
Literature Genbank Downloads
Databases
Patents PubChem
each
company
Firewalled Databases
Data Integration Data Analysis
Lowering industry firewalls: pre-competitive informatics in drug discovery
Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944
14. The Innovative Medicines Initiative
• EC funded public- The Open PHACTS Project
private partnership for • Create a semantic integration hub (“Open
pharmaceutical Pharmacological Space”)…
research • Runs 2011-2014
• Focus on key problems • Deliver services to support on-going drug
– Efficacy, Safety, discovery programs in pharma and public
Education & domain
Training, • Leading academics in semantics,
Knowledge pharmacology and informatics, driven by
Management solid industry business requirements
• 23 academic partners, 8 pharmaceutical
companies, 3 software SMEs
• Work split into clusters:
• Technical Build
• Scientific Drive
• Community & Sustainability
16. Optimised To Business Questions
Number
sum
Nr
of
1
Ques-on
15 12
9
All
oxido,reductase
inhibitors
ac6ve
<100nM
in
both
human
and
mouse
Given
compound
X,
what
is
its
predicted
secondary
pharmacology?
What
are
the
on
and
off,target
safety
18 14
8
concerns
for
a
compound?
What
is
the
evidence
and
how
reliable
is
that
evidence
(journal
impact
factor,
KOL)
for
findings
associated
with
a
compound?
Given
a
target
find
me
all
ac-ves
against
that
target.
Find/predict
polypharmacology
of
ac-ves.
Determine
24 13
8
ADMET
profile
of
ac-ves.
32 13
8
For
a
given
interac-on
profile,
give
me
compounds
similar
to
it.
The
current
Factor
Xa
lead
series
is
characterised
by
substructure
X.
Retrieve
all
bioac-vity
data
in
serine
37 13
8
protease
assays
for
molecules
that
contain
substructure
X.
Retrieve
all
experimental
and
clinical
data
for
a
given
list
of
compounds
defined
by
their
chemical
38 13
8
structure
(with
op-ons
to
match
stereochemistry
or
not).
A
project
is
considering
Protein
Kinase
C
Alpha
(PRKCA)
as
a
target.
What
are
all
the
compounds
known
to
modulate
the
target
directly?
What
are
the
compounds
that
may
modulate
the
target
directly?
i.e.
return
41 13
8
all
cmpds
ac-ve
in
assays
where
the
resolu-on
is
at
least
at
the
level
of
the
target
family
(i.e.
PKC)
both
from
structured
assay
databases
and
the
literature.
44 13
8
Give
me
all
ac-ve
compounds
on
a
given
target
with
the
relevant
assay
data
46 13
8
Give
me
the
compound(s)
which
hit
most
specifically
the
mul-ple
targets
in
a
given
pathway
(disease)
59 14
8
Iden-fy
all
known
protein-‐protein
interac-on
inhibitors
20. Open PHACTS
Explorer 1st Gen Apps Partner Apps
Oct. 2012
Identity
Resolution “Adenosine Domain
Service
(ConceptWiki)
receptor 2a”
Linked Data API (RDF/XML, TTL, JSON) Specific
Services
Identifier
P12374
Management
EC2.43.4
Service
CS4532
(BridgeDb+)
Chemistry
Data Cache Normalisation
& Q/C
(Virtuoso Triple Store)
ChemSpider
Data
Import
Public
Ontologies
User
Public Content Commercial Annotations
31. The 18th International Conference on Knowledge
Engineering and Knowledge Management is
concerned with all aspects of eliciting, acquiring,
modeling and managing knowledge, and its role in
the construction of knowledge-intensive systems and
services for the semantic web, knowledge
management, e-business, natural language
processing, intelligent information integration, etc. The
focus of the 18th edition of EKAW will be on
"Knowledge Engineering and Knowledge
Management that matters".
32. Dynamic Equality
Strict Relaxed
Analysing Browsing
§ Tuneable (same data, different questions)
§ Domain specific
§ User driven
§ Traceable
47. Conclusions
¤ Project designed for the new drug discovery
environment
¤ Timing with RDF/SW is good
¤ Companies eager to see whether it can really make a
difference
¤ Challenge: Got to be better than state of the art (in
3 years!)
¤ Funding challenges are formidable
48. Acknowledgements
¤ Many members of the consortium who have contributed to data, use cases,
funding, support, documentation, management
¤ EBI: John Overington, Anna Gaulton, Mark Davies
¤ Lundbeck: Sune Askjær
¤ Maastricht: Chris Evelo, Andra Waagmeester, Egon Willighagen
¤ Manchester:
¤ Carole Goble, Alasdair Gray, Christian Brenninkmeijer
¤ Steve Pettifer, Ian Dunlop, Rishi Ramgolam, James Eales
¤ NBIC: Barend Mons, Kees Burger
¤ RSC: Antony Williams, Valery Tkachenko
¤ SIB: Christine Chichester
¤ VU: Frank van Harmelen, Paul Groth, Antonis Loizou
¤ OpenLink: Orri Erling, Yrjana Rankka, Hugh Williams
¤ Chem2Bio2RDF: David Wild, Bin Chen
51. Find me the off-target
activities of known cancer
drugs who's primary target is a
cell cycle regulatory kinase
Gene
ChEMBL DrugBank Wikipathways
Ontology
ChEBI Uniprot UMLS
ConceptWiki ChemSpider
Connected Using Semantic Technology
52. Are these Interleukin 1A?
Human Interleukin 1A Protein http://bio2rdf.org/uniprot:P01583
Human Interleukin 1A Protein http://identifiers.org/uniprot/P01583
Human Interleukin 1A Entrez Gene: 3552, Ensembl:ENSG00000115008
Gene
Affymetrix probes hIL1A 1076_at, 210118_s_at, 208200_at, 208200_at
Mouse Interleukin 1A Uniprot:P01582
IL1A PDB Structures 1ITA (3D) 2ILA (3D) 2KKI (3D) 2L5X (3D)
….etc
53. “There is lots of data we all use every day, and it’s not part of the web. I
can see my bank statements on the web, and my photographs, and
I can see my appointments in a calendar. But can I see my photos in
a calendar to see what I was doing when I took them? Can I see
bank statement lines in a calendar?
No. Why not? Because we don’t have a web of data. Because data
is controlled by applications and each application keeps it to itself.”
Sir Tim Berners-Lee