Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

Comparing ChEMBL, DrugBank, Human
Metabolome db and Therapeutic Target db at the
chemistry and protein levels:

The presentation is based on this paper
http://onlinelibrary.wiley.com/doi/10.1002/minf.201300103/abstract

Chris Southan, Curator for IUPHAR-db/Guide toPHARMACOLOGY
Presented at the IUPHAR Meeting, Paris, October 2013

[1]

So why compare databases?
•
•
•
•
•
•
•
•
•
•
•

Determine what they actually contain
Determine extraction selectivity
Retro-divine the curatorial rules
Assess relationship mapping stringency and fidelity
Understand entity, attribute and relationship distributions
Become aware of declared or cryptic circularity
Detect global error propagation
Evaluate unique content
Judge utility, complementarity, consumability and integratablity
See what to avoid in your own database
Know what to emulate

[3]

Introducing the resources (I)
•

~ 2/3 of ChEMBL is curated from medicinal chemistry papers, mainly
as structure-activity-relationship (SAR) results. The other ~ 1/3 is from
confirmatory PubChem BioAssays. Release 15 (January 2013) 9,570
targets, 1,254,575 distinct compounds, 10,509,572 activities and
48,735 publications

•

DrugBank collates target and mechanism-of-action information. Version
3.0 (January 2011) contains 6,715 drug entries including 1,452 FDAapproved small molecules, 131 biologicals, 86 nutraceuticals and 5,076
experimental compounds. These are mapped to 4,233 protein IDs.

•

TTD is conceptually similar to DrugBank but the compound-to-target
mappings are focussed on primary targets. It ncludes a three-way split
of targets and compounds into marketed, clinical trial and research
phase. The latest version 4.3.02 (August 2011) includes 2,025 targets,
17,816 chemical structures, including 1,540 approved drugs.
[4]

Introducing the resources (II)

•

HMDB collates detailed chemical, clinical and biochemical data on
human metabolites. These are linked to other databases including
enzymes involved in the transformations. Version 3.0 (September
2012) contains 40,437 chemical entries and 5,650 protein sequence
identifiers.

•

IUPHARdb/GTP (you are hearing about this weekend….) 6064 ligands,
559 approved drugs , 894 Unique (UniProt) targets with direct activity
mappings (all ligand types, all species, but excluding kinase screens)
21,774 references

[5]

Chemistry Mw vs
over time

[7]

Resolving content inside PubChem (slice ‘n dice)

ChEMBL
ChEMBL

HMDB

DrugBank

TTD
[8]

Comparative protein attributes (e.g. GO)

HMDB

IUPHARdb

[9]

Comparing UniProt cross-references

[10]

Compare individual curatorial errors

[11]

Protein ID comparisons

Consensi are corroborative (but beware of curatorial circularity)
Human Swiss-Prot IDs

[12]

Differences in curation rules (e.g. atorvastatin)

[13]

Check for false-negatives

[14]

Thanks, questions welcome

Our 2012 paper (but the data was 2010) Mapping between databases of
compounds and protein targets Muresan S, Sitzmann M, Southan C. Methods
Mol Biol. 2012;910, PMID:22821596

Now on
http://figshare.com/articles/Mapping_Between_Databases_of_Compounds_and_Pr
otein_Targets/818979

If enjoyed this presentation, you might also like PMID: 20298516

[15]

Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Similar a Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

Similar a Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels (20)

Más de Chris Southan

Más de Chris Southan (20)

Último

Último (20)

Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels