SlideShare a Scribd company logo
1 of 17
www.guidetopharmacology.org
Will the real drugs and targets please stand up?
Evolving consensus-based curatorial strategies
Chris Southan, IUPHAR/BPS Guide to PHARMACOLOGY Web portal Group, Centre for Integrative
Physiology,School of Biomedical Sciences, University of Edinburgh, Hugh Robson Building, Edinburgh,
EH8 9XD, UK. cdsouthan@hotmail.com
Presented to the Gloriam/GPCRDB Team and the Dept. of Pharmaceutical Sciences,
University of Copenhagen, 6th May 2014
1
GToPdb: receptors, ligands, targets and drugs
ā€¢ An expert-curated database overseen by the IUPHAR Nomenclature
Committee (NC-IUPHAR)
ā€¢ >70 subcommittees comprising ~700 international scientists working on
individual target families.
ā€¢ 4 full-time curators, 1 part-time admin, 1 developer.
ā€¢ NC-IUPHAR publishes nomenclature recommendations and reviews on various
topics in pharmacological journals and through the IUPHAR database.
ā€¢ Subcommittees update their database pages annually.
ā€¢ Continuously expanding to incorporate new data types, new targets and
ligands and new domain committees
ā€¢ Public database releases every 3-4 months
Content
Detailed annotation
Pharmacological and clinical data
WellcomeTrust Grant 099156/Z/12/Z
ā€¢ Key objective: ā€œencompass all the human targets of current prescription
medicines and the likely targets of future medicinesā€
ā€¢ Conceptually familiar from our established receptor/channel-centric database
ā€¢ But - needed to re-define curatorial approaches, caveats and end-points
ā€¢ Balance between theoretical rigour and pragmatic utility
ā€¢ Four foci - grant fulfilment, user value, data mining, data consumption
ā€¢ Discuss and document changes in curatorial strategies with practical guidelines
ā€¢ Add enhancements, new relationships and features
ā€¢ Control activity-mapping stringencies and relationship distributions
ā€¢ QC legacy content, harmonise and remediate where necessary
ā€¢ Aim for small, but perfectly-formed, data content vs. complete coverage
Technical implementation
ā€¢ Restrict relationships to citable/provenanced quantitative mappings
(typically IC50, Ki, Kd)
ā€¢ Formally tag data-supported ā€œprimary targetsā€
ā€¢ Only data-supported polypharmacology
ā€¢ Mask nutraceuticals, metabolites or endogenous hormones from bloating
drug > target relationship space
ā€¢ Limit drug > multiple subunit mappings to direct interactions
ā€¢ Normalize targets to UniProt IDs and Swiss-Prot for human
ā€¢ Normalise drugs and ligands to PubChem compound records (CIDs)
ā€¢ Extend useful relationships e.g. drug > prodrug, drug > active metabolite,
ligand = target (antibody > cytokine)
ā€¢ Flexibility to handle edge cases (e.g. heparinoids)
ā€¢ Options for selective expansion (e.g. kinases, proteases andAlzheimerā€™s)
7
Defining limits for curation
ā€¢ The good news: capture of targets and drugs in databases and literature
reports is continuously expanding
ā€¢ The bad news: no one agrees on numbers, relationship definitions,
curatorial rules, identifiers, exact molecular structures, choices of primary
sources or provenance attribution
ā€¢ More bad news: source proliferation < ā€œcircularā€ annotation
ā€¢ Human target range: 186 approved drugs in 2006 (PMID:17139284 ) <
3,044 in ChEMBL_18
ā€¢ Approved drug ranges: 1,216 FDA Maximum Daily Dose (PubChem Assay
ID 1195) < 2,750 for the NCGC Pharmaceutical Collection (PMID:21525397)
ā€¢ Outer bioactivity ranges: 8057 INNs < 928,875 actives in PubChem
BioAssays < 6.3 million from GVKBIO with SAR from papers and patents
8
Evolution of our consensus strategy
Based on many collective years of curatorial engagement and deep source
knowledge we now pursue a consensus approach for the following reasons:
1. Concordant sources are generally more likely to be right than wrong
2. Curatorial efficiency of starting with solid consensus sets
3. Multiple sources are informatically synergistic ( if truly independent)
4. Approach is flexible via source updates and testing different filters
5. We control total numbers for matching to curatorial capacity
6. The concept can easily be explained to users
7. The exercise of comparing sources is very informative
8. It forces entity identifier normalisation (via cross-mapping if necessary)
9. Consensus lists per se have value for users (e.g. hosting on website)
9
Will the real targets please stand up ?
ā€¢ Compared as human Swiss-Prot IDs for 2013 database releases
ā€¢ Intersect is 351 the union is 3,046 (i.e. 15% of the 20,265 human proteome)
ā€¢ Lists included approved, clinical and research targets
10
Figure 7d from: ā€œComparing the
chemical structure and protein
content of ChEMBL, DrugBank,
Human Metabolome Database
and the Therapeutic Target
Databaseā€ PMID: 24533037
Genome Ontology comparison indicates source selectivity
11
Use a target consensus to populate the database
12
ā€¢ ChEMBL 17, 252 approved
ā€¢ Mathias Rask-Anderson et. al July
2013, 481 approved
ā€¢ Southan et. al, 2013 3-way human
DrugBank/ChEMBL/TTD 352
ā€¢ 3-way or 2-way, 19 + 40 + 143 =
202 Targets Of Approved Drugs
(TOADS) set selected for GToP
upload
Will the real drugs please stand up?
ā€¢ Work up the following CID triage inside PubChem
ā€¢ Select DrugBank 1504 ā€œapprovedā€ drug structures
ā€¢ Select two additional sources TTD and ChEMBL
ā€¢ Filter to remove salts and mixtures
ā€¢ Select synonym INN (WHO International Non-proprietary Name).
ā€¢ The final step was the Boolean intersect between all five
13
Observations and caveats
ā€¢ This set of 923 drugs can be accessed via the MyNCBI open URL
http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/1Fo7u3a
pR1bzS_UWr1YhHOTkZ/
ā€¢ TTD last submitted in Feb 2012 so drug content is thus capped to
before that date (droppingTTD gives 1117 CIDs)
ā€¢ Some metabolites (e.g. amino acids) come through the filters
ā€¢ Older drugs have no INN (e.g. aspirin)
ā€¢ Some peptide drug CIDs are missing (suggesting low concordance)
ā€¢ Approved fixed-mixtures are excluded (they do not get an INN)
ā€¢ The computed CID identity is actually a hash-code match, rather
than via InChIKey (but this should give similar numbers)
ā€¢ Each of the 923 had 76 submissions (SIDs)
ā€¢ Applying ā€œsame (bond) connectivityā€ gives 18749 but removing the
virtual deuterated entries reduces this to 6919 (i.e. the 923 have,
on average, 7.5 alternative stereo CIDs)
14
Closing consensus drugs > targets
15
ā€¢ From Phase I targets > drugs we have moved to Phase 2 for drugs >
targets
ā€¢ Current stats = 228TOADS (inward mapping expanded the set by ~10%)
ā€¢ Current stats = 996 approved drugs (need to complete the activity
mappings)
ā€¢ Note that antibodies and larger peptides (with no PubChem CIDs) are
subsumed in the 996
ā€¢ 2013 new drug CIDs loaded http://cdsouthan.blogspot.se/2014/03/the-
drugs-of-2013-in-pubchem.html
ā€¢ Will back-fill 2010-2012 new approvals as ligands, targets and activities
(but most already there)
GPCRdb/GToPdb collaborative opportunity
ā€¢ Inspect which GPCRs are concordant or discordant between the target
lists
ā€¢ Might be able to do similar exersise for GPCR-active drug/compound lists
ā€“ depending on what we can find with linkage (e.g. GLIDA)
ā€¢ Work up a triage for alert triggers for new GPCR ligand structures in PDB
(e.g. via MMDB)
16
References and Acknowledgments
17
The database team: Adam Pawson, Joanna Sharman, Helen Benson, Elena Faccenda

More Related Content

What's hot

Drug Repositioning Workshop
Drug Repositioning WorkshopDrug Repositioning Workshop
Drug Repositioning Workshop
Genna Gerla
Ā 
A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016
Andrew Pope
Ā 
AllTrials AAAS 2015 - Access to anonymised patient level data
AllTrials AAAS 2015 - Access to anonymised patient level dataAllTrials AAAS 2015 - Access to anonymised patient level data
AllTrials AAAS 2015 - Access to anonymised patient level data
SenseAboutSci
Ā 

What's hot (20)

PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
Ā 
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationGuide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Ā 
Will the correct drugs please stand up?
Will  the correct drugs please stand up?Will  the correct drugs please stand up?
Will the correct drugs please stand up?
Ā 
Assessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChemAssessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChem
Ā 
Drug Repositioning Workshop
Drug Repositioning WorkshopDrug Repositioning Workshop
Drug Repositioning Workshop
Ā 
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
Ā 
Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY
Ā 
The IUPHAR/BPS Guide to PHARAMCOLOGY in 2018: new features and updates
The IUPHAR/BPS Guide to PHARAMCOLOGY in 2018: new features and updatesThe IUPHAR/BPS Guide to PHARAMCOLOGY in 2018: new features and updates
The IUPHAR/BPS Guide to PHARAMCOLOGY in 2018: new features and updates
Ā 
Update on the Druggable Proteome
Update on the Druggable ProteomeUpdate on the Druggable Proteome
Update on the Druggable Proteome
Ā 
GtoPdb ELIXIR-All Hands 2018
GtoPdb ELIXIR-All Hands 2018GtoPdb ELIXIR-All Hands 2018
GtoPdb ELIXIR-All Hands 2018
Ā 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor data
Ā 
Introduction to Discovery Partnerships with Academia (DPAc)
Introduction to Discovery Partnerships with Academia (DPAc)Introduction to Discovery Partnerships with Academia (DPAc)
Introduction to Discovery Partnerships with Academia (DPAc)
Ā 
A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016
Ā 
Epoch Research Institute : Introduction to CR
Epoch Research Institute : Introduction to CREpoch Research Institute : Introduction to CR
Epoch Research Institute : Introduction to CR
Ā 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
Ā 
IUPHAR/BPS Guide to Pharmacology in 2018
IUPHAR/BPS Guide to Pharmacology in 2018IUPHAR/BPS Guide to Pharmacology in 2018
IUPHAR/BPS Guide to Pharmacology in 2018
Ā 
Patent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEsPatent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEs
Ā 
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
Ā 
AllTrials AAAS 2015 - Access to anonymised patient level data
AllTrials AAAS 2015 - Access to anonymised patient level dataAllTrials AAAS 2015 - Access to anonymised patient level data
AllTrials AAAS 2015 - Access to anonymised patient level data
Ā 
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Ā 

Viewers also liked

Sorting bioactive wheat from database chaff
Sorting bioactive wheat from database chaffSorting bioactive wheat from database chaff
Sorting bioactive wheat from database chaff
Chris Southan
Ā 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Chris Southan
Ā 

Viewers also liked (6)

Chemicalize.org: User-Selected PubChem Source of Structures from Text
Chemicalize.org: User-Selected PubChem Source of Structures from TextChemicalize.org: User-Selected PubChem Source of Structures from Text
Chemicalize.org: User-Selected PubChem Source of Structures from Text
Ā 
Southan real drugs_paris_oct_11_2014
Southan real drugs_paris_oct_11_2014Southan real drugs_paris_oct_11_2014
Southan real drugs_paris_oct_11_2014
Ā 
Sorting bioactive wheat from database chaff
Sorting bioactive wheat from database chaffSorting bioactive wheat from database chaff
Sorting bioactive wheat from database chaff
Ā 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Ā 
Analysing targets and drugs to populate the GToP database
Analysing  targets and drugs to populate the GToP databaseAnalysing  targets and drugs to populate the GToP database
Analysing targets and drugs to populate the GToP database
Ā 
Evolution of a Drug Target BACE1
Evolution of a Drug Target BACE1Evolution of a Drug Target BACE1
Evolution of a Drug Target BACE1
Ā 

Similar to Evolving consensus-based curatorial strategies

Analysing the drug targets in the human genome
Analysing the drug targets in the human genomeAnalysing the drug targets in the human genome
Analysing the drug targets in the human genome
Guide to PHARMACOLOGY
Ā 
How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...
How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...
How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...
MMS Holdings
Ā 
David-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptxDavid-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptx
ssuser660bb1
Ā 
David-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptxDavid-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptx
ssusera155d8
Ā 
GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
Chris Southan
Ā 

Similar to Evolving consensus-based curatorial strategies (20)

Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
Ā 
Druggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbsDruggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbs
Ā 
Amia tbi-14-final
Amia tbi-14-finalAmia tbi-14-final
Amia tbi-14-final
Ā 
Analysing the drug targets in the human genome
Analysing the drug targets in the human genomeAnalysing the drug targets in the human genome
Analysing the drug targets in the human genome
Ā 
GtoPdb teaching slides
GtoPdb teaching slidesGtoPdb teaching slides
GtoPdb teaching slides
Ā 
How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...
How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...
How to Submit Non-Clinical Data to CBER Using SEND : Understanding New FDA Re...
Ā 
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGYSlicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Ā 
David-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptxDavid-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptx
Ā 
David-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptxDavid-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptx
Ā 
David-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptxDavid-Graham-HGML-presentation-20190424.pptx
David-Graham-HGML-presentation-20190424.pptx
Ā 
Correct drug structures for pharmacology
Correct drug structures for pharmacologyCorrect drug structures for pharmacology
Correct drug structures for pharmacology
Ā 
Guide to Pharmacology Poster - ELIXIR All Hands 2020
Guide to Pharmacology Poster - ELIXIR All Hands 2020Guide to Pharmacology Poster - ELIXIR All Hands 2020
Guide to Pharmacology Poster - ELIXIR All Hands 2020
Ā 
Presentation at Rare Disease conference in San-Antonio
Presentation at Rare Disease conference in San-AntonioPresentation at Rare Disease conference in San-Antonio
Presentation at Rare Disease conference in San-Antonio
Ā 
Mobilizing informational resources for rare diseases
Mobilizing informational resources for rare diseasesMobilizing informational resources for rare diseases
Mobilizing informational resources for rare diseases
Ā 
The Impact of Real-World Data in Pharmacovigilance and Regulatory Decision-Ma...
The Impact of Real-World Data in Pharmacovigilance and Regulatory Decision-Ma...The Impact of Real-World Data in Pharmacovigilance and Regulatory Decision-Ma...
The Impact of Real-World Data in Pharmacovigilance and Regulatory Decision-Ma...
Ā 
GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
Ā 
PMED: APPM Workshop: From Real World Data to Real World Evidence - Richard Zi...
PMED: APPM Workshop: From Real World Data to Real World Evidence - Richard Zi...PMED: APPM Workshop: From Real World Data to Real World Evidence - Richard Zi...
PMED: APPM Workshop: From Real World Data to Real World Evidence - Richard Zi...
Ā 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
Ā 
Computer aided drug designing (cadd)
Computer aided drug designing (cadd)Computer aided drug designing (cadd)
Computer aided drug designing (cadd)
Ā 
Peptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdbPeptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdb
Ā 

More from Chris Southan

Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
Chris Southan
Ā 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
Chris Southan
Ā 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
Chris Southan
Ā 

More from Chris Southan (20)

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
Ā 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
Ā 
Peptide tribulations
Peptide tribulationsPeptide tribulations
Peptide tribulations
Ā 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
Ā 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
Ā 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
Ā 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
Ā 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
Ā 
Seeking glimmers of light in Pharos ā€œTdarkā€ proteins
Seeking glimmers of light in  Pharos ā€œTdarkā€ proteinsSeeking glimmers of light in  Pharos ā€œTdarkā€ proteins
Seeking glimmers of light in Pharos ā€œTdarkā€ proteins
Ā 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
Ā 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
Ā 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
Ā 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
Ā 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
Ā 
Will the real proteins please stand up
Will the real proteins please stand upWill the real proteins please stand up
Will the real proteins please stand up
Ā 
Peptide Tribulations
Peptide TribulationsPeptide Tribulations
Peptide Tribulations
Ā 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
Ā 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
Ā 
Pub Med to PubChem Connectivity
Pub Med to PubChem ConnectivityPub Med to PubChem Connectivity
Pub Med to PubChem Connectivity
Ā 
The IUPHAR/MMV Guide to Malaria Pharmacology
The  IUPHAR/MMV Guide to Malaria Pharmacology  The  IUPHAR/MMV Guide to Malaria Pharmacology
The IUPHAR/MMV Guide to Malaria Pharmacology
Ā 

Recently uploaded

Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Dipal Arora
Ā 
VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
jageshsingh5554
Ā 
Call Girls in Gagan Vihar (delhi) call me [šŸ” 9953056974 šŸ”] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [šŸ”  9953056974 šŸ”] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [šŸ”  9953056974 šŸ”] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [šŸ” 9953056974 šŸ”] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 

Recently uploaded (20)

Book Paid Powai Call Girls Mumbai š– ‹ 9930245274 š– ‹Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai š– ‹ 9930245274 š– ‹Low Budget Full Independent H...Book Paid Powai Call Girls Mumbai š– ‹ 9930245274 š– ‹Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai š– ‹ 9930245274 š– ‹Low Budget Full Independent H...
Ā 
Top Rated Bangalore Call Girls Richmond Circle āŸŸ 9332606886 āŸŸ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle āŸŸ  9332606886 āŸŸ Call Me For Ge...Top Rated Bangalore Call Girls Richmond Circle āŸŸ  9332606886 āŸŸ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle āŸŸ 9332606886 āŸŸ Call Me For Ge...
Ā 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Ā 
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Ā 
Top Rated Bangalore Call Girls Mg Road āŸŸ 9332606886 āŸŸ Call Me For Genuine S...
Top Rated Bangalore Call Girls Mg Road āŸŸ   9332606886 āŸŸ Call Me For Genuine S...Top Rated Bangalore Call Girls Mg Road āŸŸ   9332606886 āŸŸ Call Me For Genuine S...
Top Rated Bangalore Call Girls Mg Road āŸŸ 9332606886 āŸŸ Call Me For Genuine S...
Ā 
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Ā 
Call Girls Agra Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Agra Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Agra Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Agra Just Call 8250077686 Top Class Call Girl Service Available
Ā 
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
Ā 
Russian Call Girls Service Jaipur {8445551418} ā¤ļøPALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ā¤ļøPALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ā¤ļøPALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ā¤ļøPALLAVI VIP Jaipur Call Gir...
Ā 
VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony šŸ“³ 7877925207 For 18+ VIP Call Girl At Th...
Ā 
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Ā 
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Ā 
Night 7k to 12k Navi Mumbai Call Girl Photo šŸ‘‰ BOOK NOW 9833363713 šŸ‘ˆ ā™€ļø night ...
Night 7k to 12k Navi Mumbai Call Girl Photo šŸ‘‰ BOOK NOW 9833363713 šŸ‘ˆ ā™€ļø night ...Night 7k to 12k Navi Mumbai Call Girl Photo šŸ‘‰ BOOK NOW 9833363713 šŸ‘ˆ ā™€ļø night ...
Night 7k to 12k Navi Mumbai Call Girl Photo šŸ‘‰ BOOK NOW 9833363713 šŸ‘ˆ ā™€ļø night ...
Ā 
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available
Ā 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Ā 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Ā 
Best Rate (Guwahati ) Call Girls Guwahati āŸŸ 8617370543 āŸŸ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati āŸŸ 8617370543 āŸŸ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati āŸŸ 8617370543 āŸŸ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati āŸŸ 8617370543 āŸŸ High Class Call Girl...
Ā 
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Ā 
Call Girls in Gagan Vihar (delhi) call me [šŸ” 9953056974 šŸ”] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [šŸ”  9953056974 šŸ”] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [šŸ”  9953056974 šŸ”] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [šŸ” 9953056974 šŸ”] escort service 24X7
Ā 
Best Rate (Hyderabad) Call Girls Jahanuma āŸŸ 8250192130 āŸŸ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma āŸŸ 8250192130 āŸŸ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma āŸŸ 8250192130 āŸŸ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma āŸŸ 8250192130 āŸŸ High Class Call Girl...
Ā 

Evolving consensus-based curatorial strategies

  • 1. www.guidetopharmacology.org Will the real drugs and targets please stand up? Evolving consensus-based curatorial strategies Chris Southan, IUPHAR/BPS Guide to PHARMACOLOGY Web portal Group, Centre for Integrative Physiology,School of Biomedical Sciences, University of Edinburgh, Hugh Robson Building, Edinburgh, EH8 9XD, UK. cdsouthan@hotmail.com Presented to the Gloriam/GPCRDB Team and the Dept. of Pharmaceutical Sciences, University of Copenhagen, 6th May 2014 1
  • 2. GToPdb: receptors, ligands, targets and drugs ā€¢ An expert-curated database overseen by the IUPHAR Nomenclature Committee (NC-IUPHAR) ā€¢ >70 subcommittees comprising ~700 international scientists working on individual target families. ā€¢ 4 full-time curators, 1 part-time admin, 1 developer. ā€¢ NC-IUPHAR publishes nomenclature recommendations and reviews on various topics in pharmacological journals and through the IUPHAR database. ā€¢ Subcommittees update their database pages annually. ā€¢ Continuously expanding to incorporate new data types, new targets and ligands and new domain committees ā€¢ Public database releases every 3-4 months
  • 6. WellcomeTrust Grant 099156/Z/12/Z ā€¢ Key objective: ā€œencompass all the human targets of current prescription medicines and the likely targets of future medicinesā€ ā€¢ Conceptually familiar from our established receptor/channel-centric database ā€¢ But - needed to re-define curatorial approaches, caveats and end-points ā€¢ Balance between theoretical rigour and pragmatic utility ā€¢ Four foci - grant fulfilment, user value, data mining, data consumption ā€¢ Discuss and document changes in curatorial strategies with practical guidelines ā€¢ Add enhancements, new relationships and features ā€¢ Control activity-mapping stringencies and relationship distributions ā€¢ QC legacy content, harmonise and remediate where necessary ā€¢ Aim for small, but perfectly-formed, data content vs. complete coverage
  • 7. Technical implementation ā€¢ Restrict relationships to citable/provenanced quantitative mappings (typically IC50, Ki, Kd) ā€¢ Formally tag data-supported ā€œprimary targetsā€ ā€¢ Only data-supported polypharmacology ā€¢ Mask nutraceuticals, metabolites or endogenous hormones from bloating drug > target relationship space ā€¢ Limit drug > multiple subunit mappings to direct interactions ā€¢ Normalize targets to UniProt IDs and Swiss-Prot for human ā€¢ Normalise drugs and ligands to PubChem compound records (CIDs) ā€¢ Extend useful relationships e.g. drug > prodrug, drug > active metabolite, ligand = target (antibody > cytokine) ā€¢ Flexibility to handle edge cases (e.g. heparinoids) ā€¢ Options for selective expansion (e.g. kinases, proteases andAlzheimerā€™s) 7
  • 8. Defining limits for curation ā€¢ The good news: capture of targets and drugs in databases and literature reports is continuously expanding ā€¢ The bad news: no one agrees on numbers, relationship definitions, curatorial rules, identifiers, exact molecular structures, choices of primary sources or provenance attribution ā€¢ More bad news: source proliferation < ā€œcircularā€ annotation ā€¢ Human target range: 186 approved drugs in 2006 (PMID:17139284 ) < 3,044 in ChEMBL_18 ā€¢ Approved drug ranges: 1,216 FDA Maximum Daily Dose (PubChem Assay ID 1195) < 2,750 for the NCGC Pharmaceutical Collection (PMID:21525397) ā€¢ Outer bioactivity ranges: 8057 INNs < 928,875 actives in PubChem BioAssays < 6.3 million from GVKBIO with SAR from papers and patents 8
  • 9. Evolution of our consensus strategy Based on many collective years of curatorial engagement and deep source knowledge we now pursue a consensus approach for the following reasons: 1. Concordant sources are generally more likely to be right than wrong 2. Curatorial efficiency of starting with solid consensus sets 3. Multiple sources are informatically synergistic ( if truly independent) 4. Approach is flexible via source updates and testing different filters 5. We control total numbers for matching to curatorial capacity 6. The concept can easily be explained to users 7. The exercise of comparing sources is very informative 8. It forces entity identifier normalisation (via cross-mapping if necessary) 9. Consensus lists per se have value for users (e.g. hosting on website) 9
  • 10. Will the real targets please stand up ? ā€¢ Compared as human Swiss-Prot IDs for 2013 database releases ā€¢ Intersect is 351 the union is 3,046 (i.e. 15% of the 20,265 human proteome) ā€¢ Lists included approved, clinical and research targets 10 Figure 7d from: ā€œComparing the chemical structure and protein content of ChEMBL, DrugBank, Human Metabolome Database and the Therapeutic Target Databaseā€ PMID: 24533037
  • 11. Genome Ontology comparison indicates source selectivity 11
  • 12. Use a target consensus to populate the database 12 ā€¢ ChEMBL 17, 252 approved ā€¢ Mathias Rask-Anderson et. al July 2013, 481 approved ā€¢ Southan et. al, 2013 3-way human DrugBank/ChEMBL/TTD 352 ā€¢ 3-way or 2-way, 19 + 40 + 143 = 202 Targets Of Approved Drugs (TOADS) set selected for GToP upload
  • 13. Will the real drugs please stand up? ā€¢ Work up the following CID triage inside PubChem ā€¢ Select DrugBank 1504 ā€œapprovedā€ drug structures ā€¢ Select two additional sources TTD and ChEMBL ā€¢ Filter to remove salts and mixtures ā€¢ Select synonym INN (WHO International Non-proprietary Name). ā€¢ The final step was the Boolean intersect between all five 13
  • 14. Observations and caveats ā€¢ This set of 923 drugs can be accessed via the MyNCBI open URL http://www.ncbi.nlm.nih.gov/sites/myncbi/collections/public/1Fo7u3a pR1bzS_UWr1YhHOTkZ/ ā€¢ TTD last submitted in Feb 2012 so drug content is thus capped to before that date (droppingTTD gives 1117 CIDs) ā€¢ Some metabolites (e.g. amino acids) come through the filters ā€¢ Older drugs have no INN (e.g. aspirin) ā€¢ Some peptide drug CIDs are missing (suggesting low concordance) ā€¢ Approved fixed-mixtures are excluded (they do not get an INN) ā€¢ The computed CID identity is actually a hash-code match, rather than via InChIKey (but this should give similar numbers) ā€¢ Each of the 923 had 76 submissions (SIDs) ā€¢ Applying ā€œsame (bond) connectivityā€ gives 18749 but removing the virtual deuterated entries reduces this to 6919 (i.e. the 923 have, on average, 7.5 alternative stereo CIDs) 14
  • 15. Closing consensus drugs > targets 15 ā€¢ From Phase I targets > drugs we have moved to Phase 2 for drugs > targets ā€¢ Current stats = 228TOADS (inward mapping expanded the set by ~10%) ā€¢ Current stats = 996 approved drugs (need to complete the activity mappings) ā€¢ Note that antibodies and larger peptides (with no PubChem CIDs) are subsumed in the 996 ā€¢ 2013 new drug CIDs loaded http://cdsouthan.blogspot.se/2014/03/the- drugs-of-2013-in-pubchem.html ā€¢ Will back-fill 2010-2012 new approvals as ligands, targets and activities (but most already there)
  • 16. GPCRdb/GToPdb collaborative opportunity ā€¢ Inspect which GPCRs are concordant or discordant between the target lists ā€¢ Might be able to do similar exersise for GPCR-active drug/compound lists ā€“ depending on what we can find with linkage (e.g. GLIDA) ā€¢ Work up a triage for alert triggers for new GPCR ligand structures in PDB (e.g. via MMDB) 16
  • 17. References and Acknowledgments 17 The database team: Adam Pawson, Joanna Sharman, Helen Benson, Elena Faccenda