SlideShare una empresa de Scribd logo
1 de 1
3. Divergence of protein identifiers
2. Methods
7. References
Will the real pharmacologically significant
proteins please stand up?
1. Introduction
Even in their more contemplative moments probably few pharmacologists cogitate on
“so how many human proteins actually exist?” Nevertheless, on a practical level their
engagement with names and identifiers (IDs) for pharmacological protein targets and
disease mechanistic components is intense and includes navigating between
databases and the literature. This work addresses three important aspects of protein
equivocality that pharmacologists may less aware of but that we encounter head-on
during curation of the IUPHAR/BPS Guide to PHARMACOLOGY [1 2]. These are:
1. Variability in canonical counts between 19,198 from the HUGO Gene Nomenclature
Committee (HGNC) up to 21,341 in GeneCards, indicating a surprising annotation
discordance for at least 10% of the human proteome
2. Uncertainty of alternatively spliced (AS) protein existence. While Ensembl predicts
over 100,000 AS mRNAs, the verification of these by proteomics is 30-fold less than
expected, inferring that the majority do not exist in vivo [3]
3. Evidence that some canonical Swiss-Prot (SP) entries are not the major isoform
Using UniProt we ascertained the 4-way intersect between SP protein IDs, HGNC Gene
Symbols, Ensembl genes and NCBI Gene IDs. The four sets were selected using
cross-reference queries from the UniProt interface. We then accessed our internal
protein statistics including the total human UniProt IDs that we had curated into GtoPdb
and those for which we had annotated data-supported and pharmacologically-relevant
ligand interactions. These were compared to the 4-way sequence set. We also counted
proteins for which UniProt had curated splice forms using the query “Alternative splicing
(KW-0025)”. We then and compared these with our ligand interaction set. We also
inspected one splice form that has been annotated in GtoPdb and checked the
information in SP. To address the isoform abundance question we queried the
Annotation of principal and alternative splice isoforms (APPRIS) database to check
targets [4].
1. Harding SD, et al. (2018). Nucl. Acids Res. 46 (Database Issue): D1091-D1106.
2. Southan C, et al. (2018) ACS Omega 3(7), PMID: 30087946
3. Rodriguez JM et al. (2018). Nucl. Acids Res. 46 (Database Issue) D213-D217.
4. Tress ML, et al (2017) Trends Biochem Sci. 42(2):98-110.
5. Southan C (2017) F1000Res. 7;6:448.
5. Protein alternative splicing
Christopher Southan, Simon D. Harding, Elena Faccenda, Adam J. Pawson and Jamie A. Davies.
IUPHAR/BPS Guide to PHARMACOLOGY, Centre for Discovery Brain Sciences, University of Edinburgh, UK
6. Discussion points
• In addition to AS touched on here, additional sources of protein equivocality and
heterogeneity include alternative initiations and post-translational modifications.
• The multiplexing of these from a (still without a consensus) canonical set of ~19,000
proteins is predicted to run into the millions.
• The significance of this for pharmacology, systems biology and drug discovery is
acknowledged to be high but getting solid experimental data is difficult.
• GtoPdb users are welcome to alert us to potentially curatable papers on differential
ligand interactions related to any forms of protein heterogeneity
www.guidetopharmacology.org enquiries@guidetopharmacology.org @GuidetoPHARM
4. Comparing the consensus with GtoPdb
We especially thank all contributors, collaborators and NC-IUPHAR members
In the Venn diagram on the
right the 4-way intersect
shows that these four major
global pipelines concur for
less than 19,000 protein-
coding genes. Most divergent
is the 829 SP-only set.
Inspection established many
of these are categorised as
pseudogenes by HGNC [5].
This surprising result includes
some missing genomic cross-
mappings inside SP. However,
the consensus is close to the
HGNC count of 19,118 (note
Ensembl and NCBI
reciprocally cross-map hence
the empty sections)
Our next step was to compare the 4-
way set from the comparison above
(blue) with a) all the human proteins we
have entered in GtoPdb (yellow) and
b) those proteins that have a curated
interaction (mostly quantitative) against
one or more of the 9405 ligands (green)
The results were generally as expected
in confirming the majority of our proteins
are within the 4-way set (i.e. solidly
supported). However, the analysis was
valuable in detecting minor anomalies
(represented in segments of 5,6 and
23). These are being followed-up but a
major factor is that some of these are
missing GeneID cross-references in
Swiss-Prot (i.e. are blue false –ves)
It is difficult to find papers with solid data showing AS affecting proteins for which we
have curated ligand interactions and may thus exabit differential pharmacology. Many
publications indicate that AS transcription is a) widespread, b) affects the majority of the
mammalian proteome and is c) is likely to be functionally important in various biological
contexts (e.g. tumours and brain tissue) even if the mechanisms are unclear.
Notwithstanding, there are major uncertainties in proving the existence of AS proteins
since they are difficult to verify in vivo. We approached this question by counting our
interaction proteins with AS sequence variants annotated in Swiss-Prot.
The results of this are shown on
the right. The yellow circle
indicates that 52% of human SP
has at least one AS protein
sequence annotated. This rises
slightly to 54% in our interaction
set (blue). Importantly, AS in SP
is target-class specific rising to
70% for kinases but only 14%
for GPCRs (since many are
single--exon genes). Note that
Ensembl predicts considerably
more potential AS sequences
than SP curates
In GtoPdb we only assign quantitative and differentially-specific AS-ligand interactions
if the papers meet our curatorial stringency. We also need evidence that data-
supported differential binding has pharmacological significance. This is challenging for
many reasons that cannot be expanded on here (but we would be pleads to discuss).
Consequently, we have only one AS entry as the interaction between protein target
2903 as claudin18 and antibody ligand 9209 (below, together with the AS first exon).
The specific case of claudin18 and extrapolation to other AS proteins in GtoPdb
raises the question as to which sequence may be quantitatively dominant (i.e. the
principle isoform in vivo). However, there are inherent challenges of quantifying AS-
specific peptides by mass-spec proteomics or estimating surrogate relative
abundancies from transcription data. We thus chose the APPRIS database which
uses a range of computational methods fold coverage scores to select the most
likely principal isoform. In this case the two SP scored equally.

Más contenido relacionado

La actualidad más candente

CSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-FinalCSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-Final
Alissa Calderon
 
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
Patrick Dumas
 

La actualidad más candente (19)

Proteomics a search tool for vaccines
Proteomics a search tool for vaccinesProteomics a search tool for vaccines
Proteomics a search tool for vaccines
 
Pep Talk San Diego 011311
Pep Talk San Diego 011311Pep Talk San Diego 011311
Pep Talk San Diego 011311
 
2013_WCBSURC.pptx
2013_WCBSURC.pptx2013_WCBSURC.pptx
2013_WCBSURC.pptx
 
CSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-FinalCSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-Final
 
iDiffIR: Identifying differential intron retention from RNA-seq
iDiffIR: Identifying differential intron retention from RNA-seqiDiffIR: Identifying differential intron retention from RNA-seq
iDiffIR: Identifying differential intron retention from RNA-seq
 
Paper 1 Navisraj
Paper 1 NavisrajPaper 1 Navisraj
Paper 1 Navisraj
 
Early view-february-2015nicola-kerbyson
Early view-february-2015nicola-kerbysonEarly view-february-2015nicola-kerbyson
Early view-february-2015nicola-kerbyson
 
Whyte_2013
Whyte_2013Whyte_2013
Whyte_2013
 
Mazalouskas_2015
Mazalouskas_2015Mazalouskas_2015
Mazalouskas_2015
 
The halo(gen) effect in para substituted phenyl rings - EuroCup2015
The halo(gen) effect in para substituted phenyl rings - EuroCup2015The halo(gen) effect in para substituted phenyl rings - EuroCup2015
The halo(gen) effect in para substituted phenyl rings - EuroCup2015
 
AMQ and AMB poster Korotchenko July 7
AMQ and AMB poster Korotchenko July 7AMQ and AMB poster Korotchenko July 7
AMQ and AMB poster Korotchenko July 7
 
Poster anti-A 1
Poster anti-A 1Poster anti-A 1
Poster anti-A 1
 
a-FMH Poster
a-FMH Postera-FMH Poster
a-FMH Poster
 
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
 
Protein-protein interactions of transcription factors from the drought QTL-ho...
Protein-protein interactions of transcription factors from the drought QTL-ho...Protein-protein interactions of transcription factors from the drought QTL-ho...
Protein-protein interactions of transcription factors from the drought QTL-ho...
 
Rings In (Candidate) Drugs - Case Stories
Rings In (Candidate) Drugs - Case StoriesRings In (Candidate) Drugs - Case Stories
Rings In (Candidate) Drugs - Case Stories
 
Seah_SURF (1)
Seah_SURF (1)Seah_SURF (1)
Seah_SURF (1)
 
Duchenne drug tested for muscular dystrophy as drug repurposing
Duchenne drug tested for muscular dystrophy as drug repurposingDuchenne drug tested for muscular dystrophy as drug repurposing
Duchenne drug tested for muscular dystrophy as drug repurposing
 
news and views
news and viewsnews and views
news and views
 

Similar a Will the real proteins please stand up

Screening Of Mdr1 [Autosaved]
Screening Of  Mdr1 [Autosaved]Screening Of  Mdr1 [Autosaved]
Screening Of Mdr1 [Autosaved]
Pooja1923
 
Instem-Orthologues-Handout
Instem-Orthologues-HandoutInstem-Orthologues-Handout
Instem-Orthologues-Handout
Mark Miller
 
MathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaper
Mathias Hibbard
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
PreveenRamamoorthy
 
Arf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedInArf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedIn
Kenneth Hee
 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Chris Southan
 
ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5
Kaitlin Hart
 
CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013
Iddo
 
Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97
Sara Verdura
 

Similar a Will the real proteins please stand up (20)

Screening Of Mdr1 [Autosaved]
Screening Of  Mdr1 [Autosaved]Screening Of  Mdr1 [Autosaved]
Screening Of Mdr1 [Autosaved]
 
Instem-Orthologues-Handout
Instem-Orthologues-HandoutInstem-Orthologues-Handout
Instem-Orthologues-Handout
 
Identification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databasesIdentification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databases
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212
 
MathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaper
 
Update on the Druggable Proteome
Update on the Druggable ProteomeUpdate on the Druggable Proteome
Update on the Druggable Proteome
 
Integrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLEIntegrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLE
 
High similarity among ChEC-seq datasets.pdf
High similarity among ChEC-seq datasets.pdfHigh similarity among ChEC-seq datasets.pdf
High similarity among ChEC-seq datasets.pdf
 
Rehmat ullah assignment
Rehmat ullah assignmentRehmat ullah assignment
Rehmat ullah assignment
 
Integrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLEIntegrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLE
 
Bioinformatic jc 08_14_2013_formal
Bioinformatic jc 08_14_2013_formalBioinformatic jc 08_14_2013_formal
Bioinformatic jc 08_14_2013_formal
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
 
Arf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedInArf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedIn
 
Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...
 
Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable
 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
 
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
 
ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5
 
CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013
 
Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97
 

Más de Chris Southan

Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
Chris Southan
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
Chris Southan
 

Más de Chris Southan (20)

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
 
Peptide tribulations
Peptide tribulationsPeptide tribulations
Peptide tribulations
 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
 
Seeking glimmers of light in Pharos “Tdark” proteins
Seeking glimmers of light in  Pharos “Tdark” proteinsSeeking glimmers of light in  Pharos “Tdark” proteins
Seeking glimmers of light in Pharos “Tdark” proteins
 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
 
Peptide Tribulations
Peptide TribulationsPeptide Tribulations
Peptide Tribulations
 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
 
Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
 
Peptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdbPeptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdb
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 

Will the real proteins please stand up

  • 1. 3. Divergence of protein identifiers 2. Methods 7. References Will the real pharmacologically significant proteins please stand up? 1. Introduction Even in their more contemplative moments probably few pharmacologists cogitate on “so how many human proteins actually exist?” Nevertheless, on a practical level their engagement with names and identifiers (IDs) for pharmacological protein targets and disease mechanistic components is intense and includes navigating between databases and the literature. This work addresses three important aspects of protein equivocality that pharmacologists may less aware of but that we encounter head-on during curation of the IUPHAR/BPS Guide to PHARMACOLOGY [1 2]. These are: 1. Variability in canonical counts between 19,198 from the HUGO Gene Nomenclature Committee (HGNC) up to 21,341 in GeneCards, indicating a surprising annotation discordance for at least 10% of the human proteome 2. Uncertainty of alternatively spliced (AS) protein existence. While Ensembl predicts over 100,000 AS mRNAs, the verification of these by proteomics is 30-fold less than expected, inferring that the majority do not exist in vivo [3] 3. Evidence that some canonical Swiss-Prot (SP) entries are not the major isoform Using UniProt we ascertained the 4-way intersect between SP protein IDs, HGNC Gene Symbols, Ensembl genes and NCBI Gene IDs. The four sets were selected using cross-reference queries from the UniProt interface. We then accessed our internal protein statistics including the total human UniProt IDs that we had curated into GtoPdb and those for which we had annotated data-supported and pharmacologically-relevant ligand interactions. These were compared to the 4-way sequence set. We also counted proteins for which UniProt had curated splice forms using the query “Alternative splicing (KW-0025)”. We then and compared these with our ligand interaction set. We also inspected one splice form that has been annotated in GtoPdb and checked the information in SP. To address the isoform abundance question we queried the Annotation of principal and alternative splice isoforms (APPRIS) database to check targets [4]. 1. Harding SD, et al. (2018). Nucl. Acids Res. 46 (Database Issue): D1091-D1106. 2. Southan C, et al. (2018) ACS Omega 3(7), PMID: 30087946 3. Rodriguez JM et al. (2018). Nucl. Acids Res. 46 (Database Issue) D213-D217. 4. Tress ML, et al (2017) Trends Biochem Sci. 42(2):98-110. 5. Southan C (2017) F1000Res. 7;6:448. 5. Protein alternative splicing Christopher Southan, Simon D. Harding, Elena Faccenda, Adam J. Pawson and Jamie A. Davies. IUPHAR/BPS Guide to PHARMACOLOGY, Centre for Discovery Brain Sciences, University of Edinburgh, UK 6. Discussion points • In addition to AS touched on here, additional sources of protein equivocality and heterogeneity include alternative initiations and post-translational modifications. • The multiplexing of these from a (still without a consensus) canonical set of ~19,000 proteins is predicted to run into the millions. • The significance of this for pharmacology, systems biology and drug discovery is acknowledged to be high but getting solid experimental data is difficult. • GtoPdb users are welcome to alert us to potentially curatable papers on differential ligand interactions related to any forms of protein heterogeneity www.guidetopharmacology.org enquiries@guidetopharmacology.org @GuidetoPHARM 4. Comparing the consensus with GtoPdb We especially thank all contributors, collaborators and NC-IUPHAR members In the Venn diagram on the right the 4-way intersect shows that these four major global pipelines concur for less than 19,000 protein- coding genes. Most divergent is the 829 SP-only set. Inspection established many of these are categorised as pseudogenes by HGNC [5]. This surprising result includes some missing genomic cross- mappings inside SP. However, the consensus is close to the HGNC count of 19,118 (note Ensembl and NCBI reciprocally cross-map hence the empty sections) Our next step was to compare the 4- way set from the comparison above (blue) with a) all the human proteins we have entered in GtoPdb (yellow) and b) those proteins that have a curated interaction (mostly quantitative) against one or more of the 9405 ligands (green) The results were generally as expected in confirming the majority of our proteins are within the 4-way set (i.e. solidly supported). However, the analysis was valuable in detecting minor anomalies (represented in segments of 5,6 and 23). These are being followed-up but a major factor is that some of these are missing GeneID cross-references in Swiss-Prot (i.e. are blue false –ves) It is difficult to find papers with solid data showing AS affecting proteins for which we have curated ligand interactions and may thus exabit differential pharmacology. Many publications indicate that AS transcription is a) widespread, b) affects the majority of the mammalian proteome and is c) is likely to be functionally important in various biological contexts (e.g. tumours and brain tissue) even if the mechanisms are unclear. Notwithstanding, there are major uncertainties in proving the existence of AS proteins since they are difficult to verify in vivo. We approached this question by counting our interaction proteins with AS sequence variants annotated in Swiss-Prot. The results of this are shown on the right. The yellow circle indicates that 52% of human SP has at least one AS protein sequence annotated. This rises slightly to 54% in our interaction set (blue). Importantly, AS in SP is target-class specific rising to 70% for kinases but only 14% for GPCRs (since many are single--exon genes). Note that Ensembl predicts considerably more potential AS sequences than SP curates In GtoPdb we only assign quantitative and differentially-specific AS-ligand interactions if the papers meet our curatorial stringency. We also need evidence that data- supported differential binding has pharmacological significance. This is challenging for many reasons that cannot be expanded on here (but we would be pleads to discuss). Consequently, we have only one AS entry as the interaction between protein target 2903 as claudin18 and antibody ligand 9209 (below, together with the AS first exon). The specific case of claudin18 and extrapolation to other AS proteins in GtoPdb raises the question as to which sequence may be quantitatively dominant (i.e. the principle isoform in vivo). However, there are inherent challenges of quantifying AS- specific peptides by mass-spec proteomics or estimating surrogate relative abundancies from transcription data. We thus chose the APPRIS database which uses a range of computational methods fold coverage scores to select the most likely principal isoform. In this case the two SP scored equally.