2. Introduction
What is proteome?
proteome is the entire complement of
proteins, including the modifications
made to a particular set of proteins,
produced by an organism or system at
particular time and conditions.
varies with time and distinct
requirements, or stresses, that a cell or
organism undergoes.
3. • What is proteomics?
Proteomics is the large-scale study of proteins,
particularly their functions and structures.
A short list of protein modifications that might be
studied under proteomics include:
1. phosphorylation
2. ubiquitination
3. methylation
4. acetylation
5. glycosylation
6. oxidation
7. Nitrosylation etc.
4. Why proteomics?
• Gives better understanding of an organism than
Genomics.
• Limitations of genomics that made proteomics a
better approach:
1. the level of transcription of a gene gives only a
rough estimate of its level of expression into a
protein.
2. many transcripts give rise to more than one
protein, through alternative splicing or alternative
post-translational modifications.
3. many proteins form complexes with other proteins
or RNA molecules, and only function in the
presence of these other molecules.
5. 4. proteins experience post-translational modifications that
profoundly affect their activities.
5. protein degradation rate plays an important role in
protein content.
Any cell may make different sets of proteins at different
times, or under different conditions. Furthermore, any
one protein can undergo a wide range of post-
translational modifications. So proteomics study can be
complex.
Therefore, proteomics is a better approach but complex.
6. Branches of proteomics
Proteomics analysis
Determining proteins which are post-translationally modified
Expression proteomics
Profiling of expressed proteins using quantitative
methods
Cell mapping proteomics
Identification of protein complexes
7. Methods
1. Gel based proteomics(2DE):
◦ older approach
◦ Separates proteins according to charge in the
first dimension and according to the size in the
second dimension.
◦ Commonly separated using polyacrylamide gel
electrophorosis(PAGE).
◦ Identifies individual proteins in complex
samples or multiple proteins in single sample.
8. 2.Mass spectrometry based proteomics:
◦ Highly accurate for extremely low mass particles.
◦ Proteins are cleaved into peptides with enzymatic
protease and the peptide masses are detected with
the help of mass spectrometer(eg TOF)
◦ The mass spectrum of the peptides is obtained and it
is converted to a list of peptide masses that is
searched against the genome databases.
◦ Since, each protein has a unique peptide mass
fingerprint, peptide masses can identify the protein in
the database.
9. 3.Protein arrays
◦ Idea is similar to cDNA arrays.
◦ Substrate is bound on the surface of array
◦ Sample is introduced, binding takes place
◦ Detection and analysis.
◦ Analysis of protein-protein, protein-DNA or protein-
RNA interactions can be done.
10. Applications
Identification of potential new drugs for the
treatment of diseases. This relies on genome and
proteome information to identify proteins
associated with a disease, which computer
software can then use as targets for new drugs.
Biomarkers
A number of techniques allow to test for proteins produced
during a particular disease, which helps to diagnose the
disease quickly.
11. Examples of biomarkers
Alzheimer's disease
In Alzheimer’s disease, elevations in beta secretase create
amyloid/beta-protein, targeting this enzyme decreases the
amyloid/beta-protein and slows the progression of the
disease
Heart disease
Standard protein biomarkers for CVD include interleukin-6,
interleukin-8, serum amyloid A protein, fibrinogen, and
troponins.
13. Introduction – Current State
Many different informational protein
databases available online
Most databases are focused on
protein identification
◦ Research community provides the data
that drives the database contents
◦ Validation of Mass Spec data
Single vs. Multiple Species Support
14. Overview of Databases
NCBI – Protein / Peptidome
Human Gene and Protein Database
(HGPD)
Human Proteinpedia / Human Protein
Reference Database (HPRD)
Dynamic Proteomics
Open Proteomics Database
Global Proteome Machine Database
Peptide Atlas
Proteomics Identifications Database
(PRIDE)
UniProt Knowledgebase
15. NCBI – Protein / Peptidome
Two databases contained in the
Entrez suite
Multi-species result sets
Protein
◦ Provides gene information pertaining to
the expressed protein queried
Peptidome
◦ Mass Spec based protein identification
database
◦ Experiment based result sets
16. Human Gene and Protein
Database (HGPD)
Several cDNA contributors, spanning
the globe
Gateway Expression System
◦ Allows for reproducible clone library.
Clones are available for purchase.
Wheat Germ Cell-free protein
synthesis
◦ Protein Expression portion of the
database. Allows for visualization of the
SDS-PAGE results.
17. Human Proteinpedia / Human
Protein Reference Database
(HPRD)
Modeled after wikipedia
◦ Users submit and edit the data in the database
◦ Differences
Original submitter expected to provide experimental
evidence for the data
Only the original submitter can edit that specific data later.
Allows several protein features to be annotated
◦ Post-translational modification
◦ Tissue expression
◦ Cell line expression
◦ Subcellular localization
◦ Enzyme substrates
◦ Protein-protein interactions
18. Human Proteinpedia / Human
Protein Reference Database
(HPRD)
No visual protein expression data
Protein amino acid sequence given
Raw and processed mass spec files
are available as experimental
evidence
Provides links to the protein in other
databases
19. Dynamic Proteomics
Different type of database, focusing on the
dynamics of proteins treated with an anti-cancer
drug
Shows different uses for data repositories for
proteomics
◦ Not just all-encompassing data source with generic
data.
◦ Using simple databases and web front ends to make
more specific types of data available to the
community.
Also provides links to other databases
Can compare multiple sequences at once to
search the cDNA library.
20. Dynamic Proteomics
Time lapse microscopy movies that
illustrate the protein dynamics in individual
living human cancer cells in response to an
anti-cancer drug
Time Lapse Video
21. Open Proteomics Database
University of Texas
Multi-species results
Smaller pool of data submitted for
query
22. Global Proteome Machine
Database
Private industry involvement
Mass Spec Validation
Protein Identification
Utilizes data from other databases
◦ Differs from the scheme of just linking to
other protein databases
23. Peptide Atlas
Seattle Proteome Center
Focused on subset of human proteins
◦ Heart, Lung, Blood
Funded by NIH
Part of the Trans-Proteomic Pipeline
software suite
24. Proteomics Identifications
Database (PRIDE)
One of the earlier proteomic
databases
European Bioinformatics Institute
Larger selection of species specific
data
Java based, available for local
deployment
25. UniProt Knowledgebase
Swiss Institute of Bioinformatics
Also curated by European
Bioinformatics Institute
Funded by NIH
◦ Forced the conversion of earlier non-
public versions to become free and open
27. ExPAsy Proteomics Server
Swiss Institute of Bioinformatics tool
suite
Protein ID by amino acid sequence
Isoelectric Point Computation
Prediction of post translational
modifications and amino acid
substitutions.
Predicts protein cleavage sites
Protein identification by molecular
weight
30. Future Considerations
Selection of a few ‘primary’ data
repositories
Consolidation of multiple redundant
efforts being funded by the same agency
◦ Particularly NIH
Data standards to streamline the
submission of results into multiple data
sources.
◦ Reduction of the need to perform many
searches to find information about a protein
◦ mzXML is a start, but only covers mass spec
31. Database References
NCBI
◦ Protein http://www.ncbi.nlm.nih.gov/protein/
◦ Peptidome http://www.ncbi.nlm.nih.gov/pepdome
Human Gene and Protein Database (HGPD)
◦ http://riodb.ibase.aist.go.jp/hgpd/cgi-bin/index.cgi
Human Proteinpedia
◦ http://www.humanproteinpedia.org/index_html
Human Protein Reference Database (HPRD)
◦ http://www.hprd.org/
Dynamic Proteomics
◦ http://alon-serv.weizmann.ac.il/dynamprotb/seqsrch
Open Proteomics Database
◦ http://bioinformatics.icmb.utexas.edu/OPD/
Global Proteome Machine Database
◦ http://thegpm.org
Peptide Atlas
◦ http://www.peptideatlas.org/
Proteomics Identifications Database (PRIDE)
◦ http://www.ebi.ac.uk/pride/
UniProt Knowledgebase
◦ http://www.uniprot.org/
34. Discovery of protein biomarkers
A biomarker can be defined as any laboratory measurement or
physical sign used as a substitute for a clinically meaningful end
point that measures directly how a patient feels, functions or
survives as applied to proteomics, a biomarker is an identified
protein(s) that is unique to a particular disease state.
Biomarkers of drug efficacy and toxicity are becoming a key need in
the drug development process.
Mass spectral-based proteomic technologies are ideally suited for
the discovery of protein biomarkers in the absence of any prior
knowledge of quantitative changes in protein levels.
The success of any biomarker discovery effort will depend upon the
quality of samples analysed, the ability to generate quantitative
information on relative protein levels and the ability to readily
interpret the data generated.
35. Study of Tumor Metastasis
and Cancers
The identification of protein molecules with their expressions correlated
to the metastatic process help to understand the metastatic
mechanisms and thus facilitate the development of strategies for the
therapeutic interventions and clinical management of cancer.
Information contained within proteomic patterns has been
demonstrated to detect ovarian, breast and prostate cancers with
sensitivities and specificities greater than 90%.
36. Field of Neurotrauma
Neurotrauma results in complex alterations to the biological systems
within the nervous system, and these changes evolve over time.
Near-completion of the Human Genome Project has stimulated
scientists to begin looking for the next step in unraveling normal and
abnormal functions within biological systems. Consequently, there is
new focus on the role of proteins in these processes.
Proteomics is a burgeoning field that may provide a valuable
approach to evaluate the post-traumatic central nervous system
(CNS). However the senstivity of the tissue and detection of
potential biomarkers are major concern.
37. Renal disease diagnosis
Proteomics has also found significant application in studying the effects
of chemical insults on the kidney, particularly as a result of
environmental toxins, drugs and other bioactive agents.
Combining classic analytical techniques as two-dimensional gel
electrophoresis and more sophisticated techniques, such as MS, liquid
chromatography has enabled considerable progress to be made in
cataloguing and quantifying proteins present in urine and various kidney
tissue compartments in both normal and diseased physiological states.
Critical developmental tasks that still need to be accomplished are
completely defining the proteome in the various biological compartments
(e.g. tissues, serum and urine) in both health and disease, which
presents a major challenge given the dynamic range and complexity of
such proteomes; and also achieving the routine ability to accurately
and reproducibly quantify proteomic expression profiles and develop
diagnostic platforms.
38. Neurology
In neurology and neuroscience, many applications of proteomics
have involved neurotoxicology and neurometabolism, as well as in
the determination of specific proteomic aspects of individual brain
areas and body fluids in neurodegeneration.
Investigation of brain protein groups in neurodegeneration, such as
enzymes, cytoskeleton proteins, chaperones, synaptosomal proteins
and antioxidant proteins, is in progress as phenotype related
proteomics.
The concomitant detection of several hundred proteins on a gel
provides sufficiently comprehensive data to determine a
pathophysiological protein network and its peripheral
representatives. An additional advantage is that hitherto unknown
proteins have been identified as brain proteins.
39. Autoantibody profiling
Proteomics technologies enable profiling of autoantibody responses
using biological fluids derived from patients with autoimmune
disease.
They provide a powerful tool to characterize autoreactive B-cell
responses in diseases including rheumatoid arthritis, multiple
sclerosis, autoimmune diabetes, and systemic lupus erythematosus.
Autoantibody profiling may serve purposes including classification of
individual patients and subsets of patients based on their
'autoantibody fingerprint', examination of epitope spreading and
antibody isotype usage, discovery and characterization of candidate
autoantigens, and tailoring antigen-specific therapy.
40. Alzheimer's disease
In Alzheimer’s disease, elevations in beta secretase create
amyloid/beta-protein, which causes plaque to build up in the
patient's brain, which is thought to play a role in dementia.
Targeting this enzyme decreases the amyloid/beta-protein and so
slows the progression of the disease.
A procedure to test for the increase in amyloid/beta-protein is
immunohistochemical staining, in which antibodies bind to specific
antigens or biological tissue of amyloid/beta-protein.
41. Heart disease
Heart disease is commonly assessed using several key protein
based biomarkers. Standard protein biomarkers for CVD include
interleukin-6, interleukin-8, serum amyloid A protein, fibrinogen, and
troponins.
cTnI cardiac troponin I increases in concentration within 3 to 12
hours of initial cardiac injury and can be found elevated days after
an acute myocardial infarction.
A number of commercial antibody based assays as well as other
methods are used in hospitals as primary tests for acute MI.
42. Future Challenges
There is a need for biomarkers with more accurate diagnostic
capability, particularly for early-stage disease.
Also adding a quality control sample on each chip array, and
normalizing spectral data through commercially available or in-house
generated computer programs
Another challenge that proteomics techniques face lie largely in the
application of bioinformatics, i.e. the spectral data management and
analysis. The vast amount of spectral data generated demand
implementation of advanced data management and analysis
strategies.
Finally, the obvious challenge, as stated by many investigators, is
the identification of the important proteins and peptides that
contribute to the proteomic analysis.
Notas del editor
Proteomics is a systematic research approach aiming to provide the global characterization of protein expression and function under given conditions. Proteomic technology has been widely used in biomarker discovery and pathogenetic studies including tumor metastasis.
The rapid spread of proteomics technology, which principally consists of twodimensional gel electrophoresis (2-DE) with in-gel protein digestion of protein spots and identification by massspectrometry, has provided an explosive amount of results