This document provides an overview of bioinformatics and some of its key applications. It discusses how bioinformatics is an interdisciplinary field that uses computer science, statistics and other approaches to analyze large amounts of biological data. It notes that bioinformatics has become necessary due to the explosion of genomic data from projects like the Human Genome Project. Some of the goals and uses of bioinformatics mentioned include uncovering biological information from data, applications in molecular medicine, agriculture and environmental science. The document also provides brief descriptions of structural bioinformatics, common biological databases, MASCOT database searching, and scoring schemes used in bioinformatics.
3. What is bioinformatics?
IT is an interdisciplinary field that develops methods
and software tools for
understanding biological data. As
an interdisciplinary field of science, bioinformatics
combines computer science, statistics, mathematics,
and engineering to analyze and
interpret biological data
4.
5. Why Bioinformatics is necessary?
The need for bioinformatics has arisen from the
recent explosion of publicly available genomic
information, such as resulting from the Human
Genome Project.
Gain a better understanding of gene analysis,
taxonomy, & evolution.
To work efficiently on the rational drug designs
and reduce the time taken for the development
of drug manually
6. Goals of Bioinformatics
To uncover the wealth of Biological information
hidden in the mass of sequence, structure, literature
and biological data.
It is being used now and in the foreseeable future
in the areas of molecular medicine.
It has environmental benefits in identifying waste
and clean up bacteria.
In agriculture, it can be used to produce high yield,
low maintenance crops.
7. Where Bioinformatics helps?
In Experimental Molecular Biology
In Genetics and Genomics
In generating Biological Data
Analysis of gene and protein expression
Comparison of genomic data
Understanding of evolutionary aspect of Evolution
Understanding biological pathways and networks in System
Biology
In Simulation & Modeling of DNA, RNA & Protein
9. Structural Bioinoformatics
Prediction of structure from sequence
◦ secondary structure
◦ homology modelling, threading
◦ ab initio 3D prediction
Analysis of 3D structure
◦ structure comparison/ alignment
◦ prediction of function from structure
◦ molecular mechanics/ molecular dynamics
◦ prediction of molecular interactions, docking
Structure databases (RCSB)
10. List of Database
DNA Data Bank of Japan (National Institute of Genetics)
EMBL (European Bioinformatics Institute)
GenBank (National Center for Biotechnology Information)
UniProt Universal Pesource (EBI, Swiss Institute of
Bioinformatics, PIR)
Swiss-Prot Protein Knowledgebase (Swiss Institute of
Bioinformatics)
National Center for Biotechnology Information (NCBI)
NIM,USA
11. MASCOT Search
Simple MS – molecular weight of peptide
mixture.
MS/MS (Tandem MS) – sequence
structural information by recording the
fragment ion spectrum of peptide.
12. PURPOSE OF MS:
Elemental composition.
Masses of particles of molecules.
Identify unknown compounds.
Isotopic Composition.
13. Mascot is a software package from Matrix Science (www.matrixscience.com) that
interprets mass spectral data into protein identities.
It uses mass spectrometry data to identify proteins from primary sequence
databases.
The experimental mass values are then compared with calculated peptide mass by
applying cleavage rules to the entries in a comprehensive primary sequence
database.
If unknown protein is present, we will get precise entry otherwise pull out those
entries which exhibit the closest homology(related species).
14.
15. Algorithm used..
Program MASCOT is based on the MOWSE
algorithm; this program also evaluates a
possibility of random matching of experimental
and theoretical peptide masses.
16. Two Mascot Choices
Matrix Sciences offers two choice for users:
A free, open access web-based system for
occasional (1-10) queries.
A locally installed version for heavy use or
highthroughput MS (100’s queries/day)
20. Parameters used in database searching
Database searched
Taxonomy
Enzyme
Missed cleavages
Fixed versus variable modifications (PTMs)
21. SCORING SCHEMES
PROBABILITY BASED SCORING
Mascot incorporates a probability based implementation of the Mowse
algorithm
The total score is the absolute probability that the observed match is a
random event.
Advantages :
Different types of matching (peptide masses and fragment ions) can be
combined in a single search.
Scores from different searches and on different databases can be
compared.
Search parameters can be optimised more readily by iteration.