2. • POSITION-SPECIFIC SCORING MATRICES
• A PSSM is defined as a table that contains probability information of amino acids
or nucleotides at each position of an ungapped multiple sequence alignment.
• rows represent residue positions.
• columns represent the names of residues.
• The values in the table represent log odds scores of the
residues.
• The probability values in a PSSM depend on the
number of sequences used to compile the matrix.
3. • Example of construction of a PSSM from a multiple
alignment of nucleotide sequences
4.
5. PSI-BLAST
• Profiles can be used in database searching to find
remote sequence homologs
• Part of NCBI.
• Position-specific iterated BLAST.
• builds profiles and performs database searches in an
iterative fashion.
• single query protein sequence to perform a normal
BLASTP search to generate initial similarity hits.
• The high-scoring hits are used to build a multiple
sequence alignment.
• Than profile is created.
• This method uses MORKOV MODEL foe score calculation.
6. representation of a Markov chain
Drawbacks of PSI-BLAST :-
• the high sensitivity of PSI-BLAST is also its pitfall; it is associated with low selectivity
caused by the false-positives generated in the automated profile construction process.
• If unrelated sequences are erroneously included profiles become biased.
• This problem is known as profile drift.
7. Markov Model
• describes a sequence of events that occur one after another in a chain.
• Each event determines the probability of the next event.
• Unidirectional in nature.
• Move from one position to other with certain probability known as
TRANSTION PROBILITY.
• A good example of a Markov model is the signal change of traffic lights in which the
state of the current signal depends on the state of the previous signal.(e.g., green light
switches on after red light, which switches on after yellow light).
• Biological sequences written as strings of letters can be described by Markov chains.
8. • each letter representing a state is linked together with transitional probability
values.
• allows the calculation of probability values for a given residue according to the
unique distribution frequencies of nucleotides or amino acids.
TYPES OF MARKOV MODEL :-
1) Zero order markov model.
2) First order markov model.
3) Second order markov model.
4) Higher order morkov model.
9. Hidden Markov Model
• A machine learning technique
• A discrete hill climb technique.
• some non observed factors influence state transition calculations.
• An HMM combines two or more Markov chains with only one chain consisting of
observed states and the other chains made up of unobserved (or “hidden”) states that
influence the outcome of the observed states.
10. • the probability going from one state to another state is the transition
probability.
• The probability value associated with each symbol in each state is called
emission probability.
• To calculate the total probability of a particular path of the model, both transition and
emission probabilities linking all the “hidden” as well as observed states need to be
taken into account.
• Example to use two states of a partial HMM to represent (or generate) a sequence.
11. HMM involving two interconnected Markov chains with observed and unobserved state.
12. • a character in the alignment can be in one of three types :
1) match.
2) insertion.
3) deletion.
• Match are observed state.
• Insertion and deletion are hidden state.
13. illustration of a simplified partial HMM for DNA
sequences with emission and transition probability
values. Both probability values are used to
calculate the total probability of a particular path of
the model. For example, to generate the
sequence AG, the model has to progress from A
from STATE 1 to G in STATE 2, the probability of
this path is 0.80 × 0.40 × 0.32 = 0.102. Obviously,
there are 4 × 4 = 16 different sequences this
simple model can generate. The one that has
the highest probability is AT.
14.
15. architecture of a hidden Markov model representing a multiple sequence alignment
17. • works in a similar fashion as in dynamic programming
for sequence alignment.
Viterbi Algorithm
• constructs a matrix with the maximum emission probability values of all the symbols in a
state multiplied by the transition probability for that state.
• It then uses a trace-back procedure going from the lower right corner to the upper left
corner to find the path with the highest values in the matrix.
19. Forward Algorithm
• which constructs a matrix using the sum of multiple emission
states instead of the maximum, and calculates the most likely path
from the upper left corner of the matrix to the lower right corner.
Issues with HMM :
• Limited sampling size
which causes overrepresentation of observed characters while ignoring
the unobserved characters.
• This problem is known as overfitting
20. Tools used to build HMM profile
• HMMer.
• hmmbuild.
• Hmmcalibrate.
• Hmmemit.
• Hmmsearch.
Tool to find MSA which is based on HMM.
Packages are:
Located at Washington university in USA.
21. Applications
• Human identification using Gait.
• Human action recognition from Time Sequential
Images.
• Facial expression identification from videos.
• Speech recognition.