SlideShare una empresa de Scribd logo
1 de 42
Sequence comparison technique Ms.ruchiyadavlectureramity institute of biotechnologyamity universitylucknow(up)
Sequence comparison technique Pairwise Alignment Local Alignment(Smith WatermanAlgorithm) Global Alignment(Needleman Wunsch  Algorithm) Multiple Alignment Heuristic Methods Rather than struggling to find the optimal alignment we may save a lot of time by employing heuristic algorithms Execution time is much faster May completely miss the optimal alignment  FASTA and  BLAST
A T T G A C T T A A G 1 1 1 1 1 1 1 1 1 1 1 G 2 2 2 2 1 1 1 1 1 1 1 G 2 2 2 2 2 2 2 2 2 2 1 A 3 3 3 3 3 3 3 3 2 2 1 T 4 4 4 4 4 4 3 3 2 2 1 C 5 5 5 5 4 4 3 3 2 2 1 G 6 5 5 5 5 4 3 3 3 2 1 A Heuristic Methods Problem of Dynamic Programming     D.P. compute  the score in a lot of useless area for optimal sequence FASTA focuses on diagonal area
Heuristic   Heuristic    Good local alignment should have some exact match subsequence. FASTA focus on this area
Heuristic Methods: FASTA and BLAST FASTA  First fast sequence searching algorithm for comparing a query sequence against a database. BLAST  Basic Local Alignment Search Technique 	Improvement of FASTA: Search speed, ease of use, statistical rigor.
FASTA ALGORITHM (a)Find runs of identical words Identify regions shared by the two sequences that have the highest density of single identities (ktup=1) or two consecutive identities(ktup=2) (b) Re-score using PAM matrix.  Longest diagonals are scored again using the PAM-250 matrix (or other matrix).  The best scores are saved as “init1” scores.
FASTA Algorithm “init1”  ktup=2
FASTA ALGORITHM                   (c) Join segments using gaps and eliminate other   segments.  Longdiagonals that are neighbors are joined.  The score for this joined region is“initn”.  This score may be lower due to a penalty for a gap. (d) Use DP to create the optimal alignment.  construct an optimal alignment of the query sequence and the library sequence (SW algorithm).This score is reported as the optimized score
FASTA Alignments “initn”
FASTA Algorithm- Find words of identical words.  Lookup table showing the positions of each word of length k, or k-tuple, is constructed for each sequence.  The relative positions of each word in the two sequences are then calculated by subtracting the position in the first sequence from that in the second.  Words that have the same offset position are in phase and reveal a region of alignment between the two sequences.
Look-up table
A T T G A C T T A A G * * G Location Q * * G 2,3,7,11 A * * * * A 6 C * * * * T 1,8 G * C * * G 4,5,9,10 T * * * * A FASTA   - Algorithm - Use look-up Table Query     : G A A T T C A G T T A Sequence: G G A T C G A Dot—Matrix       1    2   3   4   5   6   7   8   9  10  11 Look-up Table
FASTA  - Algorithm - Use the dynamic programming in restricted area around the best-score alignment to find out the higher-score alignment than the best-score alignment Width of this band is a parameter
FASTA  - Complexity  Complexity  Step 1 and 2  	// select the best 10 diagonal run//        Let n be a sequence from DB O(n) because Step 1 just uses look up table        O(n) << O(mn)    m,n = 100 to 200
FASTA  - Complexity  compute partial D.P. Depends on the restricted area < O(mn)  Therefore, FASTA is faster than D.P. Width of this band is a parameter
Step 1: Finding Seeds  t s 16
Step 2: Re-scoring Segments, Keeping Top 10  t s 17
Step 3: Eliminating Unlikely Segments  t s 18
Step 4: Finding the Best Alignment  t s 19
Versions of FASTA FASTA compares a query protein sequence to a protein sequence library to find similar sequences. FASTA also compares a DNA sequence to a DNA sequence library. TFASTA compares a query protein sequence to a DNA sequence library, after translating the DNA sequence library in all six reading frames. FASTX and FASTY translate a query DNA sequence in all three reading forward frames and compare all three frames to a protein sequence database. TFASTX and TFASTY compare a query protein sequence to a DNA sequence database, translating each DNA sequence in all six possible reading frames.
BLAST Publications: Ungapped BLAST – Alttschul et al., 1990 Gapped BLAST, PSI-BLAST -  Altschul et al., 1997 Basic Local Alignment Search Tool Altschul et al. 1990,1994,1997 Heuristic method for local alignment Designed specifically for database searches Based on the same assumption as FASTA that good alignments contain short lengths of exact matches
Basic Local Alignment Search Tool (BLAST) Input: Query (target) sequence– either DNA, RNA or Protein Scoring Scheme– gap penalties, substitution matrix for proteins, identity/mismatch scores for DNA/RNA Word length W– typical is W=3 for proteins and W=11 for DNA/RNA Output: Statistically significant matches   22
BLAST ALGORITHM PARAMETERS
Algorithm of BLAST There are three distinct steps, which are represented as follow: Step1: Query preprocessing; Step2: Scan the database for hits; Step3: Extension of hits.
BLAST  - Algorithm  Step 1: Query preprocessing; 	Create neighbourhood words for each query word  	Max:L-w+1 Query Word Neighborhood words
BLAST  - Algorithm  Step 1: Query preprocessing; A list of words of length 3 for protein  (word length 11 is used for DNA sequences)
BLAST -Query preprocessing Compile the short-hit scoring word list from query.      The length of query word, is 3. Words below threshold are not further pursued.
BLAST  - Algorithm  Step 2: Scan the database for hits; For each words list, identify all exact matches with DB sequences Neighborhood Word list Query Word Sequences in DB Sequence 1 Sequence 2 Step 2 Step 1 The purpose of Step 1 and 2 is as same as FASTA
Step3:Extension of the hits Every hit that has been generated is now extended in both directions, without gaps. To determine whether each hit may be part of a longer segment pair with higher score,
Step3:Extension of the hits HSP (High scoring Segment Pair).  If the extended segment pair has score better than equal to S (set as a parameter of the program), it is called HSP MSP (Maximal segment pair).  In a comparison, for every sequence in the database, the best scoring HSP is called the MSP
HIGH –SCORING PAIR(HSP)
Maximal segment pair(msp)
Step 2: Extracting Seeds t s 33
Step 3: Finding HSPs t s 34
Step 4: Combining HSPs t s 35
BLAST
Basic BLAST
Specialized BLAST ,[object Object]
 Search trace archives
 Find conserved domains in your sequence (cds)
 Find sequences with similar conserved domain architecture (cdart)
 Search sequences that have gene expression  profiles (GEO)

Más contenido relacionado

La actualidad más candente

Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignmentRamya S
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsNikesh Narayanan
 
Multiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
Multiple Alignment Sequence using Clustal Omega/ Shumaila RiazMultiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
Multiple Alignment Sequence using Clustal Omega/ Shumaila RiazShumailaRiaz6
 
Needleman-wunch algorithm harshita
Needleman-wunch algorithm  harshitaNeedleman-wunch algorithm  harshita
Needleman-wunch algorithm harshitaHarshita Bhawsar
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure predictionSamvartika Majumdar
 
sequence alignment
sequence alignmentsequence alignment
sequence alignmentammar kareem
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matricesAshwini
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformaticsnadeem akhter
 
Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmProshantaShil
 
Dynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignmentDynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignmentGeethanjaliAnilkumar2
 

La actualidad más candente (20)

Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Multiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
Multiple Alignment Sequence using Clustal Omega/ Shumaila RiazMultiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
Multiple Alignment Sequence using Clustal Omega/ Shumaila Riaz
 
Needleman-wunch algorithm harshita
Needleman-wunch algorithm  harshitaNeedleman-wunch algorithm  harshita
Needleman-wunch algorithm harshita
 
Msa
MsaMsa
Msa
 
Dynamic programming
Dynamic programming Dynamic programming
Dynamic programming
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure prediction
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
 
Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch Algorithm
 
Dynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignmentDynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignment
 
Ion torrent sequencing
Ion torrent sequencingIon torrent sequencing
Ion torrent sequencing
 

Destacado

Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning GenomeCompiler
 
Introduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for BioinformaticsIntroduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for Bioinformaticsibogicevic
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignmentavrilcoghlan
 
Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction CS, NcState
 
Human genome project
Human genome projectHuman genome project
Human genome projectruchibioinfo
 
Global local alignment
Global local alignmentGlobal local alignment
Global local alignmentScott Hamilton
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresLars Juhl Jensen
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in BioinformaticsArindam Ghosh
 
Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Pritom Chaki
 
The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment Parinda Rajapaksha
 
Sequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASucheta Tripathy
 

Destacado (20)

Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Blast fasta 4
Blast fasta 4Blast fasta 4
Blast fasta 4
 
Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning
 
Introduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for BioinformaticsIntroduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for Bioinformatics
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
Ch06 rna
Ch06 rnaCh06 rna
Ch06 rna
 
Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Blast
BlastBlast
Blast
 
Global local alignment
Global local alignmentGlobal local alignment
Global local alignment
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein features
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in Bioinformatics
 
Bioalgo 2012-01-gene-prediction-sim
Bioalgo 2012-01-gene-prediction-simBioalgo 2012-01-gene-prediction-sim
Bioalgo 2012-01-gene-prediction-sim
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)
 
Ch06 alignment
Ch06 alignmentCh06 alignment
Ch06 alignment
 
The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment
 
Sequence alignment belgaum
Sequence alignment belgaumSequence alignment belgaum
Sequence alignment belgaum
 
Sequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSA
 
Genome evolution
Genome evolutionGenome evolution
Genome evolution
 

Similar a Sequence comparison techniques

2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekingeProf. Wim Van Criekinge
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014Prof. Wim Van Criekinge
 
BLAST AND FASTA.pptx
BLAST AND FASTA.pptxBLAST AND FASTA.pptx
BLAST AND FASTA.pptxPiyushBehgal1
 
Bioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searchingBioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searchingProf. Wim Van Criekinge
 
2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekinge2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekingeProf. Wim Van Criekinge
 
FastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHMFastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHMMuunda Mudenda
 
Bioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeBioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeProf. Wim Van Criekinge
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformaticsatmapandey
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)AnkitTiwari354
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignmentbarathvaj
 
Blast fasta
Blast fastaBlast fasta
Blast fastayaghava
 
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadfalizain9604
 
BLAST_CSS2.ppt
BLAST_CSS2.pptBLAST_CSS2.ppt
BLAST_CSS2.pptSilpa87
 

Similar a Sequence comparison techniques (20)

Mayank
MayankMayank
Mayank
 
2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014
 
BLAST AND FASTA.pptx
BLAST AND FASTA.pptxBLAST AND FASTA.pptx
BLAST AND FASTA.pptx
 
Bioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searchingBioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searching
 
2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekinge2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekinge
 
FastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHMFastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHM
 
Bioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeBioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekinge
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
blast and fasta
 blast and fasta blast and fasta
blast and fasta
 
Database Searching
Database SearchingDatabase Searching
Database Searching
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformatics
 
_BLAST.ppt
_BLAST.ppt_BLAST.ppt
_BLAST.ppt
 
FASTA
FASTAFASTA
FASTA
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignment
 
Blast fasta
Blast fastaBlast fasta
Blast fasta
 
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadf
 
BLAST_CSS2.ppt
BLAST_CSS2.pptBLAST_CSS2.ppt
BLAST_CSS2.ppt
 

Último

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Último (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 

Sequence comparison techniques

  • 1. Sequence comparison technique Ms.ruchiyadavlectureramity institute of biotechnologyamity universitylucknow(up)
  • 2. Sequence comparison technique Pairwise Alignment Local Alignment(Smith WatermanAlgorithm) Global Alignment(Needleman Wunsch Algorithm) Multiple Alignment Heuristic Methods Rather than struggling to find the optimal alignment we may save a lot of time by employing heuristic algorithms Execution time is much faster May completely miss the optimal alignment FASTA and BLAST
  • 3. A T T G A C T T A A G 1 1 1 1 1 1 1 1 1 1 1 G 2 2 2 2 1 1 1 1 1 1 1 G 2 2 2 2 2 2 2 2 2 2 1 A 3 3 3 3 3 3 3 3 2 2 1 T 4 4 4 4 4 4 3 3 2 2 1 C 5 5 5 5 4 4 3 3 2 2 1 G 6 5 5 5 5 4 3 3 3 2 1 A Heuristic Methods Problem of Dynamic Programming D.P. compute the score in a lot of useless area for optimal sequence FASTA focuses on diagonal area
  • 4. Heuristic Heuristic Good local alignment should have some exact match subsequence. FASTA focus on this area
  • 5. Heuristic Methods: FASTA and BLAST FASTA First fast sequence searching algorithm for comparing a query sequence against a database. BLAST Basic Local Alignment Search Technique Improvement of FASTA: Search speed, ease of use, statistical rigor.
  • 6. FASTA ALGORITHM (a)Find runs of identical words Identify regions shared by the two sequences that have the highest density of single identities (ktup=1) or two consecutive identities(ktup=2) (b) Re-score using PAM matrix. Longest diagonals are scored again using the PAM-250 matrix (or other matrix). The best scores are saved as “init1” scores.
  • 8. FASTA ALGORITHM (c) Join segments using gaps and eliminate other segments. Longdiagonals that are neighbors are joined. The score for this joined region is“initn”. This score may be lower due to a penalty for a gap. (d) Use DP to create the optimal alignment. construct an optimal alignment of the query sequence and the library sequence (SW algorithm).This score is reported as the optimized score
  • 10. FASTA Algorithm- Find words of identical words. Lookup table showing the positions of each word of length k, or k-tuple, is constructed for each sequence. The relative positions of each word in the two sequences are then calculated by subtracting the position in the first sequence from that in the second. Words that have the same offset position are in phase and reveal a region of alignment between the two sequences.
  • 12. A T T G A C T T A A G * * G Location Q * * G 2,3,7,11 A * * * * A 6 C * * * * T 1,8 G * C * * G 4,5,9,10 T * * * * A FASTA - Algorithm - Use look-up Table Query : G A A T T C A G T T A Sequence: G G A T C G A Dot—Matrix 1 2 3 4 5 6 7 8 9 10 11 Look-up Table
  • 13. FASTA - Algorithm - Use the dynamic programming in restricted area around the best-score alignment to find out the higher-score alignment than the best-score alignment Width of this band is a parameter
  • 14. FASTA - Complexity Complexity Step 1 and 2 // select the best 10 diagonal run// Let n be a sequence from DB O(n) because Step 1 just uses look up table O(n) << O(mn) m,n = 100 to 200
  • 15. FASTA - Complexity compute partial D.P. Depends on the restricted area < O(mn) Therefore, FASTA is faster than D.P. Width of this band is a parameter
  • 16. Step 1: Finding Seeds t s 16
  • 17. Step 2: Re-scoring Segments, Keeping Top 10 t s 17
  • 18. Step 3: Eliminating Unlikely Segments t s 18
  • 19. Step 4: Finding the Best Alignment t s 19
  • 20. Versions of FASTA FASTA compares a query protein sequence to a protein sequence library to find similar sequences. FASTA also compares a DNA sequence to a DNA sequence library. TFASTA compares a query protein sequence to a DNA sequence library, after translating the DNA sequence library in all six reading frames. FASTX and FASTY translate a query DNA sequence in all three reading forward frames and compare all three frames to a protein sequence database. TFASTX and TFASTY compare a query protein sequence to a DNA sequence database, translating each DNA sequence in all six possible reading frames.
  • 21. BLAST Publications: Ungapped BLAST – Alttschul et al., 1990 Gapped BLAST, PSI-BLAST - Altschul et al., 1997 Basic Local Alignment Search Tool Altschul et al. 1990,1994,1997 Heuristic method for local alignment Designed specifically for database searches Based on the same assumption as FASTA that good alignments contain short lengths of exact matches
  • 22. Basic Local Alignment Search Tool (BLAST) Input: Query (target) sequence– either DNA, RNA or Protein Scoring Scheme– gap penalties, substitution matrix for proteins, identity/mismatch scores for DNA/RNA Word length W– typical is W=3 for proteins and W=11 for DNA/RNA Output: Statistically significant matches 22
  • 24. Algorithm of BLAST There are three distinct steps, which are represented as follow: Step1: Query preprocessing; Step2: Scan the database for hits; Step3: Extension of hits.
  • 25. BLAST - Algorithm Step 1: Query preprocessing; Create neighbourhood words for each query word Max:L-w+1 Query Word Neighborhood words
  • 26. BLAST - Algorithm Step 1: Query preprocessing; A list of words of length 3 for protein (word length 11 is used for DNA sequences)
  • 27. BLAST -Query preprocessing Compile the short-hit scoring word list from query. The length of query word, is 3. Words below threshold are not further pursued.
  • 28. BLAST - Algorithm Step 2: Scan the database for hits; For each words list, identify all exact matches with DB sequences Neighborhood Word list Query Word Sequences in DB Sequence 1 Sequence 2 Step 2 Step 1 The purpose of Step 1 and 2 is as same as FASTA
  • 29. Step3:Extension of the hits Every hit that has been generated is now extended in both directions, without gaps. To determine whether each hit may be part of a longer segment pair with higher score,
  • 30. Step3:Extension of the hits HSP (High scoring Segment Pair). If the extended segment pair has score better than equal to S (set as a parameter of the program), it is called HSP MSP (Maximal segment pair). In a comparison, for every sequence in the database, the best scoring HSP is called the MSP
  • 33. Step 2: Extracting Seeds t s 33
  • 34. Step 3: Finding HSPs t s 34
  • 35. Step 4: Combining HSPs t s 35
  • 36. BLAST
  • 38.
  • 39. Search trace archives
  • 40. Find conserved domains in your sequence (cds)
  • 41. Find sequences with similar conserved domain architecture (cdart)
  • 42. Search sequences that have gene expression profiles (GEO)
  • 44. Search for SNPs(snp)
  • 45. Screen sequence for vector contamination (vecscreen)
  • 46. Align two (or more) sequences using BLAST (bl2seq)
  • 47. Search protein or nucleotide targets in PubChem BioAssay
  • 48. Search SRA transcript and genomic libraries
  • 49. Constraint Based Protein Multiple Alignment Tool
  • 50.
  • 51. Databases available on BLAST Web server
  • 52. Databases available on BLAST Web server
  • 53. Options and parameter settings available on the BLAST server