5. Needleman-Wunsch-edu.pl The Score Matrix ---------------- Seq1(j) 1 2 3 4 5 6 7 8 9 10 Seq2 * C K H V F C R V C I (i) * 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 1 C -1 1 0 -1 -2 -3 -4 -5 -6 -7 -8 2 K -2 0 2 1 0 -1 -2 -3 -4 -5 -6 3 K -3 -1 1 1 0 -1 -2 -3 -4 -5 -6 4 C -4 -2 0 0 0 -1 0 -1 -2 -3 -4 5 F -5 -3 -1 -1 -1 1 0 -1 -2 -3 -4 6 C -6 -4 -2 -2 -2 0 2 1 0 -1 -2 7 K -7 -5 -3 -3 -3 -1 1 1 0 -1 -2 8 C -8 -6 -4 -4 -4 -2 0 0 0 1 0 9 V -9 -7 -5 -5 -3 -3 -1 -1 1 0 0 A: matrix(i,j) = matrix(i-1,j-1) + (MIS)MATCH if (substr(seq1, j-1 ,1) eq substr(seq2, i-1 ,1) B: up_score = matrix(i-1,j) + GAP C: left_score = matrix(i,j-1) + GAP a b c
6.
7.
8.
9.
10.
11.
12.
13.
14. FastA (http://www.ebi.ac.uk/fasta33/) Blosum50 default. Lower PAM higher blosum to detect close sequences Higher PAM and lower blosum to detect distant sequences Gap opening penalty -12, -16 by default for fasta with proteins and DNA, respectively Gap extension penalty -2, -4 by default for fasta with proteins and DNA, respectively The larger the word-length the less sensitive, but faster the search will be Max number of scores and alignments is 100
15. FastA Output Database code hyperlinked to the SRS database at EBI Accession number Description Length Initn, init1, opt, z-score calculated during run E score - expectation value, how many hits are expected to be found by chance with such a score while comparing this query to this database. E() does not represent the % similarity
16.
17.
18.
19.
20. BLAST - B asic L ocal A lignment S earch T ool
21.
22. The big red button Do My Job It is dangerous to hide too much of the underlying complexity from the scientists.
23.
24.
25.
26.
27.
28.
29. S Length of extension Score Trim to max indexed * *Two non-overlapping HSP’s on a diagonal within distance A
30. S Length of extension Score Trim to max indexed * *Two non-overlapping HSP’s on a diagonal within distance A