SlideShare a Scribd company logo
1 of 9
Download to read offline
A T A C G T T C A G A T T G C C A G C A C G T T

Algorithms & Complexity Evaluation

Fast Search in
Biological Sequences
using Multiple Hash
Functions
We are going to deal with a very tiny alphabet
representing nucleotydes in a genetic sequence.

A DENINE
T HYMINE

Searching in a sequence for
more patterns.

G UANINE
C YTOSINE

After veryfing matches,
advance window: pos++

search window

T G A G C
A G G C A
T G T C G

patterns to
search

T G A G C

shift window
by 1 position

A T G A C G A C T

A G G C A
T G T C G

A T G A C G A C T

Grasping the
problem
string matching???
what’s this?

DNA sequence

Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
a pattern
NOT A TEXT!!!

now... a text!

T G A G C A C T G
gram dim q = 3

T G A G C A C T G
extract
ing the
first
q-gram

T G A

First we
have pre processing
stage...

F[HASH(’CTG’)] =
patterns[cur]
feeding the hash function
with the extracted q-gram,
hash is returned:
0 <= hash <= MAX

HASH ( T G A ) = #@!*$%£&?
calculated hash is
used as index in
shift array

value used to
shift the window

sh[ #@!*$%£&? ] = shift

Let‛s talk
about
Wu &
Mamber
don’t worry!
It’s not a
magic spell...
it’s just an
algorithm

A
G
T
C

C
T
G
T

A
A
A
G

A
G
G
A

C
C
T
C

T
G
A
C

G
T
G
G

G
G
T
C

C
A
A
T

G
G
G
C

A
C
A
C

Then we
can move
to real
search...

window size =
pattern size = m

A C A A C T G G C
extracting the
last q-gram only

G G C

hash function gets the
q-gram, hash returned:
0 <= hash <= MAX

HASH ( G G C ) = ^@!*%£$?#
shift index

shift

= sh[ ^@!*%£$?# ]
0?

true

NAIVE CHECK

Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
T G A

k = Math.floor(w/q);

0

1

0

0

1

0

1

k

W-M limit
cannot increase
them both...

Decrease
number
of false
positives

0

1

k
w

1

1

0

0

1

1

0

k

More text to
analize

Increase q

More bits
per char

Increase k

Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
Enhancing
W-M...

T G A G C A C T G

γ =1

γ =2

T G A G C A C T G

pre-processing

T G A G C A C T G

HASH(’CTG’) = h1

HASH(’GCA’) = h2

HASH ( T G A ) = #@!*$%

HASH ( T G A ) = #@!*$%

sh 1[ #@!*$% ] = m-q-i

sh 2[ #@!*$% ] = m-2q-i

to be
continued...

h = ( h1 << 1) + h2

F[h] = patterns[cur]

Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
...Enhancing
W-M
search

window

A
G
T
C

C
T
G
T

A
A
A
G

A
G
G
A

C
C
T
C

T
G
A
C

G
T
G
G

G
G
T
C

C
A
A
T

G
G
G
C

A
C
A
C

shift1 = sh 1[ §+!#*£$?% ]

HASH ( G G C ) = §+!#*£$?%

a text

h1

A C A A C T G G C

...now you
can’t go back

In the end...

h = ( h1 << 1) + h2
if (shift1 == 0 &&
shift2 == 0)
foreach (p in F[h])
checkOccurrInWin(p);

h2

HASH ( A C T ) = ^@!*%£$?#
shift2 = sh 2[ ^@!*%£$?# ]

Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
Complexities
Pre-processing
O ( MAX (1+

O ( MAX + r

) + r ) = Space requirement
m q ) = Time requirement

Search phase
O ( m (1) n ) = Time requirement
m

(1)

=

r
i=1

( len ( p ))
i

Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
Experimental results
35

time

100

WM(6,1)

MBNDM

time

1200

WM(4,2)

|P| = 100

30

best WM(q,γ)

time

|P| = 1000
80

1000

25
20

800

60

15

WM(8,1)

10

400

5
0

WM(8,1)

40

|P| = 10000

600

WM(8,1)
8

16

20

WM(8,1)

32

64

WM(8,2)

WM(8,1)
128

w

0

8

16

32

WM(4,2)

WM(8,3) WM(8,3)

64

128

WM(8,2) WM(8,2) WM(8,2) WM(8,2)

200

w

0

8

16

32

64

128

w

Showing comparison on execution times among WM(q,γ)
and one of the current fastest algorithms in literature
Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
A T A C G T T C A G A T T G C C A G C A C G T T

The End

More Related Content

Recently uploaded

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactisticshameyhk98
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 

Recently uploaded (20)

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactistics
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 

Featured

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

Fast Searching in Biological Sequences Using Multiple Hash Functions

  • 1. A T A C G T T C A G A T T G C C A G C A C G T T Algorithms & Complexity Evaluation Fast Search in Biological Sequences using Multiple Hash Functions
  • 2. We are going to deal with a very tiny alphabet representing nucleotydes in a genetic sequence. A DENINE T HYMINE Searching in a sequence for more patterns. G UANINE C YTOSINE After veryfing matches, advance window: pos++ search window T G A G C A G G C A T G T C G patterns to search T G A G C shift window by 1 position A T G A C G A C T A G G C A T G T C G A T G A C G A C T Grasping the problem string matching??? what’s this? DNA sequence Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
  • 3. a pattern NOT A TEXT!!! now... a text! T G A G C A C T G gram dim q = 3 T G A G C A C T G extract ing the first q-gram T G A First we have pre processing stage... F[HASH(’CTG’)] = patterns[cur] feeding the hash function with the extracted q-gram, hash is returned: 0 <= hash <= MAX HASH ( T G A ) = #@!*$%£&? calculated hash is used as index in shift array value used to shift the window sh[ #@!*$%£&? ] = shift Let‛s talk about Wu & Mamber don’t worry! It’s not a magic spell... it’s just an algorithm A G T C C T G T A A A G A G G A C C T C T G A C G T G G G G T C C A A T G G G C A C A C Then we can move to real search... window size = pattern size = m A C A A C T G G C extracting the last q-gram only G G C hash function gets the q-gram, hash returned: 0 <= hash <= MAX HASH ( G G C ) = ^@!*%£$?# shift index shift = sh[ ^@!*%£$?# ] 0? true NAIVE CHECK Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
  • 4. T G A k = Math.floor(w/q); 0 1 0 0 1 0 1 k W-M limit cannot increase them both... Decrease number of false positives 0 1 k w 1 1 0 0 1 1 0 k More text to analize Increase q More bits per char Increase k Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
  • 5. Enhancing W-M... T G A G C A C T G γ =1 γ =2 T G A G C A C T G pre-processing T G A G C A C T G HASH(’CTG’) = h1 HASH(’GCA’) = h2 HASH ( T G A ) = #@!*$% HASH ( T G A ) = #@!*$% sh 1[ #@!*$% ] = m-q-i sh 2[ #@!*$% ] = m-2q-i to be continued... h = ( h1 << 1) + h2 F[h] = patterns[cur] Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
  • 6. ...Enhancing W-M search window A G T C C T G T A A A G A G G A C C T C T G A C G T G G G G T C C A A T G G G C A C A C shift1 = sh 1[ §+!#*£$?% ] HASH ( G G C ) = §+!#*£$?% a text h1 A C A A C T G G C ...now you can’t go back In the end... h = ( h1 << 1) + h2 if (shift1 == 0 && shift2 == 0) foreach (p in F[h]) checkOccurrInWin(p); h2 HASH ( A C T ) = ^@!*%£$?# shift2 = sh 2[ ^@!*%£$?# ] Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
  • 7. Complexities Pre-processing O ( MAX (1+ O ( MAX + r ) + r ) = Space requirement m q ) = Time requirement Search phase O ( m (1) n ) = Time requirement m (1) = r i=1 ( len ( p )) i Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
  • 8. Experimental results 35 time 100 WM(6,1) MBNDM time 1200 WM(4,2) |P| = 100 30 best WM(q,γ) time |P| = 1000 80 1000 25 20 800 60 15 WM(8,1) 10 400 5 0 WM(8,1) 40 |P| = 10000 600 WM(8,1) 8 16 20 WM(8,1) 32 64 WM(8,2) WM(8,1) 128 w 0 8 16 32 WM(4,2) WM(8,3) WM(8,3) 64 128 WM(8,2) WM(8,2) WM(8,2) WM(8,2) 200 w 0 8 16 32 64 128 w Showing comparison on execution times among WM(q,γ) and one of the current fastest algorithms in literature Presentation by Simone Tino - All rights reserved. Authored from November 2012 to December 2012 - University of Catania - Faculty of Computer Science - Algoritmi e Complessità
  • 9. A T A C G T T C A G A T T G C C A G C A C G T T The End