SlideShare una empresa de Scribd logo
1 de 11
Least Common Ancestors in
      Constant Time
Motivation: GO Analysis on
     the Dendrogram
GO Analysis on Dendrogram

• For each GO term
   •   Pull out the skeleton sub-
     tree for this term
   •   Test for significance only
     at the nodes of this skeleton
     tree


• How does one pull out the
  skeleton sub-tree in time
  proportional to the size of
  that subtree?
LCA Queries

• Preprocess the tree in linear
  time, so..

• Given two nodes, the least
  common ancestor can be
  returned in O(1) time
LCA Queries on a Line Graph

• Rank nodes in order from top
  to bottom

• Take the min of the two
  ranks
Linearizing a Tree

• Label each node with its
  distance from the root

• Euler Tour to linearize nodes
  in an array (size 2n)




Find the node with the least label in
            this range
The Range Minimum Problem

• Given an array of size 2n,
  preprocess it in linear time
  so…



• Given a range, the min item in
  that range can be returned in
  O(1) time
Divide and Conquer

• Split into blocks at various
  levels of granularity

   – For each block, compute all
     prefix and suffix mins

   – Total space/preprocessing time
     O(nlog n)
Query Handling

• Given the query range

   – Determine the granularity level
     at which this range straddles
     adjacent blocks
       •   First bit diff in O(1) time


   – Look up the appropriate
     prefix/suffix mins in each block
       •   Look up precomputed tables in O(1)
           time
Reducing Preprocessing Time

•   Consider blocks of size Δ=log n/3

•   Two blocks are said to be equivalent
    if all within-block queries return the
    same min-index for the two blocks

•   How many equivalence classes are                 +1-1+1-1-1-1-1+1-1+1
    there:
     –   Recall Euler Tour, adjacent nos differ in
         +/-1, so 2Δ = n1/3

     –   For each distinct class and each of the
         possible log2n/9 queries precompute
         answers and store


     –   This takes time O(n1/3 log2n) = O(n)
Overall Preprocessing

•   Compute the O(n1/3 log2n) data
    structure for blocks of size
    =log n/3
     – Within block queries can be
       answered using this

•   Create a new array of size
    2n/(log n/3) = 6n/log n by
    replacing each block by just its
    minimum item

•   Preprocess this array as
    before, but now in O(n) time
    and space because the size of
    this array is just O(n/log n).

Más contenido relacionado

Similar a Least common ancestors in constant time

K means and dbscan
K means and dbscanK means and dbscan
K means and dbscanYan Xu
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox DetectorNamHyuk Ahn
 
ConstraintSatisfaction.ppt
ConstraintSatisfaction.pptConstraintSatisfaction.ppt
ConstraintSatisfaction.pptMdFazleRabbi53
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Sangwoo Mo
 
Module 2 Design Analysis and Algorithms
Module 2 Design Analysis and AlgorithmsModule 2 Design Analysis and Algorithms
Module 2 Design Analysis and AlgorithmsCool Guy
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentaryirrrrr
 
A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...Jungyeol
 
clustering_hierarchical ckustering notes.pdf
clustering_hierarchical ckustering notes.pdfclustering_hierarchical ckustering notes.pdf
clustering_hierarchical ckustering notes.pdfp_manimozhi
 
shell and merge sort
shell and merge sortshell and merge sort
shell and merge sorti i
 
Lec03-CS110 Computational Engineering
Lec03-CS110 Computational EngineeringLec03-CS110 Computational Engineering
Lec03-CS110 Computational EngineeringSri Harsha Pamu
 
Zvi random-walks-slideshare
Zvi random-walks-slideshareZvi random-walks-slideshare
Zvi random-walks-slideshareZvi Lotker
 
Lecture 11.2 : sorting
Lecture 11.2 :  sortingLecture 11.2 :  sorting
Lecture 11.2 : sortingVivek Bhargav
 

Similar a Least common ancestors in constant time (20)

K means and dbscan
K means and dbscanK means and dbscan
K means and dbscan
 
Data Mining Lecture_8(b).pptx
Data Mining Lecture_8(b).pptxData Mining Lecture_8(b).pptx
Data Mining Lecture_8(b).pptx
 
sorting
sortingsorting
sorting
 
Single Shot Multibox Detector
Single Shot Multibox DetectorSingle Shot Multibox Detector
Single Shot Multibox Detector
 
sorting-160810203705.pptx
sorting-160810203705.pptxsorting-160810203705.pptx
sorting-160810203705.pptx
 
Quicksort
QuicksortQuicksort
Quicksort
 
ConstraintSatisfaction.ppt
ConstraintSatisfaction.pptConstraintSatisfaction.ppt
ConstraintSatisfaction.ppt
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
 
densematrix.ppt
densematrix.pptdensematrix.ppt
densematrix.ppt
 
Ch07 linearspacealignment
Ch07 linearspacealignmentCh07 linearspacealignment
Ch07 linearspacealignment
 
Module 2 Design Analysis and Algorithms
Module 2 Design Analysis and AlgorithmsModule 2 Design Analysis and Algorithms
Module 2 Design Analysis and Algorithms
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentary
 
A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...
 
clustering_hierarchical ckustering notes.pdf
clustering_hierarchical ckustering notes.pdfclustering_hierarchical ckustering notes.pdf
clustering_hierarchical ckustering notes.pdf
 
PAM.ppt
PAM.pptPAM.ppt
PAM.ppt
 
shell and merge sort
shell and merge sortshell and merge sort
shell and merge sort
 
Lec03-CS110 Computational Engineering
Lec03-CS110 Computational EngineeringLec03-CS110 Computational Engineering
Lec03-CS110 Computational Engineering
 
binary tree
binary treebinary tree
binary tree
 
Zvi random-walks-slideshare
Zvi random-walks-slideshareZvi random-walks-slideshare
Zvi random-walks-slideshare
 
Lecture 11.2 : sorting
Lecture 11.2 :  sortingLecture 11.2 :  sorting
Lecture 11.2 : sorting
 

Más de Strand Life Sciences Pvt Ltd (11)

Strand genomics features in CIO review
Strand genomics features in CIO reviewStrand genomics features in CIO review
Strand genomics features in CIO review
 
Rules of a Quantum World
Rules of  a Quantum WorldRules of  a Quantum World
Rules of a Quantum World
 
Introduction to statistics iii
Introduction to statistics iiiIntroduction to statistics iii
Introduction to statistics iii
 
Introduction to statistics ii
Introduction to statistics iiIntroduction to statistics ii
Introduction to statistics ii
 
Dynamic programming for simd
Dynamic programming for simdDynamic programming for simd
Dynamic programming for simd
 
Complex numbers polynomial multiplication
Complex numbers polynomial multiplicationComplex numbers polynomial multiplication
Complex numbers polynomial multiplication
 
Converting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional OnesConverting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional Ones
 
Searching using Quantum Rules
Searching using Quantum RulesSearching using Quantum Rules
Searching using Quantum Rules
 
Randomized algorithms
Randomized algorithmsRandomized algorithms
Randomized algorithms
 
Suffix arrays
Suffix arraysSuffix arrays
Suffix arrays
 
Alignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGSAlignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGS
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Least common ancestors in constant time

  • 1. Least Common Ancestors in Constant Time
  • 2. Motivation: GO Analysis on the Dendrogram
  • 3. GO Analysis on Dendrogram • For each GO term • Pull out the skeleton sub- tree for this term • Test for significance only at the nodes of this skeleton tree • How does one pull out the skeleton sub-tree in time proportional to the size of that subtree?
  • 4. LCA Queries • Preprocess the tree in linear time, so.. • Given two nodes, the least common ancestor can be returned in O(1) time
  • 5. LCA Queries on a Line Graph • Rank nodes in order from top to bottom • Take the min of the two ranks
  • 6. Linearizing a Tree • Label each node with its distance from the root • Euler Tour to linearize nodes in an array (size 2n) Find the node with the least label in this range
  • 7. The Range Minimum Problem • Given an array of size 2n, preprocess it in linear time so… • Given a range, the min item in that range can be returned in O(1) time
  • 8. Divide and Conquer • Split into blocks at various levels of granularity – For each block, compute all prefix and suffix mins – Total space/preprocessing time O(nlog n)
  • 9. Query Handling • Given the query range – Determine the granularity level at which this range straddles adjacent blocks • First bit diff in O(1) time – Look up the appropriate prefix/suffix mins in each block • Look up precomputed tables in O(1) time
  • 10. Reducing Preprocessing Time • Consider blocks of size Δ=log n/3 • Two blocks are said to be equivalent if all within-block queries return the same min-index for the two blocks • How many equivalence classes are +1-1+1-1-1-1-1+1-1+1 there: – Recall Euler Tour, adjacent nos differ in +/-1, so 2Δ = n1/3 – For each distinct class and each of the possible log2n/9 queries precompute answers and store – This takes time O(n1/3 log2n) = O(n)
  • 11. Overall Preprocessing • Compute the O(n1/3 log2n) data structure for blocks of size =log n/3 – Within block queries can be answered using this • Create a new array of size 2n/(log n/3) = 6n/log n by replacing each block by just its minimum item • Preprocess this array as before, but now in O(n) time and space because the size of this array is just O(n/log n).