SlideShare una empresa de Scribd logo
1 de 10
Top-k Approach For Compact
   Storage Structure


Guided By,
Dr. Radha Senthilkumar   By,
                         S.Meenakshi,
Assistant Professor
                         2011611009,
Department of IT         M.Tech I.T
Problem Definition
 Evaluating the tree edit distance for large xml trees is
  difficult.
 The best known xml algorithm have cubic run time and
  quadratic complexity is not scalable.
 A core problem is to efficiently prune sub trees.
Literature Survey cont…
 “Efficient Top-k Approximate Subtree Matchingin Small
  Memory “Nikolaus Augsten, Denilson Barbosa, Michael M. Bo¨
  hlen, and Themis Palpanas, IEEE transactions on knowledge and
  data engineering, vol. 22, no. 8, August 2011.

 The top-k approximatec matches of a small query tree Q within a
  large document tree.
 Using prefix ring buffer that allows to efficiently prune subtrees.
 TASM is portable because it relies on the postorder queue structure
  which can be implemented by any xml processing that allows an
  efficient postorder traversal of trees.
Literature Survey cont…
 Jiaheng Lu, Pierre Senellart, Chunbin Lin, Xiaoyong Du, Shan
  Wang, Xinxing ChenMay “Optimal top-k generation of attribute
  combinations based on ranked lists” proc. ACM SIGMOD Int’l
  Conf. on Management of Data pp.1-12,2012.

• A novel top-k query type, called top-k,m queries.
• Suppose we are given a set of groups and each group contains a set of
  attributes, each of which is associated with a ranked list of tuples.
• All lists are ranked in decreasing order of the scores of tuples. We
  want the top-k combinations of attributes according to the
  corresponding top-m tuples with matching IDs.
Literature Survey cont..
 K.-C. Tai, “The Tree-to-Tree Correction Problem,” J. ACM, vol. 26,no. 3,
  pp. 422-433, 1979.


• The string-to-string correction problem, which is to determine the
  distance between two strings as measured by the minimum cost
  sequence of edit operations needed to transform one string into the
  other.
 Three edit operations: changing one node of a tree into another node,
  deleting one node from a tree, or inserting a node into a tree; and they
  presented an algorithm that computes the distance between two
  strings in time O(m* n), where m and n are the lengths of the two
  given strings.
Objective
 To implement the concept of dominating queries
  by the approach of Top-k Approximate Subtree
  Matching Problem.
 To evaluate the performance of dominating
  queries in the compact storage structure.
Dominating Queries
 The number of result is controllable.
 The result is Scaling invariant.
 No user defined ranking function is requierd.
 Each point is assigned an intuitive score which determines
  its rank.

TASM:
• The problem of ranking the k best approximate matches of
  a small query tree in the large document tree.
References
 “Efficient Top-k Approximate Subtree Matchingin Small Memory
  “Nikolaus Augsten, Denilson Barbosa, Michael M. Bo¨ hlen, and
  Themis Palpanas, IEEE transactions on knowledge and data
  engineering, vol. 22, no. 8, August 2011.
 Jiaheng Lu, Pierre Senellart, Chunbin Lin, Xiaoyong Du, Shan Wang,
  Xinxing ChenMay “Optimal top-k generation of attribute
  combinations based on ranked lists” proc. ACM SIGMOD Int’l Conf.
  on Management of Data pp.1-12,2012.
 N. Augsten, M.H. Bo¨ hlen, C.E. Dyreson, and J.
  Gamper,“Approximate Joins for Data-Centric XML,” Proc. IEEE 24th
  Int’lConf. Data Eng. (ICDE), pp. 814-823, 2008.
 K.-C. Tai, “The Tree-to-Tree Correction Problem,” J. ACM, vol. 26,no.
  3, pp. 422-433, 1979.
Timeline Chart
PHASE        REVIEW 1         REVIEW II           REVIEW III

          Learning to work   Implement the      Evaluate the
            with TASM           concept of     dominating
PHASE I         (July)         dominating      queries in compact
                             queries(August-   storage structure
                               September)         ( October and
                                                    November)
Thank You

Más contenido relacionado

La actualidad más candente

A survey of indexing techniques for sparse matrices
A survey of indexing techniques for sparse matricesA survey of indexing techniques for sparse matrices
A survey of indexing techniques for sparse matrices
unyil96
 
3D 딥러닝 동향
3D 딥러닝 동향3D 딥러닝 동향
3D 딥러닝 동향
NAVER Engineering
 
Drsp dimension reduction for similarity matching and pruning of time series ...
Drsp  dimension reduction for similarity matching and pruning of time series ...Drsp  dimension reduction for similarity matching and pruning of time series ...
Drsp dimension reduction for similarity matching and pruning of time series ...
IJDKP
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
Editor IJMTER
 
Statistical global modeling of β^- decay halflives systematics ...
Statistical global modeling of β^- decay halflives systematics ...Statistical global modeling of β^- decay halflives systematics ...
Statistical global modeling of β^- decay halflives systematics ...
butest
 
Accelerated training convergence in superposed quantum networks
Accelerated training convergence in superposed quantum networksAccelerated training convergence in superposed quantum networks
Accelerated training convergence in superposed quantum networks
Christopher Altman
 
International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)
CSCJournals
 
A frame work for clustering time evolving data
A frame work for clustering time evolving dataA frame work for clustering time evolving data
A frame work for clustering time evolving data
iaemedu
 

La actualidad más candente (14)

Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
 
A survey of indexing techniques for sparse matrices
A survey of indexing techniques for sparse matricesA survey of indexing techniques for sparse matrices
A survey of indexing techniques for sparse matrices
 
3D 딥러닝 동향
3D 딥러닝 동향3D 딥러닝 동향
3D 딥러닝 동향
 
Drsp dimension reduction for similarity matching and pruning of time series ...
Drsp  dimension reduction for similarity matching and pruning of time series ...Drsp  dimension reduction for similarity matching and pruning of time series ...
Drsp dimension reduction for similarity matching and pruning of time series ...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
 
Statistical global modeling of β^- decay halflives systematics ...
Statistical global modeling of β^- decay halflives systematics ...Statistical global modeling of β^- decay halflives systematics ...
Statistical global modeling of β^- decay halflives systematics ...
 
Accelerated training convergence in superposed quantum networks
Accelerated training convergence in superposed quantum networksAccelerated training convergence in superposed quantum networks
Accelerated training convergence in superposed quantum networks
 
International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)
 
Ia3613981403
Ia3613981403Ia3613981403
Ia3613981403
 
A0360109
A0360109A0360109
A0360109
 
C0312023
C0312023C0312023
C0312023
 
A frame work for clustering time evolving data
A frame work for clustering time evolving dataA frame work for clustering time evolving data
A frame work for clustering time evolving data
 
A h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learningA h k clustering algorithm for high dimensional data using ensemble learning
A h k clustering algorithm for high dimensional data using ensemble learning
 

Destacado (20)

P73 76
P73 76P73 76
P73 76
 
Bank mini
Bank miniBank mini
Bank mini
 
Nova reklama marketingovoe prodvigenie
Nova reklama marketingovoe prodvigenieNova reklama marketingovoe prodvigenie
Nova reklama marketingovoe prodvigenie
 
Pbl 6
Pbl 6Pbl 6
Pbl 6
 
เฟียเจท์ 1
เฟียเจท์ 1เฟียเจท์ 1
เฟียเจท์ 1
 
P85 89
P85 89P85 89
P85 89
 
Pbl 4.2
Pbl 4.2Pbl 4.2
Pbl 4.2
 
Nova reklama marketingovoe prodvigenie
Nova reklama marketingovoe prodvigenieNova reklama marketingovoe prodvigenie
Nova reklama marketingovoe prodvigenie
 
оптические методы исследования потоков 2003
оптические методы исследования потоков 2003оптические методы исследования потоков 2003
оптические методы исследования потоков 2003
 
8.1
8.18.1
8.1
 
Pbl2
Pbl2Pbl2
Pbl2
 
Memories at GITAM
Memories at GITAMMemories at GITAM
Memories at GITAM
 
130614 ist constructivo
130614 ist constructivo130614 ist constructivo
130614 ist constructivo
 
Pbl4.1
Pbl4.1Pbl4.1
Pbl4.1
 
Pbl3
Pbl3Pbl3
Pbl3
 
geosurge-00
geosurge-00geosurge-00
geosurge-00
 
Pbl 7.2
Pbl 7.2Pbl 7.2
Pbl 7.2
 
Pbl 6
Pbl 6Pbl 6
Pbl 6
 
Pbl1
Pbl1Pbl1
Pbl1
 
Pbl7.2
Pbl7.2Pbl7.2
Pbl7.2
 

Similar a 2011611009

Chapter1_C.doc
Chapter1_C.docChapter1_C.doc
Chapter1_C.doc
butest
 
A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
Ninad Samel
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 
Part2- The Atomic Information Resource
Part2- The Atomic Information ResourcePart2- The Atomic Information Resource
Part2- The Atomic Information Resource
JEAN-MICHEL LETENNIER
 

Similar a 2011611009 (20)

CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Document Classification Using Hierarchies Clusters Technique
Document Classification Using Hierarchies Clusters TechniqueDocument Classification Using Hierarchies Clusters Technique
Document Classification Using Hierarchies Clusters Technique
 
Decision tree
Decision treeDecision tree
Decision tree
 
DT.pptx
DT.pptxDT.pptx
DT.pptx
 
A survey of xml tree patterns
A survey of xml tree patternsA survey of xml tree patterns
A survey of xml tree patterns
 
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
 
Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
 
Chapter1_C.doc
Chapter1_C.docChapter1_C.doc
Chapter1_C.doc
 
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
 
P229 godfrey
P229 godfreyP229 godfrey
P229 godfrey
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Lloyd Swarmfest 2010 Presentation
Lloyd   Swarmfest 2010 PresentationLloyd   Swarmfest 2010 Presentation
Lloyd Swarmfest 2010 Presentation
 
Clustering
ClusteringClustering
Clustering
 
IRJET- Clustering of Hierarchical Documents based on the Similarity Deduc...
IRJET-  	  Clustering of Hierarchical Documents based on the Similarity Deduc...IRJET-  	  Clustering of Hierarchical Documents based on the Similarity Deduc...
IRJET- Clustering of Hierarchical Documents based on the Similarity Deduc...
 
A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering TechniquesFeature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
 
Space efficient structures for json documents
Space efficient structures for json documentsSpace efficient structures for json documents
Space efficient structures for json documents
 
Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)
 
Part2- The Atomic Information Resource
Part2- The Atomic Information ResourcePart2- The Atomic Information Resource
Part2- The Atomic Information Resource
 

Último

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 

2011611009

  • 1. Top-k Approach For Compact Storage Structure Guided By, Dr. Radha Senthilkumar By, S.Meenakshi, Assistant Professor 2011611009, Department of IT M.Tech I.T
  • 2. Problem Definition  Evaluating the tree edit distance for large xml trees is difficult.  The best known xml algorithm have cubic run time and quadratic complexity is not scalable.  A core problem is to efficiently prune sub trees.
  • 3. Literature Survey cont…  “Efficient Top-k Approximate Subtree Matchingin Small Memory “Nikolaus Augsten, Denilson Barbosa, Michael M. Bo¨ hlen, and Themis Palpanas, IEEE transactions on knowledge and data engineering, vol. 22, no. 8, August 2011.  The top-k approximatec matches of a small query tree Q within a large document tree.  Using prefix ring buffer that allows to efficiently prune subtrees.  TASM is portable because it relies on the postorder queue structure which can be implemented by any xml processing that allows an efficient postorder traversal of trees.
  • 4. Literature Survey cont…  Jiaheng Lu, Pierre Senellart, Chunbin Lin, Xiaoyong Du, Shan Wang, Xinxing ChenMay “Optimal top-k generation of attribute combinations based on ranked lists” proc. ACM SIGMOD Int’l Conf. on Management of Data pp.1-12,2012. • A novel top-k query type, called top-k,m queries. • Suppose we are given a set of groups and each group contains a set of attributes, each of which is associated with a ranked list of tuples. • All lists are ranked in decreasing order of the scores of tuples. We want the top-k combinations of attributes according to the corresponding top-m tuples with matching IDs.
  • 5. Literature Survey cont..  K.-C. Tai, “The Tree-to-Tree Correction Problem,” J. ACM, vol. 26,no. 3, pp. 422-433, 1979. • The string-to-string correction problem, which is to determine the distance between two strings as measured by the minimum cost sequence of edit operations needed to transform one string into the other.  Three edit operations: changing one node of a tree into another node, deleting one node from a tree, or inserting a node into a tree; and they presented an algorithm that computes the distance between two strings in time O(m* n), where m and n are the lengths of the two given strings.
  • 6. Objective  To implement the concept of dominating queries by the approach of Top-k Approximate Subtree Matching Problem.  To evaluate the performance of dominating queries in the compact storage structure.
  • 7. Dominating Queries  The number of result is controllable.  The result is Scaling invariant.  No user defined ranking function is requierd.  Each point is assigned an intuitive score which determines its rank. TASM: • The problem of ranking the k best approximate matches of a small query tree in the large document tree.
  • 8. References  “Efficient Top-k Approximate Subtree Matchingin Small Memory “Nikolaus Augsten, Denilson Barbosa, Michael M. Bo¨ hlen, and Themis Palpanas, IEEE transactions on knowledge and data engineering, vol. 22, no. 8, August 2011.  Jiaheng Lu, Pierre Senellart, Chunbin Lin, Xiaoyong Du, Shan Wang, Xinxing ChenMay “Optimal top-k generation of attribute combinations based on ranked lists” proc. ACM SIGMOD Int’l Conf. on Management of Data pp.1-12,2012.  N. Augsten, M.H. Bo¨ hlen, C.E. Dyreson, and J. Gamper,“Approximate Joins for Data-Centric XML,” Proc. IEEE 24th Int’lConf. Data Eng. (ICDE), pp. 814-823, 2008.  K.-C. Tai, “The Tree-to-Tree Correction Problem,” J. ACM, vol. 26,no. 3, pp. 422-433, 1979.
  • 9. Timeline Chart PHASE REVIEW 1 REVIEW II REVIEW III Learning to work Implement the Evaluate the with TASM concept of dominating PHASE I (July) dominating queries in compact queries(August- storage structure September) ( October and November)