SlideShare una empresa de Scribd logo
1 de 20
MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services Shoji Nishimura (NEC Service Platforms Labs.),   Sudipto Das, Divyakant Agrawal, Amr El Abbadi (University of California, Santa Barbara) ,[object Object],[object Object]
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object]
Motivating Scenario: Mobile Coupon Distribution Mobile Coupon Distributer Coupon Current Location Current Location Current Location ,[object Object],[object Object],[object Object]
Motivating Scenario: Mobile Coupon Distribution 125,000,000 subscribers in Japan Current Location Current Location Current Location Current Location Current Location Current Location Current Location Current Location Current Location Current Location Current Location Current Location ,[object Object],[object Object],[object Object],Coupon Coupon Coupon Large amounts of Data High Throughput System Scalability Multi-Dimensional Query Nearest Neighbors Query Efficient Complex Queries
Existing Technologies at a reasonable price Key-Value Stores Commercial products but  expensive Relational DBs Spatial DBs What We Want Open source products Scalability Multi-dimensional Queries
Ordered Key-Value Stores Sorted by key Good at 1-D Range Query ex. BigTable   HBase key00 key11 keynn key00 key01 key0X value00 value01 value0X key11 key12 key1Y value11 value12 value1Y keynn valuenn Index Buckets Longitude Time Latitude But, our target is  multi-dimensional…
Naïve Solution: Linearlization key00 key11 keynn keynn valuenn Projects n-D space to 1-D space Simple, but problematic… Apply a Z-ordering curve… key00 key01 key0X value00 value01 value0X key11 key12 key1Y value11 value12 value1Y 10 8 2 0 11 9 3 1 14 12 6 4 15 13 7 5
Problem: False positive scans ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],10 8 2 0 11 9 3 1 14 12 6 4 15 13 7 5 2 9
[object Object],Our Approach: MD-HBase Single Dimensional Index Multi-Dimensional Index Ordered Key-Value Store ex. BigTable, HBase, … MD-HBase
Introduce Multi-dimensional Index ,[object Object],[object Object],[object Object],[object Object],Divide into Organize as
Space Partition By the K-d tree Binary Z-ordering space 00  01  10  11 11 10 01 00 00  01  10  11 11 10 01 00 Partitioned space by the K-d tree How do we represent these subspaces? bitwise interleaving ex. x= 00 , y= 11  ->  0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 1 1 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 0 0 0 1 1 0 0 1 0 0 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101
Key Idea: The longest common prefix naming scheme 00  01  10  11 11 10 01 00 000* 1*** Subspaces represented as the longest common prefix of keys! ,[object Object],[object Object],1*** 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101 Left-bottom corner Right-top corner 1 0 0 0 1 1 1 1 *->0 *->1 ( 10 ,  00 ) ( 11 ,  11 )
Build an index with the longest common prefix of keys 00  01  10  11 11 10 01 00 000* 001* 01** 1*** 000* 001* 01** 1*** Index Buckets allocate per subspace 000* 001* 01** 1*** 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101
Multi-dimensional Range Query Reconstruct the boundary Info. & Check whether intersecting the queried area 00  01  10  11 11 10 01 00 Index Filter 001* 000* 11** 01** 10** Scan Scan Subspace Pruning Scan 0010 -1001 on the index 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101 11** 10** 01** 001* 000* 10** 001*
K Nearest Neighbors Query ,[object Object],[object Object],[object Object],1 2 4 3 5
Variations of Storage Layer ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experimental Results: Multi-dimensional Range Query ,[object Object],[object Object],[object Object],[object Object]
Experimental Results: k Nearest Neighbors Query ,[object Object],[object Object],[object Object],[object Object]
Experimental Results: Insert ,[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

Similar a MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services

Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Mohammad Asif
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedOmid Vahdaty
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAmazon Web Services
 
Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023SakshiTiwari490123
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyQualcomm Research
 
Spot db consistency checking and optimization in spatial database
Spot db  consistency checking and optimization in spatial databaseSpot db  consistency checking and optimization in spatial database
Spot db consistency checking and optimization in spatial databasePratik Udapure
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsVineet Gupta
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaYahoo Developer Network
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesHaohui Mai
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresDataWorks Summit
 
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data FragmentsESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data FragmentsJoachim Van Herwegen
 
AWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDB
AWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDBAWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDB
AWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDBAmazon Web Services
 
[241]large scale search with polysemous codes
[241]large scale search with polysemous codes[241]large scale search with polysemous codes
[241]large scale search with polysemous codesNAVER D2
 

Similar a MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services (20)

Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
 
Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023Data Mining Presentation on Science Day 2023
Data Mining Presentation on Science Day 2023
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiency
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 
Spot db consistency checking and optimization in spatial database
Spot db  consistency checking and optimization in spatial databaseSpot db  consistency checking and optimization in spatial database
Spot db consistency checking and optimization in spatial database
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web Systems
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of Files
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value Stores
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Distributed storage system
Distributed storage systemDistributed storage system
Distributed storage system
 
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data FragmentsESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
 
AWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDB
AWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDBAWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDB
AWS Webcast - Build Mobile Apps with a Secure, Scalable Back End on DynamoDB
 
[241]large scale search with polysemous codes
[241]large scale search with polysemous codes[241]large scale search with polysemous codes
[241]large scale search with polysemous codes
 

Último

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services

  • 1.
  • 2.
  • 3.
  • 4.
  • 5. Existing Technologies at a reasonable price Key-Value Stores Commercial products but expensive Relational DBs Spatial DBs What We Want Open source products Scalability Multi-dimensional Queries
  • 6. Ordered Key-Value Stores Sorted by key Good at 1-D Range Query ex. BigTable HBase key00 key11 keynn key00 key01 key0X value00 value01 value0X key11 key12 key1Y value11 value12 value1Y keynn valuenn Index Buckets Longitude Time Latitude But, our target is multi-dimensional…
  • 7. Naïve Solution: Linearlization key00 key11 keynn keynn valuenn Projects n-D space to 1-D space Simple, but problematic… Apply a Z-ordering curve… key00 key01 key0X value00 value01 value0X key11 key12 key1Y value11 value12 value1Y 10 8 2 0 11 9 3 1 14 12 6 4 15 13 7 5
  • 8.
  • 9.
  • 10.
  • 11. Space Partition By the K-d tree Binary Z-ordering space 00 01 10 11 11 10 01 00 00 01 10 11 11 10 01 00 Partitioned space by the K-d tree How do we represent these subspaces? bitwise interleaving ex. x= 00 , y= 11 -> 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 1 1 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 0 0 0 1 1 0 0 1 0 0 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101
  • 12.
  • 13. Build an index with the longest common prefix of keys 00 01 10 11 11 10 01 00 000* 001* 01** 1*** 000* 001* 01** 1*** Index Buckets allocate per subspace 000* 001* 01** 1*** 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101
  • 14. Multi-dimensional Range Query Reconstruct the boundary Info. & Check whether intersecting the queried area 00 01 10 11 11 10 01 00 Index Filter 001* 000* 11** 01** 10** Scan Scan Subspace Pruning Scan 0010 -1001 on the index 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101 11** 10** 01** 001* 000* 10** 001*
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.

Notas del editor

  1. アニメーション化
  2. Scalability for Data Size # of users Continuously Generated High Insertion Throughput # of users Data collection Frequency Efficient Complex Query Performance Complex Queries Multi-dimensional Range Queries K Nearest Neighbor Queries Near Real-time Data is easy to stale
  3. Synchronize text and figures
  4. Put an example