MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services
1.
2.
3.
4.
5. Existing Technologies at a reasonable price Key-Value Stores Commercial products but expensive Relational DBs Spatial DBs What We Want Open source products Scalability Multi-dimensional Queries
6. Ordered Key-Value Stores Sorted by key Good at 1-D Range Query ex. BigTable HBase key00 key11 keynn key00 key01 key0X value00 value01 value0X key11 key12 key1Y value11 value12 value1Y keynn valuenn Index Buckets Longitude Time Latitude But, our target is multi-dimensional…
7. Naïve Solution: Linearlization key00 key11 keynn keynn valuenn Projects n-D space to 1-D space Simple, but problematic… Apply a Z-ordering curve… key00 key01 key0X value00 value01 value0X key11 key12 key1Y value11 value12 value1Y 10 8 2 0 11 9 3 1 14 12 6 4 15 13 7 5
13. Build an index with the longest common prefix of keys 00 01 10 11 11 10 01 00 000* 001* 01** 1*** 000* 001* 01** 1*** Index Buckets allocate per subspace 000* 001* 01** 1*** 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101
14. Multi-dimensional Range Query Reconstruct the boundary Info. & Check whether intersecting the queried area 00 01 10 11 11 10 01 00 Index Filter 001* 000* 11** 01** 10** Scan Scan Subspace Pruning Scan 0010 -1001 on the index 1010 1000 0010 0000 1011 1001 0011 0001 1110 1100 0110 0100 1111 1101 0111 0101 11** 10** 01** 001* 000* 10** 001*
15.
16.
17.
18.
19.
20.
Notas del editor
アニメーション化
Scalability for Data Size # of users Continuously Generated High Insertion Throughput # of users Data collection Frequency Efficient Complex Query Performance Complex Queries Multi-dimensional Range Queries K Nearest Neighbor Queries Near Real-time Data is easy to stale