Master presentation of Mike Argyriou in Technological University of Crete about
Branch and-bound nearest neighbor searching over unbalanced trie-structured overlays.
4. DHT Frameworks Evolution
• Rectangular queries support
• Peers only on leaves
2003: PGrid • High-dimensional queries support with space filling curves
• Height-balanced search tree limitation
2006:
VBI
• No height-balanced search tree limitation
• Abstract types of data and queries
• Data: point, rectangular
2008:
GRaSP • Queries: point, 3-sided, n-d rectangular
4
8. Related Work
1. Naïve algorithm: Central peer collects data and
performs k-NN searching
2. K-nn search algorithm over CAN
3. Distributed quad-based index each quadtree
block is uniquely identified by its centroid
mapped to Chord k-NN search algorithm
8
11. GRaSP
Building the trie ...
Hierarchical space partition:
1 Peer p joins
2 Finds a bootstrapping peer q
Space region s(q) splits into s(q0) and
3 s(q1)
11
16. GRaSP
Data Insertion
We insert a key k into all peers who own regions
that contain k
17
17. GRaSP
Routing Tables
Each peer knows a peer in
each complementary subtrie ...
0100 = 1
0100 = 00
0100 = 011
0100 = 0101
18
18. GRaSP
Routing
“In order to route a message from peer p to peer q, the message is
forwarded from p to a neighbor peer included in a known subtrie closer
to peer q. From r it is recursively forwarded to q.”
19
20. Searching Algorithm
Branch-and-bound algorithm
Priority queue PQ of candidate peers holding answer
better than the k-th answer found so far Fringe
1. Branch Step: expand PQ
2. Bound Step: prune PQ
21
21. Searching Algorithm
Parallel Searching vs Iterative Searching
Parallel Searching requires huge message state!
Iterative Searching prunes larger regions of the data space!
22
32. Low dimensions Data FI
vs
Space Partition
Which space partition is the best?
Greece ...
33
Data-balanced partition Volume-balanced partition
33. Low dimensions Latency
vs
Space Partition
Which space partition is the best?
Greece, k=1 ...
34
Data-balanced partition Volume-balanced partition
34. Low dimensions Fringe Size
vs
Space Partition
Which space partition is the best?
Greece, k=1 ...
35
Data-balanced partition Volume-balanced partition
35. Low dimensions Max Throughput
vs
Space Partition
Which space partition is the best?
Greece, k=1 ...
36
Data-balanced partition Volume-balanced partition
59. High dimensions
Curse of dimensionality
“When the dimensionality increases,
the volume of the space
increases so fast that the available data becomes sparse.”
60