SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
A Geometric Distance Oracle for Large Real-World
Graphs
Hong, Ong Xuan
Data Science School
November 16, 2017
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 1 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 2 / 30
Introduction
Explosion of available
information → Mining
information about interactions
between: Subscribers, Groups,
People, Objects, etc.
Fundamental graph
computational is computing
shortest path distance
between arbitrary nodes, but:
Slow calculating and querying
distance results.
Limited memory for storing
graph.
How to do this analysis
effectively?
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 3 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 4 / 30
Background
Graph theory.
Distance oracle.
Approximate distance.
Metric space: Euclidean, Hyperbolic.
δ - hyperbolic metric space.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 5 / 30
Graph theory
Let G(V , E) be an undirected, weighted graph, with n = |N| nodes and
m = |E| edges. What is the distance between the nodes s and t?
Dijkstra algorithm: O(m + nlogn) with Fibonacci heap, requires no
extra space.
Adjacency matrix: query time O(1), requires O(n2) extra space.
Floyd-Warshall algorithm: return all-pairs shortest paths, initialized
in time O(n3)
How to use less than O(n2) space and answer queries in less than
O(m + nlogn)?
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 6 / 30
Distance oracle
A distance oracle (constant query time) is a data structure which is
cheaper to compute, fast to query, and satisfy 4 properties:
Preprocessing time should be O(n) or O(nlogn).
Storage less than O(n2).
Query less than O(m + nlogn).
Fidelity: approximated distance as close as possible to the actual
distances.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 7 / 30
Approximate distance oracles
Using spanning trees and distance labeling for approximating distances
(Thorup and Zwick):
Preprocessing time: O(kmn1/k).
Storage: O(kn1+1/k).
Query less than O(k).
Fidelity: estimated distance vs actual distance ∈ [1, 2k − 1].
Note: k = 1, 2, logn, higher values of k do not improve the space or
preprocessing time.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 8 / 30
Metric space
Ordered pair (M, d) where M is a set and d is a metric
d : M × M → R
∀x, y, z ∈ M, the following holds:
d(x, y) ≥ 0
d(x, y) = 0 ⇐⇒ x = y
d(x, y) = d(y, x)
d(x, z) ≤ d(x, y) + d(y, z)
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 9 / 30
Euclidean distance
d(p, q) = d(q, p) = (q1 − p1)2 + (q2 − p2)2 + ... + (qn − pn)2
=
n
i=1
(qi − pi )2
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 10 / 30
Hyperbolic distance
d( x1, y1 , x2, y2 ) = arcosh(coshy1cosh(x2 − x1)coshy2 − sinhy1sinhy2)
Where:
sinhx = ex −e−x
2 (hyperbolic Sine).
coshx = ex +e−x
2 (hyperbolic Cosine).
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 11 / 30
δ - hyperbolic metric space
Given metric space (V , d) embeds into tree metric iff 4-point condition
holds:
∀w, x, y, z ∈ V :
S := S(w, x, y, z) = d(w, x) + d(y, z)
M := M(w, x, y, z) = d(x, y) + d(w, z)
L := L(w, x, y, z) = d(x, z) + d(w, y)
S ≤ M ≤ L
Then: ∀δ ≥ 0, (L − M)/2 ≤ δ
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 12 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 13 / 30
Related works
Theoretical results provide guaranteed approximation bounds for
specific graph classes:
Distance labeling in hyperbolic graphs
A Note on Distance Approximating Trees in Graphs
Additive spanners and distance and routing labeling schemes for
hyperbolic graphs
A compact routing scheme and approximate distance oracle for
power-law graphs
Reconstructing approximate tree metrics
Essays in Group Theory
Diameters, centers, and approximating trees of δ-hyperbolic geodesic
spaces and graphs
But has not been empirically evaluated on real-world graphs.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 14 / 30
Related works
Spanning trees
Quick query O(nlogn).
Reduce space storage.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 15 / 30
Related works
Developing approximate distance oracles on empirical Graphs small world
graphs, hypergrid graphs, Facebook, telecom, Google news graph, web
graph, etc.
Efficient Shortest Paths on Massive Social Graphs
Fast fully dynamic landmark-based estimation of shortest path
distances in very large graphs
Querying Shortest Path Distance with Bounded Errors in Large
Graphs
Orion: shortest path estimation for large social graphs
Approximating Shortest Paths in Social Graphs
Fast exact shortest-path distance queries on large networks by pruned
landmark labeling
Toward a distance oracle for billion-node graphs
Heuristics lack a theoretical foundation.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 16 / 30
Related works
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 17 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 18 / 30
Proposed method
Hyperbolicity-based Breath First Search (HyperBFS). Notation from graph
hyperbolicity on real world networks for developing spanning trees:
Height ≤ O(logn)
Distance queries: O(logn)
Storage O(n) words of space for an n-node graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 19 / 30
Algorithm
Hyperbolicity-based Tree Oracle: constructing geometric oracle
Choose highly central vertex (measure of centrality in graph based on
shortest paths) as root. But we use out degree instead (power-law
network) cause they are correlated.
Build 1-10 trees (BFS algorithm) with distinct root by ordered degree
for approximation → parallel computing distance labeling.
Distances between x and y is minimum distances in different trees
constructed.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 20 / 30
Algorithm
Set 1: Embedding graph into multi-dimensional geometric space
Mapping the nodes of the graph into points in the hyperbolic space.
Distance between two d-dimension points x = (x1, x2, ..., xd ) and
y = (y1, y2, ..., yd ) is defined as follow:
arcosh( (1 +
d
i=1
x2
i )(1 +
d
i=1
y2
i ) −
d
i=1
xi yi ).|c|
Note: no guarantees on the distance estimation error
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 21 / 30
Algorithm
Set 2: Gromov-type tree contraction: improves the accuracy of distance
estimates.
partitioning tree into i-level connected component (coalesce multiple
edges into a single edge)
additive error guaranteed not to exceed 2δlogn, where δ is the
hyperbolic constant of the graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 22 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 23 / 30
Evaluation
Four Bench-marked:
Gromov-type contraction-based tree.
Steiner trees with proven multiplicative bound.
Rigel: landmark-based approach.
HyperBFS: centrality-based spanning tree oracle.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 24 / 30
Setup
2.4 GHz Intel(R) Xeon(R) processor with 190GB of RAM.
Calculate distortion: Let x, y be vertices of a graph G and let dA be the
distance approximated by a distance oracle:
Additive distortion: dG − dA.
Absolute distortion: |dG − dA|.
Multiplicative distortion: |dG −dA|
dG
.
Figure: Computational Time of Hyper BFS on Call Graph II.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 25 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 26 / 30
Average absolute error
Figure: Average absolute error on various real-world graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 27 / 30
Average additive and multiplicative error
Figure: Average additive and multiplicative error on SantaBarbara Facebook
graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 28 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 29 / 30
Discussion
Exact and approximate algorithms for computing the hyperbolicity of
large-scale graphs (N. Cohen, D. Coudert, A. Lancin)
Indexing and space O(nm) vs O(n).
Query O(n) vs O(logn).
Exact distance vs error bound 2δlogn.
Extending metrics:
Clustering local coefficient: Ci =
2|{eji :vj ,vk ∈Ni ,ejk ∈E}|
ki (ki −1)
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 30 / 30

Más contenido relacionado

La actualidad más candente

H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14Sri Ambati
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFMLconf
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningSri Ambati
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Alexander Korbonits
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsAlbert Bifet
 
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2OSri Ambati
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with RPoo Kuan Hoong
 
Deep Learning and Reinforcement Learning
Deep Learning and Reinforcement LearningDeep Learning and Reinforcement Learning
Deep Learning and Reinforcement LearningRenārs Liepiņš
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSri Ambati
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingGrigory Sapunov
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Eiji Sekiya
 
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...Altoros
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.Albert Bifet
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsAlbert Bifet
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkDatabricks
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data ScienceAlbert Bifet
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksDatabricks
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningAdam Rogers
 

La actualidad más candente (20)

H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep Learning
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
Deep Learning and Reinforcement Learning
Deep Learning and Reinforcement LearningDeep Learning and Reinforcement Learning
Deep Learning and Reinforcement Learning
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep Learning
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image Processing
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
 
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
 
H20: A platform for big math
H20: A platform for big math H20: A platform for big math
H20: A platform for big math
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural Networks
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 

Similar a Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị

ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...Advanced-Concepts-Team
 
L4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics CourseL4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics CourseMohaiminur Rahman
 
Kriging interpolationtheory
Kriging interpolationtheoryKriging interpolationtheory
Kriging interpolationtheory湘云 黄
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...Yoshihiro Nagano
 
An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling IJECEIAES
 
Graph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsGraph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsLuc Brun
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...IOSR Journals
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...IOSR Journals
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysGoon83
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureRajesh Piryani
 
20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding20140327 - Hashing Object Embedding
20140327 - Hashing Object EmbeddingJacob Xu
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...Anirbit Mukherjee
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...Scientific Review SR
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Scientific Review
 
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...IOSR Journals
 

Similar a Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị (20)

ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
 
L4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics CourseL4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics Course
 
Kriging interpolationtheory
Kriging interpolationtheoryKriging interpolationtheory
Kriging interpolationtheory
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
 
An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling
 
Poster Final
Poster FinalPoster Final
Poster Final
 
Graph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsGraph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & Trends
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 
AI Science
AI Science AI Science
AI Science
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
 
20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
 
Pycon9 dibernado
Pycon9 dibernadoPycon9 dibernado
Pycon9 dibernado
 
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
 

Más de Hong Ong

Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Hong Ong
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfHong Ong
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxHong Ong
 
Data Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdfData Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdfHong Ong
 
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?Hong Ong
 
Nền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big DataNền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big DataHong Ong
 
Bắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big DataBắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big DataHong Ong
 
Bắt đầu học data science
Bắt đầu học data scienceBắt đầu học data science
Bắt đầu học data scienceHong Ong
 

Más de Hong Ong (8)

Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
Data Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdfData Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdf
 
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
 
Nền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big DataNền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big Data
 
Bắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big DataBắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big Data
 
Bắt đầu học data science
Bắt đầu học data scienceBắt đầu học data science
Bắt đầu học data science
 

Último

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oManavSingh202607
 

Último (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 

Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị

  • 1. A Geometric Distance Oracle for Large Real-World Graphs Hong, Ong Xuan Data Science School November 16, 2017 Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 1 / 30
  • 2. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 2 / 30
  • 3. Introduction Explosion of available information → Mining information about interactions between: Subscribers, Groups, People, Objects, etc. Fundamental graph computational is computing shortest path distance between arbitrary nodes, but: Slow calculating and querying distance results. Limited memory for storing graph. How to do this analysis effectively? Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 3 / 30
  • 4. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 4 / 30
  • 5. Background Graph theory. Distance oracle. Approximate distance. Metric space: Euclidean, Hyperbolic. δ - hyperbolic metric space. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 5 / 30
  • 6. Graph theory Let G(V , E) be an undirected, weighted graph, with n = |N| nodes and m = |E| edges. What is the distance between the nodes s and t? Dijkstra algorithm: O(m + nlogn) with Fibonacci heap, requires no extra space. Adjacency matrix: query time O(1), requires O(n2) extra space. Floyd-Warshall algorithm: return all-pairs shortest paths, initialized in time O(n3) How to use less than O(n2) space and answer queries in less than O(m + nlogn)? Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 6 / 30
  • 7. Distance oracle A distance oracle (constant query time) is a data structure which is cheaper to compute, fast to query, and satisfy 4 properties: Preprocessing time should be O(n) or O(nlogn). Storage less than O(n2). Query less than O(m + nlogn). Fidelity: approximated distance as close as possible to the actual distances. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 7 / 30
  • 8. Approximate distance oracles Using spanning trees and distance labeling for approximating distances (Thorup and Zwick): Preprocessing time: O(kmn1/k). Storage: O(kn1+1/k). Query less than O(k). Fidelity: estimated distance vs actual distance ∈ [1, 2k − 1]. Note: k = 1, 2, logn, higher values of k do not improve the space or preprocessing time. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 8 / 30
  • 9. Metric space Ordered pair (M, d) where M is a set and d is a metric d : M × M → R ∀x, y, z ∈ M, the following holds: d(x, y) ≥ 0 d(x, y) = 0 ⇐⇒ x = y d(x, y) = d(y, x) d(x, z) ≤ d(x, y) + d(y, z) Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 9 / 30
  • 10. Euclidean distance d(p, q) = d(q, p) = (q1 − p1)2 + (q2 − p2)2 + ... + (qn − pn)2 = n i=1 (qi − pi )2 Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 10 / 30
  • 11. Hyperbolic distance d( x1, y1 , x2, y2 ) = arcosh(coshy1cosh(x2 − x1)coshy2 − sinhy1sinhy2) Where: sinhx = ex −e−x 2 (hyperbolic Sine). coshx = ex +e−x 2 (hyperbolic Cosine). Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 11 / 30
  • 12. δ - hyperbolic metric space Given metric space (V , d) embeds into tree metric iff 4-point condition holds: ∀w, x, y, z ∈ V : S := S(w, x, y, z) = d(w, x) + d(y, z) M := M(w, x, y, z) = d(x, y) + d(w, z) L := L(w, x, y, z) = d(x, z) + d(w, y) S ≤ M ≤ L Then: ∀δ ≥ 0, (L − M)/2 ≤ δ Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 12 / 30
  • 13. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 13 / 30
  • 14. Related works Theoretical results provide guaranteed approximation bounds for specific graph classes: Distance labeling in hyperbolic graphs A Note on Distance Approximating Trees in Graphs Additive spanners and distance and routing labeling schemes for hyperbolic graphs A compact routing scheme and approximate distance oracle for power-law graphs Reconstructing approximate tree metrics Essays in Group Theory Diameters, centers, and approximating trees of δ-hyperbolic geodesic spaces and graphs But has not been empirically evaluated on real-world graphs. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 14 / 30
  • 15. Related works Spanning trees Quick query O(nlogn). Reduce space storage. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 15 / 30
  • 16. Related works Developing approximate distance oracles on empirical Graphs small world graphs, hypergrid graphs, Facebook, telecom, Google news graph, web graph, etc. Efficient Shortest Paths on Massive Social Graphs Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs Querying Shortest Path Distance with Bounded Errors in Large Graphs Orion: shortest path estimation for large social graphs Approximating Shortest Paths in Social Graphs Fast exact shortest-path distance queries on large networks by pruned landmark labeling Toward a distance oracle for billion-node graphs Heuristics lack a theoretical foundation. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 16 / 30
  • 17. Related works Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 17 / 30
  • 18. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 18 / 30
  • 19. Proposed method Hyperbolicity-based Breath First Search (HyperBFS). Notation from graph hyperbolicity on real world networks for developing spanning trees: Height ≤ O(logn) Distance queries: O(logn) Storage O(n) words of space for an n-node graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 19 / 30
  • 20. Algorithm Hyperbolicity-based Tree Oracle: constructing geometric oracle Choose highly central vertex (measure of centrality in graph based on shortest paths) as root. But we use out degree instead (power-law network) cause they are correlated. Build 1-10 trees (BFS algorithm) with distinct root by ordered degree for approximation → parallel computing distance labeling. Distances between x and y is minimum distances in different trees constructed. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 20 / 30
  • 21. Algorithm Set 1: Embedding graph into multi-dimensional geometric space Mapping the nodes of the graph into points in the hyperbolic space. Distance between two d-dimension points x = (x1, x2, ..., xd ) and y = (y1, y2, ..., yd ) is defined as follow: arcosh( (1 + d i=1 x2 i )(1 + d i=1 y2 i ) − d i=1 xi yi ).|c| Note: no guarantees on the distance estimation error Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 21 / 30
  • 22. Algorithm Set 2: Gromov-type tree contraction: improves the accuracy of distance estimates. partitioning tree into i-level connected component (coalesce multiple edges into a single edge) additive error guaranteed not to exceed 2δlogn, where δ is the hyperbolic constant of the graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 22 / 30
  • 23. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 23 / 30
  • 24. Evaluation Four Bench-marked: Gromov-type contraction-based tree. Steiner trees with proven multiplicative bound. Rigel: landmark-based approach. HyperBFS: centrality-based spanning tree oracle. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 24 / 30
  • 25. Setup 2.4 GHz Intel(R) Xeon(R) processor with 190GB of RAM. Calculate distortion: Let x, y be vertices of a graph G and let dA be the distance approximated by a distance oracle: Additive distortion: dG − dA. Absolute distortion: |dG − dA|. Multiplicative distortion: |dG −dA| dG . Figure: Computational Time of Hyper BFS on Call Graph II. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 25 / 30
  • 26. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 26 / 30
  • 27. Average absolute error Figure: Average absolute error on various real-world graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 27 / 30
  • 28. Average additive and multiplicative error Figure: Average additive and multiplicative error on SantaBarbara Facebook graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 28 / 30
  • 29. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 29 / 30
  • 30. Discussion Exact and approximate algorithms for computing the hyperbolicity of large-scale graphs (N. Cohen, D. Coudert, A. Lancin) Indexing and space O(nm) vs O(n). Query O(n) vs O(logn). Exact distance vs error bound 2δlogn. Extending metrics: Clustering local coefficient: Ci = 2|{eji :vj ,vk ∈Ni ,ejk ∈E}| ki (ki −1) Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 30 / 30