Statistical Network Models: How to Leverage Network Structure to Improve Estimation & Prediction

2nd Workshop on Offline & Online
Evaluation of Interactive Systems
@KDD 2019

Graphs – rich and powerful
data representation
-
-
-
-
-
Social network
Human Disease Network
[Barabasi 2007]
Food Web [2007]
Terrorist Network
[Krebs 2002]Internet (AS) [2005]
Gene Regulatory Network
[Decourty 2008]
Protein Interactions
[breast cancer]
Political blogs
Power grid

User-items
w1
w2
w3
wK
…
Document-Words
Question-Answer
Graphs with multiple node/link types, strength
attributes, weights …
What’s impact of
heterogenous structure on
prediction and error?
w1
w2
w3
w4
w5
w6
w7
wK
…
P
Q
Wiki-talk, user-edits,
…
User A User B
User A Item K
User A
Heterogenous Graphs
Webpage

Many examples:
emails, user logs, weblogs, transactions, online social
network, news feed, Q&A forums …
time t1
time t2
(from online interactive systems)
time t1
How to handle concept drift, partial observability,
higher-order relationships across time
⋯ ⋯
time
User A User B
User A Item K
User A Webpage

(1) Statistical Online/Streaming Methods
(2) Graph Representation Learning

t=1 t=2
[1, 5] [6, 10]
Discrete-time models: represent dynamic network as a sequence
of static snapshot graphs where
User-defined aggregation time-interval
Streaming Model

t=1 t=2
[1, 5] [6, 10]
Discrete-time models: represent dynamic network as a sequence
of static snapshot graphs where
User-defined aggregation time-interval
Streaming Model
Discrete Time Model:
Very coarse representation with noise/error problems
Difficult to manage at a large scale
Missing higher-order dependencies

Sketching/
Sampling/
Summarization
⋯ ⋯
Sketch/Sample/
summary
Graph (S)
Graph Stream
time
Statistical Randomized Methods
with Theoretical Guarantees
q Not possible to store the entire data stream
q Faster/convenient to work with a compact summary
q Incremental & online updates
Turn large streaming data into small manageable & useful data
+
Statistical Estimation

Online ML
Algorithms
Sketch/Sample/summary
Graph (S)
Model Learning
Estimate Network Structure
Network Parameter
Estimation
Feature Representation
Higher-order Dependencies
.
.
.

Query Requirements
Accuracy, Aggregates, Top-k
Ranks, Speed, privacy
Resource Constraints
Bandwidth, Storage, Memory,
access constraints
Sampling/
Sketching
Data Characteristics
Heavy-Tailed distribution,
Correlations, clusters, rare events

Data Characteristics
Heavy-Tailed distribution,
Correlations, clusters, rare events
Query Requirements
Accuracy, Aggregates, Top-k
Ranks, Speed, privacy
Resource Constraints
Bandwidth, Storage, Memory,
access constraints
Goal
Parameter Estimation
Network Estimation
Data Collection
Learning a Model
Sample/Sketch
Sampling/
Sketching

Statistical Estimator Complexity
EstimationError
High Bias High Variance
OptimalEstimator
Bias2
Variance
MSE

§ Main idea: we define the selection estimator for an edge
Unbiased
Edge Estimator
For each subgraph J ⇢ [t],
we deﬁne the sequence of subgraph estimators as
ˆSJ,t =
Q
i2J
ˆSi,t
E[ ˆSJ,t] = SJ,t
Subgraph Estimator
Unbiased
Ahmed et al. KDD14,
TKDD14, VLDB17, IJCAI18
counts are small, because in this case the individual count estima3
ties, are less smoother by aggregation. More generally, James a4
that unbiased estimators do not necessarily minimize mean squa5
estimates of high dimensional Gaussian random variable are adj6
ization and linear combination with dimensional averages. In this7
bnk by convex combination with unnormalized count estimated fo8
bSi,t = I(i 2 bKt)/pi9
E(bSi,t) = Si,t10
Deﬁne ⌘w = bn + w where w represents the weight wk of11
sugbraph count as maintained in Algorithm ??. The loss Lw( )12
Var(b⌘w) + (E[b⌘w] n)2
= 2
Var(bn) +
2
Var(w) + 2
A straightforward computation shows that Lw( ) is minimized13
1 = Cov(bn w, bn)/E[(bn w)2
]14
A plug-in estimator bw for is obtained by substituting (bn w)2
15
for Cov(bn w, bn) whose computation we now describe.16
1.1 Covariance Relations Amongst Subgraph Estimators17
Horvitz-Thompson Estimator
See paper for proofs

§ Main idea: we define the selection estimator for an edge
Unbiased
Edge Estimator
Ahmed et al. KDD14,
TKDD14, VLDB17, IJCAI18
counts are small, because in this case the individual count estima3
ties, are less smoother by aggregation. More generally, James a4
that unbiased estimators do not necessarily minimize mean squa5
estimates of high dimensional Gaussian random variable are adj6
ization and linear combination with dimensional averages. In this7
bnk by convex combination with unnormalized count estimated fo8
bSi,t = I(i 2 bKt)/pi9
E(bSi,t) = Si,t10
Deﬁne ⌘w = bn + w where w represents the weight wk of11
sugbraph count as maintained in Algorithm ??. The loss Lw( )12
= 2
Var(bn) +
2
Var(w) + 2
A straightforward computation shows that Lw( ) is minimized13
]14
15
Horvitz-Thompson Estimator
The sample Gs is a proxy for the
input graph stream G at time t

Input
Adaptive Graph Priority Sampling
APS(m)
Output
For each edge k
Edge stream
k1, k2, ..., k, ...
Sampled Edge stream ˆK
Stored State m = O(| ˆK|)
Generate a random number
u(k) ⇠ Uni(0, 1]
Compute edge weight
w(k) = W(k, ˆK)
Compute edge priority
r(k) = w(k)/u(k)
ˆK = ˆK [ {k}
See paper for algorithm details

Input
Output
For each edge k
Edge stream
k1, k2, ..., k, ...
Sampled Edge stream ˆK
Stored State m = O(| ˆK|)
Find edge with lowest priority
k⇤
= arg mink02 ˆK r(k0
)
Update sample threshold
z⇤
= max{z⇤
, r(k⇤
)}
Remove lowest priority edge
ˆK = ˆK{k⇤
}
Use a priority queue with O(log m) updates
Adaptive Graph Priority Sampling
APS(m)

§ A shrinkage estimator is an estimator that incorporates the effect of
shrinkage
§ Main idea: an (unbiased) estimator could be improved by combining it
with other information
• e.g., combining with prior knowledge, other simple estimators, etc.
James & Stein 1992
Gruber 2017

§ A shrinkage estimator is an estimator that incorporates the effect of
shrinkage
§ Main idea: an (unbiased) estimator could be improved by combining it
with other information
• e.g., combining with prior knowledge, other simple estimators, etc.
§ Why would shrunk estimates be better?
Ø This introduces bias, but could significantly decrease the variance.
Ø If the effect of the variance is larger, this would decrease the
estimation error.
James & Stein 1992
Gruber 2017

ties, are less smoother by aggregation. More generally, Jam
that unbiased estimators do not necessarily minimize mean sq
estimates of high dimensional Gaussian random variable are
ization and linear combination with dimensional averages. In
nk by convex combination with unnormalized count estimate
Define ηw = λn + λw where w represents the weight wk
sugbraph count as maintained in Algorithm 1. The loss Lw(λ
Var(ηw) + (E[ηw] − n)2
= λ2
Var(n) + λ
2
Var(w) +
A straightforward computation shows that Lw(λ) is minimize
w)2
]. A plug-in estimator λw for λ is obtained by substitutin
estimate for Cov(n − w, n) whose computation we now des
4
estimates of high dimensional Gaussian rand161
ization and linear combination with dimensio162
nk by convex combination with unnormalize163
Define ηw = λn + λw where w represent164
sugbraph count as maintained in Algorithm165
Var(ηw) + (E[ηw] − n)2
= λ2
Var(n
A straightforward computation shows that Lw166
w)2
]. A plug-in estimator λw for λ is obtain167
estimate for Cov(n − w, n) whose computa168
r Reduction using James-Stein Shrinkage
ocal subgraph counts are subject to high relative variance when the m
in this case the individual count estimates, scaled by the inverse probab
y aggregation. More generally, James and Stein originated the observat
do not necessarily minimize mean square error [22]. In their study, unbia
ional Gaussian random variable are adjusted through scaling-based regu
ation with dimensional averages. In this paper we examine shrinkage for
n with unnormalized count estimated form by the edge sampling weight
where w represents the weight wk of any edge k, i.e., its unnormali
ained in Algorithm 1. The loss Lw(λ) is the mean sqaure error:
− n)2
= λ2
Var(n) + λ
2
Var(w) + 2λλ Cov(n, w) + λ
2
E[n − w]2
ation shows that Lw(λ) is minimized when 1 −λ = Cov(n−w, n)/E[(n
r λw for λ is obtained by substituting (n − w)2
in the denominator, and
n) whose computation we now describe.
Mean Squared Error
(MSE)
1 Estimation Error Reduction using James-Ste1
Unbiased estimators of local subgraph counts are subject to2
counts are small, because in this case the individual count esti3
ties, are less smoother by aggregation. More generally, Jame4
that unbiased estimators do not necessarily minimize mean sq5
estimates of high dimensional Gaussian random variable are a6
ization and linear combination with dimensional averages. In th7
bnk by convex combination with unnormalized count estimated8
Define ⌘w = bn + w where w represents the weight wk9
sugbraph count as maintained in Algorithm ??. The loss Lw(10
= 2
Var(bn) +
2
Var(w) +
A straightforward computation shows that Lw( ) is minimize11
]12
A plug-in estimator bw for is obtained by substituting (bn w13
i
We emphasize that the edge weight wi is a function of the grap85
that contain edge i). In the static case, we can compute the m86
by counting the number of motifs incident to each edge. How87
streaming data model, since we cannot store the entire graph i88
– We assume edges are labelled (indexed) by their arrival tim89
number of edges in the graph stream. We denote J as a sub90
subgraph isomorphic to M (i.e., J ⇢ K), hence max(J) indica91
assume J1 = {4, 20, 100} represents a triangle formed by edge92
last arriving edge in J1. Also, assume the new arriving edge i93
J1 if and only if edges 4, and 20 are already in the sample. T94
weights are allowed to increase, as we sample more topology95
with higher importance to increase their ranks. However, the96
would also change as the stream evolves, we define the edge97
form (see eqn 1). Assume edge i is added to the sample at tim98
set to 1 initially, and then adapted as the stream evolves using99
t i (i.e., subsequent times, after edge i is sampled).100
– We are working on the proof, so we will remove any optimal101
from the paper. For Uniform sampling: uniform sampling prod102
capture the skewed distribution of the data. For the constra103
that ¯  1 always holds. For variance: As requested, we show104
Minimize the mean squared error (MSE)
s. t.
Unbiased Estimator Simple Biased Estimator
Shrinkage/Combined Estimator
estimates of high dimensional Gaussian random variable are adjuste6
ization and linear combination with dimensional averages. In this pa7
bnk by convex combination with unnormalized count estimated form8
Define ⌘w = bn + w where w represents the weight wk of an9
sugbraph count as maintained in Algorithm ??. The loss Lw( ) is t10
= 2
Var(bn) +
2
Var(w) + 2 C
A straightforward computation shows that Lw( ) is minimized wh11
]12
in13
Define the unnormalized subgraph estimator bIJ,t = I(J ⇢ bK16
I0
J = IJ(0),⌧J 1. This is 1 iff all the edges of J(0)
are present in bK⌧J
17
arrival of the last edge. When this occurs, each edge in J(0)
has its w18
Algorithm ??. Thus, the weight associated with edge k at time t is w19
covariances amongst the bnk,t and wk,t need to compute the shrinkag20
Shrinkage intensity
See paper for proofs
(For each sampled edge)
Ahmed et al. 2019
arXiv:1908.01087

100
101
102
103
104
105
106
top-k edges
0
100
200
300
400
500
600
700
localedgetrianglecount
soc-livejournal
Exact
APS f=0.40
APS UB
APS LB
100
101
102
103
104
105
106
top-k edges
0
100
200
300
400
500
600
700
soc-livejournal
100
101
102
103
104
105
106
top-k edges
0
100
200
300
400
500
600
700
soc-livejournal
Exact
APS JS f=0.40
APS JS UB
APS JS LB
100
101
102
103
104
105
106
top-k edges
0
100
200
300
400
500
600
700
soc-livejournal
Unbiased Estimator Shrinkage Estimator
Goal: Estimate the higher-order motif
weighted network at any time t e.g. triangle-
weighted graph
Ahmed et al. 2019
arXiv:1908.01087

W(A,B)
10
0
10
1
10
2
10
3
10
4
10
5
0
200
400
600
800
1000
1200
stackoverflow
top−k edges
EdgeWeight
ground truth
GPS f = 0.10
10
0
10
1
10
2
10
3
10
4
10
5
0
200
400
600
800
1000
1200
stackoverflow
top−k edges
EdgeWeight
APS
User A User B
Goal: Estimate the aggregated weighted graph at any time t
e.g., W(A,B) = # communications between A & B

(1) Statistical Online/Streaming Methods
(2) Graph Representation Learning✓

§ Goal: Learn representation (features) for a set of graph
elements (nodes, edges, etc.)
§ Key intuition: Map the graph elements (e.g., nodes) to the
d-dimension space
§ Use the features for any downstream prediction task

Communities: cohesive subsets of nodes à Proximity
Roles: represent structural patterns à Structural Similarity
- two nodes belong to the same role if they’ve similar structural patterns
Cj#
Ci#
Ck#
TKDE 2015
AAAI 2017

x
x
xxx
x
x
x
xx
x
x
x
x
x
x
x
x
x
x
x
x
x
xxx
x
x
x
xx
x
x
x
x
x
x
x
x
x
x
x
x
xxx
x
x
x
xx
x
x
x
x
x
x
x
x
x
x
e.g., Deepwalk, GraRep, node2vec, Line, etc.
No guarantee that nearby vertices are structurally similar
Ahmed et al.
IJCAI-StarAI 2018

P
h
hxci
i| hxii
i
=
Y
j2ci
P( hxji | hxii)
<latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit>
A general principled framework for learning graph embeddings
(Role2vec)
that capture structural similarity (roles)
16.5% avg. improvement in accuracy
Space Efficient with 850x space savings
G1
1
G2
3
2
G3
4
G4
5
6
G5
7
8
G6
9
G7
10
11
12
G9
15
G8
13
14
Ahmed et al.
IJCAI-StarAI 2018

Attributed Random Walk
P
h
hxci
i| hxii
i
=
Y
j2ci
P( hxji | hxii)
<latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit>
16.5% avg. improvement in accuracy
Space Efficient with 850x space savings
(Role2vec)
(Role2vec)
Ahmed et al.
IJCAI-StarAI 2018

§ Statistical online methods for streaming graphs in real-time
and interactive systems
• Evaluation challenges: bias and variance & their tradeoffs
• Unbiased estimators
• Shrinkage estimators to reduce variance
§ Graph representation learning
• Structural similarity vs. proximity
• Introduced notion of feature-based walks & a framework for
generalizing existing methods based on it

§ Estimation with quadratic loss. [James & Stein] – In Breakthroughs in
statistics, 1992
§ Improving Efficiency by Shrinkage: The James–Stein and Ridge Regression
Estimators. [Gruber] – Routledge, 2017
§ Graph Sample and Hold. [Ahmed et al.] – ACM SIGKDD 2014
§ Network Sampling: From Static to Streaming Graphs. [Ahmed et al.] – ACM
TKDD 2014
§ On Sampling from Massive Graph Streams. [Ahmed et al.] – VLDB 2017
§ Sampling for approximate bipartite network projection. [Ahmed et al.] –
IJCAI 2018
§ Network Shrinkage Estimation. [Ahmed et al.] – arXiv:1908.01087 2019
§ Learning Role-based Graph Embeddings. [Ahmed et al.] – IJCAI-StarAI 2018
§ Role Discovery in Networks. [Rossi and Ahmed] – TKDE 2015

Statistical Network Models: How to Leverage Network Structure to Improve Estimation & Prediction

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Statistical Network Models: How to Leverage Network Structure to Improve Estimation & Prediction