OECD bibliometric indicators: Selected highlights, April 2024
Absorbing Random Walk Centrality
1. Absorbing
Random
Walk
Centrality
Theory
and
Algorithms
1
Harry
Mavroforakis,
Boston
University
Michael
Mathioudakis,
Aalto
University
ArisAdes
Gionis,
Aalto
University
Helsinki
-‐
September
15th
2015
2. 2
submit
query
to
twiKer
e.g.
‘#ferguson’
want
to
see
messages
from
few,
central
users
many
and
different
groups
of
people
might
be
posAng
about
it
3. one
approach...
represent
acAvity
with
graph
3
users
in
the
results
(query
nodes)
other
users
connecAons
4. 4
select
k
central
query
nodes
what
is
‘central’?
many
measures
have
been
studied
here,
we
use
random
walks
for
robustness
absorbing
random
walk
centrality
5. absorbing
random
walk
centrality
5
model
start
from
query
node
q
w.p.
s(q)
re-‐start
with
probability
α
at
each
step
transiAons
follow
edges
at
random
from
one
node
to
another
absorpAon
we
can
designate
set
of
absorbing
nodes
no
escape
from
absorbing
nodes
centrality
of
nodes
S
expected
Ame
(number
of
steps)
unAl
absorbed
by
S
6. problem
input
graph
,
query
nodes
,
α
&
candidate
nodes
(e.g.
query
nodes
or
all
nodes)
select
k
candidate
nodes
with
minimum
absorpAon
Ame
6
7. 7
select
nodes
that
are
central
w.r.t.
the
query
nodes
k
=
1
8. 8
select
nodes
that
are
central
w.r.t.
the
query
nodes
k
=
3
9. 9
select
nodes
that
are
central
w.r.t.
the
query
nodes
k
≥|Q|
12. approximaAon
centrality
measure
monotonicity
absorpAon
Ame
decreases
with
more
absorbing
nodes
supermodularity
diminishing
returns
12
13. approximaAon
centrality
gain
mc:
min
centrality
for
k=1
gain
=
mc
-‐
centrality,
k>1
non-‐negaAve,
non-‐
decreasing,
submodular
13
S
υ {u}
S
υ
{u,v}
S
greedy
algorithm
(1-1/e)-‐approximaAon
guarantee
for
gain
14. greedy
S
=
empty
for
i
=
1..k
for
u
in
V
-‐
S
calculate
centrality
of
S
υ {u}
(*)
update
S
:=
S
υ {best
u}
14
boKleneck
is
in
line
(*)
one
matrix
inversion
(super-‐quadraAc)
for
step
(*)
use
sherman-morrison
to
perform
(*)
in
O(n2)
with
one
(1)
inversion
for
first
node
however...
sAll
O(kn3)
15. heurisAcs
&
baselines
• personalized
pagerank
with
same
α
• spectralQ
&
spectralD
– spectral
embedding
of
nodes
– k-‐means
on
embedding
of
query
nodes
– select
candidates
close
to
centers
– spectral
Q
selects
more
nodes
from
larger
clusters
• spectralC
– similar
to
spectralQ
but
clustering
on
candidates
• degree
&
distance
centrality
15
16. evaluaAon
16
TABLE I: Dataset statistics
small
Dataset |V | |E|
karate 34 78
dolphins 62 159
lesmis 77 254
adjnoun 112 425
football 115 613
large
Dataset |V | |E|
kddCoauthors 2 891 2 891
livejournal 3 645 4 141
ca-GrQc 5 242 14 496
ca-HepTh 9 877 25 998
roadnet 10 199 13 932
oregon-1 11 174 23 409
Degree and distance centrality. Finally, we consider the
standard degree and distance centrality measures.
Degree returns the k highest-degree nodes. Note that this
baseline is oblivious to query nodes Q.
with q
datasets
Final
starting
Implem
with ex
Intel X
C. Res
Figu
algorith
better).
of the fi
other tw
data
cannot
run
greedy
on
these
17. input
graphs
from
previous
datasets
query
nodes:
planted
spheres
k
spheres
(k
=
1)
radius
ρ
(ρ
=
2
or
3)
s
special
nodes
inside
spheres
(s
=
10
or
20)
17
α
=
0.15
24. conclusions
complex
problem,
greedy
algorithm
is
expensive
personalized
pagerank
is
good
alternaAve
future
work
expansion
strategies
comparison
with
more
alternaAves
parallel
implementaAon
24
26. 26
where the inequality comes from the fact that a path in GX
passing from Z and being absorbed by X corresponds to a
shorter path in GY being absorbed by Y .
B. Proposition 5
Proposition Let Ci 1 be a set of i 1 absorbing nodes,
Pi 1 the corresponding transition matrix, and Fi 1 = (I
Pi 1) 1
. Let Ci = Ci 1 [ {u}. Given Fi 1, the centrality
score acQ(Ci) can be computed in time O(n2
).
The proof makes use of the following lemma.
Lemma 1 (Sherman-Morrison Formula [7]) Let M be a
square n⇥n invertible matrix and M 1
its inverse. Moreover,
let a and b be any two column vectors of size n. Then, the
following equation holds
(M + abT
) 1
= M 1
M 1
abT
M 1
/(1 + bT
M 1
a).
By a direct a
can compute
cost of O(n2
Fi =
We have thu
and therefore
C. Propositio
Proposition
correspondin
C0
= C {
score acQ(C
Proof: T
Again we ass
Puu = 0 for
two sets of a
path in GX
sponds to a
bing nodes,
i 1 = (I
e centrality
t M be a
. Moreover,
. Then, the
By a direct application of Lemma 1, it is easy to see that we
can compute Fi from Fi 1 with the following formula, at a
cost of O(n2
) operations.
Fi = Fi 1 (Fi 1a)(bT
Fi 1)/(1 + bT
(Fi 1a))
We have thus shown that, given Fi 1, we can compute Fi,
and therefore acQ
(Ci) as well, in O(n2
).
C. Proposition 6
Proposition Let C be a set of absorbing nodes, P the
corresponding transition matrix, and F = (I P) 1
. Let
C0
= C {v} [ {u}, for u, v 2 C. Given F, the centrality
score acQ(C0
) can be computed in time O(n2
).
Proof: The proof is similar to the proof of Proposition 5.
score acQ(Ci) can be computed in time O(n2
).
The proof makes use of the following lemma.
Lemma 1 (Sherman-Morrison Formula [7]) Let M be a
square n⇥n invertible matrix and M 1
its inverse. Moreover,
let a and b be any two column vectors of size n. Then, the
following equation holds
(M + abT
) 1
= M 1
M 1
abT
M 1
/(1 + bT
M 1
a).
Proof: (Proposition 5) Without loss of generality, let the
set of absorbing nodes be Ci 1 = {1, 2, . . . , i 1}. For
simplicity, assume no self-loops for non-absorbing nodes, i.e.,
Pu,u = 0 for u 2 V Ci 1. As in Section VI, the expected
number of steps before absorption is given by the formulas
acQ
(Ci 1) = sT
Q
Fi 11,
with Fi 1 = A 1
i 1 and Ai 1 = I Pi 1.
We proceed to show how to increase the set of absorbing nodes
by one and calculate the new absorption time by updating Fi 1
in O(n2
). Without loss of generality, suppose we add node i
to the absorbing nodes Ci 1, so that
Ci = Ci 1 [ {i} = {1, 2, . . . , i 1, i}.
Let Pi be the transition matrix over G with absorbing nodes
C. Proposition 6
Proposition Let C be
corresponding transition
C0
= C {v} [ {u}, f
score acQ(C0
) can be c
Proof: The proof is
Again we assume no sel
Puu = 0 for u 2 V
two sets of absorbing no
C = {
C0
= {
Let P0
be the transition
absorbing centrality for t
C0
is expressed as a fun
F = A 1
,
F0
= A0
Notice that
A0
A = (I
=
2
6
6
4
0(i 1
pi,1 . . .
pi+1,0 . . .
0(n i
where pi,j denotes the
node j in a transition ma