Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Dynamic PageRank using Evolving Teleportation
1. ⋯ ⋯ ⋯
Time
Dynamic PageRank using Evolving
Teleportation
Ryan A. Rossi
David F. Gleich
Tunisia Egypt Libya Tunisia Egypt Libya Tunisia Egypt Libya Tunisia Egypt Libya
2. Problem: Importance of nodes is NOT static
(static PageRank)
Evolving in reality!
Ryan Rossi (Purdue) Dynamic PageRank
3. Problem: Importance of nodes is NOT static
Formulate PageRank
as Dynamical System!
Evolving in reality!
Importance of 100 nodes
changing over time
Dynamic Generalization
of PageRank
Helps in prediction!
Ryan Rossi (Purdue) Dynamic PageRank
5. 1 2 Static PageRank Model. At a node, a
random surfer can:
3 4 1. follow edges uniformly with probability
α, and
5
2. randomly jump with probability 1 − α
(for now, assume vi = 1/n)
The nodes that are visited most
often are important!
Induces a Markov chain model
(random walk)
Or the linear system
where
Ryan Rossi (Purdue) Dynamic PageRank
6. 1 2 Static PageRank Model. At a node, a
random surfer can:
3 4 1. follow edges uniformly with probability
α, and
5
2. randomly jump with probability 1 − α
(for now, assume vi = 1/n)
Too simplistic! that important! most
The nodes
often are
are visited
Graph & attributes evolve!
Importance continuously changes!
Induces a Markov chain model
(random walk)
Or the linear system
where
Ryan Rossi (Purdue) Dynamic PageRank
7. Majority of work focuses on static networks!
Combine PageRank with crawling process
S. Abiteboul, M. Preda, & G. Cobena:
Adaptive on-line page importance computation
Walks on dynamic graphs
P. Grindrod, D. Higham, M. Parsons, & E. Estrada:
Communicability Across Evolving Networks
Other work:
J. O’Madadhain & P. Smyth,
EventRank: A framework for ranking time-varying networks
Ryan Rossi (Purdue) Dynamic PageRank
8. All of these techniques are not
placed in the context of a
dynamical system
We want to gain additional flexibility
by adapting these problems as
continuous dynamical systems
Ryan Rossi (Purdue) Dynamic PageRank
9. Evolving teleportation
96
(e.g. pageviews)
105
281
42
11
27
⋯
time
Importance continuously changes
as the external influence evolves!
Dynamic PageRank
⋯ ⋯
Ryan Rossi (Purdue) Dynamic PageRank
10. Evolving teleportation
96 113
(e.g. pageviews)
105 139
281 397
42 64
11 16
27 21
⋯
time time
Importance continuously changes
as the external influence evolves!
Dynamic PageRank
⋯ ⋯
Ryan Rossi (Purdue) Dynamic PageRank
11. 96 113 103
105 139 125
281 397 331
42 64 53
11 16 12
27 21 39
⋯ ⋯ ⋯
time
Importance continuously changes
as the external influence evolves!
Dynamic PageRank
⋯ ⋯
Ryan Rossi (Purdue) Dynamic PageRank
12. Changes in PageRank
values evolve
Dynamical System
Dynamic Teleportation
Ryan Rossi (Purdue) Dynamic PageRank
13. Dynamic Teleportation Model
Generalization of static PageRank. If v(t) = v stops
changing, then we recover the original PageRank vector x as the
steady-state solution:
Ryan Rossi (Purdue) Dynamic PageRank
14. A principled dynamical system framework for studying these
problems
Flexibility to choose our algorithm to solve it
Determines the effective length scale
Seamlessly generalizes PageRank for dynamics
We can easily and naturally incorporate the complete set of
dynamic components
Ryan Rossi (Purdue) Dynamic PageRank
15. Evolve the dynamical system,
Select any standard method!
forward Euler
Family of
Runge-Kutta …
methods
Many others!
Classical methods Adaptive methods
RK2,…,RK4,…
Ryan Rossi (Purdue) Dynamic PageRank
17. How we map updates to v into the dynamical system time
determines the effective length-scale that we are looking at
time-scale of dynamical system
Relationship?
time-scale of application
x(1)? 1 sec, 1 min,...?
Ryan Rossi (Purdue) Dynamic PageRank
18. How we map updates to v into the dynamical system time
determines the effective length-scale that we are looking at
Equivalent to running the
time-scale of dynamical system
Relationship? power-method until
time-scale of application
convergence each hour!
x(1)? 1 sec, 1 min,...?
time in application
h=1
t=1
60 iterations
time-scale = 1 (1 min)
(application) between each hour
h=1
t=1
3 iterations after
time-scale = 1 (20 min)
(application) each hourly change
Ryan Rossi (Purdue) Dynamic PageRank
19. v(t) changes at fixed intervals
Better idea might be to smooth out these “jumps”!
Feature of the new model!
0.2
0.18
Utilize this informationh=1
0.16
from the evolution 1 (12 min)
t=
time-scale = 1 hour
Convergence Measure
0.14
(application)
0.12
0.1
0.08
0.06
0.04 5 iterations after
0.02
each hourly change
0
0 5 10 15 20 25 30 35 40 45 50
Iteration
Ryan Rossi (Purdue) Dynamic PageRank
20. Transient
— Instantaneous values of
Summary & Cumulative
— Any summary function s(⋅) of the time-series:
integral, min, max, variance
Difference Rank
Among many others...
Ryan Rossi (Purdue) Dynamic PageRank
21. Wikipedia
— Hyperlink graph
— Hourly pageviews
Twitter
— Who-follows-whom
— Tweet rates (monthly)
Dataset Nodes Edges tmax Period Average pi Max pi
Wikipedia 4,143,840 72,718,664 20 hours 1.3225 334,650
Twitter 465,022 835,424 6 months 0.5569 1056
Ryan Rossi (Purdue) Dynamic PageRank
22. Nope, pageviews and degree uncorrelated!
8
correlation=0.02
7
High degree,
In Degree (Log)
6
Low pageviews
5
4
3
2
High pageviews,
1
Low degree
0
0 1 2 3 4 5 6 7 8 9
Total Pageviews
(Log)
Ryan Rossi (Purdue) Dynamic PageRank
23. Main Finding: Combing the external
influence with the graph, produces
something new, that is not captured
by the other methods
Ryan Rossi (Purdue) Dynamic PageRank
24. Learn model as
(Exponential moving avg)
Predicts p(t+1) as
Evaluate models (total errors) as
Ryan Rossi (Purdue) Dynamic PageRank
25. Base Model. Only pageviews (or tweet-rates)
Dynamic PageRank. Pageviews and Dynamic PageRank time-series
Dataset Forecasting Dynamic PageRank Base Model
Non-stationary 0.4349 0.5028
Wikipedia
Stationary 0.3672 0.4373
Non-stationary 0.4852 1.2333
Twitter
Stationary 0.6690 0.9180
Main Finding. Dynamic PageRank time-series
provides valuable information for forecasting
future pageviews (or tweet-rates)
Ryan Rossi (Purdue) Dynamic PageRank
26. Many applications such as
Base Model. Only pageviews (or tweet-rates) systems
• Actively adapting caches in large DB
Dynamic PageRank. Pageviews and Dynamic PageRank time-series
• Dynamically recommending pages
Dataset Forecasting Dynamic PageRank Base Model
Non-stationary 0.4349 0.5028
Wikipedia
Stationary 0.3672 0.4373
Non-stationary 0.4852 1.2333
Twitter
Stationary 0.6690 0.9180
Ryan Rossi (Purdue) Dynamic PageRank
27. Top 100 pages that fluctuate the most!
Dynamic PageRank identifies interesting pages
that pertain to recent external interest.
Ryan Rossi (Purdue) Dynamic PageRank
28. Top 100 pages that fluctuate the most!
Pages related to a recent
Australian earthquake!
Ryan Rossi (Purdue) Dynamic PageRank
29. Top 100 pages that fluctuate the most!
Just released movie
“Watchmen”
Ryan Rossi (Purdue) Dynamic PageRank
30. Top 100 pages that fluctuate the most!
Famous co-
host/musician that
died
Ryan Rossi (Purdue) Dynamic PageRank
31. Top 100 pages that fluctuate the most!
Recent “American
Idol” gossip
Ryan Rossi (Purdue) Dynamic PageRank
32. Top 100 pages that fluctuate the most!
A remembrance of Eve
Carson from a contestant
on “American Idol”
Recent “American
Idol” gossip
Ryan Rossi (Purdue) Dynamic PageRank
33. Top 100 pages that fluctuate the most!
Main Finding. These examples reveal
the ability of our Dynamic PageRank
to mesh the network structure with
changes in external interest!
Ryan Rossi (Purdue) Dynamic PageRank
34. Clustering PageRank trends
Granger Causality
Better algorithms (RK4,…)
Put more theoretical teeth behind these results
Ryan Rossi (Purdue) Dynamic PageRank
37. 1 TimAllen 2 TheOffice( 3 DrivingMis 4 Jo
11 KrisAllen 12 KatharineM 13 AmericanId 14 D
Allows us identify nodes that become 21 TheOffice( 22 TheLastHou 23 AmericanId 24 T
important around similar times (nodes 31 AsherRoth 32 DwightSchr 33 B.J.Novak 34 P
w/ similar trends of importance may be 41 TheOffice( 42 SeanHannit 43 Drake(ente 44 P
related) 51 SaraPaxton 52 BobbyBrown 53 Sting 54
61 CelticWoma 62 PaulWalker 63 TheHauntin 64
0.25
Temporal Pattern1 71 TracyMorga 72 YouSpinMeR 73 AnnCoulter 74
Temporal Pattern2
Normalized Dynamic PageRank
Temporal Pattern3
Temporal Pattern4 81 JoBethWill 82 AHaunting 83 Octopussy 84
0.2 Temporal Pattern5
91 MarcoPierr 92 Rebirth(Li 93 LietoMe(TV 94 T
Centroids!
0.15
1 Chile 2 WorldWarII 3 Iraq 4 An
11 Jew 12 Brazil 13 Frenchlang 14 S
0.1
21 Caribbean 22 Judaism 23 RomanCatho 2
31 Rome 32 NaziGerman 33 2007 3
0.05
41 2005 42 Christiani 43 Christian 4
0 51 2004 52 Gold 53 2008 54
0 2 4 6 8 10 12 14 16 18 20
Time 61 God 62 Wiktionary 63 Mammal 64
Ryan Rossi (Purdue) Dynamic PageRank 71 LatinAmeri 72 Disappeare 73 Yearofbirt 74 Y
38. Question: Does an earthquake at
time t cause people to visit Richter
magnitude page at t+1?
Causes?
Earthquake Richter Mag.
Statement on Granger Causality (Stronger version)
1. cause must occur before the effect
2. cause contains information about the effect
3. cause and effect must be linked in the graph
Ryan Rossi (Purdue) Dynamic PageRank
39. Multivariate regression lag
vector of errors
vector of response variables
regression coefficients to estimate
Granger Causality exists if the error by using the time-series x
in the forecast model is smaller than without considering x:
Significance of the difference in error is measured using the F-test
Ryan Rossi (Purdue) Dynamic PageRank
40. 0.000406***
Significant!
Earthquake Richter Mag.
Caused by Earthquake in Australia p-value
Earthquake preparedness 0.000607***
Aftershock 0.009619**
Asperity 0.001601**
Stick-slip phenomenon 0.002312**
Landslide dam 0.004820**
pval < 0.5 (*), 0.01 (**), 0.001 (***)
Ryan Rossi (Purdue) Dynamic PageRank
41. 0.000406***
Significant!
Main Finding. Allows us to identify the
Earthquake Richter Mag.
pages that influence the others with
regards to how users find information
Caused by Earthquake in Australia p-value
Earthquake preparedness 0.000607***
Aftershock 0.009619**
Asperity 0.001601**
Stick-slip phenomenon 0.002312**
Landslide dam 0.004820**
pval < 0.5 (*), 0.01 (**), 0.001 (***)
Ryan Rossi (Purdue) Dynamic PageRank
42. Introduced dynamical system framework for PageRank
Stated a dynamic Generalization of PageRank
Dynamic PageRank can help in prediction
Useful for many other applications
Ryan Rossi (Purdue) Dynamic PageRank
43. Thanks!
Questions?
rrossi@purdue.edu
http://www.cs.purdue.edu/homes/rrossi
Ryan Rossi (Purdue) Dynamic PageRank
45. Hourly
Pageviews
Earthquake
Preparedness
Earthquake 132 172
time
Richter
35 31
Mag.
Charles
Richter
Ryan Rossi (Purdue) Dynamic PageRank
46. Earthquake
Preparedness
Earthquake 132 172 764
Spike in the number of pageviews
for that given hour!
time
Richter
35 31 56
Mag.
Charles
Richter
Ryan Rossi (Purdue) Dynamic PageRank
47. ΔPR importance
substantially increases!
Earthquake
Preparedness
Earthquake 132 172 764
Spike in the number of pageviews
for that given hour!
time
Richter
35 31 56
Mag.
Charles
Richter
Ryan Rossi (Purdue) Dynamic PageRank
48. ΔPR importance
substantially increases!
Earthquake
Preparedness
Earthquake 132 172 764
After a few iterations,
importance diffuses Spike in the number of pageviews
from Earthquake to for that given hour!
Richter Mag!
Direct result of meshing time
graph with pageviews! Richter
35 31 56
Mag.
Charles
Richter
Ryan Rossi (Purdue) Dynamic PageRank
49. ΔPR importance
substantially increases!
Earthquake
Preparedness
Earthquake 132 172 764
After a few iterations,
importance diffuses Spike in the number of pageviews
from Earthquake to for that given hour!
Richter Mag!
Direct result of meshing time
graph with pageviews! Richter
35 31 56 becomes important
Mag.
at this time
Hence, Richter magnitude receives a high dynamic
PageRank score, becoming increasingly important at this
Charles
time, while its pageviews are not significantly increasing.
Richter
Ryan Rossi (Purdue) Dynamic PageRank
50. Earthquake
Preparedness
Earthquake 132 172 764 3406
time
Richter
35 31 56 1447
Mag.
In the next hour, we find that
Charles
the pageviews of Richter spike!
Richter
Reinforcing the importance!
Ryan Rossi (Purdue) Dynamic PageRank
51. Earthquake
Preparedness
Earthquake 132 172 764 3406
Dynamic PageRank is
predictive (by definition)!
Importance of Richter magnitude captured by
dynamic PageRank an hour earlier than when it time
actually became important (spike in pageviews)
Richter
35 31 56 1447
Mag.
In the next hour, we find that
Charles
the pageviews of Richter spike!
Richter
Reinforcing the importance!
Ryan Rossi (Purdue) Dynamic PageRank
52. Real-world networks are naturally dynamic
— Information Networks (e.g., Wikipedia: article-links-article)
— Social Networks (e.g., Twitter: who-follows-whom)
— Biological Networks
… ⇒
Importance changes!
Static methods fail to capture the temporal flow of information
Lead to misleading or simply incorrect conclusions
Ryan Rossi (Purdue) Dynamic PageRank
53. Graph
dynamic networks
⋯ ⋯ ⋯
time
Ryan Rossi (Purdue) Dynamic PageRank