SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
Spacey Random Walks on
Higher-Order Markov Chains
David F. Gleich!
Purdue University!
Joint work with 
Austin Benson,
Lek-Heng Lim,
supported by "
NSF CAREER
CCF-1149756
IIS-1422918 
SIAM NetSci15
David Gleich · Purdue
1
2
Spacey walk !
on Google Images
From Film.com
WARNING!!
This talk presents the “forward” explicit
derivation (i.e. lots of little steps)
rather than the implicit “backwards”
derivation (i.e. big intuitive leaps)
SIAM NetSci15
David Gleich · Purdue
3
PageRank:The initial condition
My dissertation"
Models & Algorithms for PageRank Sensitivity
The essence of PageRank!
Take any Markov chain P, PageRank "
creates a related chain with great “utility”
•  Unique stationary distribution
•  Fast convergence
•  Modeling flexibility
(I ↵P)x = (1 ↵)v
PageRank
beyond
the Web
arXiv:1407.5107
by Jessica Leber
Fast Magazine
SIAM NetSci15
David Gleich · Purdue
4
Be careful about what you
discuss after a talk…
I gave a talk!
at the Univ. of Chicago and visited Lek-heng Lim!
He told me about a new idea!
in Markov chains analysis and tensor eigenvalues
SIAM NetSci15
David Gleich · Purdue
5
Approximate stationary distributions
of higher-order Markov chains
A higher order Markov chain!
depends on the last few states.

These become Markov chains on the product state space."
But that’s usually too large for stationary distributions. 

The approximation!
is that we form a rank-1 approximation of that stationary
distribution object. 
Due to Michael Ng and collaborators 
P(Xt+1 = i | history) = P(Xt+1 = i | Xt = j, Xt 1 = k)
P(X = [i, j]) = xi xj
SIAM NetSci15
David Gleich · Purdue
6
P(X = [i, j]) = Xi,j
Why?
SIAM NetSci15
David Gleich · Purdue
7
Multidimensional, multi-
ceted data from inform-
ics and simulations
a
b
m
li
This propos
dimensiona
We want to analyze 
higher-order relationships 
and multi-way data and …

Things like 

•  Enron emails
•  Regular hypergraphs


And there’s three+ indices!
So it’s a "
higher-order Markov chain
Approximate stationary distributions
of higher-order Markov chains
The new problem!
of computing an approx. stationary dist. is a tensor eigenvector


The new problem’!
•  existence is guaranteed under mild conditions
•  uniqueness …
•  convergence …
Due to Michael Ng and collaborators 
xi =
X
jk
Pijk xj xk or x = Px2
require heroic algebra
(and are hard to check)
SIAM NetSci15
David Gleich · Purdue
8
Some small quick notes
A stochastic matrix M is a Markov chain
A stochastic hypermatrix / tensor / probability P table "
is a higher-order Markov chain
SIAM NetSci15
David Gleich · Purdue
9
Multidimensional, multi-
faceted data from inform-
atics and simulations
a
b
m
li
This propos
dimensiona
PageRank to the rescue!

What if we looked at these approx. stat.
distributions of a PageRank modified higher-
order chain?
Multilinear PageRank!

•  Formally the Li & Ng approx. stat. dist. of the
PageRank modified higher order Markov chain
•  Guaranteed existence!
•  Fast convergence ?
•  Uniqueness ? 
x = ↵Px2
+ (1 ↵)v
Multilinear PageRank"
Gleich, Lim, Yu"
arXiv:1409.1465
when alpha < 1/order !
when alpha < 1/order !
SIAM NetSci15
David Gleich · Purdue
10
One nagging question …!
Is there a stochastic process that
underlies this approximation?
SIAM NetSci15
David Gleich · Purdue
11
Meanwhile … "
Spectral clustering of tensors
Austin Benson (a colleague) asked"
if there were any interesting method to “cluster” tensors.
“Recall” spectral clustering on graphs!

!
SIAM Data Mining 2015, arXiv:1502.05058
graph ! random walk
! second eigenvector
! sweep cut partition
SIAM NetSci15
David Gleich · Purdue
12
MT
y = 2y
¯SS
min
S
(S) = min
S
#(edges cut)
min(vol(S), vol( ¯S))
Meanwhile … "
Spectral clustering of tensors
Austin Benson (a colleague) asked"
if there were any interesting method to “cluster” tensors.
“Conjecture” spectral clustering on tensors!

!
SIAM Data Mining 2015, arXiv:1502.05058
graph/tensor ! higher-order random walk
! second eigenvector
! sweep cut partition
??????!
SIAM NetSci15
David Gleich · Purdue
13
We tried many
•  apriori good and
•  retrospectively bad
ideas for the second eigenvector
SIAM NetSci15
David Gleich · Purdue
14
Austin and I were talking one day …
... about the problem of the process. (He was using Multilinear
PageRank as the “first” eigenvector.) He observed that

One of the five algorithms !
for multilinear PageRank uses a seq. of Markov chains.


Is there some way to turn this into a random walk?
xk+1 = stat. dist. of Markov chain based on ↵, v, P, and xk
SIAM NetSci15
David Gleich · Purdue
15
EUREKA!
SIAM NetSci15
David Gleich · Purdue
16
The spacey random walk
Consider a higher-order Markov chain.

If we were perfect, we’d figure out the stationary
distribution of that. But we are spacey!
•  On arriving at state j, we promptly "
“space out” and forget we came from k. 
•  But we still believe we are “higher-order”
•  So we invent a state k by drawing a random
state from our history.
P(Xt+1 = i | history) = P(Xt+1 = i | Xt = j, Xt 1 = k)
SIAM NetSci15
David Gleich · Purdue
17
The spacey random walk 

This is a vertex-reinforced random walk! "
e.g. Polya’s urn.
Pemantle, 1992; Benaïm, 1997; Pemantle 2007
SIAM NetSci15
David Gleich · Purdue
18
P(Xt+1 = i | Xt = j and the right filtration on history)
=
X
k
Pi,j,k Ck (t)/(t + n)
Let Ct (k) = (1 +
Pt
s=1 Ind{Xs = k})
How often we’ve visited
state k in the past
Stationary distributions of vertex
reinforced random walks
A vertex-reinforced random walk at time t transitions
according to a Markov matrix M given the observed
frequencies.



This has a stationary distribution, iff the dynamical system 


converges.
SIAM NetSci15
David Gleich · Purdue
19
dx
dt
= ⇡[M(x)] x
P(Xt+1 = i | Xt = j and the right filtration on history)
= [M(t)]i,j
= [M(c(t))]i,j
⇡[M] is a map to the stat. dist.
M. Benïam 1997
The Markov matrix for "
Spacey Random Walks



A necessary condition for a stationary distribution


(otherwise makes no sense)

SIAM NetSci15
David Gleich · Purdue
20
Property B. Let P be an order-m, n dimensional probability table. Then P has
property B if there is a unique stationary distribution associated with all stochastic
combinations of the last m 2 modes. That is, M =
P
k,`,... P(:, :, k, `, ...) k,`,... defines
a Markov chain with a unique Perron root when all s are positive and sum to one.
dx
dt
= ⇡[M(x)] x
M =
X
k
P(:, :, k)xk
This is the transition probability associated
with guessing the last state based on history!
We have all sorts of cool results on spacey
random walks… e.g.
Suppose you have a Polya Urn with memory… "
Then it always has a stationary distribution!
SIAM NetSci15
David Gleich · Purdue
21
Back to Multilinear PageRank
The Multilinear PageRank problem is what we call a
spacey random surfer model.
•  This is a spacey random walk
•  We add random jumps with probability (1-alpha)
It’s also a vertex-reinforced random walk.
Thus, it has a stationary probability if 


converges.
SIAM NetSci15
David Gleich · Purdue
22
dx
dt
= ⇡[M(x)] x
M(x) = ↵
P
k P(:, :, k)xk
+ (1 ↵)v
Which occurs when alpha < 1/order !
Some interesting notes about vertex
reinforced random walks
•  The power method is NOT the natural
algorithm! It’s to evolve the ODE.
•  It’s unclear if there are any structural
properties that guarantee a stationary
distribution (except for something like the
Multilinear PageRank equation)
•  Can be tough to analyze the resulting ODEs
•  Asymptotically creates a Markov chain!
SIAM NetSci15
David Gleich · Purdue
23
… back to spectral clustering …
SIAM NetSci15
David Gleich · Purdue
24
Meanwhile … "
Spectral clustering of tensors
Austin Benson (a colleague) asked"
if there were any interesting method to “cluster” tensors.
“Conjecture” spectral clustering on tensors!

!
SIAM Data Mining 2015, arXiv:1502.05058
graph/tensor ! higher-order random walk
! second eigenvector
! sweep cut partition
??????!
SIAM NetSci15
David Gleich · Purdue
25
Meanwhile … "
Spectral clustering of tensors
Austin Benson (a colleague) asked"
if there were any interesting method to “cluster” tensors.
“Conjecture” spectral clustering on tensors!

!
SIAM Data Mining 2015, arXiv:1502.05058
graph/tensor ! higher-order random walk
! second eigenvector
! sweep cut partition
SIAM NetSci15
David Gleich · Purdue
26
M(x)T
y = 2y
Use the asymptotic
Markov matrix!
Problem current methods
only consider edges 
… and that is not enough for current problems








SIAM NetSci15
David Gleich · Purdue
27
In social networks, we want to penalize cutting triangles more than
cutting edges. The triangle motif represents stronger social ties.
Problem current methods
only consider edges 
SIAM NetSci15
David Gleich · Purdue
28
SPT16
HO
CLN1
CLN2
 SWI4_SWI6
In transcription networks, the ``feedforward loop” motif represents
biological function. Thus, we want to look for clusters of this structure.
An example with a layered flow network
SIAM NetSci15
David Gleich · Purdue
29
0
12
3
4 5
6 7
8 9
10 11
§  The network “flows” downward
§  Use directed 3-cycles to model flow
i
kj
i
kj
i
kj
i
kj
1 1 1 2
§  Tensor spectral clustering: {0,1,2,3}, {4,5,6,7}, {8,9,10,11}
§  Standard spectral: {0,1,2,3,4,5,6,7}, {8,10,11}, {9}
SIAM NetSci15
David Gleich · Purdue
30
WAW2015	
  EURANDOM	
  –	
  Eindhoven	
  –	
  Netherlands	
  
Workshop  on  Algorithms  and  Models  for  the  Web  Graph  
(but  it’s  grown  to  be  all  types  of  network  analysis)
December  10-­‐11

Winter  School  on  Complex  Network  and  Graph  Models  
December  7-­‐8

Submissions  Due  July  25th!
Time for Lots of Questions!
Manuscripts!
Li, Ng. On the limiting probability distribution of a transition
probability tensor. Linear & Multilinear Algebra 2013.
Gleich. PageRank beyond the Web. (accepted at SIAM Review)
Gleich, Lim, Yu. Multilinear PageRank. (under review…)
Benson, Gleich, Leskovec. Tensor Spectral Clustering for
partitioning higher order network structures. SDM 2015, arXiv:"
https://github.com/arbenson/tensor-sc 
Benson, Gleich, Leskovec. Forthcoming. (Much better method…)
Benson, Gleich, Lim. The Spacey Random Walk. In prep.
SIAM NetSci15
David Gleich · Purdue
31

Más contenido relacionado

La actualidad más candente

A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
David Gleich
 
Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)
Matthew Leingang
 

La actualidad más candente (20)

Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
 
Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
A dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportationA dynamical system for PageRank with time-dependent teleportation
A dynamical system for PageRank with time-dependent teleportation
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)Lesson 26: Integration by Substitution (handout)
Lesson 26: Integration by Substitution (handout)
 
Lecture 5 backpropagation
Lecture 5 backpropagationLecture 5 backpropagation
Lecture 5 backpropagation
 
Lecture 4 neural networks
Lecture 4 neural networksLecture 4 neural networks
Lecture 4 neural networks
 
Uncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep LearningUncertainty Modeling in Deep Learning
Uncertainty Modeling in Deep Learning
 
Uncertainty in Deep Learning
Uncertainty in Deep LearningUncertainty in Deep Learning
Uncertainty in Deep Learning
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
 
MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...
MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...
MUMS Opening Workshop - Extrapolation: The Art of Connecting Model-Based Pred...
 

Destacado

Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1
Neeta Pande
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
David Gleich
 

Destacado (16)

How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...How does Google Google: A journey into the wondrous mathematics behind your f...
How does Google Google: A journey into the wondrous mathematics behind your f...
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
 
Graph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimcGraph libraries in Matlab: MatlabBGL and gaimc
Graph libraries in Matlab: MatlabBGL and gaimc
 
Overlapping clusters for distributed computation
Overlapping clusters for distributed computationOverlapping clusters for distributed computation
Overlapping clusters for distributed computation
 
Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1
 
A history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspectiveA history of PageRank from the numerical computing perspective
A history of PageRank from the numerical computing perspective
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
The power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulantsThe power and Arnoldi methods in an algebra of circulants
The power and Arnoldi methods in an algebra of circulants
 
Tall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduceTall and Skinny QRs in MapReduce
Tall and Skinny QRs in MapReduce
 
Iterative methods for network alignment
Iterative methods for network alignmentIterative methods for network alignment
Iterative methods for network alignment
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
Direct tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architecturesDirect tall-and-skinny QR factorizations in MapReduce architectures
Direct tall-and-skinny QR factorizations in MapReduce architectures
 
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
 
Tall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architecturesTall-and-skinny QR factorizations in MapReduce architectures
Tall-and-skinny QR factorizations in MapReduce architectures
 
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
 

Similar a Spacey random walks and higher order Markov chains

Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures
Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data StructuresRound-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures
Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures
Lu Zhang
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 
The Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and BackThe Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and Back
Moshe Vardi
 
GMMIW_Grp1_Final
GMMIW_Grp1_FinalGMMIW_Grp1_Final
GMMIW_Grp1_Final
Alfred L.
 

Similar a Spacey random walks and higher order Markov chains (20)

Tensor Eigenvectors and Stochastic Processes
Tensor Eigenvectors and Stochastic ProcessesTensor Eigenvectors and Stochastic Processes
Tensor Eigenvectors and Stochastic Processes
 
Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures
Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data StructuresRound-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures
Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures
 
Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017
 
Spacey random walks CAM Colloquium
Spacey random walks CAM ColloquiumSpacey random walks CAM Colloquium
Spacey random walks CAM Colloquium
 
Markov chain monte_carlo_methods_for_machine_learning
Markov chain monte_carlo_methods_for_machine_learningMarkov chain monte_carlo_methods_for_machine_learning
Markov chain monte_carlo_methods_for_machine_learning
 
Lecture 06 marco aurelio ranzato - deep learning
Lecture 06   marco aurelio ranzato - deep learningLecture 06   marco aurelio ranzato - deep learning
Lecture 06 marco aurelio ranzato - deep learning
 
FWI without tears: a forward modeling-free gradient
FWI without tears: a forward modeling-free gradientFWI without tears: a forward modeling-free gradient
FWI without tears: a forward modeling-free gradient
 
Spacey random walks CMStatistics 2017
Spacey random walks CMStatistics 2017Spacey random walks CMStatistics 2017
Spacey random walks CMStatistics 2017
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
randomwalk.ppt
randomwalk.pptrandomwalk.ppt
randomwalk.ppt
 
Markov Chain Monitoring - Application to demand prediction in bike sharing sy...
Markov Chain Monitoring - Application to demand prediction in bike sharing sy...Markov Chain Monitoring - Application to demand prediction in bike sharing sy...
Markov Chain Monitoring - Application to demand prediction in bike sharing sy...
 
Introduction to search and optimisation for the design theorist
Introduction to search and optimisation for the design theoristIntroduction to search and optimisation for the design theorist
Introduction to search and optimisation for the design theorist
 
Sampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo TechniquesSampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo Techniques
 
The Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and BackThe Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and Back
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Financial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic ImportanceFinancial Networks III. Centrality and Systemic Importance
Financial Networks III. Centrality and Systemic Importance
 
2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge
 
2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge
 
1 chayes
1 chayes1 chayes
1 chayes
 
GMMIW_Grp1_Final
GMMIW_Grp1_FinalGMMIW_Grp1_Final
GMMIW_Grp1_Final
 

Último

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Spacey random walks and higher order Markov chains

  • 1. Spacey Random Walks on Higher-Order Markov Chains David F. Gleich! Purdue University! Joint work with Austin Benson, Lek-Heng Lim, supported by " NSF CAREER CCF-1149756 IIS-1422918 SIAM NetSci15 David Gleich · Purdue 1
  • 2. 2 Spacey walk ! on Google Images From Film.com
  • 3. WARNING!! This talk presents the “forward” explicit derivation (i.e. lots of little steps) rather than the implicit “backwards” derivation (i.e. big intuitive leaps) SIAM NetSci15 David Gleich · Purdue 3
  • 4. PageRank:The initial condition My dissertation" Models & Algorithms for PageRank Sensitivity The essence of PageRank! Take any Markov chain P, PageRank " creates a related chain with great “utility” •  Unique stationary distribution •  Fast convergence •  Modeling flexibility (I ↵P)x = (1 ↵)v PageRank beyond the Web arXiv:1407.5107 by Jessica Leber Fast Magazine SIAM NetSci15 David Gleich · Purdue 4
  • 5. Be careful about what you discuss after a talk… I gave a talk! at the Univ. of Chicago and visited Lek-heng Lim! He told me about a new idea! in Markov chains analysis and tensor eigenvalues SIAM NetSci15 David Gleich · Purdue 5
  • 6. Approximate stationary distributions of higher-order Markov chains A higher order Markov chain! depends on the last few states. These become Markov chains on the product state space." But that’s usually too large for stationary distributions. The approximation! is that we form a rank-1 approximation of that stationary distribution object. Due to Michael Ng and collaborators P(Xt+1 = i | history) = P(Xt+1 = i | Xt = j, Xt 1 = k) P(X = [i, j]) = xi xj SIAM NetSci15 David Gleich · Purdue 6 P(X = [i, j]) = Xi,j
  • 7. Why? SIAM NetSci15 David Gleich · Purdue 7 Multidimensional, multi- ceted data from inform- ics and simulations a b m li This propos dimensiona We want to analyze higher-order relationships and multi-way data and … Things like •  Enron emails •  Regular hypergraphs And there’s three+ indices! So it’s a " higher-order Markov chain
  • 8. Approximate stationary distributions of higher-order Markov chains The new problem! of computing an approx. stationary dist. is a tensor eigenvector The new problem’! •  existence is guaranteed under mild conditions •  uniqueness … •  convergence … Due to Michael Ng and collaborators xi = X jk Pijk xj xk or x = Px2 require heroic algebra (and are hard to check) SIAM NetSci15 David Gleich · Purdue 8
  • 9. Some small quick notes A stochastic matrix M is a Markov chain A stochastic hypermatrix / tensor / probability P table " is a higher-order Markov chain SIAM NetSci15 David Gleich · Purdue 9 Multidimensional, multi- faceted data from inform- atics and simulations a b m li This propos dimensiona
  • 10. PageRank to the rescue! What if we looked at these approx. stat. distributions of a PageRank modified higher- order chain? Multilinear PageRank! •  Formally the Li & Ng approx. stat. dist. of the PageRank modified higher order Markov chain •  Guaranteed existence! •  Fast convergence ? •  Uniqueness ? x = ↵Px2 + (1 ↵)v Multilinear PageRank" Gleich, Lim, Yu" arXiv:1409.1465 when alpha < 1/order ! when alpha < 1/order ! SIAM NetSci15 David Gleich · Purdue 10
  • 11. One nagging question …! Is there a stochastic process that underlies this approximation? SIAM NetSci15 David Gleich · Purdue 11
  • 12. Meanwhile … " Spectral clustering of tensors Austin Benson (a colleague) asked" if there were any interesting method to “cluster” tensors. “Recall” spectral clustering on graphs! ! SIAM Data Mining 2015, arXiv:1502.05058 graph ! random walk ! second eigenvector ! sweep cut partition SIAM NetSci15 David Gleich · Purdue 12 MT y = 2y ¯SS min S (S) = min S #(edges cut) min(vol(S), vol( ¯S))
  • 13. Meanwhile … " Spectral clustering of tensors Austin Benson (a colleague) asked" if there were any interesting method to “cluster” tensors. “Conjecture” spectral clustering on tensors! ! SIAM Data Mining 2015, arXiv:1502.05058 graph/tensor ! higher-order random walk ! second eigenvector ! sweep cut partition ??????! SIAM NetSci15 David Gleich · Purdue 13
  • 14. We tried many •  apriori good and •  retrospectively bad ideas for the second eigenvector SIAM NetSci15 David Gleich · Purdue 14
  • 15. Austin and I were talking one day … ... about the problem of the process. (He was using Multilinear PageRank as the “first” eigenvector.) He observed that One of the five algorithms ! for multilinear PageRank uses a seq. of Markov chains. Is there some way to turn this into a random walk? xk+1 = stat. dist. of Markov chain based on ↵, v, P, and xk SIAM NetSci15 David Gleich · Purdue 15
  • 17. The spacey random walk Consider a higher-order Markov chain. If we were perfect, we’d figure out the stationary distribution of that. But we are spacey! •  On arriving at state j, we promptly " “space out” and forget we came from k. •  But we still believe we are “higher-order” •  So we invent a state k by drawing a random state from our history. P(Xt+1 = i | history) = P(Xt+1 = i | Xt = j, Xt 1 = k) SIAM NetSci15 David Gleich · Purdue 17
  • 18. The spacey random walk This is a vertex-reinforced random walk! " e.g. Polya’s urn. Pemantle, 1992; Benaïm, 1997; Pemantle 2007 SIAM NetSci15 David Gleich · Purdue 18 P(Xt+1 = i | Xt = j and the right filtration on history) = X k Pi,j,k Ck (t)/(t + n) Let Ct (k) = (1 + Pt s=1 Ind{Xs = k}) How often we’ve visited state k in the past
  • 19. Stationary distributions of vertex reinforced random walks A vertex-reinforced random walk at time t transitions according to a Markov matrix M given the observed frequencies. This has a stationary distribution, iff the dynamical system converges. SIAM NetSci15 David Gleich · Purdue 19 dx dt = ⇡[M(x)] x P(Xt+1 = i | Xt = j and the right filtration on history) = [M(t)]i,j = [M(c(t))]i,j ⇡[M] is a map to the stat. dist. M. Benïam 1997
  • 20. The Markov matrix for " Spacey Random Walks A necessary condition for a stationary distribution (otherwise makes no sense) SIAM NetSci15 David Gleich · Purdue 20 Property B. Let P be an order-m, n dimensional probability table. Then P has property B if there is a unique stationary distribution associated with all stochastic combinations of the last m 2 modes. That is, M = P k,`,... P(:, :, k, `, ...) k,`,... defines a Markov chain with a unique Perron root when all s are positive and sum to one. dx dt = ⇡[M(x)] x M = X k P(:, :, k)xk This is the transition probability associated with guessing the last state based on history!
  • 21. We have all sorts of cool results on spacey random walks… e.g. Suppose you have a Polya Urn with memory… " Then it always has a stationary distribution! SIAM NetSci15 David Gleich · Purdue 21
  • 22. Back to Multilinear PageRank The Multilinear PageRank problem is what we call a spacey random surfer model. •  This is a spacey random walk •  We add random jumps with probability (1-alpha) It’s also a vertex-reinforced random walk. Thus, it has a stationary probability if converges. SIAM NetSci15 David Gleich · Purdue 22 dx dt = ⇡[M(x)] x M(x) = ↵ P k P(:, :, k)xk + (1 ↵)v Which occurs when alpha < 1/order !
  • 23. Some interesting notes about vertex reinforced random walks •  The power method is NOT the natural algorithm! It’s to evolve the ODE. •  It’s unclear if there are any structural properties that guarantee a stationary distribution (except for something like the Multilinear PageRank equation) •  Can be tough to analyze the resulting ODEs •  Asymptotically creates a Markov chain! SIAM NetSci15 David Gleich · Purdue 23
  • 24. … back to spectral clustering … SIAM NetSci15 David Gleich · Purdue 24
  • 25. Meanwhile … " Spectral clustering of tensors Austin Benson (a colleague) asked" if there were any interesting method to “cluster” tensors. “Conjecture” spectral clustering on tensors! ! SIAM Data Mining 2015, arXiv:1502.05058 graph/tensor ! higher-order random walk ! second eigenvector ! sweep cut partition ??????! SIAM NetSci15 David Gleich · Purdue 25
  • 26. Meanwhile … " Spectral clustering of tensors Austin Benson (a colleague) asked" if there were any interesting method to “cluster” tensors. “Conjecture” spectral clustering on tensors! ! SIAM Data Mining 2015, arXiv:1502.05058 graph/tensor ! higher-order random walk ! second eigenvector ! sweep cut partition SIAM NetSci15 David Gleich · Purdue 26 M(x)T y = 2y Use the asymptotic Markov matrix!
  • 27. Problem current methods only consider edges … and that is not enough for current problems SIAM NetSci15 David Gleich · Purdue 27 In social networks, we want to penalize cutting triangles more than cutting edges. The triangle motif represents stronger social ties.
  • 28. Problem current methods only consider edges SIAM NetSci15 David Gleich · Purdue 28 SPT16 HO CLN1 CLN2 SWI4_SWI6 In transcription networks, the ``feedforward loop” motif represents biological function. Thus, we want to look for clusters of this structure.
  • 29. An example with a layered flow network SIAM NetSci15 David Gleich · Purdue 29 0 12 3 4 5 6 7 8 9 10 11 §  The network “flows” downward §  Use directed 3-cycles to model flow i kj i kj i kj i kj 1 1 1 2 §  Tensor spectral clustering: {0,1,2,3}, {4,5,6,7}, {8,9,10,11} §  Standard spectral: {0,1,2,3,4,5,6,7}, {8,10,11}, {9}
  • 30. SIAM NetSci15 David Gleich · Purdue 30 WAW2015  EURANDOM  –  Eindhoven  –  Netherlands   Workshop  on  Algorithms  and  Models  for  the  Web  Graph   (but  it’s  grown  to  be  all  types  of  network  analysis) December  10-­‐11 Winter  School  on  Complex  Network  and  Graph  Models   December  7-­‐8 Submissions  Due  July  25th!
  • 31. Time for Lots of Questions! Manuscripts! Li, Ng. On the limiting probability distribution of a transition probability tensor. Linear & Multilinear Algebra 2013. Gleich. PageRank beyond the Web. (accepted at SIAM Review) Gleich, Lim, Yu. Multilinear PageRank. (under review…) Benson, Gleich, Leskovec. Tensor Spectral Clustering for partitioning higher order network structures. SDM 2015, arXiv:" https://github.com/arbenson/tensor-sc Benson, Gleich, Leskovec. Forthcoming. (Much better method…) Benson, Gleich, Lim. The Spacey Random Walk. In prep. SIAM NetSci15 David Gleich · Purdue 31