SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
A New Algorithm Model for Massive-Scale
Streaming Graph Analysis
Chunxing Yin and Jason Riedy
Georgia Institute of Technology
ICIAM, 16 July 2019
Outline
Motivation and Applications
New Algorithm Model
Streaming Analysis
Closing
New Model for Streaming Graphs — ICIAM 2019 1/20
Motivation and Applications
(insert prefix here)-scale data analysis
Cyber-security Identify anomalies, malicious actors
Health care Finding outbreaks, population epidemiology
Social networks Advertising, searching, grouping
Intelligence Decisions at scale, regulating markets, smart &
sustainable cities
Systems biology Understanding interactions, drug design
Power grid Disruptions, conservation
Simulation Discrete events, cracking meshes
Changes are important. Cannot stop the world...
New Model for Streaming Graphs — ICIAM 2019 2/20
Potential Applications
• Social Networks
• Identify communities, influences, bridges, trends,
anomalies (trends before they happen)...
• Potential to help social sciences, city planning, and
others with large-scale data.
• Cybersecurity
• Determine if new connections can access a device or
represent new threat in < 5ms...
• Is the transfer by a virus / persistent threat?
• Bioinformatics, health
• Construct gene sequences, analyze protein
interactions, map brain interactions
• Credit fraud forensics ⇒ detection ⇒ monitoring
• Real-time integration of all the customer’s data
New Model for Streaming Graphs — ICIAM 2019 3/20
Streaming graph data
Network data rates:
• Gigabit ethernet: 81k – 1.5M packets per second
• Over 130 000 flows per second on 10 GigE (< 7.7 µs)
Person-level data rates:
• 500M posts per day on Twitter (6k / sec)1
• 3M posts per minute on Facebook (50k / sec)2
Should analyze only changes and not entire graph.
Throughput & latency trade off and expose different
levels of concurrency.
1
www.internetlivestats.com/twitter-statistics/
2
www.jeffbullas.com/2015/04/17/21-awesome-facebook-facts-and-statistics-you-need-to-check-out/
New Model for Streaming Graphs — ICIAM 2019 4/20
Streaming graph analysis
Terminology (not universal):
• Streaming changes into a massive, evolving graph
• Need to handle deletions as well as insertions
Previous STINGER performance results (x86-64):
Data ingest >2M upd/sec [Ediger, McColl, Poovey, Campbell, &
B 2014]
Clustering coefficients >100K upd/sec [Riedy, Meyerhenke,
B, E, & Mattson 2012]
Connected comp. >1M upd/sec [McColl, Green, & B 2013]
Community clustering >100K upd/sec∗
[R & B 2013]
PageRank Up to 40× latency improvement [R 2016]
New Model for Streaming Graphs — ICIAM 2019 5/20
New Algorithm Model
Starting incremental / streaming algorithms
• Incremental and
streaming algorithms
start somewhere.
• Initial, static
computation can take a
rather long time...
• During which the graph
cannot change?
• What about supporting
many simultaneous
analyses?
Data ingest rates, R-MAT into
R-MAT, scales 24 & 30
●
●
●
●
●
●
1e+02
1e+03
1e+04
1e+05
1e+06
1 10 100 1000 10000 1e+05
Batch size
Updaterate(upd/s)
platform ● Power8 Haswell Haswell−30
What can we run while the graph changes?
New Model for Streaming Graphs — ICIAM 2019 6/20
What if we don’t hold up changes?
When is an algorithm valid?
Analyze concurrently with the graph changes, and
produce a result correct for the starting graph and
some subset of concurrent changes.
• No locking beyond atomic operations.
• No versioned data structure.
• No stopping.
New Model for Streaming Graphs — ICIAM 2019 7/20
Sample of other execution models
• Put in a query, wait for sufficient data [Phillips, et al.
at Sandia]
• Different but very interesting model.
• Evolving: Sample, accurate w/high-prob.
• Difficult to generalize into graph results (e.g.
shortest path tree).
• Classical: dynamic algorithms, versioned data
• Can require drastically more storage, possibly a copy
of the graph per property, or more overhead for
techniques like read-copy-update.
Generally do not address the latency of computing the
“static” starting point.
New Model for Streaming Graphs — ICIAM 2019 8/20
Algorithm validity in our model: Example.
Can you compute degrees in an undirected graph (no self
loops) concurrently with changes?
Algorithm: Iterate over vertices, count the number of
neighbors.
1
Compute deg(v1)
1 0
Compute deg(v2)
delete edge
Cannot correspond to an undirected graph at all!
Valid for our model? No!
Not incorrect, just not valid for our model.
New Model for Streaming Graphs — ICIAM 2019 9/20
Algorithm validity in our model: Example.
Can you compute degrees in an undirected graph (no self
loops) concurrently with changes?
Algorithm: Iterate over edges, increment the degrees of
the endpoints.
1 1
Inc deg(v1), deg(v2)
1 1
(later...)
delete edge
Corresponds to the beginning graph plus a subset of
concurrent changes.
Valid for our model? Yes!
Undirected stored as directed: skip edges with v1 ≥ v2.
New Model for Streaming Graphs — ICIAM 2019 10/20
Algorithm validity in our model
s
w(e1) = 10
w(e2) = 5 → 1
∆ = 4
• What is valid?
• Typical BFS
• Shiloach-Vishkin connected components
• PageRank, Katz via Jacobi
• Making a copy! (Vertex-induced subgraph)
• What is invalid?
• Making a decision twice in implementations
• ∆-stepping SSSP: Decrease a weight below ∆
• Degree optimization: Cross threshold, miss vertex
• Applying old or different information
• Multiple searches: Betweenness centrality
• Labeling in S. Kahan’s components alg
New Model for Streaming Graphs — ICIAM 2019 11/20
Example: PageRank, Katz Centrality
PageRank
Distribution of rand. walks
(I − αD−1
AT
)x = 1/|V|
Katz Centrality
Count of number of walks
(I − αAT
)x = 1
A: row → col adjacency matrix
D: diagonal matrix of out-degrees
|V|: number of vertices, 1: all-1 vector
Both can be solved by Jacobi iteration, e.g. for Katz:
(I − αAT
)x = 1 ⇒ x(k+1)
= αAT
x(k)
+ 1
New Model for Streaming Graphs — ICIAM 2019 12/20
Jacobi can be valid for our model
Core loop of Jacobi iteration for Katz centrality:
while r(k)
≥ ϵ
1. x(k+1)
= αAT
x(k)
+ 1
2. r(k+1)
= 1−(I−αAT
)x(k+1)
3. k = k + 1
Except this is not valid. Residual r(k+1)
may use a different
graph / adjacency matrix A.
New Model for Streaming Graphs — ICIAM 2019 13/20
Jacobi can be valid for our model
Core loop of Jacobi iteration for Katz centrality:
do
1. x(k+1)
= αAT
x(k)
+ 1 and
r(k)
= 1 − (I − αAT
)x(k)
2. k = k + 1
until r(k−1)
< ε
Must use the same graph for all requirements.
Will need r(k−1)
later!
This also affects convergence speed.
New Model for Streaming Graphs — ICIAM 2019 13/20
Fun properties for one-shot queries
Due to Chunxing Yin3
, under sensible assumptions:
1. You can produce a single-change stream to
demonstrate invalidity.
2. Algorithms producing a subgraph of the input cannot
be guaranteed to run concurrently with changes and
always produce moment-in-time outputs.
3
Yin, Riedy, et al. A New Algorithmic Model for Graph Analysis of Streaming Data. 14th International Workshop on
Mining and Learning with Graphs (MLG), May 2018.
New Model for Streaming Graphs — ICIAM 2019 14/20
Fun properties for one-shot queries
Due to Chunxing Yin3
, under sensible assumptions:
1. You can produce a single-change stream to
demonstrate invalidity.
• Proof idea: Start with a graph that incorporates all
the visible changes, introduce the one change at the
right time.
2. Algorithms producing a subgraph of the input cannot
be guaranteed to run concurrently with changes and
always produce moment-in-time outputs.
3
Yin, Riedy, et al. A New Algorithmic Model for Graph Analysis of Streaming Data. 14th International Workshop on
Mining and Learning with Graphs (MLG), May 2018.
New Model for Streaming Graphs — ICIAM 2019 14/20
Fun properties for one-shot queries
Due to Chunxing Yin3
, under sensible assumptions:
1. You can produce a single-change stream to
demonstrate invalidity.
2. Algorithms producing a subgraph of the input cannot
be guaranteed to run concurrently with changes and
always produce moment-in-time outputs.
• Proof idea: Any time a snapshot result could happen,
delete then re-insert an edge from the output.
3
Yin, Riedy, et al. A New Algorithmic Model for Graph Analysis of Streaming Data. 14th International Workshop on
Mining and Learning with Graphs (MLG), May 2018.
New Model for Streaming Graphs — ICIAM 2019 14/20
Streaming Analysis
On to streaming...
Can we update graph metrics as new data arrives without
just re-running?
• Track what changed during the one-shot query.
• Update locally around those changes, while other
changes are occuring.
• If the update is valid, can repeat to follow a
streaming graph.
Initial
∆0
Upd. w/∆0
∆1
Upd. w/∆1
∆2
Examples: PageRank & Katz, iterative refinement.
Connected components, maintain a spanning tree.
New Model for Streaming Graphs — ICIAM 2019 15/20
Early results with PageRank
(Will explain the algorithm in a moment.)
Synchronous: Ingest delays will increase with # kernels.
Red dot: ingested batch. Blue dot: PR kernel begins.
Vertical: # of iterations
New Model for Streaming Graphs — ICIAM 2019 16/20
Early results with PageRank
(Will explain the algorithm in a moment.)
Concurrent: Constant-rate ingest!
Detects sudden structural change?
Red dot: ingested batch. Blue dot: PR kernel begins.
Vertical: # of iterations
New Model for Streaming Graphs — ICIAM 2019 16/20
Aside: Diameter Inside a “Batch”
New Model for Streaming Graphs — ICIAM 2019 17/20
Updating PageRank and Katz Centrality
Essentially, iterative refinement.
Algorithm 1 Update x and r with given ∆
Input: Graph Hi, previous solution xi, batch ∆i
Output: updated Katz centrality vector xi+1
1: ri = 1 − (I − αHi)xi ← saved from previous round
2: ˜ri+1 = ri + α∆ixi
3: ∆x = JACOBI(I − αHi+1,˜ri+1, tol)
4: xi+1 = xi + ∆x
5: ∆r = α∆ixi − (I − αHi+1)∆x ← saved from Jacobi
6: ri+1 = ri + ∆r
7: return xi+1
New Model for Streaming Graphs — ICIAM 2019 18/20
Closing
Open issues
Difficult problems: Updating triangle counts efficiently!
• Option: re-counting a region around changes,
stopping once counts do not change.
• Possibly incorrect on the region’s border,
but only at changes.
• Next run can fix those... A looser model?
Some algorithms essentially copy subgraphs.
• What are the size bounds?
• Can they characterize algorithms / properties?
• Can we formalize what needs kept for updating
results?
New Model for Streaming Graphs — ICIAM 2019 19/20
Closing
• Summary
• Analysis concurrent with graph change can work.
• But not all methods are valid.
• Avoid evaluating conditions or exploring the graph
more than once.
• Save information necessary for updates.
• Future work
• Track subgraphs / communities for “slow” analyses
• Develop more valid updating methods.
• Explore approximation results related to concurrent
analysis.
New Model for Streaming Graphs — ICIAM 2019 20/20

Más contenido relacionado

La actualidad más candente

Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...
Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...
Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...
Matt Stubbs
 

La actualidad más candente (10)

06 - HAMS implementation
06 - HAMS implementation06 - HAMS implementation
06 - HAMS implementation
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
 
Sparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R usersSparklyr: Big Data enabler for R users
Sparklyr: Big Data enabler for R users
 
Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...
Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...
Big Data LDN 2016: Data Warehouse Automation: Solve integration challenges, s...
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
 
Denis Reznik Data driven future
Denis Reznik Data driven futureDenis Reznik Data driven future
Denis Reznik Data driven future
 
Augmented reality meets computer vision data generation for driving scenes.
Augmented reality meets computer vision data generation for driving scenes.  Augmented reality meets computer vision data generation for driving scenes.
Augmented reality meets computer vision data generation for driving scenes.
 
Applocation of Numerical Methods
Applocation of Numerical MethodsApplocation of Numerical Methods
Applocation of Numerical Methods
 
Time Delayed Recurrent Neural Network for Multi-Step Prediction
Time Delayed Recurrent Neural Network for Multi-Step PredictionTime Delayed Recurrent Neural Network for Multi-Step Prediction
Time Delayed Recurrent Neural Network for Multi-Step Prediction
 
Probabilistic Forecasting: How and Why?
Probabilistic Forecasting: How and Why?Probabilistic Forecasting: How and Why?
Probabilistic Forecasting: How and Why?
 

Similar a ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis

DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
Subhajit Sahu
 
STINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph StreamingSTINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph Streaming
Jason Riedy
 

Similar a ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis (20)

Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New Architectures
 
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
 
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
# Can we trust ai. the dilemma of model adjustment
# Can we trust ai. the dilemma of model adjustment# Can we trust ai. the dilemma of model adjustment
# Can we trust ai. the dilemma of model adjustment
 
Fast Incremental Community Detection on Dynamic Graphs : NOTES
Fast Incremental Community Detection on Dynamic Graphs : NOTESFast Incremental Community Detection on Dynamic Graphs : NOTES
Fast Incremental Community Detection on Dynamic Graphs : NOTES
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
PL-4089, Accelerating and Evaluating OpenCL Graph Applications, by Shuai Che,...
PL-4089, Accelerating and Evaluating OpenCL Graph Applications, by Shuai Che,...PL-4089, Accelerating and Evaluating OpenCL Graph Applications, by Shuai Che,...
PL-4089, Accelerating and Evaluating OpenCL Graph Applications, by Shuai Che,...
 
Dynamic Community Detection for Large-scale e-Commerce data with Spark Stream...
Dynamic Community Detection for Large-scale e-Commerce data with Spark Stream...Dynamic Community Detection for Large-scale e-Commerce data with Spark Stream...
Dynamic Community Detection for Large-scale e-Commerce data with Spark Stream...
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
 
STINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph StreamingSTINGER: Multi-threaded Graph Streaming
STINGER: Multi-threaded Graph Streaming
 
20151130
2015113020151130
20151130
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity Clusters
 

Más de Jason Riedy

Más de Jason Riedy (20)

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architectures
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and Emus
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to Architecture
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and Beyond
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with Microbenchmarks
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery Update
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming Graphs
 
Network Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity AnalysisNetwork Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity Analysis
 
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
Graph Analysis Trends and Opportunities -- CMG Performance and Capacity 2014
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
 

Último

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 

Último (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis

  • 1. A New Algorithm Model for Massive-Scale Streaming Graph Analysis Chunxing Yin and Jason Riedy Georgia Institute of Technology ICIAM, 16 July 2019
  • 2. Outline Motivation and Applications New Algorithm Model Streaming Analysis Closing New Model for Streaming Graphs — ICIAM 2019 1/20
  • 4. (insert prefix here)-scale data analysis Cyber-security Identify anomalies, malicious actors Health care Finding outbreaks, population epidemiology Social networks Advertising, searching, grouping Intelligence Decisions at scale, regulating markets, smart & sustainable cities Systems biology Understanding interactions, drug design Power grid Disruptions, conservation Simulation Discrete events, cracking meshes Changes are important. Cannot stop the world... New Model for Streaming Graphs — ICIAM 2019 2/20
  • 5. Potential Applications • Social Networks • Identify communities, influences, bridges, trends, anomalies (trends before they happen)... • Potential to help social sciences, city planning, and others with large-scale data. • Cybersecurity • Determine if new connections can access a device or represent new threat in < 5ms... • Is the transfer by a virus / persistent threat? • Bioinformatics, health • Construct gene sequences, analyze protein interactions, map brain interactions • Credit fraud forensics ⇒ detection ⇒ monitoring • Real-time integration of all the customer’s data New Model for Streaming Graphs — ICIAM 2019 3/20
  • 6. Streaming graph data Network data rates: • Gigabit ethernet: 81k – 1.5M packets per second • Over 130 000 flows per second on 10 GigE (< 7.7 µs) Person-level data rates: • 500M posts per day on Twitter (6k / sec)1 • 3M posts per minute on Facebook (50k / sec)2 Should analyze only changes and not entire graph. Throughput & latency trade off and expose different levels of concurrency. 1 www.internetlivestats.com/twitter-statistics/ 2 www.jeffbullas.com/2015/04/17/21-awesome-facebook-facts-and-statistics-you-need-to-check-out/ New Model for Streaming Graphs — ICIAM 2019 4/20
  • 7. Streaming graph analysis Terminology (not universal): • Streaming changes into a massive, evolving graph • Need to handle deletions as well as insertions Previous STINGER performance results (x86-64): Data ingest >2M upd/sec [Ediger, McColl, Poovey, Campbell, & B 2014] Clustering coefficients >100K upd/sec [Riedy, Meyerhenke, B, E, & Mattson 2012] Connected comp. >1M upd/sec [McColl, Green, & B 2013] Community clustering >100K upd/sec∗ [R & B 2013] PageRank Up to 40× latency improvement [R 2016] New Model for Streaming Graphs — ICIAM 2019 5/20
  • 9. Starting incremental / streaming algorithms • Incremental and streaming algorithms start somewhere. • Initial, static computation can take a rather long time... • During which the graph cannot change? • What about supporting many simultaneous analyses? Data ingest rates, R-MAT into R-MAT, scales 24 & 30 ● ● ● ● ● ● 1e+02 1e+03 1e+04 1e+05 1e+06 1 10 100 1000 10000 1e+05 Batch size Updaterate(upd/s) platform ● Power8 Haswell Haswell−30 What can we run while the graph changes? New Model for Streaming Graphs — ICIAM 2019 6/20
  • 10. What if we don’t hold up changes? When is an algorithm valid? Analyze concurrently with the graph changes, and produce a result correct for the starting graph and some subset of concurrent changes. • No locking beyond atomic operations. • No versioned data structure. • No stopping. New Model for Streaming Graphs — ICIAM 2019 7/20
  • 11. Sample of other execution models • Put in a query, wait for sufficient data [Phillips, et al. at Sandia] • Different but very interesting model. • Evolving: Sample, accurate w/high-prob. • Difficult to generalize into graph results (e.g. shortest path tree). • Classical: dynamic algorithms, versioned data • Can require drastically more storage, possibly a copy of the graph per property, or more overhead for techniques like read-copy-update. Generally do not address the latency of computing the “static” starting point. New Model for Streaming Graphs — ICIAM 2019 8/20
  • 12. Algorithm validity in our model: Example. Can you compute degrees in an undirected graph (no self loops) concurrently with changes? Algorithm: Iterate over vertices, count the number of neighbors. 1 Compute deg(v1) 1 0 Compute deg(v2) delete edge Cannot correspond to an undirected graph at all! Valid for our model? No! Not incorrect, just not valid for our model. New Model for Streaming Graphs — ICIAM 2019 9/20
  • 13. Algorithm validity in our model: Example. Can you compute degrees in an undirected graph (no self loops) concurrently with changes? Algorithm: Iterate over edges, increment the degrees of the endpoints. 1 1 Inc deg(v1), deg(v2) 1 1 (later...) delete edge Corresponds to the beginning graph plus a subset of concurrent changes. Valid for our model? Yes! Undirected stored as directed: skip edges with v1 ≥ v2. New Model for Streaming Graphs — ICIAM 2019 10/20
  • 14. Algorithm validity in our model s w(e1) = 10 w(e2) = 5 → 1 ∆ = 4 • What is valid? • Typical BFS • Shiloach-Vishkin connected components • PageRank, Katz via Jacobi • Making a copy! (Vertex-induced subgraph) • What is invalid? • Making a decision twice in implementations • ∆-stepping SSSP: Decrease a weight below ∆ • Degree optimization: Cross threshold, miss vertex • Applying old or different information • Multiple searches: Betweenness centrality • Labeling in S. Kahan’s components alg New Model for Streaming Graphs — ICIAM 2019 11/20
  • 15. Example: PageRank, Katz Centrality PageRank Distribution of rand. walks (I − αD−1 AT )x = 1/|V| Katz Centrality Count of number of walks (I − αAT )x = 1 A: row → col adjacency matrix D: diagonal matrix of out-degrees |V|: number of vertices, 1: all-1 vector Both can be solved by Jacobi iteration, e.g. for Katz: (I − αAT )x = 1 ⇒ x(k+1) = αAT x(k) + 1 New Model for Streaming Graphs — ICIAM 2019 12/20
  • 16. Jacobi can be valid for our model Core loop of Jacobi iteration for Katz centrality: while r(k) ≥ ϵ 1. x(k+1) = αAT x(k) + 1 2. r(k+1) = 1−(I−αAT )x(k+1) 3. k = k + 1 Except this is not valid. Residual r(k+1) may use a different graph / adjacency matrix A. New Model for Streaming Graphs — ICIAM 2019 13/20
  • 17. Jacobi can be valid for our model Core loop of Jacobi iteration for Katz centrality: do 1. x(k+1) = αAT x(k) + 1 and r(k) = 1 − (I − αAT )x(k) 2. k = k + 1 until r(k−1) < ε Must use the same graph for all requirements. Will need r(k−1) later! This also affects convergence speed. New Model for Streaming Graphs — ICIAM 2019 13/20
  • 18. Fun properties for one-shot queries Due to Chunxing Yin3 , under sensible assumptions: 1. You can produce a single-change stream to demonstrate invalidity. 2. Algorithms producing a subgraph of the input cannot be guaranteed to run concurrently with changes and always produce moment-in-time outputs. 3 Yin, Riedy, et al. A New Algorithmic Model for Graph Analysis of Streaming Data. 14th International Workshop on Mining and Learning with Graphs (MLG), May 2018. New Model for Streaming Graphs — ICIAM 2019 14/20
  • 19. Fun properties for one-shot queries Due to Chunxing Yin3 , under sensible assumptions: 1. You can produce a single-change stream to demonstrate invalidity. • Proof idea: Start with a graph that incorporates all the visible changes, introduce the one change at the right time. 2. Algorithms producing a subgraph of the input cannot be guaranteed to run concurrently with changes and always produce moment-in-time outputs. 3 Yin, Riedy, et al. A New Algorithmic Model for Graph Analysis of Streaming Data. 14th International Workshop on Mining and Learning with Graphs (MLG), May 2018. New Model for Streaming Graphs — ICIAM 2019 14/20
  • 20. Fun properties for one-shot queries Due to Chunxing Yin3 , under sensible assumptions: 1. You can produce a single-change stream to demonstrate invalidity. 2. Algorithms producing a subgraph of the input cannot be guaranteed to run concurrently with changes and always produce moment-in-time outputs. • Proof idea: Any time a snapshot result could happen, delete then re-insert an edge from the output. 3 Yin, Riedy, et al. A New Algorithmic Model for Graph Analysis of Streaming Data. 14th International Workshop on Mining and Learning with Graphs (MLG), May 2018. New Model for Streaming Graphs — ICIAM 2019 14/20
  • 22. On to streaming... Can we update graph metrics as new data arrives without just re-running? • Track what changed during the one-shot query. • Update locally around those changes, while other changes are occuring. • If the update is valid, can repeat to follow a streaming graph. Initial ∆0 Upd. w/∆0 ∆1 Upd. w/∆1 ∆2 Examples: PageRank & Katz, iterative refinement. Connected components, maintain a spanning tree. New Model for Streaming Graphs — ICIAM 2019 15/20
  • 23. Early results with PageRank (Will explain the algorithm in a moment.) Synchronous: Ingest delays will increase with # kernels. Red dot: ingested batch. Blue dot: PR kernel begins. Vertical: # of iterations New Model for Streaming Graphs — ICIAM 2019 16/20
  • 24. Early results with PageRank (Will explain the algorithm in a moment.) Concurrent: Constant-rate ingest! Detects sudden structural change? Red dot: ingested batch. Blue dot: PR kernel begins. Vertical: # of iterations New Model for Streaming Graphs — ICIAM 2019 16/20
  • 25. Aside: Diameter Inside a “Batch” New Model for Streaming Graphs — ICIAM 2019 17/20
  • 26. Updating PageRank and Katz Centrality Essentially, iterative refinement. Algorithm 1 Update x and r with given ∆ Input: Graph Hi, previous solution xi, batch ∆i Output: updated Katz centrality vector xi+1 1: ri = 1 − (I − αHi)xi ← saved from previous round 2: ˜ri+1 = ri + α∆ixi 3: ∆x = JACOBI(I − αHi+1,˜ri+1, tol) 4: xi+1 = xi + ∆x 5: ∆r = α∆ixi − (I − αHi+1)∆x ← saved from Jacobi 6: ri+1 = ri + ∆r 7: return xi+1 New Model for Streaming Graphs — ICIAM 2019 18/20
  • 28. Open issues Difficult problems: Updating triangle counts efficiently! • Option: re-counting a region around changes, stopping once counts do not change. • Possibly incorrect on the region’s border, but only at changes. • Next run can fix those... A looser model? Some algorithms essentially copy subgraphs. • What are the size bounds? • Can they characterize algorithms / properties? • Can we formalize what needs kept for updating results? New Model for Streaming Graphs — ICIAM 2019 19/20
  • 29. Closing • Summary • Analysis concurrent with graph change can work. • But not all methods are valid. • Avoid evaluating conditions or exploring the graph more than once. • Save information necessary for updates. • Future work • Track subgraphs / communities for “slow” analyses • Develop more valid updating methods. • Explore approximation results related to concurrent analysis. New Model for Streaming Graphs — ICIAM 2019 20/20