SlideShare una empresa de Scribd logo
1 de 90
Descargar para leer sin conexión
Flink Forward Europe
8 October 2019
VASILIKI KALAVRI
vasia@apache.org
SELF-MANAGED AND
AUTOMATICALLY RECONFIGURABLE
STREAM PROCESSING
@vkalavri
2
20001992 2013
MapReduce
2004
Tapestry
NiagaraCQ Aurora
TelegraphCQ
STREAM
Naiad
Spark Streaming
Samza
Flink
Millwheel
Storm
S4 Google Dataflow
Next-gen streaming
Now
Single-node execution
Synopses and sketches
Stream Database Systems
2
20001992 2013
MapReduce
2004
Tapestry
NiagaraCQ Aurora
TelegraphCQ
STREAM
Naiad
Spark Streaming
Samza
Flink
Millwheel
Storm
S4 Google Dataflow
Next-gen streaming
Now
Tapestry
NiagaraCQ Aurora
TelegraphCQ
STREAM
Dataflow Systems
Distributed execution
Partitioned state
Single-node execution
Synopses and sketches
Stream Database Systems
2
20001992 2013
MapReduce
2004
Tapestry
NiagaraCQ Aurora
TelegraphCQ
STREAM
Naiad
Spark Streaming
Samza
Flink
Millwheel
Storm
S4 Google Dataflow
Next-gen streaming
Now
Naiad
Spark Streaming
Samza
Flink
Millwheel
Storm
S4 Google Dataflow
Next-gen streaming
Tapestry
NiagaraCQ Aurora
TelegraphCQ
STREAM
3
2013
MapReduce
2004
Naiad
Spark Streaming
Samza
Flink
Millwheel
Storm
S4 Google Dataflow
Now
Next-gen streaming
3
2013
MapReduce
2004
Naiad
Spark Streaming
Samza
Flink
Millwheel
Storm
S4 Google Dataflow
Now
Re-configurable Systems
Automatic scaling
Analyzer
invoke
re-configure job
performance metrics
decision
Profiler
Adaptive scheduling
Straggler mitigation
Query optimization
Instrumented
stream processor
Next-gen streaming
SNAILTRAIL: GENERALIZING CRITICAL
PATHS FOR ONLINE ANALYSIS OF
DISTRIBUTED DATAFLOWS
NSDI’18
CONVENTIONAL PROFILING TELLS ONLY PART OF THE STORY
5
Duration
Aggregate data exchange
Dataflow graph
CONVENTIONAL PROFILING TELLS ONLY PART OF THE STORY
5
Duration
Aggregate data exchange
Dataflow graph
Custom
aggregate metrics
6
DRIVER
W1
W2
W3
PROFILING SPARK SCHEDULING
processing
scheduling
6
0 5 10 15
Snapshot
0.0
0.2
0.4
0.6
0.8
CP
0 5 10 15
Snapshot
%weight
Processing Scheduling
DRIVER
W1
W2
W3
PROFILING SPARK SCHEDULING
processing
scheduling
7
worker 1
worker 2
worker 3
receive
message
deserialization
processing
serialization
send
message
waiting
waiting
8
worker 1
worker 2
worker 3
processing
OPTIMIZING PROCESSING…
9
worker 1
worker 2
worker 3
OPTIMIZING PROCESSING INCREASED WAITING
10
worker 1
worker 2
worker 3
CRITICAL PATH ANALYSIS
CRITICAL PATH: LONGEST EXECUTION PATH
(not considering waiting activities)
12
W1
W2
W3
a b
c d
CRITICAL PATH: LONGEST EXECUTION PATH
(not considering waiting activities)
12
W1
W2
W3
a b
c d
CRITICAL PATH: LONGEST EXECUTION PATH
(not considering waiting activities)
12
W1
W2
W3
a b
c d
CRITICAL PATH: LONGEST EXECUTION PATH
(not considering waiting activities)
12
W1
W2
W3
a b
c d
CRITICAL PATH: LONGEST EXECUTION PATH
(not considering waiting activities)
12
W1
W2
W3
a b
c d
CRITICAL PATH: LONGEST EXECUTION PATH
(not considering waiting activities)
12
W1
W2
W3
a b
c d
CRITICAL PATH: LONGEST EXECUTION PATH
(not considering waiting activities)
12
W1
W2
W3
a b
c d
OPTIMIZING CRITICAL ACTIVITIES CAN REDUCE LATENCY
13
W1
W2
W3
a b
c d
14
W1
W2
W3
a b
c d
Reduced execution time
OPTIMIZING CRITICAL ACTIVITIES CAN REDUCE LATENCY
ONLINE CRITICAL PATH ANALYSIS
ONLINE ANALYSIS OF TRACE SNAPSHOTS
16
input stream output stream
ONLINE ANALYSIS OF TRACE SNAPSHOTS
16
input stream output stream
periodic
snapshot
trace snapshot
stream
analyzer
performance
summaries
stream
17
W1
W2
W3
a b
c d
x u v z
ts te
17
All paths are potentially part of an evolving critical path
W1
W2
W3
a b
c d
x u v z
ts te
W1
W2
W3
a b
c d
x u v z
ts te
18
▸ All paths have the same length: te - ts
W1
W2
W3
a b
c d
x u v z
ts te
19
▸ All paths have the same length: te - ts
W1
W2
W3
a b
c d
x u v z
ts te
20
▸ All paths have the same length: te - ts
▸ Choosing a random path might miss critical activities
W1
W2
W3
a b
c d
x u v z
ts te
20
▸ All paths have the same length: te - ts
▸ Choosing a random path might miss critical activities
W1
W2
W3
a b
c d
x u v z
ts te
20
▸ All paths have the same length: te - ts
▸ Choosing a random path might miss critical activities
21
How to rank activities with regard to criticality?
21
How to rank activities with regard to criticality?
Intuition: the more paths an activity appears on
the more probable this activity is critical
1
2
3
4
5
6
7
8 9
22
W1
W2
W3
a b
c d
x u v z
ts te
1
2
3
4
5
6
7
8 9
22
W1
W2
W3
a b
c d
x u v z
ts te
1
2
3
4
5
6
7
8 9
22
W1
W2
W3
a b
c d
x u v z
ts te
9
0
0
6 6
CRITICAL PARTICIPATION (CP METRIC)
An estimation of the activity’s participation in the critical path
23
total number of paths
in the snapshot
activity duration: edge weight
centrality: the number of
paths this activity appears on
Definition 8. Transient Path Centrality: Let P = {~p1, ~p2, ...~pN}
be the set of N transient paths of snapshot G[ts,te]. The tran-
sient path centrality of an edge e 2 G[ts,te] is defined as
c(e) =
NX
i=1
ci(e), where ci(e) =
8
>><
>>:
0 : e < ~pi
1 : e 2 ~pi
The following holds:
CPa =
TPC(a) · aw
N(te ts)
(3)
Spark, Flink
di↵erent, but act
ysis: all execute
graphs whose v
whose edges den
ers (threads, pr
graph can be tran
all workers appl
tions of the data
1 We provide proofs
4
CRITICAL PARTICIPATION (CP METRIC)
An estimation of the activity’s participation in the critical path
23
total number of paths
in the snapshot
activity duration: edge weight
centrality: the number of
paths this activity appears on
Can be computed
without path
enumeration!
Definition 8. Transient Path Centrality: Let P = {~p1, ~p2, ...~pN}
be the set of N transient paths of snapshot G[ts,te]. The tran-
sient path centrality of an edge e 2 G[ts,te] is defined as
c(e) =
NX
i=1
ci(e), where ci(e) =
8
>><
>>:
0 : e < ~pi
1 : e 2 ~pi
The following holds:
CPa =
TPC(a) · aw
N(te ts)
(3)
Spark, Flink
di↵erent, but act
ysis: all execute
graphs whose v
whose edges den
ers (threads, pr
graph can be tran
all workers appl
tions of the data
1 We provide proofs
4
SNAILTRAIL IN ACTION
25
reference application SnailTrail
Timely
Trace ingestion
CP-based
performance
summaries
PAG construction
CP computation and
activity ranking
trace streams
Profiling
Trace generation
Apache Flink,
Apache Spark,
TensorFlow,
Heron,
Timely Dataflow, ...
26
DRIVER
W1
W2
W3
DRIVER SCHEDULING IS CRITICAL
processing
scheduling
26
DRIVER
W1
W2
W3
0 5 10 15
Snapshot
0.0
0.2
0.4
0.6
0.8
CP
0
%weight
Processing
DRIVER SCHEDULING IS CRITICAL
processing
scheduling
SNAILTRAIL V.2 DEMO
28
2013
MapReduce
2004
Naiad
Spark Streaming
Samza
Flink
Millwheel
Storm
S4 Google Dataflow
Now
Re-configurable Systems
Automatic scaling
Analyzer
invoke
re-configure job
performance metrics
decision
Profiler
Adaptive scheduling
Straggler mitigation
Query optimization
Instrumented
stream processor
Next-gen streaming
FAST AND ACCURATE
AUTOMATIC SCALING DECISIONS
FOR DISTRIBUTED STREAMING DATAFLOWS
OSDI’18
30
Streaming systems must be capable of adapting the level
of parallelism when conditions change at runtime
events/s
time
: input rate : throughput
Data loss SLO violationsIdle resources
events/s
time
events/s
time
AUTOMATIC SCALING OVERVIEW
31
scaling
controller
detect
symptoms
decide whether
to scale
decide how
much to scale
metrics
policy
scaling action
HEURISTIC SCALING APPROACHES
32
CPU utilization
backlog, tuples/s
backpressure signal
threshold and
rule-based
if CPU > 80% => scale
small changes,
one operator
at a time
Borealis
StreamCloud
Seep
IBM Streams
Spark Streaming
Google Dataflow
Dhalion
scaling actionmetrics policy
HEURISTIC SCALING APPROACHES
32
CPU utilization
backlog, tuples/s
backpressure signal
threshold and
rule-based
if CPU > 80% => scale
small changes,
one operator
at a time
Problematic under
interference,
multi-tenancy
Sensitive to
noise, manual,
hard to tune
Non-predictive,
speculative steps
Borealis
StreamCloud
Seep
IBM Streams
Spark Streaming
Google Dataflow
Dhalion
scaling actionmetrics policy
Effect of Dhalion’s scaling actions
in an initially under-provisioned
wordcount dataflow
33
Effect of Dhalion’s scaling actions
in an initially under-provisioned
wordcount dataflow
33
o1src o2
back-pressure!
target: 40 rec/s
Effect of Dhalion’s scaling actions
in an initially under-provisioned
wordcount dataflow
33
o1src o2
back-pressure!
target: 40 rec/s
10 rec/s 100 rec/s
Effect of Dhalion’s scaling actions
in an initially under-provisioned
wordcount dataflow
33
o1src o2
back-pressure!
target: 40 rec/s
10 rec/s 100 rec/s
Which operator is the bottleneck?
What if we scale ο1 x 4?
How much to scale ο2?
34
o1src o2
back-pressure!
target: 40 rec/s
10 rec/s 100 rec/s
Which operator is the bottleneck?
What if we scale ο1 x 4?
How much to scale ο2?
34
o1src o2
back-pressure!
target: 40 rec/s
10 rec/s 100 rec/s
Which operator is the bottleneck?
What if we scale ο1 x 4?
How much to scale ο2?
o1 cannot keep up
waiting for
output
waiting for
input
src
o1
o2
34
o1src o2
back-pressure!
target: 40 rec/s
10 rec/s 100 rec/s
Which operator is the bottleneck?
What if we scale ο1 x 4?
How much to scale ο2?
o1 cannot keep up
waiting for
output
waiting for
input
src
o1
o2
o2 cannot keep up
src
o1
o2
THE DS2 MODEL
36
src
o1
o2
10 recs 10 recs
1 2 3 4
100 rec 100 recs
Intuition: use the dataflow graph to extract operator dependencies
and system instrumentation to collect accurate, representative metrics.
target: 40 rec/s
0.5s
36
src
o1
o2
10 recs 10 recs
1 2 3 4
100 rec 100 recs
Intuition: use the dataflow graph to extract operator dependencies
and system instrumentation to collect accurate, representative metrics.
x4 instances
to keep up
with src rate
target: 40 rec/s
0.5s
36
src
o1
o2
10 recs 10 recs
1 2 3 4
100 rec 100 recs
Intuition: use the dataflow graph to extract operator dependencies
and system instrumentation to collect accurate, representative metrics.
True rate = 200 recs/s
x4 instances
to keep up
with src rate
target: 40 rec/s
0.5s
36
src
o1
o2
10 recs 10 recs
1 2 3 4
100 rec 100 recs
Intuition: use the dataflow graph to extract operator dependencies
and system instrumentation to collect accurate, representative metrics.
True rate = 200 recs/s
x4 instances
to keep up
with src rate
x2 instances
to keep up
with x4 o1
instances
target: 40 rec/s
0.5s
If operator scaling is linear, then:
▸ no overshoot when scaling up
▸ no undershoot when scaling down
37
parallelism
initial rate
target
prediction
p0 p1
parallelism
initial rate
target
p0p1
prediction
DS2 MAKES LINEAR PREDICTIONS
If operator scaling is linear, then:
▸ no overshoot when scaling up
▸ no undershoot when scaling down
37
parallelism
initial rate
target
prediction
p0 p1
parallelism
initial rate
target
p0p1
prediction
DS2 MAKES LINEAR PREDICTIONS
x
x
p’
p’
If operator scaling is linear, then:
▸ no overshoot when scaling up
▸ no undershoot when scaling down
37
parallelism
initial rate
target
prediction
p0 p1
parallelism
initial rate
target
p0p1
Ideal rates act as un upper bound when
scaling up and as a lower bound when
scaling down:
▸ DS2 will converge monotonically to
the target rate
prediction
DS2 MAKES LINEAR PREDICTIONS
p’
p’
If operator scaling is linear, then:
▸ no overshoot when scaling up
▸ no undershoot when scaling down
37
parallelism
initial rate
target
prediction
p0 p1
parallelism
initial rate
target
p0p1
Ideal rates act as un upper bound when
scaling up and as a lower bound when
scaling down:
▸ DS2 will converge monotonically to
the target rate
prediction
DS2 MAKES LINEAR PREDICTIONS
actual
actual
DS2 MINIMIZES THE ERROR UNTIL CONVERGENCE
38
parallelism
initial rate
target
actual
error
p0 p1
prediction
x
x
x
DS2 MINIMIZES THE ERROR UNTIL CONVERGENCE
38
parallelism
initial rate
target
actual
p0 p1
x
new
prediction
DS2 MINIMIZES THE ERROR UNTIL CONVERGENCE
38
parallelism
initial rate
target
actual
p0 p1
x
error
p1’
new
prediction
Gradually minimizes error
EVALUATION
40
Scaling Manager Scaling Policy
Metrics
Repository
invoke
re-scale job
report metrics
monitor
pull metrics
decision
Timely dataflow
Apache Flink
Instrumented
stream processor
DS2 VS. STATE-OF-THE-ART ON HERON
41
Initially under-provisioned wordcount dataflow
Target rate: 16.700 rec/s
DS2 VS. STATE-OF-THE-ART ON HERON
41
Initially under-provisioned wordcount dataflow
Target rate: 16.700 rec/s
DS2 VS. STATE-OF-THE-ART ON HERON
42
Initially under-provisioned wordcount dataflow
Target rate: 16.700 rec/s
DS2 VS. STATE-OF-THE-ART ON HERON
42
Initially under-provisioned wordcount dataflow
Target rate: 16.700 rec/s
DS2 converges in a
single step for
both operators
DS2 VS. STATE-OF-THE-ART ON HERON
42
Initially under-provisioned wordcount dataflow
Target rate: 16.700 rec/s
DS2 converges in a
single step for
both operators
and converges in
60s, as soon as it
receives the
Heron metrics
DS2 VS. STATE-OF-THE-ART ON HERON
42
Initially under-provisioned wordcount dataflow
Target rate: 16.700 rec/s
DS2 converges in a
single step for
both operators
Dhalion scales
one operator at a
time, and needs
six steps in total
1
6
5
43
2and converges in
60s, as soon as it
receives the
Heron metrics
DS2 VS. STATE-OF-THE-ART ON HERON
42
Initially under-provisioned wordcount dataflow
Target rate: 16.700 rec/s
DS2 converges in a
single step for
both operators
and converges in 2000s
Dhalion scales
one operator at a
time, and needs
six steps in total
1
6
5
43
2and converges in
60s, as soon as it
receives the
Heron metrics
DS2 VS. STATE-OF-THE-ART ON HERON
42
Initially under-provisioned wordcount dataflow
+10 counts
+12 mappers
Target rate: 16.700 rec/s
DS2 converges in a
single step for
both operators
and converges in 2000s
Dhalion scales
one operator at a
time, and needs
six steps in total
1
6
5
43
2and converges in
60s, as soon as it
receives the
Heron metrics
DS2 ON APACHE FLINK
43
Initially under-provisioned wordcount
Target rate: 2.000.000 rec/s, drops to half at 800s
DS2 ON APACHE FLINK
43
Initially under-provisioned wordcount
Target rate: 2.000.000 rec/s, drops to half at 800s
DS2 converges in
2 steps for both
operators
1
2
DS2 ON APACHE FLINK
43
Initially under-provisioned wordcount
Target rate: 2.000.000 rec/s, drops to half at 800s
DS2 reacts within
3s when the target
rate drops
DS2 converges in
2 steps for both
operators
1
2
DS2 ON APACHE FLINK
43
Initially under-provisioned wordcount
Target rate: 2.000.000 rec/s, drops to half at 800s
DS2 reacts within
3s when the target
rate drops
DS2 converges in
2 steps for both
operators
1
2
Transient
underpovisioning
by 1 instance
44
github.com/strymon-system
Kalavri V, Liagouris J, Hoffmann M, Dimitrova D, Forshaw M, Roscoe T. 

Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows.

OSDI ’18.
Hoffmann M, Lattuada A, Liagouris J, Kalavri V, Dimitrova D, Wicki S, Chothia Z, Roscoe T.

Snailtrail: Generalizing critical paths for online analysis of distributed dataflows.

NSDI’18.
github.com/li1/snailtrail
45
Zaheer Chothia
Andrea Lattuada
Timothy Roscoe
Moritz Hoffmann Desislava Dimitrova
John Liagouris
Malte Sandstede
Matthew ForshawSebastian Wicki
strymon.systems.ethz.ch
46
www.bu.edu/cs/phd-program/phd/
Let’s work on streaming
research together
Flink Forward Europe
8 October 2019
VASILIKI KALAVRI
vasia@apache.org
SELF-MANAGED AND
AUTOMATICALLY RECONFIGURABLE
STREAM PROCESSING
@vkalavri

Más contenido relacionado

Similar a Self-managed and automatically reconfigurable stream processing

High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018Zahari Dichev
 
ingraph: Live Queries on Graphs
ingraph: Live Queries on Graphs ingraph: Live Queries on Graphs
ingraph: Live Queries on Graphs Neo4j
 
Code dive 2019 kamil witecki - should i care about cpu cache
Code dive 2019   kamil witecki - should i care about cpu cacheCode dive 2019   kamil witecki - should i care about cpu cache
Code dive 2019 kamil witecki - should i care about cpu cacheKamil Witecki
 
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Austin Benson
 
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...Austin Benson
 
Large-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressivenessLarge-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressivenessSangjin Han
 
Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Saad Liaqat
 
Online Approximate OLAP in SparkSQL
Online Approximate OLAP in SparkSQLOnline Approximate OLAP in SparkSQL
Online Approximate OLAP in SparkSQLDataWorks Summit
 
Introduction to Compiler Development
Introduction to Compiler DevelopmentIntroduction to Compiler Development
Introduction to Compiler DevelopmentLogan Chien
 
Cassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalkCassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalkAndriy Rymar
 
Geohydrology ii (3)
Geohydrology ii (3)Geohydrology ii (3)
Geohydrology ii (3)Amro Elfeki
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processingAcad
 
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data StreamingTutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data StreamingVincenzo Gulisano
 
Passive network-redesign-ntua
Passive network-redesign-ntuaPassive network-redesign-ntua
Passive network-redesign-ntuaIEEE NTUA SB
 
GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...PTIHPA
 
Overlap Layout Consensus assembly
Overlap Layout Consensus assemblyOverlap Layout Consensus assembly
Overlap Layout Consensus assemblyZhuyi Xue
 
A Tutorial on Computational Geometry
A Tutorial on Computational GeometryA Tutorial on Computational Geometry
A Tutorial on Computational GeometryMinh-Tri Pham
 

Similar a Self-managed and automatically reconfigurable stream processing (20)

High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018
 
ingraph: Live Queries on Graphs
ingraph: Live Queries on Graphs ingraph: Live Queries on Graphs
ingraph: Live Queries on Graphs
 
Code dive 2019 kamil witecki - should i care about cpu cache
Code dive 2019   kamil witecki - should i care about cpu cacheCode dive 2019   kamil witecki - should i care about cpu cache
Code dive 2019 kamil witecki - should i care about cpu cache
 
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
 
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
 
Large-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressivenessLarge-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressiveness
 
15757597 (1).ppt
15757597 (1).ppt15757597 (1).ppt
15757597 (1).ppt
 
Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01
 
4900514.ppt
4900514.ppt4900514.ppt
4900514.ppt
 
Online Approximate OLAP in SparkSQL
Online Approximate OLAP in SparkSQLOnline Approximate OLAP in SparkSQL
Online Approximate OLAP in SparkSQL
 
Introduction to Compiler Development
Introduction to Compiler DevelopmentIntroduction to Compiler Development
Introduction to Compiler Development
 
Cassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalkCassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalk
 
Xbfs HPDC'2019
Xbfs HPDC'2019Xbfs HPDC'2019
Xbfs HPDC'2019
 
Geohydrology ii (3)
Geohydrology ii (3)Geohydrology ii (3)
Geohydrology ii (3)
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processing
 
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data StreamingTutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
 
Passive network-redesign-ntua
Passive network-redesign-ntuaPassive network-redesign-ntua
Passive network-redesign-ntua
 
GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...
 
Overlap Layout Consensus assembly
Overlap Layout Consensus assemblyOverlap Layout Consensus assembly
Overlap Layout Consensus assembly
 
A Tutorial on Computational Geometry
A Tutorial on Computational GeometryA Tutorial on Computational Geometry
A Tutorial on Computational Geometry
 

Más de Vasia Kalavri

From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondVasia Kalavri
 
Predictive Datacenter Analytics with Strymon
Predictive Datacenter Analytics with StrymonPredictive Datacenter Analytics with Strymon
Predictive Datacenter Analytics with StrymonVasia Kalavri
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph ProcessingVasia Kalavri
 
Graphs as Streams: Rethinking Graph Processing in the Streaming Era
Graphs as Streams: Rethinking Graph Processing in the Streaming EraGraphs as Streams: Rethinking Graph Processing in the Streaming Era
Graphs as Streams: Rethinking Graph Processing in the Streaming EraVasia Kalavri
 
Demystifying Distributed Graph Processing
Demystifying Distributed Graph ProcessingDemystifying Distributed Graph Processing
Demystifying Distributed Graph ProcessingVasia Kalavri
 
Like a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web TrackersLike a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web TrackersVasia Kalavri
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkVasia Kalavri
 
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkGelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkVasia Kalavri
 
Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems researchVasia Kalavri
 
Asymmetry in Large-Scale Graph Analysis, Explained
Asymmetry in Large-Scale Graph Analysis, ExplainedAsymmetry in Large-Scale Graph Analysis, Explained
Asymmetry in Large-Scale Graph Analysis, ExplainedVasia Kalavri
 
Block Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduceBlock Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduceVasia Kalavri
 
m2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and Reusem2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and ReuseVasia Kalavri
 
MapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open IssuesMapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open IssuesVasia Kalavri
 
A Skype case study (2011)
A Skype case study (2011)A Skype case study (2011)
A Skype case study (2011)Vasia Kalavri
 
Gelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area MeetupGelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area MeetupVasia Kalavri
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep DiveVasia Kalavri
 
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Vasia Kalavri
 

Más de Vasia Kalavri (17)

From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyond
 
Predictive Datacenter Analytics with Strymon
Predictive Datacenter Analytics with StrymonPredictive Datacenter Analytics with Strymon
Predictive Datacenter Analytics with Strymon
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
Graphs as Streams: Rethinking Graph Processing in the Streaming Era
Graphs as Streams: Rethinking Graph Processing in the Streaming EraGraphs as Streams: Rethinking Graph Processing in the Streaming Era
Graphs as Streams: Rethinking Graph Processing in the Streaming Era
 
Demystifying Distributed Graph Processing
Demystifying Distributed Graph ProcessingDemystifying Distributed Graph Processing
Demystifying Distributed Graph Processing
 
Like a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web TrackersLike a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web Trackers
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
 
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkGelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
 
Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems research
 
Asymmetry in Large-Scale Graph Analysis, Explained
Asymmetry in Large-Scale Graph Analysis, ExplainedAsymmetry in Large-Scale Graph Analysis, Explained
Asymmetry in Large-Scale Graph Analysis, Explained
 
Block Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduceBlock Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduce
 
m2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and Reusem2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and Reuse
 
MapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open IssuesMapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open Issues
 
A Skype case study (2011)
A Skype case study (2011)A Skype case study (2011)
A Skype case study (2011)
 
Gelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area MeetupGelly in Apache Flink Bay Area Meetup
Gelly in Apache Flink Bay Area Meetup
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep Dive
 
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Self-managed and automatically reconfigurable stream processing

  • 1. Flink Forward Europe 8 October 2019 VASILIKI KALAVRI vasia@apache.org SELF-MANAGED AND AUTOMATICALLY RECONFIGURABLE STREAM PROCESSING @vkalavri
  • 2. 2 20001992 2013 MapReduce 2004 Tapestry NiagaraCQ Aurora TelegraphCQ STREAM Naiad Spark Streaming Samza Flink Millwheel Storm S4 Google Dataflow Next-gen streaming Now
  • 3. Single-node execution Synopses and sketches Stream Database Systems 2 20001992 2013 MapReduce 2004 Tapestry NiagaraCQ Aurora TelegraphCQ STREAM Naiad Spark Streaming Samza Flink Millwheel Storm S4 Google Dataflow Next-gen streaming Now Tapestry NiagaraCQ Aurora TelegraphCQ STREAM
  • 4. Dataflow Systems Distributed execution Partitioned state Single-node execution Synopses and sketches Stream Database Systems 2 20001992 2013 MapReduce 2004 Tapestry NiagaraCQ Aurora TelegraphCQ STREAM Naiad Spark Streaming Samza Flink Millwheel Storm S4 Google Dataflow Next-gen streaming Now Naiad Spark Streaming Samza Flink Millwheel Storm S4 Google Dataflow Next-gen streaming Tapestry NiagaraCQ Aurora TelegraphCQ STREAM
  • 6. 3 2013 MapReduce 2004 Naiad Spark Streaming Samza Flink Millwheel Storm S4 Google Dataflow Now Re-configurable Systems Automatic scaling Analyzer invoke re-configure job performance metrics decision Profiler Adaptive scheduling Straggler mitigation Query optimization Instrumented stream processor Next-gen streaming
  • 7. SNAILTRAIL: GENERALIZING CRITICAL PATHS FOR ONLINE ANALYSIS OF DISTRIBUTED DATAFLOWS NSDI’18
  • 8. CONVENTIONAL PROFILING TELLS ONLY PART OF THE STORY 5 Duration Aggregate data exchange Dataflow graph
  • 9. CONVENTIONAL PROFILING TELLS ONLY PART OF THE STORY 5 Duration Aggregate data exchange Dataflow graph Custom aggregate metrics
  • 11. 6 0 5 10 15 Snapshot 0.0 0.2 0.4 0.6 0.8 CP 0 5 10 15 Snapshot %weight Processing Scheduling DRIVER W1 W2 W3 PROFILING SPARK SCHEDULING processing scheduling
  • 12. 7 worker 1 worker 2 worker 3 receive message deserialization processing serialization send message waiting waiting
  • 15. OPTIMIZING PROCESSING INCREASED WAITING 10 worker 1 worker 2 worker 3
  • 17. CRITICAL PATH: LONGEST EXECUTION PATH (not considering waiting activities) 12 W1 W2 W3 a b c d
  • 18. CRITICAL PATH: LONGEST EXECUTION PATH (not considering waiting activities) 12 W1 W2 W3 a b c d
  • 19. CRITICAL PATH: LONGEST EXECUTION PATH (not considering waiting activities) 12 W1 W2 W3 a b c d
  • 20. CRITICAL PATH: LONGEST EXECUTION PATH (not considering waiting activities) 12 W1 W2 W3 a b c d
  • 21. CRITICAL PATH: LONGEST EXECUTION PATH (not considering waiting activities) 12 W1 W2 W3 a b c d
  • 22. CRITICAL PATH: LONGEST EXECUTION PATH (not considering waiting activities) 12 W1 W2 W3 a b c d
  • 23. CRITICAL PATH: LONGEST EXECUTION PATH (not considering waiting activities) 12 W1 W2 W3 a b c d
  • 24. OPTIMIZING CRITICAL ACTIVITIES CAN REDUCE LATENCY 13 W1 W2 W3 a b c d
  • 25. 14 W1 W2 W3 a b c d Reduced execution time OPTIMIZING CRITICAL ACTIVITIES CAN REDUCE LATENCY
  • 27. ONLINE ANALYSIS OF TRACE SNAPSHOTS 16 input stream output stream
  • 28. ONLINE ANALYSIS OF TRACE SNAPSHOTS 16 input stream output stream periodic snapshot trace snapshot stream analyzer performance summaries stream
  • 29. 17 W1 W2 W3 a b c d x u v z ts te
  • 30. 17 All paths are potentially part of an evolving critical path W1 W2 W3 a b c d x u v z ts te
  • 31. W1 W2 W3 a b c d x u v z ts te 18 ▸ All paths have the same length: te - ts
  • 32. W1 W2 W3 a b c d x u v z ts te 19 ▸ All paths have the same length: te - ts
  • 33. W1 W2 W3 a b c d x u v z ts te 20 ▸ All paths have the same length: te - ts ▸ Choosing a random path might miss critical activities
  • 34. W1 W2 W3 a b c d x u v z ts te 20 ▸ All paths have the same length: te - ts ▸ Choosing a random path might miss critical activities
  • 35. W1 W2 W3 a b c d x u v z ts te 20 ▸ All paths have the same length: te - ts ▸ Choosing a random path might miss critical activities
  • 36. 21 How to rank activities with regard to criticality?
  • 37. 21 How to rank activities with regard to criticality? Intuition: the more paths an activity appears on the more probable this activity is critical
  • 40. 1 2 3 4 5 6 7 8 9 22 W1 W2 W3 a b c d x u v z ts te 9 0 0 6 6
  • 41. CRITICAL PARTICIPATION (CP METRIC) An estimation of the activity’s participation in the critical path 23 total number of paths in the snapshot activity duration: edge weight centrality: the number of paths this activity appears on Definition 8. Transient Path Centrality: Let P = {~p1, ~p2, ...~pN} be the set of N transient paths of snapshot G[ts,te]. The tran- sient path centrality of an edge e 2 G[ts,te] is defined as c(e) = NX i=1 ci(e), where ci(e) = 8 >>< >>: 0 : e < ~pi 1 : e 2 ~pi The following holds: CPa = TPC(a) · aw N(te ts) (3) Spark, Flink di↵erent, but act ysis: all execute graphs whose v whose edges den ers (threads, pr graph can be tran all workers appl tions of the data 1 We provide proofs 4
  • 42. CRITICAL PARTICIPATION (CP METRIC) An estimation of the activity’s participation in the critical path 23 total number of paths in the snapshot activity duration: edge weight centrality: the number of paths this activity appears on Can be computed without path enumeration! Definition 8. Transient Path Centrality: Let P = {~p1, ~p2, ...~pN} be the set of N transient paths of snapshot G[ts,te]. The tran- sient path centrality of an edge e 2 G[ts,te] is defined as c(e) = NX i=1 ci(e), where ci(e) = 8 >>< >>: 0 : e < ~pi 1 : e 2 ~pi The following holds: CPa = TPC(a) · aw N(te ts) (3) Spark, Flink di↵erent, but act ysis: all execute graphs whose v whose edges den ers (threads, pr graph can be tran all workers appl tions of the data 1 We provide proofs 4
  • 44. 25 reference application SnailTrail Timely Trace ingestion CP-based performance summaries PAG construction CP computation and activity ranking trace streams Profiling Trace generation Apache Flink, Apache Spark, TensorFlow, Heron, Timely Dataflow, ...
  • 45. 26 DRIVER W1 W2 W3 DRIVER SCHEDULING IS CRITICAL processing scheduling
  • 46. 26 DRIVER W1 W2 W3 0 5 10 15 Snapshot 0.0 0.2 0.4 0.6 0.8 CP 0 %weight Processing DRIVER SCHEDULING IS CRITICAL processing scheduling
  • 48. 28 2013 MapReduce 2004 Naiad Spark Streaming Samza Flink Millwheel Storm S4 Google Dataflow Now Re-configurable Systems Automatic scaling Analyzer invoke re-configure job performance metrics decision Profiler Adaptive scheduling Straggler mitigation Query optimization Instrumented stream processor Next-gen streaming
  • 49. FAST AND ACCURATE AUTOMATIC SCALING DECISIONS FOR DISTRIBUTED STREAMING DATAFLOWS OSDI’18
  • 50. 30 Streaming systems must be capable of adapting the level of parallelism when conditions change at runtime events/s time : input rate : throughput Data loss SLO violationsIdle resources events/s time events/s time
  • 51. AUTOMATIC SCALING OVERVIEW 31 scaling controller detect symptoms decide whether to scale decide how much to scale metrics policy scaling action
  • 52. HEURISTIC SCALING APPROACHES 32 CPU utilization backlog, tuples/s backpressure signal threshold and rule-based if CPU > 80% => scale small changes, one operator at a time Borealis StreamCloud Seep IBM Streams Spark Streaming Google Dataflow Dhalion scaling actionmetrics policy
  • 53. HEURISTIC SCALING APPROACHES 32 CPU utilization backlog, tuples/s backpressure signal threshold and rule-based if CPU > 80% => scale small changes, one operator at a time Problematic under interference, multi-tenancy Sensitive to noise, manual, hard to tune Non-predictive, speculative steps Borealis StreamCloud Seep IBM Streams Spark Streaming Google Dataflow Dhalion scaling actionmetrics policy
  • 54. Effect of Dhalion’s scaling actions in an initially under-provisioned wordcount dataflow 33
  • 55. Effect of Dhalion’s scaling actions in an initially under-provisioned wordcount dataflow 33 o1src o2 back-pressure! target: 40 rec/s
  • 56. Effect of Dhalion’s scaling actions in an initially under-provisioned wordcount dataflow 33 o1src o2 back-pressure! target: 40 rec/s 10 rec/s 100 rec/s
  • 57. Effect of Dhalion’s scaling actions in an initially under-provisioned wordcount dataflow 33 o1src o2 back-pressure! target: 40 rec/s 10 rec/s 100 rec/s Which operator is the bottleneck? What if we scale ο1 x 4? How much to scale ο2?
  • 58. 34 o1src o2 back-pressure! target: 40 rec/s 10 rec/s 100 rec/s Which operator is the bottleneck? What if we scale ο1 x 4? How much to scale ο2?
  • 59. 34 o1src o2 back-pressure! target: 40 rec/s 10 rec/s 100 rec/s Which operator is the bottleneck? What if we scale ο1 x 4? How much to scale ο2? o1 cannot keep up waiting for output waiting for input src o1 o2
  • 60. 34 o1src o2 back-pressure! target: 40 rec/s 10 rec/s 100 rec/s Which operator is the bottleneck? What if we scale ο1 x 4? How much to scale ο2? o1 cannot keep up waiting for output waiting for input src o1 o2 o2 cannot keep up src o1 o2
  • 62. 36 src o1 o2 10 recs 10 recs 1 2 3 4 100 rec 100 recs Intuition: use the dataflow graph to extract operator dependencies and system instrumentation to collect accurate, representative metrics. target: 40 rec/s 0.5s
  • 63. 36 src o1 o2 10 recs 10 recs 1 2 3 4 100 rec 100 recs Intuition: use the dataflow graph to extract operator dependencies and system instrumentation to collect accurate, representative metrics. x4 instances to keep up with src rate target: 40 rec/s 0.5s
  • 64. 36 src o1 o2 10 recs 10 recs 1 2 3 4 100 rec 100 recs Intuition: use the dataflow graph to extract operator dependencies and system instrumentation to collect accurate, representative metrics. True rate = 200 recs/s x4 instances to keep up with src rate target: 40 rec/s 0.5s
  • 65. 36 src o1 o2 10 recs 10 recs 1 2 3 4 100 rec 100 recs Intuition: use the dataflow graph to extract operator dependencies and system instrumentation to collect accurate, representative metrics. True rate = 200 recs/s x4 instances to keep up with src rate x2 instances to keep up with x4 o1 instances target: 40 rec/s 0.5s
  • 66. If operator scaling is linear, then: ▸ no overshoot when scaling up ▸ no undershoot when scaling down 37 parallelism initial rate target prediction p0 p1 parallelism initial rate target p0p1 prediction DS2 MAKES LINEAR PREDICTIONS
  • 67. If operator scaling is linear, then: ▸ no overshoot when scaling up ▸ no undershoot when scaling down 37 parallelism initial rate target prediction p0 p1 parallelism initial rate target p0p1 prediction DS2 MAKES LINEAR PREDICTIONS x x p’ p’
  • 68. If operator scaling is linear, then: ▸ no overshoot when scaling up ▸ no undershoot when scaling down 37 parallelism initial rate target prediction p0 p1 parallelism initial rate target p0p1 Ideal rates act as un upper bound when scaling up and as a lower bound when scaling down: ▸ DS2 will converge monotonically to the target rate prediction DS2 MAKES LINEAR PREDICTIONS p’ p’
  • 69. If operator scaling is linear, then: ▸ no overshoot when scaling up ▸ no undershoot when scaling down 37 parallelism initial rate target prediction p0 p1 parallelism initial rate target p0p1 Ideal rates act as un upper bound when scaling up and as a lower bound when scaling down: ▸ DS2 will converge monotonically to the target rate prediction DS2 MAKES LINEAR PREDICTIONS actual actual
  • 70. DS2 MINIMIZES THE ERROR UNTIL CONVERGENCE 38 parallelism initial rate target actual error p0 p1 prediction x x x
  • 71. DS2 MINIMIZES THE ERROR UNTIL CONVERGENCE 38 parallelism initial rate target actual p0 p1 x new prediction
  • 72. DS2 MINIMIZES THE ERROR UNTIL CONVERGENCE 38 parallelism initial rate target actual p0 p1 x error p1’ new prediction Gradually minimizes error
  • 74. 40 Scaling Manager Scaling Policy Metrics Repository invoke re-scale job report metrics monitor pull metrics decision Timely dataflow Apache Flink Instrumented stream processor
  • 75. DS2 VS. STATE-OF-THE-ART ON HERON 41 Initially under-provisioned wordcount dataflow Target rate: 16.700 rec/s
  • 76. DS2 VS. STATE-OF-THE-ART ON HERON 41 Initially under-provisioned wordcount dataflow Target rate: 16.700 rec/s
  • 77. DS2 VS. STATE-OF-THE-ART ON HERON 42 Initially under-provisioned wordcount dataflow Target rate: 16.700 rec/s
  • 78. DS2 VS. STATE-OF-THE-ART ON HERON 42 Initially under-provisioned wordcount dataflow Target rate: 16.700 rec/s DS2 converges in a single step for both operators
  • 79. DS2 VS. STATE-OF-THE-ART ON HERON 42 Initially under-provisioned wordcount dataflow Target rate: 16.700 rec/s DS2 converges in a single step for both operators and converges in 60s, as soon as it receives the Heron metrics
  • 80. DS2 VS. STATE-OF-THE-ART ON HERON 42 Initially under-provisioned wordcount dataflow Target rate: 16.700 rec/s DS2 converges in a single step for both operators Dhalion scales one operator at a time, and needs six steps in total 1 6 5 43 2and converges in 60s, as soon as it receives the Heron metrics
  • 81. DS2 VS. STATE-OF-THE-ART ON HERON 42 Initially under-provisioned wordcount dataflow Target rate: 16.700 rec/s DS2 converges in a single step for both operators and converges in 2000s Dhalion scales one operator at a time, and needs six steps in total 1 6 5 43 2and converges in 60s, as soon as it receives the Heron metrics
  • 82. DS2 VS. STATE-OF-THE-ART ON HERON 42 Initially under-provisioned wordcount dataflow +10 counts +12 mappers Target rate: 16.700 rec/s DS2 converges in a single step for both operators and converges in 2000s Dhalion scales one operator at a time, and needs six steps in total 1 6 5 43 2and converges in 60s, as soon as it receives the Heron metrics
  • 83. DS2 ON APACHE FLINK 43 Initially under-provisioned wordcount Target rate: 2.000.000 rec/s, drops to half at 800s
  • 84. DS2 ON APACHE FLINK 43 Initially under-provisioned wordcount Target rate: 2.000.000 rec/s, drops to half at 800s DS2 converges in 2 steps for both operators 1 2
  • 85. DS2 ON APACHE FLINK 43 Initially under-provisioned wordcount Target rate: 2.000.000 rec/s, drops to half at 800s DS2 reacts within 3s when the target rate drops DS2 converges in 2 steps for both operators 1 2
  • 86. DS2 ON APACHE FLINK 43 Initially under-provisioned wordcount Target rate: 2.000.000 rec/s, drops to half at 800s DS2 reacts within 3s when the target rate drops DS2 converges in 2 steps for both operators 1 2 Transient underpovisioning by 1 instance
  • 87. 44 github.com/strymon-system Kalavri V, Liagouris J, Hoffmann M, Dimitrova D, Forshaw M, Roscoe T. 
 Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows.
 OSDI ’18. Hoffmann M, Lattuada A, Liagouris J, Kalavri V, Dimitrova D, Wicki S, Chothia Z, Roscoe T.
 Snailtrail: Generalizing critical paths for online analysis of distributed dataflows.
 NSDI’18. github.com/li1/snailtrail
  • 88. 45 Zaheer Chothia Andrea Lattuada Timothy Roscoe Moritz Hoffmann Desislava Dimitrova John Liagouris Malte Sandstede Matthew ForshawSebastian Wicki strymon.systems.ethz.ch
  • 90. Flink Forward Europe 8 October 2019 VASILIKI KALAVRI vasia@apache.org SELF-MANAGED AND AUTOMATICALLY RECONFIGURABLE STREAM PROCESSING @vkalavri