This presentation held in at Inovex GmbH in Munich in November 2015 was about a general introduction of the streaming space, an overview of Flink and use cases of production users as presented at Flink Forward.
11. What is stream processing
Real-world data is unbounded and is
pushed to systems
Right now: people are using the batch
paradigm for stream analysis (there was
no good stream processor available)
New systems (Flink, Kafka) embrace
streaming nature of data
11
Web server Kafka topic
Stream processing
12. 12
Flink is a stream processor with many faces
Streaming dataflow runtime
14. Requirements for a stream processor
Low latency
• Fast results (milliseconds)
High throughput
• handle large data amounts (millions of events
per second)
Exactly-once guarantees
• Correct results, also in failure cases
Programmability
• Intuitive APIs
14
15. Pipelining
15
Basic building block to “keep the data moving”
• Low latency
• Operators push
data forward
• Data shipping as
buffers, not tuple-
wise
• Natural handling
of back-pressure
16. Fault Tolerance in streaming
at least once: ensure all operators see all
events
• Storm: Replay stream in failure case
Exactly once: Ensure that operators do
not perform duplicate updates to their
state
• Flink: Distributed Snapshots
• Spark: Micro-batches on batch runtime
16
17. Flink’s Distributed Snapshots
Lightweight approach of storing the state
of all operators without pausing the
execution
high throughput, low latency
Implemented using barriers flowing
through the topology
17
Kafka
Consumer
offset = 162
Element
Counter
value = 152
Operator
stateData Stream
barrier
Before barrier =
part of the snapshot
After barrier =
Not in snapshot
(backup till next snapshot)
18. DataStream API
18
case class WordCount(word: String, count: Int)
val text: DataStream[String] = …;
text
.flatMap { line => line.split(" ") }
.map { word => new WordCount(word, 1) }
.keyBy("word")
.window(GlobalWindows.create())
.trigger(new EOFTrigger())
.sum("count")
Batch Word Count in the DataStream API
19. Batch Word Count
in the DataSet API
19
case class WordCount(word: String, count: Int)
val text: DataStream[String] = …;
text
.flatMap { line => line.split(" ") }
.map { word => new WordCount(word, 1) }
.keyBy("word")
.window(GlobalWindows.create())
.trigger(new EOFTrigger())
.sum("count")
val text: DataSet[String] = …;
text
.flatMap { line => line.split(" ") }
.map { word => new WordCount(word, 1) }
.groupBy("word")
.sum("count")
20. Best of all worlds for streaming
Low latency
• Thanks to pipelined engine
Exactly-once guarantees
• Distributed Snapshots
High throughput
• Controllable checkpointing overhead
Programmability
• APIs similar to those known from the batch world
20
21. Throughput of distributed grep
21
Data
Generator
“grep”
operator
30 machines, 120 cores
0
20,000,000
40,000,000
60,000,000
80,000,000
100,000,000
120,000,000
140,000,000
160,000,000
180,000,000
200,000,000
Flink, no fault
tolerance
Flink, exactly
once (5s)
Storm, no
fault tolerance
Storm, micro-
batches
aggregate throughput
of 175 million
elements per second
aggregate throughput
of 9 million elements
per second
• Flink achieves 20x
higher throughput
• Flink throughput
almost the same
with and without
exactly-once
22. Aggregate throughput for stream record
grouping
22
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
70,000,000
80,000,000
90,000,000
100,000,000
Flink, no
fault
tolerance
Flink,
exactly
once
Storm, no
fault
tolerance
Storm, at
least once
aggregate throughput
of 83 million elements
per second
8,6 million elements/s
309k elements/s Flink achieves 260x
higher throughput with
fault tolerance
30 machines,
120 cores Network
transfer
23. Latency in stream record grouping
23
Data
Generator
Receiver:
Throughput /
Latency measure
• Measure time for a record to
travel from source to sink
0.00
5.00
10.00
15.00
20.00
25.00
30.00
Flink, no
fault
tolerance
Flink, exactly
once
Storm, at
least once
Median latency
25 ms
1 ms
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Flink, no
fault
tolerance
Flink,
exactly
once
Storm, at
least
once
99th percentile
latency
50 ms
30. The Flink Stack
30
Streaming dataflow runtime
DataSet (Java/Scala) DataStream (Java/Scala)
Experimental
Python API also
available
Data Source
orders.tbl
Filter
Map DataSource
lineitem.tbl
Join
Hybrid Hash
buildHT probe
hash-part [0] hash-part [0]
GroupRed
sort
forward
API independent Dataflow
Graph representation
Batch Optimizer Graph Builder
31. Batch is a special case of streaming
Batch: run a bounded stream (data set) on
a stream processor
Form a global window over the entire data
set for join or grouping operations
31
32. Batch-specific optimizations
Managed memory on- and off-heap
• Operators (join, sort, …) with out-of-core
support
• Optimized serialization stack for user-types
Cost-based Optimizer
• Job execution depends on data size
32
34. FlinkML: Machine Learning
API for ML pipelines inspired by scikit-learn
Collection of packaged algorithms
• SVM, Multiple Linear Regression, Optimization, ALS, ...
34
val trainingData: DataSet[LabeledVector] = ...
val testingData: DataSet[Vector] = ...
val scaler = StandardScaler()
val polyFeatures = PolynomialFeatures().setDegree(3)
val mlr = MultipleLinearRegression()
val pipeline = scaler.chainTransformer(polyFeatures).chainPredictor(mlr)
pipeline.fit(trainingData)
val predictions: DataSet[LabeledVector] = pipeline.predict(testingData)
36. Flink Stack += Gelly, ML
36
Gelly
ML
DataSet (Java/Scala) DataStream
Streaming dataflow runtime
37. Integration with other systems
37
SAMOA
DataSet DataStream
HadoopM/R
GoogleDataflow
Cascading
Storm
Zeppelin
• Use Hadoop Input/Output Formats
• Mapper / Reducer implementations
• Hadoop’s FileSystem implementations
• Run applications implemented against Google’s Data Flow API
on premise with Flink
• Run Cascading jobs on Flink, with almost no code change
• Benefit from Flink’s vastly better performance than
MapReduce
• Interactive, web-based data exploration
• Machine learning on data streams
• Compatibility layer for running Storm code
• FlinkTopologyBuilder: one line replacement for
existing jobs
• Wrappers for Storm Spouts and Bolts
• Coming soon: Exactly-once with Storm
• Build-in serializer support for
Hadoop’s Writable type
47. What is currently happening?
Features for upcoming 0.10 release:
• Master High Availability
• Vastly improved monitoring GUI
• Watermarks / Event time processing /
Windowing rework
The next talk by Matthias
• Stable streaming API (no more “beta” flag)
Next release: 1.0
• API stability
47
48. How do I get started?
48
Mailing Lists: (news | user | dev)@flink.apache.org
Twitter: @ApacheFlink
Blogs: flink.apache.org/blog, data-artisans.com/blog/
IRC channel: irc.freenode.net#flink
Start Flink on YARN in 4 commands:
# get the hadoop2 package from the Flink download page at
# http://flink.apache.org/downloads.html
wget <download url>
tar xvzf flink-0.9.1-bin-hadoop2.tgz
cd flink-0.9.1/
./bin/flink run -m yarn-cluster -yn 4 ./examples/flink-java-
examples-0.9.1-WordCount.jar
49. flink.apache.org 49
• Check out the slides: http://flink-
forward.org/?post_type=session
• Video recordings on YouTube, “Flink Forward”
channel
57. 57
case class Path (from: Long, to:
Long)
val tc = edges.iterate(10) {
paths: DataSet[Path] =>
val next = paths
.join(edges)
.where("to")
.equalTo("from") {
(path, edge) =>
Path(path.from, edge.to)
}
.union(paths)
.distinct()
next
}
Optimizer
Type extraction
stack
Task
scheduling
Dataflow
metadata
Pre-flight (Client)
JobManager
TaskManagers
Data
Source
orders.tbl
Filter
Map
DataSourc
e
lineitem.tbl
Join
Hybrid Hash
build
HT
probe
hash-part [0] hash-part [0]
GroupRed
sort
forward
Program
Dataflow
Graph
deploy
operators
track
intermediate
results
LocalCluster:YARN,Standalone
58. Iterative processing in Flink
Flink offers built-in iterations and delta
iterations to execute ML and graph
algorithms efficiently
58
map
join sum
ID1
ID2
ID3
There are two different data processing problems. The first is when you have collected some data and want to make some sense from it, for example build a report. The second is when you want to react to events that are happening in the world, for example a credit card transaction. And these two are fundamentally different: the first one is akin to polling, while the second one is push-based. There are many tools for the first kind of problem, including Apache Flink. However, while the second is as important and common as the first, for a long time we have not had good tools to deal with it. This is now changing.
GCE 30 instances with 4 cores and 15 GB of memory each.
Flink master from July, 24th, Storm 0.9.3.
All the code used for the evaluation can be found here.
Flink
1.5 million elements per second per core
Aggregate Throughput in cluster 182 million elements per second.
Storm
82,000 elements per second per core
Aggregate 0.57 million elements per second
Storm with Acknowledge 4,700 elements per second per core, Latency 30-120 milliseconds
Trident: 75,000 elements per second per core
Flink
720,000 events per second per core
690,000 with checkpointing activated
Storm
With at-least-once: 2,600 events per second per core
Flink
720,000 events per second per core
690,000 with checkpointing activated
Storm
With at-least-once: 2,600 events per second per core
Flink
0 Buffer timeout:
latency median 0 msec, 99 %tile 20 msec
24,500 events per second per core
People previously made the case that high throughput and low latency are mutually exclusive