Gearpump is a Akka based realtime streaming engine, it use Actor to model everything. It has super performance and flexibility. It has performance of 18000000 messages/second and latency of 8ms on a cluster of 4 machines.
2. 2
What is Gearpump
• Akka based lightweight Real time data processing platform.
• Apache License http://gearpump.io version 0.7
• Akka:
• Communication,
concurrency, Isolation,
and fault-tolerant
Simple and Powerful
Message level streaming
Long running daemons
What is Akka?
3. 3
What is Akka?
• Micro-service(Actor) oriented.
• Message Driven
• Lock-free
• Location-transparent
It is like our human society, driven by message
Which can scale to 7 billion population!
4. 4
Micro-service oriented
higher abstraction
• Break your application into Micro services
instead of object.
• Throw away locks
• Use Immutable Async message to exchange
information between micro-service instead of
shared object.
5. 5
Gearpump in Big Data Stack
store
visualization
batch stream
SQL
Catalyst
StreamSQL
Impala
Cluster manager monitor/alert/notify
Machine learning
Graphx
Cloudera
Manager
Storage
Engine
Analytics
Visualization
& management
storm
Data explore
Gearpump
6. 6
Why another streaming platform?
• The requirements are not fully met.
• A higher abstraction like micro-service oriented can largely simplify
the problem.
7. 7
What we want
• Meet The 8 Requirements of Real-Time Stream
Processing (2006)
7
Flexible Volume Speed Accuracy Visual
Any where
Any size
Any source
Any use case
dynamic DAG
②StreamSQL
High
throughput
⑦Scale linearly
①In-Stream
Zero latency
⑥HA
⑧Responsive
Exactly-once
③Message
loss/delay/out
of order
④Predictable
Easy to debug
WYSWYG
9. 9
DAG representation and API
Graph(A~>B~>C~>D, B~>E~>D)
Low level Graph API Syntax:
A B C D
E
Processor
Processor Processor Processor
Processor
DAG
Shuffle
Field
grouping
Field
grouping
10. 10
Architecture - Actor Hierarchy
Client
Hook in and query state
As general service
YARN
Each App has one isolated AppMaster, and use Actor Supervision
tree for error handling.
Master Cluster
HA Design
12. 12
Feature Highlights
Akka/Akka-
stream/Storm
compatible
Throughput
14 million/s (*)
2ms Latency(*)
Exactly-once Dynamic DAG
Out of Order
Message
Flexible DSL
DAG
Visualization
Internet of
Thing
function
usability
[*] Test environment: Intel® Xeon™ Processor E5-2690, 4 nodes, each node has 32 CPU cores, and 64GB memory, 10GbE network.
We use default configuration of Gearpump 0.7 release. See backup page for more details.
We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark) (tested carried by Intel team)
14. 14
Three steps to use it
1. Download binary from http://gearpump.io
2. Submit jar by UI
3. Monitor Status
15. 15
Application Submission Flow
Master
Workers
1. Submit a Jar
Master
Workers
AppMaster 2. Create AppMaster
Master
Workers
AppMaster
YARN
Master
Workers
AppMaster Executor Executor Executor
1 2
43
154. Report Executor to AppMaster
YARN or without YARN
16. 16
Low level Graph API - WordCount
val context = new ClientContext()
val split = Processor[Split](splitParallism)
val sum = Processor[Sum](sumParallism)
val app = StreamApplication("wordCount", Graph(split ~> sum), UserConfig.empty)
val appId = context.submit(app)
context.close()
class Split(taskContext : TaskContext, conf: UserConfig) extends Task(taskContext, conf) {
override def onNext(msg : Message) : Unit = { /* split the line */ }
}
class Sum (taskContext : TaskContext, conf: UserConfig) extends Task(taskContext, conf) {
val count = /**count of words **/
override def onNext(msg : Message) : Unit = {/* do aggregation on word*/}
}
Scala Java
17. 17
High Level DSL API - WordCount
val context = ClientContext()
val app = new StreamApp("dsl", context)
val data = "This is a good start, bingo!! bingo!!"
app.fromCollection(data.lines)
// word => (word, count = 1)
.flatMap(line => line.split("[s]+")).map((_, 1))
// (word, count1), (word, count2) => (word, count1 + count2)
.groupByKey().sum.log
val appId = context.submit(app)
context.close()
18. 18
Akka-stream API – WordCount
implicit val system = ActorSystem("akka-test")
implicit val materializer = new GearpumpMaterializer(system)
val echo = system.actorOf(Props(new Echo()))
val sink = Sink.actorRef(echo, "COMPLETE")
val source = GearSource.from[String](new CollectionDataSource(lines))
source.mapConcat{line => line.split(" ").toList}
.groupBy2(x=>x)
.map(word => (word, 1))
.reduce {(a, b) => (a._1, a._2 + b._2)}
.log("word-count").runWith(sink)
Available at branch https://github.com/gearpump/gearpump/tree/akkastream
19. 19
DAG Page
UI Portal - DAG Visualization
Track global min-Clock
of all message DAG:
• Node size reflect throughput
• Edge width represents flow rate
• Red node means something goes wrong
19
20. 20
UI Portal – Processor Detail
Processor Page
Data skew distribution Task throughput and latency
22. 22
Throughput and Latency
Throughput: 11 million message/second
Latency: 17ms on full load
SOL Shuffle test
32 tasks->32 tasks
[*] Test environment: Intel® Xeon™ Processor E5-2680, 4 nodes, each node has 32 CPU cores, and 64GB memory, 10GbE network. (tested carried by Intel team)
We use default configuration of Gearpump 0.2 release. We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark)
23. 23
Scalability
• Test run on 100 nodes(*) and 3000 tasks
• Gearpump performance scales:
100 nodes
[*] We use 8 machines to simulate 100 worker nodes
Test environment: Intel® Xeon™ Processor E5-2680, each node has 32 CPU cores, and 64GB memory, 10GbE network. (tested carried by Intel team)
We use default configuration of Gearpump 0.3.5 release. We use the SOL workload, message size 100 bytes, (https://github.com/intel-hadoop/storm-benchmark)
24. 24
Fault-Tolerance: Recovery time
[*]: Recovery time is the time interval between: a) failure happen b) all tasks in topology resume processing data.
91 worker nodes, 1000 tasks
Test environment: Intel® Xeon™ Processor E5-2680 91 worker nodes, 1000 tasks (We use 7 machines to simulate 91 worker nodes). Each node has 32 CPU cores, and 64GB
memory, 10GbE network. We use default configuration of Gearpump 0.3.5 release. We use the SOL workload (https://github.com/intel-hadoop/storm-benchmark) (by Intel team)
25. 25
High performance Messaging Layer
• Akka remote message has a big overhead, (sender + receiver address)
• Reduce 95% overhead (400 bytes to ~20 bytes)
Effective
batching
convert from
short address
convert to short address
Sync with other executors
26. 26
Effective batching
Network Idle: Flush as fast as we can
Network Busy: Smart batching until the network is open again.
This feature is ported from Storm-297
Network Bandwidth
Doubled
For 100 byte per message
Test environment: Same as storm-297
27. 27
High performance flow Control
Pass back-pressure level-by-level
About ~1% throughput impact
TaskTaskTask TaskTaskTask TaskTaskTask TaskTaskTask
Back-pressure
Sliding window
Another option(not used): big-loop-feedback flow control
1. NO central ack nodes
2. Each level knows
network status, thus
can optimize the
network at best
29. 29
What is a good use case for Gearpump?
When you want exactly-once message processing, with millisecond
latency.
When you want to integrate with Akka and Akka-stream transparently.
When you want dynamic modification of online DAG
When you want to connect with IoT edge device, location tranparency.
When you want a simple scheduler to distribute customized
application, like collecting logs, distributed cron…
When you want to use Monoid state like Count-Min Sketch…
Besides, it can integrate with: YARN, Kafka, Storm and etc..
30. 30
IoT Transparent Cloud
Location transparent. Unified programming model across the boundary.
log
Data Center
dag on
device side
Target Problem Large gap between edge devices and data center
case: Intelligent Traffic System, 3000 traffic lights, travel time, overspeed detection…
31. 31
Exactly-once: Financial use cases
Target Problem both real-time and accuracy are important
realtime
Stock index
Crawlers
Process Alerts Actions Reports
Rules
Transaction
AccountOther
data
source
Programing trading
32. 32
Delete
Transformers: Dynamic DAG
Add
change parallelism online to scale out without message loss
add/remove source/sink processor dynamically
B
Target Problem No existing way to manipulate the DAG on the fly
Each Processor can has its own independent jar
Replace
It can
33. 33
Eve: Online Machine Learning
• Decide immediately based online learning result
sensor
Learn
Predict
Decide
Input
Output
Target Problem ML train and predict online for real-time decision support.
TAP analytics platform integration: http://trustedanalytics.org/
35. 35
General ideas
State
Minclock service
Replayable
Source DAG
MinClock service track the min
timestamp of all pending messages in the
system now and future
Message(timestamp)
Every message in the system
Has a application
timestamp(message birth time)
Normal Flow
37. 37
1. Detect Message loss
2. Pause the clock at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
38. 38
Detect Failure in time
AckRequest and Ack to
detect Message loss:
Easy to trouble-shoot
When An error happen, we know
When
Where
Why
Master
AppMaster
Executor
Task
Failure
Failure
Failure
41. 41
1. Detect Message loss
2. Pause the clock at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
42. 42
Application’s Clock Service
Definition: Task min-clock is
Minimum of (
min timestamp of pending-messages in current task
Task min-Clock of all upstream tasks
)
Report task min-clock
Level Clock
A
B
C
D
E
Later
Earlier
1000
800
600
400
Ever incremental
43. 43
1. Detect Message loss
2. Pause the clock at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
45. 45
Source-based Message Replay
Replay from the very-beginning source
Global Clock Service
①Resolve offset with timestamp Tp②Replay from offset
Source
Recovery Flow
46. 46
1. Detect Message loss
2. Clock pause at Tp when message is lost
3. Replay from clock Tp
4. Exactly-once
47. 47
Checkpoint Store
DAG runtime
Key: Ensure State(t) only contains
message(timestamp <= t)
Exactly-once message processing
How?
append only checkpoint
49. 49
Exactly-once message processing
Two states
State
Accept (t < Tc)
State
Accept all
Checkpoint Store
Streaming SystemMessages
Recover checkpoint in failures
Recovery Flow
50. 50
How to do Dynamic DAG?
Multiple-Version DAG
A B C
B’Message.time >= Tc
Target Problem: Replace processor B with B’ at time Tc
DAG(Version = 0) DAG(Version = 1)
transit
Message.time < Tc
NO message loss during the transition
A B C
55. 55
Latest Performance evaluation on Gearpump 0.7
This is the latest
performance test on
version 0.7. Please see
the embedded
document on the right
to see the configuration
details.