➥🔝 7737669865 🔝▻ mehsana Call-girls in Women Seeking Men 🔝mehsana🔝 Escorts...
Jstorm introduction-0.9.6
1. Company
LOGO
An Introduction of JStorm
LongdaFeng(zhongyan.feng@alibaba-inc.com)
2. Longda Feng
Alibaba
Agenda
Background
Basic Concept & Scenarios
Why start JStorm?
JStorm vs Storm
Question and Answer.
3. Who are we?
JStorm Team was among one of the
earliest that uses Storm in China.
Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1
JStorm 0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/…
Our Duties
Application Development
JStorm System Development
JStorm System Operation
Longda Feng
Alibaba
4. Who are Using JStorm
Many small Chinese companies are using
JStorm
Longda Feng
Alibaba
5. How Big?
More than 3000 servers
More than 3 trillion messages per day
Longda Feng
Alibaba
6. What is JStorm?
JStorm is a distributed programming
framework
Similar to Hadoop MapReduce but designed
for real-time/in-memory scenarios
Users can build powerful distributed
applications from very simple APIs
Longda Feng
Alibaba
7. What is JStorm?
Redesigned Storm in Java.
Proved stable running in huge clusters.
Much faster
Much more powerful
Longda Feng
Alibaba
9. Advantage 1
Easy learning:
Simple Building Blocks: Topology/Spout/Bolt
APIs
Out of Box RPC/Fault-tolerance/Real-time
Data Grouping & Combining
Longda Feng
Alibaba
10. Advantage 2
Excellent Scalability
Horizontally Scalable
DAG-based
Adjustable parallelism of each component
Longda Feng
Alibaba
11. Stable
Guarantees Fault-Tolerance
No Single Point of Failure
• Nimbus HA
• Any Supervisor can be shutdown
New worker will be spawned and replace the
failed one automatically
Longda Feng
Alibaba
12. Accuracy
Acking framework guarantees no lost of
data
Transaction framework guarantees data
accuracy.
Longda Feng
Alibaba
13. Scenarios
Stateless Computation
All data come from Tuple
Use Cases:
Log Analysis
Pipe-lined System
Message converter
Statistical Analysis
Real-time Recommendation Algorithm
Longda Feng
Alibaba
14. Longda Feng
Alibaba
Why start JStorm
Storm community is not as active as we’ve
expected
Tailored for enterprise environment
Fixed critical bugs in Storm
Provided professional technical support,
improved app development pace.
Reduced operational cost.
16. JStorm is a superset of Storm
The program run in Storm can run in
JStorm without changing code
Longda Feng
Alibaba
17. More stable (1) -- nimbus HA
Nimbus HA
Dual-Nimbus HA
Longda Feng
Alibaba
18. More stable (2) -- RPC
Netty supports 2 RPC modes
Async
Sync
• Sending speed keeps up with the receiving speed,
therefore the data flow is more stable.
Longda Feng
Alibaba
19. More stable(3) – resource isolation
Malicious Worker won’t mess up with
others
Supported CPU Isolation with cgroups
Supported Memory Isolation
Resources quota can be enforced on each
group (before 0.9.5)
Longda Feng
Alibaba
20. More stable(4) -- Monitor
Monitor every component in your
Topology
Many more metrics(70+) than storm
Supported user-defined metrics
Supported user-defined alerts
Longda Feng
Alibaba
21. More stable (5) – CPU usage
Better utilizing CPU resource
Improved disruptor implementation
• Drop CPU usage from 300% to 10% when
processing queue is full
Avoid CPU spin-waiting
• Relocating nextTuple/ack/fail work to a different
thread
Longda Feng
Alibaba
22. More stable(6) -- more catch
Add try-catch in any place.
Nimbus/supervisor main thread
Spout/bolt initialization/cleanup
All IO operation, serialization/deserialization
All ZK operation
Longda Feng
Alibaba
23. More stable(7) -- ZK
Reduced unnecessary ZK usage:
Removed useless watcher
Increased ZK heartbeat frequency
Detect failed worker without a full scan of the
entire ZK directory
Longda Feng
Alibaba
24. More stable(8) -- other
Improved GC Tuning.
Guaranteed that all workers killed after kill
command is issued
Guaranteed single supervisor/nimbus per
instance
Avoid excessive use of local ports by
Netty client
。。。
Longda Feng
Alibaba
25. More powerful scheduler
Balancing Tasks with regard of :
CPU
Memory
Net
Longda Feng
Alibaba
26. CPU assignment
By default assign each worker a single
CPU slot
Application can be configured to utilize
more slots
Why:
Some task creates extra threads to do other
things in Alimama, one CPU slot doesn’t meet
requirement
Longda Feng
Alibaba
27. Memory Usage
Default worker memory is 2G
Application can be configured to utilize
more memory slots
Why:
In Alipay Mdrill application, Solr bolt will apply
much more memory
Longda Feng
Alibaba
28. Smarter Balancing
With JStorm Scheduler:
Tasks that exchange data heavily tend to be
assigned to the same worker to avoid
networking cost.
Longda Feng
Alibaba
29. User Defined Scheduler
User define task run one designated
worker
User can setting how many CPU slot /memory
slot will be used
Why:
In Taobao TAE project, some bolts want to
run in user defined-nodes
Longda Feng
Alibaba
30. Task on Different Node
Task of one component can be scheduled
to run on different nodes
Why:
In ALIPAY Mdrill, Solr bolt must run different
node
Longda Feng
Alibaba
31. Task on Single Node
All tasks can be scheduled to run on a
single node.
Why:
In Taobao TLog, there are many small jobs, in
order to reduce network cost, all task of one
job must run on single node.
Longda Feng
Alibaba
32. Old Assignment
“Last Assignment Policy”
By default , a task will run on the machine it
runs previous time
Why:
In Alibaba CDO, When restart one application,
user wanted to reuse old workers
Longda Feng
Alibaba
33. Pluginable
Be able to run on:
Hadoop yarn(more stable than storm)
Alibaba Apsara Clould System
Alibaba Elastic Resource Pool
Longda Feng
Alibaba
35. More convenient UI
More useful stats collected and displayed.
Browse Worker Log in UI
Longda Feng
Alibaba
36. Support libjar
Don’t need assembly all dependency jars
into one jar
Submit libjar with libjar parameter
Support worker.classpath
Longda Feng
Alibaba