More Related Content Similar to Apache Tez : Accelerating Hadoop Query Processing (20) Apache Tez : Accelerating Hadoop Query Processing1. Apache Tez : Accelerating
Hadoop Query Processing
Jeff Markham
Technical Director, APAC
Hortonworks
Page 1
2. Tez – Introduction
• Distributed execution
framework targeted towards
data-processing applications.
• Based on expressing a
computation as a dataflow
graph.
• Built on top of YARN – the
resource management
framework for Hadoop.
• Open source Apache incubator
project and Apache licensed.
© Hortonworks Inc. 2013
Page 2
3. YARN: Taking Hadoop Beyond Batch
MapReduce as Base
Apache Tez as Base
HADOOP 1.0
HADOOP 2.0
Batch
Pig
(data
flow)
Hive
Others
(sql)
(cascading)
MapReduce
MapReduce
Data
Flow
Pig
SQL
Hive
Others
(cascading)
Tez
Storm
(execu:on
engine)
YARN
(cluster
resource
management
&
data
processing)
(cluster
resource
management)
HDFS
HDFS2
(redundant,
reliable
storage)
© Hortonworks Inc. 2013.
Online
Real
Time
Data
Stream
Processing
Processing
HBase,
(redundant,
reliable
storage)
Accumulo
4. Apache Tez (“Speed”)
• Replaces MapReduce as primitive for Pig, Hive, Cascading etc.
– Smaller latency for interactive queries
– Higher throughput for batch queries
– 22 contributors: Hortonworks (13), Facebook, Twitter, Yahoo, Microsoft
Task with pluggable Input, Processor and Output
Input
Processor
Output
Task
Tez Task - <Input, Processor, Output>
YARN ApplicationMaster to run DAG of Tez Tasks
© Hortonworks Inc. 2013.
5. Tez: Building blocks for scalable data processing
Classical ‘Map’
HDFS
Input
Map
Processor
Classical ‘Reduce’
Sorted
Output
Shuffle
Input
Shuffle
Input
Reduce
Processor
Sorted
Output
Intermediate ‘Reduce’ for
Map-Reduce-Reduce
© Hortonworks Inc. 2013.
Reduce
Processor
HDFS
Output
6. Hive-on-MR vs. Hive-on-Tez
Tez avoids
unneeded writes to
HDFS
SELECT a.x, AVERAGE(b.y) AS avg
FROM a JOIN b ON (a.id = b.id) GROUP BY a
UNION SELECT x, AVERAGE(y) AS AVG
FROM c GROUP BY x
ORDER BY AVG;
Hive – MR
M
M
Hive – Tez
M
SELECT a.state
SELECT b.id
R
R
M
SELECT a.state,
c.itemId
M
M
M
R
M
SELECT b.id
R
M
HDFS
JOIN (a, c)
SELECT c.price
M
R
M
R
HDFS
R
JOIN (a, c)
R
HDFS
JOIN(a, b)
GROUP BY a.state
COUNT(*)
AVERAGE(c.price)
© Hortonworks Inc. 2013.
M
M
R
M
JOIN(a, b)
GROUP BY a.state
COUNT(*)
AVERAGE(c.price)
R
7. Tez Sessions
… because Map/Reduce query startup is expensive
• Tez Sessions
– Hot containers ready for immediate use
– Removes task and job launch overhead (~5s – 30s)
• Hive
– Session launch/shutdown in background (seamless, user not
aware)
– Submits query plan directly to Tez Session
Native Hadoop service, not ad-hoc
© Hortonworks Inc. 2013.
8. Tez Delivers Interactive Query - Out of the Box!
Feature
DescripEon
Benefit
Tez
Session
Overcomes
Map-‐Reduce
job-‐launch
latency
by
pre-‐
launching
Tez
AppMaster
Latency
Tez
Container
Pre-‐
Launch
Overcomes
Map-‐Reduce
latency
by
pre-‐launching
hot
containers
ready
to
serve
queries.
Latency
Finished
maps
and
reduces
pick
up
more
work
Tez
Container
Re-‐Use
rather
than
exi:ng.
Reduces
latency
and
eliminates
difficult
split-‐size
tuning.
Out
of
box
performance!
Run:me
re-‐
Run:me
query
tuning
by
picking
aggrega:on
configura:on
of
DAG
parallelism
using
online
query
sta:s:cs
Tez
In-‐Memory
Cache
Hot
data
kept
in
RAM
for
fast
access.
Complex
DAGs
Tez
Broadcast
Edge
and
Map-‐Reduce-‐Reduce
paXern
improve
query
scale
and
throughput.
© Hortonworks Inc. 2013.
Latency
Throughput
Latency
Throughput
Page 8
9. Tez – Design Themes
• Empowering End Users
• Execution Performance
© Hortonworks Inc. 2013
Page 9
10. Tez – Empowering End Users
• Expressive dataflow definition API’s
• Flexible Input-Processor-Output runtime model
• Data type agnostic
• Simplifying deployment
© Hortonworks Inc. 2013
Page 10
11. Tez – Empowering End Users
• Expressive dataflow definition API’s
– Enable definition of complex data flow pipelines using simple
graph connection API’s. Tez expands the logical plan at runtime.
– Targeted towards data processing applications like Hive/Pig but
not limited to it. Hive/Pig query plans naturally map to Tez dataflow
graphs with no translation impedance.
TaskA-1
TaskA-2
TaskD-1
TaskB-1
TaskB-2
TaskD-2
© Hortonworks Inc. 2013
TaskC-1
TaskE-1
TaskC-2
TaskE-2
Page 11
12. Tez – Empowering End Users
• Expressive dataflow definition API’s
Task-2
Task-1
Samples
Task-1
Partition Stage
Task-2
Preprocessor Stage
Sampler
Ranges
Distributed Sort
Task-1
© Hortonworks Inc. 2013
Task-2
Aggregate Stage
Page 12
13. Tez – Empowering End Users
• Flexible Input-Processor-Output runtime model
– Construct physical runtime executors dynamically by connecting
different inputs, processors and outputs.
– End goal is to have a library of inputs, outputs and processors that
can be programmatically composed to generate useful tasks.
HDFSInput
ShuffleInput
MapProcessor
ReduceProcessor
JoinProcessor
FileSortedOutput
HDFSOutput
FileSortedOutput
Mapper
Reducer
PairwiseJoin
© Hortonworks Inc. 2013
Input1
Input2
Page 13
14. Tez – Empowering End Users
• Data type agnostic
– Tez is only concerned with the movement of data. Files and
streams of bytes.
– Does not impose any data format on the user application. MR
application can use Key-Value pairs on top of Tez. Hive and Pig
can use tuple oriented formats that are natural and native to them.
Tez Task
File
User Code
Key Value
Bytes
Bytes
Tuples
Stream
© Hortonworks Inc. 2013
Page 14
15. Tez – Empowering End Users
• Simplifying deployment
– Tez is a completely client side application.
– No deployments to do. Simply upload to any accessible
FileSystem and change local Tez configuration to point to that.
– Enables running different versions concurrently. Easy to test new
functionality while keeping stable versions for production.
– Leverages YARN local resources.
HDFS
Tez Lib 1
Tez Lib 2
TezClient
TezTask
TezTask
TezClient
Client
Machine
Node
Manager
Node
Manager
Client
Machine
© Hortonworks Inc. 2013
Page 15
16. Tez – Empowering End Users
• Expressive dataflow definition API’s
• Flexible Input-Processor-Output runtime model
• Data type agnostic
• Simplifying usage
With great power API’s come great responsibilities J
Tez is a framework on which end user applications can
be built
© Hortonworks Inc. 2013
Page 16
17. Tez – Execution Performance
• Performance gains over Map Reduce
• Optimal resource management
• Plan reconfiguration at runtime
• Dynamic physical data flow decisions
© Hortonworks Inc. 2013
Page 17
18. Tez – Execution Performance
• Performance gains over Map Reduce
– Eliminate replicated write barrier between successive
computations.
– Eliminate job launch overhead of workflow jobs.
– Eliminate extra stage of map reads in every workflow job.
– Eliminate queue and resource contention suffered by workflow
jobs that are started after a predecessor job completes.
Pig/Hive - Tez
Pig/Hive - MR
© Hortonworks Inc. 2013
Page 18
19. Tez – Execution Performance
• Plan reconfiguration at runtime
– Dynamic runtime concurrency control based on data size, user
operator resources, available cluster resources and locality.
– Advanced changes in dataflow graph structure.
– Progressive graph construction in concert with user optimizer.
HDFS
Blocks
Stage 1
50 maps
100
partitions
Stage 2
100
reducers
Stage 1
50 maps
100
partitions
Only 10GB’s
of data
Stage 2
100 10
reducers
YARN
Resources
© Hortonworks Inc. 2013
Page 19
20. Tez – Execution Performance
• Optimal resource management
– Reuse YARN containers to launch new tasks.
– Reuse YARN containers to enable shared objects across tasks.
Start Task
Tez
Application Master
Task Done
Start Task
YARN Container
TezTask1
TezTask2
Shared Objects
TezTask Host
YARN Container
© Hortonworks Inc. 2013
Page 20
21. Tez – Execution Performance
• Dynamic physical data flow decisions
– Decide the type of physical byte movement and storage on the fly.
– Store intermediate data on distributed store, local store or inmemory.
– Transfer bytes via blocking files or streaming and the spectrum in
between.
Producer
(small size)
Producer
Local File
At Runtime
In-Memory
Consumer
Consumer
© Hortonworks Inc. 2013
Page 21
22. Tez – Sessions
Start
Session
Submit
DAG
Client
Application Master
Task Scheduler
Container Pool
• Key for interactive queries
• Analogous to database
sessions and represents a
connection between the user
and the cluster
• Run multiple DAGs / queries in
the same session
• Maintains a pool of reusable
containers for low latency
execution of tasks within and
across queries
• Takes care of data locality and
releasing resources when idle
• Session cache in the
Application Master and in the
container pool reduce recomputation and re-initialization
© Hortonworks Inc. 2013
PreWarmed
JVM
Shared
Object
Registry
Page 33
23. Tez – Benchmark Performance
Significant (but not all) speed-ups due to Tez:
• DAG support and runtime graph reconfiguration enable utilizing the
parallelism of the cluster
• Tez Session and container re-use enable
efficient and low latency execution
© Hortonworks Inc. 2013
Page 35
24. Tez – Performance Analysis
Tez Session populates
container pool
AM
Dimension table
calculation and HDFS
split generation in
parallel
Dimension tables
broadcasted to Hive
MapJoin tasks
…
…
Final Reducer prelaunched and fetches
completed inputs
TPC-DS – Query 27 with Hive on Tez
© Hortonworks Inc. 2013
Page 36
25. Tez – Current status
• Apache Incubator Project
– Rapid development. Over 600 jiras opened. Over 400 resolved.
– Growing community of contributors and users.
• Focus on stability
– Testing and quality are highest priority.
– Code ready and deployed on multi-node environments.
• Support for a vast topology of DAGs
– Already functionally equivalent to Map Reduce. Existing Map
Reduce jobs can be executed on Tez with few or no changes.
– Hive re-targeted to use Tez for execution of queries (HIVE-4660).
– Work started on Pig to use Tez for execution of scripts (PIG-3446).
© Hortonworks Inc. 2013
Page 37
26. Tez – Roadmap
• Richer DAG support
– Support for co-scheduling and streaming
– Better fault tolerance with checkpoints
• Performance optimizations
– More efficiencies in transfer of data
– Improve session performance
• Usability
– Stability and testability
– Recovery and history
– Tools for performance analysis and debugging
© Hortonworks Inc. 2013
Page 38
27. Tez – Key Takeaways
• Distributed execution framework that works on
computations represented as dataflow graphs
• Naturally maps to execution plans produced by query
optimizers
• Customizable execution architecture designed to
enable dynamic performance optimizations at runtime
• Works out of the box with the platform figuring out
the hard stuff
• Span the spectrum of interactive latency to batch
• Open source Apache project – your use-cases and
code are welcome
• It works and is already being used by Hive and Pig
© Hortonworks Inc. 2013
Page 40