What's new in Hadoop Common and HDFS

Copyright©2016 NTT corp. All Rights Reserved.
What’s new in
Hadoop Common and HDFS
@Hadoop Summit Tokyo 2016
Tsuyoshi Ozawa
NTT Software Innovation Center
2016/10/26

2Copyright©2016 NTT corp. All Rights Reserved.
• Tsuyoshi Ozawa
• Research & Engineer @ NTT
Twitter: @oza_x86_64
• Apache Hadoop Committer and PMC
• Introduction to Hadoop 2nd Edition(Japanese)” Chapter
22(YARN)
• Online article: gihyo.jp “Why and How does Hadoop work?”
About me

• What’s new in Hadoop 3 Common and HDFS?
• Build
• Compiling source code with JDK 8
• Common
• Better Library Management
• Client-Side Class path Isolation
• Dependency Upgrade
• Support for Azure Data Lake Storage
• Shell script rewrite
• metrics2 sink plugin for Apache Kafka HADOOP-10949
• HDFS
• Erasure Coding Phase 1 HADOOP-11264
• MR, YARN -> Junping will talk!
Agenda

Build

• We upgraded minimum JDK to JDK8
• HADOOP-11858
• Oracle JDK 7 is EoL at April 2015!!
• Moving forward to use new features of JDK8
• Hadoop 2.6.x
• JDK 6, 7, 8 or later
• Hadoop 2.7.x/2.8.x/2.9.x
• JDK 7, 8 or later
• Hadoop 3.0.x
• JDK 8 or later
Apache Hadoop 3.0.0 run on JDK 8 or later

Common

• Jersey: 1.9 to 1.19
• the root element whose content is empty collection is changed
from null to empty object({}).
• grizzly-http-servlet: 2.1.2 to 2.2.21
• Guice: 3.0 to 4.0
• cglib: 2.2 to 3.2.0
• asm: 3.2 to 5.0.4
Dependency Upgrade

Client-side classpath isolation
Problem
• Application code’s can
conflict with Hadoop’s
one
Solution
• Separating Server-side
jar and Client-side jar
• Like hbase-client,
dependencies are shared
HADOOP-11656/HADOOP-13070
Hadoop
Client
Server
Older
commons
User code
newer
commons
Single Jar File
Conflicts!!!
Hadoop
-client
shaded
User code
newer
commons

• FileSystem API supports various storages
• HDFS
• Amazon S3
• Azure Blob Storage
• OpenStack Swift
• 3.0.0 supports Azure Data Lake Storage officially
Support for Azure Data Lake Storage

• CLI are renewed!
• To fix bugs (e.g. HADOOP_CONF_DIR is honored sometimes)
• To introduce new features
E.g.
• To launch daemons,
Use {hadoop,yarn,hdfs} --daemon command instead of
{hadoop,yarn,hdfs}-daemons.sh
• To print various environment variables, java options, classpath,
etc “{hadoop,yarn,hdfs} --debug” option is supported
• Please check documents
• https://hadoop.apache.org/docs/current/hadoop-project-
dist/hadoop-common/CommandsManual.html
• https://issues.apache.org/jira/browse/HADOOP-9902
Shell script rewrite

• Metrics System 2 is collector of daemon metrics
• Hadoop’s daemon log can be dumped into
Apache Kafka
metrics2 sink plugin for Apache Kafka
Metrics
System2
DataNode
Metrics
NameNode
Metrics
NodeManager
Metrics
Apache Kafka Sink
(New!)

HDFS
Namenode
Multi Standby

• Before: 1 Active – 1 Standby NameNode
• Need to recover immediately
after Active NN fails
• After: 1 Active - N standby NameNode can be chosen
• Be able to choose trade off
machine costs vs operation costs
NameNode Multi-Standby
NN
Active
NN
Standby
NN
Active
NN
Standby
NN
Standby
NN
Standby
NN
Standby

HDFS
Erasure Coding

• Background
• HDFS uses Chain Replication for
higher throughput and strong consistency
• A case when replication factor is 3
• Pros
• Simplicity
• Network throughput can be suppressed
between client and replicas
• Cons
• High latency
• 33% of storage efficiency
Replication –traditional HDFS way-
DATA1 DATA1 DATA1Client
ACK
Data

• Erasure Coding is another way to save storage
with fault tolerance
• Used in RAID 5/6
• Using “parity” instead of “copy” to recover
• Reed-Solomon coding is used
• If data is lost, recover is done with inverse matrix
Erasure Coding
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
𝑋02 𝑋01 𝑋02 𝑋03
𝑋12 𝑋11 𝑋12 𝑋13
𝒅 𝟏
𝒅 𝟐
𝒅 𝟑
𝒅 𝟒
× =
𝒅 𝟏
𝒅 𝟐
𝒅 𝟑
𝒅 𝟒
𝑐 𝟎
𝑐 𝟏 Parity Bits
Data Bits
FAST ’09: 7th, A Performance Evaluation and Examination of Open-Source
Erasure Coding Libraries For Storage
Storing these
values
instead of
only storing
data!
4 bits data – 2 bits parity read Solomon

• Erasure coding is flexible:
tuning of data bits and parity bits can be done
• 6 data-bits, 3 parity-bits
• 3 replication vs (6, 3)-read Solomon
Effect of Erasure Coding
3-replication (6, 3) Reed-Solomon
Maximum fault
Tolerance
2 3
Disk usage
(N byte of data)
3N 1.5N
HDFS Erasure Coding Design Document:
https://issues.apache.org/jira/secure/attachment/12697210/HDFSEra
sureCodingDesign-20150206.pdf

• 2 approaches
• Striping : Splitting blocks into smaller block
• Pros Effective for small files
• Cons Less Data Locality to read block
• Contiguous
• Creating parities with blocks
• Pros Better Locality
• Cons Smaller files cannot be handled
Possible EC design in HDFS
1MB 1MB 1MB 1MB 1MB 1MB 1MB 1MB 1MB
ParitiesData
64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB

• According to fsimage Analysis‘ report, files over 90%
are smaller than HDFS block size, 64MB
• Figure 3 source: fsimage Analysis
https://issues.apache.org/jira/secure/attachment/12690129/fsimage-
analysis-20150105.pdf
Which is better, striping or contiguous?
1 group: 6 blocks
Cluster 3Cluster 1

• Starting from Striping to deal with smaller files
• Hadoop 3.0.0 implemented Phase 1.1 and Phase 1.2
Apache Hadoop’s decision

• What’s changed?
• How to preserve a data in DataNode
• How to preserve a metadata in NameNode
• Client Write path
• Client Read path
Erasure Coding in HDFS (ver. 2016)

• Block size data size: 1MB (not 64MB)
• Calculate Parity bits at client side, at Write Time
• Write in parallel (not chain replication)
How to preserve data in HDFS (write path)

• Read 9 small blocks
• If no data is lost, never touch parities
How to retrieve data - (6, 3) Reed Solomon-
DataNode
DataNode1MB
1MB
Client
DataNode1MB
6 data
3 parities
…
…
Read 6 Data

• Pros
• Low latency because of parallel write/read
• Good for small-size files
• Cons
• Require high network bandwidth between client-server
Network traffic
Workload 3-replication (6, 3) Reed-Solomon
Read 1 block 1 LN 1/6 LN + 5/6 RR
Write 1LN + 1LR + 1RR 1/6 LN + 1/6 LR + 7/6 RR
LN: Local Node
LR: Local Rack
RR: Remote Rack

• Write path/Read path are changed!
• How much network traffic?
• How many small files?
• If network traffic is very high,
replication seems to be preferred
• If there are cold data and most of them are small, EC
is good option 
Operation Points

• Build
• Upgrade minimum JDK to JDK 8
• Commons
• Be careful about Dependency Management of your project
if you write hand-coded MapReduce
• Shell script rewrite make operation easy
• Kafka Metrics2 Sink
• New FileSystem backend: Azure Data Lake
• HDFS
• Multiple Standby NameNode make operation flexible
• Erasure Coding
• Efficient disk usage than replication
• Every know-how will be changed!
Summary

• Kai Zheng slide is good for a reference
• http://www.slideshare.net/HadoopSummit/debunking-the-myths-
of-hdfs-erasure-coding-performance
• HDFS Erasure Coding Design Document
• https://issues.apache.org/jira/secure/attachment/12697210/HDF
SErasureCodingDesign-20150206.pdf
• Fsimage Analysis
• https://issues.apache.org/jira/secure/attachment/12690129/fsim
age-analysis-20150105.pdf
• Hadoop 3.0.0-alpha RELEASE Note
• http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-project-
dist/hadoop-common/release/3.0.0-alpha1/CHANGES.3.0.0-
alpha1.html
References

• Thanks all users, contributors, committers, and PMC of
Apache Hadoop!
• Especially, Andrew Wang had great effort to release
3.0.0-alpha!
• Thanks Kota Tsuyuzaki, a OpenStack Swift developer,
for reviewing my EC related slides!
Acknowledgement

What's new in Hadoop Common and HDFS

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (7)

Similar a What's new in Hadoop Common and HDFS

Similar a What's new in Hadoop Common and HDFS (20)

Más de DataWorks Summit/Hadoop Summit

Más de DataWorks Summit/Hadoop Summit (20)

Último

Último (20)

What's new in Hadoop Common and HDFS