SlideShare una empresa de Scribd logo
1 de 30
1Copyright © 2014 NTT DATA Corporation
Masaru Dobashi (NTT DATA)
Spark on large Hadoop cluster and evaluation from the
view point of enterprise Hadoop user and developer
2Copyright © 2014 NTT DATA Corporation
Who am I?
I’m Masaru Dobashi
Chief Engineer of NTT DATA in Japan
My team focusing on Open Source Software solutions
I’ve been integrated several Hadoop systems for 5+years
In these years, also utilize Spark, Storm, and so on.
One of leading solution provider in Japan
The largest one is a 1000+ nodes cluster
Elephant in
resort island
3Copyright © 2014 NTT DATA Corporation
Position in NTT Group
NTT Group
Total Asset:
¥19.6,536 trillion
Operating Revenues:
¥10.7007 trillion
Number of Employees:
227,168
Number of Consolidated
Subsidiaries:
827
(AS of Mar. 31, 2013)
NIPPON TELEGRAPH AND
TELEPHONE
CORPORATION
(Holding Company)
※ 【】NTT's Voting Rights Ratio(as of Mar. 31, 2013)
NIPPON TELEGRAPH AND
TELEPHONE EAST CORPORATION
【100%】
NIPPON TELEGRAPH AND
TELEPHONE WEST CORPORATION
【 100% 】
NTT Communications Corporation
【100%】
Dimension Data Holdings plc.
【100%】
NTT DOCOMO, INC.
【66.7%】
NTT DATA CORPORATION
【54.2%】
Regional Communications Business
Long-Distance and International Communications Business
Mobile Communications Business
Data Communications Business
・Planning management strategies
for the NTT Group.
・Encouraging fundamental R&D
efforts
Net Sales: USD 13.2 billion
(June, 2014; USD 1 = JPY 102)
Employees: 75,000 (January, 2014)
4Copyright © 2014 NTT DATA Corporation
Agenda
Our motivation and expectation for Spark
Characteristics of its performance with GBs, TBs and
tens of TBs of data.
Tips for the people who are planning to use Spark
Copyright © 2014 NTT DATA Corporation 5Copyright © 2012 NTT DATA Corporation 5
Motivation
6Copyright © 2014 NTT DATA Corporation
Hadoop has become the base of data processing
We started to use Hadoop 6 years ago
Hadoop enabled us to process massive data
daily and hourly
Hadoop(MapReduce, HDFS)
Hive
Pig
Data loader Outer system
batch
Daily/Hourly
The system image 6 years ago
8Copyright © 2014 NTT DATA Corporation
Demands for data processing have diversified
Handle variety of requirements for data processing
Both throughput and low latency
APIs useful for data analysis
Make the data management simple
Want to run different types of frameworks on one
architecture which access one HDFS.
Because multi clusters themselves impose complexity and
inefficiency in data management
This should be achieved by Spark
This should be achieved by Hadoop2.x and YARN
9Copyright © 2014 NTT DATA Corporation
Recent architecture for massive data processing
Spark and other data frameworks collaborate with
each other
RabbitMQ, Kafka
Flume, Fluentd
Outside service
Hadoop(YARN, HDFS)
Storm
SparkStreaming Outside service
Visualization service
HBase
MapReduce
Hive
Pig
Dataflow
batch
on-memory
stream
Messaging
KVS/cache
Spark assumes the role of
on-memory rapid batch
process
Spark
10Copyright © 2014 NTT DATA Corporation
I will not speak about Spark internal
I’m sorry but I will not speak about Spark internal
In Japanese, please search
“hadoop ソースコードリーディング spark” by google
Copyright © 2014 NTT DATA Corporation 11Copyright © 2012 NTT DATA Corporation 11
The evaluation of Spark on YARN
12Copyright © 2014 NTT DATA Corporation
Four essential points we wanted to evaluate
# Points we wanted to evaluate Apps used for evaluation
1 Capability to process tens of TBs of data without unpredictable
decrease of performance nor unexpected hold
WordCount
2 Keep reasonable performance when data is bigger than total
memory available for caching
SparkHdfsLR
(Logistic Regression)
3 Keep reasonable performance of shuffle process with tens of TBs of
data
GroupByTest
(Large shuffle process)
4 Easy to implement the multi-stage jobs (from our business use-case) POC of a certain project
Understand the basic characteristics about scale-out.
Especially about TBs and tens of TBs of data.
Basic viewpoint
Basic applications we used are little modified versions of official example of Spark
13Copyright © 2014 NTT DATA Corporation
The specification of the cluster
Item value
CPU E5-2620 6 core x 2 socket
Memory 64GB 1.3GHz
NW interface 10GBase-T x 2 port (bonding)
Disk 3TB SATA 6Gb 7200rpm
10G NW with top of rack switch
10G NW with core switch
CentOS6.5
HDFS &
YARN(CDH5.0.1)
Spark 1.0.0
Software stuck
Total cluster size
• 4k+ Core
• 10TB+ RAM
14Copyright © 2014 NTT DATA Corporation
Point1: Process time of WordCount
Total heap size of executors
We ran tests two times per each
data size. OS buffer cache was
cleared before the each execution
started.
The process time seemed to be
linear per input data size.
We found reasonable
performance, even if all of data
cannot be held on cache
We found reasonable
performance, even if all of data
cannot be held on memory
15Copyright © 2014 NTT DATA Corporation
Point1: Resource usage of a certain slavenode
[CPU Usage]
blue: user
green: system
[Network usage]
red : in
black: out
[Disk I/O]
black: read
pink : write
Map
(448s)
Reduce
(98s)
About
400MB/sec/server
disk usage
Because of locality,
there’re a few NW I/O.
This is ideal situation.
Input data: 27TB
16Copyright © 2014 NTT DATA Corporation
Point1: Summary
WordCount’s performance depends on mapper. And,
reducer may not be bottleneck. This is because
mapper outputs small data.
On this task, we confirmed reasonable performance,
even if the input data exceeded the total memory
amount.
When tasks had the locality for data, we observed the
stable throughput, (i.e. time vs. data processed)
I will talk about the case which a task lost locality, later.
17Copyright © 2014 NTT DATA Corporation
Point2: Process time of SparkHdfsLR
We ran the logistic regression with 3
cycles of computation. We found
ideal difference between cycles. The
difference of process time between
cycle 1 and others seemed to
depend on amount of cached data.
Available cache size per server (16 GB * 3 * 0.6 = 26GB)
Cache is effective
Cache is effective, even
if executor cannot hold
all of data
Using memory
and disk
Using memory
and disk
100%
cached
73%
cached
33%
cached
16%
cached
18Copyright © 2014 NTT DATA Corporation
Point2: Resource usage of a certain slavenode
[CPU Usage]
blue: user
green: system
[Network
usage]
red : in
black: out
[Disk I/O]
black: read
pink : write
8GB input per server 16GB input per server
600 sec 85sec89sec 1229sec 297sec316sec
Part of data use disk,
because cache cannot
hold all of data
I/O wait
Cache mechanism
prevented the disk
usage
19Copyright © 2014 NTT DATA Corporation
Point2: Summary
The cache mechanism of Spark worked for iterative
applications
RDD’s cache mechanism works consistently, and
enhances throughput while the amount of input data
is bigger than the total memory available for caching
But we found that it is important to minimize boxing
overhead when storing data object into the cache.
20Copyright © 2014 NTT DATA Corporation
Point3: Process time of GroupByTest
Starting fetched shuffle-data
spilling to disks, but no
impact on total Elapsed Time
We didn’t find drastic
changes of process time
In this test, mapper’s output
doesn’t decrease compared with
mapper’s input. So, exchanging
of large data should be occur in
the shuffle process.
21Copyright © 2014 NTT DATA Corporation
 Actually, the throughput seemed to be restricted by disk I/O as well as
NW I/O. This is typical when we ran shuffle tests whose mapper
generated large data.
Point3: NW usage of a certain slavenode
500 ~ 600MByte / s
100 ~ 200MByte / s
100 ~ 120MByte / s
Trial to confirm 10G Bandwidth NetworkingThe network resource usage of a certain slavenodes when we ran variety patterns of tests
Shuffle phase
Shuffle phase
Shuffle phase
cpu util was 100%
at this time
Job start Job finishSmall shuffle
No spill
Large shuffle
Spill
(1) One rack
(2)One rack
Large shuffle(3)Cluster
Weak!
Spill
22Copyright © 2014 NTT DATA Corporation
Point3: Summary
 The process time seemed to be linear per input size of shuffle.
 When the shuffle data spills out to the disk, the disk access
would compete among shuffle related tasks, such as
ShuffleMapTask(WRITE), Fetcher(READ), etc. Then, the
competition deteriorate the performance.
This situation may be improved by activities like SPARK-2044
 Additionally, there are difficulty about Java heap tuning,
because executors have variety of buffers and cache data in
the same heap. We met OOM in the worst case.
I will talk about it later, if time permits.
But… actually, we should care about skew additionally.
23Copyright © 2014 NTT DATA Corporation
Point4: Example application of POC
We categorized existing Hadoop applications in a
certain project and made mock applications which
represent major business logics of the project.
Groups by different categories
(1)Distribute + Sort
(2)Compute (3)Compute
(4-1) Group + Count
(4-2) Group + Count
(5-1) Join + Group + Sum
(5-2) Join + Group + Sum
This is a multi-stage job which includes some
computations different from each application.
This application resembles the log analysis
to find feature of web users.
Example of process flow
24Copyright © 2014 NTT DATA Corporation
Point4: Lesson learned from this POC
Tips
Use cache mechanism efficiently
Prevent skew of task allocation in the start
Prevent too large partition size
Practices for heap tuning
Use RDD to manage data rather than own arrays
Practices for implementation of DISTRIBUTE BY
Issues
Missing data locality of tasks
Today’s topics
25Copyright © 2014 NTT DATA Corporation
Use cache mechanism efficiently
We can use the cache mechanism efficiently by
minimizing object stored in MemoryStore or the data
store of the cache mechanism.
 Convenience and efficiency of data size may have trade-off
relationship. But, for example, “implicit conversion” of Scala
can help you in a certain case.
Rich data format
(like case class)
+
Simple function
Simple data format
(Byte, Int, …)
+
Rich function
You can use variety of functions to be sent to slavenodes
26Copyright © 2014 NTT DATA Corporation
Prevent skew of task allocation in the start(1)
It takes a little to start all of containers when we run
large jobs on large YARN cluster.
In this case, the allocation of tasks starts before all
containers are available, so that some tasks are
allocated on non-data-local executors.
Our workaround
val sc = new SparkContext(sparkConf)
Thread.sleep(sleeptime)
We inserted a little sleep time.
This reduces total processing
time as a result.
But...This is really workaround.
Ultimately, we should implement the threshold to start the task allocation.
For example, the percentage of containers ready for use may be useful
for this purpose.
(You can also use spark.locality.wait parameters)
27Copyright © 2014 NTT DATA Corporation
Prevent skew of task allocation in the start(2)
[CPU Usage]
blue: user
green: system
[Network usage]
red : in
black: out
[Disk I/O]
black: read
pink : write
Map
(609s)
Reduce
(103s)
Because of lack of
locality, data transfer
occurred.
Input data: 27TB
Example of slavenode’s resource usage
Copyright © 2014 NTT DATA Corporation 28Copyright © 2012 NTT DATA Corporation 28
Future work and conclusion
29Copyright © 2014 NTT DATA Corporation
Future work
Find the good collaboration between Spark and YARN.
Here are some issues to be resolved.
Overhead for starting containers
Avoid skew of task allocation when starting applications
If we can use I/O resource management in the future, it
will realize rigorous management.
Ensure traceability from a statement of application to
the framework of Spark.
This is used for performance tuning and debugging.
30Copyright © 2014 NTT DATA Corporation
Conclusion
Can scalably process tens of TBs of data without unpredictable
decrease of performance nor unexpected hold
Impression
Good! …but we need some techniques for scale out
Expectation1
Keep reasonable performance when data is bigger
than total memory available for caching
Impression
Good! …but we need some techniques to efficiently use the cache
Expectation2
Capability to run an application on YARN
Impression
We’re evaluating now and it is under development right now.
Expectation3
Copyright © 2011 NTT DATA Corporation
Copyright © 2014 NTT DATA Corporation
Copyright © 2011 NTT DATA Corporation
Copyright © 2012 NTT DATA Corporation
Spark is a young product and has some issues to be solved.
But these issues should be resolved by the great community member.
We also contribute it!
If you have any question, I can talk in the lobby individually

Más contenido relacionado

La actualidad más candente

Vanilla Hadoop vs. the rest
Vanilla Hadoop vs. the rest Vanilla Hadoop vs. the rest
Vanilla Hadoop vs. the rest Viet-Trung TRAN
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesKTN
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityWes McKinney
 
Full Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and GrafanaFull Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and GrafanaJazz Yao-Tsung Wang
 
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep LearningTensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep LearningEvans Ye
 

La actualidad más candente (8)

Introduction to HCFS
Introduction to HCFSIntroduction to HCFS
Introduction to HCFS
 
Apache: Big Data North America 2017 参加報告 #streamctjp
Apache: Big Data North America 2017 参加報告  #streamctjpApache: Big Data North America 2017 参加報告  #streamctjp
Apache: Big Data North America 2017 参加報告 #streamctjp
 
Vanilla Hadoop vs. the rest
Vanilla Hadoop vs. the rest Vanilla Hadoop vs. the rest
Vanilla Hadoop vs. the rest
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace Architectures
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
 
Full Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and GrafanaFull Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and Grafana
 
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep LearningTensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
 
Cloudera
ClouderaCloudera
Cloudera
 

Destacado

Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)NTT DATA OSS Professional Services
 
Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)
Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)
Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)NTT DATA OSS Professional Services
 
Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)
Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)
Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)NTT DATA OSS Professional Services
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~NTT DATA OSS Professional Services
 
Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)
Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)
Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)NTT DATA OSS Professional Services
 
デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~
デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~
デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~Takanori Suzuki
 
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data PlatformAmazon Web Services
 
How to Use HazelcastMQ for Flexible Messaging and More
 How to Use HazelcastMQ for Flexible Messaging and More How to Use HazelcastMQ for Flexible Messaging and More
How to Use HazelcastMQ for Flexible Messaging and MoreHazelcast
 
GMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみた
GMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみたGMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみた
GMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみたTetsuo Yamabe
 
Tez on EMRを試してみた
Tez on EMRを試してみたTez on EMRを試してみた
Tez on EMRを試してみたSatoshi Noto
 
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)tatsuya6502
 
Business Innovation cases driven by AI and BigData technologies
Business Innovation cases driven by AI and BigData technologiesBusiness Innovation cases driven by AI and BigData technologies
Business Innovation cases driven by AI and BigData technologiesDataWorks Summit/Hadoop Summit
 
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話Yahoo!デベロッパーネットワーク
 

Destacado (20)

Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
Sparkコミュニティに飛び込もう!(Spark Meetup Tokyo 2015 講演資料、NTTデータ 猿田 浩輔)
 
Hadoop2.6の最新機能+
Hadoop2.6の最新機能+Hadoop2.6の最新機能+
Hadoop2.6の最新機能+
 
Hadoop ecosystem NTTDATA osc15tk
Hadoop ecosystem NTTDATA osc15tkHadoop ecosystem NTTDATA osc15tk
Hadoop ecosystem NTTDATA osc15tk
 
Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)
Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)
Ansibleで構成管理始める人のモチベーションをあげたい! (Cloudera World Tokyo 2014LT講演資料)
 
Apache Spark 1000 nodes NTT DATA
Apache Spark 1000 nodes NTT DATAApache Spark 1000 nodes NTT DATA
Apache Spark 1000 nodes NTT DATA
 
Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)
Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)
Sparkをノートブックにまとめちゃおう。Zeppelinでね!(Hadoopソースコードリーディング 第19回 発表資料)
 
Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
 
Spark MLlibではじめるスケーラブルな機械学習
Spark MLlibではじめるスケーラブルな機械学習Spark MLlibではじめるスケーラブルな機械学習
Spark MLlibではじめるスケーラブルな機械学習
 
Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)
Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)
Hadoop 2.6の最新機能(Cloudera World Tokyo 2014 LT講演資料)
 
デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~
デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~
デブサミ2014-Stormで実現するビッグデータのリアルタイム処理プラットフォーム ~ストリームデータ処理から機械学習まで~
 
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
 
How to Use HazelcastMQ for Flexible Messaging and More
 How to Use HazelcastMQ for Flexible Messaging and More How to Use HazelcastMQ for Flexible Messaging and More
How to Use HazelcastMQ for Flexible Messaging and More
 
GMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみた
GMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみたGMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみた
GMO プライベート DMP で ビッグデータ解析をするために アプリクラウドで Apache Spark の検証をしてみた
 
Tez on EMRを試してみた
Tez on EMRを試してみたTez on EMRを試してみた
Tez on EMRを試してみた
 
金融機関でのHive/Presto事例紹介
金融機関でのHive/Presto事例紹介金融機関でのHive/Presto事例紹介
金融機関でのHive/Presto事例紹介
 
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)
 
Business Innovation cases driven by AI and BigData technologies
Business Innovation cases driven by AI and BigData technologiesBusiness Innovation cases driven by AI and BigData technologies
Business Innovation cases driven by AI and BigData technologies
 
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話
 
NetflixにおけるPresto/Spark活用事例
NetflixにおけるPresto/Spark活用事例NetflixにおけるPresto/Spark活用事例
NetflixにおけるPresto/Spark活用事例
 

Similar a Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)

Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAlluxio, Inc.
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Sumeet Singh
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonDataWorks Summit/Hadoop Summit
 
Introduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data ProcessingIntroduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data ProcessingSam Ng
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDataWorks Summit
 
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVYoshihiro Nakajima
 
Introducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFireIntroducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFireJohn Blum
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelt3rmin4t0r
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weitingWei Ting Chen
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitSaptak Sen
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsDavid Portnoy
 
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringBig Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringIRJET Journal
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud PlatformTeaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud PlatformYao Yao
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...Big Data Montreal
 
Where to Deploy Hadoop: Bare-metal or Cloud?
Where to Deploy Hadoop:  Bare-metal or Cloud?Where to Deploy Hadoop:  Bare-metal or Cloud?
Where to Deploy Hadoop: Bare-metal or Cloud?Mike Wendt
 

Similar a Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014) (20)

Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
 
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
Hadoop Summit Amsterdam 2014: Capacity Planning In Multi-tenant Hadoop Deploy...
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
 
Introduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data ProcessingIntroduction to Hadoop and Big Data Processing
Introduction to Hadoop and Big Data Processing
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUs
 
MapR Unique features
MapR Unique featuresMapR Unique features
MapR Unique features
 
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFV
 
Introducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFireIntroducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFire
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop Summit
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and StoringBig Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and Storing
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
H04502048051
H04502048051H04502048051
H04502048051
 
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud PlatformTeaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
Where to Deploy Hadoop: Bare-metal or Cloud?
Where to Deploy Hadoop:  Bare-metal or Cloud?Where to Deploy Hadoop:  Bare-metal or Cloud?
Where to Deploy Hadoop: Bare-metal or Cloud?
 

Más de NTT DATA OSS Professional Services

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力NTT DATA OSS Professional Services
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~NTT DATA OSS Professional Services
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントNTT DATA OSS Professional Services
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~NTT DATA OSS Professional Services
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのことNTT DATA OSS Professional Services
 
今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~
今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~
今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~NTT DATA OSS Professional Services
 
Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)
Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)
Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)NTT DATA OSS Professional Services
 

Más de NTT DATA OSS Professional Services (20)

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力
 
Spark SQL - The internal -
Spark SQL - The internal -Spark SQL - The internal -
Spark SQL - The internal -
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
 
Hadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返りHadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返り
 
HDFS Router-based federation
HDFS Router-based federationHDFS Router-based federation
HDFS Router-based federation
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
 
Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状
 
Distributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystemDistributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystem
 
Structured Streaming - The Internal -
Structured Streaming - The Internal -Structured Streaming - The Internal -
Structured Streaming - The Internal -
 
Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?
 
Apache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development statusApache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development status
 
HDFS basics from API perspective
HDFS basics from API perspectiveHDFS basics from API perspective
HDFS basics from API perspective
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
 
20170303 java9 hadoop
20170303 java9 hadoop20170303 java9 hadoop
20170303 java9 hadoop
 
ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)
 
Application of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jpApplication of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jp
 
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructureApplication of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと
 
今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~
今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~
今からはじめるPuppet 2016 ~ インフラエンジニアのたしなみ ~
 
Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)
Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)
Hadoopエコシステムの最新動向とNTTデータの取り組み (OSC 2016 Tokyo/Spring 講演資料)
 

Spark1.0での動作検証 - Hadoopユーザ・デベロッパから見たSparkへの期待 (Hadoop Conference Japan 2014)

  • 1. 1Copyright © 2014 NTT DATA Corporation Masaru Dobashi (NTT DATA) Spark on large Hadoop cluster and evaluation from the view point of enterprise Hadoop user and developer
  • 2. 2Copyright © 2014 NTT DATA Corporation Who am I? I’m Masaru Dobashi Chief Engineer of NTT DATA in Japan My team focusing on Open Source Software solutions I’ve been integrated several Hadoop systems for 5+years In these years, also utilize Spark, Storm, and so on. One of leading solution provider in Japan The largest one is a 1000+ nodes cluster Elephant in resort island
  • 3. 3Copyright © 2014 NTT DATA Corporation Position in NTT Group NTT Group Total Asset: ¥19.6,536 trillion Operating Revenues: ¥10.7007 trillion Number of Employees: 227,168 Number of Consolidated Subsidiaries: 827 (AS of Mar. 31, 2013) NIPPON TELEGRAPH AND TELEPHONE CORPORATION (Holding Company) ※ 【】NTT's Voting Rights Ratio(as of Mar. 31, 2013) NIPPON TELEGRAPH AND TELEPHONE EAST CORPORATION 【100%】 NIPPON TELEGRAPH AND TELEPHONE WEST CORPORATION 【 100% 】 NTT Communications Corporation 【100%】 Dimension Data Holdings plc. 【100%】 NTT DOCOMO, INC. 【66.7%】 NTT DATA CORPORATION 【54.2%】 Regional Communications Business Long-Distance and International Communications Business Mobile Communications Business Data Communications Business ・Planning management strategies for the NTT Group. ・Encouraging fundamental R&D efforts Net Sales: USD 13.2 billion (June, 2014; USD 1 = JPY 102) Employees: 75,000 (January, 2014)
  • 4. 4Copyright © 2014 NTT DATA Corporation Agenda Our motivation and expectation for Spark Characteristics of its performance with GBs, TBs and tens of TBs of data. Tips for the people who are planning to use Spark
  • 5. Copyright © 2014 NTT DATA Corporation 5Copyright © 2012 NTT DATA Corporation 5 Motivation
  • 6. 6Copyright © 2014 NTT DATA Corporation Hadoop has become the base of data processing We started to use Hadoop 6 years ago Hadoop enabled us to process massive data daily and hourly Hadoop(MapReduce, HDFS) Hive Pig Data loader Outer system batch Daily/Hourly The system image 6 years ago
  • 7. 8Copyright © 2014 NTT DATA Corporation Demands for data processing have diversified Handle variety of requirements for data processing Both throughput and low latency APIs useful for data analysis Make the data management simple Want to run different types of frameworks on one architecture which access one HDFS. Because multi clusters themselves impose complexity and inefficiency in data management This should be achieved by Spark This should be achieved by Hadoop2.x and YARN
  • 8. 9Copyright © 2014 NTT DATA Corporation Recent architecture for massive data processing Spark and other data frameworks collaborate with each other RabbitMQ, Kafka Flume, Fluentd Outside service Hadoop(YARN, HDFS) Storm SparkStreaming Outside service Visualization service HBase MapReduce Hive Pig Dataflow batch on-memory stream Messaging KVS/cache Spark assumes the role of on-memory rapid batch process Spark
  • 9. 10Copyright © 2014 NTT DATA Corporation I will not speak about Spark internal I’m sorry but I will not speak about Spark internal In Japanese, please search “hadoop ソースコードリーディング spark” by google
  • 10. Copyright © 2014 NTT DATA Corporation 11Copyright © 2012 NTT DATA Corporation 11 The evaluation of Spark on YARN
  • 11. 12Copyright © 2014 NTT DATA Corporation Four essential points we wanted to evaluate # Points we wanted to evaluate Apps used for evaluation 1 Capability to process tens of TBs of data without unpredictable decrease of performance nor unexpected hold WordCount 2 Keep reasonable performance when data is bigger than total memory available for caching SparkHdfsLR (Logistic Regression) 3 Keep reasonable performance of shuffle process with tens of TBs of data GroupByTest (Large shuffle process) 4 Easy to implement the multi-stage jobs (from our business use-case) POC of a certain project Understand the basic characteristics about scale-out. Especially about TBs and tens of TBs of data. Basic viewpoint Basic applications we used are little modified versions of official example of Spark
  • 12. 13Copyright © 2014 NTT DATA Corporation The specification of the cluster Item value CPU E5-2620 6 core x 2 socket Memory 64GB 1.3GHz NW interface 10GBase-T x 2 port (bonding) Disk 3TB SATA 6Gb 7200rpm 10G NW with top of rack switch 10G NW with core switch CentOS6.5 HDFS & YARN(CDH5.0.1) Spark 1.0.0 Software stuck Total cluster size • 4k+ Core • 10TB+ RAM
  • 13. 14Copyright © 2014 NTT DATA Corporation Point1: Process time of WordCount Total heap size of executors We ran tests two times per each data size. OS buffer cache was cleared before the each execution started. The process time seemed to be linear per input data size. We found reasonable performance, even if all of data cannot be held on cache We found reasonable performance, even if all of data cannot be held on memory
  • 14. 15Copyright © 2014 NTT DATA Corporation Point1: Resource usage of a certain slavenode [CPU Usage] blue: user green: system [Network usage] red : in black: out [Disk I/O] black: read pink : write Map (448s) Reduce (98s) About 400MB/sec/server disk usage Because of locality, there’re a few NW I/O. This is ideal situation. Input data: 27TB
  • 15. 16Copyright © 2014 NTT DATA Corporation Point1: Summary WordCount’s performance depends on mapper. And, reducer may not be bottleneck. This is because mapper outputs small data. On this task, we confirmed reasonable performance, even if the input data exceeded the total memory amount. When tasks had the locality for data, we observed the stable throughput, (i.e. time vs. data processed) I will talk about the case which a task lost locality, later.
  • 16. 17Copyright © 2014 NTT DATA Corporation Point2: Process time of SparkHdfsLR We ran the logistic regression with 3 cycles of computation. We found ideal difference between cycles. The difference of process time between cycle 1 and others seemed to depend on amount of cached data. Available cache size per server (16 GB * 3 * 0.6 = 26GB) Cache is effective Cache is effective, even if executor cannot hold all of data Using memory and disk Using memory and disk 100% cached 73% cached 33% cached 16% cached
  • 17. 18Copyright © 2014 NTT DATA Corporation Point2: Resource usage of a certain slavenode [CPU Usage] blue: user green: system [Network usage] red : in black: out [Disk I/O] black: read pink : write 8GB input per server 16GB input per server 600 sec 85sec89sec 1229sec 297sec316sec Part of data use disk, because cache cannot hold all of data I/O wait Cache mechanism prevented the disk usage
  • 18. 19Copyright © 2014 NTT DATA Corporation Point2: Summary The cache mechanism of Spark worked for iterative applications RDD’s cache mechanism works consistently, and enhances throughput while the amount of input data is bigger than the total memory available for caching But we found that it is important to minimize boxing overhead when storing data object into the cache.
  • 19. 20Copyright © 2014 NTT DATA Corporation Point3: Process time of GroupByTest Starting fetched shuffle-data spilling to disks, but no impact on total Elapsed Time We didn’t find drastic changes of process time In this test, mapper’s output doesn’t decrease compared with mapper’s input. So, exchanging of large data should be occur in the shuffle process.
  • 20. 21Copyright © 2014 NTT DATA Corporation  Actually, the throughput seemed to be restricted by disk I/O as well as NW I/O. This is typical when we ran shuffle tests whose mapper generated large data. Point3: NW usage of a certain slavenode 500 ~ 600MByte / s 100 ~ 200MByte / s 100 ~ 120MByte / s Trial to confirm 10G Bandwidth NetworkingThe network resource usage of a certain slavenodes when we ran variety patterns of tests Shuffle phase Shuffle phase Shuffle phase cpu util was 100% at this time Job start Job finishSmall shuffle No spill Large shuffle Spill (1) One rack (2)One rack Large shuffle(3)Cluster Weak! Spill
  • 21. 22Copyright © 2014 NTT DATA Corporation Point3: Summary  The process time seemed to be linear per input size of shuffle.  When the shuffle data spills out to the disk, the disk access would compete among shuffle related tasks, such as ShuffleMapTask(WRITE), Fetcher(READ), etc. Then, the competition deteriorate the performance. This situation may be improved by activities like SPARK-2044  Additionally, there are difficulty about Java heap tuning, because executors have variety of buffers and cache data in the same heap. We met OOM in the worst case. I will talk about it later, if time permits. But… actually, we should care about skew additionally.
  • 22. 23Copyright © 2014 NTT DATA Corporation Point4: Example application of POC We categorized existing Hadoop applications in a certain project and made mock applications which represent major business logics of the project. Groups by different categories (1)Distribute + Sort (2)Compute (3)Compute (4-1) Group + Count (4-2) Group + Count (5-1) Join + Group + Sum (5-2) Join + Group + Sum This is a multi-stage job which includes some computations different from each application. This application resembles the log analysis to find feature of web users. Example of process flow
  • 23. 24Copyright © 2014 NTT DATA Corporation Point4: Lesson learned from this POC Tips Use cache mechanism efficiently Prevent skew of task allocation in the start Prevent too large partition size Practices for heap tuning Use RDD to manage data rather than own arrays Practices for implementation of DISTRIBUTE BY Issues Missing data locality of tasks Today’s topics
  • 24. 25Copyright © 2014 NTT DATA Corporation Use cache mechanism efficiently We can use the cache mechanism efficiently by minimizing object stored in MemoryStore or the data store of the cache mechanism.  Convenience and efficiency of data size may have trade-off relationship. But, for example, “implicit conversion” of Scala can help you in a certain case. Rich data format (like case class) + Simple function Simple data format (Byte, Int, …) + Rich function You can use variety of functions to be sent to slavenodes
  • 25. 26Copyright © 2014 NTT DATA Corporation Prevent skew of task allocation in the start(1) It takes a little to start all of containers when we run large jobs on large YARN cluster. In this case, the allocation of tasks starts before all containers are available, so that some tasks are allocated on non-data-local executors. Our workaround val sc = new SparkContext(sparkConf) Thread.sleep(sleeptime) We inserted a little sleep time. This reduces total processing time as a result. But...This is really workaround. Ultimately, we should implement the threshold to start the task allocation. For example, the percentage of containers ready for use may be useful for this purpose. (You can also use spark.locality.wait parameters)
  • 26. 27Copyright © 2014 NTT DATA Corporation Prevent skew of task allocation in the start(2) [CPU Usage] blue: user green: system [Network usage] red : in black: out [Disk I/O] black: read pink : write Map (609s) Reduce (103s) Because of lack of locality, data transfer occurred. Input data: 27TB Example of slavenode’s resource usage
  • 27. Copyright © 2014 NTT DATA Corporation 28Copyright © 2012 NTT DATA Corporation 28 Future work and conclusion
  • 28. 29Copyright © 2014 NTT DATA Corporation Future work Find the good collaboration between Spark and YARN. Here are some issues to be resolved. Overhead for starting containers Avoid skew of task allocation when starting applications If we can use I/O resource management in the future, it will realize rigorous management. Ensure traceability from a statement of application to the framework of Spark. This is used for performance tuning and debugging.
  • 29. 30Copyright © 2014 NTT DATA Corporation Conclusion Can scalably process tens of TBs of data without unpredictable decrease of performance nor unexpected hold Impression Good! …but we need some techniques for scale out Expectation1 Keep reasonable performance when data is bigger than total memory available for caching Impression Good! …but we need some techniques to efficiently use the cache Expectation2 Capability to run an application on YARN Impression We’re evaluating now and it is under development right now. Expectation3
  • 30. Copyright © 2011 NTT DATA Corporation Copyright © 2014 NTT DATA Corporation Copyright © 2011 NTT DATA Corporation Copyright © 2012 NTT DATA Corporation Spark is a young product and has some issues to be solved. But these issues should be resolved by the great community member. We also contribute it! If you have any question, I can talk in the lobby individually