Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16

SCALE| SIMPLIFY| OPTIMIZE | EVOLVE
4/15/2016 TidalScale Proprietary & Confidential 1
Comparing a Virtual Supercomputer
with a Cluster for Spark in-memory
Computations
Ike Nassi
Ike.nassi@tidalscale.com

Why Run Spark?
Spark originated as in-memory alternative to Hadoop
Run huge analytics on clusters of commodity servers
Enjoy the hardware economy of “scale-out”
Apply a rich set of transformations and actions
Operate out of memory as much as possible

Today’s Conundrum:
Scale Up vs. Scale Out?
Scale Up Scale Out
Software Simplicity
HW Cost
?
✔ ✗
✗ ✔

TidalScale – The Best of Both
Software Simplicity HW Cost
✔ ✔
Easy to say, but this is a ridiculously difficult problem!

Key takeaways
Simplicity of Scale up:
• We allow the simplicity of scale-up – you can run multi-
terabyte analytics on a single Spark node.
Scale out “under the hood”
• We offer a new class of virtual supercomputers to host
Spark – we hide the complexity of scale-out “under the
hood”.

Traditional Spark in two layers
Programming Paradigm
RDD – Resilient Distributed Dataset / DataFrame
Parallel in-memory execution
Lazy, repeatable evaluation thanks to ”wide dependencies”
Rich set of operators beyond just Map-Reduce
Implementation Plumbing
Clusters – standalone, Mesos, Yarn
Data – HDFS, Dataframes
Memory management

Alternative Spark in two layers
Programming Paradigm
RDD – Resilient Distributed Dataset / DataFrame
Parallel in-memory execution
Lazy, repeatable evaluation thanks to ”wide dependencies”
Rich set of operators beyond just Map-Reduce
TidalScale as alternate plumbing!

Today’s Spark cluster with multiple nodes
Hardware
Spark Application
Cluster Manager
Operating System
OS
HW
OS
HW
OS
HW
Executor Executor Executor
Manager
Workers

Virtual Supercomputer running Spark
Spark Application
HW HW HW…
HyperKernel HyperKernel HyperKernel
Cluster Manager
Operating System
Draws from a pool of
processors and JVMs in a single
coherent memory space.
Standard Linux,
FreeBSD, Windows
The OS sees a collection of
cores, disks, and networks in a
huge address space

A tale of two approaches
Feature Scale out under the hood Scale out with worker nodes
Organization One super-node Cluster of worker nodes
Cross-connect 10Gb Ethernet TCP/IP
Shared variables and shuffle Across JVMs in one address space Across distinct nodes
RDD partitioning See shuffle See shuffle
Scale out Add servers “under the hood” Add servers to the cluster
Scale up Scale-out creates bigger a computer None
Reuse Run any application Other cluster techs like Hadoop

Experiment Setup
SynthBenchmark benchmark from Apache.org
• git://git.apache.org/spark.git (spark-1.6.1-bin-hadoop2.6.tgz)
• Applies the PageRank algorithm to a generated graph
• Benchmark scaled from 15GB to 150GB by number of vertices
Scale Out Spark Configuration on EC2:
• 1 Master: ec2 r3.2xlarge (8 cpus, 61G)
• 5 Workers: r3.xlarge (4 cpus, 28.5G)
• 4 Intel E5 2670 CPUs x 5 servers = 20 CPUs total allocated to Spark
Scale Up Spark Configuration on TidalScale:
• TidalScale TidalPod with 5 nodes
• 20 Intel E5 2643 v3 CPUs allocated to Spark
15-Apr-16 TidalScale Proprietary & Confidential 11

Experiment Setup
28.5G
Worker
28G
Driver
8 CPUs
61G
EC2 Setup: Spark Cluster with 1 Driver server & 5 Worker servers (20 worker CPUs)
4 CPUs
30.5G
28.5G
Worker
28.5G
Worker
28.5G
Worker
28.5G
Worker
4 CPUs
30.5G
4 CPUs
30.5G
4 CPUs
30.5G
4 CPUs
30.5G
Single 140G Worker
(20 CPUs)
40G
Driver
Hyper
Kernel
Hyper
Kernel
Hyper
Kernel
Hyper
Kernel
Hyper
Kernel
TidalScale Setup:
Spark Standalone Mode
- 20 worker CPUs
(Guest: 28cpu 225GB)

Experiment Setup
* Note: The number of edge partitions in this example have been set to a fixed constant over all size
workloads for illustrative purposes. Normal practice is to vary # of edge partitions based on workload size.
A:
EC2
Small
B:
EC2
Big
C:
TidalScale
Big
D:
TidalScale
Bigger
Cluster Configuration 10 nodes 5 nodes 1 Tidalpod 1 Tidalpod
Edge partitions * 10 10 10 20
Spark.worker.instances 10 5 1 1
Spark.worker.cores 20 20 20 20
Spark Memory per Node 10G 28G 140GB 300GB
Total Spark Memory 100GB 140GB 140GB 300GB

Experiment Results
A B

Experiment Results
A B
C (standalone mode)

Experiment Results
A B
C
D

Experiment Observations
Tuning Spark is complex
• We spent most of our time tuning Spark parameters
• We are not sure we’ve tuned optimally for either the ec2 spark distributed
cluster or the TS spark standalone case, but parameters were the same
in both
Choice of the number of data partitions really matters
• A suboptimal choice can have 2-3x performance impact
• We used 10 edge partitions for both ec2 and TidalScale configurations

Possible mixed model with multi-terabyte manager
OS
HW
OS
HW
OS
HW
Executor Executor Executor
Super
Manager
Workers
Spark Application
HW HW HW…
HyperKerne
l
HyperKernel
HyperKerne
l
Cluster Manager
Operating System

Conclusions & Recommendations
Spark standalone on TidalScale performs similarly to a
cluster
Without TidalScale, larger workloads can run out of
memory without careful Spark tuning
We recommend using both scale up and scale out

Key messages – more obvious now?
A new class of virtual supercomputers to host Spark
Run multi-terabyte analytics on a single Spark node

Value Proposition
Scale:
• Aggregates compute resources for large scale in-memory analysis and decision support
• Scales like a cluster using commodity hardware at linear cost
• Allows customers to grow gradually as their needs develop
Simplify:
• Dramatically simplifies application development
• No need to distribute work across servers
• Existing applications run as a single instance, without modification, as if on a highly flexible
mainframe
Optimize:
• Automatic dynamic hierarchical resource optimization
Evolve:
• Applicable to modern and emerging microprocessors, memories, interconnects, persistent storage
& networks

SCALE | SIMPLIFY | OPTIMIZE | EVOLVE
Contact: Ike Nassi
ike.nassi@tidalscale.com

Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16

Similar a Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16 (20)

Más de MLconf

Más de MLconf (20)

Último

Último (20)

Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16

Notas del editor