This presentation describes the journey we went through in containerizing Spark workload into multiple elastic Spark clusters in a multi-tenant kubernetes environment. Initially we deployed Spark binaries onto a host-level filesystem, and then the Spark drivers, executors and master can transparently migrate to run inside a Docker container by automatically mounting host-level volumes. In this environment, we do not need to prepare a specific Spark image in order to run Spark workload in containers. We then utilized Kubernetes helm charts to deploy a Spark cluster. The administrator could further create a Spark instance group for each tenant. A Spark instance group, which is akin to the Spark notion of a tenant, is logically an independent kingdom for a tenant’s Spark applications in which they own dedicated Spark masters, history server, shuffle service and notebooks. Once a Spark instance group is created, it automatically generates its image and commits to a specified repository. Meanwhile, from Kubernetes’ perspective, each Spark instance group is a first-class deployment and thus the administrator can scale up/down its size according to the tenant’s SLA and demand. In a cloud-based data center, each Spark cluster can provide a Spark as a service while sharing the Kubernetes cluster. Each tenant that is registered into the service gets a fully isolated Spark instance group. In an on-prem Kubernetes cluster, each Spark cluster can map to a Business Unit, and thus each user in the BU can get a dedicated Spark instance group. The next step on this journey will address the resource sharing across Spark instance groups by leveraging new Kubernetes’ features (Kubernetes31068/9), as well as the Elastic workload containers depending on job demands (Spark18278). Demo: https://www.youtube.com/watch?v=eFYu6o3-Ea4&t=5s
2. Myself
• “How High”
• Software Architect
• IBM Spectrum Computing
• Toronto Canada
2IBM Spectrum Computing
3. Agenda
• Why container?
• Migrate spark workload to container
• Spark instance on Kubernetes
– Architecture
– Workflow
– Multi-tenancy
• Future work
3IBM Spectrum Computing
4. Why use containers?
• To enforce the CPU and memory bounds.
– CPU shares are proportional to the allocated slots
– spark.driver.memory & spark.executor.memroy
• To completely isolate the file system
– Solve the dependency conflicts
• To create and ship images
– Develop once and run everywhere
4IBM Spectrum Computing
5. No prebuilt Spark image
• A running container needs an application image
– Independent to Spark versions
• Seamlessly migrate Spark workloads to a
container based environment
– Assume: Spark is distributed onto the host file system
5IBM Spectrum Computing
11. Spark Instance on Kubernetes
• Increase resource utilization
– Share nodes between Spark and surrounding ecosystem
• Isolation between tenants and apply resource
enforcement
– Each tenant gets a dedicated Spark working instance
– Tenant price plan can directly map to its resource quota
• Simplify deployment and roll out
11IBM Spectrum Computing
12. Architecture
IBM Spectrum Computing
IBM Spectrum Conductor with Spark
Spark Instance
Group
Spark Instance
Group
Spark
Master
History
Server
Notebook
Shuffle
Service
Spark
Master
History
Server
Notebook
Shuffle
Service
SparkaaS level
Tenant level
Spark instance group
Admin Portal
Spark Information Hub Image and deployment
13. Architecture
IBM Spectrum Computing
IBM Spectrum Conductor with Spark
Spark Instance
Group
Spark Instance
Group
Spark
Master
History
Server
Notebook
Shuffle
Service
Spark
Master
History
Server
Notebook
Shuffle
Service
Spark Service level
Tenant level
Spark Information Hub Image and deployment
Spark instance group
Admin Portal
• A Spark instance group is an independent
deployment in Kubernetes.
• A docker image is built automatically based on the
Spark version, configuration, notebook edition, and
user application dependencies.
• Initially only one container for Spark services of the
Spark instance group.
• Dynamic scalability based on workload demand.
14. Architecture
IBM Spectrum Computing
IBM Spectrum Conductor with Spark
Spark Instance
Group
Spark Instance
Group
Spark
Master
History
Server
Notebook
Shuffle
Service
Spark
Master
History
Server
Notebook
Shuffler
Service
SparkaaS level
Tenant level
Spark Information Hub Image and deployment
Spark instance group
Admin Portal
• IBM Spectrum Conductor with Spark - End points
• Admin manages Spark instance group life cycle
• Tenant accesses Spark workloads and notebooks
• Deploy by a helm chart and expose as a service with
one single container: CWS master
• Multiple deployments in a Kubernetes cluster
• Cloud: One deployment for one Spaas
• On-Prem: One deployment for one BU