7. 7
Benefits
MULTI-TENANCY
Each note and/or user has own
container for interpreters
SCALABILITY
Single host does not run all
interpreters anymore
SECURITY
Each container is isolated
(filesystem, process etc.)
8. 8
Usage
$ kubectl apply -f ${ZEPPELIN_HOME}/k8s/zeppelin-server.yaml
* Need to build your own Zeppelin and Spark docker image before 0.9.0 is released
1. Build Zeppelin distribution package mvn package -Pbuild-distr …
2. Build Zeppelin docker image cd scripts/docker/zeppelin/bin; docker build -t …
3. Build Spark docker image <spark-distribution>/bin/docker-image-tool.sh -m -t 2.4.0 build
Available in 0.9.0-SNAPSHOT
http://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/quickstart/kubernetes.html
Run
9. 9
Zeppelin Roadmap
- Zeppelin on Kubernetes
- Apply network policy to isolate Interpreter Pod
- Schedule note on background as a Job in Kubernetes
- Run extra application such as terminal, tensorboard, the sameway SparkUI works
- Modernize front-end stack
- Currently AngularJS
- Dark theme?
- Visualization
- Realtime data visualization
- Pivot in the backend side, instead of doing it in a front-end that require transfer all data to front-end
- Sidebar
- Sidebar with widgets, such as ToC (Table of Contents, list of data, etc)
- Online widget registry (Helium)
- Collaboration
- Multi-cursor edit
- Comment!
10. 10
Zeppelin Roadmap
Modernize
front-end stack
• Currently AngularJS
• Dark theme
Zeppelin on
Kubernetes
• Apply network policy to isolate
Interpreter Pod
• Schedule note on background as a
Job in Kubernetes
• Run extra application such as
terminal, tensorboard, the sameway
SparkUI works
Collaboration
• Multi-cursor edit
• Comment!
Sidebar
• Sidebar with widgets, such as ToC
(Table of Contents, list of data, etc)
• Online widget registry (Helium)
Visualization
• Realtime data visualization
• Pivot in the backend side,
instead of doing it in a front-end
that require transfer all data to
front-end
16. 16
Problem
- Entire result dataset need to be transferred to browser, even though not all of
them are rendered.
- Browser CPU, memory is limitation of transforming / rendering data
20. 20
Related work
- Streaming data update (without refresh notebook)
- Separate transfer for result dataset and note to browser
- Partial data fetch for table display
- Extending TableData API
24. 24
Contents
1. This is notebook
a. First
b. Second
2. Next
a. Next
One of the most popular feature in Jupyter.
Google Colab also supports it.
Zeppelin has SPELL
See https://www.npmjs.com/package/zeppelin-toc-spell
TOC (table of contents) widget
25. 25
Displays list of table, schema of table, preview of data
recognized by Interpreter
Table data widget
Name Temporary
table1 no
bank yes
Tables
Column Type
age INT
job TEXT
Schema
Preview
26. 26
Drag and drop paragraph to the clipboard.
In the same or in another notebook and drag and drop
paragraph from clipboard.
Clipboard
Drop paragraph here
Paragraph a
Paragraph b