2. Hi, I am Gagan
Now: Data Scientist, Duke Office of Information Technology
Ex: Duke Alum (Masters in Engineering Management),
Software engineer
Linkedin - gagan--kaur
4. The problem
• Data science projects are complex
• Collaboration is a challenge
• Existing solutions – Plotly, Dash, R shiny
5. Panel
• Panel is new, but built on
Bokeh and Param
• Lets all your notebooks turn
into apps or dashboards
• Any plotting library, image
or object
• Fast Iteration
• Broad library of components
• Layouts and styling
• Deployment
Source - https://medium.com/@philipp.jfr/panel-announcement-2107c2b15f52
6. Comparison to
other
dashboarding
libraries
Heavily inspired by exisiting tools
Shiny
• powerful and well polished framework for building web
applications
• Constraints imposed by a different language
Jupyter ipywidgets
• Provides the foundation for building interactive components and
embedding them in a notebook, within the Jupyter ecosystem
• Panel apps work equally well inside and outside of Jupyter
contexts
Dash
• Dash is (by design) focused specifically on support for Plotly
plots, while Panel is agnostic about what objects are being
displayed,
• requires much more detailed knowledge of low-level web
development
7. Using Panel at Duke
• New data science project for Anomaly detection
• Campus energy usage per building - large dataset, complex time
series modeling
• Trend, seasonality, local and global anomalies
• Statistical methods slow
• Unsupervised ML problem
• Tuning and evaluating the model
• Feedback from technical and non-technical stakeholders
8. Bad data detection
solution
• Currently an offline solution,
that takes in chunk of data,
• visualizes anomalies per
building,
• allows to adjust for
percentage of expected
anomalies,
• remove and download the
clean data.
9. • Interactive graphs with zoom in, hover
labels, dynamic axes
• Deployed on a server to provide access to
the Facilities team
Thank you all for being here for the talk. Thanks Alice and Mary Clair for giving me this opportunity.
I hope this is helpful and please feel free to stop me at any point and ask questions.
For all of us who have worked with data science projects, we know very well
Data science projects typically require a high degree of collaboration among various cross-functional teams to be successful. This is for all the analysts who have a plot, image, or interactive visualizations in their Jupyter notebooks and want to share it with their stakeholders or their boss, (who might not necessarily have the environment to run those notebooks).
Work with Duke Office of info tech, that is 10 miles from hereBut I am originally from Ludhiana, India that is 7700 miles from this place, I checked
I moved to the US two year ago to pursue a Masters in Engineering Management from Duke University.
Before coming to US, I worked as a software engineer with a communications solution provider company in Bangalore, India.
I was born and raised in Punjab, the state known for its amazing food - butter chicken, chicken tikka to name a few.
Two fold - two things
First talk about problem a little bit
Second, we’ll start with how we are using Panel in data analytics projects at Duke OIT followed by a short demo showing how to build a dashboard yourself.
Before we begin, can I see a show of hands –
How many of you work with reporting, dashboarding?
How many of you use python in your reporting workflow?
What tools do you use?
I am not an expert in panel but when I came across this tool, I thought this is exactly what I was looking for.
So I am gonna be looking at my notes a lot, so that I dont misspeak!
In my one year of experience at Duke OIT, I have repeatedly seen the pain involved in turning some analysis code into insights that can be easily shared with decision makers within an organization or the general public.
Because the technologies involved often required distinct skill sets, different teams may be involved in prototyping, developing and deploying an app to be used by non-technical people.
This introduces a huge amount of friction as minor tweaks need to be communicated between teams, increasing the length of the iteration cycle exponentially.
This is also Especially challenging if you are working with sensitive data and wanna share insights with stakeholder who don’t not have the same environment.
Existing solutions – there are some existing solutions to build dashboards in python and we’ll see comparison later. For me personally I didn’t wanna throw myswlf into the weeds with front end stuff with Dash neither did I want to learn R. (Snce this is PyData, I am counting on the safety of the mob when I say I didn’t wanna learn R)
I came across this library last year in July. At that time, I had been using Tableau for most my reporting and shairng insights, and came across another tool from this ecosystem holoviews for a year now, and it has quickly become my go-to charting library in Python. I love the built-in interactivity of the plots, hover etc.
Panel is a new open-source Python library that lets you create custom interactive web apps and dashboards by connecting user-defined widgets to plots, images, tables, or text.
Secondly, Panel aims to make it trivial to go from prototyping an app to deploying it internally within an organization, or sharing it publicly with the entire internet.
Architecture
Panel is built on top of two main libraries:
Bokeh provides the model-view-controller framework on which Panel is built, along with many of the core components such as the widgets and layout engine
Param provides a framework for reactive parameters which are used to define all Panel components.
Fast iteration
By Quickly making insights accessible to a wide audience.
Most importantly, Panel apps can easily be built entirely within a Jupyter notebook, where a lot of modern data science happens.
Broad and expanding library of components
Panel ships with a wide array of components, providing a large set of widgets, and a number of powerful layout components.
Layouts and styling
extensively customize the visual appearance.
This makes it possible to build dashboards that resize reactively to the current size of the browser window, or to
Deployment
There are detailed guides to explain the Python-server deployment procedure on different platforms, including AWS, Google Cloud, Heroku, and Anaconda Enterprise.
In many cases, Panel objects can also be exported to static, standalone HTML/JavaScript files that no longer need a Python server, and. it can be distributed on websites or in emails without a running Python process.
To sum up nice things about Panel:
It’s reactive (updates automatically!)
It’s declarative (readable code)
Supports different layouts (flexible)
Fully deployable to a server (shareable)
Jupyter Notebook compatible
Comparison to other dashboarding and widget libraries
Panel is a very new library in this space but it is heavily inspired by existing concepts and technologies that have in many cases been around for decades.
Shiny - It actually sets an incredibly high bar!
For anyone who performs analysis in the R programming language,
Jupyter/ipywidgets
Within the Jupyter ecosystem, the ipywidgets library has provided the foundation for building interactive components and embedding them in a notebook.
The main difference between Panel and ipywidgets is that the Panel architecture is not closely coupled to the IPython kernel that runs interactive computations in Jupyter.
Panel is based on a generalized Python/JS communication method, making Panel apps work equally well inside and outside of Jupyter contexts
Dash
Like Panel, Plotly’s Dash library allows building very complex and highly polished applications straight from Python.
Dash is also built on a reactive programming model that (along with Shiny) was a big inspiration for some of the features in Panel.
I was very conscious of my lessons learmt from my first project around collaboration.
The new project
Meters are not robust and anomalies when they restart or peak loads, external changes
Afftects forecasts energy consumtion
Forecasts at lower levels are highly sensitive to these anomalies
Each one of these widgets is tied to something directly in the plot using only the computation that is needed. Its all done locally in the browser, and is not recomuting anything.
If you change what I am actually plotting or how you are aggregating it, then it has to do a computatiotn and then it will take some time.
You would have to write massive code and callbacks to tie which part goes where, but in this it is very less code.
As you can observe in the code, Panel provides a number of different abstraction layers to write this kind of app.
Users can choose between these three different approaches to ensure they have just the right level of control needed for a particular use case, while being able to build dashboards of any complexity.