PyData talk March 2020

•Descargar como PPTX, PDF•

0 recomendaciones•86 vistas

Gagan Kaur

Converting Jupyter Notebooks to live interaction dashboards

Tecnología

Converting
Jupyter
notebooks into
dashboards
Gagandeep Kaur (Gagan)
Data Scientist
Duke Office of Information Tech

Hi, I am Gagan
Now: Data Scientist, Duke Office of Information Technology
Ex: Duke Alum (Masters in Engineering Management),
Software engineer
Linkedin - gagan--kaur

Agenda
• The problem
• What is Panel
• How we are using Panel at Duke
• Demo
• Q/A

The problem
• Data science projects are complex
• Collaboration is a challenge
• Existing solutions – Plotly, Dash, R shiny

Panel
• Panel is new, but built on
Bokeh and Param
• Lets all your notebooks turn
into apps or dashboards
• Any plotting library, image
or object
• Fast Iteration
• Broad library of components
• Layouts and styling
• Deployment
Source - https://medium.com/@philipp.jfr/panel-announcement-2107c2b15f52

Comparison to
other
dashboarding
libraries
Heavily inspired by exisiting tools
Shiny
• powerful and well polished framework for building web
applications
• Constraints imposed by a different language
Jupyter ipywidgets
• Provides the foundation for building interactive components and
embedding them in a notebook, within the Jupyter ecosystem
• Panel apps work equally well inside and outside of Jupyter
contexts
Dash
• Dash is (by design) focused specifically on support for Plotly
plots, while Panel is agnostic about what objects are being
displayed,
• requires much more detailed knowledge of low-level web
development

Using Panel at Duke
• New data science project for Anomaly detection
• Campus energy usage per building - large dataset, complex time
series modeling
• Trend, seasonality, local and global anomalies
• Statistical methods slow
• Unsupervised ML problem
• Tuning and evaluating the model
• Feedback from technical and non-technical stakeholders

Bad data detection
solution
• Currently an offline solution,
that takes in chunk of data,
• visualizes anomalies per
building,
• allows to adjust for
percentage of expected
anomalies,
• remove and download the
clean data.

• Interactive graphs with zoom in, hover
labels, dynamic axes
• Deployed on a server to provide access to
the Facilities team

Más contenido relacionado

Similar a PyData talk March 2020

Hadoopthisisnabin

USG Summit - September 2014 - Web Management using DrupalEric Sembrat

Hemanth Kumar - Drupal ArchitectHemanth Kumar

Chap 6 - Software Architecture Part 1.pptkhalidnawaz39

Chap 6 - Software Architecture Part 1.pptxssuser0ed5b4

Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...Spark Summit

HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems

project--2 nd review_2Aswini Ashu

project--2 nd review_2aswini pilli

Building Applications using Apache HadoopC4Media

Introduction to Hadoop and MapReduceeakasit_dpu

PyData: The Next Generation | Data Day Texas 2015Cloudera, Inc.

Kubeflow.pptxdhaferbenali1

Architecting Agile Data Applications for ScaleDatabricks

Zero to 365 in One Hour: Processes and Tools for Effective SharePoint SolutionsRegroove

Architecting Your First Big Data ImplementationAdaryl "Bob" Wakefield, MBA

Big Data Analytics (ML, DL, AI) hands-onDony Riyanto

50 Shades of SQLDataWorks Summit

Accelerating workloads and bursting data with Google Dataproc & AlluxioAlluxio, Inc.

Equal Access for All: Serving Students with DisabilitiesJennifer Bartlett

Similar a PyData talk March 2020 (20)

Hadoop

USG Summit - September 2014 - Web Management using Drupal

Hemanth Kumar - Drupal Architect

Chap 6 - Software Architecture Part 1.ppt

Chap 6 - Software Architecture Part 1.pptx

Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by T...

HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...

project--2 nd review_2

Building Applications using Apache Hadoop

Introduction to Hadoop and MapReduce

PyData: The Next Generation | Data Day Texas 2015

Kubeflow.pptx

Architecting Agile Data Applications for Scale

Zero to 365 in One Hour: Processes and Tools for Effective SharePoint Solutions

Architecting Your First Big Data Implementation

Big Data Analytics (ML, DL, AI) hands-on

50 Shades of SQL

Accelerating workloads and bursting data with Google Dataproc & Alluxio

Equal Access for All: Serving Students with Disabilities

Último

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Real Time Object Detection Using Open CVKhem

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

PyData talk March 2020

1. Converting Jupyter notebooks into dashboards Gagandeep Kaur (Gagan) Data Scientist Duke Office of Information Tech

2. Hi, I am Gagan Now: Data Scientist, Duke Office of Information Technology Ex: Duke Alum (Masters in Engineering Management), Software engineer Linkedin - gagan--kaur

3. Agenda • The problem • What is Panel • How we are using Panel at Duke • Demo • Q/A

4. The problem • Data science projects are complex • Collaboration is a challenge • Existing solutions – Plotly, Dash, R shiny

5. Panel • Panel is new, but built on Bokeh and Param • Lets all your notebooks turn into apps or dashboards • Any plotting library, image or object • Fast Iteration • Broad library of components • Layouts and styling • Deployment Source - https://medium.com/@philipp.jfr/panel-announcement-2107c2b15f52

6. Comparison to other dashboarding libraries Heavily inspired by exisiting tools Shiny • powerful and well polished framework for building web applications • Constraints imposed by a different language Jupyter ipywidgets • Provides the foundation for building interactive components and embedding them in a notebook, within the Jupyter ecosystem • Panel apps work equally well inside and outside of Jupyter contexts Dash • Dash is (by design) focused specifically on support for Plotly plots, while Panel is agnostic about what objects are being displayed, • requires much more detailed knowledge of low-level web development

7. Using Panel at Duke • New data science project for Anomaly detection • Campus energy usage per building - large dataset, complex time series modeling • Trend, seasonality, local and global anomalies • Statistical methods slow • Unsupervised ML problem • Tuning and evaluating the model • Feedback from technical and non-technical stakeholders

8. Bad data detection solution • Currently an offline solution, that takes in chunk of data, • visualizes anomalies per building, • allows to adjust for percentage of expected anomalies, • remove and download the clean data.

9. • Interactive graphs with zoom in, hover labels, dynamic axes • Deployed on a server to provide access to the Facilities team

10. Demo

11. Questions

Notas del editor

Thank you all for being here for the talk. Thanks Alice and Mary Clair for giving me this opportunity. I hope this is helpful and please feel free to stop me at any point and ask questions. For all of us who have worked with data science projects, we know very well Data science projects typically require a high degree of collaboration among various cross-functional teams to be successful. This is for all the analysts who have a plot, image, or interactive visualizations in their Jupyter notebooks and want to share it with their stakeholders or their boss, (who might not necessarily have the environment to run those notebooks).
Work with Duke Office of info tech, that is 10 miles from hereBut I am originally from Ludhiana, India that is 7700 miles from this place, I checked I moved to the US two year ago to pursue a Masters in Engineering Management from Duke University. Before coming to US, I worked as a software engineer with a communications solution provider company in Bangalore, India. I was born and raised in Punjab, the state known for its amazing food - butter chicken, chicken tikka to name a few.
Two fold - two things First talk about problem a little bit Second, we’ll start with how we are using Panel in data analytics projects at Duke OIT followed by a short demo showing how to build a dashboard yourself. Before we begin, can I see a show of hands – How many of you work with reporting, dashboarding? How many of you use python in your reporting workflow? What tools do you use? I am not an expert in panel but when I came across this tool, I thought this is exactly what I was looking for. So I am gonna be looking at my notes a lot, so that I dont misspeak!
In my one year of experience at Duke OIT, I have repeatedly seen the pain involved in turning some analysis code into insights that can be easily shared with decision makers within an organization or the general public. Because the technologies involved often required distinct skill sets, different teams may be involved in prototyping, developing and deploying an app to be used by non-technical people. This introduces a huge amount of friction as minor tweaks need to be communicated between teams, increasing the length of the iteration cycle exponentially. This is also Especially challenging if you are working with sensitive data and wanna share insights with stakeholder who don’t not have the same environment. Existing solutions – there are some existing solutions to build dashboards in python and we’ll see comparison later. For me personally I didn’t wanna throw myswlf into the weeds with front end stuff with Dash neither did I want to learn R. (Snce this is PyData, I am counting on the safety of the mob when I say I didn’t wanna learn R)
I came across this library last year in July. At that time, I had been using Tableau for most my reporting and shairng insights, and came across another tool from this ecosystem holoviews for a year now, and it has quickly become my go-to charting library in Python. I love the built-in interactivity of the plots, hover etc. Panel is a new open-source Python library that lets you create custom interactive web apps and dashboards by connecting user-defined widgets to plots, images, tables, or text. Secondly, Panel aims to make it trivial to go from prototyping an app to deploying it internally within an organization, or sharing it publicly with the entire internet. Architecture Panel is built on top of two main libraries: Bokeh provides the model-view-controller framework on which Panel is built, along with many of the core components such as the widgets and layout engine Param provides a framework for reactive parameters which are used to define all Panel components. Fast iteration By Quickly making insights accessible to a wide audience. Most importantly, Panel apps can easily be built entirely within a Jupyter notebook, where a lot of modern data science happens. Broad and expanding library of components Panel ships with a wide array of components, providing a large set of widgets, and a number of powerful layout components. Layouts and styling extensively customize the visual appearance. This makes it possible to build dashboards that resize reactively to the current size of the browser window, or to Deployment There are detailed guides to explain the Python-server deployment procedure on different platforms, including AWS, Google Cloud, Heroku, and Anaconda Enterprise. In many cases, Panel objects can also be exported to static, standalone HTML/JavaScript files that no longer need a Python server, and. it can be distributed on websites or in emails without a running Python process. To sum up nice things about Panel: It’s reactive (updates automatically!) It’s declarative (readable code) Supports different layouts (flexible) Fully deployable to a server (shareable) Jupyter Notebook compatible
Comparison to other dashboarding and widget libraries Panel is a very new library in this space but it is heavily inspired by existing concepts and technologies that have in many cases been around for decades. Shiny - It actually sets an incredibly high bar! For anyone who performs analysis in the R programming language, Jupyter/ipywidgets Within the Jupyter ecosystem, the ipywidgets library has provided the foundation for building interactive components and embedding them in a notebook. The main difference between Panel and ipywidgets is that the Panel architecture is not closely coupled to the IPython kernel that runs interactive computations in Jupyter. Panel is based on a generalized Python/JS communication method, making Panel apps work equally well inside and outside of Jupyter contexts Dash Like Panel, Plotly’s Dash library allows building very complex and highly polished applications straight from Python. Dash is also built on a reactive programming model that (along with Shiny) was a big inspiration for some of the features in Panel.
I was very conscious of my lessons learmt from my first project around collaboration. The new project Meters are not robust and anomalies when they restart or peak loads, external changes Afftects forecasts energy consumtion Forecasts at lower levels are highly sensitive to these anomalies
Each one of these widgets is tied to something directly in the plot using only the computation that is needed. Its all done locally in the browser, and is not recomuting anything. If you change what I am actually plotting or how you are aggregating it, then it has to do a computatiotn and then it will take some time. You would have to write massive code and callbacks to tie which part goes where, but in this it is very less code. As you can observe in the code, Panel provides a number of different abstraction layers to write this kind of app. Users can choose between these three different approaches to ensure they have just the right level of control needed for a particular use case, while being able to build dashboards of any complexity.
https://www.youtube.com/watch?v=L91rd1D6XTA&t=1137s https://www.youtube.com/watch?v=VtchVpoSdoQ https://www.youtube.com/watch?v=AXpjbJUVeb4

PyData talk March 2020

Recomendados

Recomendados

Más contenido relacionado

Similar a PyData talk March 2020

Similar a PyData talk March 2020 (20)

Último

Último (20)

PyData talk March 2020

Notas del editor