SlideShare una empresa de Scribd logo
1 de 34
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
QCon 2018
San Francisco
November 07, 2018
Chakri Cherukuri
Senior Researcher
Quantitative Financial Research Group
Jupyter Notebooks: Interactive
Visualization Approaches
InfoQ.com: News & Community Site
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
ml-models-jupyter
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Outline
• Introduction
• Overview of interactive widgets
• Example with code walk-through
— Normal Distribution
• Case studies
— Server log dashboard
— Twitter sentiment analysis
— Tools for deep learning
• Q & A
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Interactive widgets
Visual Representation
• Graphics, Interaction
• JavaScript implementation hidden from the user
Visual RepresentationPython Object
• Interface the user sees
• Its attributes (traits) can
send events when updated
IntSlider(description=‘slider,
value=50)
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Interactive widgets
• ipywidgets: core UI controls (text boxes, sliders, button etc.)
• bqplot: 2D plotting widgets (built on top of the ipywidgets framework)
• Key takeaways:
— ipywidgets and bqplot can be combined to build interactive visualizations
— Attributes of widgets (traits) can be linked using callbacks
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Server Logs Dashboard
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Server Logs
• Information we can glean:
— Timestamp of requests/events
— IP address
— HTTP status codes
— Agent
— URL
• URLs can be parsed to obtain search parameters:
— Product IDs
— Category IDS
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Dashboard Components
• Time series plots of events
— Trends
— Seasonality
— Outliers
• Daily/hourly aggregation of events
• Plots of events broken down by
— Products
— Categories
— Status codes
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Server log dashboard
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Performance/Scaling
• Data ingestion from streaming sources like Kafka
• Notebook servers running on spark/hadoop clusters
• Streaming data frames
• dask dataframes
• pandas with chunking
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Twitter Sentiment Analysis
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
News/Twitter Sentiment
• Social sentiment from the raw news story or tweet. Data is:
— Unstructured
— Highly time-sensitive
• Story-level sentiment
• Company-level sentiment
• Sentiment score can be used as a trading signal
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Twitter Sentiment Classification
Problem statement:
Predict the sentiment (negative, neutral, positive) of a tweet for a company
Ex: “$CTIC Rated strong buy by three WS analysts. Increased target from $5 to $8.” = Positive
Three way classification problem
— Input: raw tweets
— Output: sentiment label ∑ {negative, neutral, positive}
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Methodology
• We are given labeled training and test data sets
• Train classifier on training data set
• Predict labels on test data and evaluate performance
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
One vs. rest Logistic Regression
• Train three binary classifiers for each label
— Model 1: Negative vs. Not Negative
— Model 2: Positive vs. Not Positive
— Model 3: Neutral vs. Not Neutral
• Get probabilities (measures of confidence) for each label
• Output the label associated with the highest probability
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Classifier performance analysis
• Look at misclassifications
— Confusion Matrix
• Understand model predicted probabilities
— Triangle visualization
• Fix data issues
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Confusion Matrix
K x K matrix where K = number of classes
Cell[i, j] = number of samples whose:
Actual label = i
Predicted label = j
Diagonal entries: correct predictions
Off diagonal entries: misclassifications
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Triangle Visualization
• Model returns 3 probabilities (which sum to 1)
• How can we visualize these 3 numbers?
— Points inside an equilateral triangle
Not sure
Very positive
Negative / Neutral
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Performance analysis dashboard
Use the dashboard to:
— Analyze misclassifications (using confusion matrix)
— Improve model by adding more features (by looking at model coefficients)
— Fix data issues (using triangle and lasso)
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Analyze Misclassifications
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Analyze Misclassifications
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Analyze Misclassifications
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Use Lasso To Find Data Issues
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Use Lasso To Find Data Issues
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Tools for Deep Learning
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Deep Learning Tools
Graphical Wizard
— Select network parameters
— Build network architecture
— Training plots
• Real time loss/accuracy curves
• Distribution of weights/biases/activations
— Diagnostic plots
• Residual vs. Predicted Values
• Confusion Matrix
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Network Parameters
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Network Architecture
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Loss/Accuracy Curves
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Distributions of Weights/Biases/Activations
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Diagnostic Plots
© 2018 Bloomberg Finance L.P. All rights reserved.
Resources
• Widget libraries used to build the applications:
ipywidgets: https://github.com/jupyter-widgets/ipywidgets
bqplot: https://github.com/bloomberg/bqplot
• Machine Learning libraries
scikit-learn: https://http://scikit-learn.org
tensorflow: https://www.tensorflow.org
keras: https://keras.io
• Link to the notebooks/code: https://github.com/ChakriCherukuri/qcon_2018
• Tech At Bloomberg blog: www.TechAtBloomberg.com
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
ml-models-jupyter

Más contenido relacionado

Más de C4Media

Más de C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Jupyter Notebooks: Interactive Visualization Approaches

  • 1. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. QCon 2018 San Francisco November 07, 2018 Chakri Cherukuri Senior Researcher Quantitative Financial Research Group Jupyter Notebooks: Interactive Visualization Approaches
  • 2. InfoQ.com: News & Community Site • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ ml-models-jupyter
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Outline • Introduction • Overview of interactive widgets • Example with code walk-through — Normal Distribution • Case studies — Server log dashboard — Twitter sentiment analysis — Tools for deep learning • Q & A
  • 5. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Interactive widgets Visual Representation • Graphics, Interaction • JavaScript implementation hidden from the user Visual RepresentationPython Object • Interface the user sees • Its attributes (traits) can send events when updated IntSlider(description=‘slider, value=50)
  • 6. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Interactive widgets • ipywidgets: core UI controls (text boxes, sliders, button etc.) • bqplot: 2D plotting widgets (built on top of the ipywidgets framework) • Key takeaways: — ipywidgets and bqplot can be combined to build interactive visualizations — Attributes of widgets (traits) can be linked using callbacks
  • 7. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Server Logs Dashboard
  • 8. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Server Logs • Information we can glean: — Timestamp of requests/events — IP address — HTTP status codes — Agent — URL • URLs can be parsed to obtain search parameters: — Product IDs — Category IDS
  • 9. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Dashboard Components • Time series plots of events — Trends — Seasonality — Outliers • Daily/hourly aggregation of events • Plots of events broken down by — Products — Categories — Status codes
  • 10. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Server log dashboard
  • 11. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Performance/Scaling • Data ingestion from streaming sources like Kafka • Notebook servers running on spark/hadoop clusters • Streaming data frames • dask dataframes • pandas with chunking
  • 12. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Twitter Sentiment Analysis
  • 13. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. News/Twitter Sentiment • Social sentiment from the raw news story or tweet. Data is: — Unstructured — Highly time-sensitive • Story-level sentiment • Company-level sentiment • Sentiment score can be used as a trading signal
  • 14. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Twitter Sentiment Classification Problem statement: Predict the sentiment (negative, neutral, positive) of a tweet for a company Ex: “$CTIC Rated strong buy by three WS analysts. Increased target from $5 to $8.” = Positive Three way classification problem — Input: raw tweets — Output: sentiment label ∑ {negative, neutral, positive}
  • 15. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Methodology • We are given labeled training and test data sets • Train classifier on training data set • Predict labels on test data and evaluate performance
  • 16. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. One vs. rest Logistic Regression • Train three binary classifiers for each label — Model 1: Negative vs. Not Negative — Model 2: Positive vs. Not Positive — Model 3: Neutral vs. Not Neutral • Get probabilities (measures of confidence) for each label • Output the label associated with the highest probability
  • 17. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Classifier performance analysis • Look at misclassifications — Confusion Matrix • Understand model predicted probabilities — Triangle visualization • Fix data issues
  • 18. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Confusion Matrix K x K matrix where K = number of classes Cell[i, j] = number of samples whose: Actual label = i Predicted label = j Diagonal entries: correct predictions Off diagonal entries: misclassifications
  • 19. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Triangle Visualization • Model returns 3 probabilities (which sum to 1) • How can we visualize these 3 numbers? — Points inside an equilateral triangle Not sure Very positive Negative / Neutral
  • 20. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Performance analysis dashboard Use the dashboard to: — Analyze misclassifications (using confusion matrix) — Improve model by adding more features (by looking at model coefficients) — Fix data issues (using triangle and lasso)
  • 21. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Analyze Misclassifications
  • 22. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Analyze Misclassifications
  • 23. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Analyze Misclassifications
  • 24. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Use Lasso To Find Data Issues
  • 25. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Use Lasso To Find Data Issues
  • 26. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Tools for Deep Learning
  • 27. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Deep Learning Tools Graphical Wizard — Select network parameters — Build network architecture — Training plots • Real time loss/accuracy curves • Distribution of weights/biases/activations — Diagnostic plots • Residual vs. Predicted Values • Confusion Matrix
  • 28. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Network Parameters
  • 29. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Network Architecture
  • 30. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Loss/Accuracy Curves
  • 31. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Distributions of Weights/Biases/Activations
  • 32. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Diagnostic Plots
  • 33. © 2018 Bloomberg Finance L.P. All rights reserved. Resources • Widget libraries used to build the applications: ipywidgets: https://github.com/jupyter-widgets/ipywidgets bqplot: https://github.com/bloomberg/bqplot • Machine Learning libraries scikit-learn: https://http://scikit-learn.org tensorflow: https://www.tensorflow.org keras: https://keras.io • Link to the notebooks/code: https://github.com/ChakriCherukuri/qcon_2018 • Tech At Bloomberg blog: www.TechAtBloomberg.com
  • 34. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ ml-models-jupyter