SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Data visualization tools in
Python
Roman Merkulov
Data Scientist at InData Labs
r_merkulov@indatalabs.com
merkylovecom@mail.ru
Content
- why dataviz is important
- dataviz libraries in python
- facets tool
- interactive maps
- Apache Superset
data visualization
- EDA & understanding the data
- fix data
- show insights
- models validation
- analytics & reporting
Plots vs descriptive statistics
Anscombe's quartet
*https://en.wikipedia.org/wiki/Anscombe%27s_quartet
Plots vs descriptive statistics
Anscombe's quartet
*https://en.wikipedia.org/wiki/Anscombe%27s_quartet
Property Value Accuracy
Mean of X 9 exact
Sample
variance of X
11 exact
Mean of y 7.5
2 decimal
places
Sample
variance of y
4.125 +- 0.003
Correlation
coef.
0.816
3 decimal
places
Linear
regression
y = 3.00 +
0.5x
2 decimal
places
Determ. coef. 0.67
2 decimal
places
*http://blog.revolutionanalytics.com/2017/05/the-datasaurus-dozen.html
*https://matplotlib.org/gallery.html
Pros:
- very powerful
- large community, long history
Doesn’t look simple enough...
Cons:
- imperative API
- poor support for interactivity
Just to add a popup...
matplotlib based solutions
*https://speakerdeck.com/jakevdp/pythons-visualization-landscape-pycon-2017
matplotlib based solutions
http://yhat.github.io/ggpy/
http://scitools.org.uk/cartopy/docs/latest/gallery.html
https://seaborn.pydata.org/examples/index.html
https://networkx.github.io/documentation/networkx-1.9.1/examples/drawing/random_geometric_graph.html
javascript based solutions
*https://speakerdeck.com/jakevdp/pythons-visualization-landscape-pycon-2017
folium
bqplot
*https://plot.ly/python/
Pros:
- interactivity
- lots of visualization
types
- both declarative and
imperative capabilities
Cons:
- paid features
bokeh
Pros:
- interactivity
- lots of visualization
types
- both declarative and
imperative capabilities
Cons:
- limited vector graphic
export
Datashader
when you have millions and billions of points
NYC Taxi
US Census 2010
*https://datashader.readthedocs.io/en/latest/
Altair
(based on Vega-Lite)
Fully declarative paradigm
*https://altair-viz.github.io/#
Facets
Overview
Dive
Quick Draw Dataset https://pair-code.github.io/facets/quickdraw.html
*https://pair-code.github.io/facets/
https://github.com/PAIR-code/facets
*https://pair-code.github.io/facets/quickdraw.html
https://research.googleblog.com/2017/07/facets-open-source-visualization-tool.html
Folium
*https://github.com/python-visualization/folium
https://indatalabs.com/discover-hong-kong-through-the-lense-of-instagram/
https://indatalabs.com/brands-on-london-instagram
Visualization of the week according to InsideBigData
https://insidebigdata.com/2016/02/03/visualization-of-the-week-hong-kong-social-media-data-map/
Apache Superset
*https://superset.incubator.apache.org/
Apache Superset
Whatever!
if SQLAlchemy dialect is available for your DB
*https://github.com/apache/incubator-superset
Apache Superset
Who uses:
Airbnb Amino Brilliant.org Clark.de Digit Game Studios Douban
Endress+Hauser FBK - ICT center Faasos GfK Data Lab InData Labs
Maieutical Labs Qunar Shopkick Tails.com Tobii Tooploox Udemy Yahoo!
Zalando
Panoramix Caravel Superset
*https://github.com/apache/incubator-superset
Article on Superset benefits
and limitations
https://indatalabs.com/blog/data-strategy/open-
source-data-visualization-tool-superset
Roaring Elephant podcast
Episode 41
https://roaringelephant.org/2017/04/25/episode-41-
news-news-and-some-more-news/
Thanks for your attention!
some examples shown are available here
https://github.com/merkylove/data_visualisations_for_datathon_2017
https://www.slideshare.net/RomanMerkulov/data-visualization-tools-in-python/1
Contacts:
r_merkulov@indatalabs.com
merkylovecom@mail.ru
https://www.linkedin.com/in/roman-merkulov-a61804a4/

Más contenido relacionado

Similar a Роман Меркулов. In data labs. Прикладные инструменты визуализации данных в python.

Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
Wes McKinney
 
Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Slides 111017220255-phpapp01
Slides 111017220255-phpapp01
Ken Mwai
 
VerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdfVerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdf
Amzath3
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
Benjamin Bengfort
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
Paco Nathan
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
DataMine Lab
 

Similar a Роман Меркулов. In data labs. Прикладные инструменты визуализации данных в python. (20)

Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
 
Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Slides 111017220255-phpapp01
Slides 111017220255-phpapp01
 
VerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdfVerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdf
 
Data transformation
Data transformationData transformation
Data transformation
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Workflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to ReportingWorkflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to Reporting
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionMachine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API's
 
Enabling semantic integration
Enabling semantic integration Enabling semantic integration
Enabling semantic integration
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
06-07 Chapter interpolation in MATLAB
06-07 Chapter interpolation in MATLAB06-07 Chapter interpolation in MATLAB
06-07 Chapter interpolation in MATLAB
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdf
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in R
 
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
 
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in Python
 

Más de Анастасия Чопко

Más de Анастасия Чопко (10)

Татьяна Кузьмичева, Наталья Петроченко. Wargaming. Визуализация big data.
Татьяна Кузьмичева, Наталья Петроченко. Wargaming. Визуализация big data.Татьяна Кузьмичева, Наталья Петроченко. Wargaming. Визуализация big data.
Татьяна Кузьмичева, Наталья Петроченко. Wargaming. Визуализация big data.
 
Сергей Шопик. Лаборатории клиентского опыта. Анализ данных о клиентах. практи...
Сергей Шопик. Лаборатории клиентского опыта. Анализ данных о клиентах. практи...Сергей Шопик. Лаборатории клиентского опыта. Анализ данных о клиентах. практи...
Сергей Шопик. Лаборатории клиентского опыта. Анализ данных о клиентах. практи...
 
Сергей Кадомский. Wargaming. Визуализация данных как основа для развития анал...
Сергей Кадомский. Wargaming. Визуализация данных как основа для развития анал...Сергей Кадомский. Wargaming. Визуализация данных как основа для развития анал...
Сергей Кадомский. Wargaming. Визуализация данных как основа для развития анал...
 
Глеб Канунников. Solution spark.by, Opendata.by. Как визуализация открытых да...
Глеб Канунников. Solution spark.by, Opendata.by. Как визуализация открытых да...Глеб Канунников. Solution spark.by, Opendata.by. Как визуализация открытых да...
Глеб Канунников. Solution spark.by, Opendata.by. Как визуализация открытых да...
 
Вадим Шмыгов. Tut.by. Школа визуализации данных и инфографики. Инфографика tu...
Вадим Шмыгов. Tut.by. Школа визуализации данных и инфографики. Инфографика tu...Вадим Шмыгов. Tut.by. Школа визуализации данных и инфографики. Инфографика tu...
Вадим Шмыгов. Tut.by. Школа визуализации данных и инфографики. Инфографика tu...
 
Вадим Немченко. Олег Скробук. Invento.by. Tableau WDC
Вадим Немченко. Олег Скробук. Invento.by. Tableau WDCВадим Немченко. Олег Скробук. Invento.by. Tableau WDC
Вадим Немченко. Олег Скробук. Invento.by. Tableau WDC
 
Андрей Демидов. Tableau.pro. как обучать визуализации данных внутри организац...
Андрей Демидов. Tableau.pro. как обучать визуализации данных внутри организац...Андрей Демидов. Tableau.pro. как обучать визуализации данных внутри организац...
Андрей Демидов. Tableau.pro. как обучать визуализации данных внутри организац...
 
Александр Пекарский. Active cloud. Преподаватель БГУ. Зачем использовать стор...
Александр Пекарский. Active cloud. Преподаватель БГУ. Зачем использовать стор...Александр Пекарский. Active cloud. Преподаватель БГУ. Зачем использовать стор...
Александр Пекарский. Active cloud. Преподаватель БГУ. Зачем использовать стор...
 
Pavel Сejka. Tableau.com. Experience the tableau. Minsk 2017.
Pavel Сejka. Tableau.com. Experience the tableau. Minsk 2017.Pavel Сejka. Tableau.com. Experience the tableau. Minsk 2017.
Pavel Сejka. Tableau.com. Experience the tableau. Minsk 2017.
 
Александр Бувин. Arda. Все о картографии в Tableau
Александр Бувин. Arda. Все о картографии в TableauАлександр Бувин. Arda. Все о картографии в Tableau
Александр Бувин. Arda. Все о картографии в Tableau
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Роман Меркулов. In data labs. Прикладные инструменты визуализации данных в python.