SlideShare una empresa de Scribd logo
1 de 91
Descargar para leer sin conexión
analyzing MLB data with
ggplot
Greg Lamp
ggplot
● What is it?
● Alternatives
● How it works
● Why should I use it?
● Brief case study
● Questions
Here I am on
the Internet.
Founder/CTO @ Yhat
Hi, I’m Greg!
What is
ggplot?
DSL for graphics
DSL for graphics
scatterplot
histogram
labels
color
shape
What about
matplotlib?
a quick example
matplotlib ggplot
it’s not all bad!
matplotlib
syntax, api,
default themes,
learning curve
matplotlib
maturity, ipython,
customization, community
syntax, api,
default themes,
learning curve
What about
d3.js?
d3.js
ggplot
ggplot d3.js
How it works
Format
ggplot
data frame
“aesthetics”
Aesthetics
color
shape
size
...fill, alpha, slope,
intercept, ymin,
ymax, ...
Geoms,
Stats, &
Scales
geom_point
geom_area
...there are many
stat_smooth
...there are a few
scale_color_brewer
scale_color_gradient
...there are many
Layers
ggplot()
+
ggplot() geom_point()
+ +
ggplot() geom_point() stat_smooth()
+ +
ggplot() geom_point() stat_smooth()+ +
ggplot() +
geom_point() +
stat_smooth()
Why is this
good?
Makes “reasonable
assumptions”
not real colors
matplotlib freaks
still not real colors
...but i can guess
what you mean
Concise yet
expressive
Looks pretty good
(and is easy to customize)
Seaborngithub.com/mwaskom/seaborn
Case Study
pitch speed
103.4 mph
Load ggplot and pandas
Read in our pitch f/x data
define the x-
axis
pass in your data frame
add a histogram
How does fatigue
impact velocity?
...not helpful
What about at the
individual level?
Justin
Verlander
ggplot let’s you
fail quicker
Finding Help
/tagged/python-ggplot
http://ggplot.yhathq.com
What’s next?
Thanks!
@theglamp
greg@yhathq.com

Más contenido relacionado

Destacado

Python at yhat (august 2013)
Python at yhat (august 2013)Python at yhat (august 2013)
Python at yhat (august 2013)
Austin Ogilvie
 
Python
PythonPython

Destacado (20)

Electron - Build desktop apps using javascript
Electron - Build desktop apps using javascriptElectron - Build desktop apps using javascript
Electron - Build desktop apps using javascript
 
Ggplot in python
Ggplot in pythonGgplot in python
Ggplot in python
 
Table of Useful R commands.
Table of Useful R commands.Table of Useful R commands.
Table of Useful R commands.
 
Python at yhat (august 2013)
Python at yhat (august 2013)Python at yhat (august 2013)
Python at yhat (august 2013)
 
Analyze this
Analyze thisAnalyze this
Analyze this
 
Hadley verse
Hadley verseHadley verse
Hadley verse
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
What is r in spanish.
What is r in spanish.What is r in spanish.
What is r in spanish.
 
Summer school python in spanish
Summer school python in spanishSummer school python in spanish
Summer school python in spanish
 
Rcpp
RcppRcpp
Rcpp
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
 
Logical Fallacies
Logical FallaciesLogical Fallacies
Logical Fallacies
 
Yhat - Applied Data Science - Feb 2016
Yhat - Applied Data Science - Feb 2016Yhat - Applied Data Science - Feb 2016
Yhat - Applied Data Science - Feb 2016
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
 
Python
PythonPython
Python
 
Advanced R cheat sheet
Advanced R cheat sheetAdvanced R cheat sheet
Advanced R cheat sheet
 
Introduction to sas in spanish
Introduction to sas in spanishIntroduction to sas in spanish
Introduction to sas in spanish
 

Similar a Analyzing mlb data with ggplot

Parismlmeetupfinalslides 151209190037-lva1-app6892
Parismlmeetupfinalslides 151209190037-lva1-app6892Parismlmeetupfinalslides 151209190037-lva1-app6892
Parismlmeetupfinalslides 151209190037-lva1-app6892
mercedes calderon
 
20100929 ggplot - triangle useRs group presentation
20100929 ggplot - triangle useRs group presentation20100929 ggplot - triangle useRs group presentation
20100929 ggplot - triangle useRs group presentation
eballen01
 

Similar a Analyzing mlb data with ggplot (20)

ggplot for python SV 2014
ggplot for python SV 2014ggplot for python SV 2014
ggplot for python SV 2014
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJSJavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Behind the Scenes of ChatGPT.pptx
Behind the Scenes of ChatGPT.pptxBehind the Scenes of ChatGPT.pptx
Behind the Scenes of ChatGPT.pptx
 
Paris ML meetup
Paris ML meetupParis ML meetup
Paris ML meetup
 
Parismlmeetupfinalslides 151209190037-lva1-app6892
Parismlmeetupfinalslides 151209190037-lva1-app6892Parismlmeetupfinalslides 151209190037-lva1-app6892
Parismlmeetupfinalslides 151209190037-lva1-app6892
 
Winning Data Science Competitions (Owen Zhang) - 2014 Boston Data Festival
Winning Data Science Competitions (Owen Zhang)  - 2014 Boston Data FestivalWinning Data Science Competitions (Owen Zhang)  - 2014 Boston Data Festival
Winning Data Science Competitions (Owen Zhang) - 2014 Boston Data Festival
 
Winning data science competitions
Winning data science competitionsWinning data science competitions
Winning data science competitions
 
20100929 ggplot - triangle useRs group presentation
20100929 ggplot - triangle useRs group presentation20100929 ggplot - triangle useRs group presentation
20100929 ggplot - triangle useRs group presentation
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen Zhang
 
The Magic of Charts
The Magic of ChartsThe Magic of Charts
The Magic of Charts
 
Machine Learning - Supervised Learning
Machine Learning - Supervised LearningMachine Learning - Supervised Learning
Machine Learning - Supervised Learning
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
 
Grails @ Java User Group Silicon Valley
Grails @ Java User Group Silicon ValleyGrails @ Java User Group Silicon Valley
Grails @ Java User Group Silicon Valley
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep Learning
 
DN18 | The Data Janitor Returns | Daniel Molnar | Oberlo/Shopify
DN18 | The Data Janitor Returns | Daniel Molnar | Oberlo/Shopify DN18 | The Data Janitor Returns | Daniel Molnar | Oberlo/Shopify
DN18 | The Data Janitor Returns | Daniel Molnar | Oberlo/Shopify
 

Más de Austin Ogilvie (6)

2013 - Yhat - YC app.pdf
2013 - Yhat - YC app.pdf2013 - Yhat - YC app.pdf
2013 - Yhat - YC app.pdf
 
2013 05-27-yhat-about
2013 05-27-yhat-about2013 05-27-yhat-about
2013 05-27-yhat-about
 
Yhat 2017 Investor Deck
Yhat 2017 Investor DeckYhat 2017 Investor Deck
Yhat 2017 Investor Deck
 
Finding Lanes for Self-Driving Cars - PyData Berlin Jul 2017- Ross Kippenbroc...
Finding Lanes for Self-Driving Cars - PyData Berlin Jul 2017- Ross Kippenbroc...Finding Lanes for Self-Driving Cars - PyData Berlin Jul 2017- Ross Kippenbroc...
Finding Lanes for Self-Driving Cars - PyData Berlin Jul 2017- Ross Kippenbroc...
 
Applied Data Science with Yhat
Applied Data Science with YhatApplied Data Science with Yhat
Applied Data Science with Yhat
 
Predictive Models for Production Apps with Yhat
Predictive Models for Production Apps with YhatPredictive Models for Production Apps with Yhat
Predictive Models for Production Apps with Yhat
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Analyzing mlb data with ggplot