SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
Fairy tales in the land of data
Or - do I know what I’m doing?
By @przemur from
http://about.me/przemek.maciolek
A story
http://yamao.deviantart.com/art/Cleric-comm-343786321 https://www.flickr.com/photos/jsjgeology/8359854092/
Suspense
<

?
“The hammers
from the new
provider are no
good, sayr.”
What would you do?
New hammers
since this month
install.packages('ggplot2')
require('ggplot2')
setwd("/Users/pmm/Desktop/hammer")
all <- read.csv(file="all.csv")
!
qplot(all$month_sequence, all$dwarfs) + geom_smooth()
qplot(all$month_sequence, all$production) + geom_smooth()
!
all$prod_per_dwarf <- all$production / all$dwarfs
qplot(all$month_sequence, all$prod_per_dwarf) + geom_smooth()
Number of dwarfs working in the mine
The hammers from the new
provider started being
distributed to the new miners.
Total production of gold
Per-dwarf average production
Who sees any problem?
Lets look at the production of each
dwarf, relative to the time one applied…
Dwarfs which are using the
OLD hammer design
Dwarfs which are using the
NEW hammer design
new <- read.csv(file="new_relative.csv")
old <- read.csv(file="old_relative.csv")
!
qplot(new$relative_month, new$production)
ggplot(new, aes(x=relative_month, y=production)) + geom_point(shape=19,
position=position_jitter(width=.5,height=0), alpha=.2)

# This will look much better!

old$type='old'
new$type='new'
old_and_new = rbind(old,new)
ggplot(old_and_new, aes(x=relative_month, y=production, color=type)) + geom_point(shape=19,
position=position_jitter(width=.5,height=0), alpha=.2)
Scatterplot showing relative production
done using old and new hammers
What now?
ggplot(old_and_new, aes(x=relative_month, y=production, color=type)) + geom_point(shape=19,
position=position_jitter(width=.5,height=0), alpha=.1) + geom_smooth(method=lm)
The new hammers
wear much faster!
How much did the dwarfs lost?
old_m = lm(production ~ relative_month, old)
new$possible_production <- predict(old_m, new)
sum(new$possible_production) - sum(new$production)
(sum(new$possible_production) - sum(new$production))/sum(new$production)
0.5%
Now, taking into account the price of
hammer, one can select the optimal
strategy… but that’s another story…
Lessons learned …?
• Don’t trust the data blindly, ask questions
• Try to understand underlying rules of the system
• Don’t be shy with trying various models
• If using R, go for ggplot2

Más contenido relacionado

Similar a Fairy tale from the land of data

Benchmarking Perl (Chicago UniForum 2006)
Benchmarking Perl (Chicago UniForum 2006)Benchmarking Perl (Chicago UniForum 2006)
Benchmarking Perl (Chicago UniForum 2006)
brian d foy
 
03 - Average Rates of Changec Cameron 1 Sara Hill.pdf
03 - Average Rates of Changec Cameron 1 Sara Hill.pdf03 - Average Rates of Changec Cameron 1 Sara Hill.pdf
03 - Average Rates of Changec Cameron 1 Sara Hill.pdf
Dr Robert Craig PhD
 
Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)
brian d foy
 
bioinfolec_5th_20070713
bioinfolec_5th_20070713bioinfolec_5th_20070713
bioinfolec_5th_20070713
sesejun
 
How To Create Custom DSLs By PHP
How To Create Custom DSLs By PHPHow To Create Custom DSLs By PHP
How To Create Custom DSLs By PHP
Atsuhiro Kubo
 
JSplash - Adobe MAX 2009
JSplash - Adobe MAX 2009JSplash - Adobe MAX 2009
JSplash - Adobe MAX 2009
gyuque
 

Similar a Fairy tale from the land of data (13)

Perlで任意精度計算
Perlで任意精度計算Perlで任意精度計算
Perlで任意精度計算
 
Potential gotchas in making a backbone app
Potential gotchas in making a backbone appPotential gotchas in making a backbone app
Potential gotchas in making a backbone app
 
Benchmarking Perl (Chicago UniForum 2006)
Benchmarking Perl (Chicago UniForum 2006)Benchmarking Perl (Chicago UniForum 2006)
Benchmarking Perl (Chicago UniForum 2006)
 
The Ring programming language version 1.7 book - Part 64 of 196
The Ring programming language version 1.7 book - Part 64 of 196The Ring programming language version 1.7 book - Part 64 of 196
The Ring programming language version 1.7 book - Part 64 of 196
 
03 - Average Rates of Changec Cameron 1 Sara Hill.pdf
03 - Average Rates of Changec Cameron 1 Sara Hill.pdf03 - Average Rates of Changec Cameron 1 Sara Hill.pdf
03 - Average Rates of Changec Cameron 1 Sara Hill.pdf
 
Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)Benchmarking Perl Lightning Talk (NPW 2007)
Benchmarking Perl Lightning Talk (NPW 2007)
 
Clojure: Practical functional approach on JVM
Clojure: Practical functional approach on JVMClojure: Practical functional approach on JVM
Clojure: Practical functional approach on JVM
 
Ganglia Overview-v2
Ganglia Overview-v2Ganglia Overview-v2
Ganglia Overview-v2
 
XS Japan 2008 Xen Mgmt Japanese
XS Japan 2008 Xen Mgmt JapaneseXS Japan 2008 Xen Mgmt Japanese
XS Japan 2008 Xen Mgmt Japanese
 
The Ring programming language version 1.5.4 book - Part 59 of 185
The Ring programming language version 1.5.4 book - Part 59 of 185The Ring programming language version 1.5.4 book - Part 59 of 185
The Ring programming language version 1.5.4 book - Part 59 of 185
 
bioinfolec_5th_20070713
bioinfolec_5th_20070713bioinfolec_5th_20070713
bioinfolec_5th_20070713
 
How To Create Custom DSLs By PHP
How To Create Custom DSLs By PHPHow To Create Custom DSLs By PHP
How To Create Custom DSLs By PHP
 
JSplash - Adobe MAX 2009
JSplash - Adobe MAX 2009JSplash - Adobe MAX 2009
JSplash - Adobe MAX 2009
 

Último

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
JohnnyPlasten
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 

Último (20)

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

Fairy tale from the land of data