SlideShare a Scribd company logo
1 of 46
Download to read offline
Graphical Data
Exploration
Eli Bressert
@astrobiased
Stitch Fix / Data Labs
graphics &
exploration
statistical
design
1
2
What we
[data scientists]
do
1. obtain data
2. explore
3. do research/create data product
4. fine tune project and release
5. rinse and repeat
1. obtain data
2.explore
3. do research/create data product
4. fine tune project and release
5. rinse and repeat
basic statistics
simple graphics
formulate hypotheses
assess best models & approaches
1 graphics &
exploration
graphic importance
Anscombe’s Quartet
10 8.04
8 6.95
13 7.58
9 8.81
11 8.33
14 9.96
6 7.24
4 4.26
12 10.84
7 4.82
5 5.68
10 9.14
8 8.14
13 8.74
9 8.77
11 9.26
14 8.1
6 6.13
4 3.1
12 9.13
7 7.26
5 4.74
10 7.46
8 6.77
13 12.74
9 7.11
11 7.81
14 8.84
6 6.08
4 5.39
12 8.15
7 6.42
5 5.73
8 6.58
8 5.76
8 7.71
8 8.84
8 8.47
8 7.04
8 5.25
19 12.5
8 5.56
8 7.91
8 6.89
I II III IV
import seaborn as sns #awsm package
from scipy.optimize import curve_fit
def func(x, a, b):
return a + b * x
df = sns.load_dataset(“anscombe")
df.x.mean()
df.y.mean()
df.x.var()
df.y.var()
df.x.corr(tmp.y))
popt, pcov = curve_fit(func, tmp.x, tmp.y)
Mean x: 9.0
Mean y: 7.5
Variance x: 11.00
Variance y: 4.13
Correlation between x and y: 0.816
Linear regression coefficients: y = 3.00 + 0.50x
http://goo.gl/Zuw4Qe
2
4
6
8
10
12
14
y
dataVet I dataVet II
2 4 6 8 10 12 14 16 18 20
x
2
4
6
8
10
12
14
y
dataVet III
2 4 6 8 10 12 14 16 18 20
x
dataVet IV
dataVet
I
II
III
IV
complexity
“Now if the function of man is an activity of soul in accordance with, or not
without, rational principle, and if we say a so-and-so and a good so-and-so
have a function which is the same in kind, e.g. a lyre-player and a good lyre-
player, and so without qualification in all cases, eminence in respect of
excellence being added to the function (for the function of a lyre-player is
to play the lyre, and that of a good lyre-player is to do so well): if this is the
case, [and we state the function of man to be a certain kind of life, and this
to be an activity or actions of the soul implying a rational principle, and the
function of a man to be the good and noble performance of these, and if
any action is well performed when it is performed in accordance with the
appropriate excellence: if this is the case,] human good turns out to be
activity of soul in conformity with excellence, and if there are more than
one excellence, in conformity with the best and most complete.”
Nicomachean Ethics, Aristotle
ಠ_ಠ
What did it all mean?
What did it all mean?
Virtue
overly complex graphics is
analogous to a run-on sentence
0 50 100 150 200 250
0
50
100
150
200
250
−1.700
−0.908
−0.116
0.676
1.468
(╯°□°)╯︵ ┻━┻
simplicity
Feature 1 Feature 2 Feature 3 Feature 4 Feature 5 Feature 6
Feature 7 Feature 8 Feature 9 Feature 10 Feature 11 Feature 12
Feature 13 Feature 14 Feature 15 Feature 16 Feature 17 Feature 18
Feature 19 Feature 20 Feature 21 Feature 22 Feature 23 Feature 24
Feature1
Feature2
Feature3
Feature4
Feature5
Feature6
Feature7
Feature8
Feature9
Feature10
Feature11
Feature12
Feature13
Feature14
Feature15
Feature16
Feature17
Feature18
Feature19
Feature20
Feature21
Feature22
Feature23
Feature 2
Feature 3
Feature 4
Feature 5
Feature 6
Feature 7
Feature 8
Feature 9
Feature 10
Feature 11
Feature 12
Feature 13
Feature 14
Feature 15
Feature 16
Feature 17
Feature 18
Feature 19
Feature 20
Feature 21
Feature 22
Feature 23
Feature 24
−1.0
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
0.8
1.0
−4 −2 0 2 4 6
3C1
−4
−3
−2
−1
0
1
2
3
4
5
3C2
awesome D3.js tools
JavaScript SVG Canvas
D3.js
Vega
Lyra
Vegalite
Voyager Polestar
Credit: Jeff Heer
JavaScript SVG Canvas
D3.js
Vega
Lyra
Vegalite
Voyager Polestar
Credit: Jeff Heer
github.com/uwdata
EDA results will affect all that follows
statistical
design2
processing speed
faster technology
bigger data
Boundaries
Pushing
You have two options
design your
data sample
plan and
execute
hit the big red
button and wait
for the process
to finish
attention span
?
time cost
hit red button
design and sample
explore, hypothesize, model
explore, hypothesize, model
time
hit red button
design and sample
explore, hypothesize, model
explore, hypothesize, model
time
fail frequently
learn fast
?

More Related Content

Similar to Graphical Data Exploration

Robots, Small Molecules & R
Robots, Small Molecules & RRobots, Small Molecules & R
Robots, Small Molecules & R
Rajarshi Guha
 
Story points considered harmful - or why the future of estimation is really i...
Story points considered harmful - or why the future of estimation is really i...Story points considered harmful - or why the future of estimation is really i...
Story points considered harmful - or why the future of estimation is really i...
Vasco Duarte
 
Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012
emonson
 
Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )
Jennifer Campbell
 

Similar to Graphical Data Exploration (20)

Multiple intelligences approach to Number Systems
Multiple intelligences approach to  Number SystemsMultiple intelligences approach to  Number Systems
Multiple intelligences approach to Number Systems
 
Visual Analytics Best Practices
Visual Analytics Best PracticesVisual Analytics Best Practices
Visual Analytics Best Practices
 
Best Practices for Killer Data Visualization
Best Practices for Killer Data VisualizationBest Practices for Killer Data Visualization
Best Practices for Killer Data Visualization
 
Robots, Small Molecules & R
Robots, Small Molecules & RRobots, Small Molecules & R
Robots, Small Molecules & R
 
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
 
Story points considered harmful - or why the future of estimation is really i...
Story points considered harmful - or why the future of estimation is really i...Story points considered harmful - or why the future of estimation is really i...
Story points considered harmful - or why the future of estimation is really i...
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom Discovery
 
Human action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptorHuman action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptor
 
Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012
 
AiCore Brochure 27-Mar-2023-205529.pdf
AiCore Brochure 27-Mar-2023-205529.pdfAiCore Brochure 27-Mar-2023-205529.pdf
AiCore Brochure 27-Mar-2023-205529.pdf
 
Data science see what your eyes can't
Data science see what your eyes can'tData science see what your eyes can't
Data science see what your eyes can't
 
Tableau tech activist conference
Tableau   tech activist conferenceTableau   tech activist conference
Tableau tech activist conference
 
딥러닝 기초 - XOR 문제와 딥뉴럴넷(Basic of DL - XOR problem and DNN)
딥러닝 기초 - XOR 문제와 딥뉴럴넷(Basic of DL - XOR problem and DNN)딥러닝 기초 - XOR 문제와 딥뉴럴넷(Basic of DL - XOR problem and DNN)
딥러닝 기초 - XOR 문제와 딥뉴럴넷(Basic of DL - XOR problem and DNN)
 
Slides ads ia
Slides ads iaSlides ads ia
Slides ads ia
 
How To Write A Thesis For An Illustrative Essay - Ake
How To Write A Thesis For An Illustrative Essay - AkeHow To Write A Thesis For An Illustrative Essay - Ake
How To Write A Thesis For An Illustrative Essay - Ake
 
IA-advanced-R
IA-advanced-RIA-advanced-R
IA-advanced-R
 
Wearable Accelerometer Optimal Positions for Human Motion Recognition(LifeTec...
Wearable Accelerometer Optimal Positions for Human Motion Recognition(LifeTec...Wearable Accelerometer Optimal Positions for Human Motion Recognition(LifeTec...
Wearable Accelerometer Optimal Positions for Human Motion Recognition(LifeTec...
 
Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )
 
Merge sort and Quick sort
Merge sort and Quick sortMerge sort and Quick sort
Merge sort and Quick sort
 
Research and Commercialisation Challenges
Research and Commercialisation ChallengesResearch and Commercialisation Challenges
Research and Commercialisation Challenges
 

More from Eli Bressert (6)

Data Science in the Rough
Data Science in the RoughData Science in the Rough
Data Science in the Rough
 
Data Over Matter: Innovating the next generation of products
Data Over Matter: Innovating the next generation of productsData Over Matter: Innovating the next generation of products
Data Over Matter: Innovating the next generation of products
 
Color of words
Color of wordsColor of words
Color of words
 
Inspector Git: Discover Github's awesome repositories
Inspector Git: Discover Github's awesome repositoriesInspector Git: Discover Github's awesome repositories
Inspector Git: Discover Github's awesome repositories
 
Star Formation: The good, the bad, and the ugly
Star Formation: The good, the bad, and the uglyStar Formation: The good, the bad, and the ugly
Star Formation: The good, the bad, and the ugly
 
Masscive Cluster Formation
Masscive Cluster FormationMasscive Cluster Formation
Masscive Cluster Formation
 

Recently uploaded

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 

Recently uploaded (20)

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 

Graphical Data Exploration