The document discusses various methods for visualizing univariate continuous data using R, including boxplots, histograms, density plots, and QQ plots. It shows examples of visualizing birth weight data using these different plots both individually and combined. Specifically, it demonstrates creating boxplots, histograms, density lines, adjusted boxplots, and layouts combining histograms, densities and boxplots to visualize the distribution of birth weight data.
10. Histogram
Univariate Continuous
Histogram of birth weight
Weight (kg)
Frequency
2.0 2.5 3.0 3.5
0100200300400
5
35
127
249
271
346
387 393
281
176
87
95
108
88
45
6 1
hist(bw,
xlab="Weight
(kg)",
ylab="Frequency",
labels=TRUE, ...)
Baan Bapat Data Visualization with R
11. Histogram & density
Univariate Continuous
Histogram & Density line: Birth weight
Weight (kg)
Probability
2.0 2.5 3.0 3.5
0.00.51.01.5
hist(...,
freq=FALSE)
lines(density(bw),
...)
Baan Bapat Data Visualization with R
12. Histogram & density
Univariate Continuous
Histogram & Density line: Birth weight
Weight (kg)
Probability
2.0 2.5 3.0 3.5
0.00.51.01.5
hist(...,
freq=FALSE)
lines(density(bw),
...)
Baan Bapat Data Visualization with R
13. Histogram & density
Univariate Continuous
Histogram & Density lines: Birth weight mixtures?
Weight (kg)
Probability
2.0 2.5 3.0 3.5
0.00.51.01.5
More breaks in hist
Baan Bapat Data Visualization with R
14. Histogram & density
Univariate Continuous
Histogram & Density lines: Birth weight mixtures?
Weight (kg)
Probability
2.0 2.5 3.0 3.5
0.00.51.01.5
More breaks in hist
Density mixture
Baan Bapat Data Visualization with R
15. Histogram & density
Univariate Continuous
Histogram & Density lines: Birth weight mixtures?
Weight (kg)
Probability
2.0 2.5 3.0 3.5
0.00.51.01.5
More breaks in hist
Density mixture
Baan Bapat Data Visualization with R
16. Histogram & adjusted boxplot
Univariate Continuous
Histogram, Density & Adjbox
bw
Density
0.00.51.01.5
q q q
2.0 2.5 3.0 3.5
Weight (kg)
mat <-
matrix(c(1,2))
layout(mat,
height=c(0.8,
0.2))
par(mar=
c(0,4,3,1),
bty="n")
hist(..., ,
axes=FALSE)
axis(2)
boxplot()
Baan Bapat Data Visualization with R
17. Histogram & adjusted boxplot
Univariate Continuous
Histogram, Density & Adjbox
bw
Density
0.00.51.01.5
q q q
2.0 2.5 3.0 3.5
Weight (kg)
mat <-
matrix(c(1,2))
layout(mat,
height=c(0.8,
0.2))
par(mar=
c(0,4,3,1),
bty="n")
hist(..., ,
axes=FALSE)
axis(2)
boxplot()
Baan Bapat Data Visualization with R
18. Histogram & adjusted boxplot
Univariate Continuous
Histogram, Density & Adjbox
bw
Density
0.00.51.01.5
q q q
2.0 2.5 3.0 3.5
Weight (kg)
mat <-
matrix(c(1,2))
layout(mat,
height=c(0.8,
0.2))
par(mar=
c(0,4,3,1),
bty="n")
hist(..., ,
axes=FALSE)
axis(2)
boxplot()
Baan Bapat Data Visualization with R
19. Histogram & adjusted boxplot
Univariate Continuous
Histogram, Density & Adjbox
bw
Density
0.00.51.01.5
q q q
2.0 2.5 3.0 3.5
Weight (kg)
mat <-
matrix(c(1,2))
layout(mat,
height=c(0.8,
0.2))
par(mar=
c(0,4,3,1),
bty="n")
hist(..., ,
axes=FALSE)
axis(2)
boxplot()
Baan Bapat Data Visualization with R
20. Histogram & adjusted boxplot
Univariate Continuous
Histogram, Density & Adjbox
bw
Density
0.00.51.01.5
q q q
2.0 2.5 3.0 3.5
Weight (kg)
mat <-
matrix(c(1,2))
layout(mat,
height=c(0.8,
0.2))
par(mar=
c(0,4,3,1),
bty="n")
hist(..., ,
axes=FALSE)
axis(2)
boxplot()
Baan Bapat Data Visualization with R
21. Histogram & adjusted boxplot
Univariate Continuous
Histogram, Density & Adjbox
bw
Density
0.00.51.01.5
q q q
2.0 2.5 3.0 3.5
Weight (kg)
mat <-
matrix(c(1,2))
layout(mat,
height=c(0.8,
0.2))
par(mar=
c(0,4,3,1),
bty="n")
hist(..., ,
axes=FALSE)
axis(2)
boxplot()
Baan Bapat Data Visualization with R
24. Univariate Categorical
Topics most visited on English Wikipedia on 31 May 2013
Topic No. hits
Cult 291439
Rituparno Ghosh 215843
Cat anatomy 102960
Facebook 93181
Fast & Furious 6 84014
Liberace 73162
Game of Thrones 70599
Jean-Claude Romand 70144
Game of Thrones (season 3) 69752
Arrested Development (TV series) 69573
Baan Bapat Data Visualization with R
25. Barplot
Univariate Categorical
Arrested Development (TV)
Game of Thrones (season 3)
Jean−Claude Romand
Game of Thrones
Liberace
Fast & Furious 6
Facebook
Cat anatomy
Rituparno Ghosh
Cult 0
50000
100000
150000
200000
250000
69573
69752
70144
70599
73162
84014
93181
102960
215843
291439
n <- length(wiki)
bp <-
barplot(wiki,
horiz=TRUE,
col=topo.colors(n))
text(y=bp,
x=wiki,
labels=wiki,
cex=0.8, pos=2)
Baan Bapat Data Visualization with R
26. Barplot
Univariate Categorical
Arrested Development (TV)
Game of Thrones (season 3)
Jean−Claude Romand
Game of Thrones
Liberace
Fast & Furious 6
Facebook
Cat anatomy
Rituparno Ghosh
Cult 0
50000
100000
150000
200000
250000
69573
69752
70144
70599
73162
84014
93181
102960
215843
291439
n <- length(wiki)
bp <-
barplot(wiki,
horiz=TRUE,
col=topo.colors(n))
text(y=bp,
x=wiki,
labels=wiki,
cex=0.8, pos=2)
Baan Bapat Data Visualization with R
27. Barplot
Univariate Categorical
Arrested Development (TV)
Game of Thrones (season 3)
Jean−Claude Romand
Game of Thrones
Liberace
Fast & Furious 6
Facebook
Cat anatomy
Rituparno Ghosh
Cult 0
50000
100000
150000
200000
250000
69573
69752
70144
70599
73162
84014
93181
102960
215843
291439
n <- length(wiki)
bp <-
barplot(wiki,
horiz=TRUE,
col=topo.colors(n))
text(y=bp,
x=wiki,
labels=wiki,
cex=0.8, pos=2)
Baan Bapat Data Visualization with R
30. Pie
Univariate Categorical
Arrested Development (TV)
Game of Thrones (season 3)
Jean−Claude Romand
Game of Thrones
Liberace
Fast & Furious 6
Facebook
Cat anatomy
Rituparno Ghosh
Cult
pie(wiki,
init.angle=90)
Baan Bapat Data Visualization with R
31. Exploded pie
Univariate Categorical
Arrested Development (TV)
Game of Thrones (season 3)
Jean−Claude Romand
Game of ThronesLiberace
Fast & Furious 6
Facebook
Cat anatomy
Rituparno Ghosh
Cult
require(plotrix)
pie3D(wiki,
labels=names(wiki),
explode=0.1)
Baan Bapat Data Visualization with R
32. Exploded pie
Univariate Categorical
Arrested Development (TV)
Game of Thrones (season 3)
Jean−Claude Romand
Game of ThronesLiberace
Fast & Furious 6
Facebook
Cat anatomy
Rituparno Ghosh
Cult
require(plotrix)
pie3D(wiki,
labels=names(wiki),
explode=0.1)
Baan Bapat Data Visualization with R
33. Dotchart
Univariate Categorical
Arrested Development (TV)
Game of Thrones (season 3)
Jean−Claude Romand
Game of Thrones
Liberace
Fast & Furious 6
Facebook
Cat anatomy
Rituparno Ghosh
Cult
q
q
q
q
q
q
q
q
q
q
100000 200000 300000
dotchart(wiki,
pch=19,
col=rainbow(n))
Baan Bapat Data Visualization with R
35. Barplot
Bivariate Categorical
Black Brown Red Blond
Hair color
Eyecolorfrequency
020406080100120140
Brown
Blue
Hazel
Green
Stacked bar plot
barplot(
HairEyeColor)
legend(
x="topright",
legend =
attr(HairEyeColor,
"dimnames")$Eye,
pch=18,
col=mycols)
Baan Bapat Data Visualization with R
36. Barplot
Bivariate Categorical
Black Brown Red Blond
Hair color
Eyecolorfrequency
020406080100120140
Brown
Blue
Hazel
Green
Stacked bar plot
barplot(
HairEyeColor)
legend(
x="topright",
legend =
attr(HairEyeColor,
"dimnames")$Eye,
pch=18,
col=mycols)
Baan Bapat Data Visualization with R
37. Barplot
Bivariate Categorical
Black Brown Red Blond
Hair color
Eyecolorfrequency
020406080100120140
Brown
Blue
Hazel
Green
Stacked bar plot
barplot(
HairEyeColor)
legend(
x="topright",
legend =
attr(HairEyeColor,
"dimnames")$Eye,
pch=18,
col=mycols)
Baan Bapat Data Visualization with R
38. Barplot
Bivariate Categorical
Black Brown Red Blond
Hair color
Eyecolorfrequency
0102030405060
Brown
Blue
Hazel
Green
Grouped bar plot
barplot(...,
beside=TRUE)
Baan Bapat Data Visualization with R
39. Barplot
Bivariate Categorical
Black Brown Red Blond
Hair color
Eyecolorfrequency
0102030405060
Brown
Blue
Hazel
Green
Grouped bar plot
barplot(...,
beside=TRUE)
Baan Bapat Data Visualization with R
42. Cars data
Fuel consumption and 10 aspects of automobile design and
performance for 32 automobiles (1973–74 models), . . . Motor Trend
dataset: mtcars
Baan Bapat Data Visualization with R
43. Boxplot
Bivariate Continuous Vs Categorical
qq
4 6 8
1015202530
Cylinders
Milespergallon
Car Mileage ∼
cylinders
boxplot(mpg ∼
cyl, data=mtcars)
Baan Bapat Data Visualization with R
44. Boxplot
Bivariate Continuous Vs Categorical
qq
4 6 8
1015202530
Cylinders
Milespergallon
Car Mileage ∼
cylinders
boxplot(mpg ∼
cyl, data=mtcars)
Baan Bapat Data Visualization with R
54. Lots of data points
Bivariate Continuous Vs Continuous
4 6 8 10 12 14 16
101520
Alpha transparency
x
y
rgb(0.2, 0.2, 0.8, 0.2)
plot(..., col =
rgb(0, 0.4,
0, 0.2))
Baan Bapat Data Visualization with R
55. Lots of data points – hexbin
Bivariate Continuous Vs Continuous
4 6 8 10 12 14 16
10
15
20
x
y
Hexagonal Binning
1
6
10
14
19
24
28
32
37
42
46
50
55
60
64
68
73
Counts
require(hexbin)
bin <- hexbin(x,
y, xbins=50)
plot(bin,
colramp=BTY,
colorcut=
seq(0,1,1/16))
Baan Bapat Data Visualization with R
56. Lots of data points – hexbin
Bivariate Continuous Vs Continuous
4 6 8 10 12 14 16
10
15
20
x
y
Hexagonal Binning
1
6
10
14
19
24
28
32
37
42
46
50
55
60
64
68
73
Counts
require(hexbin)
bin <- hexbin(x,
y, xbins=50)
plot(bin,
colramp=BTY,
colorcut=
seq(0,1,1/16))
Baan Bapat Data Visualization with R
57. Lots of data points – hexbin
Bivariate Continuous Vs Continuous
4 6 8 10 12 14 16
10
15
20
x
y
Hexagonal Binning
1
6
10
14
19
24
28
32
37
42
46
50
55
60
64
68
73
Counts
require(hexbin)
bin <- hexbin(x,
y, xbins=50)
plot(bin,
colramp=BTY,
colorcut=
seq(0,1,1/16))
Baan Bapat Data Visualization with R
64. The plot method
Appropriately plots the object passed to it!
Timeseries – decomposition
nino3: Sea surface temperature of El Ni˜no
plot(decompose(nino3))
23252729
observed
25262728
trend
−1.00.01.0
seasonal
−1.00.00.51.0
1950 1960 1970 1980 1990 2000
random
Time
Decomposition of additive time series
Baan Bapat Data Visualization with R
67. The plot method
Appropriately plots the object passed to it!
Cluster cars based on their attributes
hc <- hclust(dist(mtcars))
plot(hc); rect.hclust(hc, k=4)
MaseratiBora
ChryslerImperial
CadillacFleetwood
LincolnContinental
FordPanteraL
Duster360
CamaroZ28
HornetSportabout
PontiacFirebird
Hornet4Drive
Valiant
Merc450SLC
Merc450SE
Merc450SL
DodgeChallenger
AMCJavelin
HondaCivic
ToyotaCorolla
Fiat128
FiatX1−9
FerrariDino
LotusEuropa
Merc230
Volvo142E
Datsun710
ToyotaCorona
Porsche914−2
Merc240D
MazdaRX4
MazdaRX4Wag
Merc280
Merc280C
0100200300400
Cluster Dendrogram
hclust (*, "complete")
dist(mtcars)
Height
Baan Bapat Data Visualization with R
68. The plot method
Appropriately plots the object passed to it!
Decision tree: Given Mileage, how many cylinders does the
car have?
require(rpart); require(rpart.plot)
rp1 <- rpart(factor(cyl) ∼ mpg, data=mtcars)
prp(rp1)
mpg >= 21
mpg >= 184
6 8
yes no
Baan Bapat Data Visualization with R
69. Financial timeseries
Multivariate: Continuous Vs Time
23
24
25
26
27
YHOO [2013−04−01/2013−06−20]
Last 25.35
Volume (millions):
18,811,400
10
20
30
40
Apr 01
2013
Apr 15
2013
Apr 29
2013
May 13
2013
May 28
2013
Jun 10
2013
Jun 20
2013
OLHC data of stock
price
require(quantmod)
getSymbols(
"YHOO",
from="2013-04-01")
chartSeries(YHOO)
Baan Bapat Data Visualization with R
70. Financial timeseries
Multivariate: Continuous Vs Time
23
24
25
26
27
YHOO [2013−04−01/2013−06−20]
Last 25.35
Volume (millions):
18,811,400
10
20
30
40
Apr 01
2013
Apr 15
2013
Apr 29
2013
May 13
2013
May 28
2013
Jun 10
2013
Jun 20
2013
OLHC data of stock
price
require(quantmod)
getSymbols(
"YHOO",
from="2013-04-01")
chartSeries(YHOO)
Baan Bapat Data Visualization with R
71. Financial timeseries
Multivariate: Continuous Vs Time
23
24
25
26
27
YHOO [2013−04−01/2013−06−20]
Last 25.35
Volume (millions):
18,811,400
10
20
30
40
Apr 01
2013
Apr 15
2013
Apr 29
2013
May 13
2013
May 28
2013
Jun 10
2013
Jun 20
2013
OLHC data of stock
price
require(quantmod)
getSymbols(
"YHOO",
from="2013-04-01")
chartSeries(YHOO)
Baan Bapat Data Visualization with R
72. Financial timeseries
Multivariate: Continuous Vs Time
23
24
25
26
27
YHOO [2013−04−01/2013−06−20]
Last 25.35
Volume (millions):
18,811,400
10
20
30
40
Apr 01
2013
Apr 15
2013
Apr 29
2013
May 13
2013
May 28
2013
Jun 10
2013
Jun 20
2013
OLHC data of stock
price
require(quantmod)
getSymbols(
"YHOO",
from="2013-04-01")
chartSeries(YHOO)
Baan Bapat Data Visualization with R
73. Complex data plotting
Multivariate data, mixed modes
lattice: based on the idea
of conditioning on the values
taken on by one or more of
the variables in a data set.
xyplot
levelplot
panel functions
ggplot2: implementation of
the grammar of graphics
in R.
data
aesthetics
geometry
statistical operation
scales
facets
coordinates
options
Baan Bapat Data Visualization with R
74. Complex data plotting
Multivariate data, mixed modes
lattice: based on the idea
of conditioning on the values
taken on by one or more of
the variables in a data set.
xyplot
levelplot
panel functions
ggplot2: implementation of
the grammar of graphics
in R.
data
aesthetics
geometry
statistical operation
scales
facets
coordinates
options
Baan Bapat Data Visualization with R
75. Complex data plotting
Multivariate data, mixed modes
lattice: based on the idea
of conditioning on the values
taken on by one or more of
the variables in a data set.
xyplot
levelplot
panel functions
ggplot2: implementation of
the grammar of graphics
in R.
data
aesthetics
geometry
statistical operation
scales
facets
coordinates
options
Baan Bapat Data Visualization with R
76. xyplot
Multivariate: Continuous Vs Continuous, by categorical
Mileage
Price
20000
40000
60000
0 20000 50000
qqqqqqqqq qq qqqqqqqq q
9_3
q qqqqqqqqqqqqqqqqqqq
qqqqqqqqq
qqq qqq qqqq q
9_3 HO
0 20000 50000
q qqq qqqqq q
qqqqqqqqqq
q qqqqqqq qq
9_5
q qqqqqqqq q
qqqqqqqqq q
9_5 HO
0 20000 50000
q qqq
9−2X AWD
qq qqqqqqqqqq qqqqqq q qqqqqqqqqqqqqqqqqqqq qqqq qqqqqq qq qqqqqqq qq
AVEO
qqqqqqqqqqqqqqqqqqqq
qqqqqqqqqq
Bonneville
q qqqqqqqqqq q qqqqqqqqq qqqq qqq qqqqq qqqqqq qqq qqqqqqq qq qqqqqqqq q
Cavalier
qqqqqqqqqq
Century
qqqqqqqq qq
Classic
q qqqqqqqqqqq qqqqqqqqqqqqqqqqq qq qqqqqqqqqqqqq qqqqqq
Cobalt
20000
40000
60000
qq qqqqqqqqqqqqq qqqq q
Corvette
20000
40000
60000
qq q qqqqqqq
CST−V
qqqqqqqqq q
CTS
qqqqqqqqqq
qqqq qqqq q qqqqqqqqqqq
Deville
qqqqq qqqqqqqq qqqqqq q
G6
qqqqq qqqqqqqqqq qq qqq
Grand Am
qqqqqqqqqq
qqqqqqqqqqqq qqqqqqqq
Grand Prix
qqqqqqqqq q
GTO
q qqqqqqqqqq q qqqqqqqq
qq qq qqqqq q
Impala
q qqqqqqqqqqqqqqqqqq qqq qqqqqqqqqqqqqqqqqqqqqqqqqqqq
Ion
qqqqqqqqq q
L Series
qqqqqqqq qq
qqqqqqqqqqqqqqqqqqqq
Lacrosse
20000
40000
60000
q q qqqqqqqqqqqqqqqqqq
Lesabre
20000
40000
60000
qq qqqqqqq qq q q qqqqqqqq qqqqqqqqqq qqqqqqqqqqqq qqqqqqqqqqqqqqqqq
Malibu
qqqqqqqqq q
q qqqqqqqq q
qq qqqqqqqq
Monte Carlo
q q qqqqqqqq
qqqqqq qqq q
Park Avenue
qq qqqqqqq q
STS−V6
qq qqqqqqqq
STS−V8
qqqqqqqqqq
Sunfire
qqqqqqqqqqqqqqqqq qqqqqqqqq qqqq
Vibe
0 20000 50000
20000
40000
60000
q qqqqqq qq q
XLR−V8
Car resale price as a
function of mileage
and model
require(lattice)
xyplot(Price ∼
Mileage | Model,
data=ca)
Baan Bapat Data Visualization with R
77. xyplot
Multivariate: Continuous Vs Continuous, by categorical
Mileage
Price
20000
40000
60000
0 20000 50000
qqqqqqqqq qq qqqqqqqq q
9_3
q qqqqqqqqqqqqqqqqqqq
qqqqqqqqq
qqq qqq qqqq q
9_3 HO
0 20000 50000
q qqq qqqqq q
qqqqqqqqqq
q qqqqqqq qq
9_5
q qqqqqqqq q
qqqqqqqqq q
9_5 HO
0 20000 50000
q qqq
9−2X AWD
qq qqqqqqqqqq qqqqqq q qqqqqqqqqqqqqqqqqqqq qqqq qqqqqq qq qqqqqqq qq
AVEO
qqqqqqqqqqqqqqqqqqqq
qqqqqqqqqq
Bonneville
q qqqqqqqqqq q qqqqqqqqq qqqq qqq qqqqq qqqqqq qqq qqqqqqq qq qqqqqqqq q
Cavalier
qqqqqqqqqq
Century
qqqqqqqq qq
Classic
q qqqqqqqqqqq qqqqqqqqqqqqqqqqq qq qqqqqqqqqqqqq qqqqqq
Cobalt
20000
40000
60000
qq qqqqqqqqqqqqq qqqq q
Corvette
20000
40000
60000
qq q qqqqqqq
CST−V
qqqqqqqqq q
CTS
qqqqqqqqqq
qqqq qqqq q qqqqqqqqqqq
Deville
qqqqq qqqqqqqq qqqqqq q
G6
qqqqq qqqqqqqqqq qq qqq
Grand Am
qqqqqqqqqq
qqqqqqqqqqqq qqqqqqqq
Grand Prix
qqqqqqqqq q
GTO
q qqqqqqqqqq q qqqqqqqq
qq qq qqqqq q
Impala
q qqqqqqqqqqqqqqqqqq qqq qqqqqqqqqqqqqqqqqqqqqqqqqqqq
Ion
qqqqqqqqq q
L Series
qqqqqqqq qq
qqqqqqqqqqqqqqqqqqqq
Lacrosse
20000
40000
60000
q q qqqqqqqqqqqqqqqqqq
Lesabre
20000
40000
60000
qq qqqqqqq qq q q qqqqqqqq qqqqqqqqqq qqqqqqqqqqqq qqqqqqqqqqqqqqqqq
Malibu
qqqqqqqqq q
q qqqqqqqq q
qq qqqqqqqq
Monte Carlo
q q qqqqqqqq
qqqqqq qqq q
Park Avenue
qq qqqqqqq q
STS−V6
qq qqqqqqqq
STS−V8
qqqqqqqqqq
Sunfire
qqqqqqqqqqqqqqqqq qqqqqqqqq qqqq
Vibe
0 20000 50000
20000
40000
60000
q qqqqqq qq q
XLR−V8
Car resale price as a
function of mileage
and model
require(lattice)
xyplot(Price ∼
Mileage | Model,
data=ca)
Baan Bapat Data Visualization with R
78. xyplot
Multivariate: Continuous Vs Continuous, by categorical
Mileage
Price
20000
40000
60000
0 20000 50000
qqqqqqqqq qq qqqqqqqq q
9_3
q qqqqqqqqqqqqqqqqqqq
qqqqqqqqq
qqq qqq qqqq q
9_3 HO
0 20000 50000
q qqq qqqqq q
qqqqqqqqqq
q qqqqqqq qq
9_5
q qqqqqqqq q
qqqqqqqqq q
9_5 HO
0 20000 50000
q qqq
9−2X AWD
qq qqqqqqqqqq qqqqqq q qqqqqqqqqqqqqqqqqqqq qqqq qqqqqq qq qqqqqqq qq
AVEO
qqqqqqqqqqqqqqqqqqqq
qqqqqqqqqq
Bonneville
q qqqqqqqqqq q qqqqqqqqq qqqq qqq qqqqq qqqqqq qqq qqqqqqq qq qqqqqqqq q
Cavalier
qqqqqqqqqq
Century
qqqqqqqq qq
Classic
q qqqqqqqqqqq qqqqqqqqqqqqqqqqq qq qqqqqqqqqqqqq qqqqqq
Cobalt
20000
40000
60000
qq qqqqqqqqqqqqq qqqq q
Corvette
20000
40000
60000
qq q qqqqqqq
CST−V
qqqqqqqqq q
CTS
qqqqqqqqqq
qqqq qqqq q qqqqqqqqqqq
Deville
qqqqq qqqqqqqq qqqqqq q
G6
qqqqq qqqqqqqqqq qq qqq
Grand Am
qqqqqqqqqq
qqqqqqqqqqqq qqqqqqqq
Grand Prix
qqqqqqqqq q
GTO
q qqqqqqqqqq q qqqqqqqq
qq qq qqqqq q
Impala
q qqqqqqqqqqqqqqqqqq qqq qqqqqqqqqqqqqqqqqqqqqqqqqqqq
Ion
qqqqqqqqq q
L Series
qqqqqqqq qq
qqqqqqqqqqqqqqqqqqqq
Lacrosse
20000
40000
60000
q q qqqqqqqqqqqqqqqqqq
Lesabre
20000
40000
60000
qq qqqqqqq qq q q qqqqqqqq qqqqqqqqqq qqqqqqqqqqqq qqqqqqqqqqqqqqqqq
Malibu
qqqqqqqqq q
q qqqqqqqq q
qq qqqqqqqq
Monte Carlo
q q qqqqqqqq
qqqqqq qqq q
Park Avenue
qq qqqqqqq q
STS−V6
qq qqqqqqqq
STS−V8
qqqqqqqqqq
Sunfire
qqqqqqqqqqqqqqqqq qqqqqqqqq qqqq
Vibe
0 20000 50000
20000
40000
60000
q qqqqqq qq q
XLR−V8
Car resale price as a
function of mileage
and model
require(lattice)
xyplot(Price ∼
Mileage | Model,
data=ca)
Baan Bapat Data Visualization with R
84. xyplot
Multivariate: Continuous Vs Several Categorical
variety
BarleyYield(bushels/acre)
20
30
40
50
60
Svansota
N
o.462
M
anchuria
N
o.475
Velvet
Peatland
G
labron
N
o.457
W
isconsin
N
o.38
Trebi
q
qq
q
qqq
q
q
q
Grand Rapids
20
30
40
50
60
q qq q
q
qq
qq q
Duluth
20
30
40
50
60
q
q
q q qqq qq
q
University Farm
20
30
40
50
60
q qq q
qqq qq q
Morris
20
30
40
50
60
q
q
q
q
q
qq
q
q q
Crookston
20
30
40
50
60
q
qq q
q
qq
q
q
q
Waseca
q 1931
1932
xyplot(yield ∼
variety | site,
data = barley,
groups = year,
pch = c(21,22),
layout = c(1,6),
stack = TRUE,
auto.key =
list(space =
"right"), ylab =
"Barley Yield
(bushels/acre)",
scales = list(x =
list(rot = 45)))
Baan Bapat Data Visualization with R
85. xyplot
Multivariate: Continuous Vs Several Categorical
variety
BarleyYield(bushels/acre)
20
30
40
50
60
Svansota
N
o.462
M
anchuria
N
o.475
Velvet
Peatland
G
labron
N
o.457
W
isconsin
N
o.38
Trebi
q
qq
q
qqq
q
q
q
Grand Rapids
20
30
40
50
60
q qq q
q
qq
qq q
Duluth
20
30
40
50
60
q
q
q q qqq qq
q
University Farm
20
30
40
50
60
q qq q
qqq qq q
Morris
20
30
40
50
60
q
q
q
q
q
qq
q
q q
Crookston
20
30
40
50
60
q
qq q
q
qq
q
q
q
Waseca
q 1931
1932
xyplot(yield ∼
variety | site,
data = barley,
groups = year,
pch = c(21,22),
layout = c(1,6),
stack = TRUE,
auto.key =
list(space =
"right"), ylab =
"Barley Yield
(bushels/acre)",
scales = list(x =
list(rot = 45)))
Baan Bapat Data Visualization with R
86. xyplot
Multivariate: Continuous Vs Several Categorical
variety
BarleyYield(bushels/acre)
20
30
40
50
60
Svansota
N
o.462
M
anchuria
N
o.475
Velvet
Peatland
G
labron
N
o.457
W
isconsin
N
o.38
Trebi
q
qq
q
qqq
q
q
q
Grand Rapids
20
30
40
50
60
q qq q
q
qq
qq q
Duluth
20
30
40
50
60
q
q
q q qqq qq
q
University Farm
20
30
40
50
60
q qq q
qqq qq q
Morris
20
30
40
50
60
q
q
q
q
q
qq
q
q q
Crookston
20
30
40
50
60
q
qq q
q
qq
q
q
q
Waseca
q 1931
1932
xyplot(yield ∼
variety | site,
data = barley,
groups = year,
pch = c(21,22),
layout = c(1,6),
stack = TRUE,
auto.key =
list(space =
"right"), ylab =
"Barley Yield
(bushels/acre)",
scales = list(x =
list(rot = 45)))
Baan Bapat Data Visualization with R
87. xyplot
Multivariate: Continuous Vs Several Categorical
variety
BarleyYield(bushels/acre)
20
30
40
50
60
Svansota
N
o.462
M
anchuria
N
o.475
Velvet
Peatland
G
labron
N
o.457
W
isconsin
N
o.38
Trebi
q
qq
q
qqq
q
q
q
Grand Rapids
20
30
40
50
60
q qq q
q
qq
qq q
Duluth
20
30
40
50
60
q
q
q q qqq qq
q
University Farm
20
30
40
50
60
q qq q
qqq qq q
Morris
20
30
40
50
60
q
q
q
q
q
qq
q
q q
Crookston
20
30
40
50
60
q
qq q
q
qq
q
q
q
Waseca
q 1931
1932
xyplot(yield ∼
variety | site,
data = barley,
groups = year,
pch = c(21,22),
layout = c(1,6),
stack = TRUE,
auto.key =
list(space =
"right"), ylab =
"Barley Yield
(bushels/acre)",
scales = list(x =
list(rot = 45)))
Baan Bapat Data Visualization with R
88. xyplot
Multivariate: Continuous Vs Several Categorical
variety
BarleyYield(bushels/acre)
20
30
40
50
60
Svansota
N
o.462
M
anchuria
N
o.475
Velvet
Peatland
G
labron
N
o.457
W
isconsin
N
o.38
Trebi
q
qq
q
qqq
q
q
q
Grand Rapids
20
30
40
50
60
q qq q
q
qq
qq q
Duluth
20
30
40
50
60
q
q
q q qqq qq
q
University Farm
20
30
40
50
60
q qq q
qqq qq q
Morris
20
30
40
50
60
q
q
q
q
q
qq
q
q q
Crookston
20
30
40
50
60
q
qq q
q
qq
q
q
q
Waseca
q 1931
1932
xyplot(yield ∼
variety | site,
data = barley,
groups = year,
pch = c(21,22),
layout = c(1,6),
stack = TRUE,
auto.key =
list(space =
"right"), ylab =
"Barley Yield
(bushels/acre)",
scales = list(x =
list(rot = 45)))
Baan Bapat Data Visualization with R
89. xyplot
Multivariate: Continuous Vs Several Categorical
variety
BarleyYield(bushels/acre)
20
30
40
50
60
Svansota
N
o.462
M
anchuria
N
o.475
Velvet
Peatland
G
labron
N
o.457
W
isconsin
N
o.38
Trebi
q
qq
q
qqq
q
q
q
Grand Rapids
20
30
40
50
60
q qq q
q
qq
qq q
Duluth
20
30
40
50
60
q
q
q q qqq qq
q
University Farm
20
30
40
50
60
q qq q
qqq qq q
Morris
20
30
40
50
60
q
q
q
q
q
qq
q
q q
Crookston
20
30
40
50
60
q
qq q
q
qq
q
q
q
Waseca
q 1931
1932
xyplot(yield ∼
variety | site,
data = barley,
groups = year,
pch = c(21,22),
layout = c(1,6),
stack = TRUE,
auto.key =
list(space =
"right"), ylab =
"Barley Yield
(bushels/acre)",
scales = list(x =
list(rot = 45)))
Baan Bapat Data Visualization with R
90. xyplot
Multivariate: Continuous Vs Several Categorical
variety
BarleyYield(bushels/acre)
20
30
40
50
60
Svansota
N
o.462
M
anchuria
N
o.475
Velvet
Peatland
G
labron
N
o.457
W
isconsin
N
o.38
Trebi
q
qq
q
qqq
q
q
q
Grand Rapids
20
30
40
50
60
q qq q
q
qq
qq q
Duluth
20
30
40
50
60
q
q
q q qqq qq
q
University Farm
20
30
40
50
60
q qq q
qqq qq q
Morris
20
30
40
50
60
q
q
q
q
q
qq
q
q q
Crookston
20
30
40
50
60
q
qq q
q
qq
q
q
q
Waseca
q 1931
1932
xyplot(yield ∼
variety | site,
data = barley,
groups = year,
pch = c(21,22),
layout = c(1,6),
stack = TRUE,
auto.key =
list(space =
"right"), ylab =
"Barley Yield
(bushels/acre)",
scales = list(x =
list(rot = 45)))
Baan Bapat Data Visualization with R
91. ggplot
qplot: iris
Multivariate data iris:
flower measurements of 3
species
qplot(
Sepal.Length,
Petal.Length,
data = iris,
color = Species,
size=Petal.Width,
alpha=I(0.7))
2
4
6
5 6 7 8
Sepal.Length
Petal.Length
Species
setosa
versicolor
virginica
log(Petal.Width)
−2
−1
0
Baan Bapat Data Visualization with R
92. qplot
several time series
Growth of Orange trees
qplot(age,
circumference,
data = Orange,
geom = c("point",
"line"), colour =
Tree)
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
50
100
150
200
400 800 1200 1600
age
circumference
Tree
q
q
q
q
q
3
1
5
2
4
Baan Bapat Data Visualization with R
93. qplot
Diamond: price, caret &
cut
qplot(carat,
price,
data=diamonds,
colour=cut,
geom=c("point",
"smooth"))
defaults + layers +
scales + coordinate
system
Layer = data +
mapping + geom +
stat + position
Baan Bapat Data Visualization with R
94. qplot
Diamond: price, caret &
cut
qplot(carat,
price,
data=diamonds,
colour=cut,
geom=c("point",
"smooth"))
defaults + layers +
scales + coordinate
system
Layer = data +
mapping + geom +
stat + position
Baan Bapat Data Visualization with R
95. qplot
Diamond: price, caret &
cut
qplot(carat,
price,
data=diamonds,
colour=cut,
geom=c("point",
"smooth"))
defaults + layers +
scales + coordinate
system
Layer = data +
mapping + geom +
stat + position
Baan Bapat Data Visualization with R
96. Human Development and Corruption
UNDP Corruption Perception and Human Development Data
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United States Australia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
Corruption Perception Index 2011 (10 = least)
HumanDevelopmentIndex2011(1=best)
Region Americas APAC East Eur & Central Asia EU & West Eur MENA SSA
HDI Vs CPI
Country
Region
Baan Bapat Data Visualization with R
97. Human Development and Corruption
0.4
0.6
0.8
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
pc1 <-
ggplot(dat,
aes(x=CPI, y=HDI,
color=Region))
pc1 <- pc1 +
geom point(shape=9)
Baan Bapat Data Visualization with R
98. Human Development and Corruption
0.4
0.6
0.8
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
pc1 <-
ggplot(dat,
aes(x=CPI, y=HDI,
color=Region))
pc1 <- pc1 +
geom point(shape=9)
Baan Bapat Data Visualization with R
99. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
labs <-
c("Chad",...
pc2 <- pc1 +
geom text(aes(label
= Country),
color="black",
size=3,
hjust=1.1, data=
dat[dat$Country
%in% labs,])
Baan Bapat Data Visualization with R
100. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
labs <-
c("Chad",...
pc2 <- pc1 +
geom text(aes(label
= Country),
color="black",
size=3,
hjust=1.1, data=
dat[dat$Country
%in% labs,])
Baan Bapat Data Visualization with R
101. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
labs <-
c("Chad",...
pc2 <- pc1 +
geom text(aes(label
= Country),
color="black",
size=3,
hjust=1.1, data=
dat[dat$Country
%in% labs,])
Baan Bapat Data Visualization with R
102. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
labs <-
c("Chad",...
pc2 <- pc1 +
geom text(aes(label
= Country),
color="black",
size=3,
hjust=1.1, data=
dat[dat$Country
%in% labs,])
Baan Bapat Data Visualization with R
103. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
labs <-
c("Chad",...
pc2 <- pc1 +
geom text(aes(label
= Country),
color="black",
size=3,
hjust=1.1, data=
dat[dat$Country
%in% labs,])
Baan Bapat Data Visualization with R
104. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
pc3 <- pc2 +
geom smooth(
method="lm",
color="black",
formula = y ∼
poly(x, 2))
Baan Bapat Data Visualization with R
105. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
pc3 <- pc2 +
geom smooth(
method="lm",
color="black",
formula = y ∼
poly(x, 2))
Baan Bapat Data Visualization with R
106. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
pc3 <- pc2 +
geom smooth(
method="lm",
color="black",
formula = y ∼
poly(x, 2))
Baan Bapat Data Visualization with R
107. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United StatesAustralia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
CPI
HDI
Region
Americas
APAC
East Eur & Central Asia
EU & West Eur
MENA
SSA
pc3 <- pc2 +
geom smooth(
method="lm",
color="black",
formula = y ∼
poly(x, 2))
Baan Bapat Data Visualization with R
108. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United States Australia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
Corruption Perception Index 2011 (10 = least)
HumanDevelopmentIndex2011(1=best)
Region Americas APAC East Eur & Central Asia EU & West Eur MENA SSA
pc4 <- pc3 +
theme bw() +
scale x continuous(
...) +
scale y continuous(
...) + theme(
legend.position =
"top",
legend.direction
= "horizontal")
Baan Bapat Data Visualization with R
109. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United States Australia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
Corruption Perception Index 2011 (10 = least)
HumanDevelopmentIndex2011(1=best)
Region Americas APAC East Eur & Central Asia EU & West Eur MENA SSA
pc4 <- pc3 +
theme bw() +
scale x continuous(
...) +
scale y continuous(
...) + theme(
legend.position =
"top",
legend.direction
= "horizontal")
Baan Bapat Data Visualization with R
110. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United States Australia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
Corruption Perception Index 2011 (10 = least)
HumanDevelopmentIndex2011(1=best)
Region Americas APAC East Eur & Central Asia EU & West Eur MENA SSA
pc4 <- pc3 +
theme bw() +
scale x continuous(
...) +
scale y continuous(
...) + theme(
legend.position =
"top",
legend.direction
= "horizontal")
Baan Bapat Data Visualization with R
111. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United States Australia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
Corruption Perception Index 2011 (10 = least)
HumanDevelopmentIndex2011(1=best)
Region Americas APAC East Eur & Central Asia EU & West Eur MENA SSA
pc4 <- pc3 +
theme bw() +
scale x continuous(
...) +
scale y continuous(
...) + theme(
legend.position =
"top",
legend.direction
= "horizontal")
Baan Bapat Data Visualization with R
112. Human Development and Corruption
Chad
Afghanistan
Nigeria
Bhutan
India
Cape Verde
Indonesia
China
Ecuador Saint Lucia
KuwaitBahrain
Italy
Hong Kong
United States Australia
Norway
0.4
0.6
0.8
1.0
2.5 5.0 7.5
Corruption Perception Index 2011 (10 = least)
HumanDevelopmentIndex2011(1=best)
Region Americas APAC East Eur & Central Asia EU & West Eur MENA SSA
pc4 <- pc3 +
theme bw() +
scale x continuous(
...) +
scale y continuous(
...) + theme(
legend.position =
"top",
legend.direction
= "horizontal")
Baan Bapat Data Visualization with R
117. sp
Plotting spatial data
Swiss Language Regions
french
german
italian
require(sp)
url(
"http://gadm.org/
data/rda/
CHE adm1.RData")
language <-
c("german", ...)
col =
terrain.color(. . . )
spplot(gadm,
”language”,
col.regions=col)
Baan Bapat Data Visualization with R
118. sp
Plotting spatial data
Swiss Language Regions
french
german
italian
require(sp)
url(
"http://gadm.org/
data/rda/
CHE adm1.RData")
language <-
c("german", ...)
col =
terrain.color(. . . )
spplot(gadm,
”language”,
col.regions=col)
Baan Bapat Data Visualization with R
119. sp
Plotting spatial data
Swiss Language Regions
french
german
italian
require(sp)
url(
"http://gadm.org/
data/rda/
CHE adm1.RData")
language <-
c("german", ...)
col =
terrain.color(. . . )
spplot(gadm,
”language”,
col.regions=col)
Baan Bapat Data Visualization with R
120. sp
Plotting spatial data
Swiss Language Regions
french
german
italian
require(sp)
url(
"http://gadm.org/
data/rda/
CHE adm1.RData")
language <-
c("german", ...)
col =
terrain.color(. . . )
spplot(gadm,
”language”,
col.regions=col)
Baan Bapat Data Visualization with R
121. sp
Plotting spatial data
Swiss Language Regions
french
german
italian
require(sp)
url(
"http://gadm.org/
data/rda/
CHE adm1.RData")
language <-
c("german", ...)
col =
terrain.color(. . . )
spplot(gadm,
”language”,
col.regions=col)
Baan Bapat Data Visualization with R
123. Interactive Visualization
R + Ggobi
require(rggobi)
ggobi(iris)
R + iplots
shiny
ui.R
server.R
runApp(¡directory¿)
R + Google Chart Tools
require(googleVis)
demo(googleVis)
World Bank country
indicators data
gvisMotionChart(. . . )
Baan Bapat Data Visualization with R
124. Interactive Visualization
R + Ggobi
require(rggobi)
ggobi(iris)
R + iplots
shiny
ui.R
server.R
runApp(¡directory¿)
R + Google Chart Tools
require(googleVis)
demo(googleVis)
World Bank country
indicators data
gvisMotionChart(. . . )
Baan Bapat Data Visualization with R
125. Interactive Visualization
R + Ggobi
require(rggobi)
ggobi(iris)
R + iplots
shiny
ui.R
server.R
runApp(¡directory¿)
R + Google Chart Tools
require(googleVis)
demo(googleVis)
World Bank country
indicators data
gvisMotionChart(. . . )
Baan Bapat Data Visualization with R
126. Interactive Visualization
R + Ggobi
require(rggobi)
ggobi(iris)
R + iplots
shiny
ui.R
server.R
runApp(¡directory¿)
R + Google Chart Tools
require(googleVis)
demo(googleVis)
World Bank country
indicators data
gvisMotionChart(. . . )
Baan Bapat Data Visualization with R
127. Success Stories
Flying in the USA
Glaciers melt as mountains warm
Soda, pop, coke, . . . ?
Baan Bapat Data Visualization with R