Se está descargando tu SlideShare. ×

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

1 de 60 Anuncio

R Programming: Transform/Reshape Data In R

Learn to transform/reshape data in R. This is part of the Working With Data module of the R Programming course by r-squared.

Learn to transform/reshape data in R. This is part of the Working With Data module of the R Programming course by r-squared.

Anuncio
Anuncio

Anuncio

Anuncio

R Programming: Transform/Reshape Data In R

1. 1. r-squared Slide 1 www.r-squared.in/rprogramming R Programming Learn the fundamentals of data analysis with R.
2. 2. r-squared Slide 2 Course Modules www.r-squared.in/rprogramming ✓ Introduction ✓ Elementary Programming ✓ Working With Data ✓ Selection Statements ✓ Loops ✓ Functions ✓ Debugging ✓ Unit Testing
3. 3. r-squared Slide 3 Working With Data www.r-squared.in/rprogramming ✓ Data Types ✓ Data Structures ✓ Data Creation ✓ Data Info ✓ Data Subsetting ✓ Comparing R Objects ✓ Importing Data ✓ Exporting Data ✓ Data Transformation ✓ Numeric Functions ✓ String Functions ✓ Mathematical Functions
4. 4. r-squared Slide 4 Data Transformation www.r-squared.in/rprogramming In this section, we will explore built-in R function that can be used for transforming/reshaping data. This section is further divided into 4 sub-sections: ● Reorder Data ● Subset/Filter Data ● Combine Data ● Transform Data
5. 5. r-squared Slide 5 Reorder Data www.r-squared.in/rprogramming In the course of analyzing data, sometimes it is necessary to reorder the data as we cannot use the data in its original format. Sorting the data is the best example of such reordering. In this section, we will learn the following functions: ✓ t (transpose) ✓ order ✓ sort ✓ rank
6. 6. r-squared Slide 6 t() www.r-squared.in/rprogramming Description: t() returns the transpose of a matrix or data frame. Syntax: t(matrix/data frame) Returns: The transpose of the matrix or data frame. Documentation help(t)
7. 7. r-squared Slide 7 t() www.r-squared.in/rprogramming Examples > # example 1 > m <- matrix(1:6, nrow = 2) > dim(m) [1] 2 3 > dim(t(m)) [1] 3 2 > m # 2 x 3 matrix [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 > t(m) # t() returns a 3 x 2 matrix [,1] [,2] [1,] 1 2 [2,] 3 4 [3,] 5 6
8. 8. r-squared Slide 8 t() www.r-squared.in/rprogramming Examples > # example 2 > data <- mtcars > head(data) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 > data_transpose <- t(data) > head(data_transpose) Mazda RX4 Mazda RX4 Wag Datsun 710 Hornet 4 Drive Hornet Sportabout Valiant mpg 21.00 21.000 22.80 21.400 18.70 18.10 cyl 6.00 6.000 4.00 6.000 8.00 6.00 disp 160.00 160.000 108.00 258.000 360.00 225.00 hp 110.00 110.000 93.00 110.000 175.00 105.00 drat 3.90 3.900 3.85 3.080 3.15 2.76 wt 2.62 2.875 2.32 3.215 3.44 3.46
9. 9. r-squared Slide 9 order() www.r-squared.in/rprogramming Description: order() sorts a given vector and returns the indices of the elements. Syntax: order(vector / data frame) Returns: The indices of the sorted object (vector / data frame). Documentation help(order)
10. 10. r-squared Slide 10 order() www.r-squared.in/rprogramming Examples > # example 1 > x <- sample(1:10) > x [1] 10 9 6 2 4 3 8 1 7 5 # let us sort x using the indices > x[c(8, 4, 6, 5, 10, 3, 9, 7, 2, 1)] [1] 1 2 3 4 5 6 7 8 9 10 > order(x) [1] 8 4 6 5 10 3 9 7 2 1 > x[order(x)] [1] 1 2 3 4 5 6 7 8 9 10
11. 11. r-squared Slide 11 order() www.r-squared.in/rprogramming Examples > # example 2 > data_ascending <- data[order(data\$mpg),] > data_descending <- data[order(-data\$mpg),] > head(data_ascending) mpg cyl disp hp drat wt qsec vs am gear carb Cadillac Fleetwood 10.4 8 472 205 2.93 5.250 17.98 0 0 3 4 Lincoln Continental 10.4 8 460 215 3.00 5.424 17.82 0 0 3 4 Camaro Z28 13.3 8 350 245 3.73 3.840 15.41 0 0 3 4 Duster 360 14.3 8 360 245 3.21 3.570 15.84 0 0 3 4 Chrysler Imperial 14.7 8 440 230 3.23 5.345 17.42 0 0 3 4 Maserati Bora 15.0 8 301 335 3.54 3.570 14.60 0 1 5 8 > head(data_descending) mpg cyl disp hp drat wt qsec vs am gear carb Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
12. 12. r-squared Slide 12 sort() www.r-squared.in/rprogramming Description: sort() will sort the elements of a vector or factor in ascending/descending order. Syntax: sort(vector / factor) Returns: Sorted vector or factor in ascending/descending order Documentation help(sort)
13. 13. r-squared Slide 13 sort() www.r-squared.in/rprogramming Examples > # example 1 > x <- sample(1:10) > x [1] 10 9 6 2 4 3 8 1 7 5 > sort(x) # ascending order [1] 1 2 3 4 5 6 7 8 9 10 > # example 2 > x <- sample(1:10) > x [1] 10 9 6 2 4 3 8 1 7 5 > sort(x, decreasing = TRUE) # descending order [1] 10 9 8 7 6 5 4 3 2 1
14. 14. r-squared Slide 14 rank() www.r-squared.in/rprogramming Description: rank() returns the sample ranks of values in a vector. Syntax: rank(vector) Returns: Sample ranks of values in a vector. Documentation help(rank)
15. 15. r-squared Slide 15 rank() www.r-squared.in/rprogramming Examples > # example 1 > x <- sample(1:10) > x [1] 7 9 1 8 6 5 3 2 10 4 > rank(x) [1] 7 9 1 8 6 5 3 2 10 4 > # example 2 > x2 <- c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5) > order(x2) [1] 2 4 7 1 10 3 5 9 11 8 6 > sort(x2) [1] 1 1 2 3 3 4 5 5 5 6 9 > (r2 <- rank(x2)) # ties are averaged [1] 4.5 1.5 6.0 1.5 8.0 11.0 3.0 10.0 8.0 4.5 8.0
16. 16. r-squared Slide 16 Subset/Filter Data www.r-squared.in/rprogramming In this section, we will look at functions that can be used for subsetting/filtering data. ✓ subset ✓ which ✓ with ✓ drop ✓ droplevels
17. 17. r-squared Slide 17 subset() www.r-squared.in/rprogramming Description: subset() can be used to subset data from vectors and data frames. Syntax: subset(vector / data frame) Returns: Vector or data frame. Documentation help(subset)
18. 18. r-squared Slide 18 subset() www.r-squared.in/rprogramming Examples > # example 1 > # subsetting vectors > x [1] 7 9 1 8 6 5 3 2 10 4 > subset(x, x > 5) [1] 7 9 8 6 10 > subset(x, x == 4) [1] 4 > subset(x, x > 4 & x < 7) [1] 6 5
19. 19. r-squared Slide 19 subset() www.r-squared.in/rprogramming Examples > # example 2 > # subsetting data frames > subset(mtcars, mpg >= 23 & mpg <= 27) mpg cyl disp hp drat wt qsec vs am gear carb Merc 240D 24.4 4 146.7 62 3.69 3.19 20.0 1 0 4 2 Porsche 914-2 26.0 4 120.3 91 4.43 2.14 16.7 0 1 5 2 > subset(mtcars, mpg >= 23 & mpg <= 27, select = c(cyl, hp)) cyl hp Merc 240D 4 62 Porsche 914-2 4 91 > subset(mtcars, cyl == 4 & hp > 100, select = mpg:wt) mpg cyl disp hp drat wt Lotus Europa 30.4 4 95.1 113 3.77 1.513 Volvo 142E 21.4 4 121.0 109 4.11 2.780
20. 20. r-squared Slide 20 which() www.r-squared.in/rprogramming Description: which() tests if the values in a object evaluate to TRUE for a given condition and return the indices of such values. Syntax: which(object, condition) Returns: Indices of values which evaluate to TRUE for a given condition. Documentation help(which)
21. 21. r-squared Slide 21 which() www.r-squared.in/rprogramming Examples > # example 1 > x [1] 7 9 1 8 6 5 3 2 10 4 > which(x == 5) # returns index of value 5. [1] 6 > which(x > 4) # returns indices of all values greater than 4. [1] 1 2 4 5 6 9 > # example 2 > # using data frame > which(data\$mpg > 20) # returns indices of values greater than 20. [1] 1 2 3 4 8 9 18 19 20 21 26 27 28 32 > data\$mpg[which(data\$mpg > 20)] # returns values greater than 20. [1] 21.0 21.0 22.8 21.4 24.4 22.8 32.4 30.4 33.9 21.5 27.3 26.0 30.4 21.4
22. 22. r-squared Slide 22 which() www.r-squared.in/rprogramming Examples > # example 3 > m [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 > which(m > 5) [1] 6 > which(m == 5) [1] 5 > which(letters == "r") # r is the 18th alphabet [1] 18 > div_by_3 <- m %% 3 == 0 > div_by_3 [,1] [,2] [,3] [1,] FALSE TRUE FALSE [2,] FALSE FALSE TRUE > which(div_by_3) # which values in m are divisible by 3. [1] 3 6
23. 23. r-squared Slide 23 droplevels() www.r-squared.in/rprogramming Description: droplevels() drops all unused levels from a factor. Syntax: droplevels(factor) Returns: Factors without unused levels. Documentation help(droplevels)
24. 24. r-squared Slide 24 droplevels() www.r-squared.in/rprogramming Examples > # example 1 > data_cyl <- subset(data, cyl == 4 | cyl == 6) > levels(data_cyl\$cyl) [1] "4" "6" "8" > droplevels(data_cyl\$cyl) [1] 6 6 4 6 6 4 4 6 6 4 4 4 4 4 4 4 6 4 Levels: 4 6 > levels(data_cyl\$cyl) [1] "4" "6" "8" > summary(droplevels(data_cyl\$cyl)) 4 6 11 7
25. 25. r-squared Slide 25 droplevels() www.r-squared.in/rprogramming Examples > # example 2 > aq <- transform(airquality, Month = factor(Month, labels = month.abb[5:9])) > aq <- subset(aq, Month != "Jul") > table(aq\$Month) May Jun Jul Aug Sep 31 30 0 31 30 > table(droplevels(aq)\$Month) May Jun Aug Sep 31 30 31 30 > droplevels(data_cyl)\$cyl [1] 6 6 4 6 6 4 4 6 6 4 4 4 4 4 4 4 6 4 Levels: 4 6 > table(droplevels(data_cyl)\$cyl) 4 6 11 7 > table(data_cyl\$cyl) 4 6 8 11 7 0
26. 26. r-squared Slide 26 Combine Data www.r-squared.in/rprogramming In this section, we will look at functions that combine data. ✓ append ✓ merge ✓ cbind ✓ rbind ✓ interaction
27. 27. r-squared Slide 27 append() www.r-squared.in/rprogramming Description: append() adds elements to a vector. We can specify the index where the element must be added. Syntax: append(vector, elements) Returns: A vector with appended elements. Documentation help(append)
28. 28. r-squared Slide 28 append() www.r-squared.in/rprogramming Examples > # example 1 > x1 <- sample(1:10) > x2 <- sample(1:5) > append(x1, x2) [1] 8 7 4 9 5 10 6 1 2 3 2 5 3 1 4 > example 2 > x1 <- sample(1:10) > x2 <- sample(1:5) > append(x1, x2, after = 2) [1] 8 7 2 5 3 1 4 4 9 5 10 6 1 2 3
29. 29. r-squared Slide 29 merge() www.r-squared.in/rprogramming Description: merge() will merge two data frames by common column or row names. Syntax: merge(dataframe1, dataframe2, by) Returns: Data frame. Documentation help(merge)
30. 30. r-squared Slide 30 merge() www.r-squared.in/rprogramming Examples > # example 1 > name <- c("John", "Jane", "Tom", "Jennifer") > age <- c(20, 25, 30, 28) > gender <- factor(c("male", "female", "male", "female")) > data_1 <- data.frame(name, age) > data_2 <- data.frame(name, gender) > data_3 <- merge(data_1, data_2, by = "name") > head(data_3) name age gender 1 Jane 25 female 2 Jennifer 28 female 3 John 20 male 4 Tom 30 male
31. 31. r-squared Slide 31 cbind() www.r-squared.in/rprogramming Description: cbind() combines objects by columns. Syntax: cbind(object1, object2) Returns: Matrix / Data frame Documentation help(cbind)
32. 32. r-squared Slide 32 cbind() www.r-squared.in/rprogramming Examples > # example 1 > cbind(1, 1:4) [,1] [,2] [1,] 1 1 [2,] 1 2 [3,] 1 3 [4,] 1 4 > # example 2 > m1 <- matrix(1:4, nrow = 2) > m2 <- matrix(5:8, nrow = 2) > cbind(m1, m2) [,1] [,2] [,3] [,4] [1,] 1 3 5 7 [2,] 2 4 6 8
33. 33. r-squared Slide 33 cbind() www.r-squared.in/rprogramming Examples > # example 3 > name <- c("John", "Jane", "Tom", "Jennifer") > age <- c(20, 25, 30, 28) > gender <- factor(c("male", "female", "male", "female")) > data_1 <- data.frame(name, age) > data_2 <- data.frame(name, gender) > data_3 <- merge(data_1, data_2, by = "name") > data_4 <- cbind(data_3, income) > head(data_4) name age gender income 1 Jane 25 female 25000 2 Jennifer 28 female 30000 3 John 20 male 35000 4 Tom 30 male 40000
34. 34. r-squared Slide 34 rbind() www.r-squared.in/rprogramming Description: rbind() combines objects by rows. Syntax: rbind(object1, object2) Returns: Matrix / Data frame Documentation help(rbind)
35. 35. r-squared Slide 35 rbind() www.r-squared.in/rprogramming Examples > # example 1 > rbind(1, 1:4) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 1 1 1 1 1 1 1 1 [2,] 1 2 3 4 5 6 7 8 > # example 2 > m1 <- matrix(1:4, nrow = 2) > m2 <- matrix(5:8, nrow = 2) > rbind(m1, m2) [,1] [,2] [1,] 1 3 [2,] 2 4 [3,] 5 7 [4,] 6 8
36. 36. r-squared Slide 36 rbind() www.r-squared.in/rprogramming Examples > # example 3 > name <- c("John", "Jane", "Tom", "Jennifer") > age <- c(20, 25, 30, 28) > gender <- factor(c("male", "female", "male", "female")) > data_1 <- data.frame(name, age) > data_2 <- data.frame(name, gender) > data_3 <- merge(data_1, data_2, by = "name") > data_4 <- data_3 > data_rbind <- rbind(data_3, data_4) > head(data_rbind) name age gender 1 Jane 25 female 2 Jennifer 28 female 3 John 20 male 4 Tom 30 male 5 Jane 25 female 6 Jennifer 28 female
37. 37. r-squared Slide 37 interaction() www.r-squared.in/rprogramming Description: interaction() creates interaction variables. Syntax interaction(factor1, factor2) Returns: Interaction variable. Documentation help(interaction)
38. 38. r-squared Slide 38 interaction() www.r-squared.in/rprogramming Examples > # example 1 > mtcars\$gear <- as.factor(mtcars\$gear) > mtcars\$cyl <- as.factor(mtcars\$cyl) > interaction(mtcars\$cyl, mtcars\$gear) [1] 6.4 6.4 4.4 6.3 8.3 6.3 8.3 4.4 4.4 6.4 6.4 8.3 8.3 8.3 8.3 8.3 8.3 4.4 4.4 4.4 [21] 4.3 8.3 8.3 8.3 8.3 4.4 4.5 4.5 8.5 6.5 8.5 4.4 Levels: 4.3 6.3 8.3 4.4 6.4 8.4 4.5 6.5 8.5 > # example 2 > mtcars\$am <- as.factor(mtcars\$am) > mtcars\$cyl <- as.factor(mtcars\$cyl) > interaction(mtcars\$cyl, mtcars\$am) [1] 6.1 6.1 4.1 6.0 8.0 6.0 8.0 4.0 4.0 6.0 6.0 8.0 8.0 8.0 8.0 8.0 8.0 4.1 4.1 4.1 [21] 4.0 8.0 8.0 8.0 8.0 4.1 4.1 4.1 8.1 6.1 8.1 4.1 Levels: 4.0 6.0 8.0 4.1 6.1 8.1
39. 39. r-squared Slide 39 Reshape Data www.r-squared.in/rprogramming In this section, we will look at functions that transform/reshape data. ✓ transform ✓ cut ✓ diff ✓ replace ✓ scale ✓ split ✓ with ✓ within ✓ by
40. 40. r-squared Slide 40 transform() www.r-squared.in/rprogramming Description: transform() is used to transform variables in a data frame. Syntax transform(data frame, expression) Returns: Data frame with transformed variables. Documentation help(transform)
41. 41. r-squared Slide 41 transform() www.r-squared.in/rprogramming Examples > # example 1 > data <- mtcars > head(data) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 > head(transform(data, mpg = -mpg, disp = disp / wt)) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 -21.0 6 61.06870 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag -21.0 6 55.65217 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 -22.8 4 46.55172 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive -21.4 6 80.24883 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout -18.7 8 104.65116 175 3.15 3.440 17.02 0 0 3 2 Valiant -18.1 6 65.02890 105 2.76 3.460 20.22 1 0 3 1
42. 42. r-squared Slide 42 transform() www.r-squared.in/rprogramming Examples > # example 2 > data <- mtcars > head(transform(data, wtdrat = wt * drat)) mpg cyl disp hp drat wt qsec vs am gear carb wtdrat Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 10.2180 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 11.2125 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 8.9320 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 9.9022 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 10.8360 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 9.5496
43. 43. r-squared Slide 43 cut() www.r-squared.in/rprogramming Description: cut() divides the range of an object into intervals. Syntax cut(object, breaks) Returns: Intervals Documentation help(cut)
44. 44. r-squared Slide 44 transform() www.r-squared.in/rprogramming Examples > # example 1 > x <- jitter(sample(1:100)) > x [1] 69.8630359 44.8489212 61.1385505 50.8402454 26.0350033 97.0510463 42.8749472 [8] 2.0313452 69.0713593 9.1714536 64.8470522 62.8787114 98.9115336 58.8020429 [15] 81.9908416 87.1495953 16.9303676 11.9593307 38.0015233 20.1833953 14.0838761 ……………………………………………………………………………………………………………………………………… …………………………………………………………………………………………… [85] 3.0042776 33.9052141 97.8309652 47.1207229 77.1890815 41.8063134 39.9223398 [92] 27.8306122 80.0271128 18.1951342 85.1410689 23.1750646 6.1861739 27.0493739 [99] 36.9679664 18.9148518 > c <- cut(x, breaks = 10) > table(c) c (0.827,10.8] (10.8,20.8] (20.8,30.7] (30.7,40.6] (40.6,50.5] (50.5,60.4] (60.4,70.3] 10 10 10 10 10 10 10 (70.3,80.3] (80.3,90.2] (90.2,100] 10 10 10
45. 45. r-squared Slide 45 diff() www.r-squared.in/rprogramming Description: diff() creates sequences with lags and iterated differences. Syntax diff(object, lag) Returns: Lagged sequence Documentation help(diff)
46. 46. r-squared Slide 46 diff() www.r-squared.in/rprogramming Examples > # example 1 > diff(1:10, 2) [1] 2 2 2 2 2 2 2 2 > diff(1:10, 2, 2) [1] 0 0 0 0 0 0 > x <- cumsum(cumsum(1:10)) > x [1] 1 4 10 20 35 56 84 120 165 220 > diff(x, lag = 2) [1] 9 16 25 36 49 64 81 100
47. 47. r-squared Slide 47 replace() www.r-squared.in/rprogramming Description: replace() replaces the elements in object given by indices in list with values. Syntax replace(object, list, values) Returns: New object with replaced values. Documentation help(replace)
48. 48. r-squared Slide 48 replace() www.r-squared.in/rprogramming Examples > # example 1 > x <- sample(1:10) > x [1] 6 2 7 9 1 5 4 8 10 3 > replace(x, 5, 10) [1] 6 2 7 9 10 5 4 8 10 3 # replace the value in the index position 5 in the vector x with the value 10. > # example 1 > x <- sample(1:10) > x [1] 6 2 7 9 1 5 4 8 10 3 > replace(x, 3:5, c(2, 4, 6)) [1] 6 2 2 4 6 5 4 8 10 3
49. 49. r-squared Slide 49 scale() www.r-squared.in/rprogramming Description: scale() scales the columns of a numeric matrix. Syntax scale(numeric matrix) Returns: Matrix with scaled columns Documentation help(scale)
50. 50. r-squared Slide 50 scale() www.r-squared.in/rprogramming Examples > # example 1 > m <- matrix(1:9, nrow = 3) > m [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 > scale(m) [,1] [,2] [,3] [1,] -1 -1 -1 [2,] 0 0 0 [3,] 1 1 1 attr(,"scaled:center") [1] 2 5 8 attr(,"scaled:scale") [1] 1 1 1
51. 51. r-squared Slide 51 split() www.r-squared.in/rprogramming Description: split() divides the data in the vector x into the groups defined by f. Syntax split(x, f) Returns: x split into groups defined by f. Documentation help(split)
52. 52. r-squared Slide 52 split() www.r-squared.in/rprogramming Examples > # example 1 > x <- split(data\$mpg, data\$cyl) > x \$`4` [1] 22.8 24.4 22.8 32.4 30.4 33.9 21.5 27.3 26.0 30.4 21.4 \$`6` [1] 21.0 21.0 21.4 18.1 19.2 17.8 19.7 \$`8` [1] 18.7 14.3 16.4 17.3 15.2 10.4 10.4 14.7 15.5 15.2 13.3 19.2 15.8 15.0 > sapply(x, mean) 4 6 8 26.66364 19.74286 15.10000 > unsplit(x, data\$cyl) [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4 14.7 32.4 [19] 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4
53. 53. r-squared Slide 53 with() www.r-squared.in/rprogramming Description: with() applies an expression to an object. Syntax with(object, expression) Returns: Result of the expression Documentation help(with)
54. 54. r-squared Slide 54 with() www.r-squared.in/rprogramming Examples > # example 1 > with(mtcars, table(cyl)) cyl 4 6 8 11 7 14 > with(mtcars, summary(mpg)) Min. 1st Qu. Median Mean 3rd Qu. Max. 10.40 15.42 19.20 20.09 22.80 33.90 > with(mtcars, lm(mpg ~ hp)) Call: lm(formula = mpg ~ hp) Coefficients: (Intercept) hp 30.09886 -0.06823
55. 55. r-squared Slide 55 within() www.r-squared.in/rprogramming Description: within() applies an expression to an object and returns a copy of the modified object. Syntax within(object, expression) Returns: Copy of the modified object Documentation help(within)
56. 56. r-squared Slide 56 within() www.r-squared.in/rprogramming Examples > # example 1 > data <- mtcars > data <- within(data, mpg_cyl <- mpg * cyl) > head(data) mpg cyl disp hp drat wt qsec vs am gear carb mpg_cyl Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 126.0 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 126.0 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 91.2 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 128.4 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 149.6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 108.6
57. 57. r-squared Slide 57 by() www.r-squared.in/rprogramming Description: by() applies an expression to each level of a factor in an object Syntax by(object, factor, expression) Returns: Result of the expression applied to different levels of the factor Documentation help(by)
58. 58. r-squared Slide 58 by() www.r-squared.in/rprogramming Examples > # example 1 > by(mtcars\$mpg, mtcars\$cyl, summary) mtcars\$cyl: 4 Min. 1st Qu. Median Mean 3rd Qu. Max. 21.40 22.80 26.00 26.66 30.40 33.90 ---------------------------------------------------------------- mtcars\$cyl: 6 Min. 1st Qu. Median Mean 3rd Qu. Max. 17.80 18.65 19.70 19.74 21.00 21.40 ---------------------------------------------------------------- mtcars\$cyl: 8 Min. 1st Qu. Median Mean 3rd Qu. Max. 10.40 14.40 15.20 15.10 16.25 19.20
59. 59. r-squared In the next unit, we will explore the following numeric functions: Slide 59 Next Steps... www.r-squared.in/rprogramming ● signif() ● jitter() ● format() ● formatC() ● abs() ● round() ● ceiling() ● floor()
60. 60. r-squared Slide 60 Connect With Us www.r-squared.in/rprogramming Visit r-squared for tutorials on: ● R Programming ● Business Analytics ● Data Visualization ● Web Applications ● Package Development ● Git & GitHub