Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Data Visualization With R

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Próximo SlideShare
Data visualization using R
Data visualization using R
Cargando en…3
×

Eche un vistazo a continuación

1 de 132 Anuncio

Data Visualization With R

Learn the basics of data visualization in R. In this module, we explore the Graphics package and learn to build basic plots in R. In addition, learn to add title, axis labels and range. Modify the color, font and font size. Add text annotations and combine multiple plots. Finally, learn how to save the plots in different formats.

Learn the basics of data visualization in R. In this module, we explore the Graphics package and learn to build basic plots in R. In addition, learn to add title, axis labels and range. Modify the color, font and font size. Add text annotations and combine multiple plots. Finally, learn how to save the plots in different formats.

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Anuncio

Similares a Data Visualization With R (20)

Más de Rsquared Academy (20)

Anuncio

Más reciente (20)

Data Visualization With R

  1. 1. www.r-squared.in/git-hub dataCrunch Data Visualization With R
  2. 2. dataCrunchCourse Material Slide 2 All the material related to this course are available at our Website Slides can be viewed at SlideShare Scripts can be downloaded from GitHub Videos can be viewed on our Youtube Channel
  3. 3. dataCrunchLearning Objectives Slide 3 By the end of this module, → Understand ➢ What is data visualization? ➢ Why visualize data? ➢ Graphics Systems In R → Be able to: ➢ Build basic plot ➢ Add title, labels & text annotations ➢ Modify Axis Ranges ➢ Modify colors, font style & size ➢ Combine multiple plots ➢ Save graphs in multiple formats
  4. 4. dataCrunchWhat is data visualization? Slide 4 Data visualization is the representation of data in graphical format
  5. 5. dataCrunchWhy visualize data? Slide 5 ● Explore: Visualization helps in exploring and explaining patterns and trends. ● Detect: Patterns or anomalies in data can be detected by looking at graphs . ● Make Sense: Possible to make sense of large amount of data efficiently and in time. ● Communicate: Easy to communicate and share the insights from the data.
  6. 6. dataCrunchWhy visualize data? Slide 6 A picture is worth a thousand words. > by(mtcars$mpg, mtcars$cyl, summary) mtcars$cyl: 4 Min. 1st Qu. Median Mean 3rd Qu. Max. 21.40 22.80 26.00 26.66 30.40 33.90 ----------------------------------------------- mtcars$cyl: 6 Min. 1st Qu. Median Mean 3rd Qu. Max. 17.80 18.65 19.70 19.74 21.00 21.40 ----------------------------------------------- mtcars$cyl: 8 Min. 1st Qu. Median Mean 3rd Qu. Max. 10.40 14.40 15.20 15.10 16.25 19.20
  7. 7. dataCrunchR Graphics System Slide 7 R has 3 main packages for data visualization: ● Graphics It is part of R installation and is the fundamental package for visualizing data. It has a lot of good features and we can create all the basic plots using this package. ● ggplot2 ggplot2 was created by Hadley Wickham and is based on the Grammar Of Graphics written by Leland Wilkinson. It has a structured approach to data visualization and builds upon the features available in Graphics and Lattice packages. ● Lattice The Lattice package is inspired by Trellis Graphics and was created by Deepayan Sarkar who is part of the R core group. It is a very powerful data visualization system with an emphasis on multivariate data.
  8. 8. dataCrunch Slide 8 Plot Basics
  9. 9. dataCrunchplot() Slide 9 The plot() function is the fundamental tool to build plots in the Graphics package. It is a generic function and creates the appropriate plot based on the input received from the user. In this section, we will explore the plot() function by using different types of data and observing the corresponding plots created. We will use the mtcars data set throughout this section. The documentation for the plot() function and the mtcars data set can be viewed using the help function. 1 2 3 4 help(plot) help(mtcars)
  10. 10. dataCrunchmtcars Slide 10 Let us take a quick look at the mtcars data set as we will be using it throughout this section: > head(mtcars) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 > str(mtcars) 'data.frame': 32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... $ disp: num 160 160 108 258 360 ... $ hp : num 110 110 93 110 175 105 245 62 95 123 ... $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... $ wt : num 2.62 2.88 2.32 3.21 3.44 ... $ qsec: num 16.5 17 18.6 19.4 17 ... $ vs : num 0 0 1 1 0 1 0 1 1 1 ... $ am : num 1 1 1 0 0 0 0 0 0 0 ... $ gear: num 4 4 4 3 3 3 3 4 4 4 ... $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
  11. 11. dataCrunchExplore plot() Slide 11 Next, we will begin exploring the plot() function. The following data will be used as an input: ● Case 1: One continuous variable ● Case 2: One categorical variable ● Case 3: Two continuous variables ● Case 4: Two categorical variables ● Case 5: One continuous & one categorical variable ● Case 6: One categorical & one continuous variable Case 5 and 6 might look similar but the difference lies in the variables being assigned to the X and Y axis.
  12. 12. dataCrunchCase 1: One continuous variable Slide 12 We will use the variable mpg (Miles Per Gallon) for this example. # plot a single continuous variable plot(mtcars$mpg) The plot() function creates a Scatter Plot when a single continuous variable is used as the input. We cannot infer anything from the above plot as it represents the data points of the mpg variable in the XY coordinate. Let us plot a categorical variable and see what happens.
  13. 13. dataCrunchCase 2: One categorical variable Slide 13 Let us use the cyl (number of cylinders) variable for this data as we need a categorical variable. But before that we need to convert it to type factor using as.factor # check the data type of cyl class(mtcars$cyl) [1] "numeric" # coerce to type factor mtcars$cyl <- as.factor(mtcars$cyl) # plot a single categorical variable plot(mtcars$cyl) The plot() function creates a bar plot when the data is categorical in nature.
  14. 14. dataCrunchCase 3: Two continuous variables Slide 14 Till now we had used only one variable as the input but from this example, we will be using two variables; one for the X axis and another for the Y axis. In this example, we will look at the relationship between the displacement and mileage of the cars. The disp and mpg variables are used and disp is plotted on X axis while mpg is plotted on the Y axis. # plot two continuous variables plot(mtcars$disp, mtcars$mpg) A Scatter plot is created when we use two continuous variables as the input for the plot function but in this case, we can interpret the plot as it represents the relationship between two variables.
  15. 15. dataCrunchCase 4: Two categorical variables Slide 15 In this example, we will use two categorical variables am (transmission type) and cyl (number of cylinders). We will convert am to type factor before creating the plot. Transmission type will be plotted on X axis and number of cylinders on Y axis. # coerce am to type factor mtcars$am <- as.factor(mtcars$am) # coerce cyl to type factor mtcars$cyl <- as.factor(mtcars$cyl) # plot two categorical variables plot(mtcars$am, mtcars$cyl) A stacked bar plot is created when we use two categorical variables as the input for the plot function. In the next two examples, we will use both continuous and categorical variables.
  16. 16. dataCrunchCase 5: Continuous/Categorical Variables Slide 16 In this example, we will plot a categorical variable cyl on the X axis and a continuous variable mpg on the Y axis. # coerce cyl to type factor mtcars$cyl <- as.factor(mtcars$cyl) # categorical/continuous variables plot(mtcars$cyl, mtcars$mpg) A box plot is created when we use a categorical variable and continuous variable as input for the plot function. But in this case, the categorical variable was plotted on the X axis and the continuous variable on the Y axis. What happens if we flip this?
  17. 17. dataCrunchCase 6: Categorical/Continuous Variables Slide 17 In this example, the continuous variable is plotted on the X axis and the categorical variable on the Y axis. # coerce cyl to type factor mtcars$cyl <- as.factor(mtcars$cyl) # continuous vs categorical variables plot(mtcars$mpg, mtcars$cyl) A scatter plot is created but since the Y axis variable is discrete, we can observe lines of points for each level of the discrete variable. We can compare the range of the X axis variable for each level of the Y axis variable.
  18. 18. dataCrunch Title, Axis Labels & Range Slide 18
  19. 19. dataCrunchTitle & Labels Slide 19 In this section, we will learn to enhance the plot by adding/modifying the following features: ● Title ● Subtitle ● Axis Labels ● Axis Range
  20. 20. dataCrunchIntroduction Slide 20 In the previous section, we created plots which did not have any title or labels. A new user of a plot should know what the X and Y axis represent as well as the primary information being communicated by the plot. The title, axis labels and range play an important role in making the plot holistic. There are two ways of adding the above mentioned elements to a plot: ● Use the relevant arguments within the plot() function. ● Use the title() function. We will explore both the methods one by one and you can choose the method that you find most convenient. Let us begin by adding the features using the plot() function.
  21. 21. dataCrunchSyntax Slide 21 Feature Argument Value Example Title main String “Scatter Plot” Subtitle sub String “Displacement vs Miles Per Gallon” X Axis Label xlab String “Displacement” Y Axis Label ylab String “Miles Per Gallon” X Axis Range xlim Numeric Vector c(0, 500) Y Axis Range ylim Numeric Vector c(0, 50)
  22. 22. dataCrunchTitle Slide 22 # create a basic plot and add a title plot(mtcars$disp, mtcars$mpg, main = “Displacement vs Miles Per Gallon”) Description Use the main argument in the plot() function to add a title to the plot. The title must be enclosed in double quotes as it is a string. Code
  23. 23. dataCrunchSubtitle Slide 23 # create a basic plot and add a subtitle plot(mtcars$disp, mtcars$mpg, sub = “Displacement vs Miles Per Gallon”) Description Use the sub argument in the plot() function to add a subtitle to the plot. The subtitle must be enclosed in double quotes as it is a string. Code
  24. 24. dataCrunchAxis Labels Slide 24 # create a basic plot and add axis labels plot(mtcars$disp, mtcars$mpg, xlab = "Displacement", ylab = "Miles Per Gallon") Description In all plots created till now, the axis labels appear as mtcars$disp and mtcars$mpg. It is not the most elegant way of labeling the axis and it would make more sense to use appropriate names. Use the xlab and ylab arguments in the plot() function to add the X and Y axis labels. Code
  25. 25. dataCrunchAxis Range Slide 25 # create a basic plot and modify axis range plot(mtcars$disp, mtcars$mpg, xlim = c(0, 600), ylim = c(0, 50)) Description The range of the axis can be modified using the xlim and ylim arguments in the plot() function. The lower and upper limit for both the axis must be mentioned in a numeric vector. Code
  26. 26. dataCrunchSo far... Slide 26 # create a plot with title, subtitle, axis labels and range plot(mtcars$disp, mtcars$mpg, main = "Displacement vs Miles Per Gallon", sub = "Scatter Plot", xlab = "Displacement", ylab = "Miles Per Gallon", xlim = c(0, 600), ylim = c(0, 50)) Description Let us create a plot that has all the features that we have learnt so far: Code
  27. 27. dataCrunchtitle() Function Slide 27 # create a basic plot plot(mtcars$disp, mtcars$mpg) # add title and label using title() function title(main = "Displacement vs Miles Per Gallon", sub = "Scatter Plot", xlab = "Displacement", ylab = "Miles Per Gallon") Description We will add/modify the same set of features but using the title() function instead of the arguments in the plot() function but we will continue to use the plot() function for creating the basic plot. Code Oops.. We have a problem. If you have observed carefully, the title() function has overwritten the default axis labels created by the plot() function.
  28. 28. dataCrunchtitle() Function Slide 28 # create a basic plot plot(mtcars$disp, mtcars$mpg, ann = FALSE) # add title and label using title() function title(main = "Displacement vs Miles Per Gallon", sub = "Scatter Plot", xlab = "Displacement", ylab = "Miles Per Gallon") Description Use the ann argument in the plot() function and set it to FALSE to ensure that the plot() function does not add the default labels. Code Note: The range of the axis cannot be modified using the title() function.
  29. 29. dataCrunch Color Slide 29
  30. 30. dataCrunchColor Slide 30 In this section, we will learn to add colors to the following using the col argument: ● Plot Symbol ● Title & Subtitle ● Axis ● Axis Labels ● Foreground
  31. 31. dataCrunchSyntax Slide 31 Feature Argument Value Example Symbol col String Hexadecimal RGB "blue" Title col.main "#0000ff" Subtitle col.sub rgb(0, 0, 1) Axis col.axis "red" Label col.lab "#ff0000" Foreground fg rgb(1, 0, 0) The col argument can be used along with main, sub, axis and lab arguments to specify the color of the title, subtitle, axes and the labels.
  32. 32. dataCrunchColor: Symbol Slide 32 # modify color of the plot plot(mtcars$disp, mtcars$mpg, col= "red") OR plot(mtcars$disp, mtcars$mpg, col = "#ff0000") OR plot(mtcars$disp, mtcars$mpg, col = rgb(1, 0, 0)) Description Let us begin by adding color to the symbol in the plot using the col argument in the plot() function. Code
  33. 33. dataCrunchColor: Title Slide 33 # modify the color of the title plot(mtcars$disp, mtcars$mpg, main = "Displacement vs Miles Per Gallon", col.main = "blue") OR plot(mtcars$disp, mtcars$mpg, main = "Displacement vs Miles Per Gallon", col.main = "#0000ff") OR plot(mtcars$disp, mtcars$mpg, main = "Displacement vs Miles Per Gallon", col.main = rgb(0, 0, 1)) Description The color of the title can be modified by using the col.main argument in the plot() function. Code
  34. 34. dataCrunchColor: Subtitle Slide 34 # modify the color of the subtitle plot(mtcars$disp, mtcars$mpg, sub= "Displacement vs Miles Per Gallon", col.sub = "blue") OR plot(mtcars$disp, mtcars$mpg, sub= "Displacement vs Miles Per Gallon", col.sub = "#0000ff") OR plot(mtcars$disp, mtcars$mpg, sub= "Displacement vs Miles Per Gallon", col.sub = rgb(0, 0, 1)) Description The color of the subtitle can be modified by using the col.sub argument in the plot() function. Code
  35. 35. dataCrunchColor: Axis Slide 35 # modify the color of the axis plot(mtcars$disp, mtcars$mpg, col.axis = "blue") OR plot(mtcars$disp, mtcars$mpg, col.axis = "#0000ff") OR plot(mtcars$disp, mtcars$mpg, col.axis = rgb(0, 0, 1)) Description The color of the axis can be modified by using the col.axis argument in the plot() function. Code
  36. 36. dataCrunchColor: Labels Slide 36 # modify the color of the labels plot(mtcars$disp, mtcars$mpg, xlab = "Displacement", ylab = "Miles Per Gallon", col.lab = "blue") Description The color of the labels can be modified by using the col.lab argument in the plot() function. Code
  37. 37. dataCrunchColor: Foreground Slide 37 # modify the color of the foreground plot(mtcars$disp, mtcars$mpg, fg= "red") Description The color of the foreground can be modified by using the fg argument in the plot() function. Code
  38. 38. dataCrunchColor: Using title() function Slide 38 # Create a basic plot plot(mtcars$disp, mtcars$mpg, ann= FALSE) # modify color using title() function title(main = "Displacement vs Miles Per Gallon", xlab = "Displacement", ylab = "Miles Per Gallon", col.main = "blue", col.lab = "red") Description The colors of the title, subtitle and the labels can be modified using the title() function as well. Let us try it out: Code
  39. 39. dataCrunch Font Slide 39
  40. 40. dataCrunchFont Slide 40 The font argument can be used along with main, sub, axis and lab arguments to specify the font of the title, subtitle, axes and the labels. Feature Argument Title font.main Subtitle font.sub Axis font.axis Labels font.lab The font argument takes values from 1 - 5. The font type represented by each value is shown in the above table. Value Font Type 1 Plain 2 Bold 3 Italic 4 Bold Italic 5 Symbol
  41. 41. dataCrunchFont: Title Slide 41 # modify the font of the title plot(mtcars$disp, mtcars$mpg, main = "Displacement vs Miles Per Gallon", font.main = 1) Description The font of the title can be modified using the font. main argument in the plot() function. Code
  42. 42. dataCrunchFont: Title Slide 42 The below plot depicts the appearance of the title when different options for font type are applied:
  43. 43. dataCrunchFont: Subtitle Slide 43 # modify the font of the subtitle plot(mtcars$disp, mtcars$mpg, sub= "Displacement vs Miles Per Gallon", font.sub = 3) Description The font of the subtitle can be modified using the font.sub argument in the plot() function. Code
  44. 44. dataCrunchFont: Subtitle Slide 44 The below plot depicts the appearance of the subtitle when different options for font type are applied:
  45. 45. dataCrunchFont: Axis Slide 45 # modify the font of the axis plot(mtcars$disp, mtcars$mpg, font.axis = 3) Description The font of the axis can be modified using the font.axis argument in the plot() function. Code
  46. 46. dataCrunchFont: Axis Slide 46 The below plot depicts the appearance of the axis when different options for font type are applied:
  47. 47. dataCrunchFont: Labels Slide 47 # modify the font of the labels plot(mtcars$disp, mtcars$mpg, xlab = "Displacement", ylab = "Miles Per Gallon", font.lab = 3) Description The font of the labels can be modified using the font.lab argument in the plot() function. Code
  48. 48. dataCrunchFont: Labels Slide 48 The below plot depicts the appearance of the labels when different options for font type are applied:
  49. 49. dataCrunch Font Size Slide 49
  50. 50. dataCrunchFont Size Slide 50 The cex argument can be used along with main, sub, axis and lab arguments to specify the font size of the title, subtitle, axes and the labels. Feature Argument Title cex.main Subtitle cex.sub Axis cex.axis Labels cex.lab The values taken by the cex argument are relative to 1 i.e the default size is represented by 1 and if the supplied value is less than 1, the size of the font will be relatively smaller, and if the supplied value is greater than 1, the size of the font will be relatively bigger.
  51. 51. dataCrunchFont Size: Title Slide 51 # modify the font size of the title plot(mtcars$disp, mtcars$mpg, main = "Scatter Plot", cex.main = 1.5) Description The font size of the title can be modified using the cex.main argument in the plot() function. Code
  52. 52. dataCrunchFont Size: Title Slide 52 The below plot depicts the appearance of the title when different options for font size are applied:
  53. 53. dataCrunchFont Size: Subtitle Slide 53 # modify the font size of the subtitle plot(mtcars$disp, mtcars$mpg, sub= "Scatter Plot", cex.sub = 1.5) Description The font size of the subtitle can be modified using the cex.sub argument in the plot() function. Code
  54. 54. dataCrunchFont: Subtitle Slide 54 The below plot depicts the appearance of the subtitle when different options for font size are applied:
  55. 55. dataCrunchFont Size: Axis Slide 55 # modify the font size of the axis plot(mtcars$disp, mtcars$mpg, cex.axis = 1.5) Description The font size of the axis can be modified using the cex.axis argument in the plot() function. Code
  56. 56. dataCrunchFont Size: Axis Slide 56 The below plot depicts the appearance of the axis when different options for font size are applied:
  57. 57. dataCrunchFont Size: Labels Slide 57 # modify the font size of the labels plot(mtcars$disp, mtcars$mpg, xlab = "Displacement", ylab = "Miles Per Gallon", cex.lab = 1.5) Description The font size of the labels can be modified using the cex.lab argument in the plot() function. Code
  58. 58. dataCrunchFont Size: Labels Slide 58 The below plot depicts the appearance of the subtitle when different options for font size are applied:
  59. 59. dataCrunch Text Annotations Slide 59
  60. 60. dataCrunchText Annotations: Objectives Slide 60 In this section, we will learn to: Add text annotations to the plots using ● text() function ● mtext() function
  61. 61. dataCrunchText Annotations: Introduction Slide 61 The text() and the mtext() functions allow the user to add text annotations to the plots. While the text() function places the text inside the plot, the mtext() function places the text on the margins of the plot. Below is the syntax for both the functions: # the text function text(x, y = NULL, labels = seq_along(x), adj = NULL, pos = NULL, offset = 0.5, vfont = NULL, cex = 1, col = NULL, font = NULL, ...) # the mtext function mtext(text, side = 3, line = 0, outer = FALSE, at = NA, adj = NA, padj = NA, cex = NA, col = NA, font = NA, ...) Let us explore each function and its arguments one by one:
  62. 62. dataCrunchText Annotations: text() Slide 62 To add text annotations using the text() function, the following 3 arguments must be supplied: ● x: x axis coordinate ● y: y axis coordinate ● text: the text to be added to the plot Below is a simple example: # create a basic plot plot(mtcars$disp, mtcars$mpg) # add text text(340, 30, "Sample Text") The text appears at the coordinates (340, 30) on the plot. Ensure that the text is enclosed in double quotes and the coordinates provided are within the range of the X and Y axis variable.
  63. 63. dataCrunchtext(): Color Slide 63 # create a basic plot plot(mtcars$disp, mtcars$mpg) # modify the color of the text text(340, 30, "Sample Text", col = "red") Description The color of the text can be modified using the col argument in the text() function. Code
  64. 64. dataCrunchtext(): Color Slide 64 The below plot depicts the appearance of the text when different options for col are applied:
  65. 65. dataCrunchtext(): Font Slide 65 # create a basic plot plot(mtcars$disp, mtcars$mpg) # modify the font of the text text(340, 30, "Sample Text", font = 2) Description The font of the text can be modified using the font argument in the text() function. Code
  66. 66. dataCrunchtext(): Font Slide 66 The below plot depicts the appearance of the text when different options for font are applied:
  67. 67. dataCrunchtext(): Font Family Slide 67 # create a basic plot plot(mtcars$disp, mtcars$mpg) # modify the font family of the text text(340, 30, "Sample Text", family = mono) Description The font family of the text can be modified using the family argument in the text() function. Code
  68. 68. dataCrunchtext(): Font Family Slide 68 The below plot depicts the appearance of the text when different options for font family are applied:
  69. 69. dataCrunchtext(): Font Size Slide 69 # create a basic plot plot(mtcars$disp, mtcars$mpg) # modify the size of the text text(340, 30, "Sample Text", cex = 1.5) Description The size of the text can be modified using the cex argument in the text() function. Code
  70. 70. dataCrunchtext(): Font Size Slide 70 The below plot depicts the appearance of the text when different options for font size are applied:
  71. 71. dataCrunchmtext(): Introduction Slide 71 The mtext() function places text annotations on the margins of the plot instead of placing them inside the plot. It allows the user to modify the location of the text in multiple ways and we will explore them one by one. Below is a simple example: # create a basic plot plot(mtcars$disp, mtcars$mpg) # add text mtext("Sample Text") As you can see, the text is placed on the margin of the plot and not inside the plot. Next, we will learn to specify the margin where the text should be placed.
  72. 72. dataCrunchmtext(): Margin Slide 72 # create a basic plot plot(mtcars$disp, mtcars$mpg) # specify the margin on which the text should appear mtext("Sample Text", side = 1) Description The margin on which we want to place the text can be specified using the side argument. It takes 4 values from 1-4 each representing one side of the plot. Code
  73. 73. dataCrunchmtext(): Margin Options Slide 73 The side argument can be used to specify the margin on which the text should be placed. side Margin 1 Bottom 2 Left 3 Top 4 Right
  74. 74. dataCrunchmtext(): Margin Slide 74 The below plot depicts the appearance of the text when different options for side are applied:
  75. 75. dataCrunchmtext(): Line Slide 75 # create a basic plot plot(mtcars$disp, mtcars$mpg) # place the text away from the margin mtext("Sample Text", line = 1) Description The line argument places the text at the specified distance from the margin. The default value is 0 and as the value increases the text is placed farther from the margin and outside the plot, and as the value decreases the text is placed inside the plot and farther from the margin. Code
  76. 76. dataCrunchmtext(): Line Slide 76 # create a basic plot plot(mtcars$disp, mtcars$mpg) # place the text away from the plot mtext("Sample Text", line = -1) Description The line argument places the text inside the plot when the values is less than zero. Code
  77. 77. dataCrunchmtext(): Line Slide 77 The below plot depicts the appearance of the text when different options for line are applied:
  78. 78. dataCrunchmtext(): adj Slide 78 # create a basic plot plot(mtcars$disp, mtcars$mpg) # align the text to the left mtext("Sample Text", adj= 0) Description The adj argument is used for horizontal alignment of the text. If set to 0, the text will be left aligned and at 1, it will be right aligned. Code
  79. 79. dataCrunchmtext(): adj Slide 79 # create a basic plot plot(mtcars$disp, mtcars$mpg) # align the text to the right mtext("Sample Text", adj= 1) Description When the value is set to 1, the text will be right aligned. Code
  80. 80. dataCrunchmtext(): adj Slide 80 The below plot depicts the appearance of the text when different options for text() are applied:
  81. 81. dataCrunch Layout Slide 81
  82. 82. dataCrunchLayout: Objectives Slide 82 In this section, we will learn to: Combine multiple graphs in a single frame using the following functions: ● par() function ● layout() function
  83. 83. dataCrunchLayout: Introduction Slide 83 Often, it is useful to have multiple plots in the same frame as it allows us to get a comprehensive view of a particular variable or compare among different variables. The Graphics package offers two methods to combine multiple plots. The par() function can be used to set graphical parameters regarding plot layout using the mfcol and mfrow arguments. The layout() function serves the same purpose but offers more flexibility by allowing us to modify the height and width of rows and columns.
  84. 84. dataCrunchLayout: par() Slide 84 The par() function allows us to customize the graphical parameters(title, axis, font, color, size) for a particular session. For combining multiple plots, we can use the graphical parameters mfrow and mfcol. These two parameters create a matrix of plots filled by rows and columns respectively. Let us combine plots using both the above parameters. Option Description Arguments mfrow Fill by rows Number of rows and columns mfcol Fill by columns Number of rows and columns
  85. 85. dataCrunchLayout: par(mfrow) Slide 85 (a) mfrow mfrow combines plots filled by rows i.e it takes two arguments, the number of rows and number of columns and then starts filling the plots by row. Below is the syntax for mfrow: Let us begin by combining 4 plots in 2 rows and 2 columns: # mfrow syntax mfrow(number of rows, number of columns)
  86. 86. dataCrunchRecipe 1: Code Slide 86 Let us begin by combining 4 plots in 2 rows and 2 columns. The plots will be filled by rows as we are using the mfrow function: # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 4 graphs to be combined and filled by rows par(mfrow = c(2, 2)) # specify the graphs to be combined plot(mtcars$mpg) plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  87. 87. dataCrunchRecipe 1: Plot Slide 87
  88. 88. dataCrunchRecipe 2: Code Slide 88 Combine 2 plots in 1 row and 2 columns. # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 2 graphs to be combined and filled by rows par(mfrow = c(1, 2)) # specify the graphs to be combined hist(mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  89. 89. dataCrunchRecipe 2: Plot Slide 89
  90. 90. dataCrunchRecipe 3: Code Slide 90 Combine 2 plots in 2 rows and 1 column # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 2 graphs to be combined and filled by rows par(mfrow = c(2, 1)) # specify the graphs to be combined hist(mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  91. 91. dataCrunchRecipe 3: Plot Slide 91
  92. 92. dataCrunchRecipe 4: Code Slide 92 Combine 3 plots in 1 row and 3 columns # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 3 graphs to be combined and filled by rows par(mfrow = c(1, 3)) # specify the graphs to be combined plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  93. 93. dataCrunchRecipe 4: Plot Slide 93
  94. 94. dataCrunchRecipe 5: Code Slide 94 Combine 3 plots in 3 rows and 1 column # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 3 graphs to be combined and filled by rows par(mfrow = c(3, 1)) # specify the graphs to be combined plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  95. 95. dataCrunchRecipe 5: Plot Slide 95
  96. 96. dataCrunchLayout: par(mfcol) Slide 96 (a) mfcol mfcol combines plots filled by columns i.e it takes two arguments, the number of rows and number of columns and then starts filling the plots by columns. Below is the syntax for mfrow: Let us begin by combining 4 plots in 2 rows and 2 columns: # mfcol syntax mfcol(number of rows, number of columns)
  97. 97. dataCrunchRecipe 6: Code Slide 97 Combine 4 plots in 2 rows and 2 columns # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 4 graphs to be combined and filled by columns par(mfcol = c(2, 2)) # specify the graphs to be combined plot(mtcars$mpg) plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  98. 98. dataCrunchRecipe 6: Plot Slide 98
  99. 99. dataCrunchSpecial Cases Slide 99 What happens if we specify lesser or more number of graphs? In the next two examples, we will specify lesser or more number of graphs than we ask the par() function to combine. Let us see what happens in such instances: Case 1: Lesser number of graphs specified We will specify that 4 plots need to be combined in 2 rows and 2 columns but provide only 3 graphs. Case 2: Extra graph specified We will specify that 4 plots need to be combined in 2 rows and 2 columns but specify 6 graphs instead of 4.
  100. 100. dataCrunchSpecial Case 1: Code Slide 100 # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 4 graphs to be combined and filled by rows par(mfrow = c(2, 2)) # specify the graphs to be combined plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  101. 101. dataCrunchSpecial Case 1: Plot Slide 101
  102. 102. dataCrunchSpecial Case 2: Code Slide 102 # store the current parameter settings in init init <- par(no.readonly=TRUE) # specify that 4 graphs to be combined and filled by rows par(mfrow = c(2, 2)) # specify the graphs to be combined plot(mtcars$mpg) plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$disp, mtcars$mpg) boxplot(mtcars$mpg) # restore the setting stored in init par(init)
  103. 103. dataCrunchSpecial Case 2: Plot Slide 103 Frame 1 Frame 2
  104. 104. r-squaredCombining Graphs: layout() Slide 104 At the core of the layout() function is a matrix. We communicate the structure in which the plots must be combined using a matrix. As such, the layout function is more flexible compared to the par() function. Let us begin by combining 4 plots in a 2 row/2 column structure. We do this by creating a layout using the matrix function. Option Description Value matrix Matrix specifying location of plots Matrix widths Width of columns Vector heights Heights of Rows Vector
  105. 105. dataCrunchRecipe 7: Code Slide 105 Combine 4 plots in 2 rows/2 columns filled by rows # specify the layout # 4 plots to be combined in 2 row/ 2 columns and arranged by row layout(matrix(c(1, 2, 3, 4), nrow = 2, byrow = TRUE)) # specify the 4 plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$mpg)
  106. 106. dataCrunchRecipe 7: Plot Slide 106
  107. 107. dataCrunchRecipe 8: Code Slide 107 Combine 4 plots in 2 rows/2 columns filled by columns To fill the plots by column, we specify byrow = FALSE in the matrix. # specify the layout # 4 plots to be combined in 2 row/ 2 columns and filled by columns layout(matrix(c(1, 2, 3, 4), nrow = 2, byrow = FALSE)) # specify the 4 plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$mpg)
  108. 108. dataCrunchRecipe 2: Plot Slide 108
  109. 109. dataCrunchRecipe 9: Code Slide 109 Combine 3 plots in 2 rows/2 columns filled by rows The magic of the layout() function begins here. We want to combine 3 plots and the first plot should occupy both the columns in row 1 and the next 2 plots should be in row 2. If you look at the matrix below, 1 is specified twice and since the matrix is filled by row, it will occupy both the columns in the first row. Similarly the first plot will occupy the entire first row. It will be crystal clear when you see the plot. # specify the matrix > matrix(c(1, 1, 2, 3), nrow = 2, byrow = TRUE) [,1] [,2] [1,] 1 1 [2,] 2 3 # 3 plots to be combined in 2 row/ 2 columns and arranged by row layout(matrix(c(1, 1, 2, 3), nrow = 2, byrow = TRUE)) # specify the 3 plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg)
  110. 110. dataCrunchRecipe 9: Plot Slide 110
  111. 111. dataCrunchRecipe 10: Code Slide 111 Combine 3 plots in 2 rows/2 columns filled by rows The plots must be filled by rows and the third plot must occupy both the columns of the second row while the other two plots will be placed in the first row. The matrix would look like this: # specify the matrix > matrix(c(1, 2, 3, 3), nrow = 2, byrow = TRUE) [,1] [,2] [1,] 1 2 [2,] 3 3 # 3 plots to be combined in 2 row/ 2 columns and arranged by row layout(matrix(c(1, 2, 3, 3), nrow = 2, byrow = TRUE)) # specify the 3 plots hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$disp, mtcars$mpg)
  112. 112. dataCrunchRecipe 10: Plot Slide 112
  113. 113. dataCrunchRecipe 11: Code Slide 113 Combine 3 plots in 2 rows/2 columns filled by columns The plots must be filled by columns and the first plot must occupy both the rows of the first column while the other two plots will be placed in the second column in two rows. The matrix would look like this: # specify the matrix > matrix(c(1, 1, 2, 3), nrow = 2, byrow = FALSE) [,1] [,2] [1,] 1 2 [2,] 1 3 # 3 plots to be combined in 2 row/ 2 columns and arranged by columns layout(matrix(c(1, 1, 2, 3), nrow = 2, byrow = FALSE)) # specify the 3 plots hist(mtcars$mpg) plot(mtcars$disp, mtcars$mpg) boxplot(mtcars$mpg)
  114. 114. dataCrunchRecipe 11: Plot Slide 114
  115. 115. dataCrunchRecipe 12: Code Slide 115 Combine 3 plots in 2 rows/2 columns filled by columns The plots must be filled by columns and the first plot must occupy both the rows of the second column while the other two plots will be placed in the first column in two rows. The matrix would look like this: # specify the matrix > matrix(c(1, 2, 3, 3), nrow = 2, byrow = FALSE) [,1] [,2] [1,] 1 3 [2,] 2 3 # 3 plots to be combined in 2 row/ 2 columns and arranged by columns layout(matrix(c(1, 2, 3, 3), nrow = 2, byrow = FALSE)) # specify the 3 plots boxplot(mtcars$mpg) plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg)
  116. 116. dataCrunchRecipe 12: Plot Slide 116
  117. 117. dataCrunchlayout(): Widths Slide 117 Widths In all the layouts created so far, we have kept the size of the rows and columns equal. What if you want to modify the width and height of the columns and rows? The widths and heights arguments in the layout() function address the above mentioned issue. Let us check them out one by one: The widths argument is used for specifying the width of the columns. Based on the number of columns in the layout, you can specify the width of each column. Let us look at some examples.
  118. 118. dataCrunchRecipe 13: Code Slide 118 Width of the 2nd column is twice the width of the 1st column # specify the matrix > matrix(c(1, 2, 3, 4), nrow = 2, byrow = TRUE) [,1] [,2] [1,] 1 3 [2,] 2 4 # 4 plots to be combined in 2 row/ 2 columns and arranged by columns layout(matrix(c(1, 2, 3, 4), nrow = 2, byrow = TRUE), widths = c(1, 3)) # specify the plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$mpg)
  119. 119. dataCrunchRecipe 13: Plot Slide 119
  120. 120. dataCrunchRecipe 14: Code Slide 120 Width of the 2nd column is twice that of the first and last column # specify the matrix > matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = TRUE) [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 # 6 plots to be combined in 2 row/ 3 columns and filled by rows layout(matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = TRUE), widths = c(1, 2, 1)) # specify the plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg)
  121. 121. dataCrunchRecipe 14: Plot Slide 121
  122. 122. dataCrunchlayout(): Heights Slide 122 Heights The heights arguments is used to modify the height of the rows and based on the number of rows specified in the layout, we can specify the height of each row. Height of the 2nd row is twice that of the first row # 4 plots to be combined in 2 row/ 2 columns and filled by rows layout(matrix(c(1, 2, 3, 4), nrow = 2, byrow = TRUE), heights= c(1, 2)) # specify the 4 plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$mpg)
  123. 123. dataCrunchRecipe 15: Plot Slide 123
  124. 124. dataCrunchRecipe 16: Code Slide 124 Height of the 3rd row is thrice that of the 1st and 2nd row # specify the matrix > matrix(c(1, 2, 3, 4, 5, 6), nrow = 3, byrow = TRUE) [,1] [,2] [1,] 1 2 [2,] 3 4 [3,] 5 6 # 6 plots to be combined in 3 row/ 2 columns and arranged by rows layout(matrix(c(1, 2, 3, 4, 5, 6), nrow = 3, byrow = TRUE), heights= c(1, 1, 3)) # specify the plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg)
  125. 125. dataCrunchRecipe 16: Plot Slide 125
  126. 126. dataCrunchPutting it all together... Slide 126 Before we end this section, let us combine plots using both the widths and heights option. # specify the matrix > matrix(c(1, 2, 3, 4, 5, 6), nrow = 3, byrow = TRUE) [,1] [,2] [1,] 1 2 [2,] 3 4 [3,] 5 6 # 6 plots to be combined in 3 row/ 2 columns and arranged by rows layout(matrix(c(1, 2, 3, 4, 5, 6), nrow = 3, byrow = TRUE), heights= c(1, 2, 1), widths = c(2, 1)) # specify the 6 plots plot(mtcars$disp, mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg) plot(mtcars$mpg) hist(mtcars$mpg) boxplot(mtcars$mpg)
  127. 127. dataCrunchPlot Slide 127
  128. 128. dataCrunch Saving Graphs Slide 128
  129. 129. dataCrunchSaving Graphs Slide 129 Once created, the graphs can be saved in multiple formats. The below table lists the different formats in which the graphs can be downloaded. Type Function pdf pdf("graph_name.pdf") png png("graph_name.png") jpeg jpeg("graph_name.jpeg") bmp bmp("graph_name.bmp") postscript postscript("graph_name.ps") win.metafile win.metafile("graph_name.wmf")
  130. 130. dataCrunchSaving Graphs: Example Slide 130 Use the dev.off() to return output to the terminal after the graph has been saved in the required format. # saving graph in pdf file pdf("plot_pdf.pdf") plot(mtcars$disp, mtcars$mpg) dev.off() # saving graph in png file png("plot_png.png") plot(mtcars$disp, mtcars$mpg) dev.off() # saving graph in jpeg file jpeg("plot_jpeg.jpg") plot(mtcars$disp, mtcars$mpg) dev.off() # saving graph in bmp file bmp("plot_bmp.bmp") plot(mtcars$disp, mtcars$mpg) dev.off() # saving graph in postscript file postscript("plot_postscript.ps") plot(mtcars$disp, mtcars$mpg) dev.off() # saving graph in windows metafile win.metafile("plot_wmf.wmf") plot(mtcars$disp, mtcars$mpg) dev.off()
  131. 131. dataCrunchNext Steps... Slide 131 Learn to → Create basic Line Graph → Add multiple lines → Modify line type/color → Add legend
  132. 132. dataCrunchNext Steps... Slide 132 Visit dataCrunch for tutorials on: → R Programming → Business Analytics → Data Visualization → Web Applications → Package Development → Git & GitHub

×