Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2018

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio

Eche un vistazo a continuación

1 de 14 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2018 (20)

Anuncio

Más de The Statistical and Applied Mathematical Sciences Institute (20)

Más reciente (20)

Anuncio

QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2018

  1. 1. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] R Basics and Simulation About R R is a free software environment for statistical computing and graphics. Provides a wide variety of statistical and graphical techniques Many classical and modern statistical techniques have been implemented. A few of these are built into the base R environment, but many are supplied as packages. Convinient interface, RStudio. It is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. Open R Studio Frequently Used Data Types and R-Objects The variables are not declared as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. Data Type: Numeric, Integer, Character ## Numeric ## Assign a number to variable num num <- 3.14 print(num) ## [1] 3.14 ## simple calculation by calling the variable print(num + 1) ## [1] 4.14 ## Let's check the data type that has been assigned to num print(class(num)) ## [1] "numeric"
  2. 2. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] ## Integer ## Assign the integer part variable num.int num.int <- as.integer(num) num.int ## [1] 3 ## Let's check the data type class(num.int) ## [1] "integer" ## Character ## Assign the integer part variable num.int char <- "Hello" char ## [1] "Hello" ## Let's check the data type class(char) ## [1] "character" R-Objects: Vectors, Matrices, Data Frames # Create a vector with more than one element # We use c() function which means to combine the elements into a vector # create a vector of characters col <- c('red','green',"yellow") col ## [1] "red" "green" "yellow" # create a vector of numeric num <- c(1,2,3) num ## [1] 1 2 3
  3. 3. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] # extract elements from vectors num[1] ## [1] 1 # Create a matrices of vectors # Several way to do this # Use cbind() function which means column combines Mcol <- cbind(num,num,num) Mcol ## num num num ## [1,] 1 1 1 ## [2,] 2 2 2 ## [3,] 3 3 3 # Use rbind() function which means row combine Mrow <- rbind(num,num,num) Mrow ## [,1] [,2] [,3] ## num 1 2 3 ## num 1 2 3 ## num 1 2 3 # Use matrix function to fill in each element M <- matrix(1:9,nrow=3,ncol=3) M ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 # Now lets try combining numeric vector and character vector into a matrix Mtry <- rbind(num,col) class(Mtry) # Do you notice what has been changed here? ## [1] "matrix" # extract elements from a matrix M[1,3]
  4. 4. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] ## [1] 7 # Create a data frame df <- data.frame(x = col, y = num) df ## x y ## 1 red 1 ## 2 green 2 ## 3 yellow 3 # extract element from data frame df$x ## [1] red green yellow ## Levels: green red yellow df$y[3] ## [1] 3 Calculation with R: Multiplication, Log, Exponential ,Power and Some Useful Statistics # for scaler x <- 2 x*x ## [1] 4 # for vector num ## [1] 1 2 3 num + num ## [1] 2 4 6
  5. 5. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] y <- c(0,1,2,3,4) log(y) ## [1] -Inf 0.0000000 0.6931472 1.0986123 1.3862944 # for matrix M <- matrix(1:9,ncol=3,nrow=3) exp(M) ## [,1] [,2] [,3] ## [1,] 2.718282 54.59815 1096.633 ## [2,] 7.389056 148.41316 2980.958 ## [3,] 20.085537 403.42879 8103.084 # for data frame df ## x y ## 1 red 1 ## 2 green 2 ## 3 yellow 3 df^2 # why is there a warning message? ## Warning in Ops.factor(left, right): '^' not meaningful for factors ## x y ## [1,] NA 1 ## [2,] NA 4 ## [3,] NA 9 # Useful statistics mean(M) ## [1] 5 sum(M) ## [1] 45 Compute Deterministic Function in R: Sine(x), Polynomial x^2 + 3*x
  6. 6. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] # define the range of x value x <- seq(0,10,len =100) x ## [1] 0.0000000 0.1010101 0.2020202 0.3030303 0.4040404 0.5050505 ## [7] 0.6060606 0.7070707 0.8080808 0.9090909 1.0101010 1.1111111 ## [13] 1.2121212 1.3131313 1.4141414 1.5151515 1.6161616 1.7171717 ## [19] 1.8181818 1.9191919 2.0202020 2.1212121 2.2222222 2.3232323 ## [25] 2.4242424 2.5252525 2.6262626 2.7272727 2.8282828 2.9292929 ## [31] 3.0303030 3.1313131 3.2323232 3.3333333 3.4343434 3.5353535 ## [37] 3.6363636 3.7373737 3.8383838 3.9393939 4.0404040 4.1414141 ## [43] 4.2424242 4.3434343 4.4444444 4.5454545 4.6464646 4.7474747 ## [49] 4.8484848 4.9494949 5.0505051 5.1515152 5.2525253 5.3535354 ## [55] 5.4545455 5.5555556 5.6565657 5.7575758 5.8585859 5.9595960 ## [61] 6.0606061 6.1616162 6.2626263 6.3636364 6.4646465 6.5656566 ## [67] 6.6666667 6.7676768 6.8686869 6.9696970 7.0707071 7.1717172 ## [73] 7.2727273 7.3737374 7.4747475 7.5757576 7.6767677 7.7777778 ## [79] 7.8787879 7.9797980 8.0808081 8.1818182 8.2828283 8.3838384 ## [85] 8.4848485 8.5858586 8.6868687 8.7878788 8.8888889 8.9898990 ## [91] 9.0909091 9.1919192 9.2929293 9.3939394 9.4949495 9.5959596 ## [97] 9.6969697 9.7979798 9.8989899 10.0000000 y <- sin(x) plot(x,y) z <- x^2 + 3*x plot(x,z, type="l")
  7. 7. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] Simulate Random Variables in R # generate uniform variable between 0,1 u <- runif(10) # plot to see what it looks like plot(u) hist(u)
  8. 8. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] # generate more data u <- runif(1000) plot(u) hist(u) # Do you see what has changed?
  9. 9. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] # sample from a vector # Type help(sample) to see the function arguments sample(x = 1:100, size=1, replace=F) ## [1] 35 # Normal random variable, a very useful random variable used in statistics n1 <- rnorm(1) n1 ## [1] -1.251055 n2 <- rnorm(1) n2 ## [1] 0.650681 # Generate a larger sample to see its distribution n <- rnorm(1000) plot(n)
  10. 10. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] hist(n) plot(density(n), main="Density of n",xlab="n") # remember the shape of the distribution
  11. 11. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] # set seed to generate the same random number set.seed(123) n1 <- rnorm(1) n1 ## [1] -0.5604756 set.seed(123) n2 <- rnorm(1) n2 ## [1] -0.5604756 Why is Normal Distribution so Useful? # Let's look at some examples # If we flip 10 coints and count the number of heads. # what do you think the distribution of the count will look like. # simulate 30 coin flips x = sample(c("head","tail"),30,replace = T) x ## [1] "head" "tail" "tail" "head" "tail" "tail" "tail" "head" "tail" "head" ## [11] "tail" "tail" "head" "tail" "head" "head" "head" "tail" "tail" "tail" ## [21] "tail" "tail" "tail" "tail" "tail" "tail" "head" "head" "tail" "tail" # count the number of heads x == "head"
  12. 12. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] ## [1] TRUE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE ## [12] FALSE TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE ## [23] FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE sum( x == "head" ) ## [1] 10 # repeat this 1000 times headcount <- c() # create an empty vector for (i in 1:1000){ x = sample(c("head","tail"),30,replace = T) headcount[i] <- sum( x == "head" ) } hist(headcount,main="Head Count in 30 Coin Flips") # plot the distribution, what do you see? # How about we simulation from a different distribution? # simulation from uniform distribution x = runif(30) sum(x) ## [1] 14.14213 # repeat this 1000 times sumunif <- c() # create an empty vector for (i in 1:1000){ x = runif(30) sumunif[i] <- sum(x)
  13. 13. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] } hist(sumunif,main="Sum of Uniform Random Variables") # plot the distribution, what do you see? # The beam machine # install.packages("animation") # library(animation) # balls = 200 # layers = 15 # ani.options(nmax=balls+layers-2) # quincunx(balls, layers) # We will illustrate this example during the hands on session Writing Your Own Function h <- function(x)(sin(x)^2+cos(x)^3)^(3/2) # Defined the range of x value x <- seq(0,10,len =100) h(x) ## [1] 1.0000000000 0.9924415229 0.9708663012 0.9383916390 0.8996454821 ## [6] 0.8600539168 0.8250760835 0.7995255648 0.7870556343 0.7898022779 ## [11] 0.8081168229 0.8403199035 0.8824779760 0.9283032379 0.9693440650 ## [16] 0.9956215190 0.9967769375 0.9636598362 0.8901501922 0.7749162064 ## [21] 0.6227902584 0.4455330451 0.2620251351 0.0988159073 0.0002787684 ## [26] NaN NaN NaN NaN NaN ## [31] NaN NaN NaN NaN NaN ## [36] NaN NaN NaN NaN 0.0712840321 ## [41] 0.2261336483 0.4079475371 0.5882393238 0.7466649092 0.8700178635 ## [46] 0.9520957115 0.9930696121 0.9982195163 0.9762294564 0.9373522986
  14. 14. R Basics and Simulation file:///F/Homework/Feb%20Ugrad%20Presentations/RTutorial_YawenGuan.html[3/6/2018 9:02:11 PM] ## [51] 0.8917535712 0.8482638203 0.8136393063 0.7922914495 0.7863422320 ## [56] 0.7958337805 0.8189731051 0.8523909547 0.8914739366 0.9308440204 ## [61] 0.9650074350 0.9891120153 0.9996831561 0.9951835470 0.9762685521 ## [66] 0.9456767896 0.9077819385 0.8679102229 0.8315735142 0.8037642898 ## [71] 0.7884053128 0.7879647488 0.8031771984 0.8327963036 0.8733616611 ## [76] 0.9190612631 0.9618485403 0.9919789195 0.9990544240 0.9735346297 ## [81] 0.9085333538 0.8016159861 0.6562765079 0.4828356883 0.2987209003 ## [86] 0.1287563460 0.0103723359 NaN NaN NaN ## [91] NaN NaN NaN NaN NaN ## [96] NaN NaN NaN NaN NaN plot(x, h(x),type="l") # notice what happend to some h(x) values Use functions written by others # install.packages("mcsm") library("mcsm") # see what functions are in the coda package ls("package:mcsm") # see how to use a particular function help(mcsm) # some package also comes with very neat example demo(package = "mcsm") demo(Chapter.2)

×