1. Stat405 Simulation
Hadley Wickham
Thursday, 23 September 2010
2. 1. Homework comments
2. Mathematical approach
3. More randomness
4. Random number generators
Thursday, 23 September 2010
3. Homework
Just graded your organisation and code, and
focused my comments there.
Biggest overall tip: use floating figures (with figure
{...}) with captions. Use ref{} to refer to the figure in
the text.
Captions should start with brief description of plot
(including bin width if applicable) and finish with
brief description of what the plot reveals.
Will grade captions more aggressively in the future.
Thursday, 23 September 2010
4. Code
Gives explicit technical details.
Your comments should remind you why
you did what you did.
Most readers will not look at it, but it’s
very important to include it, because it
means that others can check your work.
Thursday, 23 September 2010
5. Mathematical
approach
Why are we doing this simulation? Could
work out the expected value and variance
mathematically. So let’s do it!
Simplifying assumption: slots are iid.
Thursday, 23 September 2010
9. Your turn
How can you calculate the probability of each
combination?
(Hint: think about subsetting. Another hint:
think about the table and character
subsetting. Final hint: you can do this in one
line of code)
Then work out the expected value (the payoff).
Thursday, 23 September 2010
10. poss$prob <- with(poss,
dist[w1] * dist[w2] * dist[w3])
(poss_mean <- with(poss, sum(prob * prize)))
# How do we determine the variance of this
# estimator?
Thursday, 23 September 2010
12. Sample
Very useful for selecting from a discrete
set (vector) of possibilities.
Four arguments: x, size, replace, prob
Thursday, 23 September 2010
13. How can you?
Choose 1 from vector
Choose n from vector, with replacement
Choose n from vector, without replacement
Perform a weighted sample
Put a vector in random order
Put a data frame in random order
Thursday, 23 September 2010
14. # Choose 1 from vector
sample(letters, 1)
# Choose n from vector, without replacement
sample(letters, 10)
sample(letters, 40)
# Choose n from vector, with replacement
sample(letters, 40, replace = T)
# Perform a weighted sample
sample(names(dist), prob = dist)
Thursday, 23 September 2010
15. # Put a vector in random order
sample(letters)
# Put a data frame in random order
slots[sample(1:nrow(slots)), ]
Thursday, 23 September 2010
16. Your turn
Source of randomness in random_prize is
sample. Other options are:
runif, rbinom, rnbinom, rpois, rnorm,
rt, rcauchy
What sort of random variables do they
generate and what are their parameters?
Practice generating numbers from them.
Thursday, 23 September 2010
17. Function Distribution Parameters
runif Uniform min, max
rbinom Binomial size, prob
rnbinom Negative binomial size, prob
rpois Poisson lambda
rnorm Normal mean, sd
rt t df
rcauchy Cauchy location, scale
Thursday, 23 September 2010
18. Distributions
Other functions
• r to generate random numbers
• d to compute density f(x)
• p to compute distribution F(x)
• q to compute inverse distribution F-1(x)
Thursday, 23 September 2010
19. # Easy to combine random variables
n <- rpois(10000, lambda = 10)
x <- rbinom(10000, size = n, prob = 0.3)
qplot(x, binwidth = 1)
p <- runif(10000)
x <- rbinom(10000, size = 10, prob = p)
qplot(x, binwidth = 0.1)
# cf.
qplot(runif(10000), binwidth = 0.1)
Thursday, 23 September 2010
20. # Simulation is a powerful tool for exploring
# distributions. Easy to do computationally; hard
# to do analytically
qplot(1 / rpois(10000, lambda = 20))
qplot(1 / runif(10000, min = 0.5, max = 2))
qplot(rnorm(10000) ^ 2)
qplot(rnorm(10000) / rnorm(10000))
# http://www.johndcook.com/distribution_chart.html
Thursday, 23 September 2010
24. How do computers
generate random numbers?
They don’t! Actually produce pseudo-
random sequences.
Common approach: Xn+1 = (aXn + c) mod m
(http://en.wikipedia.org/wiki/
Linear_congruential_generator)
Thursday, 23 September 2010
25. next_val <- function(x, a, c, m) {
(a * x + c) %% m
}
x <- 1001
(x <- next_val(x, 1664525, 1013904223, 2^32))
# http://en.wikipedia.org/wiki/
List_of_pseudorandom_number_generators
# R uses
# http://en.wikipedia.org/wiki/Mersenne_twister
Thursday, 23 September 2010
26. # Random numbers are reproducible!
set.seed(1)
runif(10)
set.seed(1)
runif(10)
# Very useful when required to make a reproducible
# example that involves randomness
Thursday, 23 September 2010
27. True randomness
Atmospheric radio noise: http://
www.random.org. Use from R with
random package.
Not really important unless you’re running
a lottery. (Otherwise by observing a long
enough sequence you can predict the
next value)
Thursday, 23 September 2010