1. Stat310 Fin
Hadley Wickham
Saturday, 24 April 2010
2. Thank you!
To those of you who bought your
textbooks from my amazon link.
To the textbook publishers who
generously sent me free copies of books.
To Kensey for suggesting chik-fil-a
Saturday, 24 April 2010
3. 1. Eat!
2. Final & help sessions
3. Finish off hypothesis testing
4. Other statistics opportunities
5. Feedback (TA & me)
Saturday, 24 April 2010
5. Final
Take home. Two hours long.
Three (double-sided) pages of notes.
Available Wednesday April 28 9am.
Due Wednesday May 5, 5pm,
under my door.
Ten small questions of approximately
equal weight. Similar to questions from the
homework/book.
Saturday, 24 April 2010
6. Common themes
Probability of an event.
Independence & conditioning.
Distributions: pdf/pmf, cdf, mgf, named.
Transformations.
Sampling distribution of mean and variance.
Estimation and testing.
Philosophy of grading
Saturday, 24 April 2010
7. Help sessions
Mon, Tue, Wed, Thurs, Fri, Sat, Sun?
Morning or afternoon?
One-on-one help, plus brief revision of
topics of particular interest. Suggest and
vote at http://goo.gl/mod/joIx
Saturday, 24 April 2010
8. Honour code
Remember to pledge your exam, and
note the time at which you started and
ended.
You may refer only to your note sheets,
not to the text book or old homeworks
etc.
Saturday, 24 April 2010
10. Course grades
Assume I took a random sample of 20
students from each years, and that
course grades are normally distributed by
variance 80.
What is the distribution of difference of
the two group means?
Saturday, 24 April 2010
11. Your turn
The average grade from 2009 was 85 and
the average grade from 2010 was 90.
What is the p-value? (The probability that
you’d see a difference this large or large if
there really was no difference in the
population means)
Saturday, 24 April 2010
12. 1. Write down Ho and Ha
(positions of defence and prosecution)
2. Figure out good test statistic
(what numeric summary?)
3. Work out null distribution
(distribution of innocents)
4. Calculate p-value by comparing actual value
to null distribution (what proportion of true
innocents look more guilty than the suspect)
5. Reject Ho if p-value smaller than cutoff
Saturday, 24 April 2010
13. Say is Say is
guilty innocent
False
Is guilty Correct
acquittal
False
Is innocent Correct
conviction
Saturday, 24 April 2010
14. Your turn
Which type of error is more expensive/
more costly/worse in the criminal justice
system?
Saturday, 24 April 2010
15. Reject HO Accept HO
Type II
HO false Correct
error
Type I
HO true Correct
error
Saturday, 24 April 2010
16. Rates
For a given test,
P(false conviction) = α = significance level
P(false acquittal) = 1 - β
β = power
What do think happens to β if you try to
make α smaller?
Saturday, 24 April 2010
18. Cut off
Choose cut-off based on rate of false
convictions.
If you want a 5% rate of false
convictions, reject Ho if the p-value is less
than 0.05. (This is the industry standard
rate)
Can work out power.
Saturday, 24 April 2010
19. μx=80, μy=85
90 y
y y
y
88 y
y yy y y
y y y y y y y
y y y y y y yy y
86 y y y y
y x
y
y y y y y y y y y y y y
yy
y
y y y y y y y y
y y yy y y y y
84 y
y x y y y yxx yy y y y
y y x y
x y y x y y x yy xy
x yx y
y x x x yx x x yyx x yy
82 y x x x y x xx x y y y
x x y
x x
x x x x y
x x
80 x x x x
x
x x x
x x xx xxx x
x
x x x
x x xxxx x xx x y x x x x x x x
x x x x x x x x xx x x xx x
78 x x x xx x x
x x
x
76 x x
20 40 60 80 100
Saturday, 24 April 2010
23. μx=80, μy=85
0.8
0.6
p−value
0.4
0.2
0.0
20 40 60 80 100
Correctly reject null 39% of the time
Saturday, 24 April 2010
24. μx=μy=80
x x
84 y x
y x x y
y x x x
x y y y
y x y x xy xy y x x y y
82 y yy
y y y x yxx x
y y
x y
y y y
xx x y x y
x y y
y
x
x y y
y x y x y yy y
x x x y y x yx y x y
y
x yy y x
80 x y y xx x yy
y y yxxx x x y y x
yy x x y x x x yx y y x yy y xx x
x y
y xy x xxxyy xy x x x
y x
x y xxy x x x
yy
y y
y x x y x
x
78 x x x y y xx y
x x y
y x x y
y y y
x y y y xx x y x xx
x
x
y y x
76 y x
y y
x
x
20 40 60 80 100
Saturday, 24 April 2010
28. μx=μy=80
0.8
0.6
p−value
0.4
0.2
0.0
20 40 60 80 100
Incorrectly reject null 6% of the time
Saturday, 24 April 2010
29. Your turn
The average grade from 2009 was 85 and
the average grade from 2010 was 90.
Would you reject the null hypothesis that
the average grade was the same?
Saturday, 24 April 2010
30. Connection to
confidence intervals
If you construct a 90% confidence
interval, and it doesn’t include the
parameter until the null, then the p-value
must be > 1 - 0.9 = 0.1.
If the p-value is 0.08, then a 92% or
greater confidence interval would include
the null parameter, and a smaller
confidence interval would not.
Saturday, 24 April 2010
32. Majoring
3 required stat classes (Stat310, Stat405, Stat410)
+ 6 stat electives
+ calc, linear algebra, computing
+ design project
Makes for a great double major.
Particularly useful if you’re thinking about
grad school. (Appealing to employers too)
http://statistics.rice.edu/ShowInterior.aspx?id=58
Saturday, 24 April 2010
33. Minoring
From next year
Three required:
Track A: stat310, stat405, stat400/410
Track B: stat100, stat280, stat385
Three elective:
300 level+, one outside stat if it has
strong statistical component
Saturday, 24 April 2010
34. Stat410
Introduction to linear models
Powerful and general statistical tool.
Theory and data.
Offered in Fall.
Saturday, 24 April 2010
35. Stat405
Project based introduction to data
analysis. Lots of computing and hardly
any maths.
http://had.co.nz/stat405
Offered in Fall, and next year in Spring.
Saturday, 24 April 2010
36. Electives
SOCI 436 (Houston area survey), 313
(demography)
ECON 340/440 (game theory), 400
(econometrics), 475 (optimisation), 477 (math of
economics), 479 (modelling)
STAT 385, 431 (more theory), 420 (process
control), 421 (time series), 422 (Bayesian data
analysis), 423 (bioinformatics), 453
(biostatistics), 485 (environmental)
Saturday, 24 April 2010
37. Feedback
One form for me.
One form Xin Zhao, who most of you
never met but was the TA in charge of
your grading.
No form for Garrett.
Saturday, 24 April 2010