SlideShare una empresa de Scribd logo
1 de 41
Introduction to Statistics

Lecture 11 - April 21st, 2010

✤   How we express likelihood mathematically

✤   For an event “A”, the probability of A occurring is denoted “P(A)”

✤   Always number between 0 and 1

    ✤   P(A) = 0 means that A never happens

    ✤   P(A) = 1 means that A always happens

Independence & Exclusivity

✤   independence - A and B are independent if the occurrence of one does
    not affect the probability of the other:

    ✤   P(A|B) = P(A) = P(A|not B)

    ✤   P(B|A) = P(B) = P(B|not A)

✤   mutually exclusive - A and B are mutually exclusive if it is impossible
    for both of them to occur:

    ✤   P(A and B) = 0

Probability Rules

✤   Probability of not happening is 1 minus probability of occurring

    ✤   P(not A) = 1 - P(A)

✤   When A and B are independent:

    ✤   P(A and B) = P(A) × P(B)

✤   P(A or B) = P(A) + P(B) - P(A and B)
Probability Fundamentals

✤   Sum of probabilities of all possible outcomes is 1

✤   Flip a coin and you get either heads or tails:

    ✤   P(heads) + P(tails) = 1 = P(heads or tails)

✤   With mutually exclusive outcomes A, B, C, and D

    ✤   P(A) + P(B) + P(C) + P(D) = 1 = P(A or B or C or D)

Conditional Probability

✤   With non-independent events, knowing one has happened may change
    the likelihood of the other occurring

✤   Conditional probability - what is the probability of A given that B has
    already happened?

    ✤   P(A|B)

✤   Bayes Rule for conditional probability:
                           P (A and B) P (B|A) × P (B)
                 P (A|B) =            =
                              P (B)         P (A)
Conditional Probability Hoedown

✤   At John Jay, 62.5% of all students hate statistics while 25% of all
    students hate statistics and passed the class. What is the probability
    that a student passes stats given that the student hates statistics?

✤   Two fair dice are rolled, what is the (conditional) probability that
    exactly one die’s value is a 1 or 2 given that they show different

Something Really Important

✤   Classic stats problem emerged from the game show Let’s Make a Deal,
    often called the Monty Hall Problem after the show’s host

✤   Has ended many friendships and caused bitter internet arguments

The Game

✤   There are 3 doors labeled “1”, “2”, and “3”, behind one of these doors
    is a fabulous prize that Monty has hidden

✤   You get to choose a door, which may or may not have the prize

✤   Monty opens another door without revealing the prize

✤   You now have the option to stay with your door or switch to another,
    should you stick with your original choice or switch?

Choosing The First Door

✤   Three doors and one prize so you’ll pick the right door one out of
    three times, i.e. P(right first choice) = 1/3

✤   Likewise, you’ll pick the wrong door with P(wrong first choice) = 2/3

The Reveal

✤   No matter how you choose, there are two other doors one prize. This
    means there is at least one of the two unchosen doors with nothing
    behind it.

✤   Monty knows where the prize is and opens the door that DOESN’T
    have the prize behind it.

✤   This leaves your door and one other. One of them has the prize and
    the other doesn’t, should you switch?

To Switch, or Not To Switch

✤   You don’t know if you have the right door!

✤   What’s the probability that your door has the prize?

✤   What’s the probability that the other door has the prize?

✤   What’s the probability that your door doesn’t have the prize?

Example of the Game

✤   As an example, the prize is hidden behind door “3”.

✤   If you choose door “3” initially, switching can only lose you the prize

✤   If you choose door “2” initially, Monty must open door “1” and
    switching will get you the prize

✤   If you choose door “1” initially, Monty must open door “2” and
    switching will get you the prize

Switch Already!

✤   Switching is a way of saying “I don’t think the prize is behind this

✤   Since the probability is 1/3 that the prize is behind any one door, the
    probability is 2/3 that the prize is not behind that door

✤   Always switch and you’ll win 2/3’s of the time!

Expected Values

✤   Probability can be used to estimate rewards in a game of chance

    ✤   Expected Value = P(A)×Reward(A) + P(B)×Reward(B) + ...

✤   Silly coin-flipping game: If you can flip a coin three times and have
    exactly one Heads, you get a dollar. If not, you give me a dollar.

✤   Should you take the bet?

Normal Distribution

✤   The distribution is

    ✤   unimodal

    ✤   symmetric

    ✤   “light tailed”

✤   Notation: X ~ N(μ, σ) means “the random variable X has a normal
    distribution with mean μ and standard deviation σ“

Area Under the Curve Equals 1



                  !4   !2        0   2        4
Rules of Thumb

✤   P(within one standard deviation) = 0.68

✤   P(within 1.68 standard deviations) = 0.95

✤   P(within three standard deviations) = 0.997

✤   With “real” normal distributions, you just don’t get outliers!

Standard Normal Distribution

✤   standard normal distribution is the normal distribution with mean μ = 0
    and standard deviation σ = 1: Z ~ N(0, 1)

✤   Any normal distribution can be transformed into a standard normal
    distribution. If X ~ N(μ, σ), then:
                               X −µ

Z - Scores & the Standard Normal

✤   Each observation has an associated z-score, which is the number of
    standard deviations that observation is away from the mean

✤   Converting a sample from a normal distribution to z-scores transforms
    it to a standard normal distribution

    ✤   z-score = (observation - mean) ÷ standard deviation

✤   If the observation is above the mean then the Z-score is positive, if
    below then the Z-score is negative

Interval Estimation

✤   We might estimate the mean for an entire population using the mean
    for a small sample, this is called a point estimate.

✤   A confidence interval gives a range of “plausible” values for the
    population mean

       ✤   Usually reported as "mean ± wiggle room"

✤   Each interval has an associated level of confidence, usually written as
    a percent (95% being the most common)

       ✤   "I am 95% confident that the population mean is in this range,
           with the sample mean being the most likely guess"
Two-Sided: 1.96 Std. Dev.’s

Normal Critical Deviates
the point for which the area und
ht is γ. how many you wanted to find the middle X% of to travel
   Critical normal deviate: If

   distribution,               standard deviations would you have

        in each direction.

    ✤   Define zγ to be the point for which the area under the normal curve to

matical notation, zγ is the point f
        the right is γ.

    ✤   In more mathematical notation, zγ is the point for which:

                             P (Z > zγ ) = γ,
Interpreting Confidence Intervals

✤   The width of a confidence interval indicates precision

    ✤   An observation's z-score can test if an observation is similar to
        others, bigger than ±1.96 means 95% likely to be different

✤   95% confidence intervals are by far the most common, but any level of
    confidence interval can be computed:

    ✤   90%: mean ± (1.645 × standard deviation)

    ✤   95%: mean ± (1.96 × standard deviation)

    ✤   99%: mean ± (2.58 × standard deviation)
Components of Confidence

✤   How might a confidence interval change as:

    ✤   Ȳ increases

    ✤   σ increases

    ✤   n increases

    ✤   the confidence level increases (e.g., from 95% to 99%)

Conflicting Hypotheses

✤   In statistical inference, there are always two conflicting hypotheses:

    ✤   null hypothesis “H0” - often states “no effect” or “no difference”.
        This is the hypothesis that we will assume to be true unless we
        have convincing evidence to the contrary.

    ✤   alternative hypothesis “H1” or “Ha” - The hypothesis that we will
        believe only if the evidence strongly supports it.

✤   The null hypothesis typically has “=” in it
Hypothesis as Metaphor

✤   Hypothesis tests are like U.S. criminal trials

✤   The judicial system is structured such that the accused person is
    presumed innocent until proven guilty. In such a system the absence
    of convincing evidence (“beyond a reasonable doubt”) results in the
    person being set free.

    ✤   H0: innocent

    ✤   Ha: guilty


✤   In each hypothesis testing situation we will compute a p-value. This is
    the probability that the null hypothesis is correct given the data.

    ✤   Accept H0 if the p-value is large

    ✤   Reject H0 if the p-value is small, go with Ha

✤   How small is small enough? It depends... (usually p < 0.05)

Notes on Hypothesis Testing

✤   “Statistical significance” is not the same as “clinical significance”. A
    tiny effect may be “statistically significant” if the sample size is huge.

✤   The p-value does not describe the magnitude of the effect!

✤   When reporting analysis results, a confidence interval should always
    be provided along with the results of a hypothesis test.

✤   The choice of 0.05 is arbitrary. (p = 0.051 and p = 0.049 should lead to
    similar conclusions, in practice they often do not)

✤   Never report results as “p < 0.05”, report the p-value and let the
    reader decide if they agree with your interpretation.
• Type I Error: Reject H0 when H0 is actually true.
  – For example, to conclude there is an effect (or a difference)
    when there really isn’t one.
  – Also called “false positive”.
• Type II Error: Accept H0 when H0 is actually false.
  – For example, to fail to find an effect (or a difference) when
    there really is one.
  – Also called “false negative”.

                               State of nature
              Decision     H0 is true   Ha is true
             Accept H0          qh
                                 q       Type II
             Reject H0       Type I
Probabilities of Errors of Type I and Ty

✤   Each of the errors has an associated probability: associated
                    Each of the errors has an                             probabilit
                         • α = P (Type I Error)
                         • β = P (Type II Error)

✤   Hypothesis testing is set up to control Type I error rate (α)
                      Hypothesis testing is set up to control Type I
        The experimenter chooses α - everything else follows from this!
                      The experimenter chooses α — everything else
✤   Most common (by far) choice for α is 0.05.

    ✤   (Also, 0.01 and 0.10most common
                      The on occasion)           (by far) choice for α is 0.05
Comparing Means

✤   Tests:

    ✤   Single group versus a fixed mean

    ✤   Two groups with the same variable

    ✤   Two groups with pairwise observation

✤   Hypotheses:

    ✤   H0 : the two groups have equal means ( mean A = mean B )

    ✤   Ha : the means of the groups are different
Assumptions for t-Tests

    ✤   The group (sample) is the Independent Variable (dichotomous)

    ✤   The outcome of interest is the Dependent Variable

✤   t-Tests are only valid if these assumptions are not violated:

    ✤   The research question DOES involve the comparison of 2 means

    ✤   The Dependent Variable is a quantitative scale

    ✤   The distribution of the Dependent Variable is normal

    ✤   Independent Variable assigned randomly (independently)
Met Assumptions, but Which Test?

✤   Only one group with data: One-Sample t-Test

✤   Two groups:

    ✤   Not related to each other: Independent-Samples t-Test

    ✤   Related samples (e.g. before & after): Paired-Samples t-Test

One-Sample t-Test

✤   Compares a sample mean to a known population mean.

✤   Need to know the population mean!

✤   Example: Is there a difference between the population mean IQ (100)
    and the mean IQ for a sample of 50 John Jay students (125)?

Paired-Samples t-Test

✤   Sometimes we have two sets of measurements that are related:

    ✤   Each subject is measured before and after treatment

    ✤   With pairs of identical twins

    ✤   Subject has different treatment on left & right arms

✤   For each observation in one group there is exactly one closely related
    observation in the other groups (can make pairs, one of each group)
Independent-Samples t-Test

✤   Compares the means of two groups or samples.

✤   One of the most common situations in statistical inference is that of
    comparing two means from independent samples

    ✤   Clinical trials - treatment group vs. placebo group

    ✤   Exposed vs. unexposed

    ✤   Males vs. females

    ✤   General population vs. specific subpopulation
Review: Hypotheses

✤   Null Hypothesis: there is no relationship between the independent
    and dependent variables

✤   p-value: the probability of the null hypothesis (H0) being true

    ✤   Reject H0 if p is too small (usually p < 0.05)

    ✤   If we reject H0, we must instead choose the alternative (Ha)

Review: t-Tests

✤   Compare the means of exactly two groups

✤   Only one group (with data) compared to a fixed number:

    ✤   One-Sample t-Test

✤   Two groups (with data):

    ✤   Not related to each other: Independent-Samples t-Test

    ✤   Related samples (e.g. before & after): Paired-Samples t-Test

Más contenido relacionado

La actualidad más candente

Statistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciStatistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciSelvin Hadi
Chapter6Vu Vo
Properties of discrete probability distribution
Properties of discrete probability distributionProperties of discrete probability distribution
Properties of discrete probability distributionJACKIE MACALINTAL
Chapter 4 part2- Random Variables
Chapter 4 part2- Random VariablesChapter 4 part2- Random Variables
Chapter 4 part2- Random Variablesnszakir
Binomial distribution
Binomial distributionBinomial distribution
Binomial distributionnumanmunir01
Chapter 4 part3- Means and Variances of Random Variables
Chapter 4 part3- Means and Variances of Random VariablesChapter 4 part3- Means and Variances of Random Variables
Chapter 4 part3- Means and Variances of Random Variablesnszakir
Bba 3274 qm week 3 probability distribution
Bba 3274 qm week 3 probability distributionBba 3274 qm week 3 probability distribution
Bba 3274 qm week 3 probability distributionStephen Ong
Discrete Random Variable (Probability Distribution)
Discrete Random Variable (Probability Distribution)Discrete Random Variable (Probability Distribution)
Discrete Random Variable (Probability Distribution)LeslyAlingay
Theory of probability and probability distribution
Theory of probability and probability distributionTheory of probability and probability distribution
Theory of probability and probability distributionpolscjp
04 random-variables-probability-distributionsrv
04 random-variables-probability-distributionsrv04 random-variables-probability-distributionsrv
04 random-variables-probability-distributionsrvPooja Sakhla eduv eduv eduvOsmar Meraz
Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsBabasab Patil
Bernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial DistributionBernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial DistributionDataminingTools Inc
Mean, variance, and standard deviation of a Discrete Random Variable
Mean, variance, and standard deviation of a Discrete Random VariableMean, variance, and standard deviation of a Discrete Random Variable
Mean, variance, and standard deviation of a Discrete Random VariableMichael Ogoy

La actualidad más candente (20)

Statistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciStatistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ci
Sfs4e ppt 06
Sfs4e ppt 06Sfs4e ppt 06
Sfs4e ppt 06
Properties of discrete probability distribution
Properties of discrete probability distributionProperties of discrete probability distribution
Properties of discrete probability distribution
Chapter 4 part2- Random Variables
Chapter 4 part2- Random VariablesChapter 4 part2- Random Variables
Chapter 4 part2- Random Variables
Semana8 muestreo
Semana8 muestreoSemana8 muestreo
Semana8 muestreo
Semana7 dn
Semana7 dnSemana7 dn
Semana7 dn
Binomial distribution
Binomial distributionBinomial distribution
Binomial distribution
Chapter 4 part3- Means and Variances of Random Variables
Chapter 4 part3- Means and Variances of Random VariablesChapter 4 part3- Means and Variances of Random Variables
Chapter 4 part3- Means and Variances of Random Variables
Bba 3274 qm week 3 probability distribution
Bba 3274 qm week 3 probability distributionBba 3274 qm week 3 probability distribution
Bba 3274 qm week 3 probability distribution
Discrete Random Variable (Probability Distribution)
Discrete Random Variable (Probability Distribution)Discrete Random Variable (Probability Distribution)
Discrete Random Variable (Probability Distribution)
Theory of probability and probability distribution
Theory of probability and probability distributionTheory of probability and probability distribution
Theory of probability and probability distribution
04 random-variables-probability-distributionsrv
04 random-variables-probability-distributionsrv04 random-variables-probability-distributionsrv
04 random-variables-probability-distributionsrv eduv eduv eduv
Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec doms
Random variables
Random variablesRandom variables
Random variables
Bernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial DistributionBernoullis Random Variables And Binomial Distribution
Bernoullis Random Variables And Binomial Distribution
Mean, variance, and standard deviation of a Discrete Random Variable
Mean, variance, and standard deviation of a Discrete Random VariableMean, variance, and standard deviation of a Discrete Random Variable
Mean, variance, and standard deviation of a Discrete Random Variable
Hipotesis y muestreo estadístico
Hipotesis y muestreo estadísticoHipotesis y muestreo estadístico
Hipotesis y muestreo estadístico
Semana5 modelos
Semana5 modelosSemana5 modelos
Semana5 modelos

Similar a Lecture 11

Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...
Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...
Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...Daniel Katz
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestPG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestAashish Patel
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionAashish Patel
Normal Distribution
Normal DistributionNormal Distribution
Normal DistributionNevIlle16
Probability and Randomness
Probability and RandomnessProbability and Randomness
Probability and RandomnessSalmaAlbakri2
Lecture 2 - Probability
Lecture 2 - ProbabilityLecture 2 - Probability
Lecture 2 - ProbabilityLuke Dicken
2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptxImpanaR2
Binomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distributionBinomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distributionBharath kumar Karanam
Final Exam ReviewChapter 10Know the three ideas of s.docx
Final Exam ReviewChapter 10Know the three ideas of s.docxFinal Exam ReviewChapter 10Know the three ideas of s.docx
Final Exam ReviewChapter 10Know the three ideas of s.docxlmelaine
Probability basics and bayes' theorem
Probability basics and bayes' theoremProbability basics and bayes' theorem
Probability basics and bayes' theoremBalaji P
Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Vijay Hemmadi
Basic concept of probability
Basic concept of probabilityBasic concept of probability
Basic concept of probabilityIkhlas Rahman

Similar a Lecture 11 (20)

Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...
Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...
Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestPG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability Distribution
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
Probability and Randomness
Probability and RandomnessProbability and Randomness
Probability and Randomness
Lecture 2 - Probability
Lecture 2 - ProbabilityLecture 2 - Probability
Lecture 2 - Probability
2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx
Binomial probability distributions
Binomial probability distributions  Binomial probability distributions
Binomial probability distributions
Binomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distributionBinomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distribution
Final Exam ReviewChapter 10Know the three ideas of s.docx
Final Exam ReviewChapter 10Know the three ideas of s.docxFinal Exam ReviewChapter 10Know the three ideas of s.docx
Final Exam ReviewChapter 10Know the three ideas of s.docx
Probability basics and bayes' theorem
Probability basics and bayes' theoremProbability basics and bayes' theorem
Probability basics and bayes' theorem
Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis
Paranormal stats
Paranormal statsParanormal stats
Paranormal stats
Basic concept of probability
Basic concept of probabilityBasic concept of probability
Basic concept of probability

Lecture 11

  • 2. 2
  • 3. 3
  • 4. Probability ✤ How we express likelihood mathematically ✤ For an event “A”, the probability of A occurring is denoted “P(A)” ✤ Always number between 0 and 1 ✤ P(A) = 0 means that A never happens ✤ P(A) = 1 means that A always happens 4
  • 5. Independence & Exclusivity ✤ independence - A and B are independent if the occurrence of one does not affect the probability of the other: ✤ P(A|B) = P(A) = P(A|not B) ✤ P(B|A) = P(B) = P(B|not A) ✤ mutually exclusive - A and B are mutually exclusive if it is impossible for both of them to occur: ✤ P(A and B) = 0 5
  • 6. Probability Rules ✤ Probability of not happening is 1 minus probability of occurring ✤ P(not A) = 1 - P(A) ✤ When A and B are independent: ✤ P(A and B) = P(A) × P(B) ✤ P(A or B) = P(A) + P(B) - P(A and B) 6
  • 7. Probability Fundamentals ✤ Sum of probabilities of all possible outcomes is 1 ✤ Flip a coin and you get either heads or tails: ✤ P(heads) + P(tails) = 1 = P(heads or tails) ✤ With mutually exclusive outcomes A, B, C, and D ✤ P(A) + P(B) + P(C) + P(D) = 1 = P(A or B or C or D) 7
  • 8. Conditional Probability ✤ With non-independent events, knowing one has happened may change the likelihood of the other occurring ✤ Conditional probability - what is the probability of A given that B has already happened? ✤ P(A|B) ✤ Bayes Rule for conditional probability: P (A and B) P (B|A) × P (B) P (A|B) = = P (B) P (A) 8
  • 9. Conditional Probability Hoedown ✤ At John Jay, 62.5% of all students hate statistics while 25% of all students hate statistics and passed the class. What is the probability that a student passes stats given that the student hates statistics? ✤ Two fair dice are rolled, what is the (conditional) probability that exactly one die’s value is a 1 or 2 given that they show different numbers? 9
  • 10. Something Really Important ✤ Classic stats problem emerged from the game show Let’s Make a Deal, often called the Monty Hall Problem after the show’s host ✤ Has ended many friendships and caused bitter internet arguments 10
  • 11. The Game ✤ There are 3 doors labeled “1”, “2”, and “3”, behind one of these doors is a fabulous prize that Monty has hidden ✤ You get to choose a door, which may or may not have the prize ✤ Monty opens another door without revealing the prize ✤ You now have the option to stay with your door or switch to another, should you stick with your original choice or switch? 11
  • 12. Choosing The First Door ✤ Three doors and one prize so you’ll pick the right door one out of three times, i.e. P(right first choice) = 1/3 ✤ Likewise, you’ll pick the wrong door with P(wrong first choice) = 2/3 12
  • 13. The Reveal ✤ No matter how you choose, there are two other doors one prize. This means there is at least one of the two unchosen doors with nothing behind it. ✤ Monty knows where the prize is and opens the door that DOESN’T have the prize behind it. ✤ This leaves your door and one other. One of them has the prize and the other doesn’t, should you switch? 13
  • 14. To Switch, or Not To Switch ✤ You don’t know if you have the right door! ✤ What’s the probability that your door has the prize? ✤ What’s the probability that the other door has the prize? ✤ What’s the probability that your door doesn’t have the prize? 14
  • 15. Example of the Game ✤ As an example, the prize is hidden behind door “3”. ✤ If you choose door “3” initially, switching can only lose you the prize ✤ If you choose door “2” initially, Monty must open door “1” and switching will get you the prize ✤ If you choose door “1” initially, Monty must open door “2” and switching will get you the prize 15
  • 16. Switch Already! ✤ Switching is a way of saying “I don’t think the prize is behind this door” ✤ Since the probability is 1/3 that the prize is behind any one door, the probability is 2/3 that the prize is not behind that door ✤ Always switch and you’ll win 2/3’s of the time! 16
  • 17. Expected Values ✤ Probability can be used to estimate rewards in a game of chance ✤ Expected Value = P(A)×Reward(A) + P(B)×Reward(B) + ... ✤ Silly coin-flipping game: If you can flip a coin three times and have exactly one Heads, you get a dollar. If not, you give me a dollar. ✤ Should you take the bet? 17
  • 18. Normal Distribution ✤ The distribution is ✤ unimodal ✤ symmetric ✤ “light tailed” ✤ Notation: X ~ N(μ, σ) means “the random variable X has a normal distribution with mean μ and standard deviation σ“ 18
  • 19. Area Under the Curve Equals 1 0.8 N(!3,0.5) N(2,1) N(!1,3) 0.6 f(x) 0.4 0.2 0.0 !4 !2 0 2 4 19
  • 20. Rules of Thumb ✤ P(within one standard deviation) = 0.68 ✤ P(within 1.68 standard deviations) = 0.95 ✤ P(within three standard deviations) = 0.997 ✤ With “real” normal distributions, you just don’t get outliers! 20
  • 21. Standard Normal Distribution ✤ standard normal distribution is the normal distribution with mean μ = 0 and standard deviation σ = 1: Z ~ N(0, 1) ✤ Any normal distribution can be transformed into a standard normal distribution. If X ~ N(μ, σ), then: X −µ =Z σ ✤ 21
  • 22. Z - Scores & the Standard Normal ✤ Each observation has an associated z-score, which is the number of standard deviations that observation is away from the mean ✤ Converting a sample from a normal distribution to z-scores transforms it to a standard normal distribution ✤ z-score = (observation - mean) ÷ standard deviation ✤ If the observation is above the mean then the Z-score is positive, if below then the Z-score is negative 22
  • 23. Interval Estimation ✤ We might estimate the mean for an entire population using the mean for a small sample, this is called a point estimate. ✤ A confidence interval gives a range of “plausible” values for the population mean ✤ Usually reported as "mean ± wiggle room" ✤ Each interval has an associated level of confidence, usually written as a percent (95% being the most common) ✤ "I am 95% confident that the population mean is in this range, with the sample mean being the most likely guess" 23
  • 24. Two-Sided: 1.96 Std. Dev.’s 24
  • 25. Normal Critical Deviates the point for which the area und ht is γ. how many you wanted to find the middle X% of to travel Critical normal deviate: If ✤ distribution, standard deviations would you have the in each direction. ✤ Define zγ to be the point for which the area under the normal curve to matical notation, zγ is the point f the right is γ. ✤ In more mathematical notation, zγ is the point for which: P (Z > zγ ) = γ, 25
  • 26. Interpreting Confidence Intervals ✤ The width of a confidence interval indicates precision ✤ An observation's z-score can test if an observation is similar to others, bigger than ±1.96 means 95% likely to be different ✤ 95% confidence intervals are by far the most common, but any level of confidence interval can be computed: ✤ 90%: mean ± (1.645 × standard deviation) ✤ 95%: mean ± (1.96 × standard deviation) ✤ 99%: mean ± (2.58 × standard deviation) 26
  • 27. Components of Confidence ✤ How might a confidence interval change as: ✤ Ȳ increases ✤ σ increases ✤ n increases ✤ the confidence level increases (e.g., from 95% to 99%) 27
  • 28. Conflicting Hypotheses ✤ In statistical inference, there are always two conflicting hypotheses: ✤ null hypothesis “H0” - often states “no effect” or “no difference”. This is the hypothesis that we will assume to be true unless we have convincing evidence to the contrary. ✤ alternative hypothesis “H1” or “Ha” - The hypothesis that we will believe only if the evidence strongly supports it. ✤ The null hypothesis typically has “=” in it 28
  • 29. Hypothesis as Metaphor ✤ Hypothesis tests are like U.S. criminal trials ✤ The judicial system is structured such that the accused person is presumed innocent until proven guilty. In such a system the absence of convincing evidence (“beyond a reasonable doubt”) results in the person being set free. ✤ H0: innocent ✤ Ha: guilty 29
  • 30. P-values ✤ In each hypothesis testing situation we will compute a p-value. This is the probability that the null hypothesis is correct given the data. ✤ Accept H0 if the p-value is large ✤ Reject H0 if the p-value is small, go with Ha ✤ How small is small enough? It depends... (usually p < 0.05) 30
  • 31. Notes on Hypothesis Testing ✤ “Statistical significance” is not the same as “clinical significance”. A tiny effect may be “statistically significant” if the sample size is huge. ✤ The p-value does not describe the magnitude of the effect! ✤ When reporting analysis results, a confidence interval should always be provided along with the results of a hypothesis test. ✤ The choice of 0.05 is arbitrary. (p = 0.051 and p = 0.049 should lead to similar conclusions, in practice they often do not) ✤ Never report results as “p < 0.05”, report the p-value and let the reader decide if they agree with your interpretation. 31
  • 32. • Type I Error: Reject H0 when H0 is actually true. – For example, to conclude there is an effect (or a difference) when there really isn’t one. – Also called “false positive”. • Type II Error: Accept H0 when H0 is actually false. – For example, to fail to find an effect (or a difference) when there really is one. – Also called “false negative”. State of nature Decision H0 is true Ha is true Accept H0 qh q Type II qh q Reject H0 Type I 32
  • 33. Probabilities of Errors of Type I and Ty Probabilities ✤ Each of the errors has an associated probability: associated Each of the errors has an probabilit • α = P (Type I Error) • β = P (Type II Error) ✤ Hypothesis testing is set up to control Type I error rate (α) Hypothesis testing is set up to control Type I The experimenter chooses α - everything else follows from this! The experimenter chooses α — everything else ✤ Most common (by far) choice for α is 0.05. ✤ (Also, 0.01 and 0.10most common The on occasion) (by far) choice for α is 0.05 33
  • 34. Comparing Means ✤ Tests: ✤ Single group versus a fixed mean ✤ Two groups with the same variable ✤ Two groups with pairwise observation ✤ Hypotheses: ✤ H0 : the two groups have equal means ( mean A = mean B ) ✤ Ha : the means of the groups are different 34
  • 35. Assumptions for t-Tests ✤ The group (sample) is the Independent Variable (dichotomous) ✤ The outcome of interest is the Dependent Variable ✤ t-Tests are only valid if these assumptions are not violated: ✤ The research question DOES involve the comparison of 2 means ✤ The Dependent Variable is a quantitative scale ✤ The distribution of the Dependent Variable is normal ✤ Independent Variable assigned randomly (independently) 35
  • 36. Met Assumptions, but Which Test? ✤ Only one group with data: One-Sample t-Test ✤ Two groups: ✤ Not related to each other: Independent-Samples t-Test ✤ Related samples (e.g. before & after): Paired-Samples t-Test 36
  • 37. One-Sample t-Test ✤ Compares a sample mean to a known population mean. ✤ Need to know the population mean! ✤ Example: Is there a difference between the population mean IQ (100) and the mean IQ for a sample of 50 John Jay students (125)? 37
  • 38. Paired-Samples t-Test ✤ Sometimes we have two sets of measurements that are related: ✤ Each subject is measured before and after treatment ✤ With pairs of identical twins ✤ Subject has different treatment on left & right arms ✤ For each observation in one group there is exactly one closely related observation in the other groups (can make pairs, one of each group) 38
  • 39. Independent-Samples t-Test ✤ Compares the means of two groups or samples. ✤ One of the most common situations in statistical inference is that of comparing two means from independent samples ✤ Clinical trials - treatment group vs. placebo group ✤ Exposed vs. unexposed ✤ Males vs. females ✤ General population vs. specific subpopulation 39
  • 40. Review: Hypotheses ✤ Null Hypothesis: there is no relationship between the independent and dependent variables ✤ p-value: the probability of the null hypothesis (H0) being true ✤ Reject H0 if p is too small (usually p < 0.05) ✤ If we reject H0, we must instead choose the alternative (Ha) 40
  • 41. Review: t-Tests ✤ Compare the means of exactly two groups ✤ Only one group (with data) compared to a fixed number: ✤ One-Sample t-Test ✤ Two groups (with data): ✤ Not related to each other: Independent-Samples t-Test ✤ Related samples (e.g. before & after): Paired-Samples t-Test 41

Notas del editor