2. Topics
• Proportions
• Rates
– Change rates
– Measures of morbidity and mortality
– Standardization of rates
• Ratios
– Relative risk
– Odds Ratio
March 28, 2013 2
3. Categorical Data
Data that can be classified as belonging to a
distinct number of categories.
– Binary – data can be classified into one of 2
possible categories (yes/no, positive/negative)
– Ordinal – data that can be classified into
categories that have a natural ordering (i.e..
Levels of pain: none, moderate, intense)
– Nominal- data can be classified into >2
categories (i.e.. Race: Arab, African, and
other)
March 28, 2013 3
4. Data examples
• What type of data would result from these
questions?
– How old are you? ____________
– How old are you?
• A. under 18
• B. 19-35
• C. 36-49
• D. over 50
March 28, 2013 4
5. Proportions
• Numbers by themselves may be misleading: they are on
different scales and need to be reduced to a standard
basis in order to compare them.
• We most frequently use proportions: that is, the fraction
of items that satisfy some property, such as having a
disease or being exposed to a dangerous chemical.
• "Proportions" are the same thing as fractions or
percentages. In every case you need to know what you
are taking a proportion of: that is, what is the
DENOMINATOR in the proportion.
x x
p=
n percent (100) = (100)
n
March 28, 2013 5
6. Proportions and Probabilities
• We often interpret proportions as
probabilities. If the proportion with a
disease is 1/10 then we also say that the
probability of getting the disease is 1/10, or
1 in 10.
• Proportions are usually quoted for samples
- probabilities are almost always quoted for
populations.
March 28, 2013 6
7. Workers Example
Smoking Workers Cases Controls
No Yes 11 35
No 50 203
Yes Yes 84 45
No 313 270
• For the cases:
– Proportion of exposure=84/397=0.212 or 21.2%
• For the controls:
– Proportion of exposure=45/315=0.143 or 14.3%
March 28, 2013 7
8. Prevalence
Disease Prevalence = the proportion of people with a given
disease at a given time.
disease prevalence =
Number of diseased persons at a given time
Total number of persons examined at that time
Prevalence is usually quoted as per 100,000
people so the above proportion should be
multiplied by 100,000.
March 28, 2013 8
9. Interpretation
Cases (old +new)
At time t Pr evalence =
Total
Problem of exposure, consequently
Not comparable measurement
Old = duration of the disease
New = speed of the disease
March 28, 2013 9
10. Screening Tests
• Through screening tests people are
classified as healthy or as falling into one
or more disease categories.
• These tests are not 100% accurate and
therefore misclassification is unavoidable.
• There are 2 proportions that are used to
evaluate these types of diagnostic
procedures.
March 28, 2013 10
11. Screening Tests
General Population
Diseased
Positive
Test Results
March 28, 2013 11
12. Sensitivity and Specificity
• Sensitivity and specificity are terms used to
describe the effectiveness of screening tests. They
describe how good a test is in two ways - finding
false positives and finding false negatives
• Sensitivity is the Proportion of diseased who
screen positive for the disease
• Specificity is the Proportion of healthy who screen
healthy
March 28, 2013 12
13. Sensitivity and Specificity
Condition Present Condition Absent
……………………………………………………………………………………………
Test Positive True Positive (TP) False Positive (FP)
Test Negative False Negative (FN) True Negative (TN)
……………………………………………………………………………………………
Test Sensitivity (Sn) is defined as the probability that the test is positive
when given to a group of patients who have the disease.
Sn= (TP/(TP+FN))x100.
It can be viewed as, 1-the false negative rate.
The Specificity (Sp) of a screening test is defined as the probability that the
test will be negative among patients who do not have the disease.
Sp = (TN/(TN+FP))X100.
It can be understood as 1-the false positive rate.
14. Positive & Negative Predictive Values
• The positive predictive value (PPV) of a
test is the probability that a patient who
tested positive for the disease actually has
the disease. PPV = (TP/(TP+FP))X 100.
• The negative predictive value (NPV) of a
test is the probability that a patent who
tested negative for a disease will not have
the disease. NPV = (TN/(TN+FN))X100.
15. The Efficiency
• The efficiency (EFF) of a test is the
probability that the test result and the
diagnosis agree.
• It is calculated as:
EFF = ((TP+TN)/(TP+TN+FP+FN)) X 100
16. Example
• A cytological test was undertaken to screen
women for cervical cancer.
Test Positive Test Negative Total
Actually Positive )TP (154 )FP (225 379
Actually Negative )FN (362 )TN (23,362 23,724
)TP+FN (516 )FP+TN(23587
• Sensitivity =?
• Specificity = ?
March 28, 2013 16
17. Displaying Proportions
• Types of charts that can be used:
– Histogram
– Pie Chart
– Line Graph
• BEWARE of the type of display you use –
some charts are better at displaying
certain types of data than others.
March 28, 2013 17
18. Displaying Proportions
Percent of Children LivingSudan
Distribution of Race in in
African 70% 80%
Crack/cocaine Households
80
Arab 18% 70%
70 Percent of Children Living in
60% Crack/cocaine Households
Mixed 8% 60
50%
50
40%
40
30%
Other 4% 30
20%
20
10%
10
0%
0 Black White American Other
Indian
African Arab foreigners others
March 28, 2013 18
19. Displaying Proportions
Distribution of Race in Sudan Distribution of Race in Sudan
others
foreigners
African
Arab
Arab
foreigners
others
African
0 20 40 60 80
March 28, 2013 19
20. Displaying Proportions
Cause of Death of Deaths# Proportion of Deaths
Heart Disease 12,278 0.38
Cancer 6,448 0.20
Cerebrovascular Disease 3,958 0.12
Accidents 1,814 0.06
Other 8,088 0.25
Causes of Death
Heart Disease
Cancer
Cerebrovascular disease
Accidents
Other
March 28, 2013 20
21. Rates
• The term rate is often used interchangeably
with the term proportion although
sometimes it refers to a quantity of a very
different nature.
• Types of rates we will cover:
– Incidence rate
– Change rates
– Death rate
– Follow-up death rate
March 28, 2013 21
23. Definition
The incidence rate is the production of new cases in a
population. It measure the number of cases per unit
of time, i.e. It measure the average of the speed of
the apparition of new cases in given population.
There are three measures of incidence:
1. Incidence rate (=news cases/time of participation)
2. Instantaneous incidence
3. Accumulate incidence
March 28, 2013 23
24. In incidence study we are interested on the
occurrence of event (disease) over the time,.
Such study deals with follow up of
each subject and the moment (ti) that each event
(disease) occurs.
The problem of incidence data is that the existence
of
observation incomplete, subjects are still not
affected at the moment of analysis.
March 28, 2013 24
25. To study incidence rate we should know for
each subject the following information:
Date of origin
Is the date that an individual enter in the study.
Date of last information
Is the recent date that we receive information about
the status of the subject. If the subject is affected, so the date
of last information is date of getting the disease.
March 28, 2013 25
26. Duration of follow up
Is the delay between the date of origin and the date
of last information
Date of point
Is the date that we decide to stop collecting
information about subjects.
Loss of view
A subject that we do not know his status at the date
of point is called loss of view.
March 28, 2013 26
27. Time of participation t i
Time of participation for loss of view
Time of participation for loss of view
Time not consider
t1, t2,
t3, ….,
time 0 ?
Date of last Date of point
information
Time of participation for affected
subject
Time of participation
March 28, 2013 27
28. Example
The following example follow 30 individuals between 1982 to 1988 for the disease D.
of individual# Date of origin Date of disease Date of last Time of
information participation
1 11-82 1-84 12-88 14
2 11-82 10-87 12-88 59
3 6-82 2-86 6-87 44
4 11-82 11-86 12-88 36
5 11-82 3-85 12-88 28
6 11-82 1-88 12-88 62
7 6-83 3-84 6-85 9
8 1-83 12-86 12-87 47
9 1-84 1-87 12-88 36
10 11-82 10-85 12-88 35
11 12-82 8-86 12-88 44
12 1-83 6-83 7-85 5
13 6-83 11-87 7-88 53
14 11-82 8-84 2-87 21
March 28, 2013 28
30. Sometimes it is difficult to know the exact date of origin of the case or even the duration of
follow up, and this is always take place when the population under study is open . In this
case what we know very well :
1.Number of new cases, but not the time exact of participation = m
2.Total number of population from the begin to end of the study = N
The calculation using the same method as before will be impossible, because many
information was missed.
The assumption over which we are going to build our hypothesis is that all the cases enter
or quiet the study are distributed uniformly during the period of follow up, i.e. the date of
follow up for each of these subject will be the half average of the period of the follow up of
the study.
The calculation will be as follow:
Item Number time of participation
Not diseases N N*Δt
Enter along the study Ne Ne*Δt/2
Quiet along the study Ns Ns*Δt/2
Disease m m*Δt/2
m
IR =
∆t
( 2 N + N e + N s + m)
2
March 28, 2013 30
31. Change Rates
• These types of rates are used to describe
changes after a certain period of time.
new value - old value
change rate (%) = X 100
old value
• Example: A total of 35,238 new AIDS cases
were reported in 1989 compared to 32,196
reported during 1988.
– The change rate for new AIDS cases:
35,238 − 32 ,196
100 = 9.4%
32 ,196
March 28, 2013 31
32. Measures of Morbidity and Mortality
# deaths in a calendar year
crude death rate =
the population on that year
# of people that developed the disease
over a defined period of time (ie. a year)
incidence rate =
# of people at risk who were followed
for the defined period of time (ie. a year)
number of deaths
follow - up death rate =
total person - years
March 28, 2013 32
33. Crude Death Rate
• Example: The 1980 population in
California was 23,000,000 (as estimated
on 1 July) and there were 190,237 deaths
during that year.
– Crude death rate
=(190,237/23,000,000)*1,000
= 8.3 deaths per 1,000 per year
March 28, 2013 33
34. Displaying Proportions over Time
Death Rate per
100,000
Female Death Rates (1984-1987)
1984 793 815
810
805
1985 807 800
795 Death Rate per 100,000
790
1986 809
785
780
1984 1985 1986 1987
1987 813
March 28, 2013 34
35. Standardization of Rates
• Crude rates are used to describe a population but
comparisons of crude rates are often invalid because
the populations may be different w.r.t important
characteristics (ie. age, gender, race).
• To account for these differences adjusted rates are
used in the comparison.
# deaths expected
adjusted rate =
# in standard population X100,000
March 28, 2013 35
36. Group A Group B
no. deaths/ no. deaths/
Age group deaths persons 100000 deaths persons 100000
0-4 162 40,000 405.0 2,049 546,000 375.3
5-19 107 128,000 83.6 1,195 1,982,000 60.3
20-44 449 172,000 261.0 5,097 2,676,000 190.5
45-64 451 58,000 777.6 19,904 1,807,000 1101.5
+65 444 9,000 4933.3 63,505 1,444,000 4397.9
Totals 1613 407000 396.3 91750 8455000 1085.2
March 28, 2013 36
37. :Using the X population for 1970 as a standard we get
Group A Group B
Age Age spec. Exp Age spec. Exp
group Standard rate deaths rate deaths
0-4 84,416 405.0 342 375.3 317
5-19 294,353 83.6 246 60.3 177
20-44 316,744 261.0 827 190.5 603
45-64 205,745 777.6 1600 1101.5 2266
+65 98,742 4933.3 4871 4397.9 4343
Totals 1,000,000 7886 7706
Expected deaths for Group A for age group 65+ = (98,742)
(4933.3)/100,000 = 4871
Age adjusted rate for
= Group A 788.6 100,000*(7886/1,000,000)=
Age adjusted rate for
= Group B 770.6 100,000*(7706/1,000,000)=
March 28, 2013 37
38. Relative Risk
• Relative risks are the ratio of risks for two different
populations (ratio=a/b).
disease incidence in group 1
Relative Risk =
disease incidence in group 2
• If the risk (or proportion) of having the outcome is 1/10
in one population and 2/10 in a second population, then
the relative risk is: (2/10) / (1/10) = 2.0
• A relative risk >1 indicates increased risk for the group
in the numerator and a relative risk <1 indicates
decreased risk for the group in the numerator.
March 28, 2013 38
39. Odd’s Ratio and Relative Risk
• Odds ratios are better to use in case-
control studies (cases and controls are
selected and level of exposure is
determined retrospectively)
• Relative risks are better for cohort
studies (exposed and unexposed subjects
are chosen and are followed to determine
disease status - prospective)
March 28, 2013 39
40. Odd’s Ratio and Relative Risk
• When we have a two-way classification of
exposure and disease we can approximate the
relative risk by the odds ratio
Disease
Yes No
Yes A B A+B
Exposure No C D C+D
• Relative Risk=A/(A+B) divided by C/(C+D)
• Odd’s Ratio= A/B divided by C/D = AD/BC
March 28, 2013 40
41. Relationship Between the Two
Measures
A C
RR = ÷
A+B C+D
A(C + D)
=
C(A + B)
if the number of subjects classified as disease positive
is small compared to those classified as disease negative, then :
C+D ≅ D
A+B≅B
Therefore the relative risk can be approximated by :
A
A*D
RR ≅ = B
B*C C
D
March 28, 2013 41
42. Case Control Study Example
• Disease: Pancreatic Cancer
• Exposure: Cigarette Smoking
Disease
Yes No
Exposure Yes 38 81 119
No 2 56 58
March 28, 2013 42
43. Example Continued
• Relative risk for exposed vs. non-exposed
– Numerator- proportion of exposed people that
have the disease
– Denominator-proportion of non-exposed that
have the disease
– Relative Risk= (38/119)/(2/58)=9.26
March 28, 2013 43
44. Example Continued
• Odd’s Ratio for exposed vs. non-exposed
– Numerator- ratio of diseased vs. non-
diseased in the exposed group
– Denominator- ratio of diseased vs. non-
diseased in the non-exposed group
– Odd’s Ratio= (38/81)/(2/56)=(38*56)/(2*81)
=13.14
March 28, 2013 44
45. Relative Risk
• Relative risk – the chance that a member of a group
receiving some exposure will develop a disease relative to
the chance that a member of an unexposed group will
develop the same disease.
P(disease | exposed)
RR =
P(disease | unexposed)
• Recall: a RR of 1.0 indicates that the probabilities of
disease in the exposed and unexposed groups are
identical – an association between exposure and disease
does not exist.
March 28, 2013 45
46. Relative Risk
• When we have a two-way classification of
exposure and disease we can calculate
the relative risk
Disease
Yes No
Yes A B A+B
Exposure No C D C+D
March 28, 2013 46
47. Case Control Study Example
• Disease: Pancreatic Cancer
• Exposure: Cigarette Smoking
Disease
Yes No
Exposure Yes 38 81 119
No 2 56 58
March 28, 2013 47
48. Data Interpretation
• Consideration:
1. Accuracy
1. critical view of the data
2. investigating evidence of the results
3. consider other studies’ results
4. peripheral data analysis
5. conduct power analysis: type I & type II
True False
True Correct Type-II
False Type -I Correct
49. Types of Errors
If You…… When the Null Then You
Hypothesis is… Have…….
Reject the null True (there really Made a Type I
hypothesis are no difference) Error
Reject the null False (there really ☻
hypothesis are difference)
Accept the null False (there really Made Type II
hypothesis are difference) Error
Accept the null True (there really ☻
hypothesis are no difference)
50. • alpha : the level of significance used for
establishing type-I error
• β : the probability of type-II error
• 1 – β : is the probability of obtaining
significance results ( power)
• Effect size: how much we can say that the
intervention made a significance
difference
51. 2. Meaning of the results
- translation of the results and make it
understandable
3. Importance:
- translation of the significant findings into
practical findings
4. Generalizability:
- how can we make the findings useful for all
the population
5. Implication:
- what have we learned related to what has
been used during study
52. POWER--Uses and Misuses
• Sources
– Cohen Statistical Power Analysis for the
Behavioral Sciences (gold standard for power)
– Kraemer & Thieman How Many Subjects?
(also a good review)
53. Needed Parameters
• Alpha--chance of a Type I error
• Beta--chance of a Type II error
• Power = 1 - beta
• Effect size--difference between groups or
amount of variance explained or how
much relationship there is between the DV
and the IVs
54. ?Remember this in English
• Type I error is when you say there is a
difference or relationship and there is not
• Type II error is when you say there is no
difference or relationship and there really
is
55. ?What Affects Power
• Size of the difference in means or amount
of variance explained (ES)
• alpha
• Unexplained variance
• N
56. ?Which is more important
• Type I error more important if possibility of
harm or lethal effect
• Type II error more important in relatively
unexplored areas of research
• In some studies, Type I and Type II errors
may be equally important
57. How to Increase Power
1. Increase the n
2. Decrease the unexplained variance--control by design or statistics
(e.g. ANCOVA)
3. Increase alpha (controversial)
4. Use a one tailed test (directional hypothesis)--puts the zone of
rejection all in one tail; same effect as increasing alpha
5. Use parametric statistics as long as you meet the assumptions. If
not, parametric statistics are LESS powerful
6. Decrease measurement error (decrease unexplained variance)--use
more reliable instruments, standardize measurement protocol,
frequent calibration of physiologic instruments, improve inter-rater
reliability
58. ?What is good power
By tradition, “good” power is 80%
The correct answer is it depends on the nature of
the phenomenon and which kind of error is most
important in your study. This is a theoretical
argument that you have to make.
Using convention (alpha = .05 and power = .80,
beta = .20) you are saying that Type I error is
_________ as serious as a Type II error
59. Effect Size
How large an effect do I expect exists in the
population if the null is false?
OR
How much of a difference do I want to be
able to detect?
The larger the effect, the fewer the cases
needed to see it. (The difference is so big
you can trip on it.)
60. The World According to Power
Kraemer & Thiemann
• The more stringent the significance level, the greater the
necessary sample size. More subjects are needed for a
1% level than a 5% level
• Two tailed tests require larger sample sizes than one
tailed tests. Assessing two directions at the same time
requires a greater investment.
• The smaller the effect size, the larger the necessary
sample size. Subtle effects require greater efforts.
• The larger the power required, the larger the necessary
sample size. Greater protection from failure requires
greater effort.
• The smaller the sample size, the smaller the power, ie
the greater the chance of failure
61. The World According to Power
Kraemer & Thiemann
• If one proposed to go with a sample size
of 20 or fewer, you have to be willing to
have a high risk of failure or a huge effect
size
• To achieve 99% power for a effect size of .
01, you need > 150,000 subjects
62. Test Yourself
Keeping the other parameters the same:
• As ES decreases, needed n ____
• As alpha decreases, needed n ____
• Higher power requires _____ n
63. Power for each test
• You do a power analysis for each statistic
you are going to use.
• Choose the sample size based on the
highest number of subjects from the power
analysis.
• Use the most conservative power
analysis--guarantees you the most
subjects
64. ?What about multiple time points
• More time points requires fewer subjects
since more is known about the subjects
from prior time points as compared to a
cross sectional study
• In other words, less variance is
unexplained since you have baseline
information
• How many fewer? It depends
65. Power analysis and secondary
analysis
If you have a set sample size, your power
analysis then works backward. You set the
n, alpha and ES and determine the power
given the first three parameters.
66. Determining ES
If you want to determine effect size from a
completed study, you have the n, alpha
and power and can work backwards to
determine the ES.
Especially important in relatively unexplored
areas
67. Power and MR
• ES is the amount of explained variance
expected since there may not be group
differences, based on past research
• Increasing the number of independent
variables _______ sample size needed to
achieve adequate power.
68. Sampling Distribution
• A sample statistic is often unequal to the value of the
corresponding population parameter because of
sampling error.
• Sampling error reflects the tendency for statistics to
fluctuate from one sample to another.
• The amount of sampling error is the difference between
the obtained sample value and the population
parameter.
• Inferential statistics allow researchers to estimate how
close to the population value the calculated statistics is
likely to be.
• The concept of sampling, which are actually probability
distributions, is central to estimates of sampling error.
69. Characteristics of Sampling
Distribution
• Sampling error= sample mean-population mean.
• Every sample size has a different sampling distribution of
the mean.
• Sampling distributions are theoretical, because in
practice, no one draws an infinite number of samples
from a population.
• Their characteristics can be modeled mathematically and
have determined by a formulation known as the central
limit theorem.
• This theorem stipulates that the mean of the sampling
distribution is identical to the population mean.
• The average sampling error-the mean of the (mean-μ)s-
would always equal zero.
70. Standard Error of the Mean
• The standard deviation of a sampling
distribution of the mean has a special
name: the standard error of the mean
(SEM).
• The smaller the SEM, the more accurate
are the sample means as estimates of the
population value.
71. • Estimation
• Hypothesis Testing
Both activities use sample statistics (for
̅
example, X) to make inferences about a
population parameter (μ).
71
72. • Why don’t we just use a single number (a point
estimate) like, say, X̅ to estimate a population
parameter, μ?
• The problem with using a single point (or
value) is that it will very probably be wrong. In
fact, with a continuous random variable, the
probability that the variable is equal to a
̅
particular value is zero. So, P(X=μ) = 0.
• This is why we use an interval estimator.
• We can examine the probability that the
interval includes the population parameter.
72
73. Types of Statistical Inference
• Parameter estimation:
– It is used to estimate a population value, such as a
mean, relative risk index or a mean difference
between two groups.
– Estimation can take two forms:
• Point estimation: involves calculating a single statistic to
estimate the parameter. E.g. mean and median.
– Disadvantages: they offer no context for interpreting their
accuracy and a point estimate gives no information regarding
the probability that it is correct or close to the population value.
• Interval estimation: is to estimate a range of values that has
a high probability of containing the population value .
74. • How wide should the interval be? That depends
upon how much confidence you want in the
estimate.
• For instance, say you wanted a confidence interval
estimator for the mean income of a college
graduate: You might have That the mean income is
between
100% $∞and $0
confidence
95% and $41,000 $35,000
confidence
90% and $40,000 $36,000
confidence
80% and $38,500 $37,500
confidence
• The wider the interval, the greater the confidence
… …
you will have in it as containing the true population
confidence 0%(a point estimate )$38,000
parameter μ. 74
75. Interval Estimation
• For example, it is more likely the population
height mean lies between 165-175cm.
• Interval estimation involves constructing a
confidence interval (CI) around the point
estimate.
• The upper and lower limits of the CI are called
confidence limits.
• A CI around a sample mean communicates a
range of values for the population value, and the
probability of being right. That is, the estimate is
made with a certain degree of confidence of
capturing the parameter.
76. Confidence Intervals around a
Mean
• 95% CI = (mean + (1.96 x SEM)
• This statement indicates that we can be 95% confident that the
population mean lies between the confident limits , and that these
limits are equal to 1.96 times the true standard error, above and
below the sample mean.
• E.g. if the mean = 61 inches, and SEM = 1, What is 95% CI.
– Solution: 95% CI = (61 + (1.96 X 1))
95% CI = (61 + 1.96)
95% CI = 59.04 < μ < 62.96
• E.g. if the mean = 61 inches, and SEM = 1, What is 99% CI.
– Solution: 99% CI = (61 + (2.58 X 1))
99% CI = (61 + 2.58)
99% CI = 58.42 < μ < 63.58
77. Confidence Intervals and the t distribution
• When sample size is small then we cannot use
confidence intervals around the mean, instead, we
measure confidence intervals by the t-distribution.
• t-distribution is similar to a normal distribution in a
standard form.
• The exact shape of the t-distribution is influenced by the
number cases in the sample.
• Statisticians have developed tables for the area under
the t-distribution for different sample size and probability
levels.
• To use this table, we must enter at the appropriate row
based on the number of degrees of freedom.
78. Confidence Intervals and the t distribution
• 95% CI = (mean + (t x SEM)
– Where mean = the sample mean
T = tables t value at 95% CI for df = N-1
SEM = the calculated SEM for the sample data
• E.g. SEM = 1, mean = 61, N = 25, df = 25-1, t for the
95% CI with 24 df is 2.06
– Solution:
95% CI = (61 + (2.06 X 1))
95% CI = (61 + 2.06)
95% CI = 58.95 < μ < 63.06
To compute CIs around a mean with SPSS:
Analyze------descriptive stat----explore then click on the statistics
pushbutton.
79. Types of Statistical Inference
• Hypothesis testing:
– Hypothesis testing is a second approach to inferential statistics.
– Hypothesis testing involves using sampling distributions and the
laws of probability to make an objective decision about whether
to accept or reject the null hypothesis.
– The sample may deviate from the defined population’s true
nature by certain amount.
– This deviation is called sampling error.
– Drawing the wrong conclusion is called an error of inference.
– There are two types of errors of inference defined in terms of the
null hypothesis:
• Type I error
• Type II error
80. • Testing a Claim: Companies often make claims
about products. For example, a frozen yogurt
company may claim that its product has no more
than 90 calories per cup. This claim is about a
parameter – i.e., the population mean number of
calories per cup (μ).
• The claim is tested is by taking a sample - say, 100
cups - and determining the sample mean. If the
sample mean is 90 calories or less we have no
evidence that the company has lied. Even if the
sample mean is greater than 90 calories, it is
possible the company is still telling the truth
(sampling error). However, at some point –
perhaps, say, a sample average of 500 calories per
cup – it will be clear that the company has not been
completely truthful about its product.
80
81. • A hypothesis is made about the value of a
parameter, but the only facts available to estimate
the true parameter are those provided by the
sample. If the statistic differs (and of course it will)
from the hypothesis stated about the parameter, a
decision must be made as to whether or not this
difference is significant. If it is, the hypothesis is
rejected. If not, it cannot be rejected.
• H0: The null hypothesis. This contains the
hypothesized parameter value which will be
compared with the sample value.
• H1: The alternative hypothesis. This will be
“accepted” only if H0 is rejected.
Technically speaking, we never accept H0 What we actually say is that we do
not have the evidence to reject it.
81
82. • Two types of errors may occur: α (alpha) and β
(beta). The α error is often referred to as a Type
I error and β error as a Type II error.
– You are guilty of an alpha error if you reject H0 when it
really is true.
– You commit a beta error if you “accept” H0 when it is
false.
82
83. • This alpha error is related to the (1- α) we just
learned about when constructing confidence
intervals. We will soon see that an α error of .05
in testing a hypothesis (two-tail test) is
equivalent to a confidence of 95% in
constructing a two-sided interval estimator.
α/2 α/2
-Zα/2 Zα/2
83
84. TRADEOFF!
•There is a tradeoff between the alpha and beta errors.
We cannot simply reduce both types of error. As one
goes down, the other rises.
•As we lower the α error, the β error goes up: reducing
the error of rejecting H0 (the error of rejection) increases
the error of “Accepting” H0 when it is false (the error of
acceptance).
•This is similar (in fact exactly the same) to the problem
we had earlier with confidence intervals. Ideally, we
would love a very narrow interval, with a lot of
confidence. But, practically, we can never have both:
there is a tradeoff. 84
85. • Our legal system understands this tradeoff very well.
– If we make it extremely difficult to convict criminals
because we do not want to incarcerate any innocent
people we will probably have a legal system in which
no one gets convicted.
– On the other hand, if we make it very easy to convict,
then we will have a legal system in which many
innocent people end up behind bars.
– This is why our legal system does not require a guilty
verdict to be “beyond a shadow of a doubt” (i.e.,
complete certainty) but “beyond reasonable doubt.”
85
86. • Quality Control.
– A company purchases chips for its smart phones, in
batches of 50,000. The company is willing to live with
a few defects per 50,000 chips. How many defects?
– If the firm randomly samples 100 chips from each
batch of 50,000 and rejects the entire shipment if there
are ANY defects, it may end up rejecting too many
shipments (error of rejection). If the firm is too liberal
in what it accepts and assumes everything is
“sampling error,” it is likely to make the error of
acceptance.
– This is why government and industry generally work
with an alpha error of .05
86
87. 1.Formulate H0 and H1. H0 is the null hypothesis, a hypothesis about the value
of a parameter, and H1 is an alternative hypothesis.
– e.g., H0: µ=12.7 years; H1: µ≠12.7 years
2.Specify the level of significance (α) to be used. This level of significance tells
you the probability of rejecting H0 when it is, in fact, true. (Normally,
significance level of 0.05 or 0.01 are used)
3.Select the test statistic: e.g., Z, t, F, etc. So far, we have been using the Z
distribution. We will be learning about the t-distribution (used for small
samples) later on.
4.Establish the critical value or values of the test statistic needed to reject H0.
DRAW A PICTURE!
5.Determine the actual value (computed value) of the test statistic.
6.Make a decision: Reject H0 or Do Not Reject H0.
87
88. •When we Formulate H0 and H1, we have to decide
whether to use a one-tail or two-tail test.
•With a “two-tail” hypothesis test, α is split into two
and put in both tails. H1 then includes two
possibilities: μ = # OR μ ≠ #. This is why the
region of rejection is divided into two tails. Note
that the region of rejection always corresponds to
H1.
• With a “one-tail” hypothesis test, the α is entirely
in one of the tails.
Hypothesis Testing 88
89. •For example, if the company claims that a certain
product has exactly 1 mg of aspirin, that would result in
a two-tail test. Note words like “exactly” suggest two tail
tests. There are problems with too much aspirin and too
little aspirin in a drug.
•On the other hand, if a firm claims that a box of its
raisin bran cereal contains at least 100 raisins, a one-tail
test has to be used. If the sample mean is more than
100, everything is ok. The problems arise only if the
sample mean is less than 100. The question will be
whether we are looking at sampling error or perhaps the
company is lying and the true (population) mean is less
than 100 raisins.
89
90. •A company claims that its soda vending machines deliver exactly 8 ounces of
soda. Clearly, You do not want the vending machines to deliver too much or
too little soda. How would you formulate this?
Answer:
H0: µ = 8 ounces
H1: µ ≠ 8 ounces
If you are testing at α=.01, The .01 is split into two: .005 in the left tail and .
005 in the right tail The critical values are ±2.575
.005 .005
-2.575 2.575
90
91. •A company claims that its bolts have a circumference
of exactly 12.50 inches. (If the bolts are too wide or
narrow, they will not fit properly):
Answer:
H0: µ = 12.50 inches
H1: µ ≠ 12.50 inches
•A company claims that a slice of its bread has exactly 2
grams of fiber. Formulate this:
Answer:
H0: µ = 2 grams
H1: µ ≠ 2 grams
91
92. •A company claims that its batteries have an average life of at least 500
hours. How would you formulate this?
Answer:
H0: µ ≧ 500 hours
H1: µ < 500 hours
If you are testing at an α = .05, The entire .05 is in the left tail (hint: H1 points to
where the rejection region should be.) The critical value is -1.645.
92
93. A company claims that its overpriced, bottled spring water has no more than 1
mcg of benzene (poison). How would you formulate this:
Answer:
H0: µ ≦ 1 mcg. benzene
H1: µ > 1 mcg. benzene
If you are testing at an α = .05, The entire .05 is in the right tail (hint: H1 points
to where the rejection region should be.) The critical value is +1.645.
.05
1.645
93
94. A pharmaceutical company claims that each of its pills contains exactly 20.00
milligrams of Cumidin (a blood thinner). You sample 64 pills and find that the
sample mean X̅ =20.50 mg and s = .80 mg. Should the company’s claim be
rejected? Test at α = 0.05.
•Formulate the hypotheses
H0: µ =20.00 mg
H1: µ ≠ 20.00 mg
•Choose the test statistic and find the critical values; draw region of rejection
Test statistic: Z
At α = 0.05, the critical values are ±1.96.
•Use the data to get the calculated value of the test statistic
Z= = =5 [ .80/√.64 = .10 This is the standard error of the mean. ]
•Come to a Conclusion: Reject H0 or Do Not Reject H0
The computed Z value of 5 is deep in the region of rejection.
Thus, Reject H0 at p < .05 94
95. • Suppose we took the above data, ignored the hypothesis, and
constructed a 95% confidence interval estimator.
20.50 ± 1.96(.10)
95%, CIE: 20.304 mg ←→ 20.696 mg
• We note that 20.00 mg is not in this interval.
• As you can see, hypothesis testing and CIE are virtually the
same exercise; they are merely two sides of the same coin.
Both rely on the sample evidence.
• If a claim is made about a parameter, do a hypothesis test. If no
claim is made and a company wants to use sample evidence to
estimate a parameter (perhaps to determine what claims may be
made in the future about a parameter), construct a confidence
interval estimator. 95
96. • A company claims that its LED bulbs will last at least
8,000 hours. You sample 100 bulbs and find that X̅
=7,800 hours and s=800 hours. Should the
company’s claim be rejected? Test at α = 0.05.
• H0: µ ≧ 8,000 hours
H1: µ < 8,000 hours 5%
-1.645
• Z = 7,800 – 8,000 / (800/√100) = -200/80 = -2.50
• [800/√100 = 80, the standard error of the mean]
• The computed Z value of -2.50 is in the region of
rejection. Thus, reject H0 at p < .05
– Note: When testing a hypothesis, we often have to perform a one-tail test if the
claim requires it. However, we will always use only two-sided confidence interval
estimators when using sample statistics to estimate population parameters.
96
97. • In estimating µ based on sample statistics,
how large a sample do we need for the
level of precision we want?
– To determine the sample size we need, we
must know the (1) desired precision and (2) σ.
e=
Pr ecision
X ± Zα σ / n
• e, the half-width of the confidence interval
estimator is the precision with which we
are estimating. e is also called sampling
error. 97
98. …continued
We use e to solve for n:
Zσ Zσ
e= n=
n e
If then
Z 2σ 2
n=
e2
and so
98
100. • Similarly, taking e (precision) from formula
for the half-width of a confidence interval
estimator for P:
Z P (1 − P )
2
e2
• Q: If we are trying to estimate the
population proportion, P, what do we use
for P in this formula?
100
101. Suppose a pollster wants a maximum error
of
e = .01 with 95% confidence.
We assume that variance is the highest
possible, so we use P=.5. This is the way
we ensure that sampling error will be within
±.01 of the true population Proportion.
Then,1.96 .5(1 − .5)
2
n= 2
.01 = 9,604
That is a VERY large sample.
101
102. …continued
Let’s try that again with e = .03.
1.96 2.5(1 − .5)
n= .032 = 1,067
This is the sample size that most pollsters
work with.
102