This document provides an introduction to probability and statistics concepts over 2 weeks. It covers basic probability topics like sample spaces, events, probability definitions and axioms. Conditional probability and the multiplication rule for conditional probability are explained. Bayes' theorem relating prior, likelihood and posterior probabilities is introduced. Examples on probability calculations for coin tosses, dice rolls and medical testing are provided. Key terms around experimental units, populations, descriptive and inferential statistics are also defined.
2. Definitions
Variable is a characteristic that changes or varies
over time and/or for different individuals or
objects under consideration
Experimental Units are items or objects on which
measurements are taken
Measurement results when a variable is actually
measured on an experimental unit
Population is the WHOLE set of all possible
measurements
Sample is a subset of population
5. Examples
• Hair color
– Variable = Hair color
– Experimental unit = Person
– Typical Measurements =
Brown, black, blonde
6. Descriptive Statistics
• When we can enumerate whole population,
We use
• DESCRIPTIVE STATISTICS: Procedures
used to summarize and describe the set of
measurements.
7. Inferential Statistics
• When we cannot enumerate the whole
population,
we use
• INFERENTIAL STATISTICS: Procedures
used to draw conclusions or inferences
about the population from information
contained in the sample.
8. Objective of
Inferential Statistics
• To make inferences about a population
from information contained in a sample.
• The statistician’s job is to find the best
way to do this.
9. But, our conclusions could be incorrect…
consider this internet opinion poll…
Who makes the best burgers? Votes Percent
McDonalds 123 Votes 13%
Burger King 384 Votes 39%
Wendy’s 304 Votes 31%
All three have equally good burgers 72 Votes 7%
None of these have good burgers 98 Votes 10%
We’ll PAY CASH For Your Opinions!
(as much as $50,000 ) Click Here and
sign up FREE!
• We need a measure of reliability.
10. The Steps in Inferential Statistics
• Define the objective of the experiment and
the population of interest
• Determine the design of the experiment
and the sampling plan to be used
• Collect and analyze the data
• Make inferences about the population from
information in the sample
• Determine the goodness or reliability of the
inference.
12. Mathematical Terms
Theorem
• A statement that has been proven to be true.
Axiom,
• Assumptions (often unproven) defining the structures about which we are
reasoning.
Rules of inference
• Patterns of logically valid deductions from hypotheses to conclusions.
Conjecture
• A statement whose truth values has not been proven.
(A conjecture may be widely believed to be true, regardless.)
Theory
• The set of all theorems that can be proven from a given set of axioms.
13. Basic Probability
• Random Experiments
- The results will vary from one
performance of the experiment to
the next even though most of the
conditions are the same
• Example
- If we toss a coin, the result of the
experiment is that it will either
come up “tails” or “heads
- Unless you are “Harvey Dent”….
14. Basic Probability
• Sample Spaces
- Sample space: A set S that consists of all possible
outcomes of a random experiment
- Sample point: An outcome of a random experiment
in a sample space
• Example
- If we toss a die, one sample space, or set of all
possible outcomes, is given by {1, 2, 3, 4, 5, 6}
while another is {odd, even}.
- It is clear, however, that the latter would not be
adequate to determine, for example, whether an
outcome is divisible by 3.
15. Toss a coin twice. What is the sample space?
S={(H, H), (H, T), (T, H), (T,T)}
Toss a dice twice. What is the sample space?
16. You have a box containing six cards. You select two cards. What is the
sample space if
(1) You do not return the card after the selection
(2) You return the card after the selection
(1)
(2)
17. Basic Probability
• Graphical representation of a sample space
- If we toss a coin twice and use 0 to represent tails
and 1 to represent heads, the sample space can be
portrayed by points as follows
18. Basic Probability
• Event
- Event: A subset A of the sample space S (a set of
possible outcomes)
- Elementary event: An event consisting of a single
point of S
- Sure (certain) event: S itself
- Impossible event: An empty set
20. Basic Probability
• The Concept of Probability
- CLASSICAL APPROACH: If an event can occur in h
different ways out of a total number of n possible
ways, all of which are equally likely, then the
probability of the event is h/n
- FREQUENCY APPROACH (Empirical probability): If
after n repetitions of an experiment, where n is very
large, an event is observed to occur in h of these,
then the probability of the event is h/n
- AXIOMATIC APPROACH: Since both the classical
and frequency approaches have serious drawbacks,
mathematicians have been led to an axiomatic
approach to probability
21. Basic Probability
Tossing a coin
No. of Exp. H T H ratio (%) T ratio (%)
22. Basic Probability
• How to mathematically express probability?
With a sample space S and an event
A:
• P: probability function
• P(A): probability of the event A
23. Basic Probability
• AXIOMATIC APPROACH
If these three things are true, P(A) is defined as the probability for event A.
24. Basic Probability
Example
• Exclusive?
S
A B
(H,H
(H,T)
)
(T,H)
C (T,T
)
Mutually exclusive
25. Independent ≠ mutually
exclusive
• Events A and ~A are mutually exclusive, but
they are NOT independent.
• P(A&~A)= 0
• P(A)*P(~A) ≠ 0
Conceptually, once A has happened, ~A is
impossible; thus, they are completely
dependent.
28. Basic Probability
• Assignment of Probabilities
- Assignment: if we assume equal probabilities for all
simple events (A1, A2, … , An):
- and if A is any event made up of h such simple
events, we have
- which is equivalent to the classical approach to
probability (i.e. frequency approach).
29. Basic Probability
• Fundamental Principle of Counting
- Combinatorial Analysis: When the number of
sample points in a sample space is very large,
direct counting becomes a practical impossibility. In
this case, a combinatorial analysis is required.
- Principle of Counting: If one thing can be
accomplished in n1 different ways and after this a
second thing can be accomplished in n2 different
ways, . . . , and finally a kth thing can be
accomplished in nk different ways, then all k things
can be accomplished in the specified order in
n1n2…nk different ways.
31. Using a probability tree
Mendel example: What’s the chance of having a heterozygote
child (Dd) if both parents are heterozygote (Dd)?
Child’s outcome
Mother’s allele Father’s allele
P(DD)=.5*.5=.25
P(♂D=.5)
P(♀D=.5) P(♂d=.5)
P(Dd)=.5*.5=.25
P(♂D=.5) P(dD)=.5*.5=.25
P(♀d=.5)
P(♂d=.5) P(dd)=.5*.5=.25
______________
1.0
Rule of thumb: in probability, “and”
means multiply, “or” means add
32. Basic Probability
• Permutations
- Suppose that we are given n distinct objects and
wish to arrange r of these objects in a line. Since
there are n ways of choosing the 1st object, and
after this is done, n + 1 ways of choosing the 2nd
object, . . . , and finally n + r - 1 ways of choosing
the rth object:
- nPr : The number of permutations of n objects
taken r at a time.
33. Basic Probability
• Permutations
- The number of different permutations of a set
consisting of n objects of which n1 are of one type
(i.e., indistinguishable from each other), n2 are of a
second type, . . . , nk are of a kth type:
• Example.
- The number of different arrangements, or
permutations, consisting of 3 letters each that can
be formed from the 7 letters A, B, C, D, E, F, G is
34. Basic Probability
• Combinations
- When selecting or choosing objects without regard
to order is required.
- nCr : The total number of combinations of r objects
selected from n (also called the combinations of n
things taken r at a time).
35. Basic Probability
• Combinations
- Binomial Coefficient: nCr are often called
binomial coefficients because they arise in the
binomial expansion
• Example
- The number of ways in which 3 cards can be
chosen or selected from a total of 8 different cards
is:
36. Basic Probability
• Stirling’s Approximation to n! James Stirling (1692 ~
1770) was a Scottish
- An approximation formula for n! mathematician. The Stirling
numbers and Stirling's
approximation are named
after him.
- Example
38. Conditional Probability
• Example
- Find the probability that a single toss of a die will
result in a number less than 4 if (a) no other
information is given and (b) it is given that the toss
resulted in an odd number.
The added knowledge that the toss results in an odd number raises the
probability from 1/2 to 2/3
40. Conditional Probability
• Theorems
- Theorem 1: For any three events A1, A2, A3, we
have
- In words, the probability that A1 and A2 and A3 all
occur is equal to the probability that A1 occurs times
the probability that A2 occurs given that A1 has
occurred times the probability that A3 occurs given
that both A1 and A2 have occurred.
42. Conditional Probability
• Additional theorems
For events A and B and C (P(C) > 0)
(1) P(Φ | C)=0
(2) A, B : mutually exclusive ⇒ P(A∪B | C)= P(A | C) + P(B | C)
c
(3) P( A | C) = 1- P(A | C)
(4) A ⊂ B ⇒ P(B-A | C) = P(B | C) - P(A | C), P(A | C) ≤ P(B | C)
(5) P(A∪B | C) = P(A | C) + P(B | C) - P(A ∩ B | C)
(6) P(A∪B | C) ≤ P(A | C) + P(B | C)
43. Conditional Probability
• Independent Event:
- Independent Event: If P(B|A) = P(B), i.e., the
probability of B occurring is not affected by the
occurrence or non-occurrence of A:
- For three events
- Conditional probability of independent event
44. Conditional Probability
• Prior and Posterior Probability
- Prior Probability : The probability of an event before
the result is known
- Posterior Probability : The probability of an event
after the result is known
- Posterior probability is smaller than the prior
probability.
- Computation of posterior probability in more than
one stage
45. Practice problem
If HIV has a prevalence of 3% in San
Francisco, and a particular HIV test has
a false positive rate of .001 and a false
negative rate of .01, what is the
probability that a random person
selected off the street will test positive?
46. Answer
Conditional probability: the
probability of testing + given that Joint probability of being + and
a person is + testing +
Marginal probability of carrying
the virus.
P(test +)=.99 P (+, test +)=.0297
P(+)=.03 P(test - )= .01
P(+, test -)=.003
P(test +) = .001 P(-, test +)=.00097
P(-)=.97
P(test -) = .999 P(-, test -) = .96903
______________
1.0
Marginal probability of testing
positive
∴P(test +)=.0297+.00097=.03067
P(+&test+)≠P(+)*P(test+)
.0297 ≠.03*.03067 (=.00092)
∴ Dependent!
47. Law of total probability
P(test +) = P(test + /HIV+)P(HIV+ ) + P(test + /HIV−)P(HIV-)
One of these has to be true (mutually exclusive,
collectively exhaustive). They sum to 1.0.
P(test +) = .99(.03) + .001(.97)
48. Law of total probability
• Formal Rule: Marginal probability for event A=
P(A) = P(A | B1 ) P(B1 ) + P(A | B 2 ) P(B 2 ) + + P(A | Bk ) P(Bk )
k
Where: ∑ Bi = 1.0 and P(Bi &B j ) = 0 (mutually exclusive)
i =1
B1 B3
A
B2
P(A) = (50%)(25%) + (0)(50%) + + (50%)(25%) = 25%
49. Example 2
• A 54-year old woman has an abnormal
mammogram; what is the chance that she
has breast cancer?
50. Example: Mammography
sensitivity
P(test +)=.90 P (+, test +)=.0027
P(BC+)=.003 P(test -) = .10
P(+, test -)=.0003
P(test +) = .11 P(-, test +)=.10967
P(BC-)=.997
P(test -) = .89 P(-, test -) = .88733
______________
specificity 1.0
Marginal probabilities of breast cancer….(prevalence
among all 54-year olds)
P(BC/test+)=.0027/(.0027+.10967)=2.4%
51. Conditional Probability
• Bayes’ Theorem (theorem on the probability of
causes):
- Suppose that A1, A2, … , An are mutually exclusive
events whose union is the sample space S, i.e., one
of the events must occur. Then if A is any event:
52. Bayes’ Rule: derivation
• Definition:
Let A and B be two events with P(B) ≠ 0.
The conditional probability of A given B is:
P( A & B)
P( A / B) =
P( B)
The idea: if we are given that the event B occurred, the relevant sample space is
reduced to B {P(B)=1 because we know B is true} and conditional probability becomes a
probability measure on B.
53. Bayes’ Rule: derivation
P( A & B)
P( A / B) =
P( B)
can be re-arranged to:
P( A & B) = P( A / B) P( B)
and, since also:
P( A & B)
P ( B / A) = ∴ P ( A & B ) = P ( B / A) P( A)
P( A)
P ( A / B ) P ( B ) = P ( A & B ) = P ( B / A) P ( A)
P ( A / B ) P ( B ) = P ( B / A) P ( A)
P ( B / A) P ( A)
∴ P( A / B) =
P( B)
54. Bayes’ Rule:
P( B / A) P( A)
P( A / B) =
P( B)
OR
P ( B / A) P ( A)
P( A / B) = From the
P ( B / A) P( A) + P ( B / ~ A) P (~ A)
“Law of Total
Probability”
55. Bayes’ Rule:
• Why do we care??
• Why is Bayes’ Rule useful??
• It turns out that sometimes it is very useful
to be able to “flip” conditional probabilities.
That is, we may know the probability of A
given B, but the probability of B given A
may not be obvious. An example will
help…
56. Conditional Probability
• Example
- Three different machines (M1, M2, M3) were used
for producing a large batch of similar items (M1 –
20%, M2 – 30%, M3 – 50%)
- (a) Suppose that 1 % from M1 are defective, 2%
from M2 are defective, 3% from M3 are defective.
- (b) You picked one, which was found to be defective
- Question: Probability that this item was produced by
machine M2.
57. In-Class Exercise
• If HIV has a prevalence of 3% in San Francisco,
and a particular HIV test has a false positive rate
of .001 and a false negative rate of .01, what is
the probability that a random person who tests
positive is actually infected (also known as
“positive predictive value”)?
58. Answer: using probability tree
P(test +)=.99 P (+, test +)=.0297
P(+)=.03
P(test - = .01)
P(+, test -)=.003
P(test +) = .001
P(-, test +)=.00097
P(-)=.97
P(-, test -) = .96903
P(test -) = .999 ______________
1.0
A positive test places one on either of the two “test +” branches.
But only the top branch also fulfills the event “true infection.”
Therefore, the probability of being infected is the probability of being on the top
branch given that you are on one of the two circled branches above.
P (test + &true + ) .0297
P ( + / test + ) = = = 96.8%
P (test + ) .0297 + .00097
59. Answer: using Bayes’ rule
P(test + / true + ) P (true + )
P(true + / test + ) = =
P(test + / true + ) P (true + ) + P (test + / true −) P (true −)
.99(.03)
= 96.8%
.99(.03) + .001(.97)
60. Practice problem
An insurance company believes that drivers can be
divided into two classes—those that are of high risk
and those that are of low risk. Their statistics show
that a high-risk driver will have an accident at some
time within a year with probability .4, but this probability
is only .1 for low risk drivers.
a) Assuming that 20% of the drivers are high-risk, what is the
probability that a new policy holder will have an accident within
a year of purchasing a policy?
b) If a new policy holder has an accident within a year of
purchasing a policy, what is the probability that he is a high-
risk type driver?
61. Answer to (a)
Assuming that 20% of the drivers are of high-risk, what is
the probability that a new policy holder will have an
accident within a year of purchasing a policy?
Use law of total probability:
P(accident)=
P(accident/high risk)*P(high risk) +
P(accident/low risk)*P(low risk) =
.40(.20) + .10(.80) = .08 + .08 = .16
62. Answer to (b)
If a new policy holder has an accident within a year of
purchasing a policy, what is the probability that he is a high-risk
type driver?
P(high-risk/accident)=
P(accident/high risk)*P(high risk)/P(accident)
=.40(.20)/.16 = 50%
P(accident/HR)=.4 P(accident, high risk)=.08
Or use tree: P(high risk)=.20 P( no acc/HR)=.6
P(no accident, high risk)=.12)
P(accident/LR)=.1
P(low risk)=.80 P(accident, low risk)=.08
P( no P(no accident, low risk)=.72
accident/LR)=.9 ______________
1.0
P(high risk/accident)=.08/.16=50%