SlideShare una empresa de Scribd logo
1 de 12
Descargar para leer sin conexión
The Logit Model:
Estimation, Testing and Interpretation
Herman J. Bierens
October 25, 2008

1
1.1

Introduction to maximum likelihood estimation
The likelihood function

Consider a random sample Y1 , ..., Yn from the Bernoulli distribution:
Pr[Yj = 1] = p0
Pr[Yj = 0] = 1 − p0 ,
where p0 is unknown. For example, toss n times a coin for which you suspect
that it is unfair: p0 6= 0.5, and for each tossing j assign Yj = 1 if the outcome
is heads and Yj = 0 if the outcome is tails. The question is how to estimate
p0 and how to test the null hypothesis that the coin is fair: p0 = 0.5.
The probability function involved can be written as
f(y|p0 ) = Pr[Yj = y]
=

py
0

1−y

(1 − p0 )

=

(

p0
if y = 1,
1 − p0 if y = 0.

Next, let y1 , ..., yn be a given sequence of zeros and ones. Thus, each yj is either 0 or 1. The joint probability function of the random sample Y1 , Y2 , ..., Yn
is defined as
fn (y1 , ..., yn |p0 ) = Pr[Y1 = y1 and Y2 = y2 ..... and Yn = yn ].
1
Because the random variables Y1 , Y2 , ..., Yn are independent, we can write
Pr[Y1 = y1 and Y2 = y2 ..... and Yn = yn ]
= Pr[Y1 = y1 ] × Pr[Y2 = y2 ] × ... × Pr[Yn = yn ]
= f (y1 |p0 ) × f (y2 |p0 ) × ... × f (yn |p0 )
=

n
Y

j=1

f(yj |p0 ),

hence
fn (y1 , ..., yn |p0 ) =
=

n
Y

j=1

⎛

y

p0j (1 − p0 )1−yj
⎞⎛

n
Y

n
yj ⎠ ⎝ Y
⎝
p0
(1
j=1
j=1

Pn

= p0

j=1

yj

⎞

1−yj ⎠

− p0 )
Pn

(1 − p0 )n−

j=1

yj

.

Replacing the given non-random sequence y1 , ..., yn by the random sample
Y1 , Y2 , ..., Yn and the unknown probability p0 by a variable p in the interval
(0, 1) yields the likelihood function
Pn

Ln (p) = fn (Y1 , ..., Yn |p) = p

j=1

Yj

Pn

(1 − p)n−

j=1

Yj

For the case p = p0 the likelihood function can be interpreted as the joint
probability that we draw a particular sample Y1 , ..., Yn .

1.2

Maximum likelihood estimation

The idea of maximum likelihood (ML) estimation is now to choose p such
that Ln (p) is maximal. In other words, choose p such that the probability of
drawing this particular sample Y1 , ..., Yn is maximal.
Note that maximizing Ln (p) is equivalent to maximizing ln (Ln (p)) , i.e.,
⎛

ln (Ln (p)) = ⎝

n
X

j=1

³

⎞

⎛

Yj ⎠ ln(p) + ⎝n −

n
X

j=1

⎞

Yj ⎠ ln(1 − p)
´

= n Y ln(p) + (1 − Y ) ln(1 − p) ,
2
where

n
1X
Y =
Yj
n j=1

b
is the sample mean. Therefore, the ML estimator p in this case can be
obtained from the first-order condition for a maximum of ln (Ln (p)) in p =
b
p:
Ã

b
b
b
d ln (Ln (p))
d ln(p)
d ln(1 − p)
=n Y
+ (1 − Y )
0 =
b
b
b
dp
dp
dp
Ã
!
b
b
b
d ln(p)
d ln(1 − p) d(1 − p)
= n Y
+ (1 − Y )
×
b
b
b
dp
d(1 − p)
dp
Ã
!
1
1
× (−1)
= n Y + (1 − Y )
b
b
p
1−p

= n

Ã
Ã

Y
1−Y
−
b
b
p
1−p

b
Y −p
= n
b
b
p (1 − p)

!

!

⎛

= n⎝

³

b
b
Y (1 − p) − p 1 − Y
b
b
p (1 − p)

!

´⎞
⎠

where we have used the fact that d ln(x)/dx = 1/x. Thus, in this case the
b
ML estimator p of p0 is the sample mean:
b
p =Y.

b
Note that this is an unbiased estimator: E (p) =

1.3

1
n

Pn

j=1

E (Yj ) = p0 .

Large sample statistical inference

It can be shown (but this√
requires advanced probability theory) that if the
b
sample size n is large then n (p − p0 ) is approximately normally distributed,
i.e.,
n
√
1 X
2
b
n (p − p0 ) = √
(Yj − p0 ) ∼ N[0, σ0 ],
n j=1
where

h

2
σ0 = var(Yj ) = E (Yj − p0 )2

i

= (1 − p0 )2 p0 + (−p0 )2 (1 − p0 )
= p0 (1 − p0 ).
3
Thus, for large sample size n,
√
b
n (p − p0 )
q

p0 (1 − p0 )

∼ N[0, 1].

(1)

This result can be used to test hypotheses about p0 . In particular, under
the null hypothesis that the coin is fair, p0 = 0.5, we have
√
√
b
n (p − 0.5)
b
2 n (p − 0.5) = √
∼ N[0, 1],
0.5 × 0.5
√
b
Therefore, 2 n (p − 0.5) can be used as the test statistic of the standard
normal test of the null hypothesis p0 = 1/2, as follows. Recall that for a
standard normal random variable U , Pr [|U | > 1.96] = 0.05. Thus, under the
null hypothesis p0 = 1/2 one would expect that
¯
h¯ √
i
¯
¯
b
Pr ¯2 n (p − 0.5)¯ > 1.96 = 0.05
¯
h¯ √
i
¯
¯
b
Pr ¯2 n (p − 0.5)¯ ≤ 1.96 = 0.95
√
b
If |2 n (p − 0.5)| > 1.96 then we reject the null hypothesis p0 = 1/2 at
the 5% significance level, because√
this is not what one would expect if the
b
null hypothesis is true, and if |2 n (p − 0.5)| ≤ 1.96 then we accept this
null hypothesis, as this result is then in accordance with the null hypothesis
p0 = 1/2.
The result (1) can also be used to endow the unknown probability p0 with
a confidence interval, for example the 95% confidence interval, as follows. The
result (1) implies
¯
⎡¯ √
⎤
¯
¯
¯ n (p − p0 ) ¯
b
¯ ≤ 1.96⎦ = 0.95,
Pr ⎣¯ q
¯
¯
¯ p0 (1 − p0 ) ¯
which, after some straightforward calculations, can be shown to be equivalent
to
h
i
Pr pn ≤ p0 ≤ pn = 0.95
where

pn =

q

b
b
b
n.p + (1.96)2 /2 − 1.96 n.p (1 − p) + (1.96)2 /4

n + (1.96)2
q

2

pn =

b
b
b
n.p + (1.96) /2 + 1.96 n.p (1 − p) + (1.96)2 /4

n + (1.96)2
4
h

i

The interval pn , pn is now the 95% confidence interval for p0 .

1.4

An application election polls

Consider a presidential election with two candidates, candidate A and candidate B, and let p0 be the fraction of likely voters who favor candidate A,
just before the election is held. To predict the outcome of the election, a
polling agency draws a random sample of size n = 3000, for example, from
the population of likely voters.1 Suppose that 1800 of the respondents express a preference for candidate A. Thus, the fraction of respondents favoring
b
b
candidate A is p = 0.6. Substituting n = 3000 and p = 0.6 in the formulas
for pn and pn yields
pn = 0.58, pn = 0.62
Thus, the 95% confidence interval of 100 × p0 is [58, 62]. The polling results
are therefore stated as: 60% of the likely voters will vote for candidate A,
with a margin of error of ±2 points.

2

Motivation for maximum likelihood estimation

A more formal motivation for ML estimation is based on the fact that for
0 < x < 1 and x > 1,
ln(x) < x − 1.
This is illustrated in the following picture:
1

How to draw such a sample is beyond the scope of this lecture note.

5
ln(x) ≤ x − 1.
The inequality ln(x) < x − 1 is strict for x 6= 1, and ln(1) = 0. Consequently, taking x = f(Yj |p)/f(Yj |p0 ), we have the inequality
Ã

f (Yj |p)
ln
f(Yj |p0 )

!

≤

f(Yj |p)
− 1.
f (Yj |p0 )

Taking expectations, it follows that
"

Ã

f (Yj |p)
E ln
f (Yj |p0 )

!#

≤
=
=
=

"

#

f(Yj |p)
E
−1
f (Yj |p0 )
f(1|p)
f (0|p)
Pr[Yj = 1] +
Pr[Yj = 0] − 1
f (1|p0 )
f(0|p0 )
1−p
p
p0 +
(1 − p0 ) − 1
p0
1 − p0
p + 1 − p − 1 = 0,
(2)

hence

"

Ã

f(Yj |p)
E [ln (f (Yj |p))] − E [ln (f (Yj |p0 ))] = E ln
f (Yj |p0 )

!#

≤ 0,

and therefore,
E [ln (Ln (p))] ≤ E [ln (Ln(p0 ))] .

(3)

Thus, E [ln (Ln (p))] is maximal for p = p0 , and it can be shown that this
maximum is unique.
6
3

Maximum likelihood estimation of the Logit
model

3.1

The Logit model with one explanatory variable

Next, let (Y1 , X1 ), ..., (Yn , Xn ) be a random sample from the conditional Logit
distribution:
1
,
1 + exp (−α0 − β0 Xj )
Pr[Yj = 0|Xj ] = 1 − Pr[Yj = 1|Xj ]
exp (−α0 − β0 Xj )
=
1 + exp (−α0 − β0 Xj )
Pr[Yj = 1|Xj ] =

(4)

where the Xj ’s are the explanatory variables and α0 and β0 are unknown
parameters to be estimated. This model is called a Logit model, because
Pr[Yj = 1|Xj ] = F (α0 + β0 Xj )
where
F (x) =

1
1 + exp(−x)

(5)

(6)

is the distribution function of the logistic (Logit) distribution.
The conditional probability function involved is
f(y|Xj , α0 , β0 ) = Pr[Yj = y|Xj ]
= F (α0 + β0 Xj )y (1 − F (α0 + β0 Xj ))1−y
(
F (α0 + β0 Xj )
if y = 1,
=
1 − F (α0 + β0 Xj ) if y = 0.
Now the conditional log-likelihood function is
ln (Ln (α, β)) =

n
X

j=1

=

n
X

ln (f (Yj |Xj , α, β))

Yj ln (F (α + βXj )) +

j=1

=−

n
X

(1 − Yj ) ln (1 − F (α + βXj ))

j=1
n
X

n
X

(1 − Yj ) (α + βXj ) −

j=1

7

j=1

ln (1 + exp (−α − βXj )) .

(7)
Similar to (3) we have
E [ln (Ln (α, β))| X1 , ..., Xn ] ≤ E [ln (Ln (α0 , β0 ))| X1 , ..., Xn ] .
Again, this result motivates to estimate α0 and β0 by maximizing ln (Ln (α, β))
to α and β:
³
´
b b
ln Ln (α, β) = max ln (Ln (α, β)) .
α,β

b
b
However, there is no longer an explicit solution for α and β. These ML
estimators have to be solved numerically. Your econometrics software will do
that for you.

3.2

Pseudo t-values

It can be shown that if the sample size n is large then
´
√
√ ³b
b
n (α − α0 ) ∼ N (0, σ 2 ), n β − β0 ∼ N (0, σ 2 ).
α
β

bα
bβ
Given consistent estimators σ 2 and σ 2 of the unknown variances σ 2 and σ 2 ,
α
β
respectively (which are computed by your econometrics software), we then
have
´
√ ³b
√
n β − β0
b − α0 )
n (α
∼ N (0, 1),
∼ N(0, 1).
b
b
σα
σβ
These results can be used to test whether the coefficients α0 and β0 are zero
or not. In particular the null hypothesis β0 = 0 is of interest, because this
hypothesis implies that the conditional probability Pr[Yj = 1|Xj ] does not
depend on Xj . Under the null hypothesis β0 = 0 we have
√ b
bβ = nβ ∼ N(0, 1).
t
b
σβ

Recall that the 5% critical value of the two-sided standard normal test is
1.96. Thus, for example, the null hypothesis β0 = 0 is rejected¯ at¯ the 5%
¯b ¯
significance level in favor of the alternative hypothesis β0 6= 0 if ¯t β ¯ > 1.96,
¯

¯

¯b ¯
and accepted if ¯t β ¯ ≤ 1.96.

b
b
The statistic t β is called the pseudo t-value of β because it is used in the
b
same way as the t-value in linear regression, and σ β is called the standard
b Your econometric software will report the ML estimators together
error of β.
with their corresponding pseudo t-values and/or standard errors.

8
3.3

The general Logit model

The general Logit model takes the form

Pr[Yj = 1|X1j , ...Xk,j ] =
=

1
1+

0
exp (−β1 X1j

³

1 + exp −

0
− ... − βk Xkj )

1
Pk

i=1

βi0 Xij

(8)

´,

where one of the Xij equals 1 for the constant term, for example, let Xkj = 1,
and the βi0 ’s are the true parameter values. This model can be estimated by
ML in the same way as before. Thus, the log-likelihood function is
ln (Ln (β1 , ..., βk )) = −

n
X

j=1

(1 − Yj )

k
X
i=1

βi Xij −

n
X

j=1

Ã

Ã

ln 1 + exp −

k
X

βi Xij

i=1

!!

,

(9)

b
b
and the ML estimators β 1 , ..., β k are obtained by maximizing ln (Ln (β1 , ..., βk )):
³

´

b
b
ln Ln (β 1 , ..., β k ) = max ln (Ln (β1 , ..., βk )) .
β1 ,...,βk

Again, it can be shown that if n is large then for i = 1, ..., k,
´
√ ³b
2
n β i − βi0 ∼ N[0, σi ].

2
bi
Given consistent estimators σ 2 of the variances σi , it follows then that

´
√ ³b
n β i − βi0
b
σi

∼ N[0, 1]

for i = 1, ..., k. Your econometrics software will report the ML estimators
√ b
b
b
b
β i together with their corresponding pseudo t-values ti = nβ i /σ i and/or
b i.
standard errors σ

3.4

Testing joint significance

Now suppose you want to test the joint null hypothesis
0
0
0
H0 : β1 = 0, β2 = 0, ..., βm = 0,

9

(10)
where m < k.
There are two ways to do that. One way is akin to the F test in linear
regression: Re-estimate the Logit model under the null hypothesis:
³

´

e
e
ln Ln (0, .., 0, β m+1 , ..., β k ) =

max

βm+1 ,...,βk

ln (Ln (0, .., 0, βm+1 , ..., βk )) .

and compare the log-likelihoods2 . It can be shown that under the null hypothesis (10) and for large samples,
Ã

e
e
Ln (0, .., 0, β m+1 , ..., β k )
LRm = −2 ln
b
b
Ln (β 1 , ..., β k )

!

∼ χ2 ,
m

where the degrees of freedom m corresponds to the number of restrictions
imposed under the null hypothesis. This is the so-called likelihood ratio test,
which is conducted right-sided. For example, choose the 5% significance
level, look up in the table of the χ2 distribution the critical value c such
that for a χ2 distributed random variable Zm , Pr[Zm > c] = 0.05. Then the
m
null hypothesis (10) is rejected at the 5% significance level if LRm > c and
accepted if LRm ≤ c.
An alternative test of the null hypothesis (10) is the Wald test, which is
conducted in the same way as for linear regression models.3 Under the null
hypothesis (10) the Wald test statistic has also a χ2 distribution.
m

4

Interpretation of the coefficients of the Logit
model

4.1

Marginal effects

Consider the Logit model (5). If β0 > 0 then Pr[Yj = 1|Xj ] = F (α0 + β0 Xj )
is an increasing function of Xj :
dP [Yj = 1|Xj ]
= β0 .F 0 (α0 + β0 Xj ),
dXj
where F 0 is the derivative of (6):
2
3

Your econometric software will report the log-likelihood function value.
In EasyReg International the Wald test can be conducted simply by point-and-click.

10
exp(−x)
1 + exp(−x)
1
2 =
2 −
(1 + exp(−x))
(1 + exp(−x))
(1 + exp(−x))2
1
1
=
−
= F (x) − F (x)2
1 + exp(−x) (1 + exp(−x))2
= F (x) (1 − F (x)) .

F 0 (x) =

Therefore, the marginal effect of Xj on Pr[Yj = 1|Xj ] depends on Xj :
dP [Yj = 1|Xj ]
= β0 .F (α0 + β0 Xj ) (1 − F (α0 + β0 Xj )) ,
dXj
which renders the interpretation of β0 difficult.
However, the coefficient β0 can be interpreted in terms of relative changes
in odds.

4.2

Odds and odds ratios

The odds is the ratio of the probability that something is true divided by the
probability that it is not true. Thus, in the Logit case (4),
Odds (X) =

Pr[Yj = 1|Xj ]
F (α0 + β0 Xj )
=
= exp(α0 + β0 Xj ).
Pr[Yj = 0|Xj ]
1 − F (α0 + β0 Xj )

(11)

The odds ratio is the ratio of two odds for different values of Xj , say
Xj = x and Xj = x + ∆x:
exp(α + βx + β∆x)
Odds (x + ∆x)
=
= exp(β∆x),
Odds (x)
exp(α + βx)
where ∆x is a small change in x. Then
Ã

!

Odds (x + ∆x) − Odds (x)
exp(β0 ∆x) − 1
= lim
∆x→0
Odds (x)
∆x
¯
exp(β0 ∆x) − 1
d exp(u) ¯
¯
¯
= β0 lim
= β0 ×
= β0 exp(0) = β0 .
β0 ∆x→0
β0 ∆x
du ¯u=0
1
lim
∆x→0 ∆x

Thus, β0 may be interpreted as the relative change in the odds due to a small
change ∆x in Xj :
Odds (x + ∆x) − Odds (x)
Odds (x + ∆x)
=
− 1 ≈ β0 ∆x
Odds (x)
Odds (x)
11

(12)
If Xj is a binary variable itself, Xj = 0 or Xj = 1, then the only reasonable
choices for x + ∆x and x are 1 and 0, respectively, so that then
Odds (1) − Odds (0)
Odds (1)
−1=
= exp(β0 ) − 1.
Odds (0)
Odds (0)
Only if β0 is small we may then use the approximation exp(β0 ) − 1 ≈ β0 . If
not, one has to interpret β0 in terms of the log of the odds ratio involved:
Ã

Odds (1)
ln
Odds (0)

!

= β0 .

The interpretation of the coefficients βi0 , i = 1, ..., k − 1 in the general
Logit model (8) is similar as in the case (12):
Odds (X1j , ..., Xi−1,j , Xi,j + ∆Xi,j , Xi+1,j , ..., Xk,j )
− 1 ≈ βi0 ∆Xi,j
Odds (X1j , ..., Xi−1,j , Xi,j , Xi+1,j , ..., Xk,j )
if ∆Xi,j is small. For example, βi0 may be interpreted as the percentage
change in Odds(X1j , .., Xk,j ) due to a small percentage change 100×∆Xi,j = 1
in Xi,j .

12

Más contenido relacionado

La actualidad más candente

効率的反実仮想学習
効率的反実仮想学習効率的反実仮想学習
効率的反実仮想学習Masa Kato
 
Introduction to Anova
Introduction to AnovaIntroduction to Anova
Introduction to AnovaJi Li
 
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestPG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestAashish Patel
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSSLNIPE
 
Cunningham slides-ch2
Cunningham slides-ch2Cunningham slides-ch2
Cunningham slides-ch2cunningjames
 
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionAashish Patel
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
P G STAT 531 Lecture 7 t test and Paired t test
P G STAT 531 Lecture 7 t test and Paired t testP G STAT 531 Lecture 7 t test and Paired t test
P G STAT 531 Lecture 7 t test and Paired t testAashish Patel
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JMJapheth Muthama
 
P G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square testP G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square testAashish Patel
 
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestStudent's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestAzmi Mohd Tamil
 
P G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 CorrelationP G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 CorrelationAashish Patel
 
One-way ANOVA for Completely Randomized Design (CRD)
One-way ANOVA for Completely Randomized Design (CRD)One-way ANOVA for Completely Randomized Design (CRD)
One-way ANOVA for Completely Randomized Design (CRD)Siti Nur Adila Hamzah
 
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha BaliRegression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha BaliAkanksha Bali
 
Logistic Regression in Case-Control Study
Logistic Regression in Case-Control StudyLogistic Regression in Case-Control Study
Logistic Regression in Case-Control StudySatish Gupta
 
Applied statistics lecture_7
Applied statistics lecture_7Applied statistics lecture_7
Applied statistics lecture_7Daria Bogdanova
 
Statistics Case Study - Stepwise Multiple Regression
Statistics Case Study - Stepwise Multiple RegressionStatistics Case Study - Stepwise Multiple Regression
Statistics Case Study - Stepwise Multiple RegressionSharad Srivastava
 

La actualidad más candente (20)

効率的反実仮想学習
効率的反実仮想学習効率的反実仮想学習
効率的反実仮想学習
 
Introduction to Anova
Introduction to AnovaIntroduction to Anova
Introduction to Anova
 
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z TestPG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 6 Test of Significance, z Test
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
 
Cunningham slides-ch2
Cunningham slides-ch2Cunningham slides-ch2
Cunningham slides-ch2
 
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability Distribution
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Z And T Tests
Z And T TestsZ And T Tests
Z And T Tests
 
T test statistics
T test statisticsT test statistics
T test statistics
 
P G STAT 531 Lecture 7 t test and Paired t test
P G STAT 531 Lecture 7 t test and Paired t testP G STAT 531 Lecture 7 t test and Paired t test
P G STAT 531 Lecture 7 t test and Paired t test
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JM
 
P G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square testP G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 8 Chi square test
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestStudent's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
 
P G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 CorrelationP G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 9 Correlation
 
One-way ANOVA for Completely Randomized Design (CRD)
One-way ANOVA for Completely Randomized Design (CRD)One-way ANOVA for Completely Randomized Design (CRD)
One-way ANOVA for Completely Randomized Design (CRD)
 
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha BaliRegression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
 
Logistic Regression in Case-Control Study
Logistic Regression in Case-Control StudyLogistic Regression in Case-Control Study
Logistic Regression in Case-Control Study
 
Applied statistics lecture_7
Applied statistics lecture_7Applied statistics lecture_7
Applied statistics lecture_7
 
Statistics Case Study - Stepwise Multiple Regression
Statistics Case Study - Stepwise Multiple RegressionStatistics Case Study - Stepwise Multiple Regression
Statistics Case Study - Stepwise Multiple Regression
 

Similar a Logit model testing and interpretation

Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheetSuvrat Mishra
 
Multivriada ppt ms
Multivriada   ppt msMultivriada   ppt ms
Multivriada ppt msFaeco Bot
 
Data mining assignment 2
Data mining assignment 2Data mining assignment 2
Data mining assignment 2BarryK88
 
Deep learning .pdf
Deep learning .pdfDeep learning .pdf
Deep learning .pdfAlHayyan
 
Normal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson DistributionNormal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson DistributionQ Dauh Q Alam
 
Poisson Distribution.pdf
Poisson Distribution.pdfPoisson Distribution.pdf
Poisson Distribution.pdfssuser69c0bb1
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptxfuad80
 
Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and BootstrapConfidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and BootstrapFrancesco Casalegno
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksFederico Cerutti
 

Similar a Logit model testing and interpretation (20)

Unit II PPT.pptx
Unit II PPT.pptxUnit II PPT.pptx
Unit II PPT.pptx
 
Propensity albert
Propensity albertPropensity albert
Propensity albert
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Multivriada ppt ms
Multivriada   ppt msMultivriada   ppt ms
Multivriada ppt ms
 
Probability Cheatsheet.pdf
Probability Cheatsheet.pdfProbability Cheatsheet.pdf
Probability Cheatsheet.pdf
 
Data mining assignment 2
Data mining assignment 2Data mining assignment 2
Data mining assignment 2
 
Deep learning .pdf
Deep learning .pdfDeep learning .pdf
Deep learning .pdf
 
Normal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson DistributionNormal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson Distribution
 
Statistical Method In Economics
Statistical Method In EconomicsStatistical Method In Economics
Statistical Method In Economics
 
Poisson Distribution.pdf
Poisson Distribution.pdfPoisson Distribution.pdf
Poisson Distribution.pdf
 
Lecture 1 review
Lecture 1   reviewLecture 1   review
Lecture 1 review
 
Nested sampling
Nested samplingNested sampling
Nested sampling
 
Slides ineq-2
Slides ineq-2Slides ineq-2
Slides ineq-2
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptx
 
Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and BootstrapConfidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
Inequality, slides #2
Inequality, slides #2Inequality, slides #2
Inequality, slides #2
 
Inequalities #2
Inequalities #2Inequalities #2
Inequalities #2
 
PTSP PPT.pdf
PTSP PPT.pdfPTSP PPT.pdf
PTSP PPT.pdf
 
stochastic processes assignment help
stochastic processes assignment helpstochastic processes assignment help
stochastic processes assignment help
 

Último

M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewasmakika9823
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
Catalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdf
Catalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdfCatalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdf
Catalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdfOrient Homes
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 
Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyEthan lee
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Delhi Call girls
 
Socio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptxSocio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptxtrishalcan8
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 

Último (20)

M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
Catalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdf
Catalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdfCatalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdf
Catalogue ONG NƯỚC uPVC - HDPE DE NHAT.pdf
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 
Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
 
Socio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptxSocio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptx
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 

Logit model testing and interpretation

  • 1. The Logit Model: Estimation, Testing and Interpretation Herman J. Bierens October 25, 2008 1 1.1 Introduction to maximum likelihood estimation The likelihood function Consider a random sample Y1 , ..., Yn from the Bernoulli distribution: Pr[Yj = 1] = p0 Pr[Yj = 0] = 1 − p0 , where p0 is unknown. For example, toss n times a coin for which you suspect that it is unfair: p0 6= 0.5, and for each tossing j assign Yj = 1 if the outcome is heads and Yj = 0 if the outcome is tails. The question is how to estimate p0 and how to test the null hypothesis that the coin is fair: p0 = 0.5. The probability function involved can be written as f(y|p0 ) = Pr[Yj = y] = py 0 1−y (1 − p0 ) = ( p0 if y = 1, 1 − p0 if y = 0. Next, let y1 , ..., yn be a given sequence of zeros and ones. Thus, each yj is either 0 or 1. The joint probability function of the random sample Y1 , Y2 , ..., Yn is defined as fn (y1 , ..., yn |p0 ) = Pr[Y1 = y1 and Y2 = y2 ..... and Yn = yn ]. 1
  • 2. Because the random variables Y1 , Y2 , ..., Yn are independent, we can write Pr[Y1 = y1 and Y2 = y2 ..... and Yn = yn ] = Pr[Y1 = y1 ] × Pr[Y2 = y2 ] × ... × Pr[Yn = yn ] = f (y1 |p0 ) × f (y2 |p0 ) × ... × f (yn |p0 ) = n Y j=1 f(yj |p0 ), hence fn (y1 , ..., yn |p0 ) = = n Y j=1 ⎛ y p0j (1 − p0 )1−yj ⎞⎛ n Y n yj ⎠ ⎝ Y ⎝ p0 (1 j=1 j=1 Pn = p0 j=1 yj ⎞ 1−yj ⎠ − p0 ) Pn (1 − p0 )n− j=1 yj . Replacing the given non-random sequence y1 , ..., yn by the random sample Y1 , Y2 , ..., Yn and the unknown probability p0 by a variable p in the interval (0, 1) yields the likelihood function Pn Ln (p) = fn (Y1 , ..., Yn |p) = p j=1 Yj Pn (1 − p)n− j=1 Yj For the case p = p0 the likelihood function can be interpreted as the joint probability that we draw a particular sample Y1 , ..., Yn . 1.2 Maximum likelihood estimation The idea of maximum likelihood (ML) estimation is now to choose p such that Ln (p) is maximal. In other words, choose p such that the probability of drawing this particular sample Y1 , ..., Yn is maximal. Note that maximizing Ln (p) is equivalent to maximizing ln (Ln (p)) , i.e., ⎛ ln (Ln (p)) = ⎝ n X j=1 ³ ⎞ ⎛ Yj ⎠ ln(p) + ⎝n − n X j=1 ⎞ Yj ⎠ ln(1 − p) ´ = n Y ln(p) + (1 − Y ) ln(1 − p) , 2
  • 3. where n 1X Y = Yj n j=1 b is the sample mean. Therefore, the ML estimator p in this case can be obtained from the first-order condition for a maximum of ln (Ln (p)) in p = b p: à b b b d ln (Ln (p)) d ln(p) d ln(1 − p) =n Y + (1 − Y ) 0 = b b b dp dp dp à ! b b b d ln(p) d ln(1 − p) d(1 − p) = n Y + (1 − Y ) × b b b dp d(1 − p) dp à ! 1 1 × (−1) = n Y + (1 − Y ) b b p 1−p = n à à Y 1−Y − b b p 1−p b Y −p = n b b p (1 − p) ! ! ⎛ = n⎝ ³ b b Y (1 − p) − p 1 − Y b b p (1 − p) ! ´⎞ ⎠ where we have used the fact that d ln(x)/dx = 1/x. Thus, in this case the b ML estimator p of p0 is the sample mean: b p =Y. b Note that this is an unbiased estimator: E (p) = 1.3 1 n Pn j=1 E (Yj ) = p0 . Large sample statistical inference It can be shown (but this√ requires advanced probability theory) that if the b sample size n is large then n (p − p0 ) is approximately normally distributed, i.e., n √ 1 X 2 b n (p − p0 ) = √ (Yj − p0 ) ∼ N[0, σ0 ], n j=1 where h 2 σ0 = var(Yj ) = E (Yj − p0 )2 i = (1 − p0 )2 p0 + (−p0 )2 (1 − p0 ) = p0 (1 − p0 ). 3
  • 4. Thus, for large sample size n, √ b n (p − p0 ) q p0 (1 − p0 ) ∼ N[0, 1]. (1) This result can be used to test hypotheses about p0 . In particular, under the null hypothesis that the coin is fair, p0 = 0.5, we have √ √ b n (p − 0.5) b 2 n (p − 0.5) = √ ∼ N[0, 1], 0.5 × 0.5 √ b Therefore, 2 n (p − 0.5) can be used as the test statistic of the standard normal test of the null hypothesis p0 = 1/2, as follows. Recall that for a standard normal random variable U , Pr [|U | > 1.96] = 0.05. Thus, under the null hypothesis p0 = 1/2 one would expect that ¯ h¯ √ i ¯ ¯ b Pr ¯2 n (p − 0.5)¯ > 1.96 = 0.05 ¯ h¯ √ i ¯ ¯ b Pr ¯2 n (p − 0.5)¯ ≤ 1.96 = 0.95 √ b If |2 n (p − 0.5)| > 1.96 then we reject the null hypothesis p0 = 1/2 at the 5% significance level, because√ this is not what one would expect if the b null hypothesis is true, and if |2 n (p − 0.5)| ≤ 1.96 then we accept this null hypothesis, as this result is then in accordance with the null hypothesis p0 = 1/2. The result (1) can also be used to endow the unknown probability p0 with a confidence interval, for example the 95% confidence interval, as follows. The result (1) implies ¯ ⎡¯ √ ⎤ ¯ ¯ ¯ n (p − p0 ) ¯ b ¯ ≤ 1.96⎦ = 0.95, Pr ⎣¯ q ¯ ¯ ¯ p0 (1 − p0 ) ¯ which, after some straightforward calculations, can be shown to be equivalent to h i Pr pn ≤ p0 ≤ pn = 0.95 where pn = q b b b n.p + (1.96)2 /2 − 1.96 n.p (1 − p) + (1.96)2 /4 n + (1.96)2 q 2 pn = b b b n.p + (1.96) /2 + 1.96 n.p (1 − p) + (1.96)2 /4 n + (1.96)2 4
  • 5. h i The interval pn , pn is now the 95% confidence interval for p0 . 1.4 An application election polls Consider a presidential election with two candidates, candidate A and candidate B, and let p0 be the fraction of likely voters who favor candidate A, just before the election is held. To predict the outcome of the election, a polling agency draws a random sample of size n = 3000, for example, from the population of likely voters.1 Suppose that 1800 of the respondents express a preference for candidate A. Thus, the fraction of respondents favoring b b candidate A is p = 0.6. Substituting n = 3000 and p = 0.6 in the formulas for pn and pn yields pn = 0.58, pn = 0.62 Thus, the 95% confidence interval of 100 × p0 is [58, 62]. The polling results are therefore stated as: 60% of the likely voters will vote for candidate A, with a margin of error of ±2 points. 2 Motivation for maximum likelihood estimation A more formal motivation for ML estimation is based on the fact that for 0 < x < 1 and x > 1, ln(x) < x − 1. This is illustrated in the following picture: 1 How to draw such a sample is beyond the scope of this lecture note. 5
  • 6. ln(x) ≤ x − 1. The inequality ln(x) < x − 1 is strict for x 6= 1, and ln(1) = 0. Consequently, taking x = f(Yj |p)/f(Yj |p0 ), we have the inequality à f (Yj |p) ln f(Yj |p0 ) ! ≤ f(Yj |p) − 1. f (Yj |p0 ) Taking expectations, it follows that " à f (Yj |p) E ln f (Yj |p0 ) !# ≤ = = = " # f(Yj |p) E −1 f (Yj |p0 ) f(1|p) f (0|p) Pr[Yj = 1] + Pr[Yj = 0] − 1 f (1|p0 ) f(0|p0 ) 1−p p p0 + (1 − p0 ) − 1 p0 1 − p0 p + 1 − p − 1 = 0, (2) hence " à f(Yj |p) E [ln (f (Yj |p))] − E [ln (f (Yj |p0 ))] = E ln f (Yj |p0 ) !# ≤ 0, and therefore, E [ln (Ln (p))] ≤ E [ln (Ln(p0 ))] . (3) Thus, E [ln (Ln (p))] is maximal for p = p0 , and it can be shown that this maximum is unique. 6
  • 7. 3 Maximum likelihood estimation of the Logit model 3.1 The Logit model with one explanatory variable Next, let (Y1 , X1 ), ..., (Yn , Xn ) be a random sample from the conditional Logit distribution: 1 , 1 + exp (−α0 − β0 Xj ) Pr[Yj = 0|Xj ] = 1 − Pr[Yj = 1|Xj ] exp (−α0 − β0 Xj ) = 1 + exp (−α0 − β0 Xj ) Pr[Yj = 1|Xj ] = (4) where the Xj ’s are the explanatory variables and α0 and β0 are unknown parameters to be estimated. This model is called a Logit model, because Pr[Yj = 1|Xj ] = F (α0 + β0 Xj ) where F (x) = 1 1 + exp(−x) (5) (6) is the distribution function of the logistic (Logit) distribution. The conditional probability function involved is f(y|Xj , α0 , β0 ) = Pr[Yj = y|Xj ] = F (α0 + β0 Xj )y (1 − F (α0 + β0 Xj ))1−y ( F (α0 + β0 Xj ) if y = 1, = 1 − F (α0 + β0 Xj ) if y = 0. Now the conditional log-likelihood function is ln (Ln (α, β)) = n X j=1 = n X ln (f (Yj |Xj , α, β)) Yj ln (F (α + βXj )) + j=1 =− n X (1 − Yj ) ln (1 − F (α + βXj )) j=1 n X n X (1 − Yj ) (α + βXj ) − j=1 7 j=1 ln (1 + exp (−α − βXj )) . (7)
  • 8. Similar to (3) we have E [ln (Ln (α, β))| X1 , ..., Xn ] ≤ E [ln (Ln (α0 , β0 ))| X1 , ..., Xn ] . Again, this result motivates to estimate α0 and β0 by maximizing ln (Ln (α, β)) to α and β: ³ ´ b b ln Ln (α, β) = max ln (Ln (α, β)) . α,β b b However, there is no longer an explicit solution for α and β. These ML estimators have to be solved numerically. Your econometrics software will do that for you. 3.2 Pseudo t-values It can be shown that if the sample size n is large then ´ √ √ ³b b n (α − α0 ) ∼ N (0, σ 2 ), n β − β0 ∼ N (0, σ 2 ). α β bα bβ Given consistent estimators σ 2 and σ 2 of the unknown variances σ 2 and σ 2 , α β respectively (which are computed by your econometrics software), we then have ´ √ ³b √ n β − β0 b − α0 ) n (α ∼ N (0, 1), ∼ N(0, 1). b b σα σβ These results can be used to test whether the coefficients α0 and β0 are zero or not. In particular the null hypothesis β0 = 0 is of interest, because this hypothesis implies that the conditional probability Pr[Yj = 1|Xj ] does not depend on Xj . Under the null hypothesis β0 = 0 we have √ b bβ = nβ ∼ N(0, 1). t b σβ Recall that the 5% critical value of the two-sided standard normal test is 1.96. Thus, for example, the null hypothesis β0 = 0 is rejected¯ at¯ the 5% ¯b ¯ significance level in favor of the alternative hypothesis β0 6= 0 if ¯t β ¯ > 1.96, ¯ ¯ ¯b ¯ and accepted if ¯t β ¯ ≤ 1.96. b b The statistic t β is called the pseudo t-value of β because it is used in the b same way as the t-value in linear regression, and σ β is called the standard b Your econometric software will report the ML estimators together error of β. with their corresponding pseudo t-values and/or standard errors. 8
  • 9. 3.3 The general Logit model The general Logit model takes the form Pr[Yj = 1|X1j , ...Xk,j ] = = 1 1+ 0 exp (−β1 X1j ³ 1 + exp − 0 − ... − βk Xkj ) 1 Pk i=1 βi0 Xij (8) ´, where one of the Xij equals 1 for the constant term, for example, let Xkj = 1, and the βi0 ’s are the true parameter values. This model can be estimated by ML in the same way as before. Thus, the log-likelihood function is ln (Ln (β1 , ..., βk )) = − n X j=1 (1 − Yj ) k X i=1 βi Xij − n X j=1 Ã Ã ln 1 + exp − k X βi Xij i=1 !! , (9) b b and the ML estimators β 1 , ..., β k are obtained by maximizing ln (Ln (β1 , ..., βk )): ³ ´ b b ln Ln (β 1 , ..., β k ) = max ln (Ln (β1 , ..., βk )) . β1 ,...,βk Again, it can be shown that if n is large then for i = 1, ..., k, ´ √ ³b 2 n β i − βi0 ∼ N[0, σi ]. 2 bi Given consistent estimators σ 2 of the variances σi , it follows then that ´ √ ³b n β i − βi0 b σi ∼ N[0, 1] for i = 1, ..., k. Your econometrics software will report the ML estimators √ b b b b β i together with their corresponding pseudo t-values ti = nβ i /σ i and/or b i. standard errors σ 3.4 Testing joint significance Now suppose you want to test the joint null hypothesis 0 0 0 H0 : β1 = 0, β2 = 0, ..., βm = 0, 9 (10)
  • 10. where m < k. There are two ways to do that. One way is akin to the F test in linear regression: Re-estimate the Logit model under the null hypothesis: ³ ´ e e ln Ln (0, .., 0, β m+1 , ..., β k ) = max βm+1 ,...,βk ln (Ln (0, .., 0, βm+1 , ..., βk )) . and compare the log-likelihoods2 . It can be shown that under the null hypothesis (10) and for large samples, Ã e e Ln (0, .., 0, β m+1 , ..., β k ) LRm = −2 ln b b Ln (β 1 , ..., β k ) ! ∼ χ2 , m where the degrees of freedom m corresponds to the number of restrictions imposed under the null hypothesis. This is the so-called likelihood ratio test, which is conducted right-sided. For example, choose the 5% significance level, look up in the table of the χ2 distribution the critical value c such that for a χ2 distributed random variable Zm , Pr[Zm > c] = 0.05. Then the m null hypothesis (10) is rejected at the 5% significance level if LRm > c and accepted if LRm ≤ c. An alternative test of the null hypothesis (10) is the Wald test, which is conducted in the same way as for linear regression models.3 Under the null hypothesis (10) the Wald test statistic has also a χ2 distribution. m 4 Interpretation of the coefficients of the Logit model 4.1 Marginal effects Consider the Logit model (5). If β0 > 0 then Pr[Yj = 1|Xj ] = F (α0 + β0 Xj ) is an increasing function of Xj : dP [Yj = 1|Xj ] = β0 .F 0 (α0 + β0 Xj ), dXj where F 0 is the derivative of (6): 2 3 Your econometric software will report the log-likelihood function value. In EasyReg International the Wald test can be conducted simply by point-and-click. 10
  • 11. exp(−x) 1 + exp(−x) 1 2 = 2 − (1 + exp(−x)) (1 + exp(−x)) (1 + exp(−x))2 1 1 = − = F (x) − F (x)2 1 + exp(−x) (1 + exp(−x))2 = F (x) (1 − F (x)) . F 0 (x) = Therefore, the marginal effect of Xj on Pr[Yj = 1|Xj ] depends on Xj : dP [Yj = 1|Xj ] = β0 .F (α0 + β0 Xj ) (1 − F (α0 + β0 Xj )) , dXj which renders the interpretation of β0 difficult. However, the coefficient β0 can be interpreted in terms of relative changes in odds. 4.2 Odds and odds ratios The odds is the ratio of the probability that something is true divided by the probability that it is not true. Thus, in the Logit case (4), Odds (X) = Pr[Yj = 1|Xj ] F (α0 + β0 Xj ) = = exp(α0 + β0 Xj ). Pr[Yj = 0|Xj ] 1 − F (α0 + β0 Xj ) (11) The odds ratio is the ratio of two odds for different values of Xj , say Xj = x and Xj = x + ∆x: exp(α + βx + β∆x) Odds (x + ∆x) = = exp(β∆x), Odds (x) exp(α + βx) where ∆x is a small change in x. Then à ! Odds (x + ∆x) − Odds (x) exp(β0 ∆x) − 1 = lim ∆x→0 Odds (x) ∆x ¯ exp(β0 ∆x) − 1 d exp(u) ¯ ¯ ¯ = β0 lim = β0 × = β0 exp(0) = β0 . β0 ∆x→0 β0 ∆x du ¯u=0 1 lim ∆x→0 ∆x Thus, β0 may be interpreted as the relative change in the odds due to a small change ∆x in Xj : Odds (x + ∆x) − Odds (x) Odds (x + ∆x) = − 1 ≈ β0 ∆x Odds (x) Odds (x) 11 (12)
  • 12. If Xj is a binary variable itself, Xj = 0 or Xj = 1, then the only reasonable choices for x + ∆x and x are 1 and 0, respectively, so that then Odds (1) − Odds (0) Odds (1) −1= = exp(β0 ) − 1. Odds (0) Odds (0) Only if β0 is small we may then use the approximation exp(β0 ) − 1 ≈ β0 . If not, one has to interpret β0 in terms of the log of the odds ratio involved: Ã Odds (1) ln Odds (0) ! = β0 . The interpretation of the coefficients βi0 , i = 1, ..., k − 1 in the general Logit model (8) is similar as in the case (12): Odds (X1j , ..., Xi−1,j , Xi,j + ∆Xi,j , Xi+1,j , ..., Xk,j ) − 1 ≈ βi0 ∆Xi,j Odds (X1j , ..., Xi−1,j , Xi,j , Xi+1,j , ..., Xk,j ) if ∆Xi,j is small. For example, βi0 may be interpreted as the percentage change in Odds(X1j , .., Xk,j ) due to a small percentage change 100×∆Xi,j = 1 in Xi,j . 12