SlideShare una empresa de Scribd logo
1 de 54
Descargar para leer sin conexión
Week 1
Intro & Stats I Review
Applied Statistical Analysis II
Jeffrey Ziegler, PhD
Assistant Professor in Political Science & Data Science
Trinity College Dublin
Spring 2023
Road map for today
Welcome and introduction
I About module, structure
I Review of last term
I Bridging terms
Alternative derivations of least squares
Frequentist model of likelihood
Intro to GLMs & MLE
Next time: Framework of generalised linear models (GLMs)
By next week, please...
I Fork GitHub repository
I Read assigned chapters
1 53
General Info About Course
Instructor Jeffrey Ziegler, PhD
Email zieglerj@tcd.ie
In-Person Sessions 16:00 - 18:00 Mondays AP2.05
Office Hours W/Th 13:00-14:00
2 53
Review of Tools: Necessity of R
R is the statistical programming language
Perform stats analysis, data manipulation, plotting, etc.
Notes on R code:
I ALWAYS comment your code
Worry more about saying too little rather than too much
I Use indentation to visually clarify blocks of code, such as
multiple lines for one command or multiple commands that
produce one logical step
I Plan for your code to be run from source files
Interactive analysis is great for exploration, terrible for
(re)analysis, which your job and/or reputation may depend
I Create command source files with all your analysis (.R file)
I Final plots go to files, not screen (dev.off())
3 53
Review of Tools: GitHub, & LaTex
LaTex is the word processor
Input output, figures, and code from R
Using Word will result in a deduction
GitHub is how we’ll share our work with each other
Fork repository
Keep up-to-date
Keep organised
4 53
As a reminder, you can access the syllabus here...
Direct link to syllabus, or find on course website
5 53
Required Materials
Texts All readings are provided, don’t buy books!
R and LaTeX Should have installed
Rstudio and TexStudio Should have installed
6 53
Course Evaluation/Assessment
Problem Sets (3/4): 50%
Exam: 25%
Replication: 25%
7 53
Problem Sets (50%)
Typically assigned every other week(ish), and you will
generally have two weeks to do assignment (this will vary)
Problem sets require R and should be written in LaTex (must
include tables, figures, and code within text)
All problem sets will be posted on GitHub
Evaluated by me or Martyn
I will publish correct answers each week, but look at others’
GitHubs so we can learn from each other
The lowest PS grade will be dropped, so “I have been so busy
with other classes” is not a legitimate reason to not turn in
problem set!
8 53
Exams (50%)
In class: February, 27
Exam is cumulative
Exam is multiple choice and open-response questions
Exam is graded by me, instructor
You will be allowed a formula sheet
Make-up exams are only allowed by written approval
9 53
Replication (25%): Example
Does Having Daughters Cause Judges to Rule for Women’s Issues?
Table: Number of Children and Girls for U.S. Courts of Appeals Judges
Participating in Gender-Related Cases, 1996-2002
1
Number of Children 0 1 2 3 4 5 6 7 8 9
Democrat 12 13 33 24 15 4 0 1 0 1
Republican 13 8 44 30 15 7 3 0 1 0
Number of girls 0 1 2 3 4 5 6 7 8 9
Democrat 26 35 29 10 1 2 - - - -
Republican 36 43 31 9 2 0 - - - -
1
Does Having Daughters Cause Judges to Rule for Women’s Issues? Adam
Glynn & Maya Sen (AJPS, 2015)
10 53
Data: Daughters
Number of Girls by Partisan Leaning
Quantity
Conditional
Percent
0 1 2 3 4 5
1
0
0.0
0.2
0.4
0.6
0.8
1.0
Democrat
Republican
11 53
Data: Judge Demography
Table: Demographics of U.S. Court of Appeal Judges who voted on
gender-related cases (1996-2002)
All Democrats Republicans Women Men
Mean No. Children 2.47 2.40 2.54 1.58 2.66
Mean No. Girls 1.24 1.33 1.16 0.71 1.34
0 children 0.11 0.12 0.11 0.29 0.08
1 children 0.09 0.13 0.07 0.21 0.07
2 children 0.34 0.32 0.36 0.26 0.36
3 children 0.24 0.23 0.25 0.13 0.26
4 children 0.13 0.15 0.12 0.08 0.15
5 Children or More 0.08 0.06 0.09 0.03 0.05
Proportion Female 0.17 0.26 0.09 1.00 0.00
Proportion Republican 0.54 0.00 1.00 0.29 0.59
Proportion White 0.91 0.78 0.99 0.93 0.91
Mean Year Born 1932.55 1931.23 1933.43 1938.57 1931.49
12 53
Data: Judge Demography
Data: Judge Demography
0.00
0.25
0.50
0.75
1.00
Democrat Republican
Percent
of
Party
Percent of Female Judges by Party Affiliation
13 53
Data: Cases
Table: Distribution of the number of gender-related cases heard per
judge, 1996-2002
Min. 1st Qu. Median Mean 3rd Qu. Max.
All Judges 1.00 5.00 8.00 11.10 14.00 46.00
Democrats 1.00 5.00 7.00 10.12 13.00 39.00
Republicans 1.00 5.00 9.00 11.94 14.00 46.00
14 53
Data: Cases
0.0 0.5 1.0
Proportion of Cases Decided in a Feminist Direction
Less Feminist More Feminist
Republicans
Democrats
All
15 53
Model
Predict the probability that a judge will vote in a feminist
direction in any given gender-related case
Pr(yi = 1) = logit−1
(β0 + βkXi)
yi: judge-level votes in individual cases
Xi: vector of individual-level predictors
Main covariate of interest is # of biological daughters,
conditioned on total # of children (categorical variable)
16 53
Model: Base comparison
1 # R code presented in AJPS online replication f i l e s
2 base_model <− z e l i g ( progressive . vote ~ as . factor ( g i r l s )
3 + as . factor ( child ) , model = " l o g i t " ,
4 data = subset (women. cases , child < 5 &
child > 0) )
5 summary( base_model ) # number of observations = 1974
Dependent variable:
progressive.vote
as.factor(girls)1 0.384∗∗∗
(0.128)
Observations 1,974
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
17 53
Model: Casewise delete
1 ## case −wise delete to get same n in primary model ##
2 # subset for judges with at least 1 child ,
3 # but less than 5 (n=2448)
4 women. subset <− women. cases [women. cases$ child < 5
5 & women. cases$ child > 0 ,]
6 # subset for those judges who have g i r l s (n=1975)
7 women. subset <− women. subset [ complete . cases (women. subset$
g i r l s ) , ]
8 # subset for with progressive . vote value ( minus NAs )
9 # (n=1974)
10 women. subset <−women. subset [ complete . cases (
11 women. subset$progressive . vote ) , ]
18 53
Model: Casewise delete (same results)
1 # re −run with subsetted data
2 case_delete <− glm ( progressive . vote ~ as . factor ( g i r l s )
3 + as . factor ( child ) , family =binomial ( link =
" l o g i t " ) ,
4 data = women. subset )
5 summary( case_delete ) # number of observations = 1974
Table: Re-run with subset
Dependent variable:
progressive.vote
as.factor(girls)1 0.384∗∗∗
(0.128)
Observations 1,974
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
19 53
Diagnostics: Basic model
1 ### check Pearson residuals ###
2 sum( residuals ( base_model , type= " pearson " ) ^2)
3 # check deviance
4 pchisq ( deviance ( base_model ) , df . residual ( base_model ) ,
5 lower=F )
6 # resulting p−value = 0 , not good model f i t
20 53
Diagnostics: Basic model
1 # get preds for a l l subjects and take inverse l o g i t
2 predValues <− i l o g i t ( predict ( base_model ) )
3 # mean l i k e of judge voting for p l a i n t i f f
4 mean( predValues ) # ( original mean l i b vote = 0.433)
5 # create binary outcome of benign and malignant
6 predBinary <− i f e l s e ( predValues > 0.433 , 1 , 0)
7 # create table to show predictive error
8 table ( observeBinary , predBinary )
9 #1109/1974 (56.2%) accurately predicted
Table: Estimation of judge below or above mean lib vote
0 1
0 858 333
1 532 251
21 53
Missingness: Multiple imputation
22 53
Missingness: Multiple imputation
Reported in article: number of observations = 1507, β̂=0.42 (0.15)
Table: Estimated coefficient with multiple imputation, model 4
iterations (m)
seed 50 75 100
1234 0.257 (0.131) 0.225 (0.128) 0.241 (0.130)
1 0.220 (0.129) 0.221 (0.131) 0.232 (0.133)
555 0.213 (0.130) 0.218 (0.129) 0.233 (0.131)
1989 0.227 (0.131) 0.229 (0.135) 0.249 (0.132)
23 53
Other course policies to consider
Absences for religious holidays are excused
Talk to me ASAP if you have any illness or family emergencies
All students with special accommodations should notify me
as soon as possible
I Documentation from the Trinity Office of Disability Services is
required
The schedule posted on the syllabus is tentative and subject
to change
24 53
Reminder: Approach Toward Learning
Preparation + synthesis + practice = learning
Individual preparedness: Reading and review lectures before
class
In class:
I Discussion and Q&A on important concepts
I Tutorial: Advanced theoretical problems
Office hours: Review and correct mistakes
Problem sets: Individual homework assignments
Exam and replication: Showcase knowledge
25 53
Review: Last Term Final
26 53
This term: Extending Modelling & Estimation
What is a model?
How do we estimate its parameters?
What are the properties of estimator?
Use what we learned, extend to non-continuous outcomes
27 53
Social Science and Parametric Models
Goal of social science is parsimonious explanation of social
phenomena
I Parsimonious because we can never explain every detail
I Explanation because we want more than mere description
Compare with non-parametric approach
28 53
Non-parametric regression smoother
2021 2022
0.26
0.28
0.30
0.32
2021−2022 Sinn Fein polls
with a lowess nonparametric fit
Date
Percent
of
Respondents
that
Support
Party
29 53
How did I do that in R?
1 # load data
2 polling _data <− read . csv ( " https : //raw . githubusercontent . com/ASDS−
TCD/ S t a t s I I _Spring2023/main/datasets/long_ I P I . csv " )
3 attach ( polling _data )
4 Date <− as . Date ( Date , "%d/%
m/%Y" )
5 # open up non−parametric plot
6 pdf ( " . . /graphics/nonparameter_example . pdf " , width =9.25)
7 plot ( Date , SF , type= "n" ,
8 main=" 2021−2022 Sinn Fein polls n with a lowess nonparametric
f i t " ,
9 xlab=" Date " , ylab=" Percent of Respondents that Support Party " )
10 points ( Date , j i t t e r ( SF ) , pch =1 , cex =.6 , col =" red " )
11 lines ( lowess ( SF ~ Date , f =1/ 10) , col = " blue " )
12 abline (0 ,0)
13 dev . off ( )
14 # Open an empty plot : type ="n" suppresses points and lines but
15 # scales axes correctly for the x and y variables .
16 # The "n" in the main t i t l e i s the line −break command to s p l i t
17 # the t i t l e across two lines .
18 # Plot the points , adding a l i t t l e noise to reduce overprinting of
# data , use plot character 1 ( open c i r c l e ) , set the size
19 # to .6 of normal and color the points red
30 53
Non-parametric Models: Virtues and Vices
Benefits:
Very flexible, can fit any pattern of data
Makes minimal (virtually no) assumptions about data
Can reveal unexpected patterns and departures from linear
assumptions
Drawbacks:
Too flexible, sensitive to overfitting
Without parameters there is no simple interpretation of
effects
Hard to incorporate substantive theory and tests
31 53
A non-parametric future?
A great deal of research on modern nonparametric methods
is going on, lots of new developments
But for social scientists, perhaps not the wave of the future
The reason is that parametric models can do a lot for us
32 53
What is a parametric model?
We begin with specification of a specific distribution
describing behaviour under study
Specification requires theoretical understanding
Specification also requires making assumptions explicit
While this places a considerable burden on our theory, it
forces us to confront limits of our knowledge and helps
avoid making implicit and unwarranted assumptions
Specification should make our assumptions clear to all,
including ourselves
33 53
Specification
We are concerned with the estimation of parametric models of
the form:
yi ∼ f(θ, xi)
where:
θ is a vector of parameters
xi is a vector of exogenous characteristics ofith observation
The specific functional form, f, provides an almost unlimited
choice of specific models
34 53
Examples of specific models
Poisson:
yi ∼ f(k; λ) = Pr(X=k) =
λke−λ
k!
where:
k is # of occurrences (k = 0, 1, 2, . . . )
e is Euler’s number (e = 2.71828...)
! is the factorial function
λ: Positive real number λ is equal to the expected value of X
and also to its variance
λ = E(X) = Var(X)
35 53
Estimation: Maximum Likelihood
Likelihood: proportional to probability of observing data,
treating parameters of distribution as variables and data as
fixed (and assuming independent observations)
L(θ|Y) ∝
N
Y
i=1
p(Yi|θ)
Maximum likelihood estimate is that value of parameter θ
for which likelihood of observed sample is a maximum
Alternatively, ML estimate is mode of likelihood function
ML estimator turns out to have several useful properties, as
we shall see
36 53
We aren’t doing Bayesian inference!
Big difference to what we’re doing with MLE!
Given that y ∼ p(y, θ) how can we make inferences about
value of θ?
Bayes approach is reverse of probability problem: given θ
what can we say about distribution of y
I Sometimes called inverse probability
Instead, we seek distribution p(θ|y), distribution of unknown
parameter conditional on observed data
37 53
Minimising least squares, assumptions?
We required no assumptions about the distribution of y, x or
u in order to compute least squares coefficients
I If all we care about is fitting data, then we can stop here
If we want to make inferences about θ, however, we need
some more assumptions
For example, what is relationship between θ̂ and θ in the
population model?
To this point, none whatsoever!
I If we want to talk about θ, as opposed to θ̂, we need some
more assumptions
38 53
Reminder: Gauss-Markov Assumptions
1. yi = xiθ + ui
2. x is fixed and full rank (linear independence)
3. E(ui) = E(ū) = µu = 0
4. E(u2
i
) = σ2
5. E(ui, uj) = 0, ∀i 6= j
6. u ∼ normal
39 53
Why make G-M assumptions?
In order to do inference, we must say how data are
generated (Assumptions 1 and 6)
Must specify the parameterization of data generating
process (Assumptions 1, 3–6)
Must prove that the estimator has desirable properties. (1 is
crucial while 3–6 are necessary for hypothesis testing
40 53
What to notice about OLS
Most of assumptions are about inherently unobservable
term, ui
Only assumption explicitly about yi is the first
Specification strongly encourages us to think of ui as “error”,
rather than intrinsic variability in outcomes, yi
Key idea: Minimize sum of squared errors
I This only indirectly considers the data generating process
that creates the observed yi
Properties of LS estimators come as an after-thought
I We must derive them for each case as assumptions differ
(think GLS vs. OLS, for example)
41 53
New way of thinking: ML models
Specification of distribution of outcome variable, this shift in
focus is conceptual, but powerful
ML requires an explicit choice of distribution
I While some may be ruled out easily, final choice is inherently
subjective and uncertain
In defending our choices, we are forced to think through
nature of data generating process, which is at core of our
substantive theory
42 53
Outcome Variable
For least squares application, we wrote
yi = xiθ + ui
and
u ∼ N(0, σ2
I)
but we never said anything about distribution of yi!
Seems odd, we have more substantive knowledge about yi than
we can possibly have about unobservable ui
43 53
Outcome Variable
Implicitly, however, is that we have said something about
distribution of yi
Because ui is normally distributed, and because yi is a linear
combination of ui, (and xθ, and since x is fixed) we can
conclude that yi is also normally distributed
Recall a theorem from intro stats: If u ∼ N(µ, σ2) and a, b are
constants, then a linear function of u
v = a + bu
is normal also
v ∼ N(µ + a, b2
σ2
)
44 53
Outcome Variable
We could write the usual linear model as
yi ∼ N(µi, σ2
)
and
µi = xiθ
Now notice that least squares model and this model are
exactly same thing
Hence we can express usual OLS model as an equivalent ML
model by focusing on distribution of yi and data generating
process, rather than on minimizing squared error
45 53
Specifying Data Generating Process
Specification of distribution of y is most crucial and
controversial step in maximum likelihood modeling
First, it is a decision which is to some extent subjective
Second, it matters a lot for results (especially predicted
values)
46 53
Ex: Parliamentary Committees
A reasonable question to ask about parliamentary
committees is how ’productive’ they are
# of bills voted out of a congressional committee gives some
hints as to an appropriate distribution
Do committees vary in how much legislation they process?
I How do we model this?
47 53
Ex: Parliamentary Committees
# of bills is discrete and non-negative
I This means we can rule out any distribution which is either
continuous or which allows negative values
Are there any systematic features that account for this
variation?
How do historical changes in committee rules or structure
affect productivity within and between committees?
Thus, normal distribution cannot be a candidate for
describing this process, since normal is defined over real
number line from −∞ to +∞
48 53
Ex: Parliamentary Committees
Binomial, for example might be one candidate
The committee considers N bills each session
From these it reports y bills out
If probability of reporting each bill is p, then probability
model is
yi ∼

n
k

pk
(1 − p)n−k
where
I k successes
I n independent Bernoulli trials
I n
k

= n!
k!(n−k)!
49 53
Ex: Parliamentary Committees
But are bills really limited to N?
If we think supply of bills is effectively unlimited, because
MEPs will find bills to sponsor if there is slack in system,
then we might wish to model the process as... a poisson
distribution
yi ∼
λke−λ
k!
This distribution is also discrete and non-negative
50 53
Issues with MLE
Choice of a particular distribution is not always clear
I Yet that choice must be made, for without it there is no model
Some criticize ML for this
I They point to the subjective and somewhat arbitrary choice
of distribution, and to fact that if you pick wrong distribution
you are estimating a misspecified model
This is a lot of assumptions, and perhaps social science
theory is not up to challenge
I This is a valid concern
I In an ideal world, we would have better knowledge of
appropriate distribution and would not have so much
discretion
51 53
Necessary Choices in Applied Stats
Don’t delude ourselves into thinking that there’s an escape
from these dilemmas of statistical modeling
Any statistical model must specify both structure and
distribution of its variables
Those who rely on OLS then are actually doing ML but are
assuming that every model is a continuous, normal, model
Surely it is preferable to adopt most persuasive ML
specification even if it is subjective, than to always adopt
this particular ML regression model regardless of substance
of problem!
52 53
Class business
Read required (and suggested) online materials
Fork GitHub repository
These slides are available on the course website
Next time, we’ll talk about GLMs!
53 / 53

Más contenido relacionado

Similar a 1_Introduction_printable.pdf

Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...
Simplilearn
 
Educational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docxEducational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docx
toltonkendal
 
1. What type of research uses numeric measurement data (Points .docx
1. What type of research uses numeric measurement data (Points  .docx1. What type of research uses numeric measurement data (Points  .docx
1. What type of research uses numeric measurement data (Points .docx
jackiewalcutt
 
www1.cs.columbia.edu
www1.cs.columbia.eduwww1.cs.columbia.edu
www1.cs.columbia.edu
butest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final
Brian Lin
 
unit classification.pptx
unit  classification.pptxunit  classification.pptx
unit classification.pptx
ssuser908de6
 
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
AASTHA76
 

Similar a 1_Introduction_printable.pdf (20)

Basic concepts of probability
Basic concepts of probability Basic concepts of probability
Basic concepts of probability
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealed
 
Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...
 
Educational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docxEducational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docx
 
1. What type of research uses numeric measurement data (Points .docx
1. What type of research uses numeric measurement data (Points  .docx1. What type of research uses numeric measurement data (Points  .docx
1. What type of research uses numeric measurement data (Points .docx
 
1624.pptx
1624.pptx1624.pptx
1624.pptx
 
Forms of learning in ai
Forms of learning in aiForms of learning in ai
Forms of learning in ai
 
www1.cs.columbia.edu
www1.cs.columbia.eduwww1.cs.columbia.edu
www1.cs.columbia.edu
 
Practice Test 1
Practice Test 1Practice Test 1
Practice Test 1
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Probability
ProbabilityProbability
Probability
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final
 
unit classification.pptx
unit  classification.pptxunit  classification.pptx
unit classification.pptx
 
Intro to SPSS.ppt
Intro to SPSS.pptIntro to SPSS.ppt
Intro to SPSS.ppt
 
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
 
Paper sharing_Explaining Data-Driven Decisions made by AI Systems_The Counter...
Paper sharing_Explaining Data-Driven Decisions made by AI Systems_The Counter...Paper sharing_Explaining Data-Driven Decisions made by AI Systems_The Counter...
Paper sharing_Explaining Data-Driven Decisions made by AI Systems_The Counter...
 
Questions for R language.pdf
Questions for R language.pdfQuestions for R language.pdf
Questions for R language.pdf
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Último (20)

Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 

1_Introduction_printable.pdf

  • 1. Week 1 Intro & Stats I Review Applied Statistical Analysis II Jeffrey Ziegler, PhD Assistant Professor in Political Science & Data Science Trinity College Dublin Spring 2023
  • 2. Road map for today Welcome and introduction I About module, structure I Review of last term I Bridging terms Alternative derivations of least squares Frequentist model of likelihood Intro to GLMs & MLE Next time: Framework of generalised linear models (GLMs) By next week, please... I Fork GitHub repository I Read assigned chapters 1 53
  • 3. General Info About Course Instructor Jeffrey Ziegler, PhD Email zieglerj@tcd.ie In-Person Sessions 16:00 - 18:00 Mondays AP2.05 Office Hours W/Th 13:00-14:00 2 53
  • 4. Review of Tools: Necessity of R R is the statistical programming language Perform stats analysis, data manipulation, plotting, etc. Notes on R code: I ALWAYS comment your code Worry more about saying too little rather than too much I Use indentation to visually clarify blocks of code, such as multiple lines for one command or multiple commands that produce one logical step I Plan for your code to be run from source files Interactive analysis is great for exploration, terrible for (re)analysis, which your job and/or reputation may depend I Create command source files with all your analysis (.R file) I Final plots go to files, not screen (dev.off()) 3 53
  • 5. Review of Tools: GitHub, & LaTex LaTex is the word processor Input output, figures, and code from R Using Word will result in a deduction GitHub is how we’ll share our work with each other Fork repository Keep up-to-date Keep organised 4 53
  • 6. As a reminder, you can access the syllabus here... Direct link to syllabus, or find on course website 5 53
  • 7. Required Materials Texts All readings are provided, don’t buy books! R and LaTeX Should have installed Rstudio and TexStudio Should have installed 6 53
  • 8. Course Evaluation/Assessment Problem Sets (3/4): 50% Exam: 25% Replication: 25% 7 53
  • 9. Problem Sets (50%) Typically assigned every other week(ish), and you will generally have two weeks to do assignment (this will vary) Problem sets require R and should be written in LaTex (must include tables, figures, and code within text) All problem sets will be posted on GitHub Evaluated by me or Martyn I will publish correct answers each week, but look at others’ GitHubs so we can learn from each other The lowest PS grade will be dropped, so “I have been so busy with other classes” is not a legitimate reason to not turn in problem set! 8 53
  • 10. Exams (50%) In class: February, 27 Exam is cumulative Exam is multiple choice and open-response questions Exam is graded by me, instructor You will be allowed a formula sheet Make-up exams are only allowed by written approval 9 53
  • 11. Replication (25%): Example Does Having Daughters Cause Judges to Rule for Women’s Issues? Table: Number of Children and Girls for U.S. Courts of Appeals Judges Participating in Gender-Related Cases, 1996-2002 1 Number of Children 0 1 2 3 4 5 6 7 8 9 Democrat 12 13 33 24 15 4 0 1 0 1 Republican 13 8 44 30 15 7 3 0 1 0 Number of girls 0 1 2 3 4 5 6 7 8 9 Democrat 26 35 29 10 1 2 - - - - Republican 36 43 31 9 2 0 - - - - 1 Does Having Daughters Cause Judges to Rule for Women’s Issues? Adam Glynn & Maya Sen (AJPS, 2015) 10 53
  • 12. Data: Daughters Number of Girls by Partisan Leaning Quantity Conditional Percent 0 1 2 3 4 5 1 0 0.0 0.2 0.4 0.6 0.8 1.0 Democrat Republican 11 53
  • 13. Data: Judge Demography Table: Demographics of U.S. Court of Appeal Judges who voted on gender-related cases (1996-2002) All Democrats Republicans Women Men Mean No. Children 2.47 2.40 2.54 1.58 2.66 Mean No. Girls 1.24 1.33 1.16 0.71 1.34 0 children 0.11 0.12 0.11 0.29 0.08 1 children 0.09 0.13 0.07 0.21 0.07 2 children 0.34 0.32 0.36 0.26 0.36 3 children 0.24 0.23 0.25 0.13 0.26 4 children 0.13 0.15 0.12 0.08 0.15 5 Children or More 0.08 0.06 0.09 0.03 0.05 Proportion Female 0.17 0.26 0.09 1.00 0.00 Proportion Republican 0.54 0.00 1.00 0.29 0.59 Proportion White 0.91 0.78 0.99 0.93 0.91 Mean Year Born 1932.55 1931.23 1933.43 1938.57 1931.49 12 53
  • 14. Data: Judge Demography Data: Judge Demography 0.00 0.25 0.50 0.75 1.00 Democrat Republican Percent of Party Percent of Female Judges by Party Affiliation 13 53
  • 15. Data: Cases Table: Distribution of the number of gender-related cases heard per judge, 1996-2002 Min. 1st Qu. Median Mean 3rd Qu. Max. All Judges 1.00 5.00 8.00 11.10 14.00 46.00 Democrats 1.00 5.00 7.00 10.12 13.00 39.00 Republicans 1.00 5.00 9.00 11.94 14.00 46.00 14 53
  • 16. Data: Cases 0.0 0.5 1.0 Proportion of Cases Decided in a Feminist Direction Less Feminist More Feminist Republicans Democrats All 15 53
  • 17. Model Predict the probability that a judge will vote in a feminist direction in any given gender-related case Pr(yi = 1) = logit−1 (β0 + βkXi) yi: judge-level votes in individual cases Xi: vector of individual-level predictors Main covariate of interest is # of biological daughters, conditioned on total # of children (categorical variable) 16 53
  • 18. Model: Base comparison 1 # R code presented in AJPS online replication f i l e s 2 base_model <− z e l i g ( progressive . vote ~ as . factor ( g i r l s ) 3 + as . factor ( child ) , model = " l o g i t " , 4 data = subset (women. cases , child < 5 & child > 0) ) 5 summary( base_model ) # number of observations = 1974 Dependent variable: progressive.vote as.factor(girls)1 0.384∗∗∗ (0.128) Observations 1,974 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 17 53
  • 19. Model: Casewise delete 1 ## case −wise delete to get same n in primary model ## 2 # subset for judges with at least 1 child , 3 # but less than 5 (n=2448) 4 women. subset <− women. cases [women. cases$ child < 5 5 & women. cases$ child > 0 ,] 6 # subset for those judges who have g i r l s (n=1975) 7 women. subset <− women. subset [ complete . cases (women. subset$ g i r l s ) , ] 8 # subset for with progressive . vote value ( minus NAs ) 9 # (n=1974) 10 women. subset <−women. subset [ complete . cases ( 11 women. subset$progressive . vote ) , ] 18 53
  • 20. Model: Casewise delete (same results) 1 # re −run with subsetted data 2 case_delete <− glm ( progressive . vote ~ as . factor ( g i r l s ) 3 + as . factor ( child ) , family =binomial ( link = " l o g i t " ) , 4 data = women. subset ) 5 summary( case_delete ) # number of observations = 1974 Table: Re-run with subset Dependent variable: progressive.vote as.factor(girls)1 0.384∗∗∗ (0.128) Observations 1,974 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 19 53
  • 21. Diagnostics: Basic model 1 ### check Pearson residuals ### 2 sum( residuals ( base_model , type= " pearson " ) ^2) 3 # check deviance 4 pchisq ( deviance ( base_model ) , df . residual ( base_model ) , 5 lower=F ) 6 # resulting p−value = 0 , not good model f i t 20 53
  • 22. Diagnostics: Basic model 1 # get preds for a l l subjects and take inverse l o g i t 2 predValues <− i l o g i t ( predict ( base_model ) ) 3 # mean l i k e of judge voting for p l a i n t i f f 4 mean( predValues ) # ( original mean l i b vote = 0.433) 5 # create binary outcome of benign and malignant 6 predBinary <− i f e l s e ( predValues > 0.433 , 1 , 0) 7 # create table to show predictive error 8 table ( observeBinary , predBinary ) 9 #1109/1974 (56.2%) accurately predicted Table: Estimation of judge below or above mean lib vote 0 1 0 858 333 1 532 251 21 53
  • 24. Missingness: Multiple imputation Reported in article: number of observations = 1507, β̂=0.42 (0.15) Table: Estimated coefficient with multiple imputation, model 4 iterations (m) seed 50 75 100 1234 0.257 (0.131) 0.225 (0.128) 0.241 (0.130) 1 0.220 (0.129) 0.221 (0.131) 0.232 (0.133) 555 0.213 (0.130) 0.218 (0.129) 0.233 (0.131) 1989 0.227 (0.131) 0.229 (0.135) 0.249 (0.132) 23 53
  • 25. Other course policies to consider Absences for religious holidays are excused Talk to me ASAP if you have any illness or family emergencies All students with special accommodations should notify me as soon as possible I Documentation from the Trinity Office of Disability Services is required The schedule posted on the syllabus is tentative and subject to change 24 53
  • 26. Reminder: Approach Toward Learning Preparation + synthesis + practice = learning Individual preparedness: Reading and review lectures before class In class: I Discussion and Q&A on important concepts I Tutorial: Advanced theoretical problems Office hours: Review and correct mistakes Problem sets: Individual homework assignments Exam and replication: Showcase knowledge 25 53
  • 27. Review: Last Term Final 26 53
  • 28. This term: Extending Modelling & Estimation What is a model? How do we estimate its parameters? What are the properties of estimator? Use what we learned, extend to non-continuous outcomes 27 53
  • 29. Social Science and Parametric Models Goal of social science is parsimonious explanation of social phenomena I Parsimonious because we can never explain every detail I Explanation because we want more than mere description Compare with non-parametric approach 28 53
  • 30. Non-parametric regression smoother 2021 2022 0.26 0.28 0.30 0.32 2021−2022 Sinn Fein polls with a lowess nonparametric fit Date Percent of Respondents that Support Party 29 53
  • 31. How did I do that in R? 1 # load data 2 polling _data <− read . csv ( " https : //raw . githubusercontent . com/ASDS− TCD/ S t a t s I I _Spring2023/main/datasets/long_ I P I . csv " ) 3 attach ( polling _data ) 4 Date <− as . Date ( Date , "%d/% m/%Y" ) 5 # open up non−parametric plot 6 pdf ( " . . /graphics/nonparameter_example . pdf " , width =9.25) 7 plot ( Date , SF , type= "n" , 8 main=" 2021−2022 Sinn Fein polls n with a lowess nonparametric f i t " , 9 xlab=" Date " , ylab=" Percent of Respondents that Support Party " ) 10 points ( Date , j i t t e r ( SF ) , pch =1 , cex =.6 , col =" red " ) 11 lines ( lowess ( SF ~ Date , f =1/ 10) , col = " blue " ) 12 abline (0 ,0) 13 dev . off ( ) 14 # Open an empty plot : type ="n" suppresses points and lines but 15 # scales axes correctly for the x and y variables . 16 # The "n" in the main t i t l e i s the line −break command to s p l i t 17 # the t i t l e across two lines . 18 # Plot the points , adding a l i t t l e noise to reduce overprinting of # data , use plot character 1 ( open c i r c l e ) , set the size 19 # to .6 of normal and color the points red 30 53
  • 32. Non-parametric Models: Virtues and Vices Benefits: Very flexible, can fit any pattern of data Makes minimal (virtually no) assumptions about data Can reveal unexpected patterns and departures from linear assumptions Drawbacks: Too flexible, sensitive to overfitting Without parameters there is no simple interpretation of effects Hard to incorporate substantive theory and tests 31 53
  • 33. A non-parametric future? A great deal of research on modern nonparametric methods is going on, lots of new developments But for social scientists, perhaps not the wave of the future The reason is that parametric models can do a lot for us 32 53
  • 34. What is a parametric model? We begin with specification of a specific distribution describing behaviour under study Specification requires theoretical understanding Specification also requires making assumptions explicit While this places a considerable burden on our theory, it forces us to confront limits of our knowledge and helps avoid making implicit and unwarranted assumptions Specification should make our assumptions clear to all, including ourselves 33 53
  • 35. Specification We are concerned with the estimation of parametric models of the form: yi ∼ f(θ, xi) where: θ is a vector of parameters xi is a vector of exogenous characteristics ofith observation The specific functional form, f, provides an almost unlimited choice of specific models 34 53
  • 36. Examples of specific models Poisson: yi ∼ f(k; λ) = Pr(X=k) = λke−λ k! where: k is # of occurrences (k = 0, 1, 2, . . . ) e is Euler’s number (e = 2.71828...) ! is the factorial function λ: Positive real number λ is equal to the expected value of X and also to its variance λ = E(X) = Var(X) 35 53
  • 37. Estimation: Maximum Likelihood Likelihood: proportional to probability of observing data, treating parameters of distribution as variables and data as fixed (and assuming independent observations) L(θ|Y) ∝ N Y i=1 p(Yi|θ) Maximum likelihood estimate is that value of parameter θ for which likelihood of observed sample is a maximum Alternatively, ML estimate is mode of likelihood function ML estimator turns out to have several useful properties, as we shall see 36 53
  • 38. We aren’t doing Bayesian inference! Big difference to what we’re doing with MLE! Given that y ∼ p(y, θ) how can we make inferences about value of θ? Bayes approach is reverse of probability problem: given θ what can we say about distribution of y I Sometimes called inverse probability Instead, we seek distribution p(θ|y), distribution of unknown parameter conditional on observed data 37 53
  • 39. Minimising least squares, assumptions? We required no assumptions about the distribution of y, x or u in order to compute least squares coefficients I If all we care about is fitting data, then we can stop here If we want to make inferences about θ, however, we need some more assumptions For example, what is relationship between θ̂ and θ in the population model? To this point, none whatsoever! I If we want to talk about θ, as opposed to θ̂, we need some more assumptions 38 53
  • 40. Reminder: Gauss-Markov Assumptions 1. yi = xiθ + ui 2. x is fixed and full rank (linear independence) 3. E(ui) = E(ū) = µu = 0 4. E(u2 i ) = σ2 5. E(ui, uj) = 0, ∀i 6= j 6. u ∼ normal 39 53
  • 41. Why make G-M assumptions? In order to do inference, we must say how data are generated (Assumptions 1 and 6) Must specify the parameterization of data generating process (Assumptions 1, 3–6) Must prove that the estimator has desirable properties. (1 is crucial while 3–6 are necessary for hypothesis testing 40 53
  • 42. What to notice about OLS Most of assumptions are about inherently unobservable term, ui Only assumption explicitly about yi is the first Specification strongly encourages us to think of ui as “error”, rather than intrinsic variability in outcomes, yi Key idea: Minimize sum of squared errors I This only indirectly considers the data generating process that creates the observed yi Properties of LS estimators come as an after-thought I We must derive them for each case as assumptions differ (think GLS vs. OLS, for example) 41 53
  • 43. New way of thinking: ML models Specification of distribution of outcome variable, this shift in focus is conceptual, but powerful ML requires an explicit choice of distribution I While some may be ruled out easily, final choice is inherently subjective and uncertain In defending our choices, we are forced to think through nature of data generating process, which is at core of our substantive theory 42 53
  • 44. Outcome Variable For least squares application, we wrote yi = xiθ + ui and u ∼ N(0, σ2 I) but we never said anything about distribution of yi! Seems odd, we have more substantive knowledge about yi than we can possibly have about unobservable ui 43 53
  • 45. Outcome Variable Implicitly, however, is that we have said something about distribution of yi Because ui is normally distributed, and because yi is a linear combination of ui, (and xθ, and since x is fixed) we can conclude that yi is also normally distributed Recall a theorem from intro stats: If u ∼ N(µ, σ2) and a, b are constants, then a linear function of u v = a + bu is normal also v ∼ N(µ + a, b2 σ2 ) 44 53
  • 46. Outcome Variable We could write the usual linear model as yi ∼ N(µi, σ2 ) and µi = xiθ Now notice that least squares model and this model are exactly same thing Hence we can express usual OLS model as an equivalent ML model by focusing on distribution of yi and data generating process, rather than on minimizing squared error 45 53
  • 47. Specifying Data Generating Process Specification of distribution of y is most crucial and controversial step in maximum likelihood modeling First, it is a decision which is to some extent subjective Second, it matters a lot for results (especially predicted values) 46 53
  • 48. Ex: Parliamentary Committees A reasonable question to ask about parliamentary committees is how ’productive’ they are # of bills voted out of a congressional committee gives some hints as to an appropriate distribution Do committees vary in how much legislation they process? I How do we model this? 47 53
  • 49. Ex: Parliamentary Committees # of bills is discrete and non-negative I This means we can rule out any distribution which is either continuous or which allows negative values Are there any systematic features that account for this variation? How do historical changes in committee rules or structure affect productivity within and between committees? Thus, normal distribution cannot be a candidate for describing this process, since normal is defined over real number line from −∞ to +∞ 48 53
  • 50. Ex: Parliamentary Committees Binomial, for example might be one candidate The committee considers N bills each session From these it reports y bills out If probability of reporting each bill is p, then probability model is yi ∼ n k pk (1 − p)n−k where I k successes I n independent Bernoulli trials I n k = n! k!(n−k)! 49 53
  • 51. Ex: Parliamentary Committees But are bills really limited to N? If we think supply of bills is effectively unlimited, because MEPs will find bills to sponsor if there is slack in system, then we might wish to model the process as... a poisson distribution yi ∼ λke−λ k! This distribution is also discrete and non-negative 50 53
  • 52. Issues with MLE Choice of a particular distribution is not always clear I Yet that choice must be made, for without it there is no model Some criticize ML for this I They point to the subjective and somewhat arbitrary choice of distribution, and to fact that if you pick wrong distribution you are estimating a misspecified model This is a lot of assumptions, and perhaps social science theory is not up to challenge I This is a valid concern I In an ideal world, we would have better knowledge of appropriate distribution and would not have so much discretion 51 53
  • 53. Necessary Choices in Applied Stats Don’t delude ourselves into thinking that there’s an escape from these dilemmas of statistical modeling Any statistical model must specify both structure and distribution of its variables Those who rely on OLS then are actually doing ML but are assuming that every model is a continuous, normal, model Surely it is preferable to adopt most persuasive ML specification even if it is subjective, than to always adopt this particular ML regression model regardless of substance of problem! 52 53
  • 54. Class business Read required (and suggested) online materials Fork GitHub repository These slides are available on the course website Next time, we’ll talk about GLMs! 53 / 53