How to Design Research from Ilm Ideas on Slide Share

What to research and how?
Research questions, sampling and all
that
Faisal Bari
Associate Prof. of Economics, LUMS
Associate Fellow, IDEAS
(With contribution from Dr. Farooq
Naseer, IDEAS)

Outline
• What does the TNA tell us
• Framing of research issues, questions and
tools. This appears to be simpler than it
is….demands reflexivity
• Sampling and related issues…all about power

Ilm-Ideas TNA: Research Tools
0
2
4
6
8
10
12
14
16
18
Focus Groups Interviews HH Surveys School level
survey
Case studies Secondary
analysis of data
Statistical models Other
Research tools used most often
1 2 3 4 5

Ilm-Ideas: Tools Required
11
13
12
15
13
11
14
14
10
15
14
17
14
13
6
4
5
2
4
6
3
3
7
2
3
0
3
4
Conducting a research needs assessment and/or defining research
objectives
Identifying priority research questions
Selecting research sites and developing criteria for the selection
Selecting and justifying the sampling strategy and target numbers
Sampling – selecting the research target group
Conducting desk research to identify good practice examples within and
outside the country of similar researches undertaken
Developing research indicators
Developing research tools and instruments
Piloting the research instruments
Conducting qualitative and quantitative research tools and instruments
Management of data collection/fieldwork, including the control, supervision
and debriefings of field workers/interviewers
Use of data analysis software and systems
Conduct data interpretation and analysis
Report writing
Priority Non-Priority

Research Capacity
8
7
7
6
7
10
7
10
12
10
10
3
9
11
8
10
8
8
8
5
8
7
3
5
5
7
5
4
1
0
2
3
2
2
2
0
2
2
2
7
3
1
0 2 4 6 8 10 12 14 16 18
Conducting research needs assessment and/or defining research objectives
Identifying priority research questions
Selecting research sites and developing criteria for the selection
Selecting and justifying the sampling strategy and target numbers
Sampling – selecting the research target group
Conducting desk research to identify good practice examples
Developing research indicators
Developing qualitative and quantitative research tools and instruments
Piloting the research instruments
Conducting qualitative and quantitative research tools and instruments
Management of data collection/fieldwork, including the control,
supervision and debriefings of field workers/interviewers
Use of data analysis software and systems
Conduct data interpretation and analysis
Report writing
Organizational Capacity - Research
High Medium Low

Main Challenges in Policy Research
• Getting concerned institutions engaged and motivated
• Data management
• Interpreting data
• Report writing
• Availability of updated data
• Accessing policy documents
• Low experience/expertise in conducting policy research
• Access and availability of public expenditure documents
• Discrepancy in government data/inaccurate govt. data
• Dearth of qualitative research experts in the country
• Lack of interest within policy circles
• Shortage of sector experts
• Community based research
• Sometimes funding agency and govt. interests don’t match

Framing Research
• Can you tell whether you are drinking Coca Cola?
• For a single person: coke or not
• For a single person: coke or other colas
• For many people: coke or not
• For many people: coke or other colas
• Trivial? Think of cure for cancer 10% total cure
versus 50 percent improvement for 50% (but not
cure)

Framing Research: Examples
• Private Public Partnerships in Education:
Adopt a school programme
• Importance and need: 25 A and quality issues
• Variation in legal frameworks: Punjab and
Sindh
• Variation in models: PEN, CARE, SEF
• Variations across time: do models mature.
What is the exit strategy

• Remedial education for teachers (will come back
to this at the end too)
• DSD reports, PEC results….content knowledge of
teachers is a significant issue
• How to remedy that? CPD already in place
• Something that is scale-able also
• Using DTEs to reach teachers (Maths and Science)
• Use technology to reduce cost

Framing Research: Examples MFN
• Post fact impact evaluation…one way MFN
paper
• Introductory paras set the context and
question
• Issue of composite effect…rather than
isolating contributions. Child friendly (teacher
training, materials, parental
involvement)…better learning

• Propensity score matching (not gold
standard…but best available here)
• Two stage matching: Schools and then
children (need both school and
children/family characteristics)
• School level matching: geography, medium,
level of school

• Within school blocks….child matching
• Robustness
• Children joining…dropped…selection bias
• Treatment and control children…good match
on average

• Mining….Item Response Theory (IRT)
• Possibility of leakage (teacher transfers,
student transfers)
• No non-cognitive testing….where gains might
be large too
• Could we check if the effect was different on
the weakest/strongest students

• Tahir Andrabi and the recent education
recovery paper.
• Distance from Fault Line as the independent
variable
• How is that established? And What is its
importance
• The results are insightful….the ‘hey, wait a
minute’ moment

Sampling Issues:
• Statistics Refresher: Summarizing data
• Sampling:
– Minimizing error
– Representativeness
• Hypotheses testing
• Power

Data: Summarizing
• Variation is what we study: variation is King
• Statistics helps us summarize data by using
two important features of a dataset:
–the average (mean, center)
• what is the average age of participants in this room?
• Is it important?....not a technical issue only (The deer
hunter)
–the variance (variability, spread)
• by how much does age vary across participants?
• Again….is it important…and when (50 or 0/100)

Population and Sample
• Measuring the population gives us the truth!
(assuming there is no measurement error)
– But we usually cannot survey the entire
population
– Hence we must draw a sample
• How do we choose the sample?
• How large should be the sample?

Sample
• Sample must be representative of the population:
– Draw a random sample
– Jute example, skulls, Indian census
• But still, the sample is not some fixed subset of the
population so each sample will be different!
• This is called “sampling error.” How to reduce it?
– Draw a larger sample.
– But how large? (depends on the hypothesis of interest
and sampling error… want to maximize the “power” to
reject incorrect hypotheses)

Simple Random Sampling
• List every individual in the population of interest
(population size: N)
• Decide on a sample size based on ‘power’
calculations (sample size: n < N)… to be discussed
• Randomly pick n individuals from the population
such that each individual has a positive chance of
being picked
• Examples:
• Toss a coin
• Draw lots out of a basket
• Use a computer software

Stratified Random Sampling
• Mark separate sub-groups (or strata) in the
population list before drawing a random
sample from each
• Stratified Sampling
– For adequate representation
of different sub-groups (i.e. strata)
in the population
– For a given sample size, reduces the
sampling error as compared to the
un-stratified simple random sampling
• Trade-off between the cost of doing stratification and the
smaller sample size needed
• Fraction sampled could be different across strata; improves
across-group comparisons

Two Nice Results
• Before we turn to hypothesis testing and the concept
of statistical power, important to recognize that the
sample average behaves well in large samples
• Law of Large Numbers
– The sample average will approach the true
population average as the sample size increases
• Central Limit Theorem
– The sample average will tend to be normally
distributed, around the true population average
value, as the sample size increases

Normal Distribution
𝑀𝑒𝑎𝑛: 𝜇 =
𝑥𝑖
𝑛
𝑆𝑡. 𝑑𝑒𝑣: 𝜎 =
(𝑥𝑖 − 𝜇)
2
𝑛

Hypothesis Testing
• Suppose the average pre-training knowledge of M&E
in the population is 3/10 points on a standardized
test
• How can we empirically test whether this course
improves M&E knowledge?
• In statistical terms, this test can be stated as follows:
– H0 or the null hypothesis: This hypothesis states what you
would like to disprove i.e. “no effect”.
– H1 or the alternative hypothesis: The course improves
M&E knowledge i.e. “positive effect”.

Hypothesis Testing
• Ex-post, administer the test on multiple cohorts of course
participants –OR– use statistical theory to decide based on
just one cohort
• When is the average test score of course participants in a
cohort “significantly” (i.e. statistically) higher than 3?
• That is, allowing for
sampling error, when
can we be “confident”
that we are observing a
real improvement in M&E
scores?
• Depends on the sampling error
in average test score

Hypothesis Testing
• Suppose, you want to test a promising
intervention designed to improve
• (M&E) education. Question: Is the intervention
(“treatment”) effective?

Statistical Power
• The power of a test is the probability of
correctly rejecting the null hypothesis
• In other words, power is the probability of
correctly declaring the treatment as beneficial
• Hence, Statistical Power = 1 – Prob(Type-II
error)

Importance of getting power right
Testing a new ‘miracle’ cure for cancer
– Power too low; missed a large treatment effect
– Power too high; wasted resources in doing a large study to
declare a tiny, clinically irrelevant effect as statistically
significant
– Power just right; have a good chance of detecting
reasonably sized effects, but not tiny ones

Power: Main Ingredients
For a given significance level, power depends on
the following:
1. Sample Size
2. Assumed Effect Size under H1
3. Variance of outcome in the study population
4. Proportion of sample in T vs C
5. Clustering

Power Sample Size
• Increasing the sample size reduces the
sampling error (i.e. sample-to-sample
variation) in the sample average

Variance
• The “sampling error” in the sample average,
sigma^2/n, is directly proportional to the
(“natural”) variance in the outcome variable in
the population
• There is sometimes very little we can do to
reduce the noise
• The underlying variance is what it is
• We can try to “absorb” variance:
– controlling for other variables

Clustering:
• You want to know how close the upcoming
national elections will be
• Method 1: Randomly select 50 people from the
entire population
• Method 2: Randomly select 10 families, and ask
five members of each family their opinion
• Method 2 will yield relatively imprecise/noisy
estimates if the political opinion within families
does not tend to vary a lot (high “intra-cluster
correlation”)

Sampling Frames for Examples Used
• For PPP
• For remedial education for teachers
• Why did MFN go the way he did

And last but not least
• Happy hunting
• Thank you

How to Design Research from Ilm Ideas on Slide Share

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to How to Design Research from Ilm Ideas on Slide Share

Similar to How to Design Research from Ilm Ideas on Slide Share (20)

More from ilmideas

More from ilmideas (6)

Recently uploaded

Recently uploaded (20)

How to Design Research from Ilm Ideas on Slide Share