This document provides an overview of small area estimation techniques, also known as poverty mapping. The goal is to understand the spatial distribution of poverty in a country using statistical methods to combine limited household survey data with more comprehensive census data. The method involves identifying explanatory variables common to both datasets, estimating a model of expenditures using survey data, predicting expenditures in census data using the model, and simulating poverty measures at different geographic levels of disaggregation with standard errors.
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
An Overview of Small Area Estimation
1. An Overview of Small Area Estimation
(aka Poverty Mapping)
David Stifel
Lafayette College
IFPRI Addis Ababa
Central Statistical Agency
Addis Ababa, 29 May 2012
1
2. What is the goal?
• To understand the
spatial distribution
of poverty in a
country / region.
3. What is the problem?
• Main source of information on distributional
outcomes (e.g. household surveys) permit only
limited disaggregation
o e.g. HICES/WMS – urban/rural within region
• Very large data sources (e.g. census) typically
collect very limited information on welfare
outcomes
o Usually no data on income or consumption at all
4. How to solve this problem?
1. Collect larger samples
• Expensive
• There is a quantity-quality trade-off
2. Combine limited information in census into some
sort of proxy of welfare (e.g. “basic needs index”,
factor analysis asset index, etc)
• ad hoc
• disputed
• interpretation?
5. How to solve this problem?
3. Use statistical, small-area estimation (SAE)
techniques
• Readily interpretable results
Uses exactly the same concept of welfare as traditional
survey-based analysis
• Statistical precision can be gauged
• Encouraging results to date
6. SAE Poverty Maps
• Brainchild of…
o Peter Lanjuow (World Bank)
o Jean Lanjuow (UC Berkeley, deceased)
o Chris Elbers (Free University, Amsterdam)
o Jesko Hentschel (World Bank)
7. SAE Poverty Maps
Goal: To produce disaggregated estimates of welfare
that are accurate and easily calculated
• Called “Poverty Maps”, but not necessarily maps
• Highly disaggregated databases of welfare
• Poverty
• Inequality
• Average consumption
8. SAE Poverty Maps
Terminology: Map
• Mathematical term
Map from one set to another
• Geographical term
Graphically represent data using a map
We use both terms here.
9. Data Requirements
• Nationally or regionally representative
household budget survey
Does include household consumption
• National census
Does NOT include household consumption
• Comparable correlates of HH consumption in
both survey and census (causality does not matter)
• External data can also be merged with survey & census
(e.g. GPS recordings – meteorological data)
10. Poverty Mapping - Basics
1. Identify explanatory variables common to both
expenditure survey & census (Stage 0)
2. Estimate model of pc (or per AE) expenditures
using expenditure survey at the lowest level of
representation – stepwise regression (Stage 1)
3. Predict pc expenditures at household level in
target data using the parameters from Stage 1
(Stage 2)
4. Calculate poverty (and/or) inequality measures
at desired level of disaggregation
11. Poverty Mapping - Basics
Estimate the following model in the sample (stepwise)…
(Stage 1)
survey
ln c ci X ci u
ci
Using the estimated parameters, predict in the population…
(Stage 2)
ˆ
ln cci X census ˆ ˆ
uci
ci
12. Poverty Estimates
Use predicted values of expenditure (c) to
predict poverty measures (e.g. FGT measures)…
ˆ 1 n
ˆ
z cci
P 1z ˆ
cci
n i 1 z
Run 100 simulations (draws from the error term
and β distributions), and report average poverty
measure & standard errors.
13. Why include the predicted error?
• Because X ˆ explains only a portion of the observed
consumption.
• This may be due to:
Unobserved factors which also explain the variation in the
observed consumption, but which are not included in the
model
Model misspecification
Measurement error in the observed consumption
To account for the first two factors, an estimate of the
error term is added to the predicted consumption.
14. Actual vs. Predicted Expenditures
1.0
0.9
0.8
0.7
0.6
Share of Population
0.5
0.4
0.3
0.2
Actual
0.1
Predicted
0.0
0 10,000 z z 20,000 30,000 40,000
Annual Per AE Consumption
15. Error Term
uci c ci
Location component (c): Allows for spatial correlation
Household component (ci): Allows for individual
differences in the error term (heteroskedasticity)
These error components are drawn from
distributions, the variances of which are functions of the
data.
So… although the heteroskedastic functional form is
assumed constant, the actual distribution is a function of
the data.
16. Poverty Mapping - Basics
Stage 2 – Repeated simulations for different draws from the
distributions of β and distribution of…
uci c ci
To get multiple distributions of predicted consumption…
ˆ
ln cci X census ˆ ˆ
uci
ci
For each simulation, calculate welfare indicators...
20. Poverty Mapping - Basics
Stage 2 – Repeated simulations for different draws from the
distributions of β and distribution of…
uci c ci
To get a distribution of predicted consumption…
ˆ
ln cci X census ˆ ˆ
uci
ci
For each simulation, calculate welfare indicators...
21. Sources of Error
1. Idiosyncratic Error
E[ P ( x, , u; z )] vs. E[ P (c; z )]
Larger target sample smaller error
Better prediction from xci smaller error
2. Model Error
E[ P ( x, ˆ , u; z)]
ˆ vs. E[ P ( x, , u; z )]
Careful specification of the model smaller error
22. Sources of Error
3. Computation Error
Simulations generate computation error
More simulations smaller error
23. Review of Poverty Mapping Basics
1. Identify explanatory variables common to both
expenditure survey & census (Stage 0)
2. Estimate model of pc expenditures using
expenditure survey at the lowest level of
representation (Stage 1)
3. Predict pc expenditures at household level in
target data using the parameters from Stage 1
(Stage 2)
4. Calculate poverty (and/or) inequality measures
at desired level of disaggregation