sisvsp2012_sessione9_giusti_marchetti_pratesi_

Oﬃcial data and poverty indicators: small area estimates
in local governance

Monica Pratesi, Stefano Marchetti, Caterina Giusti, Nicola Salvati

Department of Statistics and Mathematics Applied to Economics, University of Pisa

SISVSP 2012
Rome, 19-20 April 2012

M. Pratesi (DSMAE, Pisa) Oﬃcial data and poverty indicators 19-20 April 2012 1 / 25

Structure of the Presentation

1 Motivation

2 Poverty indicators and SAE methods

3 Oversampling and Small Area Estimation: A Comparison

4 Application of small area M-quantile models to poverty mapping in Tuscany

5 Concluding remarks


Part I

Motivation


Motivation

Motivation

Problem: to estimate some key statistics for poverty at the small area level to
drive local governance
We focus on small area estimation of Laeken poverty indicators, such as head
count ratio and poverty gap

Proposed methodology
Using M-quantile models to estimate poverty indicators and to provide also an
estimator of the corresponding mean squared errors

Opportunity
Comparing model-based estimates with direct estimates computed with an
EU-SILC oversampling of households


Motivation

Motivation

Available data to measure poverty and living conditions in Italy come mainly
from sample surveys, such as the Survey on Income and Living Conditions
(EU-SILC)
However, EU-SILC data can be used to produce accurate estimates only at
the NUTS 2 level (that is, regional level)
To satisfy the increasing demand from oﬃcial and private institutions of
statistical estimates on poverty and living conditions referring to smaller
domains (LAU 1 and LAU 2 levels, that is Provinces and Municipalities),
there is the need to resort to small area methodologies
We focus on the estimation of poverty measures at the small area level. For
this purpose we use data coming from the EU-SILC survey 2008 and from the
Population Census 2001


Part II

Poverty indicators and SAE methods



Poverty Measures

Denoting by t the poverty line and by y a measure of welfare, the Foster et al.
(1984) poverty measures (FGT) for a small area d can be defined as
−1
Zd (α, t) = Nd zjd (α, t) + zkd (α, t) .
j∈sd k∈rd

where for a generic unit i in area d
t − yid α
zid (α, t) = I(yid t) i = 1, . . . , Nd
t

zjd (α, t) is known for j ∈ sd
zkd (α, t) is unknown for k ∈ rd and should be predicted
Setting α = 0 defines the Head Count Ratio whereas setting α = 1 defines the
Poverty Gap.



Poverty Measures

The HCR indicator is a widely used measure of poverty because of its ease of
construction and interpretation, since it counts the number of individuals
with income below the poverty line. At the same time this indicator also
assumes that all poor individuals are in the same situation. For example, the
easiest way of reducing the headcount index is by targeting beneﬁts to people
just below the poverty line because they are the ones who are cheapest to
move across the line. Hence, policies based on the headcount index might be
sub-optimal.
For this reason we also obtain estimates of the PG indicator. The PG can be
interpreted as the average shortfall of poor people. It shows how much would
have to be transferred in mean to the poor to bring their expenditure up to
the poverty line.



M-quantile models

With regression models we model the mean of the variable of interest (y )
given the covariates (x)
A more complete picture is oﬀered, however, by modeling not only the mean
of (y ) given (x) but also other quantiles. Examples include the median, the
25th, 75th percentiles. This is known as quantile regression
An M-quantile regression model for quantile q

Qq = xT β ψ (q)
jd

Main features of these models
No hypothesis of normal distribution
Robust methods (inﬂuence function of the M-quantile regression)



Using M-quantile models to measure area effects

Central Idea: Area effects can be described by estimating an area specific q value
ˆ
(θd ) for each area (group) of a hierarchical dataset (Chambers & Tzavidis 2006)

Estimate the area specific target parameter by fitting an M-quantile model
ˆ
for each area at θd

ˆ jd
ˆ ˆ
yjd = xT β ψ (θd )
Mixed effects model use random effects to capture the dissimilarity between
domains. M-quantile models attempt to capture this dissimilarity via the
ˆ
domain-specific M-quantile coefficients θd



SAE Poverty Measures Estimators

Using a smearing-type predictor that follow the same idea of the Chambers and
Dunstan (1986) distribution function estimator we can predict the zkd (α, t) values

−1 t − ykjd
ˆ α
zkd (α, t) = nd
ˆ I(ˆkjd
y t) k ∈ rd , j ∈ sd
t
j∈sd

ˆ
ykjd = xT β ψ (θd ) + ejd
ˆ kd
ˆ
ejd = yjd − xT β ψ (θd )
jd
Finally, the small area estimator of FGT poverty measures is

ˆ −1
Zd (α, t) = Nd zjd (α, t) + zkd (α, t) .
ˆ
j∈sd k∈rd



A Mean Squared Error Estimator of the Poverty Measures
Estimators

To estimate the mean squared error of the M-quantile poverty estimators we can
use the bootstrap proposed by Tzavidis et al. (2010) and Marchetti et al. (2012).
Let b = (1, . . . , B), where B is the number of bootstrap populations
Let r = (1, . . . , R), where R is the number of bootstrap samples
Let Ω = (yk , xk ), k ∈ (1, . . . , N), be the target population
By ·∗ we denote bootstrap quantities
ˆ
Zd (α, t) denotes the FGT poverty measures estimator of the small area d
Let y be the study variable that is known only for sampled units and let x be
the vector of auxiliary variables that is known for all the population units
Let s = (1, . . . , n) be a within area simple random sample of the ﬁnite
population Ω = {1, . . . , N}



Estimator

ˆ ˆ
Fit the M-quantile regression model on sample s, yjd = xT β ψ (θd )
ˆ jd
Compute the residuals, yjd − yjd = ejd
ˆ
Generate B bootstrap populations of dimension N, Ω∗b
∗ ˆ ˆ ∗
1 ykd = xT β ψ (θd ) + ekd , k = (1, . . . , N)
kd
∗
2 ekd are obtained by sampling with replacement residuals ejd
3 residuals can be sampled from the empirical distribution function or from a
smoothed distribution function
4 we can consider all the residuals (ej , j = 1, . . . , n), that is the unconditional
approach or only area residuals (ejd , j = 1, . . . , nd ), that is the conditional
approach.
From every bootstrap population draw R samples of size n without
replacement



Estimator
Using the B bootstrap populations and from the R samples drawn from every
bootstrap population we can estimate the mean squared error of the FGT
estimator
Bias
ˆ ˆ B R ˆ
E Z (α, t)∗ − Z (α, t)∗ = B −1 b=1 R −1 r =1 Z (α, t)∗br − Z (α, t)∗b

Variance
2
ˆ
Var Z (α, t)∗ − Z (α, t)∗ = B −1
B
R −1
R ˆ ¯
ˆ
Z (α, t)∗br − Z (α, t)∗br
b=1 r =1

where
Z (α, t)∗b is the FGT of the bth bootstrap population
ˆ
Z (α, t)∗br is the FGT estimate for Z (α, t)∗b estimated using the r th sample
drown from the bth bootstrap population
¯
ˆ R ˆ
Z (α, t)∗br = R −1 r =1 Z (α, t)∗br

Part III

Poverty Mapping in the Province of Pisa: Oversampling
vs. Small Area Estimation


Oversampling and Small Area Estimation: A Comparison


When direct estimates are unreliable there are two possible solutions:
Increase the sample size in the domains of interest in such a way that direct
estimates became reliable (oversampling solution)
Use small area methods (small area solution)
In order to make a comparison between these alternatives we can take the
opportunity to use data referring to an EU-SILC 2008 oversampling of households
for the Province of Pisa - side result of the SAMPLE project
(www.sample-project.eu).
Sample size for the province of Pisa EU-SILC 2008: 149 households
Sample size for the province of Pisa Oversample: 675 households (that
include the 149 household of the EU-SILC survey)
REMARK: Oversample has been managed by the ISTAT who warrantees the high
quality of the data



SAE methods for poverty indicators in Tuscany Provinces

Data on the equivalised income in 2007 are available from the EU-SILC
survey 2008 for 1495 households in the 10 Tuscany Provinces
To better compare the living conditions in these areas we estimate the
indicators considering the gender of the head of the household
A set of explanatory variables is available for each unit in the population from
the Population Census 2001
We employ an M-quantile model to estimate Head Count Ratio (HCR) and
Poverty Gap (PG) for the Provinces by gender of the head of the household
(HH), for a total of 20 areas
National poverty line: 9310.74 Euros (equivalised household income)



Model Speciﬁcations

The selection of covariates to ﬁt the small area models relies on prior studies
of poverty assessment
The following covariates have been selected:
household size (integer value)
ownership of dwelling (owner/tenant)
age of the head of the household (integer value)
years of education of the head of the household (integer value)
working position of the head of the household (employed / unemployed in the
previous week)



We estimate the Head Count Ratio (HCR) and the Poverty Gap (HCR) in the
Province of Pisa considering the gender of the Head of the Household (HH) using:

Direct estimators based on the EU-SILC survey data
Direct estimators based on the Oversampling data
M-quantile small are estimators based on the EU-SILC survey data

Table: Direct estimates (without and with oversampling) and MQ/CD estimates of the HCR
and PG with corresponding estimated Root Mean Squared Errors (in brackets) and number of
sampled households (h) in the Province of Pisa, by gender of the Head of the Household (HH).

Estimates HH gender h HCR % PG %
Direct estimate Female 44 9.88 (4.28) 4.48 (2.56)
Male 105 6.62 (2.24) 2.25 (0.91)
MQ/CD estimates Female 44 20.72 (3.13) 8.64 (2.00)
Male 105 9.02 (1.63) 2.91 (0.74)
Direct estimates Female 193 23.57 (4.92) 6.64 (2.77)
(with oversampling) Male 482 8.21 (1.61) 2.40 (0.60)


Part IV

Application of small area M-quantile models to poverty
mapping in Tuscany


Application of small area M-quantile models to poverty mapping in Tuscany

Estimates of the HCR at small area level in Tuscany

MS MS

LU LU
PT PO PT PO

FI FI

AR AR
PI PI

LI LI
SI SI

GR GR

8.48 10.17 16.76 24.04 31.63

Figure: Provinces by gender of the HH: males (left) and females (right)

Application of small area M-quantile models to poverty mapping in Tuscany

Estimates of the PG at small area level in Tuscany

MS MS

LU LU
PT PO PT PO

FI FI

AR AR
PI PI

LI LI
SI SI

GR GR

2.69 3.31 6.37 10.39 15.05

Figure: Provinces by gender of the HH: males (left) and females (right)

Part V

Concluding remarks


Concluding remarks

Concluding remarks and ongoing research

Main results
Focus on the poverty indicators small area estimators
Small area methods play a crucial role in providing poverty measures for local
governance
Small area estimates are very close to the oversampling estimate and they are
(almost) costless
Ongoing and future research
Consider non-monetary measures of poverty (Cheli and Lemmi, 1995)
Enhance the ﬁtting of the models, considering non parametric models and
spatial models
Compare with alternative methods
Take into account the survey weights


Concluding remarks

Essential bibliography

Breckling, J. and Chambers, R. (1988). M -quantiles. Biometrika, 75, 761–771.

Chambers, R. and Dunstan, R. (1986). Estimating distribution function from survey data, Biometrika. 73, 597–604.

Chambers, R. and Tzavidis, N. (2006). M-quantile models for small area estimation. Biometrika, 93, 255–268.

Chambers, R., Chandra, H. and Tzavidis, N. (2007). On robust mean squared error estimation for linear predictors for domains. CCSR Working
paper 2007-10, University of Manchester.
Cheli B. and Lemmi, A. (1995). A Totally Fuzzy and Relative Approach to the Multidimensional Analysis of Poverty. Economic Notes, 24,
115-134.
Foster, J., Greer, J. and Thorbecke, E. (1984) A class of decomposable poverty measures. Econometrica, 52, 761-766.

Giusti C., Pratesi M., Salvati N. (2009). Estimation of poverty indicators: a comparison of small area methods at LAU1-2 level in Tuscany,
Abstract Book, NTTS - Conferences on New Techniques and Technologies for Statistics, Brussels, 18-20 Febbraio 2009.
Hall, P. and Maiti, T. (2006). On parametric bootstrap methods for small area prediction. Journal of the Royal Statistical Society: Series B, 68,
2, 221–238.
Marchetti, S., Tzavidis, N. and Pratesi, P. (2012). Non-parametric bootstrap mean squared error estimation for image-quantile estimators of
small area averages, quantiles and poverty indicators. Computational Statistical and Data Analysis, doi:10.1016/j.csda.2012.01.023

Lombardia M.J., Gonzalez-Manteiga W. and Prada-Sanchez J.M. (2003). Bootstrapping the Chambers-Dunstan estimate of ﬁnite population
distribution function. Journal of Statistical Planning and Inference, 116, 367-388.
Royall, R. and Cumberland, W.G. (1978). Variance Estimation in Finite Population Sampling. Journal of the American Statistical Association, 73,
351-358.
Tzavidis N., Marchetti S. and Chambers R. (2010). Robust estimation of small area means and quantiles. Australian and New Zealand Journal of
Statistics, 52, 2, 167–186.
Tzavidis, N., Salvati, N., Pratesi, M. and Chambers, R. (2007). M-quantile models for poverty mapping. Statistical Methods & Applications, 17,
393-411.


sisvsp2012_sessione9_giusti_marchetti_pratesi_

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (16)

Destacado

Destacado (20)

Similar a sisvsp2012_sessione9_giusti_marchetti_pratesi_

Similar a sisvsp2012_sessione9_giusti_marchetti_pratesi_ (20)

Más de Gruppo Valorizzazione delle Statistiche Pubbliche

Más de Gruppo Valorizzazione delle Statistiche Pubbliche (20)

Último

Último (20)

sisvsp2012_sessione9_giusti_marchetti_pratesi_