Alari Paulus. Tax evasion and measurement error. An analysis of income survey data linked with tax records
1. Introduction
Methodology
Data
Findings
Tax evasion and measurement error
An analysis of income survey data linked with tax records
Alari Paulus
Institute for Social & Economic Research (ISER)
University of Essex
Eesti Pank
Nov 19, 2013
Summary
3. Introduction
Methodology
Data
Findings
Summary
Motivation
Tax evasion undermines the role of taxation and affects
resource allocation
Who evades? The extent of evasion?
Empirical evidence on income tax evasion at micro-level:
(random) audits - only partial detection
surveys - self-reported non-compliance, indirect methods
focus on the scale
experiments - difficult to link compliance decisions with real
income distribution and job characteristics; actual scale
still limited as data availability is a challenge!
Very low non-compliance wrt salaries/wages due to
third-party reporting
employees and employers can collude
mainly based on audits
4. Introduction
Methodology
Data
Findings
Research questions and contribution
Which individuals are more likely to evade taxes on
employment income in Estonia? How extensive?
Individual-level survey data linked to tax records
Only other example by Baldini et al. (2009) but ignore
measurement error
Studies on income measurement error ignore tax evasion,
e.g. Bound et al. (1994), Kapteyn & Ypma (2007)
A new econometric approach to model these jointly
Extend evidence for post-socialist countries
Rich information on individual characteristics
Link between true income and tax evasion
Find substantial underreporting of earnings (10-15%)
Summary
5. Introduction
Methodology
Data
Findings
Existing literature: theory
The deterrence model (Allingham & Sandmo 1972,
Srinivasan 1973, Yitzhaki 1974)
probability of detection and penalty rate
marginal tax rate (ambiguous)
total income - evasion increases in absolute terms,
ambiguous in relative terms
risk aversion
Endogenous income (Andersen 1977, Pencavel 1979)
Interactions with tax authority (see Andreoni et al. 1998)
Behavioural economics (see Hashimzade et al. 2012)
Kleven et al. (2009): third-party reporting and firm size
Summary
6. Introduction
Methodology
Data
Findings
Summary
Existing literature: empirics
Audits
MTR (mixed)
income (mixed)
married people ↑ and elderly ↓ (Clotfelter 1983, Feinstein
1991, Martinez-Vazques and Rider 2005)
Experiments
support for auditing and penalties
MTR (mixed)
income (mixed)
males ↑ and elderly ↓ (Baldry 1987, Pudney et al. 2000)
Survey
people in smaller firms, construction, non-Estonians, men,
less education, young and elderly more likely to evade (Kriz
et al. 2008, Meriküll & Staehr 2010)
60% of income underreported by hh-s with business
income (Kukk & Staehr 2013)
7. Introduction
Methodology
Data
Findings
Summary
Employee vs employer
Incentives for both sides (except for public sector
employers) and requires co-operation
Employer in a stronger position if few alternative
employment options
But employers more likely to be punished when found out
→ smaller risk of exposure for smaller companies
Disincentives for employee: no health insurance, lower
pension entitlement, difficult to obtain bank loan/mortgages
Evidence of declaring part of earnings (close to the
minimum wage)
Assume final decision up to the individual (but controlling
for a few firm characteristics)
8. Introduction
Methodology
Data
Findings
Summary
Model structure
Three dependent variables:
1
An individual i has employment income yiT (true earnings)
2
Declares to the tax authority yir (reported earnings)
none, some or all of yiT (but not more)
3
States in the survey yis (survey earnings)
noisy measure of true earnings
can understate or overstate (even yis > 0 while yiT = 0)
Highly non-linear system → specify functional forms and error
distributions and estimate parameters with the Maximum
Likelihood method
9. Introduction
Methodology
Data
Findings
Summary
Model structure
Three dependent variables:
1
An individual i has employment income yiT (true earnings)
2
Declares to the tax authority yir (reported earnings)
none, some or all of yiT (but not more)
3
States in the survey yis (survey earnings)
noisy measure of true earnings
can understate or overstate (even yis > 0 while yiT = 0)
Highly non-linear system → specify functional forms and error
distributions and estimate parameters with the Maximum
Likelihood method
10. Introduction
Methodology
Data
Findings
Part 1: True earnings yiT
Allow for the possibility that actual income is zero
Assume log-normal distribution (if positive)
ln yiT ∼ N(xi β, σ 2 )
yiT = 0
with probability p
with probability 1 − p
where xi is a vector of personal characteristics
Summary
11. Introduction
Methodology
Data
Findings
Summary
Part 2: Reported earnings yir
A two-limit Tobit model
The proportion of true earnings reported (multiplicative), a
function of true earnings and personal characteristics (zi ):
0
if yiT = 0
(no earnings)
0
if yiT > 0 and ri∗ ≤ 0
(full evasion)
yir =
∗ · yT
T > 0 and 0 < r ∗ < 1
if yi
(part evasion)
ri
i
i
yiT
if yiT > 0 and ri∗ ≥ 1
(no evasion)
where
ri∗ =
2
and ui ∼ N(0, σu )
0
T +zγ+u
θyi
i
i
if yiT = 0
if yiT > 0
12. Introduction
Methodology
Data
Findings
Summary
Part 2: Reported earnings yir
A two-limit Tobit model
Reported earnings in levels (additive), a function of true
earnings and personal characteristics (zi ):
if yiT = 0
(no earnings)
0
0
if yiT > 0 and yi∗r ≤ 0
(full evasion)
yir =
∗r
T > 0 and 0 < y ∗r < y T
if yi
(part evasion)
yi
i
i
T
yi
if yiT > 0 and yi∗r ≥ yiT
(compliance)
where
yi∗r =
2
and ui ∼ N(0, σu )
0
T +zγ+u
θyi
i
i
if yiT = 0
if yiT > 0
13. Introduction
Methodology
Data
Findings
Part 3: Survey earnings yis
In log-terms, conditional on yis > 0
Assume that a function of log-true earnings and personal
characteristics (wi )
A shift parameter (δ0 ) if true earnings are zero
ln yis = ρ ln yiT · 1(yiT > 0) + δ0 · 1(yiT = 0) + wi δ + vi
2
where 1(·) is an indicator function and vi ∼ N(0, σv )
Summary
14. Introduction
Methodology
Data
Findings
Maximum likelihood estimation
Probability density:
∞
f (yir , yis |xi , zi , wi ) =
yir
f (yir , yis |xi , zi , wi , y T )f (y T |xi ) dy T
∞
=
yir
f (yir |xi , zi , wi , y T )f (yis |xi , zi , wi , y T )f (y T |xi ) dy T
assuming independence of random terms
Semi-infinite integrals:
solved numerically using Gauss-Hermite quadrature
nodes and weights as calculated in Steen et al. (1969)
15 quadrature points
Summary
15. Introduction
Methodology
Data
Findings
Maximum likelihood estimation
Probability density:
∞
f (yir , yis |xi , zi , wi ) =
yir
f (yir , yis |xi , zi , wi , y T )f (y T |xi ) dy T
∞
=
yir
f (yir |xi , zi , wi , y T )f (yis |xi , zi , wi , y T )f (y T |xi ) dy T
assuming independence of random terms
Semi-infinite integrals:
solved numerically using Gauss-Hermite quadrature
nodes and weights as calculated in Steen et al. (1969)
15 quadrature points
Summary
16. Introduction
Methodology
Data
Findings
Identification
Two observed income measures and three equations
Key identifying assumptions:
People working in the public sector cannot evade taxes
No differences between public and private sector wrt
processes underlying true income and survey income
Intuition:
Constrained sample (largely) identifies parameters in the
true earnings and measurement error equation
Unconstrained sample identifies the compliance equation
Simultaneous estimation
consistent estimates
shift parameters for the (un)constrained sector
Summary
17. Introduction
Methodology
Data
Findings
Identification
Two observed income measures and three equations
Key identifying assumptions:
People working in the public sector cannot evade taxes
No differences between public and private sector wrt
processes underlying true income and survey income
Intuition:
Constrained sample (largely) identifies parameters in the
true earnings and measurement error equation
Unconstrained sample identifies the compliance equation
Simultaneous estimation
consistent estimates
shift parameters for the (un)constrained sector
Summary
18. Introduction
Methodology
Data
Findings
Summary
Data
1
Estonian Social Survey 2008
basis for the Estonian part of the EU-SILC
2007 incomes by source (earnings either gross or net)
2
Individual tax reports for 2007
personal declaration if submitted, otherwise company
declarations for employees
gross incomes by source and taxes (withheld, final)
3
Linkage
by Stat. Estonia, no consent required from the respondents
based on unique PIN (available for sampled persons, asked
for other household members, matched for the rest)
very few non-matches, possibly some incorrect matches
19. Introduction
Methodology
Data
Findings
Summary
Evolution of the sample
Sample
Initial sample of ESU 2008
Linked with tax records
Aged 16 or oldera
Responded in ESUb
Earnings reported in ESU
Employedc
Employment duration reported
Worked full time for whole year
- constrained sectord
- unconstrained sector
Total
15,123
15,048
12,861
10,951
10,237
6,570
5,496
4,171
927
3,244
Number of persons
A00
A0s
Ar 0
3,672
341
846
5
341
846
5
280
7
1
139
12
1
127
-
Ars
5,378
5,378
5,204
4,031
915
3,116
Omitted at
each step
75
2,187
1,910
714
3,667
1,074
1,325
-
Notes: (a) that is being subject to a personal interview; (b) for new
sample members the number of non-respondents includes only
sampled persons without other household members; (c) employment
reported as the main activity at least for one month in the income
reference period and/or having positive earnings in either data
source; (d) constrained sector sub-sample includes public sector
employees, except those who changed jobs or have a second job.
20. Introduction
Methodology
Data
Findings
Summary
Mean earnings
Sample
Survey
Tax records
Difference
b
se
b
se
b
se
ESU non-respondents with positive earnings in the tax records
Unit non-response
8.48
0.64
Item non-reponse
8.76
0.43
ESU respondents with positive earnings in either source
Positive earnings in ESU (A0s )
6.78
0.44
0.00
6.78
0.44
Pos... in the tax records (Ar 0 )
0.00
1.41
0.12
-1.41
0.12
Pos... in both sources (Ars )
7.46
0.11
7.32
0.13
0.13
0.08
- constrained sector
7.37
0.20
8.24
0.27
-0.88
0.15
- unconstrained sector
7.48
0.12
7.05
0.14
0.43
0.10
N
1,114
563
341
846
5,378
1,048
4,330
Notes: annual (net) earnings in thousand EEK divided by 12;
estimations take into account design weights and clustering at the
household level; constrained sector sub-sample includes public sector
employees, except those who changed jobs or have a second job.
21. Introduction
Methodology
Data
Findings
Summary
Distribution of earnings
Survey
Tax records
.15
Density
.2
.15
Density
.2
.1
.05
.1
.05
0
0
0
10
20
30
40
0
10
20
30
Annual (net) earnings divided by 12, in thousand EEK
Notes: final estimation sample (i.e. worked full time for whole year) excluding those with zero earnings or monthly earnings above
40 thousand EEK, N = 4,012; bandwith 0.5; vertical line shows monthly minimum net wage (3,175 EEK).
40
23. Introduction
Methodology
Data
Findings
Summary
Regression estimates (p < 0.05)
1
True income equation:
age (-), age sq (-), male (+), Estonian nationality (+),
education (+), region, industry, occupation (+), firm size (+),
hours in main job (+), health status (+)
Non-significant: marital status, rural area, studying, 2nd job, hours in 2nd
job, constrained sector
2
Tax compliance equation:
age (+), male (-), Est. nationality (+), education (+), married
(+), north-east region (-), industry, occupation, firm size (+),
lease loan (+), studying (+)
Non-significant: age sq, rural area, hours in main job, 2nd job, hours in 2nd
job, mortgage
3
Survey income equation:
age (-), age sq (-), male (+), Est. nationality (+), education
(+), # of waves (-), interview month (+), interview rating (+),
how responded
Non-significant: persons present at interview, constrained sector
28. Introduction
Methodology
Data
Findings
Sensitivity analysis
Alternative
samples
definitions for the constrained sector
set of co-variates
parameter constraints (i.e. differences between the
constrained and the unconstrained sector)
survey design (weights, clustering)
Results appear robust
parameter estimates
proportion of compliant
proportion of earnings not declared
Summary
29. Introduction
Methodology
Data
Findings
Summary
Concluding points
First study analysing tax evasion and measurement error
ME instrumental for income differences
Broad set of characteristics associated with tax evasion (in
line with previous studies)
Scale and distribution of undeclared earnings
part vs full evasion
high income earners
Substantial underreporting of wages/salaries despite
third-party reporting