SlideShare una empresa de Scribd logo
1 de 135
Descargar para leer sin conexión
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Predicting the quality of a survey question
from its design characteristics: SQP
Daniel Oberski
(joint work with Willem Saris)
U N I V E R S I T A T
P O M P E U F A B R A
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Measurement Representation
Construct
Measurement
Response
Edited data
Validity
Processing
error
Measurement
error
Inferential population
Target population
Sampling frame
Sample
Respondents
Survey statistic
Coverage
error
Sampling
error
Nonresponse
error
(Groves et al. 2004).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error ConclConstruct
Measurement
Response
Edited data
Validity
Processing
error
Measurement
error
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
• Assume the step from construct to measurement is already
acceptable
→ Assume that the question measures an intended construct:
respondent knows the answer, can interpret the question,
...
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
• Assume the step from construct to measurement is already
acceptable
→ Assume that the question measures an intended construct:
respondent knows the answer, can interpret the question,
...
→ reaction of respondent to the question depends on some
unobserved value/opinion, which is in turn a measure of
construct.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
• Assume the step from construct to measurement is already
acceptable
→ Assume that the question measures an intended construct:
respondent knows the answer, can interpret the question,
...
→ reaction of respondent to the question depends on some
unobserved value/opinion, which is in turn a measure of
construct.
• We focus only on the degree to which the response is a
good measure of this unobserved score/opinion,
“measurement error”.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
• Assume the step from construct to measurement is already
acceptable
→ Assume that the question measures an intended construct:
respondent knows the answer, can interpret the question,
...
→ reaction of respondent to the question depends on some
unobserved value/opinion, which is in turn a measure of
construct.
• We focus only on the degree to which the response is a
good measure of this unobserved score/opinion,
“measurement error”.
• (NOT the degree to which the question is interpretable,
measures some construct, etc.)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
• Assume the step from construct to measurement is already
acceptable
→ Assume that the question measures an intended construct:
respondent knows the answer, can interpret the question,
...
→ reaction of respondent to the question depends on some
unobserved value/opinion, which is in turn a measure of
construct.
• We focus only on the degree to which the response is a
good measure of this unobserved score/opinion,
“measurement error”.
• (NOT the degree to which the question is interpretable,
measures some construct, etc.)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Reasons to study measurement error
• Reliability is an upper bound on validity; responses can
never measure underlying construct better than the single
indicator.
• Unreliability increases the variance of estimators:
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Reasons to study measurement error
• Reliability is an upper bound on validity; responses can
never measure underlying construct better than the single
indicator.
• Unreliability increases the variance of estimators:
• var(ˆµ) = κ−1
σ2
/n, where κ ∈ (0, 1) is reliability
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Reasons to study measurement error
• Reliability is an upper bound on validity; responses can
never measure underlying construct better than the single
indicator.
• Unreliability increases the variance of estimators:
• var(ˆµ) = κ−1
σ2
/n, where κ ∈ (0, 1) is reliability
• Unreliability reduces apparent strength of relationships
between variables:
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Reasons to study measurement error
• Reliability is an upper bound on validity; responses can
never measure underlying construct better than the single
indicator.
• Unreliability increases the variance of estimators:
• var(ˆµ) = κ−1
σ2
/n, where κ ∈ (0, 1) is reliability
• Unreliability reduces apparent strength of relationships
between variables:
• ρxy = κx · κy · ρXY , where ρXY is the true correlation and ρxy
the observed correlation.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Reasons to study measurement error
• Reliability is an upper bound on validity; responses can
never measure underlying construct better than the single
indicator.
• Unreliability increases the variance of estimators:
• var(ˆµ) = κ−1
σ2
/n, where κ ∈ (0, 1) is reliability
• Unreliability reduces apparent strength of relationships
between variables:
• ρxy = κx · κy · ρXY , where ρXY is the true correlation and ρxy
the observed correlation.
• Correlated measurement errors will make variables look
more related than they really are; e.g. “How many minutes
does it take to...” questions correlate partly because they
are all asked in the same way.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Reasons to study measurement error
• Reliability is an upper bound on validity; responses can
never measure underlying construct better than the single
indicator.
• Unreliability increases the variance of estimators:
• var(ˆµ) = κ−1
σ2
/n, where κ ∈ (0, 1) is reliability
• Unreliability reduces apparent strength of relationships
between variables:
• ρxy = κx · κy · ρXY , where ρXY is the true correlation and ρxy
the observed correlation.
• Correlated measurement errors will make variables look
more related than they really are; e.g. “How many minutes
does it take to...” questions correlate partly because they
are all asked in the same way.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Public health ranking: Correction of regression coefficients for κ
Country
Educationaldifferentialsinsubjectivehealthwith2s.e.interval
-0.4-0.3-0.2-0.10.0
GR
CZ
PT
SI
FI
HU
PL
SK
LU
ES
EE
DK
DE
TR
IS
NO
CH
BE
IE
FR
UA
AT
NL
SE
Uncorrected regression coefficient
Measurement error-corrected coefficient
0.82
0.85
0.78
0.73
0.56
0.75
0.71
0.81
0.86
0.85
0.95
0.84
0.91
0.70
0.81
0.87
0.81
0.82
0.92
0.85
0.91
0.81
0.93
0.99
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Design characteristics of questions
• Social Desirability
• Centrality
• Reference period
• Question
formulation
• WH word used
• Use of gradation
• Balance of the
request
• Encouragement
• Showcards
present
• Showcards have
pictures
• ...
• Emphasis on subjective
opinion in request
• Information about the
opinion of other people
• Use of stimulus or
statement in the question
• Absolute or comparative
judgment
• Response scale: basic
choice
• Number of categories
• Labels full, partial, or no
• Labels full sentences
• Knowledge provided
• Survey mode
• ...
• Order of the labels
• Correspondence between
labels and numbers of the
scale
• Theoretical range of the
scale
• Neutral category
• Number of fixed reference
points
• Don’t know option
• Interviewer instruction
• Respondent instruction
• Extra motivation, info or
definition available?
• Agree-disagree scale
• . . .
(Saris & Gallhofer 2007)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Question design choices
• There are a great number of question design
characteristics for which it has at some point been found or
suggested that they influence the response;
• Any question in a questionnaire represents a series of
choices (conscious or not) on those characteristics: a
method of asking the question;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Question design choices
• There are a great number of question design
characteristics for which it has at some point been found or
suggested that they influence the response;
• Any question in a questionnaire represents a series of
choices (conscious or not) on those characteristics: a
method of asking the question;
• It is clear that what is a good method depends strongly on
the topic, for example
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Question design choices
• There are a great number of question design
characteristics for which it has at some point been found or
suggested that they influence the response;
• Any question in a questionnaire represents a series of
choices (conscious or not) on those characteristics: a
method of asking the question;
• It is clear that what is a good method depends strongly on
the topic, for example
• The frequency and importance of an event or series of
events asked about determine: reasonable reference
periods; reasonable categories - wide or deep;
approximately or exactly (Tourangeau et al. 2000).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Question design choices
• There are a great number of question design
characteristics for which it has at some point been found or
suggested that they influence the response;
• Any question in a questionnaire represents a series of
choices (conscious or not) on those characteristics: a
method of asking the question;
• It is clear that what is a good method depends strongly on
the topic, for example
• The frequency and importance of an event or series of
events asked about determine: reasonable reference
periods; reasonable categories - wide or deep;
approximately or exactly (Tourangeau et al. 2000).
• But are some methods generally better than others?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Question design choices
• There are a great number of question design
characteristics for which it has at some point been found or
suggested that they influence the response;
• Any question in a questionnaire represents a series of
choices (conscious or not) on those characteristics: a
method of asking the question;
• It is clear that what is a good method depends strongly on
the topic, for example
• The frequency and importance of an event or series of
events asked about determine: reasonable reference
periods; reasonable categories - wide or deep;
approximately or exactly (Tourangeau et al. 2000).
• But are some methods generally better than others?
• If so, what about those methods makes them better?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Question design choices
• There are a great number of question design
characteristics for which it has at some point been found or
suggested that they influence the response;
• Any question in a questionnaire represents a series of
choices (conscious or not) on those characteristics: a
method of asking the question;
• It is clear that what is a good method depends strongly on
the topic, for example
• The frequency and importance of an event or series of
events asked about determine: reasonable reference
periods; reasonable categories - wide or deep;
approximately or exactly (Tourangeau et al. 2000).
• But are some methods generally better than others?
• If so, what about those methods makes them better?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Question design choices
• But are some methods generally better than others?
• If so, what about those methods makes them better?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Talk outline
1 Question design
The influence of the method
Variation in influence of the method
2 Modeling measurement error
Definitions
Formal model and assumptions
3 Estimating measurement error
Design requirements
Estimation of the model
4 Predicting measurement error
Description of the data
Meta-analysis of the MTMM experiments
Program demonstration
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
The method influences the answers
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
European Social Survey, 2002
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
European Social Survey, 2002
Method A:
ENTER START TIME:
1 TvTot
CARD 1 On an average weekday, how much time, in total, do you
spend watching television? Please use this card to answer.
No time at all
Less than ½ hour
½ hour to 1 hour
More than 1 hour, up to1½ hours
More than 1½ hours, up to 2 hours
More than 2 hours, up to 2½ hours
More than 2½ hours, up to 3 hours
More than 3 hours
(Don’t know)
A2 TvPol
STILL CARD 1 And again on an average weekday, how much of
your time watching television is spent watching news or
programmes about politics and current affairs1
? Still use
this card.
00 GO TO A3
01
02
03
04 ASK A2
05
06
07
88
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
European Social Survey, 2002
Method A:
ENTER START TIME:
1 TvTot
CARD 1 On an average weekday, how much time, in total, do you
spend watching television? Please use this card to answer.
No time at all
Less than ½ hour
½ hour to 1 hour
More than 1 hour, up to1½ hours
More than 1½ hours, up to 2 hours
More than 2 hours, up to 2½ hours
More than 2½ hours, up to 3 hours
More than 3 hours
(Don’t know)
A2 TvPol
STILL CARD 1 And again on an average weekday, how much of
your time watching television is spent watching news or
programmes about politics and current affairs1
? Still use
this card.
00 GO TO A3
01
02
03
04 ASK A2
05
06
07
88
Method B:!
!""#$%&'()*%)+&#!)&,%$#
!
-&.# !"#$"#$%&'$(&#)&&*+$,-#./)#012.#340&-#4"#3/3$5-#+/#,/1#67&"+#)$32.4"(#
3&5&%464/"89
:##
#
# # # ,$/+%#/)#;!<=>0#### ###?@A#BC@<DE>0# # # #
# # # #
-&1# #!"#$"#$%&'$(&#)&&*+$,-#./)#012.#340&-#4"#3/3$5-#+/#,/1#67&"+#5463&"4"(#3/#
3.&#'$+4/8F
:##
#
# # # ,$/+%#/)#;!<=>G## ?@A#BC@<DE>G# # # #
# # # # # #
#
#
-&2# !"#$"#$%&'$(&#)&&*+$,-#./)#012.#340&-#4"#3/3$5-#+/#,/1#67&"+#'&$+4"(#3.&#
"&)67$7&'688
:##
#
# # # ,$/+%#/)#;!<=>G# #?@A#BC@<DE>G# #
#
#
#Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
TV watching: method A versus method B
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of TV watching:
categorical scale
0
2000
4000
6000
8000
qqq
q
qqqq
qq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
qqq
q
q
qq
q
q
q
q
qq
qq
qq
qq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
qq
q
q
q
qqqq
q
qq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
qq
q
q
q
qqqq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
qq
q
qqq
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
qq
q
q
q
qq
qq
q
q
q
q
q
q
q
q
q
qqq
q
q
qqq
qq
qq
q
q
qq
q
q
q
qq
qq
qqq
q
qqq
qq
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
q
qqq
q
q
qqq
q
q
q
qqqq
q
q
qqq
qqqqq
qq
q
q
qqq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qqq
q
qq
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qqqq
q
q
q
q
qq
q
q
q
q
q
qqq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
qqqq
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
qq
q
q
q
qq
qqqq
q
q
q
qqq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qqqqqq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
0
5
10
15
Hours of TV watching:
write in hrs and mins
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
Radio listening: method A versus method B
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of radio listening:
categorical scale
0
2000
4000
6000
8000
q
q
qqqqq
q
q
qq
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qq
qq
q
q
qq
q
q
q
q
q
q
q
qqqq
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
qqq
q
q
q
qq
q
q
q
qqqq
q
q
q
q
qqqqqqq
qq
q
q
qq
q
qq
q
qq
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qqqq
q
q
q
q
q
q
q
q
q
qqqq
q
qqq
qq
qq
q
qqqq
q
q
qq
q
qq
q
q
q
q
qq
q
q
qq
q
q
qqq
q
q
q
qq
qqqq
q
qqqq
qq
qq
qq
qqq
qq
qqqq
q
q
q
qqqq
qq
q
q
q
q
q
q
q
qq
qqq
q
qq
q
qq
q
q
q
q
qq
q
qq
qq
q
q
q
q
q
q
qq
qq
q
qq
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
qq
qqq
qq
qq
q
qq
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
qqq
q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
qq
q
q
qqqq
q
q
qq
qq
q
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
qqq
q
qqq
q
qq
q
qqq
qq
q
q
q
q
q
q
q
q
qq
q
qqq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
qqqqq
qq
q
q
q
q
qq
q
qqq
q
qq
q
q
q
qqq
q
q
q
q
q
q
qq
q
q
qqq
qqq
qq
q
q
qqq
q
q
qqqq
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
qq
q
qq
q
q
q
q
q
qq
q
q
qqq
q
q
q
q
q
qq
q
q
q
qqqqq
q
qqq
q
q
q
qqq
q
q
q
q
q
q
q
q
q
q
q
qqq
q
q
q
q
q
qq
qq
q
qq
q
q
q
q
qq
q
q
qq
q
q
qqq
q
qqqqq
q
qqq
q
q
q
q
q
q
qq
qq
q
q
q
q
q
q
qq
q
q
qqq
q
q
qqq
qq
q
q
q
q
q
qqq
q
q
qqq
q
q
q
q
q
q
q
qqq
qqq
q
q
qqqq
qq
q
qq
q
qqq
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
qqq
qq
q
qq
q
q
q
q
q
q
q
qq
q
qq
q
q
qq
q
q
q
qqqq
q
qqq
q
q
q
qqq
qq
q
q
qqq
q
q
q
q
q
qqqq
q
q
q
q
q
qqq
qq
q
q
q
q
q
qqq
q
q
q
q
q
q
q
q
q
qq
q
qq
q
q
qq
q
qq
q
qq
q
qqq
q
qq
q
q
qqq
q
q
qq
q
qq
q
qqq
q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
q
qqq
q
qqqqq
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
qqq
q
q
q
q
q
q
q
qq
q
q
qq
qq
qq
q
q
q
qq
q
q
q
qq
q
q
q
q
q
q
q
qqq
q
q
q
qq
q
q
q
q
q
q
q
q
q
qq
q
q
qqq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
q
q
0
5
10
15
Hours of radio listening:
write in hrs and mins
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
Newspaper reading: method A versus method B
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of newspaper reading:
categorical scale
0
2000
4000
6000
8000
10000
12000
q
q
q
0
2000
4000
6000
8000
10000
Hours of newspaper reading:
write in hrs and mins
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
TV watching: method A versus method B
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of TV watching:
categorical scale
0.00
0.05
0.10
0.15
0.20
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of TV watching:
write in hrs and mins, recoded
0.00
0.05
0.10
0.15
0.20
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
Radio listening: method A versus method B
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of radio listening:
categorical scale
0.00
0.05
0.10
0.15
0.20
0.25
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of radio listening:
write in hrs and mins, recoded
0.00
0.05
0.10
0.15
0.20
0.25
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
The influence of the method
Newspaper reading: method A versus method B
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of newspaper reading:
categorical scale
0.0
0.1
0.2
0.3
0.4
0
h<0.5
0.5<=h<=1
1<h<=1.5
1.5<h<=2
2<h<=2.5
2.5<h<=3
h>3
Hours of newspaper reading:
write in hrs and mins, recoded
0.0
0.1
0.2
0.3
0.4
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• The numeric method clearly produces many outliers, as
well as very high values that may or may not be outliers.
• To the extent that this is due to confusion of hours and
minutes, version C may remedy that problem.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• The numeric method clearly produces many outliers, as
well as very high values that may or may not be outliers.
• To the extent that this is due to confusion of hours and
minutes, version C may remedy that problem.
• Distributions of hours with method A and B (recoded) is
similar but not the same:
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• The numeric method clearly produces many outliers, as
well as very high values that may or may not be outliers.
• To the extent that this is due to confusion of hours and
minutes, version C may remedy that problem.
• Distributions of hours with method A and B (recoded) is
similar but not the same:
• There are much fewer people who watch very little TV with
method B, (9% versus 4% of 40,355 respondents),
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• The numeric method clearly produces many outliers, as
well as very high values that may or may not be outliers.
• To the extent that this is due to confusion of hours and
minutes, version C may remedy that problem.
• Distributions of hours with method A and B (recoded) is
similar but not the same:
• There are much fewer people who watch very little TV with
method B, (9% versus 4% of 40,355 respondents),
• Numeric method B has more people who watch a lot of TV.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• The numeric method clearly produces many outliers, as
well as very high values that may or may not be outliers.
• To the extent that this is due to confusion of hours and
minutes, version C may remedy that problem.
• Distributions of hours with method A and B (recoded) is
similar but not the same:
• There are much fewer people who watch very little TV with
method B, (9% versus 4% of 40,355 respondents),
• Numeric method B has more people who watch a lot of TV.
• Numeric method B has a spike at exactly 1 hour for radio
and newspaper.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• The numeric method clearly produces many outliers, as
well as very high values that may or may not be outliers.
• To the extent that this is due to confusion of hours and
minutes, version C may remedy that problem.
• Distributions of hours with method A and B (recoded) is
similar but not the same:
• There are much fewer people who watch very little TV with
method B, (9% versus 4% of 40,355 respondents),
• Numeric method B has more people who watch a lot of TV.
• Numeric method B has a spike at exactly 1 hour for radio
and newspaper.
• Overall it is clear the method has some influence on
average over all 40,355 respondents.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• The numeric method clearly produces many outliers, as
well as very high values that may or may not be outliers.
• To the extent that this is due to confusion of hours and
minutes, version C may remedy that problem.
• Distributions of hours with method A and B (recoded) is
similar but not the same:
• There are much fewer people who watch very little TV with
method B, (9% versus 4% of 40,355 respondents),
• Numeric method B has more people who watch a lot of TV.
• Numeric method B has a spike at exactly 1 hour for radio
and newspaper.
• Overall it is clear the method has some influence on
average over all 40,355 respondents.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Is the difference between methods the same for all
respondents?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Is the difference between methods the same for all
respondents?
The same people were asked both versions. This allows us to
show variation in answers to the numeric question, within
categories of the categorical question.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Is the difference between methods the same for all
respondents?
No time at all
Numeric value given
Density
0 1 2 3 4
0.00.20.40.60.81.0
Less than 0,5 hour
Numeric value given
Density
0 1 2 3 4
0.00.20.40.60.81.0
0,5 hour to 1 hour
Numeric value given
Density
0 1 2 3 4
0.00.20.40.60.81.0
More than 1 hour, up to 1,5 hours
Numeric value given
Density
0 1 2 3 4
0.00.20.40.60.81.0
More than 1,5 hours, up to 2 hours
Numeric value given
Density
0 1 2 3 4
0.00.20.40.60.81.0
More than 2 hours, up to 2,5 hours
Numeric value given
Density
0 1 2 3 4
0.00.20.40.60.81.0
More than 2,5 hours, up to 3 hours
Numeric value given
Density
0 1 2 3 4
0.00.20.40.60.81.0
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• Not only does the method influence the distribution of
answers,
• the method effect also depends on the person.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Variation in influence of the method
Do people answer methods differently?
• Not only does the method influence the distribution of
answers,
• the method effect also depends on the person.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Traits, Methods, and Persons
• Can imagine the same question (“Trait”) being asked in
different ways (“Methods”);
• Can imagine the same method being used to ask different
questions;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Traits, Methods, and Persons
• Can imagine the same question (“Trait”) being asked in
different ways (“Methods”);
• Can imagine the same method being used to ask different
questions;
• A response to a survey question is then different person’s
answers to Trait-Method combinations.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Traits, Methods, and Persons
• Can imagine the same question (“Trait”) being asked in
different ways (“Methods”);
• Can imagine the same method being used to ask different
questions;
• A response to a survey question is then different person’s
answers to Trait-Method combinations.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Measurement error model
1 Responses are a measure of some underlying score
(“trait”) so that if a person’s memory were erased and the
person re-interviewed, they should give a similar answer.
2 Responses are influenced by random variation: errors,
such as mistaking minutes for hours, but also variation in
information retrieved from memory.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Measurement error model
1 Responses are a measure of some underlying score
(“trait”) so that if a person’s memory were erased and the
person re-interviewed, they should give a similar answer.
2 Responses are influenced by random variation: errors,
such as mistaking minutes for hours, but also variation in
information retrieved from memory.
3 The method influences the answers on average, e.g. there
might be more social desirability bias in one method than
another, the scale may suggest some unspoken norm, etc.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Measurement error model
1 Responses are a measure of some underlying score
(“trait”) so that if a person’s memory were erased and the
person re-interviewed, they should give a similar answer.
2 Responses are influenced by random variation: errors,
such as mistaking minutes for hours, but also variation in
information retrieved from memory.
3 The method influences the answers on average, e.g. there
might be more social desirability bias in one method than
another, the scale may suggest some unspoken norm, etc.
4 Influence of method is different for different people:
random variation in the differences between methods.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Measurement error model
1 Responses are a measure of some underlying score
(“trait”) so that if a person’s memory were erased and the
person re-interviewed, they should give a similar answer.
2 Responses are influenced by random variation: errors,
such as mistaking minutes for hours, but also variation in
information retrieved from memory.
3 The method influences the answers on average, e.g. there
might be more social desirability bias in one method than
another, the scale may suggest some unspoken norm, etc.
4 Influence of method is different for different people:
random variation in the differences between methods.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Modeling measurement error
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Quasi-equation
Response =
Responses are a measure of some underlying score
(“trait”) so that if a person’s memory were erased and
the person re-interviewed, they should give a similar
answer.
Trait + Trait × Person+
Responses are influenced by random variation: er-
rors, such as mistaking minutes for hours, but also
variation in information retrieved from memory.
Person × Moment+
The method influences the answers on average, e.g.
there might be more social desirability bias in one
method than another, the scale may suggest some
unspoken norm, etc.
Method + Method × Trait
Influence of method is different for different people:
random variation in the differences between meth-
ods.
Method × Person
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Quasi-equation
Response = Trait + Method + Trait × Method+
Trait × Person + Method × Person+
Person × Moment
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Interpretation of the model
If persons are a random sample from a population U, consider
Person a random factor.
1 “Rest” variance is called “random measurement error”
2 Proportion of Residual variance on the total is called
“unreliability” (1 − r2)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Interpretation of the model
If persons are a random sample from a population U, consider
Person a random factor.
1 “Rest” variance is called “random measurement error”
2 Proportion of Residual variance on the total is called
“unreliability” (1 − r2)
3 Proportion of Method×Person variance on the total is
called “common method variance” (sometimes “invalidity”),
(1 − v2)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Interpretation of the model
If persons are a random sample from a population U, consider
Person a random factor.
1 “Rest” variance is called “random measurement error”
2 Proportion of Residual variance on the total is called
“unreliability” (1 − r2)
3 Proportion of Method×Person variance on the total is
called “common method variance” (sometimes “invalidity”),
(1 − v2)
4 Proportion of Trait×Person variance on the total is called
“quality” of the question (q2 or κ)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Interpretation of the model
If persons are a random sample from a population U, consider
Person a random factor.
1 “Rest” variance is called “random measurement error”
2 Proportion of Residual variance on the total is called
“unreliability” (1 − r2)
3 Proportion of Method×Person variance on the total is
called “common method variance” (sometimes “invalidity”),
(1 − v2)
4 Proportion of Trait×Person variance on the total is called
“quality” of the question (q2 or κ)
5 “Quality” (q2 or κ) will equal v2 · r2.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Definitions
Interpretation of the model
If persons are a random sample from a population U, consider
Person a random factor.
1 “Rest” variance is called “random measurement error”
2 Proportion of Residual variance on the total is called
“unreliability” (1 − r2)
3 Proportion of Method×Person variance on the total is
called “common method variance” (sometimes “invalidity”),
(1 − v2)
4 Proportion of Trait×Person variance on the total is called
“quality” of the question (q2 or κ)
5 “Quality” (q2 or κ) will equal v2 · r2.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
Equation model
Yijk = τijk + ηij + ξik + ijk ,
where
i Indexes persons;
j Indexes traits;
k Indexes methods.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
Model
Response = Trait + Method + Trait × Method+
Trait × Person + Method × Person+
Person × Moment
Yijk = τijk + ηij + ξik + ijk ,
where
i Indexes persons;
j Indexes traits;
k Indexes methods.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
Equation with Trait×Method interaction with
Trait×Person
Yijk = τijk + λjk ηij + ξik + ijk ,
where
i Indexes persons;
j Indexes traits;
k Indexes methods.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
Assumptions in the model
1 The (interaction) effects do not depend on other
Method×Trait combinations a person might receive;
(“no carry-over effects”, “SUTVA”, “independence
assumption”)
Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov
& Reinecke (eds), 2012)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
Assumptions in the model
1 The (interaction) effects do not depend on other
Method×Trait combinations a person might receive;
(“no carry-over effects”, “SUTVA”, “independence
assumption”)
2 There is no separate Person main effect: Trait and Method
within Person already capture all within-person correlation
Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov
& Reinecke (eds), 2012)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
Assumptions in the model
1 The (interaction) effects do not depend on other
Method×Trait combinations a person might receive;
(“no carry-over effects”, “SUTVA”, “independence
assumption”)
2 There is no separate Person main effect: Trait and Method
within Person already capture all within-person correlation
(“method variance is the only systematic
variance”, COVU( ijk , ξik ) = 0 and
COVU( ijk , ηik ) = 0 )
Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov
& Reinecke (eds), 2012)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
Assumptions in the model
1 The (interaction) effects do not depend on other
Method×Trait combinations a person might receive;
(“no carry-over effects”, “SUTVA”, “independence
assumption”)
2 There is no separate Person main effect: Trait and Method
within Person already capture all within-person correlation
(“method variance is the only systematic
variance”, COVU( ijk , ξik ) = 0 and
COVU( ijk , ηik ) = 0 )
Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov
& Reinecke (eds), 2012)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
The parameters of interest in the model are
• The variance over persons in the Trait effect;
• The variance over persons in the Method effect.
Expressed as proportions of the total variance over persons of
Yjk , these two quantities equal, respectively,
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
The parameters of interest in the model are
• The variance over persons in the Trait effect;
• The variance over persons in the Method effect.
Expressed as proportions of the total variance over persons of
Yjk , these two quantities equal, respectively,
• The reliability κjk of a question asking Trait j with Method k
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
The parameters of interest in the model are
• The variance over persons in the Trait effect;
• The variance over persons in the Method effect.
Expressed as proportions of the total variance over persons of
Yjk , these two quantities equal, respectively,
• The reliability κjk of a question asking Trait j with Method k
• The correlation between two different questions that is
purely due to them being measured with the same method.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Formal model and assumptions
The parameters of interest in the model are
• The variance over persons in the Trait effect;
• The variance over persons in the Method effect.
Expressed as proportions of the total variance over persons of
Yjk , these two quantities equal, respectively,
• The reliability κjk of a question asking Trait j with Method k
• The correlation between two different questions that is
purely due to them being measured with the same method.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of measurement error with the MTMM design
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Design requirements
What design is needed to estimate this model?
Response = Trait + Method + Trait × Method+
Trait × Person + Method × Person+
Person × Moment
Yijk = τijk + ηij + ξik + ijk ,
i Indexes persons; j indexes traits; k indexes methods.
• The model suggests that a Person×Method×Trait factorial
experiment would allow for the estimation of the reliability
and method variance.
• Residual or “measurement error” error Person × Moment is
estimated by Person × Trait × Method interaction.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Design requirements
What design is needed to estimate this model?
• A Person×Method×Trait factorial experiment would ask
the same question in different ways (Methods) and use
different methods to ask the same questions, within each
person;
• Campbell and Fiske introduced such designs in 1959
under the name “Multitrait-multimethod” (MTMM)
experiment.
• Not all Trait-Method combinations are necessary, but at
least one repetition within each person is required (Saris,
Satorra & Coenders, 2004).
• Under the model and assumptions 1 and 2, the MTMM
design will provide data that allow for the estimation of the
reliability and method variance (“invalidity”).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Design requirements
Example of an MTMM experiment
On an average weekday, how much time, in total...
T = 1 ...do you spend watching television?
T = 2 ...do you spend listening to the radio?
T = 3 ...do you spend reading the newspapers?
Scales:
M = 1: 8pt (hours)
M = 2: Write in hours and minutes
M = 3: 7pts vague quantifiers
Each respondent answered all three questions in two different
ways.
The repetition was given at the end of the interview (after
approximately 50 minutes passed)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
Estimation issues
Yijk = τijk + λjk ηij + ξik + ijk .
• The model can be estimated with regression (with Person
a random factor);
• Not flexible enough: little influence on covariance structure
and λjk not possible.
• The model can also be recognized as a factor analysis or
more generally as a structural equation model (SEM),
• through transformation as an IRT or latent class model.
• The SEM framework allows enough flexibility to estimate
the parameters of interest: trait, method and residual
variance or r2, v2, and quality q2.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
The model as a SEM (or IRT or latent class) model
M1 M2 M3
T1 T2 T3
y11 y21 y31 y12 y22 y32 y13 y23 y33
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
Another example
COMPARING QUESTIONS WITH AGREE/DISAGREE RESPONSE OPTIONS TO QUESTIONS WITH ITEM-SPECIFIC RESPONSE OPTIONS 69
Table 4: Experiment 2 of round 2
Introduction Statements Answer categories
Main Using this card, - There is a lot of variety in my work - not at all true
questionnaire please tell me how - My job is secure - a little true
true each of the - My health or safety is at risk because - quite true
“A/D” following statements of my work - very true
is about your current job.
SC group 1 The next 3 questions - Please choose one of the following to - not at all varied
are about your describe how varied your work is. - a little varied
IS current job. - Please choose one of the following to - quite varied
describe how secure your job is - very varied
- Please choose one of the following to (same type of response
say how much, if at all, your work puts scale using terms secure
your health and safety at risk. and safe instead of varied)
SC group 2 - Please indicate, on a scale of 0 to 10, Horizontal 11 point
how varied your work is, where 0 is not scale only labelled at the
IS at all varied and 10 is very varied. end points
- Now please indicate, on a scale of 0 to
10, how secure your job is, where 0 is
not at all secure and 10 is very secure.
- Please indicate, on a scale of 0 to 10,
how much your health and safety is at
risk from your work, where 0 is not at
all at risk and 10 is very much at risk.
Table 5: The means reliability, validity and quality of the three questions of experiment 2 in Round 2 of the ESS across 10 countries for the
different methods (standard deviations in brackets)
Reliability r2
Validity v2
Quality q2
Method Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
Source: R´evilla, Saris & Krosnick, (2010)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
Results from another example
- Please choose one of the following to (same type of response
say how much, if at all, your work puts scale using terms secure
your health and safety at risk. and safe instead of varied)
SC group 2 - Please indicate, on a scale of 0 to 10, Horizontal 11 point
how varied your work is, where 0 is not scale only labelled at the
IS at all varied and 10 is very varied. end points
- Now please indicate, on a scale of 0 to
10, how secure your job is, where 0 is
not at all secure and 10 is very secure.
- Please indicate, on a scale of 0 to 10,
how much your health and safety is at
risk from your work, where 0 is not at
all at risk and 10 is very much at risk.
Table 5: The means reliability, validity and quality of the three questions of experiment 2 in Round 2 of the ESS across 10 countries for the
different methods (standard deviations in brackets)
Reliability r2
Validity v2
Quality q2
Method Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
A/D(4) .65 .59 .61 .99 .98 .99 .64 .58 .60
(.09) (.18) (.15) (.02) (.03) (.03) (.10) (.18) (.15)
IS(4) .80 .80 .80 1 1 1 .80 .80 .80
(.14) (.13) (.14) (0) (0) (0) (.14) (.13) (.14)
IS(11) .81 .83 .77 .98 .98 .98 .80 .82 .76
(.09) (.11) (.12) (.03) (.03) (.04) (.10) (.12) (.14)
using a truth scale with the same number of categories for all
three questions (around .7 to .9 versus .5 to .6). The position
of the IS scale in the supplementary questionnaire is not an
issue as the better quality of the IS scale is also observed both
when it comes first and when it comes later.
Possibly the order of the observations with the different
scale types has an impact on the size of the differences since
we see fewer differences in this second experiment than in
the first, but this may also be linked to the subject matter
of the experiments or to other characteristics of the methods
used (such as the number of points). More research is needed
to determine this, however the important point here is that in
different combinations, the superiority of the IS in terms of
scale with 11 categories was also better than the IS scale with
4 categories. So, not only might the kind of scale (IS versus
A/D) impact the total quality of a measure, but so might the
length of the scale (number of response categories). How-
ever, it seems that this effect varies across countries.
Experiments in Round 3 of the
ESS
In round 3 of the ESS again two SB-MTMM experiments
have been done which allow the comparison of the IS scales
with A/D scales. The attraction of these experiments is thatPredicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
Results from another example
Quality q2
Q1 Q2 Q3
.64 .58 .60
(.10) (.18) (.15)
.80 .80 .80
(.14) (.13) (.14)
.80 .82 .76
(.10) (.12) (.14)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
Results from another example
• It looks like there is much more measurement error
(residual variance) in the agree-disagree questions than
there is in the item-specific scales.
• This was true over all countries (shown is the average over
countries).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
Results from another example
• It looks like there is much more measurement error
(residual variance) in the agree-disagree questions than
there is in the item-specific scales.
• This was true over all countries (shown is the average over
countries).
• Still wonder whether the same would be found with other
topics and under other conditions, and with other
combinations of methods.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Estimation of the model
Results from another example
• It looks like there is much more measurement error
(residual variance) in the agree-disagree questions than
there is in the item-specific scales.
• This was true over all countries (shown is the average over
countries).
• Still wonder whether the same would be found with other
topics and under other conditions, and with other
combinations of methods.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Are some types of questions better than others?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
• So far there have been five rounds (2002, 4, 6, 8, and 10).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
• So far there have been five rounds (2002, 4, 6, 8, and 10).
• The experiments are done in 20-30 European countries
every two years;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
• So far there have been five rounds (2002, 4, 6, 8, and 10).
• The experiments are done in 20-30 European countries
every two years;
• Effective sample size per country is at least 1500.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
• So far there have been five rounds (2002, 4, 6, 8, and 10).
• The experiments are done in 20-30 European countries
every two years;
• Effective sample size per country is at least 1500.
• Each experiment usually estimates the quality for 9
questions (Method-Trait combinations).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
• So far there have been five rounds (2002, 4, 6, 8, and 10).
• The experiments are done in 20-30 European countries
every two years;
• Effective sample size per country is at least 1500.
• Each experiment usually estimates the quality for 9
questions (Method-Trait combinations).
• Range of topics is reasonably diverse, though factual
questions are underrepresented.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
• So far there have been five rounds (2002, 4, 6, 8, and 10).
• The experiments are done in 20-30 European countries
every two years;
• Effective sample size per country is at least 1500.
• Each experiment usually estimates the quality for 9
questions (Method-Trait combinations).
• Range of topics is reasonably diverse, though factual
questions are underrepresented.
• In total about 5000 questions available, but only 3000 of
those will be used here for various reasons.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• The examples given so far come from a much larger series
of MTMM experiments;
• In the European Social Survey (ESS), every round about
six MTMM experiments are done;
• So far there have been five rounds (2002, 4, 6, 8, and 10).
• The experiments are done in 20-30 European countries
every two years;
• Effective sample size per country is at least 1500.
• Each experiment usually estimates the quality for 9
questions (Method-Trait combinations).
• Range of topics is reasonably diverse, though factual
questions are underrepresented.
• In total about 5000 questions available, but only 3000 of
those will be used here for various reasons.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• In addition to the ESS, an older series of experiments also
exists (F. Andrews; K¨oltringer; Saris; Billiet, 1990’s)
• These add another 1089 questions for which reliability and
validity coefficients are estimated
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• In addition to the ESS, an older series of experiments also
exists (F. Andrews; K¨oltringer; Saris; Billiet, 1990’s)
• These add another 1089 questions for which reliability and
validity coefficients are estimated
• Combining the two datasets (ESS question qualities and
Old experiment qualities, we created a database of 3011
questions with their reliability and validity estimates.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• In addition to the ESS, an older series of experiments also
exists (F. Andrews; K¨oltringer; Saris; Billiet, 1990’s)
• These add another 1089 questions for which reliability and
validity coefficients are estimated
• Combining the two datasets (ESS question qualities and
Old experiment qualities, we created a database of 3011
questions with their reliability and validity estimates.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Reliability and validity estimates of 3011 questions
Reliability coefficient
Reliability coefficient
Frequency
0.4 0.6 0.8 1.0
0200400600800
Validity coefficient
Validity coefficient
Frequency
0.2 0.4 0.6 0.8 1.0
050010001500
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Logit transform of Reliability and validity estimates
Reliability coefficient, logit
Validity coefficient
Frequency
0 2 4 6
0200400600800
Validity coefficient, logit
Validity coefficient
Frequency
0 2 4 6
0100200300400500
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Coding design characteristics of the 3011 questions
• For each of the 3011 questions in all countries, a team of
coders coded 40 design characteristics of the question;
• Some codes were automatically generated by Natural
Language Processing software (syllables, words, etc).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Coding design characteristics of the 3011 questions
• For each of the 3011 questions in all countries, a team of
coders coded 40 design characteristics of the question;
• Some codes were automatically generated by Natural
Language Processing software (syllables, words, etc).
• Coders were students, assistants to the local coordinators
of the ESS, and two experts;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Coding design characteristics of the 3011 questions
• For each of the 3011 questions in all countries, a team of
coders coded 40 design characteristics of the question;
• Some codes were automatically generated by Natural
Language Processing software (syllables, words, etc).
• Coders were students, assistants to the local coordinators
of the ESS, and two experts;
• For English source version, experts double-coded
questions independently, then created consensus codes;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Coding design characteristics of the 3011 questions
• For each of the 3011 questions in all countries, a team of
coders coded 40 design characteristics of the question;
• Some codes were automatically generated by Natural
Language Processing software (syllables, words, etc).
• Coders were students, assistants to the local coordinators
of the ESS, and two experts;
• For English source version, experts double-coded
questions independently, then created consensus codes;
• Non-expert codes were quality-controlled by detailed
comparison with consensus codes for the English source;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Coding design characteristics of the 3011 questions
• For each of the 3011 questions in all countries, a team of
coders coded 40 design characteristics of the question;
• Some codes were automatically generated by Natural
Language Processing software (syllables, words, etc).
• Coders were students, assistants to the local coordinators
of the ESS, and two experts;
• For English source version, experts double-coded
questions independently, then created consensus codes;
• Non-expert codes were quality-controlled by detailed
comparison with consensus codes for the English source;
• In a meeting between the experts and each other coder,
the discrepancies were discussed and either corrected or
left in as true differences.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Coding design characteristics of the 3011 questions
• For each of the 3011 questions in all countries, a team of
coders coded 40 design characteristics of the question;
• Some codes were automatically generated by Natural
Language Processing software (syllables, words, etc).
• Coders were students, assistants to the local coordinators
of the ESS, and two experts;
• For English source version, experts double-coded
questions independently, then created consensus codes;
• Non-expert codes were quality-controlled by detailed
comparison with consensus codes for the English source;
• In a meeting between the experts and each other coder,
the discrepancies were discussed and either corrected or
left in as true differences.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
• absolute
• avgabs intro
• avgabs total
• avgsy total
• avgwrd intro
• avgwrd total
• balance
• centrality
• computer.assisted
• concept
• country
• domain
• dont know
• encourage
• fixrefpoints
• form basic
• future
• labels
• instr interv
• instr respon
• interviewer
• intr request
• intropresent
• knowledge
• labels gramm
• labels order
• language
• motivation
• opinionother
• past
• position
• questiontype
• scal neutral
• scale basic
• scale corres
• scale trange
• scale urange
• showc boxes
• showc horiz
• showc letter
• showc over
• showc quest
• showc start
• socdesir
• stimulus
• subjectiveop
• symmetry
• used WH word
• usedshowcard
• visual
• from
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Domain of question # questions
Internatl politics 64
Health 190
Living conditions 453
Other beliefs 292
Work 469
Personal relations 320
Consumer behavior 34
Leisure activts 131
National gvt 141
Institutions 284
Political parties 30
Trade unions 12
Economy 237
Other 354
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Description of the data
Concept of question # questions
Evaluative belief 713
Feeling 903
Importance 96
Expectation 39
Facts, behavior 63
Judgement 123
Relationship 8
Evaluation 704
Norm 57
Policy 250
Right 4
Action tendency 51
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis dataset
• For each of the 3011 questions, we have in the database:
• The estimated quality (reliability and validity coefficients)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis dataset
• For each of the 3011 questions, we have in the database:
• The estimated quality (reliability and validity coefficients)
• About 50 design characteristics (through hand- and
automatic coding)
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis dataset
• For each of the 3011 questions, we have in the database:
• The estimated quality (reliability and validity coefficients)
• About 50 design characteristics (through hand- and
automatic coding)
• The next step was to relate the design characteristics to
the quality estimates:
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis dataset
• For each of the 3011 questions, we have in the database:
• The estimated quality (reliability and validity coefficients)
• About 50 design characteristics (through hand- and
automatic coding)
• The next step was to relate the design characteristics to
the quality estimates:
• Can the quality estimates be predicted from the design
characteristics?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis dataset
• For each of the 3011 questions, we have in the database:
• The estimated quality (reliability and validity coefficients)
• About 50 design characteristics (through hand- and
automatic coding)
• The next step was to relate the design characteristics to
the quality estimates:
• Can the quality estimates be predicted from the design
characteristics?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis dataset
• For each of the 3011 questions, we have in the database:
• The estimated quality (reliability and validity coefficients)
• About 50 design characteristics (through hand- and
automatic coding)
• The next step was to relate the design characteristics to
the quality estimates:
• Can the quality estimates be predicted from the design
characteristics?
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis
• Prediction by random forests of regression trees (Breiman
2001);
• Two separate models: one for validity and for reliability
coefficients;
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis
• Prediction by random forests of regression trees (Breiman
2001);
• Two separate models: one for validity and for reliability
coefficients;
• Missing data are multiply imputed using the MICE
algorithm (van Buuren & Groothuis-Oudshoorn 2011).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis
• Prediction by random forests of regression trees (Breiman
2001);
• Two separate models: one for validity and for reliability
coefficients;
• Missing data are multiply imputed using the MICE
algorithm (van Buuren & Groothuis-Oudshoorn 2011).
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Example regression tree for logit(reliability coefficient)
|
domain=3,4,7,11,13,14,112
domain=3
gradation>=0.5 position< 339.5
position>=410
concept=1,2 position< 404.5
concept=1,73,78
position< 322.5
ncategories>=4.5
domain=6,101,103,120
domain=4,7,11,13,14,112
gradation< 0.5 position>=339.5
position< 410
concept=73,75,76 position>=404.5
concept=2,76
position>=322.5
ncategories< 4.5
1.955
n=1988
1.724
n=1303
0.9636
n=108
0.4959
n=36
1.198
n=72
1.793
n=1195
1.642
n=722
2.023
n=473
1.544
n=108
1.28
n=76
2.17
n=32
2.165
n=365
1.97
n=217
2.45
n=148
2.394
n=685
1.489
n=138
2.622
n=547
2.384
n=233
2.799
n=314
2.681
n=260
3.364
n=54
Example regression tree for reliability coefficient
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis with random forests
• R2 based on out-of-bag (crossvalidation) mean square
error is 85% for validity coefficient and 60% for reliability
coefficient.
• Importance measures indicate domain, number of
categories, concept, position in the questionnaire, number
of syllables, country, number of words, fixed reference
points, and other linguistic complexity measures are the
most influential for reliability.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis with random forests
• R2 based on out-of-bag (crossvalidation) mean square
error is 85% for validity coefficient and 60% for reliability
coefficient.
• Importance measures indicate domain, number of
categories, concept, position in the questionnaire, number
of syllables, country, number of words, fixed reference
points, and other linguistic complexity measures are the
most influential for reliability.
• For validity, in addition to the above, order of the labels
(positive-negative), centrality of the trait and other
characteristics are also important.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl
Meta-analysis of the MTMM experiments
Meta-analysis with random forests
• R2 based on out-of-bag (crossvalidation) mean square
error is 85% for validity coefficient and 60% for reliability
coefficient.
• Importance measures indicate domain, number of
categories, concept, position in the questionnaire, number
of syllables, country, number of words, fixed reference
points, and other linguistic complexity measures are the
most influential for reliability.
• For validity, in addition to the above, order of the labels
(positive-negative), centrality of the trait and other
characteristics are also important.
Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP

Más contenido relacionado

La actualidad más candente

Mixed Effects Models - Signal Detection Theory
Mixed Effects Models - Signal Detection TheoryMixed Effects Models - Signal Detection Theory
Mixed Effects Models - Signal Detection TheoryScott Fraundorf
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approachGarima Nanda
 
Causal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine LearningCausal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine LearningBill Liu
 

La actualidad más candente (7)

Mixed Effects Models - Signal Detection Theory
Mixed Effects Models - Signal Detection TheoryMixed Effects Models - Signal Detection Theory
Mixed Effects Models - Signal Detection Theory
 
On Impact in Software Engineering Research
On Impact in Software Engineering ResearchOn Impact in Software Engineering Research
On Impact in Software Engineering Research
 
On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)
On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)
On impact in Software Engineering Research (ICSE 2018 New Faculty Symposium)
 
Twittering Dissent
Twittering DissentTwittering Dissent
Twittering Dissent
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approach
 
On Impact in Software Engineering Research (Dagstuhl 2020)
On Impact in Software Engineering Research (Dagstuhl 2020)On Impact in Software Engineering Research (Dagstuhl 2020)
On Impact in Software Engineering Research (Dagstuhl 2020)
 
Causal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine LearningCausal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine Learning
 

Destacado

A measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysisA measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysisDaniel Oberski
 
Multidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM modelMultidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM modelDaniel Oberski
 
Complex sampling in latent variable models
Complex sampling in latent variable modelsComplex sampling in latent variable models
Complex sampling in latent variable modelsDaniel Oberski
 
lavaan.survey: An R package for complex survey analysis of structural equatio...
lavaan.survey: An R package for complex survey analysis of structural equatio...lavaan.survey: An R package for complex survey analysis of structural equatio...
lavaan.survey: An R package for complex survey analysis of structural equatio...Daniel Oberski
 
How good are administrative register data and what can we do about it?
How good are administrative register data and what can we do about it?How good are administrative register data and what can we do about it?
How good are administrative register data and what can we do about it?Daniel Oberski
 
ESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey ResearchESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey ResearchDaniel Oberski
 
Predicting the quality of a survey question from its design characteristics
Predicting the quality of a survey question from its design characteristicsPredicting the quality of a survey question from its design characteristics
Predicting the quality of a survey question from its design characteristicsDaniel Oberski
 
Detecting local dependence in latent class models
Detecting local dependence in latent class modelsDetecting local dependence in latent class models
Detecting local dependence in latent class modelsDaniel Oberski
 

Destacado (9)

A measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysisA measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysis
 
Multidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM modelMultidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM model
 
Complex sampling in latent variable models
Complex sampling in latent variable modelsComplex sampling in latent variable models
Complex sampling in latent variable models
 
lavaan.survey: An R package for complex survey analysis of structural equatio...
lavaan.survey: An R package for complex survey analysis of structural equatio...lavaan.survey: An R package for complex survey analysis of structural equatio...
lavaan.survey: An R package for complex survey analysis of structural equatio...
 
How good are administrative register data and what can we do about it?
How good are administrative register data and what can we do about it?How good are administrative register data and what can we do about it?
How good are administrative register data and what can we do about it?
 
ESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey ResearchESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey Research
 
Predicting the quality of a survey question from its design characteristics
Predicting the quality of a survey question from its design characteristicsPredicting the quality of a survey question from its design characteristics
Predicting the quality of a survey question from its design characteristics
 
Detecting local dependence in latent class models
Detecting local dependence in latent class modelsDetecting local dependence in latent class models
Detecting local dependence in latent class models
 
Using Standardized Instruments
Using Standardized InstrumentsUsing Standardized Instruments
Using Standardized Instruments
 

Similar a Predicting the quality of a survey question from its design characteristics: SQP

Statistical methods for questionnaire development: Questionnaire reliability ...
Statistical methods for questionnaire development: Questionnaire reliability ...Statistical methods for questionnaire development: Questionnaire reliability ...
Statistical methods for questionnaire development: Questionnaire reliability ...Ahmed Negida
 
Psychometric Studies in the Development of an Inkjet Printer
Psychometric Studies in the Development of an Inkjet PrinterPsychometric Studies in the Development of an Inkjet Printer
Psychometric Studies in the Development of an Inkjet PrinterDavid Lee
 
In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...
In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...
In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...Daniel Mendez
 
Ryan Ripley - The #NoEstimatesMovement
Ryan Ripley - The #NoEstimatesMovementRyan Ripley - The #NoEstimatesMovement
Ryan Ripley - The #NoEstimatesMovementProjectCon
 
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)Yun Huang
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 Georgina Tilby
 
reliability and validity psychology 1234
reliability and validity psychology 1234reliability and validity psychology 1234
reliability and validity psychology 1234MajaAiraBumatay
 
Advancing Testing Using Axioms
Advancing Testing Using AxiomsAdvancing Testing Using Axioms
Advancing Testing Using AxiomsSQALab
 
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptxChapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptxHazelLansula1
 
Measurement and scaling
Measurement and scalingMeasurement and scaling
Measurement and scalingAli Syed
 
Doing observation and Data Analysis for Qualitative Research
Doing observation and Data Analysis for Qualitative ResearchDoing observation and Data Analysis for Qualitative Research
Doing observation and Data Analysis for Qualitative ResearchAhmad Johari Sihes
 
Lecture slides stats1.13.l06.air
Lecture slides stats1.13.l06.airLecture slides stats1.13.l06.air
Lecture slides stats1.13.l06.airatutor_te
 
Model evaluation 201606
Model evaluation 201606Model evaluation 201606
Model evaluation 201606Town Peterson
 
Research 101: Scale Validity & Reliability
Research 101: Scale Validity & ReliabilityResearch 101: Scale Validity & Reliability
Research 101: Scale Validity & ReliabilityHarold Gamero
 
Instrument development and psychometric validation 030222
Instrument development and psychometric validation 030222Instrument development and psychometric validation 030222
Instrument development and psychometric validation 030222Roger Watson
 
Assessment and individual differences
Assessment and individual differencesAssessment and individual differences
Assessment and individual differencesSullivan Turner
 
Hypothesise like you Mean it!
Hypothesise like you Mean it!Hypothesise like you Mean it!
Hypothesise like you Mean it!Chris Massey
 
Scale development
Scale developmentScale development
Scale developmentmichaelsony
 
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Jonathon Hare
 

Similar a Predicting the quality of a survey question from its design characteristics: SQP (20)

Statistical methods for questionnaire development: Questionnaire reliability ...
Statistical methods for questionnaire development: Questionnaire reliability ...Statistical methods for questionnaire development: Questionnaire reliability ...
Statistical methods for questionnaire development: Questionnaire reliability ...
 
Psychometric Studies in the Development of an Inkjet Printer
Psychometric Studies in the Development of an Inkjet PrinterPsychometric Studies in the Development of an Inkjet Printer
Psychometric Studies in the Development of an Inkjet Printer
 
In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...
In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...
In Quest for Requirements Engineering Oracles: Dependent Variables and Measur...
 
Ryan Ripley - The #NoEstimatesMovement
Ryan Ripley - The #NoEstimatesMovementRyan Ripley - The #NoEstimatesMovement
Ryan Ripley - The #NoEstimatesMovement
 
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
2015EDM: A Framework for Multifaceted Evaluation of Student Models (Polygon)
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015
 
reliability and validity psychology 1234
reliability and validity psychology 1234reliability and validity psychology 1234
reliability and validity psychology 1234
 
Advancing Testing Using Axioms
Advancing Testing Using AxiomsAdvancing Testing Using Axioms
Advancing Testing Using Axioms
 
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptxChapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
 
Introduction.pptx
 Introduction.pptx Introduction.pptx
Introduction.pptx
 
Measurement and scaling
Measurement and scalingMeasurement and scaling
Measurement and scaling
 
Doing observation and Data Analysis for Qualitative Research
Doing observation and Data Analysis for Qualitative ResearchDoing observation and Data Analysis for Qualitative Research
Doing observation and Data Analysis for Qualitative Research
 
Lecture slides stats1.13.l06.air
Lecture slides stats1.13.l06.airLecture slides stats1.13.l06.air
Lecture slides stats1.13.l06.air
 
Model evaluation 201606
Model evaluation 201606Model evaluation 201606
Model evaluation 201606
 
Research 101: Scale Validity & Reliability
Research 101: Scale Validity & ReliabilityResearch 101: Scale Validity & Reliability
Research 101: Scale Validity & Reliability
 
Instrument development and psychometric validation 030222
Instrument development and psychometric validation 030222Instrument development and psychometric validation 030222
Instrument development and psychometric validation 030222
 
Assessment and individual differences
Assessment and individual differencesAssessment and individual differences
Assessment and individual differences
 
Hypothesise like you Mean it!
Hypothesise like you Mean it!Hypothesise like you Mean it!
Hypothesise like you Mean it!
 
Scale development
Scale developmentScale development
Scale development
 
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...
 

Último

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 

Último (20)

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 

Predicting the quality of a survey question from its design characteristics: SQP

  • 1. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski (joint work with Willem Saris) U N I V E R S I T A T P O M P E U F A B R A Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 2. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Measurement Representation Construct Measurement Response Edited data Validity Processing error Measurement error Inferential population Target population Sampling frame Sample Respondents Survey statistic Coverage error Sampling error Nonresponse error (Groves et al. 2004). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 3. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error ConclConstruct Measurement Response Edited data Validity Processing error Measurement error Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 4. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl • Assume the step from construct to measurement is already acceptable → Assume that the question measures an intended construct: respondent knows the answer, can interpret the question, ... Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 5. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl • Assume the step from construct to measurement is already acceptable → Assume that the question measures an intended construct: respondent knows the answer, can interpret the question, ... → reaction of respondent to the question depends on some unobserved value/opinion, which is in turn a measure of construct. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 6. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl • Assume the step from construct to measurement is already acceptable → Assume that the question measures an intended construct: respondent knows the answer, can interpret the question, ... → reaction of respondent to the question depends on some unobserved value/opinion, which is in turn a measure of construct. • We focus only on the degree to which the response is a good measure of this unobserved score/opinion, “measurement error”. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 7. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl • Assume the step from construct to measurement is already acceptable → Assume that the question measures an intended construct: respondent knows the answer, can interpret the question, ... → reaction of respondent to the question depends on some unobserved value/opinion, which is in turn a measure of construct. • We focus only on the degree to which the response is a good measure of this unobserved score/opinion, “measurement error”. • (NOT the degree to which the question is interpretable, measures some construct, etc.) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 8. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl • Assume the step from construct to measurement is already acceptable → Assume that the question measures an intended construct: respondent knows the answer, can interpret the question, ... → reaction of respondent to the question depends on some unobserved value/opinion, which is in turn a measure of construct. • We focus only on the degree to which the response is a good measure of this unobserved score/opinion, “measurement error”. • (NOT the degree to which the question is interpretable, measures some construct, etc.) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 9. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Reasons to study measurement error • Reliability is an upper bound on validity; responses can never measure underlying construct better than the single indicator. • Unreliability increases the variance of estimators: Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 10. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Reasons to study measurement error • Reliability is an upper bound on validity; responses can never measure underlying construct better than the single indicator. • Unreliability increases the variance of estimators: • var(ˆµ) = κ−1 σ2 /n, where κ ∈ (0, 1) is reliability Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 11. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Reasons to study measurement error • Reliability is an upper bound on validity; responses can never measure underlying construct better than the single indicator. • Unreliability increases the variance of estimators: • var(ˆµ) = κ−1 σ2 /n, where κ ∈ (0, 1) is reliability • Unreliability reduces apparent strength of relationships between variables: Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 12. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Reasons to study measurement error • Reliability is an upper bound on validity; responses can never measure underlying construct better than the single indicator. • Unreliability increases the variance of estimators: • var(ˆµ) = κ−1 σ2 /n, where κ ∈ (0, 1) is reliability • Unreliability reduces apparent strength of relationships between variables: • ρxy = κx · κy · ρXY , where ρXY is the true correlation and ρxy the observed correlation. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 13. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Reasons to study measurement error • Reliability is an upper bound on validity; responses can never measure underlying construct better than the single indicator. • Unreliability increases the variance of estimators: • var(ˆµ) = κ−1 σ2 /n, where κ ∈ (0, 1) is reliability • Unreliability reduces apparent strength of relationships between variables: • ρxy = κx · κy · ρXY , where ρXY is the true correlation and ρxy the observed correlation. • Correlated measurement errors will make variables look more related than they really are; e.g. “How many minutes does it take to...” questions correlate partly because they are all asked in the same way. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 14. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Reasons to study measurement error • Reliability is an upper bound on validity; responses can never measure underlying construct better than the single indicator. • Unreliability increases the variance of estimators: • var(ˆµ) = κ−1 σ2 /n, where κ ∈ (0, 1) is reliability • Unreliability reduces apparent strength of relationships between variables: • ρxy = κx · κy · ρXY , where ρXY is the true correlation and ρxy the observed correlation. • Correlated measurement errors will make variables look more related than they really are; e.g. “How many minutes does it take to...” questions correlate partly because they are all asked in the same way. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 15. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Public health ranking: Correction of regression coefficients for κ Country Educationaldifferentialsinsubjectivehealthwith2s.e.interval -0.4-0.3-0.2-0.10.0 GR CZ PT SI FI HU PL SK LU ES EE DK DE TR IS NO CH BE IE FR UA AT NL SE Uncorrected regression coefficient Measurement error-corrected coefficient 0.82 0.85 0.78 0.73 0.56 0.75 0.71 0.81 0.86 0.85 0.95 0.84 0.91 0.70 0.81 0.87 0.81 0.82 0.92 0.85 0.91 0.81 0.93 0.99 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 16. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Design characteristics of questions • Social Desirability • Centrality • Reference period • Question formulation • WH word used • Use of gradation • Balance of the request • Encouragement • Showcards present • Showcards have pictures • ... • Emphasis on subjective opinion in request • Information about the opinion of other people • Use of stimulus or statement in the question • Absolute or comparative judgment • Response scale: basic choice • Number of categories • Labels full, partial, or no • Labels full sentences • Knowledge provided • Survey mode • ... • Order of the labels • Correspondence between labels and numbers of the scale • Theoretical range of the scale • Neutral category • Number of fixed reference points • Don’t know option • Interviewer instruction • Respondent instruction • Extra motivation, info or definition available? • Agree-disagree scale • . . . (Saris & Gallhofer 2007) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 17. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Question design choices • There are a great number of question design characteristics for which it has at some point been found or suggested that they influence the response; • Any question in a questionnaire represents a series of choices (conscious or not) on those characteristics: a method of asking the question; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 18. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Question design choices • There are a great number of question design characteristics for which it has at some point been found or suggested that they influence the response; • Any question in a questionnaire represents a series of choices (conscious or not) on those characteristics: a method of asking the question; • It is clear that what is a good method depends strongly on the topic, for example Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 19. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Question design choices • There are a great number of question design characteristics for which it has at some point been found or suggested that they influence the response; • Any question in a questionnaire represents a series of choices (conscious or not) on those characteristics: a method of asking the question; • It is clear that what is a good method depends strongly on the topic, for example • The frequency and importance of an event or series of events asked about determine: reasonable reference periods; reasonable categories - wide or deep; approximately or exactly (Tourangeau et al. 2000). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 20. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Question design choices • There are a great number of question design characteristics for which it has at some point been found or suggested that they influence the response; • Any question in a questionnaire represents a series of choices (conscious or not) on those characteristics: a method of asking the question; • It is clear that what is a good method depends strongly on the topic, for example • The frequency and importance of an event or series of events asked about determine: reasonable reference periods; reasonable categories - wide or deep; approximately or exactly (Tourangeau et al. 2000). • But are some methods generally better than others? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 21. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Question design choices • There are a great number of question design characteristics for which it has at some point been found or suggested that they influence the response; • Any question in a questionnaire represents a series of choices (conscious or not) on those characteristics: a method of asking the question; • It is clear that what is a good method depends strongly on the topic, for example • The frequency and importance of an event or series of events asked about determine: reasonable reference periods; reasonable categories - wide or deep; approximately or exactly (Tourangeau et al. 2000). • But are some methods generally better than others? • If so, what about those methods makes them better? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 22. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Question design choices • There are a great number of question design characteristics for which it has at some point been found or suggested that they influence the response; • Any question in a questionnaire represents a series of choices (conscious or not) on those characteristics: a method of asking the question; • It is clear that what is a good method depends strongly on the topic, for example • The frequency and importance of an event or series of events asked about determine: reasonable reference periods; reasonable categories - wide or deep; approximately or exactly (Tourangeau et al. 2000). • But are some methods generally better than others? • If so, what about those methods makes them better? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 23. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Question design choices • But are some methods generally better than others? • If so, what about those methods makes them better? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 24. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Talk outline 1 Question design The influence of the method Variation in influence of the method 2 Modeling measurement error Definitions Formal model and assumptions 3 Estimating measurement error Design requirements Estimation of the model 4 Predicting measurement error Description of the data Meta-analysis of the MTMM experiments Program demonstration Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 25. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method The method influences the answers Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 26. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method European Social Survey, 2002 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 27. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method European Social Survey, 2002 Method A: ENTER START TIME: 1 TvTot CARD 1 On an average weekday, how much time, in total, do you spend watching television? Please use this card to answer. No time at all Less than ½ hour ½ hour to 1 hour More than 1 hour, up to1½ hours More than 1½ hours, up to 2 hours More than 2 hours, up to 2½ hours More than 2½ hours, up to 3 hours More than 3 hours (Don’t know) A2 TvPol STILL CARD 1 And again on an average weekday, how much of your time watching television is spent watching news or programmes about politics and current affairs1 ? Still use this card. 00 GO TO A3 01 02 03 04 ASK A2 05 06 07 88 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 28. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method European Social Survey, 2002 Method A: ENTER START TIME: 1 TvTot CARD 1 On an average weekday, how much time, in total, do you spend watching television? Please use this card to answer. No time at all Less than ½ hour ½ hour to 1 hour More than 1 hour, up to1½ hours More than 1½ hours, up to 2 hours More than 2 hours, up to 2½ hours More than 2½ hours, up to 3 hours More than 3 hours (Don’t know) A2 TvPol STILL CARD 1 And again on an average weekday, how much of your time watching television is spent watching news or programmes about politics and current affairs1 ? Still use this card. 00 GO TO A3 01 02 03 04 ASK A2 05 06 07 88 Method B:! !""#$%&'()*%)+&#!)&,%$# ! -&.# !"#$"#$%&'$(&#)&&*+$,-#./)#012.#340&-#4"#3/3$5-#+/#,/1#67&"+#)$32.4"(# 3&5&%464/"89 :## # # # # ,$/+%#/)#;!<=>0#### ###?@A#BC@<DE>0# # # # # # # # -&1# #!"#$"#$%&'$(&#)&&*+$,-#./)#012.#340&-#4"#3/3$5-#+/#,/1#67&"+#5463&"4"(#3/# 3.&#'$+4/8F :## # # # # ,$/+%#/)#;!<=>G## ?@A#BC@<DE>G# # # # # # # # # # # # -&2# !"#$"#$%&'$(&#)&&*+$,-#./)#012.#340&-#4"#3/3$5-#+/#,/1#67&"+#'&$+4"(#3.&# "&)67$7&'688 :## # # # # ,$/+%#/)#;!<=>G# #?@A#BC@<DE>G# # # # #Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 29. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method TV watching: method A versus method B 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of TV watching: categorical scale 0 2000 4000 6000 8000 qqq q qqqq qq qq q q q q q q q q q q q q q q q q q q q q q q q q qqq q q q q q q q q qq q q q q q qqq q q qq q q q q qq qq qq qq qq q q q q q q q q q q q q q q q qq q q qq q qq q q q qqqq q qq qq q q q q q q q q q q q q q q q q qqq q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q qqq q q q q qq q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q qqq q q q q q q q qqq q q q q q q qq q q q q q qq q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q q qq q q qq q q q q q q q q qq q q q qq qq q q q q q q q q q qqq q q qqq qq qq q q qq q q q qq qq qqq q qqq qq q q q q q qq q q q q q q qq q q q q qq q q q q q q qqq q q qqq q q q qqqq q q qqq qqqqq qq q q qqq q q q q q q qq q q q q q q q q q q qqq q qq qq q q q q qq q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q q q qqqq q q q q qq q q q q q qqq q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q q q q q q q qq q q q qqqq q qq q q q q q q qq q q q q q q qq q q q q qq q q q qq q q q q q q q qq q q q q q q qq q qq q q q qq qqqq q q q qqq qq q q q q q q q q q q q q q q qq q q qq q q q q q q qq q q q q q q q qqqqqq q q q q q q q q q q q q q qq qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q q qq q q 0 5 10 15 Hours of TV watching: write in hrs and mins Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 30. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method Radio listening: method A versus method B 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of radio listening: categorical scale 0 2000 4000 6000 8000 q q qqqqq q q qq q q q q q q q qq q q q q q q q q q q qq qq q q qq q q q q q q q qqqq q q q q q q q q q q qq q q qq q q q q qqq q q q qq q q q qqqq q q q q qqqqqqq qq q q qq q qq q qq q q q q qq q q q q q q qq q q q q q q q q qqqq q q q q q q q q q qqqq q qqq qq qq q qqqq q q qq q qq q q q q qq q q qq q q qqq q q q qq qqqq q qqqq qq qq qq qqq qq qqqq q q q qqqq qq q q q q q q q qq qqq q qq q qq q q q q qq q qq qq q q q q q q qq qq q qq q q q q q q q qqq q q q q q q q q q q q q qqq q q q q q q q q q q qq q q q q qq q qq q q q q q q q q q q q q q q q q q qq q q qq q q qq qqq qq qq q qq q q q q qq q q q q q q qq q q q q q q q q q q q qq qqq q q qqq q q q q q q q q q q q q q q q q q qqq q q q q qq q q qqqq q q qq qq q q q qq q qq q q q q q q q q q q qqq q qqq q qq q qqq qq q q q q q q q q qq q qqq qq q q q q q q q q q q q q q qq q qqqqq qq q q q q qq q qqq q qq q q q qqq q q q q q q qq q q qqq qqq qq q q qqq q q qqqq q q q q q q q q q qq q q q q q q q q qq q qq q q q q q qq q q qqq q q q q q qq q q q qqqqq q qqq q q q qqq q q q q q q q q q q q qqq q q q q q qq qq q qq q q q q qq q q qq q q qqq q qqqqq q qqq q q q q q q qq qq q q q q q q qq q q qqq q q qqq qq q q q q q qqq q q qqq q q q q q q q qqq qqq q q qqqq qq q qq q qqq q q q q q qq q q q q q q qq q q q q qqq qq q qq q q q q q q q qq q qq q q qq q q q qqqq q qqq q q q qqq qq q q qqq q q q q q qqqq q q q q q qqq qq q q q q q qqq q q q q q q q q q qq q qq q q qq q qq q qq q qqq q qq q q qqq q q qq q qq q qqq q qq q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q qq q q q qqq q qqqqq q q q q q q qq q q qq q q q q q qqq q q q q q q q qq q q qq qq qq q q q qq q q q qq q q q q q q q qqq q q q qq q q q q q q q q q qq q q qqq qq q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q qq q q 0 5 10 15 Hours of radio listening: write in hrs and mins Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 31. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method Newspaper reading: method A versus method B 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of newspaper reading: categorical scale 0 2000 4000 6000 8000 10000 12000 q q q 0 2000 4000 6000 8000 10000 Hours of newspaper reading: write in hrs and mins Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 32. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method TV watching: method A versus method B 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of TV watching: categorical scale 0.00 0.05 0.10 0.15 0.20 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of TV watching: write in hrs and mins, recoded 0.00 0.05 0.10 0.15 0.20 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 33. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method Radio listening: method A versus method B 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of radio listening: categorical scale 0.00 0.05 0.10 0.15 0.20 0.25 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of radio listening: write in hrs and mins, recoded 0.00 0.05 0.10 0.15 0.20 0.25 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 34. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl The influence of the method Newspaper reading: method A versus method B 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of newspaper reading: categorical scale 0.0 0.1 0.2 0.3 0.4 0 h<0.5 0.5<=h<=1 1<h<=1.5 1.5<h<=2 2<h<=2.5 2.5<h<=3 h>3 Hours of newspaper reading: write in hrs and mins, recoded 0.0 0.1 0.2 0.3 0.4 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 35. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • The numeric method clearly produces many outliers, as well as very high values that may or may not be outliers. • To the extent that this is due to confusion of hours and minutes, version C may remedy that problem. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 36. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • The numeric method clearly produces many outliers, as well as very high values that may or may not be outliers. • To the extent that this is due to confusion of hours and minutes, version C may remedy that problem. • Distributions of hours with method A and B (recoded) is similar but not the same: Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 37. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • The numeric method clearly produces many outliers, as well as very high values that may or may not be outliers. • To the extent that this is due to confusion of hours and minutes, version C may remedy that problem. • Distributions of hours with method A and B (recoded) is similar but not the same: • There are much fewer people who watch very little TV with method B, (9% versus 4% of 40,355 respondents), Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 38. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • The numeric method clearly produces many outliers, as well as very high values that may or may not be outliers. • To the extent that this is due to confusion of hours and minutes, version C may remedy that problem. • Distributions of hours with method A and B (recoded) is similar but not the same: • There are much fewer people who watch very little TV with method B, (9% versus 4% of 40,355 respondents), • Numeric method B has more people who watch a lot of TV. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 39. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • The numeric method clearly produces many outliers, as well as very high values that may or may not be outliers. • To the extent that this is due to confusion of hours and minutes, version C may remedy that problem. • Distributions of hours with method A and B (recoded) is similar but not the same: • There are much fewer people who watch very little TV with method B, (9% versus 4% of 40,355 respondents), • Numeric method B has more people who watch a lot of TV. • Numeric method B has a spike at exactly 1 hour for radio and newspaper. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 40. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • The numeric method clearly produces many outliers, as well as very high values that may or may not be outliers. • To the extent that this is due to confusion of hours and minutes, version C may remedy that problem. • Distributions of hours with method A and B (recoded) is similar but not the same: • There are much fewer people who watch very little TV with method B, (9% versus 4% of 40,355 respondents), • Numeric method B has more people who watch a lot of TV. • Numeric method B has a spike at exactly 1 hour for radio and newspaper. • Overall it is clear the method has some influence on average over all 40,355 respondents. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 41. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • The numeric method clearly produces many outliers, as well as very high values that may or may not be outliers. • To the extent that this is due to confusion of hours and minutes, version C may remedy that problem. • Distributions of hours with method A and B (recoded) is similar but not the same: • There are much fewer people who watch very little TV with method B, (9% versus 4% of 40,355 respondents), • Numeric method B has more people who watch a lot of TV. • Numeric method B has a spike at exactly 1 hour for radio and newspaper. • Overall it is clear the method has some influence on average over all 40,355 respondents. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 42. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Is the difference between methods the same for all respondents? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 43. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Is the difference between methods the same for all respondents? The same people were asked both versions. This allows us to show variation in answers to the numeric question, within categories of the categorical question. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 44. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Is the difference between methods the same for all respondents? No time at all Numeric value given Density 0 1 2 3 4 0.00.20.40.60.81.0 Less than 0,5 hour Numeric value given Density 0 1 2 3 4 0.00.20.40.60.81.0 0,5 hour to 1 hour Numeric value given Density 0 1 2 3 4 0.00.20.40.60.81.0 More than 1 hour, up to 1,5 hours Numeric value given Density 0 1 2 3 4 0.00.20.40.60.81.0 More than 1,5 hours, up to 2 hours Numeric value given Density 0 1 2 3 4 0.00.20.40.60.81.0 More than 2 hours, up to 2,5 hours Numeric value given Density 0 1 2 3 4 0.00.20.40.60.81.0 More than 2,5 hours, up to 3 hours Numeric value given Density 0 1 2 3 4 0.00.20.40.60.81.0 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 45. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • Not only does the method influence the distribution of answers, • the method effect also depends on the person. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 46. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Variation in influence of the method Do people answer methods differently? • Not only does the method influence the distribution of answers, • the method effect also depends on the person. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 47. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Traits, Methods, and Persons • Can imagine the same question (“Trait”) being asked in different ways (“Methods”); • Can imagine the same method being used to ask different questions; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 48. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Traits, Methods, and Persons • Can imagine the same question (“Trait”) being asked in different ways (“Methods”); • Can imagine the same method being used to ask different questions; • A response to a survey question is then different person’s answers to Trait-Method combinations. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 49. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Traits, Methods, and Persons • Can imagine the same question (“Trait”) being asked in different ways (“Methods”); • Can imagine the same method being used to ask different questions; • A response to a survey question is then different person’s answers to Trait-Method combinations. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 50. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Measurement error model 1 Responses are a measure of some underlying score (“trait”) so that if a person’s memory were erased and the person re-interviewed, they should give a similar answer. 2 Responses are influenced by random variation: errors, such as mistaking minutes for hours, but also variation in information retrieved from memory. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 51. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Measurement error model 1 Responses are a measure of some underlying score (“trait”) so that if a person’s memory were erased and the person re-interviewed, they should give a similar answer. 2 Responses are influenced by random variation: errors, such as mistaking minutes for hours, but also variation in information retrieved from memory. 3 The method influences the answers on average, e.g. there might be more social desirability bias in one method than another, the scale may suggest some unspoken norm, etc. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 52. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Measurement error model 1 Responses are a measure of some underlying score (“trait”) so that if a person’s memory were erased and the person re-interviewed, they should give a similar answer. 2 Responses are influenced by random variation: errors, such as mistaking minutes for hours, but also variation in information retrieved from memory. 3 The method influences the answers on average, e.g. there might be more social desirability bias in one method than another, the scale may suggest some unspoken norm, etc. 4 Influence of method is different for different people: random variation in the differences between methods. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 53. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Measurement error model 1 Responses are a measure of some underlying score (“trait”) so that if a person’s memory were erased and the person re-interviewed, they should give a similar answer. 2 Responses are influenced by random variation: errors, such as mistaking minutes for hours, but also variation in information retrieved from memory. 3 The method influences the answers on average, e.g. there might be more social desirability bias in one method than another, the scale may suggest some unspoken norm, etc. 4 Influence of method is different for different people: random variation in the differences between methods. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 54. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Modeling measurement error Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 55. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Quasi-equation Response = Responses are a measure of some underlying score (“trait”) so that if a person’s memory were erased and the person re-interviewed, they should give a similar answer. Trait + Trait × Person+ Responses are influenced by random variation: er- rors, such as mistaking minutes for hours, but also variation in information retrieved from memory. Person × Moment+ The method influences the answers on average, e.g. there might be more social desirability bias in one method than another, the scale may suggest some unspoken norm, etc. Method + Method × Trait Influence of method is different for different people: random variation in the differences between meth- ods. Method × Person Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 56. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Quasi-equation Response = Trait + Method + Trait × Method+ Trait × Person + Method × Person+ Person × Moment Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 57. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Interpretation of the model If persons are a random sample from a population U, consider Person a random factor. 1 “Rest” variance is called “random measurement error” 2 Proportion of Residual variance on the total is called “unreliability” (1 − r2) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 58. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Interpretation of the model If persons are a random sample from a population U, consider Person a random factor. 1 “Rest” variance is called “random measurement error” 2 Proportion of Residual variance on the total is called “unreliability” (1 − r2) 3 Proportion of Method×Person variance on the total is called “common method variance” (sometimes “invalidity”), (1 − v2) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 59. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Interpretation of the model If persons are a random sample from a population U, consider Person a random factor. 1 “Rest” variance is called “random measurement error” 2 Proportion of Residual variance on the total is called “unreliability” (1 − r2) 3 Proportion of Method×Person variance on the total is called “common method variance” (sometimes “invalidity”), (1 − v2) 4 Proportion of Trait×Person variance on the total is called “quality” of the question (q2 or κ) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 60. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Interpretation of the model If persons are a random sample from a population U, consider Person a random factor. 1 “Rest” variance is called “random measurement error” 2 Proportion of Residual variance on the total is called “unreliability” (1 − r2) 3 Proportion of Method×Person variance on the total is called “common method variance” (sometimes “invalidity”), (1 − v2) 4 Proportion of Trait×Person variance on the total is called “quality” of the question (q2 or κ) 5 “Quality” (q2 or κ) will equal v2 · r2. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 61. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Definitions Interpretation of the model If persons are a random sample from a population U, consider Person a random factor. 1 “Rest” variance is called “random measurement error” 2 Proportion of Residual variance on the total is called “unreliability” (1 − r2) 3 Proportion of Method×Person variance on the total is called “common method variance” (sometimes “invalidity”), (1 − v2) 4 Proportion of Trait×Person variance on the total is called “quality” of the question (q2 or κ) 5 “Quality” (q2 or κ) will equal v2 · r2. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 62. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions Equation model Yijk = τijk + ηij + ξik + ijk , where i Indexes persons; j Indexes traits; k Indexes methods. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 63. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions Model Response = Trait + Method + Trait × Method+ Trait × Person + Method × Person+ Person × Moment Yijk = τijk + ηij + ξik + ijk , where i Indexes persons; j Indexes traits; k Indexes methods. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 64. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions Equation with Trait×Method interaction with Trait×Person Yijk = τijk + λjk ηij + ξik + ijk , where i Indexes persons; j Indexes traits; k Indexes methods. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 65. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions Assumptions in the model 1 The (interaction) effects do not depend on other Method×Trait combinations a person might receive; (“no carry-over effects”, “SUTVA”, “independence assumption”) Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov & Reinecke (eds), 2012) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 66. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions Assumptions in the model 1 The (interaction) effects do not depend on other Method×Trait combinations a person might receive; (“no carry-over effects”, “SUTVA”, “independence assumption”) 2 There is no separate Person main effect: Trait and Method within Person already capture all within-person correlation Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov & Reinecke (eds), 2012) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 67. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions Assumptions in the model 1 The (interaction) effects do not depend on other Method×Trait combinations a person might receive; (“no carry-over effects”, “SUTVA”, “independence assumption”) 2 There is no separate Person main effect: Trait and Method within Person already capture all within-person correlation (“method variance is the only systematic variance”, COVU( ijk , ξik ) = 0 and COVU( ijk , ηik ) = 0 ) Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov & Reinecke (eds), 2012) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 68. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions Assumptions in the model 1 The (interaction) effects do not depend on other Method×Trait combinations a person might receive; (“no carry-over effects”, “SUTVA”, “independence assumption”) 2 There is no separate Person main effect: Trait and Method within Person already capture all within-person correlation (“method variance is the only systematic variance”, COVU( ijk , ξik ) = 0 and COVU( ijk , ηik ) = 0 ) Assumption 2 can sometimes be relaxed (Oberski et al in Salzborn, Davidov & Reinecke (eds), 2012) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 69. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions The parameters of interest in the model are • The variance over persons in the Trait effect; • The variance over persons in the Method effect. Expressed as proportions of the total variance over persons of Yjk , these two quantities equal, respectively, Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 70. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions The parameters of interest in the model are • The variance over persons in the Trait effect; • The variance over persons in the Method effect. Expressed as proportions of the total variance over persons of Yjk , these two quantities equal, respectively, • The reliability κjk of a question asking Trait j with Method k Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 71. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions The parameters of interest in the model are • The variance over persons in the Trait effect; • The variance over persons in the Method effect. Expressed as proportions of the total variance over persons of Yjk , these two quantities equal, respectively, • The reliability κjk of a question asking Trait j with Method k • The correlation between two different questions that is purely due to them being measured with the same method. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 72. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Formal model and assumptions The parameters of interest in the model are • The variance over persons in the Trait effect; • The variance over persons in the Method effect. Expressed as proportions of the total variance over persons of Yjk , these two quantities equal, respectively, • The reliability κjk of a question asking Trait j with Method k • The correlation between two different questions that is purely due to them being measured with the same method. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 73. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of measurement error with the MTMM design Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 74. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Design requirements What design is needed to estimate this model? Response = Trait + Method + Trait × Method+ Trait × Person + Method × Person+ Person × Moment Yijk = τijk + ηij + ξik + ijk , i Indexes persons; j indexes traits; k indexes methods. • The model suggests that a Person×Method×Trait factorial experiment would allow for the estimation of the reliability and method variance. • Residual or “measurement error” error Person × Moment is estimated by Person × Trait × Method interaction. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 75. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Design requirements What design is needed to estimate this model? • A Person×Method×Trait factorial experiment would ask the same question in different ways (Methods) and use different methods to ask the same questions, within each person; • Campbell and Fiske introduced such designs in 1959 under the name “Multitrait-multimethod” (MTMM) experiment. • Not all Trait-Method combinations are necessary, but at least one repetition within each person is required (Saris, Satorra & Coenders, 2004). • Under the model and assumptions 1 and 2, the MTMM design will provide data that allow for the estimation of the reliability and method variance (“invalidity”). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 76. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Design requirements Example of an MTMM experiment On an average weekday, how much time, in total... T = 1 ...do you spend watching television? T = 2 ...do you spend listening to the radio? T = 3 ...do you spend reading the newspapers? Scales: M = 1: 8pt (hours) M = 2: Write in hours and minutes M = 3: 7pts vague quantifiers Each respondent answered all three questions in two different ways. The repetition was given at the end of the interview (after approximately 50 minutes passed) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 77. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model Estimation issues Yijk = τijk + λjk ηij + ξik + ijk . • The model can be estimated with regression (with Person a random factor); • Not flexible enough: little influence on covariance structure and λjk not possible. • The model can also be recognized as a factor analysis or more generally as a structural equation model (SEM), • through transformation as an IRT or latent class model. • The SEM framework allows enough flexibility to estimate the parameters of interest: trait, method and residual variance or r2, v2, and quality q2. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 78. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model The model as a SEM (or IRT or latent class) model M1 M2 M3 T1 T2 T3 y11 y21 y31 y12 y22 y32 y13 y23 y33 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 79. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model Another example COMPARING QUESTIONS WITH AGREE/DISAGREE RESPONSE OPTIONS TO QUESTIONS WITH ITEM-SPECIFIC RESPONSE OPTIONS 69 Table 4: Experiment 2 of round 2 Introduction Statements Answer categories Main Using this card, - There is a lot of variety in my work - not at all true questionnaire please tell me how - My job is secure - a little true true each of the - My health or safety is at risk because - quite true “A/D” following statements of my work - very true is about your current job. SC group 1 The next 3 questions - Please choose one of the following to - not at all varied are about your describe how varied your work is. - a little varied IS current job. - Please choose one of the following to - quite varied describe how secure your job is - very varied - Please choose one of the following to (same type of response say how much, if at all, your work puts scale using terms secure your health and safety at risk. and safe instead of varied) SC group 2 - Please indicate, on a scale of 0 to 10, Horizontal 11 point how varied your work is, where 0 is not scale only labelled at the IS at all varied and 10 is very varied. end points - Now please indicate, on a scale of 0 to 10, how secure your job is, where 0 is not at all secure and 10 is very secure. - Please indicate, on a scale of 0 to 10, how much your health and safety is at risk from your work, where 0 is not at all at risk and 10 is very much at risk. Table 5: The means reliability, validity and quality of the three questions of experiment 2 in Round 2 of the ESS across 10 countries for the different methods (standard deviations in brackets) Reliability r2 Validity v2 Quality q2 Method Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3 Source: R´evilla, Saris & Krosnick, (2010) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 80. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model Results from another example - Please choose one of the following to (same type of response say how much, if at all, your work puts scale using terms secure your health and safety at risk. and safe instead of varied) SC group 2 - Please indicate, on a scale of 0 to 10, Horizontal 11 point how varied your work is, where 0 is not scale only labelled at the IS at all varied and 10 is very varied. end points - Now please indicate, on a scale of 0 to 10, how secure your job is, where 0 is not at all secure and 10 is very secure. - Please indicate, on a scale of 0 to 10, how much your health and safety is at risk from your work, where 0 is not at all at risk and 10 is very much at risk. Table 5: The means reliability, validity and quality of the three questions of experiment 2 in Round 2 of the ESS across 10 countries for the different methods (standard deviations in brackets) Reliability r2 Validity v2 Quality q2 Method Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3 A/D(4) .65 .59 .61 .99 .98 .99 .64 .58 .60 (.09) (.18) (.15) (.02) (.03) (.03) (.10) (.18) (.15) IS(4) .80 .80 .80 1 1 1 .80 .80 .80 (.14) (.13) (.14) (0) (0) (0) (.14) (.13) (.14) IS(11) .81 .83 .77 .98 .98 .98 .80 .82 .76 (.09) (.11) (.12) (.03) (.03) (.04) (.10) (.12) (.14) using a truth scale with the same number of categories for all three questions (around .7 to .9 versus .5 to .6). The position of the IS scale in the supplementary questionnaire is not an issue as the better quality of the IS scale is also observed both when it comes first and when it comes later. Possibly the order of the observations with the different scale types has an impact on the size of the differences since we see fewer differences in this second experiment than in the first, but this may also be linked to the subject matter of the experiments or to other characteristics of the methods used (such as the number of points). More research is needed to determine this, however the important point here is that in different combinations, the superiority of the IS in terms of scale with 11 categories was also better than the IS scale with 4 categories. So, not only might the kind of scale (IS versus A/D) impact the total quality of a measure, but so might the length of the scale (number of response categories). How- ever, it seems that this effect varies across countries. Experiments in Round 3 of the ESS In round 3 of the ESS again two SB-MTMM experiments have been done which allow the comparison of the IS scales with A/D scales. The attraction of these experiments is thatPredicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 81. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model Results from another example Quality q2 Q1 Q2 Q3 .64 .58 .60 (.10) (.18) (.15) .80 .80 .80 (.14) (.13) (.14) .80 .82 .76 (.10) (.12) (.14) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 82. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model Results from another example • It looks like there is much more measurement error (residual variance) in the agree-disagree questions than there is in the item-specific scales. • This was true over all countries (shown is the average over countries). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 83. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model Results from another example • It looks like there is much more measurement error (residual variance) in the agree-disagree questions than there is in the item-specific scales. • This was true over all countries (shown is the average over countries). • Still wonder whether the same would be found with other topics and under other conditions, and with other combinations of methods. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 84. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Estimation of the model Results from another example • It looks like there is much more measurement error (residual variance) in the agree-disagree questions than there is in the item-specific scales. • This was true over all countries (shown is the average over countries). • Still wonder whether the same would be found with other topics and under other conditions, and with other combinations of methods. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 85. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Are some types of questions better than others? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 86. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 87. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; • So far there have been five rounds (2002, 4, 6, 8, and 10). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 88. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; • So far there have been five rounds (2002, 4, 6, 8, and 10). • The experiments are done in 20-30 European countries every two years; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 89. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; • So far there have been five rounds (2002, 4, 6, 8, and 10). • The experiments are done in 20-30 European countries every two years; • Effective sample size per country is at least 1500. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 90. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; • So far there have been five rounds (2002, 4, 6, 8, and 10). • The experiments are done in 20-30 European countries every two years; • Effective sample size per country is at least 1500. • Each experiment usually estimates the quality for 9 questions (Method-Trait combinations). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 91. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; • So far there have been five rounds (2002, 4, 6, 8, and 10). • The experiments are done in 20-30 European countries every two years; • Effective sample size per country is at least 1500. • Each experiment usually estimates the quality for 9 questions (Method-Trait combinations). • Range of topics is reasonably diverse, though factual questions are underrepresented. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 92. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; • So far there have been five rounds (2002, 4, 6, 8, and 10). • The experiments are done in 20-30 European countries every two years; • Effective sample size per country is at least 1500. • Each experiment usually estimates the quality for 9 questions (Method-Trait combinations). • Range of topics is reasonably diverse, though factual questions are underrepresented. • In total about 5000 questions available, but only 3000 of those will be used here for various reasons. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 93. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • The examples given so far come from a much larger series of MTMM experiments; • In the European Social Survey (ESS), every round about six MTMM experiments are done; • So far there have been five rounds (2002, 4, 6, 8, and 10). • The experiments are done in 20-30 European countries every two years; • Effective sample size per country is at least 1500. • Each experiment usually estimates the quality for 9 questions (Method-Trait combinations). • Range of topics is reasonably diverse, though factual questions are underrepresented. • In total about 5000 questions available, but only 3000 of those will be used here for various reasons. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 94. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • In addition to the ESS, an older series of experiments also exists (F. Andrews; K¨oltringer; Saris; Billiet, 1990’s) • These add another 1089 questions for which reliability and validity coefficients are estimated Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 95. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • In addition to the ESS, an older series of experiments also exists (F. Andrews; K¨oltringer; Saris; Billiet, 1990’s) • These add another 1089 questions for which reliability and validity coefficients are estimated • Combining the two datasets (ESS question qualities and Old experiment qualities, we created a database of 3011 questions with their reliability and validity estimates. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 96. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • In addition to the ESS, an older series of experiments also exists (F. Andrews; K¨oltringer; Saris; Billiet, 1990’s) • These add another 1089 questions for which reliability and validity coefficients are estimated • Combining the two datasets (ESS question qualities and Old experiment qualities, we created a database of 3011 questions with their reliability and validity estimates. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 97. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Reliability and validity estimates of 3011 questions Reliability coefficient Reliability coefficient Frequency 0.4 0.6 0.8 1.0 0200400600800 Validity coefficient Validity coefficient Frequency 0.2 0.4 0.6 0.8 1.0 050010001500 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 98. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Logit transform of Reliability and validity estimates Reliability coefficient, logit Validity coefficient Frequency 0 2 4 6 0200400600800 Validity coefficient, logit Validity coefficient Frequency 0 2 4 6 0100200300400500 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 99. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Coding design characteristics of the 3011 questions • For each of the 3011 questions in all countries, a team of coders coded 40 design characteristics of the question; • Some codes were automatically generated by Natural Language Processing software (syllables, words, etc). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 100. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Coding design characteristics of the 3011 questions • For each of the 3011 questions in all countries, a team of coders coded 40 design characteristics of the question; • Some codes were automatically generated by Natural Language Processing software (syllables, words, etc). • Coders were students, assistants to the local coordinators of the ESS, and two experts; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 101. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Coding design characteristics of the 3011 questions • For each of the 3011 questions in all countries, a team of coders coded 40 design characteristics of the question; • Some codes were automatically generated by Natural Language Processing software (syllables, words, etc). • Coders were students, assistants to the local coordinators of the ESS, and two experts; • For English source version, experts double-coded questions independently, then created consensus codes; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 102. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Coding design characteristics of the 3011 questions • For each of the 3011 questions in all countries, a team of coders coded 40 design characteristics of the question; • Some codes were automatically generated by Natural Language Processing software (syllables, words, etc). • Coders were students, assistants to the local coordinators of the ESS, and two experts; • For English source version, experts double-coded questions independently, then created consensus codes; • Non-expert codes were quality-controlled by detailed comparison with consensus codes for the English source; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 103. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Coding design characteristics of the 3011 questions • For each of the 3011 questions in all countries, a team of coders coded 40 design characteristics of the question; • Some codes were automatically generated by Natural Language Processing software (syllables, words, etc). • Coders were students, assistants to the local coordinators of the ESS, and two experts; • For English source version, experts double-coded questions independently, then created consensus codes; • Non-expert codes were quality-controlled by detailed comparison with consensus codes for the English source; • In a meeting between the experts and each other coder, the discrepancies were discussed and either corrected or left in as true differences. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 104. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Coding design characteristics of the 3011 questions • For each of the 3011 questions in all countries, a team of coders coded 40 design characteristics of the question; • Some codes were automatically generated by Natural Language Processing software (syllables, words, etc). • Coders were students, assistants to the local coordinators of the ESS, and two experts; • For English source version, experts double-coded questions independently, then created consensus codes; • Non-expert codes were quality-controlled by detailed comparison with consensus codes for the English source; • In a meeting between the experts and each other coder, the discrepancies were discussed and either corrected or left in as true differences. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 105. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data • absolute • avgabs intro • avgabs total • avgsy total • avgwrd intro • avgwrd total • balance • centrality • computer.assisted • concept • country • domain • dont know • encourage • fixrefpoints • form basic • future • labels • instr interv • instr respon • interviewer • intr request • intropresent • knowledge • labels gramm • labels order • language • motivation • opinionother • past • position • questiontype • scal neutral • scale basic • scale corres • scale trange • scale urange • showc boxes • showc horiz • showc letter • showc over • showc quest • showc start • socdesir • stimulus • subjectiveop • symmetry • used WH word • usedshowcard • visual • from Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 106. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Domain of question # questions Internatl politics 64 Health 190 Living conditions 453 Other beliefs 292 Work 469 Personal relations 320 Consumer behavior 34 Leisure activts 131 National gvt 141 Institutions 284 Political parties 30 Trade unions 12 Economy 237 Other 354 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 107. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Description of the data Concept of question # questions Evaluative belief 713 Feeling 903 Importance 96 Expectation 39 Facts, behavior 63 Judgement 123 Relationship 8 Evaluation 704 Norm 57 Policy 250 Right 4 Action tendency 51 Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 108. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis dataset • For each of the 3011 questions, we have in the database: • The estimated quality (reliability and validity coefficients) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 109. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis dataset • For each of the 3011 questions, we have in the database: • The estimated quality (reliability and validity coefficients) • About 50 design characteristics (through hand- and automatic coding) Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 110. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis dataset • For each of the 3011 questions, we have in the database: • The estimated quality (reliability and validity coefficients) • About 50 design characteristics (through hand- and automatic coding) • The next step was to relate the design characteristics to the quality estimates: Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 111. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis dataset • For each of the 3011 questions, we have in the database: • The estimated quality (reliability and validity coefficients) • About 50 design characteristics (through hand- and automatic coding) • The next step was to relate the design characteristics to the quality estimates: • Can the quality estimates be predicted from the design characteristics? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 112. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis dataset • For each of the 3011 questions, we have in the database: • The estimated quality (reliability and validity coefficients) • About 50 design characteristics (through hand- and automatic coding) • The next step was to relate the design characteristics to the quality estimates: • Can the quality estimates be predicted from the design characteristics? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 113. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis dataset • For each of the 3011 questions, we have in the database: • The estimated quality (reliability and validity coefficients) • About 50 design characteristics (through hand- and automatic coding) • The next step was to relate the design characteristics to the quality estimates: • Can the quality estimates be predicted from the design characteristics? Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 114. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis • Prediction by random forests of regression trees (Breiman 2001); • Two separate models: one for validity and for reliability coefficients; Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 115. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis • Prediction by random forests of regression trees (Breiman 2001); • Two separate models: one for validity and for reliability coefficients; • Missing data are multiply imputed using the MICE algorithm (van Buuren & Groothuis-Oudshoorn 2011). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 116. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis • Prediction by random forests of regression trees (Breiman 2001); • Two separate models: one for validity and for reliability coefficients; • Missing data are multiply imputed using the MICE algorithm (van Buuren & Groothuis-Oudshoorn 2011). Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 117. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Example regression tree for logit(reliability coefficient) | domain=3,4,7,11,13,14,112 domain=3 gradation>=0.5 position< 339.5 position>=410 concept=1,2 position< 404.5 concept=1,73,78 position< 322.5 ncategories>=4.5 domain=6,101,103,120 domain=4,7,11,13,14,112 gradation< 0.5 position>=339.5 position< 410 concept=73,75,76 position>=404.5 concept=2,76 position>=322.5 ncategories< 4.5 1.955 n=1988 1.724 n=1303 0.9636 n=108 0.4959 n=36 1.198 n=72 1.793 n=1195 1.642 n=722 2.023 n=473 1.544 n=108 1.28 n=76 2.17 n=32 2.165 n=365 1.97 n=217 2.45 n=148 2.394 n=685 1.489 n=138 2.622 n=547 2.384 n=233 2.799 n=314 2.681 n=260 3.364 n=54 Example regression tree for reliability coefficient Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 118. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis with random forests • R2 based on out-of-bag (crossvalidation) mean square error is 85% for validity coefficient and 60% for reliability coefficient. • Importance measures indicate domain, number of categories, concept, position in the questionnaire, number of syllables, country, number of words, fixed reference points, and other linguistic complexity measures are the most influential for reliability. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 119. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis with random forests • R2 based on out-of-bag (crossvalidation) mean square error is 85% for validity coefficient and 60% for reliability coefficient. • Importance measures indicate domain, number of categories, concept, position in the questionnaire, number of syllables, country, number of words, fixed reference points, and other linguistic complexity measures are the most influential for reliability. • For validity, in addition to the above, order of the labels (positive-negative), centrality of the trait and other characteristics are also important. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski
  • 120. Introduction Question design Modeling measurement error Estimating measurement error Predicting measurement error Concl Meta-analysis of the MTMM experiments Meta-analysis with random forests • R2 based on out-of-bag (crossvalidation) mean square error is 85% for validity coefficient and 60% for reliability coefficient. • Importance measures indicate domain, number of categories, concept, position in the questionnaire, number of syllables, country, number of words, fixed reference points, and other linguistic complexity measures are the most influential for reliability. • For validity, in addition to the above, order of the labels (positive-negative), centrality of the trait and other characteristics are also important. Predicting the quality of a survey question from its design characteristics: SQP Daniel Oberski