ITC Measurement equivalence

Measurement equivalence of
Business-focused Inventory of
Personality :
A comparison of European language
versions

Tao Li
Hogrefe Ltd. UK

The 7th Conference of the International Test Commission
Hong Kong July 2010

To set the scene

• An international organisation wishes to use a
personality test to select managers globally for
expatriate assignments

– Does the test measure the same traits for the
candidates?

– Are the scores comparable across countries?

Measurement equivalence
• The relative comparability of the wording, scaling,
and scoring of constructs across groups

– A prerequisite for valid group comparison

– Implicitly assumed but RARELY examined

Levels of Measurement Equivalence
• Constructs have the same basic factor structure
across groups
Structural • The constructs have similar meaning

• The strength of the relationships between items and
constructs being measured are equivalent
Metric • The constructs have the same meaning

• Measure the constructs on the same scale
• The groups use the response scale in a similar way
Scalar • Complete comparability of scores

Partial Invariance

• Full invariance: ideal but often impossible

• Partial invariance: some, but not all, of the item are
equivalent across groups

Business-focused Inventory of Personality (BIP)

• A work based personality test developed in Germany
and was adapted to all major European languages
– R. Hossiep & M. Paschen, 1998, 2003 © by Hogrefe

• An combination of etic-emic approach to adaptation
– Etic : e.g. English, Portugal, Dutch, Denmark
– Emic: e.g. French, Spanish

The analyses

• Testing measurement equivalence
– Multi-group exploratory structural equation modelling (ESEM)

• Identifying differential item functioning (DIF)
– Multiple indicators-multiple causes approach (MIMIC)

• Testing measurement equivalence involving emit items
– Missing data technique

Testing measurement equivalence

• Data: German, English, Denmark, Portugal
– Equal structure: structural equivalence
– Equal loading: metric equivalence
– Equal intercept: scalar equivalence

Results
• Fit indices:
– Comparative Fit Index (CFI): >0.95 good fit
– The root mean square error of approximation (RMSEA): <0.08 good fit

• Structural equivalence: all scales
– min CFA=0.956; max RMSEA=0.070
• Metric equivalence: 11/14 scales
• Scalar equivalence: none
• Partial equivalence: all scales

Full invariance items
Scales Full invariant items
Achievement Motivation 5
Power Motivation 3
Leadership Motivation 3
Conscientiousness 3
Flexibility 3
Action Orientation 4
Social Sensitivity 4
Openness to Contact 4
Sociability 3
Team Orientation 2
Assertiveness 3
Emotional Stability 2
Working under Pressure 3
Self confidence 4

Differential item functioning
• People from different groups with the same
underlying ability/trait level have a different
probability of endorsing an item

• MIMIC approach to DIF detecting
– Modelling DIF and latent mean difference
simultaneously

MIMIC approach to DIF detecting

Country
Item

Item

Construct Item

Item

Item

Example
• BIP Openness to Contact scale
• Portuguese vs. German.

Assume no DIF DIF effect modelled
Latent mean difference 0.20 0.52
Effect size Small Medium

Measurement equivalence involving emic items

• Etic vs. Emic
— using the “same” items vs. using culturally specific items

• How to compare combined etic-emic instruments?

Missing data technique
• Introducing “imaginary” observed items
Country A Country B

Common Common
item item
Common Common
item item

Common
Common item
item Construct
Construct
Emic item A Emic item B

Imaginary Imaginary
Emic item B Emic item A

Example
• BIP Flexibility scale: Spanish vs. German
– 12 common items
– 2 items unique to German version
– 1 items unique to Spanish version

Model fit CFI: 0.978; RMSEA: 0.046
Latent mean difference 0.10

Summary

• Measurement equivalence
– All BIP scales demonstrated structural invariance
– Most scales showed metric invariance
– No scales presented scalar invariance
– Full invariant items were identified for each scale

Implications

• Common items make it possible to equate
scores across versions in the presence of DIF

• Comparing instruments involving emic items is
possible and necessary

Thank you

tao.li@hogrefe.co.uk

ITC Measurement equivalence

Recomendados

Recomendados

Más contenido relacionado

Similar a ITC Measurement equivalence

Similar a ITC Measurement equivalence (20)

ITC Measurement equivalence