Injustice - Developers Among Us (SciFiDevCon 2024)
Manipulating and measuring model interpretability
1. Manipulating and Measuring Model Interpretability
Microsoft Research NYC
Forough Poursabzi-
Sangdeh
Dan Goldstein Jake Hofman Jenn Wortman
Vaughan
Hanna Wallach
6. DIFFERENT SCENARIOS, DIFFERENT PEOPLE, DIFFERENT NEEDS
u = k(x, u)
Explain a
prediction
Understand
model
Make better
decisions
Debug
model
De-bias
model
Inspire trust
CEOs Approach A
Data
scientists
Approach C
Laypeople
Regulators Approach B
9. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
… …
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
10. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
11. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of human
behavior
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
12. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of human
behavior
We need interdisciplinary approaches
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
13. Interpretability
FOCUS ON LAYPEOPLE
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of human
behavior
Randomized human-subject experiments
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
14. USER EXPERIMENT, PREDICTIVE TASK
u = k(x, u)
• Predict the price of apartments in NYC with the help of a model
23. USER INTERFACE AND INTERACTIONS
u = k(x, u)
• Training phase: participants get familiar with the model
• Testing phase step 1: simulate the model’s prediction
Simulate the model
24. USER INTERFACE AND INTERACTIONS
u = k(x, u)
• Testing phase step 2: observe the model’s prediction and guess the price
Predict actual selling price
25. PRE-REGISTERED HYPOTHESES
u = k(x, u)
• CLEAR-2 feature will be easiest for participants to simulate
• Participants will trust CLEAR-2 feature more than BB-8 feature
• Participants’ behaviors will vary when they see unusual examples where the model makes
inaccurate predictions
https://aspredicted.org/xy5s6.pdf
26. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
27. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
m
$um
28. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
Simulation error
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$100k
$200k
Meansimulationerror
m
$um
29. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
Simulation error
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$100k
$200k
Meansimulationerror
m
$um
30. TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
31. TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
m
$ua
32. Deviation
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
Meandeviationfromthemodel
TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
m
$ua
33. Deviation
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
Meandeviationfromthemodel
TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
m
$ua
37. DETECTION OF MISTAKES
Participants’ behaviors will vary when they see unusual examples where the model makes
inaccurate predictions
Apartment 12: 1 bed, 3 bath
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
$200k
$250k
$300k
Meandeviationfromthemodel
forapartment12 m
$ua
38. DETECTION OF MISTAKES
Participants’ behaviors will vary when they see unusual examples where the model makes
inaccurate predictions
Apartment 12: 1 bed, 3 bath
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
$200k
$250k
$300k
Meandeviationfromthemodel
forapartment12 m
$ua
When participants see unusual examples, they are less likely to correct inaccurate
predictions made by clear models than black-box models
45. USER INTERFACE AND INTERACTIONS
u = k(x, u)
• We remove potential anchors
46. PRE-REGISTERED HYPOTHESES
u = k(x, u)
• Explicit attention checks on unusual inputs will affect participants’ abilities in detecting
model’s mistakes
• Model transparency affects participants’ abilities in detecting model’s mistakes, both with
and without attention checks
https://aspredicted.org/5xy8y.pdf
47. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
48. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
49. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
50. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
51. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
52. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
53. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
54. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
• With attention checks, there is no difference between clear and black-box
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
55. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
• With attention checks, there is no difference between clear and black-box
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
56. SUMMARY OF RESULTS
u = k(x, u)
• A clear model with a small number of features is easier for participants to simulate
- People have a better understanding of simple and transparent models
• No significant difference in participants’ trust in the model
- Contrary to intuition, people do not necessarily trust simple and transparent models
more
• Participants were less able to correct inaccurate predictions of a clear model than a black-
box model
- Too much transparency can be harmful
- Design implications (e.g., highlighting unusual inputs, display model internals on
demand)
57. • Interpretability is not a purely computational problem
- We need interdisciplinary research to understand interpretability
• Our surprising results underscore that interpretability research is much more complicated
- We need more empirical studies
- Other scenarios, domains, models, factors, outcomes
TAKEAWAYS
58. u = k(x, u)
https://csel.cs.colorado.edu/~fopo5620/
forough.poursabzi@microsoft.com
Thanks!