Usability Testing for Survey Research:How to and Best Practices

Nov 9, 2016
QDET2 short course | Miami, FL
Usability Testing for Survey Research:Usability Testing for Survey Research:Usability Testing for Survey Research:Usability Testing for Survey Research:
How To and Best PracticesHow To and Best PracticesHow To and Best PracticesHow To and Best Practices
Jen Romano-Bergstrom
Sr. UX Researcher
Facebook
jenrb@fb.com
@romanocog
Emily Geisen
Usability/Cog Lab Manager
Survey Methodologist
RTI
egeisen@rti.org
#QDET2

AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Analyzing results
2
2:00 – 3:45
4:00 – 5:30
#QDET2

Activity
• How long did it take you to get here?
• What is today’s date?
3
#QDET2

Why is Design Important?
4
#QDET2
Image source: Geisen & Romano Bergstrom, 2017

Is this position 0 or missing?
Why is Design Important?

• Planning
6
#QDET2
2:00 – 3:45
4:00 – 5:30

Usability definedUsability definedUsability definedUsability defined
7
“The extent to which a product can
be used by specified users to achieve
specified goals with effectiveness,
efficiency, and satisfaction in a
specified context of use.”
-ISO 9241:11
#QDET2

Usability definedUsability definedUsability definedUsability defined
8
“The extent to which a product can
be used by specified users to achieve
specified goals with effectiveness,
efficiency, and satisfaction in a
specified context of use.”
-ISO 9241:11
#QDET2

9
• Product
• Users
• Goals
• Context
• Effectiveness, Efficiency, Satisfaction
What does usability mean for your survey?What does usability mean for your survey?What does usability mean for your survey?What does usability mean for your survey?
#QDET2

What does usability mean for surveys?What does usability mean for surveys?What does usability mean for surveys?What does usability mean for surveys?
10
Product Web sites, web surveys, paper surveys, apps
Users Our respondents (mostly), interviewers
Goals Respondents must be able to provide their correct
and accurate opinions, stories, facts, predictions
Context of use In their homes, offices, out and about
Effectiveness Completing the questions and survey with accurate
answers
Efficiency Completing the questions and survey quickly, with as
few steps/clicks as possible
Satisfaction Having a pleasant experience
#QDET2

11
Effectiveness: achieve a goal
Efficiency: appropriate time /effort
Efficient + effective = useful
#QDET2

12
Satisfaction is more complex.
• Did it allow them to provide their accurate answers?
• Did they enjoy the experience?
• Did it require too much time to complete?
• Did they find the instrument easy to use?
• Did they find it easy to learn how to use?
#QDET2

13
Other factors may be important.
• Easy to remember how to use (memorability)
• Error frequency and severity
• Accessibility
• And most crucial for surveys
• Data quality
• Respondent burden
#QDET2

14
Usability testing is watching a user try to
achieve the goal
• Participants represent real users
• Participants do real tasks
• You observe and record what participants do
• You think about what you saw:
• Analyze data,
• Diagnose problems,
• Recommend changes.
• Make changes and test again
#QDET2

• Planning
15
#QDET2
2:00 – 3:45
4:00 – 5:30

16
Surveys are not web sites.
…so why is usability testing
needed for survey research?
#QDET2

Surveys
• Improve data quality
• Reduce respondent
burden
Websites
• Increase traffic/revenue
(e.g., sell more product)
• Disseminate
information
17
Usability Testing Goals
#QDET2

18
Users are not trained interviewers
• Web surveys go to the general (or specific) public
• Varying levels of computer expertise
• Varying levels of literacy
• Likely to be in a hurry, interrupted, distracted
• There is no interviewer
• No one to interpret the questions
• No one to navigate around the instrument
#QDET2

19
#QDET2
Presser et al 2004: pretesting focuses on a
“broader concern for improving data
quality so
that measurements
meet a survey’s
objective”

20
Cognitive Testing vs Usability
• Cognitive Testing: Do people understand it?
• What are your feelings towards Obamacare? vs.
• What are your feelings towards the Affordable Care Act?
• Usability Testing: Can people use it?
#QDET2

Usability Model for Surveys
#QDET2
1. Interpreting the design:
a. What meaning do respondents assign to visual design and layout?
b. How do respondents believe the survey works?
2. Completing actions and navigating:
a. How well does the survey support respondents’ ability to
complete tasks and goals?
b. How well do respondents follow navigational cues and
instructions?
3. Processing feedback:
a. How do respondents interpret and react to the survey feedback in
response to their actions?
b. How well does the survey help respondents identify, interpret, and
resolve errors?
Source: Geisen & Romano Bergstrom, 2017

22
Example Usability Study
Romano & Chen, 2011
#QDET2

23
Navigation Usability Study
Method
• Lab-based usability study
• TA read introduction and left letter on desk
• Separate rooms
• R read letter and logged in to survey
• Think Aloud
• Eye Tracking
• Satisfaction Questionnaire
• Debriefing
* p < 0.0001
#QDET2
Romano & Chen, 2011

24
Eye Tracking
Romano & Chen, 2011
• Participants looked at Previous and Next in PN conditions
• Many participants looked at Previous in the N_P conditions
• Couper et al. (2011): Previous gets used more when it is on the right.
#QDET2

25
Debriefing Interview
• N_P version
• Counterintuitive
• Don’t like the “buttons being flipped.”
• Next on the left is “really irritating.”
• Order is “opposite of what most people would design.”
• PN version
• “Pretty standard, like what you typically see.”
• The location is “logical.”
#QDET2
Romano & Chen, 2011

• “If you’re doing a web survey, you’re doing a mobile
survey.” - Michael Link, 2013 AAPOR
• Respondents on mobile devices are as high as 30%
or more for some surveys (Lugtig, Toepoel & Amin,
2016; Saunders, 2015).
26
Don’t forget about mobile
#QDET2

27
#QDET2
Romano Bergstrom, QDET2, 2016
Mobile Usability Study
V1: long list of items: grid on desktop; drop down to select
response on mobile

28
#QDET2
Mobile Usability Study
V1: long list of items; drop down to select response
V2: each question on
separate screens
Romano Bergstrom, QDET2, 2016

Usability Testing Demo
29
#QDET2

• Planning
30
2:00 – 3:45
4:00 – 5:30
#QDET2

What can be tested?
31
#QDET2

Exploratory Testing Example
32
• Background
• GSS collects data about graduate students and postdocs in
different fields of study
• NSF wanted to modify GSS to capture data at a more
consistent and detailed level (e.g., programs instead of
departments)
• The design approach
• Redesigned parts of survey using this model
• Conducted usability testing
#QDET2

Created a Hierarchy to Collect Data
33
• School/College: The Graduate school
• Department: Biological Sciences
• Program: Cell Biology
• Provide counts of graduate students in cell biology by
race/ethnicity, sex, etc
• Program: Botany
• Provide counts of graduate students in botany by race/ethnicity,
sex, etc
• Department: Physics
• Program: Atmospheric physics
• Provide counts of graduate students in cell biology by
race/ethnicity, sex, etc
#QDET2

Survey Framework Did Not Fit Users
34
• No common terminology
• What is a department vs program?
• Used other terminology altogether:
division, concentration, track, field, subject
• No common hierarchy or structure
• Departments within programs and vice versa
• No departments, just programs
• Some departments had programs, some didn’t
• Information not available at level desired
#QDET2

Assessment & Verification:
35
#QDET2
• Low-fidelity prototypes
• Paper or computer mockups computer
• Wireframe
• High-fidelity prototypes
• Early interactive prototype / selected
interactive questions
• Finished product

Low-Fidelity Testing
36
• Methods: simple drawings or illustrations,
Word/Excel/Visio, simple screen shots, website shell
• Uses: new questions or surveys, redesigns, evaluating
information architecture, visual aspect of survey, web-
centric features
• Benefits: allows for quick-feedback without spending
too much time or money on programming, can apply
results to other aspects of survey
• Don’t forget mobile testing (if using) at this stage!
#QDET2

What I can test: Paper Mock-Ups
37
#QDET2

38
#QDET2
What can I
test:
Mobile
Paper
Prototype
Image source: Craig, 2016; Geisen & Romano Bergstrom, 2017

What can I test: Computer mockup
39
#QDET2

What can I
test:
Low-Fidelity
Wireframe
40
#QDET2
Romano Bergstrom et al., 2011

41
#QDET2
What can I
test:
High-Fidelity
Wireframe
Romano Bergstrom et al., 2011

Design-based and iterative
42
#QDET2
Romano Bergstrom & Strohl, 2014

When to test
43
• Start as early as possible
• Testing should be integrated into the programming
schedule, not conducted after
• Test in stages as web survey is being developed
• Test until all serious problems resolved / stop learning
anything new (ideally)
• Iterative testing benefits from more rounds, fewer
people
#QDET2

Reasons for more rounds, fewer people
44
• Identify more issues: 2 rounds of 5 users will likely
identify more issues than 1 round of 10 users
• Diminishing returns from more users in each round
• Can be hard for users to see past the big glaring problems
to other more subtle problems
• Allows you to test solution
• Good balance between testing resources and revision
resources
• Quicker to summarize results and revise testing
#QDET2

Smaller rounds support collaboration
45
• Include stakeholders in testing
• Have programmers, clients, decision-makers observe
testing live or remotely
• Direct observation is more exciting than reading a report
• Collaborative process
• Conduct tests in the morning, meet to discuss over a long
lunch, recommendations for changes ready in the
afternoon
• Report can then summarize findings
and changes instead of findings and
recommendations
#QDET2

Iterative Testing: Example
One box, prompt inside box
One box, prompt below box: resulted in
more complete names
Separate boxes, prompt below: even
more complete names
46
#QDET2
Geisen, Olmsted, Goerman & Lakhe, 2014

Iterative Testing: Example
One box, prompt inside box
One box, prompt below box: resulted in
more complete names
Separate boxes, prompt below: even
more complete names
47
#QDET2
Geisen, Olmsted, Goerman & Lakhe, 2014

Start with web & survey best practices
48
• User-centered evaluation includes best
practices/findings from literature
• Abundance of literature
• Designing Effective Web Surveys (Couper)
• Internet, Mail, and Mixed-Mode Surveys (Dillman et al.)
• Jakob Nielsen
#QDET2

Build off the literature before doing
usability testing
49
• Usability testing will show you how well or how easily
people can do method A
• Will not necessarily show you that method A is
definitively better than method B
• Not a replacement for large, probability-based
methodological experiments
• Don’t reinvent the wheel
#QDET2

Building off the literature: Example
50
• Concern: Want to know best method for providing
definitions in web surveys
• Ask: Has this been done before?
• Start with the literature:
• Conrad, Couper, Tourangeau, Peytchev (2006)
• Peytchev, Conrad, Couper, Tourangeau (2007)
• Peytchev, Conrad, Couper, Tourangeau (2010)
#QDET2

Building off the literature: Example (cont)
51
• Methods
• Experiment 1: one-click, two-clicks, click and scroll
• Experiment 2: roll-over, one-click, two-clicks
• Experiment 3: roll-over vs. always included
• Conclusions:
• Reading definitions probably improves accuracy
• Less effort required, more likely to read definitions
#QDET2

Example 2
52
• Methods
• Experiment 1: One-click, two-clicks, click and scroll
• Experiment 2: roll-over, one-click, two-clicks
• Experiment 3: hover-over vs always included
• Conclusions:
• Reading definitions probably improves accuracy
• Less effort required, more likely to read definitions
#QDET2

Start with the literature, but decide
what’s relevant for your study
53
• The literature may not focus on the study population
needed for your survey
• May not be any literature on the particular topic or
issue your survey has
• And sometimes the experts just don’t agree,
then what?
#QDET2

• Planning
• Reporting findings
54
#QDET2
2:00 – 3:45
4:00 – 5:30

Obstacles to Testing
55
• “There is no time.”
• Start early in development process.
• One morning a month with 3 users (Krug)
• 12 people in 3 days (Anderson Riemer)
• 12 people in 2 days (Lebson & Romano Bergstrom)
• “I can’t find representative users.”
• Everyone is important, and something is better than
nothing.
• Remote testing or travel
• “We don’t have a lab.”
• You can test anywhere.
#QDET2

Planning
56
• Participant Selection and Recruitment
• Testing Location and Equipment
• Identifying Testing Focus/Concerns
• Identifying Measures to Collect
• Decide on Testing Roles
• Preparing Test Materials
#QDET2

Participants: determining target audience
57
• Recruit people who are like your target users
• Who is the survey for?
• Consider participants’ jobs and other roles
• Recruit diverse participants
• Age
• Income
• Education
• Is location important?
#QDET2

Participants: How many?
Adapted from: Nielsen and Landauer (1993)
#QDET2

Participant recruitment
59
• Existing Participant Lists (Existing Frame)
• Target respondents; small percentage of successful recruits
• No Participant Lists (Constructed Frame)
• Research firm with database
• Reliable; may be professional participants
• Hang fliers nearby
• Target locals; bit of work walking around
• Online social media ads
• Target specific criteria; social media users
• Classifieds
• Lots of responses quickly, non-Internet users; may be professional
• Snowball (word-of-mouth)
• Good for specific populations; They may know each other
#QDET2

Participant recruiting tips
60
• Recruit “floaters” (for no-shows and cancellations)
• Talk to your participants early
• Ask about specific behaviors relevant to your study (e.g.
mobile usage, time spent online)
• Talk about what they’ll do and build rapport
• Get an email address and a contact number
• Schedule sessions ASAP (e.g., 3 weeks ahead)
• Remind them the day before
#QDET2

61
Location: Lab, Remote, In the Field
• Controlled environment
• All participants have the
same experience
• Record and communicate
from control room
• Observers watch from
control room and provide
additional probes (via
moderator) in real time
• Incorporate physiological
measures (e.g., eye
tracking, EDA)
• No travel costs
Laboratory Remote In the Field
• Participants tend to be
more comfortable in
their natural
environments
• Recruit hard-to-reach
populations (e.g.,
children, doctors)
• Moderator travels to
various locations
• Bring equipment (e.g.,
eye tracker)
• Natural observations
• Participants in their
natural environments
(e.g., home, work)
• Use video chat
(moderated sessions)
or online programs
(unmoderated)
• Conduct many
sessions quickly
• Recruit participants in
many locations (e.g.,
states, countries)
#QDET2

62
Lab-Based Usability Testing
#QDET2

64
Remote moderated Usability Testing
#QDET2

65
#AAPOR2016
Great for
quickly
assessing
different
designs –
gather large
amounts of
data quickly
0.00
0.10
0.20
0.30
0.40
0.50
0.60
Only Me Friends Confirm Change progress bar back on
phone
outside area
V1, N=2000
V2, N=1898
Pre-UX, N=864
First-click heat maps
Percentage of participants who made
the first click to these areas of interestRemote
unmoderated
testing
example

66
#QDET2
In-the-field
usability testing

Testing location: summary
67
Your location / Facility User’s location
Pros • Use your equipment (e.g.,
one-way mirror, recording
software)
• Controlled setting with
no/few interruptions
• Simulates real user experience
• Users have access to info
• Easier to schedule/
accommodate users
Cons • Not true to real life
• More burden to user – more
no shows and cancelations
• Using your computer/device
not the user’s computer
• Need portable equipment or do
without
• Interviewer travel - increased
cost to researcher
• Safety matters
• Harder to schedule observers
#QDET2

Equipment: video/audio recording
68
• Helps with note-taking
• Reduces need to take notes or have a note-taker during
interview
• Can take more nuanced notes afterwards if you can start
and stop the video
• Helps after the interview
• Useful during debriefing to replay parts of video
• Accommodates observers who could not make it to the
actual session
#QDET2

Equipment: Screen-sharing for observers
69
• Fosters collaboration
• Can accommodate observers from any location
• Facilitate discussions in conference setting
• Improved schedule
• Stakeholders get information immediately
• No waiting for recorded videos or report
• Cheaper
• Inexpensive compared to travel costs
• However, watching from their desks leads to less
engagement than in-person – the best is always
having observers in-person, in the observation room
#QDET2

Identify Testing Focus and Concerns
70
• General focus: Can they complete the survey?
• More specific concern:
We are worried about the definitions.
• More specific:
Are hover-overs an effective way of providing definitions?
• More specific: Do participants know definitions are available?
• More specific: Do participants understand what’s hover-overable?
• More specific: How helpful/unhelpful are the definitions?
#QDET2

More examples of specific concerns
71
• How well do people understand the instructions?
• Do people read the entire question and response options before responding? If not,
what do they read?
• Can people use the Next and Previous navigation buttons correctly?
• Do people know what to do on each screen?
• How easily do people find the information they need to answer the questions?
• When people do not understand something or have a question, do they use the
FAQs?
• Are the FAQs helpful/sufficient? What is missing?
• Are people able to correctly select their job from a long list of potential jobs?
• When do people use the left navigation, if at all?
• Can people use sliders correctly to select the desired response?
#QDET2

Identifying measures to collect
72
• Observational metrics tell us howhowhowhow participants navigate and interact.
• Self-report metrics tell us whywhywhywhy participants focus on certain site aspects.
• Eye tracking tells us what, how long, and how oftenwhat, how long, and how oftenwhat, how long, and how oftenwhat, how long, and how often participants focus on design elements.
• The combination of observational, self-report, and implicit data allows us to accurately measure
the user experience. We do not use eye tracking in isolation.
#QDET2

Include eye tracking?
73
Consider using eye tracking if you want to:Consider using eye tracking if you want to:Consider using eye tracking if you want to:Consider using eye tracking if you want to:
• Observe what attracts attention
• Discover potential areas of confusion/interest
• Watch as users learn to interact with an interface over time
• Validate/invalidate design changes
Usability Testing Usability Testing with Eye Tracking
Can users complete the survey
and individual items?
Do users see things that aid/hurt
completion?
• Direct observation of users’
behaviors
• Analysis: users’ conceptual
model vs. survey model
• Look patterns: locations, duration,
path
• Analysis: intended visual hierarchy
vs. actual look pattern
Evaluates usability Evaluates user experience
Supports improved ease of use Supports improved ease of use and
increased engagement
#QDET2

Eye tracking enables researchers to assess
attention to motivational language and
brand, which may impact response rate....
75Walton, Romano Bergstrom, Hawkins & Pierce, 2014
#QDET2

Eye tracking example: People read pagesPeople read pagesPeople read pagesPeople read pages
withwithwithwith questions on them differently than other pagesquestions on them differently than other pagesquestions on them differently than other pagesquestions on them differently than other pages....
76Jarrett & Romano Bergstrom, 2014
The F-shaped eye-tracking pattern of the
block of text at the top of the page is
completely different from the eye-
tracking pattern on the question and
answer spaces at the bottom of the page.
#QDET2

Eye tracking example: PeoplePeoplePeoplePeople dondondondon’’’’t read importantt read importantt read importantt read important
parts of survey invitation letters.parts of survey invitation letters.parts of survey invitation letters.parts of survey invitation letters.
77Olmsted-Hawala, Wang, Willimack, Burke & Lakhe, 2016
#QDET2

78
#AAPOR2016
When
NOT to
track
Jarrett & Romano Bergstrom, 2014

79
Slot-In Survey AnswersSlot-In Survey Answers
#QDET2

80
Gathered Survey
Answers
Gathered Survey
Answers
#QDET2

81
Created Survey
Answers
Created Survey
Answers
#QDET2

82
Third-Party Survey
Answers
Third-Party Survey
Answers
#QDET2

83
Include eye tracking? – Summary
#QDET2

Plan your measurements
84
• Examples of performance measures
• Success rate and/or speed for tasks
• Requests for help/assistance
• Number and types of errors that occurred (e.g., incorrect
selections, menu choices)
• Count of features used (e.g., help menu, hover-over
definitions, calculate button)
• Examples of preference measures
• Do you prefer A or B? Why?
• How or easy or difficult was it to do … Very easy, easy…
#QDET2

Organize roles
85
• Meet and greet
• Observers
• Test facilitator
• Note taker
• Videographer
#QDET2

Develop your test materials
86
• Develop consent forms, screeners
• Instructions/directions for participants
• Prepare written tasks/scenarios (on index cards)
• Pretest/posttest questionnaires
• Observer note sheets
#QDET2

Tasks, Scenarios, Probes
87
• Scenario – a real-life situation that you ask
participants to put themselves in to test the
instrument
• Task – something you want the participant to
accomplish
• Probe – questions asked of the user to elicit additional
information and feedback
#QDET2

A scenario brings the data together into a
coherent story
88
• Keep scenarios short and simple
• Scenarios should reflect things participants might
actually do
• Use vignettes to test rare/unusual situations
• Use the participant’s words, not researchers
• May need to prepare fake data to answer questions
(e.g., SSN, phone number)
#QDET2

For some products, you need a task
89
• These are things that you want the user to do
• Often as simple as: “Please fill out this survey as you
would at home”
• You may need specific tasks to match your test focus
and concerns
#QDET2

90
Example 1 – Scenario
Romano Bergstrom, Childs, Olmsted-Hawala & Jurgenson, 2013
• Participants imagined they were at the respondent’s door
#QDET2

91
• Participants imagined they were at the respondent’s door
#QDET2

92
• To assess if the Information Sheet worked well, scripts
were used to ensure interviewers could record
difficult-to-record households.
#QDET2

Example 2 - Scenario and Task
Scenario: Your graduate school will include the following PhD
programs this year:
• Biology
• Chemistry
• Marine, Earth, and Atmospheric Sciences
• Physics
• Spanish
Task: Please update the list of departments, programs, and
research units that should be included in the survey for this
year.
#QDET2

Don’t confuse participants with
too many scenarios
95
• Whittle lists of tasks/scenarios to manageable number
• Prepare tasks to give to participants (index cards are
useful)
• Tasks should flow in order of the survey
• Okay to change tasks between rounds
#QDET2

• Planning
96
#QDET2
2:00 – 3:45
4:00 – 5:30

The day before the test
98
• Send out reminders
• Phone or email to respondents
• Email to stakeholders
• Equipment/Facility
• Check the computers and software (remote sharing, video
recording), keyboard, mouse
• Make sure the room you’ll use is tidy
• Make sure your meet/greet person
has the final list of participants’ names
• Incentives are available
#QDET2

The day
before
the test
99
#QDET2

Set-Up for Mobile
100
#QDET2

Set-Up for Mobile w Eye Tracking
101
Fors Marsh Group UX Lab Facebook UX Lab
#QDET2

Moderating Technique: Think Aloud
102
• Getting respondents to verbalize their thoughts
• Can be concurrent or retrospective
• Implementing
• Explain “thinking aloud” at the start
• Get the participant to try an example
• Remind them periodically (What are you thinking?)
• Snags:
• Thinking aloud is not natural for some people
• Others will start well, then forget
#QDET2

Think Aloud: concurrent vs retrospective
Concurrent
• Immediate thoughts
(good recall)
• Procedural comments
• May affect task
performance and
usability metrics.
• Can interfere with eye-
tracking data
• Shorter session length
• Less natural
Retrospective
• Relies on memory (recall
failure)
• Explanatory comments
• No effect on task
performance or usability
metrics
• Accurate Eye-tracking
data
• Session length increases
• More natural
103
#QDET2

Users may need help with thinking aloud
104
• Prompt as needed
• “What are you thinking?”
• “Tell me what you’re doing.
• “Tell me what you’re looking at.”
• “Keep talking.”
• “Tell me more about that.”
• Show you’re listening
• Be Patient
• Give reminders
#QDET2

Moderating Technique: Verbal Probing
105
• Ask targeted questions (probes) about content or
functionality
• Explore content in more depth
• Concurrent vs retrospective
• Scripted vs spontaneous
#QDET2

Verbal Probing: Concurrent vs Retrospective
Concurrent
• Immediate thoughts (good
recall) and more detail
• May be biased
• Affects task performance and
usability metrics
• Ideal for exploratory tests and
cognitive/usability combined
tests
• Better for participants with
low cognitive ability
Retrospective
• Relies on memory (recall
failure), less detail
• Less biased
• No effect on task
performance or usability
metrics
• Can be used in any stage
of testing
106
#QDET2

Verbal Probe Examples 1
Immediate thoughts or
reactions
• What are your thoughts
on this [screen]?
• What are you thinking?
• What are you doing?
• What are you looking at?
• What are you trying to
do?
Does functionality match
expectations?
• What do you expect to
happen when you [click
that link/button]?
• How did you expect that
to work?
107
#QDET2

Verbal Probe Examples 2
Understand user
• What do you want to
accomplish?
• Can you describe the steps
you are taking now?
• How did you feel about
that process to [complete
task]?
• What’s going through your
mind right now?
Probing further
• Echoing
• Can you tell me more
about that?
• Can you provide an
example of [X]?
108
#QDET2

Probing Tips
109
• Avoid yes/no questions, people tend to be
acquiescent
• Bad: “Was this task difficult to complete?”
• Good: “How easy or difficult was that task to complete?”
• Ask unbiased questions
• Bad: “Are you looking at the X link?”
• Good: “Can you tell me what are you looking at?”
• Be quiet and wait
• Bad: Impatiently asking “what’s happening?”
• Good: Count to 20 before jumping in. Or to 30.
#QDET2

When you hear yourself asking a leading
question, balance it
110
Leading
question
“So you think
that’s difficult
then?”
Balanced
question
“...or was it
easy?”
#QDET2

Choosing a Moderating Technique
111
• Can the participant work completely alone?
• Will you need time on task and accuracy data?
• Are the tasks multi layered and/or require
concentration?
• Will you be conducting eye tracking?
#QDET2

Moderating Techniques: Summary
112
#QDET2
Approach Advantages Disadvantages
Concurrent
Think
Aloud
• Feedback in real-time
• Good recall
• Procedural comments
• Shorter session length
• Unbiased feedback
• Easy for moderators to learn
• Slight effect on task
performance (vs. RTA)
• May affect usability
metrics.
• Some interference
with eye-tracking data
• Less natural
• Hard for some
participants
Retro-
spective
Think
Aloud
• Explanatory comments
• No effect on task performance
or usability metrics
• Accurate Eye-tracking data
• More natural
• Unbiased feedback
• Easy for moderators to learn
• Recall failure
• Longer session length
• Hard for some
participants
• Requires heavy cueing
Geisen & Romano Bergstrom, 2017

Moderating Techniques: Summary 2
113
#QDET2
Approach Advantages Disadvantages
Concurrent
Verbal
Probing
• Feedback in real-time
• Good recall
• Ask targeted questions
• More detailed comments
• Works well for exploratory tests,
cognitive/usability combined tests
• Easiest for participants, especially
with low cognitive ability
• May introduce bias
• Negative effect on task
performance and
usability metrics
• Hardest for moderators
to learn
• Longest session lengths
Retro-
spective
Verbal
Probing
• Less biased
• Ask targeted question
• No effect on task performance or
usability metrics
• Can be used in any stage of testing
• Easier for participants
• Recall failure
• Requires some cueing
• Less detailed
comments
• Hard for moderators to
learn (vs CTA, RTA)

Moderating Tips
114
• Maintain objective viewpoint
• Be prepared for surprises
• Report accurately
• Redirect participants to keep them on task
• Avoid coaching
• Don’t help participants
• Don’t ask if they would do “anything else”
• Don’t suggest: “Let’s try this”
#QDET2

Moderating Tips (Continued)
• Participant is silent
• Be patient. Then if necessary, ask “What are you
thinking?”
• Participant asks you, “Is this right?”
• “We just want to see how you do it?”
• Participant asks you for help
• “What would you do if I wasn’t here?”
• Participant blames himself/herself
• “A lot of people have had this problem.”
• “Your feedback helps us learn what we need to
improve.”
115
#QDET2

Provide neutral feedback
116
• Provide praise/feedback for every task, successful or
unsuccessful
• Keep it neutral
• “mm hmmm:
• “uh huh”
• “That’s interesting”
• “that’s helpful”
• If you are writing notes then write down everything
• If you only write bad things, the participant will notice it’s
biased
#QDET2

• Planning
• Analyzing Results
117
#QDET2
2:00 – 3:45
4:00 – 5:30

Analyzing Results
118
Analyze
• Collect all of your data together
• Summarize/Reduce to meaningful chunks
• Understanding what it means
Revise
• What can/should we do about it?
Test again
Barnum, 2011
#QDET2

Collect All of Your Data Together
119
• Self-reported
• Verbalizations
• Satisfaction and difficulty ratings from questionnaires
• Observational
• Usability metrics
• Click patterns
• Behavior and other observations
• Implicit:
• Eye-tracking data
#QDET2

120
“Focus ruthlessly on only the most
serious problems”
Krug, 2010
#QDET2

Focus on the most serious problems
121
• Run your usability test
• Have a meeting with the key stakeholders
• Decide on most important problems (and problems
that are easily fixed)
• Go away and fix them
• Ignore everything else
• Test again and repeat
#QDET2

Determining the problems to target
122
• Frequency of the problem (e.g., 5 out of 5 users)
• How likely are others to have this problem?
• What’s the impact on the survey (e.g., causes break-
offs, inaccurate data)
• How much of the survey does it affect (e.g., local vs.
global finding)?
• How easy/difficult is it to fix (low-hanging fruit)
#QDET2

Focus on findings that improve quality
123
• When usability testing a survey, focus on
• Improving data quality
• Reducing respondent burden
#QDET2

Determining what and how to fix
problems is harder.
124
• Group debrief after the test to discuss
• Set priorities
• Most serious problems (ignore “nice to haves”)
• Problems that are easy to fix (e.g., typos, wording)
• How long will it take to fix?
• Will fixing it cause other potential problems?
• Recommendations should be specific/doable within
timeframe and budget
• Not everything will get fixed
#QDET2

Weigh effect on data vs Effort to fix
125
#QDET2

With few users, each one really matters
126
• Are there outliers in small studies?
• How representative is each user?
• Are others likely to have this problem?
• Caveats for reporting data with few users
• Report numbers (4 out of 5) rather than percentages
• Report with numbers rather than words (most, usually,
almost all)
• With a small number of users, you will find your
biggest problems
• Iterate and/or conduct remote unmoderated testing, if
possible
#QDET2

ExampleConsentForm
128
#AAPOR2016

129
#AAPOR2016
ExampleModerator’sGuide

ExampleSatisfactionQuestionnaire
130

ExampleDebriefingInterview
131

Usability Testing for Survey Research:How to and Best Practices

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (12)

Similar a Usability Testing for Survey Research:How to and Best Practices

Similar a Usability Testing for Survey Research:How to and Best Practices (20)

Último

Último (20)

Usability Testing for Survey Research:How to and Best Practices