Measuring Performance Metrics

Chapter 4
Performance Metrics

Presenter: 00335011 魏傳諺

Agenda

• Preface
• Task Success
• Time-on-Task
• Errors
• Efficiency
• Learnability

Preface of Performance Metrics

• Based on specific user behaviors
– User behaviors
– The use of scenarios or task
• How well users are actually using a product
• Useful to estimate the magnitude of a specific usability issue
– How many people are likely to encounter the same issue after the product is
released?
– How many users are able to successfully complete a core set of tasks using
a product
• Not the magical elixir for every situation
– sample size
– time & money
– tell the what very effectively but not the why

Five Basic Types

• The most widely used performance metric
Task Success • How effectively users are able to complete a given set of tasks

Time-on-Task • How much time is required to complete a task

Errors • Reflect the mistakes made during a task

Efficiency • The amount of effort a user expends to complete a task

Learnability • How performance changes over time

Task Success

• The most common usability metric
• As long as the user has a well-defined task, you can measure
success

Collecting Any Type of Success Metric

• Each task must have a clear end-state
– Define the success criteria  Data collection
• Find the current price for a share of Google stock (clear end-state)
• Research ways to save for your retirement (not a clear end-state)

• Way to collect success data
– Verbally articulate the answer after completing the task
– Provide their answers in a more structured way
• Try to avoid write-in answers if possible

• In some case the correct solution to a task may not be verifiable
– depends on the user‟s specific situation
– testing is not being performed in person

Binary Success

• Either participants complete a task successfully or they don‟t
• How to Collect and Measure
– 0&1
• How to Analyze and Present
– By individual task
– By user or type of user
• Frequency of use
• Previous experience using the product
• Domain expertise
• Age group
• Can calculate a percentage of tasks that each successfully completed
– Binary data  Continuous data

• Calculating Confidence Intervals

Levels of Success

• Partially completing a task?
– coming close to fully completing a task may provide value to the
participant
– Helpful for you to know
• Why some participants failed to complete a task
• With which particular tasks they needed help

Levels of Success (cont’d)

• How to Collect and Measure
– Must define the various levels
– Based on the extent or degree to which a participant completed the task
• Complete Success, Partial Success, and Failure
• What constitutes „„giving assistance‟‟ to the participant
• Assign a numeric value for each level
• Does not differentiate between different types of failure
– Based on the experience in completing a task
• No Problem, Minor Problem, Major Problem, and Failure/Gave up
• Ordinal data  No average score
– Based on the participant accomplishing the task in different ways
• Depending on the quality of the answer (not needs numeric score)

Levels of Success (cont’d)

• How to Analyze and Present
– To create a tacked bar chart
– To report a “usability score”

Issues in Measuring Success

• How to define whether a task was successful?
– When unexpected situations arise
• Make note of them
• Afterward try to reach a consensus

• How or when to end a task
– Stopping rule
• Complete task / Reach the point at which they would give up or seek
assistance
• “Three strikes and you‟re out”
• Set a time limit
– If the participant is becoming particularly frustrated or agitated

Time-on-Task

• Way to measure the efficiency of any product
– The faster a participant can complete a task, the better the experience
• Exceptions to the assumption that faster is better
– Game
– Learning

Importance of Measuring Time-on-Task

• Particularly important for products
– where tasks are performed repeatedly by the user
• The side benefits of measuring time-on-task
– Increasing Efficiency  Cost Savings  Actual ROI

How to Collect and Measure Time-on-Task

• The time elapsed between the start of a task and the end of a task
– In minutes
– In seconds
• Measure by any time-keeping device
– Start time & End time
– Two people record the times
• Automated Tools for Measuring Time-on-Task
– less error-prone
– Much less obtrusive
• Turning on and off the Clock
– Rules about how to measure time
• Start the clock as soon as they finish reading the task
• Point the timing ends at the participant hit the “answer” button
• Stop timing when the participant has stopped interacting with the product

How to Collect and Measure Time-on-Task (cont’d)

• Tabulating Time Data

Analyzing and Presenting Time-on-Task Data

• Ways to present
– Mean
– Median
– Geometric mean
• Ranges
– Time interval
• Thresholds
– Whether users can complete certain tasks within an acceptable amount of
time
• Distributions and Outliers
– Exclude outliers (> 3 SD above the mean)
– Set up thresholds
– determine the fastest possible time

Issues to Consider When Using Time Data

• Only Successful Tasks or All Tasks?
– Advantage of only including successful tasks
• A cleaner measure of efficiency
– Advantage of including all tasks
• A more accurate reflection of the overall user experience
• An independent measure in relation to the task success data
– Always determined when to end  include all times
– Sometimes decided when to end  only include successful tasks
• Using a Think-Aloud Protocol?
– Think-aloud protocol: to gain important insight
– Have an impact on the time-on-task data
– Retrospective probing technique
• Should You Tell the Participants about the Time Measurement?
– Perform the tasks as quickly and accurately as possible

Errors

• Usability issue vs. Error
– A usability issue is the underlying cause of a problem
– One or more errors are a possible outcome
• Errors
– incorrect actions that may lead to task failure

When to Measure Errors

• When you want to understand the specific action or set of actions
that may result in task failure
• Errors can tell
– How many mistakes were made
– Where they were made within the product
– How various designs produce different frequencies and types of errors
– How usable something really is
• Three general situations where measuring errors might be useful
– When an error will result in a significant loss in efficiency
– When an error will result in significant costs
– When an error will result in task failure

What Constitutes an Error?

• No widely accepted definition of what constitutes an error
• Based on many different types of incorrect actions by the user
– Entering incorrect data into a form field
– Making the wrong choice in a menu or drop-down list
– Taking an incorrect sequence of actions
– Failing to take a key action
• Determine what constitutes an error
– Make a list of all the possible actions
– Define many of the different types of errors that can be made

What Constitutes an Error? (cont’d)

Collecting and Measuring Errors

• Not always easy
– Need to know what the correct (set of) action(s) should be
• Consideration
– Only a single error opportunity
– Multiple error opportunities
• Way of organizing error data
– Record the number of errors for each task and each user
– 0 ~ max(number of error opportunities)

Analyzing and Presenting Errors

• Tasks with a Single Error Opportunity
– Look at the frequency of the error for each task
• Frequency of errors
• Percentage of participants who made an error for each task
– From an aggregate perspective
• Average the error rates for each task into a single error rate
• Take an average of all the tasks that had a certain number of errors
• Establish maximum acceptable error rates for each task
• Tasks with Multiple Error Opportunities
– Look at the frequency of errors for each task  error rate
– The average number of errors made by each participant for each task
– Which tasks fall above or below a threshold
– Weight each type of error with a different value and then calculate an “error score”

Issues to Consider When Using Error Metrics

• Make sure you are not double-counting errors
• Need to know
– An error rate, and
– Why different errors are occurring
• An error is the same as failing to complete a task
– Report errors as task failure

Efficiency

• Time-on-task
• Look at the amount of effort required to complete a task
– In most products, the goal is to minimize the amount of effort
– two types of effort
• Cognitive
– Finding the right place to perform an action
– Deciding what action is necessary
– Interpreting the results of the action
• Physical
– The physical activity required to take action

Collecting and Measuring Efficiency

• Identify the action(s) to be measured
• Define the start and end of an action
• Count the actions
• Actions must be meaningful
– Incremental increase in cognitive effort
– Incremental increase in physical effort
• Look only at successful tasks

Analyzing and Presenting Efficiency Data

Analyzing and Presenting Efficiency Data (cont’d)

Efficiency as a Combination of Task Success and Time

• Task Success + Time-on-Task
• Core measure of efficiency
– The ratio of the task completion rate to the mean time per task

LEARNABILITY

• Most products, especially new ones, require some amount of learning
• Experience
– Based on the amount of time spent using a product
– Based on the variety of tasks performed
• Learning
– Sometimes quick and painless
– At other times quite arduous and time consuming
• Learnability
– The extent to which something can be learned
– How much time and effort are required to become proficient
– While happens over a short period of time  maximize efficiency
– While happen over a longer time period  great rely on memory

Collecting and Measuring Learnability Data

• Basically the same as they are for the other performance metrics
• Collect the data at multiple times
– Based on expected frequency of use
• Decide which metrics to use  Decide how much time to allow
between trials
• Alternatives
– Trials within the same session
– Trials within the same session but with breaks between tasks
– Trials between sessions

Analyzing and Presenting Learnability Data

• By examining a specific performance metric
• Interpret the chart
– Notice the slope of the line(s)
– Notice the point of asymptote, or essentially where the line starts to
flatten out
– Look at the difference between the highest and lowest values on the y-
axis
• Compare learnability across different conditions

Issues to Consider When Measuring Learnability

• What Is a Trial?
– Learning is continuous and without breaks in time
• Memory is much less a factor in this situation
• More about developing and modifying different strategies to complete a set
of tasks
• Take measurements at specified time intervals

• Number of Trials
– There must be at least two
– In most cases there should be at least three or four
– You should err on the side of more trials than you think you might need
to reach stable performance.

Measuring Performance Metrics

Recomendados

Recomendados

Más contenido relacionado

Similar a Measuring Performance Metrics

Similar a Measuring Performance Metrics (20)

Último

Último (20)

Measuring Performance Metrics