Más contenido relacionado Similar a LeanUX: Online Design of Experiments (20) LeanUX: Online Design of Experiments1. LeanUXDenver 2012 L6σ
LeanUX
Multivariate Testing Using
Design of Experiments (DOE)
Scott Leek
Sigma Consulting Resources, LLC
LeanUXDenver
September 21, 2012
© 2012 Sigma Consulting Resources, LLC 1
2. Purpose L6σ
Objectives
• Strategies and tactics for testing theories, advantages and disadvantages
• Fundamental approach and iterative nature of experimentation
• Properties of a good experimental design
• Basic DOE terminology
• Design types and uses
• Full factorial designs
• Concepts
• How to
• Example
• Fractional factorial designs
© 2012 Sigma Consulting Resources, LLC 2
3. LeanUX Notional Scenario L6σ
Objective
Test landing experience factors to increase
landing page conversion rate
© 2012 Sigma Consulting Resources, LLC 3
4. LeanUX Approaches L6σ
Strategy
Retrospective (Passive Observation)
Methods Buttons Measurements Observe Effect
Effect
Layout Offers Colors
Search for Cause
© 2012 Sigma Consulting Resources, LLC 4
5. LeanUX Approaches L6σ
Strategy
Prospective (Experimentation)
Methods Materials Measurements …To Create Effect
Effect
People Machines Environment
Change One or More Factors…
© 2012 Sigma Consulting Resources, LLC 5
6. LeanUX Tactics L6σ
Options
• Historical data
• One factor at a time
• All factors at the same time
• A/B Testing
• Design of Experiments (DOE)/Multivariate Testing (MVT)
© 2012 Sigma Consulting Resources, LLC 6
7. LeanUX Tactics L6σ
Historical Data
Description
Analyze historical (retrospective) data to find correlations and/or build
predictive models (ANOVA, Regression, GLM, et cetera).
Conversion
Probability
Load Time
© 2012 Sigma Consulting Resources, LLC 7
8. LeanUX Tactics L6σ
Historical Data
Advantages Disadvantages
Timely and efficient use of data Large data sets
Logistically simple Background variables uncontrolled
Effective predictive models Potential lurking variables
Interactions can be problematic
Factor testing range too narrow
Important factors not tested
Errors in the data, incomplete data
© 2012 Sigma Consulting Resources, LLC 8
9. LeanUX Tactics L6σ
Proving Cause
“To find out what happens to a system when you interfere
with it, you have to interfere with it (not just passively
observe it).”
George E. P. Box
© 2012 Sigma Consulting Resources, LLC 9
10. LeanUX Tactics L6σ
One Factor At A Time
Description
Start from baseline factor settings and change one factor. If the result is
better retain the change, if not, return to the baseline. Repeat with the
next factor.
Factor 1 Factor 2 Factor 3 Factor 4 Factor 5
© 2012 Sigma Consulting Resources, LLC 10
11. LeanUX Tactics L6σ
One Factor At A Time
Baseline for five
factors
Change factor 1
If improved retain
change, change
factor 2
If not improved do
not retain change,
change factor 3
© 2012 Sigma Consulting Resources, LLC 11
12. LeanUX Tactics L6σ
One Factor At A Time
Advantages Disadvantages
Fast and simple to execute Confounded by random variation
Little planning required Logistically problematic
You can get lucky No information on main effects
No information on interactions
Factor combinations not tested
Background variables uncontrolled
Potential lurking variables
© 2012 Sigma Consulting Resources, LLC 12
13. LeanUX Tactics L6σ
All Factors At The Same Time
Description
Start from baseline factor settings and change multiple (or all) factors
simultaneously.
Baseline for five
factors
Change multiple
factors
© 2012 Sigma Consulting Resources, LLC 13
14. LeanUX Tactics L6σ
All Factors At The Same Time
Advantages Disadvantages
Fast and simple to execute Effects are confounded
Little planning required Logistically problematic
You can get lucky Factor combinations not tested
Background variables uncontrolled
Potential lurking variables
© 2012 Sigma Consulting Resources, LLC 14
15. LeanUX Tactics L6σ
A/B Testing
Description
A simple designed experiment randomly exposing users to either a
control (A) or a treatment (B). The treatment can vary one factor on a
landing page, or vary the multiple factors in the landing experience.
Revenue
Landing Landing
Page 1 Page 2
© 2012 Sigma Consulting Resources, LLC 15
16. LeanUX Tactics L6σ
A/B Testing
Advantages Disadvantages
Relatively simple Limited number of comparisons
Efficient use of data Limited information on main effects
Effective results No information on interactions
Protect against lurking variables Increased probability of Type I error*
Plan for background variables
* Pairwise comparisons of seven factors, two at a time, results in 21 tests (7!/(2 ! × 5 !)). Assuming 95% confidence the probability of a
Type I error increases to 66% (1 - (0.9521)) from 5% (1 - (0.95)).
© 2012 Sigma Consulting Resources, LLC 16
17. LeanUX Tactics L6σ
Design of Experiments (DOE)
Description
Similar to A/B testing but multiple factors are tested simultaneously
allowing for precise estimates of main effects and interaction effects.
Yes
Discount Field
Photo
Offering Graphic
No Icon
Small Large
Button Size
© 2012 Sigma Consulting Resources, LLC 17
18. LeanUX Tactics L6σ
Design of Experiments (DOE)
Advantages Disadvantages
Relatively simple Can be logistically complicated*
Efficient use of data Requires planning and discipline
Effective results
Protect against lurking variables
Plan for background variables
Estimates for main effects
Estimates for interaction effects
Predictive model
* Crook, Thomas, Frasca, Brian, Kohavi, Ron, LongBotham, Roger, “Seven Pitfalls to Avoid when Running Controlled
Experiments on the Web,” http://www.exp-platform.com/Pages/ExPpitfalls.aspx.
© 2012 Sigma Consulting Resources, LLC 18
19. Design of Experiments (DOE) L6σ
LeanUX Experimentation
Knowledge
Current Decision/
State UX UX Action
Data Data
Theory Theory Theory…
© 2012 Sigma Consulting Resources, LLC 19
20. Design of Experiments (DOE) L6σ
Properties of a Good Experimental Design*
1. Actionable well-defined objective(s)
2. Conducted sequentially to build knowledge
3. Variation in the response variables can be allocated to factors,
background variables, and lurking variables
4. Experiments are conducted over as wide a range of conditions as
possible to improve confidence (degree of belief)
5. As simple as possible while satisfying the first four properties
* Adapted from Moen, Ronald D., Nolan, Thomas W., Provost, Lloyd P., (1991): Improving Quality
Through Planned Experimentation, McGraw-Hill, New York.
© 2012 Sigma Consulting Resources, LLC 20
21. Design of Experiments (DOE) L6σ
Terms
• Response Variable – also called a dependent variable,
or overall evaluation criterion (OEC). A response
variable is a measure that the experiment is trying to
maximize, minimize, or optimize – e.g., click-through
rate, dwell time, et cetera.
• Factor – also called an independent variable (variant).
Factors are changed in a planned way during the
experiment to observe the affect on the response – e.g.,
button position, headline type, offer graphic, et cetera.
• Level – a setting for a factor that can be qualitative or
quantitative – e.g., button position of top or bottom, offer
graphic of icon or photo, response time, et cetera.
© 2012 Sigma Consulting Resources, LLC 21
22. Design of Experiments (DOE) L6σ
Terms
• Background Variable – a variable that potentially
affects the response variable but is not of interest to
study as a factor – e.g., browser type, server response
time, volumes, time (day, week, month, year), et cetera.
Background variables are managed in one of three
ways: holding constant, blocking, or measuring.
• Lurking Variable – a variable potentially affecting the
response variable that is unknown at the time the
experiment is planned. Lurking variables are mitigated
through randomization.
© 2012 Sigma Consulting Resources, LLC 22
23. Design of Experiments (DOE) L6σ
Terms
• Experimental Unit – the smallest unit receiving different
combinations of factor levels (treatments) – e.g., people,
batches, projects, parts, et cetera.
• Run (Test or Trial) – a set of factor level combinations
(treatments) tested in the experiment – e.g., button size
= large, offering graphic = icon, discount field = yes.
• Effect – the change in the response variable when
factor levels are changed – e.g., conversion rates
increase when the offering graphic is a photo of a
person versus an icon. There are main effects and
interaction effects.
© 2012 Sigma Consulting Resources, LLC 23
24. Design of Experiments (DOE) L6σ
Design Types & Uses
Knowledge
Low High
Design Fractional Response
Screening Full Factorials
Type Factorials Surface
# of Factors >5 5 – 10 2–8 2–8
Identify Identify main Identify main
Optimize factor
Purpose important effects + some effects +
settings
factors interactions interactions
© 2012 Sigma Consulting Resources, LLC 24
25. Design of Experiments (DOE) L6σ
Design Types & Uses
Knowledge
Low High
Design Fractional Response
Screening Full Factorials
Type Factorials Surface
# of Factors >5 5 – 10 2–8 2–8
Identify Identify main Identify main
Optimize factor
Purpose important effects + some effects +
settings
factors interactions interactions
© 2012 Sigma Consulting Resources, LLC 25
26. Design of Experiments (DOE) L6σ
2k Full Factorial Designs
• The experimental trials are performed for all possible
combinations of factor levels.
• Full factorial designs are frequently called nk designs
n = number of factor levels
k = number of factors
• A common factorial design is the 2k design, simple and
powerful.
• Disadvantages of the 2k design include possible non-
linear relationships and the number of trials can
increase quickly.
© 2012 Sigma Consulting Resources, LLC 26
27. 2k Full Factorial Designs L6σ
Why 2k Designs?
# of Factors (k) 2k 3k
2 4 9
3 8 27
4 16 81
5 32 243
6 64 729
7 128 2,187
8 256 6,561
2k designs require significantly fewer trials
as the number of factors increases.
© 2012 Sigma Consulting Resources, LLC 27
28. 2k Full Factorial Designs L6σ
Risks?
Conversion
Probability Relationship may
be non-linear
Basic 2k design
assumes a linear
relationship
Lo Hi
Load Time
Options for dealing with non-linear relationships: add center points,
add factor levels, or use Response Surface Methodology.
© 2012 Sigma Consulting Resources, LLC 28
29. 2k Full Factorial Designs L6σ
Factors & Levels
Factor Level
Small
Button Size
Large
Icon
Offering Graphic
Photo
Yes
Discount Field
No
Three factors, each at two levels = 23 = 8 trials
(runs) in the full factorial design.
© 2012 Sigma Consulting Resources, LLC 29
30. 2k Full Factorial Designs L6σ
Notation & Standard Order
Standard Button Offering Discount Standard Button Offering Discount
Order Size Graphic Field Order Size Graphic Field
1 Small Icon No 1 - - -
2 Large Icon No 2 + - -
=
3 Small Photo No 3 - + -
4 Large Photo No 4 + + -
5 Small Icon Yes 5 - - +
6 Large Icon Yes 6 + - +
7 Small Photo Yes 7 - + +
8 Large Photo Yes 8 + + +
© 2012 Sigma Consulting Resources, LLC 30
31. 2k Full Factorial Designs L6σ
Visualizing the Experimental Space
• A cube helps visualize the experimental space with 3 factors
• Each corner represents one of the 23 = 8 trials (runs)
• A Full Factorial design covers the entire experimental space
Button Size = Large
Offering Graphic = Photo
Discount Field = Yes
Yes
Discount Field
Photo
Offering Graphic
No Icon
Small Large
Button Size = Small
Offering Graphic = Icon
Button Size
Discount Field = No
© 2012 Sigma Consulting Resources, LLC 31
32. Design of Experiments (DOE) L6σ
Steps (1 – 4 of 10)
1. Define the objective(s)
2. Summarize relevant background information
3. Identify the response variable(s)
4. Identify the factors and levels
© 2012 Sigma Consulting Resources, LLC 32
33. LeanUX DOE L6σ
Plan
1. Objective(s)
Test landing page factors to increase conversion rate
2. Background Information
A series of prior experiments concluded that there are 3 significant factors
out of the 8 tested
3. Response Variable(s)
Conversion rate
4. Factors Levels
Button Size Small Large
Offer Graphic Icon Photo
Discount Field No Yes
© 2012 Sigma Consulting Resources, LLC 33
34. Design of Experiments (DOE) L6σ
Controlling Background Variables
• Hold constant
• Measure and include as a covariate
• Run the experiment in Blocks (groups of experimental
units receiving similar treatments)
© 2012 Sigma Consulting Resources, LLC 34
35. Design of Experiments (DOE) L6σ
Steps (5 – 7 of 10)
5. Identify the background variables and method of control
6. Select the design including replication
7. Randomize trials (runs)
© 2012 Sigma Consulting Resources, LLC 35
36. LeanUX DOE L6σ
Plan
5. Background Variable(s) Method of Control
Browser Type Measure
Operating System Measure
Time (Day, Week, Month, Year) Measure (could run in blocks)
6. Design and Replication
23 Full Factorial = 8 trials x 2 Replicates = 16 trials
7. Randomization
Users randomly assigned to treatments. All assignments are re-directs. The
assignment and redirecting process will be tested offline.
© 2012 Sigma Consulting Resources, LLC 36
37. Design of Experiments (DOE) L6σ
Replication
• Repetition of experimental treatments so that
experimental error (common cause variation) can be
estimated
• A 23 Full Factorial 8-run design with 2 replicates requires
16 trials (runs)
• All trials, including replicates should be randomized
• Include replication if resources allow (estimate error,
estimate response variability, calculate statistical
significance)
© 2012 Sigma Consulting Resources, LLC 37
38. Design of Experiments (DOE) L6σ
Randomization
• Creating a random sequence to run the experimental trials (runs) or
randomly assign users to treatments
• Random means the probability of each event is equal
Standard Order Random Order
* Crook et al recommend conducting A/A testing prior to experimentation to validate the randomization process. See Crook,
Thomas, Frasca, Brian, Kohavi, Ron, LongBotham, Roger, “Seven Pitfalls to Avoid when Running Controlled Experiments
on the Web,” http://www.exp-platform.com/Pages/ExPpitfalls.aspx.
© 2012 Sigma Consulting Resources, LLC 38
39. Design of Experiments (DOE) L6σ
Why Randomize?
• The response of interest is conversion rate.
• The graph depicts the conversion rate over a typical day.
• Why did the conversion rate trend down over the course of a day?
© 2012 Sigma Consulting Resources, LLC 39
40. Design of Experiments (DOE) L6σ
Why Randomize?
• A new landing page is tested against a control, but assignments are
not randomized
• The control is tested during the first half of the day and the
treatment is tested during the second half of the day
Treatment
Control
© 2012 Sigma Consulting Resources, LLC 40
41. Design of Experiments (DOE) L6σ
Why Randomize?
• Tested randomly throughout the day the effect of the lurking
variable is averaged over both the treatment or control
• Randomization provide protection against lurking variables and is
known as the “experimenter’s insurance”
© 2012 Sigma Consulting Resources, LLC 41
42. Design of Experiments (DOE) L6σ
Steps (8 – 10 of 10)
8. Conduct the experiment and collect data
9. Analyze data
10. Draw conclusions and action plans
© 2012 Sigma Consulting Resources, LLC 42
43. Design of Experiments (DOE) L6σ
Conducting the Experiment
• During the experiment plan to collect information about
events and outcomes that are not part of the
experimental plan
© 2012 Sigma Consulting Resources, LLC 43
44. Analyzing a 2k Design L6σ
Model
• The analysis of a 2k design results in a model
Y = b1X1 + b2X2 + + bnXn + e
• A full factorial design begins by examining all possible
terms that might be included in a model, for example, in
a 23 design there are three main effects (A, B, C), three
two factor interaction effects (AB, AC, BC), and one
three factor interaction (ABC)
• The “e” term represents the model error or residual
© 2012 Sigma Consulting Resources, LLC 44
45. Design of Experiments (DOE) L6σ
Analyzing a 2k Design
1. Test the model
Data errors and lurking variables
Assumptions
2. Identify significant main and interaction effects
3. Create appropriate graphical summaries
© 2012 Sigma Consulting Resources, LLC 45
46. Analyzing a 2k Design L6σ
Test for Data Errors & Lurking Variables
• A simple time series plot is used to look for obvious data errors
(missing values, outliers caused data entry)
• Test for lurking variables by examining the time series plots for
trends or time related cycles or patterns
© 2012 Sigma Consulting Resources, LLC 46
47. Analyzing a 2k Design L6σ
Residuals
• All models contain residual, or “left over” variation that is
not explained by the terms (factors) in the model
Residual = Observed - Predicted
Button&Size Offer&Graphic Discount&Field Conversion&Rate Predicted Residual
Large Photo No 23 26 .3
Small Photo No 20 20.5 .0.5
Large Photo Yes 20 19 1
Small Icon No 13 10.5 2.5
Small Photo Yes 21 19.5 1.5
Large Icon Yes 10 9 1
Large Photo No 29 26 3
Large Photo Yes 18 19 .1
Small Icon Yes 5 6.5 .1.5
Small Photo Yes 18 19.5 .1.5
Small Icon No 8 10.5 .2.5
Small Photo No 21 20.5 0.5
Large Icon Yes 8 9 .1
Large Icon No 15 16 .1
Small Icon Yes 8 6.5 1.5
Large Icon No 17 16 1
© 2012 Sigma Consulting Resources, LLC 47
48. Analyzing a 2k Design L6σ
Residual Assumptions
• An independent random variable that is normally
distributed with a mean of 0
• Constant variance over the range of experimental
conditions
• Stable over time
• Not correlated to the factors
© 2012 Sigma Consulting Resources, LLC 48
49. Analyzing a 2k Design L6σ
Testing Assumptions
© 2012 Sigma Consulting Resources, LLC 49
50. Design of Experiments (DOE) L6σ
Analyzing a 2k Design
✔1. Test the model
Data errors and lurking variables
Assumptions
2. Identify significant main and interaction effects and
assess the quality of the model
3. Create appropriate graphical summaries
© 2012 Sigma Consulting Resources, LLC 50
51. Analyzing a 2k Design L6σ
Significant Effects
• Main Effect – the change in the response variable that
results when a factor level is changed.
• Interaction Effect – the change in the response
variable that results when a factor level is changed and
the effect is a function of the level of a second factor.
© 2012 Sigma Consulting Resources, LLC 51
52. Analyzing a 2k Design L6σ
Main Effect
18.25
Discount Field Effect
13.5 - 18.25 = -4.75
13.5
© 2012 Sigma Consulting Resources, LLC 52
53. Analyzing a 2k Design L6σ
Main Effect
The average change (increase or decrease) in the response variable
when changing a factor level from low (high) to high (low).
Main Effect = (Average High (+) Level) – (Average Low (-) Level)
Discount Field Main Effect = #19.5+19.0 + 9.0 + 6.5 & − # 20.5+ 26.0 +16.0 +10.5 & = # 54 & − # 73 & = 13.5 −18.25 = −4.75
! $ ! $ ! $ ! $
" 4 % " 4 % "4% "4%
Discount Field Yes (+)
Discount Field No (-)
© 2012 Sigma Consulting Resources, LLC 53
54. Analyzing a 2k Design L6σ
Interaction Effect
Conversion declines when a discount
field is added, the amount of the decline
depends on the button size.
© 2012 Sigma Consulting Resources, LLC 54
55. Analyzing a 2k Design L6σ
Interaction Effect
The average change in the response variable when a factor level is
changed from a low to a high level, and the effect depends on the level
of another factor.
(! 19 + 9 $ ! 26 +16 $+ (! 19.5 + 6.5 $ ! 20.5 +10.5 $+
Discount Field Button Size Interaction Effect = *# &−# &- − *# &−# &-
)" 2 % " 2 %, )" 2 % " 2 %, [14 − 21] − [13−15.5] −4.5
= = = −2.25
2 2 2
Discount Field Yes (+), Large Button (+)
Discount Field No (-), Large Button (+)
Discount Field Yes (+), Small Button (-)
Discount Field No (-), Small Button (-)
© 2012 Sigma Consulting Resources, LLC 55
56. Analyzing a 2k Design L6σ
Significant Effects
• Effects (main or interaction) are deemed significant based upon a
statistical hypothesis test (e.g., t-test) that results in a p-value
• The p-value is the probability of a Type I error (alpha, level of
confidence); commonly, if p < 0.05 the Null Hypothesis is rejected
and the Alternative Hypothesis is accepted:
Null Hypothesis (H0): AverageControl – AverageTreatment = 0
Alternative Hypothesis (H0): AverageControl – AverageTreatment ≠ 0
• Most software creates a table with a variety of statistics (effect,
coefficient, t-statistic, p-value, et cetera) related to each effect,
some software provide charts that graphically identify significant
effects
© 2012 Sigma Consulting Resources, LLC 56
57. Analyzing a 2k Design L6σ
Significant Effects
• Three factors are statistically significant: Button Size, Offer Graphic,
and Discount Field.
• None of the interactions are significant.
P-value = 0.05
t-Statistic
© 2012 Sigma Consulting Resources, LLC 57
58. Analyzing a 2k Design L6σ
Significant Effects
Factorial Fit: Conversion Rate versus Button Size, Offer Graphic, ...
Estimated Effects and Coefficients for Conversion Rate (coded units)
Term Effect Coef SE Coef T P _
Constant 15.875 0.5995 26.48 0.000
Button Size 3.250 1.625 0.5995 2.71 0.027
Offer Graphic 10.750 5.375 0.5995 8.97 0.000
Discount Field -4.750 -2.375 0.5995 -3.96 0.004
Button Size*Offer Graphic -0.750 -0.375 0.5995 -0.63 0.549
Button Size*Discount Field -2.250 -1.125 0.5995 -1.88 0.097
Offer Graphic*Discount Field 0.750 0.375 0.5995 0.63 0.549
Button Size*Offer Graphic* -0.750 -0.375 0.5995 -0.63 0.549
Discount Field
S = 2.39792 PRESS = 184 R-Sq = 93.11% R-Sq(pred) = 72.44% R-Sq(adj) = 87.08%
• Effect = change in the response variable when factor is changed from a low level to a high level.
• Coefficient = If factors are coded, the coefficient is half the value of the effect.
• t-Statistic is the statistical test to determine the p-value and statistical significance.
• P-value: if < 0.05 the factor is statistically significant (p-value = probability of a Type I error).
© 2012 Sigma Consulting Resources, LLC 58
59. Analyzing a 2k Design L6σ
Assessing Model Quality
Factorial Fit: Conversion Rate versus Button Size, Offer Graphic, ...
Estimated Effects and Coefficients for Conversion Rate (coded units)
Term Effect Coef SE Coef T P _
Constant 15.875 0.5995 26.48 0.000
Button Size 3.250 1.625 0.5995 2.71 0.027
Offer Graphic 10.750 5.375 0.5995 8.97 0.000
Discount Field -4.750 -2.375 0.5995 -3.96 0.004
Button Size*Offer Graphic -0.750 -0.375 0.5995 -0.63 0.549
Button Size*Discount Field -2.250 -1.125 0.5995 -1.88 0.097
Offer Graphic*Discount Field 0.750 0.375 0.5995 0.63 0.549
Button Size*Offer Graphic* -0.750 -0.375 0.5995 -0.63 0.549
Discount Field
S = 2.39792 PRESS = 184 R-Sq = 93.11% R-Sq(pred) = 72.44% R-Sq(adj) = 87.08%
• S = standard deviation of the residuals.
• PRESS = predicted sum of the squares.
• R-Sq = simple R2.
• R-Sq(pred) = R2 for model predictions.
• R-Sq(adj) = R2 adjusted, used with more than one factor to compare various models.
© 2012 Sigma Consulting Resources, LLC 59
60. Assessing Model Quality L6σ
The R2 Statistic
• R2 is the percent of variation in the response explained by the
factor(s)
R2 = Explained _Variation *100
Total_Variation
Total Variation
(100%) % Explained
© 2012 Sigma Consulting Resources, LLC 60
61. Analyzing a 2k Design L6σ
Significant Effects & Assessing Model Quality
• After assessing the the initial model, remove insignificant terms and
rerun the model
© 2012 Sigma Consulting Resources, LLC 61
62. Design of Experiments (DOE) L6σ
Analyzing a 2k Design
✔1. Test the model
Data errors and lurking variables
Assumptions
✔2. Identify significant main and interaction effects and
assess the quality of the model
3. Create appropriate graphical summaries
© 2012 Sigma Consulting Resources, LLC 62
66. Design of Experiments (DOE) L6σ
Prediction Equation
• The prediction equation includes a constant (overall average) in the
equation.
• The coefficients for discrete factors are the amount added or subtracted
from the overall average.
• The coefficients for continuous factors are slopes if they are not coded.
• Whether an effect is added or subtracted depends on whether the effect is
negative or positive, and how the factor was coded (e.g., no= –1, yes= +1).
Conversion = 15.875 + (Button Size * 1.625) + (Offer Graphic * 5.375) + (Discount Field * -2.375)
Large Button, Photo, No Discount
Conversion = 15.875 + (1 * 1.625) + (1 * 5.375) + (-1 * -2.375) = 25.25
© 2012 Sigma Consulting Resources, LLC 66
67. Design of Experiments (DOE) L6σ
Conclusions & Action Plans
• Summarize findings in simple language
• Present how conclusions have been (or will be)
validated
• Use simple graphical displays to communicate important
concepts
• Make recommendations concrete and actionable
• The appropriate action may include conducting another
experiment
© 2012 Sigma Consulting Resources, LLC 67
68. Design of Experiments (DOE) L6σ
Reducing Experimental Trials
Fractional Factorial Designs
© 2012 Sigma Consulting Resources, LLC 68
69. Design of Experiments (DOE) L6σ
Reducing the Size of a Factorial Design
Standard Button Offering Discount
Order Size Graphic Field
1 - - - Yes
2 + - -
3 - + -
4 + + -
Discount
Field Photo
5 - - +
6 + - + Offering
No Icon
7 - + + Graphic
Small Large
8 + + +
Button Size
If only 4 trials can be run (half of the full factorial) which 4 trials
should be chosen?
© 2012 Sigma Consulting Resources, LLC 69
70. Fractional Factorial Designs L6σ
Selecting the Half Fraction
Standard Button Offering Discount
Order Size Graphic Field
1 - - - Yes
2 + - -
3 - + -
4 + + -
Discount
Field Photo
5 - - +
6 + - + Offering
No Icon
7 - + + Graphic
Small Large
8 + + +
Button Size
The Discount Field is only tested at the “no”(-) level resulting in no
measure of the effect of the Discount Field.
© 2012 Sigma Consulting Resources, LLC 70
71. Fractional Factorial Designs L6σ
Selecting the Half Fraction
Standard Button Offering Discount
Order Size Graphic Field
1 - - - Yes
2 + - -
3 - + -
4 + + -
Discount
Field Photo
5 - - +
6 + - + Offering
No Icon
7 - + + Graphic
Small Large
8 + + +
Button Size
The effects of Discount Field (yes, +) and Offering Graphic (icon, -) are
confounded (Discount Field (yes) and Offering Graphic (icon) are always
tested together, as are Discount Field (no) and Offering Graphic (photo))
© 2012 Sigma Consulting Resources, LLC 71
72. Fractional Factorial Designs L6σ
Selecting the Half Fraction
Standard Button Offering Discount
Order Size Graphic Field
1 - - - Yes
2 + - -
3 - + -
4 + + -
Discount
Field Photo
5 - - +
6 + - + Offering
No Icon
7 - + + Graphic
Small Large
8 + + +
Button Size
• Each factor makes two comparisons for each of the 3 factors (balanced)
• Covers the most experimental space using four trials
• Collapses into a full factorial if one of the factors is found not significant
© 2012 Sigma Consulting Resources, LLC 72
73. Fractional Factorial Designs L6σ
Selecting the Half Fraction
Standard Button Offering Discount
Order Size Graphic Field
1 - - - Yes
2 + - -
3 - + -
4 + + -
Discount
Field Photo
5 - - +
6 + - + Offering
No Icon
7 - + + Graphic
Small Large
8 + + +
Button Size
• This will also work.
© 2012 Sigma Consulting Resources, LLC 73
74. Fractional Factorial Designs L6σ
Notation
2k factorial designs us the following notation:
2 k-p
R
Where
k = number of factors
p = fraction of the design (p=1=½ fraction, p=2=¼ fraction)
R = resolution
© 2012 Sigma Consulting Resources, LLC 74
75. Fractional Factorial Designs L6σ
Confounding
• Reducing the number of runs improves efficiency. The cost is a reduction
in the quantity of information provided, this is due to confounding.
• Confounding means that effects are mixed up. How the effects are
confounded depends on the resolution of the Fractional Factorial design.
• Fractional Factorial designs are structured to create confounding with
higher order interactions (typically not common).
• Using the Conversion Rate example the 23-1III results in the following
confounding:
• Button Size + (Offering Graphic * Discount Rate)
• Offering Graphic + (Button Size * Discount Rate)
• Discount Rate + (Button Size * Offering Graphic)
• The 23-1III is not a very useful design due to its resolution.
© 2012 Sigma Consulting Resources, LLC 75
76. Fractional Factorial Designs L6σ
Resolution
• Resolution is a measure of the degree of confounding.
• The higher the resolution the more likely important main effects, and two
factor interactions will be confounded with very higher order interactions.
• A full factorial design is full resolution.
Resolution Confounding
Main effects + 2-factor (and higher) interactions
III
1+2
Main effects + 3-factor (and higher) interactions
1+3
IV
2-factor interactions + 2-factor (and higher) interactions
2+2
Main effects + 4-factor (and higher) interactions
1+4
V
2-factor interactions + 3-factor (and higher) interactions
2+3
© 2012 Sigma Consulting Resources, LLC 76
78. Fractional Factorial Designs L6σ
Design Types & Resolution
Knowledge
Low High
Design Fractional Response
Screening Full Factorials
Type Factorials Surface
# of Factors >5 5 – 10 2–8 2–8
Identify Identify main Identify main
Optimize factor
Purpose important effects + some effects +
settings
factors interactions interactions
Resolution III IV+ Full Full
© 2012 Sigma Consulting Resources, LLC 78
79. 25-1 Fractional Factorial Design L6σ
Factors & Levels
Factor Level
Small
Button Size
Large
Icon
Offering Graphic
Photo
Yes
Discount Field
No
Blue
Background
Gray
G Format
Heading
H Format
© 2012 Sigma Consulting Resources, LLC 79
80. 25-1 Fractional Factorial Design L6σ
Factors & Levels
Five factors with full factorial = 32 runs and the half factorial = 16
© 2012 Sigma Consulting Resources, LLC 80
81. 25-1 Fractional Factorial Design L6σ
Analysis Confounding Structure
These Effects Are
Confounded With These Effects
Overall Average Button Size * Offering Graphic * Discount Field * Background * Heading
Button Size Offering Graphic * Discount Field * Background * Heading
Offering Graphic Button Size * Discount Field * Background * Heading
Discount Field Button Size * Offering Graphic * Background * Heading
Background Button Size * Offering Graphic * Discount Field * Heading
Heading Button Size * Offering Graphic * Discount Field * Background
Button Size * Offering Graphic Discount Field * Background * Heading
Button Size * Discount Field Offering Graphic * Background * Heading
Button Size * Background Offering Graphic * Discount Field * Heading
Button Size * Heading Offering Graphic * Discount Field * Background
Offering Graphic * Discount Field Button Size * Background * Heading
Offering Graphic * Background Button Size * Discount Field * Heading
Offering Graphic * Heading Button Size * Discount Field * Background
Discount Field * Background Button Size * Offering Graphic * Heading
Discount Field * Heading Button Size * Offering Graphic * Background
Background * Heading Button Size * Offering Graphic * Discount Field
© 2012 Sigma Consulting Resources, LLC 81
82. Design of Experiments (DOE) L6σ
Other Issues
• Statistical control and process predictability
• Sample representativeness (bias)
• Power (ability to detect a difference) and sample size
• “Exercise the experimentation system” (A/A) testing
• Significant differences in browser redirects
© 2012 Sigma Consulting Resources, LLC 82
83. Design of Experiments (DOE) L6σ
Summary
• DOE is a planned approach to testing, designs have a known number of
trials that can be budgeted
• Important main/interaction effects identified
• Multiple factors evaluated simultaneously
• Background variables managed by controlling, measuring, or blocking
• Lurking variables mitigated by randomization
• Replication enables estimation of experimental error
• Prediction equations
• The number of trials in full factorial designs can be reduced with fractional
factorials
© 2012 Sigma Consulting Resources, LLC 83
84. Design of Experiments (DOE) L6σ
References
Box, E. P. George, Hunter, William G., Hunter, J. Stuart, (1978): Statistics for
Experimenters, John Wiley & Sons, New York.
Crook, Thomas, Frasca, Brian, Kohavi, Ron, LongBotham, Roger, “Seven Pitfalls to Avoid
when Running Controlled Experiments on the Web,”
http://www.exp-platform.com/Pages/ExPpitfalls.aspx.
Kohavi, Ron, Longbotham, Roger, “Unexpected Results in Online Controlled Experiments,”
http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/k/Kohavi:Ron.html.
Kohavi, Ron, Henne, Randal M., Sommerfield, Dan, “Practical Guide to Controlled
Experiments on the Web: Listen to Your Customers not the HiPPO,”
http://exp-platform.com/hippo.aspx.
Moen, Ronald D., Nolan, Thomas W., Provost, Lloyd P., (1991): Improving Quality Through
Planned Experimentation, McGraw-Hill, New York.
© 2012 Sigma Consulting Resources, LLC 84