SlideShare una empresa de Scribd logo
1 de 17
The math behind A/B testing    
       How  to  perform  a  non-­‐biased  test
A/B testing
Not a replacement for common sense

It only gives you a level of confidence

Helps you achieve only local maxima
AB experiment: Toss a coin
    Heads  =  successful  conversion.  Tails  =  no  conversion.  




 Hypothesis:  Wearing  a  Red  color  t-­‐shirt  will  increase  conversion
46 heads out
of 100


54 heads out
of 100
Conversion increased by 17%


 Changing  your  t-­‐shirt  to  red  increases  conversion
Whats wrong
    Conversion  is  never  a  single  number.  Its  a  range.




Probability




                                    µ


                               Variance/Noise
Whats wrong
 Sample  =  100  tosses.




                  µ	
  =	
  0.5



                                  µ	
  =	
  0.46


            ∞	
  coin	
  tosses
Whats wrong
 Sample  =  100  tosses.




                  µ	
  =	
  0.5



                                        µ	
  ==	
  0.46
                                         µ	
   	
  0.46


            ∞	
  coin	
  tosses      100	
  coin	
  tosses


 Sample  mean  ≠  population  mean
The role of chance


         Red                            Blue




  Comparison  between  two  noisy  samples
Statistical significance


          Red                                 Blue




  Standard Error (SE) = Square root of (p * (1-p) / n)
  p = conversion rate, n = sample size

  How much deviation from average conversion rate (p)
  can be expected if this experiment is repeated multiple
  times.
Statistical significance
 95% confidence:
 True conversion rate lies within this range: p ± 2 * SE
Statistical significance
 95% confidence:
 True conversion rate lies within this range: p ± 2 * SE
Statistical significance
 95% confidence:
 True conversion rate lies within this range: p ± 2 * SE




  h3p://visualwebsiteop=mizer.com/ab-­‐split-­‐significance-­‐calculator/
Sample size
  Standard Error (SE) = Square root of (p * (1-p) / n)
Sample size
 Standard Error (SE) = Square root of (p * (1-p) / n)

 Min.  sample  size  to  calculate  the  statistical  signiLicance

   Statistical  conLidence
   Existing  conversion  rate  of  website
   Difference  in  conversion  rate  you  want  to  detect
   Number  of  variations  you  want  to  test

   h3p://www.testsignificance.com/
Ideal test
Determine  the  sample  size

Check  the  results  only  once  you  have  reached  the  sample  size

Determine  the  statistical  signiLicance

Pick  based  on  long  term  plan  if  no  clear  winner
Thanks

Más contenido relacionado

Destacado

High Performance PhoneGap Apps
High Performance PhoneGap AppsHigh Performance PhoneGap Apps
High Performance PhoneGap AppsSyd Lawrence
 
Predictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine LearningPredictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine LearningMarketo
 
Understand A/B Testing in 9 use cases & 7 mistakes
Understand A/B Testing in 9 use cases & 7 mistakesUnderstand A/B Testing in 9 use cases & 7 mistakes
Understand A/B Testing in 9 use cases & 7 mistakesTheFamily
 
A/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'tsA/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'tsRamkumar Ravichandran
 
Making Better Mistakes Tomorrow
Making Better Mistakes TomorrowMaking Better Mistakes Tomorrow
Making Better Mistakes TomorrowDanielle Jabin
 
Testing a 2D Platformer with Spock
Testing a 2D Platformer with SpockTesting a 2D Platformer with Spock
Testing a 2D Platformer with SpockAlexander Tarlinder
 
Emerging Trends in Online Search
Emerging Trends in Online SearchEmerging Trends in Online Search
Emerging Trends in Online SearchDistilled
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At SpotifyAdam Kawa
 
Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved Peter Antman
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyChris Johnson
 

Destacado (12)

High Performance PhoneGap Apps
High Performance PhoneGap AppsHigh Performance PhoneGap Apps
High Performance PhoneGap Apps
 
Predictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine LearningPredictive Content: Engineer Higher Conversions with Machine Learning
Predictive Content: Engineer Higher Conversions with Machine Learning
 
Understand A/B Testing in 9 use cases & 7 mistakes
Understand A/B Testing in 9 use cases & 7 mistakesUnderstand A/B Testing in 9 use cases & 7 mistakes
Understand A/B Testing in 9 use cases & 7 mistakes
 
A/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'tsA/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'ts
 
Making Better Mistakes Tomorrow
Making Better Mistakes TomorrowMaking Better Mistakes Tomorrow
Making Better Mistakes Tomorrow
 
The Power of A/B Testing
The Power of A/B TestingThe Power of A/B Testing
The Power of A/B Testing
 
Testing a 2D Platformer with Spock
Testing a 2D Platformer with SpockTesting a 2D Platformer with Spock
Testing a 2D Platformer with Spock
 
Emerging Trends in Online Search
Emerging Trends in Online SearchEmerging Trends in Online Search
Emerging Trends in Online Search
 
Data at Spotify
Data at SpotifyData at Spotify
Data at Spotify
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
 
Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved Growing up with agile - how the Spotify 'model' has evolved
Growing up with agile - how the Spotify 'model' has evolved
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
 

Similar a The math-behind-ab-testing

Week 3 – Multiple Choice4) A random sample of 100 observations f.docx
Week 3 – Multiple Choice4) A random sample of 100 observations f.docxWeek 3 – Multiple Choice4) A random sample of 100 observations f.docx
Week 3 – Multiple Choice4) A random sample of 100 observations f.docxmelbruce90096
 
Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015Webanalisten .nl
 
Violating Np More Than 10
Violating Np More Than 10Violating Np More Than 10
Violating Np More Than 10Jim Mydloski
 
Assignment #9First, we recall some definitions that will be help.docx
Assignment #9First, we recall some definitions that will be help.docxAssignment #9First, we recall some definitions that will be help.docx
Assignment #9First, we recall some definitions that will be help.docxfredharris32
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststNor Ihsan
 
Statistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference LondonStatistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference LondonTom Capper
 
1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inferenceDev Pandey
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1shoffma5
 
Conversion Conference Berlin
Conversion Conference BerlinConversion Conference Berlin
Conversion Conference BerlinTom Capper
 
Nonparametric hypothesis testing methods
Nonparametric hypothesis testing methodsNonparametric hypothesis testing methods
Nonparametric hypothesis testing methodsGaetan Lion
 

Similar a The math-behind-ab-testing (20)

Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Week 3 – Multiple Choice4) A random sample of 100 observations f.docx
Week 3 – Multiple Choice4) A random sample of 100 observations f.docxWeek 3 – Multiple Choice4) A random sample of 100 observations f.docx
Week 3 – Multiple Choice4) A random sample of 100 observations f.docx
 
Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015Chris Stuccio - Data science - Conversion Hotel 2015
Chris Stuccio - Data science - Conversion Hotel 2015
 
Violating Np More Than 10
Violating Np More Than 10Violating Np More Than 10
Violating Np More Than 10
 
Assignment #9First, we recall some definitions that will be help.docx
Assignment #9First, we recall some definitions that will be help.docxAssignment #9First, we recall some definitions that will be help.docx
Assignment #9First, we recall some definitions that will be help.docx
 
Applied statistics part 1
Applied statistics part 1Applied statistics part 1
Applied statistics part 1
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10stst
 
Statistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference LondonStatistics for CRO - Conversion Conference London
Statistics for CRO - Conversion Conference London
 
Statistics Homework Help
Statistics Homework HelpStatistics Homework Help
Statistics Homework Help
 
1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1
 
Calculating p value
Calculating p valueCalculating p value
Calculating p value
 
Conversion Conference Berlin
Conversion Conference BerlinConversion Conference Berlin
Conversion Conference Berlin
 
Les5e ppt 11
Les5e ppt 11Les5e ppt 11
Les5e ppt 11
 
Nonparametric hypothesis testing methods
Nonparametric hypothesis testing methodsNonparametric hypothesis testing methods
Nonparametric hypothesis testing methods
 
Statistics Homework Help
Statistics Homework HelpStatistics Homework Help
Statistics Homework Help
 
6. sample size v3
6. sample size   v36. sample size   v3
6. sample size v3
 

Más de Amit Sawhney

Más de Amit Sawhney (7)

Spotify India Entry
Spotify India EntrySpotify India Entry
Spotify India Entry
 
FieldDay_Sonica_Sapient
FieldDay_Sonica_SapientFieldDay_Sonica_Sapient
FieldDay_Sonica_Sapient
 
Tools I Carry
Tools I CarryTools I Carry
Tools I Carry
 
Improving email open rates
Improving email open ratesImproving email open rates
Improving email open rates
 
Investing early
Investing earlyInvesting early
Investing early
 
Scrum
ScrumScrum
Scrum
 
Git basics
Git basicsGit basics
Git basics
 

The math-behind-ab-testing

  • 1. The math behind A/B testing     How  to  perform  a  non-­‐biased  test
  • 2. A/B testing Not a replacement for common sense It only gives you a level of confidence Helps you achieve only local maxima
  • 3. AB experiment: Toss a coin    Heads  =  successful  conversion.  Tails  =  no  conversion.   Hypothesis:  Wearing  a  Red  color  t-­‐shirt  will  increase  conversion
  • 4. 46 heads out of 100 54 heads out of 100
  • 5. Conversion increased by 17% Changing  your  t-­‐shirt  to  red  increases  conversion
  • 6. Whats wrong Conversion  is  never  a  single  number.  Its  a  range. Probability µ Variance/Noise
  • 7. Whats wrong Sample  =  100  tosses. µ  =  0.5 µ  =  0.46 ∞  coin  tosses
  • 8. Whats wrong Sample  =  100  tosses. µ  =  0.5 µ  ==  0.46 µ    0.46 ∞  coin  tosses 100  coin  tosses Sample  mean  ≠  population  mean
  • 9. The role of chance Red Blue Comparison  between  two  noisy  samples
  • 10. Statistical significance Red Blue Standard Error (SE) = Square root of (p * (1-p) / n) p = conversion rate, n = sample size How much deviation from average conversion rate (p) can be expected if this experiment is repeated multiple times.
  • 11. Statistical significance 95% confidence: True conversion rate lies within this range: p ± 2 * SE
  • 12. Statistical significance 95% confidence: True conversion rate lies within this range: p ± 2 * SE
  • 13. Statistical significance 95% confidence: True conversion rate lies within this range: p ± 2 * SE h3p://visualwebsiteop=mizer.com/ab-­‐split-­‐significance-­‐calculator/
  • 14. Sample size Standard Error (SE) = Square root of (p * (1-p) / n)
  • 15. Sample size Standard Error (SE) = Square root of (p * (1-p) / n) Min.  sample  size  to  calculate  the  statistical  signiLicance Statistical  conLidence Existing  conversion  rate  of  website Difference  in  conversion  rate  you  want  to  detect Number  of  variations  you  want  to  test h3p://www.testsignificance.com/
  • 16. Ideal test Determine  the  sample  size Check  the  results  only  once  you  have  reached  the  sample  size Determine  the  statistical  signiLicance Pick  based  on  long  term  plan  if  no  clear  winner