SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Hypothesis	Testing:		
How	to	Eliminate	Ideas	as	Soon	as	Possible

Roman	Zykov	
Retail	Rocket	
Boston,	RecSys	2016
Context
• Intro	
• Offline	vs	Online	testing	
• Make	offline	testing	shorter	
• Artificial	diversity	metric	
• Online	tests
Retail	Rocket
• Personalised	real-time	recommendations	
• E-commerce	only	
• Multiple	channels	(site,	email,	…)	
• Founded	in	2012	
• Offices:	Amsterdam,	Barcelona,	Milan,	Moscow	
• 1000+	retail	partners	
• 100+	million	daily	events
Why	testing	is	important?
• Highly	competitive	market	
• It’s	not	hard	to	create	own	recommendation		
• Constant	changes	in	the	product	and	algorithms	
• Fast	and	reliable	decisions
Offline	vs	Online	testing
Offline	testing		forecasts	online	testing	results	
• Relatively	fast,	testing	of	minor	changes	requires	hours	
• Few	resources:	data,	computational	resources,	code,	1	dev	
• Hard	to	forecast	online	metrics	in	some	cases	
• Influence	of	an	algorithm	on	users'	behaviour	is	ignored	
• Bad	values	of	offline	metrics	prevent	online	implementation	
Online	test	-	final	decision	point	
• Requires	much	time.	At	least	two	cycles	of	decision	making	
• Requires	many	resources:	design,	onsite	production,	etc
Testing	facts
• Nine	out	of	ten	ideas	do	not	improve	anything	
• Most	ideas	have	minor	impact:	
o add	new	data:	extracted	from	text,	images,	etc	
o adjust	parameters	of	algorithm
Offline	testing
Offline	predicts	Online
Major	changes	or	new	algorithm	
• Always	check	by	online	experiment	
• Find	appropriate	offline	metric	after	
• Try	different	definitions	of	users’	sessions	
• Try	different	events	sequences	
Minor	changes	
•			Use	offline	tests	if	you	have	proved	offline	metric
Make	offline	testing	shorter	Retail	Rocket
What	we	did	
• Functional	programming	on	Scala/Spark.	Four	languages	
(Python,	Java,	Pig,	Hive)	had	been	previously	used.	
• Research	in	Scala/Spark	Notebooks	with	added	R	integration	
for	graphics	
• Offline	evaluation	framework	for	all	of	our	tasks	with	metrics	
calculations.	The	most	complicated	project	among	others	in	
Retail	Rocket	
What	we	got	
• It	takes	hours	to	prove	or	disapprove	any	simple	idea	
whereas	previously	it	could	have	taken	days	
• Research	is	limited	by	the	power	of	our	cluster	and	the	
number	of	data	scientists
Scala/Spark	notebook	with	R
Offline	framework
• Scala	on	Spark	
• Deals	with	existing	web	logs	
• Implicit	feedback	
• Major	metrics:	
o Recall,	Diversity,	Recall	with	NN,	Empty	Recs	
• Minor	metrics:	
o Serendipity,	Novelty,	Coverage	
• Different	types	of	events	sequences	
• Different	definitions	of	users’	sessions	
• Personalised	/	Non-personalised	recommendations	
• Adjustable	TOP	of	viewable	recommendations		
• Test	panel	of	sites	from	different	domains
Offline	events	sequences
		view1													view2										view3										cart1	 						cart2											view4										view5	 					view6							purchase1
View2View View2Cart View2Purchase Cart2Purchase Cart2Cart
view1	->	view2	
view2	->	view3	
view3	->	view4	
view4	->	view5	
view5	->	view6	
view1	->	cart1	
view2	->	cart1	
view3	->	cart1	
view4	->	cart1	
view5	->	cart2	
view6	->	cart2	
view1	->	purchase1	
view2	->	purchase1	
view3	->	purchase1	
view4	->	purchase1	
view5	->	purchase1	
view6	->	purchase1	
cart1	->	purchase1	
cart2	->	purchase1	
cart1	->	cart2	
*	Events:	product	view,	add	to	cart,	purchase,	main	page	view,	search,	catalog	page,	…
Offline	metric	examples	
		view1													view2										view3										cart1	 						cart2											view4										view5	 					view6							purchase1
What	Customers	Buy	After	Viewing	This	Item	
• View2Cart	
• View2Purchase	
• …	
Customers	Who	Bought	This	Item	Also	Bought		
• Cart2Cart	
• Cart2Purchase	
• View2Cart	
• …
Case:	Artificial	diversification
Artificial	diversification
Original
After
Problem:	It’s	not	impossible	to	use	Recall	for	evaluating
Recall	with	Nearest	Neighbours	(NN)
Top	4	recs
0.8 0.7 0.5 0.5
0.8 0.7 0.5 0.5
0.6 0.5 0.4
0.9 0.8 0.3 0.5
Content	based	similarity

(Nearest	neighbours)
Real	item
0.5
Indirect	hit
1.0
Direct	hit	
No	hit	
0.0
Metric	=	Average	over	all	sessions
Online	A/B	testing
AA/BB	tests
A	group
A	group
B	group
B	group
Control	group
Test	group
AA/BB	tests
A
A
B
B
A
A
B
B
IdealDirty
Bayesian	approach
• Conversion	rates	
o Beta	distribution	with	normal	priors		
• Average	Order	Values	
o Normal	distribution	(after	log)	with	normal	priors	
• Priors	from	historical	data	before	experiment	
Anything	may	be	done	with	posteriors.	
E.g.:	There	is	a	95%	chance	that	A	has	an	1%	lift	over	B
Conclusion
• Offline	testing	can	predict	online	results	
• One	programming	language	for	R&D	reduces	the	test	time	
• The	Scala	language	is	a	good	alternative	for	ML	tasks	
• Different	event	sequences	for	offline	metrics	
• Recall	with	Nearest	Neighbours	(NN)	metric
Thank	you!
Roman	Zykov	
Retail	Rocket		
rzykov@retailrocket.net	
https://github.com/RetailRocket/SparkMultiTool

Más contenido relacionado

Similar a How to eliminate ideas as soon as possible

When is a project ready for Software Automation_NEW
When is a project ready for Software Automation_NEWWhen is a project ready for Software Automation_NEW
When is a project ready for Software Automation_NEW
Mike Christesen
 
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisions
Vivastream
 
Bootstrapping your startup & building it lean: stop wasting time
Bootstrapping your startup & building it lean: stop wasting timeBootstrapping your startup & building it lean: stop wasting time
Bootstrapping your startup & building it lean: stop wasting time
Joel Gascoigne
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
oralonso
 

Similar a How to eliminate ideas as soon as possible (20)

ECC-Net Web presence, Lars Boström
ECC-Net Web presence, Lars BoströmECC-Net Web presence, Lars Boström
ECC-Net Web presence, Lars Boström
 
When is a project ready for Software Automation_NEW
When is a project ready for Software Automation_NEWWhen is a project ready for Software Automation_NEW
When is a project ready for Software Automation_NEW
 
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
Data Foundation for Analytics Excellence by Tanimura, cathy from OktaData Foundation for Analytics Excellence by Tanimura, cathy from Okta
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Mobile EHS and Quality Auditing - Lessons Learned
Mobile EHS and Quality Auditing - Lessons LearnedMobile EHS and Quality Auditing - Lessons Learned
Mobile EHS and Quality Auditing - Lessons Learned
 
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisions
 
166 sspcc1 b_newman
166 sspcc1 b_newman166 sspcc1 b_newman
166 sspcc1 b_newman
 
Business intelligence prof nikhat fatma mumtaz husain shaikh
Business intelligence  prof nikhat fatma mumtaz husain shaikhBusiness intelligence  prof nikhat fatma mumtaz husain shaikh
Business intelligence prof nikhat fatma mumtaz husain shaikh
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Wbs
WbsWbs
Wbs
 
Wbs, estimation and scheduling
Wbs, estimation and schedulingWbs, estimation and scheduling
Wbs, estimation and scheduling
 
GarageLabs Startup Insights
GarageLabs Startup InsightsGarageLabs Startup Insights
GarageLabs Startup Insights
 
Trending Topics in Recommender Systems
Trending Topics in Recommender SystemsTrending Topics in Recommender Systems
Trending Topics in Recommender Systems
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
 
Ericriesleanstartuppresentationforweb2
Ericriesleanstartuppresentationforweb2Ericriesleanstartuppresentationforweb2
Ericriesleanstartuppresentationforweb2
 
Bootstrapping your startup & building it lean: stop wasting time
Bootstrapping your startup & building it lean: stop wasting timeBootstrapping your startup & building it lean: stop wasting time
Bootstrapping your startup & building it lean: stop wasting time
 
How to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product ManagerHow to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product Manager
 
predictive analysis and usage in procurement ppt 2017
predictive analysis and usage in procurement  ppt 2017predictive analysis and usage in procurement  ppt 2017
predictive analysis and usage in procurement ppt 2017
 
Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 

Más de Roman Zykov

Big data europe 2012 brochure (3)
Big data europe 2012 brochure (3)Big data europe 2012 brochure (3)
Big data europe 2012 brochure (3)
Roman Zykov
 
Hadoop implementation in Wikimart
Hadoop implementation in WikimartHadoop implementation in Wikimart
Hadoop implementation in Wikimart
Roman Zykov
 
Google Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetrics
Google Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetricsGoogle Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetrics
Google Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetrics
Roman Zykov
 
MIPhT presentation about BI
MIPhT presentation about BIMIPhT presentation about BI
MIPhT presentation about BI
Roman Zykov
 
Roman zykovcertificates
Roman zykovcertificatesRoman zykovcertificates
Roman zykovcertificates
Roman Zykov
 
Wpaper 005 functionalism_new_approach
Wpaper 005 functionalism_new_approachWpaper 005 functionalism_new_approach
Wpaper 005 functionalism_new_approach
Roman Zykov
 
Searchpatterns 100519055231-phpapp02
Searchpatterns 100519055231-phpapp02Searchpatterns 100519055231-phpapp02
Searchpatterns 100519055231-phpapp02
Roman Zykov
 
Metrics drivendesign
Metrics drivendesignMetrics drivendesign
Metrics drivendesign
Roman Zykov
 
Ozon в высшей школе экономики часть 4
Ozon в высшей школе экономики часть 4Ozon в высшей школе экономики часть 4
Ozon в высшей школе экономики часть 4
Roman Zykov
 
Ozon в высшей школе экономики часть 3
Ozon в высшей школе экономики часть 3Ozon в высшей школе экономики часть 3
Ozon в высшей школе экономики часть 3
Roman Zykov
 
Ozon в высшей школе экономики часть 2
Ozon в высшей школе экономики часть 2Ozon в высшей школе экономики часть 2
Ozon в высшей школе экономики часть 2
Roman Zykov
 
Ozon в высшей школе экономики часть 1
Ozon в высшей школе экономики часть 1Ozon в высшей школе экономики часть 1
Ozon в высшей школе экономики часть 1
Roman Zykov
 
Roman Zykov Certificates
Roman Zykov CertificatesRoman Zykov Certificates
Roman Zykov Certificates
Roman Zykov
 

Más de Roman Zykov (20)

Kib Rif 2015. Make money from your data
Kib Rif 2015. Make money from your dataKib Rif 2015. Make money from your data
Kib Rif 2015. Make money from your data
 
сервисы персонализации на основе данных
сервисы персонализации на основе данныхсервисы персонализации на основе данных
сервисы персонализации на основе данных
 
Big data europe 2012 brochure (3)
Big data europe 2012 brochure (3)Big data europe 2012 brochure (3)
Big data europe 2012 brochure (3)
 
Wikimart recommendations
Wikimart recommendationsWikimart recommendations
Wikimart recommendations
 
Hadoop in Wikimart. Part 1. Business
Hadoop in Wikimart. Part 1. BusinessHadoop in Wikimart. Part 1. Business
Hadoop in Wikimart. Part 1. Business
 
Hadoop implementation in Wikimart
Hadoop implementation in WikimartHadoop implementation in Wikimart
Hadoop implementation in Wikimart
 
Google Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetrics
Google Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetricsGoogle Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetrics
Google Analytics vs Omniture SiteCatalyst vs In-ouse Webanalytics at iMetrics
 
MIPhT presentation about BI
MIPhT presentation about BIMIPhT presentation about BI
MIPhT presentation about BI
 
Owox rzykov kp_iexamples
Owox rzykov kp_iexamplesOwox rzykov kp_iexamples
Owox rzykov kp_iexamples
 
Owox rzykov
Owox rzykovOwox rzykov
Owox rzykov
 
Roman zykovcertificates
Roman zykovcertificatesRoman zykovcertificates
Roman zykovcertificates
 
Wpaper 005 functionalism_new_approach
Wpaper 005 functionalism_new_approachWpaper 005 functionalism_new_approach
Wpaper 005 functionalism_new_approach
 
Searchpatterns 100519055231-phpapp02
Searchpatterns 100519055231-phpapp02Searchpatterns 100519055231-phpapp02
Searchpatterns 100519055231-phpapp02
 
Metrics drivendesign
Metrics drivendesignMetrics drivendesign
Metrics drivendesign
 
E-commerce KPIs
E-commerce KPIsE-commerce KPIs
E-commerce KPIs
 
Ozon в высшей школе экономики часть 4
Ozon в высшей школе экономики часть 4Ozon в высшей школе экономики часть 4
Ozon в высшей школе экономики часть 4
 
Ozon в высшей школе экономики часть 3
Ozon в высшей школе экономики часть 3Ozon в высшей школе экономики часть 3
Ozon в высшей школе экономики часть 3
 
Ozon в высшей школе экономики часть 2
Ozon в высшей школе экономики часть 2Ozon в высшей школе экономики часть 2
Ozon в высшей школе экономики часть 2
 
Ozon в высшей школе экономики часть 1
Ozon в высшей школе экономики часть 1Ozon в высшей школе экономики часть 1
Ozon в высшей школе экономики часть 1
 
Roman Zykov Certificates
Roman Zykov CertificatesRoman Zykov Certificates
Roman Zykov Certificates
 

Último

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 

Último (20)

Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 

How to eliminate ideas as soon as possible