SlideShare una empresa de Scribd logo
1 de 53
Descargar para leer sin conexión
Recommendations @ Instacart
Sharath Rao
Data Scientist
Catalog, Search and Discovery
v
The Instacart Value Proposition
Groceries from stores
you love
delivered
to your
doorstep
in as little
as an hour
+ + + =
v
Customer Experience
Select a

Store
Shop for
Groceries
Checkout Select Delivery
Time
Delivered
to Doorstep
v
Shopper Experience
Accept Order Find the
Groceries
Out for
Delivery
Delivered
to Doorstep
Scan Barcode
v
Four Sided Marketplace
Customers Shoppers
Products

(Advertisers)
Search
Advertising
Shopping
Delivery
Customer Service
Inventory
Picking
Loyalty
Stores

(Retailers)
v
What this talk is about
A new collaborative filtering algorithm
• A case-study
• live end to end recommendation system
• one person month
• hundreds of millions of transactions
v
Online grocery vs Traditional e-commerce
Week 3Week 2
Online
Grocery
Week 1
Traditional
e-commerce
v
Grocery Shopping in “Low Dimensional Space”
Search
Restock
Explore
+
+
=
v
Why recommendations at Instacart
Your storeEverybody’s store
v
Repeat purchases increase LTV of recommendations
$5.49
$549
Today A year later
1 +….+ 100
$549
$549
vDifferent recommendation systems address different needs
v
Personalized Top N recommendations
Promote broad-based discovery
in a dynamic catalog
Including from stores customers
may have never shopped
v
Replacement Product Recommendations
Mitigate adverse impact of
last-minute out of stocks
v
“Frequently bought with” Recommendations
Not necessarily
consumed together
Help customers shop for
complementary products
and try alternatives
Probably
consumed together
v
Post Checkout Recommendations
Accommodate last-minute
requests for that
“just one more thing”
vPersonalized Top N Recommendations
v
Learning from feedback
Traditionally collaborative filtering used explicit feedback to predict ratings
There may still bias in whether the user chooses to rate
Explicit Feedback Implicit Feedback
v
Learning from Explicit Feedback
• Explicit feedback may be more reliable but there is much less of it
• Less reliable if users rate based on aspirations instead of true preferences
vs
v
Implicit Feedback - trade-off quality and quantity
Strengthofevidence
Number of Events
v
Architecture
Event Data Score and
Select Top N
(Spark/EMR)
User/Product Factors
Event Data
Run-time
ranking for
diversity
Candidate
Selection
ALS
(Spark/EMR)
Generate
User-Product
Matrix
v
A Matrix Factorization Formulation for Implicit Feedback
N Products
MUsers
1
-
-
9
-
-
-
3
20
User Product Matrix
R; (M x N)
1
0
0
1
0
0
0
1
1binary
preferences
Preference Matrix R;
(M x N)
“Collaborative Filtering for Implicit Feedback” - Hu et. al
v
A Matrix Factorization Formulation for Implicit Feedback
~
Y
XT
Product Factors
(k x N)
User Factors
(M x k)
1
0
0
1
0
0
0
1
1
x
Preference Matrix R;
(M x N)
v
Matrix Factorization from Implicit Feedback - The Intuition
#Purchases Preference p Confidence c
0 0 Low
1 1 Low
>>1 1 High
• Confidence increases linearly with purchases r
• c = 1 + alpha * r
• alpha controls the marginal rate of learning from user purchases
• Key questions
• How should the unobserved events be treated
• How should one trade-off observed and the unobserved
v
Regularized Weighted Squared Loss
Confidence
User
Factors
Matrix
Product
Factors
Matrix
Preference
Matrix Regularization
Solve using Alternating Least Squares
v
Architecture
Generate
User-Product
Matrix
ALS
(Spark/EMR)
Score and
Select Top N
(Spark/EMR)
User/Product Factors
Run-time
ranking for
diversity
Candidate
Selection
Event Data
Event Data
v
Spark ALS Hyper-parameter Tuning
• rank k - diminishing returns after 150
• alpha - controls rate of learning from observed events
• iterations - ALS tends to converge within 5, seldom more than 10
• lambda - regularization parameter
v
Architecture
Generate
User-Product
Matrix
ALS Matrix
Factorization
(Spark/EMR)
Candidate
Selection
Score and
Select Top N
(Spark/EMR)
User/Product Factors
Run-time
ranking for
diversity
Event Data
Event Data
v
Scoring user and products
With millions of products and users, scoring every (user, product) pair is prohibitive
Two goals in selecting products to score
• Products that have an a priori high purchase rate (popular)
• Long tail which have not been discovered
Exclude previously purchased products
~
v
Candidate Product Selection
We start with simple stratified sampling
For each user, score N products
Sample h products from Head
Sample t products from tail
N ~ 10000
h ~ 3000
t ~7000
v
Architecture
Generate
User-Product
Matrix
ALS
(Spark/EMR)
Score and
Select Top N
(Spark/EMR)
User/Product Factors
Run-time
diversity ranking
Candidate
Selection
Event Data
Event Data
v
Offline evaluation
• Ideally we want to evaluate user response to recommendations
• But we will only know this from an live A/B test
• Recall based metrics are an offline proxy (albeit not the best)
• Recall: “Fraction of purchased products covered among Top N
recommendations”
• We only use this for hyper parameter tuning
v
Tuning Spark For ALS
Understanding Spark execution model and its implementation of ALS helps
• Training is communication heavy1
, set partitions <= #CPU cores
• Scoring is memory intensive
• Broad guidelines2
• Limit executor memory to 64GB
• 5 cores per executor
• Set executors based on data size
1 - http://apache-spark-user-list.1001560.n3.nabble.com/Error-No-space-left-on-device-tp9887p9896.html
2 - http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/
v
What better promotes broad-based discovery
vs
v
Online ranking for diversity
“Diversity within sessions, Novelty across sessions”
“Establish trust in a fresh and comprehensive catalog”
“Less is more”
Cached list of
~1000 products
per user
Final list of
<100 products
promote diversity
v
Diversity
Top K products - ranked by score
Rank product categories by their median product score
> > >
v
Weighted sampling for diversity
Sample category in
proportion to score
Within category, sample in
proportion to product score
v
A/B Test Setup
Generate
User-Product
Matrix
ALS
(Spark/EMR)
Score and
Select Top N
(Spark/EMR)
User/Product Factors
Run-time
diversity ranking
Candidate
Selection
Event Data
Event Data
Weekly for past N
months data
Weekly for users with
recent activity
v
A/B Test Results
• Statistically significant increases
• Items per order
• GMV per order
• Total product sales spread over more
categories
v
Ok, we have a recommendation system
Where do we go from here?
v
What else do you do with user and product factors?
Score (user, product) pair on demand
Get Top N similar users
Get Top N similar product
As features in other models
v
Products similar to “Haigs Spicy Hummus"
More “Spicy Hummus”
Spicy Salsas
Generated using Approximate Nearest Neighbor
(“annoy” from Spotify)
v
Ensembles
Use different types of evidence and/or product metadata to easily create ensembles
User x Products Purchased
User x Products Viewed
User x Brands Purchased
Model or Linear
Combination
…
v
What next
• Improve candidate selection by leverage user and product factors
• Make recommendations more contextual
• Address cold-start problems, particularly for users
• Explain recommendations (“Because you did X”)
v
Replacement Product Recommendation
v
Fulfillment in Traditional E-commerce
• Manage inventory in warehouses optimized for quick
fulfillment
• Users only specify the “What” they want
• Disallow users from ordering out of stock products
• Set expectations
• “3 day shipping” but will ship in 10 business days
v
Fulfillment for on-demand delivery from local retailers
• Shoppers navigate a complex environment where products
• may have run out
• may be misplaced
• may be damaged
• User specifies “What”, “When” and “Where from”
• Improvise under uncertainty
v
Addressing new challenges in on-demand delivery
• Tight technology integrations help improve tracking of in-store availability
• Complemented by predictive models that estimate availability in real-time
• Last minute out of stocks can still happen
v
v
What makes a replacement acceptable?
Flavor PackingSizeBrand Price
• Several product attributes matter
• Context matters, might benefit from personalization
• Must scale to millions of products
• Not always symmetric
• May be ok to replace X with gluten free X but not the other way around
Diet
Info
v
• Shoppers are trained to pick replacements
• But shoppers can benefit from algorithmic suggestions
• Many unfamiliar products in a vast catalog
• Validation for common products
• Finding replacements fast improves operational efficiency
Replacement Recommendations for Shoppers
v
• Customers can specify replacements while placing the order
• Can choose to communicate with the shopper in store to verify
Replacement Recommendations for Customers
v
How do we algorithmically generate replacements?
COME BUILD IT :)
WE’RE HIRING!


@sharathrao

Más contenido relacionado

La actualidad más candente

Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System ExplainedCrossing Minds
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviewsmaranlar
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerceAlexander Konduforov
 
How does instacart work? Slide Share
How does instacart work? Slide Share How does instacart work? Slide Share
How does instacart work? Slide Share Growcer
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation systemAkashPatil334
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender SystemsDavid Zibriczky
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Product Research Workflow Presentation
Product Research Workflow PresentationProduct Research Workflow Presentation
Product Research Workflow PresentationIlanaMaeAsaldo
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
Recommendation systems
Recommendation systemsRecommendation systems
Recommendation systemsSaurabhWani6
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsJames Kirk
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systemsRami Alsalman
 
Deep Learning at Instacart
Deep Learning at InstacartDeep Learning at Instacart
Deep Learning at InstacartJeremy Stanley
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithmsnextlib
 

La actualidad más candente (20)

Recommender system
Recommender systemRecommender system
Recommender system
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviews
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
How does instacart work? Slide Share
How does instacart work? Slide Share How does instacart work? Slide Share
How does instacart work? Slide Share
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation system
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Product Research Workflow Presentation
Product Research Workflow PresentationProduct Research Workflow Presentation
Product Research Workflow Presentation
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommendation systems
Recommendation systemsRecommendation systems
Recommendation systems
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender Systems
 
Customer Analytics Best Practice
Customer Analytics Best PracticeCustomer Analytics Best Practice
Customer Analytics Best Practice
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systems
 
Deep Learning at Instacart
Deep Learning at InstacartDeep Learning at Instacart
Deep Learning at Instacart
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
 

Similar a DataEngConf SF16 - Recommendations at Instacart

DataEngConf 2017 - Machine Learning Models in Production
DataEngConf 2017 - Machine Learning Models in ProductionDataEngConf 2017 - Machine Learning Models in Production
DataEngConf 2017 - Machine Learning Models in ProductionSharath Rao
 
WrangleConf 2017 - Lessons from Integrating ML models into Data Products
WrangleConf 2017 - Lessons from Integrating ML models into Data ProductsWrangleConf 2017 - Lessons from Integrating ML models into Data Products
WrangleConf 2017 - Lessons from Integrating ML models into Data ProductsSharath Rao
 
Webinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchWebinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchLucidworks
 
Winning Supply Chain in Omnichannel - Trends and Implications
Winning Supply Chain in Omnichannel - Trends and ImplicationsWinning Supply Chain in Omnichannel - Trends and Implications
Winning Supply Chain in Omnichannel - Trends and ImplicationsMichael Hu
 
Big data certification training mumbai
Big data certification training mumbaiBig data certification training mumbai
Big data certification training mumbaiTejaspathiLV
 
Best data science courses in pune
Best data science courses in puneBest data science courses in pune
Best data science courses in puneprathyusha1234
 
Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabadprathyusha1234
 
best online data science courses
best online data science coursesbest online data science courses
best online data science coursesprathyusha1234
 
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...Cloudera, Inc.
 
Market Basket Analysis.ppt
Market Basket Analysis.pptMarket Basket Analysis.ppt
Market Basket Analysis.pptUshaSeshadri1
 
Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...
Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...
Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...Channel IQ
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
Multi-Echelon Inventory Optimization.pptx
Multi-Echelon Inventory Optimization.pptxMulti-Echelon Inventory Optimization.pptx
Multi-Echelon Inventory Optimization.pptxChandrasen Sharma
 
Big Data and the Next Best Offer
Big Data and the Next Best OfferBig Data and the Next Best Offer
Big Data and the Next Best OfferMichel Bruley
 
Saving Time and Money in Warehouse Operations (MFSA Annual Conference)
Saving Time and Money in Warehouse Operations (MFSA Annual Conference)Saving Time and Money in Warehouse Operations (MFSA Annual Conference)
Saving Time and Money in Warehouse Operations (MFSA Annual Conference)interlinkONE
 

Similar a DataEngConf SF16 - Recommendations at Instacart (20)

DataEngConf 2017 - Machine Learning Models in Production
DataEngConf 2017 - Machine Learning Models in ProductionDataEngConf 2017 - Machine Learning Models in Production
DataEngConf 2017 - Machine Learning Models in Production
 
WrangleConf 2017 - Lessons from Integrating ML models into Data Products
WrangleConf 2017 - Lessons from Integrating ML models into Data ProductsWrangleConf 2017 - Lessons from Integrating ML models into Data Products
WrangleConf 2017 - Lessons from Integrating ML models into Data Products
 
Webinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchWebinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better Search
 
Data Science and Future of Retail: Beacon analytics
Data Science and Future of Retail: Beacon analyticsData Science and Future of Retail: Beacon analytics
Data Science and Future of Retail: Beacon analytics
 
Winning Supply Chain in Omnichannel - Trends and Implications
Winning Supply Chain in Omnichannel - Trends and ImplicationsWinning Supply Chain in Omnichannel - Trends and Implications
Winning Supply Chain in Omnichannel - Trends and Implications
 
Walmart Presentation
Walmart PresentationWalmart Presentation
Walmart Presentation
 
Big data certification training mumbai
Big data certification training mumbaiBig data certification training mumbai
Big data certification training mumbai
 
Best data science courses in pune
Best data science courses in puneBest data science courses in pune
Best data science courses in pune
 
Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabad
 
best online data science courses
best online data science coursesbest online data science courses
best online data science courses
 
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
Lessons From Integrating Machine Learning into Data Products | Wrangle Confer...
 
Market Basket Analysis.ppt
Market Basket Analysis.pptMarket Basket Analysis.ppt
Market Basket Analysis.ppt
 
Inventory Optimization Using ABC-XYZ Analysis
Inventory Optimization Using ABC-XYZ AnalysisInventory Optimization Using ABC-XYZ Analysis
Inventory Optimization Using ABC-XYZ Analysis
 
Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...
Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...
Technology Update: Channel IQ Tech Innovation and Product Roadmap with CEO An...
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
XYZ inventory management
XYZ inventory managementXYZ inventory management
XYZ inventory management
 
Multi-Echelon Inventory Optimization.pptx
Multi-Echelon Inventory Optimization.pptxMulti-Echelon Inventory Optimization.pptx
Multi-Echelon Inventory Optimization.pptx
 
Big Data and the Next Best Offer
Big Data and the Next Best OfferBig Data and the Next Best Offer
Big Data and the Next Best Offer
 
Saving Time and Money in Warehouse Operations (MFSA Annual Conference)
Saving Time and Money in Warehouse Operations (MFSA Annual Conference)Saving Time and Money in Warehouse Operations (MFSA Annual Conference)
Saving Time and Money in Warehouse Operations (MFSA Annual Conference)
 
CSCMP 2014: Battle-Testing Your Distribution Center Design
CSCMP 2014: Battle-Testing Your Distribution Center DesignCSCMP 2014: Battle-Testing Your Distribution Center Design
CSCMP 2014: Battle-Testing Your Distribution Center Design
 

Más de Hakka Labs

Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Hakka Labs
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchHakka Labs
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceHakka Labs
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleHakka Labs
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataHakka Labs
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale Hakka Labs
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQHakka Labs
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...Hakka Labs
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...Hakka Labs
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestHakka Labs
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringHakka Labs
 
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresDataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresHakka Labs
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkHakka Labs
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesHakka Labs
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityHakka Labs
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...Hakka Labs
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopHakka Labs
 
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...Hakka Labs
 

Más de Hakka Labs (20)

Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series search
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data Science
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scale
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineering
 
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresDataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data Structures
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with Ourselves
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
DataEngConf: Building a Music Recommender System from Scratch with Spotify Da...
 

Último

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Último (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

DataEngConf SF16 - Recommendations at Instacart

  • 1. Recommendations @ Instacart Sharath Rao Data Scientist Catalog, Search and Discovery
  • 2. v The Instacart Value Proposition Groceries from stores you love delivered to your doorstep in as little as an hour + + + =
  • 3. v Customer Experience Select a
 Store Shop for Groceries Checkout Select Delivery Time Delivered to Doorstep
  • 4. v Shopper Experience Accept Order Find the Groceries Out for Delivery Delivered to Doorstep Scan Barcode
  • 5. v Four Sided Marketplace Customers Shoppers Products
 (Advertisers) Search Advertising Shopping Delivery Customer Service Inventory Picking Loyalty Stores
 (Retailers)
  • 6. v What this talk is about A new collaborative filtering algorithm • A case-study • live end to end recommendation system • one person month • hundreds of millions of transactions
  • 7. v Online grocery vs Traditional e-commerce Week 3Week 2 Online Grocery Week 1 Traditional e-commerce
  • 8. v Grocery Shopping in “Low Dimensional Space” Search Restock Explore + + =
  • 9. v Why recommendations at Instacart Your storeEverybody’s store
  • 10. v Repeat purchases increase LTV of recommendations $5.49 $549 Today A year later 1 +….+ 100 $549 $549
  • 11. vDifferent recommendation systems address different needs
  • 12. v Personalized Top N recommendations Promote broad-based discovery in a dynamic catalog Including from stores customers may have never shopped
  • 13. v Replacement Product Recommendations Mitigate adverse impact of last-minute out of stocks
  • 14. v “Frequently bought with” Recommendations Not necessarily consumed together Help customers shop for complementary products and try alternatives Probably consumed together
  • 15. v Post Checkout Recommendations Accommodate last-minute requests for that “just one more thing”
  • 16. vPersonalized Top N Recommendations
  • 17. v Learning from feedback Traditionally collaborative filtering used explicit feedback to predict ratings There may still bias in whether the user chooses to rate Explicit Feedback Implicit Feedback
  • 18. v Learning from Explicit Feedback • Explicit feedback may be more reliable but there is much less of it • Less reliable if users rate based on aspirations instead of true preferences vs
  • 19. v Implicit Feedback - trade-off quality and quantity Strengthofevidence Number of Events
  • 20. v Architecture Event Data Score and Select Top N (Spark/EMR) User/Product Factors Event Data Run-time ranking for diversity Candidate Selection ALS (Spark/EMR) Generate User-Product Matrix
  • 21. v A Matrix Factorization Formulation for Implicit Feedback N Products MUsers 1 - - 9 - - - 3 20 User Product Matrix R; (M x N) 1 0 0 1 0 0 0 1 1binary preferences Preference Matrix R; (M x N) “Collaborative Filtering for Implicit Feedback” - Hu et. al
  • 22. v A Matrix Factorization Formulation for Implicit Feedback ~ Y XT Product Factors (k x N) User Factors (M x k) 1 0 0 1 0 0 0 1 1 x Preference Matrix R; (M x N)
  • 23. v Matrix Factorization from Implicit Feedback - The Intuition #Purchases Preference p Confidence c 0 0 Low 1 1 Low >>1 1 High • Confidence increases linearly with purchases r • c = 1 + alpha * r • alpha controls the marginal rate of learning from user purchases • Key questions • How should the unobserved events be treated • How should one trade-off observed and the unobserved
  • 24. v Regularized Weighted Squared Loss Confidence User Factors Matrix Product Factors Matrix Preference Matrix Regularization Solve using Alternating Least Squares
  • 25. v Architecture Generate User-Product Matrix ALS (Spark/EMR) Score and Select Top N (Spark/EMR) User/Product Factors Run-time ranking for diversity Candidate Selection Event Data Event Data
  • 26. v Spark ALS Hyper-parameter Tuning • rank k - diminishing returns after 150 • alpha - controls rate of learning from observed events • iterations - ALS tends to converge within 5, seldom more than 10 • lambda - regularization parameter
  • 27. v Architecture Generate User-Product Matrix ALS Matrix Factorization (Spark/EMR) Candidate Selection Score and Select Top N (Spark/EMR) User/Product Factors Run-time ranking for diversity Event Data Event Data
  • 28. v Scoring user and products With millions of products and users, scoring every (user, product) pair is prohibitive Two goals in selecting products to score • Products that have an a priori high purchase rate (popular) • Long tail which have not been discovered Exclude previously purchased products ~
  • 29. v Candidate Product Selection We start with simple stratified sampling For each user, score N products Sample h products from Head Sample t products from tail N ~ 10000 h ~ 3000 t ~7000
  • 30. v Architecture Generate User-Product Matrix ALS (Spark/EMR) Score and Select Top N (Spark/EMR) User/Product Factors Run-time diversity ranking Candidate Selection Event Data Event Data
  • 31. v Offline evaluation • Ideally we want to evaluate user response to recommendations • But we will only know this from an live A/B test • Recall based metrics are an offline proxy (albeit not the best) • Recall: “Fraction of purchased products covered among Top N recommendations” • We only use this for hyper parameter tuning
  • 32. v Tuning Spark For ALS Understanding Spark execution model and its implementation of ALS helps • Training is communication heavy1 , set partitions <= #CPU cores • Scoring is memory intensive • Broad guidelines2 • Limit executor memory to 64GB • 5 cores per executor • Set executors based on data size 1 - http://apache-spark-user-list.1001560.n3.nabble.com/Error-No-space-left-on-device-tp9887p9896.html 2 - http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/
  • 33. v What better promotes broad-based discovery vs
  • 34. v Online ranking for diversity “Diversity within sessions, Novelty across sessions” “Establish trust in a fresh and comprehensive catalog” “Less is more” Cached list of ~1000 products per user Final list of <100 products promote diversity
  • 35. v Diversity Top K products - ranked by score Rank product categories by their median product score > > >
  • 36. v Weighted sampling for diversity Sample category in proportion to score Within category, sample in proportion to product score
  • 37. v A/B Test Setup Generate User-Product Matrix ALS (Spark/EMR) Score and Select Top N (Spark/EMR) User/Product Factors Run-time diversity ranking Candidate Selection Event Data Event Data Weekly for past N months data Weekly for users with recent activity
  • 38. v A/B Test Results • Statistically significant increases • Items per order • GMV per order • Total product sales spread over more categories
  • 39. v Ok, we have a recommendation system Where do we go from here?
  • 40. v What else do you do with user and product factors? Score (user, product) pair on demand Get Top N similar users Get Top N similar product As features in other models
  • 41. v Products similar to “Haigs Spicy Hummus" More “Spicy Hummus” Spicy Salsas Generated using Approximate Nearest Neighbor (“annoy” from Spotify)
  • 42. v Ensembles Use different types of evidence and/or product metadata to easily create ensembles User x Products Purchased User x Products Viewed User x Brands Purchased Model or Linear Combination …
  • 43. v What next • Improve candidate selection by leverage user and product factors • Make recommendations more contextual • Address cold-start problems, particularly for users • Explain recommendations (“Because you did X”)
  • 45. v Fulfillment in Traditional E-commerce • Manage inventory in warehouses optimized for quick fulfillment • Users only specify the “What” they want • Disallow users from ordering out of stock products • Set expectations • “3 day shipping” but will ship in 10 business days
  • 46. v Fulfillment for on-demand delivery from local retailers • Shoppers navigate a complex environment where products • may have run out • may be misplaced • may be damaged • User specifies “What”, “When” and “Where from” • Improvise under uncertainty
  • 47. v Addressing new challenges in on-demand delivery • Tight technology integrations help improve tracking of in-store availability • Complemented by predictive models that estimate availability in real-time • Last minute out of stocks can still happen
  • 48. v
  • 49. v What makes a replacement acceptable? Flavor PackingSizeBrand Price • Several product attributes matter • Context matters, might benefit from personalization • Must scale to millions of products • Not always symmetric • May be ok to replace X with gluten free X but not the other way around Diet Info
  • 50. v • Shoppers are trained to pick replacements • But shoppers can benefit from algorithmic suggestions • Many unfamiliar products in a vast catalog • Validation for common products • Finding replacements fast improves operational efficiency Replacement Recommendations for Shoppers
  • 51. v • Customers can specify replacements while placing the order • Can choose to communicate with the shopper in store to verify Replacement Recommendations for Customers
  • 52. v How do we algorithmically generate replacements?
  • 53. COME BUILD IT :) WE’RE HIRING! 
 @sharathrao