SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Deep Learning for
Recommender Systems
Yves Raimond & Justin Basilico
January 25, 2017
Re·Work Deep Learning Summit San Francisco
@moustaki @JustinBasilico
The value of recommendations
● A few seconds to find something
great to watch…
● Can only show a few titles
● Enjoyment directly impacts
customer satisfaction
● Generates over $1B per year of
Netflix revenue
● How? Personalize everything
Deep learning for
recommendations: a first try
0 1 0 1 0
0 0 1 1 0
1 0 0 1 1
0 1 0 0 0
0 0 0 0 1
UsersItems
Traditional Recommendation Setup
U≈R
V
A Matrix Factorization view
U
A Feed-Forward Network view
V
U
A (deeper) feed-forward view
V
Mean
squared loss?
A quick & dirty experiment
● MovieLens-20M
○ Binarized ratings
○ Two weeks validation, two weeks test
● Comparing two models
○ ‘Standard’ MF, with hyperparameters:
■ L2 regularization
■ Rank
○ Feed-forward net, with hyperparameters:
■ L2 regularization (for all layers + embeddings)
■ Embeddings dimensionality
■ Number of hidden layers
■ Hidden layer dimensionalities
■ Activations
● After hyperparameter search for both models, what do we get?
What’s going on?
● Very similar models: representation learning
through embeddings, MSE loss, gradient-based
optimization
● Main difference is that we can learn a different
embedding combination than a dot product
● … but embeddings are arbitrary representations
● … and capturing pairwise interactions through a
feed-forward net requires a very large model
Conclusion?
● Not much benefit in the ‘traditional’
recommendation setup of a deep versus a
properly tuned model
● … Is this talk over?
Breaking the ‘traditional’ recsys setup
● Adding extra data / inputs
● Modeling different facets of users and items
● Alternative framings of the problem
Alternative data
Content-based side information
● VBPR: helping cold-start by augmenting item
factors with visual factors from CNNs [He et. al.,
2015]
● Content2Vec [Nedelec et. al., 2017]
● Learning to approximate MF item embeddings
from content [Dieleman, 2014]
Metadata-based side information
● Factorization Machines [Rendle, 2010] with
side-information
○ Extending the factorization framework to an arbitrary
number of inputs
● Meta-Prod2Vec [Vasile et. al., 2016]
○ Regularize item embeddings using side-information
● DCF [Li et al., 2016]
● Using associated textual information for
recommendations [Bansal et. al., 2016]
YouTube Recommendations
● Two stage ranker:
candidate generation
(shrinking set of items to
rank) and ranking
(classifying actual
impressions)
● Two feed-forward, fully
connected, networks with
hundreds of features
[Covington et. al., 2016]
Alternative models
Restricted Boltzmann Machines
● RBMs for Collaborative
Filtering [Salakhutdinov,
Minh & Hinton, 2007]
● Part of the ensemble that
won the $1M Netflix Prize
● Used in our rating
prediction system for
several years
Auto-encoders
● RBMs are hard to train
● CF-NADE [Zheng et al., 2016]
○ Define (random) orderings over conditionals
and model with a neural network
● Denoising auto-encoders: CDL [Wang et
al., 2015], CDAE [Wu et al., 2016]
● Variational auto-encoders [Liang et al.,
2017]
(*)2Vec
● Prod2Vec [Grbovic et al., 2015],
Item2Vec [Barkan & Koenigstein,
2016], Pin2Vec [Ma, 2017]
● Item-item co-occurrence
factorization (instead of user-item
factorization)
● The two approaches can be
blended [Liang et al., 2016]
prod2vec
(Skip-gram)
user2vec
(Continuous Bag of Words)
Wide + Deep models
● Wide model:
memorize
sparse, specific
rules
● Deep model:
generalize to
similar items via
embeddings
[Cheng et. al., 2016]
Deep Wide
(many parameters due
to cross product)
Alternative framings
Sequence prediction
● Treat recommendations as a
sequence classification problem
○ Input: sequence of user actions
○ Output: next action
● E.g. Gru4Rec [Hidasi et. al., 2016]
○ Input: sequence of items in a
sessions
○ Output: next item in the session
● Also co-evolution: [Wu et al., 2017],
[Dai et al., 2017]
Contextual sequence prediction
● Input: sequence of contextual user actions, plus
current context
● Output: probability of next action
● E.g. “Given all the actions a user has taken so far,
what’s the most likely video they’re going to play right
now?”
● e.g. [Smirnova & Vasile, 2017], [Beutel et. al., 2018]
Contextual sequence data
2017-12-10 15:40:22
2017-12-23 19:32:10
2017-12-24 12:05:53
2017-12-27 22:40:22
2017-12-29 19:39:36
2017-12-30 20:42:13
Context ActionSequence
per user
?
Time
Time-sensitive sequence prediction
● Recommendations are actions at a moment in time
○ Proper modeling of time and system dynamics is critical
● Experiment on a Netflix internal dataset
○ Context:
■ Discrete time
● Day-of-week: Sunday, Monday, …
● Hour-of-day
■ Continuous time (Timestamp)
○ Predict next play (temporal split data)
Results
Other framings
● Causality in recommendations
○ Explicitly modeling the consequence of a recommender systems’ intervention
[Schnabel et al., 2016]
● Recommendation as question answering
○ E.g. “I loved Billy Madison, My Neighbor Totoro, Blades of Glory, Bio-Dome,
Clue, and Happy Gilmore. I’m looking for a Music movie.” [Dodge et al., 2016]
● Deep Reinforcement Learning for
recommendations [Zhao et al, 2017]
Conclusion
Takeaways
● Deep Learning can work well for Recommendations... when you go
beyond the classic problem definition
● Similarities between DL and MF are a good thing: Lots of MF work
can be translated to DL
● Lots of open areas to improve recommendations using deep
learning
● Think beyond solving existing problems with new tools and instead
what new problems they can solve
More Resources
● RecSys 2017 tutorial by Karatzoglou and Hidasi
● RecSys Summer School slides by Hidasi
● DLRS Workshop 2016, 2017
● Recommenders Shallow/Deep by Sudeep Das
● Survey paper by Zhang, Yao & Sun
● GitHub repo of papers by Nandi
Thank you.
@moustaki @JustinBasilico
Yves Raimond & Justin Basilico
Yes, we’re hiring...

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Learning a Personalized Homepage
Learning a Personalized HomepageLearning a Personalized Homepage
Learning a Personalized Homepage
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender System
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Netflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 Stars
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 

Similar a Deep Learning for Recommender Systems

Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
Xavier Amatriain
 

Similar a Deep Learning for Recommender Systems (20)

Video Recommendation Engines as a Service
Video Recommendation Engines as a ServiceVideo Recommendation Engines as a Service
Video Recommendation Engines as a Service
 
Strata NYC: Building turn-key recommendations for 5% of internet video
Strata NYC: Building turn-key recommendations for 5% of internet videoStrata NYC: Building turn-key recommendations for 5% of internet video
Strata NYC: Building turn-key recommendations for 5% of internet video
 
Further enhancements of recommender systems using deep learning
Further enhancements of recommender systems using deep learningFurther enhancements of recommender systems using deep learning
Further enhancements of recommender systems using deep learning
 
[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)
 
Building turn-key recommendations for 5% of internet video
Building turn-key recommendations for 5% of internet videoBuilding turn-key recommendations for 5% of internet video
Building turn-key recommendations for 5% of internet video
 
Deep Learning Recommender Systems
Deep Learning Recommender SystemsDeep Learning Recommender Systems
Deep Learning Recommender Systems
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software Engineering
 
Recommending Sequences RecTour 2017
Recommending Sequences RecTour 2017Recommending Sequences RecTour 2017
Recommending Sequences RecTour 2017
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Dialogue system②
Dialogue system②Dialogue system②
Dialogue system②
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
 
Active Learning on Question Answering with Dialogues
 Active Learning on Question Answering with Dialogues Active Learning on Question Answering with Dialogues
Active Learning on Question Answering with Dialogues
 
GKumarAICS
GKumarAICSGKumarAICS
GKumarAICS
 
KaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep LearningKaoNet: Face Recognition and Generation App using Deep Learning
KaoNet: Face Recognition and Generation App using Deep Learning
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scale
 
Retweet Prediction with Attention-based Deep Neural Network
Retweet Prediction with Attention-based Deep Neural NetworkRetweet Prediction with Attention-based Deep Neural Network
Retweet Prediction with Attention-based Deep Neural Network
 

Más de Yves Raimond (9)

(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning
 
Paris ML meetup
Paris ML meetupParis ML meetup
Paris ML meetup
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015
 
Utilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCUtilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBC
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBC
 
Publishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebPublishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the Web
 
Linked data and applications
Linked data and applicationsLinked data and applications
Linked data and applications
 
Web of data
Web of dataWeb of data
Web of data
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 

Último

Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Último (20)

Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 

Deep Learning for Recommender Systems

  • 1. Deep Learning for Recommender Systems Yves Raimond & Justin Basilico January 25, 2017 Re·Work Deep Learning Summit San Francisco @moustaki @JustinBasilico
  • 2. The value of recommendations ● A few seconds to find something great to watch… ● Can only show a few titles ● Enjoyment directly impacts customer satisfaction ● Generates over $1B per year of Netflix revenue ● How? Personalize everything
  • 4. 0 1 0 1 0 0 0 1 1 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 1 UsersItems Traditional Recommendation Setup
  • 7. U A (deeper) feed-forward view V Mean squared loss?
  • 8. A quick & dirty experiment ● MovieLens-20M ○ Binarized ratings ○ Two weeks validation, two weeks test ● Comparing two models ○ ‘Standard’ MF, with hyperparameters: ■ L2 regularization ■ Rank ○ Feed-forward net, with hyperparameters: ■ L2 regularization (for all layers + embeddings) ■ Embeddings dimensionality ■ Number of hidden layers ■ Hidden layer dimensionalities ■ Activations ● After hyperparameter search for both models, what do we get?
  • 9.
  • 10.
  • 11. What’s going on? ● Very similar models: representation learning through embeddings, MSE loss, gradient-based optimization ● Main difference is that we can learn a different embedding combination than a dot product ● … but embeddings are arbitrary representations ● … and capturing pairwise interactions through a feed-forward net requires a very large model
  • 12. Conclusion? ● Not much benefit in the ‘traditional’ recommendation setup of a deep versus a properly tuned model ● … Is this talk over?
  • 13. Breaking the ‘traditional’ recsys setup ● Adding extra data / inputs ● Modeling different facets of users and items ● Alternative framings of the problem
  • 15. Content-based side information ● VBPR: helping cold-start by augmenting item factors with visual factors from CNNs [He et. al., 2015] ● Content2Vec [Nedelec et. al., 2017] ● Learning to approximate MF item embeddings from content [Dieleman, 2014]
  • 16. Metadata-based side information ● Factorization Machines [Rendle, 2010] with side-information ○ Extending the factorization framework to an arbitrary number of inputs ● Meta-Prod2Vec [Vasile et. al., 2016] ○ Regularize item embeddings using side-information ● DCF [Li et al., 2016] ● Using associated textual information for recommendations [Bansal et. al., 2016]
  • 17. YouTube Recommendations ● Two stage ranker: candidate generation (shrinking set of items to rank) and ranking (classifying actual impressions) ● Two feed-forward, fully connected, networks with hundreds of features [Covington et. al., 2016]
  • 19. Restricted Boltzmann Machines ● RBMs for Collaborative Filtering [Salakhutdinov, Minh & Hinton, 2007] ● Part of the ensemble that won the $1M Netflix Prize ● Used in our rating prediction system for several years
  • 20. Auto-encoders ● RBMs are hard to train ● CF-NADE [Zheng et al., 2016] ○ Define (random) orderings over conditionals and model with a neural network ● Denoising auto-encoders: CDL [Wang et al., 2015], CDAE [Wu et al., 2016] ● Variational auto-encoders [Liang et al., 2017]
  • 21. (*)2Vec ● Prod2Vec [Grbovic et al., 2015], Item2Vec [Barkan & Koenigstein, 2016], Pin2Vec [Ma, 2017] ● Item-item co-occurrence factorization (instead of user-item factorization) ● The two approaches can be blended [Liang et al., 2016] prod2vec (Skip-gram) user2vec (Continuous Bag of Words)
  • 22. Wide + Deep models ● Wide model: memorize sparse, specific rules ● Deep model: generalize to similar items via embeddings [Cheng et. al., 2016] Deep Wide (many parameters due to cross product)
  • 24. Sequence prediction ● Treat recommendations as a sequence classification problem ○ Input: sequence of user actions ○ Output: next action ● E.g. Gru4Rec [Hidasi et. al., 2016] ○ Input: sequence of items in a sessions ○ Output: next item in the session ● Also co-evolution: [Wu et al., 2017], [Dai et al., 2017]
  • 25. Contextual sequence prediction ● Input: sequence of contextual user actions, plus current context ● Output: probability of next action ● E.g. “Given all the actions a user has taken so far, what’s the most likely video they’re going to play right now?” ● e.g. [Smirnova & Vasile, 2017], [Beutel et. al., 2018]
  • 26. Contextual sequence data 2017-12-10 15:40:22 2017-12-23 19:32:10 2017-12-24 12:05:53 2017-12-27 22:40:22 2017-12-29 19:39:36 2017-12-30 20:42:13 Context ActionSequence per user ? Time
  • 27. Time-sensitive sequence prediction ● Recommendations are actions at a moment in time ○ Proper modeling of time and system dynamics is critical ● Experiment on a Netflix internal dataset ○ Context: ■ Discrete time ● Day-of-week: Sunday, Monday, … ● Hour-of-day ■ Continuous time (Timestamp) ○ Predict next play (temporal split data)
  • 29. Other framings ● Causality in recommendations ○ Explicitly modeling the consequence of a recommender systems’ intervention [Schnabel et al., 2016] ● Recommendation as question answering ○ E.g. “I loved Billy Madison, My Neighbor Totoro, Blades of Glory, Bio-Dome, Clue, and Happy Gilmore. I’m looking for a Music movie.” [Dodge et al., 2016] ● Deep Reinforcement Learning for recommendations [Zhao et al, 2017]
  • 31. Takeaways ● Deep Learning can work well for Recommendations... when you go beyond the classic problem definition ● Similarities between DL and MF are a good thing: Lots of MF work can be translated to DL ● Lots of open areas to improve recommendations using deep learning ● Think beyond solving existing problems with new tools and instead what new problems they can solve
  • 32. More Resources ● RecSys 2017 tutorial by Karatzoglou and Hidasi ● RecSys Summer School slides by Hidasi ● DLRS Workshop 2016, 2017 ● Recommenders Shallow/Deep by Sudeep Das ● Survey paper by Zhang, Yao & Sun ● GitHub repo of papers by Nandi
  • 33. Thank you. @moustaki @JustinBasilico Yves Raimond & Justin Basilico Yes, we’re hiring...