SlideShare una empresa de Scribd logo
1 de 33
RNNs for
Recommendation &
Personalization
Nick Pentreath
Principal Engineer
@MLnick
DBG / May 2, 2018 / © 2018 IBM Corporation
About
@MLnick on Twitter & Github
Principal Engineer, IBM
CODAIT - Center for Open-Source Data &
AI Technologies
Machine Learning & AI
Apache Spark committer & PMC
Author of Machine Learning with Spark
Various conferences & meetups
DBG / May 2, 2018 / © 2018 IBM Corporation
Center for Open Source Data
and AI Technologies
CODAIT
codait.org
DBG / May 2, 2018 / © 2018 IBM Corporation
CODAIT aims to make AI solutions
dramatically easier to create, deploy,
and manage in the enterprise
Relaunch of the Spark Technology
Center (STC) to reflect expanded
mission
Improving Enterprise AI Lifecycle in Open Source
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
Agenda
Recommender systems overview
Deep learning and RNNs
RNNs for recommendations
Challenges and future directions
DBG / May 2, 2018 / © 2018 IBM Corporation
Recommender Systems
DBG / May 2, 2018 / © 2018 IBM Corporation
Users and Items
Recommender Systems
DBG / May 2, 2018 / © 2018 IBM Corporation
Events
Recommender Systems
DBG / May 2, 2018 / © 2018 IBM Corporation
Implicit preference data
 Online – page view, click, app interaction
 Commerce – cart, purchase, return
 Media – preview, watch, listen
Explicit preference data
 Ratings, reviews
Intent
 Search query
Social
 Like, share, follow, unfollow, block
Context
Recommender Systems
DBG / May 2, 2018 / © 2018 IBM Corporation
Prediction
DBG / May 2, 2018 / © 2018 IBM Corporation
Recommender Systems
Prediction is ranking
– Given a user and context, rank the available items in order
of likelihood that the user will interact with them
Sort
items
Matrix Factorization
DBG / May 2, 2018 / © 2018 IBM Corporation
Recommender Systems
The de facto standard model
– Represent user ratings as a user-item matrix
– Find two smaller matrices (called the factor
matrices) that approximate the full matrix
– Minimize the reconstruction error (i.e. rating
prediction / completion)
– Efficient, scalable algorithms
• Gradient Descent
• Alternating Least Squares (ALS)
– Prediction is simple
– Can handle implicit data
Cold Start
DBG / May 2, 2018 / © 2018 IBM Corporation
Recommender Systems
New items
– No historical interaction data
– Typically use baselines (e.g. populariy) or item content
New (or unknown) users
– Previously unseen or anonymous users have no user
profile or historical interactions
– Have context data (but possibly very limited)
– Cannot directly use collaborative filtering models
• Item-similarity for current item
• Represent session as aggregation of items
• Contextual models can incorporate short-term history
Deep Learning and
Recurrent Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Overview
DBG / May 2, 2018 / © 2018 IBM Corporation
Deep Learning
Original theory from 1940s; computer models
originated around 1960s; fell out of favor in
1980s/90s
Recent resurgence due to
– Bigger (and better) data; standard datasets (e.g. ImageNet)
– Better hardware (GPUs)
– Improvements to algorithms, architectures and optimization
Leading to new state-of-the-art results in
computer vision (images and video);
speech/text; language translation and more
Source: Wikipedia
Modern Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Deep Learning
Deep (multi-layer) networks
Computer vision
– Convolution neural networks (CNNs)
– Image classification, object detection, segmentation
Sequences and time-series
– Recurrent neural networks (RNNs)
– Machine translation, text generation
– LSTMs, GRUs
Embeddings
– Text, categorical features
Deep learning frameworks
– Flexibility, computation graphs, auto-differentiation, GPUs
Source: Stanford CS231n
Recurrent Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Deep Learning
Neural Network on Sequences …
– … sequence of neural network (layers)
– Hidden layers (state) dependent on previous state as well as
current input
– “memory” of what came before
Source: Stanford CS231n
– Share weights across all time steps
– Training using backpropagation through time (BPTT)
Recurrent Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Source: Andrej Karpathy
Deep Learning
Recurrent Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Deep Learning
Issues
– Exploding gradients - clip / scale gradients
– Vanishing gradients
Source: Stanford CS231n
Solutions
– Truncated BPTT
– Restrict sequence length
– Cannot encode very long term memory
Recurrent Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Deep Learning
Long Short Term Memory (LSTM)
– Replace simple RNN layer (activation) with a LSTM cell
– Cell has 3 gates - Input (i), Forget (f), Output (o)
– Activation (g)
– Backpropagation depends only on elementwise operations (no
matrix operations over W)
Gated Recurrent Unit (GRU)
– Effectively a simplified version of LSTM
– 2 gates instead of 3 - input and forget gate is combined into an
update gate. No output gate
GRU has fewer parameters, LSTM may be more
expressive
Source: Stanford CS231n; Hochreiter et al.
Recurrent Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Deep Learning
Variants
– Multi-layer (deep) RNNs
– Bi-directional
– Deep bi-directional
– Attention
Source: Stanford CS231n; Denny Britz
RNNs for Recommendations
DBG / May 2, 2018 / © 2018 IBM Corporation
Deep Learning for Recommenders Overview
DBG / May 2, 2018 / © 2018 IBM Corporation
RNNs for Recommendations
Most approaches have focused on combining
– Performance of collaborative filtering models
(especially matrix factorization)
• Embeddings with appropriate loss = MF
– Power of deep learning for feature extraction
• CNNs for image content, audio, etc.
• Embeddings for categorical features
• Linear models for interactions
• RNNs for text
Source: Spotify / Sander Dieleman
Google Research
Session-based recommendation
DBG / May 2, 2018 / © 2018 IBM Corporation
RNNs for Recommendations
Apply the advances in sequence modeling
from deep learning
– RNN architectures trained on the sequence of
user events in a session (e.g. products viewed,
purchased) to predict next item in session
– Adjustments for domain
• Item encoding (1-of-N, weighted average)
• Parallel mini-batch processing
• Ranking losses – BPR , TOP1
• Negative item sampling per mini-batch
– Report 20-30% accuracy gain over baselines
Source: Hidasi, Karatzoglou, Baltrunas, Tikk
Contextual Session-based models
DBG / May 2, 2018 / © 2018 IBM Corporation
RNNs for Recommendations
Add contextual data to the RNN architecture
– Context included time, time since last event,
event type
– Combine context data with input / output layer
– Also combine context with the RNN layers
– About 3-6% improvement (in Recall@10 metric)
over simple RNN baseline
– Importantly, model is even better at predicting
sales (vs view, add to cart events) and at
predicting new / fresh items (vs items the user
has already seen)
Source: Smirnova, Vasile
Content and Session-based models
DBG / May 2, 2018 / © 2018 IBM Corporation
RNNs for Recommendations
Add content data to the RNN architecture
– Parallel RNN (p-RNN)
– Follows trend in combining DL architectures for
content feature extraction with CF models for
interaction data
• CNN for image data
• BOW for text (alternatives are Word2Vec-style models
and RNN language models)
– Some training tricks
• Alternating – keep one subnet fixed, train other
• Residual – subnets trained on residual error
• Interleaved – alternating training per mini-batch
Source: Hidasi, Quadrana, Karatzoglou, Tikk
3D CNNs for Session-based Recommendation
DBG / May 2, 2018 / © 2018 IBM Corporation
RNNs for Recommendations
As we’ve seen in text / NLP, CNNs can also
be effective in modeling sequences
– 3D convolutional models have been applied in
video classification
– Potentially faster to train, easier to understand
– Use character-level encoding of IDs and item
features (name, description, categories)
• Compact representation
• No embedding layer
– “ResNet” style architecture
– Show improvement over p-RNN
Source: Tuan, Phuong
Challenges
DBG / May 2, 2018 / © 2018 IBM Corporation
Challenges particular to recommendation
models
– Data size and dimensionality (input & output)
• Sampling
– Extreme sparsity
• Embeddings & compressed representations
– Wide variety of specialized settings
– Combining session, content, context and
preference data
– Model serving is difficult – ranking, large number of
items, computationally expensive
– Metrics – model accuracy and its relation to real-
world outcomes and behaviors
– Need for standard, open, large-scale, datasets that
have time and session data and are content- and
context-rich
• RecSys 15 Challenge – YouChoose dataset
– Evaluation – watch you baselines!
• When Recurrent Neural Networks meet the
Neighborhood for Session-Based Recommendation
Challenges and Future Directions
Future Directions
DBG / May 2, 2018 / © 2018 IBM Corporation
Challenges and Future Directions
Most recent and future directions in research
& industry
– Improved RNNs
• Cross-session models (e.g. Hierarchical RNN)
• Further research on contextual models, as well as
content and metadata
• Attention models
– Combine sequence and historical models (long-
and short-term user profiles)
• Personalizing session-based models
– Applications at scale
• Dimensionality reduction techniques (e.g.
Bloom embeddings for large input/output
spaces)
• Compressed encodings for users and items
• Distributed training
• Efficient model serving for complex
architectures
Summary
DBG / May 2, 2018 / © 2018 IBM Corporation
Challenges and Future Directions
DL for recommendation is just getting started
(again)
– Huge increase in interest, research papers.
Already many new models and approaches
– DL approaches have generally yielded
incremental % gains
• But that can translate to significant $$$
• More pronounced in session-based
– Cold start scenarios benefit from multi-modal
nature of DL models and explicit modeling of
sequences
– Flexibility of DL frameworks helps a lot
– Benefits from advances in DL for images, video,
NLP etc.
– Open-source libraries appearing (e.g. Spotlight)
– Check out DLRS workshops & tutorials @
RecSys 2016 / 2017, and upcoming in Oct, 2018
– RecSys challenges
Thank you!
codait.org
twitter.com/MLnick
github.com/MLnick
developer.ibm.com/code
DBG / May 2, 2018 / © 2018 IBM Corporation
FfDL
Sign up for IBM Cloud and try Watson Studio!
https://ibm.biz/BdZgcx
https://datascience.ibm.com/
MAX
Links & References
Wikipedia: Perceptron
Stanford CS231n Convolutional Neural Networks for Visual
Recognition
Stanford CS231n – RNN Slides
Recurrent Neural Networks Tutorial
The Unreasonable Effectiveness of Recurrent Neural
Networks
Understanding LSTM Networks
Learning Phrase Representations using RNN Encoder-
Decoder for Statistical Machine Translation
Long short-term memory
Attention and Augmented Recurrent Neural Networks
DBG / May 2, 2018 / © 2018 IBM Corporation
Links & References
Deep Content-based Music Recommendation
Google’s Wide and Deep Learning Model
Deep Learning for Recommender Systems Workshops @
RecSys
Deep Learning for Recommender Systems Tutorial @
RecSys 2017
Session-based Recommendations with Recurrent Neural
Networks
Recurrent Neural Networks with Top-k Gains for Session-
based Recommendations
Sequential User-based Recurrent Neural Network
Recommendations
DBG / May 2, 2018 / © 2018 IBM Corporation
Links & References
Personalizing Session-based Recommendations with
Hierarchical Recurrent Neural Networks
Parallel Recurrent Neural Network Architectures for Feature-
rich Session-based Recommendations
Contextual Sequence Modeling for Recommendation with
Recurrent Neural Networks
When Recurrent Neural Networks meet the Neighborhood
for Session-Based Recommendation
3D Convolutional Networks for Session-based
Recommendation with Content Features
Spotlight: Recommendation models in PyTorch
RecSys 2015 Challenge – YouChoose Dataset
DBG / May 2, 2018 / © 2018 IBM Corporation
DBG / May 2, 2018 / © 2018 IBM Corporation

Más contenido relacionado

La actualidad más candente

Graphs and Financial Services Analytics
Graphs and Financial Services AnalyticsGraphs and Financial Services Analytics
Graphs and Financial Services AnalyticsNeo4j
 
Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpAdrian Ziegler
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareTigerGraph
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at ScaleNeo4j
 
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...TigerGraph
 
Large-Scale Machine Learning at Twitter
Large-Scale Machine Learning at TwitterLarge-Scale Machine Learning at Twitter
Large-Scale Machine Learning at Twitternep_test_account
 
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...dbpublications
 
Vertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part IIVertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part IIVertex Holdings
 
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph AlgorithmsNeo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph AlgorithmsNeo4j
 
FPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel Shifter
FPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel ShifterFPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel Shifter
FPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel Shifterdbpublications
 
A Survey on Graph Database Management Techniques for Huge Unstructured Data
A Survey on Graph Database Management Techniques for Huge Unstructured Data A Survey on Graph Database Management Techniques for Huge Unstructured Data
A Survey on Graph Database Management Techniques for Huge Unstructured Data IJECEIAES
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetTigerGraph
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryTigerGraph
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphTigerGraph
 
Vertex Perspectives | AI Optimized Chipsets | Part IV
Vertex Perspectives | AI Optimized Chipsets | Part IVVertex Perspectives | AI Optimized Chipsets | Part IV
Vertex Perspectives | AI Optimized Chipsets | Part IVVertex Holdings
 
Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingJan Wiegelmann
 
Scaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsScaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsConnected Data World
 

La actualidad más candente (20)

Graphs and Financial Services Analytics
Graphs and Financial Services AnalyticsGraphs and Financial Services Analytics
Graphs and Financial Services Analytics
 
Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at Scale
 
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...
 
Large-Scale Machine Learning at Twitter
Large-Scale Machine Learning at TwitterLarge-Scale Machine Learning at Twitter
Large-Scale Machine Learning at Twitter
 
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
 
Vertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part IIVertex Perspectives | AI Optimized Chipsets | Part II
Vertex Perspectives | AI Optimized Chipsets | Part II
 
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph AlgorithmsNeo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
Neo4j Graph Data Science Training - June 9 & 10 - Slides #6 Graph Algorithms
 
FPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel Shifter
FPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel ShifterFPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel Shifter
FPGA Implementation of High Speed 8bit Vedic Multiplier using Barrel Shifter
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
A Survey on Graph Database Management Techniques for Huge Unstructured Data
A Survey on Graph Database Management Techniques for Huge Unstructured Data A Survey on Graph Database Management Techniques for Huge Unstructured Data
A Survey on Graph Database Management Techniques for Huge Unstructured Data
 
Graph Analytics
Graph AnalyticsGraph Analytics
Graph Analytics
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
 
Big Data Analytics With MATLAB
Big Data Analytics With MATLABBig Data Analytics With MATLAB
Big Data Analytics With MATLAB
 
Plume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis LibraryPlume - A Code Property Graph Extraction and Analysis Library
Plume - A Code Property Graph Extraction and Analysis Library
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise Graph
 
Vertex Perspectives | AI Optimized Chipsets | Part IV
Vertex Perspectives | AI Optimized Chipsets | Part IVVertex Perspectives | AI Optimized Chipsets | Part IV
Vertex Perspectives | AI Optimized Chipsets | Part IV
 
Deep Learning for Autonomous Driving
Deep Learning for Autonomous DrivingDeep Learning for Autonomous Driving
Deep Learning for Autonomous Driving
 
Scaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analyticsScaling up business value with real-time operational graph analytics
Scaling up business value with real-time operational graph analytics
 

Similar a RNNs for Recommendations and Personalization

Deep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreathDeep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreathDatabricks
 
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverScott Shadley, MBA,PMC-III
 
Search and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same CoinSearch and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same CoinNick Pentreath
 
STOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKSSTOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKSIRJET Journal
 
Data-driven AI for Self-Adaptive Software Systems
Data-driven AI for Self-Adaptive Software SystemsData-driven AI for Self-Adaptive Software Systems
Data-driven AI for Self-Adaptive Software SystemsAndreas Metzger
 
Clustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesClustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesIRJET Journal
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
 
The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines Jim Dowling
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2Mohit Garg
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationDenodo
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...MaximilianHoffmann7
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...Srivatsan Ramanujam
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...NECST Lab @ Politecnico di Milano
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...oj08
 

Similar a RNNs for Recommendations and Personalization (20)

Deep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreathDeep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems with Nick pentreath
 
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
 
Search and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same CoinSearch and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same Coin
 
STOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKSSTOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKS
 
Data-driven AI for Self-Adaptive Software Systems
Data-driven AI for Self-Adaptive Software SystemsData-driven AI for Self-Adaptive Software Systems
Data-driven AI for Self-Adaptive Software Systems
 
Clustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesClustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining Techniques
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas Geerdink
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
 

Más de Nick Pentreath

Notebook-based AI Pipelines with Elyra and Kubeflow
Notebook-based AI Pipelines with Elyra and KubeflowNotebook-based AI Pipelines with Elyra and Kubeflow
Notebook-based AI Pipelines with Elyra and KubeflowNick Pentreath
 
Scaling up deep learning by scaling down
Scaling up deep learning by scaling downScaling up deep learning by scaling down
Scaling up deep learning by scaling downNick Pentreath
 
End-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNXEnd-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNXNick Pentreath
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesNick Pentreath
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayNick Pentreath
 
IBM Developer Model Asset eXchange
IBM Developer Model Asset eXchangeIBM Developer Model Asset eXchange
IBM Developer Model Asset eXchangeNick Pentreath
 
IBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for EveryoneIBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for EveryoneNick Pentreath
 
Productionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for AnalyticsProductionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for AnalyticsNick Pentreath
 

Más de Nick Pentreath (8)

Notebook-based AI Pipelines with Elyra and Kubeflow
Notebook-based AI Pipelines with Elyra and KubeflowNotebook-based AI Pipelines with Elyra and Kubeflow
Notebook-based AI Pipelines with Elyra and Kubeflow
 
Scaling up deep learning by scaling down
Scaling up deep learning by scaling downScaling up deep learning by scaling down
Scaling up deep learning by scaling down
 
End-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNXEnd-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNX
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI Day
 
IBM Developer Model Asset eXchange
IBM Developer Model Asset eXchangeIBM Developer Model Asset eXchange
IBM Developer Model Asset eXchange
 
IBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for EveryoneIBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for Everyone
 
Productionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for AnalyticsProductionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for Analytics
 

Último

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Último (20)

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

RNNs for Recommendations and Personalization

  • 1. RNNs for Recommendation & Personalization Nick Pentreath Principal Engineer @MLnick DBG / May 2, 2018 / © 2018 IBM Corporation
  • 2. About @MLnick on Twitter & Github Principal Engineer, IBM CODAIT - Center for Open-Source Data & AI Technologies Machine Learning & AI Apache Spark committer & PMC Author of Machine Learning with Spark Various conferences & meetups DBG / May 2, 2018 / © 2018 IBM Corporation
  • 3. Center for Open Source Data and AI Technologies CODAIT codait.org DBG / May 2, 2018 / © 2018 IBM Corporation CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise Relaunch of the Spark Technology Center (STC) to reflect expanded mission Improving Enterprise AI Lifecycle in Open Source Gather Data Analyze Data Machine Learning Deep Learning Deploy Model Maintain Model Python Data Science Stack Fabric for Deep Learning (FfDL) Mleap + PFA Scikit-LearnPandas Apache Spark Apache Spark Jupyter Model Asset eXchange Keras + Tensorflow
  • 4. Agenda Recommender systems overview Deep learning and RNNs RNNs for recommendations Challenges and future directions DBG / May 2, 2018 / © 2018 IBM Corporation
  • 5. Recommender Systems DBG / May 2, 2018 / © 2018 IBM Corporation
  • 6. Users and Items Recommender Systems DBG / May 2, 2018 / © 2018 IBM Corporation
  • 7. Events Recommender Systems DBG / May 2, 2018 / © 2018 IBM Corporation Implicit preference data  Online – page view, click, app interaction  Commerce – cart, purchase, return  Media – preview, watch, listen Explicit preference data  Ratings, reviews Intent  Search query Social  Like, share, follow, unfollow, block
  • 8. Context Recommender Systems DBG / May 2, 2018 / © 2018 IBM Corporation
  • 9. Prediction DBG / May 2, 2018 / © 2018 IBM Corporation Recommender Systems Prediction is ranking – Given a user and context, rank the available items in order of likelihood that the user will interact with them Sort items
  • 10. Matrix Factorization DBG / May 2, 2018 / © 2018 IBM Corporation Recommender Systems The de facto standard model – Represent user ratings as a user-item matrix – Find two smaller matrices (called the factor matrices) that approximate the full matrix – Minimize the reconstruction error (i.e. rating prediction / completion) – Efficient, scalable algorithms • Gradient Descent • Alternating Least Squares (ALS) – Prediction is simple – Can handle implicit data
  • 11. Cold Start DBG / May 2, 2018 / © 2018 IBM Corporation Recommender Systems New items – No historical interaction data – Typically use baselines (e.g. populariy) or item content New (or unknown) users – Previously unseen or anonymous users have no user profile or historical interactions – Have context data (but possibly very limited) – Cannot directly use collaborative filtering models • Item-similarity for current item • Represent session as aggregation of items • Contextual models can incorporate short-term history
  • 12. Deep Learning and Recurrent Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation
  • 13. Overview DBG / May 2, 2018 / © 2018 IBM Corporation Deep Learning Original theory from 1940s; computer models originated around 1960s; fell out of favor in 1980s/90s Recent resurgence due to – Bigger (and better) data; standard datasets (e.g. ImageNet) – Better hardware (GPUs) – Improvements to algorithms, architectures and optimization Leading to new state-of-the-art results in computer vision (images and video); speech/text; language translation and more Source: Wikipedia
  • 14. Modern Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation Deep Learning Deep (multi-layer) networks Computer vision – Convolution neural networks (CNNs) – Image classification, object detection, segmentation Sequences and time-series – Recurrent neural networks (RNNs) – Machine translation, text generation – LSTMs, GRUs Embeddings – Text, categorical features Deep learning frameworks – Flexibility, computation graphs, auto-differentiation, GPUs Source: Stanford CS231n
  • 15. Recurrent Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation Deep Learning Neural Network on Sequences … – … sequence of neural network (layers) – Hidden layers (state) dependent on previous state as well as current input – “memory” of what came before Source: Stanford CS231n – Share weights across all time steps – Training using backpropagation through time (BPTT)
  • 16. Recurrent Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation Source: Andrej Karpathy Deep Learning
  • 17. Recurrent Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation Deep Learning Issues – Exploding gradients - clip / scale gradients – Vanishing gradients Source: Stanford CS231n Solutions – Truncated BPTT – Restrict sequence length – Cannot encode very long term memory
  • 18. Recurrent Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation Deep Learning Long Short Term Memory (LSTM) – Replace simple RNN layer (activation) with a LSTM cell – Cell has 3 gates - Input (i), Forget (f), Output (o) – Activation (g) – Backpropagation depends only on elementwise operations (no matrix operations over W) Gated Recurrent Unit (GRU) – Effectively a simplified version of LSTM – 2 gates instead of 3 - input and forget gate is combined into an update gate. No output gate GRU has fewer parameters, LSTM may be more expressive Source: Stanford CS231n; Hochreiter et al.
  • 19. Recurrent Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation Deep Learning Variants – Multi-layer (deep) RNNs – Bi-directional – Deep bi-directional – Attention Source: Stanford CS231n; Denny Britz
  • 20. RNNs for Recommendations DBG / May 2, 2018 / © 2018 IBM Corporation
  • 21. Deep Learning for Recommenders Overview DBG / May 2, 2018 / © 2018 IBM Corporation RNNs for Recommendations Most approaches have focused on combining – Performance of collaborative filtering models (especially matrix factorization) • Embeddings with appropriate loss = MF – Power of deep learning for feature extraction • CNNs for image content, audio, etc. • Embeddings for categorical features • Linear models for interactions • RNNs for text Source: Spotify / Sander Dieleman Google Research
  • 22. Session-based recommendation DBG / May 2, 2018 / © 2018 IBM Corporation RNNs for Recommendations Apply the advances in sequence modeling from deep learning – RNN architectures trained on the sequence of user events in a session (e.g. products viewed, purchased) to predict next item in session – Adjustments for domain • Item encoding (1-of-N, weighted average) • Parallel mini-batch processing • Ranking losses – BPR , TOP1 • Negative item sampling per mini-batch – Report 20-30% accuracy gain over baselines Source: Hidasi, Karatzoglou, Baltrunas, Tikk
  • 23. Contextual Session-based models DBG / May 2, 2018 / © 2018 IBM Corporation RNNs for Recommendations Add contextual data to the RNN architecture – Context included time, time since last event, event type – Combine context data with input / output layer – Also combine context with the RNN layers – About 3-6% improvement (in Recall@10 metric) over simple RNN baseline – Importantly, model is even better at predicting sales (vs view, add to cart events) and at predicting new / fresh items (vs items the user has already seen) Source: Smirnova, Vasile
  • 24. Content and Session-based models DBG / May 2, 2018 / © 2018 IBM Corporation RNNs for Recommendations Add content data to the RNN architecture – Parallel RNN (p-RNN) – Follows trend in combining DL architectures for content feature extraction with CF models for interaction data • CNN for image data • BOW for text (alternatives are Word2Vec-style models and RNN language models) – Some training tricks • Alternating – keep one subnet fixed, train other • Residual – subnets trained on residual error • Interleaved – alternating training per mini-batch Source: Hidasi, Quadrana, Karatzoglou, Tikk
  • 25. 3D CNNs for Session-based Recommendation DBG / May 2, 2018 / © 2018 IBM Corporation RNNs for Recommendations As we’ve seen in text / NLP, CNNs can also be effective in modeling sequences – 3D convolutional models have been applied in video classification – Potentially faster to train, easier to understand – Use character-level encoding of IDs and item features (name, description, categories) • Compact representation • No embedding layer – “ResNet” style architecture – Show improvement over p-RNN Source: Tuan, Phuong
  • 26. Challenges DBG / May 2, 2018 / © 2018 IBM Corporation Challenges particular to recommendation models – Data size and dimensionality (input & output) • Sampling – Extreme sparsity • Embeddings & compressed representations – Wide variety of specialized settings – Combining session, content, context and preference data – Model serving is difficult – ranking, large number of items, computationally expensive – Metrics – model accuracy and its relation to real- world outcomes and behaviors – Need for standard, open, large-scale, datasets that have time and session data and are content- and context-rich • RecSys 15 Challenge – YouChoose dataset – Evaluation – watch you baselines! • When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation Challenges and Future Directions
  • 27. Future Directions DBG / May 2, 2018 / © 2018 IBM Corporation Challenges and Future Directions Most recent and future directions in research & industry – Improved RNNs • Cross-session models (e.g. Hierarchical RNN) • Further research on contextual models, as well as content and metadata • Attention models – Combine sequence and historical models (long- and short-term user profiles) • Personalizing session-based models – Applications at scale • Dimensionality reduction techniques (e.g. Bloom embeddings for large input/output spaces) • Compressed encodings for users and items • Distributed training • Efficient model serving for complex architectures
  • 28. Summary DBG / May 2, 2018 / © 2018 IBM Corporation Challenges and Future Directions DL for recommendation is just getting started (again) – Huge increase in interest, research papers. Already many new models and approaches – DL approaches have generally yielded incremental % gains • But that can translate to significant $$$ • More pronounced in session-based – Cold start scenarios benefit from multi-modal nature of DL models and explicit modeling of sequences – Flexibility of DL frameworks helps a lot – Benefits from advances in DL for images, video, NLP etc. – Open-source libraries appearing (e.g. Spotlight) – Check out DLRS workshops & tutorials @ RecSys 2016 / 2017, and upcoming in Oct, 2018 – RecSys challenges
  • 29. Thank you! codait.org twitter.com/MLnick github.com/MLnick developer.ibm.com/code DBG / May 2, 2018 / © 2018 IBM Corporation FfDL Sign up for IBM Cloud and try Watson Studio! https://ibm.biz/BdZgcx https://datascience.ibm.com/ MAX
  • 30. Links & References Wikipedia: Perceptron Stanford CS231n Convolutional Neural Networks for Visual Recognition Stanford CS231n – RNN Slides Recurrent Neural Networks Tutorial The Unreasonable Effectiveness of Recurrent Neural Networks Understanding LSTM Networks Learning Phrase Representations using RNN Encoder- Decoder for Statistical Machine Translation Long short-term memory Attention and Augmented Recurrent Neural Networks DBG / May 2, 2018 / © 2018 IBM Corporation
  • 31. Links & References Deep Content-based Music Recommendation Google’s Wide and Deep Learning Model Deep Learning for Recommender Systems Workshops @ RecSys Deep Learning for Recommender Systems Tutorial @ RecSys 2017 Session-based Recommendations with Recurrent Neural Networks Recurrent Neural Networks with Top-k Gains for Session- based Recommendations Sequential User-based Recurrent Neural Network Recommendations DBG / May 2, 2018 / © 2018 IBM Corporation
  • 32. Links & References Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks Parallel Recurrent Neural Network Architectures for Feature- rich Session-based Recommendations Contextual Sequence Modeling for Recommendation with Recurrent Neural Networks When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation 3D Convolutional Networks for Session-based Recommendation with Content Features Spotlight: Recommendation models in PyTorch RecSys 2015 Challenge – YouChoose Dataset DBG / May 2, 2018 / © 2018 IBM Corporation
  • 33. DBG / May 2, 2018 / © 2018 IBM Corporation