SlideShare a Scribd company logo
1 of 38
Download to read offline
Survey of Recommendation Systems
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
Introduction
• What is recommendation system?
– Recommend related items
– Personalized experiences
• How to build a recommendation system?
– Content-Based
– Collaborative Filtering Algorithm
• Examples
– Amazon
– Youa
Examples
Browsing a book
Recommendations
Rating?
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
CF Algorithm
• Memory-Based
 User-Based
 Item-Based
• Model-Based
 Bayes
 Clustering
User-Based CF Algorithm
User-Based CF Algorithm
User by Item Matrix:
Table 1: An example of user-item matrix
Table 2: A simple example of ratings matrix
User-Based CF Algorithm
Voting : vi,j corresponding to the vote for user i on item j.
Mean Vote :
where Ii is the set of items on which user i voted.
Predicted vote:
weights of n similar usersnormalizer
Similarity Computation
Vector Cosine-Based Similarity
Correlation-Based Similarity (Pearson)
Other Similarities
Vector Cosine-Based Similarity
Vector cosine similarity:







Uu ujuUu uiu
Uu ujuuiu
BA
rrrr
rrrr
w
2
,
2
,
,,
,
)()(
))((
Adjusted cosine similarity:
different rating scale?
Correlation-Based Similarity
Pearson correlation:
Thus in the example in Table 2, we have w1,5 = 0.756.
Prediction Computation
Weighted Sum of Others’ Ratings:
For the simple example in Table 4, using the user-based CF algorithm, to
predict the rating for U1 on I2, we have
Recommendations I
Rating Prediction Algorithm:
a) Calculate Pa,i for each item i with prediction
computation formulation.
b) Recommend the top-N highest rating items
that the active user a has not purchased.
Recommendations II
K Nearest Neighbors Algorithm:
a) Find k most similar users (KNN).
b) Identify a set of items, C, purchased by the
group together with their frequency.
c) Recommend the top-N most frequent items in
C that the active user has not purchased.
Item-Based CF Algorithm
Correlation-Based Similarity:
where ru,i is the rating of user u on item i, is the average rating of the ith item by
those users.
User-Item
Matrix
ir
Prediction Computation
Simple Weighted Average:
where wi,n is the weight between items i and n, ru,n is the rating for
user u on item n.
Extensions
• Default Voting
• Inverse User Frequency
• Case Amplification
Default Voting
Problem:
• pair-wise similarity is computed only from the ratings in
the intersection of the items both users have rated.
• too few votes at the beginning
Solution:
 Assuming some default voting values for the missing
ratings can improve the CF prediction performance.
 Dimension Reduction, such as SVD, PCA etc.
Inverse User Frequency
Definition:
)/log( ji nnf 
where nj is the number of users who have rated item j and
n is the total number of users.
Case Amplification
where ρ is the case amplification power, ρ ≥ 1, and
typical choice of ρ is 2.5. Case amplification reduces
noise in the data.
It tends to favor high weights as small values raised to a
power become negligible.
For example, wi,j = 0.9, then it remains high (0.92.5 ≈ 0.8);
if wi,j = 0.1, then it be negligible (0.12.5 ≈ 0.003).
Model-Based CF Algorithm
• Simple Bayesian CF Algorithm
• Clustering CF Algorithm
Simple Bayesian CF Algorithm
Simple Bayesian:
Laplace Estimator:
Simple Bayesian CF Algorithm
Example in Table 4, to produce the rating for U1 on I2 using the
Simple Bayesian CF algorithm and the Laplace Estimator:
Clustering CF Algorithm
For two data objects, X = (x1, x2, …, xn) and Y = (y1,
y2, …, yn), the popular Minkowski distance is defined as,
where n is the dimension number of the object, and q is a positive integer.
Obviously, when q = 1, d is Manhattan distance; when
q = 2, d is Euclidian distance.
Evaluation Metrics
Mean Absolute Error and Normalized Mean Absolute Error:
where rmax and rmin are the upper and lower bounds of the ratings.
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
Challenges
• Data sparsity
• Scalability
• Synonymy
• Gray Sheep
• Shilling Attacks
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
Demo
• Tools:Mahout - Scalable machine learning and data
mining library,http://mahout.apache.org/
• Data: MovieLens, http://www.movielens.org/
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
Conclusions
CF categories Memory-based CF
Representative techniques Item-based/user-based top-N
recommendations
Main advantages 1. easy implementation
2. new data can be added easily and
incrementally
3. need not consider the content of the
items being recommended
4. scale well with co-rated items
Main shortcomings 1. are dependent on human ratings
2. performance decrease when data
are sparse
3. cannot recommend for new users
and items
4. have limited scalability for large
Conclusions
CF categories Model-based CF
Representative techniques 1. Bayesian belief nets CF
2. Clustering CF
3. CF using dimensionality reduction
techniques, SVD, PCA
Main advantages 1. better address the sparsity,
scalability and other problems
2. improve prediction performance
3. give an intuitive rationale for
recommendations
Main shortcomings 1. expensive model-building
2. trade-off between prediction
performance and scalability
3. lose useful information for
dimensionality reduction techniques
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
Future work
Scalability Real-time
Q & A
References
 J. Breese, D. Heckerman, and C. Kadie, “Empirical analysis of predictive
algorithms for collaborative filtering,” in Proceedings of the 4th
Conference on Uncertainty in Artificial Intelligence (UAI ’98), 1998.
 B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative
filtering recommendation algorithms,” in Proc. of the WWW Conference,
2001.
 K. Miyahara and M. J. Pazzani, “Collaborative filtering with the simple
Bayesian classifier,” in Proceedings of the 6th Pacific Rim International
Conference on Artificial Intelligence, pp. 679–689, 2000.
 L. H. Ungar and D. P. Foster, “Clustering methods for collaborative
filtering,” in Proceedings of the Workshop on Recommendation Systems,
AAAI Press, 1998.
 Xiaoyuan Su and Taghi M. Khoshgoftaar, “A Survey of Collaborative
Filtering Techniques,” in Advances in Artificial Intelligence Volume 2009,
Article ID 421425, 19 pages.

More Related Content

What's hot

Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
Georgian Micsa
 
Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011
idoguy
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
Roger Chen
 

What's hot (20)

How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender Systems
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011Social Recommender Systems Tutorial - WWW 2011
Social Recommender Systems Tutorial - WWW 2011
 
Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system
 
Movie lens movie recommendation system
Movie lens movie recommendation systemMovie lens movie recommendation system
Movie lens movie recommendation system
 
A Hybrid Recommendation system
A Hybrid Recommendation systemA Hybrid Recommendation system
A Hybrid Recommendation system
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Movie Recommendation engine
Movie Recommendation engineMovie Recommendation engine
Movie Recommendation engine
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 

Viewers also liked

Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
es712
 
Peopleviews: Human Computation for Constraint-Based Recommendation
Peopleviews: Human Computation for Constraint-Based RecommendationPeopleviews: Human Computation for Constraint-Based Recommendation
Peopleviews: Human Computation for Constraint-Based Recommendation
Thomas Ulz
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scale
huguk
 
Questionnaire conclusion
Questionnaire conclusionQuestionnaire conclusion
Questionnaire conclusion
Jnae
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 
Medicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAs
Medicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAsMedicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAs
Medicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAs
NASHP HealthPolicy
 

Viewers also liked (20)

REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
Large-scale Social Recommendation Systems: Challenges and Opportunity
Large-scale Social Recommendation Systems: Challenges and OpportunityLarge-scale Social Recommendation Systems: Challenges and Opportunity
Large-scale Social Recommendation Systems: Challenges and Opportunity
 
Introduction to Collaborative Filtering
Introduction to Collaborative FilteringIntroduction to Collaborative Filtering
Introduction to Collaborative Filtering
 
[ECWEB2012]Differential Context Relaxation for Context-Aware Travel Recommend...
[ECWEB2012]Differential Context Relaxation for Context-Aware Travel Recommend...[ECWEB2012]Differential Context Relaxation for Context-Aware Travel Recommend...
[ECWEB2012]Differential Context Relaxation for Context-Aware Travel Recommend...
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
 
Machine Learning at Netflix Scale
Machine Learning at Netflix ScaleMachine Learning at Netflix Scale
Machine Learning at Netflix Scale
 
Peopleviews: Human Computation for Constraint-Based Recommendation
Peopleviews: Human Computation for Constraint-Based RecommendationPeopleviews: Human Computation for Constraint-Based Recommendation
Peopleviews: Human Computation for Constraint-Based Recommendation
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scale
 
Recommendation System --Theory and Practice
Recommendation System --Theory and PracticeRecommendation System --Theory and Practice
Recommendation System --Theory and Practice
 
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
 
Soil survey the quest for precision agriculture in bangladesh
Soil survey the quest for precision agriculture in bangladeshSoil survey the quest for precision agriculture in bangladesh
Soil survey the quest for precision agriculture in bangladesh
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011
 
Questionnaire conclusion
Questionnaire conclusionQuestionnaire conclusion
Questionnaire conclusion
 
ESSIR 2013 Recommender Systems tutorial
ESSIR 2013 Recommender Systems tutorial ESSIR 2013 Recommender Systems tutorial
ESSIR 2013 Recommender Systems tutorial
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Recommender Systems - A Review and Recent Research Trends
Recommender Systems  -  A Review and Recent Research TrendsRecommender Systems  -  A Review and Recent Research Trends
Recommender Systems - A Review and Recent Research Trends
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
Graph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBayGraph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBay
 
Medicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAs
Medicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAsMedicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAs
Medicaid Medical Homes Initiatives: Promising Practices to Inform 2703 SPAs
 

Similar to Survey of Recommendation Systems

Download
DownloadDownload
Download
butest
 
Download
DownloadDownload
Download
butest
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions

Similar to Survey of Recommendation Systems (20)

Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Collaborative Filtering Survey
Collaborative Filtering SurveyCollaborative Filtering Survey
Collaborative Filtering Survey
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
 
Item basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithmsItem basedcollaborativefilteringrecommendationalgorithms
Item basedcollaborativefilteringrecommendationalgorithms
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdf
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
 
Collaborative Metric Learning (WWW'17)
Collaborative Metric Learning (WWW'17)Collaborative Metric Learning (WWW'17)
Collaborative Metric Learning (WWW'17)
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
 
k-Nearest Neighbors with brief explanation.pptx
k-Nearest Neighbors with brief explanation.pptxk-Nearest Neighbors with brief explanation.pptx
k-Nearest Neighbors with brief explanation.pptx
 
LCBM: Statistics-Based Parallel Collaborative Filtering
LCBM: Statistics-Based Parallel Collaborative FilteringLCBM: Statistics-Based Parallel Collaborative Filtering
LCBM: Statistics-Based Parallel Collaborative Filtering
 
LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in Recommendation
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
background.pptx
background.pptxbackground.pptx
background.pptx
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 

Survey of Recommendation Systems

  • 2. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
  • 3. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
  • 4. Introduction • What is recommendation system? – Recommend related items – Personalized experiences • How to build a recommendation system? – Content-Based – Collaborative Filtering Algorithm • Examples – Amazon – Youa
  • 6. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
  • 7. CF Algorithm • Memory-Based  User-Based  Item-Based • Model-Based  Bayes  Clustering
  • 9. User-Based CF Algorithm User by Item Matrix: Table 1: An example of user-item matrix Table 2: A simple example of ratings matrix
  • 10. User-Based CF Algorithm Voting : vi,j corresponding to the vote for user i on item j. Mean Vote : where Ii is the set of items on which user i voted. Predicted vote: weights of n similar usersnormalizer
  • 11. Similarity Computation Vector Cosine-Based Similarity Correlation-Based Similarity (Pearson) Other Similarities
  • 12. Vector Cosine-Based Similarity Vector cosine similarity:        Uu ujuUu uiu Uu ujuuiu BA rrrr rrrr w 2 , 2 , ,, , )()( ))(( Adjusted cosine similarity: different rating scale?
  • 13. Correlation-Based Similarity Pearson correlation: Thus in the example in Table 2, we have w1,5 = 0.756.
  • 14. Prediction Computation Weighted Sum of Others’ Ratings: For the simple example in Table 4, using the user-based CF algorithm, to predict the rating for U1 on I2, we have
  • 15. Recommendations I Rating Prediction Algorithm: a) Calculate Pa,i for each item i with prediction computation formulation. b) Recommend the top-N highest rating items that the active user a has not purchased.
  • 16. Recommendations II K Nearest Neighbors Algorithm: a) Find k most similar users (KNN). b) Identify a set of items, C, purchased by the group together with their frequency. c) Recommend the top-N most frequent items in C that the active user has not purchased.
  • 17. Item-Based CF Algorithm Correlation-Based Similarity: where ru,i is the rating of user u on item i, is the average rating of the ith item by those users. User-Item Matrix ir
  • 18. Prediction Computation Simple Weighted Average: where wi,n is the weight between items i and n, ru,n is the rating for user u on item n.
  • 19. Extensions • Default Voting • Inverse User Frequency • Case Amplification
  • 20. Default Voting Problem: • pair-wise similarity is computed only from the ratings in the intersection of the items both users have rated. • too few votes at the beginning Solution:  Assuming some default voting values for the missing ratings can improve the CF prediction performance.  Dimension Reduction, such as SVD, PCA etc.
  • 21. Inverse User Frequency Definition: )/log( ji nnf  where nj is the number of users who have rated item j and n is the total number of users.
  • 22. Case Amplification where ρ is the case amplification power, ρ ≥ 1, and typical choice of ρ is 2.5. Case amplification reduces noise in the data. It tends to favor high weights as small values raised to a power become negligible. For example, wi,j = 0.9, then it remains high (0.92.5 ≈ 0.8); if wi,j = 0.1, then it be negligible (0.12.5 ≈ 0.003).
  • 23. Model-Based CF Algorithm • Simple Bayesian CF Algorithm • Clustering CF Algorithm
  • 24. Simple Bayesian CF Algorithm Simple Bayesian: Laplace Estimator:
  • 25. Simple Bayesian CF Algorithm Example in Table 4, to produce the rating for U1 on I2 using the Simple Bayesian CF algorithm and the Laplace Estimator:
  • 26. Clustering CF Algorithm For two data objects, X = (x1, x2, …, xn) and Y = (y1, y2, …, yn), the popular Minkowski distance is defined as, where n is the dimension number of the object, and q is a positive integer. Obviously, when q = 1, d is Manhattan distance; when q = 2, d is Euclidian distance.
  • 27. Evaluation Metrics Mean Absolute Error and Normalized Mean Absolute Error: where rmax and rmin are the upper and lower bounds of the ratings.
  • 28. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
  • 29. Challenges • Data sparsity • Scalability • Synonymy • Gray Sheep • Shilling Attacks
  • 30. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
  • 31. Demo • Tools:Mahout - Scalable machine learning and data mining library,http://mahout.apache.org/ • Data: MovieLens, http://www.movielens.org/
  • 32. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
  • 33. Conclusions CF categories Memory-based CF Representative techniques Item-based/user-based top-N recommendations Main advantages 1. easy implementation 2. new data can be added easily and incrementally 3. need not consider the content of the items being recommended 4. scale well with co-rated items Main shortcomings 1. are dependent on human ratings 2. performance decrease when data are sparse 3. cannot recommend for new users and items 4. have limited scalability for large
  • 34. Conclusions CF categories Model-based CF Representative techniques 1. Bayesian belief nets CF 2. Clustering CF 3. CF using dimensionality reduction techniques, SVD, PCA Main advantages 1. better address the sparsity, scalability and other problems 2. improve prediction performance 3. give an intuitive rationale for recommendations Main shortcomings 1. expensive model-building 2. trade-off between prediction performance and scalability 3. lose useful information for dimensionality reduction techniques
  • 35. Outline • Introduction • Collaborative Filtering Algorithm • Challenges • Experiments (demo) • Summary • Future work
  • 37. Q & A
  • 38. References  J. Breese, D. Heckerman, and C. Kadie, “Empirical analysis of predictive algorithms for collaborative filtering,” in Proceedings of the 4th Conference on Uncertainty in Artificial Intelligence (UAI ’98), 1998.  B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative filtering recommendation algorithms,” in Proc. of the WWW Conference, 2001.  K. Miyahara and M. J. Pazzani, “Collaborative filtering with the simple Bayesian classifier,” in Proceedings of the 6th Pacific Rim International Conference on Artificial Intelligence, pp. 679–689, 2000.  L. H. Ungar and D. P. Foster, “Clustering methods for collaborative filtering,” in Proceedings of the Workshop on Recommendation Systems, AAAI Press, 1998.  Xiaoyuan Su and Taghi M. Khoshgoftaar, “A Survey of Collaborative Filtering Techniques,” in Advances in Artificial Intelligence Volume 2009, Article ID 421425, 19 pages.