SlideShare una empresa de Scribd logo
1 de 46
Descargar para leer sin conexión
Recommender
Systems!
@ASAI 2011!
Lic. Ernesto Mislej!
ernesto@7puentes.com - @fetnelio !

ASAI 2011 – 40 JAIIO !
Agosto 2011 – Córdoba – Argentina !
                  7puentes.com!
Outline



     • Introduction: What are recommender systems for?
     • Paradigms: How do they work?
          • Collaborative Filtering
          • Content-based Filtering
          • Knowledge-Based Recommendations


     • Summary, Issues and Challenges


  Source: Tutorial: Recommender Systems, D. Jannach, G. Friedrich, IJCAI 2011.
Introduction
Problem domain

  There is an extensive class of Web applications that involve
  predicting user responses to options. Such a facility is called a
  recommendation system.

      RS are software agents that elicit the interests and
      preferences of individual consumers . . . and make
      recommendations accordingly.
      They have the potential to support and improve the
      quality of the decision consumers make while searching
      for and selecting products online.


                                   — Xiao & Benbasat, MISQ, 2007
Different perspectives

  Retrieval perspective
    • Reduce search costs
    • Provide correct proposals
    • Users know in advance what they want


  Recommendation perspective
    • Serendipity identify items from the Long Tail
    • Users did not know about existence


  Serendipity (a.k.a chiripa) is when someone finds something
  that they weren’t expecting to find.
Different perspectives cont.

  Prediction perspective
    • Predict to what degree users like an item
    • Most popular evaluation scenario in research


  Interaction perspective
    • Give users a good feeling
    • Educate users about the product domain
    • Convince/persuade users - explain


  Conversion perspective
    • Commercial situations
    • Increase hit, clickthrough, lookers to bookers rates
    • Optimize sales margins and profit
The Long Tail

   • Recommend widely unknown items that users might actually
     like!
   • 20 % of items accumulate 74 % of all positive ratings
RS seen as a function



  Given:
    • User model (e.g. ratings, preferences, demographics,
      situational context)
    • Items (with or without description of item characteristics)


  Find:
    • Relevance score. Used for ranking.
Paradigms
Paradigms of recommender systems




  Recommender systems reduce information overload by
  estimating relevance.
Paradigms of recommender systems




  Personalized recommendations
Paradigms of recommender systems




  Collaborative: Tell me what’s popular among my peers
Paradigms of recommender systems




  Content-based: Show me more of the same what I’ve liked
Paradigms of recommender systems




  Knowledge-based: Tell me what fits based on my needs
Paradigms of recommender systems




  Hybrid: combinations of various inputs and/or composition of
  different mechanism
Collaborative Filtering
Collaborative Filtering (CF)

  The most prominent approach to generate recommendations:
    • used by large, commercial e-commerce sites
    • well-understood, various algorithms and variations exist
    • applicable in many domains (book, movies, DVDs, . . . )

  Approach:
    • use the wisdom of the crowd to recommend items

  Basic assumption and idea:
    • Users give ratings to catalog items (implicitly or explicitly)
    • Customers who had similar tastes in the past, will have
      similar tastes in the future
User-based nearest-neighbor CF


  Given an active user (Alice) and an Item i not yet seen by Alice.
  The goal is to estimate Alice’s rating for this item, e.g., by:

    • find a set of users (peers) who liked the same items as Alice in
      the past and who have rated item i
    • use, e.g. the average of their ratings to predict, if Alice will like
      item i
    • do this for all items Alice has not seen and recommend the
      best-rated
User-based nearest-neighbor CF

  Some questions:
    • How do we measure similarity?
    • How many neighbors should we consider?
    • How do we generate a prediction from the neighbors’ ratings?


  Utility Matrix:

                    Item 1   Item 2   Item 3   Item 4   Item 5
         Alice         5        3        4        4        ?
         Berta         3        1        2        3        3
         Clara         4        3        4        3        5
         Diana         5        1        4        3        4
         Erika         1        5        5        2        1
Measuring user similarity

  Pearson correlation:

                                        p∈P (ra,p   − ¯a )(rb,p − ¯b )
                                                      r           r
             sim(a, b) =
                                   p∈P (ra,p    − ¯a )2
                                                  r           p∈P (rb,p   − ¯b )2
                                                                            r
  a, b: users
  ra,p : rating of user a for item p
  P: set of items, rated both by a and b
  ¯a , ¯b : user’s average ratings
  r r
  Possible similarity values between -1 and 1

                Item 1       Item 2      Item 3           Item 4   Item 5        Sim
     Alice         5            3           4                4        ?            -
     Berta         3            1           2                3        3         0.852
     Clara         4            3           4                3        5         0.707
     Diana         5            1           4                3        4         0.956
     Erika         1            5           5                2        1         -0.792
Making predictions


   • A common prediction function:

                               b∈N   sim(a, b) · (rb,p − ¯b )
                                                         r
           pred(a, p) = ¯a +
                        r
                                       b∈N sim(a, b)

   • Calculate, whether the neighbors’ ratings for the unseen
     item i are higher or lower than their average
   • Combine the rating differences - use the similarity with as a
     weight
   • Add/subtract the neighbors’ bias from the active user’s
     average and use this as a prediction
Improving the metrics / prediction function

  Not all neighbor ratings might be equally valuable:
  Agreement on commonly liked items is not so informative as
  agreement on controversial items.

  Value of number of co-rated items:
  Use significance weighting, by e.g., linearly reducing the weight
  when the number of co-rated items is low.

  Case amplification:
  Give more weight to very similar neighbors, i.e., where the
  similarity value is close to 1.

  Neighborhood selection:
  Use similarity threshold or fixed number of neighbors
Memory-based and model-based approaches

  User-based CF is said to be memory-based:
    • The rating matrix is directly used to find neighbors / make
      predictions
    • does not scale for most real-world scenarios
    • large e-commerce sites have tens of millions of customers
      and millions of items

  Model-based approaches:
    • based on an offline pre-processing or model-learning phase
    • at run-time, only the learned model is used to make
      predictions
    • models are updated / re-trained periodically
Item-based CF



  Item-based collaborative filtering recommendation algorithms, B.
  Sarwar et al., WWW 2001
    • Scalability issues arise with U2U if many more users than
      items (m     n, m = |users|, n = |items|)
    • High sparsity leads to few common ratings between two
      users.
    • Basic idea: Item-based CF exploits relationships between
      items first, instead of relationships between users
Item-based CF

  Use the similarity between items (and not users) to make
  predictions.
    • Look for items that are similar to Item 5.
    • Take Alice’s ratings for these items to predict the rating for
      Item 5.

                 Item 1    Item 2   Item 3    Item 4   Item 5
        Alice       5         3        4         4        ?
        Berta       3         1        2         3        3
        Clara       4         3        4         3        5
        Diana       3         3        1         5        4
        Erika       1         5        5         2        1
Item-based CF

  The cosine similarity measure:
    • Produces better results in item-to-item filtering.
    • Ratings are seen as vector in n-dimensional space.
    • Similarity is calculated based on the angle between the
      vectors.

  Pre-processing for item-based filtering:
    • Calculate all pair-wise item similarities in advance.
    • The neighborhood to be used at run-time is typically rather
      small, because only items are taken into account which the
      user has rated.
    • Item similarities are supposed to be more stable than user
      similarities.
More on Ratings

  Explicit ratings:
    • Most commonly used (1 to 5, 1 to 7 Likert response scales)
    • Optimal scale; 10-point scale is accepted in movie domain.
    • Multidimensional ratings (multiple ratings per movie).
    • Users not always willing to rate many items.
    • How to stimulate users to rate more items?

  Implicit ratings:
    • Clicks, page views, time spent on some page, demo
      downloads . . .
    • Can be used in addition to explicit ones; question of
      correctness of interpretation.
Data sparsity problems

  Cold start problem:
    • How to recommend new items?
    • What to recommend to new users?

  Straightforward approaches:
    • Ask/force users to rate a set of items.
    • Use another method (e.g., content-based, demographic or
      simply non-personalized) in the initial phase

  Alternatives:
    • Assume transitivity of neighborhoods.
    • Recursive CF.
    • Graph-based methods.
More model-based approaches

  Plethora of different techniques proposed in the last years:
    • Matrix factorization techniques, statistics.
    • Singular value decomposition (SVD), principal component
      analysis (PCA).

    • Association rule mining.
    • Shopping basket analysis.

    • Probabilistic models.
    • Clustering models, Bayesian networks, probabilistic Latent
      Semantic Analysis.

    • Various other machine learning approaches.
Collaborative Filtering Issues


  PROs:
    • Well-understood.
    • Works well in some domains.
    • No knowledge engineering required.

  CONs:
    • Requires user community
    • Sparsity problems.
    • No integration of other knowledge sources.
    • No explanation of results.
Content-based
recommendation
Content-based recommendation

  Idea: CF methods do not require any information about the
  items, it might be reasonable to exploit such information; and
  recommend fantasy novels to people who liked fantasy novels in
  the past.

  What do we need:
   • some information about the available items such as the
     genre (content).
   • some sort of user profile describing what the user likes (the
     preferences).

  The task:
    • learn user preferences.
    • locate/recommend items that are similar to the user
      preferences.
What is the content?


  The genre is actually not part of the content of a book.

    • Most CB-recommendation methods originate from
      Information Retrieval (IR) field.
    • The goal is to find and rank interesting text documents
      (news articles, web pages)
    • The item descriptions are usually automatically extracted
      (important words)
Content representation and item similarities


  IR methods:
    • Compute the similarity of an unseen item with the user
      profile based on the keyword overlap

                           2 · |keywords(a) ∩ keywords(b)|
            sim(a, b) =
                            |keywords(a)| + |keywords(b)|

    • Standard measure: TF-IDF.
    • Vector space model. Vectors are usually long and sparse.
    • Remove stopwords, stemming, top-tokens . . .
    • Cosine similarity.
Recommending items


  Simple method: nearest neighbors
    • Given a set of documents D already rated by the user
      (like/dislike)
        • Find the n nearest neighbors of a not-yet-seen item i in D
        • Take these ratings to predict a rating/vote for i
        • (Variations: neighborhood size, lower/upper similarity
          thresholds..)
    • Good to model short-term interests / follow-up stories
    • Used in combination with method to model long-term
      preferences
Recommending items


  Query-based retrieval: Rocchio’s method
    • The SMART System: Users are allowed to rate
      (relevant/irrelevant) retrieved documents (feedback)
    • Document collections D+ and D−

  Probabilistic methods:
    • Recommendation as classical text classification problem.
    • 2 classes: hot/cold; simple Boolean document
      representation.
    • calculate probability that document is hot/cold based on
      Bayes theorem.
Limitations of CB recommendation methods

  Keywords alone may not be sufficient to judge quality/relevance
  of a document or web page:
    • up-to-dateness, usability, aesthetics, writing style
    • content may also be limited / too short
    • content may not be automatically extractable (multimedia)

  Ramp-up phase required:
    • Some training data is still required.
    • Web 2.0: Use other sources to learn the user preferences

  Overspecialization:
    • Algorithms tend to propose more of the same
    • Or: too similar news items
Knowledge-Based
Recommender Systems
KB Recommendation

  Explicit domain knowledge:
    • Sales knowledge elicitation from domain experts
    • System mimics the behavior of experienced sales assistant
    • Best-practice sales interactions
    • Can guarantee correct recommendations (determinism)
      with respect to expert knowledge

  Conversational interaction strategy:
    • Opposed to one-shot interaction
    • Elicitation of user requirements
    • Transfer of product knowledge educating users phase

  Hybridization:
    • E.g. merging explicit knowledge with community data
    • Can ensure some policies based on e.g. availability, user
      context or profit margin
KB Recommendation

  Constraint-based recommendation:
    • A knowledge-based RS formulated as constraint satisfaction
      problem
    • What if no solution exists?
    • Find minimal relaxations

  Ask user:
    • Computation of minimal revisions of requirements.
    • Do you want to relax your brand preference? Accept

      Panasonic instead of Canon brand
    • Idea that users navigate a product space.
Conversational strategies


  Process consisting of multiple conversational moves:
    • Resembles natural sales interactions
    • Not all user requirements known beforehand
    • Customers are rarely satisfied with the initial
      recommendations

  Different styles of preference elicitation:
    • Free text query interface.
    • Asking technical/generic properties
    • Images / inspiration
    • Proposing and Critiquing
Summary, Issues and
Challenges
Summary

 Collaborative Filtering (CF):
 PROs: Nearly no ramp-up effort, serendipity of results, learns market
 segments
 CONs: Requires some form of rating feedback, cold start for new users
 and new items

 Content-based:
 PROs: No community required, comparison between items possible
 CONs: Content-descriptions necessary, cold start for new users, no
 surprises

 Knowledge-based:
 PROs: Deterministic recommendations, assured quality, no cold-start,
 can resemble sales dialogue.
 CONs: Knowledge engineering effort to bootstrap, basically static,
 does not react to short-term trends.
Issues and Challenges
  Evaluating Recommender Systems
    • Is a RS efficient with respect to a specific criteria like accuracy,
      user satisfaction, response time, serendipity, online conversion,
      ramp-up efforts, . . . .
    • Do customers like/buy recommended items?
    • Do customers buy items they otherwise would have not?
    • Are they satisfied with a recommendation after purchase?

  Recommendation is viewed as IR task.
    • Recommendation is viewed as information retrieval task.
    • Precision and Recall, F1

                                      precision · recall
                           F1 = 2 ·
                                      precision + recall
    • Mean Absolute Error (MAE), Root Mean Square Error (RMSE).
Issues and Challenges

  Hybridization Strategies
    • Idea of crossing two (or more) implementations
    • Parallel and Pipelined invocation of different systems
    • Netflix competition - stacking recommender systems
      Weighted design based on > 100 predictor


  Explanations in recommender systems:
    • The digital camera Profishot is a must-buy for you because. . .
    • Additional information to explain the system’s output
      following some objectives
    • Knowledgeable explanations significantly increase the
      users’s perceived utility
Gracias por
vuestra !
atención!
Lic. Ernesto Mislej!
ernesto@7puentes.com - @fetnelio !




                  7puentes.com!

Más contenido relacionado

La actualidad más candente

How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?blueace
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemRishabh Mehta
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringChangsung Moon
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systemsyoualab
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System ExplainedCrossing Minds
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Introduction to Recommendation System
Introduction to Recommendation SystemIntroduction to Recommendation System
Introduction to Recommendation SystemMinha Hwang
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemAkshat Thakar
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsTamer Rezk
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systemsKapil Garg
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation enginesGeorgian Micsa
 
Datajob 2013 - Construire un système de recommandation
Datajob 2013 - Construire un système de recommandationDatajob 2013 - Construire un système de recommandation
Datajob 2013 - Construire un système de recommandationDjamel Zouaoui
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectiveXavier Amatriain
 

La actualidad más candente (20)

How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Introduction to Recommendation System
Introduction to Recommendation SystemIntroduction to Recommendation System
Introduction to Recommendation System
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
Datajob 2013 - Construire un système de recommandation
Datajob 2013 - Construire un système de recommandationDatajob 2013 - Construire un système de recommandation
Datajob 2013 - Construire un système de recommandation
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 

Similar a Recommender Systems! @ASAI 2011

Recommender Systems.pptx
Recommender Systems.pptxRecommender Systems.pptx
Recommender Systems.pptxNamrata Vij
 
Big data certification training mumbai
Big data certification training mumbaiBig data certification training mumbai
Big data certification training mumbaiTejaspathiLV
 
Best data science courses in pune
Best data science courses in puneBest data science courses in pune
Best data science courses in puneprathyusha1234
 
Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabadprathyusha1234
 
best online data science courses
best online data science coursesbest online data science courses
best online data science coursesprathyusha1234
 
Demystifying Recommendation Systems
Demystifying Recommendation SystemsDemystifying Recommendation Systems
Demystifying Recommendation SystemsRumman Chowdhury
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedXavier Amatriain
 
Content based recommendation systems
Content based recommendation systemsContent based recommendation systems
Content based recommendation systemsAravindharamanan S
 
Ronny lempelyahooindiabigthinkerapril2013
Ronny lempelyahooindiabigthinkerapril2013Ronny lempelyahooindiabigthinkerapril2013
Ronny lempelyahooindiabigthinkerapril2013Muthusamy Chelliah
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerceAlexander Konduforov
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyMaya Hristakeva
 
Recommendation Systems Roadtrip
Recommendation Systems RoadtripRecommendation Systems Roadtrip
Recommendation Systems RoadtripThe Real Dyl
 
A recommendation engine for your php application
A recommendation engine for your php applicationA recommendation engine for your php application
A recommendation engine for your php applicationMichele Orselli
 

Similar a Recommender Systems! @ASAI 2011 (20)

Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
 
Recommender Systems.pptx
Recommender Systems.pptxRecommender Systems.pptx
Recommender Systems.pptx
 
Big data certification training mumbai
Big data certification training mumbaiBig data certification training mumbai
Big data certification training mumbai
 
Best data science courses in pune
Best data science courses in puneBest data science courses in pune
Best data science courses in pune
 
Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabad
 
best online data science courses
best online data science coursesbest online data science courses
best online data science courses
 
Demystifying Recommendation Systems
Demystifying Recommendation SystemsDemystifying Recommendation Systems
Demystifying Recommendation Systems
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
 
Content based recommendation systems
Content based recommendation systemsContent based recommendation systems
Content based recommendation systems
 
Ronny lempelyahooindiabigthinkerapril2013
Ronny lempelyahooindiabigthinkerapril2013Ronny lempelyahooindiabigthinkerapril2013
Ronny lempelyahooindiabigthinkerapril2013
 
Fashiondatasc
FashiondatascFashiondatasc
Fashiondatasc
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Lec7 collaborative filtering
Lec7 collaborative filteringLec7 collaborative filtering
Lec7 collaborative filtering
 
Recommender lecture
Recommender lectureRecommender lecture
Recommender lecture
 
Recommendation Systems Roadtrip
Recommendation Systems RoadtripRecommendation Systems Roadtrip
Recommendation Systems Roadtrip
 
A recommendation engine for your php application
A recommendation engine for your php applicationA recommendation engine for your php application
A recommendation engine for your php application
 

Más de Ernesto Mislej

Tópicos de Big Data - Introducción
Tópicos de Big Data - IntroducciónTópicos de Big Data - Introducción
Tópicos de Big Data - IntroducciónErnesto Mislej
 
Tópicos de Big Data - Sistemas de Recomendación
Tópicos de Big Data - Sistemas de RecomendaciónTópicos de Big Data - Sistemas de Recomendación
Tópicos de Big Data - Sistemas de RecomendaciónErnesto Mislej
 
Tópicos de Big Data - Link Analysis
Tópicos de Big Data - Link AnalysisTópicos de Big Data - Link Analysis
Tópicos de Big Data - Link AnalysisErnesto Mislej
 
Tópicos de Big Data - Items Similares
Tópicos de Big Data - Items SimilaresTópicos de Big Data - Items Similares
Tópicos de Big Data - Items SimilaresErnesto Mislej
 
Data Science & Big Data
Data Science & Big DataData Science & Big Data
Data Science & Big DataErnesto Mislej
 
Innovación en Big Data
Innovación en Big DataInnovación en Big Data
Innovación en Big DataErnesto Mislej
 
Dime qué tuiteas y te diré quién eres. DataFest 2013
Dime qué tuiteas y te diré quién eres. DataFest 2013Dime qué tuiteas y te diré quién eres. DataFest 2013
Dime qué tuiteas y te diré quién eres. DataFest 2013Ernesto Mislej
 
A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...
A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...
A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...Ernesto Mislej
 
Opinion Mining #datafestAr
Opinion Mining #datafestArOpinion Mining #datafestAr
Opinion Mining #datafestArErnesto Mislej
 
Curso de Nivelación de Algoritmos - Clase 4
Curso de Nivelación de Algoritmos - Clase 4Curso de Nivelación de Algoritmos - Clase 4
Curso de Nivelación de Algoritmos - Clase 4Ernesto Mislej
 
Curso de Nivelación de Algoritmos - Clase 3
Curso de Nivelación de Algoritmos - Clase 3Curso de Nivelación de Algoritmos - Clase 3
Curso de Nivelación de Algoritmos - Clase 3Ernesto Mislej
 
Curso de Nivelación de Algoritmos - Clase 2
Curso de Nivelación de Algoritmos - Clase 2Curso de Nivelación de Algoritmos - Clase 2
Curso de Nivelación de Algoritmos - Clase 2Ernesto Mislej
 
Curso de Nivelación de Algoritmos - Clase 1
Curso de Nivelación de Algoritmos - Clase 1Curso de Nivelación de Algoritmos - Clase 1
Curso de Nivelación de Algoritmos - Clase 1Ernesto Mislej
 
Curso de Nivelación de Algoritmos - Clase 5
Curso de Nivelación de Algoritmos - Clase 5Curso de Nivelación de Algoritmos - Clase 5
Curso de Nivelación de Algoritmos - Clase 5Ernesto Mislej
 
Análisis Inteligente de Textos
Análisis Inteligente de TextosAnálisis Inteligente de Textos
Análisis Inteligente de TextosErnesto Mislej
 

Más de Ernesto Mislej (17)

Tópicos de Big Data - Introducción
Tópicos de Big Data - IntroducciónTópicos de Big Data - Introducción
Tópicos de Big Data - Introducción
 
Tópicos de Big Data - Sistemas de Recomendación
Tópicos de Big Data - Sistemas de RecomendaciónTópicos de Big Data - Sistemas de Recomendación
Tópicos de Big Data - Sistemas de Recomendación
 
Tópicos de Big Data - Link Analysis
Tópicos de Big Data - Link AnalysisTópicos de Big Data - Link Analysis
Tópicos de Big Data - Link Analysis
 
Tópicos de Big Data - Items Similares
Tópicos de Big Data - Items SimilaresTópicos de Big Data - Items Similares
Tópicos de Big Data - Items Similares
 
Data Science & Big Data
Data Science & Big DataData Science & Big Data
Data Science & Big Data
 
Innovación en Big Data
Innovación en Big DataInnovación en Big Data
Innovación en Big Data
 
Dime qué tuiteas y te diré quién eres. DataFest 2013
Dime qué tuiteas y te diré quién eres. DataFest 2013Dime qué tuiteas y te diré quién eres. DataFest 2013
Dime qué tuiteas y te diré quién eres. DataFest 2013
 
A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...
A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...
A quienes les gustó esta charla también les gustó... Cómo los Sistemas de Rec...
 
Opinion Mining #datafestAr
Opinion Mining #datafestArOpinion Mining #datafestAr
Opinion Mining #datafestAr
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Curso de Nivelación de Algoritmos - Clase 4
Curso de Nivelación de Algoritmos - Clase 4Curso de Nivelación de Algoritmos - Clase 4
Curso de Nivelación de Algoritmos - Clase 4
 
Curso de Nivelación de Algoritmos - Clase 3
Curso de Nivelación de Algoritmos - Clase 3Curso de Nivelación de Algoritmos - Clase 3
Curso de Nivelación de Algoritmos - Clase 3
 
Curso de Nivelación de Algoritmos - Clase 2
Curso de Nivelación de Algoritmos - Clase 2Curso de Nivelación de Algoritmos - Clase 2
Curso de Nivelación de Algoritmos - Clase 2
 
Curso de Nivelación de Algoritmos - Clase 1
Curso de Nivelación de Algoritmos - Clase 1Curso de Nivelación de Algoritmos - Clase 1
Curso de Nivelación de Algoritmos - Clase 1
 
Curso de Nivelación de Algoritmos - Clase 5
Curso de Nivelación de Algoritmos - Clase 5Curso de Nivelación de Algoritmos - Clase 5
Curso de Nivelación de Algoritmos - Clase 5
 
Análisis Inteligente de Textos
Análisis Inteligente de TextosAnálisis Inteligente de Textos
Análisis Inteligente de Textos
 

Último

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 

Último (20)

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 

Recommender Systems! @ASAI 2011

  • 1. Recommender Systems! @ASAI 2011! Lic. Ernesto Mislej! ernesto@7puentes.com - @fetnelio ! ASAI 2011 – 40 JAIIO ! Agosto 2011 – Córdoba – Argentina ! 7puentes.com!
  • 2. Outline • Introduction: What are recommender systems for? • Paradigms: How do they work? • Collaborative Filtering • Content-based Filtering • Knowledge-Based Recommendations • Summary, Issues and Challenges Source: Tutorial: Recommender Systems, D. Jannach, G. Friedrich, IJCAI 2011.
  • 4. Problem domain There is an extensive class of Web applications that involve predicting user responses to options. Such a facility is called a recommendation system. RS are software agents that elicit the interests and preferences of individual consumers . . . and make recommendations accordingly. They have the potential to support and improve the quality of the decision consumers make while searching for and selecting products online. — Xiao & Benbasat, MISQ, 2007
  • 5. Different perspectives Retrieval perspective • Reduce search costs • Provide correct proposals • Users know in advance what they want Recommendation perspective • Serendipity identify items from the Long Tail • Users did not know about existence Serendipity (a.k.a chiripa) is when someone finds something that they weren’t expecting to find.
  • 6. Different perspectives cont. Prediction perspective • Predict to what degree users like an item • Most popular evaluation scenario in research Interaction perspective • Give users a good feeling • Educate users about the product domain • Convince/persuade users - explain Conversion perspective • Commercial situations • Increase hit, clickthrough, lookers to bookers rates • Optimize sales margins and profit
  • 7. The Long Tail • Recommend widely unknown items that users might actually like! • 20 % of items accumulate 74 % of all positive ratings
  • 8. RS seen as a function Given: • User model (e.g. ratings, preferences, demographics, situational context) • Items (with or without description of item characteristics) Find: • Relevance score. Used for ranking.
  • 10. Paradigms of recommender systems Recommender systems reduce information overload by estimating relevance.
  • 11. Paradigms of recommender systems Personalized recommendations
  • 12. Paradigms of recommender systems Collaborative: Tell me what’s popular among my peers
  • 13. Paradigms of recommender systems Content-based: Show me more of the same what I’ve liked
  • 14. Paradigms of recommender systems Knowledge-based: Tell me what fits based on my needs
  • 15. Paradigms of recommender systems Hybrid: combinations of various inputs and/or composition of different mechanism
  • 17. Collaborative Filtering (CF) The most prominent approach to generate recommendations: • used by large, commercial e-commerce sites • well-understood, various algorithms and variations exist • applicable in many domains (book, movies, DVDs, . . . ) Approach: • use the wisdom of the crowd to recommend items Basic assumption and idea: • Users give ratings to catalog items (implicitly or explicitly) • Customers who had similar tastes in the past, will have similar tastes in the future
  • 18. User-based nearest-neighbor CF Given an active user (Alice) and an Item i not yet seen by Alice. The goal is to estimate Alice’s rating for this item, e.g., by: • find a set of users (peers) who liked the same items as Alice in the past and who have rated item i • use, e.g. the average of their ratings to predict, if Alice will like item i • do this for all items Alice has not seen and recommend the best-rated
  • 19. User-based nearest-neighbor CF Some questions: • How do we measure similarity? • How many neighbors should we consider? • How do we generate a prediction from the neighbors’ ratings? Utility Matrix: Item 1 Item 2 Item 3 Item 4 Item 5 Alice 5 3 4 4 ? Berta 3 1 2 3 3 Clara 4 3 4 3 5 Diana 5 1 4 3 4 Erika 1 5 5 2 1
  • 20. Measuring user similarity Pearson correlation: p∈P (ra,p − ¯a )(rb,p − ¯b ) r r sim(a, b) = p∈P (ra,p − ¯a )2 r p∈P (rb,p − ¯b )2 r a, b: users ra,p : rating of user a for item p P: set of items, rated both by a and b ¯a , ¯b : user’s average ratings r r Possible similarity values between -1 and 1 Item 1 Item 2 Item 3 Item 4 Item 5 Sim Alice 5 3 4 4 ? - Berta 3 1 2 3 3 0.852 Clara 4 3 4 3 5 0.707 Diana 5 1 4 3 4 0.956 Erika 1 5 5 2 1 -0.792
  • 21. Making predictions • A common prediction function: b∈N sim(a, b) · (rb,p − ¯b ) r pred(a, p) = ¯a + r b∈N sim(a, b) • Calculate, whether the neighbors’ ratings for the unseen item i are higher or lower than their average • Combine the rating differences - use the similarity with as a weight • Add/subtract the neighbors’ bias from the active user’s average and use this as a prediction
  • 22. Improving the metrics / prediction function Not all neighbor ratings might be equally valuable: Agreement on commonly liked items is not so informative as agreement on controversial items. Value of number of co-rated items: Use significance weighting, by e.g., linearly reducing the weight when the number of co-rated items is low. Case amplification: Give more weight to very similar neighbors, i.e., where the similarity value is close to 1. Neighborhood selection: Use similarity threshold or fixed number of neighbors
  • 23. Memory-based and model-based approaches User-based CF is said to be memory-based: • The rating matrix is directly used to find neighbors / make predictions • does not scale for most real-world scenarios • large e-commerce sites have tens of millions of customers and millions of items Model-based approaches: • based on an offline pre-processing or model-learning phase • at run-time, only the learned model is used to make predictions • models are updated / re-trained periodically
  • 24. Item-based CF Item-based collaborative filtering recommendation algorithms, B. Sarwar et al., WWW 2001 • Scalability issues arise with U2U if many more users than items (m n, m = |users|, n = |items|) • High sparsity leads to few common ratings between two users. • Basic idea: Item-based CF exploits relationships between items first, instead of relationships between users
  • 25. Item-based CF Use the similarity between items (and not users) to make predictions. • Look for items that are similar to Item 5. • Take Alice’s ratings for these items to predict the rating for Item 5. Item 1 Item 2 Item 3 Item 4 Item 5 Alice 5 3 4 4 ? Berta 3 1 2 3 3 Clara 4 3 4 3 5 Diana 3 3 1 5 4 Erika 1 5 5 2 1
  • 26. Item-based CF The cosine similarity measure: • Produces better results in item-to-item filtering. • Ratings are seen as vector in n-dimensional space. • Similarity is calculated based on the angle between the vectors. Pre-processing for item-based filtering: • Calculate all pair-wise item similarities in advance. • The neighborhood to be used at run-time is typically rather small, because only items are taken into account which the user has rated. • Item similarities are supposed to be more stable than user similarities.
  • 27. More on Ratings Explicit ratings: • Most commonly used (1 to 5, 1 to 7 Likert response scales) • Optimal scale; 10-point scale is accepted in movie domain. • Multidimensional ratings (multiple ratings per movie). • Users not always willing to rate many items. • How to stimulate users to rate more items? Implicit ratings: • Clicks, page views, time spent on some page, demo downloads . . . • Can be used in addition to explicit ones; question of correctness of interpretation.
  • 28. Data sparsity problems Cold start problem: • How to recommend new items? • What to recommend to new users? Straightforward approaches: • Ask/force users to rate a set of items. • Use another method (e.g., content-based, demographic or simply non-personalized) in the initial phase Alternatives: • Assume transitivity of neighborhoods. • Recursive CF. • Graph-based methods.
  • 29. More model-based approaches Plethora of different techniques proposed in the last years: • Matrix factorization techniques, statistics. • Singular value decomposition (SVD), principal component analysis (PCA). • Association rule mining. • Shopping basket analysis. • Probabilistic models. • Clustering models, Bayesian networks, probabilistic Latent Semantic Analysis. • Various other machine learning approaches.
  • 30. Collaborative Filtering Issues PROs: • Well-understood. • Works well in some domains. • No knowledge engineering required. CONs: • Requires user community • Sparsity problems. • No integration of other knowledge sources. • No explanation of results.
  • 32. Content-based recommendation Idea: CF methods do not require any information about the items, it might be reasonable to exploit such information; and recommend fantasy novels to people who liked fantasy novels in the past. What do we need: • some information about the available items such as the genre (content). • some sort of user profile describing what the user likes (the preferences). The task: • learn user preferences. • locate/recommend items that are similar to the user preferences.
  • 33. What is the content? The genre is actually not part of the content of a book. • Most CB-recommendation methods originate from Information Retrieval (IR) field. • The goal is to find and rank interesting text documents (news articles, web pages) • The item descriptions are usually automatically extracted (important words)
  • 34. Content representation and item similarities IR methods: • Compute the similarity of an unseen item with the user profile based on the keyword overlap 2 · |keywords(a) ∩ keywords(b)| sim(a, b) = |keywords(a)| + |keywords(b)| • Standard measure: TF-IDF. • Vector space model. Vectors are usually long and sparse. • Remove stopwords, stemming, top-tokens . . . • Cosine similarity.
  • 35. Recommending items Simple method: nearest neighbors • Given a set of documents D already rated by the user (like/dislike) • Find the n nearest neighbors of a not-yet-seen item i in D • Take these ratings to predict a rating/vote for i • (Variations: neighborhood size, lower/upper similarity thresholds..) • Good to model short-term interests / follow-up stories • Used in combination with method to model long-term preferences
  • 36. Recommending items Query-based retrieval: Rocchio’s method • The SMART System: Users are allowed to rate (relevant/irrelevant) retrieved documents (feedback) • Document collections D+ and D− Probabilistic methods: • Recommendation as classical text classification problem. • 2 classes: hot/cold; simple Boolean document representation. • calculate probability that document is hot/cold based on Bayes theorem.
  • 37. Limitations of CB recommendation methods Keywords alone may not be sufficient to judge quality/relevance of a document or web page: • up-to-dateness, usability, aesthetics, writing style • content may also be limited / too short • content may not be automatically extractable (multimedia) Ramp-up phase required: • Some training data is still required. • Web 2.0: Use other sources to learn the user preferences Overspecialization: • Algorithms tend to propose more of the same • Or: too similar news items
  • 39. KB Recommendation Explicit domain knowledge: • Sales knowledge elicitation from domain experts • System mimics the behavior of experienced sales assistant • Best-practice sales interactions • Can guarantee correct recommendations (determinism) with respect to expert knowledge Conversational interaction strategy: • Opposed to one-shot interaction • Elicitation of user requirements • Transfer of product knowledge educating users phase Hybridization: • E.g. merging explicit knowledge with community data • Can ensure some policies based on e.g. availability, user context or profit margin
  • 40. KB Recommendation Constraint-based recommendation: • A knowledge-based RS formulated as constraint satisfaction problem • What if no solution exists? • Find minimal relaxations Ask user: • Computation of minimal revisions of requirements. • Do you want to relax your brand preference? Accept Panasonic instead of Canon brand • Idea that users navigate a product space.
  • 41. Conversational strategies Process consisting of multiple conversational moves: • Resembles natural sales interactions • Not all user requirements known beforehand • Customers are rarely satisfied with the initial recommendations Different styles of preference elicitation: • Free text query interface. • Asking technical/generic properties • Images / inspiration • Proposing and Critiquing
  • 43. Summary Collaborative Filtering (CF): PROs: Nearly no ramp-up effort, serendipity of results, learns market segments CONs: Requires some form of rating feedback, cold start for new users and new items Content-based: PROs: No community required, comparison between items possible CONs: Content-descriptions necessary, cold start for new users, no surprises Knowledge-based: PROs: Deterministic recommendations, assured quality, no cold-start, can resemble sales dialogue. CONs: Knowledge engineering effort to bootstrap, basically static, does not react to short-term trends.
  • 44. Issues and Challenges Evaluating Recommender Systems • Is a RS efficient with respect to a specific criteria like accuracy, user satisfaction, response time, serendipity, online conversion, ramp-up efforts, . . . . • Do customers like/buy recommended items? • Do customers buy items they otherwise would have not? • Are they satisfied with a recommendation after purchase? Recommendation is viewed as IR task. • Recommendation is viewed as information retrieval task. • Precision and Recall, F1 precision · recall F1 = 2 · precision + recall • Mean Absolute Error (MAE), Root Mean Square Error (RMSE).
  • 45. Issues and Challenges Hybridization Strategies • Idea of crossing two (or more) implementations • Parallel and Pipelined invocation of different systems • Netflix competition - stacking recommender systems Weighted design based on > 100 predictor Explanations in recommender systems: • The digital camera Profishot is a must-buy for you because. . . • Additional information to explain the system’s output following some objectives • Knowledgeable explanations significantly increase the users’s perceived utility
  • 46. Gracias por vuestra ! atención! Lic. Ernesto Mislej! ernesto@7puentes.com - @fetnelio ! 7puentes.com!