SlideShare una empresa de Scribd logo
1 de 43
Descargar para leer sin conexión
RecSys: Recommender Systems
            Tran The Truyen
            http://truyen.vietlabs.com
The world is an over-crowded place
They all want to get our attention
We are overloaded
• Thousands of news articles
  and blog posts each day
• Millions of movies, books
  and music tracks online
• In Hanoi, > 50 TV channels,
  thousands of programs
  each day
• In New York, several
  thousands of ad messages
  sent to us per day
But we really need and
consume only a few of them!
Sometimes, all we need is this
Or, just this


                 !
               RB
             TU
           IS
         D
    ’T
   N
  O
D
Help me!
Can Google help?
• Yes, but only when we really know what
 we are looking for
• What if I just want some interesting music
 tracks?
  – Btw, what does it mean by “interesting”?
Can Facebook help?
• Yes, I tend to find my friends’ stuffs
  interesting
• What if I had only few friends, and what
  they like do not always attract me?
Can experts help?
• Yes, but it won’t scale well
  – Everyone receives exactly the same advice!

• It is what they like, not me!
  – Like movies, what get expert approval does
    not guarantee attention of the mass
OK, here is the idea called RecSys:
                                I like these bits
• To recommend to us
  something we may like
  – It may not be popular
  – The world is long-tailed
• How?
  – Based on our history of
    using services
  – Based on other people
    like us
  – Ever heard of “collective
    intelligence”?
Hang on, what is long-tailed?
• Popularised by Chris Anderson, Wired 2004

                          The short-tailed distribution


                           The bell-shaped distribution




                                 The long-tailed distribution
Ever heard of
• GroupLens?
• Amazon recommendation?
• Netflix Cinematch?
• Google News personalization?
• Netflix Prize $1mil challenge?
• Strands?
• TiVo?
• Findory?
Want some evidences?
             (Celma & Lamere, ISMIR 2007)


• Netflix:
  – 2/3 rented movies are from recommendation

• Google News
  – 38% more click-through are due to
    recommendation

• Amazon
  – 35% sales are from recommendation
What can be recommended?
• Advertising messages • Tags
• Investment choices   • News articles
• Restaurants          • Online mates (Dating services)
• Cafes                • Future friends (Social network sites)
• Music tracks         • Courses in e-learning
• Movies               • Drug components
• TV programs          • Research papers
• Books                • Citations
• Cloths               • Code modules
• Supermarket goods    • Programmers
But, what do recommender
         systems do, exactly?
1. Predict how much you may like a certain
   product/service
2. Compose a list of N best items for you
3. Compose a list of N best users for a certain
   product/service
4. Explain to you why these items are recommended to
   you
5. Adjust the prediction and recommendation based on
   your feedback and other people
Graph representation

      Titanic               Taken         Panda




                                    ?
Me              My friend           You           Another guy
We must also take a good care of

• Data normalisation
• Removal or reduction of noise
• Protection of users’ privacy
• Attack: someone just doesn’t like your
 system
Task 1: Preference prediction
• Collaborative filtering
  – User-based method
  – Item-based method
  – Matrix Factorization
• Content-based filtering
• Hybrid:
  – Linear/sequential/switching combination
  – Semi-Restricted Boltzmann Machines
Collaborative filtering (1)
• User-based method (1994,
  GroupLens)
  – Many people liked “Kungfu
    Panda”                                                     item
                                                123    4   5678
  – Can you tell how much I like it?
                                              1545         3       4
  – The idea is to pick about 20-50
                                              2  35          4    5
    people who share similar
                                              3  4     5   4
    taste with me, then how much I
                                              45       5     35
                                                   4
    like depend on how much
                                              54           33      4
    THEY liked.
                                              652          35
  – In short: you may like it
                                              7    1   4   2
                                       user
    because your “friends” liked it
                                              8        5       43
Collaborative filtering (2)
• Item-based method (2001,
  deployed at Amazon)
   – I have watched so many good &
     bad movies
   – Would you recommend me
     watching “Taken”?                                                         item
   – The idea is to pick from my                       1   2   3   4   5   67         8
     previous list 20-50 movies that
                                                   1       4           3           4
                                                       5       5
     share similar audience with
     “Taken”, then how much I will like            2       3   5           4       5
     depend on how much I liked those              3       4       5   4
     early movies
                                                   4               5       35
                                                       5       4
   – In short: I tend to watch this movie
     because I have watched those                  5   4               3   3       4
     movies … or                                   6   5   2           3   5
   – People who have watched those
                                                   7           1   4   2
                                            user

     movies also liked this movie
     (Amazon style)                                8               5           4   3
Collaborative filtering (3)
~ [0.1 0.3 0.2 0.9 0.5 0.4 0.7 0.3 0.8 1.5]
• Matrix Factorization (2006, Netflix
  challgence)
  – You many have watched thousands of movies
  – But perhaps I can tell these movies belong to
    10 groups, like Action, Sci-Fi, Animation,
    etc,…
  – So 10 numbers are enough to describe your
    taste
  – Likewise, “Titanic” has been watched by
    millions people, but perhaps …10 numbers
    are enough to describe its features
  – Magic: these hidden aspects can be
    discovered automatically by Matrix
    Factorization!
Problems with collaborative filtering
• Scale
   – Netflix (2007): 5M users, 50K movies, 1.4B ratings

• Sparse data
   – I have rated only one book at Amazon!

• Cold-Start
   – New users and items do not have history

• Popularity bias
   – Everyone reads “Harry Potter”

• Hacking
   – Someone reads “Harry Potter” reads “Karma Sutra”
Content-based method
• Web page: words, hyperlinks, images, tags, comments,
  titles, URL, topic
• Music: genre, rhythm, melody, harmony, lyrics, meta data,
  artists, bands, press releases, expert reviews, loudness,
  energy, time, spectrum, duration, frequency, pitch, key,
  mode, mood, style, tempo
• User: age, sex, job, location, time, income, education,
  language, family status, hobbies, general interests, Web
  usage, computer usage, fan club membership, opinion,
  comments, tags, mobile usage
• Context: time, location, mobility, activity, socializing,
  emotion
Content-based method (2)
• Can we acquire those content pieces
  automatically?
  – Fairly easy for text
  – Difficult for music and video, except for digital signals.
    E.g. music genre classification 60-80% accuracy
  – A lot of noise, e.g. misplaced tags
  – Attacks
• What can we do with these?
  – Compute similarity between items or users
  – Query items that are similar to a given item
  – Match item’s content and user’s profile
Content-based method (3)
• Measuring similarity
  – Cosine, TF-IDF as in standard Information
    Retrieval
  – KL-divergence for probability-oriented guys
  – Euclidean, dimensionality reduction if you
    want
  – Anything you can imagine of!
Hybrid: Semi-Restricted Boltzmann
       Machines (2009, IMPCA)
                                         User A             User B   User C

• A probabilistic combination of
   –   Item-based method
   –   User-based method
   –   Matrix Factorization
   –   (May be) content-based method

• It looks like a Neural Network
                                              11
                                              00              111
                                                              000
   – But it does not really so ☺              11
                                              00              111
                                                              000
                                              11
                                              00              111
                                                              000
                                              11
                                              00              111
                                                              000
                                                   Item X

• It really is a type of Markov
  random fields, which is, again, a
  type of Graphical Models
   – Self-advertising: I work on these
     stuffs for living!
But, what do recommender
         systems do, exactly?
1. Predict how much you may like a certain
   product/service
2. Compose a list of N best items for you
3. Compose a list of N best users for a certain
   product/service
4. Explain to you why these items are recommended to
   you
5. Adjust the prediction and recommendation based on
   your feedback and other people
Task 2,3: Top-N recommendation

• Top-N item list:
   – Find similar users, collect what they like
   – Filter out those the user has rated
   – Rank the remaining items by considering
      •   The number of times each item is liked by those users
      •   The popularity of the item
      •   The associated ratings
      •   The similarity between each item in the list and what the user
          has rated

• Switching the role of item to user, we may have
  top-N user list
But, what do recommender
         systems do, exactly?
1. Predict how much you may like a certain
   product/service
2. Compose a list of N best items for you
3. Compose a list of N best users for a certain
   product/service
4. Explain to you why these items are recommended to
   you
5. Adjust the prediction and recommendation based on
   your feedback and other people
Task 4: Explanation
• This is a current hit …
• More on this artist …
• Try something from similar artists …
• Someone similar to you also like this …
• As you listened to that, you may want this …
• These two go together …
• This is most popular in your group …
• This is highly rated …
• Try something new …
Task 4: Explanation (2)
• Examples from Strands.com
  –   Welcom back (recently viewed)
  –   For you today
  –   New for you
  –   Hot / Most popular of this type
  –   Other people also do this …
  –   Similar or related products
  –   Complementary accessories
  –   This goes with this …
  –   Gift idea
  –   Shopping assisant
But, what do recommender
         systems do, exactly?
1. Predict how much you may like a certain
   product/service
2. Compose a list of N best items for you
3. Compose a list of N best users for a certain
   product/service
4. Explain to you why these items are recommended to
   you
5. Adjust the prediction and recommendation based on
   your feedback and other people
Task 5: Online updating
• New items and users come each hour or minute
• The two worlds:
  – Most songs and books are still interesting for a long
    time (the tail is really long)
  – Most news articles are read on the day and forgotten
    next day
     • But tracking back is useful to follow an event or scandal

• Online updating large-scale neighbour-based
  systems is NOT easy at all
Evaluation
• How do we know the recommendation is
  good?
  – How good is good?
  – Measures should be automated
• Practice: training/testing split (e.g. 80/20)
• Popular criteria
  – Prediction error: ZOE, MAE, RMSE
  – Hit recall/precision/F-measure, rank utility,
    ROC curve,
Evaluation (2)
• Yet little on
  – Relevance
  – Usefulness
  – % Increase in purchase
  – % Reduction in cost
  – Novelty/surprise/long-tails
  – Diversity
  – Coverage
  – Explainability
A question: Can we
make use of these
information sources?
• Blogs
• Social Media
• Online comments
• Online stores
• Review sites
• Locations
• Mobility
A case-study: Strands
• Services for any online-retailers
   – Retailers send product, purchase information into
     Strands server (one retailer per account) through
     APIs
   – Strands returns recommendation for each visitor
• The same logic for social media servers
• moneyStrands for personal financial
  management (e.g. investment recommendation)
• MyStrands for music personalization
Want more practical hints?
• New books:
  – Toby Segaran, Programming Collective
    Intelligence, O'Reilly, 2007
  – Satnam Alag, Collective Intelligence in
    Action, Manning Publications, 2009
• Check out for real deployment:
  – TechCrunch
  – ReadWriteWeb
Want more state-of-the-arts?
• Research in Recommender Systems is becoming a
  mainstream, evidenced from the recent conference
  ACM RecSys.
• Other places:
  –   ICWSM: Weblog and Social Media
  –   WebKDD: Web Knowledge Discovery and Data Mining
  –   WWW: The original WWW conference
  –   SIGIR: Information Retrieval
  –   ACM KDD: Knowledge Discovery and Data Mining
  –   ICML: Machine Learning
Questions left to you
• Will you trust such Recommender
 Systems?
• Will you implement and deploy it here?
• Will you do research?
  – PhD scholarships available (as of 19/4/09)
  – See http://truyen.vietlabs.com/scholarship.html
  – Warning: you are going to waste 3-5 years of your
    youth life!

Más contenido relacionado

La actualidad más candente

Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filteringD Yogendra Rao
 
A Hybrid Recommendation system
A Hybrid Recommendation systemA Hybrid Recommendation system
A Hybrid Recommendation systemPranav Prakash
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systemsFalitokiniaina Rabearison
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewYONG ZHENG
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerceAlexander Konduforov
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectiveXavier Amatriain
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemRishabh Mehta
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?blueace
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
Movie lens movie recommendation system
Movie lens movie recommendation systemMovie lens movie recommendation system
Movie lens movie recommendation systemGaurav Sawant
 

La actualidad más candente (20)

Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
A Hybrid Recommendation system
A Hybrid Recommendation systemA Hybrid Recommendation system
A Hybrid Recommendation system
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Content based filtering
Content based filteringContent based filtering
Content based filtering
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Movie lens movie recommendation system
Movie lens movie recommendation systemMovie lens movie recommendation system
Movie lens movie recommendation system
 

Destacado

Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkCaserta
 
Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization MachinesPavel Kalaidin
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2njit-ronbrown
 
آموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومآموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومfaradars
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Domonkos Tikk
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorizationrubyyc
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix FactorizationTatsuya Yokota
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFMLiangjie Hong
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with SparkChris Johnson
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsAladejubelo Oluwashina
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsNavisro Analytics
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringDKALab
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 

Destacado (17)

Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 
Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization Machines
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
 
آموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومآموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دوم
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix Factorization
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro Analytics
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 

Similar a Recommender Systems

Mechanical Librarian
Mechanical LibrarianMechanical Librarian
Mechanical LibrarianAndre Vellino
 
Jon Sanders on Collaborative Filters at SXSW
Jon Sanders on Collaborative Filters at SXSWJon Sanders on Collaborative Filters at SXSW
Jon Sanders on Collaborative Filters at SXSWAnton Kast
 
Recommendations and Discovery at StumbleUpon
Recommendations and Discovery at StumbleUponRecommendations and Discovery at StumbleUpon
Recommendations and Discovery at StumbleUponSumanth Kolar
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Max De Marzi
 
WWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic AnalysisWWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic AnalysisLaura Hollink
 
Trust in Recommender Systems a historical overview and recent developments
Trust in Recommender Systems
a historical overview and recent developmentsTrust in Recommender Systems
a historical overview and recent developments
Trust in Recommender Systems a historical overview and recent developmentsPaolo Massa
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jMax De Marzi
 
At Your Service: What Netflix and Assessments Have In Common | SoGoSurvey
At Your Service: What Netflix and Assessments Have In Common | SoGoSurveyAt Your Service: What Netflix and Assessments Have In Common | SoGoSurvey
At Your Service: What Netflix and Assessments Have In Common | SoGoSurveySogolytics
 
These Kids Today: Usability Testing with Current and Prospective Students
These Kids Today: Usability Testing with Current and Prospective StudentsThese Kids Today: Usability Testing with Current and Prospective Students
These Kids Today: Usability Testing with Current and Prospective StudentsLori Packer
 
Information Architecture as Storytelling - 2009
Information Architecture as Storytelling - 2009Information Architecture as Storytelling - 2009
Information Architecture as Storytelling - 2009Geoff Barnes
 
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...Toine Bogers
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummiesSaurav Chakravorty
 
Recommendation Systems Roadtrip
Recommendation Systems RoadtripRecommendation Systems Roadtrip
Recommendation Systems RoadtripThe Real Dyl
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Ernesto Mislej
 
2.social recommedation
2.social recommedation2.social recommedation
2.social recommedationjilung hsieh
 
Construindo Sistemas de Recomendação com Python
Construindo Sistemas de Recomendação com PythonConstruindo Sistemas de Recomendação com Python
Construindo Sistemas de Recomendação com PythonMarcel Caraciolo
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Dakiry
 

Similar a Recommender Systems (20)

Mechanical Librarian
Mechanical LibrarianMechanical Librarian
Mechanical Librarian
 
Collab filtering-tutorial
Collab filtering-tutorialCollab filtering-tutorial
Collab filtering-tutorial
 
Jon Sanders on Collaborative Filters at SXSW
Jon Sanders on Collaborative Filters at SXSWJon Sanders on Collaborative Filters at SXSW
Jon Sanders on Collaborative Filters at SXSW
 
Search Engine Google
Search Engine GoogleSearch Engine Google
Search Engine Google
 
Recommendations and Discovery at StumbleUpon
Recommendations and Discovery at StumbleUponRecommendations and Discovery at StumbleUpon
Recommendations and Discovery at StumbleUpon
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015
 
WWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic AnalysisWWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic Analysis
 
Trust in Recommender Systems a historical overview and recent developments
Trust in Recommender Systems
a historical overview and recent developmentsTrust in Recommender Systems
a historical overview and recent developments
Trust in Recommender Systems a historical overview and recent developments
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4j
 
At Your Service: What Netflix and Assessments Have In Common | SoGoSurvey
At Your Service: What Netflix and Assessments Have In Common | SoGoSurveyAt Your Service: What Netflix and Assessments Have In Common | SoGoSurvey
At Your Service: What Netflix and Assessments Have In Common | SoGoSurvey
 
These Kids Today: Usability Testing with Current and Prospective Students
These Kids Today: Usability Testing with Current and Prospective StudentsThese Kids Today: Usability Testing with Current and Prospective Students
These Kids Today: Usability Testing with Current and Prospective Students
 
Information Architecture as Storytelling - 2009
Information Architecture as Storytelling - 2009Information Architecture as Storytelling - 2009
Information Architecture as Storytelling - 2009
 
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummies
 
Recommendation Systems Roadtrip
Recommendation Systems RoadtripRecommendation Systems Roadtrip
Recommendation Systems Roadtrip
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
 
2.social recommedation
2.social recommedation2.social recommedation
2.social recommedation
 
Construindo Sistemas de Recomendação com Python
Construindo Sistemas de Recomendação com PythonConstruindo Sistemas de Recomendação com Python
Construindo Sistemas de Recomendação com Python
 
Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”Олександр Обєдніков “Рекомендательные системы”
Олександр Обєдніков “Рекомендательные системы”
 

Último

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 

Último (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 

Recommender Systems

  • 1. RecSys: Recommender Systems Tran The Truyen http://truyen.vietlabs.com
  • 2. The world is an over-crowded place
  • 3. They all want to get our attention
  • 4. We are overloaded • Thousands of news articles and blog posts each day • Millions of movies, books and music tracks online • In Hanoi, > 50 TV channels, thousands of programs each day • In New York, several thousands of ad messages sent to us per day
  • 5. But we really need and consume only a few of them!
  • 6. Sometimes, all we need is this
  • 7. Or, just this ! RB TU IS D ’T N O D
  • 9. Can Google help? • Yes, but only when we really know what we are looking for • What if I just want some interesting music tracks? – Btw, what does it mean by “interesting”?
  • 10. Can Facebook help? • Yes, I tend to find my friends’ stuffs interesting • What if I had only few friends, and what they like do not always attract me?
  • 11. Can experts help? • Yes, but it won’t scale well – Everyone receives exactly the same advice! • It is what they like, not me! – Like movies, what get expert approval does not guarantee attention of the mass
  • 12. OK, here is the idea called RecSys: I like these bits • To recommend to us something we may like – It may not be popular – The world is long-tailed • How? – Based on our history of using services – Based on other people like us – Ever heard of “collective intelligence”?
  • 13. Hang on, what is long-tailed? • Popularised by Chris Anderson, Wired 2004 The short-tailed distribution The bell-shaped distribution The long-tailed distribution
  • 14. Ever heard of • GroupLens? • Amazon recommendation? • Netflix Cinematch? • Google News personalization? • Netflix Prize $1mil challenge? • Strands? • TiVo? • Findory?
  • 15.
  • 16. Want some evidences? (Celma & Lamere, ISMIR 2007) • Netflix: – 2/3 rented movies are from recommendation • Google News – 38% more click-through are due to recommendation • Amazon – 35% sales are from recommendation
  • 17. What can be recommended? • Advertising messages • Tags • Investment choices • News articles • Restaurants • Online mates (Dating services) • Cafes • Future friends (Social network sites) • Music tracks • Courses in e-learning • Movies • Drug components • TV programs • Research papers • Books • Citations • Cloths • Code modules • Supermarket goods • Programmers
  • 18. But, what do recommender systems do, exactly? 1. Predict how much you may like a certain product/service 2. Compose a list of N best items for you 3. Compose a list of N best users for a certain product/service 4. Explain to you why these items are recommended to you 5. Adjust the prediction and recommendation based on your feedback and other people
  • 19. Graph representation Titanic Taken Panda ? Me My friend You Another guy
  • 20. We must also take a good care of • Data normalisation • Removal or reduction of noise • Protection of users’ privacy • Attack: someone just doesn’t like your system
  • 21. Task 1: Preference prediction • Collaborative filtering – User-based method – Item-based method – Matrix Factorization • Content-based filtering • Hybrid: – Linear/sequential/switching combination – Semi-Restricted Boltzmann Machines
  • 22. Collaborative filtering (1) • User-based method (1994, GroupLens) – Many people liked “Kungfu Panda” item 123 4 5678 – Can you tell how much I like it? 1545 3 4 – The idea is to pick about 20-50 2 35 4 5 people who share similar 3 4 5 4 taste with me, then how much I 45 5 35 4 like depend on how much 54 33 4 THEY liked. 652 35 – In short: you may like it 7 1 4 2 user because your “friends” liked it 8 5 43
  • 23. Collaborative filtering (2) • Item-based method (2001, deployed at Amazon) – I have watched so many good & bad movies – Would you recommend me watching “Taken”? item – The idea is to pick from my 1 2 3 4 5 67 8 previous list 20-50 movies that 1 4 3 4 5 5 share similar audience with “Taken”, then how much I will like 2 3 5 4 5 depend on how much I liked those 3 4 5 4 early movies 4 5 35 5 4 – In short: I tend to watch this movie because I have watched those 5 4 3 3 4 movies … or 6 5 2 3 5 – People who have watched those 7 1 4 2 user movies also liked this movie (Amazon style) 8 5 4 3
  • 24. Collaborative filtering (3) ~ [0.1 0.3 0.2 0.9 0.5 0.4 0.7 0.3 0.8 1.5] • Matrix Factorization (2006, Netflix challgence) – You many have watched thousands of movies – But perhaps I can tell these movies belong to 10 groups, like Action, Sci-Fi, Animation, etc,… – So 10 numbers are enough to describe your taste – Likewise, “Titanic” has been watched by millions people, but perhaps …10 numbers are enough to describe its features – Magic: these hidden aspects can be discovered automatically by Matrix Factorization!
  • 25. Problems with collaborative filtering • Scale – Netflix (2007): 5M users, 50K movies, 1.4B ratings • Sparse data – I have rated only one book at Amazon! • Cold-Start – New users and items do not have history • Popularity bias – Everyone reads “Harry Potter” • Hacking – Someone reads “Harry Potter” reads “Karma Sutra”
  • 26. Content-based method • Web page: words, hyperlinks, images, tags, comments, titles, URL, topic • Music: genre, rhythm, melody, harmony, lyrics, meta data, artists, bands, press releases, expert reviews, loudness, energy, time, spectrum, duration, frequency, pitch, key, mode, mood, style, tempo • User: age, sex, job, location, time, income, education, language, family status, hobbies, general interests, Web usage, computer usage, fan club membership, opinion, comments, tags, mobile usage • Context: time, location, mobility, activity, socializing, emotion
  • 27. Content-based method (2) • Can we acquire those content pieces automatically? – Fairly easy for text – Difficult for music and video, except for digital signals. E.g. music genre classification 60-80% accuracy – A lot of noise, e.g. misplaced tags – Attacks • What can we do with these? – Compute similarity between items or users – Query items that are similar to a given item – Match item’s content and user’s profile
  • 28. Content-based method (3) • Measuring similarity – Cosine, TF-IDF as in standard Information Retrieval – KL-divergence for probability-oriented guys – Euclidean, dimensionality reduction if you want – Anything you can imagine of!
  • 29. Hybrid: Semi-Restricted Boltzmann Machines (2009, IMPCA) User A User B User C • A probabilistic combination of – Item-based method – User-based method – Matrix Factorization – (May be) content-based method • It looks like a Neural Network 11 00 111 000 – But it does not really so ☺ 11 00 111 000 11 00 111 000 11 00 111 000 Item X • It really is a type of Markov random fields, which is, again, a type of Graphical Models – Self-advertising: I work on these stuffs for living!
  • 30. But, what do recommender systems do, exactly? 1. Predict how much you may like a certain product/service 2. Compose a list of N best items for you 3. Compose a list of N best users for a certain product/service 4. Explain to you why these items are recommended to you 5. Adjust the prediction and recommendation based on your feedback and other people
  • 31. Task 2,3: Top-N recommendation • Top-N item list: – Find similar users, collect what they like – Filter out those the user has rated – Rank the remaining items by considering • The number of times each item is liked by those users • The popularity of the item • The associated ratings • The similarity between each item in the list and what the user has rated • Switching the role of item to user, we may have top-N user list
  • 32. But, what do recommender systems do, exactly? 1. Predict how much you may like a certain product/service 2. Compose a list of N best items for you 3. Compose a list of N best users for a certain product/service 4. Explain to you why these items are recommended to you 5. Adjust the prediction and recommendation based on your feedback and other people
  • 33. Task 4: Explanation • This is a current hit … • More on this artist … • Try something from similar artists … • Someone similar to you also like this … • As you listened to that, you may want this … • These two go together … • This is most popular in your group … • This is highly rated … • Try something new …
  • 34. Task 4: Explanation (2) • Examples from Strands.com – Welcom back (recently viewed) – For you today – New for you – Hot / Most popular of this type – Other people also do this … – Similar or related products – Complementary accessories – This goes with this … – Gift idea – Shopping assisant
  • 35. But, what do recommender systems do, exactly? 1. Predict how much you may like a certain product/service 2. Compose a list of N best items for you 3. Compose a list of N best users for a certain product/service 4. Explain to you why these items are recommended to you 5. Adjust the prediction and recommendation based on your feedback and other people
  • 36. Task 5: Online updating • New items and users come each hour or minute • The two worlds: – Most songs and books are still interesting for a long time (the tail is really long) – Most news articles are read on the day and forgotten next day • But tracking back is useful to follow an event or scandal • Online updating large-scale neighbour-based systems is NOT easy at all
  • 37. Evaluation • How do we know the recommendation is good? – How good is good? – Measures should be automated • Practice: training/testing split (e.g. 80/20) • Popular criteria – Prediction error: ZOE, MAE, RMSE – Hit recall/precision/F-measure, rank utility, ROC curve,
  • 38. Evaluation (2) • Yet little on – Relevance – Usefulness – % Increase in purchase – % Reduction in cost – Novelty/surprise/long-tails – Diversity – Coverage – Explainability
  • 39. A question: Can we make use of these information sources? • Blogs • Social Media • Online comments • Online stores • Review sites • Locations • Mobility
  • 40. A case-study: Strands • Services for any online-retailers – Retailers send product, purchase information into Strands server (one retailer per account) through APIs – Strands returns recommendation for each visitor • The same logic for social media servers • moneyStrands for personal financial management (e.g. investment recommendation) • MyStrands for music personalization
  • 41. Want more practical hints? • New books: – Toby Segaran, Programming Collective Intelligence, O'Reilly, 2007 – Satnam Alag, Collective Intelligence in Action, Manning Publications, 2009 • Check out for real deployment: – TechCrunch – ReadWriteWeb
  • 42. Want more state-of-the-arts? • Research in Recommender Systems is becoming a mainstream, evidenced from the recent conference ACM RecSys. • Other places: – ICWSM: Weblog and Social Media – WebKDD: Web Knowledge Discovery and Data Mining – WWW: The original WWW conference – SIGIR: Information Retrieval – ACM KDD: Knowledge Discovery and Data Mining – ICML: Machine Learning
  • 43. Questions left to you • Will you trust such Recommender Systems? • Will you implement and deploy it here? • Will you do research? – PhD scholarships available (as of 19/4/09) – See http://truyen.vietlabs.com/scholarship.html – Warning: you are going to waste 3-5 years of your youth life!