SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
WE KNOW YOU WILL LIKE THIS
                                 Introduction to Recommendation Engines




Monday, January 14, 13
ML
                                          X                     X     +Y




              Supervised                                             Unsupervised
                                                                           Clustering
                         T   + YT


                 X                  X    +Y

                                                                      Hierarchical Clustering
   Regression                           Classification
             Turnout                        Class
                  30                        Spam
          Y=         (numeric)          Y = Not Spam (Categorical)
                  12
                  25                        Spam
Monday, January 14, 13
MarabooKarnaf Ima Adama
                                                                                                Liv
                                                                    Idan      5      ?      3     ?
                                                                    Shahar    4      3      ?     2
                                                                    Gadi      ?      1      ?     5




                         Content/Model-Based
                                                                     (Agnostic, Behavioural)
                         (predicting the rating)

                                                   Recommendation
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Preference Problem (Ads)




                         Rating Problem (Movies)




Monday, January 14, 13
Monday, January 14, 13
Related problem: Ranking




Monday, January 14, 13
Maraboo   Karnaf   Ima Adama Liv
                         Idan         1        ?         1        ?
                         Shahar       1        1         ?        1
                         Gadi        ?         1         ?        1




                                  Maraboo   Karnaf   Ima Adama Liv
                         Idan         5        ?         3        ?
                         Shahar       4        3         ?        2
                         Gadi        ?         1         ?        5




Monday, January 14, 13
Maraboo   Karnaf   Ima Adama Liv
              Idan           1        ?         1        ?
              Shahar         1        1         ?        1
              Gadi          ?         1         ?        1




Monday, January 14, 13
Maraboo   Karnaf   Ima Adama Liv
                  Idan         5        ?         3        ?
                  Shahar       4        3         ?        2
                  Gadi        ?         1         ?        5




Monday, January 14, 13
User-based Collaborative Filtering




Monday, January 14, 13
Monday, January 14, 13
Jaccard Distance                            “We share 5 preferences out of 7!”


          Euclidean Distance



            Cosine Similiarity


             Pearson’s
             Correlation      1-                                           “Our preferences go
             Distance                                                     in the same direction!”
                                                             (but only 2 such preferences do...)
             Log-Likelihood
             Ratio

                                   Measure of “Surprise” at correlation

Monday, January 14, 13
Item-Based Collaborative Filtering

          Usually bounded




Monday, January 14, 13
Case study: Amazon
                         100,000,000 users

                         2,000,000 items

                         Each user expresses preference for 10 items

                         Each item has 500 reviews
                         User-Based CF:                      Item-Based CF:

                         100,000,000 x 100,000,000           2,000,000 x 2,000,000 similarity
                         similarity matrix                   matrix

                         2,000,000 x 500 sum terms           2,000,000 x 10 sum terms

Monday, January 14, 13
Interpretability




                         “People who go to
                             La Colombe                  “Coffee Shop
                            Torrefaction &            connoisseurs tend
                         FourSquare HQ tend             to come here”
                             to go here”


Monday, January 14, 13
Evaluation
                         Rating Problem: Predictive accuracy (regression) metrics

                            RMSE, MAE, etc.

                         Preference (Binary) Problem: Classification accuracy (IR) metrics

                            Accuracy, Precision, Recall, F-1, ROC, etc.

                            Benchmark vs. ‘random’ and ‘popular’

                         Ranking accuracy metrics: Similarity of permutations

                            Pearson’s correlation, Spearman’s rho, Kendall’s tau

Monday, January 14, 13
Monday, January 14, 13
Challenges

                         Cold-start problems (new item, new user)

                         “Black” and “Grey” sheep

                         Exploration-exploitation and reinforcement learning

                         Scale




Monday, January 14, 13
Advanced Topics

                         Dimensionality Reduction

                         Map-Reducible calculations

                         Content-based (feature-based)

                         Multiple models




Monday, January 14, 13
MapReduce Similarity Calculation
                                          “User-based”
                                              A                                  ui
                           Maraboo Karnaf Ima Adama Liv                          Gadi                  Gadi
              Idan
              Shahar
                               1
                               1
                                          ?
                                          1
                                                    1
                                                    ?
                                                                ?
                                                                1   *   Maraboo
                                                                        Karnaf
                                                                                      ?
                                                                                      1
                                                                                          =   Idan
                                                                                              Shahar
                                                                                                          0
                                                                                                          2
              Gadi             ?          1         ?           1       Ima Adama     ?       Gadi        2
                                                                        Liv           1
                                                                                          User similarity vector
                                              AT                            Aui                                    T(Au )
                         Maraboo
                                   Idan
                                      1
                                              Shahar Gadi
                                                1           ?
                                                                    *   Idan
                                                                                 Gadi
                                                                                      0
                                                                                          =   Maraboo
                                                                                                       Gadi
                                                                                                              2
                                                                                                                   A   i
                         Karnaf       ?         1           1           Shahar        2       Karnaf          4
                         Ima Adama    1         ?           ?           Gadi          2       Ima Adama       0
                         Liv          ?         1           1                                 Liv             4




Monday, January 14, 13
MapReduce Similarity Calculation
                                          “Item-Based”
                                            A T                                                   A
                                   Idan       Shahar Gadi                        Maraboo Karnaf Ima Adama Liv                Maraboo Karnaf Ima Adama Liv
                         Maraboo      1         1           ?           Idan          1       ?       1       ?
                         Karnaf       ?         1           1       *   Shahar        1       1       ?       1   =   Maraboo
                                                                                                                      Karnaf
                                                                                                                                  2
                                                                                                                                  1
                                                                                                                                          1
                                                                                                                                          2
                                                                                                                                                1
                                                                                                                                                0
                                                                                                                                                       1
                                                                                                                                                       2
                         Ima Adama    1         ?           ?           Gadi          ?       1       ?       1       Ima Adama   1       0     1      0
                         Liv          ?         1           1                                                         Liv         1       2     0      2

                                                                                                                         Item similarity matrix
                                     ATA                                         ui
                           Maraboo Karnaf Ima Adama Liv                           Gadi                    Gadi
               Maraboo         2          1         1           1        Maraboo          ?       Maraboo     2
                                                                                              =
                                                                    *                                                                 T
                                                                                                                            (A A)ui
               Karnaf          1          2         0           2        Karnaf           1       Karnaf      4
               Ima Adama       1          0         1           0        Ima Adama        ?       Ima Adama   0
               Liv             1          2         0           2        Liv              1       Liv         4




                                                        Similarity of item x to item y is <ix,iy>

Monday, January 14, 13
MapReduce Similarity Calculation
                            Recall row outer-product matrix multiplication:
                                                                          Maraboo Karnaf Ima Adama Liv
                                                                   Maraboo     2       1     1      1
                                                                   Karnaf      1       2     0      2
                                                                   Ima Adama   1       0     1      0
                                                                   Liv         1       2     0      2


                                                                                       =
                                Maraboo Karnaf Ima Adama Liv              Maraboo Karnaf Ima Adama Liv              Maraboo Karnaf Ima Adama Liv
                         Maraboo     1       0     1      0        Maraboo     1       1     0      1        Maraboo     0       0     0      0
                         Karnaf
                         Ima Adama
                                     0
                                     1
                                             0
                                             0
                                                   0
                                                   1
                                                          0
                                                          0    +   Karnaf
                                                                   Ima Adama
                                                                               1
                                                                               0
                                                                                       1
                                                                                       0
                                                                                             0
                                                                                             0
                                                                                                    1
                                                                                                    0
                                                                                                         +   Karnaf
                                                                                                             Ima Adama
                                                                                                                         0
                                                                                                                         0
                                                                                                                                 1
                                                                                                                                 0
                                                                                                                                       0
                                                                                                                                       0
                                                                                                                                              1
                                                                                                                                              0
                         Liv         0       0     0      0        Liv         1       1     0      1        Liv         0       1     0      1



                              uIdanuIdan T                            uShaharuShahar
                                                                                   T                            uGadiuGadi   T

                                         Only one user’s list of items is used every time!

Monday, January 14, 13
MapReduce Similarity Calculation

                         All of the classic similarity functions are
                         made up of 3 stages:

                            Preprocess (uses only one ELEMENT)

                            Norm (Can be done in reduce on one
                                  VECTOR)
                                                     T
                            Similarity utilizes the A A matrix joined
                            with norm entries


Monday, January 14, 13
Bibliography
                         Google News Personalization: Scalable Online Collaborative Filtering - Das, Datar, Garg, Rajaram, WWW2007

                         Logistic Regression and Collaborative Filtering for Sponsored Search Term Recommendation - Bartz, Murthi, Sebastian, EC2006

                         Evaluating Collaborative Filtering Recommender Systems - Herlocker, Konstan, Tenveen, Riedl, ACM TIS2004

                         A Survey of Collaborative Filtering Techniques - Su, Khoshgoftaar, AAI2009

                         An Introduction to Information Retrieval - Manning, Raghavan, Schutze, Cambridge Press

                         Mahout in Action - Friedman, Dunning, Anil, Owen, Manning Publications

                         Lessons from the Netflix Prize Challenge - Bell, Koren, KDD2009

                         Factorization meets the Neighbourhood: a Multifaceted Collaborative Filtering Model - Koren, KDD2008

                         Accurate Methods for the Statistics of Surprise and Coincidence - Dunning, ACL1993

                         Item-Based Collaborative Filtering Recommendation Algorithms - Sarwar, Konstan, Karypis, Riedl, WWW2001

                         Matrix Factorization Techniques for Recommender Systems - Koren, Bell, Volinsky, IEEE2009

                         recommenderlab: A Framework for Developing and Testing Recommendation Algorithms - Hahsler, 2001

                         Scalable Similarity-Based Neighbourhood Methods with MapReduce - Schelter, Boden, Markl, RecSys2012



Monday, January 14, 13
Thanks!


                         Nimrod Priell
                         nimrod.priell@gmail.com
                         @nimrodpriell
                         http://www.educated-guess.com




Monday, January 14, 13

Más contenido relacionado

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Destacado

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destacado (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Collaborative filtering intro - Full

  • 1. WE KNOW YOU WILL LIKE THIS Introduction to Recommendation Engines Monday, January 14, 13
  • 2. ML X X +Y Supervised Unsupervised Clustering T + YT X X +Y Hierarchical Clustering Regression Classification Turnout Class 30 Spam Y= (numeric) Y = Not Spam (Categorical) 12 25 Spam Monday, January 14, 13
  • 3. MarabooKarnaf Ima Adama Liv Idan 5 ? 3 ? Shahar 4 3 ? 2 Gadi ? 1 ? 5 Content/Model-Based (Agnostic, Behavioural) (predicting the rating) Recommendation Monday, January 14, 13
  • 12. Preference Problem (Ads) Rating Problem (Movies) Monday, January 14, 13
  • 15. Maraboo Karnaf Ima Adama Liv Idan 1 ? 1 ? Shahar 1 1 ? 1 Gadi ? 1 ? 1 Maraboo Karnaf Ima Adama Liv Idan 5 ? 3 ? Shahar 4 3 ? 2 Gadi ? 1 ? 5 Monday, January 14, 13
  • 16. Maraboo Karnaf Ima Adama Liv Idan 1 ? 1 ? Shahar 1 1 ? 1 Gadi ? 1 ? 1 Monday, January 14, 13
  • 17. Maraboo Karnaf Ima Adama Liv Idan 5 ? 3 ? Shahar 4 3 ? 2 Gadi ? 1 ? 5 Monday, January 14, 13
  • 20. Jaccard Distance “We share 5 preferences out of 7!” Euclidean Distance Cosine Similiarity Pearson’s Correlation 1- “Our preferences go Distance in the same direction!” (but only 2 such preferences do...) Log-Likelihood Ratio Measure of “Surprise” at correlation Monday, January 14, 13
  • 21. Item-Based Collaborative Filtering Usually bounded Monday, January 14, 13
  • 22. Case study: Amazon 100,000,000 users 2,000,000 items Each user expresses preference for 10 items Each item has 500 reviews User-Based CF: Item-Based CF: 100,000,000 x 100,000,000 2,000,000 x 2,000,000 similarity similarity matrix matrix 2,000,000 x 500 sum terms 2,000,000 x 10 sum terms Monday, January 14, 13
  • 23. Interpretability “People who go to La Colombe “Coffee Shop Torrefaction & connoisseurs tend FourSquare HQ tend to come here” to go here” Monday, January 14, 13
  • 24. Evaluation Rating Problem: Predictive accuracy (regression) metrics RMSE, MAE, etc. Preference (Binary) Problem: Classification accuracy (IR) metrics Accuracy, Precision, Recall, F-1, ROC, etc. Benchmark vs. ‘random’ and ‘popular’ Ranking accuracy metrics: Similarity of permutations Pearson’s correlation, Spearman’s rho, Kendall’s tau Monday, January 14, 13
  • 26. Challenges Cold-start problems (new item, new user) “Black” and “Grey” sheep Exploration-exploitation and reinforcement learning Scale Monday, January 14, 13
  • 27. Advanced Topics Dimensionality Reduction Map-Reducible calculations Content-based (feature-based) Multiple models Monday, January 14, 13
  • 28. MapReduce Similarity Calculation “User-based” A ui Maraboo Karnaf Ima Adama Liv Gadi Gadi Idan Shahar 1 1 ? 1 1 ? ? 1 * Maraboo Karnaf ? 1 = Idan Shahar 0 2 Gadi ? 1 ? 1 Ima Adama ? Gadi 2 Liv 1 User similarity vector AT Aui T(Au ) Maraboo Idan 1 Shahar Gadi 1 ? * Idan Gadi 0 = Maraboo Gadi 2 A i Karnaf ? 1 1 Shahar 2 Karnaf 4 Ima Adama 1 ? ? Gadi 2 Ima Adama 0 Liv ? 1 1 Liv 4 Monday, January 14, 13
  • 29. MapReduce Similarity Calculation “Item-Based” A T A Idan Shahar Gadi Maraboo Karnaf Ima Adama Liv Maraboo Karnaf Ima Adama Liv Maraboo 1 1 ? Idan 1 ? 1 ? Karnaf ? 1 1 * Shahar 1 1 ? 1 = Maraboo Karnaf 2 1 1 2 1 0 1 2 Ima Adama 1 ? ? Gadi ? 1 ? 1 Ima Adama 1 0 1 0 Liv ? 1 1 Liv 1 2 0 2 Item similarity matrix ATA ui Maraboo Karnaf Ima Adama Liv Gadi Gadi Maraboo 2 1 1 1 Maraboo ? Maraboo 2 = * T (A A)ui Karnaf 1 2 0 2 Karnaf 1 Karnaf 4 Ima Adama 1 0 1 0 Ima Adama ? Ima Adama 0 Liv 1 2 0 2 Liv 1 Liv 4 Similarity of item x to item y is <ix,iy> Monday, January 14, 13
  • 30. MapReduce Similarity Calculation Recall row outer-product matrix multiplication: Maraboo Karnaf Ima Adama Liv Maraboo 2 1 1 1 Karnaf 1 2 0 2 Ima Adama 1 0 1 0 Liv 1 2 0 2 = Maraboo Karnaf Ima Adama Liv Maraboo Karnaf Ima Adama Liv Maraboo Karnaf Ima Adama Liv Maraboo 1 0 1 0 Maraboo 1 1 0 1 Maraboo 0 0 0 0 Karnaf Ima Adama 0 1 0 0 0 1 0 0 + Karnaf Ima Adama 1 0 1 0 0 0 1 0 + Karnaf Ima Adama 0 0 1 0 0 0 1 0 Liv 0 0 0 0 Liv 1 1 0 1 Liv 0 1 0 1 uIdanuIdan T uShaharuShahar T uGadiuGadi T Only one user’s list of items is used every time! Monday, January 14, 13
  • 31. MapReduce Similarity Calculation All of the classic similarity functions are made up of 3 stages: Preprocess (uses only one ELEMENT) Norm (Can be done in reduce on one VECTOR) T Similarity utilizes the A A matrix joined with norm entries Monday, January 14, 13
  • 32. Bibliography Google News Personalization: Scalable Online Collaborative Filtering - Das, Datar, Garg, Rajaram, WWW2007 Logistic Regression and Collaborative Filtering for Sponsored Search Term Recommendation - Bartz, Murthi, Sebastian, EC2006 Evaluating Collaborative Filtering Recommender Systems - Herlocker, Konstan, Tenveen, Riedl, ACM TIS2004 A Survey of Collaborative Filtering Techniques - Su, Khoshgoftaar, AAI2009 An Introduction to Information Retrieval - Manning, Raghavan, Schutze, Cambridge Press Mahout in Action - Friedman, Dunning, Anil, Owen, Manning Publications Lessons from the Netflix Prize Challenge - Bell, Koren, KDD2009 Factorization meets the Neighbourhood: a Multifaceted Collaborative Filtering Model - Koren, KDD2008 Accurate Methods for the Statistics of Surprise and Coincidence - Dunning, ACL1993 Item-Based Collaborative Filtering Recommendation Algorithms - Sarwar, Konstan, Karypis, Riedl, WWW2001 Matrix Factorization Techniques for Recommender Systems - Koren, Bell, Volinsky, IEEE2009 recommenderlab: A Framework for Developing and Testing Recommendation Algorithms - Hahsler, 2001 Scalable Similarity-Based Neighbourhood Methods with MapReduce - Schelter, Boden, Markl, RecSys2012 Monday, January 14, 13
  • 33. Thanks! Nimrod Priell nimrod.priell@gmail.com @nimrodpriell http://www.educated-guess.com Monday, January 14, 13