SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
Institut de Recherche en Informatique de Toulouse (IRIT) - UMR 5505




                  Bridging the gap between users and systems

      Laurent CANDILLIER – Max CHEVALIER – Damien DUDOGNON – Josiane MOTHE




27/10/11
Diversity in recommender systems
  How to recommend documents for a visited one
              Maximizing the chances of retrieving at least one relevant
               document per user [Santos et al., 2010]
              Cover a large range of users’ interests


  Context
     Blog platform
              Unknown user => no profile
              Diversity of users, diversity of their expectations




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   2
Diversity in recommender systems
  How to recommend documents for a visited one
              Maximizing the chances of retrieving at least one relevant
               document per user [Santos et al., 2010]
              Cover a large range of users’ interests


  Context
     Blog platform
              Unknown user => no profile
              Diversity of users, diversity of their expectations


     => Diversify the recommendations
27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   3
What is diversity?
  Definitions from the literature
    Topicality
              Related to a particular topic [Xu and Chen, 2006]


        Diversity
              Topical diversity
                Extrinsic: solve ambiguity [Radlinski et al., 2009]

                Intrinsic: avoid redundancy [Clarke et al., 2008]



              Serendipity
                Attractive and surprising documents [Herlocker et al., 2004]



27/10/11                   Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   4
Approaches to diversify IR results
  Topical diversity
     Clustering
              Identify aspects
              Reorder depending on the aspects covered


        Examples
              K-Means [Bi et al., 2009]
              Hierarchical Clustering [Meij et al., 2010]




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   5
Approaches to diversify IR results
  Topical diversity
     Sliding Windows
              Reorder the retrieved documents
              Select documents using metrics
                Similarity with the visited document

                Similarity with the current recommended document list




        Examples
              MMR [Carbonell and Goldstein, 1998]
              Intra-list similarity [Ziegler et al., 2005]



27/10/11                   Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   6
Approaches to diversify IR results
  Serendipity
     Alternative to topical diversity
     Similarity not only based on the content


        Examples
              Organizational similarity [Cabanac et al., 2007]
              Temporal diversity [Lathia et al., 2010]




27/10/11                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   7
Analysis of the TREC Web 2009 results
  Hypothesis
    Diversity of approaches
              No one approach for all users’ needs
              Approaches are complementary
              Valuable to combine them


  Goals
     Analyse results obtained with approaches having
            Same goal
            Similar performances

           => To identify if diversity exists

27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   8
Analysis of the TREC Web 2009 results
  Experimental framework
     Reference IR corpus (TREC Web 2009)


        Two IR contexts
          Adhoc task

          Diversity task



        Compare results (runs) of the 4 best approaches of each task
          Similar performances according to IR metrics

             MAP for adhoc task

             NDCG for diversity task

          Overlap for each pair of runs underlying diversity


27/10/11              Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   9
Analysis of the TREC Web 2009 results
  Adhoc Task




    Top 10 documents
      Overlap: 22.4%

      Precision: 0.384



    Overlap max < 30%




27/10/11             Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   10
Analysis of the TREC Web 2009 results
  Diversity Task




    Top 10 documents
      Overlap: 6.3%



    Overlap max < 15%




27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   11
Analysis of the TREC Web 2009 results
  Conclusions
     Two distinct approaches are unlikely to return the same
      (relevant) documents
              Low average overlap

        Diversity of approaches
          No approach significantly better than others
          A combination can be valuable



        TREC tasks focused on topicality and topical diversity
          Can’t be used to evaluate other types of diversity
          Users’ study necessary [Hayes et al., 2002]


27/10/11                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   12
Users’ Study
  Our intuitions
    Most of the time, users want topicality
              Get focused information


        Sometime, they want diversity
              Topical diversity
                Enlarge the subject

              Serendipity
                Discover new information




27/10/11                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   13
Users’ Study
  Goals
     Verify our intuitions
     Prove that diversified recommendations answer a larger
      range of users’ needs

  Context of experimentation
     34 students in M. Sc. (Management field)
     Blog post recommendations




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   14
Users’ Study
  Experimental Framework
     Select a document




27/10/11      Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   15
Users’ Study
  Experimental Framework
     Read the selected
      document




27/10/11      Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   16
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              17
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              18
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              19
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              20
Users’ Study
  Experimental Framework
     Compute the recommendation lists


       Approach 1                                                   List 1 (random)

       Approach 2

       Approach 3

       Approach 4                                                     List 2 (fused)

       Approach 5



27/10/11            Candillier L. – Chevalier M. – Dudognon D. – Mothe M.              21
Users’ Study
  Experimental Framework
     Present recommendation lists for assessment
                   Which list best meets your needs?




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   22
Users’ Study
  Experimental Framework
     Present recommendation lists for assessment
                   Which list is the most diversified?




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   23
Users’ Study
  Experimental Framework
     Assessment of all
      documents




27/10/11      Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   24
Users’ Study
  Approaches used
     searchsim
              Vector-space model
              Document title as query
        mlt
                                                                                  Topicality
          Apache Solr MoreLikeThis module
          Document content as query




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.            25
Users’ Study
  Approaches used
       
           
           
       
           
           
        kmeans
          K-means classification                                              Topical diversity
          One element per cluster




27/10/11               Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                   26
Users’ Study
  Approaches used
       
           
           
       
           
           
       
           
           
        blogart
          Random selection from the same blog
        topcateg                                                              Serendipity
          Popular documents in the same category



27/10/11               Candillier L. – Chevalier M. – Dudognon D. – Mothe M.             27
Users’ Study
  Approaches used



    Same analysis than TREC
     experiments
      Same results

      Overlap is low (< 10%)

     => High diversity




27/10/11             Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   28
Users’ Study
     Results
        Distribution of relevant documents
   blogart                           fused                                kmeans                        fused
              35%              65%                                                  52.5%           21.3%
                       0%                                                                   26.2%


                                       mlt                             fused
                                             54.7%              32.8%
                                                      12.5%



searchsim                         fused                                topcateg                         fused
              52.4%          38.9%                                                  8.8%            91.2%
                      8.7%                                                                   0%


   27/10/11                     Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                       29
Users’ Study
     Results
        Distribution of relevant documents
                                                                          kmeans                        fused
              35%              65%                                                  52.5%           21.3%
                       0%                                                                   26.2%


                                       mlt                             fused
                                             54.7%              32.8%
                                                      12.5%



searchsim                         fused
              52.4%          38.9%                                                  8.8%            91.2%
                      8.7%                                                                   0%


   27/10/11                     Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                       30
Users’ Study
  Results
     Distribution of relevant documents
blogart                           fused
           35%              65%                                                  52.5%           21.3%
                    0%                                                                   26.2%




                                          54.7%              32.8%
                                                   12.5%



                                                                    topcateg                         fused
           52.4%          38.9%                                                  8.8%            91.2%
                   8.7%                                                                   0%


27/10/11                     Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                       31
Users’ Study
  Results
     Distribution of relevant documents
              Relevant mainly retrieved by topical approaches
              But at least 20% are retrieved only by fused


        Fused matches with a larger range of needs




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   32
Conclusions and future work
  Conclusions
     Diversity of users’ expectations
              No one approach to rule them all
              A diversity of approaches
                Complementary

                Fused




        Diversity helps RS to fit more users’ needs




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   33
Conclusions and future work
  Future work
     Real scale experiment
              OverBlog platform


        Renew the user survey
              More users (international call for participation)
              Avoid revealed biases
                e.g. More detailed form

               => Deeper analysis




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   34
Conclusions and future work
  Future work
     Improve the model
              Refining the fusing process
              Adding a learning process to weight each approach
                For every visited document

                   Find the proportion of documents coming from each
                    approach (log analysis)
                Better match with the real users’ needs




27/10/11                  Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   35
Thank you for your attention

                               Questions ?




27/10/11         Candillier L. – Chevalier M. – Dudognon D. – Mothe M.   36
References
 W. Bi, X. Yu, Y. Liu, F. Guan, Z. Peng, H. Xu, and X. Cheng, “ICTNET at Web Track 2009 diversity task”, Text REtrieval Conf., 2009

 G. Cabanac, M. Chevalier, C. Chrisment, and C. Julien, “An Original Usage-based Metrics for Building a Unified View of Corporate Documents”,
 Inter. Conf. on Database and Expert Systems Applications, 2007, LNCS V. 4653, 2007, pp. 202–212

 J. Carbonell and J. Goldstein, “The use of MMR, diversity-based reranking for reordering documents and producing summaries”, ACM Conf. on
 Research and Development in Information Retrieval, 1998, pp. 335-336

 C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I.n MacKinnon, “Novelty and Diversity in Information
 Retrieval Evaluation”, ACM Conf. on Research and Development in Information Retrieval, 2008, pp. 659-666

 C. Hayes, P. Massa, P. Avesani, and P. Cunningham, « An online evaluation framework for recommender systems», Workshop on Personalization
 and Recommendation in E-Commerce, 2002

 J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, “Evaluating Collaborative Filtering Recommender Systems”, ACM Trans. Information
 Systems, 22(1), 2004, pp. 5-53

 N. Lathia, S. Hailes, L. Capra, and X. Amatriain, “Temporal diversity in recommender systems”, ACM Conf. on Research and Development in
 Information Retrieval, 2010, pp. 210-217

 E. Meij, J. He, W. Weerkamp, and M. de Rijke, “Topical Diversity and Relevance Feedback”, Text REtrieval Conf., 2010

 F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. “Redundancy, diversity and interdependent document relevance”, SIGIR Forum, 43(2),
 2009, pp. 46–52

 R. L. T. Santos, C. Macdonald, and I. Ounis, “Selectively Diversifying Web Search Results”, ACM Inter. Conf. on Information and Knowledge
 Management, 2010

 Y. C. Xu and Z. Chen, “Relevance judgment: What do information users consider beyond topicality”, Journal of the American Society for
 Information Science and Technology, 57(7), 2006, pp. 961–973

 C. Ziegler, S. McNee, J. A. Konstan, and G. Lausen, “Improving recommendation lists through topic diversification”, Inter. Conf. on World Wide
 Web, 2005, pp. 22–32
27/10/11                                 Candillier L. – Chevalier M. – Dudognon D. – Mothe M.                                                  37

Más contenido relacionado

Similar a Diversity in recommender systems - Bridging the gap between users and systems

Staged Models for Interdisciplinary Research
Staged Models for Interdisciplinary ResearchStaged Models for Interdisciplinary Research
Staged Models for Interdisciplinary ResearchBruce Edmonds
 
Serve-Learn-Sustain's Linked Courses for Trandisciplinary Learning
Serve-Learn-Sustain's Linked Courses for Trandisciplinary LearningServe-Learn-Sustain's Linked Courses for Trandisciplinary Learning
Serve-Learn-Sustain's Linked Courses for Trandisciplinary LearningESD UNU-IAS
 
Interactive Recommender Systems
Interactive Recommender SystemsInteractive Recommender Systems
Interactive Recommender SystemsKatrien Verbert
 
Visual Analysis of Topic Competition on Social Media
Visual Analysis of Topic Competition on Social Media Visual Analysis of Topic Competition on Social Media
Visual Analysis of Topic Competition on Social Media Yingcai Wu
 
Ullmann
UllmannUllmann
Ullmannanesah
 
1_Q2-PRACTICAL-RESEARCH.pptx
1_Q2-PRACTICAL-RESEARCH.pptx1_Q2-PRACTICAL-RESEARCH.pptx
1_Q2-PRACTICAL-RESEARCH.pptxGeraldRefil3
 
Interactive recommender systems: opening up the “black box”
Interactive recommender systems: opening up the “black box”Interactive recommender systems: opening up the “black box”
Interactive recommender systems: opening up the “black box”Katrien Verbert
 
Designing work integrated assessment- tools & techniques for creating 'authe...
Designing work integrated assessment- tools & techniques for creating  'authe...Designing work integrated assessment- tools & techniques for creating  'authe...
Designing work integrated assessment- tools & techniques for creating 'authe...Richard Osborne
 
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...MarciaDevlin
 
Design based for lisbon 2011
Design based for lisbon 2011Design based for lisbon 2011
Design based for lisbon 2011Terry Anderson
 
Ontology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online LearningOntology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online LearningSanjaya Mishra
 
Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...Hendrik Drachsler
 
Doing things differently: Re-evaluating our role in participatory research
Doing things differently: Re-evaluating our role in participatory researchDoing things differently: Re-evaluating our role in participatory research
Doing things differently: Re-evaluating our role in participatory researchSustainabilityStudiesUHI
 
Mapping the Terrain of Design Thinking: Pedagogies & Outcomes
Mapping the Terrain of Design Thinking: Pedagogies & OutcomesMapping the Terrain of Design Thinking: Pedagogies & Outcomes
Mapping the Terrain of Design Thinking: Pedagogies & OutcomesSystemic Design Association (SDA)
 

Similar a Diversity in recommender systems - Bridging the gap between users and systems (20)

Staged Models for Interdisciplinary Research
Staged Models for Interdisciplinary ResearchStaged Models for Interdisciplinary Research
Staged Models for Interdisciplinary Research
 
Serve-Learn-Sustain's Linked Courses for Trandisciplinary Learning
Serve-Learn-Sustain's Linked Courses for Trandisciplinary LearningServe-Learn-Sustain's Linked Courses for Trandisciplinary Learning
Serve-Learn-Sustain's Linked Courses for Trandisciplinary Learning
 
Interactive Recommender Systems
Interactive Recommender SystemsInteractive Recommender Systems
Interactive Recommender Systems
 
Visual Analysis of Topic Competition on Social Media
Visual Analysis of Topic Competition on Social Media Visual Analysis of Topic Competition on Social Media
Visual Analysis of Topic Competition on Social Media
 
Ullmann
UllmannUllmann
Ullmann
 
1_Q2-PRACTICAL-RESEARCH.pptx
1_Q2-PRACTICAL-RESEARCH.pptx1_Q2-PRACTICAL-RESEARCH.pptx
1_Q2-PRACTICAL-RESEARCH.pptx
 
metodos qualitativos
metodos qualitativosmetodos qualitativos
metodos qualitativos
 
Interactive recommender systems: opening up the “black box”
Interactive recommender systems: opening up the “black box”Interactive recommender systems: opening up the “black box”
Interactive recommender systems: opening up the “black box”
 
Designing work integrated assessment- tools & techniques for creating 'authe...
Designing work integrated assessment- tools & techniques for creating  'authe...Designing work integrated assessment- tools & techniques for creating  'authe...
Designing work integrated assessment- tools & techniques for creating 'authe...
 
Jan Reichelt Mendeley
Jan Reichelt MendeleyJan Reichelt Mendeley
Jan Reichelt Mendeley
 
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
Professor Marcia Devlin: "Learning Theories and Interdisciplinary Epistemolog...
 
Design based for lisbon 2011
Design based for lisbon 2011Design based for lisbon 2011
Design based for lisbon 2011
 
Ontology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online LearningOntology for Research in Distance, Open and Online Learning
Ontology for Research in Distance, Open and Online Learning
 
Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...
 
Bell.bolouri
Bell.bolouriBell.bolouri
Bell.bolouri
 
these_15-9
these_15-9these_15-9
these_15-9
 
Doing things differently: Re-evaluating our role in participatory research
Doing things differently: Re-evaluating our role in participatory researchDoing things differently: Re-evaluating our role in participatory research
Doing things differently: Re-evaluating our role in participatory research
 
EU4ALL @ UKOU
EU4ALL @ UKOUEU4ALL @ UKOU
EU4ALL @ UKOU
 
Chapter eight (1)
Chapter eight (1)Chapter eight (1)
Chapter eight (1)
 
Mapping the Terrain of Design Thinking: Pedagogies & Outcomes
Mapping the Terrain of Design Thinking: Pedagogies & OutcomesMapping the Terrain of Design Thinking: Pedagogies & Outcomes
Mapping the Terrain of Design Thinking: Pedagogies & Outcomes
 

Último

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Último (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Diversity in recommender systems - Bridging the gap between users and systems

  • 1. Institut de Recherche en Informatique de Toulouse (IRIT) - UMR 5505 Bridging the gap between users and systems Laurent CANDILLIER – Max CHEVALIER – Damien DUDOGNON – Josiane MOTHE 27/10/11
  • 2. Diversity in recommender systems  How to recommend documents for a visited one  Maximizing the chances of retrieving at least one relevant document per user [Santos et al., 2010]  Cover a large range of users’ interests  Context  Blog platform  Unknown user => no profile  Diversity of users, diversity of their expectations 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 2
  • 3. Diversity in recommender systems  How to recommend documents for a visited one  Maximizing the chances of retrieving at least one relevant document per user [Santos et al., 2010]  Cover a large range of users’ interests  Context  Blog platform  Unknown user => no profile  Diversity of users, diversity of their expectations => Diversify the recommendations 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 3
  • 4. What is diversity?  Definitions from the literature  Topicality  Related to a particular topic [Xu and Chen, 2006]  Diversity  Topical diversity  Extrinsic: solve ambiguity [Radlinski et al., 2009]  Intrinsic: avoid redundancy [Clarke et al., 2008]  Serendipity  Attractive and surprising documents [Herlocker et al., 2004] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 4
  • 5. Approaches to diversify IR results  Topical diversity  Clustering  Identify aspects  Reorder depending on the aspects covered  Examples  K-Means [Bi et al., 2009]  Hierarchical Clustering [Meij et al., 2010] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 5
  • 6. Approaches to diversify IR results  Topical diversity  Sliding Windows  Reorder the retrieved documents  Select documents using metrics  Similarity with the visited document  Similarity with the current recommended document list  Examples  MMR [Carbonell and Goldstein, 1998]  Intra-list similarity [Ziegler et al., 2005] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 6
  • 7. Approaches to diversify IR results  Serendipity  Alternative to topical diversity  Similarity not only based on the content  Examples  Organizational similarity [Cabanac et al., 2007]  Temporal diversity [Lathia et al., 2010] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 7
  • 8. Analysis of the TREC Web 2009 results  Hypothesis  Diversity of approaches  No one approach for all users’ needs  Approaches are complementary  Valuable to combine them  Goals  Analyse results obtained with approaches having  Same goal  Similar performances => To identify if diversity exists 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 8
  • 9. Analysis of the TREC Web 2009 results  Experimental framework  Reference IR corpus (TREC Web 2009)  Two IR contexts  Adhoc task  Diversity task  Compare results (runs) of the 4 best approaches of each task  Similar performances according to IR metrics  MAP for adhoc task  NDCG for diversity task  Overlap for each pair of runs underlying diversity 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 9
  • 10. Analysis of the TREC Web 2009 results  Adhoc Task  Top 10 documents  Overlap: 22.4%  Precision: 0.384  Overlap max < 30% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 10
  • 11. Analysis of the TREC Web 2009 results  Diversity Task  Top 10 documents  Overlap: 6.3%  Overlap max < 15% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 11
  • 12. Analysis of the TREC Web 2009 results  Conclusions  Two distinct approaches are unlikely to return the same (relevant) documents  Low average overlap  Diversity of approaches  No approach significantly better than others  A combination can be valuable  TREC tasks focused on topicality and topical diversity  Can’t be used to evaluate other types of diversity  Users’ study necessary [Hayes et al., 2002] 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 12
  • 13. Users’ Study  Our intuitions  Most of the time, users want topicality  Get focused information  Sometime, they want diversity  Topical diversity  Enlarge the subject  Serendipity  Discover new information 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 13
  • 14. Users’ Study  Goals  Verify our intuitions  Prove that diversified recommendations answer a larger range of users’ needs  Context of experimentation  34 students in M. Sc. (Management field)  Blog post recommendations 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 14
  • 15. Users’ Study  Experimental Framework  Select a document 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 15
  • 16. Users’ Study  Experimental Framework  Read the selected document 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 16
  • 17. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 17
  • 18. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 18
  • 19. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 19
  • 20. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 20
  • 21. Users’ Study  Experimental Framework  Compute the recommendation lists Approach 1 List 1 (random) Approach 2 Approach 3 Approach 4 List 2 (fused) Approach 5 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 21
  • 22. Users’ Study  Experimental Framework  Present recommendation lists for assessment Which list best meets your needs? 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 22
  • 23. Users’ Study  Experimental Framework  Present recommendation lists for assessment Which list is the most diversified? 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 23
  • 24. Users’ Study  Experimental Framework  Assessment of all documents 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 24
  • 25. Users’ Study  Approaches used  searchsim  Vector-space model  Document title as query  mlt Topicality  Apache Solr MoreLikeThis module  Document content as query 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 25
  • 26. Users’ Study  Approaches used        kmeans  K-means classification Topical diversity  One element per cluster 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 26
  • 27. Users’ Study  Approaches used           blogart  Random selection from the same blog  topcateg Serendipity  Popular documents in the same category 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 27
  • 28. Users’ Study  Approaches used  Same analysis than TREC experiments  Same results  Overlap is low (< 10%) => High diversity 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 28
  • 29. Users’ Study  Results  Distribution of relevant documents blogart fused kmeans fused 35% 65% 52.5% 21.3% 0% 26.2% mlt fused 54.7% 32.8% 12.5% searchsim fused topcateg fused 52.4% 38.9% 8.8% 91.2% 8.7% 0% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 29
  • 30. Users’ Study  Results  Distribution of relevant documents kmeans fused 35% 65% 52.5% 21.3% 0% 26.2% mlt fused 54.7% 32.8% 12.5% searchsim fused 52.4% 38.9% 8.8% 91.2% 8.7% 0% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 30
  • 31. Users’ Study  Results  Distribution of relevant documents blogart fused 35% 65% 52.5% 21.3% 0% 26.2% 54.7% 32.8% 12.5% topcateg fused 52.4% 38.9% 8.8% 91.2% 8.7% 0% 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 31
  • 32. Users’ Study  Results  Distribution of relevant documents  Relevant mainly retrieved by topical approaches  But at least 20% are retrieved only by fused  Fused matches with a larger range of needs 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 32
  • 33. Conclusions and future work  Conclusions  Diversity of users’ expectations  No one approach to rule them all  A diversity of approaches  Complementary  Fused  Diversity helps RS to fit more users’ needs 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 33
  • 34. Conclusions and future work  Future work  Real scale experiment  OverBlog platform  Renew the user survey  More users (international call for participation)  Avoid revealed biases  e.g. More detailed form => Deeper analysis 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 34
  • 35. Conclusions and future work  Future work  Improve the model  Refining the fusing process  Adding a learning process to weight each approach  For every visited document  Find the proportion of documents coming from each approach (log analysis)  Better match with the real users’ needs 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 35
  • 36. Thank you for your attention Questions ? 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 36
  • 37. References W. Bi, X. Yu, Y. Liu, F. Guan, Z. Peng, H. Xu, and X. Cheng, “ICTNET at Web Track 2009 diversity task”, Text REtrieval Conf., 2009 G. Cabanac, M. Chevalier, C. Chrisment, and C. Julien, “An Original Usage-based Metrics for Building a Unified View of Corporate Documents”, Inter. Conf. on Database and Expert Systems Applications, 2007, LNCS V. 4653, 2007, pp. 202–212 J. Carbonell and J. Goldstein, “The use of MMR, diversity-based reranking for reordering documents and producing summaries”, ACM Conf. on Research and Development in Information Retrieval, 1998, pp. 335-336 C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I.n MacKinnon, “Novelty and Diversity in Information Retrieval Evaluation”, ACM Conf. on Research and Development in Information Retrieval, 2008, pp. 659-666 C. Hayes, P. Massa, P. Avesani, and P. Cunningham, « An online evaluation framework for recommender systems», Workshop on Personalization and Recommendation in E-Commerce, 2002 J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, “Evaluating Collaborative Filtering Recommender Systems”, ACM Trans. Information Systems, 22(1), 2004, pp. 5-53 N. Lathia, S. Hailes, L. Capra, and X. Amatriain, “Temporal diversity in recommender systems”, ACM Conf. on Research and Development in Information Retrieval, 2010, pp. 210-217 E. Meij, J. He, W. Weerkamp, and M. de Rijke, “Topical Diversity and Relevance Feedback”, Text REtrieval Conf., 2010 F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. “Redundancy, diversity and interdependent document relevance”, SIGIR Forum, 43(2), 2009, pp. 46–52 R. L. T. Santos, C. Macdonald, and I. Ounis, “Selectively Diversifying Web Search Results”, ACM Inter. Conf. on Information and Knowledge Management, 2010 Y. C. Xu and Z. Chen, “Relevance judgment: What do information users consider beyond topicality”, Journal of the American Society for Information Science and Technology, 57(7), 2006, pp. 961–973 C. Ziegler, S. McNee, J. A. Konstan, and G. Lausen, “Improving recommendation lists through topic diversification”, Inter. Conf. on World Wide Web, 2005, pp. 22–32 27/10/11 Candillier L. – Chevalier M. – Dudognon D. – Mothe M. 37