The document provides an overview of a tutorial on context-awareness in information retrieval and recommender systems. It discusses topics such as information overload, solutions like information retrieval (e.g. search engines) and recommender systems (e.g. movie recommendations). It then covers context and context-awareness, giving examples like how recommendations may change based on location, time, user intent, etc. It also discusses incorporating context-awareness into information retrieval and recommender systems to improve recommendations.
Tutorial: Context In Recommender SystemsYONG ZHENG
This document provides an overview of a tutorial on context-aware recommender systems. The tutorial will cover traditional recommendation techniques, context-aware recommendation which incorporates additional contextual information such as time and location, and context suggestion. It includes an agenda with topics, background information on recommender systems and evaluation metrics, and descriptions of techniques for context-aware recommendation including context filtering and modeling.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...Gabriel Moreira
Presentation of the Phd. thesis defense of Gabriel de Souza Pereira Moreira at Instituto Tecnológico de Aeronáutica (ITA), on Dec. 09, 2019, in São José dos Campos, Brazil.
Abstract:
Recommender systems have been increasingly popular in assisting users with their choices, thus enhancing their engagement and overall satisfaction with online services. Since the last decade, recommender systems became a topic of increasing interest among machine learning, human-computer interaction, and information retrieval researchers.
News recommender systems are aimed to personalize users experiences and help them discover relevant articles from a large and dynamic search space. Therefore, it is a challenging scenario for recommendations. Large publishers release hundreds of news daily, implying that they must deal with fast-growing numbers of items that get quickly outdated and irrelevant to most readers. News readers exhibit more unstable consumption behavior than users in other domains such as entertainment. External events, like breaking news, affect readers interests. In addition, the news domain experiences extreme levels of sparsity, as most users are anonymous, with no past behavior tracked.
Since 2016, Deep Learning methods and techniques have been explored in Recommender Systems research. In general, they can be divided into methods for: Deep Collaborative Filtering, Learning Item Embeddings, Session-based Recommendations using Recurrent Neural Networks (RNN), and Feature Extraction from Items' Unstructured Data such as text, images, audio, and video.
The main contribution of this research was named CHAMELEON a meta-architecture designed to tackle the specific challenges of news recommendation. It consists of a modular reference architecture which can be instantiated using different neural building blocks.
As information about users' past interactions is scarce in the news domain, information such as the user context (e.g., time, location, device, the sequence of clicks within the session), static and dynamic article features like the article textual content and its popularity and recency, are explicitly modeled in a hybrid session-based recommendation approach using RNNs.
The recommendation task addressed in this work is the next-item prediction for user sessions, i.e., "what is the next most likely article a user might read in a session?". A temporal offline evaluation is used for a realistic offline evaluation of such task, considering factors that affect global readership interests like popularity, recency, and seasonality.
Experiments performed with two large datasets have shown the effectiveness of the CHAMELEON for news recommendation on many quality factors such as accuracy, item coverage, novelty, and reduced item cold-start problem, when compared to other traditional and state-of-the-art session-based algorithms.
Context-aware Recommendation: A Quick ViewYONG ZHENG
Context-aware recommendation systems take into account additional contextual information beyond just the user and item, such as time, location, and companion. There are three main approaches: contextual prefiltering splits items or users based on context; contextual modeling directly integrates context into models like matrix factorization; and CARSKit is an open source Java library for building context-aware recommender systems.
Deep learning techniques are increasingly being used for recommender systems. Neural network models such as word2vec, doc2vec and prod2vec learn embedding representations of items from user interaction data that capture their relationships. These embeddings can then be used to make recommendations by finding similar items. Deep collaborative filtering models apply neural networks to matrix factorization techniques to learn joint representations of users and items from rating data.
This document discusses recommendation systems and provides examples of different types of recommendation approaches. It introduces collaborative filtering and content-based filtering as the main recommendation techniques. For collaborative filtering, it provides an example of item-based collaborative filtering using the R programming language on a Last.fm music dataset. Content-based filtering recommends items based on their properties and features. Hybrid systems combine collaborative and content-based filtering to generate recommendations.
The document discusses recommender systems and describes several techniques used in collaborative filtering recommender systems including k-nearest neighbors (kNN), singular value decomposition (SVD), and similarity weights optimization (SWO). It provides examples of how these techniques work and compares kNN to SWO. The document aims to explain state-of-the-art recommender system methods.
Deep Natural Language Processing for Search and Recommender SystemsHuiji Gao
Tutorial for KDD 2019:
Search and recommender systems process rich natural language text data such as user queries and documents. Achieving high-quality search and recommendation results requires processing and understanding such information effectively and efficiently, where natural language processing (NLP) technologies are widely deployed. In recent years, the rapid development of deep learning models has been proven successful for improving various NLP tasks, indicating their great potential of promoting search and recommender systems.
In this tutorial, we summarize the current effort of deep learning for NLP in search/recommender systems. We first give an overview of search/recommender systems with NLP, then introduce basic concept of deep learning for NLP, covering state-of-the-art technologies in both language understanding and language generation. After that, we share our hands-on experience with LinkedIn applications. In the end, we highlight several important future trends.
Tutorial: Context In Recommender SystemsYONG ZHENG
This document provides an overview of a tutorial on context-aware recommender systems. The tutorial will cover traditional recommendation techniques, context-aware recommendation which incorporates additional contextual information such as time and location, and context suggestion. It includes an agenda with topics, background information on recommender systems and evaluation metrics, and descriptions of techniques for context-aware recommendation including context filtering and modeling.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...Gabriel Moreira
Presentation of the Phd. thesis defense of Gabriel de Souza Pereira Moreira at Instituto Tecnológico de Aeronáutica (ITA), on Dec. 09, 2019, in São José dos Campos, Brazil.
Abstract:
Recommender systems have been increasingly popular in assisting users with their choices, thus enhancing their engagement and overall satisfaction with online services. Since the last decade, recommender systems became a topic of increasing interest among machine learning, human-computer interaction, and information retrieval researchers.
News recommender systems are aimed to personalize users experiences and help them discover relevant articles from a large and dynamic search space. Therefore, it is a challenging scenario for recommendations. Large publishers release hundreds of news daily, implying that they must deal with fast-growing numbers of items that get quickly outdated and irrelevant to most readers. News readers exhibit more unstable consumption behavior than users in other domains such as entertainment. External events, like breaking news, affect readers interests. In addition, the news domain experiences extreme levels of sparsity, as most users are anonymous, with no past behavior tracked.
Since 2016, Deep Learning methods and techniques have been explored in Recommender Systems research. In general, they can be divided into methods for: Deep Collaborative Filtering, Learning Item Embeddings, Session-based Recommendations using Recurrent Neural Networks (RNN), and Feature Extraction from Items' Unstructured Data such as text, images, audio, and video.
The main contribution of this research was named CHAMELEON a meta-architecture designed to tackle the specific challenges of news recommendation. It consists of a modular reference architecture which can be instantiated using different neural building blocks.
As information about users' past interactions is scarce in the news domain, information such as the user context (e.g., time, location, device, the sequence of clicks within the session), static and dynamic article features like the article textual content and its popularity and recency, are explicitly modeled in a hybrid session-based recommendation approach using RNNs.
The recommendation task addressed in this work is the next-item prediction for user sessions, i.e., "what is the next most likely article a user might read in a session?". A temporal offline evaluation is used for a realistic offline evaluation of such task, considering factors that affect global readership interests like popularity, recency, and seasonality.
Experiments performed with two large datasets have shown the effectiveness of the CHAMELEON for news recommendation on many quality factors such as accuracy, item coverage, novelty, and reduced item cold-start problem, when compared to other traditional and state-of-the-art session-based algorithms.
Context-aware Recommendation: A Quick ViewYONG ZHENG
Context-aware recommendation systems take into account additional contextual information beyond just the user and item, such as time, location, and companion. There are three main approaches: contextual prefiltering splits items or users based on context; contextual modeling directly integrates context into models like matrix factorization; and CARSKit is an open source Java library for building context-aware recommender systems.
Deep learning techniques are increasingly being used for recommender systems. Neural network models such as word2vec, doc2vec and prod2vec learn embedding representations of items from user interaction data that capture their relationships. These embeddings can then be used to make recommendations by finding similar items. Deep collaborative filtering models apply neural networks to matrix factorization techniques to learn joint representations of users and items from rating data.
This document discusses recommendation systems and provides examples of different types of recommendation approaches. It introduces collaborative filtering and content-based filtering as the main recommendation techniques. For collaborative filtering, it provides an example of item-based collaborative filtering using the R programming language on a Last.fm music dataset. Content-based filtering recommends items based on their properties and features. Hybrid systems combine collaborative and content-based filtering to generate recommendations.
The document discusses recommender systems and describes several techniques used in collaborative filtering recommender systems including k-nearest neighbors (kNN), singular value decomposition (SVD), and similarity weights optimization (SWO). It provides examples of how these techniques work and compares kNN to SWO. The document aims to explain state-of-the-art recommender system methods.
Deep Natural Language Processing for Search and Recommender SystemsHuiji Gao
Tutorial for KDD 2019:
Search and recommender systems process rich natural language text data such as user queries and documents. Achieving high-quality search and recommendation results requires processing and understanding such information effectively and efficiently, where natural language processing (NLP) technologies are widely deployed. In recent years, the rapid development of deep learning models has been proven successful for improving various NLP tasks, indicating their great potential of promoting search and recommender systems.
In this tutorial, we summarize the current effort of deep learning for NLP in search/recommender systems. We first give an overview of search/recommender systems with NLP, then introduce basic concept of deep learning for NLP, covering state-of-the-art technologies in both language understanding and language generation. After that, we share our hands-on experience with LinkedIn applications. In the end, we highlight several important future trends.
Slides by Amaia Salvador at the UPC Computer Vision Reading Group.
Source document on GDocs with clickable links:
https://docs.google.com/presentation/d/1jDTyKTNfZBfMl8OHANZJaYxsXTqGCHMVeMeBe5o1EL0/edit?usp=sharing
Based on the original work:
Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In Advances in Neural Information Processing Systems, pp. 91-99. 2015.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Massimo Quadrana
Slides of the Tutorial on Sequence Aware Recommenders held at ACM RecSys 2018 in Vancouver.
Link to the website: https://sites.google.com/view/seq-recsys-tutorial
Link to the hands-on: https://github.com/mquad/sars_tutorial
Wajdi Khattel presented a proposal for a terrorist detection model in social networks. The model uses a multi-dimensional network as input and consists of three sub-models: a text classification model, image classification model, and general information classification model. The sub-models each output a score that is then used by a decision making module to classify a user as a terrorist or not based on a threshold. The implementation involved collecting offline training data from banned Twitter accounts, Google images, and a public dataset. Online data was also collected from Facebook, Instagram, and Twitter using their APIs. Several machine learning models were tested for each sub-model and the proposed full model uses a neural network for text, CNN with data augmentation and
INTRODUCTION TO INFORMATION RETRIEVAL
This lecture will introduce the information retrieval problem, introduce the terminology related to IR, and provide a history of IR. In particular, the history of the web and its impact on IR will be discussed. Special attention and emphasis will be given to the concept of relevance in IR and the critical role it has played in the development of the subject. The lecture will end with a conceptual explanation of the IR process, and its relationships with other domains as well as current research developments.
INFORMATION RETRIEVAL MODELS
This lecture will present the models that have been used to rank documents according to their estimated relevance to user given queries, where the most relevant documents are shown ahead to those less relevant. Many of these models form the basis for many of the ranking algorithms used in many of past and today’s search applications. The lecture will describe models of IR such as Boolean retrieval, vector space, probabilistic retrieval, language models, and logical models. Relevance feedback, a technique that either implicitly or explicitly modifies user queries in light of their interaction with retrieval results, will also be discussed, as this is particularly relevant to web search and personalization.
Image classification with Deep Neural NetworksYogendra Tamang
This document discusses image classification using deep neural networks. It provides background on image classification and convolutional neural networks. The document outlines techniques like activation functions, pooling, dropout and data augmentation to prevent overfitting. It summarizes a paper on ImageNet classification using CNNs with multiple convolutional and fully connected layers. The paper achieved state-of-the-art results on ImageNet in 2010 and 2012 by training CNNs on a large dataset using multiple GPUs.
Challenges and Solutions in Group Recommender SystemsLudovico Boratto
The document discusses group recommender systems. It begins with an overview of recommender systems principles and introduces the concept of group recommendation. It then outlines several key tasks in group recommendation systems, including defining different types of groups, acquiring preferences, modeling groups, predicting ratings, helping groups reach consensus, and explaining recommendations to groups. The document provides examples of approaches used in existing systems for each of these tasks. It also surveys common techniques for modeling groups, such as additive utilitarian, multiplicative utilitarian, Borda count, and Copeland rule strategies.
Skeleton-based Human Action Recognition with Recurrent Neural NetworkLuong Vo
This document presents a thesis on using recurrent neural networks for skeleton-based human action recognition. The proposed method uses two RNNs - a temporal RNN to model the temporal dynamics of joints over time, and a spatial RNN to model the dependencies between joints spatially. The RNNs are trained on skeleton data extracted from video datasets like NTU RGB+D and Kinetics. Experimental results show the method achieves state-of-the-art accuracy on the NTU datasets and can recognize actions in real-time from new video inputs. Future work involves exploring more advanced temporal modeling and evaluating on larger datasets.
Recsys 2014 Tutorial - The Recommender Problem RevisitedXavier Amatriain
This document summarizes Xavier Amatriain's presentation on recommender systems. It discusses traditional recommendation methods like collaborative filtering, content-based recommendations, and hybrid approaches. It also covers newer methods that go beyond traditional techniques, such as learning to rank, deep learning, social recommendations, and context-aware recommendations. Throughout the presentation, Amatriain discusses challenges like cold starts, popularity bias, and limitations of different recommendation approaches. He also shares lessons learned from the Netflix Prize competition, including how SVD and RBM models were used.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
1. Deep learning techniques such as convolutional neural networks, recurrent neural networks, and autoencoders can be applied to recommender systems.
2. Convolutional neural networks are commonly used to extract features from images, audio, and video that can then be used for recommendation. Recurrent neural networks can model user sessions as sequences of clicks.
3. Autoencoders learn lower-dimensional representations of items that capture similarities and can be used to make recommendations, especially for cold start problems where little is known about new users or items.
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
The document describes Faster R-CNN, an object detection method that uses a Region Proposal Network (RPN) to generate region proposals from feature maps, pools features from each proposal into a fixed size using RoI pooling, and then classifies and regresses bounding boxes for each proposal using a convolutional network. The RPN outputs objectness scores and bounding box adjustments for anchor boxes sliding over the feature map, and non-maximum suppression is applied to reduce redundant proposals.
This document discusses recommender systems, including:
1. It provides an overview of recommender systems, their history, and common problems like top-N recommendation and rating prediction.
2. It then discusses what makes a good recommender system, including experiment methods like offline, user surveys, and online experiments, as well as evaluation metrics like prediction accuracy, diversity, novelty, and user satisfaction.
3. Key metrics that are important to evaluate recommender systems are discussed, such as user satisfaction, prediction accuracy, coverage, diversity, novelty, serendipity, trust, robustness, and response time. The document emphasizes selecting metrics based on business goals.
This document provides an overview of transformers in computer vision. It discusses how transformers were originally developed for natural language processing using attention mechanisms instead of recurrent connections. Vision transformers apply this approach to images by treating patches as tokens and using self-attention. Early vision transformers achieved strong results on image classification tasks. Recent developments include Swin transformers which use shifted windows to incorporate positional information, and models that combine convolutional and transformer architectures. Transformers are also being applied to video understanding tasks. The document explores different transformer architectures and applications of vision transformers.
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorialAlexandros Karatzoglou
The slides from the Learning to Rank for Recommender Systems tutorial given at ACM RecSys 2013 in Hong Kong by Alexandros Karatzoglou, Linas Baltrunas and Yue Shi.
This paper proposes a similarity-based approach for contextual modeling in context-aware recommender systems. It introduces three methods for representing context similarity - independent, latent, and multidimensional - and applies them to context-aware matrix factorization and sparse linear models. Experimental results on four datasets show the multidimensional context similarity approach outperforms deviation-based contextual modeling and independent context modeling. The paper concludes similarity-based contextual modeling provides a general way to incorporate contexts and recommends exploring solutions to reduce costs in multidimensional modeling and applying other base recommender algorithms.
Context-aware recommender systems (CARS) help improve the effectiveness of recommendations by adapting to users' preferences in different contextual situations. One approach to CARS that has been shown to be particularly effective is Context-Aware Matrix Factorization (CAMF). CAMF incorporates contextual dependencies into the standard matrix factorization (MF) process, where users and items are represented as collections of weights over various latent factors. In this paper, we introduce another CARS approach based on an extension of matrix factorization, namely, the Sparse Linear Method (SLIM). We develop a family of deviation-based contextual SLIM (CSLIM) recommendation algorithms by learning rating deviations in different contextual conditions. Our CSLIM approach is better at explaining the underlying reasons behind contextual recommendations, and our experimental evaluations over five context-aware data sets demonstrate that these CSLIM algorithms outperform the state-of-the-art CARS algorithms in the top-$N$ recommendation task. We also discuss the criteria for selecting the appropriate CSLIM algorithm in advance based on the underlying characteristics of the data.
Slides by Amaia Salvador at the UPC Computer Vision Reading Group.
Source document on GDocs with clickable links:
https://docs.google.com/presentation/d/1jDTyKTNfZBfMl8OHANZJaYxsXTqGCHMVeMeBe5o1EL0/edit?usp=sharing
Based on the original work:
Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In Advances in Neural Information Processing Systems, pp. 91-99. 2015.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Massimo Quadrana
Slides of the Tutorial on Sequence Aware Recommenders held at ACM RecSys 2018 in Vancouver.
Link to the website: https://sites.google.com/view/seq-recsys-tutorial
Link to the hands-on: https://github.com/mquad/sars_tutorial
Wajdi Khattel presented a proposal for a terrorist detection model in social networks. The model uses a multi-dimensional network as input and consists of three sub-models: a text classification model, image classification model, and general information classification model. The sub-models each output a score that is then used by a decision making module to classify a user as a terrorist or not based on a threshold. The implementation involved collecting offline training data from banned Twitter accounts, Google images, and a public dataset. Online data was also collected from Facebook, Instagram, and Twitter using their APIs. Several machine learning models were tested for each sub-model and the proposed full model uses a neural network for text, CNN with data augmentation and
INTRODUCTION TO INFORMATION RETRIEVAL
This lecture will introduce the information retrieval problem, introduce the terminology related to IR, and provide a history of IR. In particular, the history of the web and its impact on IR will be discussed. Special attention and emphasis will be given to the concept of relevance in IR and the critical role it has played in the development of the subject. The lecture will end with a conceptual explanation of the IR process, and its relationships with other domains as well as current research developments.
INFORMATION RETRIEVAL MODELS
This lecture will present the models that have been used to rank documents according to their estimated relevance to user given queries, where the most relevant documents are shown ahead to those less relevant. Many of these models form the basis for many of the ranking algorithms used in many of past and today’s search applications. The lecture will describe models of IR such as Boolean retrieval, vector space, probabilistic retrieval, language models, and logical models. Relevance feedback, a technique that either implicitly or explicitly modifies user queries in light of their interaction with retrieval results, will also be discussed, as this is particularly relevant to web search and personalization.
Image classification with Deep Neural NetworksYogendra Tamang
This document discusses image classification using deep neural networks. It provides background on image classification and convolutional neural networks. The document outlines techniques like activation functions, pooling, dropout and data augmentation to prevent overfitting. It summarizes a paper on ImageNet classification using CNNs with multiple convolutional and fully connected layers. The paper achieved state-of-the-art results on ImageNet in 2010 and 2012 by training CNNs on a large dataset using multiple GPUs.
Challenges and Solutions in Group Recommender SystemsLudovico Boratto
The document discusses group recommender systems. It begins with an overview of recommender systems principles and introduces the concept of group recommendation. It then outlines several key tasks in group recommendation systems, including defining different types of groups, acquiring preferences, modeling groups, predicting ratings, helping groups reach consensus, and explaining recommendations to groups. The document provides examples of approaches used in existing systems for each of these tasks. It also surveys common techniques for modeling groups, such as additive utilitarian, multiplicative utilitarian, Borda count, and Copeland rule strategies.
Skeleton-based Human Action Recognition with Recurrent Neural NetworkLuong Vo
This document presents a thesis on using recurrent neural networks for skeleton-based human action recognition. The proposed method uses two RNNs - a temporal RNN to model the temporal dynamics of joints over time, and a spatial RNN to model the dependencies between joints spatially. The RNNs are trained on skeleton data extracted from video datasets like NTU RGB+D and Kinetics. Experimental results show the method achieves state-of-the-art accuracy on the NTU datasets and can recognize actions in real-time from new video inputs. Future work involves exploring more advanced temporal modeling and evaluating on larger datasets.
Recsys 2014 Tutorial - The Recommender Problem RevisitedXavier Amatriain
This document summarizes Xavier Amatriain's presentation on recommender systems. It discusses traditional recommendation methods like collaborative filtering, content-based recommendations, and hybrid approaches. It also covers newer methods that go beyond traditional techniques, such as learning to rank, deep learning, social recommendations, and context-aware recommendations. Throughout the presentation, Amatriain discusses challenges like cold starts, popularity bias, and limitations of different recommendation approaches. He also shares lessons learned from the Netflix Prize competition, including how SVD and RBM models were used.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
1. Deep learning techniques such as convolutional neural networks, recurrent neural networks, and autoencoders can be applied to recommender systems.
2. Convolutional neural networks are commonly used to extract features from images, audio, and video that can then be used for recommendation. Recurrent neural networks can model user sessions as sequences of clicks.
3. Autoencoders learn lower-dimensional representations of items that capture similarities and can be used to make recommendations, especially for cold start problems where little is known about new users or items.
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
The document describes Faster R-CNN, an object detection method that uses a Region Proposal Network (RPN) to generate region proposals from feature maps, pools features from each proposal into a fixed size using RoI pooling, and then classifies and regresses bounding boxes for each proposal using a convolutional network. The RPN outputs objectness scores and bounding box adjustments for anchor boxes sliding over the feature map, and non-maximum suppression is applied to reduce redundant proposals.
This document discusses recommender systems, including:
1. It provides an overview of recommender systems, their history, and common problems like top-N recommendation and rating prediction.
2. It then discusses what makes a good recommender system, including experiment methods like offline, user surveys, and online experiments, as well as evaluation metrics like prediction accuracy, diversity, novelty, and user satisfaction.
3. Key metrics that are important to evaluate recommender systems are discussed, such as user satisfaction, prediction accuracy, coverage, diversity, novelty, serendipity, trust, robustness, and response time. The document emphasizes selecting metrics based on business goals.
This document provides an overview of transformers in computer vision. It discusses how transformers were originally developed for natural language processing using attention mechanisms instead of recurrent connections. Vision transformers apply this approach to images by treating patches as tokens and using self-attention. Early vision transformers achieved strong results on image classification tasks. Recent developments include Swin transformers which use shifted windows to incorporate positional information, and models that combine convolutional and transformer architectures. Transformers are also being applied to video understanding tasks. The document explores different transformer architectures and applications of vision transformers.
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorialAlexandros Karatzoglou
The slides from the Learning to Rank for Recommender Systems tutorial given at ACM RecSys 2013 in Hong Kong by Alexandros Karatzoglou, Linas Baltrunas and Yue Shi.
This paper proposes a similarity-based approach for contextual modeling in context-aware recommender systems. It introduces three methods for representing context similarity - independent, latent, and multidimensional - and applies them to context-aware matrix factorization and sparse linear models. Experimental results on four datasets show the multidimensional context similarity approach outperforms deviation-based contextual modeling and independent context modeling. The paper concludes similarity-based contextual modeling provides a general way to incorporate contexts and recommends exploring solutions to reduce costs in multidimensional modeling and applying other base recommender algorithms.
Context-aware recommender systems (CARS) help improve the effectiveness of recommendations by adapting to users' preferences in different contextual situations. One approach to CARS that has been shown to be particularly effective is Context-Aware Matrix Factorization (CAMF). CAMF incorporates contextual dependencies into the standard matrix factorization (MF) process, where users and items are represented as collections of weights over various latent factors. In this paper, we introduce another CARS approach based on an extension of matrix factorization, namely, the Sparse Linear Method (SLIM). We develop a family of deviation-based contextual SLIM (CSLIM) recommendation algorithms by learning rating deviations in different contextual conditions. Our CSLIM approach is better at explaining the underlying reasons behind contextual recommendations, and our experimental evaluations over five context-aware data sets demonstrate that these CSLIM algorithms outperform the state-of-the-art CARS algorithms in the top-$N$ recommendation task. We also discuss the criteria for selecting the appropriate CSLIM algorithm in advance based on the underlying characteristics of the data.
[RIIT 2017] Identifying Grey Sheep Users By The Distribution of User Similari...YONG ZHENG
Yong Zheng, Mayur Agnani, Mili Singh. “Identifying Grey Sheep Users By The Distribution of User Similarities In Collaborative Filtering”. Proceedings of The 6th ACM Conference on Research in Information Technology (RIIT), Rochester, NY, USA, October, 2017
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...YONG ZHENG
Yong Zheng. "Deviation-Based and Similarity-Based Contextual SLIM Recommendation Algorithms". ACM RecSys Doctoral Symposium, Proceedings of the 8th ACM Conference on Recommender Systems (ACM RecSys 2014), pp. 437-440, Silicon Valley, CA, USA, Oct 2014 [Doctoral Symposium, Acceptance rate: 47%]
[UMAP 2015] Integrating Context Similarity with Sparse Linear Recommendation ...YONG ZHENG
This document summarizes a research paper on integrating context similarity with sparse linear recommendation models. It discusses contextual modeling approaches, including independent contextual modeling using tensor factorization and dependent contextual modeling using deviation-based and similarity-based approaches. It presents the sparse linear method (SLIM) and a contextual extension (CSLIM) that incorporates context similarity. Four methods for modeling context similarity - independent, latent, weighted Jaccard, and multidimensional - are described. Experimental evaluations on limited context-aware datasets are conducted to compare baseline algorithms like tensor factorization to the new similarity-based CSLIM approaches.
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...YONG ZHENG
This document summarizes a research paper that improves on a previous context-aware recommender system algorithm called GCSLIM by factorizing contexts to address its sparsity problem. The paper introduces GCSLIM and its drawback of measuring context deviations in pairs, which can result in unknown deviations when new context combinations are encountered. To solve this, the paper represents each context as a vector and calculates deviations as the Euclidean distance between vectors. Experimental results on a restaurant dataset show improved precision and MAP over baselines. The conclusions discuss how factorizing contexts can alleviate but not fully solve sparsity, and future work to address cold start issues.
This paper proposes a method called user-oriented context suggestion that suggests contexts to users based on their preferences. It aims to maximize user experience by recommending not just good items, but appropriate contexts for those items. Two algorithms are developed: one based on contextual rating deviations that identifies how a user's ratings change across contexts, and another that adapts techniques from item-oriented context suggestion. An evaluation on a music dataset finds the tensor factorization approach performs best, with the contextual rating deviations method also outperforming a simple baseline. Future work includes collecting better evaluation data and trying other contextual recommendation algorithms.
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...YONG ZHENG
The document proposes a new approach to identify "grey sheep users" in recommender systems. Grey sheep users have unusual tastes and low correlations with other users. The approach represents each user as a histogram of their similarities to other users. It then uses outlier detection on the histograms to identify grey sheep users as the outliers with low similarities. The approach is tested on movie rating data and is shown to better identify grey sheep users compared to other methods. Future work involves applying this approach to other datasets and improving recommendations for identified grey sheep users.
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...YONG ZHENG
This document describes an empirical study that compares different context-aware recommendation approaches. It evaluates three context-aware splitting approaches (item splitting, user splitting, and UI splitting) on several datasets using different recommendation algorithms and impurity criteria for splitting. The results show that UI splitting generally performs the best when used with matrix factorization as the recommendation algorithm. The document also compares the splitting approaches to other context-aware recommendation methods like differential context modeling and context-aware matrix factorization. The goal is to better understand how different context-aware techniques compare and which may be most appropriate depending on the data and application.
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation ApproachYONG ZHENG
This paper proposes a novel approach called Criteria Chains for multi-criteria recommender systems. Criteria Chains predicts ratings across multiple criteria in a chain, using previous predictions as context. It outperforms baselines by better utilizing relationships between criteria. The best method is to rank criteria by information gain to generate the chain, then use predicted criteria as context (CCC approach) to estimate the overall rating. Future work includes optimizing chain generation beyond information gain.
[Decisions2013@RecSys]The Role of Emotions in Context-aware RecommendationYONG ZHENG
The document discusses the role of emotions in context-aware recommender systems (CARS). It explores two classes of CARS algorithms: context-aware splitting approaches and differential context modeling. For context-aware splitting approaches, it examines which emotional contexts are most frequently used to split items or users. For differential context modeling, it analyzes which emotional dimensions are selected or weighted most highly for different algorithm components. The experimental results found that the emotions of end emotion and dominant emotion were the most influential across approaches. User splitting also generally outperformed item splitting.
[IUI2015] A Revisit to The Identification of Contexts in Recommender SystemsYONG ZHENG
This document proposes a framework for identifying contexts in context-aware recommender systems (CARS). It defines contexts as any information that characterizes a user's situation. The framework models activities as having subjects (users), objects (items or other users), and actions (interactions). It provides three rules for context identification: 1) attributes of actions are contexts, 2) some dynamic attributes in user profiles can be contexts, and 3) some attributes of user objects can be contexts in social networks. The framework aims to clarify what should be considered contexts versus item content to improve CARS development and analysis.
[EMPIRE 2016] Adapt to Emotional Reactions In Context-aware PersonalizationYONG ZHENG
This document discusses using emotions as context in recommender systems. It proposes two models that utilize emotional reactions data from movie ratings to improve context-aware recommender system algorithms. The models apply emotional regularization techniques to matrix factorization. One model regularizes based on similar emotional users, while another also considers original user similarities. Tests on a movie rating dataset show improvements over baselines, with emotional state during consumption more effective than after. Future work could explore emotional transitions over time.
Matrix Factorization In Recommender SystemsYONG ZHENG
The document discusses matrix factorization techniques for recommender systems. It begins with an overview of recommender systems and their use of matrix factorization for dimensionality reduction. Principal component analysis and singular value decomposition are described as early linear algebra techniques used for this purpose. The document then focuses on how these techniques evolved into basic and extended matrix factorization methods in recommender systems, using the Netflix Prize competition as an example.
Recommendation systems provide users with information they may be interested in based on their preferences and interests. They help address the problem of information overload by retrieving desired information for the user based on their preferences or those of similar users. The two main types of recommendation systems are personalized and non-personalized systems. Common techniques used include collaborative filtering, which finds users with similar tastes, and content-based filtering, which recommends items similar to those a user has liked based on item attributes.
Recommender systems are useful for online businesses such as Amazon, or Netflix. This set of slides provides a brief overview on recommender systems and their challenges.
An introduction to system-oriented evaluation in Information RetrievalMounia Lalmas-Roelleke
Slides for my lecture on IR evaluation, presented at 11th European Summer School in Information Retrieval (ESSIR 2017) at Universitat Pompeu Fabra, Barcelona.
These slides were based on
1. Evaluation lecture @ QMUL; Thomas Roelleke & Mounia Lalmas
3. Lecture 8: Evaluation @ Stanford University; Pandu Nayak & Prabhakar Raghavan
4. Retrieval Evaluation @ University of Virginia; Hongnig Wang
5. Lectures 11 and 12 on Evaluation @ Berkeley; Ray Larson
6. Evaluation of Information Retrieval Systems @ Penn State University; Lee Giles
Textbooks:
1. Information Retrieval, 2nd edition, C.J. van Rijsbergen (1979)
2. Introduction to Information Retrieval, C.D. Manning, P. Raghavan & H. Schuetze (2008)
3. Modern Information Retrieval: The Concepts and Technology behind Search, 2nd ed; R. Baeza-Yates & B. Ribeiro-Neto (2011)
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditSC CTSI at USC and CHLA
This 50-minute presentation introduces r/SampleSize, a community on the website Reddit that allows for online participant recruitment without compulsory or immediate payment. It will provide an overview of best practices for recruiting participants on r/SampleSize. It will also compare r/SampleSize to Amazon Mechanical Turk (MTurk), a widely used crowdsourcing platform for recruiting research participants.
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...Roman Atachiants
Users who cannot formulate a precise query but know there must be a good answer somewhere, often rely on exploratory search. This requires an interactive and responsive system, or else the user will soon give up. As data bases are becoming larger, more specialized, and more distributed this calls for a Rich Internet Application, fast enough to keep pace with the users explorations. This thesis studies and implements a system, called MultiMap, which computes similarity maps in real-time. This entailed: (1) precomputing every data structure that does not change after the initial query, (2) optimizing algorithms for zooming and map generation (3) and providing a cognitively appropriate visualization of high dimensional space. Applied to a very large movie database, it resulted in a highly responsive, satisfying, usable system.
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
Despite the attention Semantic Search is continuously gaining, several challenges affecting tool performance and user experience remain unsolved. Among these are: matching user terms with the searchspace, adopting view-based interfaces in the Open Web as well as supporting users while building their queries. This paper proposes an approach to move a step forward towards tackling these challenges by creating models of usage of Linked Data concepts and properties extracted from semantic query logs as a source of collaborative knowledge. We use two sets of query logs from the USEWOD workshops to create our models and show the potential of using them in the mentioned areas.
Statistical Analysis of Results in Music Information Retrieval: Why and HowJulián Urbano
This document summarizes an introduction given by Julián Urbano and Arthur Flexer on statistical analysis in music information retrieval. It discusses why evaluation is important, including addressing questions about how good a system is and comparing different systems. It describes how the Cranfield paradigm addresses these questions by using fixed test collections with documents, topics and relevance judgments to simulate users and allow reproducible experiments. Finally, it outlines different types of music information retrieval tasks like retrieval, annotation and extraction and how the evaluation approach may differ depending on the specific task or use case.
The first part of a workshop on user experience surveys. Topics: (1) how to improve the questions in surveys and (2) how to assess UX using a survey.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This document discusses evaluation methods for information retrieval systems. It begins by outlining different types of evaluation, including retrieval effectiveness, efficiency, and user-based evaluation. It then focuses on retrieval effectiveness, describing commonly used measures like precision, recall, and discounted cumulative gain. It discusses how these measures are calculated and their limitations. The document also introduces other evaluation metrics like R-precision, average precision, and normalized discounted cumulative gain that provide single value assessments of system performance.
Search & Recommendation: Birds of a Feather?Toine Bogers
In just a little over half a century, the field of information retrieval has experienced spectacular growth and success, with IR applications such as search engines becoming a billion-dollar industry in the past decades. Recommender systems have seen an even more meteoric rise to success with wide-scale application by companies like Amazon, Facebook, and Netflix. But are search and recommendation really two different fields of research that address different problems with different sets of algorithms in papers published at distinct conferences?
In my talk, I want to argue that search and recommendation are more similar than they have been treated in the past decade. By looking more closely at the tasks and problems that search and recommendation try to solve, at the algorithms used to solve these problems and at the way their performance is evaluated, I want to show that there is no clear black and white division between the two. Instead, search and recommendation are part of a much more fluid continuum of methods and techniques for information access.
(Keynote at "Mind The Gap '14" workshop at the iConference 2014 in Berlin, Germany)
June 18, 2014
NISO Virtual Conference: Transforming Assessment: Alternative Metrics and Other Trends
NISO Altmetrics Initiative: A Project Update
- Martin Fenner, Technical Lead for the PLOS Article-Level Metrics project
Magnetic - Query Categorization at ScaleAlex Dorman
presented 09/23/14 at NYC Search, Discovery & Analytics meetup
Classification of short text into a predefined hierarchy of categories is a challenge. The need to categorize short texts arises in multiple domains: keywords and queries in online advertising, improvement of search engine results, analysis of tweets or messages in social networks, etc. We leverage community-moderated, freely-available data sets (Wikipedia, DBPedia, Freebase) and open-source tools (Hadoop, Solr) to build a flexible and extensible classification model.
Magnetic is an online advertising company specializing in search retargeting and applying data science to online search behavior. We create custom real-time audience segments based on what users have searched for across the web. Targeting individual keywords found in user search history is a great way to build an audience. But the need to create manually selected keywords might present operational challenge. The ability to classify queries and keywords helps to create larger audiences with less effort and better accuracy. Among the other use cases for keyword classification in online advertising are reporting on size of inventory available by category, and campaign performance optimization.
We will share our experiences building a real-world data science system that scales to production data volumes of more than 20 million keyword classifications per hour. And will touch on some aspect of knowledge discovery such as language detection, n-gram extraction, and entity recognition.
about the speaker: Alex Dorman, CTO at Magnetic.
Alex has used Hadoop technologies since 2007. Before joining Magnetic, Alex built big data platforms and teams at Proclivity Media and ContextWeb/PulsePoint.
Surveyance or Surveillance? Data Ethics in Library TechnologiesShea Swauger
Libraries are increasingly investing in systems that can track and correlate user behavior. This proposal seeks to interrogate the methods and ethical implications of these technologies, questions how well positioned libraries are to advocate changes to those technologies, and seeks to establish guidelines for how to operationalize interrogation of technology.
Joshua White presents the findings of his PhD research applying social network analysis to online datasets. He developed tools to collect Twitter data and detect botnet command networks, phishing websites, and malware infection vectors. His work also identified influential actors during events and classified users based on their social roles. Future areas of research include building an ontology for semantic social network analysis.
The document introduces recommender systems and discusses their application to technology enhanced learning (TEL). It defines recommender systems as tools that use opinions from a community of users to help individuals identify interesting content from many options. The document outlines common tasks of recommender systems like providing annotations, finding good items, or recommending item sequences. It also discusses modeling techniques, evaluation challenges, and open research issues for applying recommender systems in TEL.
A Brief (and Practical) Introduction to Information ArchitectureLouis Rosenfeld
Keynote presentation by Louis Rosenfeld at the Usability and Accessibility for the Web International Seminar; 26 July 2007, Monterrey, Nuevo Leon, Mexico
Discovering Common Motifs in Cursor Movement DataYandex
The document discusses research on discovering common motifs in mouse cursor movement data. It summarizes prior work on modeling post-click user behavior on search result pages. The researchers aim to automatically discover meaningful patterns (motifs) in cursor movement data without pre-defining complex features. They describe a pipeline to generate motif candidates, find frequent candidates, de-duplicate motifs, and apply various optimizations. Experimental results show motifs can improve relevance prediction and search result ranking. Motifs are also useful for characterizing attention patterns and predicting cognitive impairment.
We provide real time big data training in Chennai by industrial experts with real time scenarios.
Our Advanced topics will enhance the students expectations into high level knowledge in Big Data Technology.
For More Info.Reach our Big Data Technical Team@ +91 96677211551/56
The Experience of Big data Training Experts Team.
www.thecreatingexperts.com
SAP BEST INSTITUTES IN CHENNAI
http://www.youtube.com/watch?v=UpWthI0P-7g
Similar a Tutorial: Context-awareness In Information Retrieval and Recommender Systems (20)
[WI 2014]Context Recommendation Using Multi-label ClassificationYONG ZHENG
This document proposes a new type of recommender system called a context recommender that recommends appropriate contexts (e.g. time, location, companion) for users to consume items. It discusses how context recommenders are different than traditional and context-aware recommenders. It also presents the framework for context recommenders including algorithms using multi-label classification to directly predict contexts. The document reports on experiments comparing these algorithms on several datasets and finds that personalized algorithms outperform non-personalized ones and that certain multi-label classification algorithms like label powerset using support vector machines achieve the best performance.
[UMAP2013] Recommendation with Differential Context WeightingYONG ZHENG
Context-aware recommender systems (CARS) adapt their recommendations to users’ specific situations. In many recommender systems, particularly those based on collaborative filtering, the contextual constraints may lead to sparsity: fewer matches between the current user context and previous situations. Our earlier work proposed an approach called differential context relaxation (DCR), in which different subsets of contextual features were applied in different components of a recommendation algorithm. In this paper, we expand on our previous work on DCR, proposing a more general approach — differential context weighting (DCW), in which contextual features are weighted. We compare DCR and DCW on two real-world datasets, and DCW demonstrates improved accuracy over DCR with comparable coverage. We also show that particle swarm optimization (PSO) can be used to efficiently determine the weights for DCW.
[SOCRS2013]Differential Context Modeling in Collaborative FilteringYONG ZHENG
This document discusses differential context modeling (DCM) in collaborative filtering recommender systems. DCM is a framework that separates recommender algorithms into components and applies differential context constraints to each component to maximize contextual effects. The document applies DCM using differential context relaxation and weighting to item-based collaborative filtering and Slope One recommender algorithms. Experimental results on movie and food rating datasets show that differential context weighting improves predictive performance over baselines and differential context relaxation. Future work involves expanding DCM to additional recommender algorithms and optimizing performance.
This document provides an overview of slope one recommender algorithms and their implementation in distributed systems using Hadoop and Mahout. It discusses slope one and weighted slope one recommenders, how they are implemented in Mahout, and how Mahout runs them in a distributed manner on Hadoop using mappers and reducers. It then describes experiments run on MovieLens data using this distributed slope one implementation and analyzes the results.
The document provides an outline for a manual on writing a Ph.D. dissertation. It discusses introducing the dissertation, how to write and organize it, dissertation style, and good habits for writing a dissertation. Key sections include outlining the dissertation process and milestones, differences between papers/theses, common dissertation skeleton structures, principles for organizing sections, and tips for writing early and getting feedback.
A topic trend can be inferred by the usage of tags -- we name it as attention. Time-series analysis for tagging prediction can indicate the evolution of attention flow. This side takes political analysis for example, using time-series technique and discover interesting patterns.
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...YONG ZHENG
This document summarizes a research paper on optimal feature selection for context-aware recommendation systems using differential relaxation. The paper proposes a differential context relaxation (DCR) model that applies different context relaxations to different components of a recommendation algorithm to maximize their contributions. It uses binary particle swarm optimization to efficiently find optimal context relaxations and outperforms exhaustive search. Experimental results on a food preference dataset show the effects of different contexts and context-linked features. The paper discusses limitations and opportunities for future work to address sparsity issues.
[ECWEB2012]Differential Context Relaxation for Context-Aware Travel Recommend...YONG ZHENG
Context-aware recommendation (CARS) has been shown to be an effective approach to recommendation in a number of domains. However, the problem of identifying appropriate contextual variables remains: using too many contextual variables risks a drastic increase in dimensionality and a loss of accuracy in recommendation. In this paper, we propose a novel treatment of context – identifying influential contexts for different algorithm components instead of for the whole algorithm. Based on this idea, we take traditional user-based collaborative filtering (CF) as an example, decompose it into three context-sensitive components, and propose a hybrid contextual approach. We then identify appropriate relaxations of contextual constraints for each algorithm component. The effectiveness of context relaxation is demonstrated by comparison of three algorithms using a travel data set: a contenxt-ignorant approach, contextual pre-filtering, and our hybrid contextual algorithm. The experiments show that choosing an appropriate relaxation of the contextual constraints for each component of an algorithm outperforms strict application of the context.
[HetRec2011@RecSys]Experience Discovery: Hybrid Recommendation of Student Act...YONG ZHENG
The aim of the Experience Discovery project is to recommend extracurricular activities to high school and middle school students in urban areas. In implementing this system, we have been able to make use of both usage data and data drawn from a social networking site. Using pilot data, we are able to show that very simple aggregation techniques applied to the social network can improve recommendation accuracy.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Tutorial: Context-awareness In Information Retrieval and Recommender Systems
1. Tutorial: Context-Awareness In Information
Retrieval and Recommender Systems
Yong Zheng
School of Applied Technology
Illinois Institute of Technology, Chicago
Time: 2:00 PM – 5:00 PM, Oct 13, 2016
Location: Omaha Hilton, Omaha, NE, USA
The 16th IEEE/WIC/ACM Conference on Web Intelligence, Omaha, USA
2. Introduction
Yong Zheng
School of Applied Technology
Illinois Institute of Technology, Chicago, USA
2016, PhD in CIS, DePaul University, USA
Dissertation: Context-awareness In Recommender Systems
Research: Data Science for Web Intelligence
Short Tutorial (2 hrs, non-technical):
Schedule: 2:00 PM – 5:00 PM, Oct 13, 2016
Break: 3:00 PM – 3:30 PM (Outside Merchants)
Location: Herndon, Omaha Hilton, Omaha, USA
2
3. Topics in this Tutorial
• Information Overload
• Solution: Information Retrieval (IR)
e.g., Google Search Engine
• Solution: Recommender Systems (RecSys)
e.g., Movie recommender by Netflix
• Context and Context-awareness
e.g., Mobile computing and smart home devices
• Context-awareness in IR and RecSys
• Extended Topics: Trends, Challenges and Future
3
12. Information Overload
12
• Information overload refers to the difficulty a
person can have understanding an issue and
making decisions that can be caused by the
presence of too much information.
• The term is popularized by Alvin Toffler in his
bestselling 1970 book “Future Shock”
13. Information Overload
13
• We are living in the information age
• But, you may want to LEAVE (escape from) the
overloaded information age right now.
14. Alleviating Information Overload
14
Some solutions (Meyer 1998)
• Chunking: Deal with group of things, not individuals
• Omission: Ignore some information
• Queuing: Put information aside & catch up later
• Capitulation: Escape from the task
• Filtering: ignore irrelevant information
• And so forth …
Let’s take Emails for example
21. Information Retrieval
Task In Information Retrieval:
• Given a query
• Retrieve a list of documents related to the query/intent
The query could be a term:
21
22. Information Retrieval
Task In Information Retrieval:
• Given a query
• Retrieve a list of documents related to the query/intent
The query could be a sentence:
22
23. Information Retrieval
Task In Information Retrieval:
• Given a query
• Retrieve a list of documents related to the query/intent
The query could be a picture:
23
24. Information Retrieval
Task In Information Retrieval:
• Given a query
• Retrieve a list of documents related to the query/intent
The query could be an audio/voice:
24
25. Information Retrieval
Task In Information Retrieval:
• Given a query
• Retrieve a list of documents related to the query
The query could be even anything!!!!
Thanks to the contributions by:
• Multimedia
• Natural language processing (NLP)
25
26. Information Retrieval
Key issues in IR
26
Creation
Utilization Searching & Ranking
Active
Inactive
Semi-Active
Retention/
Mining
Disposition
Discard
Using
Creating
Authoring
Modifying
Organizing
Indexing
Storing
Retrieval
Distribution
Networking
Accessing
Filtering
34. Recommender Systems
34
• RecSys are able to provide item recommendations
tailored by users’ preferences in the history.
The notion of “item” may vary from domains to domains
E-Commerce: single item, a bundle of items, etc
Movies: each movie, director, actor, movie genre, etc
Music: single track, album, artist, playlist, etc
Travel: tour, flight, hotel, car rental, travel package, etc
Social networks: tweets, user accounts, groups, etc
42. 42
Task and Eval (1): Rating Prediction
User Item Rating
U1 T1 4
U1 T2 3
U1 T3 3
U2 T2 4
U2 T3 5
U2 T4 5
U3 T4 4
U1 T4 3
U2 T1 2
U3 T1 3
U3 T2 3
U3 T3 4
Train
Test
Task: P(U, T) in testing set
Prediction error: e = R(U, T) – P(U, T)
Mean Absolute Error (MAE) =
Other evaluation metrics:
• Root Mean Square Error (RMSE)
• Coverage
• and more …
43. 43
Task and Eval (1): Rating Prediction
User Item Rating
U1 T1 4
U1 T2 3
U1 T3 3
U2 T2 4
U2 T3 5
U2 T4 5
U3 T4 4
U1 T4 3
U2 T1 2
U3 T1 3
U3 T2 3
U3 T3 4
Train
Test
Task: P(U, T) in testing set
1. Build a model, e.g., P(U, T) = Avg (T)
2. Process of Rating Prediction
P(U1, T4) = Avg(T4) = (5+4)/2 = 4.5
P(U2, T1) = Avg(T1) = 4/1 = 4
P(U3, T1) = Avg(T1) = 4/1 = 4
P(U3, T2) = Avg(T2) = (3+4)/2 = 3.5
P(U3, T3) = Avg(T3) = (3+5)/2 = 4
3. Evaluation by Metrics
Mean Absolute Error (MAE) =
ei = R(U, T) – P(U, T)
MAE = (|3 – 4.5| + |2 - 4| + |3 - 4| +
|3 – 3.5| + |4 - 4|) / 5 = 1
44. 44
Task and Eval (2): Top-N Recommendation
User Item Rating
U1 T1 4
U1 T2 3
U1 T3 3
U2 T2 4
U2 T3 5
U2 T4 5
U3 T4 4
U1 T4 3
U2 T1 2
U3 T1 3
U3 T2 3
U3 T3 4
Train
Test
Task: Top-N Items to a user U3
Predicted Rank: T3, T1, T4, T2
Real Rank: T3, T2, T1
Then compare the two lists:
Precision@N = # of hits/N
Other evaluation metrics:
• Recall
• Mean Average Precision (MAP)
• Normalized Discounted Cumulative Gain (NDCG)
• Mean Reciprocal Rank (MRR)
• and more …
45. 45
Task and Eval (2): Top-N Recommendation
User Item Rating
U1 T1 4
U1 T2 3
U1 T3 3
U2 T2 4
U2 T3 5
U2 T4 5
U3 T4 4
U1 T4 3
U2 T1 2
U3 T1 3
U3 T2 3
U3 T3 4
Train
Test
Task: Top-N Items to user U3
1. Build a model, e.g., P(U, T) = Avg (T)
2. Process of Rating Prediction
P(U3, T1) = Avg(T1) = 4/1 = 4
P(U3, T2) = Avg(T2) = (3+4)/2 = 3.5
P(U3, T3) = Avg(T3) = (3+5)/2 = 4
P(U3, T4) = Avg(T4) = (4+5)/2 = 3.5
Predicted Rank: T3, T1, T4, T2
Real Rank: T3, T2, T1
3. Evaluation Based on the two lists
Precision@N = # of hits/N
Precision@1 = 1/1
Precision@2 = 2/2
Precision@3 = 2/3
47. Traditional Recommendation Algorithms
47
Content-Based Recommendation Algorithms
The user will be recommended items similar to the ones the
user preferred in the past, such as book/movie recsys
Collaborative Filtering Based Recommendation Algorithms
The user will be recommended items that people with similar
tastes and preferences liked in the past, e.g., movie recsys
Hybrid Recommendation Algorithms
Combine content-based and collaborative filtering based
algorithms to produce item recommendations.
58. What is Context?
58
The most common contextual variables:
Time and Location
User intent or purpose
User emotional states
Devices
Topics of interests, e.g., apple vs. Apple
Others: companion, weather, budget, etc
Usually, the selection/definition of contexts is a domain-specific problem
59. Outline
• Context and Context-awareness
What is context and examples
What is context-awareness and examples
Context collections
59
60. Context-Awareness
60
• Context-Awareness = Adapt to the changes of
the contextual situations, to build smart
applications
• It has been successfully applied to:
– Ubiquitous computing
– Mobile computing
– Information Retrieval
– Recommender Systems
– And so forth…
61. Example: Smart Home with Remote Controls
61
https://www.youtube.com/watch?v=jB7iuBKcfZw
62. Example: Smart Home with Context-awareness
62
https://www.youtube.com/watch?v=UQWYRsXkbAM
63. Content vs Context
63
• Content-Based Approaches
• Context-Driven Applications and Approaches
66. When Contexts Take Effect?
• Contexts could be useful in different time points
66
Timeline
Past
Context
Current
Context
Future
Context
Most
Applications
Historical
Data or
Knowledge
Ubiquitous
Computing
Context Modeling
Context Mining
Context Matching
Context Adaptation
Context Prediction
Context Adaptation
67. Outline
• Context and Context-awareness
What is context and examples
What is context-awareness and examples
Context collections
67
68. How to Collect Contexts
• Sensors
e.g., the application of smart homes
• User Inputs
e.g., survey or user interactions
• Inference
e.g., from user reviews
68
69. How to Collect Contexts
• Sensors, e.g., the application of smart homes
69
70. How to Collect Contexts
• User Inputs, e.g., survey or user interactions
70
71. How to Collect Contexts
• Inference, e.g., from user reviews
71
Family Trip
Early Arrival
Season and Family Trip
72. Short Summary
• Information Overload
• Solution: Information Retrieval (IR)
e.g., Google Search Engine
• Solution: Recommender Systems (RecSys)
e.g., Movie recommender by Netflix
• Context and Context-awareness
e.g., Mobile computing and smart home devices
72
73. Next
• Coffee Break
– Time: 3:00 PM to 3:30 PM
– Location: Outside Merchants
• Context-awareness in IR and RecSys
• Extended Topics: Trends, Challenges and Future
73
75. 75
• Search in Google (by time)
Context-awareness in IR: Examples
76. 76
• Search in Google Map (by location)
Context-awareness in IR: Examples
77. Context in IR
77
• Searches should be processed in the context of the
information surrounding them, allowing more
accurate search results that better reflect the
user’s actual intentions. (Finkelstein, 2001)
• Context, in IR, refers to the whole data, metadata,
applications and cognitive structures embedded in
situations of retrieval or information seeking.
(Tamine, et al., 2010)
• These information usually have an impact on the
user’s behavior and perception of relevance.
80. Context-awareness in IR
80
The development of Context-awareness in IR
Interactive and Proactive by Gareth J.F. Jones, 2001
Other Frameworks or Models
Temporal Models
Semantic Models
Topic Models
Multimedia as inputs: IR based on voice or audios
And so forth…
81. Terminologies in IR
81
The Author: The author of documents, info provider
The End User: Who releases the queries or Whose
context information is captured
Information Recipient: Who finally receive the
retrieved information
We assume the end user and information recipient
are the same person.
82. Context-awareness in IR: Interactive Applications
82
Interactive app
• User-driven approach
• Users explicitly issue a request (along with context
information) to retrieve relevant documents
• Examples: What are the comfortable hotels near
the Omaha Zoo (assume there are no automatic
location detectors or sensors)
• Contexts are included in the query; or finer-grained
query can be derived from related key words
83. Context-awareness in IR: Proactive Applications
83
Proactive App
• Author-driven approach
• Each document is associated with a trigger context. The
documents are retrieved to the user if the trigger context
matches user’s current context.
• Example-1: (Location, Time) = trigger contexts for each
restaurant; open Yelp, input Chinese dish, Yelp will return a
list of Chinese restaurants nearby and valid opening hours at
the current moment. [search with queries]
• Example-2: (Location, Time) = trigger contexts for each
restaurant; open Yelp, Yelp will deliver a list of Chinese
restaurants nearby and valid opening hours at the current
moment. [retrieval or recommendation without queries]
84. Context-awareness in IR
84
Other Frameworks or Models
Temporal Models
Image retrieval, NYC pictures: old, new? Summer, winter?
Semantic Models
Text retrieval, apple vs Apple?
Topic Models
Academic papers retrieval, AI, ML, DM, RecSys?
Multimedia as inputs: IR based on voice or audios
Bird singing, which birds? real birds? emotional reactions?
And so forth…
85. Context-awareness in IR
85
Interactive
User must give a query
The context information are involved in the user inputs
It is a process from user contexts to relevant documents
Proactive
User may or may not give a query
Context are captured automatically
It is a process of matching trigger contexts with user contexts
It is a process from documents to users
87. Outline
• Context-aware Recommendation
Intro: Does context matter?
Definition: What is Context in RecSys?
Collection: Context Acquisition
Selection: How to identify the relevant context?
Context Incorporation
Context Filtering
Context Modeling
Other Challenges and CARSKit
87
88. Non-context vs Context
88
Companion
• Decision Making = Rational + Contextual
• Examples:
Travel destination: in winter vs in summer
Movie watching: with children vs with partner
Restaurant: quick lunch vs business dinner
Music: workout vs study
89. What is Context?
89
• “Context is any information that can be used to characterize the
situation of an entity” by Anind K. Dey, 2001
• Representative Context: Fully Observable and Static
• Interactive Context: Non-Fully observable and Dynamic
90. Interactive Context Adaptation
90
• Interactive Context: Non-fully observable and Dynamic
List of References:
M Hosseinzadeh Aghdam, N Hariri, B Mobasher, R Burke. "Adapting
Recommendations to Contextual Changes Using Hierarchical Hidden
Markov Models", ACM RecSys 2015
N Hariri, B Mobasher, R Burke. "Adapting to user preference changes
in interactive recommendation", IJCAI 2015
N Hariri, B Mobasher, R Burke. "Context adaptation in interactive
recommender systems", ACM RecSys 2014
N Hariri, B Mobasher, R Burke. "Context-aware music
recommendation based on latent topic sequential patterns", ACM
RecSys 2012
91. CARS With Representative Context
91
• Observed Context:
Contexts are those variables which may change when a same
activity is performed again and again.
• Examples:
Watching a movie: time, location, companion, etc
Listening to a music: time, location, emotions, occasions, etc
Party or Restaurant: time, location, occasion, etc
Travels: time, location, weather, transportation condition, etc
92. What is Representative Context?
92
Activity Structure:
1). Subjects: group of users
2). Objects: group of items/users
3). Actions: the interactions within the activities
Which variables could be context?
1). Attributes of the actions
Watching a movie: time, location, companion
Listening to a music: time, occasions, etc
2). Dynamic attributes or status from the subjects
User emotions
Yong Zheng. "A Revisit to The
Identification of Contexts in
Recommender Systems", IUI 2015
93. Context-aware RecSys (CARS)
93
• Traditional RS: Users × Items Ratings
• Contextual RS: Users × Items × Contexts Ratings
Example of Multi-dimensional Context-aware Data set
User Item Rating Time Location Companion
U1 T1 3 Weekend Home Kids
U1 T2 5 Weekday Home Partner
U2 T2 2 Weekend Cinema Partner
U2 T3 3 Weekday Cinema Family
U1 T3 ? Weekend Cinema Kids
94. Terminology in CARS
94
• Example of Multi-dimensional Context-aware Data set
Context Dimension: time, location, companion
Context Condition: Weekend/Weekday, Home/Cinema
Context Situation: {Weekend, Home, Kids}
User Item Rating Time Location Companion
U1 T1 3 Weekend Home Kids
U1 T2 5 Weekday Home Partner
U2 T2 2 Weekend Cinema Partner
U2 T3 3 Weekday Cinema Family
U1 T3 ? Weekend Cinema Kids
95. Context Acquisition
95
How to Collect the context and user preferences in contexts?
• By User Surveys or Explicitly Asking for User Inputs
Predefine context & ask users to rate items in these situations;
Or directly ask users about their contexts in user interface;
• By Usage data
The log data usually contains time and location (at least);
User behaviors can also infer context signals;
• By User reviews
Text mining or opinion mining could be helpful to infer context
information from user reviews
104. Context Relevance and Context Selection
104
Apparently, not all of the context are relevant or influential
• By User Surveys
e.g., which ones are important for you in this domain
• By Feature Selection
e.g., Principal Component Analysis (PCA)
e.g., Linear Discriminant Analysis (LDA)
• By Statistical Analysis or Detection on Contextual Ratings
Statistical test, e.g., Freeman-Halton Test
Other methods: information gain, mutual information, etc
Reference: Odic, Ante, et al. "Relevant context in a movie
recommender system: Users’ opinion vs. statistical detection."
CARS Workshop@ACM RecSys 2012
105. Context-aware Data Sets
105
Public Data Set for Research Purpose
• Food: AIST Japan Food, Mexico Tijuana Restaurant Data
• Movies: AdomMovie, DePaulMovie, LDOS-CoMoDa Data
• Music: InCarMusic
• Travel: TripAdvisor, South Tyrol Suggests (STS)
• Mobile: Frappe
Frappe is a large data set, others are either small or sparse
Downloads and References:
https://github.com/irecsys/CARSKit/tree/master/context-
aware_data_sets
106. 106
• Once we collect context information, and also identify
the most influential or relevant contexts, the next step
is to incorporate contexts into the recommender
systems.
Context Incorporation
• Traditional RS: Users × Items Ratings
• Contextual RS: Users × Items × Contexts Ratings
107. 107
• There are three ways to build algorithms for CARS
Context-aware RecSys (CARS)
108. 108
• Next, we focus on the following CARS algorithms:
Contextual Filtering: Use Context as Filter
Contextual Modeling: Independent vs Dependent
Context-aware RecSys (CARS)
111. 111
• Data Sparsity Problem in Contextual Rating
Differential Context Modeling
User Movie Time Location Companion Rating
U1 Titanic Weekend Home Girlfriend 4
U2 Titanic Weekday Home Girlfriend 5
U3 Titanic Weekday Cinema Sister 4
U1 Titanic Weekday Home Sister ?
Context Matching Only profiles given in <Weekday, Home, Sister>
Context Relaxation Use a subset of context dimensions to match
Context Weighting Use all profiles, but weighted by context similarity
112. 112
• Context Relaxation
Differential Context Modeling
User Movie Time Location Companion Rating
U1 Titanic Weekend Home Girlfriend 4
U2 Titanic Weekday Home Girlfriend 5
U3 Titanic Weekday Cinema Sister 4
U1 Titanic Weekday Home Sister ?
Use {Time, Location, Companion} 0 record matched!
Use {Time, Location} 1 record matched!
Use {Time} 2 records matched!
Note: a balance is required for relaxation and accuracy
113. 113
• Context Weighting
Differential Context Modeling
User Movie Time Location Companion Rating
U1 Titanic Weekend Home Girlfriend 4
U2 Titanic Weekday Home Girlfriend 5
U3 Titanic Weekday Cinema Sister 4
U1 Titanic Weekday Home Sister ?
c and d are two contexts. (Two red regions in the Table above.)
σ is the weighting vector <w1, w2, w3> for three dimensions.
Assume they are equal weights, w1 = w2 = w3 = 1.
J(c, d, σ) = # of matched dimensions / # of all dimensions = 2/3
Similarity of contexts is measured by
Weighted Jaccard similarity
114. 114
• Notion of “differential”
In short, we apply different context relaxation and context weighting to
each component
Differential Context Modeling
1.Neighbor Selection 2.Neighbor contribution
3.User baseline 4.User Similarity
115. 115
• Workflow
Step-1: We decompose an algorithm to different components;
Step-2: We try to find optimal context relaxation/weighting:
In context relaxation, we select optimal context dimensions
In context weighting, we find optimal weights for each dimension
• Optimization Problem
Assume there are 4 components and 3 context dimensions
Differential Context Modeling
1 2 3 4 5 6 7 8 9 10 11 12
DCR 1 0 0 0 1 1 1 1 0 1 1 1
DCW 0.2 0.3 0 0.1 0.2 0.3 0.5 0.1 0.2 0.1 0.5 0.2
1st 2nd 3rd 4th
117. 117
• How PSO works?
Differential Context Modeling
Swarm = a group of birds
Particle = each bird ≈ search entity in algorithm
Vector = bird’s position in the space ≈ Vectors we need in DCR/DCW
Goal = the distance to location of pizza ≈ prediction error
So, how to find goal by swam intelligence?
1.Looking for the pizza
Assume a machine can tell the distance
2.Each iteration is an attempt or move
3.Cognitive learning from particle itself
Am I closer to the pizza comparing with
my “best ”locations in previous history?
4.Social Learning from the swarm
Hey, my distance is 1 mile. It is the closest!
. Follow me!! Then other birds move towards here.
DCR – Feature selection – Modeled by binary vectors – Binary PSO
DCW – Feature weighting – Modeled by real-number vectors – PSO
118. 118
• Summary
Pros: Alleviate data sparsity problem in CARS
Cons: Computational complexity in optimization
Cons: Local optimum by non-linear optimizer
Our Suggestion:
We may just run these optimizations offline to find optimal
context relaxation or context weighting solutions; And those
optimal solutions can be obtained periodically;
Differential Context Modeling
122. 122
• Tensor Factorization
Independent Contextual Modeling
Multi-dimensional space: Users × Items × Contexts Ratings
Each context variable is
modeled as an
individual and
independent dimension
in addition to user &
item dims.
Thus we can create a
multidimensional
space, where rating is
the value in the space.
125. 125
• Tensor Factorization
Pros: Straightforward, easily to incorporate contexts into the model
Cons: 1). Ignore the dependence between contexts and user/item dims
2). Increased computational cost if more context dimensions
There are some research working on efficiency improvement on TF,
such as reusing GPU computations, and so forth…
Independent Contextual Modeling
127. 127
• Dependence between Every two Contexts
Deviation-Based: rating deviation between two contexts
Similarity-Based: similarity of rating behaviors in two contexts
Dependent Contextual Modeling
128. 128
• Notion: Contextual Rating Deviation (CRD)
CRD how user’s rating is deviated from context c1 to c2?
CRD(D1) = 0.5 Users’ rating in Weekday is generally higher than
users’ rating at Weekend by 0.5
CRD(D2) = -0.1 Users’ rating in Cinema is generally lower than
users’ rating at Home by 0.1
Deviation-Based Contextual Modeling
Context D1: Time D2: Location
c1 Weekend Home
c2 Weekday Cinema
CRD(Di) 0.5 -0.1
130. 130
• Build a deviation-based contextual modeling approach
Assume Ø is a special situation: without considering context
Assume Rating (U, T, Ø) = Rating (U, T) = 4
Predicted Rating (U, T, c2) = 4 + 0.5 -0.1 = 4.4
Deviation-Based Contextual Modeling
Context D1: Time D2: Location
Ø UnKnown UnKnown
c2 Weekday Cinema
CRD(Di) 0.5 -0.1
In other words, F(U, T, C) = P(U, T) + 𝑖=0
𝑁
𝐶𝑅𝐷(𝑖)
131. 131
• Build a deviation-based contextual modeling approach
Note: P(U, T) could be a rating prediction by any traditional
recommender systems, such as matrix factorization
Deviation-Based Contextual Modeling
Simplest model: F(U, T, C) = P(U, T) + 𝑖=0
𝑁
𝐶𝑅𝐷(𝑖)
User-personalized model: F(U, T, C) = P(U, T) + 𝑖=0
𝑁
𝐶𝑅𝐷(𝑖, 𝑈)
Item-personalized model: F(U, T, C) = P(U, T) + 𝑖=0
𝑁
𝐶𝑅𝐷(𝑖, 𝑇)
132. 132
• Build a similarity-based contextual modeling approach
Assume Ø is a special situation: without considering context
Assume Rating (U, T, Ø) = Rating (U, T) = 4
Predicted Rating (U, T, c2) = 4 × Sim(Ø, c2)
Similarity-Based Contextual Modeling
Context D1: Time D2: Location
Ø UnKnown UnKnown
c2 Weekday Cinema
Sim(Di) 0.5 0.1
In other words, F(U, T, C) = P(U, T) × Sim(Ø, C)
133. 133
• Challenge: how to model context similarity, Sim(c1,c2)
We propose three representations:
• Independent Context Similarity (ICS)
• Latent Context Similarity (LCS)
• Multidimensional Context Similarity (MCS)
Similarity-Based Contextual Modeling
134. 134
• Sim(c1, c2): Independent Context Similarity (ICS)
𝑆𝑖𝑚 c1, 𝑐2 = 𝑖=1
𝑁
𝑠𝑖𝑚(𝐷𝑖) = 0.5 × 0.1 = 0.05
Similarity-Based Contextual Modeling
Context D1: Time D2: Location
c1 Weekend Home
c2 Weekday Cinema
Sim(Di) 0.5 0.1
𝐺𝑒𝑛𝑒𝑟𝑎𝑙𝑙𝑦, 𝐼𝑛 𝐼𝐶𝑆: 𝑆𝑖𝑚 c1, 𝑐2 = 𝑖=1
𝑁
𝑠𝑖𝑚(𝐷𝑖)
Weeend Weekday Home Cinema
Weekend 1 b — —
Weekday a 1 — —
Home — — 1 c
Cinema — — d 1
135. 135
• Sim(c1, c2): Latent Context Similarity (LCS)
In training, we learnt (home, cinema), (work, cinema)
In testing, we need (home, work)
Similarity-Based Contextual Modeling
𝐺𝑒𝑛𝑒𝑟𝑎𝑙𝑙𝑦, 𝐼𝑛 𝐿𝐶𝑆: 𝑆𝑖𝑚 c1, 𝑐2 = 𝑖=1
𝑁
𝑠𝑖𝑚(𝐷𝑖)
𝑆𝑖𝑚 𝐷𝑖 = 𝑑𝑜𝑡𝑃𝑟𝑜𝑑𝑢𝑐𝑡 (𝑉𝑖1, 𝑉𝑖2)
f1 f2 … … … … fN
home 0.1 -0.01 … … … … 0.5
work 0.01 0.2 … … … … 0.01
cinema 0.3 0.25 … … … … 0.05
Vector
Representation
136. 136
• Sim(c1, c2): Multidimensional Context Similarity (MCS)
Each context condition is an individual axis in the space.
For each axis, there are only two values: 0 and 1.
1 means this condition is selected; otherwise, not selected.
When value is 1, each condition is associated with a weight
c1 = <Weekday, Cinema, with Kids>
c2 = <Weekend, Home, with Family>
They can be mapped as two points in the space
Similarity-Based Contextual Modeling
𝐼𝑛 𝑀𝐶𝑆: 𝐷𝑖𝑠𝑆𝑖𝑚 c1, 𝑐2 = distance between two point
139. 139
Recommendation Library
• Motivations to Build a Recommendation Library
1). Standard Implementations for popular algorithms
2). Standard platform for benchmark or evaluations
3). Helpful for both research purpose and industry practice
4). Helpful as tools in teaching and learning
141. 141
CARSKit: A Java-based Open-source
Context-aware Recommendation Library
CARSKit: https://github.com/irecsys/CARSKit
Users × Items × Contexts Ratings
User Guide: http://arxiv.org/abs/1511.03780
142. 142
CARSKit: A Short User Guide
1. Download the JAR library, i.e., CARSKit.jar
2. Prepare your data
3. Setting: setting.conf
4. Run: java –jar CARSKit.jar –c setting.conf
148. 148
Challenges
• There could be many other challenges in context-
awareness in IR and RecSys:
Numeric v.s. Categorical Context Information
Explanations by Context
New User Interfaces and Interactions
User Intent Predictions or References in IR and RecSys
Cold-start and Data Sparisty Problems in CARS
149. 149
Challenges: Numeric Context
• List of Categorical Context
Time: morning, evening, weekend, weekday, etc
Location: home, cinema, work, party, etc
Companion: family, kid, partner, etc
• How about numeric context
Time: 2016, 6:30 PM, 2 PM to 6 PM (time-aware recsys)
Temperature: 12°C, 38°C
Principle component by PCA: numeric values
150. 150
Challenges: Explanation
• Recommendation Using social networks (By Netflix)
The improvement is not significant;
Unless we explicitly explain it to the end users;
• IR and RecSys Using context (Open Research)
Similar thing could happen to context-aware IR & recsys;
How to use contexts to explain information filtering;
How to design new user interface to explain;
How to introduce user-centric evaluations;
151. 151
Challenges: User Interface
• Potential Research Problems in User Interface
New UI to collect context;
New UI to interact with users friendly and smoothly;
New UI to explain context-aware IR and RecSys;
New UI to avoid debates on user privacy;
User privacy problems in context collection & usage
152. 152
Challenges: Cold-Start and Data Sparsity
• Cold-start Problems
Cold-start user: no rating history by this user
Cold-start item: no rating history on this item
Cold-start context: no rating history within this context
• Solution: Hybrid Method by Matthias Braunhofer, et al.
153. 153
Challenges: User Intent
• User Intent could be the most influential contexts
How to better predict that
How to better design UI to capture that
How to balance user intent and limitations in resources
154. 154
Trends and Future
• Context-awareness enable new applications: context
suggestion, or context-driven UI/Applications
Context Suggestion
155. Context Suggestion
155
• Task: Suggest a list of contexts to users (on items)
Context Rec
Contextual RecTraditional Rec
156. 156
Context Suggestion: Motivations
• Motivation-1: Maximize user experience
User Experience (UX) refers to a person's emotions and
attitudes about using a particular product, system or
service.
162. 162
References
L Baltrunas, M Kaminskas, F Ricci, et al. Best usage context prediction
for music tracks. CARS@ACM RecSys, 2010
Y Zheng, B Mobasher, R Burke. Context Recommendation Using Multi-
label Classification. IEEE/WIC/ACM WI, 2014
Y Zheng. Context Suggestion: Solutions and Challenges. ICDM
Workshop, 2015
Y Zheng. Context-Driven Mobile Apps Management and
Recommendation. ACM SAC, 2016
Yong Zheng, Bamshad Mobasher, Robin Burke. “User-Oriented
Context Suggestion“, ACM UMAP, 2016
163. Tutorial: Context-Awareness In Information
Retrieval and Recommender Systems
Yong Zheng
School of Applied Technology
Illinois Institute of Technology, Chicago
The 16th IEEE/WIC/ACM Conference on Web Intelligence, Omaha, USA