Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems [CERI '16 Slides]

CERI 2016, GRANADA, SPAIN
ADDITIVE SMOOTHING FOR RELEVANCE-BASED
LANGUAGE MODELLING OF RECOMMENDER SYSTEMS
Daniel Valcarce, Javier Parapar, Álvaro Barreiro
@dvalcarce @jparapar @AlvaroBarreiroG
Information Retrieval Lab
@IRLab_UDC
University of A Coruña
Spain

Outline
1. Recommender Systems
2. Pseudo-Relevance Feedback
3. Relevance-Based Language Modelling of Recommender
Systems
4. IDF Eﬀect and Additive Smoothing
5. Experiments
6. Conclusions and Future Directions
1/26

Recommender Systems
Recommender systems generate personalised suggestions for
items that may be of interest to the users.
Top-N Recommendation: create a ranking of the N most
relevant items for each user.
Collaborative ﬁltering: exploit only user-item interactions
(ratings, clicks, etc.).
3/26

Pseudo-Relevance Feedback (I)
In Information Retrieval, Pseudo-Relevance Feedback (PRF) is
an automatic query expansion method.
The goal is to expand the original query with new terms to
improve the quality of the search results.
These new terms are extracted automatically from a ﬁrst
retrieval using the original query.
5/26

Pseudo-Relevance Feedback (II)
Information need
6/26

Information need
query
6/26

Information need
query Retrieval
System
6/26

Information need
query Retrieval
System
Query
Expansion
expanded
query
6/26

RELEVANCE-BASED LANGUAGE MODELLING
OF RECOMMENDER SYSTEMS

Pseudo-Relevance Feedback for Collaborative Filtering
PRF CF
User’s query User’s proﬁle
mostˆ1,populatedˆ2,stateˆ2 Titanicˆ2,Avatarˆ3,Matrixˆ5
Documents
Neighbours
Terms
Items
8/26

Relevance-Based Language Models (RM)
Relevance-Based Language Models or Relevance Models (RM)
are a state-of-the-art PRF technique (Lavrenko & Croft, SIGIR
2001).
Two models: RM1 and RM2.
RM1 works better than RM2 in retrieval.
9/26

Relevance-Based Language Models (RM)
Relevance-Based Language Models or Relevance Models (RM)
are a state-of-the-art PRF technique (Lavrenko & Croft, SIGIR
2001).
Two models: RM1 and RM2.
RM1 works better than RM2 in retrieval.
Relevance Models have been recently adapted to collaborative
ﬁltering (Parapar et al., IPM 2013).
For recommendation, RM2 is the preferred method.
9/26

Relevance Models for Collaborative Filtering
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v) p(v)
p(i)
p(j|v)
Iu is the set of items rated by the user u.
Vu is neighbourhood of the user u. This is computed using
a clustering algorithm.
p(i) and p(v) are the item and user priors.
p(i|u) is computed smoothing the maximum likelihood
estimate with the probability in the collection.
10/26

Relevance Models for Collaborative Filtering
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v) p(v)
p(i)
p(j|v)
Iu is the set of items rated by the user u.
Vu is neighbourhood of the user u. This is computed
using a clustering algorithm.
p(i) and p(v) are the item and user priors.
p(i|u) is computed smoothing the maximum likelihood
estimate with the probability in the collection.
10/26

Collection-based Smoothing Techniques (II)
Absolute Discounting, Jelinek-Mercer and Dirichlet Priors have
been studied in the context of:
Text Retrieval (Zhai & Laﬀerty, ACM TOIS 2004)
Collaborative Filtering (Valcarce et al., ECIR 2015)
12/26

◦ Absolute Discounting performs very poorly.
◦ Dirichlet Priors is the most popular approach.
◦ Jelinek-Mercer is a bit better for long queries.
◦ Absolute Discounting is the best smoothing method.
12/26

◦ Absolute Discounting performs very poorly.
◦ Dirichlet Priors is the most popular approach.
◦ Jelinek-Mercer is a bit better for long queries.
◦ Absolute Discounting is the best smoothing method.
Can we do better?
12/26

IDF EFFECT AND ADDITIVE SMOOTHING

Axiomatic Analysis of the IDF Effect in IR
A recent work performed an axiomatic analysis of several PRF
methods (Hazimeh & Zhai, ICTIR 2015).
They found out that RM1 with Dirichlet Priors and
Jelinek-Mercer smoothing methods demote the IDF effect.
The IDF effect is a desirable property that, intuitively,
promotes documents with very specific terms.
14/26

Axiomatic Analysis of the IDF Effect in IR
A recent work performed an axiomatic analysis of several PRF
methods (Hazimeh & Zhai, ICTIR 2015).
They found out that RM1 with Dirichlet Priors and
Jelinek-Mercer smoothing methods demote the IDF effect.
The IDF effect is a desirable property that, intuitively,
promotes documents with very specific terms.
Can we use this result in recommendation?
What is the IDF effect in recommendation? Is it a desirable
property?
They studied RM1, what about RM2?
14/26

The IDF Effect in Recommendation (I)
This retrieval idea is related to the novelty in recommendation.
Definition (IDF effect)
A recommender system supports the IDF eﬀect if p(i1|Ru) >
p(i2|Ru) when
two items i1 and i2
have the same ratings r(v, i1) r(v, i2) for all v ∈ Vu
and diﬀerent popularity p(i1|C) < p(i2|C)
In simply words, if we have the same feedback for two items,
we should recommend the least popular one.
15/26

The IDF Effect in Recommendation (II)
We performed an axiomatic analysis of RM21 using the
following smoothing methods:
Dirichlet Priors
Jelinek-Mercer
Absolute Discounting
1Math proofs in the paper!
16/26

The IDF Effect in Recommendation (II)
We performed an axiomatic analysis of RM21 using the
following smoothing methods:
Dirichlet Priors
Jelinek-Mercer
Absolute Discounting
Additive Smoothing
pγ(i|u)
r(u, i) + γ
j∈Iu
r(u, j) + γ|I|
1Math proofs in the paper!
16/26

Experimental settings
Datasets:
Movielens 100k
Movielens 1M
Metrics:
Ranking accuracy: nDCG.
Diversity: the complement of the Gini index.
Novelty: mean self-information (MSI).
18/26

Ranking accuracy
0.30
0.31
0.32
0.33
0.34
0.35
0.36
0.37
0.38
0.39
0.40
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
nDCG@10
δ, λ, µ × 103
γ
Additive (γ)
Absolute Discounting (δ)
Jelinek-Mercer (λ)
Dirichlet Priors (µ)
0.26
0.27
0.28
0.29
0.30
0.31
0.32
0.33
0.34
0.35
0.36
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
nDCG@10
δ, λ, µ × 103
γ
Additive (γ)
Jelinek-Mercer (λ)
Figure: Values of nDCG@10 on MovieLens 100k (left) and 1M (right).
19/26

Diversity
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
Gini@10
δ, λ, µ × 103
γ
Additive (γ)
Jelinek-Mercer (λ)
0.00
0.01
0.02
0.03
0.04
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
Gini@10
δ, λ, µ × 103
γ
Additive (γ)
Jelinek-Mercer (λ)
Figure: Values of Gini@10 on MovieLens 100k (left) and 1M (right).
20/26

Novelty
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
11.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
MSI@10
δ, λ, µ × 103
γ
Additive (γ)
Jelinek-Mercer (λ)
8.0
8.5
9.0
9.5
10.0
10.5
11.0
11.5
12.0
12.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
MSI@10
δ, λ, µ × 103
γ
Additive (γ)
Jelinek-Mercer (λ)
Figure: Values of MSI@10 on MovieLens 100k (le ft) and 1M (right).
21/26

G-measure of nDCG, Gini and MSI
0.2
0.3
0.4
0.5
0.6
0.7
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
G(Gini@10,MSI@10,nDCG@10)
δ, λ, µ × 103
γ
Additive (γ)
Jelinek-Mercer (λ)
0.1
0.2
0.3
0.4
0.5
0.6
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
G(Gini@10,MSI@10,nDCG@10)
δ, λ, µ × 103
γ
Additive (γ)
Jelinek-Mercer (λ)
Figure: Values of the geometric mean among nDCG@10, Gini@10 and
MSI@10 on MovieLens 100k (left) and 1M (right).
22/26

CONCLUSIONS AND FUTURE DIRECTIONS

Conclusions
The IDF effect from IR is related to the novelty of the
recommendations.
The use of collection-based smoothing methods with RM2
demotes the IDF effect.
Additive smoothing is a simple method that does not demote
(nor promote) the IDF effect.
Additive smoothing provides better accuracy, diversity and
novelty figures than collection-based smoothing methods.
24/26

Future work
Envision new ways of enhancing the IDF eﬀect in RM2:
Design smoothing methods that actively promote the IDF
eﬀect.
Use non-uniform prior estimates.
Study axiomatically other IR properties that can be useful in
recommendation.
25/26

THANK YOU!
@DVALCARCE
http://www.dc.fi.udc.es/~dvalcarce

Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems [CERI '16 Slides]

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems [CERI '16 Slides]

Similar a Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems [CERI '16 Slides] (20)

Más de Daniel Valcarce

Más de Daniel Valcarce (6)

Último

Último (20)

Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems [CERI '16 Slides]