Recommender Systems in the Linked Data era

Recommender Systems
in the Linked Data era
ROBERTO MIRIZZI, PHD
roberto.mirizzi@gmail.com

Outline
What is a Recommender System?
◦ A definition
◦ Types
What is Linked Data?
◦ LOD
◦ DBpedia
Some Recommender Systems (RS):
◦ A content-based RS (memory-based)
◦ A mobile content-based RS (memory-based)
◦ A content-based RS (model-based)
◦ A hybrid RS (model-based)

Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA
7/12/2013
What is a Recommender System?
Recommender Systems (RSs) are software tools and techniques providing suggestions for items
to be of use to a user.
[F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.]
Input Data:
A set of users U = {u1, …, uM}
A set of items I = {i1, …, iN}
The preference matrix R = [ru,i]
Problem Definition:
Given user u and target item i
Predict the preference ru,i
?
?

7/12/2013
Content-based (CB): recommendations are based on the assumption that if in the past a user liked a set of
items with particular features, they will likely go for items having similar characteristics
Recommender Systems: types
animation
fairytale
ogre
castle
Collaborative-filtering (CF): recommendations are based on the assumption that users having similar
history are more likely to have similar tastes/needs
Hybrid: it’s not too hard to guess what they are 

7/12/2013
What is Linked Data?
A collection of interrelated
datasets on the Web
Principles:
1. Use HTTP URIs to identify
things
2. Leverage standards such as
RDF and SPARQL to provide
information about things
3. Link related things by
relationships
[http://linkeddata.org/]

7/12/2013
foaf:page
DBpedia: a Nucleus for a Web of
Open Data
http://dbpedia.org
DBpedia is a crowd-sourced community effort to extract
structured information from Wikipedia and make this
information available on the Web.
DBpedia allows you to ask sophisticated queries against
Wikipedia, and to link the different data sets on the Web to
Wikipedia data.
[Auer et al., DBpedia: A Nucleus for a Web of Open Data. ISWC+ASWC 2007]
[Bizer et el., A crystallization point for the Web of Data. Journal Web Semantics, 2009]

7/12/2013
Querying DBpedia: SPARQL
DBpedia exposes a SPARQL endpoint
(http://dbpedia.org/sparql) to query the dataset.
Results can be provided in several formats (e.g., JSON,
XML, NTriples, etc.)
SPARQL is an RDF query language. Its queries consist of triple patterns, conjunctions, disjunctions and
optional patterns

7/12/2013
A graph of knowledge
Why don’t we use all this information to foster recommender systems?
Ocean’s Eleven
George Clooney
Brad Pitt
Ocean’s Twelve
Steven
Soderbergh
Catherine Zeta-
Jones
2000s crime films
American criminal
comedy films
Crime films
Crime

7/12/2013
A graph of knowledge
Ocean’s Eleven
George Clooney
Brad Pitt
Ocean’s Twelve
Steven
Soderbergh
Catherine Zeta-
Jones
2000s crime films
American criminal
comedy films
Crime films
Crime
Why don’t we use all this information to foster recommender systems?
likes
likes

A content-based RS (memory-based)

7/12/2013
The good old Vector Space Model
[http://en.wikipedia.org/wiki/File:Vector_space_model.jpg]
The Vector Space Model is an algebraic model for
representing both text documents and queries as vectors
of index terms wt,d that are positive and non-binary.
1, 2, ,, ,...,
T
d d d N dv w w w   
, ,t d t d tw tf idf 
,
,
,
t d
t d
k dk
n
tf
n


, ,1
2 2
, ,1 1
( , )
N
i j i qj q i
j
N N
j i j i qi i
w wd d
sim d q
d q w w

 

 


 
 ' '
logt
D
idf
d D t d

 

7/12/2013
Semantic Vector Space Model (i)
Ocean’s Eleven
George Clooney
Steven Soderberg
2000s crime films
Crime
starring
director
subject/broader
genre
Ocean’s Twelve
Brad Pitt
Catherine Zeta-Jones
Crime films
American criminal…
Ocean’s Eleven
Ocean’s Twelve
starring
Each item is expressed as a tensor in a multi-
dimensional space where each dimension
corresponds to a specific property of the
considered datasets (e.g., starring,
subject/broader, director, genre, …)

7/12/2013
STARRING
George
Clooney [gc]
(38 movies)
Catherine
Z. Jones [czj]
(22 movies)
Brad
Pitt [bp]
(35 movies)
Ocean’s Eleven
[o11]
(13 actors)
  
Ocean’s Twelve
[o12]
(15 actors)
  
STARRING
George
Clooney [gc]
(38 movies)
Catherine
Z. Jones [czj]
(22 movies)
Brad
Pitt [bp]
(35 movies)
Ocean’s Eleven
[o11]
(13 actors)
  
Ocean’s Twelve
[o12]
(15 actors)
  
Semantic Vector Space Model (ii)
starring George Clooney [gc] Catherine Z. Jones [czj] Brad Pitt [bp]
Ocean’s Eleven [o11]
Ocean’s Twelve [o12]
, ,x y x y xactor movie actor movie actorw tf idf 
11,gc ow
12,gc ow 12,czj ow
11,bp ow
12,bp ow
11,czj ow
We can now compute the scalar product between the two vectors to get their similarity…

7/12/2013
Semantic Vector Space Model (iii)
12 11 12 11 12 11
12 12 12 11 11 11
, , , , , ,
12 11
2 2 2 2 2 2
, , , , , ,
( , )
gc o gc o czj o czj o bp o bp o
starring
gc o czj o bp o gc o czj o bp o
w w w w w w
sim o o
w w w w w w
    

    
…and then combine all the similarities for each property:
12 11 12 11 12 11 12 11( , ) () ) ( ,( , , )starring directostarring director subjecr subjecttsim o o sis m oim o si o oo mo        
soon we will see how to compute the p coefficients

7/12/2013
Ready for our first Content-based RS
 ( ) , 1 if likes , 1 otherwisej j j j jprofile u m r r u m r     
( )
( , )
( , )
( )
j
p p j i
p
j
m profile u
i
sim m m
r
P
r u m
profile u






Given a user profile, defined as:
We predict the rating using a Nearest Neighbor Classifier (Memory-based) where the similarity measure is
a linear combination of local similarities:
 ( ) ,j j jprofile u m r r   
or as:
[Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Davide Romito, Markus Zanker. Linked Open Data to support
Content-based Recommender Systems. 8th International Conference on Semantic Systems (I-SEMANTICS 2012) – best paper]

7/12/2013
How do we compute the p coefficients?
We need to identify the best possible values for the coefficient p, that is the weights associated
with each property. There are plenty of choices to do that.
Depending on the nature of the user ratings (Likert or binary), we can consider the rating
prediction as a regression problem (linear regression) or as a classification problem (logistic
regression), and minimize a loss function J().
In the former case we can minimize the least squares loss function, and in the latter case we can
minimize the cross-entropy loss function. In both cases we can use gradient descent:
 p p
p
J   


 

Another possible approach is to use a genetic algorithm, to minimize a not smooth loss
function, such as the number of misclassification errors.

A mobile content-based RS (memory-based)

7/12/2013
Let’s go Mobile
(e.g., recommend movies in theaters)
[Vito Claudio Ostuni, Giosia Gentile, Tommaso Di Noia, Roberto Mirizzi, Davide Romito, Eugenio Di Sciascio. Mobile Movie
Recommendations with Linked Data. Human-Computer Interaction & Knowledge Discovery @ CD-ARES’13 (HCI-KDD 2013)]
 ( , ) , 1 if likes with companion , 1 otherwisej j j j jprofile u cmp m r r u m cmp r     
This time the user profile is context-dependent and is defined as:
( , , ) ( , , ) ( )i prefFilter preFilter i postFilter postFilterr u m cmp r u m cmp r u    
h (hierarchy): 1 if the theater is in the same city, 0 otherwise
c (cluster): 1 if the theater is a multiplex, 0 otherwise
cl (co-location): 1 if the theater is close to other POIs, 0 otherwise
ar (association-rule): 1 if the ticket price is known, 0 otherwise
ap (anchor-point proximity): 1 if the theater is close to the user home or office, 0 otherwise
( )
5
postFilter
h c cl ar ap
r u
   

( , )
( , )
( , , )
( , )
j
j j i
m profile u cmp
preFilter i
r sim m m
r u m cmp
profile u cmp




And the prediction is made by two parts, contextual pre-filtering and contextual post-filtering:

A content-based RS (model-based)

7/12/2013
Time for a Model-based CB-RS
George
Clooney [gc]
Catherine Z.
Jones [czj]
Brad Pitt
[bp]
starring
Ocean’s
Eleven [o11]
Ocean’s
Twelve [o12]
Steven
Soderbergh [ss]
director
2000s crime
films [2cf]
Crime films
[cf]
American criminal
comedy [acc]
subject
11,gc ow
12,gc ow 12,czj ow
11,bp ow
12,bp ow
11,czj ow 112 ,cf ow
122 ,cf ow 12,cf ow
11,acc ow
12,acc ow
11,cf ow11,ss ow
12,ss ow
This time each item is represented by a feature vector, where each feature corresponds to a property value.
 ( ) , 1 if likes , 1 otherwisej j j j jprofile u m r r u m r     The user profile is defined as:
[Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Davide Romito. Exploiting the Web of Data in
Model-based Recommender Systems. 6th ACM Conference on Recommender Systems (RecSys 2012)]

7/12/2013
Training the system with an SVM
classifier
[https://en.wikipedia.org/wiki/File:Svm_max_sep_hyperplane_with_margin.png]
Support Vector Machine (SVM) is known to work
well for text classification. Our problem of learning
the user profile has a lot of commonalities with it,
such as the sparse nature of the feature vector and
the high dimensionality of the input space.
Main advantages:
1. Feature selection is often not needed (SVM
robust to over-fitting and scales up pretty well)
2. No need to tune parameters like before
We then fit a logistic model to SVM output to
obtain a ranked list of items.

7/12/2013
Let’s continue with a Hybrid RS
[Vito Claudio Ostuni, Tommaso Di Noia, Eugenio Di Sciascio, Roberto Mirizzi. Top-N Recommendations from
Implicit Feedback leveraging Linked Open Data. 7th ACM Conference on Recommender Systems (RecSys 2013)]
We want to recommend items i to user u, exploiting both
the LOD knowledge base and other users’ interactions.
The ultimate goal of this recommendation system is to
rank in the top-N positions items to be likely relevant for
the user, in presence of implicit feedback.
Given the nature of the problem, the user profile is
defined as:
 ( ) is relevant forprofile u i i u

7/12/2013
Path-based features
1
# ( )
( )
# ( )
ui
ui D
ui
d
path j
x j
path d



We define as the feature vector encoding all
the interactions between user u and item i. Each
component of this vector represents the relevance
score between u and i with respect to a particular
feature, and is defined as:
D
uix 
The paths can be content-based, collaborative or
hybrid.

7/12/2013
Learning the ranking function
In order to predict the ranking and form the top-N recommendation lists we deal with the learning to
rank problem by adopting a point-wise approach.
In particular we use a combination of Random Forests and Gradient Boosted Regression Trees (GBRT).

Recommender Systems in the Linked Data era

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Recommender Systems in the Linked Data era

Similar a Recommender Systems in the Linked Data era (20)

Más de Roku

Más de Roku (7)

Último

Último (20)

Recommender Systems in the Linked Data era